CN116010965A - Method and system for detecting logic loopholes of extensible stateful protocol entity program - Google Patents

Method and system for detecting logic loopholes of extensible stateful protocol entity program Download PDF

Info

Publication number
CN116010965A
CN116010965A CN202211629520.XA CN202211629520A CN116010965A CN 116010965 A CN116010965 A CN 116010965A CN 202211629520 A CN202211629520 A CN 202211629520A CN 116010965 A CN116010965 A CN 116010965A
Authority
CN
China
Prior art keywords
message
input
symbol
output
state machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211629520.XA
Other languages
Chinese (zh)
Inventor
徐向华
黄一乘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202211629520.XA priority Critical patent/CN116010965A/en
Publication of CN116010965A publication Critical patent/CN116010965A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Maintenance And Management Of Digital Transmission (AREA)

Abstract

The invention discloses a logic vulnerability detection method and a system of an extensible stateful protocol entity program. Test preparation stage: the method comprises the steps of locally deploying an entity program to be tested, and taking a script file describing a flow message of a stateful protocol to be tested and a corresponding message type of the flow message as input; state machine learning phase: selecting input symbols from the script file, mapping the symbols into corresponding message messages, and then blurring the message messages to generate test cases; judging whether the test case still belongs to the symbol, if so, sending the test case to the program to be tested for execution, otherwise, correcting; collecting response messages after the program to be tested is executed, mapping the response messages into corresponding output symbols, and constructing a minimum certainty Mealy state machine corresponding to the entity program of the protocol to be tested by using model learning; and in the result sorting analysis stage, pruning and processing the learned result state machine. The method is suitable for detecting the logic loopholes by the stateful protocol entity program, and has strong expandability and high loophole detection efficiency.

Description

Method and system for detecting logic loopholes of extensible stateful protocol entity program
Technical Field
The invention relates to the technical field of network communication protocol testing, in particular to a logic vulnerability detection method and a system of an extensible stateful protocol entity program.
Background
The fuzzy test technology is a method for realizing vulnerability mining by providing unexpected input to a target system and monitoring abnormal results, and has the advantages of higher accuracy, strong availability, low dependence on source codes of a test target and the like. Most protocol implementations are provided by manufacturers, and users have insufficient knowledge of the internal implementation structure of the protocols, so that fuzzy test technology is mostly adopted to carry out vulnerability mining on the protocols.
According to whether there is correlation between protocol messages, network protocols are classified into stateful protocols and stateless protocols. The stateless protocol, such as ICMP, DNS, etc., each request message is independent, and there is no association relation between them; and stateful protocols, such as FTP, TCP, TLS, DTLS, each request response affects the next request response, both parties to the communication need to maintain a state space for the protocol.
For fuzzy test of a stateful protocol, a test case message is received by a protocol entity when the state of the test case message is consistent with that of the current protocol entity, otherwise, the test case message is refused to be received, so that a protocol state machine is required to be relied on during test. If a certain intermediate state in the protocol state machine is tested, the preamble message needs to be sent first to make the protocol entity program reach the state to be tested and then test is performed.
The existing fuzzy test method for the stateful network protocol entity mainly concentrates crash vulnerability test which can lead to program downtime of the protocol entity, and mainly comprises the following patent methods: CN 104796240A proposes a crash vulnerability fuzzy test method for a stateful network protocol, CN 114116500A proposes a protocol entity program crash vulnerability fuzzy test method for an unknown protocol, and CN 105763392A proposes a method for performing fuzzy test on an industrial control component according to a protocol state.
In addition to the above-described crash vulnerability test methods for stateful protocols, there are few logical vulnerability test methods for protocols. Logical vulnerabilities are typically caused by some sort of logic error in the protocol implementation code due to a programmer's inaccurate understanding of the protocol specification when programming the protocol program. Logical vulnerabilities generally do not lead to crashed exceptions for protocol entity programs and are therefore relatively difficult to detect. Document Protocol state machine fuzzing of TLS Implementations (ts-fuse) proposes a method for detecting logical vulnerabilities, which utilizes automaton learning technology to construct state machines of different protocol entity programs, and through a mapper module, abstract input symbols can be converted into specific message messages and sent to a program to be detected, and meanwhile message responses returned by the program to be detected are received and converted into abstract output symbols. These input-output symbols are used to construct the state machine of the protocol entity program. Finally, the logic loopholes are discovered by manually analyzing the learned state machine. If a problematic state or abnormal state transition is found, the corresponding source code is found to analyze whether a logic error exists. A similar method is adopted in Analysis of DTLS Implementations Using Protocol State Fuzzing (DTLS-fuse), so that logic vulnerability detection of the DTLS protocol is realized.
However, the above-mentioned protocol logic vulnerability test works as follows: the mapper and the fuzzy test implementation code are tightly coupled with the TLS and the DTLS protocols, the state machine can only be learned and constructed aiming at the TLS and the DTLS protocols, and if the tests of other protocols are realized by extension, the protocol mapper and the fuzzy test implementation code are required to be modified, so that the modification difficulty is high, and the protocol expandability is poor.
CN 109525457A devised a method for performing fuzzy test on protocol entity program based on protocol state migration, and discovering logical vulnerabilities by discovering abnormal state migration in protocol entity. For the protocol entity program disclosed by the protocol specification, protocol state machine information is required to be described according to the protocol specification and used for guiding fuzzy test; and for a private protocol entity program with unknown protocol specification, acquiring state machine information of the program by a protocol reverse analysis method, and guiding fuzzy test. Then, on the basis of the state machine for acquiring the protocol entity program, solving and acquiring the shortest path for traversing all the migration of the protocol state machine, and testing each state migration in sequence according to the shortest path. And finding out abnormal state transition of the protocol entity according to the unique input/output sequence of the protocol state in the test process. The method has the following defects: the protocol state machine disclosing the protocol specification is not necessarily completely consistent with the state machine of the protocol entity program, and some reasonable intermediate states and state transitions may occur in the process of realizing the program; and the private protocol state machine information obtained by protocol reverse analysis has larger deviation from the real protocol state. The inconsistency/deviation of the state machine of the method influences the accuracy of the test result.
In view of the above problems, the present invention provides a method and a system for detecting logic vulnerabilities of an extensible stateful protocol entity program, which takes a flow message of a stateful protocol to be detected and a message type corresponding to the flow message as input, effectively constructs a state machine of the protocol entity program by a method combining fuzzy test and automaton learning, and discovers potential logic vulnerabilities in the protocol entity program to be detected by manually analyzing the state machine.
Disclosure of Invention
Aiming at the problems, the invention provides a logic vulnerability detection method and a system of an extensible stateful protocol entity program. The method comprises a test preparation stage, a state machine learning stage and a result arrangement analysis stage. In the test preparation stage, a protocol entity program to be tested is deployed locally, and a flow message describing a protocol to be tested with a state and a script file corresponding to the message type are used as input. In the state machine learning stage, firstly, selecting input symbols from a script file, mapping the symbols into corresponding message messages, and fuzzifying the message messages to generate test cases; then, judging whether the test case belongs to the symbol or not through similarity calculation, if so, sending the test case to a program to be tested for execution, otherwise, carrying out correction by fuzzification again; and finally, collecting response messages after the test case is executed by the program to be tested, mapping the response messages into corresponding output symbols through similarity calculation, and constructing a minimum certainty Mealy state machine corresponding to the entity program of the protocol to be tested, namely a result state machine by using model learning. And in the result analysis stage, pruning and processing are carried out on the result state machine obtained through learning, so that the test personnel can conveniently carry out abnormal path migration and redundant state analysis.
The invention provides a logic loophole detection method of an extensible stateful protocol entity program, which mainly comprises three stages: test preparation phase, state machine learning phase and result arrangement analysis phase.
1 test preparation stage
And capturing packets of the communication traffic of the protocol to be tested by using a Wireshark tool, and taking the message and the corresponding message type in the packets as the input of the fuzzy tester. The method specifically comprises the following steps:
1.1 protocol to be tested physical program deployment
The method comprises the steps of deploying a protocol entity program to be tested, and starting the protocol entity program to be tested and a communication terminal corresponding to the protocol entity program to be tested (if the protocol entity program to be tested is a client, starting a corresponding communication server, otherwise, starting a corresponding communication client).
1.2 network traffic Capture
And starting a network packet analysis tool Wireshark to capture network traffic between the entity program of the protocol to be detected and the corresponding communication end.
1.3 message type Classification
The captured network traffic is filtered and consolidated into different message types while ensuring that each message type corresponds to at least one message instance.
Then, dividing the message sent by the entity program of the protocol to be tested into the output message types; the type of the message sent by the communication terminal is divided into input message types.
Finally, each message type is uniquely identified by a symbol, and the data structure of the consolidated message type is represented as < symbol, and the symbol identifies the message instance set corresponding to the message type >.
1.4 script File construction
Describing all message instances, message types and symbols obtained in 1.3 through a script file to form a script file describing the flow message of the to-be-tested stateful protocol and the corresponding message types, and taking the script file as the input of a tester. The script file contains three mapping relation tables: a symbol mapping table, an input symbol mapping table, and an output symbol mapping table. Wherein the input symbol mapping table describes a one-to-one mapping of symbols and message types; the input symbol mapping table describes the one-to-many mapping relation of all input symbols and message examples corresponding to the message types identified by the symbols; the output symbol mapping table describes the one-to-many mapping relationship of all output symbols and message instances corresponding to the message types identified by the symbols.
2 state machine learning phase
The fuzzy tester firstly selects input symbols from the script file, maps the symbols into corresponding message messages, and fuzzifies the corresponding message messages to generate test cases.
And then judging whether the test case still belongs to the state through similarity calculation, if so, sending the test case to the program to be tested for execution, and otherwise, correcting.
And finally, collecting response messages after the program to be tested executes the test cases, mapping the response messages into corresponding output symbols through similarity calculation, and constructing a minimum certainty Mealy state machine corresponding to the entity program of the protocol to be tested by using model learning. The method specifically comprises the following steps:
2.1 message acquisition
And randomly selecting an input symbol from a symbol mapping table in the script file, acquiring a message instance corresponding to the message type identified by the symbol from the input symbol mapping table in the script file for each selected symbol, and randomly selecting one as a message to be obfuscated if one symbol corresponds to a plurality of instances.
2.2 message obfuscation
And performing blurring operation on the obtained message, including non-processing, null processing, random field mutation and havoc mutation operation, and changing attribute values of partial fields in the message.
2.3 blur correction
The above-described blurring operation is corrected using a similarity calculation. And (3) respectively calculating the similarity of the test cases generated by blurring and all message instances corresponding to the message types identified by all input symbols, wherein the calculation formula is Levenshtein distance:
dis(MM input ,M input )=lev(MM input ,M input )/max(len(MM input ),len(M input ))
wherein lev (MM) input ,M input ) Representing test cases MM input Conversion of insertion, substitution, deletion into message instance M to which it is compared input The minimum number of operations required, max () represents the maximum value and len () represents the length of the response message. dis (MM) input ,M input ) The dissimilarity of the two message messages is shown.
Test case MM generated by calculation fuzzification input Dissimilarity with all message instances, find the message instance M with the minimum dissimilarity with it min_dis . If the message instance and the current test instance belong to the same input symbol S input And after the correction is finished, carrying out the next step. Otherwise, the blurring operation is carried out again and the process is repeated until the condition is met.
2.4 test case execution
And (3) sending the test case corrected in the step (2.3) to a protocol entity program to be tested for execution.
2.5 output symbol matching
And (3) obtaining a response message returned after the program in the step (2.4) finishes executing the test case, and determining an output symbol corresponding to the response message by using similarity calculation. And (3) calculating the similarity of the response message and all message instances corresponding to the message types identified by all output symbols, wherein the calculation formula is Levenshtein distance:
dis(MR output ,M output )
=lev(MR output ,M output )/max(len(MR output ),len(M output ))
wherein lev (MR output ,M output ) Representing response message MR output Conversion of insertion, substitution, deletion into message instance M to which it is compared output The minimum number of operations required, max () represents the maximum value and len () represents the length of the response message. dis (MR) output ,M output ) Representing the dissimilarity of the two response messages. Calculating response message MR output Dissimilarity of all message instances corresponding to the message types identified by all output symbols, and taking the output symbol S corresponding to the response message with the minimum dissimilarity output As the response message MR output Corresponding output symbols.
2.6 State machine learning
And learning the state machine of the program to be tested by adopting an open source automaton learning framework LearnLib. And receiving the input symbol sent at the time and the returned output symbol, constructing a minimum certainty Mealy state machine corresponding to the protocol entity program by using an angle's L algorithm, and verifying the state machine by using a Chow's W-method algorithm. If the verification is not passed, the state machine is reconstructed. And repeating iterative learning until verification is passed, considering that the currently learned state machine accords with an actual program, finishing learning, and taking the result as a final state machine result.
3 results sort analysis stage
And trimming and sorting the state machine obtained by 2.6 to facilitate the analysis of abnormal path migration and redundant states by the testers. The method specifically comprises the following steps:
3.1 State machine finishing
And (3) performing some arrangement on the constructed state machine through manual analysis so as to facilitate the analysis of abnormal path migration and redundant states by a tester. The method specifically comprises the following operations:
3.1.1 Path marker
A complete normal state transition path is marked in the state machine, the path contains all complete interaction states of the protocol to be tested, and each state transition in the path accords with the protocol specification.
3.1.2 redundant Path deletion
And deleting all the state transition paths which cannot reach the last interaction state.
3.2 State machine analysis
Through the above operation, the resulting state machine has a normal state transition path and some paths which do not pass through the normal state transition but still complete communication interaction, and some paths may be deleted or skipped relative to the normal path, and the states are analyzed to find potential abnormal path transition problems. The method specifically comprises the following operations:
3.2.1 normalization analysis
Each path in the state machine that is able to reach the final state is checked to see if the state transitions in these paths are in compliance with the specifications of the protocol under test.
3.2.2 redundant State analysis
Whether redundant states or unexpected states exist in the state machine or not is searched, and whether the states have problems or not is analyzed. If a problematic state or abnormal state transition is found, the corresponding source code is found to analyze whether a logic error exists.
Based on the method, the invention provides a logic vulnerability detection system of an extensible stateful protocol entity program, which comprises a test preparation module, a state machine learning module and a result arrangement analysis module, wherein the state machine learning module comprises a mapper unit, a fuzzy test unit and a learner unit;
test preparation module: capturing network traffic communicated between a protocol program to be tested and a corresponding communication program by means of a Wireshark tool, and distinguishing the traffic, namely dividing the traffic into different message types, so that each message type has at least one message; abstracting each message type into a symbol, and storing the symbol in an input symbol alphabet and an output symbol alphabet according to whether the message is sent by a protocol entity program to be tested or not; each alphabet comprises a plurality of tuples with the format of < symbol, and the symbol corresponds to the message set >; all of this information is recorded in a file named symbol alphabets xml;
a mapper unit: acquiring all tuple information in a symbol alphabets xml file, wherein the tuple information specifically comprises all input symbols and corresponding message messages in an input symbol alphabet, and outputting all output symbols and corresponding message messages in the symbol alphabet; randomly selecting a series of input symbols from all input symbols to form an input symbol sequence, and analyzing each input symbol into a corresponding input message according to an input symbol alphabet;
and a ambiguity test unit: blurring operation is carried out on the obtained message, wherein the operation comprises non-processing, null processing, random field variation and havoc; changing attribute values of partial fields in the message; and correcting the fuzzy operation through similarity calculation; sending the corrected message as a test case to a protocol entity program to be tested, and obtaining a response message returned after the program executes the test case; the mapper unit maps the output symbols into corresponding output symbols according to an output symbol alphabet;
a learner unit: model learning is carried out by adopting an open source automaton learning framework LearnLib; the learner unit receives the input symbol sent this time and the returned output symbol, builds a minimum deterministic Mealy state machine corresponding to the protocol entity program by using model learning, and determines a final result state machine through continuous iterative learning, wherein the final result state machine completely reflects all behaviors of the protocol entity program to be tested.
And a result arrangement analysis module: and pruning and processing the result state machine constructed by the model learning so as to facilitate the test personnel to analyze the abnormal path migration and the redundant state.
The beneficial effects of the invention are as follows:
1. the invention does not need to know specific protocol specifications in advance, only needs to provide a script file for describing the flow instance and the message type mapping of the protocol message to be tested as input, and the methods of the tls-fuse, the dtls-fuse and the like need to take protocol specification information (description information of message format, message field and the like) as input, so that a complete protocol specification description script is not required to be written, the preparation workload of a new protocol test is greatly reduced, the logic vulnerability test of a private protocol with unknown specifications is more convenient, and the protocol scalability is good.
2. The invention is not limited to a specific protocol in the aspects of mapping strategy and fuzzy strategy, but corrects the test case through the similarity calculation after mutation, and the newly added test protocol does not need to modify the test program code; and the tls-fuzzer and the dtls-fuzzer methods both need to write detailed related codes in modules such as mapping strategies and fuzzy strategies according to protocol specifications, and a great deal of code modification and expansion work is needed to test the new protocols. Therefore, the protocol of the method has stronger expandability.
3. Compared with the method for discovering the logic vulnerability of the state exception migration by the CN 109525457A through protocol reverse analysis or known protocol specification, the method provided by the invention adopts the thought of model learning, so that the state machine of the protocol entity program can be better learned and updated in the test process, and the learned state machine is more complete, and the test result is more accurate.
Drawings
FIG. 1 is a flow chart of an extensible stateful protocol physical program logic vulnerability detection method according to the present invention.
FIG. 2 is a diagram illustrating an exemplary script file format during a test preparation phase in an extensible stateful protocol physical program logic vulnerability detection method according to the present invention.
FIG. 3 is a flowchart illustrating a state machine learning stage in an extensible stateful protocol (PPP) physical program logic vulnerability detection method according to the present invention.
FIG. 4 is a diagram illustrating an exemplary state machine configuration of a result configuration analysis stage in an extensible stateful protocol physical program logic vulnerability detection method according to the present invention.
Detailed Description
The following describes the technical scheme of the embodiment of the present invention in detail with reference to the drawings.
As shown in fig. 1, the method for detecting logical vulnerabilities of the extensible stateful protocol entity program provided by the present invention mainly includes three stages: test preparation phase, state machine learning phase and result arrangement analysis phase.
1 test preparation stage
And capturing packets of the communication traffic of the protocol to be tested by using a Wireshark tool, and taking the message and the corresponding message type in the packets as the input of the fuzzy tester. Comprises the following steps:
1.1, deploying the protocol entity program to be tested, and starting the protocol entity program to be tested and a communication terminal corresponding to the protocol entity program to be tested (if the protocol entity program to be tested is a client, starting a corresponding communication server, otherwise starting a corresponding communication client).
1.2, starting a network packet analysis tool Wireshark, and capturing network traffic between a protocol entity program to be detected and a corresponding communication end.
1.3 screening and sorting the captured network traffic, dividing the captured network traffic into different message types, and simultaneously ensuring that at least one message instance corresponds to each message type.
Then, dividing the message sent by the entity program of the protocol to be tested into the output message types; the type of the message sent by the communication terminal is divided into input message types.
Finally, each message type is uniquely identified by a symbol, and the data structure of the consolidated message type is represented as < symbol, and the symbol identifies the message instance set corresponding to the message type >.
For example, the entity program to be tested is a server, the message type corresponding to a certain message M is Client Hello, and first, the message is sent by the Client, and the entity program to be tested is input, so that the entity program to be tested is classified into the input message type. The type of abstraction is then uniquely identified with a CH symbol. Finally, we represent the information by the data structure described above, i.e., < CH, { M } >.
1.4 all message instances, message types and symbols obtained in the step 1.3 are described by a script file. Fig. 2 is an exemplary diagram of a script file format.
After the above-mentioned fuzzy preparation stage is completed, a script file describing the flow message of the to-be-tested stateful protocol and its correspondent message type is obtained, and contains three mapping relation tables: symbol mapping table symbol alphabets (describing one-to-one mapping relationship of symbols and message types), input symbol mapping table inputsymbol alphabets (describing one-to-many mapping relationship of message instances corresponding to all input symbols and message types identified by the symbols), and output symbol mapping table outputsymbol alphabets (describing one-to-many mapping relationship of message instances corresponding to all output symbols and message types identified by the symbols). The script file is used as an input to the fuzzy tester.
2 state machine learning phase
As shown in fig. 3, the ambiguity tester first selects an input symbol from the script file, maps the symbol to a corresponding message, and performs ambiguity on the corresponding message to generate a test case.
And then judging whether the test case still belongs to the state through similarity calculation, if so, sending the test case to the program to be tested for execution, and otherwise, correcting.
And finally, collecting response messages after the program to be tested executes the test cases, mapping the response messages into corresponding output symbols through similarity calculation, and constructing a minimum certainty Mealy state machine corresponding to the entity program of the protocol to be tested by using model learning. The method specifically comprises the following steps:
2.1 message acquisition
The method comprises the steps of randomly selecting input symbols from a symbol mapping table symbol alpha bet in a script file, acquiring a message instance corresponding to a message type identified by the symbol from the input symbol mapping table inputsymbol alpha bet in the script file for each selected symbol, and randomly selecting one as a message to be obfuscated if a plurality of instances exist.
The pseudo code for this process is as follows:
Figure BDA0003999130610000091
/>
Figure BDA0003999130610000101
2.2 message obfuscation
And performing blurring operation on the obtained message, including non-processing, null processing, random field mutation and havoc mutation operation, and changing attribute values of partial fields in the message.
2.3 blur correction
The above-described blurring operation is corrected using a similarity calculation. And respectively calculating the similarity of the test cases generated by blurring and all message instances corresponding to the message types identified by all input symbols, wherein a calculation formula is based on the Levenshtein distance:
dis(MM input ,M input )=lev(MM input ,M input )/max(len(MM input ),len(M input ))
wherein lev (MM) input ,M input ) Representing test cases MM input Conversion of insertion, substitution, deletion into message instance M to which it is compared input The minimum number of operations required, max () represents the maximum value and len () represents the length of the response message. dis (MM) input ,M input ) The dissimilarity of the two message messages is shown.
Test case MM generated by calculation fuzzification input Dissimilarity with all message instances, find the message instance M with the minimum dissimilarity with it min_dis If the message instance and the current test instance belong to the same input symbol S input And after the correction is finished, carrying out the next step. Otherwise, the blurring operation is carried out again and the process is repeated until the condition is met.
The pseudo code for this process is as follows:
Figure BDA0003999130610000111
2.4 test case execution
And sending the test case after 2.3 correction to the entity program of the protocol to be tested for execution.
2.5 output symbol matching
And 2.4, obtaining a response message returned after the program has executed the test case, and determining an output symbol corresponding to the response message by using similarity calculation. And (3) calculating the similarity of the response message and all message instances corresponding to the message types identified by all output symbols, wherein a calculation formula is based on the Levenshtein distance:
dis(MR output ,M output )
=lev(MR output ,M output )/max(len(MR output ),len(M output ))
wherein lev (MR output ,M output ) Representing response message MR output Conversion of insertion, substitution, deletion into message instance M to which it is compared output The minimum number of operations required, max () represents the maximum value and len () represents the length of the response message. dis (MR) output ,M output ) The dissimilarity of the two response messages is shown.
Calculating response message MR output Dissimilarity of all message instances corresponding to the message types identified by all output symbols, and taking the output symbol S corresponding to the response message with the minimum dissimilarity output As the response message MR output Corresponding output symbols.
The pseudo code for this process is as follows:
Figure BDA0003999130610000121
2.6 State machine learning
And collecting the input symbols sent at the time and the returned output symbols, and learning a program state machine to be tested by adopting an open source automaton learning framework LearnLib. One of the common algorithms for model learning is the angle's L algorithm, which uses an observation table to build a complete state machine step by step, and the main idea includes two steps of member query and equivalent query:
2.6.1 member queries. In Angluin's L model learning, a state is uniquely determined by an input/output sequence. If for both states, the output sequences they return are equal for all the input sequences, then they are considered to be the same state for both. Therefore, by constantly generating an input sequence, and judging whether a new state is generated or not based on the output response sequence. If the input/output sequences of the states are found to be identical, they are combined into the same state. During learning, all input and output results and corresponding states are recorded in an observation table. The learning is iterated until the watch cannot be updated (i.e., no new states are found) at which point a hypothetical state machine is generated that fully reflects the input/output transitions between each state, i.e., state, in the watch.
2.6.2 equivalent queries. A hypothetical state machine is obtained in the 2.6.1 member query stage, and it is necessary to verify whether this state machine is consistent with the actual state machine of the protocol entity program, and the verification process is an equivalent query. The equivalent query is realized by adopting a Chow's W-method built in LearnLib: by continuously generating input symbols and simultaneously sending the input symbols to the assumed state machine and the protocol entity program to be tested, and observing whether response sequences of the input symbols and the protocol entity program to be tested are consistent. If an inconsistency is found before the maximum query upper limit is reached (10000 is the upper limit of the method, namely 10000 input symbols are continuously transmitted), the state machine assumed at present is inconsistent with the actual state machine of the protocol, the input sequence causing the inconsistency is taken as a counterexample, updated into an observation table, and the 2.6.1 member query relearning state machine is carried out again.
The iterative learning process of 2.6 is repeated as follows until no counterexamples are found after the upper query limit is reached: and the output sequence returned by the assumed state machine is identical to the response sequence returned by the actual protocol entity program, which indicates that the currently assumed state machine accords with the actual state machine, and the learning is finished and is used as a final state machine result.
3 results sort analysis stage
And trimming and arranging the state machine obtained by 2.6 learning, and manually analyzing the state machine to find out the logic loopholes. The method specifically comprises the following steps:
3.1 State machine finishing
3.1.1 Path marker
A complete normal state transition path is marked in the state machine, the path contains all complete interaction states of the protocol to be tested, and each state transition in the path accords with the protocol specification.
3.1.2 redundant Path deletion
And deleting all the state transition paths which cannot reach the last interaction state.
3.2 State machine analysis
The method specifically comprises the following operations:
3.2.1 normalization analysis
Each path in the state machine that is able to reach the final state is checked to see if the state transitions in these paths are in compliance with the specifications of the protocol under test.
3.2.2 redundant State analysis
Whether redundant states or unexpected states exist in the state machine or not is searched, and whether the states have problems or not is analyzed. If a problematic state or abnormal state transition is found, the corresponding source code is found to analyze whether a logic error exists.
Take fig. 4 as an example to illustrate the 3.2 analysis process: through the above operation, the paths s6- > s7 and s8- > s7 which cannot reach the last interaction state are deleted, and a normal state transition path s1- > s2- > s3- > s4- > s5- >9 is highlighted. Some of the state machines also have paths that do not pass through the normal state transitions but still complete communication interactions, which may prune or skip some of the states relative to the normal path, and analyze the states to find potential abnormal path transition problems.
The foregoing is only a preferred embodiment of the invention, it being noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the invention.

Claims (5)

1. The method for detecting the logic loopholes of the extensible stateful protocol entity program is characterized by comprising three stages: a test preparation stage, a logic vulnerability detection method state machine learning stage of a state extensible protocol entity program and a result arrangement analysis stage;
in the test preparation stage, firstly, deploying an entity program to be tested locally, and taking a script file describing a flow message of a stateful protocol to be tested and a corresponding message type as input;
in the state machine learning stage, firstly, selecting input symbols from a script file, mapping the symbols into corresponding message messages, and fuzzifying the message messages to generate test cases; then judging whether the test case still belongs to the symbol or not through the similarity, if so, sending the test case to a program to be tested for execution, otherwise, carrying out correction by fuzzification again; finally, collecting response messages after the program to be tested executes the test cases, mapping the response messages into corresponding output symbols through similarity calculation, and constructing a minimum certainty Mealy state machine corresponding to the entity program of the protocol to be tested, namely a result state machine by using model learning;
and in the result sorting analysis stage, pruning and processing the learned result state machine so as to facilitate the analysis of abnormal path migration and redundant states by the testers.
2. The method for detecting logical vulnerabilities of an extensible stateful protocol entity program according to claim 1, wherein the test preparation stage uses an existing tool to grasp a communication traffic of a protocol to be tested, and takes a message and a corresponding message type thereof as inputs of a tester, and specifically comprises the following steps:
2.1 protocol entity program deployment to be tested
Deploying the entity program of the protocol to be tested, and starting the entity program of the protocol to be tested and a communication end corresponding to the entity program of the protocol to be tested;
2.2 network traffic Capture
Starting a network packet analysis tool Wireshark, and capturing network traffic between a protocol entity program to be detected and a corresponding communication end;
2.3 message type Classification
Screening and sorting the captured network traffic, dividing the captured network traffic into different message types, and simultaneously ensuring that each message type at least corresponds to one message instance;
then, dividing the message sent by the entity program of the protocol to be tested into the output message types; the message sent by the communication terminal is divided into input message types;
finally, each message type is uniquely identified by a symbol, and the data structure of the consolidated message type is represented as < symbol, and the symbol identifies the message instance set corresponding to the message type >.
2.4 script File construction
Describing all message instances, message types and symbols mentioned in the step 2.3 through a script file to form a script file describing the flow message of the to-be-tested stateful protocol and the corresponding message types, and taking the script file as the input of a tester; the script file contains three mapping relation tables: a symbol mapping table, an input symbol mapping table, and an output symbol mapping table; wherein the symbol mapping table describes a one-to-one mapping of symbols and message types; the input symbol mapping table describes the one-to-many mapping relation of all input symbols and message examples corresponding to the message types identified by the symbols; the output symbol mapping table describes the one-to-many mapping relationship of all output symbols and message instances corresponding to the message types identified by the symbols.
3. The method for detecting logical vulnerabilities of an extensible stateful protocol physical program of claim 2, wherein the state machine learning phase is implemented as follows:
3.1 message acquisition
Randomly selecting input symbols from a symbol mapping table in a script file, for each selected symbol, acquiring a message instance corresponding to a message type identified by the symbol from the input symbol mapping table in the script file, and randomly selecting an instance as a message to be obfuscated if one symbol corresponds to a plurality of instances;
3.2 message blurring
Blurring operation is carried out on the obtained message, including non-processing, null processing, random field variation and havoc variation operation, and attribute values of partial fields in the message are changed;
3.3 blur correction
Correcting the blurring operation by using similarity calculation; and (3) respectively calculating the similarity of the test cases generated by blurring and all message instances corresponding to the message types identified by all input symbols, wherein the calculation formula is Levenshtein distance:
dis(MM input ,M input )=lev(MM input ,M input )/max(len(MM input ),len(M input ))
wherein lev (MM) input ,M input ) Representing test cases MM input Conversion of insertion, substitution, deletion into message instance M to which it is compared input The minimum number of operations required, max () represents the maximum value, len () represents the message instance M input Is a length of (2); dis (MM) input ,M input ) The dissimilarity of two message instances is shown;
test case MM generated by calculation fuzzification input Dissimilarity with all message instances, find the message instance M with the minimum dissimilarity with it min_dis The method comprises the steps of carrying out a first treatment on the surface of the If the message instance M min_dis The same input symbol S as the current test case input After the correction is finished, carrying out the next step; otherwise, the blurring operation is carried out again, and the process is repeated until the condition is met;
3.4 test case execution
The test case corrected in the step 3.3 is sent to a protocol entity program to be tested for execution;
3.5 output symbol matching
Obtaining a response message returned after the program in the step 3.4 completes the test case, and determining an output symbol corresponding to the response message by using similarity calculation; and (3) calculating the similarity of the response message and all message instances corresponding to the message types identified by all output symbols, wherein the calculation formula is Levenshtein distance:
dis(MR output ,M output )=lev(MR output ,M output )/max(len(MR output ),len(M output ))
wherein lev (MR output ,M output ) Representing response message MR output Conversion of insertion, substitution, deletion into message instance M to which it is compared output The minimum number of operations required, max () represents the maximum value, len () represents the message instance M output Is a length of (2); dis (MR) output ,M output ) Representing dissimilarity of two response message instances;
calculating response message MR output Dissimilarity of all message instances corresponding to the message types identified by all output symbols, and taking the output symbol S corresponding to the response message with the minimum dissimilarity output As the response message MR output A corresponding output symbol;
3.6 State machine learning
Learning a program state machine to be tested by adopting an open source automaton learning framework LearnLib; receiving the input symbol sent at this time and the returned output symbol, constructing a minimum certainty Mealy state machine corresponding to a protocol entity program by using an angle's L algorithm, and verifying the state machine by using a Chow's W-method algorithm; if the verification is not passed, reconstructing the state machine; and repeating iterative learning until verification is passed, considering that the currently learned state machine accords with an actual program, finishing learning, and taking the result as a final state machine result.
4. The method for detecting logical vulnerabilities of an extensible stateful protocol physical program according to claim 1, wherein the result sort analysis stage:
4.1 State machine finishing
The constructed state machine is arranged through analysis, so that a tester can conveniently analyze abnormal path migration and redundant states; the method specifically comprises the following operations:
4.1.1 Path Label
Marking a complete normal state transition path in a state machine, wherein the path comprises all complete interaction states of a protocol to be tested, and each state transition in the path accords with protocol specifications;
4.1.2 redundant Path deletion
Deleting all state transition paths which cannot reach the last interaction state;
4.2 State machine analysis
After the state machine arrangement, the obtained result state machine has a normal state transition path and some paths which do not pass through the normal state transition but still complete communication interaction, and the paths possibly delete or skip some states relative to the normal path, and the states are analyzed to find potential abnormal path transition problems; the method specifically comprises the following operations:
4.2.1 normalization analysis
Checking each path capable of reaching a final state in the state machine, and analyzing whether state transitions in the paths meet the specification of a protocol to be tested;
4.2.2 redundant State analysis
Searching whether redundant states or unexpected states exist in the state machine, and analyzing whether the states have problems; if a problematic state or abnormal state transition is found, the corresponding source code is found to analyze whether a logic error exists.
5. The extensible logical vulnerability detection system of the stateful protocol entity program is characterized by comprising a test preparation module, a state machine learning module and a result arrangement analysis module, wherein the state machine learning module comprises a mapper unit, a fuzzy test unit and a learner unit;
test preparation module: capturing network traffic communicated between a protocol program to be tested and a corresponding communication program by means of a Wireshark tool, and distinguishing the traffic, namely dividing the traffic into different message types, so that each message type has at least one message; abstracting each message type into a symbol, and storing the symbol in an input symbol alphabet and an output symbol alphabet according to whether the message is sent by a protocol entity program to be tested or not; each alphabet comprises a plurality of tuples with the format of < symbol, and the symbol corresponds to the message set >; all of this information is recorded in a file named symbol alphabets xml;
a mapper unit: acquiring all tuple information in a symbol alphabets xml file, wherein the tuple information specifically comprises all input symbols and corresponding message messages in an input symbol alphabet, and outputting all output symbols and corresponding message messages in the symbol alphabet; randomly selecting a series of input symbols from all input symbols to form an input symbol sequence, and analyzing each input symbol into a corresponding input message according to an input symbol alphabet;
and a ambiguity test unit: blurring operation is carried out on the obtained message, wherein the operation comprises non-processing, null processing, random field variation and havoc; changing attribute values of partial fields in the message; and correcting the fuzzy operation through similarity calculation; sending the corrected message as a test case to a protocol entity program to be tested, and obtaining a response message returned after the program executes the test case; the mapper unit maps the output symbols into corresponding output symbols according to an output symbol alphabet;
a learner unit: model learning is carried out by adopting an open source automaton learning framework LearnLib; the learner unit receives the input symbol sent this time and the returned output symbol, builds a minimum certainty Mealy state machine corresponding to the protocol entity program by using model learning, and determines a final result state machine through continuous iterative learning, wherein the final result state machine completely reflects all behaviors of the protocol entity program to be tested;
and a result arrangement analysis module: and pruning and processing the result state machine constructed by the model learning so as to facilitate the test personnel to analyze the abnormal path migration and the redundant state.
CN202211629520.XA 2022-12-14 2022-12-14 Method and system for detecting logic loopholes of extensible stateful protocol entity program Pending CN116010965A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211629520.XA CN116010965A (en) 2022-12-14 2022-12-14 Method and system for detecting logic loopholes of extensible stateful protocol entity program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211629520.XA CN116010965A (en) 2022-12-14 2022-12-14 Method and system for detecting logic loopholes of extensible stateful protocol entity program

Publications (1)

Publication Number Publication Date
CN116010965A true CN116010965A (en) 2023-04-25

Family

ID=86024103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211629520.XA Pending CN116010965A (en) 2022-12-14 2022-12-14 Method and system for detecting logic loopholes of extensible stateful protocol entity program

Country Status (1)

Country Link
CN (1) CN116010965A (en)

Similar Documents

Publication Publication Date Title
Tappler et al. Model-based testing IoT communication via active automata learning
US5351247A (en) Adaptive fault identification system
CN104168288A (en) Automatic vulnerability discovery system and method based on protocol reverse parsing
Vollmer et al. Autonomous rule creation for intrusion detection
WO2020193333A1 (en) Computer security
CN112702405A (en) Internet of things equipment identification method based on multi-protocol detection
GB2583892A (en) Adaptive computer security
GB2582609A (en) Pre-emptive computer security
US11477225B2 (en) Pre-emptive computer security
CN114281676A (en) Black box fuzzy test method and system for industrial control private protocol
Svajlenko et al. A machine learning based approach for evaluating clone detection tools for a generalized and accurate precision
CN110647138B (en) Safety detection method of charging pile CAN bus protocol based on improved ant colony algorithm
CN112291226B (en) Method and device for detecting abnormity of network flow
US20080104576A1 (en) Method and arrangement for locating input domain boundaries
CN116010965A (en) Method and system for detecting logic loopholes of extensible stateful protocol entity program
Helmke et al. EPF: An evolutionary, protocol-aware, and coverage-guided network fuzzing framework
CN116094850A (en) Network protocol vulnerability detection method and system based on system state tracking graph guidance
CN113760753B (en) QUIC protocol testing method based on gray box blurring technology
CN110990383A (en) Similarity calculation method based on industrial big data set
Kushik et al. Studying the optimal height of the EFSM equivalent for testing telecommunication protocols
Székely et al. Protocol state machine reverse engineering with a teaching-learning approach
CN113836539A (en) Power engineering control system leak full-flow disposal system and method based on precise test
Bohlin et al. Inferring compact models of communication protocol entities
Yi et al. An Empirical Study of Counterexample-Guided Fuzzing for Neural Networks Verification
CN115296832B (en) Attack tracing method and device for application server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination