CN116909884B - Configuration defect-oriented database fuzzy test method - Google Patents

Configuration defect-oriented database fuzzy test method Download PDF

Info

Publication number
CN116909884B
CN116909884B CN202310805941.1A CN202310805941A CN116909884B CN 116909884 B CN116909884 B CN 116909884B CN 202310805941 A CN202310805941 A CN 202310805941A CN 116909884 B CN116909884 B CN 116909884B
Authority
CN
China
Prior art keywords
configuration
seed
database
module
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310805941.1A
Other languages
Chinese (zh)
Other versions
CN116909884A (en
Inventor
李姗姗
董威
贾周阳
李解
张元良
陈振邦
陈立前
罗朝鹏
刘浩然
白林枭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202310805941.1A priority Critical patent/CN116909884B/en
Publication of CN116909884A publication Critical patent/CN116909884A/en
Application granted granted Critical
Publication of CN116909884B publication Critical patent/CN116909884B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/368Test management for test version control, e.g. updating test cases to a new software version
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a database fuzzy test method for configuration defects, and aims to solve the problem of low configuration code coverage rate of the conventional test method. The technical proposal is as follows: constructing a database fuzzy test system facing configuration defects, which is composed of a configuration stain analysis module, a configuration pile insertion module and a database fuzzy test module; the configuration taint analysis module obtains the influence range of target configuration and the mapping relation set of configuration and program basic blocks; the configuration instrumentation module reads in the software source code to be tested, configures and inserts the software source code with the mapping relation set of the basic blocks of the program; and the database fuzzy test module performs coverage rate guided gray box fuzzy test on the software after pile insertion, and utilizes a configuration-oriented two-stage variation strategy to penetrate configuration control branch conditions for the seeds executing the configuration codes to output a configuration defect set. The invention can fully test the configuration codes in the tested software, improves the coverage rate and effectively detects the configuration defects of the database software.

Description

Configuration defect-oriented database fuzzy test method
Technical Field
The invention relates to the field of configuration defect detection in database software, in particular to a database fuzzy test method for configuration defects.
Background
Database software is critical as an infrastructure for modern data-intensive software. Database software is growing in size as businesses and organizations produce more and more data in day-to-day operations. However, as with other bulky and complex software, database software inevitably presents many vulnerabilities that affect the user experience and may even lead to serious security problems.
Database software loopholes are frequent, the configuration quantity is numerous, the processing logic is complex, and the database software loopholes are one of main reasons for the loopholes. The software configuration refers to selecting and determining related hardware, software model, version and quantity, planning the placement position and association relation of the software, setting related parameter values of the software and the like based on the user requirements, the functions, the structure, the main characteristics and the like of the software. The configuration of the interface provided as database software and interacting with the user can control the software behavior and manage the system resource allocation, and has rich flexibility. The user can set specific values for the configuration parameters through the configuration file, and the system loads the configuration file at the time of starting. However, each change to the configuration parameter values requires a significant amount of reboot overhead, so modern database software introduces configuration items that can be dynamically modified at runtime. While the dynamically modifiable configuration provides greater flexibility, it also makes the software more error-prone. Many database official vulnerability reports show that dynamically modifying the values of configuration items (which values tend to be valid values) at runtime may cause database crashes or cause functional errors, and malicious attackers may use these vulnerabilities to steal user information, resulting in system crashes, leading to unpredictable economic losses.
In order to improve the security and reliability of database software, researchers have made great efforts and have achieved a certain result. One typical database testing method is based on a generated testing technique, which requires a developer to create a file containing all the syntax rules of Structured Query Language (SQL for short, SQL is a standard computer language for accessing and processing databases) queries, then randomly select various SQL statement keywords and operators, and generate the SQL queries according to the syntax rules. Random aggregation functions and sub-queries can also be added, so that the generated SQL query is more complex and diversified. However, the search space based on the generated method is huge, the triggering loopholes often need to execute rare program paths, and the method similar to violent enumeration has limited effectiveness in detecting database loopholes.
In recent years, a great deal of work has been developed on coverage (code coverage, which can measure the proportion of code that automated tests perform) guided gray box fuzzy test techniques. Unlike the generated-based test method, the fuzzy test relies on random mutation to create new test cases and utilizes feedback information such as code coverage rate and the like to guide exploration of an input space. At the beginning of the test, the fuzzy test tool selects a seed from the initial seed (in the fuzzy test field, the seed represents a test case) corpus as an input, randomly mutates (e.g., flips several bits or bytes) the selected input to generate slightly different variant inputs, and then uses the variant inputs to run a target program and detect abnormal behaviors such as crashes and assertion failures. During execution, the ambiguity test tool also records code path information. The input triggering the new code coverage will be preferentially selected for the new round of mutation. However, directly applying conventional fuzzy testing to database testing presents the following challenges: the input SQL queries of the database are highly structured, each SQL query is subjected to grammar and semantic check before being executed, once any grammar and semantic errors are detected, the database management system immediately stops executing and reports errors and exits, and if the deep code logic of the database management system is triggered, the grammar and semantic of the SQL query should be correct. In order to solve the limitation of random mutation (random bit flipping and byte substitution of test cases) in the traditional fuzzy test in the database test, the SQUIRREL Testing Database Management Systems with Language Validity and Coverage Feedback published by Rui Zhong et al in CCS2020 (a database management system test method based on language effectiveness and coverage rate feedback method, hereinafter referred to as background art I) converts SQL query into self-defined Intermediate Representation (IR), then performs type-based mutation on the IR, and finally converts the IR into new SQL query to ensure the correctness of grammar semantics. Although SQUIRREL has achieved great success in the field of database vulnerability detection, it only mutates SQL queries without taking configuration variations into account, so that complex configuration control sub-conditions cannot be penetrated, and coverage of configuration codes (i.e., codes obtained by performing taint analysis with initial variables configured in a program as sources) is also at a low level.
On the other hand, part of work is directed to development of defect detection researches related to software configuration, and is mainly divided into two categories: firstly, configuration function code defect detection and secondly, configuration fault reaction capability defect detection. The former mainly detects the function realization defect or performance defect in the configuration related code through dynamic or static program analysis, but cannot quickly generate a large number of test cases; the latter evaluates the response capability of the software to the faults by extracting the configuration constraint of the software and injecting configuration errors, thereby improving the efficiency of configuration fault diagnosis. In fact, not only does a misconfiguration result in a software system defect, but an effective configuration may also expose a hidden software defect. The former work mostly focuses on system failures caused by incorrect configuration values, however configuration-related database vulnerabilities are mostly caused by legal values. A study by Sun et al in "Testing Configuration Changes in Context to Prevent Production Failures (configuration change in test context to prevent production failure, hereinafter referred to as background art two)" published by OSDI2020 found that 46.3% -61.9% of configuration defects had completely valid parameter values, and that the proportion of configuration defects caused by valid parameters was similar to or even higher than the proportion of configuration defects caused by invalid parameters.
In summary, configuration defects in database software are frequent, and the configuration code in the database cannot be efficiently tested in the second background art, and configuration defects caused by modification of configuration valid values are detected, while the first background art has difficulty in penetrating the conditions of complex configuration control branches (i.e., control branches in the configuration code). How to penetrate the configuration control branch condition in the database software, promote the coverage rate of the configuration codes in the database and detect more configuration defects hidden in the database is a technical problem of great concern to the skilled person.
Disclosure of Invention
The invention aims to solve the technical problem that the coverage rate of configuration codes in a database is low because the existing database fuzzy test method is difficult to penetrate configuration control branch conditions, and provides a database fuzzy test method for configuration defects. The invention inputs the configuration and the database into the joint unfolding variation for the first time, provides a two-stage variation strategy of firstly changing the configuration and then changing the variants, and improves the coverage capability of the configuration related paths, thereby fully testing the configuration codes in the database and detecting more hidden configuration defects.
In order to solve the technical problems, the technical scheme of the invention is as follows: firstly, constructing a database fuzzy test system facing configuration defects, wherein the database fuzzy test system facing the configuration defects is composed of a configuration stain analysis module, a configuration pile inserting module and a database fuzzy test module; then, the configuration taint analysis module reads the source codes of the software to be detected and the target configuration set input by the user, obtains the influence ranges of all target configurations and the configuration and program basic block mapping relation set MS, and sends the MS to the configuration instrumentation module; the configuration instrumentation module reads in the software source code to be detected input by the user, receives the MS from the configuration stain analysis module, instrumentation is carried out on the software source code according to the MS, software S after instrumentation is obtained, and the software S is sent to the database fuzzy test module. And finally, the database fuzzy test module carries out coverage rate guided gray box fuzzy test on the S and outputs a configuration defect set.
The invention comprises the following steps:
firstly, constructing a database fuzzy test system facing configuration defects, wherein the database fuzzy test system facing the configuration defects is composed of a configuration stain analysis module, a configuration pile inserting module and a database fuzzy test module.
The configuration taint analysis module is connected with the configuration instrumentation module, reads the software source code to be detected and the target configuration set input by a user, carries out taint analysis on the software source code to be detected and the target configuration set, obtains the influence ranges of all target configurations in the target configuration set and the mapping relation set MS of the configuration and the program basic block, and sends the MS to the configuration instrumentation module.
The configuration instrumentation module is connected with the configuration taint analysis module and the database fuzzy test module, reads in the software source code to be detected input by a user, receives the MS from the configuration taint analysis module, instrumentation is carried out on the software source code according to the MS, software S after instrumentation is obtained, and the software S is sent to the database fuzzy test module.
The database fuzzy test module is connected with the configuration pile inserting module, receives software S after pile inserting from the configuration pile inserting module, carries out gray box fuzzy test guided by coverage rate on the S, utilizes a two-stage mutation strategy of firstly mutating and then mutating seeds (seeds represent test cases in the field of fuzzy test) to penetrate configuration control branch conditions, detects configuration defects of the database software, and outputs a configuration defect set.
Secondly, the configuration taint analysis module reads the software source code to be detected and the target configuration set input by a user, carries out taint analysis on the software source code to be detected and the target configuration set to obtain the influence ranges of all target configurations in the target configuration set and a configuration and program basic block mapping relation set MS, and sends the MS to the configuration instrumentation module, wherein the method comprises the following steps:
2.1 configuring the stain analysis Module to read in the software Source code S to be detected input by the user 0 And target configuration set C, c= { C 1 ,c 2 ,...,c i ,...,c I }, wherein c i For the ith target configuration in C, C i The constant character string is characterized in that I is the total number of target configurations in C, and I is more than or equal to 1 and less than or equal to I;
2.2 configuration stain analysis Module S was analyzed using Confmapper algorithm (see section "Confmapper: automated variable finding for configuration items in source code", by Shulin Zhou et al in QRS-C2016 (page 4 of a party that automatically discovers initial variables of configuration parameters from software Source code) 0 From software source code S 0 Finding configuration parameter initial variables to obtain I initial program variables of target configuration in C, and forming a configuration variable set VC, VC= { VC by the I initial program variables of target configuration (I configuration variables for short) 1 ,vc 2 ,...,vc i ,...,vc I }, where vc i To configure c i Corresponding configuration variables;
2.3 configuration stain analysis Module Using the DG (constructing dependence graphs for program analysis, program analysis based on build dependency graphs) algorithm of the article "DG: analysis and slicing of LLVM bitcode (a program analysis and slicing method based on Low Level Virtual Machine)" published by Marek Chalupa et al in ATVA2020, configuration variables in VC are subjected to stain analysis to obtain the influence scope of target configuration (i.e. stain propagation variables are in the software source code S to be detected) 0 Position in (a) set R, r= { R 1 ,R 2 ,...,R i ,...,R I (wherein Ri is c) i Is set in the range of influence of R i ={r 1 ,r 2 ,…,r ni ,…,r Ni -where r ni Is R i N of (v) i The individual stain propagation variables are at S 0 Position N of (3) i Is R i The number of the elements is 1 to n i ≤N i
2.4 configuring the spot analysis Module to locate R in R i Middle r ni S at site 0 Is a sequence of instructions that are executed sequentially, each basic block having only one entry and one exit, the entry being the first instruction therein and the exit being the last instruction therein), to obtain a set of configuration and basic block mappings MS, ms= { MS 1 ,MS 2 ,...,MS i ,...,MS I }, wherein MS is i C is i Mapping relation set of (c) i With MS (MS) i The elements in the system have one-to-many mapping relation, MS i ={ms 1 ,ms 2 ,...,ms ni ,...,ms Ni },ms ni C is i S of mapping 0 N of (3) i The method comprises the following steps of:
2.4.1 initializing variable i=1;
2.4.2 initializing variable n i =1;
2.4.3 initialization
2.4.4 positioning r ni At S 0 Find r ni The first instruction Inst of the basic block of the program is given by the file name and line number ms of the Inst ni Representing the program basic block;
2.4.5 ms ni Joining MS i
2.4.6 let n i =n i +1, if n i ≤N i Turning to 2.4.4; if n i >N i Let n i Let i=i+1, turn 2.4.7, =1;
2.4.7 if I is less than or equal to I, 2.4.3; if i>I, description c i S of mapping 0 The basic blocks of the program in the method are all put into the MS and are converted to 2.5;
and 2.5, sending the MS to a configuration pile inserting module.
And a third step of: configuring a pile inserting module to read to-be-detected software source code S input by a user 0 Receiving MS from the configuration stain analysis module, and comparing S according to the MS 0 Performing pile insertion to obtain software S to be detected after pile insertion, and sending the software S to a database fuzzy test module, wherein the method comprises the following steps:
3.1 configuring the pile inserting module according to MS to S 0 Pile insertion, so that a database fuzzy test module in the fuzzy test process can sense whether an execution path of seeds contains c or not i The method of the mapped program basic block is as follows:
3.1.1 initializing variable i=1;
3.1.2 initializing variable n i =1;
3.1.3 analysis of detected software Source code S using Modulepass tool of LLVM (Low Level Virtual Machine) framework (version 10.0.0 and above, same LLVM framework version number referred to subsequently) 0 Obtain ms ni At S 0 Position loc in (a) ni
3.1.4 IRBuilder interface at loc using LLVM framework ni Place insert store c i To Store instructions in shared memory, simply referred to as value Store instructions (in this way seed pass c can be obtained at S runtime i Information);
3.1.5 let n i =n i +1, if n i ≤N i Turning to 3.1.3; if n i >N i Turning to 3.1.6;
3.1.6 let i=i+1, if i.ltoreq.I, turn 3.1.2; if i>I, description according to MS vs S 0 After pile insertion, obtaining software S to be detected after pile insertion, and converting to 3.2;
3.2, transmitting the software S to be detected after pile insertion to a database fuzzy test module;
fourth, the database fuzzy test module performs coverage rate guided gray box fuzzy test on S, and utilizes a two-stage mutation strategy of mutation configuration before mutation seed to penetrate configuration control branch conditions, detect configuration defects of database software, and output configuration defect sets, wherein the method comprises the following steps:
4.1, the database fuzzy test module receives S from the configuration pile inserting module;
4.2 the database fuzzy test module generates an initial seed queue Q by using an initial seed library SP provided by a user (the initial seed library contains initial test cases provided by the user and is stored in a file form), wherein SP= { SP 1 ,sp 2 ,...,sp j ,...sp J }, where sp j For the J-th seed in the initial seed pool, J (0 <J is less than or equal to 100) is the number of initial seeds in SP, J is less than or equal to 1 and less than or equal to J, and the method is as follows:
4.2.1 initializing variable j=1;
4.2.2 initializing seed queue
4.2.3 seed sp j Sending the software S to be detected to the pile inserted;
4.2.4 obtaining a user-defined maximum size MaxSize and a user-defined maximum duration MaxT from a configuration file provided by a user;
4.2.5 judging sp j Whether the file size exceeds MaxSize, if so, specify sp j The seed execution speed during the fuzzy test operation is affected, so that j=j+1 is changed to 4.2.3; if not, go to 4.2.6;
4.2.6 judging seed sp j Whether the execution time exceeds MaxT, if so, specify sp j Will cause S to hang, let j=j+1, go 4.2.3; if not, go to 4.2.7;
4.2.7 judging seed sp j Whether or not a crash of the software S is caused (judged on the basis of the signal SIGKILL issued by the operating system, which indicates a termination procedure), if so, an sp is described j Potential safety hazards can be brought, and j=j+1 is changed to 4.2.3; if not, specify sp j Is a safe seed, and is turned to 4.2.8;
4.2.8 judging seed sp j Whether the execution path passes through the configuration basic block in the MS (the configuration instrumentation module is based on the MS versus S) 0 Available after pile insertion), if not, specify sp j Independent of configuration, let j=j+1, turn 4.2.3; if so, specify sp j Associated with configuration, turn 4.2.9;
4.2.9 the z-th seed Q in the seed queue Q z =sp j Will q z Adding an initial seed queue Q;
4.2.10 if j=j, it indicates that J seeds in the initial seed pool are treated, resulting in a seed queue Q, q= { Q 1 ,q 2 ,...,q z ,...q Z }, where q z Is the z-th seed in the seed queue Q, and Q z For the seeds of the configuration basic blocks (namely program basic blocks) in the MS, Z is the number of the seeds in Q, Z is more than or equal to 1 and less than or equal to Z, Z is more than or equal to J, and the conversion is 4.3; otherwise, let j=j+1, turn 4.2.3;
4.3 database ambiguity test Module A seed selection strategy in the article "SQUIRREL: testing Database Management Systems with Language Validity and Coverage Feedback (a database management System test method based on language availability and coverage feedback method)" by Rui Zhong et al in CCS2020 was used to select a seed from the seed queue Q (let Q be z (ii) performing a fuzzy test;
4.4 database fuzzy test Module Using configuration-oriented two-stage mutation strategy vs. q z Performing mutation to generate a mutated new seed, and feeding the mutated new seed to S for execution to penetrate through configuration control branch conditions, wherein the method comprises the following steps:
4.4.1 according to the configuration pile inserting information (configuration pile inserting module root According to MS vs S 0 Stake-inserting to obtain S, c for seeds to pass through can be obtained during S operation i Information) to obtain q z Passed configuration set C ', C' = { C 1 ',c 2 ',...c k ',...c K ' }, wherein c k ' seed q z K is the number of the K configuration in C', and K is more than or equal to 1 and less than or equal to K;
4.4.2 variation of the configuration in C' in sequence will trigger a new covered seed q z ' join to seed queue Q by:
4.4.2.1 initializing variable k=1;
4.4.2.2 extraction of configuration c using the Spex algorithm of article "Do Not Blame Users for Misconfigurations (without blading user's configuration errors)" published by Tianyin Xu et al in SOSP 2013 k ' grammar type, value range, and grammar format. The four types of grammar extracted include: boolean type (bool), enumeration type (enum), string type (string), numeric type (int); extracting c k ' value range, i.e. c k Set V of all possible values of' k ,V k ={v 1 ,v 2 ,…,v m …,v M },v m To configure c k ' mth legal value, M is the number of legal values generated; extracting c k The syntax format of' includes: path format, IP format (internet protocol address format), uniform Resource Locator (URL) format (uniform resource locator format), and ID format (identity code format);
4.4.2.3 according to c k ' grammar type generation c k ' set of values to be measured V k The method is as follows:
4.4.2.3.1 if c k ' Boolean type (bool), let V k ' = {0,1}, turn 4.4.2.4;
4.4.2.3.2 if c k ' is of the enumeration type (enum), let V k '=V k Turning to 4.4.2.4;
4.4.2.3.3 if c k ' is string type, let V k '={sv 1 ,sv 2 }, where sv is 1 ,sv 2 To satisfy c k ' 2 random values in grammar format, turn 4.4.2.4;
4.4.2.3.4 if c k ' is a numerical type (int), then for c k The' value is sampled by: c extracted by the Spex algorithm of 4.4.2.2 steps is recorded k ' minimum value is Min, c k ' Max maximum value, let V k '={Min,10·Min,10 2 ·Min,10 -2 ·Max,10 -1 Max, max }, turn 4.4.2.4;
4.4.2.4 database fuzzy test module uses SQL keywords provided by the database under test according to V k ' pair c k Variation of the' value to obtain a mutated configuration c k "different database modification configuration keys are different, mySQL, postgreSQL and mariadib keys are Set, redis keys are Config Set, SQLite keys are Pragma. For example, a value of MySQL configured to be autocompmit is mutated by an instruction Set autocompmit=false; the value of the PostgreSQL configuration archive_timeout is mutated by the instruction Set archive_timeout=1000; the value of the configuration connect_timeout of the mariadib is mutated, and the value is changed by the instruction Set connect_timeout=2000; the values of the configured slow log-max-len of Redis are mutated by the instruction Config Set slowlog-max-len 10086; the value of the configuration analysis_limit of SQLite is mutated by the instruction Pragma analysis_limit=18;
4.4.2.5 arrangement c after mutation k "and q z Splicing to form new seeds q z ' q z ' feed S execution, if q z ' cover the new code segment in S (the code segment not covered by the previous seed), then q z ' add to seed queue Q; if q z ' execution causes an S crash (determined from the signal SIGKILL issued by the operating system) or hangs (depending on whether the seed execution time exceeds MaxT), q will be z ' add to CS;
4.4.2.6 if k=k, the seed q is described z After the configuration variation is finished, turning to 4.4.3, otherwise, turning k=k+1 to 4.4.2.2;
4.4.3 seed q extraction z The set of numerical fields PV, pv= { PV 1 ,pv 2 ,...,pv u ,...,pv U },pv u U is the number of value fields in PV, which is the U number of value fields in PV. Sequentially mutating the numerical value fields in PV and monitoring the change condition of the program variable P in the configuration control branch, and respectively establishing the numerical value fields PV 1 ,pv 2 ,...,pv u ,...,pv U Mapping with variable P and for pv in the subsequent process 1 ,pv 2 ,...,pv u ,...,pv U Directional mutation to obtain mutated new seed q z ", the method is as follows:
4.4.3.1 initializing variable u=1;
4.4.3.2 pv according to the rand () function of the C language u Randomly mutating the value of (2) to obtain a mutated value field pv u ' and by pv u ' pair q z Value field PV in (b) u Substitution is carried out to obtain q z Mutated novel seed q z ”;
4.4.3.3 the value P' of the program variable P during operation is obtained by instrumentation by:
4.4.3.3.1 analysis of detected software Source code S Using Modulepass tool of LLVM framework 0 Obtaining P in S 0 Position loc in (a) p
4.4.3.3.2 IRBuilder interface at loc using LLVM framework p A Store instruction (in this way P 'information can be obtained at S runtime) that inserts the value P' of Store P into shared memory;
4.4.3.4 q z "feed S execute, if q z "covers the new code segment in S (code segment not covered by the previous seed), q will be z "add to seed queue Q; if q z "execution causes an S crash (determined from the signal SIGKILL issued by the operating system) or suspension (depending on whether the seed execution time exceeds MaxT), q will be z "add to CS; if P' changes during S execution, this means pv u Associated with P, and establishes a mapping relation<pvu,P>Turning to 4.4.3.5; if P' does not change during S execution, directly transferring 4.4.3.5;
4.4.3.5 if the mapping is successfully establishedJet relation, using rand () function of C language to pv u Ten random variations in value (with the goal of penetrating configuration control branch conditions), 4.4.3.6; if the mapping relation is not successfully established, let u=u+1, turn 4.4.3.2;
4.4.3.6 if u>U and test time less than 24 hours, indicating seed q z After the numerical value field in the seed queue Q is mutated, the testing time is not satisfied with the requirement of the user, and the process is switched to 4.3, and the next seed is continuously selected from the seed queue Q for fuzzy testing; if u is>U and the test time is more than or equal to 24 hours, which indicates that the test time meets the requirement of a user, and outputs CS (a test case set triggering configuration defects, namely a configuration defect set, is stored in CS); otherwise, let u=u+1, turn 4.4.3.2, and continue to randomly mutate the next value field in PV.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention can fully test the configuration codes in the tested software. By adopting the method, the configuration defect test is carried out in the 3 open source popular software MySQL, postgreSQL, SQLite to obtain a test set of the configuration related basic blocks, the coverage rate of the method to the configuration related basic blocks can reach 48.2%, and the coverage rate of the background technology can only reach 35.4%.
2. The invention can detect 9 configuration defects for the database software community, report the software community feedback to the developer for confirmation, and prevent potential economic and user loss possibly caused by the software configuration defects. Wherein 3 configuration defects are detected for MySQL communities and 6 configuration defects are detected for SQLite communities.
Drawings
FIG. 1 is a general flow chart of the present invention;
FIG. 2 is a logical block diagram of the configuration defect oriented database ambiguity test system constructed in the first step of the present invention.
Detailed Description
The present invention will be described below with reference to the accompanying drawings.
As shown in fig. 1, the present invention includes the steps of:
firstly, constructing a database fuzzy test system facing configuration defects, wherein the database fuzzy test system facing the configuration defects is shown in fig. 2 and consists of a configuration stain analysis module, a configuration pile insertion module and a database fuzzy test module.
The configuration taint analysis module is connected with the configuration instrumentation module, reads the software source code to be detected and the target configuration set input by a user, carries out taint analysis on the software source code to be detected and the target configuration set, obtains the influence ranges of all target configurations in the target configuration set and the mapping relation set MS of the configuration and the program basic block, and sends the MS to the configuration instrumentation module.
The configuration instrumentation module is connected with the configuration taint analysis module and the database fuzzy test module, reads in the software source code to be detected input by a user, receives the MS from the configuration taint analysis module, instrumentation is carried out on the software source code according to the MS, software S after instrumentation is obtained, and the software S is sent to the database fuzzy test module.
The database fuzzy test module is connected with the configuration pile inserting module, receives the software S after pile inserting from the configuration pile inserting module, performs coverage rate guided gray box fuzzy test on the S, penetrates configuration control branch conditions by utilizing a two-stage variation strategy of firstly varying configuration and then varying seeds, detects configuration defects of the database software, and outputs a configuration defect set.
Secondly, the configuration taint analysis module reads the software source code to be detected and the target configuration set input by a user, carries out taint analysis on the software source code to be detected and the target configuration set to obtain the influence ranges of all target configurations in the target configuration set and a configuration and program basic block mapping relation set MS, and sends the MS to the configuration instrumentation module, wherein the method comprises the following steps:
2.1 configuring the stain analysis Module to read in the software Source code S to be detected input by the user 0 And target configuration set C, c= { C 1 ,c 2 ,...,c i ,...,c I }, wherein c i For the ith target configuration in C, C i The constant character string is characterized in that I is the total number of target configurations in C, and I is more than or equal to 1 and less than or equal to I;
2.2 configuration stain analysis Module uses Confmapp er algorithm (see ShuliN Zhou et al, confmapper Automated variable finding for configuration items in source code, a paper published in QRS-C2016 (page 4 of a party that automatically discovers initial variables of configuration parameters from software Source code), analyzes S 0 From software source code S 0 Finding configuration parameter initial variables to obtain I initial program variables of target configuration in C, and forming a configuration variable set VC, VC= { VC by the I initial program variables of target configuration (I configuration variables for short) 1 ,vc 2 ,...,vc i ,...,vc I }, where vc i To configure c i Corresponding configuration variables;
2.3 configuration stain analysis Module Using the DG (constructing dependence graphs for program analysis, program analysis based on build dependency graphs) algorithm of the article "DG: analysis and slicing of LLVM bitcode (a program analysis and slicing method based on Low Level Virtual Machine)" published by Marek Chalupa et al in ATVA 2020, configuration variables in VC are subjected to stain analysis to obtain the influence scope of target configuration (i.e. stain propagation variables are in the software source code S to be detected) 0 Position in (a) set R, r= { R 1 ,R 2 ,...,R i ,...,R I (wherein Ri is c) i Is set in the range of influence of R i ={r 1 ,r 2 ,…,r ni ,…,r Ni -where r ni Is R i N of (v) i The individual stain propagation variables are at S 0 Position N of (3) i Is R i The number of the elements is 1 to n i ≤N i
2.4 configuring the spot analysis Module to locate R in R i Middle r ni S at site 0 Is a sequence of instructions that are executed sequentially, each basic block having only one entry and one exit, the entry being the first instruction therein and the exit being the last instruction therein), to obtain a set of configuration and basic block mappings MS, ms= { MS 1 ,MS 2 ,...,MS i ,...,MS I }, wherein MS is i C is i Mapping relation set of (c) i With MS (MS) i The elements in (a) have one-to-manyMapping relation, MS i ={ms 1 ,ms 2 ,...,ms ni ,...,ms Ni },ms ni C is i S of mapping 0 N of (3) i The method comprises the following steps of:
2.4.1 initializing variable i=1;
2.4.2 initializing variable n i =1;
2.4.3 initialization
2.4.4 positioning r ni At S 0 Find r ni The first instruction Inst of the basic block of the program is given by the file name and line number ms of the Inst ni Representing the program basic block;
2.4.5 ms ni Joining MS i
2.4.6 let n i =n i +1, if n i ≤N i Turning to 2.4.4; if n i >N i Let n i Let i=i+1, turn 2.4.7, =1;
2.4.7 if I is less than or equal to I, 2.4.3; if i>I, description c i S of mapping 0 The basic blocks of the program in the method are all put into the MS and are converted to 2.5;
and 2.5, sending the MS to a configuration pile inserting module.
And a third step of: configuring a pile inserting module to read to-be-detected software source code S input by a user 0 Receiving MS from the configuration stain analysis module, and comparing S according to the MS 0 Performing pile insertion to obtain software S to be detected after pile insertion, and sending the software S to a database fuzzy test module, wherein the method comprises the following steps:
3.1 configuring the pile inserting module according to MS to S 0 Pile insertion, so that a database fuzzy test module in the fuzzy test process can sense whether an execution path of seeds contains c or not i The method of the mapped program basic block is as follows:
3.1.1 initializing variable i=1;
3.1.2 initializing variable n i =1;
3.1.3 Using LLVM (Low Level Virtual Machine) The Modulepass tool of the framework (version 10.0.0 and above, the later related LLVM framework version numbers are the same) analyzes the detected software source code S 0 Obtain ms ni At S 0 Position loc in (a) ni
3.1.4 IRBuilder interface at loc using LLVM framework ni Place insert store c i To Store instructions in shared memory, simply referred to as value Store instructions (in this way seed pass c can be obtained at S runtime i Information);
3.1.5 let n i =n i +1, if n i ≤N i Turning to 3.1.3; if n i >N i Turning to 3.1.6;
3.1.6 let i=i+1, if i.ltoreq.I, turn 3.1.2; if i>I, description according to MS vs S 0 After pile insertion, obtaining software S to be detected after pile insertion, and converting to 3.2;
3.2, transmitting the software S to be detected after pile insertion to a database fuzzy test module;
fourth, the database fuzzy test module performs coverage rate guided gray box fuzzy test on S, and utilizes a two-stage mutation strategy of mutation configuration before mutation seed to penetrate configuration control branch conditions, detect configuration defects of database software, and output configuration defect sets, wherein the method comprises the following steps:
4.1, the database fuzzy test module receives S from the configuration pile inserting module;
4.2 the database fuzzy test module generates an initial seed queue Q by using an initial seed library SP provided by a user (the initial seed library contains initial test cases provided by the user and is stored in a file form), wherein SP= { SP 1 ,sp 2 ,...,sp j ,...sp J }, where sp j For the J-th seed in the initial seed pool, J (0<J is less than or equal to 100) is the number of initial seeds in SP, J is less than or equal to 1 and less than or equal to J, and the method is as follows:
4.2.1 initializing variable j=1;
4.2.2 initializing seed queue
4.2.3 seed sp j Sending the software S to be detected to the pile inserted;
4.2.4 obtaining a user-defined maximum size MaxSize and a user-defined maximum duration MaxT from a configuration file provided by a user;
4.2.5 judging sp j Whether the file size exceeds MaxSize, if so, specify sp j The seed execution speed during the fuzzy test operation is affected, so that j=j+1 is changed to 4.2.3; if not, go to 4.2.6;
4.2.6 judging seed sp j Whether the execution time exceeds MaxT, if so, specify sp j Will cause S to hang, let j=j+1, go 4.2.3; if not, go to 4.2.7;
4.2.7 judging seed sp j Whether or not a crash of the software S is caused (judged on the basis of the signal SIGKILL issued by the operating system, which indicates a termination procedure), if so, an sp is described j Potential safety hazards can be brought, and j=j+1 is changed to 4.2.3; if not, specify sp j Is a safe seed, and is turned to 4.2.8;
4.2.8 judging seed sp j Whether the execution path passes through the configuration basic block in the MS (the configuration instrumentation module is based on the MS versus S) 0 Available after pile insertion), if not, specify sp j Independent of configuration, let j=j+1, turn 4.2.3; if so, specify sp j Associated with configuration, turn 4.2.9;
4.2.9 the z-th seed Q in the seed queue Q z =sp j Will q z Adding an initial seed queue Q;
4.2.10 if j=j, it indicates that J seeds in the initial seed pool are treated, resulting in a seed queue Q, q= { Q 1 ,q 2 ,...,q z ,...q Z }, where q z Is the z-th seed in the seed queue Q, and Q z For the seeds of the configuration basic blocks (namely program basic blocks) in the MS, Z is the number of the seeds in Q, Z is more than or equal to 1 and less than or equal to Z, Z is more than or equal to J, and the conversion is 4.3; otherwise, let j=j+1, turn 4.2.3;
4.3 database ambiguity test Module Using the article "SQUIRREL: testing Database Management Systems with Language Validity and Cov by Rui Zhong et al in CCS2020Seed selection strategy in the erage feed back (a database management system test method based on language availability and coverage Feedback method) "selects one seed from the seed queue Q (let Q be z (ii) performing a fuzzy test;
4.4 database fuzzy test Module Using configuration-oriented two-stage mutation strategy vs. q z Performing mutation to generate a mutated new seed, and feeding the mutated new seed to S for execution to penetrate through configuration control branch conditions, wherein the method comprises the following steps:
4.4.1 according to the configuration instrumentation information (the configuration instrumentation module is based on MS versus S) 0 Stake-inserting to obtain S, c for seeds to pass through can be obtained during S operation i Information) to obtain q z Passed configuration set C ', C' = { C 1 ',c 2 ',...c k ',...c K ' }, wherein c k ' seed q z K is the number of the K configuration in C', and K is more than or equal to 1 and less than or equal to K;
4.4.2 variation of the configuration in C' in sequence will trigger a new covered seed q z ' join to seed queue Q by:
4.4.2.1 initializing variable k=1;
4.4.2.2 extraction of configuration c using the Spex algorithm of article "Do Not Blame Users for Misconfigurations (without blading user's configuration errors)" published by Tianyin Xu et al in SOSP 2013 k ' grammar type, value range, and grammar format. The four types of grammar extracted include: boolean type (bool), enumeration type (enum), string type (string), numeric type (int); extracting c k ' value range, i.e. c k Set V of all possible values of' k ,V k ={v 1 ,v 2 ,…,v m …,v M },v m To configure c k ' mth legal value, M is the number of legal values generated; extracting c k The syntax format of' includes: path format, IP format (internet protocol address format), uniform Resource Locator (URL) format (uniform resource locator format), and ID format (identity code format);
4.4.2.3 according to c k ' grammar type generation c k ' set of values to be measured V k The method is as follows:
4.4.2.3.1 if c k ' Boolean type (bool), let V k ' = {0,1}, turn 4.4.2.4;
4.4.2.3.2 if c k ' is of the enumeration type (enum), let V k '=V k Turning to 4.4.2.4;
4.4.2.3.3 if c k ' is string type, let V k '={sv 1 ,sv 2 }, where sv is 1 ,sv 2 To satisfy c k ' 2 random values in grammar format, turn 4.4.2.4;
4.4.2.3.4 if c k ' is a numerical type (int), then for c k The' value is sampled by: c extracted by the Spex algorithm of 4.4.2.2 steps is recorded k ' minimum value is Min, c k ' Max maximum value, let V k '={Min,10·Min,10 2 ·Min,10 -2 ·Max,10 -1 Max, max }, turn 4.4.2.4;
4.4.2.4 database fuzzy test module uses SQL keywords provided by the database under test according to V k ' pair c k Variation of the' value to obtain a mutated configuration c k "different database modification configuration keys are different, mySQL, postgreSQL and mariadib keys are Set, redis keys are Config Set, SQLite keys are Pragma. For example, a value of MySQL configured to be autocompmit is mutated by an instruction Set autocompmit=false; the value of the PostgreSQL configuration archive_timeout is mutated by the instruction Set archive_timeout=1000; the value of the configuration connect_timeout of the mariadib is mutated, and the value is changed by the instruction Set connect_timeout=2000; the values of the configured slow log-max-len of Redis are mutated by the instruction Config Set slowlog-max-len 10086; the value of the configuration analysis_limit of SQLite is mutated by the instruction Pragma analysis_limit=18;
4.4.2.5 arrangement c after mutation k "and q z Splicing to form new seeds q z ' q z ' feed S execution, if q z ' cover the new code segment in S (Code segment not covered by the previous seed), then q z ' add to seed queue Q; if q z ' execution causes an S crash (determined from the signal SIGKILL issued by the operating system) or hangs (depending on whether the seed execution time exceeds MaxT), q will be z ' add to CS;
4.4.2.6 if k=k, the seed q is described z After the configuration variation is finished, turning to 4.4.3, otherwise, turning k=k+1 to 4.4.2.2;
4.4.3 seed q extraction z The set of numerical fields PV, pv= { PV 1 ,pv 2 ,...,pv u ,...,pv U },pv u U is the number of value fields in PV, which is the U number of value fields in PV. Sequentially mutating the numerical value fields in PV and monitoring the change condition of the program variable P in the configuration control branch, and respectively establishing the numerical value fields PV 1 ,pv 2 ,...,pv u ,...,pv U Mapping with variable P and for pv in the subsequent process 1 ,pv 2 ,...,pv u ,...,pv U Directional mutation to obtain mutated new seed q z ", the method is as follows:
4.4.3.1 initializing variable u=1;
4.4.3.2 pv according to the rand () function of the C language u Randomly mutating the value of (2) to obtain a mutated value field pv u ' and by pv u ' pair q z Value field PV in (b) u Substitution is carried out to obtain q z Mutated novel seed q z ”;
4.4.3.3 the value P' of the program variable P during operation is obtained by instrumentation by:
4.4.3.3.1 analysis of detected software Source code S Using Modulepass tool of LLVM framework 0 Obtaining P in S 0 Position loc in (a) p
4.4.3.3.2 IRBuilder interface at loc using LLVM framework p A Store instruction (in this way P 'information can be obtained at S runtime) that inserts the value P' of Store P into shared memory;
4.4.3.4 q z "feed S execute, if q z "cover new in SCode segment (code segment not covered by previous seed), then q z "add to seed queue Q; if q z "execution causes an S crash (determined from the signal SIGKILL issued by the operating system) or suspension (depending on whether the seed execution time exceeds MaxT), q will be z "add to CS; if P' changes during S execution, this means pv u Associated with P, and establishes a mapping relation<pvu,P>Turning to 4.4.3.5; if P' does not change during S execution, directly transferring 4.4.3.5;
4.4.3.5 if the mapping is successfully established, the rand () function of the C language is used for the pv u Ten times random variation of the values of (a), 4.4.3.6; if the mapping relation is not successfully established, let u=u+1, turn 4.4.3.2;
4.4.3.6 if u>U and test time less than 24 hours, indicating seed q z After the numerical value field in the seed queue Q is mutated, the testing time is not satisfied with the requirement of the user, and the process is switched to 4.3, and the next seed is continuously selected from the seed queue Q for fuzzy testing; if u is>U and the test time is more than or equal to 24 hours, which indicates that the test time meets the requirement of a user, and outputs CS (a test case set triggering configuration defects, namely a configuration defect set, is stored in CS); otherwise, let u=u+1, turn 4.4.3.2, and continue to randomly mutate the next value field in PV.
In order to verify the effect of the invention on detecting configuration defect problems, a comparison experiment of the invention with Squirrel in the background technology (Squirrel is a tool designed in SQUIRREL: testing Database Management Systems with Language Validity and Coverage Feedback issued by Rui Zhong et al in CCS 2020) is carried out on a computer with an 8-Core Intel Core i7-9700K and a 32GB memory, the kernel version of the Ubuntu18.04 operating system is 5.8.0, the software environment is LLVM10.0.0+python3.8, and the main coding language is C++. Experimental three types of software MySQL, postgreSQL, SQLite were chosen as target software for evaluation. As the database configuration defect problem detected by the invention is a novel defect problem, no special detection technology exists at present, the invention is compared with the first front edge technology Squirrel (background technology I) of the database defect detection problem, the result is shown in the table 1, and experiments prove that compared with the background technology I, more configuration defects can be detected under the condition of executing the same operation time. The invention detects 9 database configuration defects, and the detection efficiency of the invention are higher than those of the first background technology as the first background technology does not detect the configuration defects.
TABLE 1 comparison of the capability of detecting configuration defects of the present invention with the first background art

Claims (9)

1. A database fuzzy test method facing configuration defects is characterized by comprising the following steps:
firstly, constructing a database fuzzy test system facing configuration defects, wherein the database fuzzy test system facing the configuration defects is composed of a configuration stain analysis module, a configuration pile inserting module and a database fuzzy test module;
the configuration stain analysis module is connected with the configuration pile inserting module;
the configuration pile inserting module is connected with the configuration stain analysis module and the database fuzzy test module;
the database fuzzy test module is connected with the configuration pile inserting module;
secondly, the configuration taint analysis module reads the software source code to be detected and the target configuration set input by a user, carries out taint analysis on the software source code to be detected and the target configuration set to obtain the influence ranges of all target configurations in the target configuration set and a configuration and program basic block mapping relation set MS, and sends the MS to the configuration instrumentation module, wherein the method comprises the following steps:
2.1 configuring the stain analysis Module to read in the software Source code S to be detected input by the user 0 And target configuration set C, c= { C 1 ,c 2 ,...,c i ,...,c I }, wherein c i For the ith target configuration in C, C i The constant character string is characterized in that I is the total number of target configurations in C, and I is more than or equal to 1 and less than or equal to I;
2.2 configuration stain analysis Module analyzes S using Confmapp er algorithm 0 From S 0 Finding configuration parameter initial variables to obtain I initial program variables of target configuration in C, and forming a configuration variable set VC, VC= { VC by the I initial program variables of target configuration, namely the I configuration variables 1 ,vc 2 ,...,vc i ,...,vc I }, where vc i Configuration c for target i Corresponding configuration variables;
2.3 configuration taint analysis module uses DG algorithm to carry out taint analysis to configuration variables in VC to obtain influence range of target configuration, namely taint propagation variables in software source code S to be detected 0 Position set R, r= { R in (a) 1 ,R 2 ,...,R i ,...,R I (wherein Ri is c) i Is set in the range of influence of R i ={r 1 ,r 2 ,…,r ni ,…,r Ni -where r ni Is R i N of (v) i The individual stain propagation variables are at S 0 Position N of (3) i Is R i The number of the elements is 1 to n i ≤N i
2.4 configuring the spot analysis Module to locate R in R i Middle r ni S at site 0 Obtaining a configuration and program basic block mapping relation set MS, MS= { MS 1 ,MS 2 ,...,MS i ,...,MS I }, wherein MS is i C is i Mapping relation set of (c) i With MS (MS) i The elements in the system have one-to-many mapping relation, MS i ={ms 1 ,ms 2 ,...,ms ni ,...,ms Ni },ms ni C is i S of mapping 0 N of (3) i Program basic blocks;
2.5, sending the MS to a configuration pile inserting module;
And a third step of: configuring a pile inserting module to read to-be-detected software source code S input by a user 0 Receiving MS from the configuration stain analysis module, and comparing S according to the MS 0 Performing pile insertion to obtain software S to be detected after pile insertion, and sending the software S to a database fuzzy test module, wherein the method comprises the following steps:
3.1 configuring the pile inserting module according to MS to S 0 Pile insertion, so that a database fuzzy test module in the fuzzy test process can sense whether an execution path of seeds contains c or not i The method of the mapped program basic block is as follows:
3.1.1 initializing variable i=1;
3.1.2 initializing variable n i =1;
3.1.3 analysis of software Source code S to be detected Using Modulepass tool of LLVM framework 0 Obtain ms ni At S 0 Position loc in (a) ni
3.1.4 IRBuilder interface at loc using LLVM framework ni Place insert store c i To Store instructions in shared memory, simply referred to as value Store instructions;
3.1.5 let n i =n i +1, if n i ≤N i Turning to 3.1.3; if n i >N i Turning to 3.1.6;
3.1.6 let i=i+1, if i.ltoreq.I, turn 3.1.2; if i>I, description according to MS vs S 0 After pile insertion, obtaining software S to be detected after pile insertion, and converting to 3.2;
3.2, transmitting the software S to be detected after pile insertion to a database fuzzy test module;
fourth step: the database fuzzy test module performs coverage rate guided gray box fuzzy test on the S, and utilizes a two-stage mutation strategy of mutation configuration before mutation seed to penetrate configuration control branch conditions, detect configuration defects of database software and output configuration defect sets, and the method comprises the following steps:
4.1, the database fuzzy test module receives S from the configuration pile inserting module;
4.2 the database ambiguity test Module generates an initial seed queue Q, SP= { SP, with the initial seed library SP provided by the user 1 ,sp 2 ,...,sp j ,...sp J }, where sp j For the jth seed in the initial seed library, J is the number of initial seeds in the SP, J is more than or equal to 1 and less than or equal to J, and J is a positive integer, and the method comprises the following steps:
4.2.1 initializing variable j=1;
4.2.2 initializing seed queue
4.2.3 seed sp j Sending the software S to be detected to the pile inserted;
4.2.4 obtaining a user-defined maximum size MaxSize and a user-defined maximum duration MaxT from a configuration file provided by a user;
4.2.5 judging sp j Whether the file size exceeds MaxSize, if so, specify sp j The seed execution speed during the fuzzy test operation is affected, so that j=j+1 is changed to 4.2.3; if not, go to 4.2.6;
4.2.6 judging seed sp j Whether the execution time exceeds MaxT, if so, specify sp j Will cause S to hang, let j=j+1, go 4.2.3; if not, go to 4.2.7;
4.2.7 judging seed sp j Whether or not it would cause a crash of the software S, if so, it is indicated as sp j Potential safety hazards can be brought, and j=j+1 is changed to 4.2.3; if not, specify sp j Is a safe seed, and is turned to 4.2.8;
4.2.8 judging seed sp j Whether the execution path passes through a configuration basic block in the MS, if not, it indicates sp j Independent of configuration, let j=j+1, turn 4.2.3; if so, specify sp j Associated with configuration, turn 4.2.9;
4.2.9 the z-th seed Q in the seed queue Q z =sp j Will q z Adding a seed queue Q;
4.2.10 if j=j, it indicates that J seeds in the initial seed pool are treated, resulting in a seed queue Q, q= { Q 1 ,q 2 ,...,q z ,...q Z }, where q z Is the z-th seed in the seed queue Q, and Q z Z is the number of seeds in Q, Z is not less than 1 and not more than Z, Z is not less than J, and 4.3 is converted to seeds of which the execution path passes through the configuration basic block in the MS; otherwise, let j=j+1, turn 4.2.3;
4.3 the database fuzzy test module selects one seed from the seed queue Q by using a seed selection strategy to carry out fuzzy test, so that the selected seed is Q z
4.4 database fuzzy test Module Using configuration-oriented two-stage mutation strategy vs. q z Performing mutation to generate a mutated new seed, and sending the mutated new seed to the S for execution to penetrate the configuration control branch condition, wherein the method comprises the following steps:
4.4.1 obtaining q according to the configuration instrumentation information z Passed configuration set C ', C' = { C 1 ',c 2 ',...c k ',...c K ' }, wherein c k ' seed q z K is the number of the K configuration in C', and K is more than or equal to 1 and less than or equal to K;
4.4.2 variation of the configuration in C' in sequence will trigger a new covered seed q z ' join to seed queue Q by:
4.4.2.1 initializing variable k=1;
4.4.2.2 extraction of configuration c using Spex algorithm k ' grammar type, value range and grammar format; the four types of grammar extracted include: boolean type bool, enumeration type enum, string type string, value type int; extracting c k ' value range, i.e. c k Set V of all possible values of' k ,V k ={v 1 ,v 2 ,…,v m …,v M },v m To configure c k ' mth legal value, M is the number of legal values generated; extracting c k The syntax format of' includes: path format, IP format, URL format, and ID format;
4.4.2.3 according to c k ' grammar type generation c k ' set of values to be measured V k The method is as follows:
4.4.2.3.1 if c k ' Boolean type, let V k ' = {0,1}, turn 4.4.2.4;
4.4.2.3.2 if c k ' being of the enumeration type, let V k '=V k Turning to 4.4.2.4;
4.4.2.3.3 if c k ' is a character string type, let V k '={sv 1 ,sv 2 }, where sv is 1 ,sv 2 To satisfy c k ' 2 random values in grammar format, turn 4.4.2.4;
4.4.2.3.4 if c k ' being of the numerical type, then for c k The' value is sampled to obtain sampled V k ', 4.4.2.4;
4.4.2.4 database fuzzy test module uses SQL keywords provided by the database under test according to V k ' pair c k Variation of the' value to obtain a mutated configuration c k ”;
4.4.2.5 arrangement c after mutation k "and q z Splicing to form new seeds q z ' q z ' send to S execute, if q z ' covering the new code segment in S, i.e. the code segment not covered by the previous seed, q will be z ' add to seed queue Q; if q z ' execution causes S to crash or hang, q z ' add to the configuration defect set CS;
4.4.2.6 if k=k, the seed q is described z After the configuration variation is finished, turning to 4.4.3, otherwise, turning k=k+1 to 4.4.2.2;
4.4.3 seed q extraction z The set of numerical fields PV, pv= { PV 1 ,pv 2 ,...,pv u ,...,pv U },pv u The U is the number of the value fields in the PV; sequentially mutating the numerical value fields in PV and monitoring the change condition of the program variable P in the configuration control branch, and respectively establishing the numerical value fields PV 1 ,pv 2 ,...,pv u ,...,pv U Mapping with variable P and for pv in the subsequent process 1 ,pv 2 ,...,pv u ,...,pv U Directional mutation to obtain mutated new seed q z ", the method is as follows:
4.4.3.1 initializing variable u=1;
4.4.3.2 pv according to the rand () function of the C language u Randomly mutating the value of (2) to obtain a mutated value field pv u ' and by pv u ' pair q z Value field PV in (b) u Substitution is carried out to obtain q z Mutated novel seed q z ”;
4.4.3.3 obtaining the value P' of the program variable P during running through pile insertion;
4.4.3.4 q z "send to S execute, if q z "code segment covered with new code segment in S, i.e. code segment not covered with previous seed, q will be z "added to seed queue Q; if q z "execution causes S to crash or hang, q z "add to configuration defect set CS; if P' changes during S execution, this means pv u Associated with P, and establishes a mapping relation<pvu,P>Turning to 4.4.3.5; if P' does not change during S execution, directly transferring 4.4.3.5;
4.4.3.5 if the mapping is successfully established, the rand () function of the C language is used for the pv u Ten times random variation of the values of (a), 4.4.3.6; if the mapping relation is not successfully established, let u=u+1, turn 4.4.3.2;
4.4.3.6 if u>U and test time less than 24 hours, indicating seed q z After the numerical value field in the seed queue Q is mutated, the testing time is not satisfied with the requirement of the user, and the process is switched to 4.3, and the next seed is continuously selected from the seed queue Q for fuzzy testing; if u is>U and the test time is greater than or equal to 24 hours, which indicates that the test time meets the requirement of a user, and outputting a CS (circuit switching) in which a test case set triggering configuration defects, namely a configuration defect set, is stored; otherwise, let u=u+1, turn 4.4.3.2, and continue to randomly mutate the next value field in PV.
2. The method for fuzzy testing a database for configuration defects according to claim 1, wherein said configuration spot analysis module locates R in 2.4 steps i Middle r ni S at site 0 The method for obtaining the configuration and program basic block mapping relation set MS is as follows:
2.4.1 initializing variable i=1;
2.4.2 initializing variable n i =1;
2.4.3 initialization
2.4.4 positioning r ni At S 0 Find r ni The first instruction Inst of the basic block of the program is given by the file name and line number ms of the Inst ni Representing the program basic block;
2.4.5 ms ni Joining MS i
2.4.6 let n i =n i +1, if n i ≤N i Turning to 2.4.4; if n i >N i Let n i Let i=i+1, turn 2.4.7, =1;
2.4.7 if I is less than or equal to I, 2.4.3; if i>I, description c i S of mapping 0 The basic blocks of the program are all put into the MS and are finished.
3. The configuration defect oriented database fuzzing method of claim 1 wherein the LLVM framework is version 10.0.0 and above.
4. The method for configuration defect oriented database ambiguity test of claim 1, wherein said J satisfies 0<J.ltoreq.100 in step 4.2.
5. The method for configuration defect oriented database fuzzing test of claim 1 wherein said determining seed sp is step 4.2.7 j Whether the crash of the software S is caused is determined according to a signal SIGKILL sent by the operating system, where SIGKILL represents a termination process.
6. The method for configuration defect oriented database ambiguity test in claim 1 wherein step 4.4.2.3.4 if c k ' being of the numerical type, pair c k The method of sampling the' value is: c extracted by the Spex algorithm of 4.4.2.2 steps is recorded k ' minimum value is Min, c k ' Max maximum value, let V k '={Min,10·Min,10 2 ·Min,10 -2 ·Max,10 -1 ·Max,Max}。
7. A facing ligand as defined in claim 1A database fuzzy test method for setting defects is characterized in that 4.4.2.4 steps are carried out by the database fuzzy test module according to V by using SQL keywords provided by a tested database k ' pair c k Variation of the' value to obtain a mutated configuration c k "when MySQL is configured to be automatic, the value of MySQL is changed, and Set automatic=false is instructed; the value of the PostgreSQL configuration archive_timeout is mutated by the instruction Set archive_timeout=1000; the value of the configuration connect_timeout of the mariadib is mutated, and the value is changed by the instruction Set connect_timeout=2000; the values of the configured slow log-max-len of Redis are mutated by the instruction Config Setslowlog-max-len 10086; the value of SQLite's configuration analysis_limit is varied by the instruction Pragma analysis_limit=18.
8. The method for configuration defect oriented database ambiguity test of claim 1 wherein said determining q is performed in step 4.4.2.5 z The' execution causes S to suspend method is based on whether the seed execution time exceeds the maximum duration MaxT.
9. The method for fuzzy testing a database for a configuration defect according to claim 1, wherein the method for obtaining the value P' of the program variable P during operation by instrumentation in step 4.4.3.3 is as follows:
4.4.3.3.1 analyzing to-be-detected software source code S by using Modulepass tool of LLVM framework 0 Obtaining P in S 0 Position loc in (a) p
4.4.3.3.2 IRBuilder interface at loc using LLVM framework p A Store instruction is inserted with the value P' of Store P into shared memory.
CN202310805941.1A 2023-07-03 2023-07-03 Configuration defect-oriented database fuzzy test method Active CN116909884B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310805941.1A CN116909884B (en) 2023-07-03 2023-07-03 Configuration defect-oriented database fuzzy test method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310805941.1A CN116909884B (en) 2023-07-03 2023-07-03 Configuration defect-oriented database fuzzy test method

Publications (2)

Publication Number Publication Date
CN116909884A CN116909884A (en) 2023-10-20
CN116909884B true CN116909884B (en) 2024-01-26

Family

ID=88357390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310805941.1A Active CN116909884B (en) 2023-07-03 2023-07-03 Configuration defect-oriented database fuzzy test method

Country Status (1)

Country Link
CN (1) CN116909884B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118211223B (en) * 2024-05-17 2024-07-30 国网电商科技有限公司 Open source software vulnerability discovery method, system, equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193731A (en) * 2017-05-12 2017-09-22 北京理工大学 Use the fuzz testing coverage rate improved method of control variation
CN114185802A (en) * 2021-12-16 2022-03-15 杭州电子科技大学 Fuzzy test seed variation intensity optimization method
CN114490353A (en) * 2022-01-06 2022-05-13 清华大学 Database management system fuzzy test method and device and electronic equipment
CN116126698A (en) * 2022-12-29 2023-05-16 中国人民解放军国防科技大学 Run-time configuration updating defect detection method based on metamorphic test

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113157549A (en) * 2020-01-23 2021-07-23 戴尔产品有限公司 Software code testing system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193731A (en) * 2017-05-12 2017-09-22 北京理工大学 Use the fuzz testing coverage rate improved method of control variation
CN114185802A (en) * 2021-12-16 2022-03-15 杭州电子科技大学 Fuzzy test seed variation intensity optimization method
CN114490353A (en) * 2022-01-06 2022-05-13 清华大学 Database management system fuzzy test method and device and electronic equipment
CN116126698A (en) * 2022-12-29 2023-05-16 中国人民解放军国防科技大学 Run-time configuration updating defect detection method based on metamorphic test

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Identify error-handling code snippets in large-scale software;Jinyu Liu,Shanshan Li等;IEEE;全文 *

Also Published As

Publication number Publication date
CN116909884A (en) 2023-10-20

Similar Documents

Publication Publication Date Title
Han et al. CodeAlchemist: Semantics-aware code generation to find vulnerabilities in JavaScript engines.
CN110399730B (en) Method, system and medium for checking intelligent contract vulnerability
Shoham et al. Static specification mining using automata-based abstractions
Liang et al. Deepfuzzer: Accelerated deep greybox fuzzing
CN109933991A (en) A kind of method, apparatus of intelligence contract Hole Detection
CN116909884B (en) Configuration defect-oriented database fuzzy test method
CN110399300A (en) A kind of Python software obfuscation test method based on regime type perception
CN107292170A (en) Detection method and device, the system of SQL injection attack
CN111859380B (en) Zero false alarm detection method for Android App loopholes
CN107193732B (en) Verification function positioning method based on path comparison
CN111857681B (en) Software-defined key function positioning and extracting method of C + + system
CN109117368A (en) A kind of interface test method, electronic equipment and storage medium
Wang et al. {FuzzJIT}:{Oracle-Enhanced} Fuzzing for {JavaScript} Engine {JIT} Compiler
Madhavan et al. Automating grammar comparison
CN117556431B (en) Mixed software vulnerability analysis method and system
Zhou et al. Confmapper: Automated variable finding for configuration items in source code
CN113297580A (en) Code semantic analysis-based electric power information system safety protection method and device
Kang et al. Scaling javascript abstract interpretation to detect and exploit node. js taint-style vulnerability
Shi et al. {AIFORE}: Smart Fuzzing Based on Automatic Input Format Reverse Engineering
KR20210045122A (en) Apparatus and method for generating test input a software using symbolic execution
Shan et al. Face It Yourselves: An LLM-Based Two-Stage Strategy to Localize Configuration Errors via Logs
Chen et al. Efficient Detection of Java Deserialization Gadget Chains via Bottom-up Gadget Search and Dataflow-aided Payload Construction
WO2010149986A2 (en) A method, a computer program and apparatus for analysing symbols in a computer
Xie et al. CSEFuzz: fuzz testing based on symbolic execution
WO2021104027A1 (en) Code performance testing method, apparatus and device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant