CN113377651A - Class integration test sequence generation method based on reinforcement learning - Google Patents
Class integration test sequence generation method based on reinforcement learning Download PDFInfo
- Publication number
- CN113377651A CN113377651A CN202110647435.5A CN202110647435A CN113377651A CN 113377651 A CN113377651 A CN 113377651A CN 202110647435 A CN202110647435 A CN 202110647435A CN 113377651 A CN113377651 A CN 113377651A
- Authority
- CN
- China
- Prior art keywords
- complexity
- class
- value
- action
- reinforcement learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012360 testing method Methods 0.000 title claims abstract description 167
- 238000000034 method Methods 0.000 title claims abstract description 120
- 230000010354 integration Effects 0.000 title claims abstract description 52
- 230000002787 reinforcement Effects 0.000 title claims abstract description 41
- 230000006870 function Effects 0.000 claims abstract description 42
- 230000003068 static effect Effects 0.000 claims abstract description 11
- 238000004458 analytical method Methods 0.000 claims abstract description 9
- 230000009471 action Effects 0.000 claims description 73
- 230000008569 process Effects 0.000 claims description 23
- 230000008878 coupling Effects 0.000 claims description 14
- 238000010168 coupling process Methods 0.000 claims description 14
- 238000005859 coupling reaction Methods 0.000 claims description 14
- 238000013461 design Methods 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000007246 mechanism Effects 0.000 claims description 5
- 230000003993 interaction Effects 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 abstract description 4
- 238000000691 measurement method Methods 0.000 abstract description 4
- 238000013522 software testing Methods 0.000 abstract description 3
- 239000003795 chemical substances by application Substances 0.000 description 44
- 238000011156 evaluation Methods 0.000 description 7
- 239000011159 matrix material Substances 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3688—Test management for test execution, e.g. scheduling of test suites
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Pure & Applied Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Analysis (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a class integration test sequence generation method based on reinforcement learning, and belongs to the technical field of software testing. Comprises the following steps: 1) defining a reinforcement learning task; 2) program static analysis; 3) measuring the complexity of the test pile; 4) designing a reward function; 5) designing a value function; 6) and generating a class integration test sequence. The invention solves the problem that the existing class integrated test sequence generation method based on reinforcement learning has an inaccurate index for evaluating and determining the total cost of the class integrated test sequence, provides a more accurate measurement method for testers to carry out test work in actual production life, improves the efficiency of integrated test, and can better control the quality of products.
Description
Technical Field
The invention belongs to the technical field of software testing, and particularly relates to a class integration test sequence generation method based on reinforcement learning.
Background
The software testing stage mainly comprises unit testing, integration testing, system testing, verification and confirmation, regression testing and the like. The integrated test is to assemble all software units into modules, subsystems or systems on the basis of unit test, detect whether each part of work reaches or realizes corresponding technical indexes and requirements, so as to ensure that each unit can cooperatively operate according to a set intention after being combined, and ensure the behavior of increment to be correct. However, the object-oriented program has no obvious hierarchical division, and the call relationship between the object-oriented programs is represented by an intricate and complex mesh structure, and the traditional integrated test strategy cannot be well applied to the structure. Therefore, a new integrated test strategy conforming to the characteristics of the object-oriented program needs to be provided, and the new strategy takes the class as an object to generate an optimal class integrated test sequence so as to determine the test mode.
Based on the inter-class dependency of object-oriented programs, researchers in the field of software engineering have proposed an integration strategy based on class integration test sequences. During testing, these strategies often require the construction of required test stubs for certain classes in the object-oriented program in order to perform certain functions instead. This task is costly and generally has no way to avoid, so how to reduce the cost becomes a critical issue in the integration test. In the research process, students measure the cost of a test pile by calculating the complexity of the test pile, and different types of integrated test sequences have different test pile complexities and different test costs. The classes in the test program are reasonably sequenced to obtain a feasible class integrated test sequence, so that the overall complexity of the test pile to be constructed can be greatly reduced, and the test cost is reduced as much as possible.
The existing class integration test sequence generation method based on reinforcement learning ignores the evaluation index of the complexity of the test pile, and the methods assume that the dependency degree of the dependency relationship between each class is the same, namely, each test pile has the same complexity. However, different test pegs have different complexities, and the fewer test pegs cannot represent the lower cost of test pegs to determine a class integrated test sequence. Therefore, the existing class integration test sequence generation method based on reinforcement learning takes the number of class test piles as a measure to determine the total cost of the class integration test sequence, and the index is not accurate enough. Therefore, it is of great importance for the integration test to propose a reasonable generation technology of the class integration test sequence and to refine the evaluation index.
Disclosure of Invention
The invention aims to provide a class integration test sequence generation method based on reinforcement learning, and solves the problem that the index of the total cost for evaluating and determining the class integration test sequence cost is not accurate enough in the existing class integration test sequence generation method based on reinforcement learning. Therefore, a more accurate measurement method can be provided for testers to carry out test work in actual production life, and the efficiency of integrated test is further improved.
The invention is realized according to the following technical scheme:
a class integration test sequence generation method based on reinforcement learning comprises the following specific processes:
step 1, defining a reinforcement learning task: the task of reinforcement learning is to make the intelligent agent continuously try in the environment, continuously adjust the strategy according to the obtained reward value, and finally generate a better strategy, and the intelligent agent can know what action should be executed under what state according to the strategy;
step 2, static analysis of the program: performing static analysis on a source program, using the acquired information for calculating the attribute and method complexity among classes, calculating the attribute coupling among the classes through the attribute complexity, and calculating the method coupling among the classes through the method complexity;
step 3, measuring the complexity of the test pile: calculating the complexity of the test pile according to the obtained attributes and method complexity, and providing information for the design of a reward function at the back;
In a specific scheme, the specific steps of step 1 are as follows:
1.1, regarding a software system to be analyzed as a set of classes to be integrated during testing;
1.2, an action sequence, namely an action history, executed by the agent in the path is reserved and used as a candidate solution of the integrated test class sequence;
1.3, finding an action history with the maximum overall reward from the candidate solutions, namely, finding a class integration test sequence required by the learning process.
In a specific scheme, the specific steps of step 2 are as follows:
2.1, analyzing the relationship among classes, calculating the attribute coupling among the classes through the attribute complexity, using A (i, j) to represent, wherein i, j respectively represent the classes in the program, and the attribute complexity is numerically equal to the sum of the number of member variables of i call j, the type of the method parameter and the number of return values of the method;
2.2, calculating the method coupling between classes through the method complexity, using M (i, j) to represent, wherein the method complexity is equal to the number of methods in i call j in number.
And 2.3, standardizing the attribute and method complexity.
In a specific scheme, the specific steps of step 3 are as follows:
3.1, calculating the weight of the attribute and the method complexity between classes by an entropy weight method;
3.2, calculating to obtain the complexity of the test pile by combining the attribute and the method complexity;
and 3.3, accumulating the complexity of the test pile generated in the process when the class integration test sequence is obtained, and obtaining the complexity of the total test pile.
Wherein,,a (i, j) represents the attribute complexity between classes i and j, M (i, j) represents the method complexity between classes i and j, the entropy weight method firstly standardizes the continuous indexes, the obtained results are all between 0 and 1, and the attribute complexity between the classes after standardization isThe complexity of the method is。
In a specific scheme, the specific steps of step 4 are as follows:
4.1, designing a reward function with a higher reward value when the class explored and integrated by the agent is more optimal;
4.2, when any action class repeatedly appears twice in the process, giving a minimum value- ∞tothe path so as to avoid the action class when continuously exploring;
4.3, designing a reward function by combining the complexity of the test pile.
wherein the agent reaches sigma through i-1 state transitionsi,σiRepresents a state path, r (σ)i) Denotes the prize value that the status path will receive, Max denotes the maximum prize value, here 1000, c is a positive integer value, here 100, aσiRepresenting the action history corresponding to the state path, SCplx () represents the test stub complexity. When any unsatisfactory situation occurs in the process, the environment will give the agent a penalty value.
In a specific scheme, the specific steps of step 5 are as follows:
5.1, obtaining a Q value which is reported immediately according to the state and the selected action generated by the interaction of the environment, wherein the Q value is represented by Q (s, a), s represents the state, and a represents the action;
5.2, selecting the largest Q (s ', a ') according to the next state s ' and multiplying it by a discount factor gamma;
5.3, adding the reward value r obtained by the agent executing the action a in the state s;
5.4, multiplying the whole by a learning rate alpha;
5.5, plus the value of Q just reported immediately, the current value of Q is obtained.
where α represents the learning rate, r represents the reward value resulting from the agent performing action a in state s, and γ represents the discount factor.
In a specific scheme, the specific steps of step 6 are as follows:
6.1, the intelligent agent selects the action according to an action selection mechanism;
6.2, when the intelligent agent finishes training times, the system selects the action sequence with the maximum integral reward value to return, and the action sequence is the required optimal class integration test sequence.
Compared with the prior art, the invention has the beneficial effects that:
the invention solves the problem that the existing class integrated test sequence generation method based on reinforcement learning has an inaccurate index for evaluating and determining the total cost of the class integrated test sequence, provides a more accurate measurement method for testers to carry out test work in actual production life, improves the efficiency of integrated test, and further better controls the quality of products.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention, are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention without limiting the invention to the right. It is obvious that the drawings in the following description are only some embodiments, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
In the drawings:
FIG. 1 is a flow chart of a class integration test sequence generation method based on reinforcement learning according to an embodiment of the present invention;
FIG. 2 is a flow chart for defining reinforcement learning tasks;
FIG. 3 is a flow chart of program static analysis;
FIG. 4 is a flow chart of measuring test pile complexity;
FIG. 5 is a flow chart for designing a reward function;
FIG. 6 is a flow chart of a design value function;
FIG. 7 is a flow chart for generating a class integration test sequence.
It should be noted that the drawings and the description are not intended to limit the scope of the inventive concept in any way, but to illustrate it by a person skilled in the art with reference to specific embodiments.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and the following embodiments are used for illustrating the present invention and are not intended to limit the scope of the present invention.
FIG. 1 is a flowchart of a class integration test sequence generation method based on reinforcement learning according to an embodiment of the present invention.
S1 defines a reinforcement learning task. The reinforcement learning is based on a Markov decision process as a theory, namely, the conditional probability of the intelligent agent selecting action is only related to the state of the current affair, but simultaneously, the current action can influence the transition to the next state with a certain probability, and the intelligent agent obtains a reward value according to the state and the action. Assume that the path action history has n actions in total. Under the given strategy, the goal of defining the reinforcement learning task is to find an optimal set of sequences to maximize the overall reward obtained.
The S2 procedure analyzes statically. Performing static analysis on a source program to acquire a series of information, such as: the inter-class dependency relationship type, the attribute dependency information (such as member variable dependency information, parameter passing information, return value information and the like) and the method calling information are used for calculating the attribute and method complexity between classes. The inter-class dependency relationship can be classified into an inter-class static dependency relationship and an inter-class dynamic dependency relationship according to whether the program runs or not. When the program does not need to run, the static dependency between the classes can be analyzed, and the relationship formed when the program runs is called the dynamic dependency relationship between the classes. The invention relates to mainly static dependencies. The strength of the dependency relationship among the classes needs to be considered in the process of constructing the test pile, and the measurement of the dependency relationship needs to analyze the coupling information among the classes, so that the cost of constructing the test pile is calculated. Through analysis, the closeness degree of the coupling degree and the strength of the dependence relationship form a positive correlation relationship, namely the coupling degree and the test cost form a positive correlation relationship.
S3 measures test pile complexity. The instrumentation stub is not a true class in a system, but is a component module or a piece of object code that serves the object under test. When the dependency relationship between the two classes is strong, the functions of the test pile to be simulated are more, the construction is more complex, and the complexity of the test pile is higher; when the relation is weak, the difficulty of constructing the test pile is reduced, the cost is low, and the complexity of the test pile is low, so that the complexity of the test pile can be calculated according to the strength of the relation between the classes, and the test cost is obtained.
S4 designs a reward function. When the agent takes action a in state s, the environment will give the agent a reward r in general, and when the agent obtains r in the forward direction, the agent will enhance the selection of the relevant directional action and will also influence the next state. The reward function is a function for calculating the reward value r.
S5 designs a value function. The reward function is used for enabling a reward and punishment value obtained when the intelligent body is transferred from one state to the next state to be maximum, and in order to guarantee that the accumulated reward value is maximum when the intelligent body reaches the target state, the accumulated reward value is represented by the designed value function, and the Q table is used for storing the Q value. Assuming that the average reward value obtained after the previous t actions and the reward value of the t +1 time are known, the accumulated reward Q value after the t +1 time selection action can be predicted to be updated, and the value function represents the process of predicting and updating feedback in reinforcement learning.
S6 generates a class integration test sequence. And selecting proper actions to add into the action sequence once through action selection and reward feedback of the intelligent agent in the learning process, designing a reward function by measuring the complexity of constructing a test pile in the process, ensuring that the accumulated reward value of the action sequence is also maximum by using the design value function, and finally selecting a proper action sequence which is the optimal class integrated test sequence obtained by the method.
FIG. 2 is a flow chart for defining reinforcement learning tasks. By knowing the structure of reinforcement learning and combining with the related research of a class integrated test sequence, a reinforcement learning strategy which aims at the lowest complexity of a test pile as possible is made, and the action sequence accumulated reward value selected by an intelligent agent is optimally used as a learning target. The method comprises the following specific steps: firstly, regarding a software system to be analyzed as a set of classes to be integrated during testing; then, an action sequence executed by the agent in the path, namely an action history, is reserved as a candidate solution of the integrated test class sequence; and finally, finding an action history with the maximum overall reward from the candidate solutions, namely the category integrated test sequence required by the learning process.
FIG. 3 is a flow chart of program static analysis. And analyzing the dependency relationship between classes in the program to obtain the attribute and method coupling degree, and preparing for calculating the complexity of the test pile in the next step. The method comprises the following specific steps: firstly, analyzing the relationship between classes by analyzing specific sentences among programs; then, calculating attribute coupling between classes through attribute complexity, and calculating method coupling between classes through method complexity; finally, for the convenience of the next calculation, the attributes and the method complexity are respectively standardized.
FIG. 4 is a flow chart of measuring the complexity of a test pile. The complexity of the test pile is an important index for measuring the test cost, and is mainly obtained by calculating the attributes among classes and the complexity of a method. The method comprises the following specific steps: firstly, determining weights of attributes and method complexity, and calculating the weights of the attributes and the method complexity by adopting an entropy weight method; then, calculating to obtain the complexity of the test pile by combining the attributes and the method complexity after the last step of standardization; and finally, when a class integration test sequence is obtained, accumulating the complexity of the test piles generated in the process to obtain the complexity of the total test pile so as to evaluate the method.
The implementation mode is as follows: in order to more accurately obtain the complexity of the test pile, an entropy weight method is adopted to calculate the weight W of the attribute and the method complexityAAnd WM。
The steps of calculating the complexity of the test pile by the entropy weight method are as follows:
(1) standardized index
The complexity of the test pile is calculated by two indexes, namely attribute complexity and method complexity, A (i, j) represents the attribute complexity between classes i and j, M (i, j) represents the method complexity between classes i and j, and the entropy weight method firstly standardizes the continuous indexes, so that the obtained results are all between 0 and 1, and the attribute complexity between the classes after standardization isThe complexity of the method is. The calculation formula is as follows:
(2) establishing an evaluation matrix
Assuming that the system to be tested includes m classes, an m × 2 matrix may be constructed to represent two relationships between the classes, where a first column represents an evaluation value under an index of attribute complexity, and a second column represents an evaluation value under an index of method complexity. The two columns jointly form an evaluation matrix R, and the calculation formula is as follows:
computing information entropy
Before calculating the information entropy, the evaluation value ratio of each class to each index needs to be calculated firstly, and P is usedijExpressing the specific gravity of the jth index, and calculating the information entropy e of the jth index according to the obtained specific gravityjWhere K is a constant, the formula is as follows:
calculating weights
Denote the weight of the jth index as WjWherein j represents the attribute complexity weight when being 1, and represents the method complexity weight when being 2, the formula is as follows:
finally, obtaining the attribute complexity and the method complexity to obtain the weight, and further obtaining the test pile complexity SCplx (i, j).
FIG. 5 is a flow chart of designing reward functions, which are important indicators for directing agents to explore, and agents tend to explore action paths with larger reward values, so in order to obtain an integrated-like test sequence with the lowest test cost, the reward functions are improved by combining the complexity of test piles. The method comprises the following specific steps: firstly, designing a reward function with a higher reward value when the class explored and integrated by the agent is more optimal; then, when any action class repeatedly appears twice in the process, giving a minimum value- ∞ to the path so as to avoid the situation when continuously exploring; finally, by combining the complexity of the test pile and designing a reward function, the training agent tends to explore a path with lower complexity of the test pile.
The implementation mode is as follows: assuming that the agent experiences n +1 states in total from the first state to the f-th state, and selects n actions, the f-th state s is consideredfI.e. the final state, from s1To sfThe formula of the state change function in between is shown below, s' being the next state to state s.
By usingRepresenting the state from the initial state to the final state sfA state path of (2), where σ0Indicates the initial state s1,. The n action sequences executed by the agent according to the path can be expressed asI.e. the action history corresponding to the state path. If the path does not contain a duplicate action, it can be considered asI.e. an alternative class integration test sequence.
The reward and punishment mechanism in reinforcement learning is the core for controlling the agent to explore the optimal path, and when the category explored and integrated by the agent is better, the reward value obtained by the agent is higher. In order to further reduce the overall test cost of the class integration test sequence, the invention designs an enhancement function by combining the complexity of a test pile, and defines a formula as follows:
the agent reaches sigma through i-1 state transitionsi,σiRepresents a state path, r (σ)i) Denotes the prize value that the status path will receive, Max denotes the maximum prize value, here 1000, c is a positive integer value, here 100, aσiRepresenting the action history corresponding to the state path, SCplx () represents the test stub complexity. When any situation which does not meet the requirement occurs in the process of exploring the intelligent agent, the environment gives a penalty value to the intelligent agent. For example, any action class repeats twice in the process, i.e., the path is given a minimum value- ∞, which can be avoided when continuing to explore later. When no duplicate classes are present in the path, the path is considered feasible and the context assigns the agent a reward value, where c is a positive integer. When the final state is reached, no repeated classes appear, namely, one alternative path can be considered to be found and the overall testing cost is calculated. If the test cost of the currently obtained class integrated test sequence is less than the previously obtained test costs of all sequences, a higher prize value may be given. The integration is carried out according to the sequence of the path action history, so that the complexity of the whole test pile of the generated class integration test sequence can be minimized, and the test cost can be saved to the greatest extent.
FIG. 6 is a flow chart of a design value function, where the value function focuses more on the accumulated reward value obtained during the agent exploration process, and may supplement the reward function to promote the agent to explore in a direction with lower test cost. Utensil for cleaning buttockThe method comprises the following steps: firstly, obtaining an instant Q value according to the state and the selected action generated in the interaction of the environment, and expressing the Q value by Q (s, a); then, the largest Q (s ', a ') is selected based on the next state s ' and multiplied by a discounting factor(ii) a Then, adding the reward value obtained by the agent executing action a in state sAnd multiplied by a learning rate as a whole(ii) a Finally, add the Q value just reported immediately to get the current Q value.
wherein,it is indicated that the learning rate is,indicating that an agent is in a stateLower execution actionThe value of the benefit to be received,representing a discount factor.
FIG. 7 is a flow chart for generating a class integration test sequence, the invention adopts a reinforcement learning method to train an intelligent agent to learn in a direction with lower test cost, and an action sequence obtained by selecting actions in the process is the required class integration test sequence. The method comprises the following specific steps: firstly, the agent selects an action according to an action selection mechanism; then, when the intelligent agent finishes the training times, the system selects the action sequence with the maximum overall reward value to return, and the action sequence is the solved optimal class integration test sequence.
The implementation mode is as follows: the reinforcement learning is a process of exploring and utilizing selection actions, in order to avoid trapping in local optimization in the learning process and further increase the proportion of agent exploration, two selection mechanisms are adopted on the basis of an epsilon-greedy method:
traditional epsilon-greedy algorithms: selecting the action corresponding to the maximum Q value in the current state according to the probability of 1-epsilon; actions are randomly selected with a probability of epsilon.
Dynamically adjusting the epsilon algorithm: the value of ε is first dynamically adjusted, as shown below, where time represents the number of training sessions.
And obtaining a path sigma of the intelligent agent from the initial state to the final state through the Q-learning algorithm, wherein if all actions in all states are accessed, the Q value obtained by the intelligent agent reaches an optimal value. And finally, obtaining action history associated with the state path, wherein the action sequence corresponding to the action history is the optimal class integration test sequence obtained by the method.
The experimental data of the class integration test sequence generation method based on reinforcement learning are given as follows:
the experimental subjects are 9 programs such as elevator, SPM, ATM, day, ANT, BCEL, DNS, email _ spl, and statepad _ spl in the SIR website. The average of the results of 30 experiments in the method indicates that for 5 of these 9 programs, the overall test stub complexity spent in generating the optimal class integration test sequence by the method is the lowest, which is reduced by 39.4%, 33.3%, 7.6%, 37.9% and 17.8% compared to the lowest overall test stub complexity spent in generating the class integration test sequence by using the particle swarm algorithm and the random algorithm, respectively.
In conclusion, the invention solves the problem that the existing class integrated test sequence generation method based on reinforcement learning evaluates and determines the index of the total cost of the class integrated test sequence is not accurate enough, not only can further improve the research in the field of class integrated test sequence generation, but also can further reduce the test cost, provide a more accurate measurement method for testers to carry out test work in actual production life, improve the efficiency of a software test link and ensure the quality of software products.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than others, combinations of features of different embodiments are also meant to be within the scope of the invention and form different embodiments. For example, in the above embodiments, those skilled in the art can use the combination according to the known technical solutions and technical problems to be solved by the present application.
Although the present invention has been described with reference to a preferred embodiment, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A class integration test sequence generation method based on reinforcement learning is characterized in that:
step 1, defining a reinforcement learning task: the task of reinforcement learning is to make the intelligent agent continuously try in the environment, continuously adjust the strategy according to the obtained reward value, and finally generate a better strategy, and the intelligent agent can know what action should be executed under what state according to the strategy;
step 2, static analysis of the program: performing static analysis on a source program, using the acquired information for calculating the attribute and method complexity among classes, calculating the attribute coupling among the classes through the attribute complexity, and calculating the method coupling among the classes through the method complexity;
step 3, measuring the complexity of the test pile: calculating the complexity of the test pile according to the obtained attributes and method complexity, and providing information for the design of a reward function at the back;
step 4, designing a reward function: integrating the calculation of the complexity of the test pile into the design of a reward function, and guiding an intelligent agent to learn towards a direction with lower complexity of the test pile;
step 5, designing a value function: the value function is fed back through the reward function, and the maximization of the accumulated reward is ensured through the setting of the value function;
step 6, generating a class integration test sequence: and when the intelligent agent finishes the set training times, selecting the action path with the maximum integral reward value, namely the class integrated test sequence obtained by the study.
2. The class integration test sequence generation method based on reinforcement learning of claim 1, wherein:
the specific steps of step 1 are as follows:
1.1, regarding a software system to be analyzed as a set of classes to be integrated during testing;
1.2, an action sequence, namely an action history, executed by the agent in the path is reserved and used as a candidate solution of the integrated test class sequence;
1.3, finding an action history with the maximum overall reward from the candidate solutions, namely, finding a class integration test sequence required by the learning process.
3. The class integration test sequence generation method based on reinforcement learning of claim 1, wherein:
the specific steps of step 2 are as follows:
2.1, analyzing the relationship among classes, calculating the attribute coupling among the classes through the attribute complexity, using A (i, j) to represent, wherein i, j respectively represent the classes in the program, and the attribute complexity is numerically equal to the sum of the number of member variables of i call j, the type of the method parameter and the number of return values of the method;
2.2, calculating the method coupling between classes through the method complexity, using M (i, j) to represent, wherein the method complexity is equal to the number of methods in i call j in number, and then standardizing the attribute and the method complexity.
4. The class integration test sequence generation method based on reinforcement learning of claim 1, wherein:
the specific steps of step 3 are as follows:
3.1, calculating the weight of the attribute and the method complexity between classes by an entropy weight method;
3.2, calculating to obtain the complexity of the test pile by combining the attribute and the method complexity;
and 3.3, accumulating the complexity of the test pile generated in the process when the class integration test sequence is obtained, and obtaining the complexity of the total test pile.
5. The class integration test sequence generation method based on reinforcement learning of claim 4, wherein:
Wherein,,a (i, j) represents the attribute complexity between classes i and j, M (i, j) represents the method complexity between classes i and j, the entropy weight method firstly standardizes the continuous indexes, the obtained results are all between 0 and 1, and the attribute complexity between the classes after standardization isThe complexity of the method is。
6. The class integration test sequence generation method based on reinforcement learning of claim 1, wherein:
the specific steps of step 4 are as follows:
4.1, designing a reward function with a higher reward value when the class explored and integrated by the agent is more optimal;
4.2 when any action class in the process is repeated twice, giving a minimum value to the pathSo as to avoid when continuing to explore;
4.3, by combining the complexity of the test pile and designing a reward function, the training agent tends to explore a path with lower complexity of the test pile.
7. The class integration test sequence generation method based on reinforcement learning of claim 6, wherein:
wherein the agent reaches sigma through i-1 state transitionsi,σiRepresents a state path, r (σ)i) Denotes the prize value that the status path will receive, Max denotes the maximum prize value, here 1000, c is a positive integer value, here 100, aσiRepresenting the action history corresponding to the state path, SCplx () represents the test stub complexity.
8. The class integration test sequence generation method based on reinforcement learning of claim 1, wherein:
the specific steps of step 5 are as follows:
5.1, obtaining a Q value which is reported immediately according to the state and the selected action generated by the interaction of the environment, wherein the Q value is represented by Q (s, a), s represents the state, and a represents the action;
5.2, selecting the largest Q (s ', a ') according to the next state s ' and multiplying it by a discount factor gamma;
5.3, adding the reward value r obtained by the agent executing the action a in the state s;
5.4, multiplying the whole by a learning rate alpha;
5.5, plus the value of Q just reported immediately, the current value of Q is obtained.
10. The class integration test sequence generation method based on reinforcement learning of claim 1, wherein:
the specific steps of step 6 are as follows:
6.1, the intelligent agent selects the action according to an action selection mechanism;
6.2, when the intelligent agent finishes training times, the system selects the action sequence with the maximum integral reward value to return, and the action sequence is the required optimal class integration test sequence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110647435.5A CN113377651A (en) | 2021-06-10 | 2021-06-10 | Class integration test sequence generation method based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110647435.5A CN113377651A (en) | 2021-06-10 | 2021-06-10 | Class integration test sequence generation method based on reinforcement learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113377651A true CN113377651A (en) | 2021-09-10 |
Family
ID=77573587
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110647435.5A Pending CN113377651A (en) | 2021-06-10 | 2021-06-10 | Class integration test sequence generation method based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113377651A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117435516A (en) * | 2023-12-21 | 2024-01-23 | 江西财经大学 | Test case priority ordering method and system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108399127A (en) * | 2018-02-09 | 2018-08-14 | 中国矿业大学 | A kind of integrated method for creating test sequence of class |
CN109800717A (en) * | 2019-01-22 | 2019-05-24 | 中国科学院自动化研究所 | Activity recognition video frame sampling method and system based on intensified learning |
CN110659199A (en) * | 2018-06-29 | 2020-01-07 | 中国矿业大学 | Class integration test sequence generation method based on transfer dependence |
US20200065157A1 (en) * | 2018-08-27 | 2020-02-27 | Vmware, Inc. | Automated reinforcement-learning-based application manager that learns and improves a reward function |
CN111026549A (en) * | 2019-11-28 | 2020-04-17 | 国网甘肃省电力公司电力科学研究院 | Automatic test resource scheduling method for power information communication equipment |
US20210064515A1 (en) * | 2019-08-27 | 2021-03-04 | Nec Laboratories America, Inc. | Deep q-network reinforcement learning for testing case selection and prioritization |
US20210073110A1 (en) * | 2019-09-10 | 2021-03-11 | Sauce Labs Inc. | Authoring automated test suites using artificial intelligence |
-
2021
- 2021-06-10 CN CN202110647435.5A patent/CN113377651A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108399127A (en) * | 2018-02-09 | 2018-08-14 | 中国矿业大学 | A kind of integrated method for creating test sequence of class |
CN110659199A (en) * | 2018-06-29 | 2020-01-07 | 中国矿业大学 | Class integration test sequence generation method based on transfer dependence |
US20200065157A1 (en) * | 2018-08-27 | 2020-02-27 | Vmware, Inc. | Automated reinforcement-learning-based application manager that learns and improves a reward function |
CN109800717A (en) * | 2019-01-22 | 2019-05-24 | 中国科学院自动化研究所 | Activity recognition video frame sampling method and system based on intensified learning |
US20210064515A1 (en) * | 2019-08-27 | 2021-03-04 | Nec Laboratories America, Inc. | Deep q-network reinforcement learning for testing case selection and prioritization |
US20210073110A1 (en) * | 2019-09-10 | 2021-03-11 | Sauce Labs Inc. | Authoring automated test suites using artificial intelligence |
CN111026549A (en) * | 2019-11-28 | 2020-04-17 | 国网甘肃省电力公司电力科学研究院 | Automatic test resource scheduling method for power information communication equipment |
Non-Patent Citations (5)
Title |
---|
GABRIELA CZIBULA: "An effective approach for determining the class integration test order using reinforcement learning", 《APPLIED SOFT COMPUTING》, pages 517 - 530 * |
LIONEL C. BRIAND: "Using Genetic Algorithms and Coupling Measures to Devise Optimal Integration Test Orders", 《PROC. OF THE 14TH INT’L CONF. ON SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING》, pages 1 - 8 * |
何柳柳: "面向持续集成测试优化的强化学习奖励机制", 《软件学报》, pages 1438 - 1449 * |
张艳梅: "一种基于动态依赖关系的类集成测试方法", 《计算机学报》, pages 1075 - 1089 * |
张艳梅;姜淑娟;陈若玉;王兴亚;张妙;: "基于粒子群优化算法的类集成测试序列确定方法", 计算机学报, no. 04, pages 1 - 5 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117435516A (en) * | 2023-12-21 | 2024-01-23 | 江西财经大学 | Test case priority ordering method and system |
CN117435516B (en) * | 2023-12-21 | 2024-02-27 | 江西财经大学 | Test case priority ordering method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Vanhoucke et al. | An overview of project data for integrated project management and control | |
Attarzadeh et al. | Proposing a new software cost estimation model based on artificial neural networks | |
Chang et al. | Learning to simulate and design for structural engineering | |
Hearty et al. | Predicting project velocity in xp using a learning dynamic bayesian network model | |
Lanubile et al. | Evaluating predictive quality models derived from software measures: lessons learned | |
CN113837356A (en) | Intelligent sewage treatment prediction method based on fusion neural network | |
EP4075281A1 (en) | Ann-based program test method and test system, and application | |
CN111125964B (en) | Sewage treatment process proxy model construction method based on Kriging interpolation method | |
JP2022530868A (en) | Target object attribute prediction method based on machine learning, related equipment and computer programs | |
CN108596800A (en) | Bayes-based open answer decision method | |
CN109925718A (en) | A kind of system and method for distributing the micro- end map of game | |
Srivastava et al. | Software testing effort: An assessment through fuzzy criteria approach | |
CN113377651A (en) | Class integration test sequence generation method based on reinforcement learning | |
CN110659199B (en) | Class integration test sequence generation method based on transfer dependence | |
Zhao et al. | Designing a prediction model for athlete’s sports performance using neural network | |
CN111767991B (en) | Measurement and control resource scheduling method based on deep Q learning | |
CN113868113B (en) | Class integrated test sequence generation method based on Actor-Critic algorithm | |
CN111783930A (en) | Neural network test sufficiency evaluation method based on path state | |
Alsmadi et al. | Effective generation of test cases using genetic algorithms and optimization theory | |
CN115081856A (en) | Enterprise knowledge management performance evaluation device and method | |
Smith et al. | A framework to model and measure system effectiveness | |
CN113987261A (en) | Video recommendation method and system based on dynamic trust perception | |
Cevikcan et al. | Westinghouse method oriented fuzzy rule based tempo rating approach | |
Sun et al. | Prediction of Condition Monitoring Signals Using Scalable Pairwise Gaussian Processes and Bayesian Model Averaging | |
CN113392958A (en) | Parameter optimization and application method and system of fuzzy neural network FNN |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |