CN111611177B - Software performance defect detection method based on configuration item performance expectation - Google Patents
Software performance defect detection method based on configuration item performance expectation Download PDFInfo
- Publication number
- CN111611177B CN111611177B CN202010610996.3A CN202010610996A CN111611177B CN 111611177 B CN111611177 B CN 111611177B CN 202010610996 A CN202010610996 A CN 202010610996A CN 111611177 B CN111611177 B CN 111611177B
- Authority
- CN
- China
- Prior art keywords
- performance
- configuration item
- label
- software
- expected
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000007547 defect Effects 0.000 title claims abstract description 133
- 238000001514 detection method Methods 0.000 title claims abstract description 71
- 238000012360 testing method Methods 0.000 claims abstract description 193
- 238000000034 method Methods 0.000 claims abstract description 100
- 238000012549 training Methods 0.000 claims abstract description 27
- 238000011056 performance test Methods 0.000 claims description 17
- 239000008186 active pharmaceutical agent Substances 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000005065 mining Methods 0.000 claims description 5
- 238000005457 optimization Methods 0.000 claims description 5
- 241000238366 Cephalopoda Species 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 4
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 238000005315 distribution function Methods 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 241000894007 species Species 0.000 claims description 3
- 238000010998 test method Methods 0.000 claims description 3
- 230000007812 deficiency Effects 0.000 claims description 2
- 238000006243 chemical reaction Methods 0.000 claims 1
- 239000000203 mixture Substances 0.000 claims 1
- 238000012216 screening Methods 0.000 claims 1
- 238000012795 verification Methods 0.000 claims 1
- 238000013522 software testing Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 3
- 125000004122 cyclic group Chemical group 0.000 description 2
- 230000002950 deficient Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002389 environmental scanning electron microscopy Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3684—Test management for test design, e.g. generating new test cases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3692—Test management for test results analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a software performance defect detection method based on configuration item performance expectation, and aims to provide a method for effectively detecting configuration item related performance defects. The technical proposal is as follows: constructing a performance defect detection system consisting of a configuration item expected prediction module, a test sample generation module and a performance defect detection module by using the configuration item performance expected; training a configuration item expected prediction module; and reading in the software to be detected, predicting the performance expectation of the configuration item by the configuration item expectation predicting module, generating a test sample by the test sample generating module according to the performance expectation and the software test set, executing the test sample by the performance defect detecting module, detecting whether the performance expectation and the actual performance are consistent, and outputting the performance defect if the performance expectation and the actual performance are not consistent. The invention can not only effectively detect the software performance defect, but also detect new performance defect for the software community, and can effectively judge the performance difference between the software without defects and the software with defects.
Description
Technical Field
The invention relates to the field of performance defect detection in large software, in particular to a software performance defect detection method based on configuration item performance expectation.
Background
With the continuous progress of society, software systems have been widely used in various fields, playing a vital role in modern society, and playing an important role. With the continuous development of software systems, the requirements of people on the reliability, safety and performance (software running speed) of the software are higher and higher, so that the software scale is continuously increased, and the software complexity is continuously improved. For example, the Hadoop distributed open source software version 2.8.0, the number of source files exceeds 8000, and the code total line number approaches tens of millions. Meanwhile, the software system provides more and more flexible configuration items so that a user configures the software according to requirements. For example, there are 1000 more configuration items in Apache httpd software, and 800 more configuration items in MySQL. And the proportion of non-functional attributes is increasing, and the configuration items are closely related to computing resources (such as CPU, memory and the like) and performance optimization strategies. Meanwhile, as the scale of software is continuously increased, improving the software performance is one of the most important tasks of software evolution and maintenance. The paper "An Empirical Study on Performance Bugs for Highly Configurable Software Systems (performance deficiency demonstration study in highly configurable software systems)" published by Xue Han et al in ESEM 2016 shows that: configuration items also become one of the main causes of software performance problems, with proportions as high as 59%. In a survey of 148 businesses, 92% of the businesses consider improving software performance to be one of the most important tasks in the software development process. In recent years, significant business loss has been caused by software performance problems caused by code flaws associated with software configuration items.
Aiming at the problem of software performance, the prior art mainly adopts two types of methods to detect the software. The first type of method, such as "Automating Performance Bottleneck Detection using Search-Based Application Profiling (a method for automatically detecting performance defects based on searching and profiling)" published by Du Shen et al in ISSTA2015, is mainly based on a performance bottleneck diagnostic tool such as profiler to generate a test case that makes software run slowly, and reports the function that takes the longest time to execute the case to a developer as a performance defect. Although such methods have high coverage for detecting performance defects, there are a large number of false positives. The reason is that the slow execution of the test cases may not be due to performance defects, but rather because the time required for the test cases themselves is long. That is, this type of method lacks an effective performance Test predictor (Test Oracle: in computing, software engineering, and software testing, a Test Oracle (or just Oracle) is a mechanism for determining whether a Test has passed or failed. Test predictor is a criterion for determining whether a Test passes or not In the field of computer, software testing).
A second type of method, such as "Toddler: detecting Performance Problems via Similar Memory-Access Patterns (detection of performance defects through similar memory read-write Patterns)" published by Adrian Nistor et al in ICSE 2013, matches performance defects in software under test by summarizing the performance defect code Patterns and variable read Patterns in a loop structure. The method builds test predictions based on the defect code mode, and can effectively reduce false alarms of performance faults. However, the performance defects in the cyclic structure account for only a small proportion of the general performance defects, so this type of method is limited to detecting a specific type of fault (e.g., defects in the cyclic structure), and it has been verified that this type of method can only detect 9.8% of the configuration item-related performance faults.
In summary, how to construct a low false alarm and high coverage performance test prediction and automatically generate corresponding test samples to effectively and comprehensively detect software performance defects is a hotspot problem under discussion by those skilled in the art.
Disclosure of Invention
The invention aims to provide a software performance defect detection method based on configuration item performance expectation. The method utilizes the performance expectation of the software configuration item to construct a test prediction (namely, when the actual performance of the software is inconsistent with the performance expectation of the configuration item, the performance defect exists), and simultaneously, the test prediction of the software to be tested is automatically predicted; based on the test prediction, a test sample is automatically generated, and the performance defects related to the configuration items are effectively detected.
In order to solve the technical problems, the technical scheme of the invention is as follows: firstly, a performance defect detection system consisting of a configuration item expected prediction module, a test sample generation module and a performance defect detection module is constructed by utilizing the configuration item performance expected of "Tuning backfirednot (analysis) you fault: understanding and detecting configuration-related performance bugs (configuration adjustment is not always your error; then, a training data set which is manually marked with expected configuration items is read in, and a expected configuration item prediction module is trained; and finally, reading in software to be detected (comprising software, a software self-contained test set and a software configuration item user manual), predicting the performance expectations of the configuration items by a configuration item expectations prediction module, sending the predicted performance expectations to a test sample generation module and a performance defect detection module, generating test sample according to the performance expectations and the software test set by the test sample generation module, sending the test sample to the performance defect detection module, executing the test sample by the performance defect detection module, detecting whether the performance expectations and the actual performance are consistent or not, and outputting the performance defect if the performance expectations and the actual performance are not consistent.
The invention comprises the following steps:
the first step, a performance defect detection system is constructed, wherein the performance defect detection system consists of a configuration item expected prediction module, a test sample generation module and a performance defect detection module.
The configuration item expected prediction module is a weighted voting classifier, is connected with the test sample generation module and the performance defect detection module, reads description and value range of the configuration item from a configuration item user manual of software to be detected, predicts the performance expected of the configuration item to be predicted, obtains a performance expected label (the label represents the category of the performance expected) of the configuration item, and sends the performance expected label of the configuration item to the test sample generation module and the performance defect detection module.
The test sample generating module is connected with the configuration item expected predicting module and the performance defect detecting module, receives the performance expected label of the configuration item from the configuration item expected predicting module, reads the test command from the test set of the software to be detected, and generates a test sample set T according to the performance expected label of the configuration item and the test set of the software to be detected.
The performance defect detection module is connected with the configuration item expected prediction module and the test sample generation module, receives the test sample set T from the test sample generation module, receives the performance expected label of the configuration item from the configuration item expected prediction module, executes the test sample in the test sample set T, detects whether the expected performance and the actual performance corresponding to the performance expected label of the configuration item are consistent, and if not, outputs the performance defect of the software to be detected.
And a second step of: training the configuration items of the performance defect detection system expects a prediction module. Reading in the manual annotation expected configuration items and official document description of the configuration items, and training the configuration item expected prediction module.
2.1, constructing a training set, wherein the method comprises the following steps: n (wherein, N is more than or equal to 500) configuration items are randomly selected from 1 ten thousand configuration items of 12 types of software including MySQL, mariaDB, apache-httpd, apache-Tomcat, apache-Derby, H2 and PostgreSQL, GCC, clang, mongoDB, rocksDB, squid.
2.2, according to official document description of N configuration items, manually labeling the configuration items with performance expected labels, wherein the method comprises the following steps: from the document description (noted d) of the configuration item (noted c), if the configuration item is adjusted to turn on the optimization switch (i.e., the meaning of the performance desirability tab is to turn on the optimization switch), the performance desirability tab of the configuration item is Label 1 The method comprises the steps of carrying out a first treatment on the surface of the If the purpose of the configuration item is to enhance performance sacrificing reliability, the performance expectancy Label of the configuration item is Label 2 The method comprises the steps of carrying out a first treatment on the surface of the If the purpose of adjusting the configuration item is to allocate more computer resources, the performance expectations of the configuration item are labeled Label 3 The method comprises the steps of carrying out a first treatment on the surface of the If the purpose of adjusting the configuration item is to turn on the software additional function, the performance expectancy Label of the configuration item is Label 4 The method comprises the steps of carrying out a first treatment on the surface of the If the adjustment of the configuration item is independent of software performance, the performance expectation tag of the configuration item is Label 5 The method comprises the steps of carrying out a first treatment on the surface of the Finally, a training set is obtained and marked as wherein ,N1 +N 2 +N 3 +N 4 +N 5 =N;N 1 、N 2 、N 3 、N 4 、N 5 Label with expected performance 1 ,Label 2 ,Label 3 ,Label 4 ,Label 5 The number of configuration item document descriptions.Is a Label with the expected performance in training set l Is the ith of (2) l A configuration item.Is->Is composed of words. Wherein l is more than or equal to 1 and less than or equal to 5, i is more than or equal to 1 l ≤N l . Let->The total number of words in-> Is recorded as (word) 1 Word(s) 2 …, word->…, word->)。
2.3 the configuration item expected prediction module pre-processes the training set;
2.3.1 initializing variable l=1;
2.3.2 initializing variable o l =1;
2.3.3.2 words are wordsIs transformed into-> whereinPart-of-speech tags for words (e.g. nouns (nouns), verbs (Verb), etc.), -and the like>Synonyms for the computer field (such as memory, DS of CPU are resource);
2.3.3.3 cases ofLet->Turn 2.3.3.2; if->Then get +.>Is of the following form: Is abbreviated as->Turning to 2.3.4;
2.3.4 judgment of i l Whether or not to equal N l If yes, go to 2.3.5, otherwise let i l =i l +1, 2.3.3;
2.3.5 judges whether l is equal to 5, if yes, turning to 2.4, otherwise, let l=l+1, turning to 2.3.2;
2.4 configuration items the prediction module expects to mine frequent subsequences. The sets are separately prepared using the PreFixSpan algorithm of the literature "PreFixSpan: mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth (PrexSpan: efficient mining of sequence patterns by prefix projection patterns)" published by Jian Pei et al in ICDE 2001 Frequent subsequence mining is performed to obtain 5 frequent subsequence sets: wherein Q1 ,Q 2 ,…,Q l ,…,Q 5 Is a positive integer, meaning that when l=1, 2, …,5, the PrefixSpan algorithm is from the set +.>The number of frequent subsequences mined; q is more than or equal to 1 and less than or equal to Q l ;
2.5 pairs of P 1 ,P 2 ,…,P 5 Confidence is calculated for all frequent subsequences in (1) by:
2.5.1 initializing variable l=1;
2.5.2 initializing variable q=1;
2.5.3 calculation of frequent subsequences p (l,q) Confidence of (c) ( l,q):
The sum of the number of matches). Wherein if p (l,q) Is-> Then determine p (l,q) And->And matching once.
2.5.4 determining if Q is equal to Q l If yes, go to 2.5.5; if not, let q=q+1, turn 2.5.3;
2.5.5 it is determined whether l is equal to 5, if so, P is obtained 1 ,P 2 ,…,P 5 Confidence level of all frequent subsequences in (2.6); if not, let l=l+1, turn 2.5.2.
2.6 according to P 1 ,P 2 ,…,P 5 Confidence of frequent subsequences in P 1 ,P 2 ,…,P 5 Is selected. The method comprises the following steps:
2.6.1 initializing variable l=1;
2.6.2 initializing variable q=1;
2.6.3 if it isWherein 5 is the number of desired tag species, p will be lq Put into collection P l ' in;
2.6.4 determining if Q is equal to Q l If yes, turning to 2.6.5; if not, let q=q+1, go 2.6.3;
2.6.5 judging whether l is equal to 5, if so, indicating that the filtered frequent subsequence set P is obtained 1 ',P 2 ',P 3 ',P 4 ',P 5 ' 2.7; if not, let l=l+1, turn 2.6.2.
2.7 use of P 1 ',P 2 ',P 3 ',P 4 ',P 5 ' training the configuration item expectation prediction module, the method is:
2.7.1 initializing: from P 1 ',P 2 ',P 3 ',P 4 ',P 5 ' 100 frequent subsequences are randomly selected (replaced after being selected) respectively to form a random selected frequent subsequence set P 1 ”,P 2 ”,P 3 ”,P 4 ”,P 5 ”。P 1 ”,P 2 ”,P 3 ”,P 4 ”,P 5 "contains 500 frequent subsequences in total, namely:
{p (1,1) ,p (1,2) ,…,p (1,r) ,…,p (1,100) },…,{p (l,1) ,p (l,2) ,…,p (l,r) ,…,p (l,100) },…,
{p (5,1) ,p (5,2) ,…,p (5,r) ,…,p (5,100) },1≤r≤100;
2.7.2 calculation of P respectively 1 ”,P 2 ”,P 3 ”,P 4 ”,P 5 "accuracy on training dataset
Recall (Recall), F-score (the harmonic mean of accuracy and Recall):
2.7.3 determining if the estimated cumulative distribution function value for the maximum value of F-score is greater than a threshold value δ, typically 99% -99.9%, if so, 2.8; if the threshold delta is smaller than or equal to the threshold delta, turning to 2.7.1;
2.8 configuration item expect prediction Module selects P corresponding to when F-score is maximum 1 ”,P 2 ”,P 3 ”,P 4 ”,P 5 "building a weighted vote classifier". The method comprises the following steps: the preprocessed configuration item that sets the input of the weighted voting classifier to any desired tag to be predicted describes POS x ,DS x Outputting the ticket obtained by 5 expected labels (abbreviated as x), wherein the expected label with x performance is the highest expected label obtained by the ticket obtained. Wherein, the ticket of category l is frequent subsequence The sum of the confidence coefficients of (1) and (r) x Less than or equal to 100, and%>Is a subsequence of x. The classifier outputs five-tuple of the ticket, which is marked as +.>
wherein ,meaning of (2): "at P l "in, satisfy" asSubsequence of xConfidence sum of (where l=1, 2, …, 5) ". If the element in the Votes (x) is not 0, finding the element corresponding to the maximum value in the Votes (x), wherein the sequence number l corresponding to the element is the sequence number corresponding to the performance expected Label of x, and is Label l Turning to the third step; if Votes (x) = [0,0]The performance expected label of x is empty, and the third step is switched; for example, if Votes (x) = [1.1,1.4,5.3,0,2.0 ]]The performance expected Label for x is Label 3 If Votes (x) = [0,0]The performance expectation tag for the configuration item x is null;
thirdly, generating a performance expected label set L for the software to be detected by using the trained configuration item expected prediction module, and sending the L to the test sample generation module and the performance defect detection module, wherein the method comprises the following steps:
the trained configuration item expected prediction module reads the configuration item description from a configuration item user manual of software to be detected, and the weighted voting classifier is used for all configuration items C= { C to be detected 1 ,c 2 ,…,c z ,…,c N ' wherein, z is more than or equal to 1 and less than or equal to N ' (let N ' be the number of configuration items in the configuration item user manual), predicting the performance expectation to obtain a performance expectation label set L= [ Lab 1 ,Lab 2 ,…,Lab z ,…,Lab N′ ]Wherein Lab z ∈{Label 1 ,Label 2 ,Label 3 ,Label 4 ,Label 5 Null (empty); and L is sent to a test sample generating module and a performance defect detecting module.
Fourth, the test sample generating module generates a test sample set T for the software to be detected, and sends the T to the performance defect detecting module, and the method is as follows:
4.1 the test sample generation module uses the Spex algorithm of article "Do Not Blame Users for Misconfigurations (without blading user's configuration errors)" published by Tianyin Xu et al in SOSP 2013 to extract the grammar type and value range of the software configuration items in C. The types of grammar that Spex ultimately extracts fall into four categories: numerical type (int), boolean type (bool), enumeration type (enum), string type (string);
4.2 test sample generating Module is the configuration item set C= { C 1 ,c 2 ,…,c z ,…,c N' Generating a set of values V, v= { V 1 ,V 2 ,…,V z ,…,V N'}, wherein For configuration item c z Is a value of K z Generating a module c for the test sample z The number of values generated. The method comprises the following steps:
4.2.1 initializing variable z=1;
4.2.3 if c z Is of Boolean type (bool), let V z = {0,1}, turn 4.2.7;
4.2.4 if c z For enumeration type (enum), let wherein C extracted for Spex algorithm z All possible values of (a) to 4.2.7; />
4.2.5 if c z For string type (string), let(according to the conclusion that "Tuning backfirednot (always) you fault: understanding and detecting configuration-related performance bugs published in ESEC/FSE 2019 at He Haochen (configuration adjustment is counterproductive;
4.2.6 if c z For the value type (int), then for c z Is sampled by the following method: c extracted by recording Spex algorithm z The minimum and maximum values of (2) are recorded as Min and Max, let V z ={Min,10·Min,10 2 ·Min,Max,10 -1 ·Max,10 -2 Max, turn 4.2.7;
4.2.7 if z=n', turn 4.3; otherwise, let z=z+1, turn 4.2.2;
4.3 pairs V 1 ,V 2 ,…,V z ,…,V N' Taking the Cartesian product to obtain Cartesian product V Cartesian =V 1 ×V 2 ×…×V N' ;
4.4 the set of performance tests for software is typically provided in the form of a performance test tool. The test sample generation module thus generates test commands based on the performance test tool (e.g., sysbench, apache-benchmark). The method comprises the following steps: the parameters of the performance test tool are sampled by the classical Pair-wise method (Pair-wise Testing is a combinatorial method of software testing that, for each Pair of input parameters to a system, tests all possible discrete combinations of those parameters "Pair-wise method is a combination method in the field of software testing, which tests all possible discrete combinations of these parameters for each Pair of input parameters of the system" - - - "Pragmatic Software Testing: becoming an Effective and Efficient Test Professional" utility software test: becoming an efficient test specialty "), then the parameters (such as concurrency, load type, data table size, data table number, read operation ratio, write operation ratio) are input to the performance test tool, test commands are output, and a test command set b= { B is obtained 1 ,b 2 ,b 3 ,…,b y ,…,b Y Y is more than or equal to 1 and less than or equal to Y, and Y is the number of test commands in B;
4.5 test sample Generation Module generates a test sample set T, T=B×V Cartesian ={t 1 ,t 2 ,t 3 ,…,t a ,…,t W },1≤a≤W,t a Is a binary group of the two-dimensional data,( wherein ,The meaning of (2) is: c z The value of (2) is +.>) W is the number of test samples in T, ">C is 1 U is more than or equal to 1 and is more than or equal to K 1 ) The possible value->C is z H (h is more than or equal to 1 and less than or equal to K) z ) The possible value->C is 8′ (1≤j≤K 8′ ) The j-th possible value, K 1 、K z 、K 8′ Configuration item c extracted by Spex algorithm 1 、c z 、c 8′ And K 1 、K z 、K N′ Are all positive integers; the test sample set T is sent to a performance defect detection module;
fifth step: the performance defect detection module detects the performance defect of the executable file of the software to be detected according to T and L:
5.1, the performance defect detection module executes the test sample in the T to obtain the performance value of the test sample, and the method comprises the following steps:
5.1.1 initializing variable a=1;
5.1.2 to prevent performance fluctuations due to unstable test environments, the performance defect detection module repeatedly executes each test sample a times, a being a positive integer, a preferably being 10; thus, let variable repeat=1 (variable repeat records the number of times the current repeat is executed);
5.1.3 Performance Defect detection Module test sample t a Inputting the software to be detected, running the software to be detected, and recording the repeat Secondary input t a Performance values obtained by runningSetting a default performance index of the detection performance value as software data throughput;
5.1.4 determining whether repeat is equal to A, if so, obtaining a set of information about test sample t a The performance index of (2) is recorded as:turning to 4.1.5; no let repeat = repeat +1, go to 4.1.3;
5.1.5 determining if a is equal to W, if so, noting that the output is Out= { [ t ] 1 ,R 1 ],…,[t a ,R a ],…,[t W ,R W ](where the tuple [ t ] a ,R a ]The first element of the test sample is a test sample, the second element is a performance value set obtained by executing the test sample for A times), and the test sample is converted into 5.2; otherwise let a=a+1, turn 5.1.2;
5.2 the performance defect detection module groups Out according to the test sample, the method is:
5.2.1 initializing variable a=1;
5.2.2 judging if [ t ] a ,R a ]Having been grouped, let a=a+1, turn 5.2.2; otherwise, turning to 5.2.3;
5.2.3 will [ t ] a ,R a ]According to t a In configuration item values and test commands, i.e., [ t ] a ,R a ]And { [ t ] 1 ,R 1 ],…,[t a ,R a ],…,[t W ,R W ]In }, t a and ta' At the same time, the following 3 conditions are satisfied, then [ t ] a ,R a ],[t a’ ,R a’ ]The method comprises the following steps:
condition 1, t a and ta' Only one configuration item c z (wherein, z is more than or equal to 1 and less than or equal to N') has different values;
Condition 3, [ t ] a ,R a ],[t a’ ,R a’ ]Not grouped;
let AND [ t ] a ,R a ]Consensus Num satisfying the above conditions a And, i.eDivided into groups, denoted as Group (z,y) ,Group (z,y) ={[t a ,R a ],[t a‘ ,R a‘ ],[t a’‘ ,R a“ ],…,[t a* ,R a* ](wherein, 1 is less than or equal to a ', a', …, a is less than or equal to W, num) a Is a positive integer, num a The size of (c) and c z Related to the type of (a): if c z In Boolean form, then Num a =2; if the enumeration type is adopted, num a =K z The method comprises the steps of carrying out a first treatment on the surface of the If the number is of the numerical type, num a =6; if the character string is of the character string type, num a =1). For example t a Is-> and ta’ Is-> At the time, will [ t ] a ,R a ],[t a’ ,R a’ ]Forming a group;
5.2.4 if a=w, it indicates that the grouping is completed, and a test result set g= { Group after grouping is obtained (1,1) ,Group (1,2) ,…,Group (1,Y) ,…,Group (z,y) …,Group (N',Y) Turning to 5.3; otherwise let a=a+1, turn 5.2.2;
5.3 the performance defect detection module uses hypothesis testing (hypothesis testing), also known as statistical hypothesis testing), to determine the differences between samples and between overall according to the performance expected label L of the configuration item set C and the grouped test result set GWhether it is a statistical inference method caused by sampling errors or by intrinsic differences. Assuming that the inspection parameter β is a positive real number smaller than 1, β=0.05 is preferable). The hypothesis testing principle is as follows: if any configuration item c z The desired Label is Label 1 、Label 2 Or Label 3 Regulation c z The performance of the software is expected to be improved, and if the actual test result is that the performance is reduced, the software has performance defects; if c z The desired Label is Label 4 Regulation c z The expected performance of the software is that the performance is reasonably reduced, and if the actual test result is that the performance is greatly reduced, the software has performance defects; if c z The desired Label is Label 5 Regulation c z The performance of the software is expected to be unchanged, and if the actual test result is that the performance is reduced, the software has performance defects. The method comprises the following steps: traversing each group in R, and judging whether the software to be detected has defects or not by using a hypothesis testing method:
5.3.1 initializing variable z=1;
5.3.2 initializing variable y=1;
5.3.3 if Lab z =Label 1 (wherein Lab z C is z Expected tag of (1) 0 :R a ≤R a'( wherein ,cz At t a The value of (c) is 0 z At t a' The value of 1). Turn 5.3.8;
5.3.4 if Lab z =Label 2 Setting hypothesis to be tested H 0 :R a ≤R a'( wherein ,cz At t a The value of (c) is greater than c z At t a' In (c) a value in (c). Turn 5.3.8;
5.3.5 Lab z =Label 3 Setting hypothesis to be tested H 0 :R a ≤R a'( wherein ,cz At t a The value of (c) is less than c z At t a' In (c) a value in (c). Turn 5.3.8;
5.3.6 Lab z =Label 4 Setting hypothesis to be tested H 0 :5·R a ≤R a'( wherein ,cz At t a The value of (c) is 1 z At t a' A value of 0). Turn 5.3.8;
5.3.7 if Lab z =Label 5 Setting hypothesis to be tested H 0 :R a ≠R a' . Turn 5.3.8;
5.3.8 when the hypothesis test result shows that H 0 When rejected (i.e., the rejection probability calculated using the hypothesis test method is 1- β), it is indicated that the software has one and the configuration item c z Related performance defect, and the test command triggering the defect is that
5.3.9 if y=y, turn 5.3.10; if not, let y=y+1, turn 5.3.3;
5.3.10 if z=n', the detection is ended; if not, let z=z+1, turn 5.3.2.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention can effectively detect the software performance defect. In 61 historical performance defects in 12 large open source software MySQL, mariaDB, apache-httpd, apache-Tomcat, apache-Derby, H2 and PostgreSQL, GCC, clang, mongoDB, rocksDB, squid, 23418 test samples are run based on the expectations of 52 configuration items, 178 hours are consumed, 54 performance defects are successfully detected, and only 7 false positives (false positives) are generated. While only 6 were detected by prior work (Adrian Nistor et al, "Toddler: detecting Performance Problems via Similar Memory-Access Patterns" published in ICSE 2013 to detect performance defects by similar memory read/write Patterns).
2. The invention can detect 11 new performance defects for the software community, and prevent potential economic and user losses possibly caused by software performance problems. The defect ID is: clang-43576, clang-43084, clang-44359, clang-44518, GCC-93521, GCC-93037, GCC91895, GCC91852, GCC-91817, GCC-91875, GCC-93535.
3. The second step of the invention provides a detailed classification of the performance expectations of the configuration items and a method for automatically predicting the performance expectations of the configuration items, and provides a data set containing a large number of configuration items and expected configuration items; the invention can effectively judge the performance difference of the non-defective software and the defective software based on the expectation of the configuration item, and has good application prospect.
Drawings
FIG. 1 is a general flow chart of the present invention;
FIG. 2 is a logical block diagram of a performance desirability detection system constructed in a first step of the present invention;
FIG. 3 is a table of configuration item performance expectations for use in the second step of the present invention.
Detailed Description
The present invention will be described below with reference to the accompanying drawings.
As shown in fig. 1, the present invention includes the steps of:
first, a performance defect detection system is constructed, wherein the performance defect detection system is shown in fig. 2 and consists of a configuration item expected prediction module, a test sample generation module and a performance defect detection module.
The configuration item expected prediction module is a weighted voting classifier and is connected with the test sample generation module and the performance defect detection module, the description and the value range of the configuration item are read from a configuration item user manual of software to be detected, the performance expected label of the configuration item is obtained by predicting the performance expected of the configuration item to be predicted, and the performance expected label of the configuration item is sent to the test sample generation module and the performance defect detection module.
The test sample generating module is connected with the configuration item expected predicting module and the performance defect detecting module, receives the performance expected label of the configuration item from the configuration item expected predicting module, reads the test command from the test set of the software to be detected, and generates a test sample set T according to the performance expected label of the configuration item and the test set of the software to be detected.
The performance defect detection module is connected with the configuration item expected prediction module and the test sample generation module, receives the test sample set T from the test sample generation module, receives the performance expected label of the configuration item from the configuration item expected prediction module, executes the test sample in the test sample set T, detects whether the expected performance and the actual performance corresponding to the performance expected label of the configuration item are consistent, and if not, outputs the performance defect of the software to be detected.
And a second step of: training the configuration items of the performance defect detection system expects a prediction module. Reading in the manual annotation expected configuration items and official document description of the configuration items, and training the configuration item expected prediction module.
2.1, constructing a training set, wherein the method comprises the following steps: n (wherein, N is more than or equal to 500) configuration items are randomly selected from 1 ten thousand configuration items of 12 types of software including MySQL, mariaDB, apache-httpd, apache-Tomcat, apache-Derby, H2 and PostgreSQL, GCC, clang, mongoDB, rocksDB, squid.
2.2, manually labeling the configuration items with performance expected labels according to official document descriptions of N configuration items, wherein the method is as shown in fig. 3: from the document description (noted d) of the configuration item (noted c), if the purpose of adjusting the configuration item is to turn on the optimization switch, the performance expectancy Label of the configuration item is Label 1 The method comprises the steps of carrying out a first treatment on the surface of the If the purpose of the configuration item is to enhance performance sacrificing reliability, the performance expectancy Label of the configuration item is Label 2 The method comprises the steps of carrying out a first treatment on the surface of the If the purpose of adjusting the configuration item is to allocate more computer resources, the performance expectations of the configuration item are labeled Label 3 The method comprises the steps of carrying out a first treatment on the surface of the If the purpose of adjusting the configuration item is to turn on the software additional function, the performance expectancy Label of the configuration item is Label 4 The method comprises the steps of carrying out a first treatment on the surface of the If the adjustment of the configuration item is independent of software performance, the performance expectation tag of the configuration item is Label 5 The method comprises the steps of carrying out a first treatment on the surface of the Finally, a training set is obtained and marked as wherein ,N1 +N 2 +N 3 +N 4 +N 5 =N;N 1 、N 2 、N 3 、N 4 、N 5 Label with expected performance 1 ,Label 2 ,Label 3 ,Label 4 ,Label 5 The number of configuration item document descriptions.Is a Label with the expected performance in training set l Is the ith of (2) l A configuration item.Is->Is composed of words. Wherein l is more than or equal to 1 and less than or equal to 5, i is more than or equal to 1 l ≤N l . Let->The total number of words in- > Is recorded as (word) 1 Word(s) 2 …, word->…, word->)。
2.3 the configuration item expected prediction module pre-processes the training set;
2.3.1 initializing variable l=1;
2.3.2 initializing variable i l =1;
2.3.3.2 words are wordsIs transformed into-> whereinIs a part-of-speech tag of a word,synonyms for computer domain;
2.3.3.3 cases ofLet->Turn 2.3.3.2; if->Then a pretreated is obtainedIs of the following form: Is abbreviated as->Turning to 2.3.4; />
2.3.4 judgment of i l Whether or not to equal N l If yes, go to 2.3.5, otherwise let i l =i l +1, 2.3.3;
2.3.5 judges whether l is equal to 5, if yes, turning to 2.4, otherwise, let l=l+1, turning to 2.3.2;
2.4 configuration items the prediction module expects to mine frequent subsequences. The Prefixspan algorithm used is respectively to the set Frequent subsequence mining is performed to obtain 5 frequent subsequence sets: wherein Q1 ,Q 2 ,…,Q l ,…,Q 5 Is a positive integer, meaning that when l=1, 2, …,5, the PrefixSpan algorithm is from the set +.>The number of frequent subsequences mined; q is more than or equal to 1 and less than or equal to Q l ;
2.5 pairs of P 1 ,P 2 ,…,P 5 Confidence is calculated for all frequent subsequences in (1) by:
2.5.1 initializing variable l=1;
2.5.2 initializing variable q=1;
2.5.3 calculation of frequent subsequences p (l,q) Confidence of (c) (l,q) :
The sum of the number of matches). Wherein if p (l,q) Is-> Then determine p (l,q) And->And matching once.
2.5.4 determining if Q is equal to Q l If yes, go to 2.5.5; if not, let q=q+1, turn 2.5.3;
2.5.5 it is determined whether l is equal to 5, if so, P is obtained 1 ,P 2 ,…,P 5 Confidence level of all frequent subsequences in (2.6); if not, let l=l+1, turn 2.5.2.
2.6 according to P 1 ,P 2 ,…,P 5 Confidence of frequent subsequences in P 1 ,P 2 ,…,P 5 Is selected. The method comprises the following steps:
2.6.1 initializing variable l=1;
2.6.2 initializing variable q=1;
2.6.3 if it isWherein 5 is the number of desired tag species, p will be lq Put into collection P l ' in;
2.6.4 determining if Q is equal to Q l If yes, turning to 2.6.5; if not, let q=q+1, go 2.6.3;
2.6.5 judging whether l is equal to 5, if so, indicating that the filtered frequent subsequence set P is obtained 1 ',P 2 ',P 3 ',P 4 ',P 5 ' 2.7; if not, let l=l+1, turn 2.6.2.
2.7 use of P 1 ',P 2 ',P 3 ',P 4 ',P 5 ' training the configuration item expectation prediction module, the method is:
2.7.1 initializing: from P 1 ',P 2 ',P 3 ',P 4 ',P 5 ' 100 frequent subsequences are randomly selected (replaced after being selected) respectively to form a random selected frequent subsequence set P 1 ”,P 2 ”,P 3 ”,P 4 ”,P 5 ”。P 1 ”,P 2 ”,P 3 ”,P 4 ”,P 5 "contains 500 frequent subsequences in total, namely: { p (1,1) ,p (1,2) ,…,p (1,r) ,…,p (1,100) },…,{p (l,1) ,p (l,2) ,…,p (l,r) ,…,p (l,100) },…,
{p (5,1) ,p (5,2) ,…,p (5,r) ,…,p (5,100) },1≤r≤100;
2.7.2 calculation of P respectively 1 ”,P 2 ”,P 3 ”,P 4 ”,P 5 "accuracy on training dataset (Precision), recall (Recall), F-score (harmonic mean of accuracy and Recall):
2.7.3 determining if the estimated cumulative distribution function value for the maximum value of F-score is greater than a threshold value δ, typically 99% -99.9%, if so, 2.8; if the threshold delta is smaller than or equal to the threshold delta, turning to 2.7.1;
2.8 configuration item expect prediction Module selects P corresponding to when F-score is maximum 1 ”,P 2 ”,P 3 ”,P 4 ”,P 5 "building a weighted vote classifier". The method comprises the following steps: the preprocessed configuration item that sets the input of the weighted voting classifier to any desired tag to be predicted describes POS x ,DS x Outputting the ticket obtained by 5 expected labels (abbreviated as x), wherein the expected label with x performance is the highest expected label obtained by the ticket obtained. Wherein, the ticket of category l is frequent subsequenceThe sum of the confidence coefficients of (1) and (r) x Less than or equal to 100, and%>Is a subsequence of x. The classifier outputs a five-tuple of the obtained ticket, which is recorded as Votes (x) =
wherein ,sub-sequence>Meaning of (2): "at P l "in", the "subsequence of x" is satisfiedConfidence sum of (where l=1, 2, …, 5) ". If the element in Votes (x) is not 0, finding the element corresponding to the maximum value in Votes (x), wherein the serial number l corresponding to the element is the property of x The serial number corresponding to the expected Label is Label l Turning to the third step; if Votes (x) = [0,0]The performance expected label of x is empty, and the third step is switched; for example, if Votes (x) = [1.1,1.4,5.3,0,2.0 ]]The performance expected Label for x is Label 3 If Votes (x) = [0,0]The performance expectation tag for the configuration item x is null;
thirdly, generating a performance expected label set L for the software to be detected by using the trained configuration item expected prediction module, and sending the L to the test sample generation module and the performance defect detection module, wherein the method comprises the following steps:
the trained configuration item expected prediction module reads the configuration item description from a configuration item user manual of software to be detected, and the weighted voting classifier is used for all configuration items C= { C to be detected 1 ,c 2 ,…,c z ,…,c N ' wherein, z is more than or equal to 1 and less than or equal to N ' (let N ' be the number of configuration items in the configuration item user manual), predicting the performance expectation to obtain a performance expectation label set L= [ Lab 1 ,Lab 2 ,…,Lab z ,…,Lab N′ ]Wherein Lab z ∈{Label 1 ,Label 2 ,Label 3 ,Label 4 ,Label 5 Null (empty); and L is sent to a test sample generating module and a performance defect detecting module.
Fourth, the test sample generating module generates a test sample set T for the software to be detected, and sends the T to the performance defect detecting module, and the method is as follows:
4.1 the test sample generation module uses the Spex algorithm to extract the grammar type and the value range of the software configuration item in the C. The types of grammar that Spex ultimately extracts fall into four categories: numerical type (int), boolean type (bool), enumeration type (enum), string type (string);
4.2 test sample generating Module is the configuration item set C= { C 1 ,c 2 ,…,c z ,…,c N' Generating a set of values V, v= { V 1 ,V 2 ,…,V z ,…,V N'}, wherein For configuration item c z Is a value of K z Generating a module c for the test sample z The number of values generated. The method comprises the following steps:
4.2.1 initializing variable z=1;
4.2.3 if c z Is of Boolean type (bool), let V z = {0,1}, turn 4.2.7;
4.2.4 if c z For enumeration type (enum), let wherein C extracted for Spex algorithm z All possible values of (a) to 4.2.7;
4.2.6 if c z For the value type (int), then for c z Is sampled by the following method: c extracted by recording Spex algorithm z The minimum and maximum values of (2) are recorded as Min and Max, let V z ={Min,10·Min,10 2 ·Min,Max,10 -1 ·Max,10 -2 Max, turn 4.2.7;
4.2.7 if z=n', turn 4.3; otherwise, let z=z+1, turn 4.2.2;
4.3 pairs V 1 ,V 2 ,…,V z ,…,V N' Taking the Cartesian product to obtain Cartesian product V Cartesian =V 1 ×V 2 ×…×V N' ;
4.4 the set of performance tests for software is typically provided in the form of a performance test tool. The test sample generation module is therefore based on a performance test tool (e.gsysbench, apache-benchmark) generates test commands. The method comprises the following steps: sampling parameters of the performance test tool by adopting a classical pair-wise method, inputting the parameters (such as concurrency, load type, data table size, data table number, read operation proportion and write operation proportion) into the performance test tool, and outputting test commands to obtain a test command set B= { B 1 ,b 2 ,b 3 ,…,b y ,…,b Y Y is more than or equal to 1 and less than or equal to Y, and Y is the number of test commands in B;
4.5 test sample Generation Module generates a test sample set T, T=B×V Cartesian ={t 1 ,t 2 ,t 3 ,…,t a ,…,t W },1≤a≤W,t a Is a binary group of the two-dimensional data,( wherein ,The meaning of (2) is: c z The value of (2) is +.>) W is the number of test samples in T, ">C is 1 U is more than or equal to 1 and is more than or equal to K 1 ) The possible value->C is z H (h is more than or equal to 1 and less than or equal to K) z ) The possible value->C is N′ (1≤j≤K N′ ) The j-th possible value, K 1 、K z 、K N′ Configuration item c extracted by Spex algorithm 1 、c z 、c N′ The number of possible values of (a) is a positive integer; the test sample set T is sent to a performance defect detection module;
fifth step: the performance defect detection module detects the performance defect of the executable file of the software to be detected according to T and L:
5.1, the performance defect detection module executes the test sample in the T to obtain the performance value of the test sample, and the method comprises the following steps:
5.1.1 initializing variable a=1;
5.1.2 to prevent performance fluctuations due to unstable test environments, the performance defect detection module repeatedly executes each test sample a times, a being a positive integer, a preferably being 10; thus, let variable repeat=1;
5.1.3 Performance Defect detection Module test sample t a Inputting software to be detected, running the software to be detected, and recording the repeat time t of input a Performance values obtained by runningSetting a default performance index of the detection performance value as software data throughput;
5.1.4 determining whether repeat is equal to A, if so, obtaining a set of information about test sample t a The performance index of (2) is recorded as:turning to 4.1.5; no let repeat = repeat +1, go to 4.1.3;
5.1.5 determining if a is equal to W, if so, noting that the output is Out= { [ t ] 1 ,R 1 ],…,[t a ,R a ],…,[t W ,R W ](where the tuple [ t ] a ,R a ]The first element of the test sample is a test sample, the second element is a performance value set obtained by executing the test sample for A times), and the test sample is converted into 5.2; otherwise let a=a+1, turn 5.1.2;
5.2 the performance defect detection module groups Out according to the test sample, the method is:
5.2.1 initializing variable a=1;
5.2.2 judging if [ t ] a ,R a ]Having been grouped, let a=a+1, turn 5.2.2; otherwise, turning to 5.2.3;
5.2.3 will [ t ] a ,R a ]According to t a In configuration item values and test commands, i.e., [ t ] a ,R a ]And { [ t ] 1 ,R 1 ],…,[t a ,R a ],…,[t W ,R W ]In }, t a and ta' At the same time, the following 3 conditions are satisfied, then [ t ] a ,R a ],[t a’ ,R a’ ]The method comprises the following steps:
condition 1, t a and ta' Only one configuration item c z (wherein, z is more than or equal to 1 and less than or equal to N') has different values;
Condition 3, [ t ] a ,R a ],[t a’ ,R a’ ]Not grouped;
let AND [ t ] a ,R a ]Consensus Num satisfying the above conditions a And, i.eDivided into groups, denoted as Group (z,y) ,Group (z,y) ={[t a ,R a ],[t a‘ ,R a‘ ],[t a’‘ ,R a“ ],…,[t a* ,R a* ](wherein, 1 is less than or equal to a ', a', …, a is less than or equal to W, num) a Is a positive integer, num a The size of (c) and c z Related to the type of (a): if c z In Boolean form, then Num a =2; if the enumeration type is adopted, num a =K z The method comprises the steps of carrying out a first treatment on the surface of the If the number is of the numerical type, num a =6; if the character string is of the character string type, num a =1). For example t a Is-> and ta' Is-> At the time, will [ t ] a ,R a ],[t a’ ,R a’ ]Forming a group;
5.2.4 if a=w, it indicates that the grouping is completed, and a test result set g= { Group after grouping is obtained (1,1) ,Group (1,2) ,…,Group (1,Y) ,…,Group (z,y) …,Group (N',Y) Turning to 5.3; otherwise let a=a+1, turn 5.2.2;
5.3 the performance defect detection module judges whether the software to be detected has defects or not by using a hypothesis test (the hypothesis test parameter beta is a positive real number smaller than 1. Preferably beta=0.05) method according to the performance expected label L of the configuration item set C and the grouped test result set G. The hypothesis testing principle is as follows: if any configuration item c z The desired Label is Label 1 、Label 2 Or Label 3 Regulation c z The performance of the software is expected to be improved, and if the actual test result is that the performance is reduced, the software has performance defects; if c z The desired Label is Label 4 Regulation c z The expected performance of the software is that the performance is reasonably reduced, and if the actual test result is that the performance is greatly reduced, the software has performance defects; if c z The desired Label is Label 5 Regulation c z The performance of the software is expected to be unchanged, and if the actual test result is that the performance is reduced, the software has performance defects. The method comprises the following steps: traversing each group in R, and judging whether the software to be detected has defects or not by using a hypothesis testing method:
5.3.1 initializing variable z=1;
5.3.2 initializing variable y=1;
5.3.3 if Lab z =Label 1 (wherein Lab z C is z Expected tag of (1) 0 :R a ≤R a'( wherein ,cz At t a The value of (c) is 0 z At t a' The value of 1). Turn 5.3.8;
5.3.4 if Lab z =Label 2 Setting hypothesis to be tested H 0 :R a ≤R a'( wherein ,cz At t a The value of (c) is greater than c z At t a' In (c) a value in (c). Turn 5.3.8;
5.3.5 Lab z =Label 3 Setting hypothesis to be tested H 0 :R a ≤R a'( wherein ,cz At t a The value of (c) is less than c z At t a' In (c) a value in (c). Turn 5.3.8;
5.3.6 Lab z =Label 4 Setting hypothesis to be tested H 0 :5·R a ≤R a'( wherein ,cz At t a The value of (c) is 1 z At t a' A value of 0). Turn 5.3.8;
5.3.7 if Lab z =Label 5 Setting hypothesis to be tested H 0 :R a ≠R a' . Turn 5.3.8;
5.3.8 when the hypothesis test result shows that H 0 When rejected (i.e., the rejection probability calculated using the hypothesis test method is 1- β), it is indicated that the software has one and the configuration item c z Related performance defect, and the test command triggering the defect is that
5.3.9 if y=y, turn 5.3.10; if not, let y=y+1, turn 5.3.3;
5.3.10 if z=n', the detection is ended; if not, let z=z+1, turn 5.3.2.
Claims (13)
1. The software performance defect detection method based on the configuration item performance expectation is characterized by comprising the following steps:
firstly, constructing a performance defect detection system, wherein the performance defect detection system consists of a configuration item expected prediction module, a test sample generation module and a performance defect detection module;
the configuration item expected prediction module is a weighted voting classifier and is connected with the test sample generation module and the performance defect detection module, the description and the value range of the configuration item are read from a configuration item user manual of software to be detected, the performance expected label of the configuration item is obtained by predicting the performance expected of the configuration item to be predicted, and the performance expected label of the configuration item is sent to the test sample generation module and the performance defect detection module;
The test sample generating module is connected with the configuration item expected predicting module and the performance defect detecting module, receives the performance expected label of the configuration item from the configuration item expected predicting module, reads a test command from a test set of the software to be detected, and generates a test sample set T according to the performance expected label of the configuration item and the test set of the software to be detected;
the performance defect detection module is connected with the configuration item expected prediction module and the test sample generation module, receives a test sample set T from the test sample generation module, receives a performance expected label of the configuration item from the configuration item expected prediction module, executes the test sample in the test sample set T, detects whether expected performance and actual performance corresponding to the performance expected label of the configuration item are consistent, and if not, outputs the performance defect of the software to be detected;
and a second step of: reading in the manual annotation expected configuration items and official document description of the configuration items, and training a configuration item expected prediction module of the performance defect detection system, wherein the method comprises the following steps:
2.1, constructing a training set, wherein N configuration items in the training set are equal to or more than 500;
2.2, according to official document description of N configuration items, manually labeling the configuration items with performance expected labels, wherein the method comprises the following steps: according to the document description d of configuration item c, if the purpose of adjusting the configuration item is to turn on the optimization switch, the performance expectation tag of the configuration item is Label 1 The method comprises the steps of carrying out a first treatment on the surface of the If the purpose of the configuration item is to enhance performance sacrificing reliability, the performance expectancy Label of the configuration item is Label 2 The method comprises the steps of carrying out a first treatment on the surface of the If the purpose of adjusting the configuration item is to allocate more computer resources, the performance expectations of the configuration item are labeled Label 3 The method comprises the steps of carrying out a first treatment on the surface of the If the purpose of adjusting the configuration item is to turn on the software additional function, the performance expectancy Label of the configuration item is Label 4 The method comprises the steps of carrying out a first treatment on the surface of the If the adjustment of the configuration item is independent of software performance, the performance expectation tag of the configuration item is Label 5 The method comprises the steps of carrying out a first treatment on the surface of the Finally, a training set is obtained and marked as 0≤i 1 ≤N 1 ;0≤i 2 ≤N 2 ; 0≤i 3 ≤N 3 ;0≤i 4 ≤N 4 ; 0≤i 5 ≤N 5 ,N 1 +N 2 +N 3 +N 4 +N 5 =N;N 1 、N 2 、N 3 、N 4 、N 5 Label with expected performance 1 ,Label 2 ,Label 3 ,Label 4 ,Label 5 The number of configuration item document descriptions;Is a Label with the expected performance in training set l Is the ith of (2) l Personal configuration item->Is->Is described by a document ofWord composition, 1.ltoreq.l.ltoreq.5, 1.ltoreq.i l ≤N l The method comprises the steps of carrying out a first treatment on the surface of the Let->The total number of words in->The method is characterized by comprising the following steps: words and phrases 1 Word(s) 2 …, word->…, word->
2.3 the configuration item expected prediction module preprocesses the training set by:
2.3.1 initializing variable l=1;
2.3.2 initializing variable i l =1;
2.3.3.2 words are wordsIs transformed into-> Part of speech tag for word, < >>Synonyms for computer domain;
2.3.3.3 cases ofLet->Turn 2.3.3.2; if->Then get +.>Is of the following form:<POS 1 ,DS 1 >,<POS 2 ,DS 2 >,...,is simply described asTurning to 2.3.4;
2.3.4 judgment of i l Whether or not to equal N l If yes, go to 2.3.5, otherwise let i l =i l +1, 2.3.3;
2.3.5 judges whether l is equal to 5, if yes, turning to 2.4, otherwise, let l=l+1, turning to 2.3.2;
2.4 configuration item expectation prediction module mines frequent subsequences, and uses a Prefix span algorithm to set separately Frequent subsequence mining is performed to obtain 5 frequent subsequence sets: wherein Q1 ,Q 2 ,...,Q l ,...,Q 5 Being a positive integer, it means that when l=1, 2,..5, the PrefixSpan algorithm is from the set +.> The number of frequent subsequences mined; q is more than or equal to 1 and less than or equal to Q l :
2.5 pairs of P 1 ,P 2 ,...,P 5 Confidence is calculated for all frequent subsequences in (a) by:
2.5.1 initializing variable l=1;
2.5.2 initializing variable q=1;
2.5.3 calculation of frequent subsequences p (l,q) Confidence of (c) (l,q) :
Confidence (l,q) =(p (l,q) At the collectionMatching times of (p) (l,q) In five collections-> Sum of matching times) of (a), wherein, if p (l,q) Is->Then determine p (l,q) And->Primary matching;
2.5.4 determining if Q is equal to Q l If yes, go to 2.5.5; if not, let q=q+1, turn 2.5.3;
2.5.5 it is determined whether l is equal to 5, if so, P is obtained 1 ,P 2 ,...,P 5 Confidence level of all frequent subsequences in (2.6); if not, let l=l+1, turn 2.5.2;
2.6 according to P 1 ,P 2 ,...,P 5 Confidence of frequent subsequences in P 1 ,P 2 ,...,P 5 The frequent subsequences in the sequence are filtered to obtain a filtered frequent subsequence set P 1 ′,P 2 ′,P 3 ′,P 4 ′,P 5 ′;
2.7 use of P 1 ′,P 2 ′,P 3 ′,P 4 ′,P 5 ' training the configuration item expectation prediction module, the method is:
2.7.1 initializing: from P 1 ′,P 2 ′,P 3 ′,P 4 ′,P 5 '100 frequent subsequences are randomly selected from' respectively to form a set P of randomly selected frequent subsequences 1 ″,P 2 ″,P 3 ″,P 4 ″,P 5 ″;P 1 ”,P 2 ”,P 3 ”,P 4 ”,P 5 "contains 500 frequent subsequences in total, namely:
{p (1,1) ,p (1,2) ,...,p (1,r) ,...,p (1,100) },...,{p (l,1) ,p (l,2) ,...,p (l,r) ,...,p (l,100 )},...,{p (5,1) ,p (5,2) ,...,p (5,r) ,...,p (5,100) },1≤r≤100;
2.7.2 calculation of P respectively 1 ”,P 2 ”,P 3 ”,P 4 ”,P 5 "accuracy Precision on training dataset, recall Recall, harmonic mean of accuracy and RecallF-score:
2.7.3 determining if the estimated cumulative distribution function value for the maximum value of F-score is greater than a threshold delta, if so, turning to 2.8; if the threshold delta is smaller than or equal to the threshold delta, turning to 2.7.1;
2.8 configuration item expect prediction Module selects P corresponding to when F-score is maximum 1 ″,P 2 ″,P 3 ″,P 4 ″,P 5 "constructing a weighted voting classifier, turning to the third step;
thirdly, generating a performance expected label set L for the software to be detected by using the trained configuration item expected prediction module, and sending the L to the test sample generation module and the performance defect detection module, wherein the method comprises the following steps:
the trained expected prediction module of the configuration items reads the description of the configuration items from a user manual of the configuration items of the software to be detected, and the weighted voting classifier is used for determining all the configuration items C= { C to be detected 1 ,c 2 ,...,c z ,...,c N′ Predicting the performance expectation to obtain a performance expectation label set L= [ Lab ] 1 ,Lab 2 ,...,Lab z ,...,Lab N′ ]Wherein, z is more than or equal to 1 and less than or equal to N', lab z ∈{Label 1 ,Label 2 ,Label 3 ,Label 4 ,Label 5 Null, N' is the number of configuration items in the configuration item user manual; l is sent to a test sample generating module and a performance defect detecting module;
fourth, the test sample generating module generates a test sample set T for the software to be detected, and sends the T to the performance defect detecting module, and the method is as follows:
4.1, the test sample generating module extracts the grammar types and the value ranges of the software configuration items in the C, and the extracted grammar types are divided into four types: numerical type, boolean type, enumeration type, string type;
4.2 test sample generating Module is the configuration item set C= { C 1 ,c 2 ,...,c z ,...,c N′ Generating a set of values V, v= { V 1 ,V 2 ,...,V z ,...,V N′}, wherein For configuration item c z Is a value of K z Generating a module c for the test sample z The number of values generated;
4.3 pairs V 1 ,V 2 ,...,V z ,...,V N′ Taking the Cartesian product to obtain Cartesian product V Cartesian =V 1 ×V 2 ×...ΔV N′ ;
4.4 the test sample generating module generates a test command based on the performance test tool, and the method comprises the following steps: sampling parameters of the performance test tool by using a pair-wise method, inputting the parameters into the performance test tool, and outputting test commands to obtain a test command set B= { B 1 ,b 2 ,b 3 ,...,b y ,...,b Y Y is more than or equal to 1 and less than or equal to Y, and Y is the number of test commands in B;
4.5 test sample Generation Module generates a test sample set T, T=B×V Cartesian ={t 1 ,t 2 ,t 3 ,...,t a ,...,t W },1≤a≤W,t a Is a binary group of the two-dimensional data, wherein , The meaning of (2) is: c z The value of (2) is +.>W is the number of test samples in T, < >>C is 1 Is the (u) th possible value, is given by->C is z Is the h possible value of +.>C is N′ The j-th possible value of (2) is 1.ltoreq.u.ltoreq.K 1 ,1≤h≤K z ,1≤j≤K N′ ,K 1 、K z 、K N′ Configuration item c extracted by Spex algorithm 1 、c z 、c N′ The number of possible values of (a) is a positive integer; the test sample set T is sent to a performance defect detection module;
fifth step: the performance defect detection module detects the performance defect of the executable file of the software to be detected according to T and L:
5.1, the performance defect detection module executes the test sample in the T to obtain the performance value of the test sample, and the method comprises the following steps:
5.1.1 initializing variable a=1;
5.1.2 the performance defect detection module repeatedly executes each test sample a times, so that the variable repeat=1, and a is a positive integer;
5.1.3 Performance Defect detection Module test sample t a Inputting software to be detected, running the software to be detected, and recording the repeat time t of input a Performance values obtained by runningSetting a default performance index of the detection performance value as software data throughput;
5.1.4 determining if repeat is equal to A, if so Then a set of test sample cases t is obtained a The performance index of (2) is recorded as:turning to 4.1.5; no let repeat = repeat +1, go to 4.1.3; />
5.1.5 determining if a is equal to W, if so, noting that the output is Out= { [ t ] 1 ,R 1 ],...,[t a ,R a ],...,[t W ,R W ]Wherein the doublet [ t ] a ,R a ]The first element of the test sample is a test sample, and the second element is a performance value set obtained by executing the test sample for A times, and the conversion is 5.2; otherwise let a=a+1, turn 5.1.2;
5.2 grouping Out according to the test sample by the performance defect detection module to obtain a grouped test result set G= { Group (1,1) ,Group (1,2) ,...,Group (1,Y) ,...,Group (z,y) ...,Group (N′,Y) };
5.3, the performance defect detection module judges whether the software to be detected has defects or not by using a hypothesis test method according to the performance expected label L of the configuration item set C and the grouped test result set G:
5.3.1 initializing variable z=1;
5.3.2 initializing variable y=1;
5.3.3 if Lab z =Label 1 Setting hypothesis to be tested H 0 :R a ≤R a′, wherein ,cz At t a The value of (c) is 0 z At t a′ The value of (a) is 1, a' is not less than 1 and not more than W, and 5.3.8 is changed;
5.3.4 if Lab z =Label 2 Setting hypothesis to be tested H 0 :R a ≤R a′, wherein ,cz At t a The value of (c) is greater than c z At t a′ To 5.3.8;
5.3.5 Lab z =Label 3 Setting hypothesis to be tested H 0 :R a ≤R a′, wherein ,cz At t a The value of (c) is less than c z At t a′ To 5.3.8;
5.3.6 Lab z =Label 4 Setting a hypothesis to be tested H0: 5.R a ≤R a′, wherein ,cz At t a The value of (c) is 1 z At t a′ The value of (2) is 0, turn 5.3.8;
5.3.7 if Lab z =Label 5 Setting hypothesis to be tested H 0 :R a ≠R a′ Turning to 5.3.8;
5.3.8 when the hypothesis test result shows that H 0 When refused, namely the refused probability is more than or equal to 1-beta, the software is indicated to have one and configuration item c z Related performance defect, and the test command triggering the defect is thatBeta is a hypothesis testing parameter, and is a positive real number smaller than 1;
5.3.9 if y=y, turn 5.3.10; if not, let y=y+1, turn 5.3.3;
5.3.10 if z=n', the detection is ended; if not, let z=z+1, turn 5.3.2.
2. The method for detecting software performance defect based on configuration item performance expectation as claimed in claim 1, wherein the method for constructing training set in step 2.1 is: n configuration items are randomly selected from 1 ten thousand configuration items of 12 types of software including MySQL, mariaDB, apache-httpd, apache-Tomcat, apache-Derby, H2 and PostgreSQL, GCC, clang, mongoDB, rocksDB, squid.
3. A method for detecting a software performance defect based on a configuration item performance expectation as recited in claim 1, wherein said step 2.6 is based on P 1 ,P 2 ,...,P 5 Confidence pair P for frequent subsequences in 1 ,P 2 ,...,P 5 The method for screening the frequent subsequences comprises the following steps:
2.6.1 initializing variable l=1;
2.6.2 initializing variable q=1;
2.6third. 3 ifWherein 5 is the number of desired tag species, p will be lq Put into collection P l ' in;
2.6.4 determining if Q is equal to Q l If yes, turning to 2.6.5; if not, let q=q+1, go 2.6.3;
2.6.5 judging whether l is equal to 5, if so, indicating that the filtered frequent subsequence set P is obtained 1 ′,P 2 ′,P 3 ′,P 4 ′,P 5 'A'; if not, let l=l+1, turn 2.6.2.
4. A method of software performance defect detection based on configuration item performance expectations according to claim 1 wherein said threshold δ is 99% -99.9% at step 2.7.3.
5. The method for detecting software performance defects based on configuration item performance expectations according to claim 1, wherein the method for constructing the weighted voting classifier by the configuration item expected prediction module in step 2.8 is as follows: the preprocessed configuration item that sets the input of the weighted voting classifier to any desired tag to be predicted describes POS x ,DS x "POS x ,DS x The method is characterized in that the method is simply denoted as x, the ticket obtained by 5 expected tags is output, and the expected tag with the performance of x is the highest expected tag obtained by the ticket; wherein, the ticket of category l is frequent subsequenceThe sum of the confidence coefficients of (1) and (r) x Less than or equal to 100, and%>A subsequence of x; the weighted voting classifier outputs a vote quintuple, which is marked as +.>
wherein ,meaning of (2): "at P l "in", the +.sub.sequence "of" x "is satisfied>The sum of confidence levels of "; if the element in the Votes (x) is not 0, finding the element corresponding to the maximum value in the Votes (x), wherein the sequence number l corresponding to the element is the sequence number corresponding to the performance expected Label of x, and is Label l The method comprises the steps of carrying out a first treatment on the surface of the If Votes (x) = [0,0]The performance of x expects the tag to be empty.
6. The method of claim 1, wherein the test sample generation module extracts the grammar type and the range of values of the software configuration item in C by using a Spex algorithm in step 4.1.
7. The method for detecting software performance defects based on configuration item performance expectations as recited in claim 1, wherein said test sample generation module of step 4.2 is a configuration item set c= { C 1 ,c 2 ,...,c z ,...,c N′ The method for generating the set of values to be measured V is as follows:
4.2.1 initializing variable z=1;
4.2.3 if c z For Boolean type, let V z = {0,1}, turn 4.2.7;
4.2.4 if c z To enumerate the types, order whereinC extracted for Spex algorithm z All possible values of (a) to 4.2.7;
4.2.6 if c z For the value type, then for c z Is sampled by the following method: record c z The minimum and maximum values of (2) are recorded as Min and Max, let V z ={Min,10·Min,10 2 ·Min,Max,10 -1 ·Max,10 -2 Max, turn 4.2.7;
4.2.7 if z=n', end; otherwise, let z=z+1, turn 4.2.2.
8. The method for detecting software performance defect based on configuration item performance expectation as claimed in claim 1, wherein the performance test tool is sysbench, apache-benchmark in step 4.4, and the parameters of the performance test tool include concurrency, load type, data table size, data table number, read operation proportion and write operation proportion.
9. A method for detecting a software performance deficiency based on a configuration item performance expectation as claimed in claim 1, wherein said a in the fifth step is 10.
10. The method for detecting software performance defect based on configuration item performance expectation as claimed in claim 1, wherein the method for grouping Out by the performance defect detection module according to the test sample in step 5.2 is as follows:
5.2.1 initializing variable a=1;
5.2.2 judging if [ t ] a ,R a ]Having been grouped, let a=a+1, turn 5.2.2; otherwise, turning to 5.2.3;
5.2.3 will [ t ] a ,R a ]According to t a In configuration item values and test commands, i.e., [ t ] a ,R a ]And { [ t ] 1 ,R 1 ],...,[t a ,R a ],...,[t W ,R W ]In }, t a and ta′ At the same time, the following 3 conditions are satisfied, then [ t ] a ,R a ],[t a’ ,R a’ ]The method comprises the following steps:
condition 1, t a and ta′ Only one configuration item c z Z is more than or equal to 1 and less than or equal to N';
Condition 3, [ t ] a ,R a ],[t a’ ,R a’ ]Not grouped;
let AND [ t ] a ,R a ]Consensus Num satisfying the above conditions a And, i.eDivided into groups, denoted as Group (z,y) ,Wherein, a 'is not more than 1, a', a is a positive integer;
5.2.4 if α=w, it indicates that the grouping is completed, and a test result set g= { Group after grouping is obtained (1,1) ,Group (1,2) ,...,Group (1,Y) ,...,Group (z,y) ...,Group (N′,Y) Turning to 5.3; let α=a+1, turn 5.2.2.
11. The method for detecting software performance defect based on configuration item performance expectation as claimed in claim 1, wherein Num is a The size of (c) and c z Related to the type of (a): if c z In Boolean form, then Num a =2; if the enumeration type is adopted, num a =K z The method comprises the steps of carrying out a first treatment on the surface of the If it isFor the numerical value type, num a =6; if the character string is of the character string type, num a =1。
12. The method for detecting software performance defects based on configuration item performance expectations according to claim 1, wherein the verification principle of hypothesis testing used by said performance defect detection module in step 5.3 is: if any configuration item c z The desired Label is Label 1 、Label 2 Or Label 3 Regulation c z The performance of the software is expected to be improved, and if the actual test result is that the performance is reduced, the software has performance defects; if c z The desired Label is Label 4 Regulation c z The expected performance of the software is that the performance is reasonably reduced, and if the actual test result is that the performance is greatly reduced, the software has performance defects; if c z The desired Label is Label 5 Regulation c z The performance of the software is expected to be unchanged, and if the actual test result is that the performance is reduced, the software has performance defects.
13. A method for detecting defects in software based on expected performance of configuration items according to claim 1, wherein said 5.3 step said hypothesis test parameter β=0.05.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010610996.3A CN111611177B (en) | 2020-06-29 | 2020-06-29 | Software performance defect detection method based on configuration item performance expectation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010610996.3A CN111611177B (en) | 2020-06-29 | 2020-06-29 | Software performance defect detection method based on configuration item performance expectation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111611177A CN111611177A (en) | 2020-09-01 |
CN111611177B true CN111611177B (en) | 2023-06-09 |
Family
ID=72200573
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010610996.3A Active CN111611177B (en) | 2020-06-29 | 2020-06-29 | Software performance defect detection method based on configuration item performance expectation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111611177B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112131108B (en) * | 2020-09-18 | 2024-04-02 | 电信科学技术第十研究所有限公司 | Feature attribute-based test strategy adjustment method and device |
CN114756865B (en) * | 2022-04-24 | 2024-08-13 | 安天科技集团股份有限公司 | RDP file security detection method and device, electronic equipment and storage medium |
CN114780411B (en) * | 2022-04-26 | 2023-04-07 | 中国人民解放军国防科技大学 | Software configuration item preselection method oriented to performance tuning |
CN115562645B (en) * | 2022-09-29 | 2023-06-09 | 中国人民解放军国防科技大学 | Configuration fault prediction method based on program semantics |
CN116225965B (en) * | 2023-04-11 | 2023-10-10 | 中国人民解放军国防科技大学 | IO size-oriented database performance problem detection method |
CN116561002B (en) * | 2023-05-16 | 2023-10-10 | 中国人民解放军国防科技大学 | Database performance problem detection method for I/O concurrency |
CN116560998B (en) * | 2023-05-16 | 2023-12-01 | 中国人民解放军国防科技大学 | I/O (input/output) sequence-oriented database performance problem detection method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104407971A (en) * | 2014-11-18 | 2015-03-11 | 中国电子科技集团公司第十研究所 | Method for automatically testing embedded software |
CN106201871A (en) * | 2016-06-30 | 2016-12-07 | 重庆大学 | Based on the Software Defects Predict Methods that cost-sensitive is semi-supervised |
CN106528417A (en) * | 2016-10-28 | 2017-03-22 | 中国电子产品可靠性与环境试验研究所 | Intelligent detection method and system of software defects |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7398469B2 (en) * | 2004-03-12 | 2008-07-08 | United Parcel Of America, Inc. | Automated test system for testing an application running in a windows-based environment and related methods |
US8494832B2 (en) * | 2007-06-20 | 2013-07-23 | Sanjeev Krishnan | Method and apparatus for software simulation |
US8140319B2 (en) * | 2008-02-05 | 2012-03-20 | International Business Machines Corporation | Method and system for predicting system performance and capacity using software module performance statistics |
CN107066389A (en) * | 2017-04-19 | 2017-08-18 | 西安交通大学 | The Forecasting Methodology that software defect based on integrated study is reopened |
-
2020
- 2020-06-29 CN CN202010610996.3A patent/CN111611177B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104407971A (en) * | 2014-11-18 | 2015-03-11 | 中国电子科技集团公司第十研究所 | Method for automatically testing embedded software |
CN106201871A (en) * | 2016-06-30 | 2016-12-07 | 重庆大学 | Based on the Software Defects Predict Methods that cost-sensitive is semi-supervised |
CN106528417A (en) * | 2016-10-28 | 2017-03-22 | 中国电子产品可靠性与环境试验研究所 | Intelligent detection method and system of software defects |
Also Published As
Publication number | Publication date |
---|---|
CN111611177A (en) | 2020-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111611177B (en) | Software performance defect detection method based on configuration item performance expectation | |
Jiang et al. | What causes my test alarm? Automatic cause analysis for test alarms in system and integration testing | |
US9720815B2 (en) | Automatically generating testcases | |
US20200349229A1 (en) | Open domain targeted sentiment classification using semisupervised dynamic generation of feature attributes | |
Bowring et al. | Active learning for automatic classification of software behavior | |
Ebrahimi et al. | An HMM-based approach for automatic detection and classification of duplicate bug reports | |
Banerjee et al. | Automated triaging of very large bug repositories | |
Xia et al. | Towards more accurate multi-label software behavior learning | |
Yan et al. | Characterizing and identifying reverted commits | |
CN113672931B (en) | Software vulnerability automatic detection method and device based on pre-training | |
Arul et al. | Data anomaly detection for structural health monitoring of bridges using shapelet transform | |
Falessi et al. | The impact of dormant defects on defect prediction: A study of 19 apache projects | |
Mahdieh et al. | Test case prioritization using test case diversification and fault-proneness estimations | |
Hirsch et al. | Root cause prediction based on bug reports | |
Saifan et al. | Software defect prediction based on feature subset selection and ensemble classification | |
CN114816962A (en) | ATTENTION-LSTM-based network fault prediction method | |
Hamidi et al. | Towards understanding developers’ machine-learning challenges: A multi-language study on stack overflow | |
US20200319992A1 (en) | Predicting defects using metadata | |
Nascimento et al. | A cost-sensitive approach to enhance the use of ML classifiers in software testing efforts | |
Zhu et al. | Advanced crowdsourced test report prioritization based on adaptive strategy | |
Rahman et al. | Quantizing large-language models for predicting flaky tests | |
Saarinen | Adaptive real-time anomaly detection for multi-dimensional streaming data | |
Al-Johany et al. | Prediction and Correction of Software Defects in Message-Passing Interfaces Using a Static Analysis Tool and Machine Learning | |
Kumar | Multiclass Software Bug Severity Classification using Decision Tree, Naive Bayes and Bagging | |
Gunu et al. | Modern predictive models for modeling the college graduation rates |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |