CN110442514A - The method that defect repair is recommended is realized based on learning algorithm - Google Patents

The method that defect repair is recommended is realized based on learning algorithm Download PDF

Info

Publication number
CN110442514A
CN110442514A CN201910623765.3A CN201910623765A CN110442514A CN 110442514 A CN110442514 A CN 110442514A CN 201910623765 A CN201910623765 A CN 201910623765A CN 110442514 A CN110442514 A CN 110442514A
Authority
CN
China
Prior art keywords
defect
ast
sentence
code
repair
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910623765.3A
Other languages
Chinese (zh)
Other versions
CN110442514B (en
Inventor
孙小兵
朱轩锐
李斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangzhou University
Original Assignee
Yangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangzhou University filed Critical Yangzhou University
Priority to CN201910623765.3A priority Critical patent/CN110442514B/en
Publication of CN110442514A publication Critical patent/CN110442514A/en
Application granted granted Critical
Publication of CN110442514B publication Critical patent/CN110442514B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a kind of methods for realizing defect repair recommendation based on learning algorithm, the following steps are included: the source code before being repaired for the bug collected and after repairing, abstract syntax tree AST extraction is carried out by GumTree respectively, before acquisition bug reparation, repairs the respective AST edit operation sequence of post code;Screening and filtering is carried out to AST edit operation sequence;In conjunction with the AST edit operation sequence after screening and filtering, it is abstracted the source code before bug is repaired and after repairing using resolver, and be each mapped to vector characteristics expression;Training neural network is indicated according to the vector characteristics after mapping, is obtained defect sentence identification model, is thus identified defect sentence;Semantic feature based on source code is that the defect sentence identified recommends recovery scenario.The method of the present invention is according to the AST edit operation between code, and by fine-grained code analysis implementation model character representation, taken in conjunction context relation positions defect code, can obtain the reparation suggested design with fine granularity property, so that reparation is more accurate.

Description

The method that defect repair is recommended is realized based on learning algorithm
Technical field
The invention belongs to software maintenance field, especially a kind of method for realizing that defect repair is recommended based on learning algorithm.
Background technique
For software, repairing defect is a very time-consuming task.As the scale and complexity of software product increase Long, defect is an inevitable problem.Unreasonable or developer the warp of the deviation, development process that are understood due to demand The reasons such as deficiency are tested, are likely to generate software defect.When developer faces a large amount of defects, if it is possible to according to defect generation Code realizes the recommendation of recovery scenario, then can greatly improve the efficiency that developer repairs defect.
It is a research hotspot in Current software maintenance area that auto-programming, which repairs problem, and researchers at home and abroad are to this Problem conducts in-depth research.Existing auto-programming restorative procedure can simply be divided into the auto-programming based on test case It repairs and other kinds of auto-programming restorative procedure.Other kinds of auto-programming restorative procedure is by based on preposition item The information such as part/postcondition contract or defect report assess the correctness of candidate patches.When developer does not have It, can be automatic for some Defective programs by auto-programming restorative procedure when time enough completes the artificial reparation of all defect Temporary patch is generated, subsequent developer can refer to these temporary patch, further increase patch by manual type Quality.Self-repairing method is relatively limited at present, and relatively high to the requirement of the professional knowledge of developer, acceptable for confirming Reparation mode or conversion need take a substantial amount of time;And self-repairing method generates the acceptable patch of programmer and can deposit In problem, such as article [Qi, Z., Long, F., Achour, S., and Rinard, M.An analysis of patch plausibility and correctness for generate-and-validate patch generation Systems.ISSTA ' 15.], it is generated by deletion functional block or the technology for realizing reparation by overloading in test case The patch of most of reports is all incorrect.
Depth learning technology is widely used in defect location, failure prediction and defect repair at present.It is being based on deep learning Automatic defect repair in, it is common to use study code conversion or study history submit to generate patch, although can generate It sufficiently correct code conversion variant or patch and no longer needs manually to go to select, but defect repair accuracy is on 45% left side The right side, accuracy or relatively low, and complete to have difficulties when a large amount of defective datas are repaired.In true defect code reparation In, it is studied by being automatically repaired for error code conversion there is still a need for further.
Summary of the invention
Defect repair scheme is provided for developer the purpose of the present invention is to provide a kind of, developer's defect is improved and repairs The defect repair recommended method of multiple efficiency and quality.
The technical solution for realizing the aim of the invention is as follows: realizing the method that defect repair is recommended, packet based on learning algorithm Include following steps:
Step 1, for collection bug repair before and repair after source code, abstract syntax is carried out by GumTree respectively It sets AST to extract, before acquisition bug reparation, repairs the respective AST edit operation sequence of post code;
Step 2 carries out screening and filtering to AST edit operation sequence;
Step 3, in conjunction with the AST edit operation sequence after screening and filtering, be abstracted before bug is repaired and after repairing using resolver Source code, and be each mapped to vector characteristics expression;
Step 4 indicates training neural network according to the vector characteristics after mapping, obtains defect sentence identification model, thus Identify defect sentence;
Step 5, the semantic feature based on source code are that the defect sentence that step 4 identifies recommends recovery scenario.
Compared with prior art, the present invention its remarkable advantage are as follows: 1) extract the AST edit operation of code by GumTree Sequence offers precise data source, it is ensured that each of present invention training data is all effective for model;2) RNN is used Coder-decoder joint training model indicates that further decoding is output sequence, utilizes end by the way that list entries is switched to vector The entire learning process of structured training to end allows the model to simulate a variety of different AST and operates and generate candidate patches, It is applied widely;3) from the context that defect code is positioned from statement semantics when recommending recovery scenario, it can obtain Reparation suggested design with fine granularity property, so that reparation is more accurate;4) consider from AST angle, so that the reparation provided Granularity is thinner.
Present invention is further described in detail with reference to the accompanying drawing.
Detailed description of the invention
Fig. 1 is that the present invention is based on the flow charts that learning algorithm realizes defect repair recommended method.
Fig. 2 is the AST edit operation sequence diagram of code before repairing in the embodiment of the present invention.
Fig. 3 is the AST edit operation sequence diagram that post code is repaired in the embodiment of the present invention.
Specific embodiment
In conjunction with Fig. 1, the present invention is based on learning algorithms to realize the method that defect repair is recommended, comprising the following steps:
Step 1, for collection bug repair before and repair after source code, abstract syntax is carried out by GumTree respectively It sets AST to extract, before acquisition bug reparation, repairs the respective AST edit operation sequence of post code;
Step 2 carries out screening and filtering to AST edit operation sequence;
Step 3, in conjunction with the AST edit operation sequence after screening and filtering, be abstracted before bug is repaired and after repairing using resolver Source code, and be each mapped to vector characteristics expression;
Step 4 indicates training neural network according to the vector characteristics after mapping, obtains defect sentence identification model, thus Identify defect sentence;
Step 5, the semantic feature based on source code are that the defect sentence that step 4 identifies recommends recovery scenario.
It is further preferred that carrying out AST extraction to source code in step 1, the node type of the AST of extraction includes:
(1) method call and class example create node;
(2) method statement, type declarations and enumeration declaration node;
(3) control stream node, including while statement, catch sentence, if sentence and throw sentence.
It is further preferred that screening and filtering in step 2 specifically: filter out syntax error and the frequency of occurrences lower than setting threshold The AST edit operation sequence of value.
Further, it is abstracted the source code before bug is repaired and after repairing using resolver in step 3, and be each mapped to Vector characteristics expression, specifically:
Step 3-1, source code is generated into label stream using analyzer;
Step 3-2, label is flowed back to and is fed in resolver, one is generated only to each identifier in source code/text One ID is simultaneously mapped.
Further, the semantic feature in step 5 based on code is that defect sentence recommends recovery scenario, specifically:
According to context of the defect sentence in source code, mode is repaired in conjunction with the fine granularity of the following table 1, recommendation is repaired accordingly Compound case: in the defect sentence X there are problem need to repair or the defect sentence in the Y of X need to repair there are problem, wherein X, Y are 1. extremelyIn any one;
The reparation mode of 1 bug of table
Below with reference to embodiment, the present invention is described in further detail.
Embodiment
In conjunction with Fig. 1, the present invention is based on learning algorithms to realize the method that defect repair is recommended, comprising the following steps:
1, the code of (fixed files) after collecting (buggy files) before bug is repaired from Github and repairing, point Abstract syntax tree AST extraction is not carried out by GumTree, before acquisition bug reparation, repairs the respective AST edit operation of post code Sequence.The AST edit operation sequence of code is as shown in Figure 2 before the reparation extracted in the present embodiment, repairs the AST of post code edits The sequence of operation is as shown in Figure 3.
2, screening and filtering is carried out to the edit operation sequence of AST, filters out and is lower than comprising syntax error and the frequency of occurrences The AST edit operation sequence of given threshold (being specially that the number occurred is less than 3 times in the present embodiment).
3, it in conjunction with the AST edit operation sequence after screening and filtering, is abstracted before bug is repaired and after repairing using Java resolver Source code, and be each mapped to vector characteristics expression, it is as shown in table 2 below:
Table 2 repairs the code mapping result of front and back
4, training neural network is indicated according to the vector characteristics after mapping, obtain defect sentence identification model, thus identify Defect sentence is as shown in table 3 below out:
The input and output example of 3 defect sentence identification model of table
5, the semantic feature based on source code is that the defect sentence identified in the above process 4 recommends recovery scenario, this reality Apply the recovery scenario in example are as follows:
There is the 9. argument in 2. if main body in such defect, it is proposed that modify.
The method of the present invention passes through fine-grained code analysis implementation model feature according to the AST edit operation between code It indicating, relationship from the context positions defect code, the reparation suggested design with fine granularity property can be obtained, so that It is more accurate to repair.

Claims (6)

1. a kind of method for realizing that defect repair is recommended based on learning algorithm, which comprises the following steps:
Step 1, for collection bug repair before and repair after source code, abstract syntax tree is carried out by GumTree respectively AST is extracted, and before acquisition bug reparation, repairs the respective AST edit operation sequence of post code;
Step 2 carries out screening and filtering to AST edit operation sequence;
Step 3, in conjunction with the AST edit operation sequence after screening and filtering, be abstracted the source before bug is repaired and after repairing using resolver Code, and it is each mapped to vector characteristics expression;
Step 4 indicates training neural network according to the vector characteristics after mapping, obtains defect sentence identification model, thus identifies Defect sentence out;
Step 5, the semantic feature based on source code are that the defect sentence that step 4 identifies recommends recovery scenario.
2. the method according to claim 1 for realizing that defect repair is recommended based on learning algorithm, which is characterized in that step 1 Described to carry out AST extraction to source code, the node type of the AST of extraction includes:
(1) method call and class example create node;
(2) method statement, type declarations and enumeration declaration node;
(3) control stream node, including while statement, catch sentence, if sentence and throw sentence.
3. the method according to claim 1 for realizing that defect repair is recommended based on learning algorithm, which is characterized in that step 2 The screening and filtering specifically: filter out syntax error and the frequency of occurrences is lower than the AST edit operation sequence of given threshold.
4. the method according to claim 1 for realizing that defect repair is recommended based on learning algorithm, which is characterized in that step 3 The source code being abstracted using resolver before bug is repaired and after repairing, and it is each mapped to vector characteristics expression, specifically:
Step 3-1, source code is generated into label stream using analyzer;
Step 3-2, label is flowed back to and is fed in resolver, one is generated uniquely to each identifier in source code/text ID is simultaneously mapped.
5. the method according to claim 1 for realizing that defect repair is recommended based on learning algorithm, which is characterized in that step 4 The neural network is specially Recognition with Recurrent Neural Network RNN.
6. the method according to claim 1 for realizing that defect repair is recommended based on learning algorithm, which is characterized in that step 5 The semantic feature based on code is that defect sentence recommends recovery scenario, specifically:
According to context of the defect sentence in source code, mode is repaired in conjunction with the fine granularity of the following table 1, recommends corresponding reparation side Case: in the defect sentence X there are problem need to repair or the defect sentence in the Y of X need to repair there are problem, wherein X, Y are 1. extremelyIn any one;
The reparation mode of 1 bug of table
CN201910623765.3A 2019-07-11 2019-07-11 Method for realizing defect repair recommendation based on learning algorithm Active CN110442514B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910623765.3A CN110442514B (en) 2019-07-11 2019-07-11 Method for realizing defect repair recommendation based on learning algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910623765.3A CN110442514B (en) 2019-07-11 2019-07-11 Method for realizing defect repair recommendation based on learning algorithm

Publications (2)

Publication Number Publication Date
CN110442514A true CN110442514A (en) 2019-11-12
CN110442514B CN110442514B (en) 2024-01-12

Family

ID=68430178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910623765.3A Active CN110442514B (en) 2019-07-11 2019-07-11 Method for realizing defect repair recommendation based on learning algorithm

Country Status (1)

Country Link
CN (1) CN110442514B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111459491A (en) * 2020-03-17 2020-07-28 南京航空航天大学 Code recommendation method based on tree neural network
CN111897946A (en) * 2020-07-08 2020-11-06 扬州大学 Vulnerability patch recommendation method, system, computer equipment and storage medium
CN114416421A (en) * 2022-01-24 2022-04-29 北京航空航天大学 Automatic positioning and repairing method for code defects
CN115951892A (en) * 2022-11-08 2023-04-11 北京交通大学 Program patch generating method based on expression

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104412327A (en) * 2013-01-02 2015-03-11 默思股份有限公司 Built in self-testing and repair device and method
CN105045719A (en) * 2015-08-24 2015-11-11 中国科学院软件研究所 Method and device for predicting regression test failure on basis of repair deficiency change
CN106445804A (en) * 2016-08-24 2017-02-22 北京奇虎测腾安全技术有限公司 Source code cloud detection system and method based on serialization intermediate representation
US20170315968A1 (en) * 2016-04-27 2017-11-02 Melissa Boucher Unified document surface
CN109299007A (en) * 2018-09-18 2019-02-01 哈尔滨工程大学 A kind of defect repair person's auto recommending method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104412327A (en) * 2013-01-02 2015-03-11 默思股份有限公司 Built in self-testing and repair device and method
CN105045719A (en) * 2015-08-24 2015-11-11 中国科学院软件研究所 Method and device for predicting regression test failure on basis of repair deficiency change
US20170315968A1 (en) * 2016-04-27 2017-11-02 Melissa Boucher Unified document surface
CN106445804A (en) * 2016-08-24 2017-02-22 北京奇虎测腾安全技术有限公司 Source code cloud detection system and method based on serialization intermediate representation
CN109299007A (en) * 2018-09-18 2019-02-01 哈尔滨工程大学 A kind of defect repair person's auto recommending method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111459491A (en) * 2020-03-17 2020-07-28 南京航空航天大学 Code recommendation method based on tree neural network
CN111897946A (en) * 2020-07-08 2020-11-06 扬州大学 Vulnerability patch recommendation method, system, computer equipment and storage medium
CN111897946B (en) * 2020-07-08 2023-09-19 扬州大学 Vulnerability patch recommendation method, vulnerability patch recommendation system, computer equipment and storage medium
CN114416421A (en) * 2022-01-24 2022-04-29 北京航空航天大学 Automatic positioning and repairing method for code defects
CN114416421B (en) * 2022-01-24 2024-05-31 北京航空航天大学 Automatic positioning and repairing method for code defects
CN115951892A (en) * 2022-11-08 2023-04-11 北京交通大学 Program patch generating method based on expression

Also Published As

Publication number Publication date
CN110442514B (en) 2024-01-12

Similar Documents

Publication Publication Date Title
CN110442514A (en) The method that defect repair is recommended is realized based on learning algorithm
CN110347603B (en) Automatic software testing system and method for artificial intelligence
CN103678110B (en) The method and apparatus of amendment relevant information is provided
CN105512036A (en) Test template for automatically generating test case according to preset rules and test method
CN109740457B (en) Face recognition algorithm evaluation method
CN107678971B (en) Code taste driven code defect prediction method based on clone and coupling detection
CN109299083A (en) A kind of data governing system
CN111400505A (en) Method and system for matching fault elimination scheme of power consumption information acquisition system
CN110442527A (en) Automation restorative procedure towards bug report
CN103294595A (en) Genetic algorithm based software repair method
CN108665244B (en) 61850 model-based constant value list automatic generation method and storage medium
CN106933572B (en) Measurement model based on LLVM intermediate representation program slice
CN115712623A (en) Batch data fault-tolerant acquisition method based on capture metadata change
CN116483730A (en) Service system automatic test method based on domestic software and hardware and open source test tool
CN108228232B (en) Automatic repairing method for circulation problem in program
CN110597718A (en) Automatic test implementation method and system based on AI
CN101894073B (en) Defect automatic positioning device based on control flow intersection and automatic positioning method thereof
CN109508204B (en) Front-end code quality detection method and device
CN113641573B (en) Program analysis software automatic test method and system based on revision log
CN112395343B (en) DSG-based field change data acquisition and extraction method
CN114064472A (en) Automatic software defect repairing and accelerating method based on code representation
CN103150254B (en) Error locating method for software based on state-dependent probabilistic modeling
CN111966675A (en) Fixed asset investment project data cleaning method and terminal
CN113050925A (en) Intelligent contract repairing method and device for block chain
CN111552639A (en) Software test comprehensive control method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant