CN115017035A - Method for constructing deep learning defect data set by using Docker container - Google Patents

Method for constructing deep learning defect data set by using Docker container Download PDF

Info

Publication number
CN115017035A
CN115017035A CN202210535022.2A CN202210535022A CN115017035A CN 115017035 A CN115017035 A CN 115017035A CN 202210535022 A CN202210535022 A CN 202210535022A CN 115017035 A CN115017035 A CN 115017035A
Authority
CN
China
Prior art keywords
defect
version
repair
commit
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210535022.2A
Other languages
Chinese (zh)
Inventor
梁云凯
冯志勇
李晓红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202210535022.2A priority Critical patent/CN115017035A/en
Publication of CN115017035A publication Critical patent/CN115017035A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3664Environments for testing or debugging software
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Stored Programmes (AREA)

Abstract

The invention relates to a method for constructing a deep learning defect data set by using a Docker container, which constructs a data set gDefects4dl, which is a defect data set of deep learning application and aims to help developers to better understand defects in the deep learning application and evaluate the effectiveness of deep learning defect positioning and defect repairing technologies developed by others.

Description

Method for constructing deep learning defect data set by using Docker container
Technical Field
The invention belongs to the technical field of software engineering and deep learning, and particularly relates to a method for constructing a deep learning defect data set by using a Docker container.
Background
Deep learning models and programs are widely used in various software systems, which inevitably leads to an increasing demand for testing, debugging and repairing techniques thereof. Evaluating automated testing, debugging, and even repair techniques requires a known set of defects. To ensure study reproducibility, studies should be validated on similar, published data. In the absence of a carefully constructed defect data set, researchers must manually collect reproducible defects from an open source repository, which is a very time consuming process.
At present, researchers have constructed defect data sets for deep learning models or application studies, in order to perform empirical studies and develop methods and tools for automatically locating or repairing defects.
Despite some work, existing deep learning defect data sets lack a detailed description of the cause of the occurrence of the defect. For example, some repairs in a data set are loading additional pre-trained models or adding more layers of activation functions. If no good specifications are defined, it is difficult to know whether such repairs and the root cause of the defect occurrence can be generalized to other deep learning projects.
And the defects in the existing data set can not be reproduced generally, or the reproduction process is too tedious, which undoubtedly increases the use difficulty of users. Even the repair of some defects is too complicated and irregular, and collecting and analyzing the defects is not helpful for researching the automatic positioning and repairing technology of the defects.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a method for constructing a deep learning defect data set by using a Docker container, which is used for developers to research the defects in deep learning and evaluate the effectiveness of defect positioning and defect repairing technologies proposed by the developers; the invention constructs a data set gDefeatures 4dl, which is a defect data set of deep learning application and aims to help developers to better understand defects in the deep learning application and evaluate the effectiveness of deep learning defect positioning and defect repairing technologies developed by others.
The technical problem to be solved by the invention is realized by the following technical scheme:
a method for constructing a deep learning defect data set by using a Docker container is characterized by comprising the following steps: the method comprises the following steps:
1) commit tracking and screening
The issue and the commit are in a many-to-many relationship, all the commits related to the issue need to be tracked and screened in order to obtain a true repair, and if only one commit corresponds to the issue, the commit is considered to be the repair for the issue; if a plurality of commit are corresponding to the issue, artificially identifying which commit is the true repair;
2) test case construction and enhancement
Once it is determined that the commit of the issue is repaired, then a test case needs to be manually constructed, which passes through the commit version and which is abnormal in the version before commit; defining the version before commit as c-l, and the commit version as c; according to two different commit of c and c-l, using Git to create two different branches, wherein the branch where c-1 is located is called a toggle branch, the branch where c is located is called a fix branch, and adding a test case for each branch;
for the condition that the bug test case of the bug version is reported, a trace module is added in the test case, and the trace module of Python provides a standard interface for extracting, formatting and printing stack traces of a Python program;
for the condition that the testing case of the bug version does not report errors, adding assert to the testing case to distinguish the bug version from the fix version, wherein Python assert is used for judging an expression, the exception is triggered when the expression condition is false, the assertion can directly return errors under the condition that the condition does not meet the program operation, and the errors are reported without waiting for the program to crash after the program operation;
for the test case of the fix version, in order to more intuitively display the difference between the test case of the fix version and the bug version, the test case of the fix version with each defect is added and output with 'success' information;
3) environmental isolation
Generally, the operating environment of each defect is different, and some defects can only be reproduced under a specific environment, such as a specific data set, a specific version of library, a specific operating system configuration; after test case enhancement, therefore, environmental isolation is required,
the Conda version:
in this version, we build a folder for each defect, where scripts are provided to create the Conda environment, download data sets and dependency libraries (including integration tools), and execute test cases, and when a user wants to reproduce a defect, only run the script;
docker version:
in the version, a single Docker mirror image is constructed for each defect, all necessary environments of the defect are pre-installed in the mirror image, a user can reproduce the defect in the Docker, after the container is started, an internal file item of the container is stored in a metadata folder under a home directory, and a shell script for running a test case is stored in a script directory;
4) meta information extraction and augmentation
The defect meta-information provides basic information of the defect, and a user can conveniently acquire related information of the defect through the meta-information, so that each defect is added with the meta-information, such as a defect name (named by 'library name + issue number'), artificially defined defect type, exception information, an issue URL, a commit URL, a defect description, a support degree and a reason why the defect can be generalized, wherein the support degree indicates how many defects in a data set have similar or same error reasons and repair modes with the defect; the support degree can indirectly or directly reflect the generalization ability of a certain defect, and can also be said to be a universal degree;
meanwhile, two repair indexes of positioning difficulty and repair difficulty and one repair operator are defined for each defect, wherein the positioning difficulty refers to the distance between the position of the code generating the abnormity by running the bug version test case and the position of the repaired code;
if the file where the abnormal information points to the position is different from the file where the modification position is located, the tree directory structure of the file needs to be obtained, the minimum value from the two files to the nearest common ancestor is used as the positioning difficulty by searching the nearest common ancestor (LCA) of the two files, and the algorithm for solving the nearest common ancestor adopts the Tarjan algorithm;
if the file where the abnormal information points to the position is the same as the file where the modified position is, the calculation method of the positioning difficulty lscore is shown as a formula 4-1, wherein line is the difference value between the line number of the abnormal code and the line number of the repair code:
Figure BDA0003647530380000031
the repair difficulty is mainly determined by the modified serial number of the character and the length of the modified character string, the calculation method of the repair difficulty rscore is shown as a formula 4-2,
Figure BDA0003647530380000032
wherein: clines is the modified serial number of characters;
slength is the modified string length;
the repair operator means how many steps of operation are needed for the repair, and if only one step of operation is needed to be modified, the value of the operator is 1;
5) tool integration
The defect data set gDefects4dl of the deep learning application is constructed to provide a reference data set for the research of deep learning defect location and defect repair technology, and in order to enable the data set gDefects4dl to have wider application, gDefects4dl provides some extensible interfaces and integrates three defect location tools ShapeFlow, DEBAR and GRIST to support the evaluation of defects in the data set, a Docker image of each defect integrates the three tools, and scripts in a Conda environment can automatically download the defect location tools;
ShapeFlow is a dynamic abstract interpreter of TensorFlow that can quickly capture tensor shape mismatch errors; ShapeFlow shares the same API as TensorFlow, but it only captures and recognizes the shape of the tensor, ShapeFlow constructs a custom shape computation graph, similar to the computation graph used by TensorFlow; ShapeFlow does not require a programmer to make code annotations or code modifications;
DEBAR is a static analysis method based on abstract interpretation, which is used for detecting numerical errors in a neural network architecture and comprises two abstract technologies, one is applicable to tensor and the other is applicable to numerical values;
GRIST is a method for exposing numerical errors in deep learning through gradient back propagation, and the method adopts a dynamic technology to automatically generate a small input capable of exposing the numerical errors in a deep learning program, and utilizes a built-in gradient calculation function of deep learning to identify the numerical errors.
Moreover, the identification of whether commit is a true repair in step 1) is divided into two ways:
first, the discussion information in issue is analyzed by human: if the information related to the repair is mentioned, the codes are compared with the problem description to screen commit; if no information related to code repair exists, judging by means of professional knowledge;
secondly, creating different branches according to commit through a version control mechanism, then constructing an operating environment for the branches in Docker or Conda, and adding test cases; if the commit is not a real repair, the branch test case is abnormal, and if the branch test case is normally executed, the commit can be proved to be a real repair.
The invention has the advantages and beneficial effects that:
1. the method for constructing the deep learning defect data set by using the Docker container is used for a developer to research defects in deep learning and evaluate the effectiveness of defect positioning and defect repairing technologies proposed by the developer.
2. According to the method for constructing the deep learning defect data set by using the Docker container, the data set gDefets 4dl is constructed, is a defect data set of deep learning application, and aims to help developers to better understand defects in the deep learning application and evaluate the effectiveness of deep learning defect positioning and defect repairing technologies developed by others.
Drawings
FIG. 1 is a flow chart of the overall construction of the data set gDefects4dl according to the present invention;
FIG. 2 is a diagram illustrating a file structure of a Conda version defect according to the present invention;
FIG. 3 is a schematic diagram of a defect list in the Docker warehouse defects4dl according to the present invention;
FIG. 4 is a schematic view of the internal structure of the container of the present invention;
FIG. 5 is a schematic diagram of the naming mode of the Shell script of the present invention;
FIG. 6 is a diagram illustrating the writing form of the Shell script according to the present invention;
FIG. 7 is a flow chart of the present invention for calculating the difficulty of positioning;
FIG. 8 is an overall architectural view of the data set gDefects4dl tool of the present invention;
FIG. 9 is a diagram of a main page and common commands of the command line tool of the present invention;
FIG. 10 is a diagram of the main page of the JavaWeb version tool of the present invention;
FIG. 11 is a functional diagram of a command line tool for retrieving defect lists in accordance with the present invention;
FIG. 12 is a functional diagram of a keyword search for a command line tool according to the present invention;
FIG. 13 is a Docker mirror function download for a command line tool in accordance with the present invention;
FIG. 14 is a functional diagram of a command line tool for checking defect details according to the present invention;
FIG. 15 is a schematic diagram of defect detail information of JavaWeb version of the present invention;
FIG. 16 is a similar defect jump function diagram of the present invention;
FIG. 17 is a schematic diagram of a GitHub issue page of the present invention;
FIG. 18 is a functional diagram of a command line version diff of the present invention;
FIG. 19 is a JavaWeb version diff interface diagram of the present invention;
FIG. 20 is a functional diagram of a command line version test case execution according to the present invention;
FIG. 21 is a JavaWeb version test case execution function diagram of the present invention;
FIG. 22 is a functional diagram of the use of the integration tool in the data set tool of the present invention.
Detailed Description
The present invention is further illustrated by the following specific examples, which are intended to be illustrative, not limiting and are not intended to limit the scope of the invention.
The gDefets 4dl data set constructed by the method is more focused on collecting defects with generalization capability, namely defects with universality of repair modes. We see whether a defect has generalization capability or not, first, the defect class is specified according to the specification it violates, such as the precondition of implicit API, compatibility of type/shape, etc. Second, a degree of support is provided for each defect, indicating how many similar defects in the data set have similar or identical causes of errors and repair patterns. In summary, we constructed the goal of gDefects4dl as follows:
general applicability (general): the defects in the data set gDefects4dl are selected based on the commonality of the deep learning application defects. When the data set is constructed, the defect depending on a specific application program is avoided, so that the generalization capability of the defect in the data set is ensured.
Authenticity (Authenticity): each defect in the data set gDefects4dl was selected from the closed state of the issued in the popular GitHub library. Thus, each flaw naturally records all discussions and code evolution history associated with it.
Reproducible (Reproducible): each defect in the data set gDefects4dl is reproducible, the results of multiple runs do not have fundamental differences, and a user can view the repair process of the code, the detailed information of the repair and the evolution history of the code.
Isolated Environment (Isolated Environment): an independent operation environment is built for each defect through two modes of Docker and Conda, and the situations that a user needs to build the environment through a complicated program and the environment configuration is wrong are avoided. In addition, a test case is provided for each defect and enhanced to distinguish the bug version from the fix version.
Tool Extension (Tool Extension): the data set gDefects4dl supports various programmable interfaces and integrates various tools, and currently gDefects4dl integrates three defect detection tools of ShapeFlow [10], DEBAR [13] and GRIST [12 ]. More deep learning defect location and defect repair tools may be integrated in the future.
The invention provides a method for constructing a deep learning defect data set by using a Docker container, and as shown in FIG. 1, the overall construction process of a data set gDefets 4dl is shown. The whole process starts from a closed-state issue in a popular deep learning resource library, and isolates a defect of a reproducible deep learning application, wherein the defect consists of meta-information (including defect type, abnormal information, defect description and the like) and a runtime environment (including two forms of Conda and Docker to reproduce the defect), and the specific implementation details are as follows:
commit tracking and screening
There is a many-to-many relationship between issue and commit, and in order to obtain a true repair (i.e., repair the commit of the issue), it is necessary to trace and screen all the commits associated with the issue. Therefore, if only one commit corresponds to the issue, the commit is considered to be a repair for the issue. If there are multiple commit's corresponding to this issue, then it is necessary to manually identify which commit is the true repair.
There are two main ways to identify whether commit is a true repair. One is to analyze the discussion information in issue artificially. If information related to the repair is mentioned, the commit is screened by comparing the code to the problem description. If no information related to code repair exists, judgment is carried out by means of professional knowledge. And secondly, creating different branches according to commit through a version control mechanism, then constructing an operating environment for the branches in Docker or Conda, and adding test cases. If the commit is not a real repair, the branch test case is abnormal, and if the branch test case is normally executed, the commit can be proved to be a real repair.
To ensure that defects in the data set are generalized, the tests are verified by manually checking the isuse and commit (commit is a commit that can repair the isuse). If there is a defect in the library that has a similar or the same cause of error and repair pattern, then the defect is considered to be generalized. If there is no similar defect in the library, but the defect has a certain repair pattern, we have found three deep learning researchers to help us identify.
2. Test case construction and enhancement
Once it is determined that the commit of the issue (denoted here as c) is to be repaired, then a test case needs to be manually constructed that passes in version c, with exceptions occurring in version c-1 (the commit before c). In view of the extensibility of gDefects4dl, we backed up the entire library of defective gits in order to facilitate future development of more software engineering tasks. According to two different commit of c and c-1, two different branches are created by using Git, the branch where c-1 is located is called a buggy branch (named library name + issue number + -buggy in GitHub), and the branch where c is located is called a fix branch (named library name + issue number + -fix in GitHub). And add test cases for each branch. And the execution of all bug versions can not be abnormal, and some bugs can not be abnormal due to different test case constructions, but are different from the execution result of the test case of the fix version. Therefore, the debug version test case is enhanced in two ways.
For the condition that the bug version test case is reported in error, in order to make the abnormal information clearer, a receive module is added in the test case. The traceback module of Python provides a standard interface to fetch, format and print the stack trace of the Python program. It accurately mimics the behavior of the Python interpreter in printing stack traces. By using the trace back module, more detailed exception information can be obtained.
For the condition that the testing case of the bug version does not report errors, the alert is added to the testing case to distinguish the bug version from the fix version. Python alert is used to determine an expression that triggers an exception if the expression condition is false. The assertion can directly return an error under the condition that the condition does not meet the program operation, and the error does not need to be reported until the program is crashed after the program is operated, so that the time and the resources are saved.
For the test case of the fix version, in order to more intuitively display the difference between the test case of the fix version and the debug version, the test case of the fix version with each defect is added and output with "success" information.
3. Environmental isolation
Typically, the operating environment for each defect is different. Some defects can only be reproduced under specific circumstances, such as a specific data set, a specific version of a library, a specific operating system configuration, and the like. Thus, after test case enhancement, the present study provides two ways to isolate the environment.
The Conda version:
in this version, we build a folder for each defect, which provides scripts to create the Conda environment, download data sets and dependent libraries (including integration tools), and execute test cases. When a user wants to reproduce a certain defect, only the script needs to be run. As shown in fig. 2, each defect has a folder named by its name (library name + issue number), and the folder includes a bug folder, a fix folder, and a txt file named commit address, and the txt file includes some source information related to the defect, such as the URL of issue, the URL of commit, and so on. The button folder has the same structure as the fix folder and comprises a requisitions. The Python script file can automatically help the user to construct a Conda environment, clone a related library in the GitHub, download an environment on which the test case depends and run the test case.
Docker version:
in this version, we build a separate Docker image for each defect, with all the necessary environments of the defect pre-populated in the image. The user can reproduce the defect in Docker. As shown in FIG. 3, all Docker images are stored in the defects4dl repository. Each Docker image is named in the form of "library name + issue number". From the naming, the user can clearly know the source of the defect, and can pull the Docker image by the following command when using the Docker image.
docker pull defects4 dl/Defect name (defect name here refers to a mirror name), and after the mirror pull is completed, the interactive interface is entered by the following commands.
The docker run-it defect name/bin/bash (here the defect name refers to also a mirror name), and in addition, it is also possible to run the mirror and change the container name by the following commands.
docker run-name modified container name-it-d defects4 dl/defect name (here defect name refers to a mirror name), and the internal file directory of the container is shown in fig. 4 after the container is started. Items are stored under the metadata folder under the home directory. The shell script for running the test case is stored in the script directory, and as shown in fig. 5, the shell script is named as "library name + issue number + -version.sh" (version includes both bug and fix). The script is written as shown in FIG. 6.
The Docker container mode is used for isolating the environment, so that more hard disk space is occupied, and a Conda mode is designed for isolating the environment for the convenience of using the data set by a user with smaller hard disk capacity. The Docker container approach is designed to facilitate more installation and future maintenance of imperfections associated with system configuration.
4. Meta information extraction and augmentation
The defect meta information provides basic information of the defect, and a user can conveniently acquire related information of the defect through the meta information. We therefore append to each defect its meta-information, such as the defect name (named "library name + issue number"), artificially defined defect type, exception information, issue URL, commit URL, defect description, support, and the reason why this defect can be generalized.
The support table indicates how many defects in the data set have similar or the same causes of errors and repair patterns as the defect. The degree of support may indirectly or directly reflect the generalization ability of a certain class of defects, and may also be said to be a degree of versatility.
Meanwhile, two repair indexes of positioning difficulty and repair difficulty and one repair operator are defined for each defect. The positioning difficulty refers to the distance between the code position where the bug version test case generates the abnormity and the repaired code position. It should be noted that an exception may point to more than one location and modify more than one location. For the convenience of calculation, only the value with the smallest distance is taken as the positioning difficulty of the defect. The positioning difficulty is calculated as follows, and the flow is shown in fig. 7.
If the file where the abnormal information points to the position is different from the file where the modification position is located, the tree directory structure of the file needs to be obtained, and the minimum value from the two files to the nearest common ancestor is taken as the positioning difficulty by searching the nearest common ancestor (LCA) of the two files. The algorithm for finding the nearest common ancestor adopts the Tarjan algorithm.
If the file where the abnormal information points to the position is the same as the file where the modified position is, the calculation method of the positioning difficulty lscore is shown in formula 4-1, where line is the difference between the line number of the abnormal code and the line number of the repair code.
Figure BDA0003647530380000091
The difficulty of repair is mainly determined by the number of modified strings and the length of the modified strings. The calculation method of the repair difficulty rscore is shown in formula 4-2, where clines is the modified serial number of the character, and slength is the modified length of the character.
Figure BDA0003647530380000092
The repair operator indicates how many steps are required for the repair to be completed, and if only one step needs to be modified, the value of the operator is 1.
5. Tool integration
The defect data set gDefects4dl of the deep learning application is constructed to provide a reference data set for the study of deep learning defect localization and defect repair technology. To make the data set gDefects4dl more versatile, gDefects4dl provides some extensible interfaces and integrates three defect localization tools, shareflow, DEBAR, GRIST, to support the evaluation of defects in the data set. Each defective Docker image integrates these three tools. Scripts of the Conda environment will also automatically download these defect location tools.
ShapeFlow is a dynamic abstract interpreter of TensorFlow that can quickly capture tensor shape mismatch errors. ShapeFlow shares the same API as TensorFlow, but it only captures and recognizes the shape of the tensor. ShapeFlow constructs custom shape computation graphs similar to the computation graphs used by TensorFlow. ShapeFlow does not require code annotation or code modification by a programmer and is therefore convenient to use.
DEBAR is a static analysis method based on abstract interpretation, which is used to detect numerical errors in neural network architectures, and comprises two abstraction techniques, one for tensor and one for numerical values.
GRIST is a method of exposing numerical errors in deep learning by gradient back propagation. It uses dynamic technique to automatically generate a small input that can expose numerical errors in the deep learning procedure. Numerical errors are identified using the built-in gradient computation function of deep learning.
To facilitate the use of the defect data set gDefects4dl, we provide two external tools for use, a command line version and a java web version. FIG. 8 illustrates the overall architecture of the data set gDefects4dl tool, which consists of a user interface layer, an abstraction interface layer, and an isolation environment layer.
At the isolation environment layer, for each defect, an isolation environment for the defect is constructed by using a Docker container and Conda technique. In either environment, a versioning mechanism (implemented via Git) and integrated instrumentation tools (e.g., shareflow, GRIST, and DEBAR) can be supported.
At the level of the abstract interface, the data set gDefects4dl tool provides a set of programming interfaces for downloading a mirror of a defect, finding/viewing specific defects, viewing repair history, reproducing specific defects, and running detection tools for defect support.
In the user interface layer, the data set gDefects4dl tool provides a user interface for both the command line and the Java Web version. JavaWeb edition is for user to browseThe overall information for each defect. In the Java Web edition tool, the user can retrieve defects based on error type (e.g., API misuse, shape mismatch, etc.), framework (e.g., TensorFlow or PyTorch), error message, dynamic and static classification, supported tool (e.g., DEBAR or shareffow). The user can acquire information of the defect by setting the Conda environment or pulling a corresponding Docker image (the Docker image is saved in DockerHub). The function of the command line end is the same as that provided by the Java Web edition. Each function of the data set tool is packaged as an API that can be called by a user. Java Doc of these APIs may be inhttp://47.93.14.147: 9000/javadocAnd (6) checking. Details of the tool installation, deployment and use are as follows:
gDefets 4dl deployment and startup
The user can download the item from https:// github. com/llmhyy/features 4dl locally and import it into the IntelliJ IDEA or other code editor. Waiting a period of time for Maven to download the relevant dependency package. The user can also download deffects 4dl. The environment requirement of project operation mainly comprises two aspects, the version of JDK is more than or equal to 1.8, and a Docker environment is required to be installed.
If the user runs the project through the code editor, the entry of the project is AppENTER, and the command line version and the JavaWeb version can be started simultaneously by running AppENTER. If the files are deffects 4dl.zip files, the user can start the project through java-jar deffects 4dl.jar.
When the project is started, the command line edition main interface is entered, and the input help command can view some use commands of the defect data set gDefects4dl tool, as shown in FIG. 9. By entering in a browserhttp://127.0.0.1: 9000/bugListThe java web version home interface may be entered as shown in fig. 10. The main interface of the java web version is divided into two parts, top and bottom, above which is shown a general overview of the defects, including the total number of defects in the dataset, the total categories of defects, and the number of defects in each category. The following includes three drop-down boxes and some information of the defect, the three drop-down boxes are respectively according to the type of the defect,And screening the defects by reporting error information and dynamic and static classification. Below the drop-down box is summary information of the defect, mainly including ID (defect name, named by "library name + issue number"), error information, category, description of the defect, and two operation buttons. Clicking on the Detail button may display details of a defect, and clicking on the Pull button may Pull a mirror image of a defect in Docker.
2. Defect name retrieval
In the command line version, the user can retrieve all defects in the data set by the ls command, as shown in FIG. 11. Each defect is named by the name of the item to which it belongs, combined with the number of this issue. As shown in fig. 12, the user can keyword-search for defects containing a certain keyword using the ls grep command. If the java web version is used, summary information such as the defect name shown in fig. 10 can be seen by entering the home page.
gDefets 4dl mirror pull
The user can download a Docker image of a certain defect through a pullOneBug command, as shown in fig. 13. The entire Docker image may also be downloaded via the pullBug command. If the JavaWeb version is used, a Pull button (see figure 10) is arranged in each defect list when the first page is entered, and the Docker image of the corresponding defect can be downloaded by clicking the Pull button.
If the user uses the pullOneBug command or the pullBug command, the image is automatically started, and the two commands encapsulate the function of starting the image. If the user has downloaded the Docker images before, a single Docker container can be started through a startOneBug command, or all Docker containers can be started through a runBug command by one key.
4. Defect detail information query
After the Docker image is downloaded, the detailed information of a certain defect can be viewed, as shown in fig. 14, the command line version can view the detailed information of a certain defect by using an info command, including a defect name (library name + issue number), a type of the defect, a dynamic and static classification (1 represents dynamic, and 0 represents static), a URL of issue, a URL of commit, error reporting information, a location difficulty, a repair difficulty (shown in the form of "repair difficulty + operator"), a support degree (i.e., the number of similar or identical defects in a data set), a defect name similar thereto, a tool capable of detecting the defect, a defect description, and submission times of bug version and fix version commit.
The java web version can enter a defect Detail page by clicking the Detail button in fig. 10, as shown in fig. 15, the page displays defect details including defect name (library name + issue number), defect type, error information, support degree, location difficulty, repair difficulty (shown in the form of "repair difficulty + operator"), defect name similar thereto, description, and some buttons. The lowest blue text box is used for outputting the test case and the execution result of the detection tool.
The java web version also provides a function to view "Similar Bugs" as shown in fig. 15, where the defect details have a Similar Bugs attribute, and the blue text in the following pane shows a defect Similar to the defect (i.e. having Similar or identical error causes and repair patterns). The jump to the details page of these defects can be done by clicking on the names of these similar defects. FIG. 16 shows the page after the keras2672 has been clicked to jump.
5. Defect source information review
The source information refers to the source of the defect, and in order to facilitate a user to view the source information of the defect, the user provides a defect source information viewing function. As shown in fig. 15 or fig. 16, there are many buttons on the detail page, two buttons, GitHub _ issue and GitHub _ commit, and the defect in the data set gDefects4dl is actually present in GitHub, which contains the source information of the defect. The Java Web version tool provides two jump functions, by clicking on the GitHub _ issue, it can jump to the issue page of GitHub, and clicking on the GitHub _ commit, it can jump to the commit page of GitHub. For example, clicking on the GitHub _ issue button results in the graph shown in fig. 17.
Diff queries
In the data set gDefects4dl tool, a Diff function is implemented by using a version control mechanism Git. Through Diff functionality, a user may view the version of the defect before repair, the version after repair, changes to the code, the evolution history of the repair, and core modifications associated with the defect. There are some commit fixes that address a number of different problems, in which case the change would be large, and it is far from enough that the tool just shows the contents of the modification, so the tool provides the core modification functionality. In the command line version, the user can view this information through a diff command, as shown in fig. 18, the output result has many different colored marks, the red mark shows the modified content, and the code will be shown in red as long as there is a change before and after the repair. The yellow mark shows the core modifications associated with this defect. The java web version jumps to the Diff interface by clicking on the Diff button, and the interface after jumping is shown in fig. 19. And the number of modified files, the name of the modified file, the number of modified code lines of each file and the detailed modified information are sequentially displayed from top to bottom. The red boxes represent deleted portions and the green boxes represent added portions. Where the word "× corefix × denotes the Core modification.
7. Test case execution
A separate operation environment is constructed for each defect through Docker and Conda, and meanwhile, each defect has an enhanced test case, so that a user can directly operate the test cases of the bug version and the fix version in the gDefects4dl tool. The command line version executes the test case through the test command, and as shown in fig. 20, after the test case is executed, the result is output to the command line in real time. For example, as shown in fig. 21, when the failure test button is clicked, the build version test case may be executed, and when the past test button is clicked, the fix version test case may be executed. The button can be changed into a non-clickable state after being clicked, and the button is not changed back to the clickable state until the test case is executed. The execution result of the JavaWeb version test case is also obtained in real time, a polling mode is adopted during display, and the operation result is read every second and displayed on a front-end page.
8. Use of detection tools
The gDefects4dl tool currently integrates three defect localization tools to evaluate these defects. ShapeFlow, which can detect shape mismatch type defects, static tool DEBAR, which can detect numerical errors, and dynamic tool GRIST, respectively. As shown in fig. 22, for example, by taking DEBAR as an example, clicking the corresponding button can execute the corresponding detection tool, and if the defect is supported by a certain tool, the corresponding tool button is green in color, and if not supported, is gray. The latest data is obtained in real time in a polling mode as the execution of the test case, the button is displayed in a non-clickable state in the execution process, and the recovery is carried out after the execution is finished.
Although the embodiments of the present invention and the accompanying drawings are disclosed for illustrative purposes, those skilled in the art will appreciate that: various substitutions, changes and modifications are possible without departing from the spirit and scope of the invention and the appended claims, and therefore the scope of the invention is not limited to the disclosure of the embodiments and the accompanying drawings.

Claims (2)

1. A method for constructing a deep learning defect data set by using a Docker container is characterized by comprising the following steps: the method comprises the following steps:
1) commit tracking and screening
The issue and the commit are in a many-to-many relationship, all the commits related to the issue need to be tracked and screened in order to obtain a true repair, and if only one commit corresponds to the issue, the commit is considered to be the repair for the issue; if a plurality of commit are corresponding to the issue, artificially identifying which commit is the true repair;
2) test case construction and enhancement
Once it is determined that the commit of the issue is repaired, then a test case needs to be manually constructed, which passes through the commit version and which is abnormal in the version before commit; defining the version before commit as c-l, and the commit version as c; according to two different commit of c and c-l, using Git to create two different branches, wherein the branch where c-1 is located is called a toggle branch, the branch where c is located is called a fix branch, and adding a test case for each branch;
for the condition that the bug test case of the bug version is reported, a trace module is added in the test case, and the trace module of Python provides a standard interface to extract, format and print stack traces of Python programs;
for the condition that the testing case of the bug version does not report errors, adding assert to the testing case to distinguish the bug version from the fix version, wherein Python assert is used for judging an expression, the exception is triggered when the expression condition is false, the assertion can directly return errors under the condition that the condition does not meet the program operation, and the errors are reported without waiting for the program to crash after the program operation;
for the test case of the fix version, in order to more intuitively display the difference between the test case of the fix version and the bug version, the test case of the fix version with each defect is added and output with 'success' information;
3) environmental isolation
Generally, the operating environment of each defect is different, and some defects can only be reproduced under a specific environment, such as a specific data set, a specific version of library, a specific operating system configuration; after test case enhancement, therefore, environmental isolation is required,
the Conda version:
in this version, we build a folder for each defect, where scripts are provided to create the Conda environment, download data sets and dependency libraries (including integration tools), and execute test cases, and when a user wants to reproduce a defect, only run the script;
docker version:
in the version, a single Docker mirror image is constructed for each defect, all necessary environments of the defect are pre-installed in the mirror image, a user can reproduce the defect in the Docker, after the container is started, an internal file item of the container is stored in a metadata folder under a home directory, and a shell script for running a test case is stored in a script directory;
4) meta information extraction and augmentation
The defect meta information provides basic information of the defect, and a user can conveniently acquire related information of the defect through the meta information, so that the user attaches the meta information of each defect, such as a defect name (named as 'library name + issue number'), a manually defined defect type, exception information, an issue URL, a commit URL, a defect description, a support degree and a reason why the defect can be generalized, wherein the support degree indicates how many defects in a data set have similar or identical error reasons and repair modes with the defect; the support degree can indirectly or directly reflect the generalization ability of a certain defect, and can also be said to be a universal degree;
meanwhile, two repair indexes of positioning difficulty and repair difficulty and one repair operator are defined for each defect, wherein the positioning difficulty refers to the distance between the abnormal code position generated by running the bug version test case and the repaired code position;
if the file where the abnormal information points to the position is different from the file where the modification position is located, the tree directory structure of the file needs to be obtained, the minimum value from the two files to the nearest Common Ancestor is used as the positioning difficulty by searching the nearest Common Ancestor (LCA) of the two files, and the algorithm for solving the nearest Common Ancestor adopts the Tarjan algorithm;
if the file where the abnormal information points to the position is the same as the file where the modified position is, the calculation method of the positioning difficulty lscore is shown as a formula 4-1, wherein line is the difference value between the line number of the abnormal code and the line number of the repair code:
Figure FDA0003647530370000021
the repair difficulty is mainly determined by the modified serial number of the character and the length of the modified character string, the calculation method of the repair difficulty rscore is shown as a formula 4-2,
Figure FDA0003647530370000022
wherein: clines is the modified serial number of characters;
slength is the modified string length;
the repair operator means how many steps of operation are needed for the repair, and if only one step of operation is needed to be modified, the value of the operator is 1;
5) tool integration
The defect data set gDefects4dl of the deep learning application is constructed to provide a reference data set for the research of deep learning defect location and defect repair technology, and in order to enable the data set gDefects4dl to have wider application, gDefects4dl provides some extensible interfaces and integrates three defect location tools ShapeFlow, DEBAR and GRIST to support the evaluation of defects in the data set, a Docker image of each defect integrates the three tools, and scripts in a Conda environment can automatically download the defect location tools;
ShapeFlow is a dynamic abstract interpreter of TensorFlow that can quickly capture tensor shape mismatch errors; ShapeFlow shares the same API as TensorFlow, but it only captures and recognizes the shape of the tensor, ShapeFlow constructs a custom shape computation graph, similar to the computation graph used by TensorFlow; ShapeFlow does not require a programmer to make code annotations or code modifications;
DEBAR is a static analysis method based on abstract interpretation, which is used for detecting numerical errors in a neural network architecture and comprises two abstract technologies, one is applicable to tensor and the other is applicable to numerical values;
GRIST is a method for exposing numerical errors in deep learning through gradient back propagation, and the method adopts a dynamic technology to automatically generate a small input capable of exposing the numerical errors in a deep learning program, and utilizes a built-in gradient calculation function of deep learning to identify the numerical errors.
2. The method of building a deep learning defect data set using a Docker vessel of claim 1, wherein: the identification of whether the commit is a true repair in the step 1) is divided into two modes:
first, the discussion information in issue is analyzed by human: if the information related to the repair is mentioned, the codes are compared with the problem description to screen commit; if no information related to code repair exists, judging by means of professional knowledge;
secondly, creating different branches according to commit through a version control mechanism, then constructing an operating environment for the branches in Docker or Conda, and adding test cases; if the commit is not a real repair, the branch test case is abnormal, and if the branch test case is normally executed, the commit can be proved to be a real repair.
CN202210535022.2A 2022-05-17 2022-05-17 Method for constructing deep learning defect data set by using Docker container Pending CN115017035A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210535022.2A CN115017035A (en) 2022-05-17 2022-05-17 Method for constructing deep learning defect data set by using Docker container

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210535022.2A CN115017035A (en) 2022-05-17 2022-05-17 Method for constructing deep learning defect data set by using Docker container

Publications (1)

Publication Number Publication Date
CN115017035A true CN115017035A (en) 2022-09-06

Family

ID=83069820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210535022.2A Pending CN115017035A (en) 2022-05-17 2022-05-17 Method for constructing deep learning defect data set by using Docker container

Country Status (1)

Country Link
CN (1) CN115017035A (en)

Similar Documents

Publication Publication Date Title
US11907107B2 (en) Auto test generator
US7490319B2 (en) Testing tool comprising an automated multidimensional traceability matrix for implementing and validating complex software systems
US8752001B2 (en) System and method for developing a rule-based named entity extraction
US5956513A (en) System and method for automated software build control
US7752501B2 (en) Dynamic generation and implementation of globalization verification testing for user interface controls
Hammoudi et al. Why do record/replay tests of web applications break?
US8473915B2 (en) Coverage analysis tool for testing database-aware software applications
US20080148235A1 (en) Runtime inspection of user interfaces
US20020091968A1 (en) Object-oriented data driven software GUI automated test harness
US20110022551A1 (en) Methods and systems for generating software quality index
Coppola et al. Mobile GUI testing fragility: a study on open-source android applications
Chen et al. Extracting and studying the Logging-Code-Issue-Introducing changes in Java-based large-scale open source software systems
Li et al. Effective software test automation: developing an automated software testing tool
CN110716874B (en) Domestic operating system hardware compatibility testing method
Henkel et al. Shipwright: A human-in-the-loop system for dockerfile repair
Huang et al. Characterizing and detecting configuration compatibility issues in android apps
CN112256271A (en) Block chain intelligent contract security detection system based on static analysis
Nickel et al. Ibm ilog cplex optimization studio—a primer
CN114385491A (en) JS translator defect detection method based on deep learning
Ostrand et al. A Tool for Mining Defect-Tracking Systems to Predict Fault-Prone Files.
CN115017035A (en) Method for constructing deep learning defect data set by using Docker container
CN112597037B (en) Java and Python combined automatic script development method and device
Aranega et al. Rotten green tests in Java, Pharo and Python: An empirical study
US10762211B2 (en) Source code diagnostic instrument
CN113849814A (en) Configurable system bug reproduction system and reproduction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination