CN114791886B - Software problem tracking method and system - Google Patents

Software problem tracking method and system Download PDF

Info

Publication number
CN114791886B
CN114791886B CN202210704988.4A CN202210704988A CN114791886B CN 114791886 B CN114791886 B CN 114791886B CN 202210704988 A CN202210704988 A CN 202210704988A CN 114791886 B CN114791886 B CN 114791886B
Authority
CN
China
Prior art keywords
module
feature
model
embedded
code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210704988.4A
Other languages
Chinese (zh)
Other versions
CN114791886A (en
Inventor
杨玉翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weichuang Software (Dalian) Co.,Ltd.
Original Assignee
Weichuang Software Wuhan Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weichuang Software Wuhan Co ltd filed Critical Weichuang Software Wuhan Co ltd
Priority to CN202210704988.4A priority Critical patent/CN114791886B/en
Publication of CN114791886A publication Critical patent/CN114791886A/en
Application granted granted Critical
Publication of CN114791886B publication Critical patent/CN114791886B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • G06F11/3608Software analysis for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3636Software debugging by tracing the execution of the program
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The embodiment of the specification provides a method and a system for tracking software problems, wherein the method comprises the steps of acquiring software problem records through a database; determining the corresponding relation between the software problem information and the code module based on the software problem record; and determining a key auditing module based on the corresponding relation and the module characteristics.

Description

Software problem tracking method and system
Technical Field
The present disclosure relates to the field of software development technologies, and in particular, to a method and a system for tracking software problems.
Background
Along with the popularization of computers and intelligent terminals, the development scale of application software is gradually enlarged, the functional requirements of the software are gradually enriched, and meanwhile, the quality requirements of the software are also higher and higher. The software problem is used as an important index for measuring the software quality, so that the tracking of the software problem is an important link in the whole software development period. Especially for software with higher complexity, the functional modules are numerous, and different developers in a development team have different software problems in quantity and severity due to the influence of factors such as capability, experience and the like. However, the positioning, analysis, tracing, repair, etc. of software problems require a lot of manpower and time, and even cause unpredictable risks to the operation of the software.
Therefore, a software problem tracking method and system are provided.
Disclosure of Invention
One or more embodiments of the present specification provide a software problem tracking method. The method comprises the following steps: acquiring a software problem record through a database; determining the corresponding relation between the software problem information and the code module based on the software problem record; and determining a key auditing module based on the corresponding relation and the module characteristics.
In some embodiments, the determining a highlight review module comprises: acquiring a first module, a second module and a module to be checked, wherein the first module is a code module with problems, and the second module is a code module without problems; extracting a first feature of the first module, a second feature of the second module and a third feature of the module to be checked, wherein the first feature comprises a functional feature, a code feature and a programmer feature of the first module; clustering based on the first characteristics, and determining at least one clustering center; based on the second characteristic, the second module is classified into the cluster, and the problem rate corresponding to the at least one cluster center is determined; determining key clusters based on the problem rate corresponding to the at least one cluster center, wherein the key clusters comprise key cluster features; and determining the module to be checked corresponding to the third feature with the similarity higher than a threshold value with the important clustering feature as the important auditing module.
In some embodiments, the code characteristics include one or more of sub-module length, number of annotations, number of jump statements.
In some embodiments, the code features include development features obtained based on a development environment, and the development features include one or more of the number of test cases executed by the module, the number and length of content pasting, and the editing time of the module.
In some embodiments, the first feature is an embedded feature obtained by a neural network-based embedding model obtained by training in conjunction with a programming quality model.
In some embodiments, the software problem record is obtained based on a test record, and the key review module is determined by a machine learning based problem prediction model.
One or more embodiments of the present specification provide a software problem tracking system, the system comprising: the software problem record acquisition module acquires a software problem record through a database; the corresponding relation determining module is used for determining the corresponding relation between the software problem information and the code module based on the software problem record; and the key auditing module determining module is used for determining a key auditing module based on the corresponding relation and the code characteristics.
One or more embodiments of the present specification provide a software problem tracking apparatus including a processor for executing the above software problem tracking method.
One or more embodiments of the present specification provide a computer-readable storage medium storing computer instructions that, when read by a computer, cause the computer to perform the software problem tracking method described above.
Drawings
The present description will be further explained by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:
FIG. 1 is a schematic diagram of an application scenario of a software problem tracking system in accordance with some embodiments of the present description;
FIG. 2 is a block diagram of a software problem tracking system shown in accordance with some embodiments of the present description;
FIG. 3 is an exemplary flow diagram of a software problem tracking method, shown in accordance with some embodiments of the present description;
FIG. 4 is an exemplary flow diagram of a key review module determination method according to some embodiments of the present description;
FIG. 5 is a schematic diagram of a model structure of a programming quality model in accordance with some embodiments described herein.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only examples or embodiments of the present description, and that for a person skilled in the art, the present description can also be applied to other similar scenarios on the basis of these drawings without inventive effort. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.
It should be understood that "system," "device," "unit," and/or "module" as used herein is a method for distinguishing between different components, elements, parts, portions, or assemblies of different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not to be taken in a singular sense, but rather are to be construed to include a plural sense unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
Flow charts are used in this description to illustrate operations performed by a system according to embodiments of the present description. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to or removed from these processes.
FIG. 1 is a schematic diagram of an application scenario of a software problem tracking system according to some embodiments of the present description.
In some embodiments, the software problem tracking system 100 may be used in a variety of application scenarios in the software development process. In some application scenarios, the application scenario 100 of the software problem tracking system may include a processing device 110, a network 120, a storage device 130, a collection terminal 140, and a user terminal 150. In some embodiments, the components in the software problem tracking system 100 may be connected and/or in communication with each other via a network 120 (e.g., a wireless connection, a wired connection, or a combination thereof). For example, processing device 110 may be connected to storage device 130 through network 120. As another example, user terminal 150 may be coupled to processing device 110, storage device 130, through network 120. In some embodiments, the software problem tracking system 100 may perform detection and analysis on the software module through the acquisition terminal 140, determine a key review module, and send the key module to the user terminal 150, so that the user may perform key review on the key review module.
The processing device 110 may be used to process information and/or data related to the application scenario 100, for example, the processing device 110 may obtain a record of software problem information, obtain individual software modules of a software product, associate the software problem information with the software modules, and so on. In some embodiments, processing device 110 may include one or more processing engines (e.g., a single chip processing engine or a multi-chip processing engine). For example only, the processing device 110 may include a Central Processing Unit (CPU). Processing device 110 may process data, information, and/or processing results obtained from other devices or system components and execute program instructions based on the data, information, and/or processing results to perform one or more functions described herein.
The network 120 may connect the components of the application scenario 100 and/or connect the application scenario 100 with external resource components. The network enables communication between the components and with other components outside of the application scenario 100, facilitating the exchange of data and/or information. The network may be a local area network, a wide area network, the internet, etc., and may be a combination of various network architectures.
Storage device 130 may be used to store data and/or instructions. In some embodiments, storage device 130 may store data and/or instructions for use by processing device 110 in performing or using the exemplary methods described in this specification. For example, the storage device 130 may be a database for storing software module information, software problem information records for software modules, and the like. As another example, the storage device 130 may be a hard disk, storing code files for software modules, and the like. In some embodiments, a storage device 130 may be connected to the network 120 to communicate with one or more components of the application scenario 100 (e.g., processing device 110, user terminal 150).
The acquisition terminal 140 may be used to acquire data and/or information. For example, the collection terminal 140 may be used to collect issue records for the code module 141. For another example, the collection terminal 140 may be an integrated development tool in software development, and detect a software problem occurring in the software detection module code 141 through a compiler, an analyzer, a debugger, and the like therein. As another example, the collection terminal 140 may be a terminal device (e.g., a computer, a smart phone, etc.) for installing software, and is used to collect software problems fed back by a user when the software is running. In some embodiments, the collection terminal 140 may send the collected data and/or information to a processing device over a network.
User terminal 150 may include one or more terminal devices. In some embodiments, the user terminal 150 may include a mobile phone 150-1, a tablet 150-2, a laptop 150-3, and the like. In some embodiments, a user may view information and/or enter data and/or instructions through a user terminal. For example, a user may view information related to a software module (e.g., information of a programmer writing software module code, software problem information, etc.) through the user terminal 150. For another example, the user includes a program code auditor, and the program code auditor can audit the code of the software module through the user terminal.
It should be noted that the application scenarios are provided for illustrative purposes only and are not intended to limit the scope of the present specification. It will be apparent to those skilled in the art that various modifications and variations can be made in light of the description herein. For example, the application scenario may also include a database. As another example, the application scenarios may be implemented on other devices to implement similar or different functionality. However, variations and modifications may be made without departing from the scope of the present description.
FIG. 2 is an exemplary block diagram of a software problem tracking system shown in accordance with some embodiments of the present description. As shown in fig. 2, the software problem tracking system 200 includes a software problem record obtaining module 210, a correspondence determining module 220 and a key auditing module determining module 240. In some embodiments, the software problem tracking system 200 may be implemented by a portion of the processing device 110 or by the processing device 110.
The software problem record obtaining module 210 may be configured to obtain software problem record information of the code module. In some embodiments, the software problem record acquisition module 210 may acquire the software problem record from a database.
The correspondence determining module 220 may be configured to determine a correspondence of the software problem information to the code module based on the software problem record.
The key review module determination module 230 may be configured to determine a key review module based on the correspondence and the module characteristics. In some embodiments, the highlight review module determination module 230 may be used to determine whether the module to be detected is a highlight review module.
Reference is made to the description of fig. 3-5 for more on the above modules.
It should be understood that the system and its modules shown in FIG. 2 may be implemented in a variety of ways.
It should be noted that the above description of the system and its modules is for convenience only and should not limit the present disclosure to the scope of the illustrated embodiments. It will be appreciated by those skilled in the art that, given the teachings of the present system, any combination of modules or sub-system configurations may be used to connect to other modules without departing from such teachings. In some embodiments, the modules disclosed in fig. 2 may be different modules in a system, or may be a module that implements the functions of two or more of the modules described above. For example, each module may share one memory module, and each module may have its own memory module. Such variations are within the scope of the present disclosure.
FIG. 3 is an exemplary flow diagram of a software problem tracking system, shown in accordance with some embodiments of the present description. As shown in fig. 3, the process 300 includes the following steps. In some embodiments, flow 300 may be performed by processing device 110.
At step 310, a software problem record is obtained.
The software problem record refers to defect information existing in the software program which is on line. In some embodiments, a software problem record may include information about errors or potential functional defects present in a software program that affect the ability of the software to function properly. For example, a software problem record includes the version of the software that is in problem, the hardware environment in which it is running, the type, severity, time of occurrence, etc. In some embodiments, the software problem record may include problems that the program does not meet the preset criteria, such as code normative problems (e.g., named normative problems, etc.), readability problems (e.g., code annotation problems), efficiency of execution problems (e.g., time of execution problems, etc.), and so on. In some embodiments, the software problem record also includes programmer's information such as the programmer's number, name, job title, work experience, time of code submission, and the like.
The software problem record may be stored by the storage device 130, for example, in a database. In some embodiments, the software problem record may include a textual description of the software problem and may also include a related screenshot. In some embodiments, the software problem record may be obtained through software testing. For example, a software tester tests the functions of software through a preset test case based on a software test tool, and determines and records software problem information based on a feedback result of the software test tool. In some embodiments, software problem records may be recorded during code review based on program code reviewers, such as records of code naming irregularities, lack of necessary annotation records, and the like. In some embodiments, the software problem record may be feedback information submitted by a user's experience of use of the software, such as a record of certain functions not responding for a long time, a record of software crashes, and so forth.
And step 320, determining the corresponding relation between the software problem information and the code module based on the software problem record.
A code module refers to one or more code files written in a computer programming language for implementing a particular software function. The code module is a code package which can be executed after compiling, and can also be an algorithm code segment which can realize specific functions. In some embodiments, multiple code modules may be included in an online application. In some embodiments, the code modules may be stored via storage device 130, such as a physical file of code modules may be stored on a hard disk. In some embodiments, a code module may contain sub-code modules based on dependencies.
In some embodiments, information about the code module may be stored in a database. For example, the related information of the code module, including the function information, may be preset various classification information, such as an algorithm class (an iterative algorithm, a sorting algorithm, an encryption and decryption algorithm, etc.), a service class (such as a login function, a registration function, etc.), an application interface class, an image processing class, a database processing class, etc. Information associated with a code module may also include program code information, such as the number of lines of code, time of submission, programming language, code normality, and the like. Information about the programmer may also be included, such as the programmer's name, age, work experience (e.g., 3 years, 5 years), job title (e.g., primary, secondary, advanced development engineers), etc.
The corresponding relation refers to the incidence relation between the software problem of each code module and the corresponding code module in actual operation. Specifically, different code modules have different software problems in the running process. Through software problem recording, corresponding code modules can be determined, and the code modules have corresponding relations. For example, if the login function of the software fails to log in, the corresponding code module related to the login function may have a problem (e.g., there is a problem in the encryption algorithm logic of the input login password). In some embodiments, the corresponding relation between the software problem information record and the code module can be obtained based on the association query of the database. For example, the code module corresponding to the software problem may be determined based on the data table record of the correspondence. In some embodiments, a code module may correspond to a plurality of software problem records. For example, multiple software problem records obtained from a database may be associated with the same code module.
And step 330, determining a key auditing module based on the corresponding relation and the module characteristics.
In some embodiments, processing device 110 may determine module characteristics based on information about the code module. The module features comprise function features, code features and programmer features. In some embodiments, the functional feature may be a feature determined based on functional information of the code module, and the code feature may be a feature determined based on code information of the code module; the programmer features may be features determined based on programmer information for the code module.
The key auditing module is a code module which needs a user to check codes more carefully and more comprehensively. In some embodiments, the code module corresponding to the key review module has a higher error rate. In some embodiments, the focus audit module may be a code module written by a programmer who writes code modules with a high error rate. In some embodiments, the associated at least one code module is determined by the plurality of software problem records based on the corresponding relation, and the error programmer information is determined by the programmer characteristics in the module characteristics of the at least one code module, and the error rate of different programmers is counted. Furthermore, the code modules in charge of programmers with higher error rate are subjected to key review. In some embodiments, an error rate threshold may be preset, such as 30%, above which the module that the programmer is responsible for writing is determined to be the key review module. For example, the programmer a is responsible for writing program codes of 10 code modules, and in the current 10 different software problem information records, 6 software problems are related to 6 code modules written by the programmer a, if the error rate of the programmer a is 60%, and is higher than a preset threshold, it is considered that the other 4 code modules responsible for the programmer a are important review modules. In some embodiments, based on the correspondence, the commonality between a plurality of code modules with software problems (similarity of module features) may be determined, and subsequently, the key auditing module is determined through the similarity of module features. For example, a plurality of code modules having problems are determined based on the correspondence relationship, and if the plurality of code modules all relate to the image processing class, then the code modules relate to the image processing class, and then the code modules are the key auditing modules. See the description of fig. 4 for more.
In some embodiments, the processing device 110 may push the determined key review module to the user terminal for code review. For example, the code file of the key auditing module can be pushed to the terminal device of the auditor, and the reminding information can also be sent to the relevant auditor.
In some embodiments, the software problem records are obtained based on test records, and the key review module is determined by a machine learning based problem prediction model. In some embodiments, the problem prediction model may be comprised of an embedded model and a judgment model. In some embodiments, the inputs to the problem prediction model include software problem records and the outputs include the results of the determination of whether the problem prediction model is a key review module. In some embodiments, the software problem record includes a code signature. The relevant contents of the code features, the embedding model and the judgment model can be referred to in fig. 4 to 5. In other embodiments, the key review module may also determine based on a programming quality model, as described in greater detail with reference to fig. 5.
In some embodiments of the present description, by tracking programmers associated with software problems, a potentially faulty software module can be quickly located for performing a key review, so as to improve the efficiency of investigating the potentially faulty module and avoid the time and effort consumed by excessive human analysis and investigation.
FIG. 4 is an exemplary flow diagram of a highlight review module determination method according to some embodiments of the present description. In some embodiments, flow 400 may be performed by processing device 110. As shown in fig. 4, the process 400 includes the following steps.
Step 410, a first module, a second module and a module to be detected are obtained.
The first module refers to a code module which is found to have a software problem in the software which is on line. In some embodiments, at least one of the first modules may be determined based on reported software problem records.
The second module refers to a code module which is not found to have a software problem in the on-line software.
The module to be detected is a candidate module needing to be detected. In some embodiments, whether the module to be detected has a software problem or not needs to be further detected and determined.
Step 420, extracting a first feature of the first module, a second feature of the second module and a third feature of the module to be checked.
The first feature refers to a module feature of the first module. The first features may include functional features, code features, programmer features of the first module. For more, refer to the description of fig. 1, which is not repeated herein.
In some embodiments, the code characteristics further include one or more of sub-module length, number of annotations, number of jump statements.
In some embodiments, the code features further include development features obtained based on the development environment, the development features including one or more of the number of test cases executed by the module, the number and length of content stickers, and the module editing time.
The development features may be obtained based on a compiler, analyzer, debugger, etc. of the integrated development environment. In some embodiments, the more test cases the more code of a software module executes, the more adequate the testing and the lower the probability of a problem. In some embodiments, when the number and length of the pasted contents do not satisfy the preset requirements, it may be considered that the code quality is negatively affected, for example, if the repeated codes are too much or the pasted contents forget the necessary modification, the probability of software problems may be high. In some embodiments, the longer the edit time of the code, the more likely the code is to be thought and thought of as being written, the higher the quality of which may be, and the threshold range of code edit time may be preset based on development experience. And if the actual code editing time is within the threshold range and is lower than the minimum value or higher than the maximum value, the probability of the problem is considered to be higher.
In some embodiments, the first feature is an embedded feature, which may be obtained based on an embedded model. For more, see the description of fig. 5, which is not repeated here.
The second feature refers to a module feature of the second module. The third characteristic is a module characteristic of the module to be detected. For the module features, reference is made to the description of fig. 1, and details are not repeated here.
And 430, clustering based on the first characteristics, and determining at least one clustering center.
In some embodiments, the processing device 110 may perform clustering on a plurality of the first features through various clustering algorithms. For example, a plurality of cluster centers may be determined based on a clustering algorithm including K-Means (K-Means algorithm), density (DBSCAN algorithm), and the like, and the plurality of code modules corresponding to each cluster center may be classified into one class. In some embodiments, the clustering process may be performed according to the functional features of the first feature, for example, the code modules with the functional features being algorithm classes such as iteration, sorting, encryption, decryption, etc. are grouped into a class a, the code modules of the image processing class may be grouped into a class B, the code modules of the interactive interface class are grouped into a class C, etc. In some embodiments, clustering the code features in the first features may be further included. For example, the code module with large code amount and complex logic has a relatively high error rate like the first module belonging to the algorithm class, otherwise, the error rate is low, and the clustering centers are a1 and a 2. In some embodiments, clustering may be further included for programmer features in the first features, e.g., the first modules belonging to the class of algorithms having a higher error rate for primary programmers and a lower error rate for advanced programmers. The cluster centers can be A3 type and A4 type.
Step 440, classifying the second module into clusters based on the second features, and determining a problem rate corresponding to at least one cluster center.
In some embodiments, the processing device 110 may determine the closest cluster by a vector distance (e.g., euclidean distance) from the cluster center determined in step 430 based on the second characteristic of the second module, with the cluster as the classification of the second module.
The problem rate refers to the proportion of the number of the first modules in the code modules of the same cluster. For example, if the number of the first modules in the cluster a determined in step 420 is 4 and the number of the second modules classified into the cluster a is 6, the corresponding problem rate of the cluster center a is 40%.
Step 450, determining key clusters based on the problem rate corresponding to at least one cluster center, wherein the key clusters comprise key cluster features.
Key clustering refers to the classification of code modules with a high problem rate. In some embodiments, a problem rate threshold may be preset, and when the problem rate is greater than the threshold, the classification is determined to be an important cluster. For example, if the preset problem rate threshold is 30%, based on A, B, C cluster centers determined in step 430, where the problem rates are 40%, 20%, and 10%, respectively, then cluster a is the key cluster.
And 460, determining the module to be checked corresponding to the third feature with the similarity higher than the threshold value of the key clustering feature as the key checking module.
In some embodiments, the processing device 110 may determine the similarity between the third module and the key cluster center by calculating a vector distance (e.g., euclidean distance, etc.) between the third feature vector of the module to be detected and each key cluster feature vector. In some embodiments, the smaller the vector distance, the higher the similarity. A similarity threshold may be set, and when the similarity threshold is higher than the similarity threshold, the module to be detected is determined as a key auditing module.
In some embodiments of the present description, by introducing the multidimensional feature of the software module, the similarity analysis processing is performed on the software module to be detected and the module with a higher problem rate, so that the decision of judging whether the module to be detected is a key auditing module is more accurate.
It should be noted that the above description of the determination process of the focus audit module is only for illustration and description, and does not limit the application scope of the present specification. Various modifications and alterations to the above-described process may be made by those skilled in the art in light of the present disclosure. However, such modifications and variations are intended to be within the scope of the present description.
FIG. 5 is a schematic diagram of model structures of an embedding model and a programming quality model, shown in accordance with some embodiments of the present description. The model structure 500 is shown in fig. 5.
In some embodiments, information of the first module is input into the embedding model, and the embedding characteristics are determined by the embedding model.
An embedded model refers to a model used to determine the first feature. The embedded model may be a trained machine learning model. For example, the embedded model may be a deep neural network model.
In some embodiments, processing device 110 may input first module information into an embedding model, and output embedded features of the first module via the embedding model. The first module information refers to information related to the first module. In some embodiments, the embedded model may be obtained in conjunction with programming quality model training.
In some embodiments, the input to the embedded model includes one or more of a sub-module length, a number of annotations, a number of jump statements for the first module. In some embodiments, the input of the embedded model may further include development characteristics obtained based on a development environment, and the development characteristics include one or more of the number of test cases executed by the module, the number and length of content pasting, and the editing time of the module.
In some embodiments, the embedded model may also be obtained in conjunction with judgment model training.
In some embodiments, the decision model may include two embedded layers (i.e., embedded layer 1 and embedded layer 2 in the figure) and one decision layer. Wherein the two embedding layers may be two identical embedding models, respectively. The two embedded models can have the same initial parameters and share the parameters, and the parameters of the two embedded models can be updated synchronously when the parameters are updated iteratively in training. The fault judgment layer can be a deep neural network model.
In some embodiments, the training sample data of the judgment model may be feature data of a plurality of first modules of the same category, and the training label may determine whether the first module is the same programmer according to the programmer features of the first module, such as setting 1 and 0 to respectively indicate yes and no. For example, during training, the information of the first module of two same classifications (for example, functional features are all algorithm classes) written by the same programmer can be respectively input into the embedded layer 1 and the embedded layer 2, and then the embedded feature 1 and the embedded feature 2 respectively output by the embedded layer 1 and the embedded layer 2 are input into the fault judgment layer as sample data of the fault judgment layer to obtain a result of 1 or 0. And verifying the output result of the fault judgment by using the training label 1 or 0. In the training process, a loss function is established based on the output result of the fault judgment and the output of the embedded layer 1 and the embedded layer 2, and parameters of the embedded layer 1 or the embedded layer 2 and the fault judgment are updated.
The trained embedded model is trained through the joint judgment model, so that parameters of the embedded model are optimized, and the obtained output result is accurate.
A programming quality model refers to a model used to determine the quality of the programming of a software module. The programming quality model may be a trained machine learning model. For example, the programming quality model may be a deep neural network model.
In some embodiments, the embedded features are input to a programming quality model from which programming quality is determined.
In some embodiments, processing device 110 may input the embedded features to a programming quality model, through which the programming quality of the corresponding first module is output. In some embodiments, the programming quality may be used to reflect a focus review module.
In some embodiments, joint training may be performed by an embedded model and a programming quality model. To obtain the embedded model and the programming quality model.
In some embodiments, sample data of the first module information is input to the embedding model, and a vector representation of the embedded features is obtained; and inputting the embedded features into the programming quality model as sample data of the programming quality model to obtain the programming quality of the first module. For example, the training sample data includes function information (such as an algorithm class, a business class and the like) of the first module, code information (such as a line number of the code, a logic compactness and the like), information of a programmer (such as a worker working experience, a level and the like) and is input into the embedded model, and the output of the embedded model is used as the input of the programming quality model. In some embodiments, the training labels may be obtained from a development environment, and the predicted result of the programming quality model is a quality condition classification of the first module under test. For example, the training labels may set the quality classification as excellent, good, normal, bad. The quality classification can be determined according to the number of test cases required by actual test, whether software problems occur after preset test time, and the number of the problems. In the training process, a loss function is established based on the programming quality prediction result corresponding to the sample data and the output of the embedded model, and the parameters of the embedded model and the programming quality model are updated. And obtaining the trained embedded model and the trained programming quality model until the trained programming quality model meets a preset condition, wherein the preset condition can be that the loss function is smaller than a threshold value, convergence is realized, or the training period reaches the threshold value.
Through the programming quality model described in some embodiments of the present specification, the code quality of the module to be tested can be predicted, and the consumption of time and energy caused by manual analysis is reduced. In addition, through the combined training mode of the model, training samples can be reduced, the training process is simplified, and the training efficiency is improved.
Some embodiments of the present description also provide a software problem tracking apparatus comprising at least one processor and at least one memory for storing computer instructions; the at least one processor is configured to execute at least some of the computer instructions to implement the software problem tracking method described above.
Some embodiments of the present specification further disclose a computer-readable storage medium storing computer instructions, and when the computer reads the computer instructions in the storage medium, the computer executes the software problem tracking method.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be regarded as illustrative only and not as limiting the present specification. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such alterations, modifications, and improvements are intended to be suggested in this specification, and are intended to be within the spirit and scope of the exemplary embodiments of this specification.
Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.
Additionally, the order in which the elements and sequences of the process are recited in the specification, the use of alphanumeric characters, or other designations, is not intended to limit the order in which the processes and methods of the specification occur, unless otherwise specified in the claims. While certain presently contemplated useful embodiments of the invention have been discussed in the foregoing disclosure by way of various examples, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein described. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features are required than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
Where numerals describing the number of components, attributes or the like are used in some embodiments, it is to be understood that such numerals used in the description of the embodiments are modified in some instances by the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit-preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.
For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document does not conform to or conflict with the contents of the present specification, it is to be understood that the application history document, as used herein in the present specification or appended claims, is intended to define the broadest scope of the present specification (whether presently or later in the specification) rather than the broadest scope of the present specification. It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments described herein. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.

Claims (7)

1. A software problem tracking method, comprising:
acquiring a software problem record through a database;
determining the corresponding relation between the software problem information and the code module based on the software problem record;
determining a key auditing module based on the corresponding relation and the module characteristics, comprising:
acquiring a first module, a second module and a module to be checked, wherein the first module is a code module with problems, and the second module is a code module without problems;
extracting a first feature of the first module, a second feature of the second module and a third feature of the module to be checked, wherein the first feature comprises a functional feature, a code feature and a programmer feature of the first module;
wherein the first feature is an embedded feature, the embedded feature is obtained through an embedded model, and the input of the embedded model comprises one or more of first module information, development features obtained based on a development environment, sub-module length of the first module, annotation quantity, and skip statement quantity; the development characteristics comprise one or more of the number of the test cases executed by the module, the times and the length of pasting the content and the editing time of the module; the output of the embedding model includes the embedding features;
the embedded model is obtained by joint training with a programming quality model and comprises the following steps: inputting sample data of the first module information into the embedding model to obtain vector representation of the embedding characteristics; inputting the embedded features into the programming quality model as sample data of the programming quality model; the prediction result of the programming quality model is the quality condition classification of the first module during testing;
the embedded model is obtained by combining with the training of a judgment model; the judgment model comprises two embedded layers and a judgment layer; wherein the two embedding layers are the same two embedding models; the two embedded models have the same initial parameters and share the parameters, and the parameters of the two embedded models are updated synchronously when parameter iteration updating is carried out in training; the input of the judgment model comprises the information of the first modules in two same classifications, and the output of the judgment model comprises the judgment result of whether the first modules are the same programmer or not;
clustering based on the first characteristics, and determining at least one clustering center;
based on the second characteristic, the second module is classified into the cluster, and the problem rate corresponding to the at least one cluster center is determined;
determining key clusters based on the problem rate corresponding to the at least one cluster center, wherein the key clusters comprise key cluster features;
and determining the module to be checked corresponding to the third feature with the similarity higher than a threshold value with the key clustering feature as the key auditing module.
2. The method of claim 1, the code characteristics comprising one or more of sub-module length, number of annotations, number of jump statements.
3. The method of claim 1, wherein the code features comprise development features obtained based on the development environment, and the development features comprise one or more of the number of test cases executed by a module, the number and length of times and lengths of pasting content, and the editing time of the module.
4. The method of claim 1, wherein the software problem record is obtained based on a test record, and the key review module is determined by a machine learning based problem prediction model.
5. A software problem tracking system, comprising:
the software problem record acquisition module acquires a software problem record through a database;
the corresponding relation determining module is used for determining the corresponding relation between the software problem information and the code module based on the software problem record;
the key auditing module determining module is used for determining a key auditing module based on the corresponding relation and the module characteristics, and comprises:
acquiring a first module, a second module and a module to be checked, wherein the first module is a code module with problems, and the second module is a code module without problems;
extracting a first feature of the first module, a second feature of the second module and a third feature of the module to be checked, wherein the first feature comprises a functional feature, a code feature and a programmer feature of the first module;
wherein the first feature is an embedded feature obtained through an embedded model, and the input of the embedded model comprises one or more of first module information, development features obtained based on a development environment, sub-module length of the first module, number of annotations, and number of jump statements; the development characteristics comprise one or more of the number of the test cases executed by the module, the times and the length of pasting the content and the editing time of the module; the output of the embedding model includes the embedding features;
the embedded model is obtained by joint training with a programming quality model, and comprises the following steps: inputting sample data of the first module information into the embedding model to obtain vector representation of the embedding characteristics; inputting the embedded features into the programming quality model as sample data of the programming quality model; the prediction result of the programming quality model is the quality condition classification of the first module during testing;
the embedded model is obtained by combining judgment model training; the judgment model comprises two embedded layers and a judgment layer; wherein the two embedding layers are the same two embedding models; the two embedded models have the same initial parameters and share the parameters, and when parameter iterative updating is carried out in training, the parameters of the two embedded models are updated synchronously; the input of the judgment model comprises the information of the first modules in two same classifications, and the output of the judgment model comprises a judgment result of whether the first modules are the same programmer or not;
clustering based on the first characteristics, and determining at least one clustering center;
based on the second characteristic, the second module is classified into the cluster, and the problem rate corresponding to the at least one cluster center is determined;
determining key clusters based on the problem rate corresponding to the at least one cluster center, wherein the key clusters comprise key cluster features;
and determining the module to be checked corresponding to the third feature with the similarity higher than a threshold value with the key clustering feature as the key auditing module.
6. A software problem tracking apparatus, said apparatus comprising at least one processor and at least one memory; the at least one memory is for storing computer instructions;
the at least one processor is configured to execute at least some of the computer instructions to implement the method of any of claims 1-4.
7. A computer-readable storage medium storing computer instructions which, when read by a computer, cause the computer to perform the method of any one of claims 1 to 4.
CN202210704988.4A 2022-06-21 2022-06-21 Software problem tracking method and system Active CN114791886B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210704988.4A CN114791886B (en) 2022-06-21 2022-06-21 Software problem tracking method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210704988.4A CN114791886B (en) 2022-06-21 2022-06-21 Software problem tracking method and system

Publications (2)

Publication Number Publication Date
CN114791886A CN114791886A (en) 2022-07-26
CN114791886B true CN114791886B (en) 2022-09-23

Family

ID=82462956

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210704988.4A Active CN114791886B (en) 2022-06-21 2022-06-21 Software problem tracking method and system

Country Status (1)

Country Link
CN (1) CN114791886B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899135A (en) * 2015-05-14 2015-09-09 工业和信息化部电子第五研究所 Software defect prediction method and system
CN106708738A (en) * 2016-12-23 2017-05-24 上海斐讯数据通信技术有限公司 Method and system for predicting software testing defects
CN107133176A (en) * 2017-05-09 2017-09-05 武汉大学 A kind of spanned item mesh failure prediction method based on semi-supervised clustering data screening
CN107577605A (en) * 2017-09-04 2018-01-12 南京航空航天大学 A kind of feature clustering system of selection of software-oriented failure prediction
CN109032916A (en) * 2017-06-08 2018-12-18 阿里巴巴集团控股有限公司 The method, apparatus and system that functional module in application program is evaluated and tested
CN109522192A (en) * 2018-10-17 2019-03-26 北京航空航天大学 A kind of prediction technique of knowledge based map and complex network combination

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160004627A1 (en) * 2014-07-06 2016-01-07 International Business Machines Corporation Utilizing semantic clusters to Predict Software defects
US10922218B2 (en) * 2019-03-25 2021-02-16 Aurora Labs Ltd. Identifying software interdependencies using line-of-code behavior and relation models
JP7041281B2 (en) * 2019-07-04 2022-03-23 浙江大学 Address information feature extraction method based on deep neural network model
US11288168B2 (en) * 2019-10-14 2022-03-29 Paypal, Inc. Predictive software failure discovery tools
CN114238100A (en) * 2021-12-10 2022-03-25 国家电网有限公司客户服务中心 Java vulnerability detection and positioning method based on GGNN and layered attention network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899135A (en) * 2015-05-14 2015-09-09 工业和信息化部电子第五研究所 Software defect prediction method and system
CN106708738A (en) * 2016-12-23 2017-05-24 上海斐讯数据通信技术有限公司 Method and system for predicting software testing defects
CN107133176A (en) * 2017-05-09 2017-09-05 武汉大学 A kind of spanned item mesh failure prediction method based on semi-supervised clustering data screening
CN109032916A (en) * 2017-06-08 2018-12-18 阿里巴巴集团控股有限公司 The method, apparatus and system that functional module in application program is evaluated and tested
CN107577605A (en) * 2017-09-04 2018-01-12 南京航空航天大学 A kind of feature clustering system of selection of software-oriented failure prediction
CN109522192A (en) * 2018-10-17 2019-03-26 北京航空航天大学 A kind of prediction technique of knowledge based map and complex network combination

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
聚类分析在软件缺陷度量应用的研究;上火的丫头;《https://www.docin.com/p-776455497.html》;20140311;全文第1-9页 *
聚类分析在软件缺陷度量应用的研究;魏友明等;《中国科技论文在线》;20091230;全文第1-9页 *

Also Published As

Publication number Publication date
CN114791886A (en) 2022-07-26

Similar Documents

Publication Publication Date Title
Fan et al. The impact of mislabeled changes by szz on just-in-time defect prediction
Yan et al. Automating change-level self-admitted technical debt determination
US10572374B2 (en) System and method for automated software testing based on machine learning (ML)
Bissyandé et al. Empirical evaluation of bug linking
Hemmati et al. Prioritizing manual test cases in traditional and rapid release environments
Hemmati et al. Prioritizing manual test cases in rapid release environments
US6269457B1 (en) Technology regression and verification acceptance method
CN107862327B (en) Security defect identification system and method based on multiple features
CN114116496A (en) Automatic testing method, device, equipment and medium
Yang et al. Vuldigger: A just-in-time and cost-aware tool for digging vulnerability-contributing changes
Felderer et al. Using defect taxonomies for requirements validation in industrial projects
CN115328784A (en) Agile interface-oriented automatic testing method and system
Gholamian et al. A comprehensive survey of logging in software: From logging statements automation to log mining and analysis
CN115952081A (en) Software testing method, device, storage medium and equipment
Shatnawi et al. An Assessment of Eclipse Bugs' Priority and Severity Prediction Using Machine Learning
CN107301120A (en) Method and device for handling unstructured daily record
US20200319992A1 (en) Predicting defects using metadata
Polaczek et al. Exploring the software repositories of embedded systems: An industrial experience
CN114791886B (en) Software problem tracking method and system
CN109800147B (en) Test case generation method and terminal equipment
Quach et al. Evaluating the impact of falsely detected performance bug-inducing changes in JIT models
CN113791980A (en) Test case conversion analysis method, device, equipment and storage medium
Goyal et al. Bug handling in service sector software
CN116383834B (en) Detection method for source code vulnerability detection tool abnormality and related equipment
CN112487269B (en) Method and device for detecting automation script of crawler

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231024

Address after: Room 2-13-20110, 12A Gangwan Street, Zhongshan District, Dalian City, Liaoning Province, 116000

Patentee after: Weichuang Software (Dalian) Co.,Ltd.

Address before: 430000 building C21, group 1, phase III, Wuhan Software New Town, No. 8, Huacheng Avenue, Donghu New Technology Development Zone, Wuhan, Hubei

Patentee before: Weichuang software (Wuhan) Co.,Ltd.

TR01 Transfer of patent right