CN109615241A - A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network - Google Patents

A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network Download PDF

Info

Publication number
CN109615241A
CN109615241A CN201811528908.4A CN201811528908A CN109615241A CN 109615241 A CN109615241 A CN 109615241A CN 201811528908 A CN201811528908 A CN 201811528908A CN 109615241 A CN109615241 A CN 109615241A
Authority
CN
China
Prior art keywords
developer
feature
bug
training
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811528908.4A
Other languages
Chinese (zh)
Inventor
陈荣
王林辉
王芝
李辉
郭世凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Maritime University
Original Assignee
Dalian Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Maritime University filed Critical Dalian Maritime University
Priority to CN201811528908.4A priority Critical patent/CN109615241A/en
Publication of CN109615241A publication Critical patent/CN109615241A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06311Scheduling, planning or task assignment for a person or group
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Educational Administration (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network, comprising the following steps: S1: obtain original bug report data set from selected open source projects, and pre-processed into training set and test set to it;S2: the sample in training set is sequentially input in CLBT model, and all parameters are until the training of the model is completed in convergence in training CLBT model;S3: the sample in test set being sequentially input in the CLBT model for completing training, and each sample returns to a recommendation probability for whole developers, which is dispatched to the developer of maximum probability.This method first choice has done the feature extraction of quantization to the length dependence between the hierarchical relationship and word of entire sentence, the word order information in considering prior art, it is extracted semanteme and the contextual feature of word further simultaneously to participate in the assignment work of bug report, more sufficiently effective digging utilization has been carried out to text information.

Description

A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network
Technical field
The present invention relates to software testing technology field more particularly to a kind of softwares based on convolution sum Recognition with Recurrent Neural Network Bug allocating method.
Background technique
Software Bug, i.e. software fault are inevitable products in software development process.The bug in software is repaired in time It is the premise for guaranteeing software quality with the correctness of maintenance system.In order to be conducive to collect and management software bug, software developer The warehouse software Bug (Bug Repository) is devised, the administrative staff for coming the warehouse storage and maintenance software bug, Bug are by examining Bug report is read to assign suitable developer to repair Bug.With the further maturation of software development technique, software bug Quantity greatly increase, it is traditional by way of manually carrying out Bug appointment because time-consuming big, low efficiency, far can not Meet current needs.So researchers propose that being carried out automation Bug using machine learning is assigned, so that Bug be assigned Problem is converted to text classification problem, becomes a research hotspot at present.But there is no to text envelope for many researchs Breath is adequately excavated, they often ignore the word order and contextual feature of text.In addition, the relevant technologies judge it is similar Performance is also very poor when developer.
Summary of the invention
According to problem of the existing technology, the invention discloses a kind of softwares based on convolution sum Recognition with Recurrent Neural Network Bug allocating method, comprising: following steps:
S1: obtaining original bug report data set from selected open source projects, and it is pre-processed into training set and Test set;
S2: the sample in training set is sequentially input in CLBT model, and all parameters are until receive in training CLBT model Hold back the training for completing the model;
S3: the sample in test set is sequentially input in the CLBT model for completing training, each sample returns to a needle To the recommendation probability of whole developers, which is dispatched to the developer of maximum probability;
Further, in S1 it is specific in the following way:
S11: screening bug report: retaining confirmation and be repaired bug report, deletes and repairs the very few developer of bug report quantity And the bug report repaired by them;
S12: it extracts text information: the text information of bug report being segmented, is stemmed and remove stop words, is deleted out The excessively high or too low word of existing frequency;
S13: it extracts developer's liveness information: elapsing a period of time forward from the corresponding timing node of every bug report, Statistics belongs to the history bug report of one kind with current bug report in this period, successively extract and belong to a kind of history bug report The reparation person of announcement forms developer's repairing sequence of current bug report;
S14: pretreated data set will be completed and be divided into training set and test set;
Further, established in S2 and training CLBT model specifically in the following way:
S21: encoding textual information and developer's liveness information: all words are processed into using an effective code isometric Vector, isometric vector is equally processed into developer;
S22: the text vector encoded is inputted into bidirectional circulating neural network, to extract the word order feature between word;
S23: the semanteme and its contextual feature of word are extracted: the text vector encoded are inputted into convolutional neural networks, It is slided using the unequal convolution kernel of multiple sizes on word sequence to obtain high-level characteristic, is obtained under multiple and different convolution kernels Feature Mapping, this feature is mapped using dimension, the reservation notable feature for reducing output by the way of maximum pond, by reservation Notable feature is as the high-level characteristic extracted;
S24: the developer's liveness information input one-way circulation neural network that will be encoded extracts developer's liveness High-level characteristic;
S25: by the high level of S22 word order feature, semanteme and its contextual feature generated into S24 and developer's liveness Feature is merged in a manner of being multiplied between element, and fused feature is input to output layer;
S26: output layer obtains the recommendation probability to each developer after the calculating of softmax function;
Further, in S3 it is specific in the following way:
S31: reading trained neural network model, maintains all parameter constants, and pretreated test set is passed through in input.
S32: for each sample in test set, developer's recommendation list is returned to;
By adopting the above-described technical solution, a kind of software based on convolution sum Recognition with Recurrent Neural Network provided by the invention Bug allocating method, this method have done quantization to the length dependence between the hierarchical relationship and word of entire sentence first Feature extraction, the word order information in considering prior art, while being based on convolutional neural networks (Convolutional Neural Networks, CNN) semanteme and the contextual feature of word are further extracted to participate in the assignment work of Bug report, to text Information has carried out more sufficiently effective digging utilization.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The some embodiments recorded in application, for those of ordinary skill in the art, without creative efforts, It is also possible to obtain other drawings based on these drawings.
Fig. 1 is the flow chart of the method for the present invention.
Specific embodiment
To keep technical solution of the present invention and advantage clearer, with reference to the attached drawing in the embodiment of the present invention, to this Technical solution in inventive embodiments carries out clear and complete description:
A kind of software Bug allocating method (Convolution based on convolution sum Recognition with Recurrent Neural Network as shown in Figure 1 LSTM Bug Triage, CLBT), comprising the following steps:
S1: obtaining original bug report data set from selected open source projects, and it is pre-processed into training set and Test set;
S2: the sample in training set is sequentially input in CLBT model, and all parameters in training pattern are until restrained At the training of the model;
S3: the sample in test set is sequentially input in the CLBT model for completing training, each sample returns to a needle To the recommendation probability of whole developers, which is dispatched to the developer of maximum probability;
Further, in S1 it is specific in the following way:
S11: screening bug report: retaining confirmation and be repaired bug report, deletes and repairs the very few developer of bug report quantity And the bug report repaired by them;
S12: it extracts text information: the text information of bug report being segmented, is stemmed and remove stop words, is deleted out The excessively high or too low word of existing frequency;
S13: it extracts developer's liveness information: elapsing a period of time forward from the corresponding timing node of every bug report, Statistics belongs to the history bug report of one kind with current bug report in this period, successively extract and belong to a kind of history bug report The reparation person of announcement forms developer's repairing sequence of current bug report;
S14: pretreated data set will be completed and be divided into training set and test set;
Further, established in S2 and training CLBT model specifically in the following way:
S21: encoding textual information and developer's liveness information: all words are processed into using an effective code isometric Vector, isometric vector is equally processed into developer;
S22: the text vector encoded is inputted into bidirectional circulating neural network, to extract the word order feature between word;
S23: the semanteme and its contextual feature of word are extracted: the text vector encoded are inputted into convolutional neural networks, It is slided using the unequal convolution kernel of multiple sizes on word sequence to obtain high-level characteristic, is obtained under multiple and different convolution kernels Feature Mapping, this feature is mapped using dimension, the reservation notable feature for reducing output by the way of maximum pond, by reservation Notable feature is as the high-level characteristic extracted;
S24: the developer's liveness information input one-way circulation neural network that will be encoded extracts developer's liveness High-level characteristic;
S25: by the high level of S22 word order feature, semanteme and its contextual feature generated into S24 and developer's liveness Feature is merged in a manner of being multiplied between element, and fused feature is input to output layer;
S26: output layer obtains the recommendation probability to each developer after the calculating of softmax function;
Further, in S3 it is specific in the following way:
S31: reading trained neural network model, maintains all parameter constants, and pretreated test set is passed through in input.
S32: for each sample in test set, developer's recommendation list is returned to.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, Anyone skilled in the art in the technical scope disclosed by the present invention, according to the technique and scheme of the present invention and its Inventive concept is subject to equivalent substitution or change, should be covered by the protection scope of the present invention.

Claims (4)

1. a kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network, it is characterised in that: the following steps are included:
S1: original bug report data set is obtained from selected open source projects, and is pre-processed into training set and test to it Collection;
S2: the sample in training set is sequentially input in CLBT model, and all parameters are until restrained in training CLBT model At the training of the model;
S3: the sample in test set being sequentially input in the CLBT model for completing training, and each sample returns to one for complete The sample, is dispatched to the developer of maximum probability by the recommendation probability of portion developer.
2. a kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network according to claim 1, feature is also It is: in S1 specifically in the following way:
S11: screening bug report: retaining confirmation and be repaired bug report, delete repair the very few developer of bug report quantity and by The bug report that they repair;
S12: it extracts text information: the text information of bug report being segmented, is stemmed and remove stop words, delete appearance frequency The excessively high or too low word of rate;
S13: it extracts developer's liveness information: elapsing a period of time, statistics forward from the corresponding timing node of every bug report A kind of history bug report is belonged to current bug report in this period, successively extracts and belongs to a kind of history bug report Reparation person forms developer's repairing sequence of current bug report;
S14: pretreated data set will be completed and be divided into training set and test set.
3. a kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network according to claim 1, feature is also Be: being established in S2 and training CLBT model specifically in the following way:
S21: encoding textual information and developer's liveness information: using an effective code by all words be processed into it is isometric to Amount, isometric vector is equally processed into developer;
S22: the text vector encoded is inputted into bidirectional circulating neural network, to extract the word order feature between word;
S23: the semanteme and its contextual feature of word are extracted: the text vector encoded is inputted into convolutional neural networks, is used Multiple unequal convolution kernels of size slide the spy obtained under multiple and different convolution kernels to obtain high-level characteristic on word sequence Sign mapping is reduced the dimension of output by the way of maximum pond to this feature mapping, retains notable feature, by the significant of reservation Feature is as the high-level characteristic extracted;
S24: the developer's liveness information input one-way circulation neural network that will be encoded extracts the high level of developer's liveness Feature;
S25: by the high-level characteristic of S22 word order feature, semanteme and its contextual feature generated into S24 and developer's liveness It is merged in a manner of being multiplied between element, fused feature is input to output layer;
S26: output layer obtains the recommendation probability to each developer after the calculating of softmax function.
4. a kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network according to claim 1, feature is also It is: in S3 specifically in the following way:
S31: reading trained neural network model, maintains all parameter constants, and pretreated test set is passed through in input;
S32: for each sample in test set, developer's recommendation list is returned to.
CN201811528908.4A 2018-12-13 2018-12-13 A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network Pending CN109615241A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811528908.4A CN109615241A (en) 2018-12-13 2018-12-13 A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811528908.4A CN109615241A (en) 2018-12-13 2018-12-13 A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network

Publications (1)

Publication Number Publication Date
CN109615241A true CN109615241A (en) 2019-04-12

Family

ID=66008330

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811528908.4A Pending CN109615241A (en) 2018-12-13 2018-12-13 A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network

Country Status (1)

Country Link
CN (1) CN109615241A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110472246A (en) * 2019-08-16 2019-11-19 上海掌学教育科技有限公司 Work order classification method, device and storage medium
CN113138920A (en) * 2021-04-20 2021-07-20 中国科学院软件研究所 Software defect report allocation method and device based on knowledge graph and semantic role labeling

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951512A (en) * 2017-03-17 2017-07-14 深圳市唯特视科技有限公司 A kind of end-to-end session control method based on hybrid coding network
CN107480141A (en) * 2017-08-29 2017-12-15 南京大学 It is a kind of that allocating method is aided in based on the software defect of text and developer's liveness
WO2018153265A1 (en) * 2017-02-23 2018-08-30 腾讯科技(深圳)有限公司 Keyword extraction method, computer device, and storage medium
US20180261213A1 (en) * 2017-03-13 2018-09-13 Baidu Usa Llc Convolutional recurrent neural networks for small-footprint keyword spotting

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018153265A1 (en) * 2017-02-23 2018-08-30 腾讯科技(深圳)有限公司 Keyword extraction method, computer device, and storage medium
US20180261213A1 (en) * 2017-03-13 2018-09-13 Baidu Usa Llc Convolutional recurrent neural networks for small-footprint keyword spotting
CN106951512A (en) * 2017-03-17 2017-07-14 深圳市唯特视科技有限公司 A kind of end-to-end session control method based on hybrid coding network
CN107480141A (en) * 2017-08-29 2017-12-15 南京大学 It is a kind of that allocating method is aided in based on the software defect of text and developer's liveness

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
席圣渠等: "基于循环神经网络的缺陷报告分派方法", 《软件学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110472246A (en) * 2019-08-16 2019-11-19 上海掌学教育科技有限公司 Work order classification method, device and storage medium
CN113138920A (en) * 2021-04-20 2021-07-20 中国科学院软件研究所 Software defect report allocation method and device based on knowledge graph and semantic role labeling
CN113138920B (en) * 2021-04-20 2022-09-06 中国科学院软件研究所 Software defect report allocation method and device based on knowledge graph and semantic role labeling

Similar Documents

Publication Publication Date Title
CN110597735B (en) Software defect prediction method for open-source software defect feature deep learning
CN108509969B (en) Data labeling method and terminal
CN110457260A (en) Document handling method, device, equipment and computer readable storage medium
CN112000771B (en) Judicial public service-oriented sentence pair intelligent semantic matching method and device
CN103455896B (en) With no paper assembling Quality Control method based on Internet of Things
CN111860981A (en) Enterprise national industry category prediction method and system based on LSTM deep learning
CN110766438B (en) Method for analyzing user behavior of power grid user through artificial intelligence
CN106201472A (en) The method for scheduling task of software development and device
CN113010635B (en) Text error correction method and device
CN115757124A (en) Test case generation method based on neural network
CN109615241A (en) A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network
CN106355303A (en) Data model automatic evaluation system
CN105426419A (en) System and method for data promotion among heterogeneous systems
CN109375904A (en) A kind of computer software development approach based on model
CN113705215A (en) Meta-learning-based large-scale multi-label text classification method
CN112699235A (en) Method, equipment and system for analyzing and evaluating resume sample data
CN113672732A (en) Method and device for classifying business data
CN116150404A (en) Educational resource multi-modal knowledge graph construction method based on joint learning
CN110852076B (en) Method and device for automatic disease code conversion
CN109828750A (en) Auto-configuration data buries method, apparatus, electronic equipment and storage medium a little
CN116663540A (en) Financial event extraction method based on small sample
CN109977128A (en) Electric Power Network Planning data fusion method based on tense dimension
Visalli et al. ESG Data Collection with Adaptive AI.
CN115345600B (en) RPA flow generation method and device
CN111651960A (en) Optical character joint training and recognition method for moving from contract simplified form to traditional form

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190412

RJ01 Rejection of invention patent application after publication