CN109615241A - A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network - Google Patents
A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network Download PDFInfo
- Publication number
- CN109615241A CN109615241A CN201811528908.4A CN201811528908A CN109615241A CN 109615241 A CN109615241 A CN 109615241A CN 201811528908 A CN201811528908 A CN 201811528908A CN 109615241 A CN109615241 A CN 109615241A
- Authority
- CN
- China
- Prior art keywords
- developer
- feature
- bug
- training
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 18
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 17
- 230000000306 recurrent effect Effects 0.000 title claims abstract description 11
- 238000012549 training Methods 0.000 claims abstract description 26
- 238000012360 testing method Methods 0.000 claims abstract description 17
- 239000000284 extract Substances 0.000 claims description 12
- 238000013527 convolutional neural network Methods 0.000 claims description 6
- 230000008439 repair process Effects 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 4
- 230000002457 bidirectional effect Effects 0.000 claims description 3
- 238000012790 confirmation Methods 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 abstract description 2
- 238000013139 quantization Methods 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000013522 software testing Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06311—Scheduling, planning or task assignment for a person or group
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Educational Administration (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network, comprising the following steps: S1: obtain original bug report data set from selected open source projects, and pre-processed into training set and test set to it;S2: the sample in training set is sequentially input in CLBT model, and all parameters are until the training of the model is completed in convergence in training CLBT model;S3: the sample in test set being sequentially input in the CLBT model for completing training, and each sample returns to a recommendation probability for whole developers, which is dispatched to the developer of maximum probability.This method first choice has done the feature extraction of quantization to the length dependence between the hierarchical relationship and word of entire sentence, the word order information in considering prior art, it is extracted semanteme and the contextual feature of word further simultaneously to participate in the assignment work of bug report, more sufficiently effective digging utilization has been carried out to text information.
Description
Technical field
The present invention relates to software testing technology field more particularly to a kind of softwares based on convolution sum Recognition with Recurrent Neural Network
Bug allocating method.
Background technique
Software Bug, i.e. software fault are inevitable products in software development process.The bug in software is repaired in time
It is the premise for guaranteeing software quality with the correctness of maintenance system.In order to be conducive to collect and management software bug, software developer
The warehouse software Bug (Bug Repository) is devised, the administrative staff for coming the warehouse storage and maintenance software bug, Bug are by examining
Bug report is read to assign suitable developer to repair Bug.With the further maturation of software development technique, software bug
Quantity greatly increase, it is traditional by way of manually carrying out Bug appointment because time-consuming big, low efficiency, far can not
Meet current needs.So researchers propose that being carried out automation Bug using machine learning is assigned, so that Bug be assigned
Problem is converted to text classification problem, becomes a research hotspot at present.But there is no to text envelope for many researchs
Breath is adequately excavated, they often ignore the word order and contextual feature of text.In addition, the relevant technologies judge it is similar
Performance is also very poor when developer.
Summary of the invention
According to problem of the existing technology, the invention discloses a kind of softwares based on convolution sum Recognition with Recurrent Neural Network
Bug allocating method, comprising: following steps:
S1: obtaining original bug report data set from selected open source projects, and it is pre-processed into training set and
Test set;
S2: the sample in training set is sequentially input in CLBT model, and all parameters are until receive in training CLBT model
Hold back the training for completing the model;
S3: the sample in test set is sequentially input in the CLBT model for completing training, each sample returns to a needle
To the recommendation probability of whole developers, which is dispatched to the developer of maximum probability;
Further, in S1 it is specific in the following way:
S11: screening bug report: retaining confirmation and be repaired bug report, deletes and repairs the very few developer of bug report quantity
And the bug report repaired by them;
S12: it extracts text information: the text information of bug report being segmented, is stemmed and remove stop words, is deleted out
The excessively high or too low word of existing frequency;
S13: it extracts developer's liveness information: elapsing a period of time forward from the corresponding timing node of every bug report,
Statistics belongs to the history bug report of one kind with current bug report in this period, successively extract and belong to a kind of history bug report
The reparation person of announcement forms developer's repairing sequence of current bug report;
S14: pretreated data set will be completed and be divided into training set and test set;
Further, established in S2 and training CLBT model specifically in the following way:
S21: encoding textual information and developer's liveness information: all words are processed into using an effective code isometric
Vector, isometric vector is equally processed into developer;
S22: the text vector encoded is inputted into bidirectional circulating neural network, to extract the word order feature between word;
S23: the semanteme and its contextual feature of word are extracted: the text vector encoded are inputted into convolutional neural networks,
It is slided using the unequal convolution kernel of multiple sizes on word sequence to obtain high-level characteristic, is obtained under multiple and different convolution kernels
Feature Mapping, this feature is mapped using dimension, the reservation notable feature for reducing output by the way of maximum pond, by reservation
Notable feature is as the high-level characteristic extracted;
S24: the developer's liveness information input one-way circulation neural network that will be encoded extracts developer's liveness
High-level characteristic;
S25: by the high level of S22 word order feature, semanteme and its contextual feature generated into S24 and developer's liveness
Feature is merged in a manner of being multiplied between element, and fused feature is input to output layer;
S26: output layer obtains the recommendation probability to each developer after the calculating of softmax function;
Further, in S3 it is specific in the following way:
S31: reading trained neural network model, maintains all parameter constants, and pretreated test set is passed through in input.
S32: for each sample in test set, developer's recommendation list is returned to;
By adopting the above-described technical solution, a kind of software based on convolution sum Recognition with Recurrent Neural Network provided by the invention
Bug allocating method, this method have done quantization to the length dependence between the hierarchical relationship and word of entire sentence first
Feature extraction, the word order information in considering prior art, while being based on convolutional neural networks (Convolutional Neural
Networks, CNN) semanteme and the contextual feature of word are further extracted to participate in the assignment work of Bug report, to text
Information has carried out more sufficiently effective digging utilization.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The some embodiments recorded in application, for those of ordinary skill in the art, without creative efforts,
It is also possible to obtain other drawings based on these drawings.
Fig. 1 is the flow chart of the method for the present invention.
Specific embodiment
To keep technical solution of the present invention and advantage clearer, with reference to the attached drawing in the embodiment of the present invention, to this
Technical solution in inventive embodiments carries out clear and complete description:
A kind of software Bug allocating method (Convolution based on convolution sum Recognition with Recurrent Neural Network as shown in Figure 1
LSTM Bug Triage, CLBT), comprising the following steps:
S1: obtaining original bug report data set from selected open source projects, and it is pre-processed into training set and
Test set;
S2: the sample in training set is sequentially input in CLBT model, and all parameters in training pattern are until restrained
At the training of the model;
S3: the sample in test set is sequentially input in the CLBT model for completing training, each sample returns to a needle
To the recommendation probability of whole developers, which is dispatched to the developer of maximum probability;
Further, in S1 it is specific in the following way:
S11: screening bug report: retaining confirmation and be repaired bug report, deletes and repairs the very few developer of bug report quantity
And the bug report repaired by them;
S12: it extracts text information: the text information of bug report being segmented, is stemmed and remove stop words, is deleted out
The excessively high or too low word of existing frequency;
S13: it extracts developer's liveness information: elapsing a period of time forward from the corresponding timing node of every bug report,
Statistics belongs to the history bug report of one kind with current bug report in this period, successively extract and belong to a kind of history bug report
The reparation person of announcement forms developer's repairing sequence of current bug report;
S14: pretreated data set will be completed and be divided into training set and test set;
Further, established in S2 and training CLBT model specifically in the following way:
S21: encoding textual information and developer's liveness information: all words are processed into using an effective code isometric
Vector, isometric vector is equally processed into developer;
S22: the text vector encoded is inputted into bidirectional circulating neural network, to extract the word order feature between word;
S23: the semanteme and its contextual feature of word are extracted: the text vector encoded are inputted into convolutional neural networks,
It is slided using the unequal convolution kernel of multiple sizes on word sequence to obtain high-level characteristic, is obtained under multiple and different convolution kernels
Feature Mapping, this feature is mapped using dimension, the reservation notable feature for reducing output by the way of maximum pond, by reservation
Notable feature is as the high-level characteristic extracted;
S24: the developer's liveness information input one-way circulation neural network that will be encoded extracts developer's liveness
High-level characteristic;
S25: by the high level of S22 word order feature, semanteme and its contextual feature generated into S24 and developer's liveness
Feature is merged in a manner of being multiplied between element, and fused feature is input to output layer;
S26: output layer obtains the recommendation probability to each developer after the calculating of softmax function;
Further, in S3 it is specific in the following way:
S31: reading trained neural network model, maintains all parameter constants, and pretreated test set is passed through in input.
S32: for each sample in test set, developer's recommendation list is returned to.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
Anyone skilled in the art in the technical scope disclosed by the present invention, according to the technique and scheme of the present invention and its
Inventive concept is subject to equivalent substitution or change, should be covered by the protection scope of the present invention.
Claims (4)
1. a kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network, it is characterised in that: the following steps are included:
S1: original bug report data set is obtained from selected open source projects, and is pre-processed into training set and test to it
Collection;
S2: the sample in training set is sequentially input in CLBT model, and all parameters are until restrained in training CLBT model
At the training of the model;
S3: the sample in test set being sequentially input in the CLBT model for completing training, and each sample returns to one for complete
The sample, is dispatched to the developer of maximum probability by the recommendation probability of portion developer.
2. a kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network according to claim 1, feature is also
It is: in S1 specifically in the following way:
S11: screening bug report: retaining confirmation and be repaired bug report, delete repair the very few developer of bug report quantity and by
The bug report that they repair;
S12: it extracts text information: the text information of bug report being segmented, is stemmed and remove stop words, delete appearance frequency
The excessively high or too low word of rate;
S13: it extracts developer's liveness information: elapsing a period of time, statistics forward from the corresponding timing node of every bug report
A kind of history bug report is belonged to current bug report in this period, successively extracts and belongs to a kind of history bug report
Reparation person forms developer's repairing sequence of current bug report;
S14: pretreated data set will be completed and be divided into training set and test set.
3. a kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network according to claim 1, feature is also
Be: being established in S2 and training CLBT model specifically in the following way:
S21: encoding textual information and developer's liveness information: using an effective code by all words be processed into it is isometric to
Amount, isometric vector is equally processed into developer;
S22: the text vector encoded is inputted into bidirectional circulating neural network, to extract the word order feature between word;
S23: the semanteme and its contextual feature of word are extracted: the text vector encoded is inputted into convolutional neural networks, is used
Multiple unequal convolution kernels of size slide the spy obtained under multiple and different convolution kernels to obtain high-level characteristic on word sequence
Sign mapping is reduced the dimension of output by the way of maximum pond to this feature mapping, retains notable feature, by the significant of reservation
Feature is as the high-level characteristic extracted;
S24: the developer's liveness information input one-way circulation neural network that will be encoded extracts the high level of developer's liveness
Feature;
S25: by the high-level characteristic of S22 word order feature, semanteme and its contextual feature generated into S24 and developer's liveness
It is merged in a manner of being multiplied between element, fused feature is input to output layer;
S26: output layer obtains the recommendation probability to each developer after the calculating of softmax function.
4. a kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network according to claim 1, feature is also
It is: in S3 specifically in the following way:
S31: reading trained neural network model, maintains all parameter constants, and pretreated test set is passed through in input;
S32: for each sample in test set, developer's recommendation list is returned to.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811528908.4A CN109615241A (en) | 2018-12-13 | 2018-12-13 | A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811528908.4A CN109615241A (en) | 2018-12-13 | 2018-12-13 | A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109615241A true CN109615241A (en) | 2019-04-12 |
Family
ID=66008330
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811528908.4A Pending CN109615241A (en) | 2018-12-13 | 2018-12-13 | A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109615241A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110472246A (en) * | 2019-08-16 | 2019-11-19 | 上海掌学教育科技有限公司 | Work order classification method, device and storage medium |
CN113138920A (en) * | 2021-04-20 | 2021-07-20 | 中国科学院软件研究所 | Software defect report allocation method and device based on knowledge graph and semantic role labeling |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951512A (en) * | 2017-03-17 | 2017-07-14 | 深圳市唯特视科技有限公司 | A kind of end-to-end session control method based on hybrid coding network |
CN107480141A (en) * | 2017-08-29 | 2017-12-15 | 南京大学 | It is a kind of that allocating method is aided in based on the software defect of text and developer's liveness |
WO2018153265A1 (en) * | 2017-02-23 | 2018-08-30 | 腾讯科技(深圳)有限公司 | Keyword extraction method, computer device, and storage medium |
US20180261213A1 (en) * | 2017-03-13 | 2018-09-13 | Baidu Usa Llc | Convolutional recurrent neural networks for small-footprint keyword spotting |
-
2018
- 2018-12-13 CN CN201811528908.4A patent/CN109615241A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018153265A1 (en) * | 2017-02-23 | 2018-08-30 | 腾讯科技(深圳)有限公司 | Keyword extraction method, computer device, and storage medium |
US20180261213A1 (en) * | 2017-03-13 | 2018-09-13 | Baidu Usa Llc | Convolutional recurrent neural networks for small-footprint keyword spotting |
CN106951512A (en) * | 2017-03-17 | 2017-07-14 | 深圳市唯特视科技有限公司 | A kind of end-to-end session control method based on hybrid coding network |
CN107480141A (en) * | 2017-08-29 | 2017-12-15 | 南京大学 | It is a kind of that allocating method is aided in based on the software defect of text and developer's liveness |
Non-Patent Citations (1)
Title |
---|
席圣渠等: "基于循环神经网络的缺陷报告分派方法", 《软件学报》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110472246A (en) * | 2019-08-16 | 2019-11-19 | 上海掌学教育科技有限公司 | Work order classification method, device and storage medium |
CN113138920A (en) * | 2021-04-20 | 2021-07-20 | 中国科学院软件研究所 | Software defect report allocation method and device based on knowledge graph and semantic role labeling |
CN113138920B (en) * | 2021-04-20 | 2022-09-06 | 中国科学院软件研究所 | Software defect report allocation method and device based on knowledge graph and semantic role labeling |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110597735B (en) | Software defect prediction method for open-source software defect feature deep learning | |
CN108509969B (en) | Data labeling method and terminal | |
CN110457260A (en) | Document handling method, device, equipment and computer readable storage medium | |
CN112000771B (en) | Judicial public service-oriented sentence pair intelligent semantic matching method and device | |
CN103455896B (en) | With no paper assembling Quality Control method based on Internet of Things | |
CN111860981A (en) | Enterprise national industry category prediction method and system based on LSTM deep learning | |
CN110766438B (en) | Method for analyzing user behavior of power grid user through artificial intelligence | |
CN106201472A (en) | The method for scheduling task of software development and device | |
CN113010635B (en) | Text error correction method and device | |
CN115757124A (en) | Test case generation method based on neural network | |
CN109615241A (en) | A kind of software Bug allocating method based on convolution sum Recognition with Recurrent Neural Network | |
CN106355303A (en) | Data model automatic evaluation system | |
CN105426419A (en) | System and method for data promotion among heterogeneous systems | |
CN109375904A (en) | A kind of computer software development approach based on model | |
CN113705215A (en) | Meta-learning-based large-scale multi-label text classification method | |
CN112699235A (en) | Method, equipment and system for analyzing and evaluating resume sample data | |
CN113672732A (en) | Method and device for classifying business data | |
CN116150404A (en) | Educational resource multi-modal knowledge graph construction method based on joint learning | |
CN110852076B (en) | Method and device for automatic disease code conversion | |
CN109828750A (en) | Auto-configuration data buries method, apparatus, electronic equipment and storage medium a little | |
CN116663540A (en) | Financial event extraction method based on small sample | |
CN109977128A (en) | Electric Power Network Planning data fusion method based on tense dimension | |
Visalli et al. | ESG Data Collection with Adaptive AI. | |
CN115345600B (en) | RPA flow generation method and device | |
CN111651960A (en) | Optical character joint training and recognition method for moving from contract simplified form to traditional form |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190412 |
|
RJ01 | Rejection of invention patent application after publication |