CN113326062A - Software defect-oriented multi-round automatic question and answer method, system, computer equipment and storage medium - Google Patents
Software defect-oriented multi-round automatic question and answer method, system, computer equipment and storage medium Download PDFInfo
- Publication number
- CN113326062A CN113326062A CN202110569649.5A CN202110569649A CN113326062A CN 113326062 A CN113326062 A CN 113326062A CN 202110569649 A CN202110569649 A CN 202110569649A CN 113326062 A CN113326062 A CN 113326062A
- Authority
- CN
- China
- Prior art keywords
- defect
- software
- user
- maintainer
- question
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000007547 defect Effects 0.000 claims abstract description 115
- 230000004927 fusion Effects 0.000 claims abstract description 18
- 238000001514 detection method Methods 0.000 claims abstract description 13
- 230000009193 crawling Effects 0.000 claims abstract description 11
- 238000004891 communication Methods 0.000 claims abstract description 6
- 101100421536 Danio rerio sim1a gene Proteins 0.000 claims description 10
- 101100495431 Schizosaccharomyces pombe (strain 972 / ATCC 24843) cnp1 gene Proteins 0.000 claims description 9
- 101100365794 Schizosaccharomyces pombe (strain 972 / ATCC 24843) sim3 gene Proteins 0.000 claims description 9
- 238000007781 pre-processing Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 7
- 230000003993 interaction Effects 0.000 claims description 7
- 238000004458 analytical method Methods 0.000 claims description 5
- 230000002950 deficient Effects 0.000 claims description 5
- 230000009471 action Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 3
- 230000002452 interceptive effect Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000013441 quality evaluation Methods 0.000 claims description 3
- 230000008451 emotion Effects 0.000 description 22
- 238000004364 calculation method Methods 0.000 description 7
- 230000006399 behavior Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 4
- 238000012423 maintenance Methods 0.000 description 4
- 230000002996 emotional effect Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000008439 repair process Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000007849 functional defect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000004451 qualitative analysis Methods 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/72—Code refactoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3624—Software debugging by performing operations on the source code, e.g. via a compiler
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a software defect-oriented multi-round automatic question answering method, which belongs to the field of software maintenance and comprises the following steps: crawling a defect report from an open source Bug management library, extracting information which is helpful for defect understanding in the report, extracting entities and relations from some long texts, carrying out knowledge fusion and quality detection, and constructing a software defect knowledge map; recording the multiple communication between a software developer or maintainer and the system, and constructing a multi-round dialogue memory module; constructing a user portrait of a software developer or a maintainer according to relevant problems in the software defect field asked by the software developer or the maintainer; and constructing a multi-turn question and answer module according to the dialogue memory and the user portrait of the software developer or maintainer.
Description
Technical Field
The invention belongs to the field of software maintenance, and particularly relates to a software defect-oriented multi-round automatic question answering method and system.
Background
Software defects (defects), also often called bugs. The software defect is a problem, error or hidden functional defect existing in computer software or programs, which destroys normal operation capability. The presence of defects may result in a software product that is somewhat unsatisfactory to the needs of the user. IEEE729-1983 has a standard definition for defects: from the inside of the product, the defects are various problems such as errors, faults and the like in the development or maintenance process of the software product; a defect is a failure or violation of some function that the system needs to implement, as viewed from outside the product.
The main function of the defect repairing technology in the industry is to assist developers in repairing defects and improve the efficiency and quality of defect repairing. When a new defect is assigned, a developer only has a defect report submitted by a user, the defect symptoms are simply described, the information amount is small, the quality is low, and the defect analysis becomes very difficult. Statistically, it takes an average of 200 days for developers to repair a defect, wherein more than 50% of the time is used to understand the defect. Therefore, how to assist developers to deeply analyze and accurately understand the defects naturally becomes the key point of accurate defect repair.
When a software developer directly retrieves answers by means of a conventional search engine or a software defect library, it is difficult to help the software developer to repair defects effectively because the system has ambiguity in understanding the user's problems and the returned answers are not accurate enough.
At present, an automatic question-answering method with some software defects is based on a template matching or deep learning technology, the interaction times with a user are limited, the question is asked once, the answer is answered once, the understanding of the real question intention of the user is insufficient, the effect is poor, and accurate and satisfactory answers cannot be provided for software development and maintenance personnel.
Disclosure of Invention
The invention aims to provide a multi-round automatic question and answer method which can carry out multiple rounds of interaction with a software developer and a maintainer and fully understand the real intention of a questioner.
The invention also provides a software defect-oriented multi-turn automatic question answering system.
In order to realize the purpose of the invention, the technical solution of the provided software defect-oriented multi-round automatic question-answering method is as follows:
a software defect-oriented multi-round automatic question answering method comprises the following steps:
step 1, crawling a defect report from an open source Bug management library, extracting information which is helpful for defect understanding in the report, extracting entities and relations from a long text, performing knowledge fusion and quality detection, and constructing a software defect knowledge map;
step 2, recording multiple communications between a software developer or maintainer and the system, and constructing a multi-round dialogue memory module;
step 3, constructing a user portrait of the software developer or the maintainer according to the related problems in the software defect field asked by the software developer or the maintainer;
and 4, constructing a guided multi-turn question-answering module according to the dialogue memory and the user portrait.
Further, the construction of the software defect knowledge graph module comprises the following three stages:
the first stage is as follows: the method comprises the steps of crawling a defect report from an open source Bug management library, preprocessing the defect report, extracting information important for defect analysis understanding from a large number of attributes and Description texts in the crawled defect report, wherein the defect report information comprises a defect number (Bug ID), a defect Title (Title), a Product (Product), a Component (Component), Severity (Severity), a Modified state (Modified), a defect handler (identifier), a defect Reporter (Reporter) and defect Description information (Description);
and a second stage: extracting information, namely extracting entities, attributes and interrelations among the entities from the Title and Description information extracted in the first stage, and forming ontology knowledge expression on the basis;
and a third stage: knowledge fusion and quality detection: after acquiring the defect knowledge, performing knowledge fusion to eliminate contradiction and ambiguity; and after quality evaluation, adding qualified parts, namely defect knowledge entities with complete semantic expression and no ambiguity or contradiction, into a knowledge base.
Further, the knowledge fusion specifically comprises the following steps: calculating similarity sim1 between two entities by using Levenshtein distance, namely minimum edit distance, calculating similarity sim2 between two entities by using Dice coefficient, and calculating simple arithmetic mean sim3 of sim1 and sim2
sim3=(sim1+sim2)/2
If sim3 is greater than the set threshold, then both entities are counted as the same entity.
Further, the step 2 of constructing a multi-turn dialogue memory module specifically includes: the system comprises a user conversation state tracking module and a conversation strategy module, wherein the user conversation state tracking module is used for predicting a target of a user in each round of interaction, managing input and interactive question-answer history of each round and outputting a current conversation state; the dialogue strategy module takes optimal action according to the dialogue state to assist the user in completing the task of answer acquisition.
Further, step 3, constructing a user portrait of the software developer or the maintainer according to the related questions in the software defect field asked by the software developer or the maintainer, specifically including collecting daily questions of the software developer or the maintainer through a system, and counting the demands of the software developer or the maintainer on analyzing the software defect questions.
Further, step 4, constructing a guided multi-turn question-answering module according to the dialogue memory and the user portrait of the software developer or maintainer specifically includes:
step 4-1, preprocessing the user problem: analyzing and completing the question sentences by combining the question and answer context information in the multi-turn question and answer memory module to standardize the user problems, extracting defective entities and relations in the user question sentences, and deleting words which are meaningless to defect understanding;
step 4-2, map searching and reasoning: mapping the extracted defect entities and the relations into a structured query statement Cypher of a Neo4j graph database to perform subgraph search operation;
step 4-3, answer sorting: and (4) scoring the candidate answers to be ranked by combining the user characteristics obtained by the user portrait and the candidate answer list through a Lambdarank model, and ranking the candidate answers according to the score.
Correspondingly, the software defect-oriented multi-round automatic question answering system provided by the invention can adopt the following technical scheme:
a software defect-oriented multi-round automatic question-answering system comprises:
the module I is used for crawling a defect report from an open source Bug management library, extracting information which is helpful for defect understanding in the report, extracting entities and relations from a long text, carrying out knowledge fusion and quality detection, and constructing a software defect knowledge map;
the second module is used for recording the multiple communication between a software developer or maintainer and the system and constructing a multi-round dialogue memory module;
a third module, which is used for constructing a user portrait of the software developer or the maintainer according to the related problems in the software defect field asked by the software developer or the maintainer;
and a fourth module for constructing a guided multi-turn question and answer module according to the dialogue memory and the user portrait.
Compared with the prior art, the invention has the remarkable advantages that: 1) the knowledge graph in the software defect field is constructed to be used as a support for question answering, and compared with the traditional mode, the method has the advantages of good effect and high reliability; 2) the method based on the software defect knowledge graph is adopted and the database query statement is set, so that the matching of the problems and the query of the answers can be efficiently carried out; 3) the questions of the user can be accurately understood in a multi-turn question-answering mode, accurate answers are given, and the satisfaction degree of the user is improved.
Drawings
FIG. 1 is a flow chart of a software bug-oriented multi-round automatic question-answering method in one embodiment.
FIG. 2 is a flow diagram illustrating specific steps in a multi-round question answering system in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, with reference to fig. 1, the present invention provides a software defect-oriented multi-round automatic question-answering method, which includes the following steps:
step 1, crawling a defect report from an open source Bug management library, extracting information which is helpful for defect understanding in the report, extracting entities and relations from some long texts, performing knowledge fusion and quality detection, and constructing a software defect knowledge graph;
step 2, recording multiple interactions between a software developer or maintainer and the system, and constructing a multi-round dialogue memory module;
step 3, constructing a user portrait of the software developer or the maintainer according to the related problems in the software defect field asked by the software developer or the maintainer;
and 4, constructing a guided multi-turn question-answering module according to the dialogue memory and the user portrait of the software developer or maintainer.
Further, the specific process of constructing the software defect knowledge graph in the step 1 comprises the following steps:
step 1-1, crawling a defect report from an open source Bug management library, preprocessing the defect report, extracting information important for defect analysis understanding from the crawled defect report, wherein the information of the crawled defect report comprises a large number of attributes and Description texts, and the information of the defect report comprises a defect number (Bug ID), a defect Title (Title), a Product (Product), a Component (Component), Severity (Severity), a Modified state (Modified), a defect handler (identifier), a defect Reporter (Reporter) and defect Description information (Description).
Step 1-2, information extraction, namely extracting entities, attributes and interrelations among the entities from Title and Description information extracted in the first stage, and forming ontology knowledge expression on the basis;
step 1-3, knowledge fusion and quality detection: after acquiring the defect knowledge, knowledge fusion is required to eliminate contradictions and ambiguities; and then, quality evaluation is carried out, and qualified parts, namely defect knowledge entities with complete semantic expression and no ambiguity and contradiction are added into a knowledge base.
Further, the information extraction in step 1-2 specifically includes: and (3) applying a deep neural network Bi-LSTM and combining with the Attention to carry out defective entity classification and entity relationship identification.
Further, the knowledge fusion and quality detection in the steps 1 to 3 specifically include:
step 1-3-1, calculating the similarity between every two entities by using Levenshtein distance, namely the minimum edit distance, wherein the calculation formula is as follows:
sim1=1-(leυa,b(|a|,|b|)/max(|a|,|b|))
in the formula, sim1 is a similarity value between every two entities calculated by using a Levenshtein distance, and a and b are two entity character strings;
step 1-3-2, calculating the similarity sim2 between every two entities by using the Dice coefficient, wherein the calculation formula is as follows:
in the formula, sim2 is entity similarity calculated by using a Dice coefficient, and A and B respectively represent two entities;
step 1-3-3, calculating simple arithmetic mean values sim3 of sim1 and sim2, sim3 being (sim1+ sim 2)/2; if sim3 is greater than the set threshold, then two entities are counted as the same entity;
further, the step 2 of constructing a multi-round dialogue memory module specifically includes: the system comprises a user conversation state tracking module and a conversation strategy module, wherein the user conversation state tracking module is used for predicting a target of a user in each round of interaction, managing input and interactive question-answer history of each round and outputting a current conversation state; the dialogue strategy module takes optimal actions (such as providing results, confirming requirements and the like) according to the dialogue state, so that the user is effectively assisted in completing the task of acquiring answers.
Further, step 3, constructing a user representation of the software developer or the maintainer according to the related questions in the software defect field asked by the software developer or the maintainer, specifically including collecting daily questions of the software developer or the maintainer through the system, and further knowing the requirements of the software developer or the maintainer on analyzing the software defect questions.
Further, step 4, constructing a guided multi-turn question-answering module according to the dialogue memory and the user portrait of the software developer or maintainer specifically includes:
and 4-1, preprocessing the user problem. Analyzing and completing the question sentences by combining the question and answer context information in the multi-turn question and answer memory module, standardizing user problems, extracting defective entities and relations in the user question sentences, and deleting stop words, prepositions and other words which are meaningless to defect understanding;
and 4-2, searching and reasoning the map. Mapping the extracted defect entities and the relations into a structured query statement Cypher of a Neo4j graph database to perform subgraph search operation;
and 4-3, sorting answers. Scoring the candidate answers to be ranked by combining the user characteristics obtained by the user portrait and the candidate answer list through a Lambdarank model, and ranking the candidate answers according to the score;
as a specific example, in one embodiment, the method for multi-round automatic question answering oriented to software defects according to the present invention is further verified and explained with reference to fig. 1, and includes the following contents:
1. and crawling a defect report from an open source Bug management library, extracting information which is helpful for defect understanding in the report, extracting entities and relations from some long texts, performing knowledge fusion and quality detection, and constructing a software defect knowledge map. In This embodiment, a crawled bug report, where Title is "Fix dimensions used by XUL syntax change" descriptor is "This is a clinical bug location I has data loss less cause I can be found to be more and less than find. when the term I is not the Page and Properties are found, the term com up and documents the term search is not the same as that of the term book with third view and property on machine and tool work, and after extraction and knowledge fusion and quality detection of the relationship, the obtained entities and relationship are shown in the following table:
2. and recording multiple interactions between a software developer or maintainer and the system, and constructing a multi-turn dialogue memory module. The multi-turn dialog management module takes triplets of the software defect knowledge graph as input. It mainly consists of two parts: the system comprises a state tracking module and a conversation strategy module, wherein the state tracking module is used for estimating a user target of each conversation period. Dialog inputs and history are managed for each dialog cycle and the current state of the dialog is generated. The main function of the dialogue strategy module is to determine the best operation according to the state of the last dialogue so as to help the user to complete the task of acquiring information or service. And displaying the behavior of the subsequent system and the updated dialog state according to the semantic representation input by the user and the current state of the dialog box. Some of the dialog management tasks are the following:
conversation state maintenance: the dialog state at time t +1 depends on the state at the previous time t, the system behavior at the previous time t, and the user behavior corresponding to the current time t + 1.
And (3) generating a system decision: from the states in the dialog state maintenance, system behavior is generated, and it is decided what to do next to represent the observed user input and feedback behavior of the system. After receiving the problems of the user, the user interacts with the knowledge graph, meanwhile, questions are asked for the parts which are not specific and clear, and the user is allowed to continuously complete and perfect the problems. The system may obtain context information for multiple conversation sets from multiple conversation set management modules, including context information for defect knowledge maps, context information for problem entities and relationships, semantic context information for problems, and so on. Based on this information, an answer to the user's question can be accurately found.
3. According to the related questions in the software defect field asked by the software developer or the maintainer, the emotion of the user is analyzed to find out the related characteristics of the user, and the user portrait of the software developer or the maintainer is constructed, so that the candidate answers in the question and answer are sequenced.
Emotion computation can be represented by a triplet, as follows
ST=<T,C,I>
Where T denotes a set of user information, i.e., T ═ T1,t2,...tnI.e. problems with software defects posed by the user.
C represents an emotion category or a set formed by different tendency categories, namely C ═ C1,c2,...,cn}. The method can express discrete emotion characteristics, can combine more complex emotions by using basic emotions, and therefore, the emotion characteristics can be divided into two or more categories according to different application purposes so as to create different emotion classification models. The model directly reflects the basic understanding of the emotion granularity.
I denotes a set of different emotional feature strengths, i.e. { I ═ I }1,i2,...,inGeneral strength can be divided into 3 grades of high, medium and low, and can also be divided into 5 grades of extremely high, medium, low and extremely low, and the strength characteristics are combined with emotional characteristics to form the core and the foundation of emotional calculation.
According to the definition, the calculation of the user emotion can be expressed as the acquisition and identification of the knowledge of software defects in the user input problem, so that the calculation of the user emotion function on different dimensions is realized. Thus, the computation of emotion can be expressed as a state space combination formed by the Cartesian product of the three elements described above, i.e., as
ST=T×C×I
Through the emotion calculation, the system can extract the emotion characteristics of the user, and establishes the portrait of the user by qualitative and quantitative analysis and behavior modeling, so that preparation is made for ordering the candidate answers in question answering.
4. The construction of guided multi-turn questions and answers based on the session memory and user profile of the software developer or maintainer is further described in conjunction with FIG. 2,
(1) a software developer or maintainer inputs a software defect related question which is required to be inquired;
(2) and obtaining the context of the multi-turn conversation: the system acquires multi-round conversation context information from a multi-round conversation memory module, wherein the multi-round conversation context information comprises related constructed knowledge graph context information, software defect problem entities, relationship context information, semantic context information, user emotion context information and the like;
(3) user emotion analysis: calculating the emotion value of the user question based on the emotion calculation model of the dominant-predicate mode and the emotion context information of the user, supplementing the emotion value to the emotion context information of the user, and using the emotion value as the generation of a constructed user portrait and a follow-up question and answer;
(4) user problem preprocessing: preprocessing the problems input by the user, including performing reference resolution and sentence completion according to context information of a plurality of rounds of conversation, performing automatic syntax error correction based on a Bi-LSTM + CRF model, extracting defective entities and relations in question sentences and the like, and preparing for full text search and knowledge graph search of subsequent semantics;
(5) knowledge graph searching: taking the entities and the relations extracted from the problems as conditions, carrying out map search in a Neo4j map database based on Cypher query sentences, and matching the node information of the defect knowledge map; obtaining a candidate answer list of the question;
(6) and (3) answer generation: inputting the candidate answer list and the preprocessed user question into a trained deep learning ranking model Lambdarank model to obtain the similarity ranking of the candidate answers and the user question, if the similarity of the candidate answers is higher than a specified threshold, outputting the candidate answers as the answers corresponding to the questions, otherwise, prompting the user to inquire in a mode of changing the types of the questions.
In addition, the present invention further provides an embodiment of a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the software defect-oriented multi-round automatic question-answering method when executing the computer program.
The present invention also provides an embodiment of a readable storage medium, on which a computer program is stored, which is characterized in that the computer program, when being executed by a processor, implements the steps of the above software defect-oriented multi-round automatic question-answering method.
Corresponding to the above software defect-oriented multi-round automatic question-answering method, this embodiment provides a software defect-oriented multi-round automatic question-answering system, which includes:
the module I is used for crawling a defect report from an open source Bug management library, extracting information which is helpful for defect understanding in the report, extracting entities and relations from a long text, carrying out knowledge fusion and quality detection, and constructing a software defect knowledge map;
the second module is used for recording the multiple communication between a software developer or maintainer and the system and constructing a multi-round dialogue memory module;
a third module, which is used for constructing a user portrait of the software developer or the maintainer according to the related problems in the software defect field asked by the software developer or the maintainer;
and a fourth module for constructing a guided multi-turn question and answer module according to the dialogue memory and the user portrait.
The foregoing illustrates and describes the principles, general features, and advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (9)
1. A software defect-oriented multi-round automatic question answering method is characterized by comprising the following steps:
step 1, crawling a defect report from an open source Bug management library, extracting information which is helpful for defect understanding in the report, extracting entities and relations from a long text, performing knowledge fusion and quality detection, and constructing a software defect knowledge map;
step 2, recording multiple communications between a software developer or maintainer and the system, and constructing a multi-round dialogue memory module;
step 3, constructing a user portrait of the software developer or the maintainer according to the related problems in the software defect field asked by the software developer or the maintainer;
and 4, constructing a guided multi-turn question-answering module according to the dialogue memory and the user portrait.
2. The software defect-oriented multi-round automatic question-answering method according to claim 1, wherein the construction of the software defect knowledge graph module comprises the following three stages:
the first stage is as follows: the method comprises the steps of crawling a defect report from an open source Bug management library, preprocessing the defect report, extracting information important for defect analysis understanding from a large number of attributes and Description texts in the crawled defect report, wherein the defect report information comprises a defect number (Bug ID), a defect Title (Title), a Product (Product), a Component (Component), Severity (Severity), a Modified state (Modified), a defect handler (identifier), a defect Reporter (Reporter) and defect Description information (Description);
and a second stage: extracting information, namely extracting entities, attributes and interrelations among the entities from the Title and Description information extracted in the first stage, and forming ontology knowledge expression on the basis;
and a third stage: knowledge fusion and quality detection: after acquiring the defect knowledge, performing knowledge fusion to eliminate contradiction and ambiguity; and after quality evaluation, adding qualified parts, namely defect knowledge entities with complete semantic expression and no ambiguity or contradiction, into a knowledge base.
3. The software defect-oriented multi-round automatic question answering method according to claim 2, wherein the knowledge fusion specifically comprises the following steps: calculating similarity sim1 between two entities by using Levenshtein distance, namely minimum edit distance, calculating similarity sim2 between two entities by using Dice coefficient, and calculating simple arithmetic mean sim3 of sim1 and sim2
sim3=(sim1+sim2)/2
If sim3 is greater than the set threshold, then both entities are counted as the same entity.
4. The software defect-oriented multi-round automatic question answering method according to claim 2, wherein the step 2 of constructing a multi-round dialogue memory module specifically comprises the following steps: the system comprises a user conversation state tracking module and a conversation strategy module, wherein the user conversation state tracking module is used for predicting a target of a user in each round of interaction, managing input and interactive question-answer history of each round and outputting a current conversation state; the dialogue strategy module takes optimal action according to the dialogue state to assist the user in completing the task of answer acquisition.
5. The software defect-oriented multi-round automatic question-answering method according to claim 4, wherein the step 3 is to construct a user representation of the software developer or maintainer according to the questions related to the software defect field asked by the software developer or maintainer, and specifically includes collecting daily questions of the software developer or maintainer through a system and counting the demands of the software developer or maintainer on analyzing the software defect questions.
6. The software defect-oriented multi-round automatic question-answering method according to claim 5, wherein the step 4 of constructing a guided multi-round question-answering module according to the dialogue memory and the user portrait of the software developer or maintainer specifically comprises:
step 4-1, preprocessing the user problem: analyzing and completing the question sentences by combining the question and answer context information in the multi-turn question and answer memory module to standardize the user problems, extracting defective entities and relations in the user question sentences, and deleting words which are meaningless to defect understanding;
step 4-2, map searching and reasoning: mapping the extracted defect entities and the relations into a structured query statement Cypher of a Neo4j graph database to perform subgraph search operation;
step 4-3, answer sorting: and (4) scoring the candidate answers to be ranked by combining the user characteristics obtained by the user portrait and the candidate answer list through a Lambdarank model, and ranking the candidate answers according to the score.
7. A software defect-oriented multi-turn automatic question-answering system is characterized by comprising:
the module I is used for crawling a defect report from an open source Bug management library, extracting information which is helpful for defect understanding in the report, extracting entities and relations from a long text, carrying out knowledge fusion and quality detection, and constructing a software defect knowledge map;
the second module is used for recording the multiple communication between a software developer or maintainer and the system and constructing a multi-round dialogue memory module;
a third module, which is used for constructing a user portrait of the software developer or the maintainer according to the related problems in the software defect field asked by the software developer or the maintainer;
and a fourth module for constructing a guided multi-turn question and answer module according to the dialogue memory and the user portrait.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 6 are implemented when the computer program is executed by the processor.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110569649.5A CN113326062A (en) | 2021-05-25 | 2021-05-25 | Software defect-oriented multi-round automatic question and answer method, system, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110569649.5A CN113326062A (en) | 2021-05-25 | 2021-05-25 | Software defect-oriented multi-round automatic question and answer method, system, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113326062A true CN113326062A (en) | 2021-08-31 |
Family
ID=77416739
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110569649.5A Pending CN113326062A (en) | 2021-05-25 | 2021-05-25 | Software defect-oriented multi-round automatic question and answer method, system, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113326062A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160357663A1 (en) * | 2010-04-14 | 2016-12-08 | International Business Machines Corporation | Software defect reporting |
CN110209787A (en) * | 2019-05-29 | 2019-09-06 | 袁琦 | A kind of intelligent answer method and system based on pet knowledge mapping |
CN110377715A (en) * | 2019-07-23 | 2019-10-25 | 天津汇智星源信息技术有限公司 | Reasoning type accurate intelligent answering method based on legal knowledge map |
CN110413732A (en) * | 2019-07-16 | 2019-11-05 | 扬州大学 | The knowledge searching method of software-oriented defect knowledge |
CN110555153A (en) * | 2019-08-20 | 2019-12-10 | 暨南大学 | Question-answering system based on domain knowledge graph and construction method thereof |
CN111125309A (en) * | 2019-12-23 | 2020-05-08 | 中电云脑(天津)科技有限公司 | Natural language processing method and device, computing equipment and storage medium |
CN111597347A (en) * | 2020-04-24 | 2020-08-28 | 扬州大学 | Knowledge embedded defect report reconstruction method and device |
CN111666395A (en) * | 2020-05-18 | 2020-09-15 | 扬州大学 | Interpretable question answering method and device oriented to software defects, computer equipment and storage medium |
-
2021
- 2021-05-25 CN CN202110569649.5A patent/CN113326062A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160357663A1 (en) * | 2010-04-14 | 2016-12-08 | International Business Machines Corporation | Software defect reporting |
CN110209787A (en) * | 2019-05-29 | 2019-09-06 | 袁琦 | A kind of intelligent answer method and system based on pet knowledge mapping |
CN110413732A (en) * | 2019-07-16 | 2019-11-05 | 扬州大学 | The knowledge searching method of software-oriented defect knowledge |
CN110377715A (en) * | 2019-07-23 | 2019-10-25 | 天津汇智星源信息技术有限公司 | Reasoning type accurate intelligent answering method based on legal knowledge map |
CN110555153A (en) * | 2019-08-20 | 2019-12-10 | 暨南大学 | Question-answering system based on domain knowledge graph and construction method thereof |
CN111125309A (en) * | 2019-12-23 | 2020-05-08 | 中电云脑(天津)科技有限公司 | Natural language processing method and device, computing equipment and storage medium |
CN111597347A (en) * | 2020-04-24 | 2020-08-28 | 扬州大学 | Knowledge embedded defect report reconstruction method and device |
CN111666395A (en) * | 2020-05-18 | 2020-09-15 | 扬州大学 | Interpretable question answering method and device oriented to software defects, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110765257B (en) | Intelligent consulting system of law of knowledge map driving type | |
CN111026842B (en) | Natural language processing method, natural language processing device and intelligent question-answering system | |
CN117271767B (en) | Operation and maintenance knowledge base establishing method based on multiple intelligent agents | |
CN117743315B (en) | Method for providing high-quality data for multi-mode large model system | |
CN109145301B (en) | Information classification method and device and computer readable storage medium | |
CN109145168A (en) | A kind of expert service robot cloud platform | |
CN110008308B (en) | Method and device for supplementing information for user question | |
CN108829682A (en) | Computer readable storage medium, intelligent answer method and intelligent answer device | |
CN112416778A (en) | Test case recommendation method and device and electronic equipment | |
CN114647713A (en) | Knowledge graph question-answering method, device and storage medium based on virtual confrontation | |
CN115858807A (en) | Question-answering system based on aviation equipment fault knowledge map | |
CN117520522B (en) | Intelligent dialogue method and device based on combination of RPA and AI and electronic equipment | |
CN115438142B (en) | Conversational interactive data analysis report system | |
WO2019183517A1 (en) | Systems and methods using artificial intelligence to analyze natural language sources based on intelligent agent models | |
CN117891826A (en) | Method and device for constructing large vertical field model based on 12345 data | |
CN115878818B (en) | Geographic knowledge graph construction method, device, terminal and storage medium | |
Li et al. | A review of quality assurance research of dialogue systems | |
CN115757720A (en) | Project information searching method, device, equipment and medium based on knowledge graph | |
Wang | Construction of Data Mining Analysis Model in English Teaching Based on Apriori Association Rule Algorithm | |
CN113326062A (en) | Software defect-oriented multi-round automatic question and answer method, system, computer equipment and storage medium | |
CN110502675B (en) | Voice dialing user classification method based on data analysis and related equipment | |
CN109299381B (en) | Software defect retrieval and analysis system and method based on semantic concept | |
CN114186974A (en) | Multi-model fusion development task association method, device, equipment and medium | |
CN110046234B (en) | Question-answering model optimization method and device and question-answering robot system | |
Dikshit et al. | Automating Questions and Answers of Good and Services Tax system using clustering and embeddings of queries |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210831 |