Disclosure of Invention
The invention provides a method and a system for detecting software requirement text expression defects and a computer storage medium, which are used for solving the technical problems of low efficiency and low accuracy of the conventional software requirement text expression error detection.
In order to solve the technical problems, the technical scheme provided by the invention is as follows:
a software requirement text expression defect detection method is applied to a software requirement text detection system and comprises the following steps:
s1, reading a software requirement text;
s2, matching the words in the software requirement text with the expression defect words in a preset expression defect database one by utilizing a dictionary matching algorithm;
s3, if the vocabulary in the software requirement text is successfully matched with the expression defect words in the expression defect database, judging that the vocabulary is the expression defect words;
and S4, returning the expression defect sentence in the software requirement text where the expression defect word is located, and the expression defect category corresponding to the expression defect word.
As a further improvement of the system of the invention:
in step S4, returning the representation defect category corresponding to the representation defect word, including: matching the expressed defect words with a pre-stored expressed defect word-expressed defect category mapping table, and finding out the expressed defect categories corresponding to the expressed defect words.
As a further improvement of the system of the invention:
the step S4 further includes returning a defect modification comment corresponding to the defect type; the defect modification opinions are also stored in the representational defect word-representational defect category mapping table.
As a further improvement of the system of the invention:
the expression defect category comprises any one or combination of a plurality of expression defects of an ambiguous defect, an ambiguous defect of an ambiguous degree, an ambiguous defect of a range, an expression defect which can not be verified, a comparative expression defect, a highest-level expression defect, a command expression defect, a request expression defect, a personal subjective expression defect, a hypothetical expression defect and a speculative expression defect.
As a further improvement of the system of the invention:
the step S4 further includes generating an item-by-item list of the expression defect sentence-defect category-modification opinions according to the order of the appearance of the defect word in the text, and returning the item-by-item list of the expression defect sentence-defect category-modification opinions.
A computer system comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the methods described above when executing the computer program.
A computer storage medium having a computer program stored thereon, which when executed by a processor, performs the steps of any of the methods described above.
The invention has the following beneficial effects:
1. according to the method and the system for detecting the software requirement text expression defects and the computer storage medium thereof, the words in the read software requirement text can be compared with the defect words stored in the expression defect database to detect whether the expression defect words exist in the software requirement text, so that whether the software requirement text has the expression defects is judged. Compared with the existing method for manually detecting the expression defects of the software requirement text, the method can quickly detect the expression defects in the software requirement text, and saves a large amount of labor and time for finding the defects.
2. And the invention can also provide corresponding defect modification opinions according to the detected word types expressing defects, so that the defect modification gives guidance and the efficiency of modifying the defects is accelerated.
In addition to the objects, features and advantages described above, other objects, features and advantages of the present invention are also provided. The present invention will be described in further detail below with reference to the accompanying drawings.
Detailed Description
The embodiments of the invention will be described in detail below with reference to the drawings, but the invention can be implemented in many different ways as defined and covered by the claims.
The invention provides a method for detecting a software requirement text expression defect, which is applied to a software requirement text detection system, and comprises the following steps of:
s1, reading a software requirement text;
s2, matching the words in the software requirement text with the expression defect words in a preset expression defect database one by utilizing a dictionary matching algorithm;
s3, if the vocabulary in the software requirement text is successfully matched with the expression defect words in the expression defect database, judging that the vocabulary is the expression defect words and the software requirement text possibly has expression defects;
and S4, returning the expression defect sentence in the software requirement text where the expression defect word is located, and the expression defect category corresponding to the expression defect word.
The expression defect words are words which can cause that the semantic expression of the software requirement text is unclear, ambiguous or cause that the requirement cannot be verified.
The defect expression sentence is a sentence containing a defect expression word in the software requirement text.
According to the method for detecting the software requirement text expression defects, whether the expression defect words exist in the software requirement text or not is detected by comparing the words in the readable software requirement text with the defect words stored in the expression defect database, and whether the expression defects exist in the software requirement text or not is further judged. Compared with the conventional method for manually detecting the expressive defects in the software requirement text, the method can quickly detect the expressive defects in the software requirement text, and saves a large amount of labor and time for finding the defects.
The expression defect words in the expression defect database contain most of the requirement expression words with errors, the expression defect words are extracted from 50 requirement documents, and the dictionary is continuously updated.
The dictionary matching algorithm in the scheme can be one or more of a maximum forward matching algorithm, a reverse maximum matching algorithm, a maximum probability word segmentation algorithm, a bidirectional scanning method, a word-by-word traversal method, an N-shortest path method and a word-based N-gram grammar model. Other algorithms that can find a particular vocabulary in a particular text are also possible.
In practical application, on the basis of the above steps, in this embodiment, in order to detect the expression defects in the software requirement text more quickly and completely, the expression defects in the software requirement text may also be searched by using a machine learning text classification method, so as to solve the problem that the matching of the existing dictionary matching method is not accurate enough.
The specific method for machine learning text classification is as follows:
firstly, defining 10 types of data, namely, ambiguous expression sentences, unverifiable expression sentences, comparative expression sentences, highest-level expression sentences, command expression sentences, request expression sentences, personal subjective expression sentences, hypothetical expression sentences, speculative expression sentences and other expression sentences, namely, sentences which do not belong to 9 types of expression defects, wherein the types are used as correct data types, the data are sentences which can appear in a requirement text but not general sentences, and a data set is formed by manually labeling types from 50 parts of requirement documents; then, carrying out word segmentation on the data document, carrying out preprocessing such as word stop and the like, and carrying out dimension reduction on the document matrix; in the data characteristic extraction stage, the word frequency characteristics of the vocabularies are extracted by using a word bag model, namely, the word segmentation processing is carried out on each type of defect statement sentence to obtain word segmentation vocabularies, the occurrence frequency of each word segmentation vocabulary is counted to obtain the word frequency of each vocabulary, and then a text classifier is trained by using a logistic regression algorithm. And classifying each read-in sentence through a trained text classifier. If the sentence belongs to one of the 9 types of defects, the modification opinion of the sentence where the sentence is located and the corresponding type is returned.
The text classifier trained by the logistic regression algorithm based on the bag-of-words model features can also be realized by the logistic regression algorithm based on the bag-of-words model, a naive Bayes algorithm based on the bag-of-words model and a support vector machine algorithm based on the bag-of-words model.
Compared with the conventional method for manually detecting the expressive defects in the software requirement texts, the method for detecting the expressive defects in the software requirement texts can quickly detect the expressive defects in the software requirement texts, and saves a large amount of labor and time for finding the defects. In addition, the invention utilizes different detection methods to detect the expression defects in the software requirement text, thereby improving the detection efficiency and accuracy.
In practical application, on the basis of the steps, the software requirement text expression defect detection method can be optimized by adding the following steps:
in step S4, returning the representation defect category corresponding to the representation defect word, including: matching the expressed defect words with a pre-stored expressed defect word-expressed defect category mapping table, and finding out the expressed defect categories corresponding to the expressed defect words.
In the step S4.1, the defect vocabulary in the defect vocabulary database is divided into ambiguous descriptors, unverifiable descriptors, comparative descriptors, top-level descriptors, command descriptors, request descriptors, subjective descriptors, hypothetical descriptors and speculative descriptors in the embodiment, and the types of the defect vocabulary in the step S9 are summarized according to the requirement language standard in 29148-; the uncertain expressions are subdivided into three categories of ambiguous expression, degree ambiguous expression and range ambiguous expression.
Classifying the expressive deficit as referring to an ambiguity, an ambiguity in degree, an ambiguity in range, an unverifiable expression, a comparative expression, a highest-level expression, a command expression, a request expression, a personal subjective expression, a hypothetical expression, a speculative expression;
and establishing a mapping relation between the ambiguous words and the ambiguous expression defects;
establishing a mapping relation between the defects with the uncertain degree and the uncertain degree;
establishing a mapping relation between the range-undefined expression words and the range-undefined expression defects;
establishing a mapping relation between the unverifiable expression words and the unverifiable expression defects;
establishing a mapping relation between the comparative expression words and the comparative expression defects;
establishing a mapping relation between the highest level expression words and the highest level expression defects;
establishing a mapping relation between the command expression words and the command expression defects;
establishing a mapping relation between the request expression words and the request expression defects;
establishing a mapping relation between the personal subjective expression words and the personal subjective expression defects;
establishing a mapping relation between the hypothetical expression words and the hypothetical expression defects;
establishing a mapping relation between the guess expression words and the guess expression defects;
when a pronoun is used in a text expression, it cannot be determined which of the aforementioned things the pronoun refers to. The corresponding detection dictionary includes: this, those, that, then, that person, etc. pronouns. When the pronouns in the dictionary are detected in the text, the detection of the possible defect expression with ambiguous reference is determined.
Words of indefinite degree refer to the use of adjectives of indefinite degree in the text presentation, such as: slightly larger, slightly smaller, slightly (mostly slightly larger than exactly what is most. The corresponding detection dictionary includes: more or less, too much, slightly, etc. of an indefinite extent. When the presence of a word in the dictionary in the text is detected, it is assumed that an expression in which a degree of ambiguity defect may exist is detected.
An ambiguous term refers to an adjective that uses an indeterminate range in the text presentation, such as: and adjectives that are not limited to, but not limited to, over, open, etc. and are not to be so limited. The corresponding detection dictionary includes: including but not limited to, etc., as if the adjectives of the range were not defined. When a word in the dictionary is detected in the text, it is assumed that an expression in which a range ambiguity defect may exist is detected.
The non-verifiable expression words refer to words which cannot be judged to be non-verifiable and are used in the text expression, such as: reasonable (when and who is making decisions. The corresponding detection dictionary includes: reasonable, applicable, when appropriate, to the vocabulary that can not be verified if possible. When the presence of a word in the dictionary in the text is detected, it is assumed that a possible non-verifiable defect expression is detected.
The term comparative statement refers to a comparison of two things or entities in a text. These words are commonly used: faster, better, higher quality, improvement and the like. This type of test is intended to allow the tester to confirm whether a specific quantification has been made in the comparison. The corresponding detection dictionary includes: weaker, stronger, and still more comparable words. When a word in the dictionary is detected in the text, it is determined that a possibly defective expression of the comparative expression is detected.
The highest-level expression word refers to that the highest-level adjective or adverb is used for expression in the text. For example: most efficient, best performance, optimal, etc. This type of test is intended to let the tester confirm whether the maximum and minimum values acceptable for the parameter have been defined. The corresponding detection dictionary includes: maximize, minimize, optimize, shortest response time, longest response time, and the like. When a word in the dictionary of the text is detected, it is determined that a representation in which the highest-level representation defect is likely to exist is detected.
The command statement word refers to a command sentence appearing in a text. For example: the system must switch instantaneously between displaying and hiding the correct answer. This demand command is not feasible because the computer cannot complete any work "instantaneously" and, in addition, it does not state the cause of the state switch, so the demand is incomplete. The command sentences in the requirement text should be clear, concise and correct, and the detection means that whether the command sentences are concise, clear, correct and unambiguous is realized by a detector to analyze the command sentences again. The corresponding detection dictionary includes: must, must not allow, etc. the command vocabulary. When a word in the dictionary of the text is detected, it is determined that a representation in which a command representation defect may exist is detected.
The request statement word refers to a request sentence appearing in the text. For example: and (5) inputting the content to be filled in the mail bar, and clicking to send. The request command states what is not clear and what is the exact content to be filled. Such as: the mail address must be filled in, and no mail address will be sent. A request statement should make clear each request step. This type of detection is intended to allow the inspector to re-analyze whether the request statement is clear, unambiguous, and complete. The corresponding detection dictionary includes: please input, please click, please select, please move. When a word in the dictionary of the text is detected, it is determined that a representation in which a request representation defect may exist is detected.
The term personal subjective expression refers to expressions which are prone to personal claims and feelings and appear in the text. For example: giving the customer a response within 2 seconds is considered "very appealing" by the user. Both "thought" and "thought" are used to express subjective feelings of people, but the subjective feelings of each person are different from person to person, and thus an expression of personal subjectivity should not appear in the requirement text. The corresponding detection dictionary includes: it is perceived that, for example, it is believed that the feelings are etc. that express words that an individual claims and feels. When a word in the dictionary of the text is detected, the expression that the subjective expression defect of the person may exist is determined to be detected.
The hypothetical expression, meaning that a hypothetical sentence is used in the text to represent an implicit requirement. For example: the staff member's number should be verified online with the head office staff number list if feasible. In this hypothetical expression "if feasible" is ambiguous. It means "if technically feasible" (to be considered by the developer) or "data feasible" (whether the head office employee list can be accessed during run time). This type of detection is intended to allow the detection institute to confirm whether assumptions are reasonably accurate and clear. The corresponding detection dictionary includes: if so, even if so, assume vocabulary. When a word in the dictionary of text is detected, it is assumed that a representation that is likely to have a hypothetical representation deficiency is detected.
The speculative statement word refers to a guess and guess sentence that uses uncertain results in the text. The requirement document should be clear and accurate, and therefore, no speculative expression should appear in the requirement document. The corresponding detection dictionary includes: perhaps, presumably, guess, make a sense of guess, etc. When a word in the dictionary of text is detected, it is assumed that an expression that may have a guessing expression deficiency is detected.
In practical application, on the basis of the steps, the software requirement text expression defect detection method can be optimized by adding the following steps:
the step S4 further includes returning a defect modification comment corresponding to the defect type; the defect modification opinions are also stored in the representational defect word-representational defect category mapping table.
The method comprises the steps that a defect expression word-defect expression category mapping table is used for storing defect modification opinions for solving the problem of expression defects caused by the defect expression words in a defect expression database, mapping relations between the defect expression words and the defect modification opinions for solving the problem of expression defects caused by the defect expression words are stored in the defect expression word-defect expression category mapping table, and when expression defects are detected in a software requirement text, the defect modification opinions for solving the expression defects can be correspondingly solved through the mapping relations between the defect expression words and the defect modification opinions for solving the expression defects, so that the corresponding modification opinions can be provided while the defect expression is detected, guidance is given to defect modification, and the defect modification efficiency is improved.
For example: when an uncertainty expression is detected, a reference to ambiguity is taken as an example. The word "that" is detected, and the sentence is returned assuming that the sentence is "that button is controlling the system to enter the sleep state". And returns the modification opinions of the class: please ensure that the pronouns refer to things exactly and as previously mentioned.
When an unverifiable expression is detected, for example: "We select the menu that is popular at the time to build the navigation bar". Detecting 'popular at present', returning the sentence of the word, and returning the modification opinions: please define the specific meaning of the requirement to the project affiliate.
When a comparative expression is detected, for example: "updating a document requires a higher authority to operate. Detecting "higher", returning the sentence of the word and returning the modification opinion of the class: please check if the specific functional area or quality is better, faster improvement point is quantified.
When the highest level expression is detected, for example: let the system reach the minimum response time. Detecting the shortest, returning the sentence of the word and returning the modification opinions of the type: please check if the acceptable maximum and minimum values have been explicitly set.
When a command expression is detected, for example: "ideally, the system must implement a transparent merge operation and save the results". Detecting "must" indicates that the sentence is a command and returns the sentence in which the word is located. "in the ideal case" in this command is a description that has no specific meaning and is difficult to judge, and "transparent" means what to the user. The modification opinions returned for this class are: please re-analyze whether the command sentence is clear, concise and correct, whether the command to be expressed is translated into specific and observable product features.
When a request expression is detected, for example: "please input the employee number, shop number, department number to be queried, click to query to obtain the detailed information" please input is detected ", and the sentence where the word is located is returned. The step of requesting in the sentence is not clear, whether the employee number, shop number and department number are respectively input to respectively obtain corresponding detailed information or simultaneously input to obtain query information. The modification opinions returned by this class are: please check if each request step is clear, unambiguous and complete.
When a personal subjective expression is detected, for example: "we believe that the response is provided within 5 seconds, the user's perception is good". Detecting that: "think", "feel", and return the sentence in which the word is located. The requirement document describes requirements from the user's perspective, and the use of personal subjective expressions in the requirement document is inappropriate and easily ambiguous. The modification opinions returned for this class are: it is recommended to delete the subjective expression.
When a hypothetical expression is detected, for example: "the system can be forced to sleep if necessary". If is detected, the sentence where the word is located is returned. In this sentence, "if necessary", "when necessary" is judged by who, and no detailed explanation is given on the assumption. The modification opinions returned for this type are therefore: please check if the described case is reasonable and clear.
When a speculative description is detected, for example: "click delete, the system may ask again whether delete". If "possible" is detected, the sentence in which the word is located is returned. The requirement document should clearly and accurately express the requirement, so the modification opinions returned for the defects are: please delete the guess description what happens to the clear expression system and not what happens.
In practical application, on the basis of the steps, the software requirement text expression defect detection method can be optimized by adding the following steps:
as shown in fig. 2, the step S4 further includes generating an item-by-item list of the expression-defective sentence-defective category-modification opinions according to the order of occurrence of defective words in the text, and returning the item-by-item list of the expression-defective sentence-defective category-modification opinions.
And returning the defect statement sentences of the same type of defect words causing the same expressed defects and the corresponding defect modification opinions to a user in an item-by-item list of the expressed defect sentences, defect types and modification opinions, and feeding back the list to the user so as to facilitate browsing statistics.
A computer system comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when executing the computer program. A computer storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the above-mentioned method.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.