JP2016024477A

JP2016024477A - Software defect prediction device, software defect prediction method, and software defect prediction program

Info

Publication number: JP2016024477A
Application number: JP2014145786A
Authority: JP
Inventors: 明子吉田; Akiko Yoshida; 清孝粕渕; Kiyotaka Kasubuchi; 孝俊田中; Takatoshi Tanaka; 義男山下; Yoshio Yamashita
Original assignee: Screen Holdings Co Ltd
Current assignee: Screen Holdings Co Ltd
Priority date: 2014-07-16
Filing date: 2014-07-16
Publication date: 2016-02-08

Abstract

PROBLEM TO BE SOLVED: To realize a system with which it is possible to predict the number of defects occurring in each development process of software development with high accuracy and present the information needed for narrowing down a process to be improved more efficiently.SOLUTION: A software defect prediction device is provided with: a mixed-in defect count table 221 for holding the number of mixed-in defects per development process; a detected defect count table 222 for holding the number of detected defects per development process; a scale value table 223 for holding a scale value per development process; similarity calculation means 72 for calculating the similarity of a project to be predicted to other projects by using the scale value held in the scale value table 223; and prediction value calculation means 73 for calculating the predicted value of the number of mixed-in defects and number of detected defects per development process with regard to the project to be predicted, on the basis of the similarity of the project to be predicted to a high-similarity project and the number of mixed-in defects and the number of detected defects per development process with regard to the high-similarity project.SELECTED DRAWING: Figure 6

Description

本発明は、ソフトウェア開発において生じる欠陥の数を予測するソフトウェア欠陥予測装置，ソフトウェア欠陥予測方法，およびソフトウェア欠陥予測プログラムに関する。 The present invention relates to a software defect prediction apparatus, a software defect prediction method, and a software defect prediction program that predict the number of defects that occur in software development.

従来より、ソフトウェアシステムに関して、開発中あるいは開発後に、「不具合」，「バグ」などと呼ばれる欠陥が生じることが多々ある。このような欠陥が検出（発見）されると、欠陥を修正するための作業が必要となる。特に開発初期の段階で混入した欠陥がユーザリリース（開発したシステムをユーザのコンピュータに導入すること）後に検出されると、その欠陥を修正するために開発作業に大きな手戻りが生じてしまう。従って、システム（ソフトウェア）に混入した欠陥については、大きな手戻りが生じることのないよう、できるだけ早期に検出されることが好ましい。 Conventionally, defects called “failures”, “bugs”, etc. often occur during or after development of software systems. When such a defect is detected (discovered), an operation for correcting the defect is required. In particular, when a defect mixed in at an early stage of development is detected after a user release (introducing a developed system into a user's computer), a great rework occurs in the development work to correct the defect. Therefore, it is preferable that the defect mixed in the system (software) is detected as early as possible so that no major rework occurs.

欠陥を少なくしてソフトウェアの品質を高めるためには、理想的には、ソフトウェア開発の各開発工程において工程改善（例えば、レビューを充分に行うことや開発者を多く投入すること等）を実施することが好ましい。しかしながら、コストや時間などの制約があるために、多くの場合、対象工程を絞って工程改善が実施されている。改善対象工程の絞り込みは、開発工程毎の品質に関する定量的なデータに基づいて行われることが好ましい。品質に関する定量的なデータとしては、例えば欠陥数が挙げられる。なお、本明細書では、「混入欠陥数」および「検出欠陥数」という用語を用いる。「混入欠陥数」とは、ソフトウェア開発の各開発工程において実際に成果物に混入した欠陥の数のことである。「検出欠陥数」とは、混入欠陥数のうち開発者やユーザによって検出（発見）された欠陥の数のことである。従って、検出欠陥数が混入欠陥数を超えることはない。 In order to improve the quality of software by reducing defects, ideally, process improvements (for example, sufficient reviews and more developers) should be implemented in each development process of software development. It is preferable. However, due to limitations such as cost and time, in many cases, process improvement is carried out by narrowing down the target process. It is preferable to narrow down the process to be improved based on quantitative data regarding the quality of each development process. As quantitative data regarding quality, the number of defects is mentioned, for example. In the present specification, the terms “number of mixed defects” and “number of detected defects” are used. The “number of mixed defects” is the number of defects actually mixed in the product in each development process of software development. The “number of detected defects” is the number of defects detected (discovered) by the developer or user among the number of mixed defects. Therefore, the number of detected defects does not exceed the number of mixed defects.

上述したように、改善対象工程の絞り込みは、開発工程毎の品質に関する定量的なデータに基づいて行われることが好ましい。そこで、ソフトウェア開発の各開発工程の成果物についての検出欠陥数を予測することが従来より行われている。例えば、以下の特許文献１に記載されている情報処理装置では、累積指標という概念を用いた回帰分析によって、ソフトウェア開発の開発工程毎の検出欠陥密度の予測が行われている。また、以下の特許文献２に記載されている予測値算出装置では、協調フィルタリングの手法を用いて、ソフトウェア開発に関する各種評価項目（例えば工数）の評価値の予測が行われている。なお、協調フィルタリングの手法を用いてソフトウェア開発におけるプロジェクトの工数を予測する方法については、以下の非特許文献１に記載されている。 As described above, it is preferable that the improvement target processes are narrowed down based on quantitative data regarding the quality of each development process. Therefore, it has been conventionally performed to predict the number of detected defects for the product of each development process of software development. For example, in the information processing apparatus described in Patent Document 1 below, the detection defect density is predicted for each development process of software development by regression analysis using the concept of cumulative index. Moreover, in the predicted value calculation apparatus described in the following Patent Document 2, prediction values of various evaluation items (for example, man-hours) related to software development are predicted using a collaborative filtering technique. A method for predicting the man-hour of a project in software development using a collaborative filtering method is described in Non-Patent Document 1 below.

特開２０１１−７６４１１号公報JP 2011-76411 A 特開２０１１−１９７８３９号公報JP 2011-197839 A

角田雅照ら著「協調フィルタリングを用いたソフトウェア開発工数予測方法」、情報処理学会論文誌、２００５年５月、Ｖｏｌ．４６Ｎｏ．５、ｐ．１１５５−１１６４Masaaki Kakuda “Software Development Effort Prediction Method Using Collaborative Filtering”, Transactions of Information Processing Society of Japan, May 2005, Vol. 46 No. 5, p. 1155-1164

ところが、検出欠陥数は実際にシステムに混入している欠陥の数（混入欠陥数）とは異なるため、検出欠陥数に基づいて改善対象工程の絞り込みが行われても、欠陥が効率的に除去されるとは限らない。また、ソフトウェア開発については開発工程毎に例えば開発会社や作業者が異なるという特性を有しているが、従来技術によれば、欠陥数などの評価項目の評価値を予測する際にソフトウェア開発の特性が考慮されていない。従って、予測値の精度が充分ではない。 However, since the number of detected defects is different from the number of defects that are actually mixed in the system (number of mixed defects), even if the process to be improved is narrowed down based on the number of detected defects, defects are efficiently removed. It is not always done. In addition, software development has the characteristic that, for example, the development company and the worker are different for each development process, but according to the conventional technology, when predicting the evaluation value of evaluation items such as the number of defects, Characteristics are not considered. Therefore, the accuracy of the predicted value is not sufficient.

そこで本発明は、ソフトウェア開発の各開発工程で生じる欠陥の数を高い精度で予測し、より効率的な改善対象工程の絞り込みを行うための情報を提示できるシステムを実現することを目的とする。 Therefore, an object of the present invention is to realize a system capable of predicting the number of defects generated in each development process of software development with high accuracy and presenting information for narrowing down the process to be improved more efficiently.

第１の発明は、ソフトウェア開発において生じる欠陥の数を予測するソフトウェア欠陥予測装置であって、
各プロジェクトについての開発工程毎の混入欠陥数を保持する混入欠陥数保持手段と、
各プロジェクトについての開発工程毎の検出欠陥数を保持する検出欠陥数保持手段と、
各プロジェクトについての開発工程毎の規模を表す規模値を保持する規模値保持手段と、
前記規模値保持手段に保持されている各プロジェクトについての開発工程毎の規模値を用いて、混入欠陥数および検出欠陥数を予測する対象である予測対象プロジェクトと該予測対象プロジェクト以外のプロジェクトとの類似度を算出する類似度算出手段と、
前記予測対象プロジェクトとの類似度が比較的高いプロジェクトである高類似プロジェクトについての、前記混入欠陥数保持手段に保持されている開発工程毎の混入欠陥数および前記検出欠陥数保持手段に保持されている開発工程毎の検出欠陥数と、前記予測対象プロジェクトと前記高類似プロジェクトとの類似度とに基づいて、前記予測対象プロジェクトについての開発工程毎の混入欠陥数および検出欠陥数の予測値を算出する予測値算出手段と
を備えることを特徴とする。 A first invention is a software defect prediction apparatus for predicting the number of defects that occur in software development,
The number of mixed defect holding means for holding the number of mixed defects for each development process for each project,
Number of detected defects holding means for holding the number of detected defects for each development process for each project,
A scale value holding means for holding a scale value indicating the scale of each development process for each project;
Using a scale value for each development process for each project held in the scale value holding means, a prediction target project that is a target for predicting the number of mixed defects and the number of detected defects and a project other than the prediction target project Similarity calculation means for calculating similarity;
The high-similarity project, which is a project having a relatively high degree of similarity to the prediction target project, is held in the mixed defect count holding means and the detected defect count holding means held in the mixed defect count holding means. Based on the number of detected defects for each development process and the similarity between the project to be predicted and the highly similar project, the predicted number of mixed defects and the number of detected defects for each development process for the project to be predicted is calculated. And a predicted value calculating means.

第２の発明は、第１の発明において、
前記類似度算出手段は、前記類似度を算出する際、前記混入欠陥数保持手段に保持されている各プロジェクトについての開発工程毎の混入欠陥数および前記検出欠陥数保持手段に保持されている各プロジェクトについての開発工程毎の検出欠陥数を更に用いることを特徴とする。 According to a second invention, in the first invention,
When calculating the similarity, the similarity calculating unit stores the number of mixed defects and the number of detected defects held for each development process for each project held in the mixed defect number holding unit. The number of detected defects for each development process of the project is further used.

第３の発明は、第２の発明において、
前記類似度算出手段は、前記類似度を算出する際、前記混入欠陥数保持手段に保持されている前記予測対象プロジェクトについての実施済みの開発工程の混入欠陥数および前記検出欠陥数保持手段に保持されている前記予測対象プロジェクトについての実施済みの開発工程の検出欠陥数を更に用いることを特徴とする。 According to a third invention, in the second invention,
When calculating the similarity, the similarity calculating unit holds the number of mixed defects in the developed development process and the detected defect number holding unit for the prediction target project held in the mixed defect number holding unit. It is further characterized in that the number of detected defects in the development process that has already been performed for the predicted project that is being used is further used.

第４の発明は、第１から第３までのいずれかの発明において、
前記予測値算出手段は、更に、前記予測対象プロジェクトについての開発工程毎の規模値の予測値を算出することを特徴とする。 According to a fourth invention, in any one of the first to third inventions,
The predicted value calculation means further calculates a predicted value of a scale value for each development process for the prediction target project.

第５の発明は、第１から第４までのいずれかの発明において、
各プロジェクトについての開発工程毎の技術的指標に基づいて得られる評価値および環境的指標に基づいて得られる評価値を保持する技術要因・環境要因保持手段を更に備え、
前記類似度算出手段は、前記類似度を算出する際、前記技術要因・環境要因保持手段に保持されている各プロジェクトについての開発工程毎の技術的指標に基づいて得られる評価値および環境的指標に基づいて得られる評価値を更に用いることを特徴とする。 According to a fifth invention, in any one of the first to fourth inventions,
It further comprises a technical factor / environmental factor holding means for holding an evaluation value obtained based on a technical indicator for each development process for each project and an evaluation value obtained based on an environmental indicator,
The similarity calculation means, when calculating the similarity, an evaluation value and an environmental index obtained based on a technical index for each development process for each project held in the technical factor / environment factor holding means The evaluation value obtained based on the above is further used.

第６の発明は、第１から第５までのいずれかの発明において、
各プロジェクトについての開発工程毎の難易度を保持する難易度保持手段を更に備え、
前記類似度算出手段は、前記類似度を算出する際、前記難易度保持手段に保持されている各プロジェクトについての開発工程毎の難易度を更に用いることを特徴とする。 According to a sixth invention, in any one of the first to fifth inventions,
It further comprises difficulty level holding means for holding the difficulty level for each development process for each project,
The similarity calculating means further uses the difficulty for each development process for each project held in the difficulty holding means when calculating the similarity.

第７の発明は、第６の発明において、
前記難易度保持手段は、前記難易度として、ソフトウェア開発で作成されたソースコードの難易度を表す指標であるソースコードメトリクス値を保持し、
前記予測値算出手段は、更に、前記難易度保持手段に保持されている前記高類似プロジェクトについてのソースコードメトリクス値に基づいて、前記予測対象プロジェクトについてのソースコードメトリクス値の予測値を算出することを特徴とする。 A seventh invention is the sixth invention, wherein
The difficulty level holding means holds, as the difficulty level, a source code metric value that is an index representing the difficulty level of the source code created by software development,
The predicted value calculation means further calculates a predicted value of a source code metric value for the prediction target project based on a source code metric value for the highly similar project held in the difficulty level holding means. It is characterized by.

第８の発明は、第６または第７の発明において、
前記難易度保持手段は、前記難易度として、ソフトウェア開発で作成されたドキュメントのあいまい度を保持していることを特徴とする。 The eighth invention is the sixth or seventh invention, wherein
The difficulty level holding means holds the ambiguity level of a document created by software development as the difficulty level.

第９の発明は、第１から第８までのいずれかの発明において、
各プロジェクトについての開発工程毎の開発工数を保持する開発工数保持手段を更に備え、
前記類似度算出手段は、前記類似度を算出する際、前記開発工数保持手段に保持されている各プロジェクトについての開発工程毎の開発工数を更に用いることを特徴とする。 According to a ninth invention, in any one of the first to eighth inventions,
It further comprises development man-hour holding means for holding the development man-hours for each development process for each project,
The similarity calculation means further uses a development man-hour for each development process for each project held in the development man-hour holding means when calculating the similarity.

第１０の発明は、第９の発明において、
前記予測値算出手段は、更に、前記開発工数保持手段に保持されている前記高類似プロジェクトについての開発工程毎の開発工数に基づいて、前記予測対象プロジェクトについての開発工程毎の開発工数の予測値を算出することを特徴とする。 A tenth invention is the ninth invention,
The predicted value calculating means is further configured to predict a development man-hour for each development process for the prediction target project based on a development man-hour for each development process for the highly similar project held in the development man-hour holding means. Is calculated.

第１１の発明は、ソフトウェア開発において生じる欠陥の数を予測するソフトウェア欠陥予測方法であって、
各プロジェクトについての開発工程毎の混入欠陥数を予め用意された混入欠陥数保持手段に格納する混入欠陥数格納ステップと、
各プロジェクトについての開発工程毎の検出欠陥数を予め用意された検出欠陥数保持手段に格納する検出欠陥数格納ステップと、
各プロジェクトについての開発工程毎の規模を表す規模値を予め用意された規模値保持手段に格納する規模値格納ステップと、
前記規模値保持手段に保持されている各プロジェクトについての開発工程毎の規模値を用いて、混入欠陥数および検出欠陥数を予測する対象である予測対象プロジェクトと該予測対象プロジェクト以外のプロジェクトとの類似度を算出する類似度算出ステップと、
前記予測対象プロジェクトとの類似度が比較的高いプロジェクトである高類似プロジェクトについての、前記混入欠陥数保持手段に保持されている開発工程毎の混入欠陥数および前記検出欠陥数保持手段に保持されている開発工程毎の検出欠陥数と、前記予測対象プロジェクトと前記高類似プロジェクトとの類似度とに基づいて、前記予測対象プロジェクトについての開発工程毎の混入欠陥数および検出欠陥数の予測値を算出する予測値算出ステップと
を含むことを特徴とする。 An eleventh invention is a software defect prediction method for predicting the number of defects occurring in software development,
A mixed defect number storing step for storing the number of mixed defects for each development process for each project in a mixed defect number holding means prepared in advance,
Detected defect number storage step of storing the detected defect number for each development process for each project in a prepared defect number holding means prepared in advance,
A scale value storing step for storing a scale value representing the scale of each development process for each project in a scale value holding means prepared in advance;
Using a scale value for each development process for each project held in the scale value holding means, a prediction target project that is a target for predicting the number of mixed defects and the number of detected defects and a project other than the prediction target project A similarity calculation step for calculating the similarity,
The high-similarity project, which is a project having a relatively high degree of similarity to the prediction target project, is held in the mixed defect count holding means and the detected defect count holding means held in the mixed defect count holding means. Based on the number of detected defects for each development process and the similarity between the project to be predicted and the highly similar project, the predicted number of mixed defects and the number of detected defects for each development process for the project to be predicted is calculated. And a predicted value calculating step.

第１２の発明は、ソフトウェア開発において生じる欠陥の数を予測するソフトウェア欠陥予測プログラムであって、
各プロジェクトについての開発工程毎の混入欠陥数を予め用意された混入欠陥数保持手段に格納する混入欠陥数格納ステップと、
各プロジェクトについての開発工程毎の検出欠陥数を予め用意された検出欠陥数保持手段に格納する検出欠陥数格納ステップと、
各プロジェクトについての開発工程毎の規模を表す規模値を予め用意された規模値保持手段に格納する規模値格納ステップと、
前記規模値保持手段に保持されている各プロジェクトについての開発工程毎の規模値を用いて、混入欠陥数および検出欠陥数を予測する対象である予測対象プロジェクトと該予測対象プロジェクト以外のプロジェクトとの類似度を算出する類似度算出ステップと、
前記予測対象プロジェクトとの類似度が比較的高いプロジェクトである高類似プロジェクトについての、前記混入欠陥数保持手段に保持されている開発工程毎の混入欠陥数および前記検出欠陥数保持手段に保持されている開発工程毎の検出欠陥数と、前記予測対象プロジェクトと前記高類似プロジェクトとの類似度とに基づいて、前記予測対象プロジェクトについての開発工程毎の混入欠陥数および検出欠陥数の予測値を算出する予測値算出ステップと
をコンピュータのＣＰＵがメモリを利用して実行することを特徴とする。 A twelfth invention is a software defect prediction program for predicting the number of defects that occur in software development,
A mixed defect number storing step for storing the number of mixed defects for each development process for each project in a mixed defect number holding means prepared in advance,
Detected defect number storage step of storing the detected defect number for each development process for each project in a prepared defect number holding means prepared in advance,
A scale value storing step for storing a scale value representing the scale of each development process for each project in a scale value holding means prepared in advance;
Using a scale value for each development process for each project held in the scale value holding means, a prediction target project that is a target for predicting the number of mixed defects and the number of detected defects and a project other than the prediction target project A similarity calculation step for calculating the similarity,
The high-similarity project, which is a project having a relatively high degree of similarity to the prediction target project, is held in the mixed defect count holding means and the detected defect count holding means held in the mixed defect count holding means. Based on the number of detected defects for each development process and the similarity between the project to be predicted and the highly similar project, the predicted number of mixed defects and the number of detected defects for each development process for the project to be predicted is calculated. The predicted value calculation step is performed by a CPU of a computer using a memory.

第１３の発明は、第１２の発明において、
前記類似度算出ステップでは、前記類似度を算出する際、前記混入欠陥数保持手段に保持されている各プロジェクトについての開発工程毎の混入欠陥数および前記検出欠陥数保持手段に保持されている各プロジェクトについての開発工程毎の検出欠陥数が更に用いられることを特徴とする。 In a thirteenth aspect based on the twelfth aspect,
In the similarity calculation step, when calculating the similarity, the number of mixed defects and the number of detected defects held for each development process for each project held in the mixed defect number holding means The number of detected defects for each development process for the project is further used.

第１４の発明は、第１３の発明において、
前記類似度算出ステップでは、前記類似度を算出する際、前記混入欠陥数保持手段に保持されている前記予測対象プロジェクトについての実施済みの開発工程の混入欠陥数および前記検出欠陥数保持手段に保持されている前記予測対象プロジェクトについての実施済みの開発工程の検出欠陥数が更に用いられることを特徴とする。 In a fourteenth aspect based on the thirteenth aspect,
In the similarity calculation step, when the similarity is calculated, the number of mixed defects in the developed development process and the number of detected defects held in the prediction target project held in the mixed defect number holding unit are held in the detection defect number holding unit. The number of detected defects in the development process that has already been performed for the predicted project that is being used is further used.

第１５の発明は、第１２から第１４までのいずれかの発明において、
前記予測値算出ステップは、更に、前記予測対象プロジェクトについての開発工程毎の規模値の予測値が算出されることを特徴とする。 In a fifteenth aspect of the invention based on any one of the twelfth to fourteenth aspects of the invention,
In the predicted value calculating step, a predicted value of a scale value for each development process for the prediction target project is further calculated.

第１６の発明は、第１２から第１５までのいずれかの発明において、
各プロジェクトについての開発工程毎の技術的指標に基づいて得られる評価値および環境的指標に基づいて得られる評価値を予め用意された技術要因・環境要因保持手段に格納する技術要因・環境要因格納ステップを更に含み、
前記類似度算出ステップでは、前記類似度を算出する際、前記技術要因・環境要因保持手段に保持されている各プロジェクトについての開発工程毎の技術的指標に基づいて得られる評価値および環境的指標に基づいて得られる評価値が更に用いられることを特徴とする。 In a sixteenth aspect of the invention, any one of the twelfth to fifteenth aspects,
Technical factor / environment factor storage that stores the evaluation value obtained based on the technical index for each development process and the environmental index for each project in the technical factor / environment factor holding means prepared in advance. Further comprising steps,
In the similarity calculation step, when calculating the similarity, an evaluation value and an environmental index obtained based on a technical index for each development process for each project held in the technical factor / environment factor holding means An evaluation value obtained based on the above is further used.

第１７の発明は、第１２から第１６までのいずれかの発明において、
各プロジェクトについての開発工程毎の難易度を予め用意された難易度保持手段に格納する難易度格納ステップを更に含み、
前記類似度算出ステップでは、前記類似度を算出する際、前記難易度保持手段に保持されている各プロジェクトについての開発工程毎の難易度が更に用いられることを特徴とする。 According to a seventeenth aspect of the invention, in any of the twelfth to sixteenth aspects of the invention,
A difficulty level storing step of storing the difficulty level for each development process for each project in a difficulty level holding means prepared in advance;
In the similarity calculation step, when calculating the similarity, the difficulty for each development process for each project held in the difficulty holding means is further used.

第１８の発明は、第１７の発明において、
前記難易度格納ステップでは、前記難易度として、ソフトウェア開発で作成されたソースコードの難易度を表す指標であるソースコードメトリクス値が前記難易度保持手段に格納され、
前記予測値算出ステップでは、更に、前記難易度保持手段に保持されている前記高類似プロジェクトについてのソースコードメトリクス値に基づいて、前記予測対象プロジェクトについてのソースコードメトリクス値の予測値が算出されることを特徴とする。 In an eighteenth aspect based on the seventeenth aspect,
In the difficulty level storing step, as the difficulty level, a source code metric value that is an index representing the difficulty level of the source code created by software development is stored in the difficulty level holding means,
In the predicted value calculation step, a predicted value of the source code metric value for the prediction target project is further calculated based on the source code metric value for the highly similar project held in the difficulty level holding means. It is characterized by that.

第１９の発明は、第１７または第１８の発明において、
前記難易度格納ステップでは、前記難易度として、ソフトウェア開発で作成されたドキュメントのあいまい度が前記難易度保持手段に格納されることを特徴とする。 In a nineteenth aspect based on the seventeenth or eighteenth aspect,
In the difficulty level storing step, the ambiguity level of a document created by software development is stored in the difficulty level holding means as the difficulty level.

第２０の発明は、第１２から第１９までのいずれかの発明において、
各プロジェクトについての開発工程毎の開発工数を予め用意された開発工数保持手段に格納する開発工数格納ステップを更に含み、
前記類似度算出ステップでは、前記類似度を算出する際、前記開発工数保持手段に保持されている各プロジェクトについての開発工程毎の開発工数が更に用いられることを特徴とする。 In a twentieth invention according to any one of the twelfth to nineteenth inventions,
A development man-hour storage step for storing the development man-hours for each development process for each project in a development man-hour holding means prepared in advance;
In the similarity calculation step, when calculating the similarity, the development man-hours for each development process for each project held in the development man-hour holding means is further used.

第２１の発明は、第２０の発明において、
前記予測値算出ステップでは、更に、前記開発工数保持手段に保持されている前記高類似プロジェクトについての開発工程毎の開発工数に基づいて、前記予測対象プロジェクトについての開発工程毎の開発工数の予測値が算出されることを特徴とする。 The twenty-first invention is the twentieth invention,
In the predicted value calculation step, the predicted value of the development man-hours for each development process for the project to be predicted is further based on the development man-hours for each development process for the highly similar project held in the development man-hour holding means. Is calculated.

上記第１の発明によれば、ソフトウェア欠陥予測装置には、予測対象のデータを保持する手段として、開発工程毎の検出欠陥数を保持する検出欠陥数保持手段に加えて開発工程毎の混入欠陥数を保持する混入欠陥数保持手段が設けられている。このため、ソフトウェア開発の各開発工程における検出欠陥数を予測するだけでなく各開発工程における混入欠陥数を予測することが可能となる。このように各開発工程で実際にシステムに混入する欠陥の数を予測することが可能となるので、ソフトウェア開発のプロジェクトを進めるにあたって、より的確に改善対象工程の絞り込みを行うことが可能となる。 According to the first aspect of the invention, the software defect prediction apparatus includes, as means for holding the prediction target data, a mixed defect for each development process in addition to the detection defect number holding means for holding the number of detected defects for each development process. A mixing defect number holding means for holding the number is provided. For this reason, it is possible to predict not only the number of detected defects in each development process of software development but also the number of mixed defects in each development process. As described above, since it is possible to predict the number of defects actually mixed into the system in each development process, it is possible to narrow down the process to be improved more accurately when proceeding with the software development project.

上記第２の発明によれば、規模値のデータだけでなく欠陥数のデータも用いて、予測対象プロジェクトと他のプロジェクトとの類似度が算出される。これにより、予測対象プロジェクトと他のプロジェクトとの類似度がより正確に求められる。このように正確に求められた類似度に基づく予測が行われるので、欠陥の数（混入欠陥数，検出欠陥数）の予測値が高い精度で得られる。 According to the second aspect, the similarity between the prediction target project and other projects is calculated using not only the scale value data but also the defect count data. As a result, the degree of similarity between the prediction target project and other projects can be obtained more accurately. Since prediction based on the degree of similarity accurately obtained in this way is performed, a predicted value of the number of defects (number of mixed defects, number of detected defects) can be obtained with high accuracy.

上記第３の発明によれば、欠陥の数に関し、実施済みの開発工程の実績値を用いて未実施の開発工程の予測値が求められる。このため、各開発工程で生じる欠陥の数を従来よりも高い精度で予測することが可能となる。これにより、ソフトウェア開発のプロジェクトを進めるにあたって、より的確に改善対象工程の絞り込みを行うことが可能となる。 According to the said 3rd invention, the predicted value of the unimplemented development process is calculated | required using the track record value of the implemented development process regarding the number of defects. For this reason, it is possible to predict the number of defects generated in each development process with higher accuracy than before. This makes it possible to narrow down the process to be improved more accurately when proceeding with a software development project.

上記第４の発明によれば、各開発工程における規模値が予測される。これにより、改善対象工程を決定するための判断材料が多くなるので、より的確に改善対象工程の絞り込みを行うことが可能となる。 According to the fourth aspect of the invention, the scale value in each development process is predicted. As a result, the determination material for determining the improvement target process increases, so that the improvement target processes can be narrowed down more accurately.

上記第５の発明によれば、予測対象プロジェクトと他のプロジェクトとの類似度を算出する際、技術要因および環境要因のデータが用いられる。このように技術要因および環境要因のデータが用いられることにより、ソフトウェア開発の各開発工程で生じる欠陥の数が、ソフトウェア開発の特性を考慮して予測される。これにより、ソフトウェア開発のプロジェクトを進めるにあたって、きわめて効果的に改善対象工程の絞り込みを行うことが可能となる。 According to the fifth aspect, when calculating the similarity between the project to be predicted and another project, the data of the technical factor and the environmental factor are used. By using the data of the technical factor and the environmental factor in this way, the number of defects generated in each development process of software development is predicted in consideration of the characteristics of software development. This makes it possible to narrow down the process to be improved very effectively when proceeding with a software development project.

上記第６の発明によれば、開発工程毎の難易度を考慮して予測対象プロジェクトと他のプロジェクトとの類似度が算出され、その類似度に基づいて、予測対象プロジェクトの各開発工程における欠陥の数が予測される。これにより、より効果的に改善対象工程の絞り込みを行うことが可能となる。 According to the sixth aspect, the degree of similarity between the project to be predicted and another project is calculated in consideration of the difficulty level for each development process, and the defect in each development process of the project to be predicted is based on the degree of similarity. The number of is predicted. Thereby, it becomes possible to narrow down the process to be improved more effectively.

上記第７の発明によれば、ソースコードの難易度を表す指標であるソースコードメトリクス値の予測が行われる。これにより、実装工程を改善対象工程とすべきか否かをより的確に判断することが可能となる。 According to the seventh aspect, the source code metric value that is an index representing the difficulty level of the source code is predicted. This makes it possible to more accurately determine whether or not the mounting process should be an improvement target process.

上記第８の発明によれば、ソフトウェア開発の各開発工程で作成されるドキュメントのあいまい度のデータを用いて予測対象プロジェクトと他のプロジェクトとの類似度が算出され、その類似度に基づいて、予測対象プロジェクトの各開発工程における欠陥の数が予測される。一般にドキュメントのあいまい度と欠陥の数との間には比較的高い相関があるので、あいまい度のデータを用いることによって、より高い精度で欠陥の数を予測することが可能となる。これにより、より効果的に改善対象工程の絞り込みを行うことが可能となる。 According to the eighth invention, the degree of similarity between the project to be predicted and another project is calculated using the data of the degree of ambiguity of the document created in each development step of software development, and based on the degree of similarity, The number of defects in each development process of the project to be predicted is predicted. In general, since there is a relatively high correlation between the ambiguity of a document and the number of defects, the number of defects can be predicted with higher accuracy by using the ambiguity data. Thereby, it becomes possible to narrow down the process to be improved more effectively.

上記第９の発明によれば、予測対象プロジェクトと他のプロジェクトとの類似度を算出する際、開発工程毎の開発工数のデータが用いられる。一般に、或る一定の規模のプロジェクトに着目したとき、開発工数を多くするほどソフトウェアの品質は向上する。すなわち、規模が一定であれば、開発工数が多いほど欠陥の数は少なくなる傾向にある。このように開発工数と欠陥の数との間には比較的高い相関があるので、開発工数のデータを用いることによって、より高い精度で欠陥の数を予測することが可能となる。これにより、より効果的に改善対象工程の絞り込みを行うことが可能となる。 According to the ninth aspect, when calculating the similarity between the project to be predicted and another project, the data of the development man-hours for each development process is used. In general, when focusing on a project of a certain scale, the quality of software improves as the development man-hour increases. That is, if the scale is constant, the number of defects tends to decrease as the development man-hour increases. Thus, since there is a relatively high correlation between the development man-hours and the number of defects, it is possible to predict the number of defects with higher accuracy by using the development man-hour data. Thereby, it becomes possible to narrow down the process to be improved more effectively.

上記第１０の発明によれば、各開発工程における開発工数が予測される。これにより、改善対象工程を決定するための判断材料が多くなるので、より的確に改善対象工程の絞り込みを行うことが可能となる。 According to the tenth aspect, the development man-hours in each development process are predicted. As a result, the determination material for determining the improvement target process increases, so that the improvement target processes can be narrowed down more accurately.

上記第１１の発明によれば、上記第１の発明と同様の効果をソフトウェア欠陥予測方法の発明において奏することができる。 According to the eleventh aspect, the same effect as that of the first aspect can be achieved in the invention of the software defect prediction method.

上記第１２から上記第２１までの発明によれば、それぞれ上記第１から上記第１０までの発明と同様の効果をソフトウェア欠陥予測プログラムの発明において奏することができる。 According to the twelfth to twenty-first aspects of the present invention, the same effects as the first to tenth aspects of the invention can be achieved in the invention of the software defect prediction program.

本発明の第１の実施形態に係るソフトウェア欠陥予測装置を含むシステム全体の概略構成図である。It is a schematic block diagram of the whole system containing the software defect prediction apparatus which concerns on the 1st Embodiment of this invention. 上記第１の実施形態において、ソフトウェア欠陥予測装置のハードウェア構成を示すブロック図である。In the said 1st Embodiment, it is a block diagram which shows the hardware constitutions of a software defect prediction apparatus. 上記第１の実施形態において、混入欠陥数テーブルのレコードフォーマットの一例を示す図である。In the said 1st Embodiment, it is a figure which shows an example of the record format of a mixing defect number table. 上記第１の実施形態において、検出欠陥数テーブルのレコードフォーマットの一例を示す図である。In the said 1st Embodiment, it is a figure which shows an example of the record format of a detected defect number table. 上記第１の実施形態において、規模値テーブルのレコードフォーマットの一例を示す図である。In the said 1st Embodiment, it is a figure which shows an example of the record format of a scale value table. 上記第１の実施形態におけるソフトウェア欠陥予測装置の機能構成を示す機能ブロック図である。It is a functional block diagram which shows the function structure of the software defect prediction apparatus in the said 1st Embodiment. 上記第１の実施形態において、各テーブルへのデータの登録の手順を示すフローチャートである。4 is a flowchart showing a procedure for registering data in each table in the first embodiment. 上記第１の実施形態において、協調フィルタリングの手法を用いて欠陥予測処理を行う際の入力データについて説明するための図である。In the said 1st Embodiment, it is a figure for demonstrating the input data at the time of performing a defect prediction process using the method of collaborative filtering. 上記第１の実施形態において、ソフトウェア欠陥予測装置で行われる欠陥予測処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the defect prediction process performed with the software defect prediction apparatus in the said 1st Embodiment. 上記第１の実施形態において、高類似プロジェクトの個数について説明するための図である。It is a figure for demonstrating the number of highly similar projects in the said 1st Embodiment. 上記第１の実施形態において、高類似プロジェクトの個数について説明するための図である。It is a figure for demonstrating the number of highly similar projects in the said 1st Embodiment. 上記第１の実施形態におけるプロジェクト−メトリクステーブルの構成を示す図である。It is a figure which shows the structure of the project-metrics table in the said 1st Embodiment. 上記第１の実施形態において、欠陥予測処理について説明するための、規模値テーブル内のレコードの内容を示す図である。In the said 1st Embodiment, it is a figure which shows the content of the record in a scale value table for demonstrating a defect prediction process. 上記第１の実施形態において、欠陥予測処理について説明するための、規模値テーブル内のレコードの内容を示す図である。In the said 1st Embodiment, it is a figure which shows the content of the record in a scale value table for demonstrating a defect prediction process. 上記第１の実施形態において、欠陥予測処理について説明するための、プロジェクト−メトリクステーブルの内容を示す図である。In the said 1st Embodiment, it is a figure which shows the content of the project-metrics table for demonstrating a defect prediction process. 上記第１の実施形態において、欠陥予測処理について説明するための、検出欠陥数テーブル内のレコードの内容を示す図である。In the said 1st Embodiment, it is a figure which shows the content of the record in the number table of detected defects for demonstrating a defect prediction process. 上記第１の実施形態において、欠陥予測処理について説明するための、検出欠陥数テーブル内のレコードの内容を示す図である。In the said 1st Embodiment, it is a figure which shows the content of the record in the number table of detected defects for demonstrating a defect prediction process. 上記第１の実施形態において、欠陥予測処理について説明するための、検出欠陥数テーブル内のレコードの内容を示す図である。In the said 1st Embodiment, it is a figure which shows the content of the record in the number table of detected defects for demonstrating a defect prediction process. 上記第１の実施形態において、欠陥予測処理について説明するための、検出欠陥数テーブル内のレコードの内容を示す図である。In the said 1st Embodiment, it is a figure which shows the content of the record in the number table of detected defects for demonstrating a defect prediction process. 上記第１の実施形態において、欠陥予測処理について説明するための、検出欠陥数テーブル内のレコードの内容を示す図である。In the said 1st Embodiment, it is a figure which shows the content of the record in the number table of detected defects for demonstrating a defect prediction process. 本発明の第２の実施形態に係るソフトウェア欠陥予測装置のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the software defect prediction apparatus which concerns on the 2nd Embodiment of this invention. 上記第２の実施形態において、開発工数テーブルのレコードフォーマットの一例を示す図である。In the said 2nd Embodiment, it is a figure which shows an example of the record format of a development man-hour table. 上記第２の実施形態において、ソースコードメトリクステーブルのレコードフォーマットの一例を示す図である。In the said 2nd Embodiment, it is a figure which shows an example of the record format of a source code metrics table. 上記第２の実施形態において、技術要因・環境要因テーブルのレコードフォーマットの一例を示す図である。In the said 2nd Embodiment, it is a figure which shows an example of the record format of a technical factor and environmental factor table. 上記第２の実施形態におけるソフトウェア欠陥予測装置の機能構成を示す機能ブロック図である。It is a functional block diagram which shows the function structure of the software defect prediction apparatus in the said 2nd Embodiment. 上記第２の実施形態において、各テーブルへのデータの登録の手順を示すフローチャートである。It is a flowchart which shows the procedure of registration of the data to each table in the said 2nd Embodiment. 上記第２の実施形態におけるプロジェクト−メトリクステーブルの構成を示す図である。It is a figure which shows the structure of the project-metrics table in the said 2nd Embodiment. 上記第２の実施形態の変形例において、あいまい度テーブルのレコードフォーマットの一例を示す図である。In the modification of the said 2nd Embodiment, it is a figure which shows an example of the record format of an ambiguity table.

以下、添付図面を参照しつつ本発明の実施形態について説明する。なお、ソフトウェア開発の開発工程は、一例を挙げると、「要求定義」，「基本設計」，「詳細設計」，「実装（プログラミング等）」，「単体テスト」，「結合テスト」，および「総合テスト」の７つの工程から成っている。しかしながら、以下においては、説明を簡単にするため、各実施形態におけるソフトウェア開発の開発工程は「要求定義」，「設計」，「実装」，および「テスト」の４つの工程から成るものと仮定する。また、それらの４つの工程の終了後に、開発したシステムをユーザのコンピュータに導入する「ユーザリリース」と呼ばれるフェーズがあるものとする。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. For example, the software development process includes "requirement definition", "basic design", "detailed design", "implementation (programming, etc.)", "unit test", "integration test", and "comprehensive test". It consists of seven processes. However, in the following, in order to simplify the explanation, it is assumed that the development process of software development in each embodiment includes four processes of “requirement definition”, “design”, “implementation”, and “test”. . Further, it is assumed that there is a phase called “user release” in which the developed system is introduced into the user's computer after the completion of these four steps.

＜１．第１の実施形態＞
＜１．１概略構成＞
図１は、本発明の第１の実施形態に係るソフトウェア欠陥予測装置を含むシステム全体の概略構成図である。このシステムは、サーバ機７と複数のパソコン８とによって構成されている。サーバ機７および各パソコン８は、ＬＡＮ９によって互いに接続されている。サーバ機７は、各パソコン８からの要求に応じた処理の実行や複数のパソコン８が共用するためのファイル，データベースの格納などを行う。また、サーバ機７は、ソフトウェア開発の各開発工程で生じる欠陥数などを予測する処理を行う。従って、以下においては、このサーバ機７のことを「ソフトウェア欠陥予測装置」という。 <1. First Embodiment>
<1.1 Schematic configuration>
FIG. 1 is a schematic configuration diagram of an entire system including a software defect prediction apparatus according to the first embodiment of the present invention. This system includes a server machine 7 and a plurality of personal computers 8. The server machine 7 and each personal computer 8 are connected to each other by a LAN 9. The server machine 7 executes processing in response to a request from each personal computer 8 and stores files and databases that are shared by a plurality of personal computers 8. Further, the server machine 7 performs a process of predicting the number of defects generated in each development process of software development. Therefore, in the following, this server machine 7 is referred to as a “software defect prediction device”.

図２は、ソフトウェア欠陥予測装置７のハードウェア構成を示すブロック図である。このソフトウェア欠陥予測装置７は、ＣＰＵ１０と補助記憶装置２０と表示部３０と入力部４０とメモリ５０とネットワークインタフェース部６０とを備えている。ＣＰＵ１０は、与えられた命令に従い演算処理を行う。補助記憶装置２０は、各種データを記憶する。この補助記憶装置２０には、プログラム格納部２１とデータベース２２とが含まれている。プログラム格納部２１には、ソフトウェア欠陥予測プログラム２１０が格納されている。データベース２２には、混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，および規模値テーブル２２３が格納されている。表示部３０は、例えば、オペレータが作業を行うための各種画面を表示する。入力部４０は、マウスやキーボードによるオペレータからの入力を受け付ける。メモリ５０には、ＣＰＵ１０の演算処理に必要なデータが一時的に格納される。ネットワークインタフェース部６０は、ＬＡＮ９を介してこのソフトウェア欠陥予測装置７とパソコン８との間でのデータ通信が可能となるように機能する。 FIG. 2 is a block diagram illustrating a hardware configuration of the software defect prediction apparatus 7. The software defect prediction apparatus 7 includes a CPU 10, an auxiliary storage device 20, a display unit 30, an input unit 40, a memory 50, and a network interface unit 60. The CPU 10 performs arithmetic processing according to a given instruction. The auxiliary storage device 20 stores various data. The auxiliary storage device 20 includes a program storage unit 21 and a database 22. The program storage unit 21 stores a software defect prediction program 210. The database 22 stores a mixed defect count table 221, a detected defect count table 222, and a scale value table 223. The display unit 30 displays various screens for the operator to perform work, for example. The input unit 40 receives input from an operator using a mouse or a keyboard. The memory 50 temporarily stores data necessary for the arithmetic processing of the CPU 10. The network interface unit 60 functions to enable data communication between the software defect prediction device 7 and the personal computer 8 via the LAN 9.

ソフトウェア欠陥予測プログラム２１０の実行が指示されると、補助記憶装置２０に格納されているソフトウェア欠陥予測プログラム２１０がメモリ５０に読み出され、そのメモリ５０に読み出されたソフトウェア欠陥予測プログラム２１０をＣＰＵ１０が実行することにより、ソフトウェア開発の各開発工程で生じる欠陥数などを予測する処理（以下、「欠陥予測処理」という。）が実行される。なお、ソフトウェア欠陥予測プログラム２１０は、ＣＤ−ＲＯＭ，ＤＶＤ−ＲＯＭ，フラッシュメモリ等の記録媒体に記録された形態あるいはネットワークを介したダウンロードの形態で提供される。 When the execution of the software defect prediction program 210 is instructed, the software defect prediction program 210 stored in the auxiliary storage device 20 is read into the memory 50, and the software defect prediction program 210 read into the memory 50 is stored in the CPU 10 as shown in FIG. Is executed, a process for predicting the number of defects occurring in each development process of software development (hereinafter referred to as “defect prediction process”) is executed. The software defect prediction program 210 is provided in a form recorded on a recording medium such as a CD-ROM, a DVD-ROM, or a flash memory, or downloaded via a network.

ところで、欠陥予測処理では、欠陥数（混入欠陥数、検出欠陥数）の予測が行われるのみならず、ソフトウェア開発に関する各種評価項目の値の予測が行われる。なお、以下においては、欠陥数を含む各種評価項目のことを「メトリクス」という。本実施形態においては、欠陥予測処理によって予測対象プロジェクト（典型的には、開発中のプロジェクト）の各開発工程における各メトリクスの値の予測が行われる際、協調フィルタリングの手法が採用される。一般に協調フィルタリングはユーザ（消費者）の嗜好を推測する際に用いられる手法であり、この手法によれば、或るユーザの嗜好は当該ユーザに類似するユーザの情報に基づいて推測される。従って、本実施形態においては、予測対象プロジェクトについての各メトリクスの値（予測値）は、当該予測対象プロジェクトに類似する他のプロジェクトの情報を用いて予測される。 By the way, in the defect prediction process, not only the number of defects (the number of mixed defects and the number of detected defects) is predicted, but also the values of various evaluation items related to software development are predicted. In the following, various evaluation items including the number of defects are referred to as “metrics”. In the present embodiment, a collaborative filtering technique is employed when predicting the value of each metric in each development process of a prediction target project (typically a project under development) by defect prediction processing. In general, collaborative filtering is a technique used when a user's (consumer) preference is estimated. According to this technique, a user's preference is estimated based on information on a user similar to the user. Therefore, in this embodiment, the value of each metric (predicted value) for the prediction target project is predicted using information of another project similar to the prediction target project.

＜１．２テーブル＞
次に、補助記憶装置２０内のデータベース２２に保持されているテーブル（混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，および規模値テーブル２２３）について説明する。 <1.2 Table>
Next, the tables (mixed defect number table 221, detected defect number table 222, and scale value table 223) held in the database 22 in the auxiliary storage device 20 will be described.

図３は、混入欠陥数テーブル２２１のレコードフォーマットの一例を示す図である。混入欠陥数テーブル２２１には、項目名をそれぞれ「名称」，「バージョン」，「予測工程」，「要求定義混入欠陥数」，「予実フラグ（ａ１）」，「設計混入欠陥数」，「予実フラグ（ａ２）」，「実装混入欠陥数」，および「予実フラグ（ａ３）」とする複数の項目が含まれている。なお、各予実フラグには、他の予実フラグと区別するための符号を付している。 FIG. 3 is a diagram showing an example of the record format of the mixed defect count table 221. As shown in FIG. In the mixed defect count table 221, item names are “name”, “version”, “prediction process”, “required definition mixed defect count”, “predictive flag (a1)”, “design mixed defect count”, “predictive”, respectively. A plurality of items including “flag (a2)”, “number of defects mixed in mounting”, and “predictive flag (a3)” are included. In addition, the code | symbol for distinguishing from each other pre-real flag is attached | subjected to each pre-real flag.

混入欠陥数テーブル２２１の各項目のフィールド（個々のデータが格納される領域）には、それぞれ以下のような内容のデータが格納される。「名称」には、ソフトウェア開発のプロジェクトの名称が格納される。「バージョン」には、プロジェクトのバージョン値が格納される。「予測工程」には、混入欠陥数の予測が行われた工程を示す名称が格納される。例えば、実装が開始される際に混入欠陥数の予測が行われた場合、それによって得られたレコードの「予測工程」のフィールドには「実装」という名称が格納される。「要求定義混入欠陥数」には、要求定義工程でシステムに混入した欠陥の数（実績値）もしくは要求定義工程でシステムに混入すると予測される欠陥の数（予測値）が格納される。「予実フラグ（ａ１）」には、「要求定義混入欠陥数」に格納されている値が実績値であるのか予測値であるのかを示すフラグが格納される。「設計混入欠陥数」には、設計工程でシステムに混入した欠陥の数（実績値）もしくは設計工程でシステムに混入すると予測される欠陥の数（予測値）が格納される。「予実フラグ（ａ２）」には、「設計混入欠陥数」に格納されている値が実績値であるのか予測値であるのかを示すフラグが格納される。「実装混入欠陥数」には、実装工程でシステムに混入した欠陥の数（実績値）もしくは実装工程でシステムに混入すると予測される欠陥の数（予測値）が格納される。「予実フラグ（ａ３）」には、「実装混入欠陥数」に格納されている値が実績値であるのか予測値であるのかを示すフラグが格納される。 In the field of each item (area in which individual data is stored) of the mixed defect count table 221, the following data is stored. The “name” stores the name of the software development project. “Version” stores the version value of the project. In the “prediction process”, a name indicating a process in which the number of mixed defects is predicted is stored. For example, when the number of mixed defects is predicted when mounting is started, the name “mounting” is stored in the “prediction process” field of the record obtained thereby. The “number of required definition-mixed defects” stores the number of defects mixed in the system in the request definition process (actual value) or the number of defects predicted to be mixed in the system in the request definition process (predicted value). In the “predictive flag (a1)”, a flag indicating whether the value stored in the “number of required definition mixed defects” is an actual value or a predicted value is stored. In the “number of design-mixed defects”, the number of defects mixed in the system in the design process (actual value) or the number of defects predicted to be mixed in the system in the design process (predicted value) is stored. In the “predictive flag (a2)”, a flag indicating whether the value stored in the “number of design-mixed defects” is an actual value or a predicted value is stored. In “Mounting mixed defect count”, the number of defects mixed in the system in the mounting process (actual value) or the number of defects predicted to be mixed in the system in the mounting process (predicted value) is stored. In the “predictive flag (a3)”, a flag indicating whether the value stored in the “number of mixed defects” is an actual value or a predicted value is stored.

なお、混入欠陥数テーブル２２１以外のテーブルにおいても、「名称」にはソフトウェア開発のプロジェクトの名称が格納され、「バージョン」にはプロジェクトのバージョン値が格納され、「予測工程」にはテーブル内の項目の値の予測が行われた工程を示す名称が格納され、「予実フラグ」には、直前の項目のフィールドに格納されている値が実績値であるのか予測値であるのかを示すフラグが格納される。従って、混入欠陥数テーブル２２１以外の後述するテーブルに関しては、これらの項目に関する説明を省略する。また、予実フラグに関し、後述する図１３〜図２０では、説明の便宜上、フラグに代えて「実績」という文字もしくは「予測」という文字を記している。 In the tables other than the mixed defect count table 221, the name of the software development project is stored in “Name”, the version value of the project is stored in “Version”, and the value in the table is stored in “Prediction Step”. A name indicating the process in which the value of the item is predicted is stored, and a flag indicating whether the value stored in the field of the previous item is an actual value or a predicted value is stored in the “predicted flag” field. Stored. Therefore, the description about these items is abbreviate | omitted regarding the table mentioned later other than the number table 221 of mixing defects. In addition, regarding the predictive flag, in FIGS. 13 to 20 to be described later, for the convenience of explanation, the characters “actual” or “predicted” are written instead of the flag.

図４は、検出欠陥数テーブル２２２のレコードフォーマットの一例を示す図である。検出欠陥数テーブル２２２には、項目名をそれぞれ「名称」，「バージョン」，「予測工程」，「要求定義検出欠陥数」，「予実フラグ（ｂ１）」，「設計検出欠陥数」，「予実フラグ（ｂ２）」，「実装検出欠陥数」，および「予実フラグ（ｂ３）」とする複数の項目が含まれている。 FIG. 4 is a diagram illustrating an example of a record format of the detected defect number table 222. In the detected defect count table 222, the item names are “name”, “version”, “prediction process”, “required definition detected defect count”, “predictive flag (b1)”, “design detected defect count”, “predictive”, respectively. A plurality of items “flag (b2)”, “number of detected mounting defects”, and “predictive flag (b3)” are included.

検出欠陥数テーブル２２２の各項目（「名称」などの上記で説明した項目を除く）のフィールドには、それぞれ以下のような内容のデータが格納される。「要求定義検出欠陥数」には、要求定義工程で検出された欠陥の数（実績値）もしくは要求定義工程で検出されると予測される欠陥の数（予測値）が格納される。「設計検出欠陥数」には、設計工程で検出された欠陥の数（実績値）もしくは設計工程で検出されると予測される欠陥の数（予測値）が格納される。「実装検出欠陥数」には、実装工程で検出された欠陥の数（実績値）もしくは実装工程で検出されると予測される欠陥の数（予測値）が格納される。 In the field of each item (excluding the above-described items such as “name”) in the detected defect count table 222, data having the following contents is stored. The number of defects detected in the requirement definition process (actual value) or the number of defects predicted to be detected in the requirement definition process (predicted value) is stored in the “required definition detection defect count”. The “number of detected defects” stores the number of defects detected in the design process (actual value) or the number of defects predicted to be detected in the design process (predicted value). The “number of detected mounting defects” stores the number of defects detected in the mounting process (actual value) or the number of defects predicted to be detected in the mounting process (predicted value).

図５は、規模値テーブル２２３のレコードフォーマットの一例を示す図である。規模値テーブル２２３には、項目名をそれぞれ「名称」，「バージョン」，「予測工程」，「要求定義規模」，「予実フラグ（ｃ１）」，「設計規模」，「予実フラグ（ｃ２）」，「実装規模」，「予実フラグ（ｃ３）」，「テスト規模」，および「予実フラグ（ｃ４）」とする複数の項目が含まれている。 FIG. 5 is a diagram illustrating an example of the record format of the scale value table 223. In the scale value table 223, the item names are “name”, “version”, “prediction process”, “request definition scale”, “predictive flag (c1)”, “design scale”, and “predictive flag (c2)”, respectively. , “Mounting scale”, “predictive flag (c3)”, “test scale”, and “predictive flag (c4)”.

規模値テーブル２２３の各項目（「名称」などの上記で説明した項目を除く）のフィールドには、それぞれ以下のような内容のデータが格納される。「要求定義規模」には、要求定義工程の規模の大きさを特定する値（実績値もしくは予測値）が格納される。「設計規模」には、設計工程の規模の大きさを特定する値（実績値もしくは予測値）が格納される。「実装規模」には、実装工程の規模の大きさを特定する値（実績値もしくは予測値）が格納される。「テスト規模」には、テスト工程の規模の大きさを特定する値（実績値もしくは予測値）が格納される。 In the field of each item of the scale value table 223 (excluding the item described above such as “name”), data having the following contents is stored. The “required definition scale” stores a value (actual value or predicted value) that specifies the size of the required definition process. The “design scale” stores a value (actual value or predicted value) that specifies the scale of the design process. The “mounting scale” stores a value (actual value or predicted value) that specifies the size of the mounting process. The “test scale” stores a value (actual value or predicted value) that specifies the scale of the test process.

＜１．３機能構成＞
図６は、本実施形態におけるソフトウェア欠陥予測装置７の機能構成を示す機能ブロック図である。このソフトウェア欠陥予測装置７には、データ登録手段７１，類似度算出手段７２，予測値算出手段７３，混入欠陥数保持手段として機能する混入欠陥数テーブル２２１，検出欠陥数保持手段として機能する検出欠陥数テーブル２２２，および規模値保持手段として機能する規模値テーブル２２３が含まれている。類似度算出手段７２には、正規化手段７２１と類似度計算手段７２２とが含まれている。 <1.3 Functional configuration>
FIG. 6 is a functional block diagram showing a functional configuration of the software defect prediction apparatus 7 in the present embodiment. The software defect prediction apparatus 7 includes a data registration unit 71, a similarity calculation unit 72, a predicted value calculation unit 73, a mixed defect number table 221 that functions as a mixed defect number holding unit, and a detected defect that functions as a detected defect number holding unit. A number table 222 and a scale value table 223 functioning as a scale value holding unit are included. The similarity calculation means 72 includes a normalization means 721 and a similarity calculation means 722.

データ登録手段７１は、混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，および規模値テーブル２２３に対して、データの登録（レコードの追加・変更・削除）を行う。後述する欠陥予測処理を行うために、例えば各開発工程の開始前や終了後に、このデータ登録手段７１によって各テーブルへのデータの登録が行われる。本実施形態においては、図７に示すように、混入欠陥数の登録（ステップＳ１１０），検出欠陥数の登録（ステップＳ１２０），および規模値の登録（ステップＳ１３０）が順次に行われる。 The data registration unit 71 registers data (adds / changes / deletes records) in the mixed defect number table 221, the detected defect number table 222, and the scale value table 223. In order to perform a defect prediction process to be described later, for example, data is registered in each table by the data registration unit 71 before or after the start of each development process. In the present embodiment, as shown in FIG. 7, the number of mixed defects (step S110), the number of detected defects (step S120), and the scale value (step S130) are sequentially performed.

類似度算出手段７２および予測値算出手段７３は、協調フィルタリングの手法を用いた欠陥予測処理が行われる際に、それぞれ以下のような処理を行う。類似度算出手段７２は、混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，および規模値テーブル２２３に格納されているデータに基づいて、予測対象プロジェクトと他のプロジェクトとの類似度を算出する。その際、正規化手段７２１は、各メトリクスの値（一例を挙げると、規模値テーブル２２３に格納されている要求定義規模の値）の正規化を行う。また、類似度計算手段７２２は、後述する所定の計算式を用いて、予測対象プロジェクトと他のプロジェクトとの類似度を求める。予測値算出手段７３は、混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，および規模値テーブル２２３に格納されているデータと類似度算出手段７２によって算出された類似度とに基づいて、後述する所定の計算式を用いて、各メトリクスの予測値を算出する。 The similarity calculation unit 72 and the predicted value calculation unit 73 perform the following processes when the defect prediction process using the collaborative filtering technique is performed. The similarity calculation means 72 calculates the similarity between the prediction target project and other projects based on the data stored in the mixed defect count table 221, the detected defect count table 222, and the scale value table 223. At that time, the normalizing means 721 normalizes the value of each metric (for example, the value of the requirement definition scale stored in the scale value table 223). Moreover, the similarity calculation means 722 calculates | requires the similarity of a prediction object project and another project using the predetermined calculation formula mentioned later. The predicted value calculation means 73 is based on the data stored in the mixed defect number table 221, the detected defect number table 222, and the scale value table 223 and the similarity calculated by the similarity calculation means 72, which will be described later. Is used to calculate the predicted value of each metric.

＜１．４欠陥予測処理＞
＜１．４．１処理の流れ＞
本実施形態における欠陥予測処理について説明する。欠陥予測処理が行われる際、すなわち、各開発工程における各メトリクスの値の予測が行われる際、図８に示すようなｍ行×ｎ列（ｍはプロジェクトの数、ｎはメトリクスの数）の仮想的なテーブルのデータが入力データとして用いられる。図８に関し、ｐ_i（ｉは１以上ｍ以下の整数）はｉ番目のプロジェクトを表し、ｍ_j（ｊは１以上ｎ以下の整数）はｊ番目のメトリクスを表し、ｖ_i,jはプロジェクトｐ_iにおけるメトリクスｍ_jの値を表している。なお、以下においては、説明の便宜上、図８に示す仮想的なテーブルのことを「プロジェクト−メトリクステーブル」という。プロジェクト−メトリクステーブルには符号２３０を付す。 <1.4 Defect prediction processing>
<1.4.1 Process flow>
Defect prediction processing in this embodiment will be described. When defect prediction processing is performed, that is, when the value of each metric in each development process is predicted, m rows × n columns (m is the number of projects and n is the number of metrics) as shown in FIG. Virtual table data is used as input data. In FIG. 8, p _i (i is an integer from 1 to m) represents the i-th project, m _j (j is an integer from 1 to n) represents the j-th metric, and v _{i, j} is the project. It represents the value of the metric m _{j at} p _i . In the following, for convenience of explanation, the virtual table shown in FIG. 8 is referred to as a “project-metric table”. Reference numeral 230 is assigned to the project-metrics table.

以下、予測対象プロジェクトをｐ_aとし、当該予測対象プロジェクトｐ_aにおけるｂ番目のメトリクスｍ_bの値（予測値）ｖ_a,bの求め方について図９を参照しつつ説明する。図９は、本実施形態に係るソフトウェア欠陥予測装置７で行われる欠陥予測処理の処理手順を示すフローチャートである。ソフトウェア欠陥予測装置７では、まず、パソコン（クライアント）８からの予測値の算出の要求が受け付けられる（ステップＳ２１０）。以下、このステップＳ２１０で予測値の算出が要求されたメトリクスのことを「予測対象メトリクス」という。 Hereinafter, the predicted target project and p _a, the value (predicted value) of the b-th metrics m _b in the prediction target project p _a v _a, see Figure 9, the following explains how to determine the _b. FIG. 9 is a flowchart showing a processing procedure of defect prediction processing performed by the software defect prediction apparatus 7 according to the present embodiment. First, the software defect prediction apparatus 7 receives a request for calculation of a predicted value from the personal computer (client) 8 (step S210). Hereinafter, the metric for which the calculation of the predicted value is requested in step S210 is referred to as “prediction target metric”.

次に、予測対象メトリクスｍ_bに関して、測定値を持つプロジェクトが存在するか否かの判定が行われる（ステップＳ２２０）。その際、予測対象メトリクスｍ_bのデータを保持しているテーブルが参照される。例えば、予測対象メトリクスｍ_bが実装混入欠陥数であれば、混入欠陥数テーブル２２１が参照される。また、例えば、予測対象メトリクスｍ_bが要求定義検出欠陥数であれば、検出欠陥数テーブル２２２が参照される。ステップＳ２２０での判定の結果、測定値を持つプロジェクトが存在していれば、処理はステップＳ２３０に進む。一方、測定値を持つプロジェクトが存在していなければ、予測対象メトリクスｍ_bの予測値を求めることなく処理は終了する。 Next, with respect to the prediction target metrics m _b, it is determined whether the project is present with measurements carried out (step S220). At that time, the table that holds the data of the prediction target metrics m _b is referred to. For example, the prediction target metrics m _b is equal implementation contaminated number of defects, contamination defects number table 221 is referred to. Further, for example, the prediction target metrics m _b is if requested definition detection number of defects, detected defect number table 222 is referred to. As a result of the determination in step S220, if there is a project having a measurement value, the process proceeds to step S230. On the other hand, if the project with measurements exist, the process without obtaining the prediction value of the prediction target metrics m _b ends.

図８に示す例ではソフトウェア開発の各プロジェクトを評価するための指標としてｎ個のメトリクスが設けられているが、それらメトリクス毎に値の範囲が異なっている。例えば「或るメトリクスについては、最小値が０かつ最大値が１０であって、別のメトリクスについては、最小値が０かつ最大値が１０００である」というケースがある。このような場合、メトリクス値をそのまま用いてプロジェクト間の類似度を求めると、正しい類似度が求められないことがある。そこで、全てのメトリクス値が０から１までの範囲内の値となるように、メトリクス値の正規化が行われる（ステップＳ２３０）。具体的には、メトリクス値ｖ_i,j（ｉ番目のプロジェクトｐ_iのｊ番目のメトリクスｍ_jの値）の正規化後の値ｎｒｍ（ｖ_i,j）は、次式（１）によって求められる。

上式（１）において、ｍａｘ（Ｐ_j）はｊ番目のメトリクスｍ_jの値の最大値を表し、ｍｉｎ（Ｐ_j）はｊ番目のメトリクスｍ_jの値の最小値を表す。但し、最大値および最小値は、ｊ番目のメトリクスｍ_jについての測定値を持つプロジェクトのデータの中から抽出される。 In the example shown in FIG. 8, n metrics are provided as an index for evaluating each project of software development, but the value range is different for each metric. For example, there is a case where “for a certain metric, the minimum value is 0 and the maximum value is 10, and for another metric, the minimum value is 0 and the maximum value is 1000”. In such a case, if the similarity between projects is obtained using the metric value as it is, the correct similarity may not be obtained. Therefore, the metric values are normalized so that all the metric values are in the range from 0 to 1 (step S230). Specifically, the normalized value nrm (v _{i, j} ) of the metric value v _{i, j} (the value of the j th metric m _j of the i th project p _i ) is obtained by the following equation (1). It is done.

In the above equation (1), max (P _j ) represents the maximum value of the j-th metric m _j , and min (P _j ) represents the minimum value of the j-th metric m _j . However, the maximum value and the minimum value are extracted from the project data having the measurement values for the j-th metric m _j .

メトリクス値の正規化が行われた後、類似度の計算が行われる（ステップＳ２４０）。詳しくは、ステップＳ２３０で求められた正規化後のメトリクス値を用いて、予測対象プロジェクトｐ_aと他のプロジェクトｐ_iとの間の類似度ｓｉｍ（Ｐ_a，Ｐ_i）が求められる。具体的には、類似度ｓｉｍ（Ｐ_a，Ｐ_i）は、次式（２）によって求められる。

上式（２）において、ｊ∈Ｍａ∩Ｍｉは、プロジェクトｐ_aとプロジェクトｐ_iの双方が測定値を持つメトリクスのデータを用いることを表している。 After the metrics values are normalized, similarity is calculated (step S240). For more information, using the metrics value after normalization obtained in step S230, the similarity sim (P _a, P _i) between the predicted target project p _a and other projects p _i is determined. Specifically, the similarity sim (P _a , P _i ) is obtained by the following equation (2).

In the above formula (2), j∈Ma∩Mi represents that using data metrics both project p _a and project p _i has a measured value.

ステップＳ２４０で予測対象プロジェクトｐ_aと他のプロジェクトｐ_iとの間の類似度ｓｉｍ（Ｐ_a，Ｐ_i）が求められた後、予測対象メトリクスｍ_bの予測値の算出が行われる（ステップＳ２５０）。この予測値の算出は、予測対象プロジェクトｐ_aとの間の類似度が高いプロジェクト（以下、「高類似プロジェクト」という。）のデータを用いて行われる。なお、高類似プロジェクトについては、１つのプロジェクトとは限らず、複数のプロジェクトの場合もある。高類似プロジェクトの個数についての説明は後述する。予測対象プロジェクトｐ_aについての予測対象メトリクスｍ_bの予測値ｖ_a,bは、具体的には、次式（３）によって求められる。

上式（３）において、ｊ∈ｋ−ｎｅａｒｅｓｔＰｒｏｊｅｃｔｓは高類似プロジェクトについての予測対象メトリクスｍ_bのデータを用いることを表し、ａｍｐ（Ｐ_a，Ｐ_i）は次式（４）で算出される補正係数を表す。

上式（４）において、ｈは予測対象プロジェクトｐ_aとプロジェクトｐ_iの双方が測定値を持つメトリクスの数を表し、ｒ_jはｎｒｍ（ｖ_a,j）／ｎｒｍ（ｖ_i,j）で定義される値を表す。 After the similarity sim (P _a, P _i) between the predicted target project p _a and other projects p _i is determined in step S240, calculates the predicted value of the prediction target metrics m _b is performed (step S250 ). The calculation of the predicted value, the higher the similarity project between the predicted target project p _a (hereinafter, referred to as. "High similar projects") is performed using the data of the. Note that a highly similar project is not limited to one project but may be a plurality of projects. A description of the number of highly similar projects will be given later. Predicted values v _{a, b} of the prediction target metrics m _b of the prediction target project p _a is specifically determined by the following equation (3).

In the above formula (3), j∈k-nearestProjects represents the use of data of the prediction target metrics m _b for the high similarity _{_{project, amp (P a, P i}} ) is corrected to be calculated by the following equation (4) Represents a coefficient.

In the above equation (4), h represents the number of metrics both prediction target project p _a and project p _i has a measured value, r _j is nrm (v _a, j) / In nrm (v _{i, j)} Represents a defined value.

以上のようにして予測対象プロジェクトｐ_aについての予測対象メトリクスｍ_bの予測値ｖ_a,bが算出されると、欠陥予測処理は終了する。なお、本実施形態においては、ステップＳ２３０およびステップＳ２４０によって類似度算出ステップが実現され、ステップＳ２５０によって予測値算出ステップが実現されている。 When the predicted values v _a prediction target metrics m _b of the prediction target project p _{_a, b} are calculated as described above, the defect prediction process ends. In the present embodiment, a similarity calculation step is realized by steps S230 and S240, and a predicted value calculation step is realized by step S250.

＜１．４．２高類似プロジェクトの個数＞
ここで、本実施形態における高類似プロジェクトの個数について説明する。本実施形態においては、予測対象プロジェクトとそれ以外の各プロジェクトとの間の類似度が算出された後、高類似プロジェクトとして扱うプロジェクトの個数を変化させながら、それぞれの個数での予測値の算出が行われる。そして、「予測値」と「高類似プロジェクトについての実績値」との残差平方和の平均値が最小となる個数（高類似プロジェクトの個数）での予測値が、上述したステップＳ２５０での算出結果として提示される。 <1.4.2 Number of highly similar projects>
Here, the number of highly similar projects in this embodiment will be described. In this embodiment, after the degree of similarity between the project to be predicted and each other project is calculated, the predicted value is calculated for each number while changing the number of projects handled as highly similar projects. Done. Then, the predicted value at the number that minimizes the average value of the residual sum of squares of the “predicted value” and the “actual value for the highly similar project” (the number of highly similar projects) is calculated in step S250 described above. Presented as a result.

例えば、予測対象プロジェクトとの間の類似度が高いプロジェクトとして、図１０に示すような３つのプロジェクト（「ＰＲＪ−１」，「ＰＲＪ−２」，および「ＰＲＪ−３」とする）が存在していると仮定する。このような場合、高類似プロジェクトのデータとしてＰＲＪ−１のみのデータを用いたケース（最も類似度が高いプロジェクトのデータのみを用いたケース）（図１１のケースＡ），高類似プロジェクトのデータとしてＰＲＪ−１とＰＲＪ−２のデータを用いたケース（最も類似度が高いプロジェクトのデータと２番目に類似度が高いプロジェクトのデータとを用いたケース）（図１１のケースＢ），および高類似プロジェクトのデータとしてＰＲＪ−１とＰＲＪ−２とＰＲＪ−３のデータを用いたケース（最も類似度が高いプロジェクトのデータ，２番目に類似度が高いプロジェクトのデータ，および３番目に類似度が高いプロジェクトのデータを用いたケース）（図１１のケースＣ）のそれぞれについて、予測値が求められる。更に、それぞれのケースについて、上述した残差平方和の平均値が求められる。例えば、図１１においてケースＣに着目すると、予測値は６．０となっている。図１０より、ＰＲＪ−１についての実績値は５．０となっており、ＰＲＪ−２についての実績値は４．０となっており、ＰＲＪ−３についての実績値は１０．０となっている。残差平方和は、「６．０と５．０との差の２乗」と「６．０と４．０との差の２乗」と「６．０と１０．０との差の２乗」との和であるので、２１．０となる。ケースＣでは高類似プロジェクトの個数は３であるので、残差平方和の平均値は７．０となる。以上のようにして、それぞれのケースについて、残差平方和の平均値が求められる。図１１に示す例では、「実績値と予測値との残差平方和の平均値」は、ケースＢが最も低くなっている。従って、予測対象プロジェクトについての予測対象メトリクスの予測値は、ケースＢの予測値である３．０となる。 For example, there are three projects (referred to as “PRJ-1”, “PRJ-2”, and “PRJ-3”) as shown in FIG. 10 as projects having a high degree of similarity with the project to be predicted. Assuming that In such a case, a case using only the data of PRJ-1 as the data of the highly similar project (case using only the data of the project having the highest similarity) (case A in FIG. 11), the data of the highly similar project Case using data of PRJ-1 and PRJ-2 (case using data of the project with the highest similarity and data of the project with the second highest similarity) (case B in FIG. 11), and high similarity Cases using PRJ-1, PRJ-2, and PRJ-3 data as project data (project data with the highest similarity, data with the second highest similarity, and third highest similarity) A predicted value is obtained for each of the cases using the project data) (case C in FIG. 11). Furthermore, the average value of the residual sum of squares described above is obtained for each case. For example, when attention is paid to case C in FIG. 11, the predicted value is 6.0. From FIG. 10, the actual value for PRJ-1 is 5.0, the actual value for PRJ-2 is 4.0, and the actual value for PRJ-3 is 10.0. Yes. The residual sum of squares is the difference between the square of the difference between 6.0 and 5.0, the square of the difference between 6.0 and 4.0, and the difference between 6.0 and 10.0. Since it is the sum of “squared”, it is 21.0. In Case C, the number of highly similar projects is 3, so the average residual sum of squares is 7.0. As described above, the average value of the residual sum of squares is obtained for each case. In the example shown in FIG. 11, “average value of residual sum of squares of actual value and predicted value” is the lowest in case B. Therefore, the prediction value of the prediction target metric for the prediction target project is 3.0, which is the prediction value of Case B.

＜１．４．３各テーブルのデータの使用のされ方＞
次に、本実施形態において、欠陥予測処理の際に各テーブルのデータがどのように使用されるのかについて説明する。本実施形態においては、上述したように協調フィルタリングの手法を用いて各メトリクス（ソフトウェア開発に関する各種評価項目）の予測値が求められるところ、各テーブル（混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，および規模値テーブル２２３）における「名称」と「バージョン」との結合が１つのプロジェクトとみなされる。従って、プロジェクトの名称が同じであっても、バージョンが異なれば、欠陥予測処理においては異なるプロジェクトとして扱われる。 <1.4.3 How to use data in each table>
Next, how the data in each table is used in the defect prediction process in the present embodiment will be described. In the present embodiment, as described above, predicted values of each metric (various evaluation items related to software development) are obtained by using the collaborative filtering technique, and each table (mixed defect count table 221, detected defect count table 222, The combination of “name” and “version” in the scale value table 223) is regarded as one project. Therefore, even if the project names are the same, if the versions are different, they are treated as different projects in the defect prediction process.

また、本実施形態においては、混入欠陥数テーブル２２１に含まれる「要求定義混入欠陥数」，「設計混入欠陥数」，「実装混入欠陥数」、検出欠陥数テーブル２２２に含まれる「要求定義検出欠陥数」，「設計検出欠陥数」，「実装検出欠陥数」、および規模値テーブル２２３に含まれる「要求定義規模」，「設計規模」，「実装規模」，「テスト規模」がメトリクスとして扱われる。従って、仮想的なテーブルとして図１２に示すようなプロジェクト−メトリクステーブル２３０を用いた欠陥予測処理が行われる。 In this embodiment, the “required definition detection number” included in the “required definition mixed defect number”, “designed mixed defect number”, “mounted mixed defect number” included in the mixed defect number table 221, and “required definition detection” included in the detected defect number table 222. “Defect Count”, “Design Detected Defect Number”, “Mounting Detected Defect Number”, and “Required Definition Scale”, “Design Scale”, “Mounting Scale”, and “Test Scale” included in the scale value table 223 are treated as metrics. Is called. Therefore, a defect prediction process using the project-metric table 230 as shown in FIG. 12 as a virtual table is performed.

ところで、或るプロジェクトについての欠陥予測処理を行うためには、当該プロジェクトに関する何らかのデータがテーブルに登録されていなければならない。そこで、本実施形態においては、各プロジェクトの要求定義工程の開始時において、規模値テーブル２２３に対して各開発工程における規模の予測値が登録される。なお、ここでは、「名称＝“Ｅ−ＰＲＪ”」かつ「バージョン＝“１．０”」で特定されるプロジェクトが予測対象プロジェクトであると仮定する。予測対象プロジェクトの要求定義工程の開始時には、ユーザがデータ登録手段７１を用いて規模値テーブル２２３へのデータの登録を行うことにより、規模値テーブル２２３内のレコードの内容が図１３に示すような内容から図１４に示すような内容に変化する。この例では、予測対象プロジェクトについての「要求定義規模」，「設計規模」，「実装規模」，および「テスト規模」の予測値としてそれぞれ「１０」，「６５」，「５５」，および「１５」という値が登録されている。このようにして予測対象プロジェクトについての規模値のデータが登録されることにより、プロジェクト−メトリクステーブル２３０の内容は例えば図１５に示すようなものとなる。これにより、予測対象プロジェクトと他のプロジェクトとの間の類似度を求めることが可能となる。 Incidentally, in order to perform defect prediction processing for a certain project, some data related to the project must be registered in the table. Therefore, in the present embodiment, the predicted value of the scale in each development process is registered in the scale value table 223 at the start of the requirement definition process of each project. Here, it is assumed that the project specified by “name =“ E-PRJ ”and“ version = “1.0” ”is the project to be predicted. At the start of the request definition process for the prediction target project, the user registers data in the scale value table 223 using the data registration means 71, so that the contents of the records in the scale value table 223 are as shown in FIG. The content changes to the content shown in FIG. In this example, “10”, “65”, “55”, and “15” are predicted values of “requirement definition scale”, “design scale”, “implementation scale”, and “test scale” for the prediction target project, respectively. "Is registered. By registering the scale value data for the prediction target project in this way, the contents of the project-metrics table 230 are as shown in FIG. 15, for example. Thereby, it becomes possible to obtain the similarity between the prediction target project and another project.

なお、図１５に示すプロジェクト−メトリクステーブル２３０には、メトリクス値の欄が空欄になっている部分がある。このような空欄部分はメトリクス値が欠損値であることを意味しているが、協調フィルタリングにおいては未欠損の値のみを用いて計算が行われる。従って、一部のデータが欠損していても、欠陥予測処理は行われる。このように協調フィルタリングにおいては２つのプロジェクトの双方が値を持つメトリクスのデータのみを用いて類似度の計算が行われるので、多くの欠損値が含まれている場合にも予測値を求めることが可能である。 In the project-metric table 230 shown in FIG. 15, there is a portion where the metric value column is blank. Such a blank part means that the metric value is a missing value, but in collaborative filtering, calculation is performed using only unmissed values. Therefore, even if some data is missing, the defect prediction process is performed. In this way, in collaborative filtering, similarity is calculated using only metric data that both projects have values, so it is possible to obtain a predicted value even when many missing values are included. Is possible.

予測対象プロジェクトの要求定義工程の開始時には、検出欠陥数テーブル２２２内のレコードの内容は、例えば図１６に示すような内容となっている。すなわち、検出欠陥数テーブル２２２には、予測対象プロジェクトのレコードは含まれていない。このような状態において、上述したように規模値のデータを用いて予測対象プロジェクトと他のプロジェクトとの間の類似度が算出される。そして、予測対象プロジェクトと高類似プロジェクトとの類似度および検出欠陥数テーブル２２２に保持されている高類似プロジェクトについてのデータに基づいて、予測対象プロジェクトについての各開発工程における検出欠陥数の予測値が求められる。その後、ユーザは、その予測値のデータをデータ登録手段７１を用いて検出欠陥数テーブル２２２に登録する。これにより、検出欠陥数テーブル２２２内のレコードの内容は、例えば図１７に示すような内容となる。同様にして、予測対象プロジェクトについての各開発工程における混入欠陥数の予測値が求められ、その予測値のデータが混入欠陥数テーブル２２１に登録される。すなわち、本実施形態においては、欠陥予測処理によって、各開発工程における検出欠陥数の予測が行われるのみならず各開発工程における混入欠陥数の予測が行われる。 At the start of the request definition process for the prediction target project, the contents of the records in the detected defect number table 222 are as shown in FIG. 16, for example. That is, the detected defect count table 222 does not include a record of the prediction target project. In such a state, as described above, the similarity between the prediction target project and other projects is calculated using the scale value data. Then, based on the similarity between the prediction target project and the highly similar project and the data on the highly similar project held in the detected defect number table 222, the predicted value of the number of detected defects in each development process for the prediction target project is calculated. Desired. Thereafter, the user registers the data of the predicted value in the detected defect number table 222 using the data registration unit 71. Thereby, the content of the record in the detected defect number table 222 is as shown in FIG. 17, for example. Similarly, a predicted value of the number of mixed defects in each development process for the project to be predicted is obtained, and data of the predicted value is registered in the mixed defect number table 221. That is, in the present embodiment, the number of detected defects in each development process is predicted as well as the number of mixed defects in each development process by the defect prediction process.

要求定義工程が終了すると、要求定義に関するメトリクス（例えば、検出欠陥数テーブル２２２内の要求定義検出欠陥数）の実績値が得られる。ユーザは、この実績値を各テーブルに登録する。これにより、要求定義工程の次の工程である設計工程の開始時には、例えば、検出欠陥数テーブル２２２内のレコードの内容は図１８に示すような内容となる。図１８に示す検出欠陥数テーブル２２２には、予測対象プロジェクトについての要求定義検出欠陥数の実績値のデータが格納されている。この図１８に示す例からは、予測対象プロジェクトに関し、「要求定義工程開始時には要求定義検出欠陥数は“１００”と予測されていたが、実際には要求定義検出欠陥数は“９０”であった」ということが把握される。このように、設計工程の開始時は、予測対象プロジェクトについての要求定義検出欠陥数の実績値のデータが既に検出欠陥数テーブル２２２に登録されている。従って、設計工程の開始時には、規模値のデータだけでなく要求定義検出欠陥数の実績値のデータも用いて、予測対象プロジェクトと他のプロジェクトとの類似度を求めることが可能となる。このようにして求められた類似度に基づいて未実施の開発工程に関する各メトリクスの値を予測することにより、従来よりも高い精度で予測値が求められる。 When the requirement definition process ends, the actual value of the metrics related to the requirement definition (for example, the requirement definition detected defect count in the detected defect count table 222) is obtained. The user registers this actual value in each table. Thereby, at the start of the design process, which is the next process of the requirement definition process, for example, the content of the record in the detected defect number table 222 is as shown in FIG. The detected defect count table 222 shown in FIG. 18 stores data on the actual value of the requirement definition detected defect count for the prediction target project. From the example shown in FIG. 18, regarding the project to be predicted, “the requirement definition detected defect count at the start of the requirement definition process was predicted to be“ 100 ”, but the requirement definition detected defect count was actually“ 90 ”. Is understood. In this way, at the start of the design process, the actual value data of the number of required definition detection defects for the prediction target project is already registered in the detection defect number table 222. Therefore, at the start of the design process, it is possible to obtain the degree of similarity between the prediction target project and other projects using not only the scale value data but also the actual value data of the number of required definition detection defects. By predicting the value of each metric related to an unimplemented development process based on the similarity obtained in this way, the predicted value is obtained with higher accuracy than in the past.

設計工程の開始時に行われた欠陥予測処理の結果に基づいて検出欠陥数テーブル２２２にデータの登録が行われると、検出欠陥数テーブル２２２内のレコードの内容は例えば図１９に示すような内容となる。この図１９に示す例からは、予測対象プロジェクトに関し、「要求定義工程開始時には設計検出欠陥数は“６５”と予測されていたが、設計工程開始時には設計検出欠陥数は“５５”と予測されている」ということが把握される。このように、各開発工程が終了する毎に未実施の開発工程に関するメトリクスの値を予測し直すことが可能となる。 When data is registered in the detected defect count table 222 based on the result of the defect prediction process performed at the start of the design process, the content of the record in the detected defect count table 222 is, for example, as shown in FIG. Become. From the example shown in FIG. 19, regarding the project to be predicted, “the number of design detection defects is predicted to be“ 65 ”at the start of the requirement definition process, but the number of design detection defects is predicted to be“ 55 ”at the start of the design process. Is understood. In this way, it is possible to re-predict metrics values related to undeveloped development processes every time each development process ends.

予測対象プロジェクトに関してユーザリリースまで終了すると、検出欠陥数テーブル２２２内のレコードの内容は例えば図２０に示すような内容となる。この図２０より、各開発工程が終了する毎に実施済みの開発工程についての実績値の登録が行われるとともに未実施の開発工程についての予測値の算出が行われていることが把握される。 When the prediction target project is completed up to the user release, the contents of the records in the detected defect number table 222 are as shown in FIG. 20, for example. From FIG. 20, it is understood that each time a development process is completed, an actual value for a completed development process is registered and a predicted value is calculated for an unimplemented development process.

＜１．５効果＞
本実施形態によれば、ソフトウェア欠陥予測装置７には、協調フィルタリングの手法を用いた予測値算出対象のメトリクスの値を格納するテーブルとして、開発工程毎の検出欠陥数を保持する検出欠陥数テーブル２２２に加えて開発工程毎の混入欠陥数を保持する混入欠陥数テーブル２２１が設けられている。このため、ソフトウェア開発の各開発工程における検出欠陥数を予測するだけでなく各開発工程における混入欠陥数を予測することが可能となる。このように各開発工程で実際にシステムに混入する欠陥の数を予測することが可能となるので、ソフトウェア開発のプロジェクトを進めるにあたって、より的確に改善対象工程の絞り込みを行うことが可能となる。 <1.5 Effect>
According to the present embodiment, the software defect prediction device 7 has a detection defect number table that holds the number of detected defects for each development process as a table for storing the value of a metric to be predicted using a collaborative filtering technique. In addition to 222, a mixed defect number table 221 that holds the number of mixed defects for each development process is provided. For this reason, it is possible to predict not only the number of detected defects in each development process of software development but also the number of mixed defects in each development process. As described above, since it is possible to predict the number of defects actually mixed into the system in each development process, it is possible to narrow down the process to be improved more accurately when proceeding with the software development project.

また、本実施形態によれば、混入欠陥数テーブル２２１や検出欠陥数テーブル２２２には、プロジェクト単位の欠陥数のデータではなく開発工程毎の欠陥数のデータが保持されている。このため、ソフトウェア開発の途中において、実施済みの開発工程の実績値を用いて未実施の開発工程の予測値を求めることが可能となる。これにより、各開発工程で生じる欠陥の数を従来よりも高い精度で予測することが可能となる。この観点からも、ソフトウェア開発のプロジェクトを進めるにあたって、より的確に改善対象工程の絞り込みを行うことが可能となる。 Further, according to the present embodiment, the mixed defect number table 221 and the detected defect number table 222 hold the defect number data for each development process, not the defect number data for each project. For this reason, in the middle of software development, it is possible to obtain a predicted value of an unimplemented development process using the actual value of the developed development process. Thereby, the number of defects generated in each development process can be predicted with higher accuracy than in the past. From this point of view, it is possible to narrow down the process to be improved more accurately when proceeding with the software development project.

以上のように、本実施形態によれば、ソフトウェア開発の各開発工程で生じる欠陥の数を高い精度で予測し、より効率的な改善対象工程の絞り込みを行うための情報を提示できるシステムが実現される。 As described above, according to the present embodiment, a system capable of predicting the number of defects generated in each development process of software development with high accuracy and presenting information for narrowing down the process to be improved more efficiently is realized. Is done.

＜２．第２の実施形態＞
＜２．１概略構成＞
本発明の第２の実施形態に係るソフトウェア欠陥予測装置を含むシステム全体の構成については、上記第１の実施形態（図１参照）と同様であるので説明を省略する。図２１は、本実施形態に係るソフトウェア欠陥予測装置７のハードウェア構成を示すブロック図である。図２および図２１から把握されるように、本実施形態と上記第１の実施形態とでは、ハードウェアそのものの構成については同じである。但し、本実施形態においては、補助記憶装置２０内のデータベース２２に、上記第１の実施形態で設けられているテーブルに加えて、開発工数テーブル２２４，ソースコードメトリクステーブル２２５，および技術要因・環境要因テーブル２２６が設けられている。 <2. Second Embodiment>
<2.1 Schematic configuration>
Since the configuration of the entire system including the software defect prediction apparatus according to the second embodiment of the present invention is the same as that of the first embodiment (see FIG. 1), description thereof is omitted. FIG. 21 is a block diagram illustrating a hardware configuration of the software defect prediction apparatus 7 according to the present embodiment. As can be understood from FIGS. 2 and 21, the configuration of the hardware itself is the same between the present embodiment and the first embodiment. However, in this embodiment, in addition to the table provided in the first embodiment, the development man-hour table 224, the source code metrics table 225, and the technical factors / environments are added to the database 22 in the auxiliary storage device 20. A factor table 226 is provided.

＜２．２テーブル＞
次に、補助記憶装置２０内のデータベース２２に保持されているテーブルについて説明する。なお、混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，および規模値テーブル２２３については、上記第１の実施形態と同様であるので、説明を省略する。 <2.2 Table>
Next, a table held in the database 22 in the auxiliary storage device 20 will be described. Note that the mixed defect number table 221, the detected defect number table 222, and the scale value table 223 are the same as those in the first embodiment, and thus description thereof is omitted.

図２２は、開発工数テーブル２２４のレコードフォーマットの一例を示す図である。開発工数テーブル２２４には、項目名をそれぞれ「名称」，「バージョン」，「予測工程」，「要求定義工数」，「予実フラグ（ｄ１）」，「設計工数」，「予実フラグ（ｄ２）」，「実装工数」，「予実フラグ（ｄ３）」，「テスト工数」，および「予実フラグ（ｄ４）」とする複数の項目が含まれている。 FIG. 22 is a diagram illustrating an example of the record format of the development man-hour table 224. In the development man-hour table 224, the item names are “name”, “version”, “prediction process”, “required definition man-hour”, “predictive flag (d1)”, “design man-hour”, and “predictive flag (d2)”. , “Mounting man-hour”, “predictive flag (d3)”, “test man-hour”, and “predictive flag (d4)” are included.

開発工数テーブル２２４の各項目（「名称」などの上記で説明した項目を除く）のフィールドには、それぞれ以下のような内容のデータが格納される。「要求定義工数」には、要求定義工程に要した開発工数（実績値）もしくは要求定義工程に要すると予測される開発工数（予測値）が格納される。「設計工数」には、設計工程に要した開発工数（実績値）もしくは設計工程に要すると予測される開発工数（予測値）が格納される。「実装工数」には、実装工程に要した開発工数（実績値）もしくは実装工程に要すると予測される開発工数（予測値）が格納される。「テスト工数」には、テスト工程に要した開発工数（実績値）もしくはテスト工程に要すると予測される開発工数（予測値）が格納される。 Data of the following contents is stored in the field of each item of the development man-hour table 224 (excluding the items described above such as “name”). “Required definition man-hour” stores the development man-hour (actual value) required for the requirement definition process or the development man-hour (predicted value) predicted to be required for the requirement definition process. The “design man-hour” stores a development man-hour (actual value) required for the design process or a development man-hour (predicted value) predicted to be required for the design process. The “mounting man-hour” stores the development man-hour (actual value) required for the mounting process or the development man-hour (predicted value) predicted to be required for the mounting process. The “test man-hour” stores the development man-hour (actual value) required for the test process or the development man-hour (predicted value) predicted to be required for the test process.

図２３は、ソースコードメトリクステーブル２２５のレコードフォーマットの一例を示す図である。ソースコードメトリクステーブル２２５には、項目名をそれぞれ「名称」，「バージョン」，「予測工程」，「制御文数」，「予実フラグ（ｅ１）」，「複雑度」，および「予実フラグ（ｅ２）」とする複数の項目が含まれている。なお、ソースコードメトリクスとは、コンピュータに実行させるプログラムのソースコードの品質を定量的に評価するための指標のことである。本実施形態においては、ソースコードメトリクスとして、制御文数および複雑度が用いられる。 FIG. 23 is a diagram illustrating an example of a record format of the source code metrics table 225. In the source code metrics table 225, the item names are “name”, “version”, “prediction process”, “number of control statements”, “predictive flag (e1)”, “complexity”, and “predictive flag (e2), respectively. ) "Is included. The source code metrics are indicators for quantitatively evaluating the quality of the source code of a program executed by a computer. In the present embodiment, the number of control statements and the complexity are used as source code metrics.

ソースコードメトリクステーブル２２５の各項目（「名称」などの上記で説明した項目を除く）のフィールドには、それぞれ以下のような内容のデータが格納される。「制御文数」には、ソースコードに含まれている制御文（例えば、ＧＯＴＯ文）の数（実績値）もしくはソースコードに含まれると予測される制御文の数（予測値）が格納される。「複雑度」には、ソースコードの複雑さ（例えば、分岐やループの多さ）を特定する値（実績値もしくは予測値）が格納される。 In the field of each item (excluding the item described above such as “name”) of the source code metrics table 225, data having the following contents is stored. The “number of control statements” stores the number of control statements (eg, GOTO statements) included in the source code (actual value) or the number of control statements predicted to be included in the source code (predicted value). The The “complexity” stores a value (actual value or predicted value) that specifies the complexity of the source code (for example, the number of branches and loops).

図２４は、技術要因・環境要因テーブル２２６のレコードフォーマットの一例を示す図である。技術要因・環境要因テーブル２２６には、項目名をそれぞれ「名称」，「バージョン」，「予測工程」，「技術要因」，および「環境要因」とする複数の項目が含まれている。 FIG. 24 is a diagram showing an example of the record format of the technical factor / environment factor table 226. The technical factor / environment factor table 226 includes a plurality of items whose item names are “name”, “version”, “prediction process”, “technical factor”, and “environmental factor”, respectively.

技術要因・環境要因テーブル２２６の各項目（「名称」などの上記で説明した項目を除く）のフィールドには、それぞれ以下のような内容のデータが格納される。「技術要因」には、ソフトウェア開発の技術的な難しさに関する複数の指標の評価値に基づいて得られる値が格納される。「環境要因」には、ソフトウェア開発に関わる要員（作業者）の経験や能力に関する指標の評価値に基づいて得られる値が格納される。 Data of the following contents is stored in the field of each item (excluding the item described above such as “name”) of the technical factor / environment factor table 226. The “technical factor” stores a value obtained based on the evaluation values of a plurality of indices relating to technical difficulty of software development. The “environmental factor” stores a value obtained based on an evaluation value of an index related to experience and ability of personnel (operators) involved in software development.

技術要因としては、例えば、「ソースコードの再利用のしやすさ」，「内部処理の複雑さ」，「移植性の高さ」などが挙げられる。これら各要因にはそれぞれ重み付け用の係数が定められている。そして、各要因の評価値と重み付け用の係数との積の総和が、技術要因・環境要因テーブル２２６の「技術要因」のフィールドに格納される。 Examples of technical factors include “ease of reusing source code”, “complexity of internal processing”, and “high portability”. Each of these factors has a weighting coefficient. The sum of products of the evaluation values of the factors and the weighting coefficients is stored in the “technical factor” field of the technical factor / environmental factor table 226.

環境要因としては、例えば、「開発経験の有無」，「モチベーション」，「リーダーの能力」などが挙げられる。これら各要因にはそれぞれ重み付け用の係数が定められている。そして、各要因の評価値と重み付け用の係数との積の総和が、技術要因・環境要因テーブル２２６の「環境要因」のフィールドに格納される。 Examples of environmental factors include “development experience”, “motivation”, and “leader ability”. Each of these factors has a weighting coefficient. The sum of the products of the evaluation values of the factors and the weighting coefficients is stored in the “environmental factor” field of the technical factor / environmental factor table 226.

＜２．３機能構成＞
図２５は、本実施形態におけるソフトウェア欠陥予測装置７の機能構成を示す機能ブロック図である。このソフトウェア欠陥予測装置７には、データ登録手段７１，類似度算出手段７２，予測値算出手段７３，混入欠陥数保持手段として機能する混入欠陥数テーブル２２１，検出欠陥数保持手段として機能する検出欠陥数テーブル２２２，規模値保持手段として機能する規模値テーブル２２３，開発工数保持手段として機能する開発工数テーブル２２４，難易度保持手段として機能するソースコードメトリクステーブル２２５，および技術要因・環境要因保持手段として機能する技術要因・環境要因テーブル２２６が含まれている。類似度算出手段７２には、正規化手段７２１と類似度計算手段７２２とが含まれている。 <2.3 Functional configuration>
FIG. 25 is a functional block diagram showing a functional configuration of the software defect prediction apparatus 7 in the present embodiment. The software defect prediction apparatus 7 includes a data registration unit 71, a similarity calculation unit 72, a predicted value calculation unit 73, a mixed defect number table 221 that functions as a mixed defect number holding unit, and a detected defect that functions as a detected defect number holding unit. A number table 222, a scale value table 223 functioning as a scale value holding means, a development man-hour table 224 functioning as a development man-hour holding means, a source code metrics table 225 functioning as a difficulty level holding means, and a technical factor / environment factor holding means A functioning technical factor / environment factor table 226 is included. The similarity calculation means 72 includes a normalization means 721 and a similarity calculation means 722.

データ登録手段７１は、混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，規模値テーブル２２３，開発工数テーブル２２４，ソースコードメトリクステーブル２２５，および技術要因・環境要因テーブル２２６に対して、データの登録（レコードの追加・変更・削除）を行う。上記第１の実施形態と同様、例えば各開発工程の開始前や終了後に、このデータ登録手段７１によって各テーブルへのデータの登録が行われる。本実施形態においては、図２６に示すように、混入欠陥数の登録（ステップＳ３１０），検出欠陥数の登録（ステップＳ３２０），規模値の登録（ステップＳ３３０），開発工数の登録（ステップＳ３４０），ソースコードメトリクスの登録（ステップＳ３５０），技術要因・環境要因の登録（ステップＳ３６０）が順次に行われる。 The data registration means 71 registers data in the mixed defect count table 221, the detected defect count table 222, the scale value table 223, the development man-hour table 224, the source code metrics table 225, and the technical factor / environment factor table 226 ( Add / change / delete records). Similar to the first embodiment, for example, data is registered in each table by the data registration means 71 before or after the start of each development process. In the present embodiment, as shown in FIG. 26, the number of mixed defects (step S310), the number of detected defects (step S320), the scale value registration (step S330), and the development man-hour registration (step S340). , Registration of source code metrics (step S350) and registration of technical factors and environmental factors (step S360) are sequentially performed.

類似度算出手段７２および予測値算出手段７３は、協調フィルタリングの手法を用いた欠陥予測処理が行われる際に、それぞれ以下のような処理を行う。類似度算出手段７２は、混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，規模値テーブル２２３，開発工数テーブル２２４，ソースコードメトリクステーブル２２５，および技術要因・環境要因テーブル２２６に格納されているデータに基づいて、予測対象プロジェクトと他のプロジェクトとの類似度を算出する。その際、正規化手段７２１は、各メトリクスの値の正規化を行う。また、類似度計算手段７２２は、上記第１の実施形態と同様の計算式を用いて、予測対象プロジェクトと他のプロジェクトとの類似度を求める。予測値算出手段７３は、混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，規模値テーブル２２３，開発工数テーブル２２４，およびソースコードメトリクステーブル２２５に格納されているデータと類似度算出手段７２によって算出された類似度とに基づいて、上記第１の実施形態と同様の計算式を用いて、各メトリクスの予測値を算出する。 The similarity calculation unit 72 and the predicted value calculation unit 73 perform the following processes when the defect prediction process using the collaborative filtering technique is performed. The similarity calculation means 72 uses the data stored in the mixed defect count table 221, the detected defect count table 222, the scale value table 223, the development man-hour table 224, the source code metrics table 225, and the technical factor / environment factor table 226. Based on this, the similarity between the project to be predicted and another project is calculated. At that time, the normalizing means 721 normalizes the value of each metric. Moreover, the similarity calculation means 722 calculates | requires the similarity of a prediction object project and another project using the calculation formula similar to the said 1st Embodiment. The predicted value calculation means 73 is calculated by the similarity calculation means 72 with the data stored in the mixed defect count table 221, the detected defect count table 222, the scale value table 223, the development man-hour table 224, and the source code metrics table 225. Based on the similarity, the predicted value of each metric is calculated using the same calculation formula as in the first embodiment.

＜２．４欠陥予測処理＞
欠陥予測処理の流れおよび高類似プロジェクトの個数については、上記第１の実施形態と同様であるので、説明を省略する。以下、本実施形態において、欠陥予測処理の際に各テーブルのデータがどのように使用されるのかについて説明する。 <2.4 Defect prediction processing>
The flow of the defect prediction process and the number of highly similar projects are the same as those in the first embodiment, and a description thereof will be omitted. Hereinafter, how the data of each table is used in the defect prediction process in the present embodiment will be described.

本実施形態においては、混入欠陥数テーブル２２１に含まれる「要求定義混入欠陥数」，「設計混入欠陥数」，「実装混入欠陥数」、検出欠陥数テーブル２２２に含まれる「要求定義検出欠陥数」，「設計検出欠陥数」，「実装検出欠陥数」、規模値テーブル２２３に含まれる「要求定義規模」，「設計規模」，「実装規模」，「テスト規模」、開発工数テーブル２２４に含まれる「要求定義工数」，「設計工数」，「実装工数」，「テスト工数」、ソースコードメトリクステーブル２２５に含まれる「制御文数」，「複雑度」、および技術要因・環境要因テーブル２２６に含まれる「技術要因」，「環境要因」がメトリクスとみなされる。従って、仮想的なテーブルとして図２７に示すようなプロジェクト−メトリクステーブル２３０を用いた欠陥予測処理が行われる。 In the present embodiment, “the number of required definition mixed defects”, “the number of design mixed defects”, “the number of mixed mixed defects” included in the mixed defect number table 221, and “the number of required definition detected defects” included in the detected defect number table 222. ”,“ Design detection defect count ”,“ Mounting detection defect count ”,“ Requirement definition scale ”,“ Design scale ”,“ Mounting scale ”,“ Test scale ”included in the scale value table 223, and included in the development man-hour table 224 “Required definition man-hour”, “design man-hour”, “implement man-hour”, “test man-hour”, “control statement number” included in the source code metrics table 225, “complexity”, and the technical factor / environment factor table 226 The included “technical factors” and “environmental factors” are considered metrics. Therefore, a defect prediction process using the project-metric table 230 as shown in FIG. 27 as a virtual table is performed.

本実施形態においては、各プロジェクトの要求定義工程の開始時に、規模値テーブル２２３および技術要因・環境要因テーブル２２６へのデータの登録が行われる。予測対象プロジェクトの要求定義工程の開始時には、混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，開発工数テーブル２２４，およびソースコードメトリクステーブル２２５には、予測対象プロジェクトのレコードは含まれていない。このような状態において、規模値のデータおよび技術要因・環境要因のデータを用いて予測対象プロジェクトと他のプロジェクトとの間の類似度が算出される。このようにして求められた類似度に基づいて、混入欠陥数テーブル２２１，検出欠陥数テーブル２２２，開発工数テーブル２２４，およびソースコードメトリクステーブル２２５内の各メトリクスの値が予測される。 In the present embodiment, data is registered in the scale value table 223 and the technical factor / environment factor table 226 at the start of the requirement definition process for each project. At the start of the request definition process of the prediction target project, the mixed defect number table 221, the detected defect number table 222, the development man-hour table 224, and the source code metrics table 225 do not include a record of the prediction target project. In such a state, the degree of similarity between the project to be predicted and another project is calculated using the data of the scale value and the data of the technical factor / environmental factor. Based on the similarity obtained in this way, the value of each metric in the mixed defect count table 221, the detected defect count table 222, the development man-hour table 224, and the source code metrics table 225 is predicted.

要求定義工程が終了すると、要求定義に関するメトリクスの実績値が得られる。ユーザは、この実績値を各テーブルに登録する。これにより、設計工程の開始時には、要求定義に関する各メトリクスの実績値のデータがそれぞれ対応するテーブルに既に格納されている。従って、設計工程の開始時には、規模値および技術要因・環境要因のデータだけでなく、要求定義に関する各メトリクスの実績値のデータも用いて、予測対象プロジェクトと他のプロジェクトとの類似度が求められる。そして、その類似度に基づいて、各メトリクスの値が予測される。このようにして、各開発工程の開始時において、実施済みの開発工程の実績値を用いて未実施の開発工程の予測値が求められる。 When the requirement definition process ends, the actual value of the metrics related to the requirement definition is obtained. The user registers this actual value in each table. Thereby, at the start of the design process, the data of the actual value of each metric related to the requirement definition is already stored in the corresponding table. Therefore, at the start of the design process, the degree of similarity between the project to be predicted and other projects is calculated using not only the scale value and technical / environmental factor data but also the actual value data of each metric related to the requirement definition. . Based on the similarity, the value of each metric is predicted. In this way, at the start of each development process, the predicted value of the unimplemented development process is obtained using the actual value of the developed development process.

なお、予測対象プロジェクトとユーザリリース済みのプロジェクトとの類似度を求める際、ユーザリリース済みのプロジェクトについては各メトリクスの最終の実績値のデータが用いられる。但し、技術要因・環境要因のデータについては、ソフトウェア開発の特性上（例えば、開発工程毎に開発会社や作業者が異なるという特性上）、開発工程毎に値が大きく異なることがある。従って、ユーザリリース済みのプロジェクトのデータに関し、例えば、「要求定義に関するメトリクスの値を予測する際には要求定義工程における実績値のデータを用い、設計に関するメトリクスの値を予測する際には設計工程における実績値のデータを用いる」というようにしても良い。 When obtaining the similarity between the prediction target project and the user released project, the final actual value data of each metric is used for the user released project. However, the values of the technical factor / environmental factor may vary greatly from one development process to another due to the characteristics of software development (for example, the characteristics that the development company and the worker are different for each development process). Therefore, with respect to user-released project data, for example, “when the value of metrics related to requirement definition is predicted, the actual value data in the requirement definition process is used, and when the value of metrics related to design is predicted, the design process May be used. ”

＜２．５効果＞
本実施形態によれば、上記第１の実施形態と同様の効果が得られるのに加えて、開発工数テーブル２２４，ソースコードメトリクステーブル２２５，および技術要因・環境要因テーブル２２６が設けられていることにより以下のような効果が得られる。 <2.5 Effect>
According to this embodiment, in addition to the same effects as those of the first embodiment, the development man-hour table 224, the source code metrics table 225, and the technical factor / environment factor table 226 are provided. The following effects can be obtained.

本実施形態によれば、予測対象プロジェクトと他のプロジェクトとの類似度を求める際に、開発工数，ソースコードメトリクス，技術要因・環境要因が考慮される。このように様々な指標を用いて類似度が算出され、その類似度に基づいて各メトリクスの値が予測される。このため、各開発工程における各メトリクスの値が顕著に高い精度で予測される。特に、技術要因および環境要因のデータが用いられることにより、ソフトウェア開発の特性を考慮した予測が行われる。従って、ソフトウェア開発の各開発工程で生じる欠陥の数（混入欠陥数，検出欠陥数）がソフトウェア開発の特性を考慮して予測されるので、ソフトウェア開発のプロジェクトを進めるにあたって、きわめて効果的に改善対象工程の絞り込みを行うことが可能となる。 According to the present embodiment, the development man-hours, source code metrics, technical factors, and environmental factors are taken into account when obtaining the similarity between the prediction target project and other projects. Thus, the similarity is calculated using various indexes, and the value of each metric is predicted based on the similarity. For this reason, the value of each metric in each development process is predicted with significantly high accuracy. In particular, predictions are made in consideration of the characteristics of software development by using data on technical factors and environmental factors. Therefore, the number of defects (number of mixed defects, number of detected defects) that occur in each development process of software development is predicted in consideration of the characteristics of software development. It becomes possible to narrow down the processes.

また、本実施形態によれば、各開発工程における開発工数が予測される。これにより、改善対象工程を決定するための判断材料が多くなるので、より的確に改善対象工程の絞り込みを行うことが可能となる。また、一般に、或る一定の規模のプロジェクトに着目したとき、開発工数を多くするほどソフトウェアの品質は向上する。すなわち、規模が一定であれば、開発工数が多いほど欠陥数は少なくなる傾向にある。このように開発工数と欠陥数との間には比較的高い相関があるので、開発工数のデータを用いて算出された類似度に基づいて欠陥数の予測を行うことによって、より高い精度で欠陥数を予測することが可能となる。 Moreover, according to this embodiment, the development man-hour in each development process is estimated. As a result, the determination material for determining the improvement target process increases, so that the improvement target processes can be narrowed down more accurately. In general, when paying attention to a project of a certain scale, the quality of software improves as the development man-hour increases. That is, if the scale is constant, the number of defects tends to decrease as the development man-hour increases. In this way, there is a relatively high correlation between the development man-hours and the number of defects, so by predicting the number of defects based on the similarity calculated using the development man-hour data, defects can be obtained with higher accuracy. The number can be predicted.

さらに、本実施形態によれば、ソースコードメトリクスが予測される。ソースコードメトリクスは実装工程で作成されるソースコードの品質を定量的に評価するための指標であるので、ソースコードメトリクスを予測することにより、実装工程を改善対象工程とすべきか否かをより的確に判断することが可能となる。 Furthermore, according to this embodiment, source code metrics are predicted. Since source code metrics are an index for quantitatively evaluating the quality of source code created in the implementation process, it is possible to more accurately determine whether the implementation process should be an improvement target process by predicting the source code metrics. It becomes possible to judge.

＜２．６変形例＞
上記第２の実施形態の変形例について説明する。上記第２の実施形態においては、難易度保持手段として機能するテーブルとしてソースコードメトリクステーブル２２５が設けられていたが、本変形例においては、難易度保持手段として機能するテーブルとして、あいまい度テーブル２２７が設けられている。なお、ソースコードメトリクステーブル２２５およびあいまい度テーブル２２７の双方が設けられる構成を採用することもできる。 <2.6 Modification>
A modification of the second embodiment will be described. In the second embodiment, the source code metrics table 225 is provided as a table that functions as difficulty level holding means. However, in the present modification, an ambiguity table 227 is provided as a table that functions as difficulty level holding means. Is provided. Note that a configuration in which both the source code metrics table 225 and the ambiguity table 227 are provided may be employed.

ここで、「あいまい度」について説明する。一般に、ソフトウェア開発においては、各開発工程でドキュメントが作成される。ドキュメントは複数の作業者によって参照されるところ、或る作業者によって作成されたドキュメント内に難しい語彙や複雑な構文が用いられていると、他の作業者が意味を誤解することがある。簡単な例を挙げると、ドキュメント内に「ＡおよびＢまたはＣ」という表現が用いられている場合、当該表現は、「“ＡおよびＢ”または“Ｃ”」の意味で理解されることもあれば、「“Ａ”および“ＢまたはＣ”」の意味で理解されることもある。ドキュメント内で用いられているこのような表現のあいまいさは、システムの欠陥を引き起こす要因となる。すなわち、ドキュメント内にあいまいな表現が多く用いられているほど、システムに生じる欠陥の数が多くなると考えられる。そこで、近年、ドキュメント内の表現のあいまいさに起因する欠陥を少なくするため、ドキュメントのあいまいさの度合いを表す「あいまい度」を測定するソフトウェアも開発されている。本変形例においては、ソフトウェア開発の各開発工程で作成されるドキュメントのあいまい度のデータが、あいまい度テーブル２２７に保持される。そして、そのあいまい度のデータに基づいて、類似度算出手段７２による類似度の算出および予測値算出手段７３による予測値の算出が行われる。 Here, the “ambiguity” will be described. Generally, in software development, a document is created in each development process. When a document is referred to by a plurality of workers, if a difficult vocabulary or complex syntax is used in a document created by a certain worker, the other worker may misunderstand the meaning. To give a simple example, if the expression “A and B or C” is used in a document, the expression may be understood in the meaning of “A and B” or “C”. For example, it may be understood in the meaning of ““ A ”and“ B or C ””. Such ambiguity of the expression used in the document is a factor causing system defects. In other words, the more ambiguous expressions are used in a document, the more defects will occur in the system. Therefore, in recent years, software for measuring the “ambiguity” indicating the degree of ambiguity of a document has been developed in order to reduce defects caused by the ambiguity of the expression in the document. In this modification, ambiguity data of a document created in each development process of software development is held in the ambiguity table 227. Based on the ambiguity data, the similarity calculation unit 72 calculates the similarity and the prediction value calculation unit 73 calculates the prediction value.

図２８は、あいまい度テーブル２２７のレコードフォーマットの一例を示す図である。あいまい度テーブル２２７には、項目名をそれぞれ「名称」，「バージョン」，「予測工程」，「要求定義書あいまい度」，「予実フラグ（ｆ１）」，「設計書あいまい度」，「予実フラグ（ｆ２）」，「プログラム仕様書あいまい度」，「予実フラグ（ｆ３）」，「テスト仕様書あいまい度」，および「予実フラグ（ｆ４）」とする複数の項目が含まれている。 FIG. 28 is a diagram illustrating an example of a record format of the ambiguity table 227. In the ambiguity table 227, the item names are “name”, “version”, “prediction process”, “request definition document ambiguity”, “predictive flag (f1)”, “design document ambiguity”, and “predictive flag”. (F2) ”,“ program specification ambiguity ”,“ predictive flag (f3) ”,“ test specification ambiguity ”, and“ predictive flag (f4) ”are included.

あいまい度テーブル２２７の各項目（「名称」などの上記で説明した項目を除く）のフィールドには、それぞれ以下のような内容のデータが格納される。「要求定義書あいまい度」には、要求定義工程で作成される要求定義書のあいまい度（実績値もしくは予測値）が格納される。「設計書あいまい度」には、設計工程で作成される設計書のあいまい度（実績値もしくは予測値）が格納される。「プログラム仕様書あいまい度」には、実装工程で作成されるプログラム仕様書のあいまい度（実績値もしくは予測値）が格納される。「テスト仕様書あいまい度」には、テスト工程で作成されるテスト仕様書のあいまい度（実績値もしくは予測値）が格納される。 In the field of each item of the ambiguity table 227 (excluding the item described above such as “name”), data having the following contents is stored. The “request definition document ambiguity” stores the ambiguity (actual value or predicted value) of the request definition document created in the request definition process. The “design document ambiguity” stores the ambiguity (actual value or predicted value) of the design document created in the design process. The “program specification ambiguity” stores the ambiguity (actual value or predicted value) of the program specification created in the mounting process. The “test specification ambiguity” stores the ambiguity (actual value or predicted value) of the test specification created in the test process.

本変形例においては、上述したあいまい度テーブル２２７に保持されているあいまい度のデータを用いて予測対象プロジェクトと他のプロジェクトとの類似度が算出され、その類似度に基づいて、予測対象プロジェクトの各開発工程における欠陥の数（混入欠陥数，検出欠陥数）が予測される。一般にドキュメントのあいまい度と欠陥の数との間には比較的高い相関があるので、あいまい度のデータを用いることによって、より高い精度で欠陥の数を予測することが可能となる。これにより、さらに的確に改善対象工程の絞り込みを行うことが可能となる。 In this modification, the degree of similarity between the prediction target project and another project is calculated using the ambiguity data held in the ambiguity table 227 described above, and the prediction target project is calculated based on the degree of similarity. The number of defects (number of mixed defects, number of detected defects) in each development process is predicted. In general, since there is a relatively high correlation between the ambiguity of a document and the number of defects, the number of defects can be predicted with higher accuracy by using the ambiguity data. This makes it possible to narrow down the process to be improved more accurately.

＜３．その他＞
上記各実施形態における各テーブルの構成は一例であって、本発明はこれに限定されない。例えば、各テーブルに上記各実施形態で説明した以外の項目が含まれていても良い。また、例えば、混入欠陥数テーブル２２１と検出欠陥数テーブル２２２とが１つのテーブルになっていても良い。また、類似度や予測値の算出に関し、上記で説明した計算式以外の計算式を用いても良い。 <3. Other>
The configuration of each table in each of the above embodiments is an example, and the present invention is not limited to this. For example, items other than those described in the above embodiments may be included in each table. For example, the mixed defect number table 221 and the detected defect number table 222 may be a single table. Further, regarding the calculation of the similarity and the predicted value, a calculation formula other than the calculation formula described above may be used.

７…サーバ機（ソフトウェア欠陥予測装置）
８…パソコン
２０…補助記憶装置
２１…プログラム格納部
２２…データベース
７１…データ登録手段
７２…類似度算出手段
７３…予測値算出手段
２１０…ソフトウェア欠陥予測プログラム
２２１…混入欠陥数テーブル
２２２…検出欠陥数テーブル
２２３…規模値テーブル
２２４…開発工数テーブル
２２５…ソースコードメトリクステーブル
２２６…技術要因・環境要因テーブル
２２７…あいまい度テーブル
２３０…プロジェクト−メトリクステーブル
７２１…正規化手段
７２２…類似度計算手段 7 ... Server machine (software defect prediction device)
DESCRIPTION OF SYMBOLS 8 ... Personal computer 20 ... Auxiliary storage device 21 ... Program storage part 22 ... Database 71 ... Data registration means 72 ... Similarity calculation means 73 ... Predicted value calculation means 210 ... Software defect prediction program 221 ... Mixed defect number table 222 ... Number of detected defects Table 223 ... Scale value table 224 ... Development man-hour table 225 ... Source code metrics table 226 ... Technical factor / environment factor table 227 ... Ambiguity table 230 ... Project-metrics table 721 ... Normalization means 722 ... Similarity calculation means

Claims

A software defect prediction device that predicts the number of defects that occur in software development,
The number of mixed defect holding means for holding the number of mixed defects for each development process for each project,
Number of detected defects holding means for holding the number of detected defects for each development process for each project,
A scale value holding means for holding a scale value indicating the scale of each development process for each project;
Using a scale value for each development process for each project held in the scale value holding means, a prediction target project that is a target for predicting the number of mixed defects and the number of detected defects and a project other than the prediction target project Similarity calculation means for calculating similarity;
The high-similarity project, which is a project having a relatively high degree of similarity to the prediction target project, is held in the mixed defect count holding means and the detected defect count holding means held in the mixed defect count holding means. Based on the number of detected defects for each development process and the similarity between the project to be predicted and the highly similar project, the predicted number of mixed defects and the number of detected defects for each development process for the project to be predicted is calculated. A software defect prediction apparatus comprising: a predicted value calculation means for performing

When calculating the similarity, the similarity calculating unit stores the number of mixed defects and the number of detected defects held for each development process for each project held in the mixed defect number holding unit. The software defect prediction apparatus according to claim 1, wherein the number of detected defects for each development process for the project is further used.

When calculating the similarity, the similarity calculating unit holds the number of mixed defects in the developed development process and the detected defect number holding unit for the prediction target project held in the mixed defect number holding unit. The software defect prediction apparatus according to claim 2, further comprising using the number of detected defects in a development process that has already been performed for the prediction target project that is being performed.

The software defect prediction according to any one of claims 1 to 3, wherein the predicted value calculation means further calculates a predicted value of a scale value for each development process for the prediction target project. apparatus.

It further comprises a technical factor / environmental factor holding means for holding an evaluation value obtained based on a technical indicator for each development process for each project and an evaluation value obtained based on an environmental indicator,
The similarity calculation means, when calculating the similarity, an evaluation value and an environmental index obtained based on a technical index for each development process for each project held in the technical factor / environment factor holding means The software defect prediction apparatus according to any one of claims 1 to 4, further comprising using an evaluation value obtained on the basis of.

It further comprises difficulty level holding means for holding the difficulty level for each development process for each project,
6. The method according to claim 1, wherein when calculating the similarity, the similarity calculation unit further uses a difficulty level for each development process for each project held in the difficulty level holding unit. The software defect prediction apparatus according to any one of the above.

The difficulty level holding means holds, as the difficulty level, a source code metric value that is an index representing the difficulty level of the source code created by software development,
The predicted value calculation means further calculates a predicted value of a source code metric value for the prediction target project based on a source code metric value for the highly similar project held in the difficulty level holding means. The software defect prediction apparatus according to claim 6, wherein:

The software defect prediction apparatus according to claim 6 or 7, wherein the difficulty level holding means holds the ambiguity level of a document created by software development as the difficulty level.

It further comprises development man-hour holding means for holding the development man-hours for each development process for each project,
9. The similarity calculation means further uses a development man-hour for each development process for each project held in the development man-hour holding means when calculating the similarity. The software defect prediction apparatus according to any one of the above.

The predicted value calculating means is further configured to predict a development man-hour for each development process for the prediction target project based on a development man-hour for each development process for the highly similar project held in the development man-hour holding means. The software defect prediction apparatus according to claim 9, wherein:

A software defect prediction method for predicting the number of defects that occur in software development,
A mixed defect number storing step for storing the number of mixed defects for each development process for each project in a mixed defect number holding means prepared in advance,
Detected defect number storage step of storing the detected defect number for each development process for each project in a prepared defect number holding means prepared in advance,
A scale value storing step for storing a scale value representing the scale of each development process for each project in a scale value holding means prepared in advance;
Using a scale value for each development process for each project held in the scale value holding means, a prediction target project that is a target for predicting the number of mixed defects and the number of detected defects and a project other than the prediction target project A similarity calculation step for calculating the similarity,
The high-similarity project, which is a project having a relatively high degree of similarity to the prediction target project, is held in the mixed defect count holding means and the detected defect count holding means held in the mixed defect count holding means. Based on the number of detected defects for each development process and the similarity between the project to be predicted and the highly similar project, the predicted number of mixed defects and the number of detected defects for each development process for the project to be predicted is calculated. A software defect prediction method, comprising: a predicted value calculation step.

A software defect prediction program for predicting the number of defects that occur in software development,
A mixed defect number storing step for storing the number of mixed defects for each development process for each project in a mixed defect number holding means prepared in advance,
Detected defect number storage step of storing the detected defect number for each development process for each project in a prepared defect number holding means prepared in advance,
A scale value storing step for storing a scale value representing the scale of each development process for each project in a scale value holding means prepared in advance;
Using a scale value for each development process for each project held in the scale value holding means, a prediction target project that is a target for predicting the number of mixed defects and the number of detected defects and a project other than the prediction target project A similarity calculation step for calculating the similarity,
The high-similarity project, which is a project having a relatively high degree of similarity to the prediction target project, is held in the mixed defect count holding means and the detected defect count holding means held in the mixed defect count holding means. Based on the number of detected defects for each development process and the similarity between the project to be predicted and the highly similar project, the predicted number of mixed defects and the number of detected defects for each development process for the project to be predicted is calculated. A software defect prediction program, wherein a CPU of a computer executes the predicted value calculation step of using a memory.

In the similarity calculation step, when calculating the similarity, the number of mixed defects and the number of detected defects held for each development process for each project held in the mixed defect number holding means The software defect prediction program according to claim 12, wherein the number of detected defects for each development process of the project is further used.

In the similarity calculation step, when the similarity is calculated, the number of mixed defects in the developed development process and the number of detected defects held in the prediction target project held in the mixed defect number holding unit are held in the detection defect number holding unit. The software defect prediction program according to claim 13, wherein the number of detected defects in an already-developed development process for the prediction target project being used is further used.

The software defect according to any one of claims 12 to 14, wherein the predicted value calculating step further calculates a predicted value of a scale value for each development process for the prediction target project. Prediction program.

Technical factor / environment factor storage that stores the evaluation value obtained based on the technical index for each development process and the environmental index for each project in the technical factor / environment factor holding means prepared in advance. Further comprising steps,
In the similarity calculation step, when calculating the similarity, an evaluation value and an environmental index obtained based on a technical index for each development process for each project held in the technical factor / environment factor holding means The software defect prediction program according to any one of claims 12 to 15, wherein an evaluation value obtained based on the above is further used.

A difficulty level storing step of storing the difficulty level for each development process for each project in a difficulty level holding means prepared in advance;
17. The similarity calculation step further uses a difficulty level for each development process for each project held in the difficulty level holding means when calculating the similarity level. The software defect prediction program according to any one of the preceding items.

In the difficulty level storing step, as the difficulty level, a source code metric value that is an index representing the difficulty level of the source code created by software development is stored in the difficulty level holding means,
In the predicted value calculation step, a predicted value of the source code metric value for the prediction target project is further calculated based on the source code metric value for the highly similar project held in the difficulty level holding means. The software defect prediction program according to claim 17, wherein:

19. The software defect prediction program according to claim 17 or 18, wherein, in the difficulty level storing step, an ambiguity level of a document created by software development is stored as the difficulty level in the difficulty level holding means. .

A development man-hour storage step for storing the development man-hours for each development process for each project in a development man-hour holding means prepared in advance;
In the similarity calculation step, when calculating the similarity, a development man-hour for each development process for each project held in the development man-hour holding means is further used. The software defect prediction program according to any one of the preceding items.

In the predicted value calculation step, the predicted value of the development man-hours for each development process for the project to be predicted is further based on the development man-hours for each development process for the highly similar project held in the development man-hour holding means. 21. The software defect prediction program according to claim 20, wherein is calculated.