JP3004067B2

JP3004067B2 - Record classification method for information processing device

Info

Publication number: JP3004067B2
Application number: JP3066042A
Authority: JP
Inventors: 祐司本多; 茂生立花
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1991-03-29
Filing date: 1991-03-29
Publication date: 2000-01-31
Anticipated expiration: 2015-01-31
Also published as: JPH05303597A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は情報処理装置のレコード
分類方法に関し、特に、判別分析処理を利用した方法に
適用して好適なものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for classifying records in an information processing apparatus, and more particularly, to a method suitably applied to a method utilizing discriminant analysis processing.

【０００２】[0002]

【従来の技術】情報処理装置として、複数種類のデータ
（いわゆるフィールド）が組となったいわゆるレコード
単位に情報処理するものがある。この場合に、複数のレ
コードを所定の観点から分類し、分類毎に又はいずれか
の分類のレコードについて処理することがある。例え
ば、支店名、支店所在地、売場面積、売上高、利益、従
業員数等のフィールドでなるレコードを、売上高に基づ
いて分類し、従業員一人当りの利益に分類された群間で
差が生じているか否かを検出するようなことも行なわれ
る。この例の場合は、売上高が分類尺度フィールドにな
っている。ここで、新たなレコードが入力された場合に
は、どの群に属するのかを決定する。この場合、分類先
の群を規定する分類尺度フィールドにデータがあればそ
の値に基づいて分類すれば良いが、分類尺度フィールド
のデータが不明である場合には、他のフィールドのデー
タから所属群を決定する（分類する）ことが行なわれ
る。このような所属群の決定には、多変量解析の一手法
である判別分析処理が利用されることが多い。なお、判
別分析処理は、この他に分類尺度の選定の妥当性や分類
条件の妥当性の判断等にも利用される。2. Description of the Related Art As an information processing apparatus, there is an information processing apparatus which performs information processing in a so-called record unit in which a plurality of types of data (so-called fields) are grouped. In this case, a plurality of records may be classified from a predetermined viewpoint, and processing may be performed for each classification or for any of the classifications. For example, records consisting of fields such as branch name, branch location, sales floor area, sales, profit, and number of employees are classified based on sales, and there is a difference between groups classified as profit per employee. It is also performed to detect whether or not the operation is performed. In this example, sales are in the taxonomy field.
ing. Here, when a new record is input, it is determined to which group the record belongs. In this case, if there is data in the classification scale field that defines the group to be classified, classification may be performed based on the value. However, if the data of the classification scale field is unknown, the classification group is determined from the data of the other fields. Is determined (classified). Discrimination analysis processing, which is a method of multivariate analysis, is often used to determine such belonging groups. The discriminant analysis process is also used for determining the validity of selecting a classification scale and the validity of classification conditions.

【０００３】以下、判別分析処理（厳密に言えば教育判
別分析処理）の概念について図２を用いて簡単に説明す
る。図２は、２個のフィールドデータｘ1 及びｘ2 から
所属群Ｇ1 及びＧ2 を判別する場合の例である。[0003] The concept of discriminant analysis processing (strictly speaking, educational discriminant analysis processing) will be briefly described with reference to FIG. FIG. 2 shows an example in which belonging groups G1 and G2 are determined from two field data x1 and x2.

【０００４】予め所属群Ｇ1 又はＧ2 が分かっているレ
コードＲ11、Ｒ12、…及びＲ21、Ｒ22、…から、群間の
境界を規定する平面である判別関数Ｄを決定しておく。
図２からも明らかなように、判別関数Ｄは、２次元座標
系では直線で表されるように、各フィールドデータｘ1
及びｘ2 の１次結合で表されるものであり、群間の差を
最も大きくかつ群内のばらつきを最も小さくする、各レ
コードの座標の投影面になっている。[0004] A discriminant function D, which is a plane defining a boundary between groups, is determined in advance from records R11, R12, ... and R21, R22, ... in which the belonging groups G1 or G2 are known.
As is apparent from FIG. 2, the discriminant function D is represented by a straight line in the two-dimensional coordinate system, so that each field data x1
, And x2, which is the projection plane of the coordinates of each record that maximizes the difference between groups and minimizes variations within the groups.

【０００５】なお、レコードＲ11、Ｒ12、…及びＲ21、
Ｒ22、…の所属群Ｇ1 又はＧ2 の決定は、上述のフィー
ルドデータｘ1 及びｘ2 とは異なる分類尺度フィールド
のデータ（ｘ3 ）による。The records R11, R12,... And R21,
The determination of the belonging group G1 or G2 of R22,... Depends on the data (x3) of the classification scale field different from the above-mentioned field data x1 and x2.

【０００６】今、分類尺度フィールドのデータ値ｘ3 が
不明なレコードＲｘが入力されたとする。このときに
は、レコードＲｘのフィールドデータｘ1 及びｘ2 から
座標系における位置を求め、この位置を、判別関数Ｄで
規定される面に投影した位置ＤＲｘに基づいて、このレ
コードＲｘの所属群（Ｇ1)を決定する。Now, it is assumed that a record Rx whose data value x3 of the classification scale field is unknown is input. At this time, a position in the coordinate system is obtained from the field data x1 and x2 of the record Rx, and based on the position DRx projected on the surface defined by the discriminant function D, the group (G1) to which the record Rx belongs is determined. decide.

【０００７】従って、所属不明のレコードの所属群を決
定するための従来の一連の処理は、図３に示すようにな
る。まず、判別分析処理を起動し、次に、例えば、補助
記憶装置等に既に格納されているレコードから一部レコ
ードを選択し、選択したレコードが属する群の指定を取
込む（ステップ１０１〜１０３）。その後、各種の演算
を実行して判別関数等の決定を行ない、得られた判別関
数等の処理結果を利用者に表示認識させる（ステップ１
０４、１０５）。このような処理を終了すると、所属不
明のレコードを取込んで所属群を決定し、その結果を例
えば表形式等で表示して一連の処理を終了する（１０
６、１０７）。Accordingly, a conventional series of processes for determining the group to which a record whose affiliation is unknown belongs is as shown in FIG. First, the discriminant analysis processing is started, and then, for example, a partial record is selected from records already stored in the auxiliary storage device or the like, and the designation of the group to which the selected record belongs is taken in (steps 101 to 103). . After that, various calculations are executed to determine a discriminant function and the like, and the processing result of the obtained discriminant function and the like is displayed and recognized by the user (step 1).
04, 105). Upon completion of such processing, a record of unknown affiliation is acquired to determine an affiliation group, the result is displayed in, for example, a table format, and a series of processing is terminated (10).
6, 107).

【０００８】このように、判別関数を求めるためには、
所属が既に判っている複数組のレコードを選択指定して
おくことを要する。この具体的な方法としては、既存の
データファイルに基づいて群毎の別個のデータファイル
を形成したり、また、レコードに群識別情報を付与した
りしていた。As described above, in order to obtain the discriminant function,
It is necessary to select and specify a plurality of sets of records whose affiliations are already known. As a specific method, a separate data file is formed for each group based on an existing data file, and group identification information is added to a record.

【０００９】[0009]

【発明が解決しようとする課題】しかしながら、従来で
は、判別関数等を演算させるために入力するレコードの
所属群を指定する場合、いずれの具体的方法を採用した
としても、利用者がレコードの分類尺度フィールドのデ
ータ値をレコード毎に判断して、一々所属群の指定を行
なっていた。従って、多くの時間及び手間を要するもの
であった。However, in the prior art, when specifying an affiliation group of a record to be input in order to calculate a discriminant function or the like, the user is required to classify the record regardless of which specific method is adopted. The data value of the scale field is determined for each record, and the belonging group is designated one by one. Therefore, it requires much time and effort.

【００１０】レコードの分類精度を向上させようとする
と、所属が指定された入力レコードが多数求められ、そ
のため、レコード毎に指定操作を行なうという上述の問
題は大きい。[0010] In order to improve the classification accuracy of records, a large number of input records whose affiliations are specified are required. Therefore, the above-mentioned problem of performing the specifying operation for each record is significant.

【００１１】本発明は、以上の点を考慮してなされたも
ので、判別分析処理の入力レコードに対する所属群の指
定を、簡単に行なうことができ、かつ一連の処理時間を
短縮化できる情報処理装置のレコード分類方法を提供し
ようとするものである。The present invention has been made in view of the above points, and is an information processing apparatus capable of easily specifying an affiliation group for an input record in discriminant analysis processing and shortening a series of processing time. It is an object of the present invention to provide a method for classifying records of an apparatus.

【００１２】[0012]

【課題を解決するための手段】かかる課題を解決するた
め、本発明においては、所属群が不明のレコードをいず
れかの群に分類したり、所属群が指定された各レコード
がその所属群に属する確率を計算したりする判別分析処
理を実行する情報処理装置のレコード分類方法におい
て、判別分析処理のための入力データである各レコード
の所属群の指定を、以下の処理で行なうこととした。In order to solve this problem, according to the present invention, a record whose affiliation group is unknown is classified into any group, and each record whose affiliation group is designated is assigned to the affiliation group. record classification method smell <br/> of an information processing apparatus for performing discriminant analysis process or to calculate the probability of belonging Te, the designation of affiliation group of each record in the input data for the discriminant analysis processing, the following processing I decided to do it.

【００１３】すなわち、レコードを構成する複数のフィ
ールドのいずれか１以上を、分類尺度のための分類尺度
フィールドとして選択させる選択処理と、選択された分
類尺度フィールドのデータに適用して各レコードがどの
群に所属するかを決定する条件を取込む条件取込み処理
と、取込まれた条件に基づいて各レコードの所属群を決
定する判定処理とによって、判別分析処理のための入力
データである各レコードの所属群の指定を行なうことと
した。That is, a plurality of files constituting a record
Selection process to select one or more of the fields as the classification scale field for the classification scale, and conditions for applying to the data of the selected classification scale field to determine which group each record belongs to. The affiliation group of each record, which is input data for the discriminant analysis processing , is specified by the fetching condition import process and the determination process of determining the affiliation group of each record based on the fetched conditions.

【００１４】[0014]

【作用】情報処理装置において、所属群が不明のレコー
ドをいずれかの群に分類したり、所属群が指定された各
レコードがその所属群に属する確率を計算したりする処
理には、判別分析処理が用いられる。判別分析処理を行
なう場合、その入力データである各レコードについては
所属群が明らかである必要がある。In the information processing apparatus, discriminant analysis is used in the process of classifying records whose affiliation group is unknown into any group or calculating the probability that each record whose affiliation group is designated belongs to the affiliation group. Processing is used. When performing the discriminant analysis process, it is necessary that the group to which each record as input data belongs belongs.

【００１５】本発明は、このような入力レコードの所属
群の指定に関する。本発明では、まず、レコードを構成
する複数のフィールドのいずれか１以上を、所属群を規
定する分類尺度フィールドとして選択させ、次に、選択
された分類尺度フィールドのデータに適用して各レコー
ドがどの群に所属するかを決定する条件を取込み、最後
に、取込まれた条件に基づいて各レコードの所属群を決
定する。そして、この所属群の決定の後、直ちに判別分
析処理に進む。The present invention relates to the designation of the group to which such an input record belongs. In the present invention, first, a record is constructed
A plurality of fields of any one or more of, is selected as the classification metric field defining the belonging group, then each applied to the data of the classification scale field selected records
Then, a condition for determining which group the record belongs to is fetched, and finally, the group to which each record belongs is determined based on the fetched condition. After the determination of the belonging group, the process immediately proceeds to the discriminant analysis process.

【００１６】[0016]

【実施例】以下、本発明の一実施例を図面を参照しなが
ら詳述する。この実施例に係る情報処理装置も、一般の
コンピュータシステムと同様に、図示は省略するがハー
ドウェア的にはＣＰＵやメモリやキーボードやＣＲＴデ
ィスプレイ等から構成されている。この実施例の情報処
理装置は、レコード分類の機能からは図４に示すように
構成されている。なお、図４に示す機能構成は、実際上
ソフトウェアによって構成されている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below in detail with reference to the drawings. The information processing apparatus according to this embodiment, like a general computer system, includes a CPU, a memory, a keyboard, a CRT display, and the like in terms of hardware, although not shown. The information processing apparatus of this embodiment is configured as shown in FIG. 4 from the function of record classification. Note that the functional configuration shown in FIG. 4 is actually configured by software.

【００１７】図４に示すように、機能構成（プログラム
構造概念）は、制御モジュール１１、入力処理モジュー
ル１２、出力処理モジュール１３、表管理処理モジュー
ル１４、強調処理モジュール１５及び統計処理モジュー
ル１６に分けられる。As shown in FIG. 4, the functional configuration (program structure concept) is divided into a control module 11, an input processing module 12, an output processing module 13, a table management processing module 14, an emphasis processing module 15, and a statistical processing module 16. Can be

【００１８】制御モジュール１１は全体制御を司るもの
であり、各種モジュール１２〜１６はこの制御モジュー
ル１１の管理下にあって所定の処理を実行する。The control module 11 is responsible for overall control, and the various modules 12 to 16 execute predetermined processing under the control of the control module 11.

【００１９】入力処理モジュール１２は、キーボード等
の入力装置から入力されたレコードやコマンドを取込む
ものである。例えば、分類を決定する分類尺度フィール
ドのデータを有するレコードや分類尺度フィールドのデ
ータが未知で所属群が不明のレコードを取込むものであ
る。また、分類尺度フィールドの指定（選択）を受付け
たり、入力レコードの群を決定する条件を取込んだりす
るものである。The input processing module 12 captures records and commands input from an input device such as a keyboard. For example, a record having data of a classification scale field for determining a classification or a record whose data of a classification scale field is unknown and whose affiliation group is unknown is taken in. It also accepts designation (selection) of a classification scale field and incorporates conditions for determining a group of input records.

【００２０】出力処理モジュール１３は、ＣＲＴディス
プレイやプリンタ等に対する出力制御を行なうものであ
る。例えば、入力レコードの所属群の指定に供する画像
を出力させたり、得られた判別関数を出力させたり、所
属不明のレコードについて得られた所属群情報を出力さ
せたりするものである。The output processing module 13 controls output to a CRT display, a printer, and the like. For example, it outputs an image to be used for specifying the group to which the input record belongs, outputs the obtained discriminant function, and outputs the group information obtained for a record whose belonging is unknown.

【００２１】表管理処理モジュール１４は、補助記録装
置や主メモリ等に格納されている複数のレコードを含む
表データを管理するものである。この実施例の場合、こ
のモジュール１４によって管理される表データには、複
数のレコードの一部を他のレコードより強調することを
表す強調情報が付与されることがある。The table management processing module 14 manages table data including a plurality of records stored in an auxiliary recording device, a main memory or the like. In the case of this embodiment, the table data managed by the module 14 may be provided with emphasis information indicating that a part of a plurality of records is emphasized more than other records.

【００２２】強調処理モジュール１５は、表管理処理モ
ジュール１４によって管理されている表データにおけ
る、所定の条件を満足する一部のレコードを、他のレコ
ードと区別して強調するものである。すなわち、強調さ
れたレコード群と強調されないレコード群とに２分し
て、判別関数等の決定に必要な入力レコードの所属群の
指定を行なうものである。強調処理モジュール１５は、
詳細には、条件定義部１５ａと、判定部１５ｂと、強調
処理部１５ｃとからなる。条件定義部１５ａは、入力処
理モジュール１２を起動して、分類尺度フィールドの指
定（選択）と、分類尺度フィールドのデータ値を基に強
調するレコードか否かを決定する条件とを取込むもので
ある。判定部１５ｂは、条件に従って各レコードが強調
すべきものか否かを判定するものである。強調処理部１
５ｃは、判定結果に基づいて表管理処理モジュール１４
の管理下にある表に強調情報を付与するものである。The emphasis processing module 15 is for emphasizing some records in the table data managed by the table management processing module 14 that satisfy a predetermined condition while distinguishing them from other records. That is, the record group is divided into an emphasized record group and a non-emphasized record group, and the belonging group of the input record necessary for determining the discriminant function or the like is designated. The emphasis processing module 15
More specifically, it includes a condition definition unit 15a, a determination unit 15b, and an emphasis processing unit 15c. The condition definition unit 15a activates the input processing module 12, and takes in the designation (selection) of the classification scale field and the condition for determining whether or not the record is to be emphasized based on the data value of the classification scale field. The determination unit 15b determines whether each record should be emphasized according to the conditions. Emphasis processing unit 1
5c is a table management processing module 14 based on the determination result.
Is to add emphasis information to a table under the management of the above.

【００２３】統計処理モジュール１６は、判別関数等の
算出及び所属不明レコードの所属先決定を行なうもので
ある。統計処理モジュール１６は判別分析部１６ａ及び
確率計算部１６ｂからなる。判別分析部１６ａは、強調
されたレコード群と強調されていないレコード群との双
方を用いて、判別関数等の情報を算出するものである。
確率計算部１６ｂは、所属不明のレコードが与えられた
場合に、判別分析部１６ａが算出した判別関数その他の
情報を用いて、そのレコードの所属群を決定するもので
あり、その所属群決定情報を一方の所属群に属している
確率で表すものである。なお、確率計算部１６ｂは、所
属群が指定された入力レコードに対しても、その所属確
率を演算する。The statistical processing module 16 calculates a discriminant function or the like and determines the affiliation destination of an affiliation unknown record. The statistical processing module 16 includes a discriminant analysis unit 16a and a probability calculation unit 16b. The discriminant analysis unit 16a calculates information such as a discriminant function using both the emphasized record group and the unemphasized record group.
The probability calculation unit 16b, when a record whose affiliation is unknown is given, determines the affiliation group of the record using the discriminant function and other information calculated by the discriminant analysis unit 16a. Is represented by the probability of belonging to one of the belonging groups. Note that the probability calculation unit 16b also calculates the belonging probability of an input record for which a belonging group is specified.

【００２４】次に、所属不明のレコードの所属群を決定
する（レコードを分類する）ための一連のレコード分類
処理を、図１、図５ないし図７を参照しながら説明す
る。なお、図１は処理の流れを示すフローチャートであ
る。図５及び図６は判別関数の決定等に用いられる入力
レコードでなる表データを示す説明図であって、図５は
強調処理前のものを、図６は強調処理後のものを示して
いる。図７は、所属不明レコードの所属群の決定結果を
示す説明図である。図５ないし図７に示す例のものは、
４個のフィールドＡ、Ｂ、Ｃ、Ｄによって１個のレコー
ドが形成されているものである。Next, a series of record classification processing for determining a group to which a record whose affiliation is unknown (classifying records) will be described with reference to FIGS. 1, 5 to 7. FIG. FIG. 1 is a flowchart showing the flow of the processing. 5 and 6 are explanatory diagrams showing table data as input records used for determining a discriminant function and the like. FIG. 5 shows a table before the emphasis processing, and FIG. 6 shows a table after the emphasis processing. . FIG. 7 is an explanatory diagram showing the result of determining the group to which the unknown record belongs. The example shown in FIG. 5 to FIG.
One record is formed by four fields A, B, C, and D.

【００２５】まず、使用者によって選択された分類尺度
フィールドを取込む（ステップ２０１）。この場合、複
数のフィールドを分類尺度フィールドとして指定しても
良い。次に、各入力レコードを２群に振り分けるため
の、分類尺度フィールドについての条件を取込む（ステ
ップ２０２）。分類尺度フィールドとして複数のフィー
ルドが指定された場合には、各フィールドについての条
件と、その結果に対する総合条件とを取込む。そして、
取込んだ条件に基づいた判定を行ない、レコードに強調
レコードか非強調レコードかを表す情報を付与する（ス
テップ２０３）。First, the classification scale field selected by the user is fetched (step 201). In this case, a plurality of fields may be designated as classification scale fields. Next, conditions for the classification scale field for allocating each input record to two groups are fetched (step 202). When a plurality of fields are designated as the classification scale fields, the condition for each field and the overall condition for the result are taken. And
A determination is made based on the fetched conditions, and information indicating whether the record is an emphasized record or a non-emphasized record is added to the record (step 203).

【００２６】図６は、分類尺度フィールドとしてフィー
ルドＡが選択され、かつ、強調レコードの条件として
「フィールドＡの値＞６００」が指示された場合の強調
結果を表している。FIG. 6 shows an emphasis result when field A is selected as the classification measure field and "value of field A>600" is specified as the condition of the emphasis record.

【００２７】その後、使用者による指定を待たずに、強
調結果に基づいてレコードを自動的に２群に振り分ける
（ステップ２０４）。図６の例の場合、レコード番号が
１、２、４、５及び６のレコード群と、レコード番号が
３、７、８、９及び１０のレコード群に分けられる。Thereafter, the records are automatically divided into two groups based on the emphasis result without waiting for designation by the user (step 204). In the case of the example in FIG. 6, the record group is divided into record groups with record numbers 1, 2, 4, 5, and 6, and record groups with record numbers 3, 7, 8, 9, and 10.

【００２８】この状態において、判別分析処理が起動さ
れ、まず振り分けられた各レコード群を自動的に判別分
析に用いる入力レコード群として指定する（ステップ２
０５、２０６）。その後、判別分析処理を実行して判別
関数の決定等を行ない、得られた判別関数等の処理結果
を利用者に表示認識させる（ステップ２０７、２０
８）。このような分類のための前処理を終了すると、所
属不明のレコードを取込んで、判別関数等に基づいて所
属群を決定し（例えば所属確率を演算し）、その結果を
例えば表形式等で表示して一連の処理を終了する（２０
９、２１０）。In this state, the discriminant analysis process is started, and each of the sorted record groups is automatically designated as an input record group used for discriminant analysis (step 2).
05, 206). Thereafter, a discriminant analysis process is executed to determine a discriminant function and the like, and the processing result of the obtained discriminant function and the like is displayed and recognized by the user (steps 207 and 207).
8). When the preprocessing for such classification is completed, a record whose affiliation is unknown is fetched, an affiliation group is determined based on a discriminant function or the like (for example, an affiliation probability is calculated), and the result is expressed in a table format or the like. Is displayed and a series of processing ends (20
9, 210).

【００２９】図７は、分類尺度フィールドＡの値が未知
の所属不明のレコードが２個（レコード番号１１及び１
２）与えられた場合の結果を示しており、この例の場合
には、これらレコードは共に強調側の群（フィールドＡ
の値が６００より大きい群）に属する確率（強調確率）
の方が、非強調側の群に属する確率より大きくなってい
る。FIG. 7 shows two records of unknown affiliation whose values of the classification scale field A are unknown (record numbers 11 and 1).
2) shows the result when given, and in this case, these records are both groups on the highlighting side (field A
Belonging to the group whose value is larger than 600) (emphasis probability)
Is larger than the probability of belonging to the group on the non-emphasized side.

【００３０】なお、図７は、所属群が指定された入力レ
コードについても確率を示している。例えば、レコード
番号が６のレコードは非強調群に属すると指定されてい
るが、判別分析結果からは強調群に属する確率もかなり
高くなっている。このような結果は、分類尺度フィール
ドの選定や分類条件の妥当性の解析にも利用される。FIG. 7 also shows the probabilities of input records for which a group is specified. For example, although the record with the record number 6 is specified to belong to the non-emphasized group, the probability of belonging to the emphasized group is considerably high from the result of the discriminant analysis. Such results are also used for selecting a classification scale field and analyzing the validity of classification conditions.

【００３１】従って、上述の実施例によれば、所属不明
のレコードを分類したり、所属が指定されたレコードが
その所属群に属する確率を計算したりする判別分析処理
を実行するために、複数の入力レコードを２群以上の群
に振り分ける処理を、分類尺度フィールドの選択や条件
設定によって自動的に行なうようにしたので、処理時間
及び使用者の手間を大幅に削減することができる。すな
わち、レコードを複数群に振り分けた後に入力操作する
必要がなく、群を意識することなく入力し、その後分類
尺度フィールドの指定及び条件設定を行なえば良いこと
を意味し、使用者の使い勝手が向上する。また、既に、
記憶装置に格納されているレコードを用いる場合でも、
その所属群指定を容易に行なうことができる。Therefore, according to the above-described embodiment, in order to execute a discriminant analysis process for classifying a record whose affiliation is unknown or calculating a probability that a record whose affiliation is designated belongs to the affiliation group. Is automatically performed according to the selection of the classification scale field and the setting of the conditions, so that the processing time and the labor of the user can be significantly reduced. In other words, there is no need to perform input operations after allocating records to a plurality of groups, input without being aware of the groups, and then specify the classification scale field and set conditions, which improves the usability of the user. I do. Also, already,
Even when using records stored in the storage device,
The belonging group can be easily specified.

【００３２】実際上、データベース装置や表計算装置等
の情報処理装置においては、多量のレコードを処理し、
必要に応じてレコードを分類したり、その分類の妥当性
を判断する必要があり、上述の実施例のように用いられ
ることが多く生じているので、上述の効果は大きなもの
である。In practice, an information processing device such as a database device or a spreadsheet device processes a large number of records,
It is necessary to classify records as necessary and judge the validity of the classification. Since the records are often used as in the above-described embodiment, the above-described effects are significant.

【００３３】因に、データベース装置や表計算装置等の
従来の情報処理装置の中には、一部レコードを他のレコ
ードと異なるように強調する処理を有しているが、この
強調処理は判別分析処理と別個に設けられている処理で
あり、そのため、判別分析処理で直接には利用できず、
使用者は強調処理を実行させた後に、判別分析処理で利
用できるように加工する必要があり、実施例に比較すれ
ば、やはり時間や手間のかかるものであった。Incidentally, some conventional information processing apparatuses such as a database apparatus and a spreadsheet apparatus have a process of emphasizing some records so as to be different from other records. It is a process provided separately from the analysis process, and therefore cannot be used directly in the discriminant analysis process.
After executing the emphasis processing, the user needs to perform processing so that the processing can be used in the discriminant analysis processing, which is also time-consuming and troublesome as compared with the embodiment.

【００３４】なお、上述の実施例においては２群に判別
するものを示したが、３群以上に分類（判別）するもの
に適用することもできる。また、１レコードが４フィー
ルドデータでなるものを示したが、これに限定されるも
のではない。さらに、所属群が不明のレコードに対する
分類処理を実行しない、判別分析処理にも適用できる。
すなわち、特許請求の範囲でいうレコード分類は、所属
群が指定されたレコードに対する所属確率を演算するよ
うなことも含んでいる。In the above-described embodiment, the discrimination into two groups is described. However, the invention can be applied to the discrimination (determination) into three or more groups. Also, one record consists of four field data, but the present invention is not limited to this. Furthermore, the present invention can be applied to discriminant analysis processing in which classification processing is not performed on records whose affiliation group is unknown.
That is, the record classification referred to in the claims includes calculating the belonging probability for the record to which the belonging group is designated.

【００３５】[0035]

【発明の効果】以上のように、本発明によれば、レコー
ド分類を行なう判別分析処理で必要となる、複数群の入
力レコードの指定を、分類尺度フィールドの選択、及び
そのフィールドデータについての条件の決定によって自
動的に行なうようにしたので、使用者による操作を大幅
に削減することができると共に、入力レコードの指定を
も含めた処理時間を従来より格段的に短くすることがで
きる。As described above, according to the present invention, the specification of a plurality of groups of input records, which is required in the discriminant analysis processing for performing record classification, is performed by selecting the classification scale field and the conditions for the field data. , The operation by the user can be greatly reduced, and the processing time, including the specification of the input record, can be made much shorter than before.

[Brief description of the drawings]

【図１】実施例方法の処理フローチャートである。FIG. 1 is a processing flowchart of an embodiment method .

【図２】判別分析処理の概略説明図である。FIG. 2 is a schematic explanatory diagram of a discriminant analysis process.

【図３】従来方法の処理フローチャートである。FIG. 3 is a processing flowchart of a conventional method .

【図４】実施例方法の機能ブロック図である。FIG. 4 is a functional block diagram of the embodiment method .

【図５】強調処理前の表データの説明図である。FIG. 5 is an explanatory diagram of table data before emphasis processing.

【図６】強調処理後の表データの説明図である。FIG. 6 is an explanatory diagram of table data after emphasis processing.

【図７】最終処理結果の説明図である。FIG. 7 is an explanatory diagram of a final processing result.

[Explanation of symbols]

１１…制御モジュール、１２…入力処理モジュール、１
３…出力処理モジュール、１４…表管理処理モジュー
ル、１５…強調処理モジュール、１６…統計処理モジュ
ール。11: control module, 12: input processing module, 1
3 output processing module, 14 table management processing module, 15 emphasis processing module, 16 statistical processing module.

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平１−288961（ＪＰ，Ａ) 特開平２−24779（ＪＰ，Ａ) 「プログラムプロダクトＶＯＳ３統計計算ライブラリＨＩＳＴＡＴＥ２分析プログラム編文法書」，ｐｐ．277−291（平成２年３月) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 17/30 G06F 12/00 510 ──────────────────────────────────────────────────続き Continuation of the front page (56) References JP-A-1-288961 (JP, A) JP-A-2-24779 (JP, A) “Program product VOS3 Statistical calculation library HISTATE 2 Analysis program compilation grammar” , P p. 277-291 (March 1990) (58) Fields investigated (Int. Cl. ⁷ , DB name) G06F 17/30 G06F 12/00 510

Claims

(57) [Claims]

An information processing apparatus for performing a discriminant analysis process of classifying records whose affiliation group is unknown into any group, and calculating a probability that each record whose affiliation group is specified belongs to the affiliation group. Record classification method , any one or more of a plurality of fields constituting a record
A selection of as the classification scale field for the classification scale
A selection process for a taking process conditions capture the conditions which determine whether each record by applying belongs to which group the data classification scale field selected, affiliation of each record on the basis of the captured conditions A record classification method for an information processing apparatus, comprising: specifying a group to which each record, which is input data for a discriminant analysis process , is assigned by a discriminating process for determining a group, and then proceeding to a discriminant analysis process.