JP6610334B2

JP6610334B2 - Leakage risk providing apparatus, leakage risk providing method, and leakage risk providing program

Info

Publication number: JP6610334B2
Application number: JP2016037904A
Authority: JP
Inventors: 裕司山岡
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2016-02-29
Filing date: 2016-02-29
Publication date: 2019-11-27
Anticipated expiration: 2036-02-29
Also published as: JP2017156878A

Description

本発明は、漏洩リスク提供装置、漏洩リスク提供方法および漏洩リスク提供プログラムに関する。 The present invention relates to a leakage risk providing apparatus, a leakage risk providing method, and a leakage risk providing program.

各行に個人の情報を格納した個票データを、プライバシーに配慮しつつ多くの情報が残るように変換したい要望がある。このような個票データの変換は匿名化と呼ばれる。匿名化された個票データは、例えば、プライバシーに配慮しつつ第三者に販売されることで、利活用範囲を広げることができる。ところが、利活用範囲が広い有用性の高い個票データは、匿名性と有用性とがトレードオフの関係であるので、プライバシーの漏洩リスク（以下、単にリスクともいう。）が高くなる。 There is a demand for converting individual vote data storing individual information in each row so that much information remains while considering privacy. Such conversion of individual vote data is called anonymization. Anonymized individual vote data can be used for a wider range of use, for example, by being sold to a third party in consideration of privacy. However, the highly useful individual vote data with a wide utilization range has a trade-off relationship between anonymity and usefulness, and therefore privacy leakage risk (hereinafter also simply referred to as risk) increases.

個票データの提供者は、なるべく有用性を高く保ちつつ、十分な匿名性を確保するために、リスクを評価することで、個票データの情報量を決定する。リスク評価は、例えば、ＪＯモデルと呼ばれる個人情報漏洩事件による損害賠償額の算出モデルを用いて、漏洩個人情報価値を金額として算出することができる。また、他のリスク評価としては、ｋ−匿名性と呼ばれる、度数分布を用いて度数の最小の値ｋをリスク指標とし、ｋが小さい程リスクが小さいとする手法がある。 In order to ensure sufficient anonymity while keeping usability as high as possible, the provider of the individual vote data determines the information amount of the individual vote data by evaluating the risk. In the risk evaluation, for example, a leakage personal information value can be calculated as a monetary amount using a model for calculating damages due to a personal information leakage case called a JO model. As another risk evaluation, there is a method called k-anonymity, which uses a frequency distribution and uses the minimum value k of the frequency as a risk index, and the smaller the k, the smaller the risk.

“ＮＰＯ日本ネットワークセキュリティ協会「２００３年度情報セキュリティインシデントに関する調査報告書」，２００４年３月３１日”，［Online］，［平成２８年２月５日検索］、インターネット＜http://www.jnsa.org/active/2003/active2003_1a.html>"NPO Japan Network Security Association" Survey Report on Information Security Incident 2003 ", March 31, 2004", [Online], [Search February 5, 2016], Internet <http: //www.jnsa .org / active / 2003 / active2003_1a.html> Latanya Sweeney. k-anonymity: a model for protecting privacy. Int.J.Uncertain. Fuzziness Knowl.-Based Syst., Vol.10, October 2002, pp.557-570.Latanya Sweeney.k-anonymity: a model for protecting privacy.Int.J.Uncertain.Fuzziness Knowl.-Based Syst., Vol.10, October 2002, pp.557-570.

しかしながら、上述のリスク評価手法では、提供者は、評価の結果に基づいてリスクが高すぎると判断した場合には、さらなる匿名化を検討するが、金額やリスク指標では、どのように匿名化すればよいか検討することは難しい。このため、上述のリスク評価手法では、有用性を高く保ちつつ、十分な匿名性を確保できるように匿名化するための具体的な情報を得ることは困難である。 However, in the risk assessment method described above, if the provider determines that the risk is too high based on the result of the assessment, the provider will consider further anonymization. It is difficult to consider what should be done. For this reason, in the above-mentioned risk evaluation method, it is difficult to obtain specific information for anonymization so that sufficient anonymity can be ensured while keeping usefulness high.

一つの側面では、本発明は、漏洩リスクの判断の精度を高めることができる漏洩リスク提供装置、漏洩リスク提供方法および漏洩リスク提供プログラムを提供することにある。 In one aspect, the present invention is to provide a leakage risk providing apparatus, a leakage risk providing method, and a leakage risk providing program that can improve the accuracy of determination of leakage risk.

一つの態様では、漏洩リスク提供装置は、記憶部と、属性抽出部と、エントリ抽出部と、影響度算出部と、リスク情報作成部とを有する。記憶部は、複数の属性が与えられたエントリの集合である個票データにおける前記属性のそれぞれの機微度を記憶する。属性抽出部は、前記機微度に基づいて、前記属性の組み合わせごとに漏洩リスクを算出し、該漏洩リスクが所定の条件を満たす属性の組み合わせを抽出する。エントリ抽出部は、前記抽出された組み合わせごとに、当該組み合わせに含まれる属性の値で特定されるエントリ数が所定の条件を満たすエントリを抽出する。影響度算出部は、前記機微度に基づいて、前記抽出されたエントリにおける前記抽出された組み合わせに含まれる属性以外の属性の影響度を算出する。リスク情報作成部は、前記抽出された組み合わせと、前記影響度が所定の条件を満たす属性とを対応付ける。 In one aspect, the leakage risk providing apparatus includes a storage unit, an attribute extraction unit, an entry extraction unit, an influence degree calculation unit, and a risk information creation unit. The storage unit stores the sensitivity of each of the attributes in the individual vote data, which is a set of entries given a plurality of attributes. The attribute extraction unit calculates a leakage risk for each combination of the attributes based on the sensitivity, and extracts a combination of attributes for which the leakage risk satisfies a predetermined condition. The entry extraction unit extracts, for each extracted combination, an entry in which the number of entries specified by the attribute value included in the combination satisfies a predetermined condition. The influence degree calculation unit calculates the influence degree of attributes other than the attributes included in the extracted combination in the extracted entry based on the sensitivity. The risk information creation unit associates the extracted combination with an attribute whose influence degree satisfies a predetermined condition.

漏洩リスクの判断の精度を高めることができる。 The accuracy of determination of leakage risk can be increased.

図１は、実施例の漏洩リスク提供装置の構成の一例を示すブロック図である。FIG. 1 is a block diagram illustrating an example of the configuration of the leakage risk providing apparatus according to the embodiment. 図２は、個票データの一例を示す図である。FIG. 2 is a diagram illustrating an example of individual vote data. 図３は、個票データの匿名化の一例を示す図である。FIG. 3 is a diagram illustrating an example of anonymization of individual vote data. 図４は、リスク情報の一例を示す図である。FIG. 4 is a diagram illustrating an example of risk information. 図５は、リスク情報の他の一例を示す図である。FIG. 5 is a diagram illustrating another example of risk information. 図６は、個票データ記憶部の一例を示す図である。FIG. 6 is a diagram illustrating an example of the individual vote data storage unit. 図７は、優先度情報記憶部の一例を示す図である。FIG. 7 is a diagram illustrating an example of the priority information storage unit. 図８は、リスク情報記憶部の一例を示す図である。FIG. 8 is a diagram illustrating an example of the risk information storage unit. 図９は、実施例のリスク提供処理の一例を示すフローチャートである。FIG. 9 is a flowchart illustrating an example of the risk providing process according to the embodiment. 図１０は、リスク情報記憶部の処理途中の一例を示す図である。FIG. 10 is a diagram illustrating an example of processing in the risk information storage unit. 図１１は、漏洩リスク提供プログラムを実行するコンピュータの一例を示す図である。FIG. 11 is a diagram illustrating an example of a computer that executes a leakage risk providing program.

以下、図面に基づいて、本願の開示する漏洩リスク提供装置、漏洩リスク提供方法および漏洩リスク提供プログラムの実施例を詳細に説明する。なお、本実施例により、開示技術が限定されるものではない。また、以下の実施例は、矛盾しない範囲で適宜組みあわせてもよい。 Hereinafter, embodiments of a leakage risk providing apparatus, a leakage risk providing method, and a leakage risk providing program disclosed in the present application will be described in detail based on the drawings. The disclosed technology is not limited by the present embodiment. Further, the following embodiments may be appropriately combined within a consistent range.

図１は、実施例の漏洩リスク提供装置の構成の一例を示すブロック図である。図１に示す漏洩リスク提供装置１００は、個票データと、優先度情報と、閾値群とを受け付ける。漏洩リスク提供装置１００は、受け付けた個票データと、優先度情報と、閾値群とを記憶部１２０に記憶する。すなわち、漏洩リスク提供装置１００では、優先度情報に基づいて、複数の属性が与えられたエントリの集合である個票データにおける属性のそれぞれに機微度を対応付けるテーブルが記憶部１２０に記憶される。漏洩リスク提供装置１００は、機微度に基づいて、属性の組み合わせごとに漏洩リスクを算出し、該漏洩リスクが所定の条件を満たす属性の組み合わせを抽出する。漏洩リスク提供装置１００は、抽出された組み合わせごとに、当該組み合わせに含まれる属性の値で特定されるエントリ数が所定の条件を満たすエントリを抽出する。漏洩リスク提供装置１００は、機微度に基づいて、抽出されたエントリにおける抽出された組み合わせに含まれる属性以外の属性の影響度を算出する。漏洩リスク提供装置１００は、抽出された組み合わせと、影響度が所定の条件を満たす属性とを対応付ける。これにより、漏洩リスク提供装置１００は、漏洩リスクの判断の精度を高めることができる。 FIG. 1 is a block diagram illustrating an example of the configuration of the leakage risk providing apparatus according to the embodiment. The leakage risk providing apparatus 100 shown in FIG. 1 accepts individual vote data, priority information, and a threshold group. The leakage risk providing apparatus 100 stores the received individual vote data, priority information, and threshold value group in the storage unit 120. That is, in the leakage risk providing apparatus 100, a table that associates the sensitivity with each attribute in the individual vote data that is a set of entries given a plurality of attributes is stored in the storage unit 120 based on the priority information. The leakage risk providing apparatus 100 calculates a leakage risk for each combination of attributes based on the sensitivity, and extracts a combination of attributes for which the leakage risk satisfies a predetermined condition. For each extracted combination, the leakage risk providing apparatus 100 extracts an entry in which the number of entries specified by the value of the attribute included in the combination satisfies a predetermined condition. The leakage risk providing apparatus 100 calculates the influence degree of attributes other than the attributes included in the extracted combination in the extracted entry based on the sensitivity. The leakage risk providing apparatus 100 associates the extracted combination with an attribute whose degree of influence satisfies a predetermined condition. Thereby, the leakage risk providing apparatus 100 can improve the accuracy of the determination of the leakage risk.

ここで、図２から図５を用いて、個票データの匿名化およびリスク情報について説明する。図２は、個票データの一例を示す図である。図２に示すように、個票データ２０は、各行に個人情報を格納した２次元の表である。図２の例では、個票データ２０は、「患者ＩＤ（IDentifier）」、「生年」、「性別」、「国籍」、「診察結果」といった属性を示す項目を有する。なお、行番号は、説明のために付したものである。個票データ２０を匿名化する場合には、例えば各属性の情報を一般化する。例えば、属性「生年」の一般化では、値「１１」について、２０未満の数値であることを表す「＜２０」、不明な１文字の数字を表す「？」を用いる「１？」、不明な文字列を表す「＊」を用いる「＊」といった変換が行われる。なお、本実施例では、値「１１」を「２？」に変換するといった、虚偽のデータは含まれないものとする。なお、一般化では、ある属性について、全行を「＊」に一般化することは、当該属性情報の削除と実質的に同一視できる場合があるので、属性削除も一般化の一形態とする。 Here, anonymization of the individual vote data and risk information will be described with reference to FIGS. FIG. 2 is a diagram illustrating an example of individual vote data. As shown in FIG. 2, the individual slip data 20 is a two-dimensional table in which personal information is stored in each row. In the example of FIG. 2, the individual slip data 20 includes items indicating attributes such as “patient ID (IDentifier)”, “birth year”, “gender”, “nationality”, and “diagnosis result”. The line numbers are given for explanation. When the individual vote data 20 is anonymized, for example, information on each attribute is generalized. For example, in the generalization of the attribute “birth year”, “<20” indicating that the value “11” is a numerical value less than 20, “1?” Using “?” Indicating an unknown one-character number, “1?”, Unknown Conversion such as “*” using “*” representing a simple character string is performed. In this embodiment, it is assumed that false data such as converting the value “11” into “2?” Is not included. In generalization, generalizing all lines to “*” for a certain attribute may be substantially the same as deleting the attribute information, so attribute deletion is also a form of generalization. .

図３は、個票データの匿名化の一例を示す図である。図３に示す個票データ２１は、図２に示す個票データ２０に対して、属性「患者ＩＤ」を削除し、属性「生年」について一般化を行った一例である。個票データ２１では、属性「生年」について、値が２０未満である場合に「＜２０」に一般化し、値が２０以上である場合に、数値の一の位を「？」に一般化したものである。このように、個票データ２１は、「患者ＩＤ」の情報は使用できず、「生年」も１０年単位でしか判らなくなっており、個票データ２０と比較して情報量が少なくなっている。一般的には、匿名化を行うと情報量が減ってリスクも低減する。 FIG. 3 is a diagram illustrating an example of anonymization of individual vote data. The individual form data 21 shown in FIG. 3 is an example in which the attribute “patient ID” is deleted from the individual form data 20 shown in FIG. In the individual slip data 21, the attribute “birth year” is generalized to “<20” when the value is less than 20, and the first digit of the numerical value is generalized to “?” When the value is 20 or more. Is. In this way, the individual vote data 21 cannot use the information of “patient ID”, and the “birth year” can be determined only in units of 10 years, and the amount of information is smaller than that of the individual vote data 20. . In general, anonymization reduces the amount of information and reduces risk.

例えば、個票データ２０と個票データ２１とにおけるリスクについて比較する。個票データ２０において、攻撃者は、例えば母集団に｛生年：１１｝の人物が１人しかいないこと、および、当該属性を持つ人物を知っているが、当該人物の「診断結果」を知らない場合がある。この場合には、攻撃者が個票データ２０を見ることで当該人物が行番号「１」に該当し、当該人物が｛診断結果：肝炎｝であることを知ることができてしまうというリスクがある。なお、｛生年：１１｝は、属性「生年」の値が「１１」であることを示す。また、｛診断結果：肝炎｝は、属性「診断結果」が「肝炎」であることを示す。以下、属性と値とについて同様に記載する場合がある。 For example, the risks in the individual vote data 20 and the individual vote data 21 are compared. In the individual data 20, the attacker knows that there is only one person of {birth year: 11} in the population, and knows the person with the attribute, but knows the “diagnosis result” of the person. There may not be. In this case, there is a risk that an attacker can know that the person corresponds to the line number “1” and that the person is {diagnosis result: hepatitis} by looking at the individual vote data 20. is there. Note that {birth year: 11} indicates that the value of the attribute “birth year” is “11”. Further, {diagnosis result: hepatitis} indicates that the attribute “diagnosis result” is “hepatitis”. Hereinafter, the attribute and the value may be described in the same manner.

これに対し、個票データ２１では、属性「生年」が一般化されているため、人物を一意に特定できる可能性が低減されている。例えば、母集団に｛生年：＜２０｝の人物が複数いる場合には、全体のリスクが低減されている。しかしながら、個票データ２１を活用する場合には、特定個人の情報を知る必要はないので、属性「患者ＩＤ」が削除されていても問題ない。また、属性「生年」の情報は、１０年単位でも役立つ知見を得られる可能性があり、個票データ２１でも価値がある状況があり得る。なお、リスクを最小化する匿名化は、全属性を削除することであるが、この場合には、リスクはなくなるが、二次活用の可能性もなくなることとなる。 On the other hand, in the individual vote data 21, since the attribute “birth year” is generalized, the possibility of uniquely identifying a person is reduced. For example, when there are a plurality of people of {birth year: <20} in the population, the overall risk is reduced. However, when the individual vote data 21 is used, it is not necessary to know the information of a specific individual, so there is no problem even if the attribute “patient ID” is deleted. In addition, the information of the attribute “birth year” may obtain useful knowledge even in units of 10 years, and there may be a situation in which the individual vote data 21 is also valuable. The anonymization that minimizes the risk is to delete all the attributes. In this case, the risk is eliminated, but the possibility of secondary utilization is also eliminated.

図４は、リスク情報の一例を示す図である。図４に示すリスク情報２２は、図２に示す個票データ２０を入力とした場合におけるリスク情報の一例である。図４に示すように、リスク情報２２は、「属性組み合わせ」、「行の数」、「行番号」、「特定情報」、「影響情報」といった項目を有する。「属性組み合わせ」は、個票データ２０に対して行を特定できる属性の組み合わせを示す。「行の数」は、属性組み合わせで特定できる行の数を示す。「行番号」は、特定できる行のうち、予め定められた所定の行数未満の行を一例として、例示する行番号を示す。「特定情報」は、特定できる行のうち、例示された行番号を特定するための属性値を示す。つまり、「特定情報」は、例示された行番号の行を特定するのに最小限必要な属性の情報である。「影響情報」は、例示された行番号の行における、属性組み合わせに用いた属性を除いた他の属性のうち、行が特定されたことにより判明する属性の属性値を示す。なお、「影響情報」は、判明する属性の属性値のうち、機微度が高い属性の属性値を優先して表示するようにしてもよい。 FIG. 4 is a diagram illustrating an example of risk information. The risk information 22 shown in FIG. 4 is an example of risk information when the individual vote data 20 shown in FIG. 2 is input. As illustrated in FIG. 4, the risk information 22 includes items such as “attribute combination”, “number of rows”, “row number”, “specific information”, and “influence information”. The “attribute combination” indicates a combination of attributes that can specify a row for the individual slip data 20. “Number of rows” indicates the number of rows that can be specified by the attribute combination. The “line number” indicates an exemplary line number by taking, as an example, lines that are less than a predetermined number of lines that can be specified. “Specific information” indicates an attribute value for specifying the exemplified line number among the lines that can be specified. In other words, the “specific information” is information of the minimum necessary attribute for specifying the line having the exemplified line number. The “influence information” indicates an attribute value of an attribute that is determined by specifying a line among other attributes excluding the attribute used for attribute combination in the line of the exemplified line number. The “influence information” may be displayed with priority given to attribute values with high sensitivity among the attribute values of the identified attributes.

図４の１行目および２行目の例では、属性組み合わせが「患者ＩＤ」である場合に、行の数は「１０００」であり、属性「患者ＩＤ」単独の値で１０００行が特定できる。なお、個票データ２０は、全部で１０００行あるものとするので、属性「患者ＩＤ」単独の値で全行が特定できることになる。行番号「３」は、特定情報｛患者ＩＤ：１０３｝で特定でき、影響情報は｛診察結果：異食症，国籍：中国｝であることが判る。 In the example of the first and second lines in FIG. 4, when the attribute combination is “patient ID”, the number of lines is “1000”, and 1000 lines can be identified by the value of the attribute “patient ID” alone. . Since the individual vote data 20 has 1000 rows in total, all rows can be specified by the value of the attribute “patient ID” alone. The line number “3” can be specified by the specific information {patient ID: 103}, and the influence information is {diagnosis result: dysphagia, nationality: China}.

また、３行目の例では、属性組み合わせが「生年」と「性別」との組み合わせの値で１行が特定できる。さらに、特定された行番号「１」は、特定情報｛生年：１１，性別：男｝、影響情報｛診察結果：肝炎，国籍：日本｝であることが判る。このとき、攻撃者は、特定情報｛生年：１１，性別：男｝を持つ人物が母集団において１人であることを知っている場合、当該人物の影響情報｛診察結果：肝炎，国籍：日本｝を知ることができてしまうことになる。すなわち、図２に示す個票データ２０では、図４に示すリスク情報２２に基づいて、属性「患者ＩＤ」や属性「生年」は特定性が高く、匿名化の対象として効果的であると推定できる。 Further, in the example of the third row, one row can be identified by the attribute combination of the combination of “birth year” and “gender”. Furthermore, it can be seen that the specified line number “1” is specific information {birth year: 11, sex: male}, influence information {diagnosis result: hepatitis, nationality: Japan}. At this time, if the attacker knows that there is one person in the population with specific information {birth year: 11, gender: male}, the influence information of the person {diagnosis result: hepatitis, nationality: Japan } Will be known. That is, in the individual slip data 20 shown in FIG. 2, it is estimated that the attribute “patient ID” and the attribute “birth year” are highly specific and effective as an anonymization target based on the risk information 22 shown in FIG. it can.

図５は、リスク情報の他の一例を示す図である。図５に示すリスク情報２３は、図３に示す個票データ２１を入力とした場合におけるリスク情報の一例である。リスク情報２３は、リスク情報２２と比較して特定できる行が２行に減少している。リスク情報２３の１行目の例では、特定情報｛生年：７？，性別：男，国籍：日本｝で行が特定できる。しかしながら、提供者は、例えば、この様な属性値を持つ人物が母集団に十分多く存在すると判断した場合には、このリスクを許容できると決断できる場合がある。同様に、２行目の例では、特定情報｛診察結果：異食症｝で行が特定できるが、例えば影響情報｛国籍：中国｝について影響が小さい情報であると提供者が判断した場合には、このリスクを許容できると決断できる場合がある。 FIG. 5 is a diagram illustrating another example of risk information. The risk information 23 shown in FIG. 5 is an example of risk information when the individual vote data 21 shown in FIG. 3 is input. In the risk information 23, the number of lines that can be identified as compared with the risk information 22 is reduced to two lines. In the example of the first line of the risk information 23, specific information {birth year: 7? , Gender: male, nationality: Japan}. However, for example, if the provider determines that there are sufficiently many persons having such attribute values in the population, the provider may decide that this risk is acceptable. Similarly, in the example of the second line, the line can be specified by the specific information {diagnosis result: dysphagia}. For example, when the provider determines that the influence information {nationality: China} is information having a small influence. , You may decide that this risk is acceptable.

次に、図１の説明に戻って、漏洩リスク提供装置１００の構成について説明する。図１に示すように、漏洩リスク提供装置１００は、入力部１１０と、表示部１１１と、操作部１１２と、記憶部１２０と、制御部１３０とを有する。なお、漏洩リスク提供装置１００は、図１に示す機能部以外にも既知のコンピュータが有する各種の機能部、例えば各種の入力デバイスや音声出力デバイス等の機能部を有することとしてもかまわない。漏洩リスク提供装置１００の一例としては、据置型のパーソナルコンピュータを採用できる。漏洩リスク提供装置１００には、上記の据置型のパーソナルコンピュータのみならず、可搬型のパーソナルコンピュータを漏洩リスク提供装置１００として採用することもできる。また、漏洩リスク提供装置１００は、可搬型の端末としては、上記の可搬型のパーソナルコンピュータの他にも、例えば、タブレット端末を採用することもできる。 Next, returning to the description of FIG. 1, the configuration of the leakage risk providing apparatus 100 will be described. As illustrated in FIG. 1, the leakage risk providing apparatus 100 includes an input unit 110, a display unit 111, an operation unit 112, a storage unit 120, and a control unit 130. The leakage risk providing apparatus 100 may include various functional units included in known computers, for example, functional units such as various input devices and audio output devices, in addition to the functional units illustrated in FIG. As an example of the leakage risk providing apparatus 100, a stationary personal computer can be employed. As the leakage risk providing apparatus 100, not only the stationary personal computer but also a portable personal computer can be adopted as the leakage risk providing apparatus 100. In addition, the leakage risk providing apparatus 100 can employ, for example, a tablet terminal as a portable terminal in addition to the portable personal computer described above.

入力部１１０は、例えば、光学ディスク、ＵＳＢ（Universal Serial Bus）メモリ、ＳＤメモリカード等の外部記憶媒体に対する媒体アクセス装置等によって実現される。入力部１１０は、外部記憶媒体に記憶された個票データＴ、優先度情報Ｓおよび閾値群Ｐを読み取って、読み取った個票データＴ、優先度情報Ｓおよび閾値群Ｐを制御部１３０に出力する。 The input unit 110 is realized by a medium access device for an external storage medium such as an optical disk, a USB (Universal Serial Bus) memory, and an SD memory card, for example. The input unit 110 reads the individual vote data T, priority information S, and threshold value group P stored in the external storage medium, and outputs the read individual vote data T, priority information S, and threshold value group P to the control unit 130. To do.

表示部１１１は、各種情報を表示するための表示デバイスである。表示部１１１は、例えば、表示デバイスとして液晶ディスプレイ等によって実現される。表示部１１１は、制御部１３０から入力された出力画面等の各種画面を表示する。 The display unit 111 is a display device for displaying various information. The display unit 111 is realized by, for example, a liquid crystal display as a display device. The display unit 111 displays various screens such as an output screen input from the control unit 130.

操作部１１２は、漏洩リスク提供装置１００のユーザから各種操作を受け付ける入力デバイスである。操作部１１２は、例えば、入力デバイスとして、キーボードやマウス等によって実現される。操作部１１２は、ユーザによって入力された操作を操作情報として制御部１３０に出力する。なお、操作部１１２は、入力デバイスとして、タッチパネル等によって実現されるようにしてもよく、表示部１１１の表示デバイスと、操作部１１２の入力デバイスとは、一体化されるようにしてもよい。 The operation unit 112 is an input device that accepts various operations from the user of the leakage risk providing apparatus 100. The operation unit 112 is realized by, for example, a keyboard or a mouse as an input device. The operation unit 112 outputs an operation input by the user to the control unit 130 as operation information. Note that the operation unit 112 may be realized by a touch panel or the like as an input device, and the display device of the display unit 111 and the input device of the operation unit 112 may be integrated.

記憶部１２０は、例えば、ＲＡＭ（Random Access Memory）、フラッシュメモリ等の半導体メモリ素子、ハードディスクや光ディスク等の記憶装置によって実現される。記憶部１２０は、個票データ記憶部１２１と、優先度情報記憶部１２２と、閾値記憶部１２３と、リスク情報記憶部１２４とを有する。また、記憶部１２０は、制御部１３０での処理に用いられる情報を記憶する。 The storage unit 120 is realized by a storage device such as a RAM (Random Access Memory), a semiconductor memory element such as a flash memory, a hard disk, or an optical disk, for example. The storage unit 120 includes an individual vote data storage unit 121, a priority information storage unit 122, a threshold storage unit 123, and a risk information storage unit 124. In addition, the storage unit 120 stores information used for processing in the control unit 130.

個票データ記憶部１２１は、漏洩リスクの提供対象となる個票データを記憶する。個票データは、複数の属性が与えられたエントリの集合である。図６は、個票データ記憶部の一例を示す図である。図６に示すように、個票データ記憶部１２１は、「行番号」、「生年」、「国籍」、「診察結果」といった項目を有する。なお、「行番号」は、元の個票データにない場合には付加してもよいし、「行番号」が識別可能なデータ構造であれば、項目としてはなくてもよい。個票データ記憶部１２１は、例えば、エントリごとに１レコードとして記憶する。 The individual vote data storage unit 121 stores individual vote data to be provided as a leakage risk. The individual slip data is a set of entries given a plurality of attributes. FIG. 6 is a diagram illustrating an example of the individual vote data storage unit. As illustrated in FIG. 6, the individual slip data storage unit 121 includes items such as “line number”, “birth year”, “nationality”, and “diagnosis result”. The “line number” may be added if it is not in the original individual slip data, or may not be an item as long as the “line number” can be identified. The individual slip data storage unit 121 stores, for example, one record for each entry.

「行番号」は、個票データのエントリの番号、つまりレコード番号を示す情報である。「生年」は、個票データのエントリに対応する人物の生年を示す情報である。「国籍」は、個票データのエントリに対応する人物の国籍を示す情報である。「診察結果」は、個票データのエントリに対応する人物の診察結果を示す情報である。図６の１行目の例では、行番号「１」の人物は、生年「４？」、国籍「日本」および診察結果「肺癌」であることを示す。 The “line number” is information indicating the entry number of the individual slip data, that is, the record number. “Birth year” is information indicating the birth year of the person corresponding to the entry of the individual vote data. “Nationality” is information indicating the nationality of the person corresponding to the entry of the individual vote data. “Diagnosis result” is information indicating the examination result of the person corresponding to the entry of the individual vote data. In the example of the first line in FIG. 6, the person with the line number “1” indicates that his / her year of birth is “4?”, Nationality “Japan”, and examination result “lung cancer”.

図１の説明に戻って、優先度情報記憶部１２２は、優先度に用いる個票データの属性の機微度の情報を記憶する。言い換えると、優先度情報記憶部１２２は、複数の属性が与えられたエントリの集合である個票データにおける属性のそれぞれに機微度を対応付けるテーブルである。図７は、優先度情報記憶部の一例を示す図である。図７に示すように、優先度情報記憶部１２２は、属性の機微度を示す属性テーブル１２２ａと、属性値の機微度を示す属性値テーブル１２２ｂとを有する。属性テーブル１２２ａは、「属性」、「機微度」、「属性値の機微度情報」といった項目を有する。 Returning to the description of FIG. 1, the priority information storage unit 122 stores information on the sensitivity of the attribute of the individual vote data used for the priority. In other words, the priority information storage unit 122 is a table that associates the sensitivity with each attribute in the individual vote data that is a set of entries to which a plurality of attributes are given. FIG. 7 is a diagram illustrating an example of the priority information storage unit. As illustrated in FIG. 7, the priority information storage unit 122 includes an attribute table 122a indicating the sensitivity of an attribute and an attribute value table 122b indicating the sensitivity of an attribute value. The attribute table 122a includes items such as “attribute”, “sensitivity”, and “attribute value sensitivity information”.

「属性」は、個票データの属性を示す情報である。「機微度」は、当該属性における機微度を示す情報である。「機微度」は、正の数であり、値が大きいほど機微であることを示す。「機微度」は、機微性を例えば１〜１００の範囲で定量化した情報である。なお、数値の範囲は、任意の範囲を用いることができる。「属性値の機微度情報」は、例えば属性値がさらに機微度を持つ場合に、それぞれの属性値の機微度を示す属性値テーブル１２２ｂへのポインタを示す情報である。また、「属性値の機微度情報」は、例えば属性値がさらに機微度を持たない場合には、例えば「×」で示される。 “Attribute” is information indicating the attribute of the individual vote data. “Sensitivity” is information indicating the sensitivity of the attribute. “Sensitivity” is a positive number, and the greater the value, the more sensitive. “Sensitivity” is information obtained by quantifying sensitivity in a range of 1 to 100, for example. An arbitrary range can be used for the numerical range. The “attribute value sensitivity information” is information indicating a pointer to the attribute value table 122b indicating the sensitivity of each attribute value, for example, when the attribute value further has sensitivity. The “attribute value sensitivity information” is indicated by “x”, for example, when the attribute value has no further sensitivity.

属性値テーブル１２２ｂは、「属性値」、「機微度」といった項目を有する。「属性値」は、属性に対応する属性値を示す情報である。「機微度」は、当該属性値における機微度を示す情報である。なお、優先度情報記憶部１２２では、属性の機微度と、属性値の機微度との尺度を合わせるため、属性値の機微度の最大値は属性の機微度とする。図７の例では、属性テーブル１２２ａにおいて、属性「生年」は、機微度「２」、属性値の機微度情報「×」である。また、属性「診察結果」の機微度の最大値は、属性値の機微度の最大値の「１００」であり、属性値の機微度情報は属性値テーブル１２２ｂへのポインタである。また、図７の例では、属性値テーブル１２２ｂにおいて、例えば、属性値「風邪」は機微度「２」、属性値「肝炎」は機微度「８０」である。 The attribute value table 122b has items such as “attribute value” and “sensitivity”. “Attribute value” is information indicating an attribute value corresponding to the attribute. “Sensitivity” is information indicating the sensitivity of the attribute value. In the priority information storage unit 122, in order to match the scale of the attribute sensitivity and the attribute value sensitivity, the maximum value of the attribute value sensitivity is the attribute sensitivity. In the example of FIG. 7, in the attribute table 122a, the attribute “birth year” is the sensitivity “2” and the attribute value sensitivity information “x”. Further, the maximum value of the sensitivity of the attribute “diagnosis result” is “100” which is the maximum value of the sensitivity of the attribute value, and the sensitivity value information of the attribute value is a pointer to the attribute value table 122b. In the example of FIG. 7, in the attribute value table 122b, for example, the attribute value “cold” has a sensitivity “2”, and the attribute value “hepatitis” has a sensitivity “80”.

図１の説明に戻って、閾値記憶部１２３は、閾値群Ｐを記憶する。閾値記憶部１２３は、閾値群Ｐとして、例えば、閾値ｐ１、ｐ２、ｐ３といった閾値を記憶する。閾値ｐ１は、属性組み合わせの度数分布に関する閾値であり、度数が閾値ｐ１未満のエントリ（行）を判定するためのものである。閾値ｐ２は、リスク情報の属性組み合わせで特定できる行の数が多い場合に、出力するエントリの数（行数）を決定する閾値である。閾値ｐ３は、リスク情報の属性組み合わせで特定できる行の数が多い場合に、既に処理済みである属性組み合わせを包含する属性組み合わせについて、処理対象から除外するか否かを判定するための閾値である。閾値群Ｐは、例えば、閾値ｐ１＝２、閾値ｐ２＝３、および、閾値ｐ３＝２とすることができる。 Returning to the description of FIG. 1, the threshold storage unit 123 stores the threshold group P. The threshold storage unit 123 stores thresholds such as thresholds p1, p2, and p3 as the threshold group P, for example. The threshold value p1 is a threshold value regarding the frequency distribution of attribute combinations, and is used to determine an entry (row) whose frequency is less than the threshold value p1. The threshold value p2 is a threshold value for determining the number of entries to be output (number of rows) when the number of rows that can be specified by the risk information attribute combination is large. The threshold value p3 is a threshold value for determining whether or not to exclude an attribute combination including an already processed attribute combination from a processing target when the number of lines that can be specified by the risk information attribute combination is large. . The threshold value group P can be set to, for example, threshold value p1 = 2, threshold value p2 = 3, and threshold value p3 = 2.

リスク情報記憶部１２４は、リスク情報を記憶する。図８は、リスク情報記憶部の一例を示す図である。図８に示すように、リスク情報記憶部１２４は、「属性組み合わせ」、「行の数」、「行番号」、「特定情報」、「影響情報」といった項目を有する。リスク情報記憶部１２４は、例えば、属性組み合わせごとに閾値ｐ２未満のエントリ（行）を対応付けて記憶する。 The risk information storage unit 124 stores risk information. FIG. 8 is a diagram illustrating an example of the risk information storage unit. As illustrated in FIG. 8, the risk information storage unit 124 includes items such as “attribute combination”, “number of rows”, “row number”, “specific information”, and “effect information”. For example, the risk information storage unit 124 stores an entry (row) less than the threshold p2 in association with each attribute combination.

「属性組み合わせ」は、受け付けた個票データに対して行を特定できる属性の組み合わせを示す情報である。「行の数」は、属性組み合わせで特定できる行の数を示す情報である。「行番号」は、特定できる行のうち、予め定められた所定の行数未満の行を一例として、例示する行番号を示す情報である。「特定情報」は、特定できる行のうち、例示された行番号を特定するための属性値を示す情報である。「影響情報」は、例示された行番号の行における、属性組み合わせに用いた属性を除いた他の属性のうち、行が特定されたことにより判明する属性の属性値を示す情報である。図８の１行目の例では、属性組み合わせが｛生年，国籍｝である場合に、特定できる行の数は「１」行であり、行番号「３」が特定情報｛生年：４？，国籍：中国｝で特定され、影響情報が｛診察結果：肝炎｝であることを示す。 “Attribute combination” is information indicating a combination of attributes that can specify a row for the received individual slip data. The “number of rows” is information indicating the number of rows that can be specified by the attribute combination. The “line number” is information indicating an exemplary line number, taking as an example lines that are less than a predetermined number of lines that can be specified. “Specific information” is information indicating an attribute value for specifying the exemplified line number among the lines that can be specified. The “influence information” is information indicating the attribute value of an attribute that is determined by specifying a row among the other attributes excluding the attribute used for the attribute combination in the row with the exemplified row number. In the example of the first line in FIG. 8, when the attribute combination is {birth year, nationality}, the number of lines that can be specified is “1” line, and the line number “3” is the specific information {birth year: 4? , Nationality: China}, and the impact information is {diagnosis result: hepatitis}.

図１の説明に戻って、制御部１３０は、例えば、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）等によって、内部の記憶装置に記憶されているプログラムがＲＡＭを作業領域として実行されることにより実現される。また、制御部１３０は、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現されるようにしてもよい。制御部１３０は、受付部１３１と、高リスク属性抽出部１３２と、孤立エントリ抽出部１３３と、影響度算出部１３４と、リスク情報抽出部１３５と、出力制御部１３６とを有し、以下に説明する情報処理の機能や作用を実現または実行する。なお、制御部１３０の内部構成は、図１に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。 Returning to the description of FIG. 1, the control unit 130 executes, for example, a program stored in an internal storage device using a RAM as a work area by a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or the like. Is realized. Further, the control unit 130 may be realized by an integrated circuit such as an application specific integrated circuit (ASIC) or a field programmable gate array (FPGA). The control unit 130 includes a reception unit 131, a high risk attribute extraction unit 132, an isolated entry extraction unit 133, an impact calculation unit 134, a risk information extraction unit 135, and an output control unit 136. Implement or execute the functions and operations of the information processing described. Note that the internal configuration of the control unit 130 is not limited to the configuration illustrated in FIG. 1, and may be another configuration as long as the information processing described below is performed.

受付部１３１は、入力部１１０から個票データＴ、優先度情報Ｓおよび閾値群Ｐが入力されると、入力された個票データＴ、優先度情報Ｓおよび閾値群Ｐを受け付ける。受付部１３１は、受け付けた個票データＴ、優先度情報Ｓおよび閾値群Ｐを、それぞれ個票データ記憶部１２１、優先度情報記憶部１２２および閾値記憶部１２３に記憶するとともに、属性組み合わせ抽出指示を高リスク属性抽出部１３２に出力する。 When receiving the individual vote data T, the priority information S, and the threshold value group P from the input unit 110, the accepting unit 131 accepts the inputted individual vote data T, the priority information S, and the threshold value group P. The receiving unit 131 stores the received individual vote data T, the priority information S, and the threshold value group P in the individual vote data storage unit 121, the priority information storage unit 122, and the threshold value storage unit 123, respectively, as well as an attribute combination extraction instruction. Is output to the high risk attribute extraction unit 132.

高リスク属性抽出部１３２は、受付部１３１から属性組み合わせ抽出指示が入力されると、優先度情報記憶部１２２から優先度情報Ｓに含まれる各属性の機微度を読み出す。高リスク属性抽出部１３２は、読み出した各属性の機微度に基づいて、個票データ記憶部１２１を参照し、個票データＴの属性について、全属性の組み合わせを生成する。高リスク属性抽出部１３２は、生成した全属性の組み合わせについて、組み合わせ内の属性の機微度の積に基づいて優先度を算出する。すなわち、優先度は、機微度の積が小さい順に高リスクであるとする。つまり、優先度は、漏洩リスクに相当する。 When the attribute combination extraction instruction is input from the reception unit 131, the high risk attribute extraction unit 132 reads the sensitivity of each attribute included in the priority information S from the priority information storage unit 122. The high risk attribute extraction unit 132 refers to the individual vote data storage unit 121 based on the sensitivity of each read attribute, and generates a combination of all attributes for the attributes of the individual vote data T. The high-risk attribute extraction unit 132 calculates a priority for the generated combination of all attributes based on the product of the sensitivity of the attributes in the combination. That is, the priority is assumed to be higher risk in ascending order of product of sensitivity. That is, the priority corresponds to a leakage risk.

高リスク属性抽出部１３２は、算出した優先度に基づいて、優先度が所定の条件を満たす属性組み合わせを抽出する。すなわち、高リスク属性抽出部１３２は、処理すべき属性組み合わせを抽出する。なお、所定の条件は、例えば、機微度の積が２００未満である等とすることができる。つまり、余りにも知られにくい属性の組み合わせは処理から除外する。また、高リスク属性抽出部１３２は、リスク情報抽出部１３５から除外指示が入力されると、除外指示に含まれる属性組み合わせＡを包含する属性組み合わせを、処理すべき属性組み合わせから除外する。 The high risk attribute extraction unit 132 extracts attribute combinations whose priority satisfies a predetermined condition based on the calculated priority. That is, the high risk attribute extraction unit 132 extracts attribute combinations to be processed. The predetermined condition may be, for example, that the product of sensitivity is less than 200. That is, combinations of attributes that are too difficult to be known are excluded from the processing. Further, when the exclusion instruction is input from the risk information extraction unit 135, the high risk attribute extraction unit 132 excludes the attribute combination including the attribute combination A included in the exclusion instruction from the attribute combination to be processed.

高リスク属性抽出部１３２は、まだ処理すべき属性組み合わせがあるか否かを判定する。なお、高リスク属性抽出部１３２は、孤立エントリ抽出部１３３から選択指示が入力された場合にも同様に、まだ処理すべき属性組み合わせがあるか否かを判定する。高リスク属性抽出部１３２は、処理すべき属性組み合わせがない場合には、出力制御部１３６に出力指示を出力する。 The high risk attribute extraction unit 132 determines whether there is an attribute combination to be processed yet. Note that the high-risk attribute extraction unit 132 also determines whether there is an attribute combination that should still be processed when a selection instruction is input from the isolated entry extraction unit 133. The high risk attribute extraction unit 132 outputs an output instruction to the output control unit 136 when there is no attribute combination to be processed.

高リスク属性抽出部１３２は、処理すべき属性組み合わせがある場合には、処理すべき属性組み合わせのうち、最も優先度が高い属性組み合わせを属性組み合わせＡとして選択する。高リスク属性抽出部１３２は、選択した属性組み合わせＡを孤立エントリ抽出部１３３に出力する。なお、高リスク属性抽出部１３２は、最も優先度が高い属性組み合わせが複数ある場合には、属性の数が少ない属性組み合わせを最も優先度が高いものとする。 When there is an attribute combination to be processed, the high risk attribute extraction unit 132 selects the attribute combination having the highest priority among the attribute combinations to be processed as the attribute combination A. The high risk attribute extraction unit 132 outputs the selected attribute combination A to the isolated entry extraction unit 133. Note that, when there are a plurality of attribute combinations with the highest priority, the high risk attribute extraction unit 132 sets the attribute combination with the smallest number of attributes to have the highest priority.

孤立エントリ抽出部１３３は、高リスク属性抽出部１３２から属性組み合わせＡが入力されると、属性組み合わせＡの値の度数分布Ｆを算出する。孤立エントリ抽出部１３３は、算出した度数分布Ｆに閾値ｐ１未満の度数があるか否かを判定する。孤立エントリ抽出部１３３は、度数分布Ｆに閾値ｐ１未満の度数がない場合には、次の属性組み合わせＡを選択する旨の選択指示を高リスク属性抽出部１３２に出力する。 When the attribute combination A is input from the high risk attribute extraction unit 132, the isolated entry extraction unit 133 calculates the frequency distribution F of the value of the attribute combination A. The isolated entry extraction unit 133 determines whether or not the calculated frequency distribution F has a frequency less than the threshold value p1. The isolated entry extraction unit 133 outputs a selection instruction for selecting the next attribute combination A to the high risk attribute extraction unit 132 when the frequency distribution F does not have a frequency less than the threshold p1.

孤立エントリ抽出部１３３は、度数分布Ｆに閾値ｐ１未満の度数がある場合には、個票データ記憶部１２１を参照し、当該度数の属性値の組み合わせで特定される行数（エントリ数）を算出する。すなわち、孤立エントリ抽出部１３３は、度数が閾値ｐ１未満の行について総行数を算出する。また、孤立エントリ抽出部１３３は、度数が閾値ｐ１未満の行を影響度算出部１３４に出力する。 The isolated entry extraction unit 133 refers to the individual slip data storage unit 121 when the frequency distribution F has a frequency less than the threshold value p1, and determines the number of rows (number of entries) specified by the combination of attribute values of the frequency. calculate. That is, the isolated entry extraction unit 133 calculates the total number of rows for rows whose frequency is less than the threshold value p1. In addition, the isolated entry extraction unit 133 outputs the rows whose frequency is less than the threshold p1 to the influence calculation unit 134.

孤立エントリ抽出部１３３は、影響度算出部１３４から影響度が入力されると、個票データ記憶部１２１を参照し、度数が閾値ｐ１未満の行のうち、影響度が高い順に閾値ｐ２行未満の行（エントリ）を抽出する。言い換えると、孤立エントリ抽出部１３３は、抽出された組み合わせごとに、当該組み合わせに含まれる属性の値で特定されるエントリ数が所定の条件を満たすエントリを抽出する。孤立エントリ抽出部１３３は、属性組み合わせＡ、算出した総行数、および、抽出した行をリスク情報抽出部１３５に出力する。 When the influence level is input from the influence level calculation unit 134, the isolated entry extraction unit 133 refers to the individual form data storage unit 121, and among the rows whose frequency is less than the threshold value p1, the threshold value p2 is lower than the order of the influence level. The line (entry) is extracted. In other words, the isolated entry extraction unit 133 extracts, for each extracted combination, an entry in which the number of entries specified by the attribute value included in the combination satisfies a predetermined condition. The isolated entry extraction unit 133 outputs the attribute combination A, the calculated total number of rows, and the extracted rows to the risk information extraction unit 135.

影響度算出部１３４には、孤立エントリ抽出部１３３から度数が閾値ｐ１未満の行が入力される。影響度算出部１３４は、個票データ記憶部１２１および優先度情報記憶部１２２を参照し、入力された度数が閾値ｐ１未満の行について、属性組み合わせＡに含まれる属性以外の属性の影響度を算出する。言い換えると、影響度算出部１３４は、機微度に基づいて、抽出されたエントリにおける抽出された組み合わせに含まれる属性以外の属性の影響度を算出する。つまり、影響度は、影響情報に含まれる属性値の機微度であり、数値が大きいほど影響度は高くなる。影響度算出部１３４は、算出した影響度を孤立エントリ抽出部１３３に出力する。 The influence degree calculation unit 134 receives a row whose frequency is less than the threshold value p 1 from the isolated entry extraction unit 133. The influence degree calculation unit 134 refers to the individual form data storage unit 121 and the priority information storage unit 122, and determines the influence degree of attributes other than the attributes included in the attribute combination A for the row whose input frequency is less than the threshold p1. calculate. In other words, the influence degree calculation unit 134 calculates the influence degree of attributes other than the attributes included in the extracted combination in the extracted entry based on the sensitivity. In other words, the impact level is the sensitivity of the attribute value included in the impact information, and the greater the numerical value, the higher the impact level. The influence degree calculation unit 134 outputs the calculated influence degree to the isolated entry extraction unit 133.

リスク情報抽出部１３５は、孤立エントリ抽出部１３３から属性組み合わせＡ、算出した総行数、および、抽出した行が入力されると、個票データ記憶部１２１を参照し、抽出した行について行番号、特定情報および影響情報を抽出する。リスク情報抽出部１３５は、属性組み合わせＡと、算出した総行数と、抽出した行番号、特定情報および影響情報とを優先度順にリスク情報記憶部１２４に記憶する。言い換えると、リスク情報抽出部１３５は、抽出された組み合わせと、影響度が所定の条件を満たす属性とを対応付ける。 When the attribute combination A, the calculated total number of rows, and the extracted rows are input from the isolated entry extraction unit 133, the risk information extraction unit 135 refers to the individual data storage unit 121 and extracts the row numbers for the extracted rows. Extract specific information and impact information. The risk information extraction unit 135 stores the attribute combination A, the calculated total number of rows, the extracted row number, the specific information, and the influence information in the risk information storage unit 124 in order of priority. In other words, the risk information extraction unit 135 associates the extracted combination with an attribute whose degree of influence satisfies a predetermined condition.

リスク情報抽出部１３５は、リスク情報記憶部１２４を参照し、リスク情報の総行数が閾値ｐ３未満であるか否かを判定する。リスク情報抽出部１３５は、リスク情報の総行数が閾値ｐ３未満である場合には、次の属性組み合わせＡを選択する旨の選択指示を高リスク属性抽出部１３２に出力する。リスク情報抽出部１３５は、リスク情報の総行数が閾値ｐ３未満でない場合には、属性組み合わせＡを包含する属性組み合わせを、処理すべき属性組み合わせから除外する旨の除外指示を高リスク属性抽出部１３２に出力する。なお、属性組み合わせＡで特定できる行は、属性組み合わせＡを包含する属性組み合わせでも当然特定できる。このため、属性組み合わせＡで特定できる行が十分多く、そのリスクが出力されていれば、属性組み合わせＡに関するそれ以外の情報は必要性が低くなるため、除外しても構わない。属性組み合わせＡを包含する属性組み合わせを除外することで、処理量を低減できる。また、出力されるリスク情報が簡潔になり、ユーザが理解しやすくなる。 The risk information extraction unit 135 refers to the risk information storage unit 124 and determines whether or not the total number of lines of risk information is less than the threshold value p3. When the total number of lines of risk information is less than the threshold value p3, the risk information extraction unit 135 outputs a selection instruction for selecting the next attribute combination A to the high risk attribute extraction unit 132. If the total number of lines of risk information is not less than the threshold value p3, the risk information extraction unit 135 gives an exclusion instruction to exclude the attribute combination including the attribute combination A from the attribute combination to be processed. It outputs to 132. Of course, a line that can be specified by the attribute combination A can also be specified by an attribute combination including the attribute combination A. For this reason, if there are a sufficient number of lines that can be specified by the attribute combination A and the risk is output, the other information regarding the attribute combination A is less necessary and may be excluded. By excluding attribute combinations that include attribute combination A, the amount of processing can be reduced. In addition, the risk information that is output is simplified, making it easier for the user to understand.

出力制御部１３６は、高リスク属性抽出部１３２から出力指示が入力されると、リスク情報記憶部１２４からリスク情報を読み出す。出力制御部１３６は、読み出したリスク情報を属性組み合わせにおける優先度が高い順に並び替える。出力制御部１３６は、並び替えたリスク情報に基づいて出力画面を生成する。出力制御部１３６は、生成した出力画面を表示部１１１に出力して表示させる。つまり、出力制御部１３６は、リスク情報を表示部１１１に表示させる。すなわち、出力制御部１３６は、属性組み合わせと、行の数と、行番号と、特定情報と、影響情報とを有するリスク情報を表示部１１１に表示させる。言い換えると、出力制御部１３６は、抽出された組み合わせにおける漏洩リスクが高い順に、影響度が所定の条件を満たす属性と対応付けられた抽出された組み合わせをリスク情報として出力する。 When an output instruction is input from the high risk attribute extraction unit 132, the output control unit 136 reads risk information from the risk information storage unit 124. The output control unit 136 rearranges the read risk information in descending order of priority in the attribute combination. The output control unit 136 generates an output screen based on the rearranged risk information. The output control unit 136 outputs the generated output screen to the display unit 111 for display. That is, the output control unit 136 causes the display unit 111 to display risk information. That is, the output control unit 136 causes the display unit 111 to display risk information including an attribute combination, the number of rows, a row number, specific information, and influence information. In other words, the output control unit 136 outputs, as risk information, extracted combinations that are associated with attributes whose influence degree satisfies a predetermined condition in descending order of leakage risk in the extracted combinations.

すなわち、出力制御部１３６は、孤立エントリ抽出部１３３で属性組み合わせに対応して抽出された行に対応する、リスク情報抽出部１３５で抽出された行番号、特定情報および影響情報をリスク情報に含めて出力する。言い換えると、出力制御部１３６は、抽出された組み合わせに対応する抽出されたエントリの情報をリスク情報に含めて出力する。 That is, the output control unit 136 includes the line number, specific information, and influence information extracted by the risk information extraction unit 135 corresponding to the line extracted corresponding to the attribute combination by the isolated entry extraction unit 133 in the risk information. Output. In other words, the output control unit 136 outputs the information of the extracted entry corresponding to the extracted combination in the risk information.

また、出力制御部１３６は、影響情報の機微度が高い行を優先してリスク情報を出力する。すなわち、出力制御部１３６は、影響度が高い、つまり、抽出された組み合わせに対応付ける属性の機微度が高い抽出されたエントリを優先してリスク情報を出力する。 Further, the output control unit 136 outputs risk information by giving priority to a line with high sensitivity information. That is, the output control unit 136 outputs risk information with priority on an extracted entry having a high influence level, that is, an attribute with a high degree of sensitivity associated with the extracted combination.

さらに、出力制御部１３６は、１つの属性組み合わせに対応付けられる影響情報について、複数の属性または属性値のうち、当該属性値の機微度が高い属性または属性値を優先して対応付けてリスク情報を出力する。すなわち、出力制御部１３６は、抽出された組み合わせに、機微度が高い属性を優先して対応付けてリスク情報を出力する。 Further, the output control unit 136 preferentially associates the attribute information or attribute value having a high degree of sensitivity among the plurality of attributes or attribute values with respect to the influence information associated with one attribute combination. Is output. In other words, the output control unit 136 outputs risk information by preferentially associating the extracted combination with an attribute with high sensitivity.

また、出力制御部１３６は、それぞれの属性組み合わせに対応する行の数、つまり、抽出された組み合わせに対応する抽出されたエントリ数をリスク情報に含めて出力する。 Further, the output control unit 136 outputs the number of rows corresponding to each attribute combination, that is, the number of extracted entries corresponding to the extracted combination included in the risk information.

次に、実施例の漏洩リスク提供装置１００の動作について説明する。図９は、実施例のリスク提供処理の一例を示すフローチャートである。 Next, operation | movement of the leakage risk provision apparatus 100 of an Example is demonstrated. FIG. 9 is a flowchart illustrating an example of the risk providing process according to the embodiment.

受付部１３１は、入力部１１０から個票データＴ、優先度情報Ｓおよび閾値群Ｐが入力されると、入力された個票データＴ、優先度情報Ｓおよび閾値群Ｐを受け付ける（ステップＳ１）。受付部１３１は、受け付けた個票データＴ、優先度情報Ｓおよび閾値群Ｐを、それぞれ個票データ記憶部１２１、優先度情報記憶部１２２および閾値記憶部１２３に記憶するとともに、属性組み合わせ抽出指示を高リスク属性抽出部１３２に出力する。 When receiving the individual vote data T, the priority information S, and the threshold value group P from the input unit 110, the accepting unit 131 accepts the inputted individual vote data T, the priority information S, and the threshold value group P (step S1). . The receiving unit 131 stores the received individual vote data T, the priority information S, and the threshold value group P in the individual vote data storage unit 121, the priority information storage unit 122, and the threshold value storage unit 123, respectively, as well as an attribute combination extraction instruction. Is output to the high risk attribute extraction unit 132.

高リスク属性抽出部１３２は、受付部１３１から属性組み合わせ抽出指示が入力されると、優先度情報記憶部１２２から優先度情報Ｓに含まれる各属性の機微度を読み出す。高リスク属性抽出部１３２は、読み出した各属性の機微度に基づいて、個票データ記憶部１２１を参照し、個票データＴの属性について、全属性の組み合わせを生成する。高リスク属性抽出部１３２は、生成した全属性の組み合わせについて、組み合わせ内の属性の機微度の積に基づいて優先度を算出する。 When the attribute combination extraction instruction is input from the reception unit 131, the high risk attribute extraction unit 132 reads the sensitivity of each attribute included in the priority information S from the priority information storage unit 122. The high risk attribute extraction unit 132 refers to the individual vote data storage unit 121 based on the sensitivity of each read attribute, and generates a combination of all attributes for the attributes of the individual vote data T. The high-risk attribute extraction unit 132 calculates a priority for the generated combination of all attributes based on the product of the sensitivity of the attributes in the combination.

高リスク属性抽出部１３２は、算出した優先度に基づいて、優先度が所定の条件を満たす属性組み合わせを抽出する。高リスク属性抽出部１３２は、まだ処理すべき属性組み合わせがあるか否かを判定する（ステップＳ２）。高リスク属性抽出部１３２は、処理すべき属性組み合わせがある場合には（ステップＳ２：肯定）、処理すべき属性組み合わせのうち、最も優先度が高い属性組み合わせを属性組み合わせＡとして選択する（ステップＳ３）。高リスク属性抽出部１３２は、選択した属性組み合わせＡを孤立エントリ抽出部１３３に出力する。 The high risk attribute extraction unit 132 extracts attribute combinations whose priority satisfies a predetermined condition based on the calculated priority. The high risk attribute extraction unit 132 determines whether there is an attribute combination to be processed yet (step S2). When there is an attribute combination to be processed (step S2: affirmative), the high risk attribute extraction unit 132 selects the attribute combination with the highest priority among the attribute combinations to be processed as the attribute combination A (step S3). ). The high risk attribute extraction unit 132 outputs the selected attribute combination A to the isolated entry extraction unit 133.

孤立エントリ抽出部１３３は、高リスク属性抽出部１３２から属性組み合わせＡが入力されると、属性組み合わせＡの値の度数分布Ｆを算出する（ステップＳ４）。孤立エントリ抽出部１３３は、算出した度数分布Ｆに閾値ｐ１未満の度数があるか否かを判定する（ステップＳ５）。孤立エントリ抽出部１３３は、度数分布Ｆに閾値ｐ１未満の度数がない場合には（ステップＳ５：否定）、次の属性組み合わせＡを選択する旨の選択指示を高リスク属性抽出部１３２に出力してステップＳ２に戻る。 When the attribute combination A is input from the high risk attribute extraction unit 132, the isolated entry extraction unit 133 calculates the frequency distribution F of the value of the attribute combination A (step S4). The isolated entry extraction unit 133 determines whether or not the calculated frequency distribution F has a frequency less than the threshold value p1 (step S5). The isolated entry extraction unit 133 outputs a selection instruction for selecting the next attribute combination A to the high risk attribute extraction unit 132 when the frequency distribution F does not have a frequency less than the threshold value p1 (No at Step S5). The process returns to step S2.

孤立エントリ抽出部１３３は、度数分布Ｆに閾値ｐ１未満の度数がある場合には（ステップＳ５：肯定）、個票データ記憶部１２１を参照し、度数が閾値ｐ１未満の行について総行数を算出する（ステップＳ６）。また、孤立エントリ抽出部１３３は、度数が閾値ｐ１未満の行を影響度算出部１３４に出力する。 If the frequency distribution F has a frequency less than the threshold value p1 (step S5: Yes), the isolated entry extraction unit 133 refers to the individual data storage unit 121 and determines the total number of rows for the frequency whose frequency is less than the threshold value p1. Calculate (step S6). In addition, the isolated entry extraction unit 133 outputs the rows whose frequency is less than the threshold p1 to the influence calculation unit 134.

影響度算出部１３４には、孤立エントリ抽出部１３３から度数が閾値ｐ１未満の行が入力される。影響度算出部１３４は、個票データ記憶部１２１および優先度情報記憶部１２２を参照し、入力された度数が閾値ｐ１未満の行について、属性組み合わせＡに含まれる属性以外の属性の影響度を算出する（ステップＳ７）。影響度算出部１３４は、算出した影響度を孤立エントリ抽出部１３３に出力する。 The influence degree calculation unit 134 receives a row whose frequency is less than the threshold value p 1 from the isolated entry extraction unit 133. The influence degree calculation unit 134 refers to the individual form data storage unit 121 and the priority information storage unit 122, and determines the influence degree of attributes other than the attributes included in the attribute combination A for the row whose input frequency is less than the threshold p1. Calculate (step S7). The influence degree calculation unit 134 outputs the calculated influence degree to the isolated entry extraction unit 133.

孤立エントリ抽出部１３３は、影響度算出部１３４から影響度が入力されると、個票データ記憶部１２１を参照し、度数が閾値ｐ１未満の行のうち、影響度が高い順に閾値ｐ２行未満の行を抽出する（ステップＳ８）。孤立エントリ抽出部１３３は、属性組み合わせＡ、算出した総行数、および、抽出した行をリスク情報抽出部１３５に出力する。 When the influence level is input from the influence level calculation unit 134, the isolated entry extraction unit 133 refers to the individual form data storage unit 121, and among the rows whose frequency is less than the threshold value p1, the threshold value p2 is lower than the order of the influence level. Are extracted (step S8). The isolated entry extraction unit 133 outputs the attribute combination A, the calculated total number of rows, and the extracted rows to the risk information extraction unit 135.

リスク情報抽出部１３５は、孤立エントリ抽出部１３３から属性組み合わせＡ、算出した総行数、および、抽出した行が入力されると、個票データ記憶部１２１を参照し、抽出した行について行番号、特定情報および影響情報を抽出する（ステップＳ９）。リスク情報抽出部１３５は、属性組み合わせＡと、算出した総行数と、抽出した行番号、特定情報および影響情報とを優先度順にリスク情報記憶部１２４に記憶する（ステップＳ１０）。 When the attribute combination A, the calculated total number of rows, and the extracted rows are input from the isolated entry extraction unit 133, the risk information extraction unit 135 refers to the individual data storage unit 121 and extracts the row numbers for the extracted rows. Specific information and influence information are extracted (step S9). The risk information extraction unit 135 stores the attribute combination A, the calculated total number of rows, the extracted row number, specific information, and influence information in the risk information storage unit 124 in order of priority (step S10).

リスク情報抽出部１３５は、リスク情報記憶部１２４を参照し、リスク情報の総行数が閾値ｐ３未満であるか否かを判定する（ステップＳ１１）。リスク情報抽出部１３５は、リスク情報の総行数が閾値ｐ３未満である場合には（ステップＳ１１：肯定）、次の属性組み合わせＡを選択する旨の選択指示を高リスク属性抽出部１３２に出力してステップＳ２に戻る。 The risk information extraction unit 135 refers to the risk information storage unit 124 and determines whether or not the total number of lines of risk information is less than the threshold value p3 (step S11). When the total number of lines of risk information is less than the threshold value p3 (step S11: affirmative), the risk information extraction unit 135 outputs a selection instruction for selecting the next attribute combination A to the high risk attribute extraction unit 132 Then, the process returns to step S2.

リスク情報抽出部１３５は、リスク情報の総行数が閾値ｐ３未満でない場合には（ステップＳ１１：否定）、属性組み合わせＡを包含する属性組み合わせを、処理すべき属性組み合わせから除外する。すなわち、リスク情報抽出部１３５は、除外指示を高リスク属性抽出部１３２に出力する（ステップＳ１２）。高リスク属性抽出部１３２は、リスク情報抽出部１３５から除外指示が入力されると、除外指示に含まれる属性組み合わせＡを包含する属性組み合わせを、処理すべき属性組み合わせから除外して、ステップＳ２に戻る。 When the total number of lines of risk information is not less than the threshold value p3 (No at Step S11), the risk information extraction unit 135 excludes attribute combinations including the attribute combination A from the attribute combinations to be processed. That is, the risk information extraction unit 135 outputs an exclusion instruction to the high risk attribute extraction unit 132 (step S12). When the exclusion instruction is input from the risk information extraction unit 135, the high risk attribute extraction unit 132 excludes the attribute combination including the attribute combination A included in the exclusion instruction from the attribute combination to be processed, and proceeds to step S2. Return.

高リスク属性抽出部１３２は、処理すべき属性組み合わせがない場合には（ステップＳ２：否定）、出力制御部１３６に出力指示を出力する。出力制御部１３６は、高リスク属性抽出部１３２から出力指示が入力されると、リスク情報記憶部１２４からリスク情報を読み出す。出力制御部１３６は、読み出したリスク情報を属性組み合わせにおける優先度が高い順に並び替える。出力制御部１３６は、並び替えたリスク情報を含む出力画面を表示部１１１に表示させる（ステップＳ１３）。これにより、漏洩リスク提供装置１００は、漏洩リスクの判断の精度を高めることができる。 If there is no attribute combination to be processed (No at Step S2), the high risk attribute extraction unit 132 outputs an output instruction to the output control unit 136. When an output instruction is input from the high risk attribute extraction unit 132, the output control unit 136 reads risk information from the risk information storage unit 124. The output control unit 136 rearranges the read risk information in descending order of priority in the attribute combination. The output control unit 136 displays an output screen including the rearranged risk information on the display unit 111 (step S13). Thereby, the leakage risk providing apparatus 100 can improve the accuracy of the determination of the leakage risk.

続いて、リスク提供処理の具体例について説明する。なお、以下の説明では、個票データＴとして図６に示す個票データ記憶部１２１に記憶された個票データを用い、優先度情報Ｓとして図７に示す優先度情報記憶部１２２に記憶された優先度情報を用いる。また、閾値群Ｐとして、閾値ｐ１＝２、閾値ｐ２＝３、および、閾値ｐ３＝２を用いる。なお、以下の説明では、リスク提供処理の各ステップにおける具体例に着目して説明し、各部の動作の詳細については省略する。 Next, a specific example of risk provision processing will be described. In the following description, the individual form data stored in the individual form data storage unit 121 shown in FIG. 6 is used as the individual form data T, and the priority information S is stored in the priority information storage unit 122 shown in FIG. Priority information is used. Further, as the threshold group P, threshold p1 = 2, threshold p2 = 3, and threshold p3 = 2 are used. In the following description, description will be made by paying attention to specific examples in each step of the risk providing process, and details of the operation of each unit will be omitted.

ステップＳ１において、受付部１３１は、個票データＴ、優先度情報Ｓおよび閾値群Ｐを受け付ける。また、高リスク属性抽出部１３２は、各属性の機微度に基づいて、個票データＴの属性について、全属性の組み合わせを生成する。高リスク属性抽出部１３２は、生成した全属性の組み合わせについて、組み合わせ内の属性の機微度の積に基づいて優先度を算出する。 In step S 1, the receiving unit 131 receives the individual slip data T, the priority information S, and the threshold value group P. Further, the high risk attribute extraction unit 132 generates a combination of all attributes for the attributes of the individual vote data T based on the sensitivity of each attribute. The high-risk attribute extraction unit 132 calculates a priority for the generated combination of all attributes based on the product of the sensitivity of the attributes in the combination.

ステップＳ２において、高リスク属性抽出部１３２は、算出した優先度に基づいて、優先度が所定の条件を満たす属性組み合わせを抽出する。抽出される属性組み合わせは、｛生年｝、｛国籍｝、｛診察結果｝、｛生年，国籍｝、｛生年，診察結果｝、｛国籍，診察結果｝、｛生年，国籍，診察結果｝の７つとなる。高リスク属性抽出部１３２は、処理すべき属性組み合わせがあると判定する。 In step S2, the high risk attribute extraction unit 132 extracts attribute combinations whose priority satisfies a predetermined condition based on the calculated priority. The attribute combinations to be extracted are {birth year}, {nationality}, {diagnosis result}, {birth year, nationality}, {birth year, medical examination result}, {nationality, medical examination result}, {birth year, nationality, medical examination result}. Become one. The high risk attribute extraction unit 132 determines that there is an attribute combination to be processed.

ステップＳ３において、７つの属性組み合わせのうち、最も優先度が高い属性組み合わせは、機微度の積が「２」の｛生年｝であるので、高リスク属性抽出部１３２は、｛生年｝を属性組み合わせＡとして選択する。 In step S3, among the seven attribute combinations, the attribute combination with the highest priority is {birth year} with a product of “2”, so the high-risk attribute extraction unit 132 sets {birth year} to the attribute combination. Select as A.

ステップＳ４において、孤立エントリ抽出部１３３は、属性組み合わせＡの値の個票データＴにおける度数分布Ｆとして、個票データＴの｛生年｝の値を計数する。すなわち、孤立エントリ抽出部１３３は、属性値「４？」の度数が「３」、属性値「５？」の度数が「２」であるので、Ｆ＝｛（４？）：３，（５？）：２｝と計数する。 In step S4, the isolated entry extraction unit 133 counts the {birth year} value of the individual vote data T as the frequency distribution F in the individual vote data T of the value of the attribute combination A. That is, since the frequency of the attribute value “4?” Is “3” and the frequency of the attribute value “5?” Is “2”, the isolated entry extraction unit 133 has F = {(4?): 3, (5 ?): Count as 2}.

ステップＳ５において、孤立エントリ抽出部１３３は、算出した度数分布Ｆに閾値ｐ１＝２未満の度数があるか否かを判定すると、閾値ｐ１＝２未満の度数はないため、ステップＳ２に戻る。 In step S5, when the isolated entry extraction unit 133 determines whether or not the calculated frequency distribution F includes a frequency less than the threshold p1 = 2, the frequency returns to step S2 because there is no frequency less than the threshold p1 = 2.

ステップＳ２において、｛生年｝は処理済みであるので、処理すべき属性組み合わせは、｛国籍｝、｛診察結果｝、｛生年，国籍｝、｛生年，診察結果｝、｛国籍，診察結果｝、｛生年，国籍，診察結果｝の６つとなる。高リスク属性抽出部１３２は、処理すべき属性組み合わせがあると判定する。 In step S2, since {birth year} has been processed, the attribute combinations to be processed are {nationality}, {diagnosis result}, {birth year, nationality}, {birth year, diagnosis result}, {nationality, diagnosis result}, {Birth year, nationality, results of examination}. The high risk attribute extraction unit 132 determines that there is an attribute combination to be processed.

ステップＳ３において、６つの属性組み合わせのうち、最も優先度が高い属性組み合わせは、機微度の積が「１０」の｛国籍｝であるので、高リスク属性抽出部１３２は、｛国籍｝を属性組み合わせＡとして選択する。 In step S3, the attribute combination having the highest priority among the six attribute combinations is {nationality} having a product of “10”, so the high-risk attribute extraction unit 132 sets {nationality} to the attribute combination. Select as A.

ステップＳ４において、孤立エントリ抽出部１３３は、属性組み合わせＡの値の個票データＴにおける度数分布Ｆとして、個票データＴの｛国籍｝の値を計数する。すなわち、孤立エントリ抽出部１３３は、属性値「日本」の度数が「２」、属性値「中国」の度数が「３」であるので、Ｆ＝｛（日本）：２，（中国）：３｝と計数する。 In step S4, the isolated entry extraction unit 133 counts the value of {nationality} of the individual data T as the frequency distribution F in the individual data T of the attribute combination A value. That is, since the frequency of the attribute value “Japan” is “2” and the frequency of the attribute value “China” is “3”, the isolated entry extraction unit 133 has F = {(Japan): 2, (China): 3. }.

ステップＳ２において、｛生年｝、｛国籍｝は処理済みであるので、処理すべき属性組み合わせは、｛診察結果｝、｛生年，国籍｝、｛生年，診察結果｝、｛国籍，診察結果｝、｛生年，国籍，診察結果｝の５つとなる。高リスク属性抽出部１３２は、処理すべき属性組み合わせがあると判定する。 In step S2, since {birth year} and {nationality} have been processed, the attribute combinations to be processed are {diagnosis result}, {birth year, nationality}, {birth year, diagnosis result}, {nationality, diagnosis result}, {Birth year, nationality, results of examination}. The high risk attribute extraction unit 132 determines that there is an attribute combination to be processed.

ステップＳ３において、５つの属性組み合わせのうち、最も優先度が高い属性組み合わせは、機微度の積が「２×１０＝２０」の｛生年，国籍｝であるので、高リスク属性抽出部１３２は、｛生年，国籍｝を属性組み合わせＡとして選択する。 In step S3, the attribute combination having the highest priority among the five attribute combinations is {2 * 10 = 20] {birth year, nationality}, so the high risk attribute extraction unit 132 Select {birth year, nationality} as attribute combination A.

ステップＳ４において、孤立エントリ抽出部１３３は、属性組み合わせＡの値の個票データＴにおける度数分布Ｆとして、個票データＴの｛生年，国籍｝の値を計数する。すなわち、孤立エントリ抽出部１３３は、Ｆ＝｛（４？，日本）：２，（４？，中国）：１，（５？，中国）：２｝と計数する。 In step S4, the isolated entry extraction unit 133 counts the {birth year, nationality} value of the individual vote data T as the frequency distribution F in the individual vote data T of the value of the attribute combination A. That is, the isolated entry extraction unit 133 counts F = {(4 ?, Japan): 2, (4 ?, China): 1, (5 ?, China): 2}.

ステップＳ５において、孤立エントリ抽出部１３３は、算出した度数分布Ｆに閾値ｐ１＝２未満の度数があるか否かを判定すると、閾値ｐ１＝２未満の度数はあるため、ステップＳ６に進む。 In step S5, when the isolated entry extraction unit 133 determines whether or not the calculated frequency distribution F includes a frequency less than the threshold p1 = 2, since there is a frequency less than the threshold p1 = 2, the process proceeds to step S6.

ステップＳ６において、孤立エントリ抽出部１３３は、度数が閾値ｐ１未満の行が｛（４？，中国）：１｝のみであるので、総行数を｛１｝と算出する。 In step S6, the isolated entry extraction unit 133 calculates {1} as the total number of rows because the number of rows whose frequency is less than the threshold p1 is only {(4 ?, China): 1}.

ステップＳ７において、影響度算出部１３４は、｛（４？，中国）：１｝である行番号「３」の行について、｛生年，国籍｝以外の属性の影響度を算出する。影響度算出部１３４は、属性「診察結果」の属性値「肝炎」の機微度「８０」を影響度として算出する。なお、属性値が複数ある場合には、それぞれの機微度を加算または乗算して影響度としてもよいし、最大の機微度を影響度としてもよい。 In step S 7, the influence degree calculation unit 134 calculates the influence degree of an attribute other than {birth year, nationality} for the line with the line number “3” which is {(4 ?, China): 1}. The influence degree calculation unit 134 calculates the sensitivity “80” of the attribute value “hepatitis” of the attribute “diagnosis result” as the influence degree. When there are a plurality of attribute values, the degree of influence may be obtained by adding or multiplying each degree of sensitivity, or the maximum degree of sensitivity may be used as the degree of influence.

ステップＳ８において、孤立エントリ抽出部１３３は、度数が閾値ｐ１未満の行のうち、影響度が高い順に閾値ｐ２＝３行未満の行を抽出する。ここでは、度数が閾値ｐ１未満の行は、行番号「３」の行だけであるので、行番号「３」の行が抽出される。 In step S 8, the isolated entry extraction unit 133 extracts rows with the threshold value p 2 = 3 rows in descending order of influence from the rows with the frequency less than the threshold value p 1. Here, since the row whose frequency is less than the threshold p1 is only the row with the row number “3”, the row with the row number “3” is extracted.

ステップＳ９において、リスク情報抽出部１３５は、抽出された行番号「３」の行について、行番号｛３｝、特定情報｛生年：４？，国籍：中国｝および影響情報｛診察結果：肝炎｝を抽出する。 In step S9, the risk information extraction unit 135 extracts the line number {3} and the specific information {birth year: 4? For the extracted line number “3”. , Nationality: China} and influence information {diagnosis result: hepatitis}.

ステップＳ１０において、リスク情報抽出部１３５は、属性組み合わせ｛生年，国籍｝と、総行数｛１｝と、行番号｛３｝と、特定情報｛生年：４？，国籍：中国｝と、影響情報｛診察結果：肝炎｝とを優先度順にリスク情報記憶部１２４に記憶する。なお、属性組み合わせ｛生年，国籍｝の優先度は「２０」であるが、行番号｛３｝の１行だけであるので、この１行がリスク情報記憶部１２４に記憶される。図１０は、リスク情報記憶部の処理途中の一例を示す図である。図１０に示すリスク情報記憶部１２４ａは、ステップ１０においてリスク情報がリスク情報記憶部１２４に記憶された状態を示す。なお、リスク情報は、ステップＳ６〜Ｓ９の処理によって優先度順に並ぶこととなる。 In step S10, the risk information extraction unit 135 sets the attribute combination {birth year, nationality}, total number of rows {1}, row number {3}, and specific information {birth year: 4? , Nationality: China} and influence information {diagnosis result: hepatitis} are stored in the risk information storage unit 124 in order of priority. Note that the priority of the attribute combination {birth year, nationality} is “20”, but since there is only one row with the row number {3}, this one row is stored in the risk information storage unit 124. FIG. 10 is a diagram illustrating an example of processing in the risk information storage unit. The risk information storage unit 124 a illustrated in FIG. 10 indicates a state in which risk information is stored in the risk information storage unit 124 in step 10. The risk information is arranged in order of priority by the processes of steps S6 to S9.

ステップＳ１１において、リスク情報抽出部１３５は、リスク情報の総行数「１」が閾値ｐ３＝２未満であると判定するので、選択指示を高リスク属性抽出部１３２に出力してステップＳ２に戻る。 In step S11, the risk information extraction unit 135 determines that the total number of lines of risk information “1” is less than the threshold value p3 = 2, so outputs a selection instruction to the high risk attribute extraction unit 132 and returns to step S2. .

ステップＳ２において、｛生年｝、｛国籍｝、｛生年，国籍｝は処理済みであるので、処理すべき属性組み合わせは、｛診察結果｝、｛生年，診察結果｝、｛国籍，診察結果｝、｛生年，国籍，診察結果｝の４つとなる。高リスク属性抽出部１３２は、処理すべき属性組み合わせがあると判定する。 In step S2, {birth year}, {nationality}, and {birth year, nationality} have been processed, so the attribute combinations to be processed are {diagnosis result}, {birth year, diagnosis result}, {nationality, diagnosis result}, {Birth year, nationality, results of examination}. The high risk attribute extraction unit 132 determines that there is an attribute combination to be processed.

ステップＳ３において、４つの属性組み合わせのうち、最も優先度が高い属性組み合わせは、機微度の積が「１００」の｛診察結果｝であるので、高リスク属性抽出部１３２は、｛診察結果｝を属性組み合わせＡとして選択する。 In step S3, the attribute combination with the highest priority among the four attribute combinations is {diagnosis result} having a product of “100”, so the high-risk attribute extraction unit 132 sets {diagnosis result}. Select as attribute combination A.

ステップＳ４において、孤立エントリ抽出部１３３は、属性組み合わせＡの値の個票データＴにおける度数分布Ｆとして、個票データＴの｛診察結果｝の値を計数する。すなわち、孤立エントリ抽出部１３３は、Ｆ＝｛（肺癌）：１，（ＨＩＶ）：１，（肝炎）：１，（風邪）：２｝と計数する。 In step S4, the isolated entry extraction unit 133 counts the {diagnosis result} values of the individual vote data T as the frequency distribution F in the individual vote data T of the value of the attribute combination A. That is, the isolated entry extraction unit 133 counts F = {(lung cancer): 1, (HIV): 1, (hepatitis): 1, (cold): 2}.

ステップＳ６において、孤立エントリ抽出部１３３は、度数が閾値ｐ１未満の行が｛（肺癌）：１，（ＨＩＶ）：１，（肝炎）：１｝であるので、総行数を｛３｝と算出する。 In step S6, the isolated entry extraction unit 133 sets {3} as the total number of rows because the row whose frequency is less than the threshold p1 is {(lung cancer): 1, (HIV): 1, (hepatitis): 1}. calculate.

ステップＳ７において、影響度算出部１３４は、｛（肺癌）：１，（ＨＩＶ）：１，（肝炎）：１｝である行番号「１」、「２」、「３」の行について、｛診察結果｝以外の属性の影響度を算出する。｛診察結果｝以外の属性の機微度のうち最大の機微度は、いずれも属性「国籍」の「１０」である。この場合には、次に最大の機微度を比較するが、次に最大の機微度についても、いずれも属性「生年」の「２」である。本具体例では、他に属性がないため、行番号が小さい行を優先する。 In step S 7, the influence degree calculation unit 134 determines {{lung cancer): 1, (HIV): 1, (hepatitis): 1} for the row numbers “1”, “2”, “3” { The degree of influence of attributes other than the diagnosis result} is calculated. The maximum sensitivity of the attributes other than {diagnosis result} is “10” of the attribute “nationality”. In this case, the next highest sensitivity is compared, and the next highest sensitivity is also “2” of the attribute “birth year”. In this specific example, since there are no other attributes, priority is given to a line with a small line number.

ステップＳ８において、孤立エントリ抽出部１３３は、度数が閾値ｐ１未満の行のうち、影響度が高い順に閾値ｐ２＝３行未満の行を抽出する。つまり、度数が閾値ｐ１未満の行は３行あるが、行番号が小さい行から順に２行を抽出する。すなわち、孤立エントリ抽出部１３３は、行番号「１」、「２」の行を抽出する。 In step S 8, the isolated entry extraction unit 133 extracts rows with the threshold value p 2 = 3 rows in descending order of influence from the rows with the frequency less than the threshold value p 1. That is, although there are three rows whose frequency is less than the threshold value p1, two rows are extracted in order from the row with the smallest row number. That is, the isolated entry extraction unit 133 extracts the lines with the line numbers “1” and “2”.

ステップＳ９において、リスク情報抽出部１３５は、抽出された行番号「１」、「２」の行について、それぞれ、行番号、特定情報および影響情報を抽出する。行番号「１」は、行番号｛１｝、特定情報｛診察結果：肺癌｝および影響情報｛国籍：日本，生年：４？｝が抽出される。行番号「２」は、行番号｛２｝、特定情報｛診察結果：ＨＩＶ｝および影響情報｛国籍：日本，生年：４？｝が抽出される。 In step S9, the risk information extraction unit 135 extracts the line number, the specific information, and the influence information for the extracted lines with the line numbers “1” and “2”, respectively. Line number “1” is line number {1}, specific information {diagnosis result: lung cancer} and influence information {nationality: Japan, year of birth: 4? } Is extracted. Line number “2” is line number {2}, specific information {diagnosis result: HIV} and influence information {nationality: Japan, year of birth: 4? } Is extracted.

ステップＳ１０において、リスク情報抽出部１３５は、属性組み合わせ｛診察結果｝と、総行数｛３｝と、行番号｛１｝および｛２｝にそれぞれ対応する特定情報および影響情報とを優先度順にリスク情報記憶部１２４に記憶する。ここでは、上述の通り行番号｛１｝と｛２｝とは優先度が同じであるので、行番号の若い順にリスク情報記憶部１２４に記憶される。なお、影響情報内の属性値の並び順は、各属性値の影響度（機微度）が大きい順、つまり、影響度「１０」の国籍、影響度「２」の生年の順となる。この状態におけるリスク情報記憶部１２４は、図８に示すリスク情報記憶部１２４の状態となる。 In step S10, the risk information extraction unit 135 sorts the attribute combination {diagnosis result}, the total number of rows {3}, and the specific information and the influence information corresponding to the row numbers {1} and {2} in order of priority. Store in the risk information storage unit 124. Here, as described above, the line numbers {1} and {2} have the same priority, and are stored in the risk information storage unit 124 in ascending order of the line numbers. Note that the order of the attribute values in the influence information is the order in which the degree of influence (sensitivity) of each attribute value is large, that is, the nationality with the influence degree “10” and the year of birth with the influence degree “2”. The risk information storage unit 124 in this state is in the state of the risk information storage unit 124 shown in FIG.

ステップＳ１１において、リスク情報抽出部１３５は、リスク情報の総行数「３」が閾値ｐ３＝２未満でないと判定する。 In step S 11, the risk information extraction unit 135 determines that the total number of lines of risk information “3” is not less than the threshold value p 3 = 2.

ステップＳ１２において、リスク情報抽出部１３５は、除外指示を高リスク属性抽出部１３２に出力する。高リスク属性抽出部１３２は、属性組み合わせ｛診察結果｝を包含する属性組み合わせを、処理すべき属性組み合わせから除外して、ステップＳ２に戻る。 In step S 12, the risk information extraction unit 135 outputs an exclusion instruction to the high risk attribute extraction unit 132. The high risk attribute extraction unit 132 excludes the attribute combination including the attribute combination {diagnosis result} from the attribute combinations to be processed, and returns to step S2.

ステップＳ２において、｛生年｝、｛国籍｝、｛生年，国籍｝は処理済みであり、高リスク属性抽出部１３２は、残りの属性組み合わせ｛生年，診察結果｝、｛国籍，診察結果｝、｛生年，国籍，診察結果｝から、｛診察結果｝を包含する属性組み合わせを除外する。高リスク属性抽出部１３２は、処理すべき属性組み合わせがないと判定し、出力制御部１３６に出力指示を出力する。 In step S2, {birth year}, {nationality}, {birth year, nationality} have been processed, and the high-risk attribute extraction unit 132 performs the remaining attribute combinations {birth year, medical examination result}, {nationality, medical examination result}, { The attribute combination including {diagnosis result} is excluded from the birth year, nationality, and diagnosis result}. The high risk attribute extraction unit 132 determines that there is no attribute combination to be processed, and outputs an output instruction to the output control unit 136.

ステップＳ１３において、出力制御部１３６は、リスク情報記憶部１２４からリスク情報を読み出し、リスク情報を属性組み合わせにおける優先度が高い順に並び替える。出力制御部１３６は、並び替えたリスク情報を含む出力画面を表示部１１１に表示させる。なお、上述の通り、リスク情報が既に優先度順であればそのまま出力する。このように、漏洩リスク提供装置１００は、高リスクのプライバシー侵害の具体例を効率的に把握できるので、漏洩リスクの判断の精度を高めることができる。また、漏洩リスク提供装置１００は、属性組み合わせを機微度の積が小さいものから出力するので、効果的な匿名化方法を分析しやすい。つまり、効果的な匿名化は、知ることが容易な属性で行特定できることを防ぐことであり、機微度の積が小さい属性は、知ることが容易な場合が多いと考えられるためである。 In step S13, the output control unit 136 reads the risk information from the risk information storage unit 124, and rearranges the risk information in descending order of priority in the attribute combination. The output control unit 136 causes the display unit 111 to display an output screen including the rearranged risk information. As described above, if the risk information is already in order of priority, it is output as it is. In this way, the leakage risk providing apparatus 100 can efficiently grasp a specific example of a high-risk privacy infringement, and thus can improve the accuracy of the determination of the leakage risk. In addition, since the leakage risk providing apparatus 100 outputs attribute combinations from those with a small product of sensitivity, it is easy to analyze an effective anonymization method. That is, effective anonymization is to prevent a line from being identified with an attribute that is easy to know, and an attribute with a small product is considered to be easy to know in many cases.

また、漏洩リスク提供装置１００は、影響度、つまり影響情報の機微度が大きいものから出力するので、匿名化の必要性の判断において、効率を向上させることができる。すなわち、匿名化の必要性の判断のためには、最大リスクの分析が求められるが、リスクは影響情報の機微度に比例すると考えられるためである。 Moreover, since the leakage risk providing apparatus 100 outputs from the degree of influence, that is, the influence information having a high degree of sensitivity, the efficiency can be improved in determining the necessity of anonymization. That is, in order to determine the necessity of anonymization, analysis of the maximum risk is required, but the risk is considered to be proportional to the sensitivity of the impact information.

このように、漏洩リスク提供装置１００は、複数の属性が与えられたエントリの集合である個票データにおける属性のそれぞれの機微度を記憶する記憶部１２０を有する。また、漏洩リスク提供装置１００は、機微度に基づいて、属性の組み合わせごとに漏洩リスクを算出し、該漏洩リスクが所定の条件を満たす属性の組み合わせを抽出する。また、漏洩リスク提供装置１００は、抽出された組み合わせごとに、当該組み合わせに含まれる属性の値で特定されるエントリ数が所定の条件を満たすエントリを抽出する。また、漏洩リスク提供装置１００は、機微度に基づいて、抽出されたエントリにおける抽出された組み合わせに含まれる属性以外の属性の影響度を算出する。また、漏洩リスク提供装置１００は、抽出された組み合わせと、影響度が所定の条件を満たす属性とを対応付ける。その結果、漏洩リスクの判断の精度を高めることができる。 As described above, the leakage risk providing apparatus 100 includes the storage unit 120 that stores the sensitivity of each attribute in the individual vote data, which is a set of entries given a plurality of attributes. Further, the leakage risk providing apparatus 100 calculates a leakage risk for each combination of attributes based on the sensitivity, and extracts a combination of attributes for which the leakage risk satisfies a predetermined condition. Further, the leakage risk providing apparatus 100 extracts, for each extracted combination, an entry in which the number of entries specified by the attribute value included in the combination satisfies a predetermined condition. Further, the leakage risk providing apparatus 100 calculates the influence level of attributes other than the attributes included in the extracted combination in the extracted entry based on the sensitivity. Further, the leakage risk providing apparatus 100 associates the extracted combination with an attribute whose degree of influence satisfies a predetermined condition. As a result, the accuracy of determination of leakage risk can be increased.

また、漏洩リスク提供装置１００は、属性の組み合わせに含まれる属性の機微度の積を用いて漏洩リスクを算出する。その結果、知られやすい属性順にリスク情報を算出できる。 In addition, the leakage risk providing apparatus 100 calculates the leakage risk by using the product of the attributes included in the attribute combination. As a result, risk information can be calculated in the order of easily known attributes.

また、漏洩リスク提供装置１００は、抽出された組み合わせにおける漏洩リスクが高い順に、影響度が所定の条件を満たす属性と対応付けられた抽出された組み合わせをリスク情報として出力する。その結果、高リスクのプライバシー侵害を効率的に把握できる。 Further, the leakage risk providing apparatus 100 outputs, as risk information, the extracted combinations associated with attributes whose influence degree satisfies a predetermined condition in descending order of leakage risk in the extracted combinations. As a result, it is possible to efficiently grasp high-risk privacy violations.

また、漏洩リスク提供装置１００は、抽出された組み合わせに対応する抽出されたエントリの情報をリスク情報に含めて出力する。その結果、漏洩リスクの判断の精度を高めることができる。 Further, the leakage risk providing apparatus 100 outputs the information of the extracted entry corresponding to the extracted combination in the risk information. As a result, the accuracy of determination of leakage risk can be increased.

また、漏洩リスク提供装置１００は、抽出された組み合わせに対応付ける属性の機微度が高い抽出されたエントリを優先してリスク情報を出力する。その結果、知られると影響が大きい属性順にエントリを出力できる。 Further, the leakage risk providing apparatus 100 outputs risk information with priority given to the extracted entry having a high degree of sensitivity of the attribute associated with the extracted combination. As a result, entries can be output in the order of attributes that have a large influence when known.

また、漏洩リスク提供装置１００は、抽出された組み合わせに、機微度が高い属性を優先して対応付けてリスク情報を出力する。その結果、エントリ内で知られると影響が大きい属性順に影響情報を出力できる。 In addition, the leakage risk providing apparatus 100 outputs risk information by preferentially associating the extracted combination with a high-sensitivity attribute. As a result, the influence information can be output in the order of the attribute having the greatest influence when known in the entry.

また、漏洩リスク提供装置１００は、抽出された組み合わせに対応する抽出されたエントリ数をリスク情報に含めて出力する。その結果、個票データのうち、どれだけの情報が特定できるかを把握できる。 Moreover, the leakage risk providing apparatus 100 outputs the number of extracted entries corresponding to the extracted combination in the risk information. As a result, it is possible to grasp how much information can be specified in the individual vote data.

また、漏洩リスク提供装置１００は、属性の組み合わせに、他の属性の組み合わせが含まれる場合には、他の属性の組み合わせが含まれる属性の組み合わせを除外して、漏洩リスクが所定の条件を満たす属性の組み合わせを抽出する。その結果、処理量を低減できる。また、出力されるリスク情報が簡潔となるので、ユーザが理解しやすくなる。 In addition, when the combination of attributes includes a combination of other attributes, the leakage risk providing apparatus 100 excludes the combination of attributes including the combination of other attributes, and the leakage risk satisfies a predetermined condition. Extract attribute combinations. As a result, the processing amount can be reduced. In addition, since the output risk information is simplified, the user can easily understand.

なお、上記実施例では、個票データの一例として、医療機関の診察結果を用いたが、これに限定されない。例えば、店舗等における会員登録の情報や、各種アンケート結果等に対して適用してもよい。 In the above embodiment, the medical examination result is used as an example of the individual slip data. However, the present invention is not limited to this. For example, the present invention may be applied to member registration information in stores, various questionnaire results, and the like.

また、図示した各部の各構成要素は、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各部の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況等に応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。例えば、孤立エントリ抽出部１３３と影響度算出部１３４とを統合してもよい。また、図示した各処理は、上記の順番に限定されるものではなく、処理内容を矛盾させない範囲において、同時に実施してもよく、順序を入れ替えて実施してもよい。 In addition, each component of each part illustrated does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution / integration of each unit is not limited to that shown in the figure, and all or a part thereof may be functionally or physically distributed / integrated in arbitrary units according to various loads or usage conditions. Can be configured. For example, the isolated entry extraction unit 133 and the influence degree calculation unit 134 may be integrated. In addition, the illustrated processes are not limited to the above-described order, and may be performed at the same time as long as the process contents are not contradictory, or may be performed in a different order.

さらに、各装置で行われる各種処理機能は、ＣＰＵ（またはＭＰＵ、ＭＣＵ（Micro Controller Unit）等のマイクロ・コンピュータ）上で、その全部または任意の一部を実行するようにしてもよい。また、各種処理機能は、ＣＰＵ（またはＭＰＵ、ＭＣＵ等のマイクロ・コンピュータ）で解析実行されるプログラム上、またはワイヤードロジックによるハードウェア上で、その全部または任意の一部を実行するようにしてもよいことは言うまでもない。 Furthermore, various processing functions performed in each device may be executed entirely or arbitrarily on a CPU (or a microcomputer such as an MPU or MCU (Micro Controller Unit)). In addition, various processing functions may be executed in whole or in any part on a program that is analyzed and executed by a CPU (or a microcomputer such as an MPU or MCU) or on hardware based on wired logic. Needless to say, it is good.

ところで、上記の実施例で説明した各種の処理は、予め用意されたプログラムをコンピュータで実行することで実現できる。そこで、以下では、上記の実施例と同様の機能を有するプログラムを実行するコンピュータの一例を説明する。図１１は、漏洩リスク提供プログラムを実行するコンピュータの一例を示す図である。 By the way, the various processes described in the above embodiments can be realized by executing a program prepared in advance by a computer. Therefore, in the following, an example of a computer that executes a program having the same function as in the above embodiment will be described. FIG. 11 is a diagram illustrating an example of a computer that executes a leakage risk providing program.

図１１に示すように、コンピュータ２００は、各種演算処理を実行するＣＰＵ２０１と、データ入力を受け付ける入力装置２０２と、モニタ２０３とを有する。また、コンピュータ２００は、記憶媒体からプログラム等を読み取る媒体読取装置２０４と、各種装置と接続するためのインタフェース装置２０５と、他の情報処理装置等と有線または無線により接続するための通信装置２０６とを有する。また、コンピュータ２００は、各種情報を一時記憶するＲＡＭ２０７と、ハードディスク装置２０８とを有する。また、各装置２０１〜２０８は、バス２０９に接続される。 As illustrated in FIG. 11, the computer 200 includes a CPU 201 that executes various arithmetic processes, an input device 202 that receives data input, and a monitor 203. The computer 200 also includes a medium reading device 204 that reads a program and the like from a storage medium, an interface device 205 for connecting to various devices, and a communication device 206 for connecting to other information processing devices and the like by wire or wirelessly. Have The computer 200 also includes a RAM 207 that temporarily stores various types of information and a hard disk device 208. Each device 201 to 208 is connected to a bus 209.

ハードディスク装置２０８には、図１に示した受付部１３１、高リスク属性抽出部１３２、孤立エントリ抽出部１３３、影響度算出部１３４、リスク情報抽出部１３５および出力制御部１３６の各処理部と同様の機能を有する漏洩リスク提供プログラムが記憶される。また、ハードディスク装置２０８には、個票データ記憶部１２１、優先度情報記憶部１２２、閾値記憶部１２３、リスク情報記憶部１２４、および、漏洩リスク提供プログラムを実現するための各種データが記憶される。入力装置２０２は、例えば、コンピュータ２００のユーザから操作情報等の各種情報の入力を受け付ける。モニタ２０３は、例えば、コンピュータ２００のユーザに対して出力画面等の各種画面を表示する。媒体読取装置２０４は、記憶媒体から個票データＴ、優先度情報Ｓおよび閾値群Ｐを読み取る。インタフェース装置２０５は、例えば印刷装置等が接続される。通信装置２０６は、例えば、図示しないネットワークと接続され、他の情報処理装置と各種情報をやりとりする。 The hard disk device 208 has the same processing units as the receiving unit 131, the high risk attribute extracting unit 132, the isolated entry extracting unit 133, the influence degree calculating unit 134, the risk information extracting unit 135, and the output control unit 136 shown in FIG. The leakage risk providing program having the function is stored. Also, the hard disk device 208 stores individual data storage unit 121, priority information storage unit 122, threshold storage unit 123, risk information storage unit 124, and various data for realizing a leakage risk providing program. . The input device 202 receives input of various information such as operation information from a user of the computer 200, for example. The monitor 203 displays various screens such as an output screen for the user of the computer 200, for example. The medium reader 204 reads the individual slip data T, the priority information S, and the threshold value group P from the storage medium. The interface device 205 is connected to, for example, a printing device. The communication device 206 is connected to, for example, a network (not shown) and exchanges various information with other information processing devices.

ＣＰＵ２０１は、ハードディスク装置２０８に記憶された各プログラムを読み出して、ＲＡＭ２０７に展開して実行することで、各種の処理を行う。また、これらのプログラムは、コンピュータ２００を図１に示した受付部１３１、高リスク属性抽出部１３２、孤立エントリ抽出部１３３、影響度算出部１３４、リスク情報抽出部１３５および出力制御部１３６として機能させることができる。 The CPU 201 reads out each program stored in the hard disk device 208, develops it in the RAM 207, and executes it to perform various processes. These programs function as the reception unit 131, the high risk attribute extraction unit 132, the isolated entry extraction unit 133, the influence degree calculation unit 134, the risk information extraction unit 135, and the output control unit 136 shown in FIG. Can be made.

なお、上記の漏洩リスク提供プログラムは、必ずしもハードディスク装置２０８に記憶されている必要はない。例えば、コンピュータ２００が読み取り可能な記憶媒体に記憶されたプログラムを、コンピュータ２００が読み出して実行するようにしてもよい。コンピュータ２００が読み取り可能な記憶媒体は、例えば、ＣＤ−ＲＯＭやＤＶＤディスク、ＵＳＢメモリ等の可搬型記録媒体、フラッシュメモリ等の半導体メモリ、ハードディスクドライブ等が対応する。また、公衆回線、インターネット、ＬＡＮ等に接続された装置にこの漏洩リスク提供プログラムを記憶させておき、コンピュータ２００がこれらから漏洩リスク提供プログラムを読み出して実行するようにしてもよい。 Note that the above leakage risk providing program is not necessarily stored in the hard disk device 208. For example, the computer 200 may read and execute a program stored in a storage medium readable by the computer 200. The storage medium readable by the computer 200 corresponds to, for example, a portable recording medium such as a CD-ROM, a DVD disk, and a USB memory, a semiconductor memory such as a flash memory, a hard disk drive, and the like. Alternatively, the leakage risk providing program may be stored in a device connected to a public line, the Internet, a LAN, or the like, and the computer 200 may read and execute the leakage risk providing program therefrom.

以上、本実施例を含む実施の形態に関し、さらに以下の付記を開示する。 As described above, the following supplementary notes are further disclosed regarding the embodiment including the present example.

（付記１）複数の属性が与えられたエントリの集合である個票データにおける前記属性のそれぞれの機微度を記憶する記憶部と、
前記機微度に基づいて、前記属性の組み合わせごとに漏洩リスクを算出し、該漏洩リスクが所定の条件を満たす属性の組み合わせを抽出する属性抽出部と、
前記抽出された組み合わせごとに、当該組み合わせに含まれる属性の値で特定されるエントリ数が所定の条件を満たすエントリを抽出するエントリ抽出部と、
前記機微度に基づいて、前記抽出されたエントリにおける前記抽出された組み合わせに含まれる属性以外の属性の影響度を算出する影響度算出部と、
前記抽出された組み合わせと、前記影響度が所定の条件を満たす属性とを対応付けるリスク情報作成部と、
を有することを特徴とする漏洩リスク提供装置。 (Supplementary Note 1) A storage unit that stores the sensitivity of each of the attributes in the piece data that is a set of entries given a plurality of attributes;
Based on the sensitivity, an attribute extraction unit that calculates a leakage risk for each combination of attributes, and extracts a combination of attributes for which the leakage risk satisfies a predetermined condition;
For each of the extracted combinations, an entry extraction unit that extracts entries in which the number of entries specified by the value of the attribute included in the combination satisfies a predetermined condition;
Based on the sensitivity, an influence degree calculating unit that calculates an influence degree of an attribute other than the attribute included in the extracted combination in the extracted entry;
A risk information creation unit for associating the extracted combination with an attribute for which the degree of influence satisfies a predetermined condition;
A leakage risk providing apparatus characterized by comprising:

（付記２）前記属性抽出部は、前記属性の組み合わせに含まれる前記属性の前記機微度の積を用いて前記漏洩リスクを算出する、
ことを特徴とする付記１に記載の漏洩リスク提供装置。 (Additional remark 2) The said attribute extraction part calculates the said leakage risk using the product of the said sensitivity of the said attribute contained in the combination of the said attribute,
The leakage risk providing apparatus according to supplementary note 1, wherein:

（付記３）さらに、前記抽出された組み合わせにおける前記漏洩リスクが高い順に、前記影響度が所定の条件を満たす属性と対応付けられた前記抽出された組み合わせをリスク情報として出力する出力制御部、
を有することを特徴とする付記１または２に記載の漏洩リスク提供装置。 (Additional remark 3) Furthermore, the output control part which outputs the said extracted combination matched with the attribute with which the said influence degree satisfy | fills a predetermined condition as risk information in order with the said high leak risk in the said extracted combination,
The leakage risk providing apparatus according to appendix 1 or 2, characterized by comprising:

（付記４）前記出力制御部は、前記抽出された組み合わせに対応する前記抽出されたエントリの情報を前記リスク情報に含めて出力する、
ことを特徴とする付記３に記載の漏洩リスク提供装置。 (Supplementary Note 4) The output control unit outputs information on the extracted entry corresponding to the extracted combination included in the risk information.
The leakage risk providing apparatus according to Supplementary Note 3, wherein

（付記５）前記出力制御部は、前記抽出された組み合わせに対応付ける前記属性の前記機微度が高い前記抽出されたエントリを優先して前記リスク情報を出力する、
ことを特徴とする付記４に記載の漏洩リスク提供装置。 (Supplementary Note 5) The output control unit outputs the risk information in preference to the extracted entry having the high sensitivity of the attribute associated with the extracted combination.
The leakage risk providing apparatus according to Supplementary Note 4, wherein

（付記６）前記出力制御部は、前記抽出された組み合わせに、前記機微度が高い前記属性を優先して対応付けて前記リスク情報を出力する、
ことを特徴とする付記３〜５のいずれか１つに記載の漏洩リスク提供装置。 (Supplementary Note 6) The output control unit outputs the risk information by preferentially associating the attribute with high sensitivity to the extracted combination.
The leakage risk providing apparatus according to any one of supplementary notes 3 to 5, characterized in that:

（付記７）前記出力制御部は、前記抽出された組み合わせに対応する前記抽出されたエントリ数を前記リスク情報に含めて出力する、
ことを特徴とする付記３〜６のいずれか１つに記載の漏洩リスク提供装置。 (Supplementary Note 7) The output control unit outputs the risk information including the extracted number of entries corresponding to the extracted combination.
The leakage risk providing apparatus according to any one of supplementary notes 3 to 6, characterized in that:

（付記８）前記属性抽出部は、前記属性の組み合わせに、他の前記属性の組み合わせが含まれる場合には、他の前記属性の組み合わせが含まれる前記属性の組み合わせを除外して、前記漏洩リスクが所定の条件を満たす属性の組み合わせを抽出する、
ことを特徴とする付記１〜７のいずれか１つに記載の漏洩リスク提供装置。 (Supplementary note 8) When the attribute combination includes the other attribute combination, the attribute extraction unit excludes the attribute combination including the other attribute combination, and the leakage risk. Extracts combinations of attributes that satisfy a given condition,
The leakage risk providing apparatus according to any one of appendices 1 to 7, characterized in that:

（付記９）複数の属性が与えられたエントリの集合である個票データにおける前記属性のそれぞれの機微度を記憶する記憶部に記憶された前記機微度に基づいて、前記属性の組み合わせごとに漏洩リスクを算出し、該漏洩リスクが所定の条件を満たす属性の組み合わせを抽出し、
前記抽出された組み合わせごとに、当該組み合わせに含まれる属性の値で特定されるエントリ数が所定の条件を満たすエントリを抽出し、
前記機微度に基づいて、前記抽出されたエントリにおける前記抽出された組み合わせに含まれる属性以外の属性の影響度を算出し、
前記抽出された組み合わせと、前記影響度が所定の条件を満たす属性とを対応付ける、
処理をコンピュータが実行することを特徴とする漏洩リスク提供方法。 (Supplementary Note 9) Leakage for each combination of attributes based on the sensitivity stored in the storage unit storing each sensitivity of the attribute in the individual vote data which is a set of entries given a plurality of attributes Calculate the risk, extract the combination of attributes that the leakage risk satisfies the predetermined condition,
For each of the extracted combinations, an entry in which the number of entries specified by the attribute value included in the combination satisfies a predetermined condition is extracted.
Based on the sensitivity, calculate the degree of influence of attributes other than the attributes included in the extracted combination in the extracted entry;
Associating the extracted combination with an attribute for which the degree of influence satisfies a predetermined condition;
A leakage risk providing method, wherein a computer executes processing.

（付記１０）前記属性の組み合わせを抽出する処理は、前記属性の組み合わせに含まれる前記属性の前記機微度の積を用いて前記漏洩リスクを算出する、
ことを特徴とする付記９に記載の漏洩リスク提供方法。 (Additional remark 10) The process which extracts the combination of the said attribute calculates the said leakage risk using the product of the said sensitivity of the said attribute contained in the said attribute combination,
The leakage risk providing method according to supplementary note 9, characterized by:

（付記１１）さらに、前記抽出された組み合わせにおける前記漏洩リスクが高い順に、前記影響度が所定の条件を満たす属性と対応付けられた前記抽出された組み合わせをリスク情報として出力する、
処理をコンピュータが実行することを特徴とする付記９または１０に記載の漏洩リスク提供方法。 (Additional remark 11) Furthermore, the said extracted combination matched with the attribute with which the said influence degree satisfy | fills predetermined conditions is output as risk information in order with the said leakage risk in the said extracted combination high,
The leakage risk providing method according to appendix 9 or 10, wherein the process is executed by a computer.

（付記１２）前記出力する処理は、前記抽出された組み合わせに対応する前記抽出されたエントリの情報を前記リスク情報に含めて出力する、
ことを特徴とする付記１１に記載の漏洩リスク提供方法。 (Additional remark 12) The said process to output includes the information of the said extracted entry corresponding to the said extracted combination in the said risk information, and outputs,
The leakage risk providing method according to Supplementary Note 11, wherein the leakage risk is provided.

（付記１３）前記出力する処理は、前記抽出された組み合わせに対応付ける前記属性の前記機微度が高い前記抽出されたエントリを優先して前記リスク情報を出力する、
ことを特徴とする付記１２に記載の漏洩リスク提供方法。 (Additional remark 13) The said process to output outputs the said risk information in preference to the said extracted entry with the said high sensitivity of the said attribute matched with the said extracted combination.
The leakage risk providing method according to supplementary note 12, characterized by:

（付記１４）前記出力する処理は、前記抽出された組み合わせに、前記機微度が高い前記属性を優先して対応付けて前記リスク情報を出力する、
ことを特徴とする付記１１〜１３のいずれか１つに記載の漏洩リスク提供方法。 (Supplementary note 14) In the output process, the extracted combination is preferentially associated with the attribute having high sensitivity, and the risk information is output.
The leakage risk providing method according to any one of appendices 11 to 13, characterized in that:

（付記１５）前記出力する処理は、前記抽出された組み合わせに対応する前記抽出されたエントリ数を前記リスク情報に含めて出力する、
ことを特徴とする付記１１〜１４のいずれか１つに記載の漏洩リスク提供方法。 (Supplementary Note 15) The output process includes outputting the number of extracted entries corresponding to the extracted combination in the risk information,
The leakage risk providing method according to any one of appendices 11 to 14, characterized in that:

（付記１６）前記属性の組み合わせを抽出する処理は、前記属性の組み合わせに、他の前記属性の組み合わせが含まれる場合には、他の前記属性の組み合わせが含まれる前記属性の組み合わせを除外して、前記漏洩リスクが所定の条件を満たす属性の組み合わせを抽出する、
ことを特徴とする付記９〜１５のいずれか１つに記載の漏洩リスク提供方法。 (Supplementary Note 16) When the attribute combination includes other attribute combinations, the process of extracting the attribute combinations excludes the attribute combinations including other attribute combinations. Extracting a combination of attributes for which the leakage risk satisfies a predetermined condition;
The leakage risk providing method according to any one of supplementary notes 9 to 15, wherein

（付記１７）複数の属性が与えられたエントリの集合である個票データにおける前記属性のそれぞれの機微度を記憶する記憶部に記憶された前記機微度に基づいて、前記属性の組み合わせごとに漏洩リスクを算出し、該漏洩リスクが所定の条件を満たす属性の組み合わせを抽出し、
前記抽出された組み合わせごとに、当該組み合わせに含まれる属性の値で特定されるエントリ数が所定の条件を満たすエントリを抽出し、
前記機微度に基づいて、前記抽出されたエントリにおける前記抽出された組み合わせに含まれる属性以外の属性の影響度を算出し、
前記抽出された組み合わせと、前記影響度が所定の条件を満たす属性とを対応付ける、
処理をコンピュータに実行させることを特徴とする漏洩リスク提供プログラム。 (Supplementary Note 17) Leakage for each combination of attributes based on the sensitivity stored in the storage unit storing each sensitivity of the attribute in the individual vote data that is a set of entries given a plurality of attributes Calculate the risk, extract the combination of attributes that the leakage risk satisfies the predetermined condition,
For each of the extracted combinations, an entry in which the number of entries specified by the attribute value included in the combination satisfies a predetermined condition is extracted.
Based on the sensitivity, calculate the degree of influence of attributes other than the attributes included in the extracted combination in the extracted entry;
Associating the extracted combination with an attribute for which the degree of influence satisfies a predetermined condition;
A leakage risk providing program for causing a computer to execute processing.

（付記１８）前記属性の組み合わせを抽出する処理は、前記属性の組み合わせに含まれる前記属性の前記機微度の積を用いて前記漏洩リスクを算出する、
ことを特徴とする付記１７に記載の漏洩リスク提供プログラム。 (Additional remark 18) The process which extracts the combination of the said attribute calculates the said leakage risk using the product of the said sensitivity of the said attribute contained in the said attribute combination,
The leakage risk providing program according to supplementary note 17, characterized by:

（付記１９）さらに、前記抽出された組み合わせにおける前記漏洩リスクが高い順に、前記影響度が所定の条件を満たす属性と対応付けられた前記抽出された組み合わせをリスク情報として出力する、
処理をコンピュータに実行させることを特徴とする付記１７または１８に記載の漏洩リスク提供プログラム。 (Supplementary note 19) Further, in order from the highest leakage risk in the extracted combination, the extracted combination associated with an attribute whose influence degree satisfies a predetermined condition is output as risk information.
19. The leakage risk providing program according to appendix 17 or 18, which causes a computer to execute processing.

（付記２０）前記出力する処理は、前記抽出された組み合わせに対応する前記抽出されたエントリの情報を前記リスク情報に含めて出力する、
ことを特徴とする付記１９に記載の漏洩リスク提供プログラム。 (Supplementary note 20) The output process includes outputting information of the extracted entry corresponding to the extracted combination in the risk information,
The leakage risk providing program according to appendix 19, characterized by:

（付記２１）前記出力する処理は、前記抽出された組み合わせに対応付ける前記属性の前記機微度が高い前記抽出されたエントリを優先して前記リスク情報を出力する、
ことを特徴とする付記２０に記載の漏洩リスク提供プログラム。 (Additional remark 21) The process to output outputs the risk information in preference to the extracted entry having the high sensitivity of the attribute associated with the extracted combination.
The leakage risk providing program according to supplementary note 20, characterized by:

（付記２２）前記出力する処理は、前記抽出された組み合わせに、前記機微度が高い前記属性を優先して対応付けて前記リスク情報を出力する、
ことを特徴とする付記１９〜２１のいずれか１つに記載の漏洩リスク提供プログラム。 (Additional remark 22) The process to output outputs the risk information by preferentially associating the attribute with high sensitivity to the extracted combination.
The leakage risk providing program according to any one of supplementary notes 19 to 21, characterized in that:

（付記２３）前記出力する処理は、前記抽出された組み合わせに対応する前記抽出されたエントリ数を前記リスク情報に含めて出力する、
ことを特徴とする付記１９〜２２のいずれか１つに記載の漏洩リスク提供プログラム。 (Supplementary Note 23) In the output process, the risk information includes and outputs the extracted number of entries corresponding to the extracted combination.
The leakage risk providing program according to any one of supplementary notes 19 to 22, characterized in that:

（付記２４）前記属性の組み合わせを抽出する処理は、前記属性の組み合わせに、他の前記属性の組み合わせが含まれる場合には、他の前記属性の組み合わせが含まれる前記属性の組み合わせを除外して、前記漏洩リスクが所定の条件を満たす属性の組み合わせを抽出する、
ことを特徴とする付記１７〜２３のいずれか１つに記載の漏洩リスク提供プログラム。 (Supplementary Note 24) When the attribute combination includes other attribute combinations, the process of extracting the attribute combinations excludes the attribute combinations including other attribute combinations. Extracting a combination of attributes for which the leakage risk satisfies a predetermined condition;
The leakage risk providing program according to any one of supplementary notes 17 to 23, characterized in that:

１００漏洩リスク提供装置
１１０入力部
１１１表示部
１１２操作部
１２０記憶部
１２１個票データ記憶部
１２２優先度情報記憶部
１２３閾値記憶部
１２４リスク情報記憶部
１３０制御部
１３１受付部
１３２高リスク属性抽出部
１３３孤立エントリ抽出部
１３４影響度算出部
１３５リスク情報抽出部
１３６出力制御部 DESCRIPTION OF SYMBOLS 100 Leakage risk provision apparatus 110 Input part 111 Display part 112 Operation part 120 Storage part 121 Individual vote data storage part 122 Priority information storage part 123 Threshold storage part 124 Risk information storage part 130 Control part 131 Acceptance part 132 High risk attribute extraction part 133 Isolated entry extraction unit 134 Influence calculation unit 135 Risk information extraction unit 136 Output control unit

Claims

A storage unit for storing the sensitivity of each of the attributes in the individual vote data that is a set of entries given a plurality of attributes;
Based on the sensitivity, an attribute extraction unit that calculates a leakage risk for each combination of attributes, and extracts a combination of attributes for which the leakage risk satisfies a predetermined condition;
For each of the extracted combinations, an entry extraction unit that extracts entries in which the number of entries specified by the value of the attribute included in the combination satisfies a predetermined condition;
Based on the sensitivity, an influence degree calculating unit that calculates an influence degree of an attribute other than the attribute included in the extracted combination in the extracted entry;
A risk information creation unit for associating the extracted combination with an attribute for which the degree of influence satisfies a predetermined condition;
A leakage risk providing apparatus characterized by comprising:

The attribute extraction unit calculates the leakage risk using a product of the sensitivity of the attributes included in the combination of the attributes;
The leakage risk providing apparatus according to claim 1.

Furthermore, an output control unit that outputs the extracted combination associated with an attribute whose influence degree satisfies a predetermined condition in order from the highest leakage risk in the extracted combination as risk information,
The leakage risk providing apparatus according to claim 1, wherein the leakage risk providing apparatus comprises:

The output control unit outputs the extracted entry information corresponding to the extracted combination included in the risk information,
The leakage risk providing apparatus according to claim 3.

The output control unit outputs the risk information in preference to the extracted entry having a high degree of sensitivity of the attribute associated with the extracted combination;
The leakage risk providing apparatus according to claim 4.

The output control unit preferentially associates the attribute with high sensitivity with the extracted combination and outputs the risk information.
The leakage risk providing apparatus according to any one of claims 3 to 5, wherein

The output control unit outputs the risk information including the extracted number of entries corresponding to the extracted combination.
The leakage risk providing apparatus according to any one of claims 3 to 6, characterized in that:

The attribute extraction unit excludes the combination of the attributes including the combination of the other attributes when the combination of the attributes includes the combination of the other attributes, and the leakage risk is a predetermined condition. Extract combinations of attributes that satisfy
The leakage risk providing apparatus according to any one of claims 1 to 7, wherein

Based on the sensitivity stored in the storage unit that stores the sensitivity of each attribute in the individual vote data, which is a set of entries given a plurality of attributes, a leakage risk is calculated for each combination of the attributes. , Extract a combination of attributes for which the leakage risk satisfies a predetermined condition,
For each of the extracted combinations, an entry in which the number of entries specified by the attribute value included in the combination satisfies a predetermined condition is extracted.
Based on the sensitivity, calculate the degree of influence of attributes other than the attributes included in the extracted combination in the extracted entry;
Associating the extracted combination with an attribute for which the degree of influence satisfies a predetermined condition;
A leakage risk providing method, wherein a computer executes processing.

Based on the sensitivity stored in the storage unit that stores the sensitivity of each attribute in the individual vote data, which is a set of entries given a plurality of attributes, a leakage risk is calculated for each combination of the attributes. , Extract a combination of attributes for which the leakage risk satisfies a predetermined condition,
For each of the extracted combinations, an entry in which the number of entries specified by the attribute value included in the combination satisfies a predetermined condition is extracted.
Based on the sensitivity, calculate the degree of influence of attributes other than the attributes included in the extracted combination in the extracted entry;
Associating the extracted combination with an attribute for which the degree of influence satisfies a predetermined condition;
A leakage risk providing program for causing a computer to execute processing.