JP6313944B2

JP6313944B2 - Anonymization system, anonymization method and anonymization program

Info

Publication number: JP6313944B2
Application number: JP2013204823A
Authority: JP
Inventors: 秀暢小栗
Original assignee: 富士通クラウドテクノロジーズ株式会社
Priority date: 2013-09-30
Filing date: 2013-09-30
Publication date: 2018-04-18
Anticipated expiration: 2033-09-30
Also published as: JP2015069532A

Description

本発明は、個人情報を匿名化又は多様化して利用する技術に関する。 The present invention relates to a technique for using personal information by making it anonymous or diversified.

情報処理技術の発展に伴い、日常の多くの場面で情報が収集され、この収集された情報を用いた処理が行われている。例えば、消費者が店舗の会員となって商品を購入する場合、会員登録時に消費者の氏名、年齢、性別、住所、メールアドレス等を登録することが多い。そして、消費者が商品を購入すると、店舗側のシステムが、この消費者と購入した商品の情報を対応付けて記録する。このように購入した商品の情報を蓄積して分析すると、当該消費者の嗜好が推定でき、この消費者が好む新商品が発売されたような場合にダイレクトメールを発送するといったサービスを行うことができる。更に、多くの消費者の情報について分析することで、２０代女性の好む商品や関東エリアで好まれる商品といった情報を導くことができ、マーケティング等に利用される。 With the development of information processing technology, information is collected in many everyday situations, and processing using the collected information is performed. For example, when a consumer purchases a product as a member of a store, the consumer's name, age, gender, address, e-mail address, etc. are often registered at the time of membership registration. When a consumer purchases a product, the store-side system records the consumer and the purchased product information in association with each other. By accumulating and analyzing information on purchased products in this way, it is possible to estimate the consumer's preferences and perform a service such as sending a direct mail when a new product preferred by the consumer is released. it can. Furthermore, by analyzing information of many consumers, information such as products preferred by women in their 20s and products preferred in the Kanto area can be derived and used for marketing and the like.

これらの情報は、当該店舗だけでなく、商品を製造するメーカや、他の事業者にとっても利用価値が高く、例えば広告やクーポン等のレコメンドに用いたいという要求があった。 Such information has high use value not only for the store but also for the manufacturer of the product and other business operators, and there has been a demand to use it for recommendations such as advertisements and coupons.

しかし、店舗が有する消費者の個人情報を各消費者の許諾を得ずに、他者へ提供することはできない。このため、上記消費者に関する情報を他者へ提供する場合には、個人を特定できないように、匿名化する必要がある。 However, the consumer's personal information in the store cannot be provided to others without obtaining the consent of each consumer. For this reason, when providing information related to the consumer to others, it is necessary to anonymize so that individuals cannot be identified.

例えば、年齢が記載されている会員リストに２５歳の人が一人だけであると、２５歳の知人がその会員であることを知った時点で、その人を特定できることになる。即ち、２５歳の会員という属性を持つ人が一人だけであると、他の情報と照らし合わせることで、個人を特定できる可能性が高い。 For example, if there is only one person 25 years old in the member list in which the age is described, the person can be identified when he / she knows that the 25-year-old acquaintance is the member. That is, if there is only one person with the attribute of a 25-year-old member, there is a high possibility that an individual can be specified by comparing with other information.

そこで、会員リストの年齢の記載を１０歳区切りに抽象化し、２０代が３人のように同じ属性を持つ人が複数人となるようにすれば、３人のうちの誰であるかを特定できなくなる。このように同じ属性を持つ人がｋ人以上いる状態を、「k-匿名性」を満たすと称し、そのようにデータを加工することを「k-匿名化」と称する。 Therefore, if the age description in the member list is abstracted into 10-year breaks, and there are multiple people with the same attribute, such as three in their 20s, who of the three is identified become unable. A state in which there are k or more people having the same attribute in this way is referred to as “k-anonymity” and processing such data is referred to as “k-anonymization”.

特開２００３−１９６３９１号公報JP 2003-196391 A 特開２００３−２３３５５１号公報JP 2003-233551A 特開２００５−１００４０８号公報Japanese Patent Laid-Open No. 2005-100408 特開２００４−０８６３８３号公報JP 2004-086383 A 特開２００５−３４６２４８号公報JP 2005-346248 A

ショッピングモールや展示会等、複数の店舗や事業者が出店・出展している状況において、来場者の情報を各店舗や事業者でマーケティング等に利用したいという要求があり、各店舗や事業者が収集した個人情報をk-匿名化することで他の店舗や事業者に提供することが検討されている。 In the situation where multiple stores and businesses are opening and exhibiting, such as shopping malls and exhibitions, there is a demand to use information of visitors for marketing etc. at each store and business. It is being considered to provide personal information collected to other stores and businesses by k-anonymizing.

しかし、k-匿名化の処理は、年令や性別、職業等のデータのうち、どのデータを匿名化する項目に採用するか、また、これらのデータをどのように抽象化するかといった匿名化の条件を決めて匿名化処理を行った後、ｋ−匿名性を満たしているか否かを確認し、ｋ−匿名性を満たしていない場合には、抽象化の程度を上げるように匿名化の条件を変えて処理を繰り返す。このためk-匿名化の処理には、手間と時間がかかってしまう。 However, the k-anonymization process is anonymized, such as which data to use for anonymizing items, such as age, gender, and occupation, and how to abstract these data. After determining the conditions of the anonymization process, it is checked whether k-anonymity is satisfied, and if k-anonymity is not satisfied, anonymization is performed so as to increase the degree of abstraction. Repeat the process under different conditions. For this reason, the k-anonymization process takes time and effort.

一方、匿名情報を利用する店舗や事業者は、匿名情報を必要とするタイミングが夫々異なっていた。例えば、事業者Ａは１０分毎の匿名情報を必要とし、事業者Ｂは１時間毎の匿名情報を必要とし、事業者Ｃは１日毎の匿名情報を必要とする。この場合、それぞれの事業者が必要とするタイミングで匿名化処理を行って匿名情報を提供するのが望ましいが、匿名化するタイミングが異なると、その都度適切な匿名化の条件を決める手間がかかってしまう。例えば、１時間毎に匿名化を行った場合に「２０代，女性，コーヒー購入」という属性値を持つデータがｋ−匿名性を満たしていたとしても、１０分毎に匿名化を行った場合に「２０代，女性，コーヒー購入」という当該データがｋ−匿名性を満たしていなかった場合、「３０未満，女性，コーヒー購入」のように抽象化の程度を上げるか、「２０代，女性」のように採用する項目を少なくするなど、匿名化の条件を変えてk-匿名化を行う必要がある。即ち、匿名化処理の対象とする期間が短く、対象データが少ないと、k-匿名化したデータ（匿名情報）は、概略的なデータとなる。反対に、１日毎に匿名化を行う場合、「２０代前半，女性，レギュラーコーヒー購入」のように抽象化の程度を下げるか、「２０代前半，女性，コーヒー及びミルク購入」のように採用する項目を多くするように匿名化の条件を変えてもk-匿名性を満たす可能性がある。即ち、匿名化処理の対象とする期間が長く、対象データが多いと、k-匿名化したデータ（匿名情報）は、詳細なデータとなる。 On the other hand, stores and businesses that use anonymous information have different timings that require anonymous information. For example, business operator A needs anonymous information every 10 minutes, business operator B needs anonymous information every hour, and business operator C needs anonymous information every day. In this case, it is desirable to provide anonymization information by performing anonymization processing at the timing required by each business operator, but if the timing of anonymization is different, it will take time and effort to determine appropriate anonymization conditions each time End up. For example, when anonymization is performed every hour, anonymization is performed every 10 minutes even if data having the attribute value “20s, women, coffee purchase” satisfies k-anonymity If the data “20s, women, coffee purchases” does not satisfy k-anonymity, the level of abstraction is increased to “less than 30, women, coffee purchases” or “20s, women purchases”. It is necessary to change the conditions for anonymization, such as reducing the number of items to be adopted, as in “K-anonymization”. That is, if the period to be anonymized is short and the target data is small, the k-anonymized data (anonymous information) becomes schematic data. On the other hand, if anonymization is performed every day, the degree of abstraction is reduced, such as “early 20s, women, regular coffee purchase” or adopted as “early 20s, women, coffee and milk purchase”. Even if the condition of anonymization is changed so as to increase the number of items to be performed, there is a possibility of satisfying k-anonymity. That is, if the period to be anonymized is long and the target data is large, the k-anonymized data (anonymous information) becomes detailed data.

このようにk-匿名化の処理には、手間と時間がかかり、各事業者が要求するタイミング毎に匿名化の条件を決定して匿名化を行うのは現実的ではないため、所定のタイミング、例えば１時間毎に匿名化を行い、この匿名情報を各事業者に提供することになる。この場合、１０分毎の匿名情報を必要とする事業者Ａにとっては、必要なタイミングで匿名情報が得られないという問題があり、１日毎の匿名情報を必要とする事業者Ｃにとっては、詳細な匿名情報が得られないという問題があった。 In this way, the k-anonymization process takes time and effort, and it is not realistic to determine the anonymization conditions for each timing requested by each operator, so it is not realistic, so the predetermined timing For example, anonymization is performed every hour, and this anonymous information is provided to each business operator. In this case, for the business operator A who needs anonymous information every 10 minutes, there is a problem that anonymous information cannot be obtained at a necessary timing, and for the business operator C who needs anonymous information every day, details There was a problem that no anonymous information could be obtained.

これに対し、匿名化の条件を情報処理装置が機械的に決定して各事業者の必要とするタイミングで匿名化を行うことも考えられる。 On the other hand, it is conceivable that the information processing apparatus mechanically determines anonymization conditions and performs anonymization at a timing required by each operator.

しかし、従来の情報処理装置で、k-匿名性を満たすように各項目の値を抽象化する場合、単に同じ属性値が複数となるように機械的にデータを区切るため、例えk-匿名性を満たしても利用価値の無いデータとなってしまうことがある。例えば、ファッションの傾向を知るためにデータを利用する場合、年齢の項目は重要であり、匿名化のために年齢の項目を抽象化し過ぎると、利用価値が無くなってしまう。また、匿名化のため、年齢の項目を１７歳以上２２歳未満のように機械的に区切ったとすると、同一グループに成年と未成年が混在したり、高校生と社会人が混在したりすることになり、マーケティング等への利用が難くなり、利用価値が無くなってしまう。 However, when abstracting the value of each item so as to satisfy k-anonymity in a conventional information processing device, the data is simply separated so that the same attribute value becomes multiple, so for example k-anonymity Even if the above is satisfied, the data may become useless. For example, when data is used to know a fashion trend, the age item is important. If the age item is excessively abstracted for anonymization, the utility value is lost. Also, for anonymization, if the age items are mechanically separated such as 17 years old or older and less than 22 years old, adults and minors may be mixed in the same group, high school students and adults may be mixed. It becomes difficult to use for marketing and the use value is lost.

そこで本発明は、匿名化を行う対象の期間内のデータを抽象化後の価値に基づいて匿名化する技術を提供する。 Therefore, the present invention provides a technique for anonymizing data within a period to be anonymized based on a value after abstraction.

上記課題を解決するため、本発明の匿名化システムは、
匿名化を行う対象の期間を取得する期間取得部と、
個人と対応付けられた複数の項目を含む対象データのうち、前記期間に該当する対象デ
ータを取得するデータ取得部と、
前記対象データ中の項目の値である語を抽象化した語に替えて抽象化候補データを生成する抽象化部と、
前記抽象化候補データに含まれる語の価値を受信し、前記抽象化候補データに含まれる語の価値に基づいて当該抽象化候補データの価値を求める価値判定部と、
前記抽象化候補データの項目の値の組み合わせが、前記対象データの一個人に限定されないことを条件として検定する検定部と、
前記検定の条件を満たした抽象化候補データの価値に基づいて抽象化候補データを選択する選択部と、
前記価値に基づいて選択された抽象化候補データを匿名情報として出力する出力部と、を備える。 In order to solve the above problems, the anonymization system of the present invention is:
A period acquisition unit for acquiring a period for anonymization;
Of the target data including a plurality of items associated with individuals, a data acquisition unit that acquires target data corresponding to the period;
An abstraction unit that generates abstraction candidate data by replacing a word that is a value of an item in the target data with an abstracted word;
A value determination unit that receives the value of the word included in the abstraction candidate data and obtains the value of the abstraction candidate data based on the value of the word included in the abstraction candidate data;
A test unit that tests on condition that a combination of values of items of the abstraction candidate data is not limited to one individual of the target data;
A selection unit that selects the abstraction candidate data based on the value of the abstraction candidate data that satisfies the test condition;
An output unit that outputs the abstraction candidate data selected based on the value as anonymous information.

前記匿名化システムは、対象データに含まれる語を抽象化した語に替えて匿名化するため、前記語と前記抽象化した語とを対応付けて記憶した匿名化辞書を複数取得する辞書取得部と、
前記複数の匿名化辞書に含まれる各語の対応関係に基づいて、抽象化した語を上位、抽象化前の語を下位とし、前記複数の匿名化辞書に含まれる各語と、前記複数の匿名化辞書に存在する上位及び下位の語とを対応付け、対応する上位の語が存在しない最上位の語をルートとして対応する下位の語が存在しない最下位の語までをツリー状に対応付けて当該ツリー状の対応関係を一つの次元とし、前記複数の匿名化辞書に含まれる複数の前記最上位の語毎に前記次元を求め、前記最上位の語毎に求めた前記複数の次元を統合匿名化辞書とする統合部と、
前記次元の夫々について、当該次元に含まれる語に基づいて優先度を決定する優先度決定部と、
前記複数の次元のうち、前記統合匿名化辞書として採用する次元と採用しない次元とを前記優先度に基づいて選択する次元選択部と、を備え、
前記抽象化部が、前記統合匿名化辞書を参照し、前記対象データ中の項目の値である語を抽象化した語に替えて匿名化候補データを生成しても良い。 The anonymization system obtains a plurality of anonymization dictionaries that store the word and the abstracted word in association with each other in order to anonymize the word contained in the target data instead of the abstracted word. When,
Based on the correspondence of each word included in the plurality of anonymization dictionaries, the abstracted word is higher and the word before abstraction is lower, each word included in the plurality of anonymization dictionaries, Corresponds to upper and lower words that exist in the anonymization dictionary, and associates the highest word that does not have a corresponding higher word as a root to the lowest word that does not have a corresponding lower word in a tree shape The tree-like correspondence is set as one dimension, the dimension is obtained for each of the plurality of top words included in the plurality of anonymization dictionaries, and the plurality of dimensions obtained for each top word are determined. An integration part to be an integrated anonymization dictionary;
For each of the dimensions, a priority determination unit that determines the priority based on words included in the dimension;
A dimension selection unit that selects a dimension to be adopted as the integrated anonymization dictionary and a dimension not to be adopted among the plurality of dimensions based on the priority;
The abstraction unit may refer to the integrated anonymization dictionary and generate anonymization candidate data by replacing words that are values of items in the target data with abstracted words.

前記出力部は、前記匿名情報が所定条件を満たした場合に配信を行っても良い。 The output unit may perform distribution when the anonymous information satisfies a predetermined condition.

上記課題を解決するため、本発明の匿名化方法は、
匿名化を行う対象の期間を取得するステップと、
個人と対応付けられた複数の項目を含む対象データのうち、前記期間に該当する対象データを取得するステップと、
前記対象データ中の項目の値である語を抽象化した語に替えて抽象化候補データを生成するステップと、
前記抽象化候補データに含まれる語の価値を受信し、前記抽象化候補データに含まれる語の価値に基づいて当該抽象化候補データの価値を求めるステップと、
前記抽象化候補データの項目の値の組み合わせが、前記対象データの一個人に限定されないことを条件として検定するステップと、
前記検定の条件を満たした抽象化候補データの価値に基づいて抽象化候補データを選択するステップと、
前記価値に基づいて選択された抽象化候補データを匿名情報として出力するステップと、
をコンピュータが実行する。 In order to solve the above problems, the anonymization method of the present invention is:
Obtaining a period for which anonymization is performed;
Of the target data including a plurality of items associated with an individual, obtaining target data corresponding to the period; and
Generating abstract candidate data by replacing words that are values of items in the target data with abstract words;
Receiving the value of the word included in the abstraction candidate data, and determining the value of the abstraction candidate data based on the value of the word included in the abstraction candidate data;
Testing a condition that a combination of values of the abstraction candidate data items is not limited to one individual of the target data;
Selecting abstraction candidate data based on the value of the abstraction candidate data satisfying the test condition;
Outputting abstraction candidate data selected based on the value as anonymous information;
Is executed by the computer.

また、本発明は、上記匿名化方法をコンピュータに実行させるための匿名化プログラムであっても良い。更に、前記匿名化プログラムは、コンピュータが読み取り可能な記憶媒体に記録されていても良い。 Further, the present invention may be an anonymization program for causing a computer to execute the above anonymization method. Further, the anonymization program may be recorded on a computer-readable storage medium.

ここで、コンピュータが読み取り可能な記憶媒体とは、データやプログラム等の情報を電気的、磁気的、光学的、機械的、または化学的作用によって蓄積し、コンピュータから読み取ることができる記憶媒体をいう。このような記憶媒体の内コンピュータから取り外し可能なものとしては、例えばフレキシブルディスク、光磁気ディスク、CD-ROM、CD-R/W、DVD、DAT、８mmテープ、メモリカード等がある。また、コンピュータに固定された記憶媒体としてハードディスクやＲＯＭ（リードオンリーメモリ）等がある。 Here, the computer-readable storage medium refers to a storage medium that stores information such as data and programs by electrical, magnetic, optical, mechanical, or chemical action and can be read from the computer. . Examples of such storage media that can be removed from the computer include a flexible disk, a magneto-optical disk, a CD-ROM, a CD-R / W, a DVD, a DAT, an 8 mm tape, and a memory card. Further, there are a hard disk, a ROM (read only memory) and the like as a storage medium fixed to the computer.

本発明は、匿名化を行う対象の期間内のデータを抽象化後の価値に基づいて匿名化する技術を提供できる。 The present invention can provide a technique for anonymizing data within a period to be anonymized based on a value after abstraction.

図１は、匿名化の説明図である。FIG. 1 is an explanatory diagram of anonymization. 図２は、多様化の説明図である。FIG. 2 is an explanatory diagram of diversification. 図３は、匿名化システムの機能ブロック図である。FIG. 3 is a functional block diagram of the anonymization system. 図４は、情報処理装置のハードウェア構成を示す図である。FIG. 4 is a diagram illustrating a hardware configuration of the information processing apparatus. 図５は、匿名化システムが端末から取得する個人情報の一例である。FIG. 5 is an example of personal information acquired from the terminal by the anonymization system. 図６は、対象期間の一例を示す図である。FIG. 6 is a diagram illustrating an example of the target period. 図７は、配信条件の説明図である。FIG. 7 is an explanatory diagram of distribution conditions. 図８は、対象期間及び配信条件を記憶した設定情報ＤＢの具体例を示す図である。FIG. 8 is a diagram illustrating a specific example of the setting information DB that stores the target period and distribution conditions. 図９は、個人情報を受信して蓄積する処理を示す図である。FIG. 9 is a diagram showing processing for receiving and storing personal information. 図１０は、所定時間間隔及び履歴を用いた配信タイミングで配信を行う場合の処理を示す図である。FIG. 10 is a diagram illustrating processing in the case of performing distribution at a distribution timing using a predetermined time interval and history. 図１１は、リアルタイムに配信を実行する処理を示す図である。FIG. 11 is a diagram showing processing for executing distribution in real time. 図１２は、匿名化処理を示す図である。FIG. 12 is a diagram illustrating anonymization processing. 図１３は、候補パターンの説明図である。FIG. 13 is an explanatory diagram of candidate patterns. 図１４は、対象データにおける年齢の項目の一部の例を示す図である。FIG. 14 is a diagram illustrating an example of a part of the age item in the target data. 図１５は、年齢について取得する価値データの一例を示す図である。FIG. 15 is a diagram illustrating an example of value data acquired for age. 図１６は、年齢の項目の価値を示す図である。FIG. 16 is a diagram illustrating the value of the age item. 図１７は、年齢の項目の価値を示す表である。FIG. 17 is a table showing the value of the age item. 図１８は、抽象化候補データにおける年齢の項目の一部の例を示す図である。FIG. 18 is a diagram illustrating an example of a part of the age item in the abstraction candidate data. 図１９は、年代について取得する各ワードの価値データを示すである。FIG. 19 shows the value data of each word acquired for the age. 図２０は、年代の項目の価値を示す図である。FIG. 20 is a diagram showing the value of the item of the age. 図２１は、年齢の項目の価値を示す図である。FIG. 21 is a diagram showing the value of the age item. 図２２は、匿名化システムの機能ブロック図である。FIG. 22 is a functional block diagram of the anonymization system. 図２３は、辞書ＤＢの例を示す図である。FIG. 23 is a diagram illustrating an example of the dictionary DB. 図２４は、優先度ＤＢの例を示す図である。FIG. 24 is a diagram illustrating an example of the priority DB. 図２５は、共通ＤＢの例を示す図である。FIG. 25 is a diagram illustrating an example of a common DB. 図２６は、個人情報ＤＢの例を示す図である。FIG. 26 is a diagram illustrating an example of the personal information DB. 図２７は、管理サーバ２０のハードウェア構成を示す図である。FIG. 27 is a diagram illustrating a hardware configuration of the management server 20. 図２８は、匿名化装置のハードウェア構成を示す図である。FIG. 28 is a diagram illustrating a hardware configuration of the anonymization device. 図２９は、所定時間間隔及び履歴を用いた配信タイミングで配信を行う場合の処理を示す図である。FIG. 29 is a diagram illustrating processing in the case of performing distribution at a distribution timing using a predetermined time interval and history. 図３０は、リアルタイムに配信を実行する処理を示す図である。FIG. 30 is a diagram showing processing for executing distribution in real time. 図３１は、匿名化処理を示す図である。FIG. 31 is a diagram illustrating anonymization processing. 図３２は、管理サーバ２０が統合匿名化辞書を作成する処理の説明図である。FIG. 32 is an explanatory diagram of processing in which the management server 20 creates an integrated anonymization dictionary. 図３３は、匿名化辞書を統合する処理の説明図である。FIG. 33 is an explanatory diagram of processing for integrating anonymization dictionaries. 図３４は、図３３の処理によって作成される各次元の説明図である。FIG. 34 is an explanatory diagram of each dimension created by the process of FIG. 図３５は、複数の次元の説明図である。FIG. 35 is an explanatory diagram of a plurality of dimensions. 図３６は、図３５に示した次元に含まれる各ワードに重み付けをした例を示す図である。FIG. 36 is a diagram showing an example in which each word included in the dimension shown in FIG. 35 is weighted. 図３７は、各ワードの重みを集計して各次元の優先度を求める処理の説明図である。FIG. 37 is an explanatory diagram of processing for calculating the priority of each dimension by adding up the weights of the respective words. 図３８は、統合匿名化辞書を用いて匿名化装置が実行する匿名化方法の説明図である。FIG. 38 is an explanatory diagram of an anonymization method executed by the anonymization device using the integrated anonymization dictionary. 図３９は、Ａ社における匿名化の例を示す図である。FIG. 39 is a diagram illustrating an example of anonymization in Company A. 図４０は、Ｂ社における匿名化の例を示す図である。FIG. 40 is a diagram illustrating an example of anonymization in the B company. 図４１は、ショッピングモールの各店舗で収集された個人情報の匿名化の例を示す図である。FIG. 41 is a diagram illustrating an example of anonymization of personal information collected at each store of a shopping mall. 図４２は、ナビゲーションシステムの位置情報（個人情報）を匿名化する例を示す図である。FIG. 42 is a diagram illustrating an example of anonymizing position information (personal information) of the navigation system. 図４３は、匿名化システムの機能ブロック図である。FIG. 43 is a functional block diagram of the anonymization system. 図４４は、匿名化サーバのハードウェア構成を示す図である。FIG. 44 is a diagram illustrating a hardware configuration of the anonymization server. 図４５は、事業者端末のハードウェア構成を示す図である。FIG. 45 is a diagram illustrating a hardware configuration of the provider terminal. 図４６は、統合匿名化辞書を用いて匿名化サーバが実行する匿名化方法の説明図である。FIG. 46 is an explanatory diagram of an anonymization method executed by an anonymization server using an integrated anonymization dictionary. 図４７は、匿名化システムの説明図である。FIG. 47 is an explanatory diagram of the anonymization system. 図４８は、匿名化システムのハードウェア構成を示す図である。FIG. 48 is a diagram illustrating a hardware configuration of the anonymization system.

以下、図面を参照して本発明を実施するための形態について説明する。以下の実施の形態の構成は例示であり、本発明は実施の形態の構成に限定されない。 Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings. The configuration of the following embodiment is an exemplification, and the present invention is not limited to the configuration of the embodiment.

〈実施形態１〉
§１．匿名化
図１はk−匿名化の説明図であり、図１（Ａ）は、姓、年齢、性別の項目を含む会員情
報から姓の項目を削除した例を示す。 <Embodiment 1>
§1. Anonymization FIG. 1 is an explanatory diagram of k-anonymization, and FIG. 1A shows an example in which the last name item is deleted from the member information including the last name, age, and sex items.

図１（Ａ）に示すように年齢が記載されている会員情報に１６歳の女性が一人だけであると、１６歳の女性が、この会員であることが分かった時点で、その人を特定できる。即ち、１６歳・女性という属性を持つ人が一人だけであると、他の情報と照らし合わせることで、個人を特定できる可能性がある。 As shown in Fig. 1 (A), if there is only one 16-year-old woman in the member information in which the age is described, when the 16-year-old woman is found to be this member, the person is identified. it can. That is, if there is only one person with the attribute of 16 years old and female, there is a possibility that an individual can be identified by comparing with other information.

図１（Ｂ）では、会員リストの年齢の記載を抽象化し、０代（１０歳未満）、１０代、２０代のように年代別とした。しかし、この場合でも１０代女性は一人だけであり、図１（Ａ）と同様に個人が特定できてしまい匿名化としては不十分である。 In FIG. 1 (B), the description of the age in the member list is abstracted and classified by age, such as 0's (under 10 years), 10's, and 20's. However, even in this case, there is only one female teenager, and an individual can be identified as in FIG. 1A, which is insufficient for anonymization.

そこで、図１（Ｃ）では、更に抽象化し、１０代以下（１９歳以下）と２０代のように年代の区切りを変更した。図１（Ｃ）の場合、１０代以下の女性が２人であり、［１０代以下］及び［女性］という属性が単一では無くなる。このため前述のように１６歳の女性が、この会員であることが分かったとしても、どちらが当該１６歳女性のデータであるかは特定できない。このように同じ属性を持つ人がｋ人（本例では２人）以上いる状態を、「k-匿名性」を満たすと称し、そのようにデータを加工することを「k-匿名化」と称する。 Therefore, in FIG. 1 (C), it was further abstracted and the age divisions were changed to those in their teens (under 19 years old) and those in their 20s. In the case of FIG. 1C, there are two women in their teens or less, and the attributes of “10 or less” and [female] are not single. For this reason, even if it turns out that a 16-year-old woman is this member as mentioned above, it cannot be specified which is the data of the 16-year-old woman. In this way, the state where there are more than k people (2 people in this example) with the same attribute is called “k-anonymity”, and processing such data as “k-anonymization” Called.

図２は、ｌ−多様化の説明図であり、ユーザ毎の利用駅のデータを抽象化し、ユーザ毎
の利用駅が属する区のデータとした例を示す。 FIG. 2 is an explanatory diagram of l-diversification, and shows an example in which the data of the used station for each user is abstracted and used as the data of the ward to which the used station for each user belongs.

抽象化前のデータでは、駅が特定されているために、住居が新宿駅付近で勤務地が東京駅付近といったデータと照らし合わせることでユーザを特定できる可能性がある。このため利用駅を抽象化して、利用駅が属する区とすることで、新宿区内の駅と千代田区内の駅を利用するユーザが複数となり、利用者が特定されなくなる。このように「新宿区内の駅と千代田区内の駅を利用する」のように属性値がｌ種類の可能性を持つ状態を、「ｌ-多
様性」を満たすと称し、そのようにデータを加工することを「ｌ-多様化」と称する。 In the pre-abstraction data, since the station is specified, there is a possibility that the user can be specified by comparing the data such as the residence near Shinjuku Station and the work place near Tokyo Station. For this reason, by abstracting the use station and making it a ward to which the use station belongs, there are a plurality of users who use stations in Shinjuku ward and stations in Chiyoda ward, and the user is not specified. In this way, the state where the attribute value has the possibility of l types, such as “Use stations in Shinjuku ward and Chiyoda ward” is called “I-diversity” and data like that Is called “l-diversification”.

本実施形態１の匿名化システム１０は、この「k-匿名性」や「ｌ-多様性」を満たすよ
うに対象データを抽象化する、即ちデータの項目の値の組み合わせが、対象データの一個人に限定されないように抽象化することにより匿名化を行う。 The anonymization system 10 of the first embodiment abstracts the target data so as to satisfy the “k-anonymity” and “l-diversity”, that is, the combination of the values of the data items is one individual of the target data. Anonymization is performed by abstracting so that it is not limited to.

§２．システム構成
図３は匿名化システムの機能ブロック図である。本実施形態１の匿名化システム１０は、複数の事業者（本例では店舗とも称す）が出店するショッピングモールにおいて、各店舗で収集した個人情報を集約して匿名化を行い、匿名情報を各店舗や他の利用者の端末３０に対して配信する。 §2. System Configuration FIG. 3 is a functional block diagram of the anonymization system. The anonymization system 10 according to the first exemplary embodiment performs anonymization by collecting personal information collected at each store in a shopping mall where a plurality of business operators (also referred to as stores in this example) open stores. Deliver to terminals 30 of stores and other users.

匿名化システム１０は、図３に示すように、データ取得部１０１、抽象化部１０２、検定部１０３、選択部１０４、価値判定部１０５、匿名情報登録部１０６、価値データ取得部１０７、ワードカテゴリ分析部１０８、ワード価値計算部１０９、データ出力部１２０、設定情報登録部１２１、期間取得部１２２、予測部１２３、匿名化ＤＢ（データベース）１４４、検索情報蓄積ＤＢ１４５、設定情報ＤＢ１４６を備えている。 As shown in FIG. 3, the anonymization system 10 includes a data acquisition unit 101, an abstraction unit 102, a test unit 103, a selection unit 104, a value determination unit 105, an anonymous information registration unit 106, a value data acquisition unit 107, a word category. An analysis unit 108, a word value calculation unit 109, a data output unit 120, a setting information registration unit 121, a period acquisition unit 122, a prediction unit 123, an anonymization DB (database) 144, a search information accumulation DB 145, and a setting information DB 146 are provided. .

設定情報登録部１２１は、利用者又は事業者（以下、単にユーザとも称す）の端末３０から匿名化を行う対象の期間や、配信条件などの設定情報を受信して設定情報ＤＢ１４６に登録する。なお、本実施形態において設定情報ＤＢ１４６は、設定情報記憶部の一形態である。 The setting information registration unit 121 receives setting information such as a period to be anonymized and a distribution condition from the terminal 30 of a user or a business operator (hereinafter also simply referred to as a user), and registers the setting information in the setting information DB 146. In the present embodiment, the setting information DB 146 is a form of the setting information storage unit.

期間取得部１２２は、匿名化を行う対象の期間を取得する。例えば、ユーザからのリクエストと共に対象の期間を受信することや、予め設定された対象の期間を設定情報ＤＢ１４６から読み出すことにより取得する。 The period acquisition unit 122 acquires a period for which anonymization is performed. For example, the target period is received together with the request from the user, or the preset target period is read from the setting information DB 146.

予測部１２３は、端末３０から受信した個人情報等に基づいて予測結果を求める。例えば、所定期間内の属性値の変化の推移に基づいて、所定時間後の属性値を予測結果として求める。また、所定時間後の属性値が閾値を超えた場合に警告を予測結果としても良い。具体的には、１時間当たりの車による来店者数（即ち駐車場への入庫車数）と平均滞在時間に基づいて２時間後・３時間後の入庫車数を予測結果として求め、入庫車数が駐車場の容量（収容車数）の７０％（閾値）を超える場合に、誘導員の配置を促す警告と前記２時間後・３時間後の入庫車数を予測結果とする。また、複数属性の値の組み合わせと予測結果とを対応付けて記憶した予測ＤＢを備え、端末３０から受信した個人情報の属性値の組み合わせと対応する予測結果を予測ＤＢから求める。具体的には、車以外で来店した女性が４０％を超え、雨が降り出した場合に、１時間後に飲食店が混雑するのであれば、「車以外で来店した女性が４０％以上，雨」といった属性値の組み合わせと対応付けて「１時間後に女性をターゲットとしたタイムセールの実施」といった混雑緩和の為のメッセージを予測ＤＢに記憶させておき、この属性値と対応するメッセージを予測結果として求める。 The prediction unit 123 obtains a prediction result based on personal information received from the terminal 30. For example, an attribute value after a predetermined time is obtained as a prediction result based on the transition of the change of the attribute value within a predetermined period. Further, a warning may be used as a prediction result when the attribute value after a predetermined time exceeds a threshold value. Specifically, based on the number of visitors by car per hour (that is, the number of vehicles entering the parking lot) and the average stay time, the number of vehicles received after 2 hours and after 3 hours is obtained as a prediction result. When the number exceeds 70% (threshold value) of the capacity of the parking lot (accommodated number of vehicles), the warning that prompts the placement of guides and the number of vehicles that arrive after 2 or 3 hours are used as the prediction results. In addition, a prediction DB that stores combinations of values of multiple attributes and prediction results in association with each other is obtained, and a prediction result corresponding to a combination of attribute values of personal information received from the terminal 30 is obtained from the prediction DB. Specifically, if the number of women who come to the store outside of the car exceeds 40% and it starts to rain, and the restaurant is crowded after one hour, “40% or more of the women who come to the store outside the car will rain” In association with a combination of attribute values such as this, a message for congestion relief such as “Implementing a time sale targeting women after one hour” is stored in the prediction DB, and the message corresponding to this attribute value is used as a prediction result. Ask.

データ取得部１０１は、個人と対応付けられた複数の項目を含む個人情報のうち、前記
対象期間に該当するものを対象データとして取得する。例えば、データ取得部１０１は、各店舗の端末３０から個人情報を受信して匿名化ＤＢ１４４に記憶させ、匿名化ＤＢ１４４から前記対象の期間に該当するデータを対象データとして読み出す。 The data acquisition unit 101 acquires, as target data, information corresponding to the target period among personal information including a plurality of items associated with individuals. For example, the data acquisition unit 101 receives personal information from the terminal 30 of each store, stores it in the anonymization DB 144, and reads data corresponding to the target period from the anonymization DB 144 as target data.

抽象化部１０２は、対象データを匿名化或いは多様化する際に、対象データ中の項目の値であるワード（語）を抽象化したワードに替えて抽象化候補データを生成する。本実施形態においてワード（語）は、単語や句など、一まとまりの言葉であり、位置情報や電話番号等の数値、メールアドレスやＩＰアドレス等の識別情報、言葉と同様の意味を持つ記号等を含んでも良い。 When the target data is anonymized or diversified, the abstraction unit 102 generates abstract candidate data by replacing words (words) that are values of items in the target data with abstracted words. In this embodiment, a word (word) is a group of words such as a word or a phrase, a numerical value such as location information or a telephone number, identification information such as an e-mail address or an IP address, a symbol having the same meaning as the word, or the like. May be included.

価値判定部１０５は、抽象化候補データに含まれるワードの価値に基づいて当該抽象化候補データの価値を求める。 The value determination unit 105 obtains the value of the abstraction candidate data based on the value of the word included in the abstraction candidate data.

検定部１０３は、抽象化候補データの一個人と対応する項目の値の組み合わせが、当該抽象化候補データ中で単一でないことを条件として検定する。例えば検定部１０３は、抽象化候補データがｋ−匿名性を満たしているか、ｌ−多様性を満たしているかを検定する。 The test unit 103 performs test on the condition that the combination of the values of items corresponding to one individual of the abstraction candidate data is not single in the abstraction candidate data. For example, the test unit 103 tests whether the abstraction candidate data satisfies k-anonymity or l-diversity.

選択部１０４は、前記検定の条件を満たした抽象化候補データの価値に基づいて抽象化候補データを選択する。例えば、選択部１０４は、ｋ−匿名性やｌ−多様性を満たした抽象化候補データを価値が高い順に所定数選択する。また、選択部１０４は、ｋ−匿名性やｌ−多様性を満たした抽象化候補データのうち、最も価値が高い抽象化候補データを選択しても良い。 The selection unit 104 selects abstraction candidate data based on the value of the abstraction candidate data that satisfies the test condition. For example, the selection unit 104 selects a predetermined number of abstraction candidate data satisfying k-anonymity and l-diversity in descending order of value. The selection unit 104 may select abstraction candidate data having the highest value among the abstraction candidate data satisfying k-anonymity and l-diversity.

匿名情報登録部１０６は、選択部１０４で選択された抽象化候補データを匿名情報として匿名化ＤＢ１４４に登録する。 The anonymous information registration unit 106 registers the abstraction candidate data selected by the selection unit 104 in the anonymization DB 144 as anonymous information.

価値データ取得部１０７は、抽象化候補データに含まれるワードの価値データを検索情報蓄積ＤＢ１４５から取得（受信）する。また、価値データ取得部１０７は、検索情報蓄積ＤＢ１４５に前記ワードの価値データが登録されていない場合に、他の装置にリクエストし、取得した価値データを検索情報蓄積ＤＢ１４５に登録する機能（データリクエスト）や、定期的に他の装置を巡回して最新の価値データを取得し、検索情報蓄積ＤＢ１４５に登録されている価値データを更新する機能（データクローラ）を有する。本実施形態では、この価値データとして検索エンジン７０から各ワードの統計情報を受信する。ここで、各ワードの統計情報は、例えばＳＥＭの広告単価（クリック単価）や、クリック率、平均掲載順位、１日の表示回数、１日のクリック数等である。なお、価値の取得先は、検索エンジンに限らず、ウェブページやＳＮＳ等であっても良い。この場合、例えばウェブページやＳＮＳにおける各ワードの使用頻度を価値としても良い。 The value data acquisition unit 107 acquires (receives) value data of words included in the abstraction candidate data from the search information accumulation DB 145. Further, the value data acquisition unit 107 makes a request to another device when the value data of the word is not registered in the search information storage DB 145 and registers the acquired value data in the search information storage DB 145 (data request ) Or periodically visit other devices to acquire the latest value data and update the value data registered in the search information storage DB 145 (data crawler). In this embodiment, the statistical information of each word is received from the search engine 70 as this value data. Here, the statistical information of each word includes, for example, an SEM advertising unit price (unit price per click), a click rate, an average ranking, the number of display times per day, the number of clicks per day, and the like. Note that the value acquisition destination is not limited to a search engine, and may be a web page, an SNS, or the like. In this case, for example, the use frequency of each word in a web page or SNS may be used as the value.

ワードカテゴリ分析部１０８は、ウェブサイト等のデータを分析して、新規のワードや、当該ワードを抽象化したワード（カテゴリ）を求め、検索情報蓄積ＤＢに登録する。 The word category analysis unit 108 analyzes data on a website or the like to obtain a new word or a word (category) obtained by abstracting the word and registers it in the search information storage DB.

価値計算部１９は、価値データ取得部１０７で取得したワードの価値に基づき、ワードの価値の年平均や月平均、週平均など、ワードの価値の統計情報を求める。特に、対象期間内の価値、例えば最高値や最低値、平均値であっても良い。 Based on the value of the word acquired by the value data acquisition unit 107, the value calculation unit 19 obtains statistical information on the value of the word, such as an annual average, a monthly average, and a weekly average of the word value. In particular, the value within the target period, for example, the maximum value, the minimum value, or the average value may be used.

データ出力部１２０は、匿名化ＤＢ１４４から匿名情報を読み出して出力する。ここで、匿名情報の出力とは、表示装置による表示出力や、プリンタによる印刷出力、他のコンピュータへの送信、記憶媒体への書き込み等である。 The data output unit 120 reads and outputs anonymous information from the anonymization DB 144. Here, the output of anonymous information includes display output by a display device, print output by a printer, transmission to another computer, writing to a storage medium, and the like.

また、本実施形態１のデータ出力部１２０は、配信条件に基づいて匿名情報を店舗等の端末３０へ情報の配信を行う。例えば、女性の割合が６０％を超えた場合や、２０代の来店者が１００人を超えた場合、３０代男性の割合が２０％未満となった場合等、所定の属性値を持つ人数（当該属性値と一致する個人の数）が所定値に達することを配信条件とし、所定値に達した場合に情報の配信を行う。この配信条件を満たした場合に配信する情報は、単に配信条件を満たした旨の通知であっても良いし、配信条件を満たした状態の匿名情報であっても良い。 In addition, the data output unit 120 according to the first embodiment distributes anonymous information to the terminal 30 such as a store based on the distribution conditions. For example, when the percentage of women exceeds 60%, the number of visitors in their 20s exceeds 100, the percentage of men in their 30s is less than 20%, etc. The distribution condition is that the number of individuals matching the attribute value reaches a predetermined value, and information is distributed when the predetermined value is reached. The information distributed when this distribution condition is satisfied may simply be a notification that the distribution condition is satisfied, or may be anonymous information that satisfies the distribution condition.

匿名化ＤＢ１４４は、個人情報（対象データ）を記憶し、当該個人情報を検定用に供すると共に、個人情報を対象期間毎に匿名化した匿名情報を保持する。 The anonymization DB 144 stores personal information (target data), provides the personal information for verification, and holds anonymous information obtained by anonymizing personal information for each target period.

検索情報蓄積ＤＢ１４５は、価値データ取得部１０７で取得したワードの価値や、ワードカテゴリ分析部１０８で求めたワードやカテゴリの情報、価値計算部１９で求めた価値の統計情報などを記憶する。 The search information storage DB 145 stores the value of the word acquired by the value data acquisition unit 107, the information of the word and category obtained by the word category analysis unit 108, the statistical information of the value obtained by the value calculation unit 19, and the like.

設定情報ＤＢ１４６は、ユーザの端末３０から匿名化を行う対象の期間や、配信条件などの設定情報を受信して、当該ユーザの識別情報（ユーザＩＤ）と対応付けて記憶する。 The setting information DB 146 receives setting information such as a period for which anonymization is performed from the user's terminal 30 and distribution conditions, and stores the setting information in association with the identification information (user ID) of the user.

また、図１中、検索エンジン７０は、インターネット等のネットワーク上に存在する情報の検索機能を提供するサイト（コンピュータ）である。即ち、検索エンジン７０は、ユーザ端末から検索するキーワードを受信すると、このキーワードを含むウェブページのＵＲＬ等のリストを検索結果として提供し、ユーザ端末に表示させる。 In FIG. 1, a search engine 70 is a site (computer) that provides a search function for information existing on a network such as the Internet. That is, when the search engine 70 receives a keyword to be searched from the user terminal, the search engine 70 provides a list such as a URL of a web page including the keyword as a search result, and displays the list on the user terminal.

また、検索エンジン７０は、この検索機能を利用し、検索結果にキーワードと連動した広告を表示させることや、キーワードに応じた広告料を支払ったスポンサーサイトへのリンクを表示させることも行う。このため、検索エンジン７０は、検索されたワード毎に、１日の検索回数（表示回数）、検索結果の広告がクリックされた回数（クリック数）、１クリック当たりの広告料（クリック単価）等をワードの統計情報として記憶する。 In addition, the search engine 70 uses this search function to display an advertisement linked to a keyword in a search result, or to display a link to a sponsor site that has paid an advertising fee according to the keyword. For this reason, the search engine 70 determines the number of searches per day (number of times displayed), the number of times the search result advertisement was clicked (number of clicks), the advertising fee per click (cost per click), etc. Are stored as word statistics.

また、これらの情報に基づき、検索エンジン７０は、表示回数をクリック数で除したクリック率や、１日のクリック数にクリック単価を乗じた値（１日の費用）、広告の申し込み時（広告オークション時）に提示した費用に応じた広告の掲載順位等も求める。 Also, based on this information, the search engine 70 determines the click rate obtained by dividing the number of impressions by the number of clicks, the value obtained by multiplying the number of clicks per day by the cost per click (cost per day), the time of application for advertisement (advertisement) Also ask for the ranking of the advertisement according to the cost presented at the time of the auction.

検索エンジン７０は、匿名化システム１０に対し、上記クリック数、表示回数、掲載順位、１日の費用、クリック率、クリック単価等の情報を提供するデータ出力部７１や、これらワードに関する情報を記憶する検索ワード蓄積ＤＢ７２、検索結果と共に配信する広告の情報を記憶する検索広告配信ＤＢ７３を備える。 The search engine 70 stores information related to the word, such as a data output unit 71 that provides information such as the number of clicks, the number of display times, the ranking, the cost of the day, the click rate, the cost per click, etc. to the anonymization system 10. A search word storage DB 72 that stores information on advertisements distributed together with search results.

図４は匿名化システム１０のハードウェア構成を示す図である。匿名化システム１０は、ＣＰＵ１、メモリ２、通信制御部３、記憶装置４、入出力インタフェース５を有する所謂コンピュータである。 FIG. 4 is a diagram illustrating a hardware configuration of the anonymization system 10. The anonymization system 10 is a so-called computer having a CPU 1, a memory 2, a communication control unit 3, a storage device 4, and an input / output interface 5.

ＣＰＵ１は、記憶装置４からプログラムを読み出し、メモリ２に展開して実行し、前述の抽象化部１０２、価値判定部１０５、検定部１０３、選択部１０４、匿名情報登録部１０６、データ取得部１０１、価値データ取得部１０７、ワードカテゴリ分析部１０８、ワード価値計算部１０９、データ出力部１２０、期間取得部１２２の機能を提供する。 The CPU 1 reads the program from the storage device 4, develops it in the memory 2 and executes it, and executes the above-described abstraction unit 102, value determination unit 105, test unit 103, selection unit 104, anonymous information registration unit 106, data acquisition unit 101. , A value data acquisition unit 107, a word category analysis unit 108, a word value calculation unit 109, a data output unit 120, and a period acquisition unit 122 are provided.

メモリ２は、主記憶装置ということもできる。メモリ２は、例えば、ＣＰＵ１が実行するプログラムや、通信制御部３を介して受信したデータ、記憶装置４から読み出したデータ、その他のデータ等を記憶する。 The memory 2 can also be called a main storage device. The memory 2 stores, for example, a program executed by the CPU 1, data received via the communication control unit 3, data read from the storage device 4, other data, and the like.

通信制御部３は、ネットワークを介して他の装置と接続し、当該装置との通信を制御する。入出力インタフェース５は、表示装置やプリンタ等の出力手段や、キーボードやポインティングデバイス等の入力手段、ドライブ装置等の入出力手段が適宜接続される。ドライブ装置は、着脱可能な記憶媒体の読み書き装置であり、例えば、フラッシュメモリカードの入出力装置、ＵＳＢメモリを接続するＵＳＢのアダプタ等である。また、着脱可能な記憶媒体は、例えば、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disk）、ブルーレイディスク（Blu-ray(登録商標) Disc）等のディスク媒体であってもよい。ドライブ装置は、着脱可能な記憶媒体からプログラムを読み出し、記憶装置４に格納する。 The communication control unit 3 is connected to another device via a network and controls communication with the device. The input / output interface 5 is appropriately connected to output means such as a display device and a printer, input means such as a keyboard and pointing device, and input / output means such as a drive device. The drive device is a removable storage medium read / write device, such as an input / output device for a flash memory card, a USB adapter for connecting a USB memory, or the like. The removable storage medium may be a disk medium such as a CD (Compact Disc), a DVD (Digital Versatile Disk), or a Blu-ray (registered trademark) disc. The drive device reads the program from the removable storage medium and stores it in the storage device 4.

記憶装置４は、外部記憶装置ということもできる。記憶装置４としては、ＳＳＤ（Solid State Drive）やＨＤＤ等であってもよい。記憶装置４は、ドライブ装置との間で、デ
ータを授受する。例えば、記憶装置４は、ドライブ装置からインストールされる情報処理プログラム等を記憶する。また、記憶装置４は、プログラムを読み出し、メモリ２に引き渡す。本実施形態では、記憶装置４が前述の匿名化ＤＢ１４４や検索情報蓄積ＤＢ１４５を格納している。 The storage device 4 can also be called an external storage device. The storage device 4 may be an SSD (Solid State Drive), an HDD, or the like. The storage device 4 exchanges data with the drive device. For example, the storage device 4 stores an information processing program installed from the drive device. The storage device 4 reads out the program and delivers it to the memory 2. In the present embodiment, the storage device 4 stores the anonymization DB 144 and the search information accumulation DB 145 described above.

また、端末３０は、ＣＰＵ、メモリ、通信制御部、記憶装置、入出力インタフェースを有する所謂コンピュータである。 The terminal 30 is a so-called computer having a CPU, a memory, a communication control unit, a storage device, and an input / output interface.

端末３０のＣＰＵは、記憶装置からプログラムを読み出し、メモリに展開して実行し、前述の個人情報や設定情報を匿名化システム１０へ送信する機能や、匿名化システム１０から配信情報を受信し、表示等の出力を行う機能を提供する。 The CPU of the terminal 30 reads the program from the storage device, expands and executes the program in the memory, receives the distribution information from the anonymization system 10 and the function of transmitting the above-described personal information and setting information to the anonymization system 10, Provides a function to output such as display.

図５は、匿名化システム１０が各店舗の端末３０から取得する個人情報の一例である。図５の例では、各店舗の端末３０から利用日時と、年齢、性別、購入商品等の情報を取得し、利用店舗名と共に記憶している。 FIG. 5 is an example of personal information that the anonymization system 10 acquires from the terminal 30 of each store. In the example of FIG. 5, information such as the use date and time, age, sex, purchased product, etc. is acquired from the terminal 30 of each store and stored together with the use store name.

利用日時は、消費者が当該店舗を利用した時刻を示している。なお、図５の例では、利用日時として、利用した日付と時刻を記録したが、日付が必須でない場合には、利用した時刻（利用時刻）のみとしても良い。 The use date indicates the time when the consumer uses the store. In the example of FIG. 5, the use date and time are recorded as the use date and time, but when the date is not essential, only the use time (use time) may be used.

図６は、対象期間の一例を示す図である。対象期間としては、例えば、１０分毎、１時間毎、１日毎、１週間毎のように所定の間隔で対象期間が設定される。また、１０：００〜１２：００、１２：００〜１４：００等のように所定の開始時刻と所定の終了時刻によって対象期間が設定される。更に、曜日、祝祭日、平日、休日、祝祭日、週末、月初、月末等によって対象期間が設定されても良い。なお、これらは、平日の１０：００〜１８：００、休日の１２：００〜２０：００、月末の３日間のように組み合わせて設定しても良い。 FIG. 6 is a diagram illustrating an example of the target period. As the target period, for example, the target period is set at predetermined intervals such as every 10 minutes, every hour, every day, and every week. Further, the target period is set by a predetermined start time and a predetermined end time such as 10:00 to 12:00, 12: 0 to 14:00, and the like. Further, the target period may be set according to a day of the week, a holiday, a weekday, a holiday, a holiday, a weekend, the beginning of the month, the end of the month, or the like. These may be set in combination such as 10:00 to 18:00 on weekdays, 12:00 to 20:00 on holidays, and 3 days at the end of the month.

図７は、配信条件の説明図である。図７の例では、配信条件として、配信パターンや、配信種別、配信タイミング、配信情報、配信先を有し、配信タイミングと配信情報に応じた６つの配信パターンを示している。配信種別は、配信タイミングが所定時間間隔で定期的に配信を行うこととなっている定期配信と、配信タイミングが所定情報の取得等のイベントを契機に配信を行うこととなっているイベント配信とを示す。図７の例では、配信パターン１，２が定期配信であり、配信パターン３−６がイベント配信である。 FIG. 7 is an explanatory diagram of distribution conditions. In the example of FIG. 7, the distribution conditions include a distribution pattern, a distribution type, a distribution timing, distribution information, and a distribution destination, and six distribution patterns corresponding to the distribution timing and the distribution information are shown. The distribution type includes regular distribution whose distribution timing is to be distributed periodically at a predetermined time interval, and event distribution whose distribution timing is to be triggered by an event such as acquisition of predetermined information Indicates. In the example of FIG. 7, distribution patterns 1 and 2 are regular distributions, and distribution pattern 3-6 is event distribution.

配信タイミングは、配信の契機を示す情報であり、所定時間間隔やイベント等が設定されている。所定時間間隔は、前述の対象期間と同様に１０分毎、１時間毎、１日毎、１週間毎のように所定の間隔が設定される。また、所定時間間隔は、１０：００〜１２：００
、１２：００〜１４：００等のように所定の開始時刻と所定の終了時刻によって設定されても良い。更に、曜日、祝祭日、平日、休日、祝祭日、週末、月初、月末等によって所定時間間隔が設定されても良い。なお、これらは、平日の１０：００〜１８：００、休日の１２：００〜２０：００、月末の３日間のように組み合わせて設定しても良い。 The distribution timing is information indicating an opportunity for distribution, and a predetermined time interval, an event, and the like are set. The predetermined time interval is set such as every 10 minutes, every hour, every day, every week, as in the above-described target period. The predetermined time interval is 10:00 to 12:00.
, 12:00 to 14:00, etc., may be set by a predetermined start time and a predetermined end time. Furthermore, a predetermined time interval may be set according to a day of the week, a holiday, a weekday, a holiday, a holiday, a weekend, the beginning of the month, the end of the month, or the like. These may be set in combination such as 10:00 to 18:00 on weekdays, 12:00 to 20:00 on holidays, and 3 days at the end of the month.

また、配信の契機とするイベントは、例えば、所定の情報を取得した場合(パターン３)や、匿名情報が基準とする時点から所定値以上乖離した場合(パターン４)、匿名情報における所定の属性値（所定情報）の出現数や出現率が所定値に達した場合(パターン５)、匿名情報の順位が変動した場合(パターン６)などである。 In addition, an event that triggers distribution is, for example, when predetermined information is acquired (pattern 3), or when anonymity information deviates by more than a predetermined value from the reference time (pattern 4), and predetermined attributes in anonymous information This is the case when the number of occurrences of the value (predetermined information) and the appearance rate reach a predetermined value (pattern 5), when the rank of the anonymous information changes (pattern 6), and the like.

配信パターン１の場合、配信タイミングに設定した時間間隔で、匿名情報や匿名化した情報の順位を配信情報として配信する。 In the case of the distribution pattern 1, the ranking of anonymous information or anonymized information is distributed as distribution information at time intervals set in the distribution timing.

配信パターン２の場合、匿名情報に基づく予測情報を求め、この予測情報を配信情報として配信タイミングに設定した時間間隔で配信する。 In the case of distribution pattern 2, prediction information based on anonymous information is obtained, and this prediction information is distributed as distribution information at time intervals set at distribution timing.

配信パターン３の場合、所定情報の取得時に、匿名情報や匿名化した情報の順位を配信情報として配信する。 In the case of the distribution pattern 3, at the time of acquisition of predetermined information, the order of anonymous information or anonymized information is distributed as distribution information.

配信パターン４の場合、配信タイミングを満たした場合に、配信情報配信タイミングを満たした旨の通知、即ち履歴情報との差分が所定値以上乖離した旨の通知や、匿名情報、匿名化した情報の順位を配信情報として配信する。 In the case of the distribution pattern 4, when the distribution timing is satisfied, a notification that the distribution information distribution timing is satisfied, that is, a notification that the difference from the history information has deviated by a predetermined value or more, anonymity information, anonymized information Distribute the rank as distribution information.

配信パターン５の場合、配信タイミングを満たした場合に、配信情報配信タイミングを満たした旨の通知、即ち匿名情報の出現数や出現率が所定値に達した旨の通知や、匿名情報、匿名化した情報の順位を配信情報として配信する。 In the case of the distribution pattern 5, when the distribution timing is satisfied, a notification that the distribution information distribution timing is satisfied, that is, a notification that the number of appearances and the appearance rate of the anonymous information reach a predetermined value, anonymity information, anonymization The order of the information is distributed as distribution information.

配信パターン６の場合、配信タイミングを満たした場合に、配信情報配信タイミングを満たした旨の通知、即ち匿名情報の順位が変動した旨の通知や、匿名情報、匿名化した情報の順位を配信情報として配信する。 In the case of the distribution pattern 6, when the distribution timing is satisfied, a notification that the distribution information distribution timing is satisfied, that is, a notification that the rank of the anonymous information has changed, an anonymous information, and the rank of the anonymized information are distributed. Deliver as.

配信先は、端末のメールアドレスなど、配信情報の宛先を示す情報である。なお、配信手法は、電子メールに限らず、ショートメッセージサービスや、メッセンジャーソフト間のメッセージ等、電子的に情報を送信できるものであれば良い。 The distribution destination is information indicating a destination of distribution information such as a mail address of a terminal. Note that the delivery method is not limited to e-mail, and any method can be used as long as information can be transmitted electronically, such as a short message service or a message between messenger software.

なお、配信タイミングや配信情報の組み合わせは、この６パターンに限定されるものではなく、コンピュータによる判定が可能な配信タイミングや電子的に配信可能な配信情報であれば任意に採用できる。 The combination of distribution timing and distribution information is not limited to these six patterns, and any distribution timing that can be determined by a computer or distribution information that can be distributed electronically can be arbitrarily adopted.

図８は、対象期間及び配信条件を記憶した設定情報ＤＢ１４６の具体例を示す図である。図８の例では、設定情報ＤＢ１４６が、ユーザＩＤと、対象期間と、配信条件（配信パターン、配信タイミング、配信情報、配信先）とを対応付けて記憶している。 FIG. 8 is a diagram illustrating a specific example of the setting information DB 146 that stores the target period and distribution conditions. In the example of FIG. 8, the setting information DB 146 stores a user ID, a target period, and distribution conditions (distribution pattern, distribution timing, distribution information, distribution destination) in association with each other.

ユーザＩＤは、設定情報（対象期間や配信条件）毎に付した識別情報であり、本例では当該設定情報を設定したユーザの識別情報としている。なお、設定情報の識別情報は、ユーザＩＤに限らず、リクエスト毎に付した識別情報（リクエストＩＤ）や、シリアル番号等の任意の識別情報であっても良い。 The user ID is identification information given for each setting information (target period and distribution condition). In this example, the user ID is the identification information of the user who set the setting information. Note that the identification information of the setting information is not limited to the user ID, but may be arbitrary identification information such as identification information (request ID) attached to each request or a serial number.

対象期間は、１０分毎や１週間毎など、ユーザによって設定された任意の期間が記憶されている。 As the target period, an arbitrary period set by the user such as every 10 minutes or every week is stored.

配信条件は、以下のように配信パターン１〜６の例を示した。
配信パターン１の例では、１時間毎の配信タイミングで、２０代２５人、３０代４８人・・・のような匿名情報が、配信情報としてＢ店舗端末へ配信される。 Examples of distribution conditions are as follows for distribution patterns 1 to 6.
In the example of distribution pattern 1, anonymous information such as 25 people in their 20s, 48 people in their 30s, etc. is distributed as distribution information to the B store terminal at an hourly distribution timing.

配信パターン２の例では、１５分毎に予測処理が行われ、来場者の推移等から１３時に駐車場が混雑することが予測された場合に、この予測情報が配信情報としてＣ店舗端末へ配信される。 In the example of the distribution pattern 2, when the prediction process is performed every 15 minutes and the parking lot is predicted to be crowded at 13:00 from the transition of visitors, etc., this prediction information is distributed to the C store terminal as distribution information. Is done.

配信パターン３の例では、商品Ａを販売したという所定情報が取得された場合に、商品Ａを購入した人の年代と性別の順位が配信情報としてＡ店舗の端末へ配信される。 In the example of the distribution pattern 3, when predetermined information indicating that the product A has been sold is acquired, the age and gender ranking of the person who purchased the product A is distributed as distribution information to the terminal of the store A.

配信パターン４の例では、先週の土曜日を基準として４０代女性が２０％減少した場合に、「先週の土曜日と比べて女性の来店者が減少しています。」といった通知が配信情報として全店舗の端末へ配信される。 In the example of distribution pattern 4, when the number of women in their 40s has decreased by 20% relative to last Saturday, a notification such as “the number of female visitors has decreased compared to last Saturday” will be sent to all stores as distribution information. To the terminal.

配信パターン５の例では、２０代のコーヒー購入者が１００人以上となった場合に、２０代女性２５人、２０代男性７６人・・・といった匿名情報が配信情報としてＡ店舗端末へ配信される。 In the example of distribution pattern 5, when there are 100 or more coffee purchasers in their twenties, anonymous information such as 25 women in their twenties, 76 men in their twenties, etc. is distributed as distribution information to the store A terminal. The

配信パターン６の例では、年代と性別の順位が変動した場合に、１位２０代女性、２位３０代男性といった順位が配信情報としてＢ店舗端末へ配信される。 In the example of distribution pattern 6, when the rank of age and gender changes, the ranks such as first female in their 20s and second male in their 30s are distributed as distribution information to the B store terminal.

〈匿名化方法〉
図９−図１２は、匿名化システム１０が、匿名化プログラムに従って実行する匿名化方法の説明図であり、図９は、個人情報を受信して蓄積する処理を示す図、図１０は、所定時間間隔及び履歴を用いた配信タイミングで配信を行う場合の処理を示す図、図１１は、リアルタイムに配信を実行する処理を示す図、図１２は、匿名化処理を示す図である。 <Anonymization method>
9 to 12 are explanatory diagrams of the anonymization method executed by the anonymization system 10 in accordance with the anonymization program. FIG. 9 is a diagram illustrating a process of receiving and storing personal information. FIG. FIG. 11 is a diagram illustrating a process for performing distribution at a distribution timing using a time interval and a history, FIG. 11 is a diagram illustrating a process for performing distribution in real time, and FIG. 12 is a diagram illustrating an anonymization process.

匿名化システム１０は、図９に示す蓄積処理を定期的に起動させ、端末３０から個人情報を受信したか否かを確認する（ステップＳ１０１０）。ここで匿名化システム１０は、個人情報を受信していなければ（ステップＳ１０１０，Ｎｏ）、図９の処理を終了させ、個人情報を受信していれば（ステップＳ１０１０，Ｙｅｓ）、受信した個人情報を匿名化ＤＢ１４４に登録し（ステップＳ１０２０）、蓄積処理を終了する。このように匿名化システム１０は、図９の蓄積処理を繰り返し実行し、受信した個人情報を随時匿名化ＤＢ１４４に登録する。 The anonymization system 10 periodically starts the accumulation process shown in FIG. 9 and confirms whether personal information is received from the terminal 30 (step S1010). Here, if the anonymization system 10 has not received personal information (step S1010, No), the process of FIG. 9 is terminated, and if personal information has been received (step S1010, Yes), the received personal information is received. Is registered in the anonymization DB 144 (step S1020), and the accumulation process is terminated. As described above, the anonymization system 10 repeatedly executes the accumulation process of FIG. 9 and registers the received personal information in the anonymization DB 144 as needed.

また、匿名化システム１０は、図１０に示す処理を定期的に起動させ、先ず、各ユーザの設定情報を設定情報ＤＢ１４６から読み出す（ステップＳ１０３０）。 Moreover, the anonymization system 10 starts the process shown in FIG. 10 periodically, and first reads the setting information of each user from the setting information DB 146 (step S1030).

次に、匿名化システム１０は、読み出した設定情報に処理を行っていない対象期間があるか否かを判定する（ステップＳ１０３５）。匿名化システム１０は、例えば、対象期間に該当する個人情報の取得を完了し、当該個人情報の匿名化を行っていない場合に、未処理の対象期間があると判定し、対象期間に該当する個人情報の取得が完了していない、及び当該個人情報の匿名化が完了しているものだけの場合に、未処理の対象期間が無いと判定する。未処理の対象期間が無ければ（ステップＳ１０３５，Ｎｏ）、匿名化システム１０は、図１０の処理を終了し、未処理の対象期間が有れば（ステップＳ１０３５，Ｙｅｓ）、当該対象期間と対応付けられた配信情報が予測情報か否か、即ち当該対象期間と対応付けられた配信パターンが２か否かを判定する（ステップＳ１０４０）。ここで配信パターン２と判定した場合（ステップＳ１０４０，Ｙｅｓ）、匿名化システム１０は、対象期
間の個人情報を対象データとして求め、この対象データ等に基づいて予測処理を行い、予測結果を求める（ステップＳ１０４５）。 Next, the anonymization system 10 determines whether there is a target period in which the read setting information is not processed (step S1035). For example, the anonymization system 10 determines that there is an unprocessed target period when the acquisition of personal information corresponding to the target period is completed and the personal information is not anonymized, and corresponds to the target period. It is determined that there is no unprocessed target period when the acquisition of personal information has not been completed, and only when the personal information has been anonymized. If there is no unprocessed target period (step S1035, No), the anonymization system 10 ends the process of FIG. 10, and if there is an unprocessed target period (step S1035, Yes), it corresponds to the target period. It is determined whether or not the attached distribution information is prediction information, that is, whether or not the distribution pattern associated with the target period is 2 (step S1040). When it determines with the delivery pattern 2 here (step S1040, Yes), the anonymization system 10 calculates | requires the personal information of a target period as object data, performs a prediction process based on this object data etc., and calculates | requires a prediction result ( Step S1045).

そして、匿名化システム１０は、当該対象期間に該当する個人情報を対象情報として匿名化する（ステップＳ１０５０）。例えば、予測情報が個人情報（対象データ）を含む場合には、この予測情報に含まれる個人情報を匿名化する。また、予測情報と共に配信する場合には、予測の対象とした対象データを匿名化する。 Then, the anonymization system 10 anonymizes personal information corresponding to the target period as the target information (step S1050). For example, when the prediction information includes personal information (target data), the personal information included in the prediction information is anonymized. Moreover, when delivering with prediction information, the object data made into the object of prediction are anonymized.

一方、ステップＳ１０４０で配信パターン２では無いと判定した場合（ステップＳ１０４０，Ｎｏ）、予測処理を行わずに対象データの匿名化を行う（ステップＳ１０５０）。なお、匿名化の処理の詳細については、後述する。 On the other hand, when it determines with it not being the delivery pattern 2 by step S1040 (step S1040, No), anonymization of object data is performed without performing a prediction process (step S1050). Details of the anonymization process will be described later.

匿名化後、匿名化システム１０は、現在時刻が所定期間毎に定められた配信タイミングに該当するか否か、即ち配信パターン１，２の配信タイミングか否かを判定する（ステップＳ１０５５）。所定期間毎の配信タイミングに該当する場合（ステップＳ１０５５，Ｙｅｓ）、匿名化システム１０は、配信情報を配信先へ配信する（ステップＳ１０６０）。匿名化システム１０は、例えば、配信パターン１の場合、配信情報として順位や匿名情報を配信し、配信パターン２の場合、予測情報や匿名情報を配信する。 After anonymization, the anonymization system 10 determines whether or not the current time corresponds to the distribution timing determined for each predetermined period, that is, whether or not it is the distribution timing of the distribution patterns 1 and 2 (step S1055). When it corresponds to the delivery timing for every predetermined period (step S1055, Yes), the anonymization system 10 delivers delivery information to a delivery destination (step S1060). For example, in the case of distribution pattern 1, the anonymization system 10 distributes rank and anonymous information as distribution information, and in the case of distribution pattern 2, it distributes prediction information and anonymous information.

また、ステップＳ１０５５で、配信パターン１，２の配信タイミングでは無いと判定した場合（ステップＳ１０５５，Ｎｏ）、匿名化システム１０は、履歴情報を用いた配信タイミングか否か、即ち配信パターン４〜６か否かを判定する（ステップＳ１０６５）。ここで匿名化システム１０は、配信パターン４〜６でないと判定した場合には（ステップＳ１０６５，Ｎｏ）、図１０の処理を終了し、配信パターン４〜６であると判定した場合には（ステップＳ１０６５，Ｙｅｓ）、履歴情報（過去の匿名情報）を匿名化ＤＢ１４４から読み出し（ステップＳ１０７０）、匿名情報と比較して配信条件を満たしているか否かを判定する（ステップＳ１０７５）。 If it is determined in step S1055 that it is not the distribution timing of distribution patterns 1 and 2 (No in step S1055), the anonymization system 10 determines whether or not it is a distribution timing using history information, that is, distribution patterns 4-6. It is determined whether or not (step S1065). If the anonymization system 10 determines that the distribution patterns are not 4 to 6 (step S1065, No), the process of FIG. 10 is terminated, and if it is determined that the distribution patterns are 4 to 6 (step). (S1065, Yes), the history information (past anonymous information) is read from the anonymization DB 144 (step S1070), and compared with the anonymous information, it is determined whether the distribution condition is satisfied (step S1075).

配信条件を満たしていなければ（ステップＳ１０７５，Ｎｏ）、匿名化システム１０は、図１０の処理を終了させ、配信条件を満たしていれば（ステップＳ１０７５，Ｙｅｓ）、通知や順位、匿名情報を配信情報として配信する（ステップＳ１０８０）。 If the distribution condition is not satisfied (step S1075, No), the anonymization system 10 ends the process of FIG. 10, and if the distribution condition is satisfied (step S1075, Yes), distributes notification, ranking, and anonymous information. Distribute as information (step S1080).

また、匿名化システム１０は、図１１に示すリアルタイム配信の処理を定期的に起動させ、先ず、各ユーザの設定情報を設定情報ＤＢ１４６から読み出す（ステップＳ１０８２）。また、匿名化システム１０は、リアルタイム配信を行う所定情報を取得したか否かを判定する（ステップＳ１０８５）。 Further, the anonymization system 10 periodically activates the real-time distribution process shown in FIG. 11 and first reads the setting information of each user from the setting information DB 146 (step S1082). Further, the anonymization system 10 determines whether or not predetermined information for performing real-time distribution has been acquired (step S1085).

匿名化システム１０は、所定情報を取得していない場合（ステップＳ１０８５，Ｎｏ）、図１１の処理を終了し、所定情報を取得した場合（ステップＳ１０８５，Ｙｅｓ）、対象データの匿名化処理を行う（ステップＳ１０９０）。 When the anonymization system 10 has not acquired the predetermined information (step S1085, No), the process of FIG. 11 is terminated, and when the predetermined information is acquired (step S1085, Yes), the target data is anonymized. (Step S1090).

匿名化後、匿名化システム１０は、匿名情報をリアルタイムに配信先の端末３０へ配信する（ステップＳ１０９５）。 After anonymization, the anonymization system 10 distributes anonymous information to the delivery destination terminal 30 in real time (step S1095).

図１２は、ステップＳ１０５０における匿名化の処理の説明図である。匿名化システム１０は、ステップＳ１０３５で未処理と判定した対象期間に該当する個人情報を匿名化ＤＢ１４４から読み出して取得する（ステップＳ１１３０）。例えば、１０分毎、１時間毎、１日毎のように所定の間隔で対象期間が設定されている場合には、前回の匿名化の処理から当該間隔毎に対象データを読み出す。また、８：００〜１３：００、１３：００〜１８：００等のように所定の開始時刻と所定の終了時刻によって対象期間が設定されている
場合には、当該期間に取得された個人情報を対象データとして読み出す。 FIG. 12 is an explanatory diagram of the anonymization process in step S1050. The anonymization system 10 reads out and acquires from the anonymization DB 144 personal information corresponding to the target period determined as unprocessed in step S1035 (step S1130). For example, when the target period is set at a predetermined interval such as every 10 minutes, every hour, or every day, the target data is read at every interval from the previous anonymization process. In addition, when a target period is set by a predetermined start time and a predetermined end time, such as 8:00 to 13:00, 13: 0 to 18:00, etc., personal information acquired during the period Is read as target data.

次に匿名化システム１０は、対象データ中の各ワードについて、価値データが検索情報蓄積ＤＢ１４５に存在するか否かを判定する（ステップＳ１１４０）。匿名化システム１０は、全てのワードの価値データが検索情報蓄積ＤＢ１４５に存在する場合にはステップＳ１１６０へ移行し（ステップＳ１１４０，Ｙｅｓ）、足りない価値データがある場合（ステップＳ１１４０，Ｎｏ）、当該ワードの価値データを外部の装置、本例では検索エンジン７０から取得する（ステップＳ１１５０）。なお、検索エンジンから取得した価値データ以外、即ち検索情報蓄積ＤＢ１４５に存在したワードの価値情報は、検索情報蓄積ＤＢ１４５から取得する（ステップＳ１１６０）。 Next, the anonymization system 10 determines whether or not value data exists in the search information storage DB 145 for each word in the target data (step S1140). The anonymization system 10 proceeds to step S1160 if all word value data exists in the search information storage DB 145 (step S1140, Yes), and if there is insufficient value data (step S1140, No), Word value data is obtained from an external device, in this example, the search engine 70 (step S1150). In addition, the value information of the words existing in the search information storage DB 145 other than the value data acquired from the search engine is acquired from the search information storage DB 145 (step S1160).

また、匿名化システム１０は、匿名性を満たすため対象データの各項目を抽象化したワード（カテゴリ）に置き換えて抽象化候補データを作成する（ステップＳ１１７０）。なお、抽象化可能な項目が複数存在する場合には、各項目を抽象化した場合と抽象化しない場合の全てのパターンを作成する。例えば対象データに三つの項目Ａ，Ｂ，Ｃが含まれ、全項目について抽象化が可能で、抽象化した項目をＡ´，Ｂ´，Ｃ´とした場合、図１３に示すように、項目Ａだけを抽象化した場合Ａ´，Ｂ，Ｃや、項目Ａ，Ｂを抽象化した場合Ａ´，Ｂ´，Ｃなど、七つの候補パターンが作成できる。また、対象データに含まれる項目Ａ，Ｂ，Ｃのうち一部を省略した候補パターンを作成しても良い。例えば、項目Ａ，Ｂ、項目Ａ´，Ｂ、項目Ａ，Ｂ´、項目Ａ´，Ｂ´や、項目Ｂ，Ｃ、項目Ｂ´，Ｃ、項目Ｂ，Ｃ´、項目Ｂ´，Ｃ´、項目Ａ，Ｃ、項目Ａ´，Ｃ、項目Ａ，Ｃ´、項目Ａ´，Ｃ´のような候補を作成しても良い。このとき省略しない項目（必須項目）を予め設定しておき、この必須項目以外の項目を省略した候補パターンを作成しても良い。なお、匿名化システム１０は、候補パターンの項目数をステップＳ１１２５で取得した対象期間の長さに応じて定めても良い。例えば、対象期間が１０分であれば項目数１〜３、１時間であれば項目数２〜５、１日以上であれば項目数５〜８のように、対象期間毎に候補パターンの項目数の範囲を定めて記憶部に記憶しておき、ステップＳ１１２５で取得した対象期間と対応する項目数の範囲を読み出し、この項目数の範囲で候補パターンを作成しても良い。 Also, the anonymization system 10 creates abstraction candidate data by replacing each item of the target data with an abstracted word (category) in order to satisfy anonymity (step S1170). When there are a plurality of items that can be abstracted, all patterns are created when each item is abstracted and when it is not abstracted. For example, if the target data includes three items A, B, and C and all items can be abstracted, and the abstracted items are A ′, B ′, and C ′, as shown in FIG. Seven candidate patterns can be created, such as A ′, B, and C when only A is abstracted, and A ′, B ′, and C when items A and B are abstracted. Moreover, you may produce the candidate pattern which abbreviate | omitted some items A, B, and C contained in object data. For example, item A, B, item A ′, B, item A, B ′, item A ′, B ′, item B, C, item B ′, C, item B, C ′, item B ′, C ′ Candidates such as item A, C, item A ′, C, item A, C ′, item A ′, C ′ may be created. At this time, items that are not omitted (essential items) may be set in advance, and a candidate pattern may be created in which items other than the essential items are omitted. Note that the anonymization system 10 may determine the number of candidate pattern items according to the length of the target period acquired in step S1125. For example, if the target period is 10 minutes, the number of items is 1 to 3, and if it is 1 hour, the number of items is 2 to 5, and if it is more than one day, the number of items is 5 to 8 for each target period. A range of numbers may be determined and stored in the storage unit, a range of the number of items corresponding to the target period acquired in step S1125 may be read, and a candidate pattern may be created within the range of the number of items.

次に匿名化システム１０は、抽象化候補データに含まれる各ワードの価値データに基づいて各パターンの抽象化候補データの価値を算出し（ステップＳ１１８０）、この抽象化候補データの価値に基づいて検定の順番を決定する（ステップＳ１１９０）。例えばこの価値が高い順（降順）に検定の順番を決定する。なお、全ての候補パターンについて検定を行うことが望ましいが、この抽象化候補データの価値に基づき、価値の低過ぎる抽象化候補データを順番から外しても良い。例えば、価値の高い順番で、所定番目以降或いは半分未満など所定割合未満の抽象化候補データを外しても良い。また、抽象化候補データの価値が対象データの価値に対して所定割合未満となった抽象化候補データを外しても良い。これにより検定数が少なくなり、処理時間の短縮化が図れる。 Next, the anonymization system 10 calculates the value of the abstraction candidate data of each pattern based on the value data of each word included in the abstraction candidate data (step S1180), and based on the value of this abstraction candidate data. The order of testing is determined (step S1190). For example, the test order is determined in descending order of the value. Although it is desirable to test all candidate patterns, abstract candidate data that is too low in value may be removed from the order based on the value of the abstract candidate data. For example, abstract candidate data less than a predetermined ratio, such as after a predetermined value or less than half, may be removed in order of value. Further, the abstraction candidate data whose value is less than a predetermined ratio with respect to the value of the target data may be excluded. This reduces the number of tests and shortens the processing time.

この検定の順番に従い、匿名化システム１０は、抽象化候補データの匿名性を検定する（ステップＳ１２００）。例えば、ｋ−匿名性を検定するため、一個人と対応付けられた異なる項目の値の組み合わせが当該抽象化候補データ中に存在する数（存在数）を求める。或いは、ｌ多様性を検定するため、一個人と対応付けられた同じ項目の値の組み合わせが当該抽象化候補データ中に存在する数（存在数）を求める。そして、この存在数のうち最小のものを最低出現数（ｋ値／ｌ値）として求め（ステップＳ１２１０）、この最低出現数が１を超えているか否かを判定する（ステップＳ１２２０）。即ち、ここでｋ値が１を超えていればｋ−匿名性を満たし、１であればｋ−匿名性を満たさない。同様にｌ値が１を超えていればｌ−多様性を満たし、１であればｌ−多様性を満たさない。 In accordance with the order of this test, the anonymization system 10 tests the anonymity of the abstraction candidate data (step S1200). For example, in order to test k-anonymity, the number (existence number) of combinations of values of different items associated with one individual is present in the abstraction candidate data. Alternatively, in order to test 1 diversity, the number (existence number) in which the combination of values of the same item associated with one individual exists in the abstraction candidate data is obtained. Then, the smallest one of the existence numbers is obtained as the minimum number of appearances (k value / l value) (step S1210), and it is determined whether or not the minimum number of appearances exceeds 1 (step S1220). That is, if the k value exceeds 1, k-anonymity is satisfied, and if it is 1, k-anonymity is not satisfied. Similarly, if the l value exceeds 1, l-diversity is satisfied, and if l value is 1, l-diversity is not satisfied.

最低出現数（ｋ値／ｌ値）が１を超えていない場合（ステップＳ１２２０，Ｎｏ）、匿
名化システム１０は、抽象化候補データのうち、少なくとも一つの項目の値を更に抽象化する、即ち抽象化したワードに置き換え（ステップＳ１２３０）、ステップＳ１２００に戻る。 When the minimum appearance number (k value / l value) does not exceed 1 (step S1220, No), the anonymization system 10 further abstracts the value of at least one item of the abstraction candidate data, that is, Replacing with the abstracted word (step S1230), the process returns to step S1200.

一方、最低出現数（ｋ値／ｌ値）が１を超えている場合（ステップＳ１２２０，Ｙｅｓ）、匿名化システム１０は、当該抽象化候補データの価値と元の対象データの価値との差分を求め（ステップＳ１２４０）、この差分や、この差分に基づく値、例えば対象データの価値に対する差分の割合、対象データの価値に対する抽象化候補データの価値の割合を当該抽象化候補データの価値として決定する（ステップＳ１２５０）。 On the other hand, when the minimum number of appearances (k value / l value) exceeds 1 (step S1220, Yes), the anonymization system 10 calculates the difference between the value of the abstract candidate data and the value of the original target data. Determination (step S1240), this difference, a value based on this difference, for example, the ratio of the difference to the value of the target data, and the ratio of the value of the abstract candidate data to the value of the target data are determined as the value of the abstract candidate data. (Step S1250).

また、匿名化システム１０は、検定していない候補パターンがあるか否かを判定し（ステップＳ１２６０）、検定していない候補パターンがあれば（ステップＳ１２６０，Ｙｅｓ）、ステップＳ１１９０で決定した順番に従って、次の順番の抽象化候補データを特定し（ステップＳ１２７０）、ステップＳ１２００に戻って次の抽象化候補データについて検定を行う。 Further, the anonymization system 10 determines whether there is a candidate pattern that has not been verified (step S1260). If there is a candidate pattern that has not been verified (step S1260, Yes), the anonymization system 10 follows the order determined in step S1190. Then, the next abstraction candidate data is specified (step S1270), and the process returns to step S1200 to test the next abstraction candidate data.

このように各パターンの抽象化候補データについて検定を繰り返し、次の候補パターンが無くなった場合（ステップＳ１２６０，Ｎｏ）、匿名化システム１０は、ステップＳ１２５０で求各抽象化候補データの価値に基づいて、採用すべき抽象化候補データを選択し（ステップＳ１２８０）、選択した抽象化候補データを匿名情報としてステップＳ１１２５で取得した対象期間と対応付けて匿名化ＤＢ１４４に登録し（ステップＳ１２９０）、匿名化の処理を終了する。 In this way, when the test is repeated for the abstraction candidate data of each pattern and there is no next candidate pattern (No in step S1260), the anonymization system 10 determines based on the value of each abstraction candidate data in step S1250. Then, abstraction candidate data to be adopted is selected (step S1280), and the selected abstraction candidate data is registered as anonymized information in the anonymization DB 144 in association with the target period acquired in step S1125 (step S1290). Terminate the process.

このように抽象化候補データの選択は、例えば、全候補パターンの中で最も価値の高い抽象化候補データを選択する。また、匿名化システム１０は、全候補パターンの中から価値の高い順に複数の抽象化候補データを出力し、この出力された抽象化候補データの中から操作者が適切だと思う抽象化候補データを指定し、この指定された抽象化候補データを選択しても良い。 Thus, the abstraction candidate data is selected, for example, by selecting the abstraction candidate data having the highest value among all candidate patterns. Further, the anonymization system 10 outputs a plurality of abstraction candidate data in descending order of value from all candidate patterns, and the abstraction candidate data that the operator thinks is appropriate from the output abstraction candidate data May be selected, and the specified abstraction candidate data may be selected.

次に図１４−図２１を用いて本実施形態におけるデータの価値について説明する。図１４は対象データにおける年齢の項目の一部の例を示す図である。図１４に示すように対象データは、年齢ｓｉ毎に人数ｃｉを有している。例えば、１８歳（ｓ１）の人数（ｃ１）が３０人、１９歳（ｓ２）の人数（ｃ２）が１０人である。 Next, the value of data in the present embodiment will be described with reference to FIGS. FIG. 14 is a diagram illustrating an example of a part of the age item in the target data. As shown in FIG. 14, the target data has the number of people ci for each age si. For example, the number of people (c1) at the age of 18 (s1) is 30, and the number of people (c2) at the age of 19 (s2) is 10.

図１５は、年齢ｓｉについて取得する価値データの一例を示す。図１５の価値データは、年齢ｓｉ毎にＳＥＭ単価ｅｉを有している。 FIG. 15 shows an example of value data acquired for the age si. The value data in FIG. 15 has a SEM unit price ei for each age si.

この年齢ｓｉの価値は、ＳＥＭ単価ｅｉに人数ｃｉを乗じた値であり、式１で示される。 The value of this age si is a value obtained by multiplying the SEM unit price ei by the number of people ci, and is represented by Equation 1.

ｓｉ＝ｃｉ×ｅｉ・・・（式１）
そして、図１６に示すように年齢の項目Ｓ（ｅ）の価値は、各年齢ｓｉの総計であり、式２で示される。なお、図１６においてｎは５である。従って、年齢の項目Ｓ（ｅ）の価値は、図１７に示すように、２４４６円である。また、対象データにおける全ての項目の価値を合計したものが対象データの価値である。 si = ci × ei (Formula 1)
As shown in FIG. 16, the value of the age item S (e) is the total of each age si, and is expressed by Equation 2. In FIG. 16, n is 5. Accordingly, the value of the age item S (e) is 2446 yen as shown in FIG. The value of the target data is the sum of the values of all items in the target data.

一方、図１８は抽象化候補データにおける年齢の項目の一部の例を示す図である。図１８に示すように抽象化候補データは、年代ｋｉ毎に人数ｃｉを有している。例えば、１０代（ｋ１）の人数（ｃ１）が４０人、２０代（ｋ２）の人数（ｃ２）が２２人である。 On the other hand, FIG. 18 is a diagram illustrating an example of a part of the age item in the abstraction candidate data. As shown in FIG. 18, the abstraction candidate data has the number of people ci for each age ki. For example, the number of teenagers (k1) (c1) is 40, and the number of people in their 20s (k2) (c2) is 22.

図１９は、年代ｋｉについて取得する各ワードの価値データの一例を示す。図１９の価値データは、年代ｋｉ毎にＳＥＭ単価ｅｉを有している。 FIG. 19 shows an example of value data of each word acquired for the age ki. The value data in FIG. 19 has a SEM unit price ei for each age ki.

この年代ｋｉの価値は、ＳＥＭ単価ｅｉに人数ｃｉを乗じた値であり、式３で示される。 The value of this age ki is a value obtained by multiplying the SEM unit price ei by the number of people ci, and is expressed by Equation 3.

ｋｉ＝ｃｉ×ｅｉ・・・（式３）
そして、図２０に示すように年代の項目Ｓ（ｋ）の価値は、各年代ｋｉの総計であり、式４で示される。なお、図２０においてｎは２である。従って、年齢の項目Ｓ（ｋ）の価値は、図２１に示すように、２１３４円である。即ち、年齢の項目を年代に抽象化したことにより、価値が３１２円減損したことになる。また、抽象化候補データにおける全ての項目の価値を合計したものが抽象化候補データの価値である。 ki = ci × ei (Formula 3)
Then, as shown in FIG. 20, the value of the item S (k) of the age is the total of each age ki, and is expressed by Equation 4. In FIG. 20, n is 2. Therefore, the value of the age item S (k) is 2134 yen as shown in FIG. In other words, the value was lost by 312 yen by abstracting the age item into the age. Further, the sum of the values of all items in the abstraction candidate data is the value of the abstraction candidate data.

そして、ステップＳ１２５０で求める抽象化候補データの価値として、例えば式５に示すように、抽象化候補データの価値を抽象化候補データの価値と対象データの価値の合計で除した減損率Ｍ（ｋ）を求める。 Then, as the value of the abstraction candidate data obtained in step S1250, for example, as shown in Equation 5, an impairment rate M (k that is obtained by dividing the value of the abstraction candidate data by the sum of the value of the abstraction candidate data and the value of the target data. )

Ｍ（ｋ）＝Ｓ（ｋ）／（Ｓ（ｋ）＋Ｓ（ｅ））・・・（式５）
このように本実施形態１の匿名化システム１０は、各抽象化候補データの価値を抽象化したワードの価値に基づいて評価することにより、精度良く各抽象化候補データの価値を評価でき、抽象化後も高い価値を有する抽象化候補データを選択できる。 M (k) = S (k) / (S (k) + S (e)) (Formula 5)
As described above, the anonymization system 10 according to the first embodiment can evaluate the value of each abstraction candidate data with high accuracy by evaluating the value of each abstraction candidate data based on the value of the abstracted word. Abstraction candidate data having high value can be selected even after conversion.

また、本実施形態１の匿名化システム１０は、利用者側の指定する対象期間内の個人情報を対象データとして匿名化を行うので、対象期間の長さによって匿名化するための抽象化の程度が異ならせている。例えば、対象期間が１０分のように短い期間であると、個人情報の数が少なく、ｋ−匿名性又はｌ−匿名性を満たすためには抽象化の程度を高める必要があり、対象期間が１日のように比較的長い期間であると、個人情報の数が多く、抽象化の程度が低くてもｋ−匿名性又はｌ−匿名性を満たすことができる。このように利用者の指定する対象期間によって抽象化の程度が変化する場合であっても、本実施形態１の匿名化システム１０によれば、抽象化後の価値に基づいて抽象化候補を選択でき、対象期間と抽象化後の価値に基づいた適切な匿名化処理を行うことができる。 Moreover, since the anonymization system 10 of the first embodiment performs anonymization using the personal information within the target period designated by the user as the target data, the degree of abstraction for anonymizing according to the length of the target period Are different. For example, if the target period is as short as 10 minutes, the number of personal information is small, and it is necessary to increase the degree of abstraction in order to satisfy k-anonymity or l-anonymity. A relatively long period such as one day can satisfy k-anonymity or l-anonymity even if the number of personal information is large and the degree of abstraction is low. Thus, even if the degree of abstraction changes depending on the target period specified by the user, according to the anonymization system 10 of the first embodiment, abstraction candidates are selected based on the value after abstraction. And an appropriate anonymization process based on the target period and the value after abstraction can be performed.

〈実施形態２〉
図２２は、実施形態２に係る匿名化システム１００の機能ブロック図である。本実施形態２の匿名化システム１００は、複数の事業者が出展する展示会において、各事業者が来場者から収集した個人情報の匿名化を行うシステムであり、各事業者の匿名化装置５０や、各事業者で匿名化した匿名情報を管理する管理サーバ２０を有する。なお、本実施形態２において、前述の実施形態１と同じ要素には同符号を付す等して一部記載を省略する。 <Embodiment 2>
FIG. 22 is a functional block diagram of the anonymization system 100 according to the second embodiment. The anonymization system 100 according to the second embodiment is a system that anonymizes personal information collected by each business operator from an exhibitor at an exhibition where a plurality of business operators exhibit. Or it has the management server 20 which manages the anonymous information anonymized by each provider. In the second embodiment, the same elements as those in the first embodiment are denoted by the same reference numerals, and a part of the description is omitted.

本実施形態２の匿名化システム１００では、管理サーバ２０が、各事業者の匿名化装置５０から夫々匿名化辞書を取得し、各事業者の匿名化辞書を統合して統合匿名化辞書を生成し、各事業者の匿名化装置５０へ配信する。また、管理サーバ２０が、各事業者の匿名化装置５０から設定情報を受信し、受信した設定情報を各事業者の匿名化装置５０へ夫々配信する。そして、各事業者の匿名化装置５０が、設定情報の対象期間に該当する対象データを取得し、当該対象データを各事業者共通の統合匿名化辞書を用いて匿名化して匿名情報とし、この匿名情報を配信条件に基づいて配信することで、各事業者がそれぞれ匿名化した匿名情報を他の事業者が所望の条件で利用できるようにしている。 In the anonymization system 100 of the second embodiment, the management server 20 acquires anonymization dictionaries from the anonymization devices 50 of the respective operators, integrates the anonymization dictionaries of the respective operators, and generates an integrated anonymization dictionary. And distributed to the anonymization device 50 of each business operator. Moreover, the management server 20 receives setting information from each business operator's anonymization device 50, and distributes the received setting information to each business operator's anonymization device 50, respectively. And each company's anonymization device 50 acquires the target data corresponding to the target period of the setting information, anonymizes the target data using an integrated anonymization dictionary common to each company, and makes this information anonymous. By distributing anonymous information based on distribution conditions, other companies can use anonymous information anonymized by each company under desired conditions.

図２２に示すように、管理サーバ２０は、辞書取得部２０１や、統合部２０２、優先度決定部２０３、辞書管理部２０４、匿名情報登録部２０５、匿名情報制御部２０６、次元
選択部２０７、設定情報管理部２０８、配信部２０９、辞書ＤＢ２３１、優先度ＤＢ２３２、共通ＤＢ２３３、設定情報ＤＢ２３６を備えている。即ち、本実施形態２の管理サーバ２０は、辞書取得部２０１、統合部２０２、優先度決定部２０３及び次元選択部２０７を備えた辞書作成装置でもある。 As shown in FIG. 22, the management server 20 includes a dictionary acquisition unit 201, an integration unit 202, a priority determination unit 203, a dictionary management unit 204, an anonymous information registration unit 205, an anonymous information control unit 206, a dimension selection unit 207, A setting information management unit 208, a distribution unit 209, a dictionary DB 231, a priority DB 232, a common DB 233, and a setting information DB 236 are provided. That is, the management server 20 according to the second embodiment is also a dictionary creation device including a dictionary acquisition unit 201, an integration unit 202, a priority determination unit 203, and a dimension selection unit 207.

辞書取得部２０１は、対象データに含まれる語を抽象化した語に替えて匿名化するため、前記語と前記抽象化した語とを対応付けて記憶した複数の匿名化辞書を各事業者の匿名化装置５０から取得する。本実施形態では、各事業者の匿名化装置５０から送信された匿名化辞書を辞書取得部２０１が受信し、辞書ＤＢ２３１に登録する。 In order to anonymize the word included in the target data by replacing the word included in the target data with the dictionary acquisition unit 201, the dictionary acquisition unit 201 stores a plurality of anonymization dictionaries storing the word and the abstracted word in association with each operator. Obtained from the anonymization device 50. In this embodiment, the dictionary acquisition part 201 receives the anonymization dictionary transmitted from the anonymization apparatus 50 of each provider, and registers it in the dictionary DB 231.

統合部２０２は、各事業者の匿名化装置５０から取得した複数の匿名化辞書を統合して統合匿名化辞書を作成する。例えば統合部２０２は、複数の匿名化辞書に含まれる各語の対応関係に基づいて、抽象化した語を上位、抽象化前の語を下位とし、前記複数の匿名化辞書に含まれる各語と、前記複数の匿名化辞書に存在する上位及び下位の語とを対応付け、対応する上位の語が存在しない最上位の語をルートとして対応する下位の語が存在しない最下位の語までのツリー状の対応関係にある語の次元を前記最上位の語毎に生成し、統合匿名化辞書として辞書ＤＢ２３１に記憶させる。この各最上位の語をルートとするツリー状の語の次元が統合匿名化辞書を構成する。 The integration unit 202 integrates a plurality of anonymization dictionaries acquired from the anonymization device 50 of each business operator and creates an integrated anonymization dictionary. For example, the integration unit 202 sets the abstracted word as a higher level and the pre-abstraction word as a lower level based on the correspondence between the words included in the plurality of anonymized dictionaries, and each word included in the plurality of anonymized dictionaries And the upper and lower words existing in the plurality of anonymization dictionaries, and the highest word that does not have a corresponding higher word as a root to the lowest word that does not have a corresponding lower word. A dimension of a word having a tree-like correspondence is generated for each top word and stored in the dictionary DB 231 as an integrated anonymization dictionary. The dimension of the tree-like word rooted at each uppermost word constitutes an integrated anonymization dictionary.

優先度決定部２０３は、前記統合匿名化辞書を構成する次元の夫々について、当該次元に含まれる語に基づいて優先度を決定する。例えば、優先度決定部２０３は、各次元に含まれる語の数、各次元に含まれる語について上位と下位の関係にある段階の数、各次元に含まれる語の価値のうち少なくとも一つに基づいて前記優先度を決定する。なお、前記語について予め定めた値を、例えば優先度ＤＢ２３２が記憶しておき、優先度決定部２０３は、優先度ＤＢ２３２を参照して優先度を決定する。 The priority determination unit 203 determines a priority for each dimension constituting the integrated anonymization dictionary based on words included in the dimension. For example, the priority determination unit 203 sets at least one of the number of words included in each dimension, the number of stages having a higher and lower relationship for the words included in each dimension, and the value of the word included in each dimension. Based on the priority, the priority is determined. For example, the priority DB 232 stores a predetermined value for the word, and the priority determination unit 203 refers to the priority DB 232 to determine the priority.

次元選択部２０７は、前記統合部２０２で生成した複数の次元のうち、統合匿名化辞書として採用する次元と採用しない次元とを前記優先度に基づいて選択する。 The dimension selection unit 207 selects a dimension to be adopted as the integrated anonymization dictionary and a dimension not to be adopted among the plurality of dimensions generated by the integration unit 202 based on the priority.

辞書管理部２０４は、統合部２０２で作成された統合匿名化辞書を管理する。例えば辞書管理部２０４は、統合匿名化辞書を辞書ＤＢ２３１から読み出して各事業者の匿名化装置５０へ配信する。 The dictionary management unit 204 manages the integrated anonymization dictionary created by the integration unit 202. For example, the dictionary management unit 204 reads the integrated anonymization dictionary from the dictionary DB 231 and distributes it to the anonymization device 50 of each business operator.

匿名情報登録部２０５は、各事業者の匿名化装置５０から匿名情報を取得し、共通ＤＢ２３３に登録する。 The anonymous information registration unit 205 acquires anonymous information from each company's anonymization device 50 and registers it in the common DB 233.

匿名情報制御部２０６は、共通ＤＢ２３３に登録された匿名情報の出力処理等を制御する。例えば、匿名化装置５０等の情報処理装置から匿名情報の取得要求を受けた場合に、該当する匿名情報を要求元の情報処理装置へ配信する。本実施形態２において、匿名情報制御部２０６は、出力部の一形態である。 The anonymous information control unit 206 controls an output process of anonymous information registered in the common DB 233 and the like. For example, when an anonymous information acquisition request is received from an information processing device such as the anonymization device 50, the corresponding anonymous information is distributed to the requesting information processing device. In the second embodiment, the anonymous information control unit 206 is a form of an output unit.

設定情報管理部２０８は、各事業者の匿名化装置５０から設定情報を受信し、設定情報ＤＢ２３６に記憶させると共に、受信した設定情報（対象期間）を他の事業者の匿名化装置５０に配信する。設定情報ＤＢ２３６が記憶するデータの構成は、図８と略同じであり、ユーザＩＤ、対象期間、配信条件（配信パターン、配信タイミング、配信情報、配信先）等の情報を有するが、配信条件等の具体的内容については、処理する個人情報の内容に応じて適宜設定される。なお、管理サーバ２０は、受信した設定情報のうち、対象期間を他の事業者へ配信した後は保持する必要がなく、設定情報ＤＢ２３６に記憶しなくても良い。 The setting information management unit 208 receives the setting information from each company's anonymization device 50, stores it in the setting information DB 236, and distributes the received setting information (target period) to the other companies' anonymization device 50. To do. The configuration of data stored in the setting information DB 236 is substantially the same as that in FIG. 8 and includes information such as a user ID, a target period, distribution conditions (distribution pattern, distribution timing, distribution information, distribution destination), and the like. The specific contents are appropriately set according to the contents of the personal information to be processed. In addition, the management server 20 does not need to hold | maintain, after distributing a target period to the other provider among the received setting information, and does not need to memorize | store in setting information DB236.

配信部２０９は、設定情報ＤＢ２３６の配信条件に基づき、匿名情報等の配信情報を各事業者の匿名化装置５０等の配信先へ配信する。 The distribution unit 209 distributes distribution information such as anonymous information to a distribution destination such as the anonymization device 50 of each operator based on the distribution conditions of the setting information DB 236.

図２３は辞書ＤＢ２３１の例を示す図である。辞書ＤＢ２３１は、抽象化前のワード（以下、下位のワードとも称す）と、当該ワードを抽象化した後のワード(以下、上位のワ
ードとも称す)とを対応付けて記憶している。 FIG. 23 is a diagram illustrating an example of the dictionary DB 231. The dictionary DB 231 stores a word before abstraction (hereinafter also referred to as a lower word) and a word after abstraction of the word (hereinafter also referred to as an upper word) in association with each other.

図２４は、優先度ＤＢ２３２の例を示す図である。優先度ＤＢ２３２は、各ワードについて、優先度を決定するための値（価値）を記憶している。図２４の例では、各ワードに対して、１日当たりのクリック数、１日当たりの表示回数、参入企業数、１日当たりのコスト、クリック率、ＳＥＭ価格（獲得価格）など、ＳＥＭに用いられる値が記憶されている。 FIG. 24 is a diagram illustrating an example of the priority DB 232. The priority DB 232 stores a value (value) for determining the priority for each word. In the example of FIG. 24, for each word, there are values used in the SEM such as the number of clicks per day, the number of display times per day, the number of participating companies, the cost per day, the click rate, and the SEM price (acquired price). It is remembered.

図２５は、共通ＤＢ２３３の例を示す図である。共通ＤＢ２３３は、各事業者の匿名化装置５０で統合匿名化辞書を用いて匿名化した匿名情報を記憶している。図２５の例では、来訪ブース、年齢、性別、所属企業、役職、興味を示した商品、ステータスなどの項目のデータを記憶している。この項目や各項目の抽象化の程度は、後述のように統合匿名化辞書や検定の結果等によって決まる。 FIG. 25 is a diagram illustrating an example of the common DB 233. The common DB 233 stores anonymous information that has been anonymized using the integrated anonymization dictionary by the anonymization device 50 of each business operator. In the example of FIG. 25, data of items such as a visit booth, age, gender, affiliated company, title, product showing interest, and status are stored. The degree of abstraction of this item and each item is determined by the integrated anonymization dictionary, the result of the test, etc. as will be described later.

また、各事業者の匿名化装置５０は、図２２に示すように、データ取得部１０１や、抽象化部１０２、検定部１０３、選択部１０４、匿名情報登録部１０６、価値データ取得部１０７、ワードカテゴリ分析部１０８、ワード価値計算部１０９、出力制御部１１０、設定情報登録部１２１、期間取得部１２２、予測部１２３、個人情報ＤＢ１３１、検索情報蓄積ＤＢ１４５、設定情報ＤＢ１４６を備えている。 Further, as shown in FIG. 22, each business operator anonymization device 50 includes a data acquisition unit 101, an abstraction unit 102, a test unit 103, a selection unit 104, an anonymous information registration unit 106, a value data acquisition unit 107, A word category analysis unit 108, a word value calculation unit 109, an output control unit 110, a setting information registration unit 121, a period acquisition unit 122, a prediction unit 123, a personal information DB 131, a search information accumulation DB 145, and a setting information DB 146 are provided.

データ取得部１０１は、個人と対応付けられた複数の項目を含むデータ、即ち個人情報を取得し、この個人情報のうち、対象期間に該当するものを対象データとして取得する。例えば、データ取得部１０１は、来場者が記載したアンケートや来場者から聞き取った個人情報をキーボード等から入力を受けて個人情報ＤＢ１３１に記憶しておき、この個人情報ＤＢ１３１から１０分・１時間・１日などの対象期間に該当する個人情報を対象データとして読み出す。また、データ取得部１０１は、来場者の名刺やアンケートに記載された事項をＯＣＲ（Optical Character Recognition）によって読み取り、個人情報として記
憶しても良いし、来場者のＲＦ−ＩＤタグやＩＣチップ等から当該来場者の個人情報を取得して記憶しても良い。 The data acquisition unit 101 acquires data including a plurality of items associated with an individual, that is, personal information, and acquires the personal information corresponding to the target period as target data. For example, the data acquisition unit 101 receives a questionnaire written by a visitor or personal information heard from the visitor from a keyboard or the like and stores it in the personal information DB 131. The personal information DB 131 stores 10 minutes, 1 hour, Personal information corresponding to a target period such as one day is read out as target data. In addition, the data acquisition unit 101 may read items described in the visitor's business cards and questionnaires by OCR (Optical Character Recognition) and store them as personal information, or the visitor's RF-ID tag, IC chip, etc. The personal information of the visitor may be acquired and stored.

抽象化部１０２は、前記次元からなる統合匿名化辞書を参照し、前記対象データ中の項目の値である語を前記優先度に基づいて抽象化した語に替えて匿名化候補データを生成する。 The abstraction unit 102 refers to the integrated anonymization dictionary including the dimensions, and generates anonymization candidate data by replacing words that are values of items in the target data with words abstracted based on the priority. .

検定部１０３は、前記抽象化候補データの項目の値の組み合わせが、前記対象データの一個人に限定されないことを条件として検定する。例えば、検定部１０３は、抽象化候補データの項目の値の組み合わせが、ｋ−匿名性を満たすこと、或いはｌ−多様性を満たすことを条件として検定する。 The test unit 103 performs test on the condition that the combination of the item values of the abstraction candidate data is not limited to one individual of the target data. For example, the test | inspection part 103 tests on condition that the combination of the value of the item of abstraction candidate data satisfy | fills k-anonymity or l-diversity.

出力制御部１１０は、前記検定の条件を満たした抽象化候補データを匿名情報として出力する。例えば、出力制御部１１０は、匿名情報を管理サーバ２０へ送信する。 The output control unit 110 outputs the abstraction candidate data that satisfies the test condition as anonymous information. For example, the output control unit 110 transmits anonymous information to the management server 20.

設定情報登録部１２１は、当該事業者（ユーザ）によって入力された設定情報を設定情報ＤＢ１４６に記憶させると共に管理サーバ２０へ送信する。また、設定情報登録部１２１は、管理サーバ２０から配信される他の事業者の設定情報（対象期間）を受信して設定
情報ＤＢ１４６に登録する。なお、設定情報ＤＢ１４６に記憶される設定情報（対象期間）は、図６の例と同様である。 The setting information registration unit 121 stores the setting information input by the operator (user) in the setting information DB 146 and transmits the setting information to the management server 20. In addition, the setting information registration unit 121 receives setting information (target period) of other operators distributed from the management server 20 and registers it in the setting information DB 146. Note that the setting information (target period) stored in the setting information DB 146 is the same as the example in FIG.

期間取得部１２２は、匿名化を行う対象の期間を取得する。例えば、期間取得部１２２は、対象期間を設定情報ＤＢ１４６から読み出すことにより取得する。 The period acquisition unit 122 acquires a period for which anonymization is performed. For example, the period acquisition unit 122 acquires the target period by reading it from the setting information DB 146.

予測部１２３は、取得した個人情報等に基づいて予測結果を求める。例えば、所定期間内の属性値の変化の推移に基づいて、所定時間後の属性値を予測結果として求める。また、所定時間後の属性値が閾値を超えた場合に警告を予測結果としても良い。 The prediction unit 123 obtains a prediction result based on the acquired personal information and the like. For example, an attribute value after a predetermined time is obtained as a prediction result based on the transition of the change of the attribute value within a predetermined period. Further, a warning may be used as a prediction result when the attribute value after a predetermined time exceeds a threshold value.

図２６は、個人情報ＤＢ１３１の例を示す図である。個人情報ＤＢ１３１は、データ取得部１０１で取得した個人情報を記憶している。図２６の例では氏名、メール、所属企業名、役職、興味、ステータス等を記憶している。 FIG. 26 is a diagram illustrating an example of the personal information DB 131. The personal information DB 131 stores personal information acquired by the data acquisition unit 101. In the example of FIG. 26, name, mail, company name, title, interest, status, etc. are stored.

図２７は管理サーバ２０のハードウェア構成を示す図である。管理サーバ２０は、ＣＰＵ２１、メモリ２２、通信制御部２３、記憶装置２４、入出力インタフェース２５を有する所謂コンピュータである。 FIG. 27 is a diagram illustrating a hardware configuration of the management server 20. The management server 20 is a so-called computer having a CPU 21, a memory 22, a communication control unit 23, a storage device 24, and an input / output interface 25.

ＣＰＵ２１は、メモリ２２に実行可能に展開されたプログラムを実行し、前述の辞書取得部２０１や、統合部２０２、優先度決定部２０３、辞書管理部２０４、匿名情報登録部２０５、匿名情報制御部２０６、次元選択部２０７、設定情報管理部２０８、配信部２０９の機能を提供する。 The CPU 21 executes the program expanded in an executable manner in the memory 22, and the above-described dictionary acquisition unit 201, integration unit 202, priority determination unit 203, dictionary management unit 204, anonymous information registration unit 205, anonymous information control unit 206, a dimension selection unit 207, a setting information management unit 208, and a distribution unit 209.

メモリ２２は、主記憶装置ということもできる。メモリ２２は、例えば、ＣＰＵ２１が実行するプログラムや、通信制御部２３を介して受信したデータ、記憶装置２４から読み出したデータ、その他のデータ等を記憶する。 The memory 22 can also be called a main storage device. The memory 22 stores, for example, a program executed by the CPU 21, data received via the communication control unit 23, data read from the storage device 24, other data, and the like.

通信制御部２３は、ネットワークを介して他の装置と接続し、当該装置との通信を制御する。入出力インタフェース２５は、表示装置やプリンタ等の出力手段や、キーボードやポインティングデバイス等の入力手段、ドライブ装置等の入出力手段が適宜接続される。ドライブ装置は、着脱可能な記憶媒体の読み書き装置であり、例えば、フラッシュメモリカードの入出力装置、ＵＳＢメモリを接続するＵＳＢのアダプタ等である。また、着脱可能な記憶媒体は、例えば、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disk）、ブルーレイディスク（Blu-ray Disc）等のディスク媒体であってもよい。ドライブ装置は、着脱可能な記憶媒体からプログラムを読み出し、記憶装置２４に格納する。 The communication control unit 23 is connected to another device via a network and controls communication with the device. The input / output interface 25 is appropriately connected to output means such as a display device and a printer, input means such as a keyboard and pointing device, and input / output means such as a drive device. The drive device is a removable storage medium read / write device, such as an input / output device for a flash memory card, a USB adapter for connecting a USB memory, or the like. The removable storage medium may be a disk medium such as a CD (Compact Disc), a DVD (Digital Versatile Disk), or a Blu-ray Disc. The drive device reads the program from the removable storage medium and stores it in the storage device 24.

記憶装置２４は、外部記憶装置ということもできる。記憶装置２４としては、ＳＳＤ（Solid State Drive）やＨＤＤ等であってもよい。記憶装置２４は、ドライブ装置との間
で、データを授受する。例えば、記憶装置２４は、ドライブ装置からインストールされる情報処理プログラム等を記憶する。また、記憶装置２４は、プログラムを読み出し、メモリ２２に引き渡す。本実施形態では、記憶装置２４が前述の辞書ＤＢ２３１、優先度ＤＢ２３２、共通ＤＢ２３３を格納している。 The storage device 24 can also be referred to as an external storage device. The storage device 24 may be an SSD (Solid State Drive), an HDD, or the like. The storage device 24 exchanges data with the drive device. For example, the storage device 24 stores an information processing program installed from the drive device. The storage device 24 reads out the program and delivers it to the memory 22. In the present embodiment, the storage device 24 stores the dictionary DB 231, the priority DB 232, and the common DB 233 described above.

図２８は匿名化装置５０のハードウェア構成を示す図である。匿名化装置５０は、ＣＰＵ５１、メモリ５２、通信制御部５３、記憶装置５４、入出力インタフェース５５を有する所謂コンピュータである。 FIG. 28 is a diagram illustrating a hardware configuration of the anonymization device 50. The anonymization device 50 is a so-called computer having a CPU 51, a memory 52, a communication control unit 53, a storage device 54, and an input / output interface 55.

ＣＰＵ５１は、メモリ５２に実行可能に展開されたプログラムを実行し、前述のデータ取得部１０１や、抽象化部１０２、検定部１０３、選択部１０４、匿名情報登録部１０６、価値データ取得部１０７、ワードカテゴリ分析部１０８、ワード価値計算部１０９、出
力制御部１１０、設定情報登録部１２１、期間取得部１２２、予測部１２３の機能を提供する。 The CPU 51 executes the program expanded in an executable manner in the memory 52, and the data acquisition unit 101, the abstraction unit 102, the test unit 103, the selection unit 104, the anonymous information registration unit 106, the value data acquisition unit 107, The functions of the word category analysis unit 108, the word value calculation unit 109, the output control unit 110, the setting information registration unit 121, the period acquisition unit 122, and the prediction unit 123 are provided.

メモリ５２は、主記憶装置ということもできる。メモリ５２は、例えば、ＣＰＵ５１が実行するプログラムや、通信制御部５３を介して受信したデータ、記憶装置５４から読み出したデータ、その他のデータ等を記憶する。 The memory 52 can also be called a main storage device. The memory 52 stores, for example, a program executed by the CPU 51, data received via the communication control unit 53, data read from the storage device 54, other data, and the like.

通信制御部５３は、ネットワークを介して他の装置と接続し、当該装置との通信を制御する。入出力インタフェース５５は、表示装置やプリンタ等の出力手段や、キーボードやポインティングデバイス等の入力手段、ドライブ装置等の入出力手段が適宜接続される。ドライブ装置は、着脱可能な記憶媒体の読み書き装置であり、例えば、フラッシュメモリカードの入出力装置、ＵＳＢメモリを接続するＵＳＢのアダプタ等である。また、着脱可能な記憶媒体は、例えば、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disk）、ブルーレイディスク（Blu-ray Disc）等のディスク媒体であってもよい。ドライブ装置は、着脱可能な記憶媒体からプログラムを読み出し、記憶装置５４に格納する。 The communication control unit 53 is connected to another device via a network and controls communication with the device. The input / output interface 55 is appropriately connected to output means such as a display device and a printer, input means such as a keyboard and pointing device, and input / output means such as a drive device. The drive device is a removable storage medium read / write device, such as an input / output device for a flash memory card, a USB adapter for connecting a USB memory, or the like. The removable storage medium may be a disk medium such as a CD (Compact Disc), a DVD (Digital Versatile Disk), or a Blu-ray Disc. The drive device reads the program from the removable storage medium and stores it in the storage device 54.

記憶装置５４は、外部記憶装置ということもできる。記憶装置５４としては、ＳＳＤ（Solid State Drive）やＨＤＤ等であってもよい。記憶装置５４は、ドライブ装置との間
で、データを授受する。例えば、記憶装置５４は、ドライブ装置からインストールされるプログラム等を記憶する。また、記憶装置５４は、プログラムを読み出し、メモリ５２に引き渡す。本実施形態では、記憶装置５４が前述の個人情報ＤＢ１３１を格納している。 The storage device 54 can also be called an external storage device. The storage device 54 may be an SSD (Solid State Drive), an HDD, or the like. The storage device 54 exchanges data with the drive device. For example, the storage device 54 stores a program installed from the drive device. The storage device 54 reads out the program and delivers it to the memory 52. In the present embodiment, the storage device 54 stores the personal information DB 131 described above.

§３．匿名化方法
次に図２９〜図３２を用いて本実施形態２の匿名化方法について説明する。 §3. Anonymization method Next, the anonymization method of the second embodiment will be described with reference to FIGS.

（３−１）設定情報に基づく処理
匿名化装置５０は、図２９に示す処理を定期的に起動させ、先ず、設定情報（対象期間）を設定情報ＤＢ１４６から読み出す（ステップＳ１０３０）。 (3-1) Processing Based on Setting Information The anonymization device 50 periodically starts the processing shown in FIG. 29, and first reads the setting information (target period) from the setting information DB 146 (step S1030).

次に、匿名化装置５０は、読み出した対象期間に未処理のものがあるか否かを判定し（ステップＳ１０３５）、未処理の対象期間が無ければ（ステップＳ１０３５，Ｎｏ）、図２９の処理を終了し、未処理の対象期間が有れば（ステップＳ１０３５，Ｙｅｓ）、当該対象期間と対応付けられた配信情報が予測情報か否か、即ち当該対象期間と対応付けられた配信パターンが２か否かを判定する（ステップＳ１０４０）。ここで配信パターン２と判定した場合（ステップＳ１０４０，Ｙｅｓ）、匿名化システム１０は、対象期間の個人情報を対象データとして求め、この対象データ等に基づいて予測処理を行い、予測結果を求める（ステップＳ１０４５）。 Next, the anonymization apparatus 50 determines whether or not there is an unprocessed target period that has been read (step S1035), and if there is no unprocessed target period (step S1035, No), the process of FIG. If there is an unprocessed target period (step S1035, Yes), whether the distribution information associated with the target period is prediction information, that is, the distribution pattern associated with the target period is 2 It is determined whether or not (step S1040). When it determines with the delivery pattern 2 here (step S1040, Yes), the anonymization system 10 calculates | requires the personal information of a target period as object data, performs a prediction process based on this object data etc., and calculates | requires a prediction result ( Step S1045).

当該対象期間に該当する個人情報を対象情報として匿名化ＤＢ１４４から読み出す（ステップＳ１０４０）。そして、匿名化システム１０は、予測情報や対象データを匿名化する（ステップＳ１０５０）。例えば、予測情報が個人情報（対象データ）を含む場合には、この予測情報に含まれる個人情報を匿名化する。また、予測情報と共に配信する場合には、予測の対象とした対象データを匿名化する。 The personal information corresponding to the target period is read out from the anonymization DB 144 as target information (step S1040). And the anonymization system 10 anonymizes prediction information and object data (step S1050). For example, when the prediction information includes personal information (target data), the personal information included in the prediction information is anonymized. Moreover, when delivering with prediction information, the object data made into the object of prediction are anonymized.

匿名化後、匿名化装置５０は、匿名情報を管理サーバ２０へ送信する（ステップＳ１０５２）。なお、匿名情報とは別にステップＳ４５で予測情報を生成した場合、匿名化装置
５０は、匿名情報と共に予測情報を管理サーバ２０へ送信する。この匿名情報や予測情報を受信した管理サーバ２０は、共通ＤＢ２３３に蓄積する。 After anonymization, the anonymization device 50 transmits anonymous information to the management server 20 (step S1052). In addition, when prediction information is produced | generated by step S45 separately from anonymous information, the anonymization apparatus 50 transmits prediction information to the management server 20 with anonymous information. The management server 20 that has received the anonymous information and the prediction information accumulates in the common DB 233.

また、匿名化装置５０は、図３０に示すリアルタイム配信の処理を定期的に起動させ、先ず、各ユーザの設定情報を設定情報ＤＢ１４６から読み出す（ステップＳ１０８２）。また、匿名化装置５０は、リアルタイム配信を行う所定情報を取得したか否かを判定する（ステップＳ１０８５）。 Further, the anonymization device 50 periodically starts the real-time distribution process shown in FIG. 30 and first reads the setting information of each user from the setting information DB 146 (step S1082). Further, the anonymization device 50 determines whether or not predetermined information for performing real-time distribution has been acquired (step S1085).

匿名化装置５０は、所定情報を取得していない場合（ステップＳ１０８５，Ｎｏ）、図３０の処理を終了し、所定情報を取得した場合（ステップＳ１０８５，Ｙｅｓ）、対象データの匿名化処理を行う（ステップＳ１０９０）。 When the anonymization apparatus 50 has not acquired the predetermined information (step S1085, No), the process of FIG. 30 is terminated, and when the predetermined information is acquired (step S1085, Yes), the anonymization process of the target data is performed. (Step S1090).

匿名化後、匿名化装置５０は、リアルタイム配信を示す情報を匿名情報と共に管理サーバ２０へ送信する（ステップＳ１０９２）。 After anonymization, the anonymization device 50 transmits information indicating real-time delivery to the management server 20 together with the anonymous information (step S1092).

一方、管理サーバ２０は、図３１に示す配信の処理を定期的に起動させ、先ず、各ユーザの設定情報を設定情報ＤＢ２３６から読み出す（ステップＳ１０３２）。また、管理サーバ２０は、リアルタイム配信を示す情報を取得したか否かを判定する（ステップＳ１０３７）。リアルタイム配信を示す情報を取得した場合（ステップＳ１０３７，Ｙｅｓ）、管理サーバ２０は、当該情報と共に受信した匿名情報を設定情報に基づいて匿名化装置５０等の配信先に配信する（ステップＳ１０９７）。 On the other hand, the management server 20 periodically starts the distribution process shown in FIG. 31, and first reads the setting information of each user from the setting information DB 236 (step S1032). Also, the management server 20 determines whether or not information indicating real-time delivery has been acquired (step S1037). When information indicating real-time distribution is acquired (step S1037, Yes), the management server 20 distributes the anonymous information received together with the information to a distribution destination such as the anonymization device 50 based on the setting information (step S1097).

また、ステップＳ１０３７でリアルタイム配信を示す情報を取得していないと判定した場合（ステップＳ１０３７，Ｎｏ）、管理サーバ２０は、設定情報に基づき、現在時刻が所定期間毎に定められた配信タイミングに該当するか否か、即ち配信パターン１，２の配信タイミングか否かを判定する（ステップＳ１０５５）。所定期間毎の配信タイミングに該当する場合（ステップＳ１０５５，Ｙｅｓ）、管理サーバ２０は、ステップＳ１０５０で匿名化装置５０から受信した匿名情報等の配信情報を配信先へ配信する（ステップＳ１０６０）。管理サーバ２０は、例えば、配信パターン１の場合、配信情報として順位や匿名情報を配信し、配信パターン２の場合、予測情報や匿名情報を配信する。 If it is determined in step S1037 that information indicating real-time delivery has not been acquired (step S1037, No), the management server 20 corresponds to the delivery timing in which the current time is determined for each predetermined period based on the setting information. It is determined whether or not it is distribution timing of distribution patterns 1 and 2 (step S1055). When it corresponds to the delivery timing for every predetermined period (step S1055, Yes), the management server 20 delivers the delivery information such as anonymous information received from the anonymization device 50 in step S1050 to the delivery destination (step S1060). For example, in the case of the distribution pattern 1, the management server 20 distributes rank and anonymous information as distribution information, and in the case of the distribution pattern 2, the management server 20 distributes prediction information and anonymous information.

また、ステップＳ１０５５で、配信パターン１，２の配信タイミングでは無いと判定した場合（ステップＳ１０５５，Ｎｏ）、匿名化システム１０は、履歴情報を用いた配信タイミングか否か、即ち配信パターン４〜６か否かを判定する（ステップＳ１０６５）。ここで管理サーバ２０は、配信パターン４〜６でないと判定した場合には（ステップＳ１０６５，Ｎｏ）、図３１の処理を終了し、配信パターン４〜６であると判定した場合には（ステップＳ１０６５，Ｙｅｓ）、履歴情報を共通ＤＢ２３３から読み出し（ステップＳ１０７０）、匿名情報と比較して配信条件を満たしているか否かを判定する（ステップＳ１０７５）。 If it is determined in step S1055 that it is not the distribution timing of distribution patterns 1 and 2 (No in step S1055), the anonymization system 10 determines whether or not it is a distribution timing using history information, that is, distribution patterns 4-6. It is determined whether or not (step S1065). If the management server 20 determines that the distribution pattern is not 4 to 6 (No at Step S1065), the management server 20 ends the process of FIG. 31 and determines that the distribution pattern is 4 to 6 (Step S1065). , Yes), the history information is read from the common DB 233 (step S1070), and compared with the anonymous information, it is determined whether or not the distribution condition is satisfied (step S1075).

配信条件を満たしていなければ（ステップＳ１０７５，Ｎｏ）、匿名化システム１０は、図３１の処理を終了させ、配信条件を満たしていれば（ステップＳ１０７５，Ｙｅｓ）、通知や順位、匿名情報を配信情報として匿名化装置５０等の配信先へ配信する（ステップＳ１０８０）。
図３２は、管理サーバ２０がプログラムに従って実行する統合匿名化辞書を作成する処理の説明図である。 If the distribution condition is not satisfied (step S1075, No), the anonymization system 10 ends the process of FIG. 31, and if the distribution condition is satisfied (step S1075, Yes), distributes notification, ranking, and anonymous information. Information is distributed to a distribution destination such as the anonymization device 50 (step S1080).
FIG. 32 is an explanatory diagram of processing for creating an integrated anonymization dictionary that the management server 20 executes according to a program.

（３−２）統合匿名化辞書の作成
まず、管理サーバ２０は、各事業者の匿名化装置５０から各事業者の匿名化辞書を受信する（ステップＳ１０）。 (3-2) Creation of Integrated Anonymization Dictionary First, the management server 20 receives the anonymization dictionary of each business from the anonymization device 50 of each business (step S10).

次に管理サーバ２０は、各事業者の匿名化辞書を統合する（ステップＳ２０）。なお、匿名化辞書を統合する際の具体的な処理については後述する。 Next, the management server 20 integrates the anonymization dictionary of each business operator (step S20). In addition, the specific process at the time of integrating an anonymization dictionary is mentioned later.

また、管理サーバ２０は、統合匿名化辞書を構成するワードの次元について、優先度を決定し（ステップＳ３０）、この優先度に基づいて統合匿名化辞書に採用する次元と採用しない次元とを選択する（ステップＳ４０）。 In addition, the management server 20 determines priorities for the dimensions of the words constituting the integrated anonymization dictionary (step S30), and selects a dimension to be adopted for the integrated anonymization dictionary and a dimension not to be adopted based on this priority. (Step S40).

そして、管理サーバ２０は、ステップＳ４０で選択した次元から構成される統合匿名化辞書を各匿名化装置５０へ配信する（ステップＳ５０）。 And the management server 20 delivers the integrated anonymization dictionary comprised from the dimension selected by step S40 to each anonymization apparatus 50 (step S50).

図３３は、ステップＳ２０における匿名化辞書を統合する処理の説明図である。管理サーバ２０は、先ず、各事業者の匿名化辞書を記憶した辞書ＤＢ２３１から最下位のワードを抽出する（ステップＳ１１０）。例えば各事業者の匿名化辞書には、図２３に示すように「ソフトＡ」を抽象化した語が「伝票ソフト」と記憶されており、「ソフトＡ」に対して一段階上位のワードが「伝票ソフト」であることがわかる。同様に、「ソフトＺ」を抽象化した語が「伝票ソフト」であり、「ソフトＢ」を抽象化した語が「会計ソフト」である。 FIG. 33 is an explanatory diagram of the process of integrating the anonymization dictionary in step S20. The management server 20 first extracts the lowest word from the dictionary DB 231 storing the anonymization dictionary of each business operator (step S110). For example, in the anonymization dictionary of each company, as shown in FIG. 23, an abstract word of “soft A” is stored as “slip software”, and a word one level higher than “soft A” is stored. It turns out that it is "slip software". Similarly, a word that abstracts “soft Z” is “slip software”, and a word that abstracts “soft B” is “accounting software”.

更に、「ソフトＡ」や「ソフトＺ」に対して一段階上位のワードである「伝票ソフト」についても一段階上位のワードが「業務ソフト」と記憶されている。 In addition, for “slip software” which is a word one level higher than “soft A” and “soft Z”, the word one level higher is stored as “business software”.

このように辞書ＤＢ２３１に上位・下位の関係と共に記憶されているワードのうち、下位のワードと対応付けられていないワード、即ち最も下位のワードを一つ抽出する。 In this way, out of the words stored in the dictionary DB 231 together with the upper / lower relationship, one word that is not associated with the lower word, that is, the lowest word is extracted.

次に管理サーバ２０は、ステップＳ１１０で抽出したワードより一つ上位のワードを求め、一つ上位の段階（抽象化レベル）を設定する（ステップＳ１２０）。例えば、ステップＳ１１０で抽出したワードが「ソフトＡ」であれば、「伝票ソフト」を一段階上位のワードとして抽出する。 Next, the management server 20 obtains a word one higher than the word extracted in step S110, and sets one higher level (abstraction level) (step S120). For example, if the word extracted in step S110 is “soft A”, “slip software” is extracted as the upper word.

管理サーバ２０は、ステップＳ１２０で抽出したワードと対応する一つ下位のワードと同じ段階（抽象化レベル）のワードを抽出する（ステップＳ１３０）。例えば、ステップＳ１２０で抽出したワードが「伝票ソフト」であれば、「ソフトＡ」と同じ段階の「ソフトＺ」が抽出される。 The management server 20 extracts a word at the same stage (abstraction level) as the one lower word corresponding to the word extracted in step S120 (step S130). For example, if the word extracted in step S120 is “slip software”, “soft Z” at the same stage as “soft A” is extracted.

更に、管理サーバ２０は、ステップＳ１３０で抽出したワードと対応する下位のワードがあれば抽出し、対応する下位のワードが無くなるまで下位のワードの抽出を繰り返す（ステップＳ１４０）。 Furthermore, the management server 20 extracts the lower word corresponding to the word extracted in step S130, and repeats the extraction of the lower word until there is no corresponding lower word (step S140).

ステップＳ１４０で、下位のワードが出尽くした場合に、管理サーバ２０は、直前のステップＳ１２０又はステップＳ１６０で設定した段階が最上位か否か、即ち更に上位のワードが存在するか否かを判定し、最上位でなければ（ステップＳ１５０，Ｎｏ）、一つ上位のワードを求め、一つ上位の段階（抽象化レベル）を設定してステップＳ１３０に戻る（ステップＳ１６０）。例えば、ステップＳ１２０で設定したワードが「伝票ソフト」であった場合、一つ上位のワード「業務ソフト」を求め、一つ上位の段階として設定する。 When the lower word is exhausted in step S140, the management server 20 determines whether or not the stage set in the immediately preceding step S120 or step S160 is the highest, that is, whether there is a higher word. If it is not the most significant (No in step S150), the word one higher is obtained, the one higher level (abstraction level) is set, and the process returns to step S130 (step S160). For example, if the word set in step S120 is “slip software”, the word “business software” that is one higher level is obtained and set as one level higher.

そして、ステップＳ１３０へ戻り、ステップＳ１３０，Ｓ１４０の処理を行った後、ステップＳ１５０で、直前のステップＳ１２０又はステップＳ１６０で設定した段階が最上位と判定した場合（ステップＳ１５０，Ｙｅｓ）、前記複数の匿名化辞書に含まれる全てのワードの処理が終了したか否かを判定し（ステップＳ１７０）、残りのワードがあれば
（ステップＳ１７０，Ｎｏ）、ステップＳ１１０に戻って処理を繰り返し、全てのワードの処理が終了したならば（ステップＳ１７０，Ｙｅｓ）図３３の処理を終了する。 Then, after returning to step S130 and performing the processing of steps S130 and S140, if it is determined in step S150 that the stage set in the immediately preceding step S120 or step S160 is the highest (step S150, Yes), the plurality of It is determined whether or not all the words included in the anonymization dictionary have been processed (step S170), and if there are remaining words (step S170, No), the process returns to step S110 to repeat the process, If the above process is completed (step S170, Yes), the process of FIG. 33 is terminated.

（３−３）次元の説明
図３４は、図３３の処理によって作成される各次元の説明図である。図３４の例では、「ＩＴ製品」をルートとする次元について示している。即ち、図３４の次元において、「ＩＴ製品」が最上位の段階のワードである。 (3-3) Description of Dimensions FIG. 34 is an explanatory diagram of each dimension created by the process of FIG. In the example of FIG. 34, a dimension having “IT product” as a root is shown. That is, in the dimension of FIG. 34, “IT product” is the word at the highest level.

「ＩＴ製品」は、その一つ下位の段階（図３４の例では段階４）のワードとして「ソフト」「ハード」が対応付けられている。そして、「ソフト」は、その一つ下位の段階（図３４の例では段階３）のワードとして「業務ソフト」「個人ソフト」が対応付けられている。 In “IT product”, “software” and “hardware” are associated as words in the next lower stage (stage 4 in the example of FIG. 34). “Software” is associated with “business software” and “individual software” as the words at the next lower level (step 3 in the example of FIG. 34).

また、「業務ソフト」は、その一つ下位の段階（図３４の例では段階２）のワードとして「伝票ソフト」「会計ソフト」「顧客管理ソフト」が対応付けられ、「伝票ソフト」は、その一つ下位の段階（図３４の例では段階１、最下位の段階）のワードとして「ソフトＡ」「ソフトＺ」が対応付けられている。なお、「個人ソフト」は、その一つ下位の段階のワードとして「ソフトＶ」「ソフトＵ」と対応付けられ、「ハード」は、その一つ下位の段階のワードとして「サーバＤ」「サーバＥ」と対応付けられている。 In addition, “business software” is associated with “slip software”, “accounting software”, and “customer management software” as the words of the next lower stage (stage 2 in the example of FIG. 34). “Soft A” and “Soft Z” are associated with the words at one lower level (step 1 in the example of FIG. 34, the lowest level). The “individual software” is associated with “soft V” and “soft U” as the words in the lower level, and “hard” is “server D” and “server” as the words in the lower level. E ”.

このように本実施形態の統合部は、各事業者の匿名化辞書に基づいて図３４に示すような次元を複数作成する。ここで次元は、最上位のワードをルートとし、最下位のワードにかけて樹状に対応付けられた対応関係であり、最上位のワード毎に生成される。即ち統合部は、各事業者の匿名化辞書に含まれる全てのワードをまとめて樹状に対応つけて複数の次元とすることにより匿名化辞書を統合化している。そして、この複数の次元が、統合匿名化辞書である。 As described above, the integration unit of the present embodiment creates a plurality of dimensions as shown in FIG. 34 based on the anonymization dictionary of each business operator. Here, the dimension is a correspondence relationship in which the highest word is rooted and associated with the lowest word in a tree form, and is generated for each highest word. That is, the integration unit integrates the anonymization dictionary by combining all words included in the anonymization dictionary of each business operator into a plurality of dimensions by associating them with a tree. The plurality of dimensions is an integrated anonymization dictionary.

図３５は複数の次元の説明図である。図３５に示すように、あるワードを抽象化する次元は複数存在し得る。例えば、図３５の次元ａでは、「ソフトウェアＡ」を「会計ソフト」、「業務ソフト」に抽象化し、次元ｃでは、「ソフトウェアＡ」を「ａ社製品」、「パッケージ」に抽象化する。また、次元ｂや次元ｄでもそれぞれ異なるワードに抽象化する。 FIG. 35 is an explanatory diagram of a plurality of dimensions. As shown in FIG. 35, there can be multiple dimensions for abstracting a word. For example, in dimension a in FIG. 35, “software A” is abstracted into “accounting software” and “business software”, and in dimension c, “software A” is abstracted into “a company product” and “package”. Also, the dimension b and dimension d are abstracted into different words.

特に本実施形態の統合匿名化辞書は、多数の事業者の匿名化辞書を統合しているので、例えば数十〜数百の次元を含むことになり、全ての次元を用いて抽象化を行うと、データ量が膨大になってしまう。このため、本実施形態では、統合匿名化辞書の各次元について、抽象化に採用する次元の優先度を決定している。 In particular, since the integrated anonymization dictionary of this embodiment integrates anonymization dictionaries of a large number of operators, for example, it includes tens to hundreds of dimensions, and abstraction is performed using all dimensions. And the amount of data becomes enormous. For this reason, in this embodiment, the priority of the dimension employ | adopted for abstraction is determined about each dimension of an integrated anonymization dictionary.

（３−４）優先度の説明
次に、図３５〜図３７を用いてステップＳ３０における優先度の決定処理の詳細について説明する。図３６は、図３５に示した次元に含まれる各ワードに重み付けをした例を示す図である。図３６の例では、各次元に含まれるワードの夫々が、当該ワードの段階と対応付けて記憶されると共に、三種類の重み付けが行われる。重み付け１では、重要フラグの有無を付し、重み付け２では、検索回数を付し、重み付け３では、ＳＥＭ（Search Engine Marketing）価格を付している。ここで重要フラグは、ユーザが重要か否かを入力し
た値であり、重要なワード、即ち抽象化に利用したいワードには重要と記録する（重要フラグを立てる）。 (3-4) Description of Priority Next, details of the priority determination process in step S30 will be described with reference to FIGS. FIG. 36 is a diagram showing an example in which each word included in the dimension shown in FIG. 35 is weighted. In the example of FIG. 36, each word included in each dimension is stored in association with the stage of the word, and three types of weighting are performed. Weight 1 indicates the presence / absence of an important flag, weight 2 indicates the number of searches, and weight 3 indicates a SEM (Search Engine Marketing) price. Here, the important flag is a value input as to whether or not the user is important, and is recorded as important for an important word, that is, a word to be used for abstraction (an important flag is set).

また、優先度決定部２０３は、図２４に示す優先度ＤＢ２３２からワードの価値を読み出し、図３５に示すように対応するワードに重み付けとして付加する。 Also, the priority determination unit 203 reads the value of the word from the priority DB 232 shown in FIG. 24 and adds it to the corresponding word as a weight as shown in FIG.

そして図３５に示した次元のワードの数や、段階の和、各ワードの重み付けを次元毎に集計して、優先度を決定する。 Then, the number of words in the dimension shown in FIG. 35, the sum of steps, and the weight of each word are totaled for each dimension, and the priority is determined.

図３７は、各ワードの重みを集計して各次元の優先度を求める処理の説明図である。図３７において、次元ａの各ワードについて、ワード数、段階数の和、重み付け１、重み付け２、重み付け３を集計したものが表５１Ａである。同様に次元ｂを集計した表が５１Ｂ、次元ｃを集計した表が５１Ｃである。 FIG. 37 is an explanatory diagram of processing for calculating the priority of each dimension by adding up the weights of the respective words. In FIG. 37, for each word of dimension a, Table 51A is a summation of the number of words, the sum of the number of steps, weight 1, weight 2, and weight 3. Similarly, a table summarizing dimension b is 51B, and a table summing dimension c is 51C.

ワード数は、各次元に含まれるワードの総数であり、図３７の例では、次元ａが２５、次元ｂが５０、次元ｃが９である。このワード数が多いと、抽象化のバリエーションが多く、ｌ−多様性を満たし難くなる、即ち安全性が低くなることが考えられるが、データとしての詳細性は高いため、ワード数が多いものを優先する。 The number of words is the total number of words included in each dimension. In the example of FIG. 37, dimension a is 25, dimension b is 50, and dimension c is 9. If this number of words is large, there will be many variations of abstraction, and it will be difficult to satisfy 1-diversity, that is, safety will be low. Prioritize.

段階数の和とは、段階の数に、当該段階に属するワードの数を乗じ、総計を求めたものであり、例えば（段階数５×ワード数１）＋（段階数４×ワード数２）＋（段階数３×ワード数２）＋（段階数２×ワード数３）＋（段階数１×ワード数９）＝３４と求める。この段階数の和が多いと、上位の段階が多く存在し、抽象度の高い選択肢が多く存在することになり、適切な抽象化レベルで抽象化可能で、安全性が高いため、段階数の和が多いものを優先する。 The sum of the number of stages is obtained by multiplying the number of stages by the number of words belonging to the stage and obtaining a total, for example, (number of stages 5 × number of words 1) + (number of stages 4 × number of words 2). + (Number of stages 3 × number of words 2) + (number of stages 2 × number of words 3) + (number of stages 1 × number of words 9) = 34 If the sum of the number of stages is large, there are many higher-level stages, and there are many options with a high level of abstraction, which can be abstracted at an appropriate level of abstraction and is highly secure. Priority is given to those with a high sum.

同様に、重み付け１〜３についても、重要フラグの数や、検索回数、ＳＥＭ価格の総計を求め、この値の高い、即ち価値の高いものを優先する。 Similarly, for the weights 1 to 3, the total number of important flags, the number of searches, and the SEM price are obtained, and a higher value, that is, a higher value is given priority.

そして、これらワード数、段階数の和、重み付け１〜３について、次式に基づいて全体出現率（全体数に対する割合）を求める。 And about these word number, the sum of the number of steps, and the weights 1-3, the whole appearance rate (ratio with respect to the whole number) is calculated | required based on following Formula.

全体出現率＝ｔｆ／ｉｄｆ
＝次元ａの値／（次元ａの値＋次元ｂの値＋次元ｃの値＋・・・）
この全体出現率を各次元について比較したものが表５２である。表５２の各次元について、ワード数、段階数の和、重み付け１〜３の全体出現率を合計して全体優先度を定めている。 Overall appearance rate = tf / idf
= Value of dimension a / (value of dimension a + value of dimension b + value of dimension c +...)
Table 52 shows a comparison of the overall appearance rate for each dimension. For each dimension in Table 52, the total priority is determined by summing the number of words, the sum of the number of stages, and the overall appearance rates of weights 1 to 3.

このように各次元について全体優先度を求め、この全体優先度に基づいて次元選択部２０７が統合匿名化辞書に採用する次元と採用しない次元とを選択する。例えば、次元選択部２０７が表５２の全体優先度を参照し、全体優先度が高い順に所定数の次元を採用し、これ以外の全体優先度が低い次元は採用しない。 In this way, the overall priority is obtained for each dimension, and the dimension that the dimension selection unit 207 adopts to the integrated anonymization dictionary and the dimension that is not adopted are selected based on the overall priority. For example, the dimension selection unit 207 refers to the overall priorities in Table 52, adopts a predetermined number of dimensions in descending order of overall priorities, and does not employ other dimensions with lower overall priorities.

なお、選択の基準は、全体優先度の順だけでなく、重要フラグを含む次元は採用し、重要フラグを含まない次元については全体優先度が高い順に所定数の次元を採用するといったように選択条件を設定しても良い。 The selection criteria are not only the order of the overall priority, but the dimension including the important flag is adopted, and the dimension not including the important flag is selected such that a predetermined number of dimensions are adopted in descending order of the overall priority. Conditions may be set.

また、選択の対象は、例えば統合匿名化辞書に含まれる全ての次元を選択の対象とし、全体優先度に基づいて所定数の次元を採用しても良いし、同じワードを含む次元毎に選択の対象とし、全体優先度に基づいて所定数の次元を採用しても良い。 The selection target may be, for example, all dimensions included in the integrated anonymization dictionary, and a predetermined number of dimensions may be adopted based on the overall priority, or may be selected for each dimension including the same word. And a predetermined number of dimensions may be adopted based on the overall priority.

（３−５）匿名化方法
図３８は、ステップＳ１０５０において、統合匿名化辞書を用いて匿名化装置が実行する匿名化の処理の説明図である。匿名化装置５０は、ステップＳ１０３５で未処理と判定した対象期間に該当する個人情報を個人情報ＤＢ１３１から読み出して取得し（ステップ
Ｓ２１０）、対象データ中の各ワードについて、価値データが検索情報蓄積ＤＢ１４２に存在するか否かを判定する（ステップＳ２２０）。匿名化装置５０は、全てのワードの価値データが検索情報蓄積ＤＢ１４２に存在する場合にはステップＳ２３０へ移行し（ステップＳ２２０，Ｙｅｓ）、足りない価値データがある場合（ステップＳ２２０，Ｎｏ）、当該ワードの価値データを外部の装置、本例では検索エンジンから取得する（ステップＳ２４０）。そして、匿名化装置５０は、検索情報蓄積ＤＢ１４２に存在するワードの価値情報を検索情報蓄積ＤＢ１４２から取得する（ステップＳ２３０）。 (3-5) Anonymization Method FIG. 38 is an explanatory diagram of the anonymization process executed by the anonymization device using the integrated anonymization dictionary in step S1050. The anonymization device 50 reads out and acquires personal information corresponding to the target period determined as unprocessed in step S1035 from the personal information DB 131 (step S210), and for each word in the target data, the value data is the search information storage DB 142. (Step S220). The anonymization device 50 proceeds to step S230 when the value data of all words exist in the search information storage DB 142 (step S220, Yes), and when there is insufficient value data (step S220, No), Word value data is obtained from an external device, in this example, a search engine (step S240). And the anonymization apparatus 50 acquires the value information of the word which exists in search information storage DB142 from search information storage DB142 (step S230).

また、匿名化装置５０は、匿名性を満たすため対象データの各項目を抽象化したワード（カテゴリ）に置き換えて抽象化候補データを作成する（ステップＳ２５０）。なお、抽象化可能な項目が複数存在する場合には、各項目を抽象化した場合と抽象化しない場合の全てのパターンを作成する。 Further, the anonymization device 50 creates abstraction candidate data by replacing each item of the target data with an abstracted word (category) in order to satisfy anonymity (step S250). When there are a plurality of items that can be abstracted, all patterns are created when each item is abstracted and when it is not abstracted.

例えば対象データに三つの項目Ａ，Ｂ，Ｃが含まれ、全項目について抽象化が可能で、抽象化した項目をＡ´，Ｂ´，Ｃ´とした場合、図１３に示すように、項目Ａだけを抽象化した場合Ａ´，Ｂ，Ｃや、項目Ａ，Ｂを抽象化した場合Ａ´，Ｂ´，Ｃなど、七つの候補パターンが作成できる。また、全項目を用いるものに限らず、Ａ´，ＢやＢ´，Ｃなど、一部の項目を用いた候補パターンを作成しても良い。例えば、項目Ａ，Ｂ、項目Ａ´，Ｂ、項目Ａ，Ｂ´、項目Ａ´，Ｂ´や、項目Ｂ，Ｃ、項目Ｂ´，Ｃ、項目Ｂ，Ｃ´、項目Ｂ´，Ｃ´、項目Ａ，Ｃ、項目Ａ´，Ｃ、項目Ａ，Ｃ´、項目Ａ´，Ｃ´のような候補を作成しても良い。このとき省略しない項目（必須項目）を予め設定しておき、この必須項目以外の項目を省略した候補パターンを作成しても良い。なお、匿名化システム１０は、候補パターンの項目数をステップＳ１１２５で取得した対象期間の長さに応じて定めても良い。例えば、対象期間が１０分であれば項目数１〜３、１時間であれば項目数２〜５、１日以上であれば項目数５〜８のように、対象期間毎に候補パターンの項目数の範囲を定めて記憶部に記憶しておき、ステップＳ１１２５で取得した対象期間と対応する項目数の範囲を読み出し、この項目数の範囲で候補パターンを作成しても良い。 For example, if the target data includes three items A, B, and C and all items can be abstracted, and the abstracted items are A ′, B ′, and C ′, as shown in FIG. Seven candidate patterns can be created, such as A ′, B, and C when only A is abstracted, and A ′, B ′, and C when items A and B are abstracted. Moreover, you may create the candidate pattern using some items, such as A ', B, B', and C, without using all items. For example, item A, B, item A ′, B, item A, B ′, item A ′, B ′, item B, C, item B ′, C, item B, C ′, item B ′, C ′ Candidates such as item A, C, item A ′, C, item A, C ′, item A ′, C ′ may be created. At this time, items that are not omitted (essential items) may be set in advance, and a candidate pattern may be created in which items other than the essential items are omitted. Note that the anonymization system 10 may determine the number of candidate pattern items according to the length of the target period acquired in step S1125. For example, if the target period is 10 minutes, the number of items is 1 to 3, and if it is 1 hour, the number of items is 2 to 5, and if it is more than one day, the number of items is 5 to 8 for each target period. A range of numbers may be determined and stored in the storage unit, a range of the number of items corresponding to the target period acquired in step S1125 may be read, and a candidate pattern may be created within the range of the number of items.

また、一つの項目について、抽象化可能な次元が、統合匿名化辞書に複数存在する場合には、当該項目を複数に増やして、それぞれの次元で抽象化を行う。例えば所属企業の項目について、上場企業又は非上場企業に抽象化する次元ｐと、ＩＴ関連企業、教育関連企業、出版業のように業種に抽象化する次元ｑとが存在する場合、所属企業の項目を次元ｐで抽象化した所属企業（上場／非上場）の項目と、次元ｑで抽象化した所属企業（業種）の項目とに、それぞれ抽象化する。換言すれば、三つの項目Ａ，Ｂ，Ｃのうち、項目Ｂについて抽象化可能な次元が二つ存在する場合に、項目Ａ，Ｂ１´，Ｂ２´，Ｃのように四つの項目に抽象化する。即ち図１３の例では、項目Ｂ´に代えて項目Ｂ１´，Ｂ２´に抽象化する七つの候補パタンーンが作成できる。 Further, when there are a plurality of dimensions that can be abstracted for one item in the integrated anonymization dictionary, the number of the items is increased to a plurality of dimensions and the abstraction is performed in each dimension. For example, if there is a dimension p that is abstracted to a listed company or unlisted company and a dimension q that is abstracted to a business type such as an IT-related company, an education-related company, or a publishing business, The items are abstracted into the item of the affiliated company (listed / unlisted) whose items are abstracted with dimension p and the item of the affiliated company (business type) abstracted with dimensions q. In other words, when there are two dimensions that can be abstracted for item B among the three items A, B, and C, abstraction is made into four items such as items A, B1 ′, B2 ′, and C. To do. That is, in the example of FIG. 13, seven candidate patterns to be abstracted into items B1 ′ and B2 ′ can be created instead of the item B ′.

次に匿名化装置５０は、抽象化候補データに含まれる各ワードの価値データに基づいて各パターンの抽象化候補データの価値を算出し（ステップＳ２６０）、この抽象化候補データの価値に基づいて検定の順番を決定する（ステップＳ２７０）。例えばこの価値が高い順（降順）に検定の順番を決定する。なお、全ての候補パターンについて検定を行うことが望ましいが、この抽象化候補データの価値に基づき、価値の低過ぎる抽象化候補データを順番から外しても良い。例えば、価値の高い順番で、所定番目以降或いは全体数の半分より価値の低い順番など、価値の順番が候補パターンの全数と比べて所定割合より低い価値の抽象化候補データを外しても良い。また、抽象化候補データの価値が対象データの価値に対して所定割合未満となった抽象化候補データを外しても良い。これにより検定数が少なくなり、処理時間の短縮化が図れる。 Next, the anonymization device 50 calculates the value of the abstraction candidate data of each pattern based on the value data of each word included in the abstraction candidate data (step S260), and based on the value of this abstraction candidate data. The order of testing is determined (step S270). For example, the test order is determined in descending order of the value. Although it is desirable to test all candidate patterns, abstract candidate data that is too low in value may be removed from the order based on the value of the abstract candidate data. For example, the abstract candidate data whose value order is lower than a predetermined ratio compared to the total number of candidate patterns, such as an order of higher value, the order after the predetermined number, or an order whose value is lower than half of the total number, may be excluded. Further, the abstraction candidate data whose value is less than a predetermined ratio with respect to the value of the target data may be excluded. This reduces the number of tests and shortens the processing time.

この検定の順番に従い、匿名化装置５０は、抽象化候補データの匿名性を検定する（ス
テップＳ２８０）。例えば、ｋ−匿名性を検定するため、一個人と対応付けられた異なる項目の値の組み合わせが当該抽象化候補データ中に存在する数（存在数）を求める。或いは、ｌ多様性を検定するため、一個人と対応付けられた同じ項目の値の組み合わせが当該抽象化候補データ中に存在する数（存在数）を求める。そして、この存在数のうち最小のものを最低出現数（ｋ値／ｌ値）として求め（ステップＳ２９０）、この最低出現数が１を超えているか否かを判定する（ステップＳ３００）。即ち、ここでｋ値が１を超えていればｋ−匿名性を満たし、１であればｋ−匿名性を満たさない。同様にｌ値が１を超えていればｌ−多様性を満たし、１であればｌ−多様性を満たさない。 In accordance with the order of this test, the anonymization device 50 tests the anonymity of the abstraction candidate data (step S280). For example, in order to test k-anonymity, the number (existence number) of combinations of values of different items associated with one individual is present in the abstraction candidate data. Alternatively, in order to test 1 diversity, the number (existence number) in which the combination of values of the same item associated with one individual exists in the abstraction candidate data is obtained. Then, the smallest of the existence numbers is obtained as the minimum number of appearances (k value / l value) (step S290), and it is determined whether or not the minimum number of appearances exceeds 1 (step S300). That is, if the k value exceeds 1, k-anonymity is satisfied, and if it is 1, k-anonymity is not satisfied. Similarly, if the l value exceeds 1, l-diversity is satisfied, and if l value is 1, l-diversity is not satisfied.

最低出現数（ｋ値／ｌ値）が１を超えていない場合（ステップＳ３００，Ｎｏ）、匿名化装置５０は、抽象化候補データのうち、少なくとも一つの項目の値を更に抽象化する、即ち抽象化したワードに置き換え（ステップＳ３１０）、ステップＳ２８０に戻る。 When the minimum appearance number (k value / l value) does not exceed 1 (step S300, No), the anonymization device 50 further abstracts the value of at least one item of the abstraction candidate data, that is, Replacement with the abstracted word (step S310), the process returns to step S280.

一方、最低出現数（ｋ値／ｌ値）が１を超えている場合（ステップＳ３００，Ｙｅｓ）、匿名化装置５０は、当該抽象化候補データの価値と元の対象データの価値との差分を求め（ステップＳ３２０）、この差分や、この差分に基づく値、例えば対象データの価値に対する差分の割合、対象データの価値に対する抽象化候補データの価値の割合を当該抽象化候補データの価値として決定する（ステップＳ３３０）。 On the other hand, when the minimum number of appearances (k value / l value) exceeds 1 (step S300, Yes), the anonymization device 50 calculates the difference between the value of the abstraction candidate data and the value of the original target data. Obtaining (step S320), this difference, a value based on this difference, for example, the ratio of the difference to the value of the target data, and the ratio of the value of the abstraction candidate data to the value of the target data are determined as the value of the abstraction candidate data. (Step S330).

また、匿名化装置５０は、検定していない候補パターンがあるか否かを判定し（ステップＳ３４０）、検定していない候補パターンがあれば（ステップＳ３４０，Ｙｅｓ）、ステップＳ２７０で決定した順番に従って、次の順番の抽象化候補データを特定し（ステップＳ３５０）、ステップＳ２８０に戻って次の抽象化候補データについて検定を行う。 Further, the anonymization device 50 determines whether there is a candidate pattern that has not been verified (step S340). If there is a candidate pattern that has not been verified (step S340, Yes), the anonymization device 50 follows the order determined in step S270. Next, the abstraction candidate data in the next order is specified (step S350), and the process returns to step S280 to test the next abstraction candidate data.

このように各パターンの抽象化候補データについて検定を繰り返し、次の候補パターンが無くなった場合（ステップＳ３４０，Ｎｏ）、匿名化装置５０は、ステップＳ３２０で各抽象化候補データの価値に基づいて、採用すべき抽象化候補データを選択し（ステップＳ３６０）、選択した抽象化候補データを匿名情報として管理サーバ２０へ送信して（ステップＳ３７０）、図３８の処理を終了する。 Thus, when the test is repeated for the abstraction candidate data of each pattern and the next candidate pattern disappears (step S340, No), the anonymization device 50, based on the value of each abstraction candidate data in step S320, The abstraction candidate data to be adopted is selected (step S360), the selected abstraction candidate data is transmitted to the management server 20 as anonymous information (step S370), and the process of FIG. 38 is terminated.

抽象化候補データの選択は、例えば、全候補パターンの中で最も価値の高い抽象化候補データを選択する。また、匿名化装置５０は、全候補パターンの中から価値の高い順に複数の抽象化候補データを出力し、この出力された抽象化候補データの中から操作者が適切だと思う抽象化候補データを指定し、この指定された抽象化候補データを選択しても良い。 In the selection of the abstraction candidate data, for example, the abstraction candidate data having the highest value among all candidate patterns is selected. The anonymization device 50 outputs a plurality of abstraction candidate data in descending order of value from all candidate patterns, and the abstraction candidate data that the operator thinks is appropriate from the output abstraction candidate data May be selected, and the specified abstraction candidate data may be selected.

§４．匿名情報の具体例
次に図３９，図４０を用いて匿名情報の具体例について説明する。図３９は、Ａ社における匿名化の例を示す図であり、図３９（ａ）は、Ａ社が収集した個人情報、図３９（ｂ）は、図３９（ａ）の個人情報をＡ社独自の匿名化辞書で匿名化した場合の匿名情報の例を示す図、図３９（ｃ）は、図３９（ａ）の個人情報を統合匿名化辞書で匿名化した場合の匿名情報の例を示す図である。 §4. Specific Example of Anonymous Information Next, a specific example of anonymous information will be described with reference to FIGS. 39 and 40. FIG. 39 is a diagram showing an example of anonymization in company A, FIG. 39A shows personal information collected by company A, and FIG. 39B shows personal information in FIG. The figure which shows the example of the anonymous information at the time of anonymizing with an original anonymization dictionary, FIG.39 (c) is an example of the anonymous information at the time of anonymizing the personal information of FIG.39 (a) with an integrated anonymization dictionary. FIG.

Ａ社の匿名化装置５０は、図３９（ａ）の個人情報を独自の匿名化辞書で匿名化した場合、図３９（ｂ）に示すように、氏名とメールアドレスの項目を削除し、年齢を年代に、所属企業を上場企業又は非上場企業に、役職を管理職や社員、アルバイトに抽象化する。 When the anonymization device 50 of company A anonymizes the personal information of FIG. 39 (a) with its own anonymization dictionary, as shown in FIG. 39 (b), the items of name and email address are deleted, and the age In the ages, the company is abstracted as a listed company or unlisted company, and the title is abstracted into a managerial position, employee, or part-time job.

これに対して、Ａ社の匿名化装置５０は、図３９（ａ）の個人情報を統合匿名化辞書で匿名化した場合、図３９（ｃ）に示すように、氏名とメールアドレスの項目を削除し、年齢を年代に、所属企業を上場企業又は非上場企業、及び所属企業を業種に抽象化する。ま
た、Ａ社の匿名化装置５０は、統合匿名化辞書を用いた場合、役職をマネージャやスタッフに、興味を示した商品を伝票ソフトやサーバに抽象化すると共に、来訪ブースの項目を追加して、Ａ社に来訪した人のデータであることを示す値「Ａ社」を入力する。 On the other hand, when the anonymization device 50 of company A anonymizes the personal information in FIG. 39 (a) with the integrated anonymization dictionary, as shown in FIG. 39 (c), the name and e-mail address items are displayed. Delete, abstract age to age, affiliated company to listed or unlisted company, and affiliated company to industry. In addition, when the integrated anonymization dictionary is used, company A's anonymization device 50 abstracts the title of the product to the manager or staff, the product that shows interest to the slip software or server, and adds an item for the visit booth. Then, a value “Company A” indicating that it is data of a person who visited Company A is input.

一方、図４０は、Ｂ社における匿名化の例を示す図であり、図４０（ａ）は、Ｂ社が収集した個人情報、図４０（ｂ）は、図４０（ａ）の個人情報をＢ社独自の匿名化辞書で匿名化した場合の匿名情報の例を示す図、図４０（ｃ）は、図４０（ａ）の個人情報を統合匿名化辞書で匿名化した場合の匿名情報の例を示す図である。 On the other hand, FIG. 40 is a diagram showing an example of anonymization in company B, FIG. 40A shows personal information collected by company B, and FIG. 40B shows personal information in FIG. The figure which shows the example of the anonymized information at the time of anonymizing with B company original anonymization dictionary, FIG.40 (c) is anonymity information at the time of anonymizing the personal information of Fig.40 (a) with an integrated anonymization dictionary. It is a figure which shows an example.

Ｂ社の匿名化装置５０は、図４０（ａ）の個人情報を独自の匿名化辞書で匿名化した場合、図４０（ｂ）に示すように、氏名とメールアドレスの項目を削除し、年齢を年代に、所属企業を業種に、職種を開発や総務に抽象化する。 When the anonymization device 50 of company B anonymizes the personal information of FIG. 40 (a) with its own anonymization dictionary, as shown in FIG. 40 (b), the items of name and email address are deleted, and the age In the ages, the company is abstracted into the type of business, and the job type is abstracted into development and general affairs.

これに対して、Ｂ社の匿名化装置５０は、図４０（ａ）の個人情報を統合匿名化辞書で匿名化した場合、図４０（ｃ）に示すように、氏名とメールアドレスの項目を削除し、年齢を年代に、所属企業を上場企業又は非上場企業、及び所属企業を業種に抽象化する。また、Ｂ社の匿名化装置５０は、統合匿名化辞書を用いた場合、職種を技術職や事務に、興味を示した商品を会計ソフトやサーバに抽象化すると共に、来訪ブースの項目を追加して、Ｂ社に来訪した人のデータであることを示す値「Ｂ社」を入力する。 On the other hand, when the anonymization device 50 of company B anonymizes the personal information of FIG. 40 (a) with the integrated anonymization dictionary, as shown in FIG. 40 (c), the name and e-mail address items are displayed. Delete, abstract age to age, affiliated company to listed or unlisted company, and affiliated company to industry. In addition, when the anonymization device 50 of Company B uses an integrated anonymization dictionary, it abstracts the type of job into technical jobs and office work, abstracts products that show interest into accounting software and servers, and adds a visit booth item. Then, a value “Company B” indicating that it is data of a person who visited Company B is input.

このように各事業者の匿名化装置５０は、統合匿名化辞書に基づいて所属企業の項目を複数の次元で抽象化する。前述のように統合匿名化辞書には優先度の高い次元が採用されているので、この統合匿名化辞書に存在する次元で抽象化することにより、各事業者にとって有用な抽象化を行うことができる。 As described above, the anonymization device 50 of each business operator abstracts the items of the affiliated company in a plurality of dimensions based on the integrated anonymization dictionary. As mentioned above, the integrated anonymization dictionary uses a high-priority dimension, so it is possible to perform abstraction that is useful for each operator by abstracting the dimension that exists in this integrated anonymization dictionary. it can.

また、前述のように匿名化辞書を統合したことにより、抽象化する際のワードの対応関係が再編され、Ａ社の役職やＢ社の職種のように独自の項目についても共通の次元で抽象化されるので、類似の項目を有する他社のデータと比較することができる。 In addition, by integrating the anonymization dictionary as described above, the correspondence relationship of words at the time of abstraction is reorganized, and unique items such as the positions of company A and company B are also abstracted in a common dimension. Therefore, it can be compared with data from other companies that have similar items.

§５．実施形態の効果
以上のように本実施形態２によれば、利用者側の指定する対象期間内の個人情報を対象データとして匿名化を行うので、対象期間の長さによって匿名化するための抽象化の程度を異ならせて、対象期間と抽象化後の価値に基づいた適切な匿名化処理を行うことができる。また、複数の事業者の匿名化装置が、匿名化辞書を統合した共通の統合匿名化辞書を用いて匿名化を行うことで、この匿名情報を各事業者が一元的に利用できる。 §5. Advantages of the Embodiment As described above, according to the second embodiment, personal information within the target period specified by the user is anonymized as the target data, so abstraction for anonymizing according to the length of the target period It is possible to perform appropriate anonymization processing based on the target period and the value after abstraction by varying the degree of conversion. Moreover, each provider can use this anonymous information centrally by anonymizing using the common integrated anonymization dictionary which integrated the anonymization dictionary by the anonymization apparatus of a some provider.

特に、各事業者で匿名化したワードの抽象化レベルが異なっていたとしても、共通の匿名化辞書を用いているので、抽象化レベルをそろえて集計などに利用でき、利便性が高い。 In particular, even if the level of abstraction of anonymized words by each business operator is different, since a common anonymization dictionary is used, the level of abstraction can be set and used for tabulation and the like, which is highly convenient.

また、各事業者で用いられている匿名化辞書を統合した統合匿名化辞書で匿名化された匿名情報、即ち共通ＤＢ２３３に登録された匿名情報は、各事業者で利用している匿名化の情報を反映しているため、共通ＤＢ２３３の匿名情報から他社の動向を知ることができる。例えば、各事業者の匿名化辞書で共通して用いられている次元の場合、統合匿名化辞書を作成する際、当該次元に属するワードや段階が各事業者の匿名化辞書から集まるため、必然的に多くなり、優先度が高くなって当該次元が統合匿名化辞書に採用される。このため、統合匿名化辞書に採用された次元で匿名化された匿名情報は、各事業者で利用度の高い情報であることが分かる。 Moreover, the anonymous information anonymized by the integrated anonymization dictionary integrated with the anonymization dictionary used by each company, that is, the anonymous information registered in the common DB 233 is the anonymization used by each company. Since the information is reflected, it is possible to know the trends of other companies from the anonymous information of the common DB 233. For example, in the case of a dimension commonly used in each company's anonymization dictionary, when creating an integrated anonymization dictionary, the words and stages belonging to that dimension are gathered from each company's anonymization dictionary. The priority increases and the dimension is adopted in the integrated anonymization dictionary. For this reason, it turns out that the anonymous information anonymized by the dimension employ | adopted by the integrated anonymization dictionary is information with high utilization in each provider.

また、匿名情報が所定条件を満たしたことを契機に配信を行うことにより、他の事業者
の匿名情報を含めた匿名情報全体に基づいて所定の処理を開始させるための契機を提供できる。例えば、複数の事業者が収集した匿名情報に基づいて、キャンペーンを行うことやデジタルサイネージに表示するコンテンツを変更するなど、展示会に参加している複数事業者全体に係る処理を適切に行うことができる。 Moreover, the opportunity for starting a predetermined | prescribed process based on the whole anonymous information including the anonymous information of another provider can be provided by delivering on the occasion that anonymous information satisfy | filled the predetermined condition. For example, based on anonymous information collected by multiple business operators, appropriately performing processing related to the entire multiple business operators participating in the exhibition, such as conducting campaigns and changing the content displayed on digital signage Can do.

§７．変形例
上記の例では、展示会における個人情報の匿名化の例を示したが、本実施形態２の匿名化システム１００は、ショッピングモールや商店街における各店舗で収集された個人情報の匿名化に適用しても良い。 §7. Modification In the above example, an example of anonymization of personal information in an exhibition has been shown. However, the anonymization system 100 of the second embodiment is anonymization of personal information collected at each store in a shopping mall or a shopping street. You may apply to.

図４１はショッピングモールの各店舗で収集された個人情報の匿名化の例を示す図である。図４１の例では、同じショッピングモールに出店している店舗（事業者）Ｄ〜Ｆの匿名化装置５０が、夫々個人情報を匿名化して管理サーバ２０の共通ＤＢに登録する。なお、図４１のシステムにおいても統合匿名化辞書の作成方法や統合匿名化辞書を用いた匿名化の方法、配信方法は、前述の実施形態２と同じである。 FIG. 41 is a diagram illustrating an example of anonymization of personal information collected at each store of a shopping mall. In the example of FIG. 41, the anonymization devices 50 of the stores (business operators) D to F that open in the same shopping mall anonymize the personal information and register them in the common DB of the management server 20. 41, the method for creating an integrated anonymization dictionary, the anonymization method using the integrated anonymization dictionary, and the distribution method are the same as those in the second embodiment.

店舗Ｄは、飲食店であり、事前に顧客のメールアドレスや性別、年齢を記憶部に登録しておき、メールでの予約の受け付けやメールクーポンの送信を行う。そして、顧客が来店時に予約のメールやメールクーポンを提示した場合に、このメールから匿名化装置５０が顧客のＩＤやメールアドレスを取得し、対応する顧客の情報を記憶部から読み出し、来店者情報として記憶部に記憶する。なお、顧客のＩＤやメールアドレスの取得は、例えばメール中に二次元バーコード等を含めておき、この二次元バーコードを読み取り機で読み取って匿名化装置５０に送信する。また、顧客のＩＤやメールアドレスを店舗の担当者が聞き取って匿名化装置５０に入力しても良い。また、来店者情報には、前記顧客の情報に加えて、来店人数（大人の人数、子供の人数）や、購入商品、金額等を入力しても良い。 The store D is a restaurant, and registers the customer's email address, gender, and age in advance in the storage unit, and accepts reservations by email and sends email coupons. When the customer presents a reservation email or email coupon when visiting the store, the anonymization device 50 acquires the customer ID or email address from this email, reads the corresponding customer information from the storage unit, and stores the visitor information. Is stored in the storage unit. In order to obtain the customer ID and email address, for example, a two-dimensional barcode is included in the email, and the two-dimensional barcode is read by a reader and transmitted to the anonymization device 50. Further, the person in charge of the store may listen to the customer ID and email address and input them to the anonymization device 50. In addition to the customer information, the number of customers (number of adults, number of children), purchased products, amount of money, etc. may be input to the store visitor information.

図４１において、Ｄ１は店舗Ｄが取得した個人情報を示し、Ｄ２がＤ１の個人情報を統合匿名化辞書で匿名化した匿名情報を示す。図４１に示すように匿名情報Ｄ２は、年代、性別、購入商品、購入金額を有している。 In FIG. 41, D1 represents personal information acquired by the store D, and D2 represents anonymous information obtained by anonymizing the personal information of D1 with the integrated anonymization dictionary. As shown in FIG. 41, the anonymous information D2 has an age, sex, purchased product, and purchase price.

店舗Ｅは、クリーニング店であり、顧客にスタンプカードを配布し、クリーニングの代金に応じてスタンプを押し、スタンプが１０個たまった場合に景品を提供する。この景品の提供時に顧客の性別や年齢等の情報を聞き取って匿名化装置５０へ入力する。なお、景品の提供時に限らず、スタンプカードの配布時に顧客の氏名や性別、年齢を記憶部に登録しておき、顧客が来店時にスタンプカードを提示した場合に、このスタンプカードから匿名化装置５０が顧客のＩＤや氏名を取得して、対応する顧客の情報を記憶部から読み出し、来店者情報として記憶部に記憶する。なお、顧客のＩＤや氏名の取得は、例えばスタンプカードに二次元バーコード等を付しておき、この二次元バーコードを読み取り機で読み取って匿名化装置５０に送信する。また、顧客のＩＤや氏名を店舗の担当者が聞き取って匿名化装置５０に入力しても良い。また、来店者情報には、前記顧客の情報に加えて、来店人数（大人の人数、子供の人数）や、クリーニング種類、金額等を入力しても良い。 Store E is a cleaning store, distributes stamp cards to customers, presses stamps according to the cleaning fee, and provides prizes when 10 stamps are collected. When providing the prize, information such as the sex and age of the customer is heard and input to the anonymization device 50. It is to be noted that the anonymization device 50 is used not only when providing a free gift but also when the customer's name, sex, and age are registered in the storage unit when the stamp card is distributed and the customer presents the stamp card when visiting the store. Acquires the customer ID and name, reads the corresponding customer information from the storage unit, and stores it in the storage unit as store visitor information. In order to obtain the customer ID and name, for example, a two-dimensional barcode is attached to a stamp card, and the two-dimensional barcode is read by a reader and transmitted to the anonymization device 50. Further, the person in charge of the store may listen to the customer ID and name and input them to the anonymization device 50. In addition to the customer information, the number of customers (number of adults, number of children), the type of cleaning, the amount of money, and the like may be input as the visitor information.

図４１において、Ｅ１は店舗Ｅが取得した個人情報を示し、Ｅ２がＥ１の個人情報を統合匿名化辞書で匿名化した匿名情報を示す。 In FIG. 41, E1 indicates personal information acquired by the store E, and E2 indicates anonymous information obtained by anonymizing the personal information of E1 with the integrated anonymization dictionary.

また、店舗Ｆは、スーパーマーケットであり、事前に顧客のメールアドレスや氏名、性別、年齢を記憶部に登録し、顧客に会員カードを配布しておく。そして、顧客が会計時に会員カードを提示すると代金を５％割引し、このとき会員カードから匿名化装置５０が顧客のＩＤや氏名を取得して、対応する顧客の情報を記憶部から読み出し、来店者情報として記憶部に記憶する。なお、顧客のＩＤや氏名の取得は、例えば会員カードに二次元バー
コードやＲＦＩＤタグ等を付しておき、この二次元バーコードやＲＦＩＤタグを読み取り機で読み取って匿名化装置５０に送信する。また、来店者情報には、前記顧客の情報に加えて、子連れか否かや、購入商品、金額等を入力しても良い。 The store F is a supermarket, and the customer's e-mail address, name, gender, and age are registered in the storage unit in advance, and a membership card is distributed to the customer. When the customer presents the membership card at the time of payment, the price is discounted by 5%. At this time, the anonymization device 50 acquires the customer ID and name from the membership card, reads the corresponding customer information from the storage unit, and visits the store. Stored in the storage unit as person information. In order to acquire the customer ID and name, for example, a two-dimensional barcode or RFID tag is attached to the membership card, and the two-dimensional barcode or RFID tag is read by a reader and transmitted to the anonymization device 50. . Further, in addition to the customer information, whether or not to bring a child, a purchased product, an amount of money, and the like may be input to the store visitor information.

図４１において、Ｆ１は店舗Ｆが取得した個人情報を示し、Ｆ２がＦ１の個人情報を統合匿名化辞書で匿名化した匿名情報を示す。 In FIG. 41, F1 indicates personal information acquired by the store F, and F2 indicates anonymous information obtained by anonymizing the personal information of F1 using the integrated anonymization dictionary.

このようにショッピングモールにおいても、複数の店舗がそれぞれに収集した個人情報を共通の統合匿名化辞書を用いて匿名化を行い、匿名情報を共通ＤＢ２３３に登録することで、この匿名情報を一元的に利用することができる。 As described above, even in the shopping mall, personal information collected by a plurality of stores is anonymized using a common integrated anonymization dictionary, and the anonymous information is registered in the common DB 233 so that the anonymous information is unified. Can be used.

例えば、ある店舗で女性を優遇するキャンペーンを行った結果、女性の来店者が増えた場合に、他の店舗においても共通ＤＢの匿名情報から女性客の増加を知ることができ、女性向けメニューを増やすことで相乗効果を狙うといったマーケティングに利用できる。 For example, if the number of female customers increases as a result of a campaign that preferentially treats women at certain stores, the increase in female customers can be known from the anonymous information in the common DB at other stores, and a menu for women It can be used for marketing to increase the synergistic effect.

また、子供連れの顧客の割合が所定値以上となった場合に、デジタルサイネージに表示するコンテンツを子供向けに変更するなど、ショッピングモール全体に係る処理を適切に行うことができる。 In addition, when the proportion of customers with children is equal to or greater than a predetermined value, processing related to the entire shopping mall can be appropriately performed, such as changing the content displayed on the digital signage for children.

一方、図４２はナビゲーションシステムの位置情報（個人情報）を匿名化する例を示す図である。図４２の例では、各車両のナビゲーションシステム６１から事業者Ｇ〜Ｉの匿名化装置５０へ位置情報を送信し、各事業者Ｇ〜Ｉの匿名化装置５０が、夫々個人情報を匿名化して管理サーバ２０の共通ＤＢに登録する。なお、図４２のシステムにおいても統合匿名化辞書の作成方法や統合匿名化辞書を用いた匿名化の方法、配信方法は、前述の実施形態２と同じである。 On the other hand, FIG. 42 is a figure which shows the example which anonymizes the positional information (personal information) of a navigation system. In the example of FIG. 42, the position information is transmitted from the navigation system 61 of each vehicle to the anonymization device 50 of the business operators G to I, and the anonymization devices 50 of the business operators G to I anonymize the personal information, respectively. Register in the common DB of the management server 20. 42, the method for creating an integrated anonymization dictionary, the anonymization method using the integrated anonymization dictionary, and the distribution method are the same as those in the second embodiment.

事業者Ｇは、予め各運転者にＩＤを割り当て、この運転者ＩＤと共に運転者のメールアドレスや性別、年齢等を運転者情報として記憶部に登録しておく。そして、車両に搭載したナビゲーションシステム６１が、定期的に車両の位置情報を運転者ＩＤと共に事業者Ｇの匿名化装置５０へ送信する。 The operator G assigns an ID to each driver in advance, and registers the driver's e-mail address, gender, age, and the like together with the driver ID in the storage unit as driver information. And the navigation system 61 mounted in the vehicle periodically transmits the position information of the vehicle to the anonymization device 50 of the operator G together with the driver ID.

位置情報及び運転者ＩＤを受診した匿名化装置５０は、運転者のＩＤと対応する運転者情報を記憶部から読み出して位置情報と共に記憶する。そして匿名化装置５０は、この位置情報及び運転者情報を統合匿名化辞書を用いて匿名化する。図４２において、Ｇ１は事業者Ｇが取得した個人情報を示し、Ｇ２がＧ１の個人情報を統合匿名化辞書で匿名化した匿名情報を示す。なお、図４２の例では、位置情報を統合匿名化辞書に基づきメッシュコードや行政区に抽象化している。例えば、ナビゲーションシステム６１から車両の位置情報として緯度及び経度を受信した場合に、当該緯度及び経度が示す地点を含む標準地域メッシュのメッシュコードや、前記地点を含む行政区に抽象化する。図４２において匿名情報Ｄ２は、年代、性別、メッシュコード、行政区を有している。 The anonymization device 50 that has received the location information and the driver ID reads out the driver information corresponding to the driver ID from the storage unit and stores it together with the location information. And the anonymization apparatus 50 anonymizes this positional information and driver information using an integrated anonymization dictionary. 42, G1 indicates personal information acquired by the operator G, and G2 indicates anonymous information obtained by anonymizing the personal information of G1 using the integrated anonymization dictionary. In the example of FIG. 42, the position information is abstracted into mesh codes and administrative districts based on the integrated anonymization dictionary. For example, when latitude and longitude are received as the vehicle position information from the navigation system 61, it is abstracted into a mesh code of a standard area mesh including the point indicated by the latitude and longitude, or an administrative district including the point. In FIG. 42, anonymous information D2 has an age, sex, mesh code, and administrative district.

事業者Ｈ，Ｉについても上記事業者Ｇと同様に、予め運転者情報を登録しておき、各車両のナビゲーションシステム６１から位置情報及び運転者ＩＤを受診した場合に匿名化装置５０が、運転者ＩＤと対応する運転者情報を記憶部から読み出して位置情報と共に記憶する。そして匿名化装置５０は、この位置情報及び運転者情報を統合匿名化辞書を用いて匿名化する。図４２において、Ｈ１は事業者Ｈが取得した個人情報を示し、Ｈ２がＨ１の個人情報を統合匿名化辞書で匿名化した匿名情報を示す。また、Ｉ１は事業者Ｉが取得した個人情報を示し、Ｉ２がＩ１の個人情報を統合匿名化辞書で匿名化した匿名情報を示す。 As with the business operator G, the operator information is registered in advance for the business operators H and I, and when the location information and the driver ID are received from the navigation system 61 of each vehicle, the anonymization device 50 operates the driver. The driver information corresponding to the driver ID is read from the storage unit and stored together with the position information. And the anonymization apparatus 50 anonymizes this positional information and driver information using an integrated anonymization dictionary. In FIG. 42, H1 indicates personal information acquired by the operator H, and H2 indicates anonymous information obtained by anonymizing the personal information of H1 with the integrated anonymization dictionary. Moreover, I1 shows the personal information which the provider I acquired, and I2 shows the anonymous information which anonymized the personal information of I1 by the integrated anonymization dictionary.

このように車両の位置情報についても、複数の事業者がそれぞれに収集した位置情報を共通の統合匿名化辞書を用いて匿名化を行い、匿名情報を共通ＤＢ２３３に登録することで、この匿名情報を一元的に利用することができる。 As described above, the position information of the vehicle is also made anonymous by using the common integrated anonymization dictionary for the position information collected by a plurality of business operators, and the anonymous information is registered in the common DB 233. Can be used centrally.

〈実施形態３〉
前述の実施形態２では、複数の事業者がそれぞれ個人情報を取得して匿名化を行った例を示したが、これに限らず展示会の主催者が個人情報を一括して記憶部に蓄積し、統合匿名化辞書を用いて匿名化する構成であっても良い。本実施形態３は、前述の実施形態２と比べて、個人情報を主催者側の装置で匿名化する構成が異なり、その他の構成は同じである。このため、実施形態１，２と異なる構成を主に説明し、同一の要素には同符号を付す等して再度の説明を省略する。 <Embodiment 3>
In the second embodiment described above, an example has been shown in which a plurality of business operators have acquired personal information and anonymized, but the present invention is not limited to this, and the organizer of the exhibition accumulates the personal information in the storage unit. And the structure which anonymizes using an integrated anonymization dictionary may be sufficient. The third embodiment is different from the second embodiment in the configuration in which the personal information is anonymized by the apparatus on the organizer side, and the other configurations are the same. For this reason, configurations different from those of the first and second embodiments will be mainly described, and the same elements will be denoted by the same reference numerals and the description thereof will be omitted.

図４３は本実施形態３の匿名化システム３００の機能ブロック図である。匿名化システム３００は、複数の事業者が出展する展示会において、各事業者が来場者から収集した個人情報の匿名化を行うシステムであり、統合匿名化辞書の作成や個人情報の匿名化及び匿名情報の配信を行う。 FIG. 43 is a functional block diagram of the anonymization system 300 according to the third embodiment. The anonymization system 300 is a system that anonymizes personal information collected from visitors by each company in an exhibition where a plurality of companies exhibit, and creates an integrated anonymization dictionary and anonymization of personal information Distribute anonymous information.

本実施形態３では、各事業者が取得した個人情報を事業者端末３０から匿名化システム３００へ送信し、匿名化システム３００が各事業者で取得した個人情報を一括して記憶部に記憶する。即ち主催者は、個人情報を取得した夫々の事業者との間で当該個人情報を共有し、他の事業者に対しては個人情報を匿名化して提供する。例えば、事業者Ａが取得した個人情報を事業者Ｂ，Ｃを含む他の事業者に提供する場合には、匿名化して提供する。 In the third embodiment, the personal information acquired by each operator is transmitted from the operator terminal 30 to the anonymization system 300, and the personal information acquired by the anonymization system 300 is collectively stored in the storage unit. . That is, the organizer shares the personal information with each business operator who acquired the personal information, and provides the personal information anonymously to other business operators. For example, when providing personal information acquired by the business operator A to other business operators including the business operators B and C, the personal information is made anonymous.

図４３に示すように、匿名化システム３００は、辞書取得部２０１や、統合部２０２、優先度決定部２０３、匿名情報制御部２０６、次元選択部２０７、配信部２０９、データ取得部１０１、抽象化部１０２、検定部１０３、選択部１０４、匿名情報登録部１０６、価値データ取得部１０７、ワードカテゴリ分析部１０８、ワード価値計算部１０９、設定情報登録部１２１、期間取得部１２２、予測部１２３、検索情報蓄積ＤＢ１４５、設定情報ＤＢ１４６、辞書ＤＢ２３１、優先度ＤＢ２３２、共通ＤＢ２３３、個人情報ＤＢ２３４を備えている。即ち、本実施形態３の匿名化システム３００は、辞書取得部２０１、統合部２０２、優先度決定部２０３及び次元選択部２０７を備えた辞書作成装置であり、抽象化部１０２、検定部１０３、匿名情報制御部２０６を備えた匿名化装置でもある。 As shown in FIG. 43, the anonymization system 300 includes a dictionary acquisition unit 201, an integration unit 202, a priority determination unit 203, an anonymous information control unit 206, a dimension selection unit 207, a distribution unit 209, a data acquisition unit 101, and an abstraction. Conversion unit 102, verification unit 103, selection unit 104, anonymous information registration unit 106, value data acquisition unit 107, word category analysis unit 108, word value calculation unit 109, setting information registration unit 121, period acquisition unit 122, prediction unit 123 , A search information storage DB 145, a setting information DB 146, a dictionary DB 231, a priority DB 232, a common DB 233, and a personal information DB 234. That is, the anonymization system 300 according to the third embodiment is a dictionary creation device that includes a dictionary acquisition unit 201, an integration unit 202, a priority determination unit 203, and a dimension selection unit 207, and includes an abstraction unit 102, a test unit 103, It is also an anonymization apparatus provided with the anonymous information control part 206. FIG.

辞書取得部２０１は、対象データに含まれる語を抽象化した語に替えて匿名化するため、前記語と前記抽象化した語とを対応付けて記憶した複数の匿名化辞書を各事業者の事業者端末３０から取得する。本実施形態では、各事業者の事業者端末３０から送信された匿名化辞書を辞書取得部２０１が受信し、辞書ＤＢ２３１に登録する。 In order to anonymize the word included in the target data by replacing the word included in the target data with the dictionary acquisition unit 201, the dictionary acquisition unit 201 stores a plurality of anonymization dictionaries storing the word and the abstracted word in association with each operator. Obtained from the operator terminal 30. In this embodiment, the dictionary acquisition part 201 receives the anonymization dictionary transmitted from the provider terminal 30 of each provider and registers it in the dictionary DB 231.

統合部２０２は、各事業者の事業者端末３０から取得した複数の匿名化辞書を統合して統合匿名化辞書を作成し、辞書ＤＢ２３１に記憶させる。 The integration unit 202 integrates a plurality of anonymization dictionaries acquired from the operator terminal 30 of each operator, creates an integrated anonymization dictionary, and stores it in the dictionary DB 231.

データ取得部１０１は、事業者端末３０から個人情報を取得して個人情報ＤＢ２３４に記憶させる。また、データ取得部１０１は、匿名化の際、この個人情報ＤＢ２３４の個人情報のうち対象期間に該当するものを対象データとして読み出す。 The data acquisition unit 101 acquires personal information from the operator terminal 30 and stores it in the personal information DB 234. Moreover, the data acquisition part 101 reads the thing applicable to an object period among object information of this personal information DB234 as an object data in the case of anonymization.

抽象化部１０２は、記憶部から個人情報を読み出して対象データとし、統合匿名化辞書を参照して前記対象データ中の項目の値である語を前記優先度に基づいて抽象化した語に替えて匿名化候補データを生成する。 The abstraction unit 102 reads the personal information from the storage unit as target data, and refers to the integrated anonymization dictionary and replaces the word that is the value of the item in the target data with the word abstracted based on the priority. To generate anonymization candidate data.

匿名情報制御部２０６は、匿名情報の出力処理等を制御する。例えば、匿名情報制御部２０６は、前記検定の条件を満たした抽象化候補データを匿名情報として共通ＤＢ２３３に登録する、即ち共通ＤＢ２３３に記憶させる。また、匿名情報制御部２０６は、事業者端末３０等の情報処理装置から匿名情報の取得要求を受けた場合に、該当する匿名情報を要求元の情報処理装置へ配信する。本実施形態３において、匿名情報制御部２０６は、出力部の一形態である。 The anonymous information control unit 206 controls an anonymous information output process and the like. For example, the anonymous information control unit 206 registers the abstraction candidate data satisfying the test condition as anonymous information in the common DB 233, that is, stores the abstract candidate data in the common DB 233. Moreover, when the anonymous information control part 206 receives the acquisition request of anonymous information from information processing apparatuses, such as the provider terminal 30, it distributes applicable anonymous information to the information processing apparatus of the request source. In the third embodiment, the anonymous information control unit 206 is a form of an output unit.

また、事業者端末３０は、図４３に示すように、データ入力部３１１や、匿名情報取得部３１２、設定情報送信部３１３を備えている。 Further, as shown in FIG. 43, the provider terminal 30 includes a data input unit 311, an anonymous information acquisition unit 312, and a setting information transmission unit 313.

データ入力部３１１は、個人と対応付けられた複数の項目を含むデータ、即ち個人情報を取得する。例えば来場者が記載したアンケートや来場者から聞き取った個人情報をキーボード等から入力を受け、匿名化システム３００へ送信する。また、データ入力部３１は、来場者の名刺やアンケートに記載された事項を読み取り、ＯＣＲ（Optical Character Recognition）により電子データとして取得する、又は来場者のＲＦ−ＩＤタグやＩＣチ
ップ等から当該来場者の情報を取得し、匿名化システム３００へ送信しても良い。 The data input unit 311 acquires data including a plurality of items associated with an individual, that is, personal information. For example, a questionnaire written by a visitor and personal information heard from the visitor are input from a keyboard or the like and transmitted to the anonymization system 300. In addition, the data input unit 31 reads items described in a visitor's business card or questionnaire and obtains it as electronic data by OCR (Optical Character Recognition), or from the visitor's RF-ID tag or IC chip. May be acquired and transmitted to the anonymization system 300.

匿名情報取得部３１２は、匿名化システム３００から匿名情報を取得し、表示部への表示や記憶部への記憶といった出力を行う。 The anonymous information acquisition unit 312 acquires anonymous information from the anonymization system 300 and performs output such as display on a display unit or storage in a storage unit.

設定情報送信部３１３は、ユーザによって入力された対象期間や配信条件等の設定情報を匿名化システム３００へ送信する。 The setting information transmission unit 313 transmits setting information such as a target period and distribution conditions input by the user to the anonymization system 300.

図４４は匿名化システム３００のハードウェア構成を示す図である。管理サーバ２０は、ＣＰＵ２１、メモリ２２、通信制御部２３、記憶装置２４、入出力インタフェース２５を有する所謂コンピュータである。 FIG. 44 is a diagram illustrating a hardware configuration of the anonymization system 300. The management server 20 is a so-called computer having a CPU 21, a memory 22, a communication control unit 23, a storage device 24, and an input / output interface 25.

ＣＰＵ２１は、メモリ２２に実行可能に展開されたプログラムを実行し、前述の辞書取得部２０１や、統合部２０２、優先度決定部２０３、匿名情報制御部２０６、次元選択部２０７、配信部２０９、データ取得部１０１、抽象化部１０２、検定部１０３、選択部１０４、匿名情報登録部１０６、価値データ取得部１０７、ワードカテゴリ分析部１０８、ワード価値計算部１０９、設定情報登録部１２１、期間取得部１２２、予測部１２３の機能を提供する。 The CPU 21 executes the program expanded in the memory 22 so as to be executable, and the above-described dictionary acquisition unit 201, integration unit 202, priority determination unit 203, anonymous information control unit 206, dimension selection unit 207, distribution unit 209, Data acquisition unit 101, abstraction unit 102, test unit 103, selection unit 104, anonymous information registration unit 106, value data acquisition unit 107, word category analysis unit 108, word value calculation unit 109, setting information registration unit 121, period acquisition The functions of the unit 122 and the prediction unit 123 are provided.

メモリ２２は、主記憶装置ということもできる。メモリ２２は、例えば、ＣＰＵ２１が実行するプログラムや、通信制御部２３を介して受信したデータ、記憶装置２４から読み
出したデータ、その他のデータ等を記憶する。 The memory 22 can also be called a main storage device. The memory 22 stores, for example, a program executed by the CPU 21, data received via the communication control unit 23, data read from the storage device 24, other data, and the like.

通信制御部２３は、ネットワークを介して他の装置と接続し、当該装置との通信を制御する。入出力インタフェース２５は、表示装置やプリンタ等の出力手段や、キーボードやポインティングデバイス等の入力手段、ドライブ装置等の入出力手段が適宜接続される。 The communication control unit 23 is connected to another device via a network and controls communication with the device. The input / output interface 25 is appropriately connected to output means such as a display device and a printer, input means such as a keyboard and pointing device, and input / output means such as a drive device.

記憶装置２４は、プログラム、個人情報、設定情報等の情報を記憶する。また、記憶装置２４は、ドライブ装置やメモリ２２との間で、データを授受する。本実施形態３では、記憶装置２４が前述の辞書ＤＢ２３１、優先度ＤＢ２３２、共通ＤＢ２３３、個人情報ＤＢ２３４、検索情報蓄積ＤＢ１４５、設定情報ＤＢ１４６を格納している。 The storage device 24 stores information such as programs, personal information, and setting information. Further, the storage device 24 exchanges data with the drive device and the memory 22. In the third embodiment, the storage device 24 stores the aforementioned dictionary DB 231, priority DB 232, common DB 233, personal information DB 234, search information accumulation DB 145, and setting information DB 146.

図４５は事業者端末３０のハードウェア構成を示す図である。事業者端末３０は、ＣＰＵ３１、メモリ３２、通信制御部３３、記憶装置３４、入出力インタフェース３５を有する所謂コンピュータである。 FIG. 45 is a diagram illustrating a hardware configuration of the business entity terminal 30. The business entity terminal 30 is a so-called computer having a CPU 31, a memory 32, a communication control unit 33, a storage device 34, and an input / output interface 35.

ＣＰＵ３１は、メモリ３２に実行可能に展開されたプログラムを実行し、前述のデータ入力部３１１や匿名情報取得部３１２、設定情報送信部３１３の機能を提供する。 The CPU 31 executes the program expanded in the memory 32 and provides the functions of the data input unit 311, the anonymous information acquisition unit 312, and the setting information transmission unit 313.

メモリ３２は、主記憶装置ということもできる。メモリ３２は、例えば、ＣＰＵ３１が実行するプログラムや、通信制御部３３を介して受信したデータ、記憶装置３４から読み出したデータ、その他のデータ等を記憶する。 The memory 32 can also be called a main storage device. The memory 32 stores, for example, a program executed by the CPU 31, data received via the communication control unit 33, data read from the storage device 34, other data, and the like.

通信制御部３３は、ネットワークを介して他の装置と接続し、当該装置との通信を制御する。入出力インタフェース３５は、表示装置やプリンタ等の出力手段や、キーボードやポインティングデバイス等の入力手段、ドライブ装置等の入出力手段が適宜接続される。ドライブ装置は、着脱可能な記憶媒体の読み書き装置であり、例えば、フラッシュメモリカードの入出力装置、ＵＳＢメモリを接続するＵＳＢのアダプタ等である。また、着脱可能な記憶媒体は、例えば、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disk）、ブルーレイディスク（Blu-ray Disc）等のディスク媒体であってもよい。ドライブ装置は、着脱可能な記憶媒体からプログラムを読み出し、記憶装置３４に格納する。 The communication control unit 33 is connected to another device via a network and controls communication with the device. The input / output interface 35 is appropriately connected to output means such as a display device and a printer, input means such as a keyboard and pointing device, and input / output means such as a drive device. The drive device is a removable storage medium read / write device, such as an input / output device for a flash memory card, a USB adapter for connecting a USB memory, or the like. The removable storage medium may be a disk medium such as a CD (Compact Disc), a DVD (Digital Versatile Disk), or a Blu-ray Disc. The drive device reads the program from the removable storage medium and stores it in the storage device 34.

記憶装置３４は、外部記憶装置ということもできる。記憶装置３４としては、ＳＳＤ（Solid State Drive）やＨＤＤ等であってもよい。記憶装置３４は、ドライブ装置との間
で、データを授受する。例えば、記憶装置３４は、ドライブ装置からインストールされる情報処理プログラム等を記憶する。 The storage device 34 can also be called an external storage device. The storage device 34 may be an SSD (Solid State Drive), HDD, or the like. The storage device 34 exchanges data with the drive device. For example, the storage device 34 stores an information processing program installed from the drive device.

次に本実施形態３に係る匿名化方法について説明する。匿名化システム３００は、図９に示す蓄積処理を定期的に起動させ、事業者端末３０から個人情報を受信したか否かを確認する（ステップＳ１０１０）。ここで匿名化システム３００は、個人情報を受信していなければ（ステップＳ１０１０，Ｎｏ）、図９の処理を終了させ、個人情報を受信していれば（ステップＳ１０１０，Ｙｅｓ）、受信した個人情報を個人情報ＤＢ２３４に登録し（ステップＳ１０２０）、蓄積処理を終了する。このように匿名化システム３００は、図９の蓄積処理を繰り返し実行し、各事業者の端末３０から受信した個人情報を随時個人情報ＤＢ２３４に登録する。 Next, the anonymization method according to the third embodiment will be described. The anonymization system 300 periodically starts the accumulation process shown in FIG. 9 and confirms whether or not personal information is received from the operator terminal 30 (step S1010). Here, if the anonymization system 300 has not received personal information (step S1010, No), the process of FIG. 9 is terminated, and if personal information has been received (step S1010, Yes), the received personal information has been received. Is registered in the personal information DB 234 (step S1020), and the accumulation process is terminated. In this way, the anonymization system 300 repeatedly executes the accumulation process of FIG. 9 and registers the personal information received from the terminal 30 of each business operator in the personal information DB 234 as needed.

また、匿名化システム３００は、図１０に示す処理を定期的に起動させ、先ず、各ユーザの設定情報を設定情報ＤＢ１４６から読み出す（ステップＳ１０３０）。 Moreover, the anonymization system 300 starts the process shown in FIG. 10 periodically, and first reads the setting information of each user from the setting information DB 146 (step S1030).

次に、匿名化システム３００は、読み出した設定情報に処理を行っていない対象期間があるか否かを判定する（ステップＳ１０３５）。未処理の対象期間が無ければ（ステップ
Ｓ１０３５，Ｎｏ）、匿名化システム３００は、図１０の処理を終了し、未処理の対象期間が有れば（ステップＳ１０３５，Ｙｅｓ）、当該対象期間と対応付けられた配信情報が予測情報か否か、即ち当該対象期間と対応付けられた配信パターンが２か否かを判定する（ステップＳ１０４０）。ここで配信パターン２と判定した場合（ステップＳ１０４０，Ｙｅｓ）、匿名化システム３００は、対象期間の個人情報を対象データとして求め、この対象データ等に基づいて予測処理を行い、予測結果を求める（ステップＳ１０４５）。 Next, the anonymization system 300 determines whether there is a target period in which the read setting information is not processed (step S1035). If there is no unprocessed target period (step S1035, No), the anonymization system 300 ends the process of FIG. 10, and if there is an unprocessed target period (step S1035, Yes), it corresponds to the target period. It is determined whether or not the attached distribution information is prediction information, that is, whether or not the distribution pattern associated with the target period is 2 (step S1040). When it determines with the delivery pattern 2 here (step S1040, Yes), the anonymization system 300 calculates | requires the personal information of a target period as object data, performs a prediction process based on this object data etc., and calculates | requires a prediction result ( Step S1045).

当該対象期間に該当する個人情報を対象情報として個人情報ＤＢ２３４から読み出す（ステップＳ１０４０）。そして、匿名化システム３００は、予測情報や対象データを匿名化する（ステップＳ１０５０）。例えば、予測情報が個人情報（対象データ）を含む場合には、この予測情報に含まれる個人情報を匿名化する。また、予測情報と共に配信する場合には、予測の対象とした対象データを匿名化する。 The personal information corresponding to the target period is read out from the personal information DB 234 as the target information (step S1040). And the anonymization system 300 anonymizes prediction information and object data (step S1050). For example, when the prediction information includes personal information (target data), the personal information included in the prediction information is anonymized. Moreover, when delivering with prediction information, the object data made into the object of prediction are anonymized.

匿名化後、匿名化システム３００は、現在時刻が所定期間毎に定められた配信タイミングに該当するか否か、即ち配信パターン１，２の配信タイミングか否かを判定する（ステップＳ１０５５）。所定期間毎の配信タイミングに該当する場合（ステップＳ１０５５，Ｙｅｓ）、匿名化システム３００は、配信情報を配信先へ配信する（ステップＳ１０６０）。匿名化システム３００は、例えば、配信パターン１の場合、配信情報として順位や匿名情報を配信し、配信パターン２の場合、予測情報や匿名情報を配信する。 After anonymization, the anonymization system 300 determines whether or not the current time corresponds to the distribution timing determined for each predetermined period, that is, whether or not it is the distribution timing of distribution patterns 1 and 2 (step S1055). When it corresponds to the delivery timing for every predetermined period (step S1055, Yes), the anonymization system 300 delivers delivery information to a delivery destination (step S1060). For example, in the case of distribution pattern 1, the anonymization system 300 distributes rank and anonymous information as distribution information, and in the case of distribution pattern 2, it distributes prediction information and anonymous information.

また、ステップＳ１０５５で、配信パターン１，２の配信タイミングでは無いと判定した場合（ステップＳ１０５５，Ｎｏ）、匿名化システム３００は、履歴情報を用いた配信タイミングか否か、即ち配信パターン４〜６か否かを判定する（ステップＳ１０６５）。ここで匿名化システム３００は、配信パターン４〜６でないと判定した場合には（ステップＳ１０６５，Ｎｏ）、図１０の処理を終了し、配信パターン４〜６であると判定した場合には（ステップＳ１０６５，Ｙｅｓ）、履歴情報（過去の匿名情報）を共通ＤＢ２３３から読み出し（ステップＳ１０７０）、匿名情報と比較して配信条件を満たしているか否かを判定する（ステップＳ１０７５）。 If it is determined in step S1055 that it is not the distribution timing of distribution patterns 1 and 2 (step S1055, No), the anonymization system 300 determines whether or not it is a distribution timing using history information, that is, distribution patterns 4-6. It is determined whether or not (step S1065). Here, when the anonymization system 300 determines that the distribution patterns are not 4 to 6 (No at Step S1065), the process of FIG. 10 is terminated, and when it is determined that the distribution patterns are 4 to 6 (Step S1065). (S1065, Yes), the history information (past anonymous information) is read from the common DB 233 (step S1070), and compared with the anonymous information, it is determined whether the delivery condition is satisfied (step S1075).

配信条件を満たしていなければ（ステップＳ１０７５，Ｎｏ）、匿名化システム３００は、図１０の処理を終了させ、配信条件を満たしていれば（ステップＳ１０７５，Ｙｅｓ）、通知や順位、匿名情報を配信情報として配信する（ステップＳ１０８０）。 If the distribution condition is not satisfied (step S1075, No), the anonymization system 300 terminates the process of FIG. 10, and if the distribution condition is satisfied (step S1075, Yes), the notification, ranking, and anonymous information are distributed. Distribute as information (step S1080).

また、匿名化システム３００は、図１１に示すリアルタイム配信の処理を定期的に起動させ、先ず、各ユーザの設定情報を設定情報ＤＢ１４６から読み出す（ステップＳ１０８２）。次に匿名化システム３００は、リアルタイム配信を行う所定情報を取得したか否かを判定する（ステップＳ１０８５）。 Further, the anonymization system 300 periodically activates the real-time distribution process shown in FIG. 11, and first reads the setting information of each user from the setting information DB 146 (step S1082). Next, the anonymization system 300 determines whether or not predetermined information for performing real-time distribution has been acquired (step S1085).

匿名化システム３００は、所定情報を取得していない場合（ステップＳ１０８５，Ｎｏ）、図１１の処理を終了し、所定情報を取得した場合（ステップＳ１０８５，Ｙｅｓ）、対象データの匿名化処理を行う（ステップＳ１０９０）。 When the anonymization system 300 has not acquired the predetermined information (step S1085, No), the process of FIG. 11 is terminated, and when the predetermined information is acquired (step S1085, Yes), the target data is anonymized. (Step S1090).

匿名化後、匿名化システム３００は、匿名情報をリアルタイムに配信先の端末３０へ配信する（ステップＳ１０９５）。 After anonymization, the anonymization system 300 delivers anonymous information to the delivery destination terminal 30 in real time (step S1095).

次に図４６を用いてステップＳ１０５０における匿名化の処理について説明する。なお
、本実施形態３の匿名化システム３００は、各事業者の事業者端末３０から各事業者の匿名化辞書を受信して統合匿名化辞書を作成するが、この統合の処理は、図３２の処理と同じであるため、再度の説明を省略する。 Next, the anonymization process in step S1050 will be described with reference to FIG. The anonymization system 300 of the third embodiment receives the anonymization dictionary of each operator from the operator terminal 30 of each operator and creates an integrated anonymization dictionary. This integration processing is illustrated in FIG. Since the process is the same as that in FIG.

匿名化システム３００は、匿名化装置５０は、ステップＳ１０３５で未処理と判定した対象期間に該当する個人情報を個人情報ＤＢ１３１から読み出して取得し（ステップＳ６１０）、対象データ中の各ワードについて、価値データが検索情報蓄積ＤＢ１４２に存在するか否かを判定する（ステップＳ６２０）。匿名化システム３００は、全てのワードの価値データが検索情報蓄積ＤＢ１４２に存在する場合にはステップＳ６３０へ移行し（ステップＳ６２０，Ｙｅｓ）、足りない価値データがある場合（ステップＳ６２０，Ｎｏ）、当該ワードの価値データを外部の装置、本例では検索エンジンから取得する（ステップＳ６４０）。そして、匿名化システム３００は、検索情報蓄積ＤＢ１４２に存在するワードの価値情報を検索情報蓄積ＤＢ１４２から取得する（ステップＳ６３０）。 In the anonymization system 300, the anonymization device 50 reads out and acquires personal information corresponding to the target period determined as unprocessed in step S1035 from the personal information DB 131 (step S610), and values are obtained for each word in the target data. It is determined whether or not the data exists in the search information storage DB 142 (step S620). The anonymization system 300 moves to step S630 when the value data of all words exist in the search information storage DB 142 (step S620, Yes), and when there is insufficient value data (step S620, No), Word value data is obtained from an external device, in this example, a search engine (step S640). And the anonymization system 300 acquires the value information of the word which exists in search information storage DB142 from search information storage DB142 (step S630).

また、匿名化システム３００は、匿名性を満たすため対象データの各項目を抽象化したワード（カテゴリ）に置き換えて抽象化候補データを作成する（ステップＳ６５０）。なお、抽象化可能な項目が複数存在する場合には、各項目を抽象化した場合と抽象化しない場合の全てのパターンを作成する。 Further, the anonymization system 300 creates abstraction candidate data by replacing each item of the target data with an abstracted word (category) in order to satisfy anonymity (step S650). When there are a plurality of items that can be abstracted, all patterns are created when each item is abstracted and when it is not abstracted.

次に匿名化システム３００は、抽象化候補データに含まれる各ワードの価値データに基づいて各パターンの抽象化候補データの価値を算出し（ステップＳ６６０）、この抽象化候補データの価値に基づいて検定の順番を決定する（ステップＳ６７０）。 Next, the anonymization system 300 calculates the value of the abstraction candidate data of each pattern based on the value data of each word included in the abstraction candidate data (step S660), and based on the value of this abstraction candidate data. The order of testing is determined (step S670).

この検定の順番に従い、匿名化システム３００は、抽象化候補データの匿名性を検定する（ステップＳ６８０）。例えば、ｋ−匿名性を検定するため、一個人と対応付けられた異なる項目の値の組み合わせが当該抽象化候補データ中に存在する数（存在数）を求める。或いは、ｌ多様性を検定するため、一個人と対応付けられた同じ項目の値の組み合わせが当該抽象化候補データ中に存在する数（存在数）を求める。そして、この存在数のうち最小のものを最低出現数（ｋ値／ｌ値）として求め（ステップＳ６９０）、この最低出現数が１を超えているか否かを判定する（ステップＳ７００）。即ち、ここでｋ値が１を超えていればｋ−匿名性を満たし、１であればｋ−匿名性を満たさない。同様にｌ値が１を超えていればｌ−多様性を満たし、１であればｌ−多様性を満たさない。 In accordance with the order of this test, the anonymization system 300 tests the anonymity of the abstraction candidate data (step S680). For example, in order to test k-anonymity, the number (existence number) of combinations of values of different items associated with one individual is present in the abstraction candidate data. Alternatively, in order to test 1 diversity, the number (existence number) in which the combination of values of the same item associated with one individual exists in the abstraction candidate data is obtained. Then, the smallest of the existence numbers is obtained as the minimum number of appearances (k value / l value) (step S690), and it is determined whether or not the minimum number of appearances exceeds 1 (step S700). That is, if the k value exceeds 1, k-anonymity is satisfied, and if it is 1, k-anonymity is not satisfied. Similarly, if the l value exceeds 1, l-diversity is satisfied, and if l value is 1, l-diversity is not satisfied.

最低出現数（ｋ値／ｌ値）が１を超えていない場合（ステップＳ７００，Ｎｏ）、匿名化システム３００は、抽象化候補データのうち、少なくとも一つの項目の値を更に抽象化する、即ち抽象化したワードに置き換え（ステップＳ７１０）、ステップＳ６８０に戻る。 When the minimum appearance number (k value / l value) does not exceed 1 (step S700, No), the anonymization system 300 further abstracts the value of at least one item of the abstraction candidate data, that is, Replace with the abstracted word (step S710), and the process returns to step S680.

一方、最低出現数（ｋ値／ｌ値）が１を超えている場合（ステップＳ７００，Ｙｅｓ）、匿名化システム３００は、当該抽象化候補データの価値と元の対象データの価値との差分を求め（ステップＳ７２０）、この差分や、この差分に基づく値、例えば対象データの価値に対する差分の割合、対象データの価値に対する抽象化候補データの価値の割合を当該抽象化候補データの価値として決定する（ステップＳ７３０）。 On the other hand, when the minimum number of appearances (k value / l value) exceeds 1 (step S700, Yes), the anonymization system 300 calculates the difference between the value of the abstract candidate data and the value of the original target data. Determination (step S720), and the difference, a value based on the difference, for example, the ratio of the difference to the value of the target data, and the ratio of the value of the abstract candidate data to the value of the target data are determined as the value of the abstract candidate data. (Step S730).

また、匿名化システム３００は、検定していない候補パターンがあるか否かを判定し（ステップＳ７４０）、検定していない候補パターンがあれば（ステップＳ７４０，Ｙｅｓ）、ステップＳ６７０で決定した順番に従って、次の順番の抽象化候補データを特定し（ステップＳ７５０）、ステップＳ６８０に戻って次の抽象化候補データについて検定を行う。 Further, the anonymization system 300 determines whether there is a candidate pattern that has not been verified (step S740). If there is a candidate pattern that has not been verified (step S740, Yes), the anonymization system 300 follows the order determined in step S670. Next, the abstraction candidate data in the next order is specified (step S750), and the process returns to step S680 to test the next abstraction candidate data.

このように各パターンの抽象化候補データについて検定を繰り返し、次の候補パターンが無くなった場合（ステップＳ７４０，Ｎｏ）、匿名化システム３００は、ステップＳ７２０で各抽象化候補データの価値に基づいて、採用すべき抽象化候補データを選択し（ステップＳ７６０）、選択した抽象化候補データを匿名情報として共通ＤＢ２３３へ登録する（ステップＳ７７０）。 As described above, when the test is repeated for the abstraction candidate data of each pattern and the next candidate pattern is lost (No in step S740), the anonymization system 300, based on the value of each abstraction candidate data in step S720, Abstraction candidate data to be adopted is selected (step S760), and the selected abstraction candidate data is registered in the common DB 233 as anonymous information (step S770).

以上のように本実施形態３によれば、利用者側の指定する対象期間内の個人情報を対象データとして匿名化を行うので、対象期間の長さによって匿名化するための抽象化の程度を異ならせて、対象期間と抽象化後の価値に基づいた適切な匿名化処理を行うことができる。また、匿名化システム３００が、複数の事業者の匿名化辞書を統合した統合匿名化辞書を用いて匿名化を行うことで、各事業者に適合した匿名情報を提供することができる。 As described above, according to the third embodiment, personal information within the target period designated by the user is anonymized as target data, and therefore the degree of abstraction for anonymization according to the length of the target period is set. Differently, appropriate anonymization processing based on the target period and the value after abstraction can be performed. Moreover, the anonymization system 300 can provide anonymous information suitable for each business operator by performing anonymization using an integrated anonymization dictionary obtained by integrating anonymization dictionaries of a plurality of business operators.

また、匿名情報が所定条件を満たしたことを契機に配信を行うことにより、他の事業者の匿名情報を含めた匿名情報全体に基づいて所定の処理を開始させるための契機を提供できる。 Moreover, the opportunity for starting a predetermined | prescribed process based on the whole anonymous information including the anonymous information of another provider can be provided by delivering on the occasion that anonymous information satisfy | filled the predetermined condition.

〈実施形態４〉
前述の実施形態２では、複数の事業者がそれぞれ匿名化装置を備え、展示会の主催者が管理サーバ（辞書作成装置）を備える構成を示したが、これに限らず複数の事業者がそれぞれ辞書作成装置と匿名化装置を含む匿名化システムを備えた構成であっても良い。本実施形態４は、前述の実施形態２と比べて、各事業者がそれぞれ匿名化システム４００を備えた構成が異なり、その他の構成は同じである。このため、実施形態１，２と異なる構成を主に説明し、同一の要素には同符号を付す等して再度の説明を省略する。 <Embodiment 4>
In the second embodiment described above, a configuration has been described in which a plurality of business operators are each provided with an anonymization device, and an exhibition organizer is provided with a management server (dictionary creation device). The structure provided with the anonymization system containing a dictionary creation apparatus and an anonymization apparatus may be sufficient. Compared with the above-described second embodiment, the fourth embodiment is different in the configuration in which each operator includes the anonymization system 400, and the other configurations are the same. For this reason, configurations different from those of the first and second embodiments will be mainly described, and the same elements will be denoted by the same reference numerals and the description thereof will be omitted.

図４７に示すように、匿名化システム４００は、辞書取得部２０１や、統合部２０２、優先度決定部２０３、匿名情報制御部２０６、次元選択部２０７、配信部２０９、データ取得部１０１、抽象化部１０２、検定部１０３、選択部１０４、匿名情報登録部１０６、価値データ取得部１０７、ワードカテゴリ分析部１０８、ワード価値計算部１０９、設定情報登録部１２１、期間取得部１２２、予測部１２３、検索情報蓄積ＤＢ１４５、設定情報ＤＢ１４６、辞書ＤＢ２３１、優先度ＤＢ２３２、共通ＤＢ２３３、個人情報ＤＢ２３４を備えている。即ち、本実施形態４の匿名化システム４００は、辞書取得部２０１、統合部２０２、優先度決定部２０３及び次元選択部２０７を備えた辞書作成装置であると共に、データ取得部１０１や、抽象化部１０２、検定部１０３、出力制御部１１０を備えた匿名化装置でもある。 As shown in FIG. 47, the anonymization system 400 includes a dictionary acquisition unit 201, an integration unit 202, a priority determination unit 203, an anonymous information control unit 206, a dimension selection unit 207, a distribution unit 209, a data acquisition unit 101, an abstraction unit. Conversion unit 102, verification unit 103, selection unit 104, anonymous information registration unit 106, value data acquisition unit 107, word category analysis unit 108, word value calculation unit 109, setting information registration unit 121, period acquisition unit 122, prediction unit 123 , A search information storage DB 145, a setting information DB 146, a dictionary DB 231, a priority DB 232, a common DB 233, and a personal information DB 234. That is, the anonymization system 400 according to the fourth embodiment is a dictionary creation device that includes a dictionary acquisition unit 201, an integration unit 202, a priority determination unit 203, and a dimension selection unit 207, as well as a data acquisition unit 101, an abstraction unit, and the like. It is also an anonymization apparatus provided with the part 102, the test | inspection part 103, and the output control part 110.

辞書取得部２０１は、対象データに含まれる語を抽象化した語に替えて匿名化するため、前記語と前記抽象化した語とを対応付けて記憶した複数の匿名化辞書を他の事業者の匿名化システム４００から取得する。本実施形態４では、他の事業者の匿名化システム４００から送信された匿名化辞書を辞書取得部２０１が受信し、辞書ＤＢ２３１に登録する。 The dictionary acquisition unit 201 anonymizes the words included in the target data by replacing the abstracted words with each other, and thus stores a plurality of anonymized dictionaries storing the words and the abstracted words in association with each other. From the anonymization system 400. In this Embodiment 4, the dictionary acquisition part 201 receives the anonymization dictionary transmitted from the anonymization system 400 of another provider, and registers it in dictionary DB231.

統合部２０２は、各事業者の匿名化装置５０から取得した匿名化辞書及び自社の匿名化辞書を統合して統合匿名化辞書を作成する。 The integration unit 202 integrates the anonymization dictionary acquired from each company's anonymization device 50 and its own anonymization dictionary to create an integrated anonymization dictionary.

優先度決定部２０３は、前記統合匿名化辞書を構成する次元の夫々について、当該次元に含まれる語に基づいて優先度を決定する。 The priority determination unit 203 determines a priority for each dimension constituting the integrated anonymization dictionary based on words included in the dimension.

辞書管理部２０４は、統合部２０２で作成された統合匿名化辞書を管理する。例えば辞
書管理部２０４は、統合匿名化辞書を辞書ＤＢ２３１から読み出して他の事業者の匿名化システム４００へ送信する。 The dictionary management unit 204 manages the integrated anonymization dictionary created by the integration unit 202. For example, the dictionary management unit 204 reads out the integrated anonymization dictionary from the dictionary DB 231 and transmits it to the anonymization system 400 of another business operator.

匿名情報制御部２０６は、匿名情報の出力処理等を制御する。例えば、匿名情報を他の事業者の匿名化システム４００へ送信する。本実施形態４において、匿名情報制御部２０６は、出力部の一形態である。 The anonymous information control unit 206 controls an anonymous information output process and the like. For example, the anonymous information is transmitted to the anonymization system 400 of another business. In the fourth embodiment, the anonymous information control unit 206 is a form of an output unit.

データ取得部１０１は、個人と対応付けられた複数の項目を含むデータ、即ち個人情報を対象データとして取得する。 The data acquisition unit 101 acquires data including a plurality of items associated with an individual, that is, personal information as target data.

検定部１０３は、前記抽象化候補データの項目の値の組み合わせが、前記対象データの一個人に限定されないことを条件として検定する。 The test unit 103 performs test on the condition that the combination of the item values of the abstraction candidate data is not limited to one individual of the target data.

図４８は匿名化システム４００のハードウェア構成を示す図である。匿名化システム４００は、ＣＰＵ２１、メモリ２２、通信制御部２３、記憶装置２４、入出力インタフェース２５を有する所謂コンピュータである。 FIG. 48 is a diagram illustrating a hardware configuration of the anonymization system 400. The anonymization system 400 is a so-called computer having a CPU 21, a memory 22, a communication control unit 23, a storage device 24, and an input / output interface 25.

本実施形態４の匿名化システム４００では、匿名情報を交換する事業者へ匿名化辞書を渡して統合匿名化辞書を作成し、自社の個人情報を統合匿名化辞書で匿名化して他の事業者へ提供する。 In the anonymization system 400 of this Embodiment 4, an anonymization dictionary is handed over to the operator who exchanges anonymous information, an integrated anonymization dictionary is created, and the personal information of the company is anonymized by the integrated anonymization dictionary, and other operators To provide.

なお、統合匿名化辞書の作成の処理や、統合匿名化辞書を用いた匿名化の処理は、前述の実施形態と同様である。 Note that the process of creating the integrated anonymization dictionary and the process of anonymization using the integrated anonymization dictionary are the same as in the above-described embodiment.

このように各事業者が匿名化システム４００を備えることで、展示会の主催者のように複数の事業者を取りまとめるものが存在しなくても事業者間で匿名情報を交換できる。 Thus, by providing each company with the anonymization system 400, anonymity information can be exchanged between companies even if there is no such thing as organizing a plurality of companies like an exhibition organizer.

これにより複数の事業者間で夫々の顧客の個人情報を匿名化して交換し、業務提携やマーケティング等の分析に用いることができる。 Thereby, the personal information of each customer can be anonymized and exchanged between a plurality of business operators, and can be used for business tie-ups and marketing analysis.

〈その他〉
本発明は、上述の図示例にのみ限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々変更を加え得ることは勿論である。 <Others>
The present invention is not limited to the illustrated examples described above, and various modifications can be made without departing from the scope of the present invention.

１ＣＰＵ
２メモリ
４記憶装置
５入出力インタフェース
１０匿名化システム
１９価値計算部
２０管理サーバ
２２メモリ
２３通信制御部
２４記憶装置
２５入出力インタフェース
３０端末
３１データ入力部
３２メモリ
３３通信制御部
３４記憶装置
３５入出力インタフェース
５０匿名化装置
５２メモリ
５３通信制御部
５４記憶装置
５５入出力インタフェース
６１ナビゲーションシステム
７０検索エンジン
７１データ出力部
７２検索ワード蓄積ＤＢ
７３検索広告配信ＤＢ
１００匿名化システム
１０１データ取得部
１０２抽象化部
１０３検定部
１０４選択部
１０５価値判定部
１０６匿名情報登録部
１０７価値データ取得部
１０８ワードカテゴリ分析部
１０９ワード価値計算部
１１０出力制御部
１２０データ出力部
１２１設定情報登録部
１２２期間取得部
１２３予測部
１３１個人情報ＤＢ
１４２検索情報蓄積ＤＢ
１４４匿名化ＤＢ
１４５検索情報蓄積ＤＢ
１４６設定情報ＤＢ
２３１辞書ＤＢ
２３２優先度ＤＢ
２３３共通ＤＢ
２３４個人情報ＤＢ
２３６設定情報ＤＢ
２０１辞書取得部
２０２統合部
２０３優先度決定部
２０４辞書管理部
２０５匿名情報登録部
２０６匿名情報制御部
２０７次元選択部
２０８設定情報管理部
２０９配信部
３００匿名化システム
３１１データ入力部
３１２匿名情報取得部
３１３設定情報送信部
４００匿名化システム 1 CPU
2 memory 4 storage device 5 input / output interface 10 anonymization system 19 value calculation unit 20 management server 22 memory 23 communication control unit 24 storage device 25 input / output interface 30 terminal 31 data input unit 32 memory 33 communication control unit 34 storage device 35 input Output interface 50 Anonymization device 52 Memory 53 Communication control unit 54 Storage device 55 Input / output interface 61 Navigation system 70 Search engine 71 Data output unit 72 Search word storage DB
73 Search advertisement distribution DB
DESCRIPTION OF SYMBOLS 100 Anonymization system 101 Data acquisition part 102 Abstraction part 103 Test part 104 Selection part 105 Value determination part 106 Anonymous information registration part 107 Value data acquisition part 108 Word category analysis part 109 Word value calculation part 110 Output control part 120 Data output part 121 Setting Information Registration Unit 122 Period Acquisition Unit 123 Prediction Unit 131 Personal Information DB
142 Search information storage DB
144 Anonymization DB
145 Search information storage DB
146 Setting information DB
231 Dictionary DB
232 Priority DB
233 common DB
234 Personal Information DB
236 Setting information DB
201 Dictionary acquisition unit 202 Integration unit 203 Priority determination unit 204 Dictionary management unit 205 Anonymous information registration unit 206 Anonymous information control unit 207 Dimension selection unit 208 Setting information management unit 209 Distribution unit 300 Anonymization system 311 Data input unit 312 Anonymous information acquisition Unit 313 setting information transmission unit 400 anonymization system

Claims

A period acquisition unit for acquiring a period for anonymization;
Of the target data including a plurality of items associated with individuals, a data acquisition unit that acquires target data corresponding to the period;
An abstraction unit that generates abstraction candidate data by replacing a word that is a value of an item in the target data with an abstracted word;
A value determination unit that receives the value of the word included in the abstraction candidate data and obtains the value of the abstraction candidate data based on the value of the word included in the abstraction candidate data;
A test unit that tests on condition that a combination of values of items of the abstraction candidate data is not limited to one individual of the target data;
A selection unit that selects the abstraction candidate data based on the value of the abstraction candidate data that satisfies the test condition;
The abstraction candidate data selected based on the value is anonymous information, the timing of distributing the anonymous information, the prediction information based on the target data, the degree of deviation from the reference in the anonymous information, the attribute value in the anonymous information An anonymization system provided with the output part which outputs the said anonymous information, when the appearance number, the appearance rate of the attribute value in the said anonymous information, or the order | rank of the said anonymous information satisfy | fills the conditions set by the user .

A period acquisition unit for acquiring a period for anonymization;
Of the target data including a plurality of items associated with individuals, a data acquisition unit that acquires target data corresponding to the period;
An abstraction unit that generates abstraction candidate data by replacing a word that is a value of an item in the target data with an abstracted word;
A value determination unit that receives the value of the word included in the abstraction candidate data and obtains the value of the abstraction candidate data based on the value of the word included in the abstraction candidate data;
A test unit that tests on condition that a combination of values of items of the abstraction candidate data is not limited to one individual of the target data;
A selection unit that selects the abstraction candidate data based on the value of the abstraction candidate data that satisfies the test condition;
An output unit that outputs the abstraction candidate data selected based on the value as anonymous information;
In order to anonymize the word included in the target data instead of the abstracted word, a dictionary acquiring unit that acquires a plurality of anonymized dictionaries in which the word and the abstracted word are stored in association with each other;
Based on the correspondence of each word included in the plurality of anonymization dictionaries, the abstracted word is higher and the word before abstraction is lower, each word included in the plurality of anonymization dictionaries, Corresponds to upper and lower words that exist in the anonymization dictionary, and associates the highest word that does not have a corresponding higher word as a root to the lowest word that does not have a corresponding lower word in a tree shape The tree-like correspondence is set as one dimension, the dimension is obtained for each of the plurality of top words included in the plurality of anonymization dictionaries, and the plurality of dimensions obtained for each top word are determined. An integration part to be an integrated anonymization dictionary;
For each of the dimensions, a priority determination unit that determines the priority based on words included in the dimension;
A dimension selection unit that selects a dimension to be adopted as the integrated anonymization dictionary and a dimension not to be adopted among the plurality of dimensions based on the priority;
The anonymization system, wherein the abstraction unit refers to the integrated anonymization dictionary and generates anonymization candidate data by replacing the word that is the value of the item in the target data with an abstracted word.

Obtaining a period for which anonymization is performed;
Of the target data including a plurality of items associated with an individual, obtaining target data corresponding to the period; and
Generating abstract candidate data by replacing words that are values of items in the target data with abstract words;
Receiving the value of the word included in the abstraction candidate data, and determining the value of the abstraction candidate data based on the value of the word included in the abstraction candidate data;
Testing a condition that a combination of values of the abstraction candidate data items is not limited to one individual of the target data;
Selecting abstraction candidate data based on the value of the abstraction candidate data satisfying the test condition;
The abstraction candidate data selected based on the value is anonymous information, the timing of distributing the anonymous information, the prediction information based on the target data, the degree of deviation from the reference in the anonymous information, the attribute value in the anonymous information When the number of appearances, the appearance rate of attribute values in the anonymous information, or the rank of the anonymous information satisfies the conditions set by the user, the step of outputting the anonymous information ;
Anonymization method that the computer performs.

Obtaining a period for which anonymization is performed;
Of the target data including a plurality of items associated with an individual, obtaining target data corresponding to the period; and
Generating abstract candidate data by replacing words that are values of items in the target data with abstract words;
Receiving the value of the word included in the abstraction candidate data, and determining the value of the abstraction candidate data based on the value of the word included in the abstraction candidate data;
Testing a condition that a combination of values of the abstraction candidate data items is not limited to one individual of the target data;
Selecting abstraction candidate data based on the value of the abstraction candidate data satisfying the test condition;
The abstraction candidate data selected based on the value is anonymous information, the timing of distributing the anonymous information, the prediction information based on the target data, the degree of deviation from the reference in the anonymous information, the attribute value in the anonymous information When the number of appearances, the appearance rate of attribute values in the anonymous information, or the rank of the anonymous information satisfies the conditions set by the user, the step of outputting the anonymous information ;
Anonymization program to make computer run.