JP6693135B2

JP6693135B2 - Information processing apparatus, information processing method, and program

Info

Publication number: JP6693135B2
Application number: JP2016004610A
Authority: JP
Inventors: 裕司山岡
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2016-01-13
Filing date: 2016-01-13
Publication date: 2020-05-13
Anticipated expiration: 2036-01-13
Also published as: JP2017126170A

Description

本発明は、情報処理装置、情報処理方法およびプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program.

元データを、ｋ−匿名性を備えるように加工するｋ−匿名化の手法が提案されている（特許文献１）。過去に取得したストリームデータに対してｋ−匿名性を満たすように匿名化する規則を算出し、新たに取得したストリームデータを匿名化する手法が提案されている（特許文献２）。 A method of k-anonymization that processes original data to have k-anonymity has been proposed (Patent Document 1). There is proposed a method of calculating a rule for anonymizing stream data acquired in the past so as to satisfy k-anonymity and anonymizing newly acquired stream data (Patent Document 2).

その他に、元データ中の個票の各項目に記録されたデータを木構造状に階層化し、出現頻度のカウント結果に基づいて各個票の項目を一般化する匿名化の手法が提案されている（特許文献３）。 In addition, a method of anonymization is proposed in which the data recorded in each item in the original data is hierarchized in a tree structure and the items in each item are generalized based on the count result of appearance frequency. (Patent Document 3).

米国特許出願公開第２００２／１６９７９３号明細書US Patent Application Publication No. 2002/169793 特開２０１３−８０３７５号公報JP, 2013-80375, A 国際公開第２０１１−１４５４０１号公報International Publication No. 2011-145401

特許文献１および特許文献３に開示されている手法は、あらかじめ収集した個票を匿名化する手法である。逐次発生する追加の個票を適切に処理するためには、当初処理した個票と追加の個票とを合わせて処理する必要があり、データの処理に時間を要する。そのため、追加の個票が逐次発生する場合には、匿名化の処理が困難である。 The method disclosed in Patent Document 1 and Patent Document 3 is a method of anonymizing individual votes collected in advance. In order to properly process the additional pieces that occur sequentially, it is necessary to process the initially processed pieces and the additional pieces together, and it takes time to process the data. Therefore, it is difficult to perform anonymization processing when additional individual votes are sequentially generated.

特許文献２には、逐次発生する追加の個票を処理する手法が開示されている。しかし、追加の個票と、過去に処理した個票との共通点が少ない場合には、匿名性を確保できず、追加の個票から個人を特定することが可能となる場合がある。 Patent Document 2 discloses a method of processing additional individual pieces that are sequentially generated. However, when there is little commonality between the additional individual votes and the individual votes processed in the past, anonymity cannot be ensured, and it may be possible to identify an individual from the additional individual votes.

一つの側面では、逐次発生する追加の個票の匿名化に要する処理負荷を軽減する情報処理装置等を提供することを目的とする。 In one aspect, an object is to provide an information processing device or the like that reduces the processing load required for anonymization of additional individual pieces that occur sequentially.

情報処理装置は、複数の項目にそれぞれ関連付けられたデータを有する入力個票を逐次取得する取得部と、前記取得部が取得した前記入力個票を、規則に基づいて、前記複数の項目にそれぞれ関連付けられたデータを有する複数の匿名化済個票のいずれか一つと同一の出力個票に逐次変換する変換部とを備え、前記変換部は、前記入力個票を前記匿名化済個票と同一の出力個票に変換することができない場合に、該匿名化済個票が有する項目に関連付けられたすべてのデータを他のデータに置換した置換個票に変換する。 The information processing device sequentially acquires an input individual vote having data associated with each of a plurality of items, and the input individual vote acquired by the acquisition unit, based on a rule, for each of the plurality of items. A conversion unit that sequentially converts any one of a plurality of anonymized individual votes having associated data into the same output individual vote , wherein the conversion unit converts the input individual vote to the anonymized individual vote. When it is not possible to convert to the same output individual form, all the data associated with the items included in the anonymized individual form are converted to the replacement individual form which is replaced with other data.

一つの側面では、逐次発生する追加の個票の匿名化に要する処理負荷を軽減できる。 According to one aspect, it is possible to reduce the processing load required for anonymization of additional individual pieces that occur sequentially.

情報処理システムの構成を示す説明図である。It is explanatory drawing which shows the structure of an information processing system. 元データの例を示す説明図である。It is explanatory drawing which shows the example of original data. 匿名化済データＤＢのレコードレイアウトを示す説明図である。It is explanatory drawing which shows the record layout of anonymized data DB. 匿名化オートマトンの作成過程を示す説明図である。It is explanatory drawing which shows the creation process of anonymized automaton. 匿名化オートマトンの作成過程を示す説明図である。It is explanatory drawing which shows the creation process of anonymized automaton. 匿名化オートマトンの作成過程を示す説明図である。It is explanatory drawing which shows the creation process of anonymized automaton. 匿名化オートマトンを示す説明図である。It is explanatory drawing which shows an anonymization automaton. ストリーミングデータの例を示す説明図である。It is explanatory drawing which shows the example of streaming data. 匿名化処理後のストリーミングデータの例を示す説明図である。It is an explanatory view showing an example of streaming data after anonymization processing. プログラムの処理の流れを示すフローチャートである。It is a flow chart which shows a flow of processing of a program. 階層木作成のサブルーチンの処理の流れを示すフローチャートである。It is a flow chart which shows a flow of processing of a subroutine of hierarchy tree creation. ノード削減のサブルーチンの処理の流れを示すフローチャートである。It is a flow chart which shows a flow of processing of a subroutine of node reduction. バイパス追加のサブルーチンの処理の流れを示すフローチャートである。It is a flow chart which shows a flow of processing of a subroutine of bypass addition. バイパス追加のサブルーチンの処理の流れを示すフローチャートである。It is a flow chart which shows a flow of processing of a subroutine of bypass addition. 匿名化のサブルーチンの処理の流れを示すフローチャートである。It is a flow chart which shows a flow of processing of a subroutine of anonymization. 実施の形態２の匿名化処理後のストリーミングデータの例を示す説明図である。FIG. 8 is an explanatory diagram showing an example of streaming data after anonymization processing according to the second embodiment. 実施の形態２の匿名化オートマトンを示す説明図である。It is explanatory drawing which shows the anonymization automaton of Embodiment 2. 実施の形態２のプログラムの処理の流れを示すフローチャートである。9 is a flowchart showing a flow of processing of a program according to the second embodiment. 実施の形態３のストリーミングデータの例を示す説明図である。FIG. 16 is an explanatory diagram showing an example of streaming data according to the third embodiment. 実施の形態３の匿名化処理後のストリーミングデータの例を示す説明図である。FIG. 16 is an explanatory diagram showing an example of streaming data after anonymization processing according to the third embodiment. 実施の形態３の匿名化のサブルーチンの処理の流れを示すフローチャートである。16 is a flowchart showing the flow of processing of a sub-routine for anonymization according to the third embodiment. 実施の形態４の階層木の作成過程を示す説明図である。It is explanatory drawing which shows the production process of the hierarchy tree of Embodiment 4. 実施の形態４の階層木の作成過程を示す説明図である。It is explanatory drawing which shows the production process of the hierarchy tree of Embodiment 4. 実施の形態４の階層木作成のサブルーチンの処理の流れを示すフローチャートである。20 is a flowchart showing the flow of processing of a sub-routine for creating a hierarchical tree according to the fourth embodiment. 実施の形態５の情報処理システムの構成を示す説明図である。It is explanatory drawing which shows the structure of the information processing system of Embodiment 5. 実施の形態５のプログラムの処理の流れを示すフローチャートである。21 is a flowchart showing the flow of processing of a program according to the fifth embodiment. 実施の形態６のプログラムの処理の流れを示すフローチャートである。28 is a flowchart showing the flow of processing of a program according to the sixth embodiment. 実施の形態７の情報処理装置の動作を示す機能ブロック図である。It is a functional block diagram which shows operation | movement of the information processing apparatus of Embodiment 7. 実施の形態８の情報処理システムの構成を示す説明図である。It is explanatory drawing which shows the structure of the information processing system of Embodiment 8.

［実施の形態１］
図１は、情報処理システム１０の構成を示す説明図である。情報処理システム１０は、サーバ１１、第１クライアント２１および第２クライアント２５を備える。サーバ１１、第１クライアント２１および第２クライアント２５は、ネットワークを介して接続されている。 [Embodiment 1]
FIG. 1 is an explanatory diagram showing the configuration of the information processing system 10. The information processing system 10 includes a server 11, a first client 21 and a second client 25. The server 11, the first client 21 and the second client 25 are connected via a network.

サーバ１１は、サーバＣＰＵ（Central Processing Unit）１２、主記憶装置１３、補助記憶装置１４、通信部１５およびバスを備える。本実施の形態のサーバ１１は、ストリーミングデータの匿名化の処理を行う情報処理装置である。本実施の形態のサーバ１１は、汎用のパーソナルコンピューター、大型計算機等の情報機器等である。また、本実施の形態のサーバ１１は、大型計算機上で動作する仮想マシンでも良い。 The server 11 includes a server CPU (Central Processing Unit) 12, a main storage device 13, an auxiliary storage device 14, a communication unit 15, and a bus. The server 11 of the present embodiment is an information processing device that performs anonymization processing of streaming data. The server 11 of the present embodiment is a general-purpose personal computer, an information device such as a large-scale computer, or the like. Further, the server 11 of this embodiment may be a virtual machine operating on a large-scale computer.

サーバＣＰＵ１２は、本実施の形態に係るプログラムを実行する演算制御装置である。サーバＣＰＵ１２には、一または複数のＣＰＵまたはマルチコアＣＰＵ等が使用される。サーバＣＰＵ１２は、バスを介してシミュレーション装置１０を構成するハードウェア各部と接続されている。 The server CPU 12 is an arithmetic and control unit that executes the program according to the present embodiment. For the server CPU 12, one or a plurality of CPUs, a multi-core CPU, or the like is used. The server CPU 12 is connected to each part of the hardware configuring the simulation device 10 via a bus.

主記憶装置１３は、ＳＲＡＭ（Static Random Access Memory）、ＤＲＡＭ（Dynamic Random Access Memory）、フラッシュメモリ等の記憶装置である。主記憶装置１３には、サーバＣＰＵ１２が行う処理の途中で必要な情報およびサーバＣＰＵ１２で実行中のプログラムが一時的に保存される。 The main storage device 13 is a storage device such as SRAM (Static Random Access Memory), DRAM (Dynamic Random Access Memory), and flash memory. The main storage device 13 temporarily stores information required during the process performed by the server CPU 12 and a program being executed by the server CPU 12.

補助記憶装置１４は、ＳＲＡＭ、フラッシュメモリ、ハードディスクまたは磁気テープ等の記憶装置である。補助記憶装置１４には、サーバＣＰＵ１２に実行させるプログラム、匿名化済データＤＢ（Data Base）３１、匿名化オートマトン３２およびプログラムの実行に必要な各種情報が保存される。通信部１５は、ネットワークとの通信を行うインターフェイスである。 The auxiliary storage device 14 is a storage device such as an SRAM, a flash memory, a hard disk or a magnetic tape. The auxiliary storage device 14 stores a program to be executed by the server CPU 12, anonymized data DB (Data Base) 31, anonymized automaton 32, and various kinds of information necessary for executing the program. The communication unit 15 is an interface that communicates with a network.

第２クライアント２５は、入力個票を逐次取得してネットワークに送るクライアントである。第２クライアント２５は、店舗等に設置された情報端末および一般消費者等が使用するパソコン、タブレット、スマートフォンなどの情報機器である。以後の説明では第２クライアント２５を操作するユーザを第２ユーザと記載する。 The second client 25 is a client that sequentially acquires input individual votes and sends them to the network. The second client 25 is an information terminal installed in a store or the like and an information device such as a personal computer, a tablet, or a smartphone used by a general consumer or the like. In the following description, a user who operates the second client 25 will be referred to as a second user.

入力個票は、たとえばＷｅｂサイトの会員登録、通信販売の取引記録、ポイントカードのポイント付与記録等から取得する。入力個票は、複数の項目に対応するデータを含む所定の形式で、第２クライアント２５からネットワークに逐次送られる。 The input form is acquired from, for example, membership registration on the website, transaction records of mail-order sales, point grant records of point cards, and the like. The input form is sequentially sent from the second client 25 to the network in a predetermined format including data corresponding to a plurality of items.

第１クライアント２１は、たとえばマーケティング等に匿名化済データを活用するユーザが使用するクライアントである。匿名化済データについては後述する。本実施の形態の第１クライアント２１は、汎用のパーソナルコンピューター、大型計算機等の情報機器を使用する。以後の説明では、第１クライアント２１を操作するユーザを第１ユーザと記載する。第１クライアント２１は、入力個票をサーバＣＰＵ１２が匿名化した出力個票を逐次受け取る。 The first client 21 is a client used by a user who utilizes anonymized data for marketing or the like. Anonymized data will be described later. The first client 21 of the present embodiment uses an information device such as a general-purpose personal computer or a large-scale computer. In the following description, the user who operates the first client 21 will be referred to as the first user. The first client 21 sequentially receives the output votes obtained by anonymizing the input votes by the server CPU 12.

なお、第１クライアント２１および第２クライアント２５は、サーバ１１と同様にＣＰＵ、主記憶装置、補助記憶装置および通信部を備える。さらに、第１クライアント２１および第２クライアント２５は、キーボード、タッチパネル、マウス等の入力インターフェイスおよび液晶ディスプレイ、有機ＥＬディスプレイ、プリント等の出力インターフェイスを備える。第１クライアント２１および第２クライアント２５の内部構成については、図示を省略する。 The first client 21 and the second client 25 include a CPU, a main storage device, an auxiliary storage device, and a communication unit, like the server 11. Further, the first client 21 and the second client 25 include an input interface such as a keyboard, a touch panel, a mouse and the like, and an output interface such as a liquid crystal display, an organic EL display and a print. Illustrations of the internal configurations of the first client 21 and the second client 25 are omitted.

図２は、元データの例を示す説明図である。元データは本実施の形態の情報処理システム１０とは別の情報処理装置に保存されている。元データは、たとえばＷｅｂサイトの会員登録、通信販売の取引記録、ポイントカードのポイント付与記録等のデータである。図２の１つの行は、元データに含まれる１個の個票を示す。個票は、たとえば一人分の会員登録情報、または一回の取引の記録等、一つの塊として取り扱われるデータである。図２では、例として郵便番号、生年、性別の３つの項目を含む個票を示す。なお、図２では郵便番号の上３桁のみを記録した例を示す。 FIG. 2 is an explanatory diagram showing an example of original data. The original data is stored in an information processing device different from the information processing system 10 of the present embodiment. The original data is, for example, data such as Web site member registration, mail-order transaction records, and point-granting records on point cards. One row in FIG. 2 shows one piece included in the original data. The individual vote is data treated as one lump, such as membership registration information for one person or a record of one transaction. In FIG. 2, as an example, an individual vote including three items of postal code, year of birth, and gender is shown. Note that FIG. 2 shows an example in which only the first three digits of the postal code are recorded.

元データの個票にはこれらの他にたとえば住所、氏名、電話番号、メールアドレス、家族構成、購買した商品、支払い方法等の項目が含まれる場合がある。したがって、元データには個人情報が含まれる可能性がある。個人情報が含まれる元データを二次利用することは、法的に制限されている。図２に示す元データには、数十個から数百万個程度の個票が記録されている。なお、以下の説明では元データの個票を元データ個票と記載する。 In addition to these, the individual data of the original data may include items such as address, name, telephone number, mail address, family structure, purchased product, and payment method. Therefore, the original data may include personal information. Secondary use of original data containing personal information is legally restricted. In the original data shown in FIG. 2, several tens to several million individual votes are recorded. In the following description, the original data individual form will be referred to as the original data individual form.

図３は、匿名化済データＤＢ３１のレコードレイアウトを示す説明図である。匿名化済データＤＢ３１は、匿名化済データを記録するＤＢである。 FIG. 3 is an explanatory diagram showing a record layout of the anonymized data DB 31. The anonymized data DB 31 is a DB that records anonymized data.

匿名化済データについて説明する。匿名化済データとは個票から個人を特定できないように加工したデータを意味する。匿名化済データは、たとえばマーケティング分析、サービスの最適化等に利用することができる。以後の説明では、匿名化済データを利用することを二次利用と記載する。 Anonymized data will be described. Anonymized data means data processed so that an individual cannot be identified from individual votes. The anonymized data can be used for marketing analysis, service optimization, and the like. In the following description, using anonymized data is described as secondary use.

どの個票にも、同一内容の個票が当該個票自身も含めてｋ個以上存在する状態を、ｋ−匿名性という。ここでｋは２以上の整数である。ｋ−匿名性が確保されていれば、個票と紐付けられる個人をｋ人未満に絞り込んで特定することができない。本実施の形態の匿名化済データＤＢ３１は、ｋ−匿名性を有する。 The state in which there are k or more individual votes including the same content in any individual vote is called k-anonymity. Here, k is an integer of 2 or more. If k-anonymity is ensured, it is not possible to narrow down the number of individuals associated with the individual votes to less than k and specify. The anonymized data DB 31 of this embodiment has k-anonymity.

なお、本実施の形態の匿名化済データＤＢ３１は、たとえばＰｋ−匿名性を有しても良い。Ｐｋ−匿名性とは、個票から個人を特定することができる可能性が１／ｋ未満である状態を意味する。 The anonymized data DB 31 of the present embodiment may have Pk-anonymity, for example. Pk-anonymity means a state in which there is less than 1 / k possibility that an individual can be identified from individual votes.

匿名化済データＤＢ３１は、郵便番号フィールド、生年フィールドおよび性別フィールドを有する。郵便番号フィールドには、郵便番号が記録されている。なお、図３では郵便番号の上３桁のみを記録した例を示す。生年フィールドには、年齢が記録されている。性別フィールドには、性別が記録されている。郵便番号フィールド、年齢フィールドおよび性別フィールドに記録された「＊」については後述する。 The anonymized data DB 31 has a postal code field, a birth year field, and a gender field. A postal code is recorded in the postal code field. Note that FIG. 3 shows an example in which only the first three digits of the postal code are recorded. Age is recorded in the birth year field. Gender is recorded in the gender field. The “*” recorded in the postal code field, age field and gender field will be described later.

匿名化済データＤＢ３１は、１個の匿名化済個票について１つのレコードを有する。匿名化済データＤＢ３１は、これらの他にたとえば住所、氏名、電話番号、メールアドレス、家族構成、購買した商品、支払い方法等を匿名化した情報を記録するフィールドを有しても良い。 The anonymized data DB 31 has one record for each anonymized individual vote. In addition to these, the anonymized data DB 31 may have fields for recording, for example, anonymized information such as an address, a name, a telephone number, an email address, a family structure, a purchased product, a payment method, and the like.

匿名化は、本実施の形態の情報処理システム１０とは別の情報処理装置で行われる。図３中の匿名化済データＤＢ３１は、図２に示す元データを匿名化した例を示す。元データの１行目の元データ個票を匿名化したデータが、匿名化済データＤＢ３１の１行目のレコードである。 Anonymization is performed by an information processing device different from the information processing system 10 according to the present embodiment. Anonymized data DB 31 in FIG. 3 shows an example in which the original data shown in FIG. 2 is anonymized. The data obtained by anonymizing the original data individual row of the original data is the record of the first row of the anonymized data DB 31.

元データと匿名化済データとの関係について説明する。図２の１行目に記載した元データ個票については、郵便番号、年齢、性別の３項目とも図３の匿名化済データＤＢ３１の１レコード目に記録されている。元データに同一内容の元データ個票が複数含まれており、元データ個票から個人を特定することができない場合には、このように元データ個票と同一の匿名化済個票が匿名化済データＤＢ３１に含まれていても良い。 The relationship between the original data and the anonymized data will be described. Regarding the original data individual form described in the first line of FIG. 2, all three items of postal code, age, and sex are recorded in the first record of the anonymized data DB 31 of FIG. If the source data contains multiple source data votes with the same content, and if an individual cannot be identified from the source data vote, the same anonymized vote as the source data vote is anonymous. It may be included in the converted data DB 31.

図２の２行目に記載した元データ個票については、性別を「＊」に置換して図３の匿名化済データＤＢ３１の２レコード目に記録されている。ここで「＊」は性別を隠したことを意味している。性別を隠すことにより、性別だけが異なる複数の元データ個票が同一の匿名化済個票に変換される。このように元データ個票を処理することにより、元データが所定の条件の匿名性を備える匿名化済データに変換される。 In the original data individual form described in the second line of FIG. 2, the gender is replaced with “*” and recorded in the second record of the anonymized data DB 31 of FIG. Here, "*" means that the gender was hidden. By hiding the gender, a plurality of original data individual votes that differ only in gender are converted into the same anonymized individual vote. By processing the original data individual pieces in this way, the original data is converted into anonymized data having anonymity of a predetermined condition.

各項目のデータを「＊」で置換することを、以後の説明では一般化と記載する。一般化は、元データ個票を匿名化する手法の一つである。元データの３行目については、生年を「＊」に置換した匿名化済個票が匿名化済データＤＢ３１に記録されている。ここで「＊」は年齢を一般化したことを意味している。 Replacing the data of each item with “*” is referred to as generalization in the following description. Generalization is one of the methods of anonymizing the original data individual votes. Regarding the third line of the original data, the anonymized individual vote in which the year of birth is replaced with “*” is recorded in the anonymized data DB 31. Here, "*" means that the age is generalized.

データに使用しない値または符号等であれば、たとえば「−」、「／」、「９９９」等の任意の値または符号等を「＊」の代わりに使用することができる。以後の説明では、一般化に使用する値または符号等を一般化符号と記載する。なお、項目によって異なる一般化符号を使用しても良い。 Any value or sign such as "-", "/", or "999" that is not used for data can be used instead of "*". In the following description, a value or code used for generalization will be referred to as a generalized code. Note that generalized codes that differ depending on the item may be used.

元データを匿名化する方法は、一般化に限定しない。たとえば、住所を都道府県名で置き換える、氏名をイニシヤルで置き換える、一部の項目のデータを他の個票の同じ項目のデータと入れ替える、一部の項目を削除する等の方法で、元データを匿名化することができる。また、これらの方法と一般化とを組み合わせても良い。本実施の形態の情報処理装置１０は、任意の方法で匿名化した匿名化済データＤＢ３１を使用することができる。 The method of anonymizing the original data is not limited to generalization. For example, replace the address with the prefecture name, replace the name with the initial, replace the data of some items with the data of the same item in other votes, delete some items, and replace the original data. Can be anonymized. Further, these methods may be combined with generalization. The information processing device 10 according to the present embodiment can use the anonymized data DB 31 anonymized by an arbitrary method.

匿名化済データＤＢ３１は、数十個から数百万個程度のレコードを有する。すなわち匿名化済データＤＢ３１には、数十個から数百万個程度の匿名化済個票が記録されている。 The anonymized data DB 31 has tens to millions of records. That is, the anonymized data DB 31 stores several tens to several millions of anonymized individual votes.

図４から図６は、匿名化オートマトン３２の作成過程を示す説明図である。図７は、匿名化オートマトン３２を示す説明図である。匿名化オートマトン３２は、入力個票を受け付けて、所定の規則に基づく処理を行い、匿名化した匿名化済個票に変換する仮想的な機械である。すなわち、匿名化オートマトン３２は、入力個票を匿名化する処理の規則を表現した仮想的な機械である。匿名化オートマトン３２は、匿名化済データＤＢ３１に記録された匿名化済個票に基づいて作成される有限オートマトンである。匿名化オートマトン３２が入力個票を処理する手順については後述する。 4 to 6 are explanatory diagrams showing a process of creating the anonymized automaton 32. FIG. 7 is an explanatory diagram showing the anonymized automaton 32. The anonymization automaton 32 is a virtual machine that receives an input individual vote, performs a process based on a predetermined rule, and converts it into an anonymized individual vote. That is, the anonymization automaton 32 is a virtual machine that expresses the rules of the process of anonymizing the input individual vote. The anonymized automaton 32 is a finite state automaton created based on the anonymized individual votes recorded in the anonymized data DB 31. The procedure for the anonymized automaton 32 to process the input individual number will be described later.

図４は、匿名化オートマトン３２を作成する第１の段階である階層木を示す。サーバＣＰＵ１２は、図３を使用して説明した匿名化済データＤＢ３１に記録された匿名化済個票を入力として、図４に示す階層木を作成する。階層木は、親ノード６０、中間ノード６１および受容ノード６３を有する。親ノード６０および中間ノード６１は円で示す。受容ノード６３は二重楕円で示す。以後の説明では、親ノード６０、中間ノード６１および受容ノード６３をまとめてノードと記載する場合がある。 FIG. 4 shows a hierarchical tree which is the first stage of creating the anonymized automaton 32. The server CPU 12 inputs the anonymized individual number recorded in the anonymized data DB 31 described using FIG. 3 and creates the hierarchical tree shown in FIG. The hierarchical tree has a parent node 60, an intermediate node 61 and a receiving node 63. The parent node 60 and the intermediate node 61 are indicated by circles. The receiving node 63 is shown as a double ellipse. In the following description, the parent node 60, the intermediate node 61, and the reception node 63 may be collectively referred to as a node.

親ノード６０、中間ノード６１および受容ノード６３は実線で示す遷移枝６５で接続されている。中間ノード６１および受容ノード６３には、それぞれ一本の遷移枝６５が入力する。 The parent node 60, the intermediate node 61, and the accepting node 63 are connected by a transition branch 65 shown by a solid line. One transition branch 65 is input to each of the intermediate node 61 and the reception node 63.

中間ノード６１からは１本以上の遷移枝６５が出力する。遷移枝６５を選択する条件は、各々の矢印の近傍のラベルに表示してある。受容ノード６３は、匿名化済データＤＢ３１に記録されている各匿名化済個票と対応している。受容ノード６３からは遷移枝６５が出力しない。 One or more transition branches 65 are output from the intermediate node 61. The condition for selecting the transition branch 65 is displayed on the label near each arrow. The reception node 63 corresponds to each anonymized individual record recorded in the anonymized data DB 31. The transition branch 65 does not output from the reception node 63.

階層木は、第１階層の親ノード６０から第４階層の受容ノード６３までの四階層の階層構造を有する。階層の数は、匿名化済データＤＢ３１のフィールドの数に１を加算した数である。一の階層に含まれる中間ノード６１とその次の階層に含まれる中間ノード６１または受容ノード６３とを結ぶ遷移枝６５のラベルは、匿名化済データＤＢ３１の１つのフィールドに含まれるデータである。 The hierarchical tree has a hierarchical structure of four layers from the parent node 60 of the first layer to the receiving node 63 of the fourth layer. The number of layers is the number obtained by adding 1 to the number of fields of the anonymized data DB 31. The label of the transition branch 65 that connects the intermediate node 61 included in one hierarchy and the intermediate node 61 or the acceptance node 63 included in the next hierarchy is data included in one field of the anonymized data DB 31.

親ノード６０、中間ノード６１および受容ノード６３を示す円または二重楕円内の「ｎ」に続く番号は、図３を使用して説明した匿名化ＤＢ３１のレコードを上から順番に処理した場合に、サーバＣＰＵ１２がノードを生成する順番を示す。 The numbers following “n” in the circle or the double ellipse indicating the parent node 60, the intermediate node 61, and the accepting node 63 are the numbers when the records of the anonymization DB 31 described using FIG. 3 are processed in order from the top. , The order in which the server CPU 12 creates nodes.

以後の説明では、中間ノード６１を特定する必要がある場合には円内に示す記号を用いて、たとえば中間ノード６１ｎ２のように記載する。同様に、受容ノード６３を特定する必要がある場合には二重楕円内に示す記号を用いて、たとえば受容ノード６３ｎ４のように記載する。 In the following description, when it is necessary to specify the intermediate node 61, the symbols shown in the circle are used and described as, for example, the intermediate node 61n2. Similarly, when the receptive node 63 needs to be specified, the receptive node 63n4 is described using the symbols shown in the double ellipse.

以下の図および説明では、受容ノード６３は項目の名称を省略して表示する。たとえば図４の受容ノード６３ｎ１６に記載された「＊，２０，女」は、郵便番号は「＊」、年齢は「２０」、性別は「女」であることを意味している。 In the following figures and description, the acceptance node 63 is displayed with the item names omitted. For example, “*, 20, female” described in the acceptance node 63n16 in FIG. 4 means that the postal code is “*”, the age is “20”, and the gender is “woman”.

サーバＣＰＵ１２が、図４に示す階層木を作成する方法について説明する。サーバＣＰＵ１２は、親ノード６０を作成する。サーバＣＰＵ１２は、図中に「ｎ１」と示すように、親ノード６０に１番の番号を付与する。 A method for the server CPU 12 to create the hierarchical tree shown in FIG. 4 will be described. The server CPU 12 creates the parent node 60. The server CPU 12 assigns the number 1 to the parent node 60, as indicated by "n1" in the figure.

サーバＣＰＵ１２は、匿名化済ＤＢ３１から１番目のレコード「１９８，５０，男」を取得する。サーバＣＰＵ１２は、親ノード６０から出力する遷移枝６５および中間ノード６１ｎ２を追加する。サーバＣＰＵ１２は、追加した遷移枝６５に、１番目のレコードの第１フィールドに記録されたデータに対応するラベル「〒：１９８」を付与する。追加した遷移枝６５は、郵便番号が「１９８」である入力個票が親ノード６０に入力した場合に、サーバＣＰＵ１２が行う処理が中間ノード６１ｎ２に遷移することを意味する。 The server CPU 12 acquires the first record “198, 50, man” from the anonymized DB 31. The server CPU 12 adds the transition branch 65 and the intermediate node 61n2 output from the parent node 60. The server CPU 12 gives the added transition branch 65 a label “∘: 198” corresponding to the data recorded in the first field of the first record. The added transition branch 65 means that the process performed by the server CPU 12 transits to the intermediate node 61n2 when the input individual number having the postal code “198” is input to the parent node 60.

サーバＣＰＵ１２は、中間ノード６１ｎ２から出力する遷移枝６５および中間ノード６１ｎ３を追加する。サーバＣＰＵ１２は、追加した遷移枝６５に、１番目のレコードの第２フィールドに記録されたデータに対応するラベル「年：５０」を付与する。 The server CPU 12 adds the transition branch 65 and the intermediate node 61n3 output from the intermediate node 61n2. The server CPU 12 gives the added transition branch 65 a label “year: 50” corresponding to the data recorded in the second field of the first record.

次の第３フィールドは、１番目のレコードの最後のフィールドであるので、サーバＣＰＵ１２は、中間ノード６１ｎ３から出力する遷移枝６５および受容ノード６３ｎ４を追加する。サーバＣＰＵ１２は、追加した遷移枝６５に、１番目のレコードの第３フィールドに記録されたデータに対応するラベル「性：男」を付与する。サーバＣＰＵ１２は、受容ノード６３ｎ４に第１レコードに記録された匿名化済個票「１９８，５０，男」を関連付ける。 Since the next third field is the last field of the first record, the server CPU 12 adds the transition branch 65 and the acceptance node 63n4 output from the intermediate node 61n3. The server CPU 12 gives the added transition branch 65 a label “sex: male” corresponding to the data recorded in the third field of the first record. The server CPU 12 associates the anonymized individual vote “198, 50, man” recorded in the first record with the accepting node 63n4.

サーバＣＰＵ１２は、匿名化済データＤＢ３１から２番目のレコード「１９８，２０，＊」を取得する。サーバＣＰＵ１２は、作成中の階層木の親ノード６０から、２番目のレコードの第１フィールドに記録されたデータに対応する遷移枝６５が出ているか否かを判定する。郵便番号が「１９８」である遷移枝６５は既に存在する。したがって、サーバＣＰＵ１２はその遷移枝６５の先の中間ノード６１ｎ２を次の処理対象とする。 The server CPU 12 acquires the second record “198, 20, *” from the anonymized data DB 31. The server CPU 12 determines whether or not the transition branch 65 corresponding to the data recorded in the first field of the second record is output from the parent node 60 of the hierarchical tree being created. A transition branch 65 whose postal code is "198" already exists. Therefore, the server CPU 12 sets the intermediate node 61n2 ahead of the transition branch 65 as the next processing target.

サーバＣＰＵ１２は、作成中の階層木の中間ノード６１ｎ２から、２番目のレコードの第２フィールドに記録されたデータに対応する遷移枝６５が出ているか否かを判定する。生年が「２０」である遷移枝６５は存在しない。したがって、サーバＣＰＵ１２は、中間ノード６１ｎ２から出力する遷移枝６５および中間ノード６１ｎ５を追加する。サーバＣＰＵ１２は、追加した遷移枝６５に、２番目のレコードの第２フィールドに記録されたデータに対応するラベル「年：２０」を付与する。サーバＣＰＵ１２は作成した中間ノード６１ｎ５を次の処理対象とする。 The server CPU 12 determines whether or not the transition node 65 corresponding to the data recorded in the second field of the second record is output from the intermediate node 61n2 of the hierarchical tree being created. There is no transition branch 65 whose birth year is “20”. Therefore, the server CPU 12 adds the transition branch 65 and the intermediate node 61n5 output from the intermediate node 61n2. The server CPU 12 gives the added transition branch 65 a label “year: 20” corresponding to the data recorded in the second field of the second record. The server CPU 12 sets the created intermediate node 61n5 as the next processing target.

サーバＣＰＵ１２は、作成中の階層木の中間ノード６１ｎ５から、２番目のレコードの第３フィールドに記録されたデータに対応する遷移枝６５が出ているか否かを判定する。中間ノード６１ｎ５から出力する遷移枝６５はまだ存在しない。次の第３フィールドは、２番目のレコードの最後のフィールドであるので、サーバＣＰＵ１２は、中間ノード６１ｎ５から出力する遷移枝６５および受容ノード６３を追加する。サーバＣＰＵ１２は、追加した遷移枝６５に、２番目のレコードの第３フィールドに記録されたデータに対応するラベル「性：＊」を付与する。サーバＣＰＵ１２は、受容ノード６３ｎ６に第２レコードに記録された匿名化済個票「１９８，２０，＊」を関連付ける。 The server CPU 12 determines whether or not the transition node 65 corresponding to the data recorded in the third field of the second record is output from the intermediate node 61n5 of the hierarchical tree being created. The transition branch 65 output from the intermediate node 61n5 does not yet exist. Since the next third field is the last field of the second record, the server CPU 12 adds the transition branch 65 and the acceptance node 63 output from the intermediate node 61n5. The server CPU 12 gives the added transition branch 65 a label “sex: *” corresponding to the data recorded in the third field of the second record. The server CPU 12 associates the anonymized individual number “198, 20, *” recorded in the second record with the reception node 63n6.

サーバＣＰＵ１２は、以後同様にして匿名化済データＤＢ３１からレコードを取得し、親ノード６０から順番に対応する遷移枝６５の有無を判定し、無い場合には遷移枝６５およびノードを階層木に追加する。サーバＣＰＵ１２が、匿名化済データＤＢ３１に含まれるすべてのレコードを処理することにより、図４に示す階層木が完成する。 Similarly, the server CPU 12 subsequently acquires a record from the anonymized data DB 31 and determines the presence or absence of the corresponding transition branch 65 from the parent node 60 in order. If there is no transition branch 65 and node, the transition branch 65 and the node are added to the hierarchical tree. To do. The server CPU 12 processes all the records included in the anonymized data DB 31 to complete the hierarchical tree shown in FIG.

図５は、匿名化オートマトン３２を作成する第２の段階が終了した状態を示す。サーバＣＰＵ１２は、第２の段階では、図４を使用して説明した階層木から、不要なノードを削除する。具体的には、図４の階層木から中間ノード６１ｎ５および中間ノード６１ｎ９が削除されている。 FIG. 5 shows a state in which the second stage of creating the anonymized automaton 32 is completed. In the second stage, the server CPU 12 deletes unnecessary nodes from the hierarchical tree described with reference to FIG. Specifically, the intermediate nodes 61n5 and 61n9 are deleted from the hierarchical tree of FIG.

中間ノード６１の削除について説明する。一般化符号「＊」の遷移枝６５のみが出力している中間ノード６１は、匿名化オートマトン３２による処理に寄与していない。サーバＣＰＵ１２は、一般化符号「＊」の遷移枝６５のみが出力している中間ノード６１を階層木から削除する。サーバＣＰＵ１２は、削除した中間ノード６１に入力していた遷移枝６５を、削除した中間ノード６１から出力していた遷移枝６５の先の中間ノード６１または受容ノード６３に接続する。 The deletion of the intermediate node 61 will be described. The intermediate node 61 output only by the transition branch 65 of the generalized code “*” does not contribute to the processing by the anonymization automaton 32. The server CPU 12 deletes the intermediate node 61 output from only the transition branch 65 of the generalized code “*” from the hierarchical tree. The server CPU 12 connects the transition branch 65 input to the deleted intermediate node 61 to the intermediate node 61 or the accepting node 63 ahead of the transition branch 65 output from the deleted intermediate node 61.

さらに具体的に説明する。サーバＣＰＵ１２は、各中間ノード６１から出力している遷移枝６５のラベルが一般化符号「＊」のみであるか否かを順次判定する。たとえば中間ノード６１ｎ５からは、「性：＊」の遷移枝６５のみが出力している。サーバＣＰＵ１２は、中間ノード６１ｎ５および「性：＊」の遷移枝６５を消去する。サーバＣＰＵ１２は、中間ノード６１ｎ５に入力していた「年：２０」のラベルが付与された遷移枝６５を、「性：＊」の遷移枝６５と接続していた受容ノード６３ｎ６に接続する。 A more specific description will be given. The server CPU 12 sequentially determines whether or not the label of the transition branch 65 output from each intermediate node 61 is only the generalized code “*”. For example, only the transition branch 65 of "sex: *" is output from the intermediate node 61n5. The server CPU 12 deletes the intermediate node 61n5 and the transition branch 65 of "sex: *". The server CPU 12 connects the transition branch 65 having the label “year: 20” input to the intermediate node 61n5 to the reception node 63n6 connected to the transition branch 65 “sex: *”.

図６は、匿名化オートマトン３２を作成する第３の段階を説明する説明図である。第３の段階では、サーバＣＰＵ１２はバイパス枝６７（図７参照）を作成する。バイパス枝６７は、一部の中間ノード６１から出力し、別の中間ノード６１または受容ノード６３に入力する。 FIG. 6 is an explanatory diagram illustrating a third stage of creating the anonymized automaton 32. In the third stage, the server CPU 12 creates the bypass branch 67 (see FIG. 7). The bypass branch 67 outputs from one of the intermediate nodes 61 and inputs to another intermediate node 61 or the receiving node 63.

サーバＣＰＵ１２は、図５を使用して説明した作成途中の匿名化オートマトン３２からノード列６９を作成する。サーバＣＰＵ１２は、ノード列６９に１から始まる連番で番号を付与する。また、サーバＣＰＵ１２は、各ノード列６９内のノードに、親ノード６０側を１として連番で番号を付与する。 The server CPU 12 creates the node sequence 69 from the anonymized automaton 32 in the process of creation described with reference to FIG. The server CPU 12 assigns a serial number starting from 1 to the node sequence 69. In addition, the server CPU 12 assigns serial numbers to the nodes in each node row 69, with the parent node 60 side being 1.

１つのノード列６９は、親ノード６０と、一つの受容ノード６３と、両者の間の中間ノード６１とを含む。サーバＣＰＵ１２は、受容ノード６３と同じ数のノード列６９を作成する。サーバＣＰＵ１２は、一のノード列６９に含まれる中間ノード６１から出力し、他のノード列６９に含まれる中間ノード６１に入力するバイパス枝６７を作成する。 One node sequence 69 includes a parent node 60, one accepting node 63, and an intermediate node 61 between them. The server CPU 12 creates the same number of node arrays 69 as the reception nodes 63. The server CPU 12 creates a bypass branch 67 that outputs from the intermediate node 61 included in one node string 69 and inputs to the intermediate node 61 included in another node string 69.

以後の説明では、サーバＣＰＵ１２が、左端のノード列６９から連番で番号を付与した場合を例にして説明する。 In the following description, the case where the server CPU 12 assigns serial numbers from the leftmost node column 69 will be described as an example.

サーバＣＰＵ１２は、第１ノード列６９の第１ノードである親ノード６０から処理を開始する。サーバＣＰＵ１２は、親ノード６０が受容ノード６３であるか否かを判定する。サーバＣＰＵ１２は、親ノード６０は受容ノード６３では無いと判定する。サーバＣＰＵ１２は、親ノード６０から一般化符号「＊」のラベルを付与された遷移枝６５またはバイパス枝６７が出力しているか否かを判定する。親ノード６０から、一般化符号「＊」のラベルを付与された遷移枝６５が出力しているので、サーバＣＰＵ１２は第１ノード列６９６の第２ノードである。中間ノード６１ｎ２を次の処理対象とする。 The server CPU 12 starts the process from the parent node 60 which is the first node of the first node sequence 69. The server CPU 12 determines whether the parent node 60 is the receiving node 63. The server CPU 12 determines that the parent node 60 is not the receiving node 63. The server CPU 12 determines whether the transition branch 65 or the bypass branch 67 to which the label of the generalized code “*” is added is output from the parent node 60. Since the transition branch 65 labeled with the generalized code “*” is output from the parent node 60, the server CPU 12 is the second node of the first node sequence 696. The intermediate node 61n2 is the next processing target.

サーバＣＰＵ１２は、中間ノード６１ｎ２が受容ノード６３であるか否かを判定する。サーバＣＰＵ１２は、中間ノード６１ｎ２は受容ノード６３では無いと判定する。サーバＣＰＵ１２は、中間ノード６１ｎ２から一般化符号「＊」のラベルを付与された遷移枝６５またはバイパス枝６７が出力しているか否かを判定する。中間ノード６１ｎ２から、一般化符号「＊」のラベルを付与された遷移枝６５が出力しているので、サーバＣＰＵ１２は第１ノード列６９の第３ノードである、中間ノード６１ｎ３を次の処理対象とする。 The server CPU 12 determines whether the intermediate node 61n2 is the receiving node 63. The server CPU 12 determines that the intermediate node 61n2 is not the receiving node 63. The server CPU 12 determines whether the intermediate branch 61n2 outputs the transition branch 65 or the bypass branch 67 labeled with the generalized code “*”. Since the transition branch 65 labeled with the generalized code “*” is output from the intermediate node 61n2, the server CPU 12 sets the intermediate node 61n3, which is the third node in the first node sequence 69, as the next processing target. And

サーバＣＰＵ１２は、中間ノード６１ｎ３が受容ノード６３であるか否かを判定する。サーバＣＰＵ１２は、中間ノード６１ｎ３は受容ノード６３では無いと判定する。サーバＣＰＵ１２は、中間ノード６１ｎ３から一般化符号「＊」のラベルを付与された遷移枝６５またはバイパス枝６７が出力しているか否かを判定する。中間ノード６１ｎ３から、一般化符号「＊」のラベルを付与された遷移枝６５は出力していないので、サーバＣＰＵ１２は中間ノード６１ｎ３から出力するバイパス枝６７の作成可否を判定する処理を開始する。 The server CPU 12 determines whether the intermediate node 61n3 is the receiving node 63. The server CPU 12 determines that the intermediate node 61n3 is not the receiving node 63. The server CPU 12 determines whether the transition branch 65 or the bypass branch 67 labeled with the generalized code “*” is output from the intermediate node 61n3. Since the transition branch 65 labeled with the generalized code “*” is not output from the intermediate node 61n3, the server CPU 12 starts the process of determining whether or not to create the bypass branch 67 output from the intermediate node 61n3.

中間ノード６１ｎ３から出力するバイパス枝６７の作成可否を判定する処理について説明する。処理中の中間ノード６１ｎ３は親ノード６０では無いので、サーバＣＰＵ１２は第１ノード列６９内の一つ前のノードである中間ノード６１ｎ２を判定対象とする。 A process of determining whether or not to create the bypass branch 67 output from the intermediate node 61n3 will be described. Since the intermediate node 61n3 being processed is not the parent node 60, the server CPU 12 sets the intermediate node 61n2, which is the immediately preceding node in the first node sequence 69, as the determination target.

サーバＣＰＵ１２は、中間ノード６１ｎ２から一般化符号「＊」のラベルを付与された遷移枝６５またはバイパス枝６７が出力しているか否かを判定する。中間ノード６１ｎ２から、一般化符号「＊」のラベルを付与された遷移枝６５が出力しているので、サーバＣＰＵ１２は、その遷移枝６５が入力する中間ノード６１ｎ７が、第１ノード列６９に含まれているか否かを判定する。中間ノード６１ｎ７は、第１ノード列６９に含まれていないので、サーバＣＰＵ１２は、中間ノード６１ｎ３から出力して、中間ノード６１ｎ７に向かうバイパス枝６７を作成する。以上によりサーバＣＰＵ１２は、中間ノード６１ｎ３の処理を終了する。サーバＣＰＵ１２は第１ノード列６９の第４ノードである、受容ノード６３ｎ４を次の処理対象とする。 The server CPU 12 determines whether the intermediate branch 61n2 outputs the transition branch 65 or the bypass branch 67 labeled with the generalized code “*”. Since the transition branch 65 labeled with the generalized code “*” is output from the intermediate node 61n2, the server CPU 12 includes the intermediate node 61n7 input by the transition branch 65 in the first node sequence 69. Is determined. Since the intermediate node 61n7 is not included in the first node sequence 69, the server CPU 12 outputs the intermediate node 61n3 and creates the bypass branch 67 toward the intermediate node 61n7. As described above, the server CPU 12 ends the processing of the intermediate node 61n3. The server CPU 12 sets the reception node 63n4, which is the fourth node of the first node sequence 69, as the next processing target.

受容ノード６３ｎ４は、第１ノード列６９の最後のノードであるので、サーバＣＰＵ１２は、第２ノード列６９の処理に移動する。サーバＣＰＵ１２は、第２ノード列６９の第１ノードである親ノード６０から処理を開始する。親ノード６０および中間ノード６１ｎ２の処理は、第１ノード列６９で説明した処理と同一であるので、説明を省略する。なお、サーバＣＰＵ１２は、一度処理したノードを記憶しておき、２回目以降の処理を省略しても良い。サーバＣＰＵ１２は第２ノード列６９の第３ノードである。受容ノード６３ｎ６を次の処理対象とする。 Since the reception node 63n4 is the last node of the first node sequence 69, the server CPU 12 moves to the process of the second node sequence 69. The server CPU 12 starts the process from the parent node 60 which is the first node of the second node sequence 69. Since the processing of the parent node 60 and the intermediate node 61n2 is the same as the processing described in the first node sequence 69, the description thereof will be omitted. It should be noted that the server CPU 12 may store the node that has been processed once and omit the second and subsequent processes. The server CPU 12 is the third node of the second node row 69. The receiving node 63n6 is the next processing target.

受容ノード６３ｎ４は、第２ノード列６９の最後のノードであるので、サーバＣＰＵ１２は、第３ノード列６９の処理に移動する。サーバＣＰＵ１２は、第３ノード列６９の第１ノードである親ノード６０から処理を開始する。親ノード６０および中間ノード６１ｎ２の処理は、第１ノード列６９で説明した処理と同一であるので、説明を省略する。サーバＣＰＵ１２は第３ノード列６９の第３ノードである。中間ノード６１ｎ７を次の処理対象とする。 Since the reception node 63n4 is the last node of the second node sequence 69, the server CPU 12 moves to the process of the third node sequence 69. The server CPU 12 starts the process from the parent node 60 which is the first node of the third node row 69. Since the processing of the parent node 60 and the intermediate node 61n2 is the same as the processing described in the first node sequence 69, the description thereof will be omitted. The server CPU 12 is the third node of the third node row 69. The intermediate node 61n7 is the next processing target.

サーバＣＰＵ１２は、中間ノード６１ｎ７が受容ノード６３であるか否かを判定する。サーバＣＰＵ１２は、中間ノード６１ｎ７は受容ノード６３では無いと判定する。サーバＣＰＵ１２は、中間ノード６１ｎ７から一般化符号「＊」のラベルを付与された遷移枝６５またはバイパス枝６７が出力しているか否かを判定する。中間ノード６１ｎ７から、一般化符号「＊」のラベルを付与された遷移枝６５が出力していないので、サーバＣＰＵ１２は中間ノード６１ｎ７から出力するバイパス枝６７の作成可否を判定する処理を開始する。 The server CPU 12 determines whether the intermediate node 61n7 is the receiving node 63. The server CPU 12 determines that the intermediate node 61n7 is not the receiving node 63. The server CPU 12 determines whether the transition branch 65 or the bypass branch 67 labeled with the generalized code “*” is output from the intermediate node 61n7. Since the transition branch 65 labeled with the generalized code “*” is not output from the intermediate node 61n7, the server CPU 12 starts the process of determining whether or not the bypass branch 67 output from the intermediate node 61n7 can be created.

中間ノード６１ｎ７から出力するバイパス枝６７の作成可否を判定する処理について説明する。処理中の中間ノード６１ｎ７は親ノード６０では無いので、サーバＣＰＵ１２は第３ノード列６９内の一つ前のノードである中間ノード６１ｎ２を判定対象とする。 A process of determining whether or not the bypass branch 67 output from the intermediate node 61n7 can be created will be described. Since the intermediate node 61n7 being processed is not the parent node 60, the server CPU 12 sets the intermediate node 61n2, which is the immediately preceding node in the third node sequence 69, as the determination target.

サーバＣＰＵ１２は、中間ノード６１ｎ２から一般化符号「＊」のラベルを付与された遷移枝６５またはバイパス枝６７が出力しているか否かを判定する。中間ノード６１ｎ２から、一般化符号「＊」のラベルを付与された遷移枝６５が出力しているので、サーバＣＰＵ１２は、その遷移枝６５が入力する中間ノード６１ｎ７が、第３ノード列６９に含まれているか否かを判定する。中間ノード６１ｎ７は、第３ノード列６９に含まれているので、サーバＣＰＵ１２は、第３ノード列６９内のもう一つ前のノードである親ノード６０を判定対象とする。 The server CPU 12 determines whether the intermediate branch 61n2 outputs the transition branch 65 or the bypass branch 67 labeled with the generalized code “*”. Since the transition branch 65 labeled with the generalized code “*” is output from the intermediate node 61n2, the server CPU 12 includes the intermediate node 61n7 input by the transition branch 65 in the third node sequence 69. Is determined. Since the intermediate node 61n7 is included in the third node sequence 69, the server CPU 12 sets the parent node 60, which is the previous node in the third node sequence 69, as the determination target.

サーバＣＰＵ１２は、親ノード６０から一般化符号「＊」のラベルを付与された遷移枝６５またはバイパス枝６７が出力しているか否かを判定する。親ノード６０から、一般化符号「＊」のラベルを付与された遷移枝６５が出力しているので、サーバＣＰＵ１２は、その遷移枝６５が入力する中間ノード６１ｎ１２が、第３ノード列６９に含まれているか否かを判定する。中間ノード６１ｎ１２は、第３ノード列６９に含まれていないので、サーバＣＰＵ１２は、中間ノード６１ｎ７から出力して、中間ノード６１ｎ１２に向かうバイパス枝６７を作成する。以上によりサーバＣＰＵ１２は、中間ノード６１ｎ７の処理を終了する。サーバＣＰＵ１２は第３ノード列６９の第４ノードである、受容ノード６３ｎ８を次の処理対象とする。 The server CPU 12 determines whether the transition branch 65 or the bypass branch 67 to which the label of the generalized code “*” is added is output from the parent node 60. Since the transition branch 65 labeled with the generalized code “*” is output from the parent node 60, the server CPU 12 includes the intermediate node 61n12 input by the transition branch 65 in the third node sequence 69. Is determined. Since the intermediate node 61n12 is not included in the third node sequence 69, the server CPU 12 outputs the intermediate node 61n7 and creates the bypass branch 67 toward the intermediate node 61n12. With the above, the server CPU 12 ends the processing of the intermediate node 61n7. The server CPU 12 sets the reception node 63n8, which is the fourth node of the third node sequence 69, as the next processing target.

バイパス枝６７の作成を断念する例について、第５ノード列６９を例にして説明する。サーバＣＰＵ１２は、第５ノード列６９の第１ノードである親ノード６０から処理を開始する。親ノード６０の処理は、第１ノード列６９で説明した処理と同一であるので、説明を省略する。サーバＣＰＵ１２は第５ノード列６９内の第２ノードである、中間ノード６１ｎ１２を次の処理対象とする。 An example of giving up the creation of the bypass branch 67 will be described by taking the fifth node sequence 69 as an example. The server CPU 12 starts the process from the parent node 60 which is the first node of the fifth node array 69. Since the processing of the parent node 60 is the same as the processing described in the first node sequence 69, the description thereof will be omitted. The server CPU 12 sets the intermediate node 61n12, which is the second node in the fifth node array 69, as the next processing target.

サーバＣＰＵ１２は、中間ノード６１ｎ１２が受容ノード６３であるか否かを判定する。サーバＣＰＵ１２は、中間ノード６１ｎ１２は受容ノード６３では無いと判定する。サーバＣＰＵ１２は、中間ノード６１ｎ１２から一般化符号「＊」のラベルを付与された遷移枝６５またはバイパス枝６７が出力しているか否かを判定する。中間ノード６１ｎ１２から、一般化符号「＊」のラベルを付与された遷移枝６５が出力していないので、サーバＣＰＵ１２は中間ノード６１ｎ１２から出力するバイパス枝６７の作成可否を判定する処理を開始する。 The server CPU 12 determines whether the intermediate node 61n12 is the receiving node 63. The server CPU 12 determines that the intermediate node 61n12 is not the receiving node 63. The server CPU 12 determines whether the transition branch 65 or the bypass branch 67 labeled with the generalized code “*” is output from the intermediate node 61n12. Since the transition branch 65 labeled with the generalized code “*” is not output from the intermediate node 61n12, the server CPU 12 starts the process of determining whether or not the bypass branch 67 output from the intermediate node 61n12 can be created.

中間ノード６１ｎ１２から出力するバイパス枝６７の作成可否を判定する処理について説明する。処理中の中間ノード６１ｎ１２は親ノード６０では無いので、サーバＣＰＵ１２は第５ノード列６９内の一つ前のノードである親ノード６０を判定対象とする。 A process of determining whether or not to create the bypass branch 67 output from the intermediate node 61n12 will be described. Since the intermediate node 61n12 being processed is not the parent node 60, the server CPU 12 sets the parent node 60, which is the immediately preceding node in the fifth node sequence 69, as the determination target.

サーバＣＰＵ１２は、親ノード６０から一般化符号「＊」のラベルを付与された遷移枝６５またはバイパス枝６７が出力しているか否かを判定する。親ノード６０から、一般化符号「＊」のラベルを付与された遷移枝６５が出力しているので、サーバＣＰＵ１２は、その遷移枝６５が入力する中間ノード６１ｎ１２が、第５ノード列６９に含まれているか否かを判定する。中間ノード６１ｎ１２は、第５ノード列６９に含まれているので、サーバＣＰＵ１２は、第５ノード列６９から中間ノード６１ｎ１２に向かうバイパス枝６７を作成できないと判定する。 The server CPU 12 determines whether the transition branch 65 or the bypass branch 67 to which the label of the generalized code “*” is added is output from the parent node 60. Since the transition branch 65 labeled with the generalized code “*” is output from the parent node 60, the server CPU 12 includes the intermediate node 61n12 input by the transition branch 65 in the fifth node sequence 69. Is determined. Since the intermediate node 61n12 is included in the fifth node sequence 69, the server CPU 12 determines that the bypass branch 67 extending from the fifth node sequence 69 toward the intermediate node 61n12 cannot be created.

判定対象は親ノード６０であるので、サーバＣＰＵ１２はバイパス枝６７を作成せずに中間ノード６１ｎ１２の処理を終了する。 Since the determination target is the parent node 60, the server CPU 12 ends the processing of the intermediate node 61n12 without creating the bypass branch 67.

以上の処理を繰り返すことにより、サーバＣＰＵ１２は一のノード列６９から他のノード列６９に遷移するバイパス枝６７を作成する。すべてのノードの処理が完了することにより、匿名化オートマトン３２が完成する。 By repeating the above processing, the server CPU 12 creates the bypass branch 67 that transits from one node sequence 69 to another node sequence 69. The anonymized automaton 32 is completed by completing the processing of all the nodes.

図７は、完成した匿名化オートマトン３２を示す。匿名化オートマトン３２は、親ノード６０、中間ノード６１および受容ノード６３を有する。親ノード６０、中間ノード６１および受容ノード６３は実線で示す遷移枝６５および破線で示すバイパス枝６７で接続されている。中間ノード６１は、入力個票を処理する途中過程を示す。遷移枝６５およびバイパス枝６７は、入力個票を処理する経路を示す。遷移枝６５およびバイパス枝６７は、本実施の形態の判定枝の例である。 FIG. 7 shows the completed anonymized automaton 32. The anonymized automaton 32 has a parent node 60, an intermediate node 61, and a reception node 63. The parent node 60, the intermediate node 61 and the accepting node 63 are connected by a transition branch 65 shown by a solid line and a bypass branch 67 shown by a broken line. The intermediate node 61 indicates an intermediate process of processing the input individual number. The transition branch 65 and the bypass branch 67 indicate paths for processing the input individual number. The transition branch 65 and the bypass branch 67 are examples of the determination branch of the present embodiment.

親ノード６０からは１本以上の遷移枝６５が出力する。中間ノード６１には、１本の遷移枝６５が入力する。中間ノード６１からは１本以上の遷移枝６５が出力する。受容ノード６３には、１本の遷移枝６５が入力する。受容ノード６３から出力する遷移枝６５は存在しない。バイパス枝６７は、一部の中間ノード６１から出力し、別の中間ノード６１または受容ノード６３に入力する。 One or more transition branches 65 are output from the parent node 60. One transition branch 65 is input to the intermediate node 61. One or more transition branches 65 are output from the intermediate node 61. One transition branch 65 is input to the reception node 63. There is no transition branch 65 output from the acceptance node 63. The bypass branch 67 outputs from one of the intermediate nodes 61 and inputs to another intermediate node 61 or the receiving node 63.

匿名化オートマトン３２を使用して、サーバＣＰＵ１２が入力個票を処理する方法の概略を説明する。外部から親ノード６０に入力個票が入力される。入力個票は、遷移枝６５、バイパス枝６７および中間ノード６１を経由して、受容ノード６３に到達する。遷移枝６５およびバイパス枝６７を選択する条件は、各々の矢印の近傍のラベルに表示されている。到達した受容ノード６３は、入力個票を匿名化した結果を示す。受容ノード６３は、匿名化済データＤＢ３１に記録されている匿名化済個票のいずれか一つと対応している。 An outline of a method in which the server CPU 12 processes an input vote by using the anonymized automaton 32 will be described. The input individual number is input to the parent node 60 from the outside. The input data reaches the receiving node 63 via the transition branch 65, the bypass branch 67, and the intermediate node 61. The conditions for selecting the transition branch 65 and the bypass branch 67 are displayed in the labels near the respective arrows. The received acceptance node 63 shows the result of anonymizing the input individual vote. The reception node 63 corresponds to any one of the anonymized individual records recorded in the anonymized data DB 31.

図８は、ストリーミングデータの例を示す説明図である。ストリーミングデータは、第２クライアント２５が逐次生成してネットワークに送出する入力個票である。図８の１行が、１個の入力個票を示す。サーバＣＰＵ１２は、通信部１５を介して入力個票を逐次受信する。サーバＣＰＵ１２が入力個票を受信するタイミングおよび受信する入力個票の数は定まっていない。 FIG. 8 is an explanatory diagram showing an example of streaming data. The streaming data is an input piece that the second client 25 sequentially generates and sends to the network. One line in FIG. 8 shows one input vote. The server CPU 12 sequentially receives the input individual number via the communication unit 15. The timing at which the server CPU 12 receives the input form and the number of input forms to be received are not fixed.

図９は、匿名化処理後のストリーミングデータの例を示す説明図である。図９は、図８のデータを、図７を使用して説明した匿名化オートマトン３２を使用して匿名化処理した例を示す。サーバ１１が匿名化オートマトン３２を使用して行う具体的な処理の例を説明する。 FIG. 9 is an explanatory diagram showing an example of streaming data after anonymization processing. FIG. 9 shows an example in which the data of FIG. 8 is anonymized by using the anonymization automaton 32 described using FIG. 7. An example of a specific process performed by the server 11 using the anonymized automaton 32 will be described.

サーバＣＰＵ１２が、図８の１行目に示す「１９８，２０，女」の入力個票を第２クライアント２５から取得した場合を例にして説明する。サーバＣＰＵ１２は、入力個票を親ノード６０に入力する。親ノード６０からは、郵便番号によって分かれる遷移枝６５が３本出ている。 An example will be described in which the server CPU 12 acquires the input individual number “198, 20, female” shown in the first line of FIG. 8 from the second client 25. The server CPU 12 inputs the input number into the parent node 60. From the parent node 60, three transition branches 65 that are divided according to the postal code are output.

入力個票の郵便番号は「１９８」であるので、サーバＣＰＵ１２は、郵便番号が１９８の遷移枝６５に沿って遷移する中間ノード６１ｎＢおよび郵便番号を特定しない一般化符号「＊」の遷移枝６５に沿って遷移する中間ノード６１ｎＦに遷移可能であると判定する。中間ノード６１ｎＢに遷移可能なデータの範囲の方が、中間ノード６１ｎＦに遷移可能なデータの範囲よりも狭いので、サーバＣＰＵ１２は、中間ノード６１ｎＢに向かう遷移枝６５を選択して、中間ノード６１ｎＢに処理を移す。なお、以後の説明では、一般化符号「＊」の遷移枝６５と、それ以外の遷移枝６５とを選択可能である場合の判定については記載を省略する。 Since the postal code of the input vote is "198", the server CPU 12 causes the intermediate node 61nB that transits along the transition branch 65 of the postal code 198 and the transition branch 65 of the generalized code "*" that does not specify the postal code. It is determined that it is possible to transit to the intermediate node 61nF that transits along the line. Since the range of data that can be transited to the intermediate node 61nB is narrower than the range of data that can be transited to the intermediate node 61nF, the server CPU 12 selects the transition branch 65 toward the intermediate node 61nB, and sets it to the intermediate node 61nB. Transfer processing. Note that, in the following description, the description of the determination in the case where the transition branch 65 of the generalized code “*” and the other transition branches 65 can be selected is omitted.

中間ノード６１ｎＢからは、年齢によって分かれる遷移枝６５が３本出ている。入力個票の年齢は２０であるので、サーバＣＰＵ１２は「１９８，２０，＊」の受容ノード６３に向かう遷移枝６５を選択する。受容ノード６３に到達したので、サーバＣＰＵ１２は、「１９８，２０，女」の入力個票を匿名化処理した結果は、「１９８，２０，＊」であると判定する。 From the intermediate node 61nB, three transition branches 65 that are divided according to age are output. Since the age of the input vote is 20, the server CPU 12 selects the transition branch 65 toward the acceptance node 63 of "198, 20, *". Since it has reached the acceptance node 63, the server CPU 12 determines that the result of anonymizing the input individual votes of "198, 20, woman" is "198, 20, *".

サーバＣＰＵ１２が、図８の２行目に示す「１９８，３０，男」の入力個票を第２クライアント２５から取得した場合を例にして説明する。サーバＣＰＵ１２は、入力個票を親ノード６０に入力する。親ノード６０からは、郵便番号によって分かれる遷移枝６５が３本出ている。入力個票の郵便番号は「１９８」であるので、サーバＣＰＵ１２は中間ノード６１ｎＢに向かう遷移枝６５を選択して、中間ノード６１ｎＢに処理を移す。 An example will be described in which the server CPU 12 acquires the input individual number “198, 30, male” shown in the second line of FIG. 8 from the second client 25. The server CPU 12 inputs the input number into the parent node 60. From the parent node 60, three transition branches 65 that are divided according to the postal code are output. Since the postal code of the input number is "198", the server CPU 12 selects the transition branch 65 toward the intermediate node 61nB and moves the process to the intermediate node 61nB.

中間ノード６１ｎＢからは、年齢によって分かれる遷移枝６５が３本出ている。入力個票の年齢は３０であるので、一般化符号「＊」以外の２本の遷移枝６５のいずれのラベルとも一致しない。サーバＣＰＵ１２は、一般化符号である「＊」にあてはまると判定して、中間ノード６１ｎＤに向かう遷移枝６５を選択する。 From the intermediate node 61nB, three transition branches 65 that are divided according to age are output. Since the age of the input piece is 30, it does not match any label of the two transition branches 65 other than the generalized code “*”. The server CPU 12 determines that the generalized code “*” is applicable, and selects the transition branch 65 toward the intermediate node 61nD.

中間ノード６１ｎＤからは、性別によって分かれる遷移枝６５が２本出ている。入力個票の性別は男であるので、サーバＣＰＵ１２は、「１９８，＊，男」の受容ノード６３に向かう遷移枝６５を選択する。受容ノード６３に到達したので、サーバＣＰＵ１２は、「１９８，３０，男」の入力個票を匿名化処理した結果は、「１９８，＊，男」であると判定する。 From the intermediate node 61nD, two transition branches 65 that are divided according to sex are output. Since the sex of the input vote is male, the server CPU 12 selects the transition branch 65 toward the accepting node 63 of "198, *, male". Since it has reached the acceptance node 63, the server CPU 12 determines that the result of anonymizing the input individual votes of "198, 30, male" is "198, *, male".

サーバＣＰＵ１２が、図８の５行目に示す「１９８，６０，女」の入力個票を第２クライアント２５から取得した場合を例にして説明する。サーバＣＰＵ１２は、入力個票を親ノード６０に入力する。親ノード６０からは、郵便番号によって分かれる遷移枝６５が３本出ている。入力個票の郵便番号は「１９８」であるので、サーバＣＰＵ１２は中間ノード６１ｎＢに向かう遷移枝６５を選択して、中間ノード６１ｎＢに処理を移す。 An example will be described in which the server CPU 12 obtains the input individual number “198, 60, woman” shown in the fifth line of FIG. 8 from the second client 25. The server CPU 12 inputs the input number into the parent node 60. From the parent node 60, three transition branches 65 that are divided according to the postal code are output. Since the postal code of the input number is "198", the server CPU 12 selects the transition branch 65 toward the intermediate node 61nB and moves the process to the intermediate node 61nB.

中間ノード６１ｎＢからは、年齢によって分かれる遷移枝６５が３本出ている。入力個票の年齢は６０であるので、３本の遷移枝６５のいずれのラベルとも一致しない。サーバＣＰＵ１２は、一般化符号「＊」にあてはまると判定して、中間ノード６１ｎＤに向かう遷移枝６５を選択する。 From the intermediate node 61nB, three transition branches 65 that are divided according to age are output. Since the age of the input vote is 60, it does not match any of the labels of the three transition branches 65. The server CPU 12 determines that the generalized code “*” applies, and selects the transition branch 65 toward the intermediate node 61nD.

中間ノード６１ｎＤからは、性別によって分かれる遷移枝６５が２本出ている。入力個票の性別は女であるので、２本の遷移枝６５のいずれのラベルとも一致しない。サーバＣＰＵ１２は、一般化符号「＊」にあてはまると判定して、中間ノード６１ｎＦに向かうバイパス枝６７を選択する。 From the intermediate node 61nD, two transition branches 65 that are divided according to sex are output. Since the sex of the input vote is female, it does not match any label of the two transition branches 65. The server CPU 12 determines that the generalized code “*” applies, and selects the bypass branch 67 toward the intermediate node 61nF.

中間ノード６１ｎＦからは、年齢によって分かれる遷移枝６５が２本出ている。入力個票の年齢は６０であるので、２本の遷移枝６５のいずれのラベルとも一致しない。また、中間ノード６１ｎＤからは、「＊」のラベルの遷移枝６５およびバイパス枝６７は出力していない。したがって、サーバＣＰＵ１２は、入力個票を受容ノード６３に当てはめることはできない。サーバＣＰＵ１２は、「１９８，６０，女」の入力個票を匿名化処理した結果は、すべての項目を一般化符号に置換した「＊，＊，＊」であると判定する。以後の説明では、すべての項目を一般化符号に置換した個票を置換個票という。 From the intermediate node 61nF, two transition branches 65 that are divided according to age are output. Since the age of the input piece is 60, it does not match any label of the two transition branches 65. Further, the transition branch 65 and the bypass branch 67 labeled with "*" are not output from the intermediate node 61nD. Therefore, the server CPU 12 cannot apply the input vote to the accepting node 63. The server CPU 12 determines that the result of anonymizing the input individual votes of "198, 60, woman" is "*, *, *" in which all items are replaced with generalized codes. In the following description, a piece in which all items are replaced with a generalized code is called a replacement piece.

以上に説明した匿名化オートマトン３２による処理を簡単にまとめる。サーバＣＰＵ１２は、取得した入力個票を親ノード６０に入力する。サーバＣＰＵ１２は、匿名化オートマトン３２の遷移枝６５およびバイパス枝６７に沿ってノードを遷移させる。以後の説明では、遷移枝６５およびバイパス枝６７をまとめて判定枝と記載する。 The processing by the anonymized automaton 32 described above will be briefly summarized. The server CPU 12 inputs the acquired input individual number into the parent node 60. The server CPU 12 makes a node transition along the transition branch 65 and the bypass branch 67 of the anonymized automaton 32. In the following description, the transition branch 65 and the bypass branch 67 will be collectively referred to as a determination branch.

入力個票と一致する判定枝が無く、「＊」の判定枝が出ている場合には、サーバＣＰＵ１２は「＊」の判定枝を選択する。入力個票と一致する判定枝が無く、「＊」の判定枝も無い場合には、サーバＣＰＵ１２は置換個票が匿名化処理の結果であると判定する。処理が受容ノード６３に到達した場合には、サーバＣＰＵ１２は受容ノード６３の内容が入力拠標を匿名化処理した結果であると判定する。 When there is no determination branch that matches the input individual number and the determination branch of “*” is output, the server CPU 12 selects the determination branch of “*”. If there is no determination branch that matches the input individual number and there is no “*” determination edge, the server CPU 12 determines that the replacement individual number is the result of the anonymization process. When the processing reaches the reception node 63, the server CPU 12 determines that the content of the reception node 63 is the result of anonymizing the input target.

サーバＣＰＵ１２は、第２クライアント２５から逐次送信される入力個票を受信して、以上に説明した匿名化処理を行う。サーバＣＰＵ１２は匿名化処理の結果である、出力個票を第１クライアント２１に逐次送信する。 The server CPU 12 receives the input individual numbers sequentially transmitted from the second client 25, and performs the anonymization process described above. The server CPU 12 sequentially transmits the output number, which is the result of the anonymization process, to the first client 21.

図１０は、プログラムの処理の流れを示すフローチャートである。図１０に示すプログラムは、匿名化済みのデータを取得して匿名化オートマトン３２を作成し、ストリーミングデータを匿名化して出力するプログラムである。図１０を使用して、本実施の形態のプログラムの処理の流れを説明する。 FIG. 10 is a flowchart showing the flow of processing of the program. The program shown in FIG. 10 is a program that acquires anonymized data, creates the anonymized automaton 32, and anonymizes the streaming data for output. The processing flow of the program according to the present embodiment will be described with reference to FIG.

サーバＣＰＵ１２は、通信部１５を介してネットワークから匿名化済データＤＢ３１を取得して（ステップＳ５０１）、補助記憶装置１４に記憶する。なお、匿名化済データＤＢ３１は、あらかじめ補助記憶装置１４に記憶されていても良い。 The server CPU 12 acquires the anonymized data DB 31 from the network via the communication unit 15 (step S501) and stores it in the auxiliary storage device 14. The anonymized data DB 31 may be stored in the auxiliary storage device 14 in advance.

サーバＣＰＵ１２は、階層木作成のサブルーチンを起動する（ステップＳ５０２）。階層木作成のサブルーチンは、匿名化済データＤＢ３１に記録された匿名化済個票に基づいて階層を有する木構造である階層木を作成するサブルーチンである。階層木作成のサブルーチンは、匿名化オートマトン３２を作成する第１の段階のサブルーチンである。階層木作成のサブルーチンの処理の流れは後述する。 The server CPU 12 activates a subroutine for creating a hierarchical tree (step S502). The hierarchical tree creating subroutine is a subroutine for creating a hierarchical tree that is a tree structure having a hierarchy based on the anonymized individual records recorded in the anonymized data DB 31. The subroutine for creating a hierarchical tree is the first stage subroutine for creating the anonymized automaton 32. The processing flow of the hierarchical tree creation subroutine will be described later.

サーバＣＰＵ１２は、ノード削減のサブルーチンを起動する（ステップＳ５０３）。ノード削減のサブルーチンは、階層を有する木構造から不要な中間ノード６１および親ノード６０を削除して、中間ノード６１の削減および親ノード６０の変更を行うサブルーチンである。ノード削減のサブルーチンは、匿名化オートマトン３２を作成する第２の段階のサブルーチンである。ノード削減のサブルーチンの処理の流れは後述する。 The server CPU 12 activates a node reduction subroutine (step S503). The node reduction subroutine is a subroutine for deleting unnecessary intermediate nodes 61 and parent nodes 60 from a tree structure having a hierarchy, and reducing intermediate nodes 61 and changing parent nodes 60. The node reduction subroutine is the second stage subroutine for creating the anonymized automaton 32. The process flow of the node reduction subroutine will be described later.

サーバＣＰＵ１２は、バイパス追加のサブルーチンを起動する（ステップＳ５０４）。バイパス追加のサブルーチンは、作成途中の匿名化オートマトン３２にバイパス枝６７を追加して、匿名化オートマトン３２を完成させるサブルーチンである。バイパス追加のサブルーチンは、匿名化オートマトン３２を作成する第３の段階のサブルーチンである。バイパス追加のサブルーチンの処理の流れは後述する。 The server CPU 12 starts a bypass addition subroutine (step S504). The bypass addition subroutine is a subroutine that completes the anonymization automaton 32 by adding the bypass branch 67 to the anonymization automaton 32 being created. The bypass addition subroutine is a third stage subroutine for creating the anonymized automaton 32. The process flow of the bypass addition subroutine will be described later.

サーバＣＰＵ１２は、完成したオートマトン３２を補助記憶装置１４に保存する（ステップＳ５０５）。 The server CPU 12 saves the completed automaton 32 in the auxiliary storage device 14 (step S505).

第２クライアント２５のＣＰＵは、第２ユーザが入力等を行うことにより生成した入力個票を取得する（ステップＳ６０１）。第２クライアント２５のＣＰＵは取得した入力個票をサーバ１１に送信する（ステップＳ６０２）。 The CPU of the second client 25 acquires the input individual number generated by the second user performing input and the like (step S601). The CPU of the second client 25 transmits the acquired input form to the server 11 (step S602).

サーバＣＰＵ１２は、入力個票を受信する（ステップＳ５１０）。サーバＣＰＵ１２は、匿名化のサブルーチンを起動する（ステップＳ５１１）。匿名化のサブルーチンは、入力個票をステップＳ５０５で保存したオートマトン３２に入力して、匿名化するサブルーチンである。匿名化のサブルーチンの処理の流れは後述する。 The server CPU 12 receives the input vote (step S510). The server CPU 12 activates an anonymization subroutine (step S511). The anonymization subroutine is a subroutine for anonymizing by inputting the input pieces into the automaton 32 saved in step S505. The process flow of the anonymization subroutine will be described later.

サーバＣＰＵ１２は、匿名化した出力個票を第１クライアント２１のＣＰＵに送信する（ステップＳ５１２）。第１クライアント２１のＣＰＵは、受信した出力個票を保存する（ステップＳ７０１）。 The server CPU 12 sends the anonymized output form to the CPU of the first client 21 (step S512). The CPU of the first client 21 saves the received output form (step S701).

サーバＣＰＵ１２は、処理を終了するか否かを判定する（ステップＳ５１５）。処理を終了する場合とは、たとえばネットワークを介してストリーミングデータの処理を終了するという入力を受け付けた場合である。 The server CPU 12 determines whether to end the process (step S515). The case of ending the processing is, for example, the case of receiving an input for ending the processing of the streaming data via the network.

処理を終了しないと判定した場合（ステップＳ５１５でＮＯ）、サーバＣＰＵ１２はステップＳ５１０に戻る。処理を終了すると判定した場合（ステップＳ５１５でＹＥＳ）、サーバＣＰＵ１２は処理を終了する。 When it is determined that the processing is not to be ended (NO in step S515), the server CPU 12 returns to step S510. When it is determined that the process is to be ended (YES in step S515), the server CPU 12 ends the process.

図１１は、階層木作成のサブルーチンの処理の流れを示すフローチャートである。階層木作成のサブルーチンは、匿名化済データＤＢ３１に記録された匿名化済個票に基づいて階層を有する木構造を作成するサブルーチンである。図４を使用して説明した階層木は、階層木作成のサブルーチンが図３を使用して説明した匿名化済データＤＢ３１を処理することにより作成した階層木である。 FIG. 11 is a flowchart showing the flow of processing of a subroutine for creating a hierarchical tree. The subroutine for creating a hierarchical tree is a subroutine for creating a tree structure having a hierarchy based on the anonymized individual records recorded in the anonymized data DB 31. The hierarchical tree described with reference to FIG. 4 is a hierarchical tree created by the hierarchical tree creation subroutine by processing the anonymized data DB 31 described with reference to FIG.

図１１を使用して、階層木作成の処理の流れを説明する。なお、サーバＣＰＵ１２は、補助記憶装置１４または主記憶装置１３上に階層木を作成および記憶する。 The processing flow of hierarchical tree creation will be described with reference to FIG. The server CPU 12 creates and stores a hierarchical tree on the auxiliary storage device 14 or the main storage device 13.

サーバＣＰＵ１２は、親ノード６０を作成する（ステップＳ８０１）。サーバＣＰＵ１２は、カウンタＩを初期値１に設定する（ステップＳ８０２）。サーバＣＰＵ１２は、変数Ｎを初期値１に設定する（ステップＳ８０３）。サーバＣＰＵ１２は、親ノード６０に１番の番号を付与する。 The server CPU 12 creates the parent node 60 (step S801). The server CPU 12 sets the counter I to the initial value 1 (step S802). The server CPU 12 sets the variable N to the initial value 1 (step S803). The server CPU 12 assigns the number 1 to the parent node 60.

サーバＣＰＵ１２は、匿名化済データＤＢ３１からＩ番目のレコードを取得する（ステップＳ８０４）。サーバＣＰＵ１２はカウンタＪを初期値１に設定する（ステップＳ８０５）。 The server CPU 12 acquires the I-th record from the anonymized data DB 31 (step S804). The server CPU 12 sets the counter J to the initial value 1 (step S805).

サーバＣＰＵ１２は、作成中の階層木の第Ｎノードから、ステップＳ８０４で取得したレコードの第Ｊフィールドに記録された階層を示す遷移枝６５が出ているか否かを判定する（ステップＳ８０６）。該当する遷移枝６５が無いと判定した場合には（ステップＳ８０６でＮＯ）、サーバＣＰＵ１２は作成中の階層木に、処理中の匿名化済個票に対応する遷移枝６５およびノードを追加する（ステップＳ８０７）。サーバＣＰＵ１２は、追加したノードに連番の番号を付与する。 The server CPU 12 determines whether the transition branch 65 indicating the layer recorded in the Jth field of the record acquired in step S804 is output from the Nth node of the layer tree being created (step S806). When it is determined that there is no corresponding transition branch 65 (NO in step S806), the server CPU 12 adds the transition branch 65 and the node corresponding to the anonymized individual piece being processed to the hierarchical tree being created ( Step S807). The server CPU 12 gives a serial number to the added node.

該当する遷移枝６５があると判定した場合（ステップＳ８０６でＹＥＳ）、サーバＣＰＵ１２は変数Ｎを該当する遷移枝６５の先のノードの番号に設定する（ステップＳ８０８）。またステップＳ８０７でノード６１を追加した場合、サーバＣＰＵ１２は変数Ｎを追加したノードの番号に設定する（ステップＳ８０８）。 When it is determined that there is the corresponding transition branch 65 (YES in step S806), the server CPU 12 sets the variable N to the number of the node ahead of the corresponding transition branch 65 (step S808). When the node 61 is added in step S807, the server CPU 12 sets the variable N to the number of the added node (step S808).

サーバＣＰＵ１２は、ステップＳ８０４で取得したレコードのすべてのフィールドの処理を終了したか否かを判定する（ステップＳ８０９）。処理が終了していないと判定した場合（ステップＳ８０９でＮＯ）、サーバＣＰＵ１２はカウンタＪに１を加算する（ステップＳ８１２）。その後、サーバＣＰＵ１２はステップＳ８０６に戻る。 The server CPU 12 determines whether processing of all fields of the record acquired in step S804 has been completed (step S809). When it is determined that the processing has not ended (NO in step S809), the server CPU 12 adds 1 to the counter J (step S812). After that, the server CPU 12 returns to step S806.

すべてのフィールドの処理が終了したと判定した場合（ステップＳ８０９でＹＥＳ）、サーバＣＰＵ１２は、匿名化済データＤＢ３１のすべてのレコードの処理が終了したか否かを判定する（ステップＳ８１０）。処理が終了していないと判定した場合（ステップＳ８１０でＮＯ）、サーバＣＰＵ１２はカウンタＩに１を加算する（ステップＳ８１１）。その後、サーバＣＰＵ１２はステップＳ８０３に戻る。 When it is determined that the processing of all fields is completed (YES in step S809), the server CPU 12 determines whether the processing of all records in the anonymized data DB 31 is completed (step S810). When it is determined that the processing has not ended (NO in step S810), the server CPU 12 adds 1 to the counter I (step S811). After that, the server CPU 12 returns to step S803.

すべてのレコードの処理が完了したと判定した場合（ステップＳ８１０でＹＥＳ）、サーバＣＰＵ１２は処理を終了する。なお、サーバＣＰＵ１２がステップＳ８０７で追加したノードのうち、出力する遷移枝６５を有するノードは中間ノード６１であり、出力する遷移枝６５を有さないノードは受容ノード６３である。 When it is determined that the processing of all records is completed (YES in step S810), the server CPU 12 ends the processing. Among the nodes added by the server CPU 12 in step S807, the node that has the transition branch 65 that outputs is the intermediate node 61, and the node that does not have the transition branch 65 that outputs is the accepting node 63.

図１２は、ノード削減のサブルーチンの処理の流れを示すフローチャートである。ノード削減のサブルーチンは、階層を有する木構造から不要な中間ノード６１および親ノード６０を削除して、中間ノード６１の削減および親ノード６０の変更を行うサブルーチンである。図１２を使用して、ノード削減のサブルーチンの処理の流れを説明する。ノード削減のサブルーチンは、たとえば図４を使用して説明した階層木を、図５を使用して説明したように変更する。 FIG. 12 is a flowchart showing the flow of processing of a node reduction subroutine. The node reduction subroutine is a subroutine for deleting unnecessary intermediate nodes 61 and parent nodes 60 from a tree structure having a hierarchy, and reducing intermediate nodes 61 and changing parent nodes 60. The process flow of the node reduction subroutine will be described with reference to FIG. The node reduction subroutine changes, for example, the hierarchical tree described with reference to FIG. 4 as described with reference to FIG.

サーバＣＰＵ１２はカウンタＮを初期値１に設定する（ステップＳ８２１）。サーバＣＰＵ１２は、第Ｎノードから出力する遷移枝６５のラベルが一般化符号「＊」のみであるか否かを判定する（ステップＳ８２２）。 The server CPU 12 sets the counter N to the initial value 1 (step S821). The server CPU 12 determines whether the label of the transition branch 65 output from the Nth node is only the generalized code “*” (step S822).

一般化符号「＊」の遷移枝６５のみであると判定した場合（ステップＳ８２２でＹＥＳ）、サーバＣＰＵ１２は第Ｎノードを削除する（ステップＳ８２３）。サーバＣＰＵ１２は、削除した中間ノード６１に入力していた遷移枝６５を、削除した中間ノード６１の後ろの中間ノード６１または受容ノード６３に接続する。 When it is determined that there is only the transition branch 65 of the generalized code “*” (YES in step S822), the server CPU 12 deletes the Nth node (step S823). The server CPU 12 connects the transition branch 65 that has been input to the deleted intermediate node 61 to the intermediate node 61 or the accepting node 63 behind the deleted intermediate node 61.

一般化符号「＊」の遷移枝６５のみでは無いと判定した場合（ステップＳ８２２でＮＯ）およびステップＳ８２３の終了後、サーバＣＰＵ１２はすべての中間ノード６１の処理を終了したか否かを判定する（ステップＳ８２４）。終了していないと判定した場合（ステップＳ８２４でＮＯ）、サーバＣＰＵ１２はカウンタＮに１を加算する（ステップＳ８２５）。その後、サーバＣＰＵ１２はステップＳ８２２に戻る。 If it is determined that there is not only the transition branch 65 of the generalized code "*" (NO in step S822) and after the end of step S823, the server CPU 12 determines whether or not the processing of all intermediate nodes 61 has been completed ( Step S824). When it is determined that the processing has not ended (NO in step S824), the server CPU 12 adds 1 to the counter N (step S825). After that, the server CPU 12 returns to step S822.

終了したと判定した場合（ステップＳ８２４でＹＥＳ）、サーバＣＰＵ１２は処理を終了する。 If it is determined that the processing is completed (YES in step S824), the server CPU 12 ends the processing.

図１３および図１４は、バイパス追加のサブルーチンの処理の流れを示すフローチャートである。バイパス追加のサブルーチンは、作成途中の匿名化オートマトン３２にバイパス枝６７を追加して、匿名化オートマトン３２を完成させるサブルーチンである。図６、図１３および図１４を使用して、バイパス追加のサブルーチンの処理の流れを説明する。 13 and 14 are flowcharts showing the flow of the processing of the bypass addition subroutine. The bypass addition subroutine is a subroutine that completes the anonymization automaton 32 by adding the bypass branch 67 to the anonymization automaton 32 being created. The processing flow of the bypass addition subroutine will be described with reference to FIGS. 6, 13 and 14.

サーバＣＰＵ１２はカウンタＭを初期値１に設定する（ステップＳ８３１）。サーバＣＰＵ１２はカウンタＫを初期値１に設定する（ステップＳ８３２）。 The server CPU 12 sets the counter M to the initial value 1 (step S831). The server CPU 12 sets the counter K to the initial value 1 (step S832).

サーバＣＰＵ１２は、第Ｍノード列６９の第Ｋノードが受容ノード６３であるか否かを判定する（ステップＳ８３３）。受容ノード６３では無いと判定した場合（ステップＳ８３３でＮＯ）、第Ｍノード列の第Ｋノードから一般化符号「＊」の遷移枝６５またはバイパス枝６７が出力しているか否かを判定する（ステップＳ８３４）。一般化符号「＊」の遷移枝６５またはバイパス枝６７が出力していないと判定した場合（ステップＳ８３４でＮＯ）、サーバＣＰＵ１２は変数ＡにＫを代入する（ステップＳ８３５）。 The server CPU 12 determines whether or not the Kth node of the Mth node column 69 is the accepting node 63 (step S833). When it is determined that the node is not the accepting node 63 (NO in step S833), it is determined whether the transition branch 65 or the bypass branch 67 of the generalized code “*” is output from the Kth node of the Mth node sequence ( Step S834). When it is determined that the transition branch 65 or the bypass branch 67 of the generalized code “*” is not output (NO in step S834), the server CPU 12 substitutes K for the variable A (step S835).

サーバＣＰＵ１２は、変数Ａが１であるか否かを判定する（ステップＳ８３６）。ここで、変数Ａが１である場合は、第Ｍノード列６９の第Ａノードが親ノード６０であることを意味する。変数Ａが１では無いと判定した場合（ステップＳ８３６でＮＯ）、サーバＣＰＵ１２は変数Ａから１を減算する（ステップＳ８３７）。 The server CPU 12 determines whether the variable A is 1 (step S836). Here, when the variable A is 1, it means that the A-th node of the M-th node sequence 69 is the parent node 60. When it is determined that the variable A is not 1 (NO in step S836), the server CPU 12 subtracts 1 from the variable A (step S837).

サーバＣＰＵ１２は、第Ｍノード列６９の第Ａノードから、一般化符号「＊」の遷移枝６５またはバイパス枝６７が出力しているか否かを判定する（ステップＳ８３８）。一般化符号「＊」の遷移枝６５またはバイパス枝６７が出力していないと判定した場合（ステップＳ８３８でＮＯ）、サーバＣＰＵ１２はステップＳ８３６に戻る。 The server CPU 12 determines whether the transition branch 65 or the bypass branch 67 of the generalized code “*” is output from the A-th node of the M-th node sequence 69 (step S838). When it is determined that the transition branch 65 or the bypass branch 67 having the generalized code “*” is not output (NO in step S838), the server CPU 12 returns to step S836.

一般化符号「＊」の遷移枝６５またはバイパス枝６７が出力していると判定した場合（ステップＳ８３８でＹＥＳ）、サーバＣＰＵ１２は変数Ｎを一般化符号「＊」の遷移枝６５が入力する中間ノード６１の番号に設定する（ステップＳ８３９）。 When it is determined that the transition branch 65 of the generalized code “*” or the bypass branch 67 is outputting (YES in step S838), the server CPU 12 receives the variable N as an intermediate value input by the transition branch 65 of the generalized code “*”. It is set to the number of the node 61 (step S839).

サーバＣＰＵ１２は、第Ｍノード列６９が第Ｎ中間ノード６１を含むか否かを判定する（ステップＳ８４０）。第Ｍノード列６９が第Ｎ中間ノード６１を含むと判定した場合（ステップＳ８４０でＹＥＳ）、サーバＣＰＵ１２はステップＳ８３６に戻る。 The server CPU 12 determines whether the Mth node sequence 69 includes the Nth intermediate node 61 (step S840). When it is determined that the Mth node sequence 69 includes the Nth intermediate node 61 (YES in step S840), the server CPU 12 returns to step S836.

第Ｍノード列６９が第Ｎ中間ノード６１を含まないと判定した場合（ステップＳ８４０でＮＯ）、サーバＣＰＵ１２は第Ｍノード列６９の第Ｋノードから、第Ｎ中間ノード６１に向かうバイパス枝６７を作成中のオートマトン３２に追加する（ステップＳ８４１）。バイパス枝６７が選択される条件は、第Ｋノードから出力している遷移枝６５と同じ項目のデータが、一般化符号「＊」であることである。 When it is determined that the Mth node sequence 69 does not include the Nth intermediate node 61 (NO in step S840), the server CPU 12 sets the bypass branch 67 from the Kth node of the Mth node sequence 69 toward the Nth intermediate node 61. It is added to the automaton 32 being created (step S841). The condition for selecting the bypass branch 67 is that the data of the same item as the transition branch 65 output from the Kth node is the generalized code “*”.

第Ｍノード列６９の第Ｋノードが受容ノード６３である場合（ステップＳ８３３でＹＥＳ）、サーバＣＰＵ１２は第Ｍノード列６９の最終ノードまでの処理が完了したか否かを判定する（ステップＳ８４２）。第Ｍノード列６９の第Ｋノードから一般化符号「＊」の遷移枝６５またはバイパス枝６７が出力している場合（ステップＳ８３４でＹＥＳ）も、サーバＣＰＵ１２は第Ｍノード列６９の最終ノードまでの処理が完了したか否かを判定する（ステップＳ８４２）。変数Ａが１である場合（ステップＳ８３６でＹＥＳ）およびステップＳ８４１の終了後も、サーバＣＰＵ１２は第Ｍノード列６９の最終ノードまでの処理が完了したか否かを判定する（ステップＳ８４２）。 When the Kth node of the Mth node column 69 is the accepting node 63 (YES in step S833), the server CPU 12 determines whether or not the processing up to the last node of the Mth node column 69 has been completed (step S842). .. Even when the transition branch 65 or the bypass branch 67 of the generalized code “*” is output from the Kth node of the Mth node column 69 (YES in step S834), the server CPU 12 reaches the last node of the Mth node column 69. It is determined whether or not the process of (3) is completed (step S842). When the variable A is 1 (YES in step S836) and after step S841 is completed, the server CPU 12 determines whether or not the processes up to the final node of the M-th node column 69 have been completed (step S842).

処理が完了していないと判定した場合（ステップＳ８４２でＮＯ）、サーバＣＰＵ１２はカウンタＫに１を加算する（ステップＳ８４３）。その後、サーバＣＰＵ１２はステップＳ８３３に戻る。 When it is determined that the processing is not completed (NO in step S842), the server CPU 12 increments the counter K by 1 (step S843). After that, the server CPU 12 returns to step S833.

処理が完了したと判定した場合（ステップＳ８４２でＹＥＳ）、サーバＣＰＵ１２はすべてのノード列６９の処理が終了したか否かを判定する（ステップＳ８４４）。終了していないと判定した場合（ステップＳ８４４でＮＯ）、サーバＣＰＵ１２はカウンタＭに１を加算する（ステップＳ８４５）。その後、サーバＣＰＵ１２はステップＳ８３２に戻る。 When it is determined that the processing is completed (YES in step S842), the server CPU 12 determines whether the processing of all the node rows 69 is completed (step S844). When it is determined that the processing has not ended (NO in step S844), the server CPU 12 adds 1 to the counter M (step S845). After that, the server CPU 12 returns to step S832.

終了していると判定した場合（ステップＳ８４４でＹＥＳ）、サーバＣＰＵ１２は処理を終了する。 When it is determined that the processing is completed (YES in step S844), the server CPU 12 ends the processing.

図１５は、匿名化のサブルーチンの処理の流れを示すフローチャートである。匿名化のサブルーチンは、入力個票をステップＳ５０５で保存したオートマトン３２に入力して、匿名化するサブルーチンである。図１５を使用して、匿名化のサブルーチンの処理の流れを説明する。 FIG. 15 is a flowchart showing the flow of processing of the anonymization subroutine. The anonymization subroutine is a subroutine for anonymizing by inputting the input pieces into the automaton 32 saved in step S505. The process flow of the anonymization subroutine will be described with reference to FIG.

サーバＣＰＵ１２は、変数Ｎを初期値１に設定する（ステップＳ８６１）。サーバＣＰＵ１２は、第Ｎノードから入力個票に対応する遷移枝６５またはバイパス枝６７が出力しているか否かを判定する（ステップＳ８６２）。 The server CPU 12 sets the variable N to the initial value 1 (step S861). The server CPU 12 determines whether the transition branch 65 or the bypass branch 67 corresponding to the input number is output from the Nth node (step S862).

出力していないと判定した場合（ステップＳ８６２でＮＯ）、サーバＣＰＵ１２はすべての項目が一般化符号「＊」である置換個票を出力データに設定する（ステップＳ８６３）。サーバＣＰＵ１２は、その後処理を終了する。 When it is determined that it has not been output (NO in step S862), the server CPU 12 sets, in the output data, the replacement individual number in which all items are generalized codes “*” (step S863). The server CPU 12 then ends the process.

出力していると判定した場合（ステップＳ８６２でＹＥＳ）、サーバＣＰＵ１２は変数Ｎを出力している遷移枝６５の先のノードの番号に設定する（ステップＳ８７１）。この際、サーバＣＰＵ１２は、一般化符号「＊」が指定されている遷移枝６５よりも具体的なデータが指定されている遷移枝６５を優先する。 When it is determined that the variable N is being output (YES in step S862), the server CPU 12 sets the variable N to the number of the node ahead of the output transition branch 65 (step S871). At this time, the server CPU 12 gives priority to the transition branch 65 in which specific data is specified, over the transition branch 65 in which the generalized code “*” is specified.

サーバＣＰＵ１２は、第Ｎノードが受容ノード６３であるか否かを判定する（ステップＳ８７２）。受容ノード６３では無いと判定した場合（ステップＳ８７２でＮＯ）、サーバＣＰＵ１２はステップＳ８６２に戻る。 The server CPU 12 determines whether the Nth node is the accepting node 63 (step S872). When it is determined that it is not the receiving node 63 (NO in step S872), the server CPU 12 returns to step S862.

受容ノード６３であると判定した場合（ステップＳ８７２でＹＥＳ）、サーバＣＰＵ１２はその受容ノード６３の内容を有する個票を出力データに設定する（ステップＳ８７３）。その後、サーバＣＰＵ１２は処理を終了する。 When it is determined that the receiving node 63 is the receiving node 63 (YES in step S872), the server CPU 12 sets an individual piece having the content of the receiving node 63 as output data (step S873). After that, the server CPU 12 ends the process.

本実施の形態によると、匿名化オートマトン３２を使用することにより、ストリーミングデータである入力個票を１個ずつ逐次匿名化することができる。匿名化オートマトン３２の出力は、既存の匿名化済データに含まれている匿名化済個票と一致しているので、入力個票は匿名化される。 According to the present embodiment, by using the anonymization automaton 32, it is possible to sequentially anonymize input individual votes, which are streaming data, one by one. Since the output of the anonymized automaton 32 matches the anonymized individual vote included in the existing anonymized data, the input individual vote is anonymized.

ステップＳ８７１で、一般化符号「＊」が指定されている遷移枝６５よりも具体的なデータが指定されている遷移枝６５を優先することにより、匿名性を確保できる範囲で狭い定義域に入力個票を一般化する。また、バイパス枝６７を使用することにより、適切な受容ノード６３に到達しない場合に別のノード列６９を用いた一般化を試みる。 In step S871, by inputting the transition branch 65 in which specific data is specified over the transition branch 65 in which the generalized code “*” is specified, input is made in a narrow domain within a range in which anonymity can be secured. Generalize individual votes. Also, by using the bypass branch 67, an attempt is made to generalize with another node sequence 69 if the appropriate acceptor node 63 is not reached.

本実施の形態によると、過度な一般化を避けながら匿名性を確保することができる。したがって、ストリーミングデータの入力個票に対して、二次利用に適した匿名化を行うことができる。 According to this embodiment, anonymity can be secured while avoiding excessive generalization. Therefore, it is possible to anonymize the input data of the streaming data suitable for secondary use.

なお、匿名化済データＤＢ３１および匿名化オートマトン３２は、ネットワークで接続された外部の記憶装置に記憶されていても良い。 The anonymized data DB 31 and the anonymized automaton 32 may be stored in an external storage device connected via a network.

匿名化済データＤＢ３１を作成する際に匿名化する手法は一般化に限定しない。また、匿名化済データＤＢ３１は、ｋ−匿名性を有するＤＢに限定しない。任意の匿名化手法を用いて、任意の水準の匿名性を有する匿名化済データＤＢ３１を使用することができる。 The method of anonymizing when creating the anonymized data DB 31 is not limited to generalization. The anonymized data DB 31 is not limited to the DB having k-anonymity. The anonymized data DB 31 having an arbitrary level of anonymity can be used by using an arbitrary anonymization method.

［実施の形態２］
実施の形態２は、匿名化する際の項目の優先順位を変更することが可能な情報処理システム１０に関する。なお、実施の形態１と共通する部分については、説明を省略する。 [Embodiment 2]
The second embodiment relates to an information processing system 10 capable of changing the priority order of items when anonymizing. Note that the description of the same parts as those in the first embodiment will be omitted.

図１６は、実施の形態２の匿名化処理後のストリーミングデータの例を示す説明図である。図１７は、実施の形態２の匿名化オートマトン３２を示す説明図である。図１６は、図８に示したストリーミングデータを図１７に示す匿名化オートマトン３２を使用して匿名化した結果を示す。 FIG. 16 is an explanatory diagram showing an example of streaming data after anonymization processing according to the second embodiment. FIG. 17 is an explanatory diagram showing the anonymized automaton 32 of the second embodiment. 16 shows a result of anonymizing the streaming data shown in FIG. 8 using the anonymization automaton 32 shown in FIG.

図１６に示す本実施の形態により匿名化したストリーミングデータと、図９に示す実施の形態１により匿名化したストリーミングデータとの相違点について説明する。本実施の形態では、元データの性別のデータを一般化せずに出力することを優先している。一方、実施の形態１では郵便番号のデータを一般化せずに出力することを優先している。このように、匿名化オートマトン３２を変更することにより、どの項目を優先的に出力するかを変更することができる。なお、性別と郵便番号のどちらを優先することが望ましいかは、匿名化済データを二次利用する目的によって異なる。 Differences between the streaming data anonymized by the present embodiment shown in FIG. 16 and the streaming data anonymized by the first embodiment shown in FIG. 9 will be described. In the present embodiment, it is prioritized to output the data of the original data of the gender without generalization. On the other hand, in the first embodiment, priority is given to outputting the postal code data without generalizing it. Thus, by changing the anonymization automaton 32, it is possible to change which item is preferentially output. Whether it is preferable to give priority to gender or zip code depends on the purpose of secondary use of anonymized data.

図１８は、実施の形態２のプログラムの処理の流れを示すフローチャートである。図１８を使用して、本実施の形態のプログラムの処理の流れを説明する。 FIG. 18 is a flowchart showing the flow of processing of the program according to the second embodiment. The processing flow of the program according to the present embodiment will be described with reference to FIG.

サーバＣＰＵ１２は、通信部１５を介してネットワークから匿名化済データＤＢ３１を取得して（ステップＳ５０１）、補助記憶装置１４に記憶する。サーバＣＰＵ１２は、あらかじめ補助記憶装置１４に記憶されている項目の優先順位を取得する（ステップＳ５３１）。なお、サーバＣＰＵ１２はネットワークを介して第１クライアント２１等から項目の優先順位を取得しても良い。 The server CPU 12 acquires the anonymized data DB 31 from the network via the communication unit 15 (step S501) and stores it in the auxiliary storage device 14. The server CPU 12 acquires the priority order of the items stored in the auxiliary storage device 14 in advance (step S531). The server CPU 12 may acquire the priority order of items from the first client 21 or the like via the network.

サーバＣＰＵ１２は、ステップＳ５３１で取得した優先順位にしたがって、匿名化済データＤＢ３１のフィールドを入れ替える（ステップＳ５３２）。たとえば、サーバＣＰＵ１２は、最も優先順位が高い項目が記録されたフィールドを第１フィールドに、二番目に優先順位が高い項目が記録されたフィールドを第２フィールドにする。 The server CPU 12 replaces the fields of the anonymized data DB 31 according to the priority order acquired in step S531 (step S532). For example, the server CPU 12 sets the field in which the item with the highest priority is recorded as the first field and the field in which the item with the second highest priority is recorded as the second field.

サーバＣＰＵ１２は、階層木作成のサブルーチンを起動する（ステップＳ５０２）。以後の処理は、図１０を使用して説明した実施の形態１と同一であるので、説明を省略する。 The server CPU 12 activates a subroutine for creating a hierarchical tree (step S502). Subsequent processing is the same as that of the first embodiment described with reference to FIG. 10, and thus description thereof will be omitted.

前述の図１７は、サーバＣＰＵ１２がステップＳ５３１で性別、年齢、郵便番号の順序の優先順位を取得して作成したオートマトンを示す。 FIG. 17 described above shows the automaton created by the server CPU 12 in step S531 by obtaining the priority order of sex, age, and zip code.

本実施の形態によると、二次利用の目的に合わせてストリーミングデータを匿名化する情報処理装置を提供することができる。 According to the present embodiment, it is possible to provide an information processing device that anonymizes streaming data according to the purpose of secondary usage.

［実施の形態３］
実施の形態３は、一部の項目のみを匿名化する情報処理システム１０に関する。なお、実施の形態１と共通する部分については、説明を省略する。 [Third Embodiment]
The third embodiment relates to an information processing system 10 that anonymizes only some items. Note that the description of the same parts as those in the first embodiment will be omitted.

図１９は、実施の形態３のストリーミングデータの例を示す説明図である。図２０は、実施の形態３の匿名化処理後のストリーミングデータの例を示す説明図である。図１９に示すように、本実施の形態のストリーミングデータは、性別、生年、郵便番号に加えて、疾患の項目を含む入力個票である。図２０に示すように、性別、生年および郵便番号の項目は匿名化を行うが、疾患の項目は匿名化せずに出力する。以後の説明では、匿名化せずに出力する項目を、非匿名化項目と記載する。 FIG. 19 is an explanatory diagram showing an example of streaming data according to the third embodiment. FIG. 20 is an explanatory diagram showing an example of streaming data after anonymization processing according to the third embodiment. As shown in FIG. 19, the streaming data according to the present embodiment is an input individual vote including an item of disease in addition to sex, year of birth, and zip code. As shown in FIG. 20, the items of sex, year of birth, and zip code are anonymized, but the items of diseases are output without anonymization. In the following description, items that are output without anonymization are referred to as non-anonymized items.

また、本実施の形態においては、匿名化済データＤＢ３１のフィールドの配列順序と、ストリーミングデータの入力個票の項目の配列順序とが異なっている。ストリーミングデータの個票の項目の配列順序は、あらかじめ補助記憶装置１４に記憶されているか、または入力個票に含まれている。 Further, in the present embodiment, the arrangement order of the fields of the anonymized data DB 31 is different from the arrangement order of the items of the input data of the streaming data. The arrangement order of the items of the individual items of the streaming data is stored in the auxiliary storage device 14 in advance or included in the input individual items.

図２１は、実施の形態３の匿名化のサブルーチンの処理の流れを示すフローチャートである。図２１に示すサブルーチンは、図７を使用して説明したサブルーチンの代わりに使用するサブルーチンである。図２１を使用して、本実施の形態の匿名化のサブルーチンの処理の流れを説明する。 FIG. 21 is a flowchart showing the flow of processing of the anonymization subroutine of the third embodiment. The subroutine shown in FIG. 21 is a subroutine used in place of the subroutine described with reference to FIG. The process flow of the anonymization subroutine of this embodiment will be described with reference to FIG.

サーバＣＰＵ１２は、変数Ｎを初期値１に設定する（ステップＳ８６１）。サーバＣＰＵ１２は、第Ｎ中間ノード６１から入力個票に対応する遷移枝６５またはバイパス枝６７が出力しているか否かを判定する（ステップＳ８６２）。 The server CPU 12 sets the variable N to the initial value 1 (step S861). The server CPU 12 determines whether the transition branch 65 or the bypass branch 67 corresponding to the input number is output from the Nth intermediate node 61 (step S862).

出力していないと判定した場合（ステップＳ８６２でＮＯ）、サーバＣＰＵ１２は匿名化する対象の項目すべてを一般化符号「＊」で置換したデータを出力データに設定する（ステップＳ８７５）。 When it is determined that the output has not been performed (NO in step S862), the server CPU 12 sets, as output data, data obtained by replacing all items to be anonymized with the generalized code “*” (step S875).

出力していると判定した場合（ステップＳ８６２でＹＥＳ）、サーバＣＰＵ１２は変数Ｎを出力している遷移枝６５の先の中間ノード６１の番号に設定する（ステップＳ８７１）。この際、サーバＣＰＵ１２は、一般化符号「＊」が指定されている遷移枝６５よりも具体的なデータが指定されている遷移枝６５を優先する。 When it is determined that the variable N is being output (YES in step S862), the server CPU 12 sets the variable N to the number of the intermediate node 61 ahead of the output transition branch 65 (step S871). At this time, the server CPU 12 gives priority to the transition branch 65 in which specific data is specified, over the transition branch 65 in which the generalized code “*” is specified.

受容ノード６３であると判定した場合（ステップＳ８７２でＹＥＳ）、サーバＣＰＵ１２はその受容ノード６３の内容を出力データに設定する（ステップＳ８７３）。ステップＳ８７３およびステップＳ８７５の終了後、サーバＣＰＵ１２は出力データに非匿名化項目を追加する（ステップＳ８７６）。その後、サーバＣＰＵ１２は処理を終了する。 When it is determined that the receiving node 63 is the receiving node 63 (YES in step S872), the server CPU 12 sets the content of the receiving node 63 in the output data (step S873). After the end of steps S873 and S875, the server CPU 12 adds the non-anonymized item to the output data (step S876). After that, the server CPU 12 ends the process.

本実施の形態によると、たとえば疾患名等の二次利用の際に重要度の高い項目のデータを完全に残した状態で、入力個票を匿名化することができる。また、本実施の形態によると匿名化済データＤＢ３１を作成する際の元データとは異なる項目を含むストリーミングデータの入力個票を匿名化することができる。 According to the present embodiment, it is possible to anonymize the input individual vote in a state in which the data of highly important items are completely left in the secondary use such as disease names. In addition, according to the present embodiment, it is possible to anonymize the input form of streaming data that includes items different from the original data when creating the anonymized data DB 31.

なお、サーバＣＰＵ１２はステップＳ８７６の後、本サブルーチンの処理を終了する前に、匿名化済データＤＢ３１に記録されたデータに合わせて出力個票の項目の順番を変更しても良い。このようにすることにより、たとえば第１ユーザが匿名化済データＤＢ３１とストリーミングデータとを組み合わせて使用することが容易に行える。 It should be noted that the server CPU 12 may change the order of the items of the output individual items in accordance with the data recorded in the anonymized data DB 31 after the step S876 and before ending the processing of this subroutine. By doing so, for example, the first user can easily use the anonymized data DB 31 and the streaming data in combination.

［実施の形態４］
実施の形態４は、大きな階層木を作成した後に、不要な遷移枝６５を削除して匿名化オートマトン３２を作成する情報処理システム１０に関する。なお、実施の形態１と共通する部分については、説明を省略する。 [Embodiment 4]
The fourth embodiment relates to an information processing system 10 that creates an anonymized automaton 32 by deleting an unnecessary transition branch 65 after creating a large hierarchical tree. Note that the description of the same parts as those in the first embodiment will be omitted.

図２２および図２３は、実施の形態４の階層木の作成過程を示す説明図である。図２２について説明する。図２２は、階層木作成の初期段階を示す。図２２では、各階層の中間ノード６１は、それぞれ同一の種類および数の遷移枝６５を有する。なお、図２２および図２３では、受容ノード６３の番号は図示を省略する。 22 and 23 are explanatory diagrams showing a process of creating a hierarchical tree according to the fourth embodiment. 22 will be described. FIG. 22 shows an initial stage of creating a hierarchical tree. In FIG. 22, the intermediate nodes 61 of each layer have the same type and number of transition branches 65. 22 and 23, the illustration of the number of the receiving node 63 is omitted.

具体的には、第２階層の中間ノード６１はすべて、年齢が「２０」「５０」「＊」の３本の遷移枝６５を有する。第３階層の中間ノード６１はすべて、性別が「男」「女」「＊」の３本の遷移枝６５を有する。各階層間の遷移枝６５の数は、各フィールドに含まれるデータの種類の数と一致している。各受容ノード６３の内容は、親ノード６０から受容ノード６３に到達するまでに経由する遷移枝６５に対応する。 Specifically, all the intermediate nodes 61 of the second layer have three transition branches 65 whose ages are “20”, “50”, and “*”. All the intermediate nodes 61 of the third layer have three transition branches 65 having genders of “male”, “female” and “*”. The number of transition branches 65 between layers is the same as the number of types of data included in each field. The content of each reception node 63 corresponds to the transition branch 65 through which the reception node 63 is reached from the parent node 60.

図２３は、匿名化済データＤＢ３１に含まれる匿名化済個票と一致しない受容ノード６３を削除する過程を示す。たとえば、図２３の左端に記載した「１９８，２０，男」という個票は、匿名化済データＤＢ３１に含まれていない。サーバＣＰＵ１２は、このような受容ノード６３を階層木から削除することにより、匿名化の処理が不十分な個票を出力することを防止する。 FIG. 23 shows a process of deleting a receiving node 63 that does not match the anonymized individual number contained in the anonymized data DB 31. For example, the individual vote “198, 20, male” described at the left end of FIG. 23 is not included in the anonymized data DB 31. The server CPU 12 deletes such an accepting node 63 from the hierarchical tree, thereby preventing the output of individual chips for which anonymization processing is insufficient.

サーバＣＰＵ１２は、受容ノード６３を削除する場合には、その受容ノード６３に入力する遷移枝６５も削除する。この際、出力する遷移枝６５を有さない中間ノード６１が発生する場合には、サーバＣＰＵ１２はその中間ノード６１も削除する。以上の処理が完了することにより、図４を使用して説明した階層木と同様の階層木が完成する。 When deleting the acceptance node 63, the server CPU 12 also deletes the transition branch 65 input to the acceptance node 63. At this time, when an intermediate node 61 that does not have the output transition branch 65 occurs, the server CPU 12 also deletes the intermediate node 61. By completing the above processing, a hierarchical tree similar to the hierarchical tree described using FIG. 4 is completed.

図２４は、実施の形態４の階層木作成のサブルーチンの処理の流れを示すフローチャートである。図２４に示すサブルーチンは、図１１を使用して説明したサブルーチンの代わりに使用するサブルーチンである。図２４を使用して、本実施の形態の階層木作成のサブルーチンの処理の流れを説明する。 FIG. 24 is a flow chart showing the flow of processing of the hierarchical tree creating subroutine of the fourth embodiment. The subroutine shown in FIG. 24 is a subroutine used instead of the subroutine described with reference to FIG. The processing flow of the subroutine for creating a hierarchical tree according to the present embodiment will be described with reference to FIG.

サーバＣＰＵ１２は、親ノード６０を作成する（ステップＳ９０１）。サーバＣＰＵ１２は、カウンタＪを初期値１に設定する（ステップＳ９０２）。 The server CPU 12 creates the parent node 60 (step S901). The server CPU 12 sets the counter J to the initial value 1 (step S902).

サーバＣＰＵ１２は、匿名化済データＤＢ３１の第Ｊフィールドに含まれるデータの種類を抽出する（ステップＳ９０３）。図３に示す匿名化済データＤＢ３１を例にして説明する。第１フィールドである郵便番号のフィールドには「１９８」「９９８」「＊」の３種類のデータが含まれている。したがって、サーバＣＰＵ１２はＪ＝１である場合には、「１９８」「９９８」「＊」の３個のデータを抽出する。 The server CPU 12 extracts the type of data included in the Jth field of the anonymized data DB 31 (step S903). The anonymized data DB 31 shown in FIG. 3 will be described as an example. The zip code field, which is the first field, contains three types of data, "198", "998", and "*". Therefore, when J = 1, the server CPU 12 extracts three pieces of data “198”, “998”, and “*”.

サーバＣＰＵ１２は、第Ｊ＋１階層にステップＳ９０３で抽出した各データに対応する遷移枝６５および中間ノード６１を作成する（ステップＳ９０４）。たとえば、Ｊ＝１である場合には、サーバＣＰＵ１２は中間ノード６１ｎ２、中間ノード６１ｎ３、中間ノード６１ｎ４の３個の中間ノード６１および遷移枝６５を作成する。なお、ステップＳ９１４で作成するノードのうち、出力する遷移枝６５を有さないノードは、受容ノード６３である。 The server CPU 12 creates the transition branch 65 and the intermediate node 61 corresponding to each data extracted in step S903 in the (J + 1) th layer (step S904). For example, when J = 1, the server CPU 12 creates three intermediate nodes 61 and a transition branch 65, which are an intermediate node 61n2, an intermediate node 61n3, and an intermediate node 61n4. Of the nodes created in step S914, the node that does not have the transition branch 65 to output is the accepting node 63.

サーバＣＰＵ１２は、匿名化済データＤＢ３１のすべてのフィールドの処理が終了したか否かを判定する（ステップＳ９０５）。処理が終了していないと判定した場合（ステップＳ９０５でＮＯ）、サーバＣＰＵ１２はカウンタＪに１を加算する（ステップＳ９０６）。その後、サーバＣＰＵ１２はステップＳ９０３に戻る。 The server CPU 12 determines whether or not processing of all fields of the anonymized data DB 31 has been completed (step S905). When it is determined that the processing has not ended (NO in step S905), the server CPU 12 adds 1 to the counter J (step S906). After that, the server CPU 12 returns to step S903.

処理が終了したと判定した場合（ステップＳ９０５でＹＥＳ）、サーバＣＰＵ１２は、変数Ｎを初期値１に設定する（ステップＳ９１０）。サーバＣＰＵ１２は、Ｎ番目の受容ノード６３に対応する個票が匿名化済データＤＢ３１内に存在するか否かを判定する（ステップＳ９１１）。存在しないと判定した場合（ステップＳ９１１でＮＯ）、サーバＣＰＵ１２は受容ノード６３を作成中の階層木から削除する（ステップＳ９１２）。サーバＣＰＵ１２は、削除した受容ノード６３に入力する遷移枝６５も削除する。この際、出力する遷移枝６５を有さない中間ノード６１が発生する場合には、サーバＣＰＵ１２は遷移枝６５を有さない中間ノード６１も削除する。 When it is determined that the process is completed (YES in step S905), the server CPU 12 sets the variable N to the initial value 1 (step S910). The server CPU 12 determines whether or not the individual form corresponding to the Nth accepting node 63 exists in the anonymized data DB 31 (step S911). When it is determined that the node does not exist (NO in step S911), the server CPU 12 deletes the reception node 63 from the hierarchical tree being created (step S912). The server CPU 12 also deletes the transition branch 65 input to the deleted acceptance node 63. At this time, when the intermediate node 61 that does not have the output transition branch 65 occurs, the server CPU 12 also deletes the intermediate node 61 that does not have the transition branch 65.

受容ノード６３が存在すると判定した場合（ステップＳ９１１でＹＥＳ）およびステップＳ９１２の終了後、サーバＣＰＵ１２はすべての受容ノード６３の処理が終了したか否かを判定する（ステップＳ９１３）。終了していないと判定した場合には（ステップＳ９１３でＮＯ）、サーバＣＰＵ１２はカウンタＮに１を加算する（ステップＳ９１４）。その後、サーバＣＰＵ１２はステップＳ９１１に戻る。 When it is determined that the accepting node 63 exists (YES in step S911) and after the end of step S912, the server CPU 12 determines whether or not the processes of all the accepting nodes 63 have been completed (step S913). When it is determined that the processing has not ended (NO in step S913), the server CPU 12 adds 1 to the counter N (step S914). After that, the server CPU 12 returns to step S911.

終了したと判定した場合には（ステップＳ９１３でＹＥＳ）、サーバＣＰＵ１２は処理を終了する。 If it is determined that the processing is completed (YES in step S913), the server CPU 12 ends the processing.

本実施の形態によると、実施の形態１とは異なるアルゴリズムで匿名化オートマトン３２を作成することができる。なお、匿名化オートマトン３２の作成方法は、本実施の形態および実施の形態１に記載した方法に限定しない。親ノード６０および匿名化済データＤＢ３１に含まれる匿名化済個票に対応する受容ノード６３を備えれば、任意の方法で作成した匿名化オートマトン３２を使用することができる。 According to this embodiment, the anonymized automaton 32 can be created by an algorithm different from that of the first embodiment. The method for creating the anonymized automaton 32 is not limited to the methods described in the present embodiment and the first embodiment. If the parent node 60 and the accepting node 63 corresponding to the anonymized individual pieces included in the anonymized data DB 31 are provided, the anonymized automaton 32 created by any method can be used.

図７に示す匿名化オートマトン３２は、匿名化オートマトン３２の機能を説明するためのイメージである。匿名化オートマトン３２は、状態遷移表その他任意の形式で表現されて、補助記憶装置１４に記憶されていても良い。 The anonymization automaton 32 shown in FIG. 7 is an image for explaining the function of the anonymization automaton 32. The anonymized automaton 32 may be represented in a state transition table or any other format and stored in the auxiliary storage device 14.

［実施の形態５］
実施の形態５は、ストリーミングデータを蓄積して、匿名化オートマトン３２を再作成する情報処理システム１０に関する。なお、実施の形態１と共通する部分については、説明を省略する。 [Fifth Embodiment]
The fifth embodiment relates to an information processing system 10 that accumulates streaming data and recreates the anonymized automaton 32. Note that the description of the same parts as those in the first embodiment will be omitted.

図２５は、実施の形態５の情報処理システム１０の構成を示す説明図である。情報処理システム１０は、サーバ１１、第１クライアント２１、第２クライアント２５および匿名化サーバ１８を備える。サーバ１１、第１クライアント２１、第２クライアント２５および匿名化サーバ１８は、ネットワークを介して接続している。 FIG. 25 is an explanatory diagram showing the configuration of the information processing system 10 according to the fifth embodiment. The information processing system 10 includes a server 11, a first client 21, a second client 25, and an anonymization server 18. The server 11, the first client 21, the second client 25, and the anonymization server 18 are connected via a network.

匿名化サーバ１８は、元データ個票を取得して匿名化済データＤＢ３１を作成するサーバである。本実施の形態の匿名化サーバ１８は、汎用のパーソナルコンピューター、大型計算機等の情報機器等である。また、本実施の形態のサーバ１１と匿名化サーバ１８とは、同一のハードウェア上で動作する仮想マシンでも良い。匿名化サーバ１８は、本実施の形態の作成部の一例である。 The anonymization server 18 is a server that acquires the original data individual votes and creates the anonymized data DB 31. The anonymization server 18 of the present embodiment is a general-purpose personal computer, an information device such as a large-scale computer, or the like. Further, the server 11 and the anonymization server 18 of this exemplary embodiment may be virtual machines that operate on the same hardware. The anonymization server 18 is an example of the creation unit according to the present embodiment.

匿名化サーバ１８は、サーバ１１と同様にＣＰＵ、主記憶装置、補助記憶装置および通信部を備える。匿名化サーバ１８の補助記憶装置には、匿名化前の元データが記憶される。匿名化サーバ１８の内部構成については、図示を省略する。 Like the server 11, the anonymization server 18 includes a CPU, a main storage device, an auxiliary storage device, and a communication unit. Original data before anonymization is stored in the auxiliary storage device of the anonymization server 18. Illustration of the internal configuration of the anonymization server 18 is omitted.

図２６は、実施の形態５のプログラムの処理の流れを示すフローチャートである。図２６を使用して、本実施の形態の処理の流れを説明する。 FIG. 26 is a flowchart showing the flow of processing of the program according to the fifth embodiment. The processing flow of this embodiment will be described with reference to FIG.

匿名化サーバ１８のＣＰＵは、多数の元データ個票を取得する（ステップＳ９５１）。なお、元データ個票は、第２クライアント２５から逐次発生する個票を蓄積したものであっても良い。 The CPU of the anonymization server 18 acquires a large number of original data individual votes (step S951). It should be noted that the original data individual number may be a number of individual numbers sequentially generated from the second client 25.

匿名化サーバ１８のＣＰＵは、取得した元データ個票を匿名化して、匿名化済データＤＢ３１を作成する（ステップＳ９５２）。元データ個票を匿名化するには、たとえばｋ−匿名化、Ｐｋ−匿名化、米国ＨＩＰＡＡ（Health Insurance Portability and Accountability Act）プライバシールールなどの法令やガイドラインに基づいた匿名化等、任意の手法を使用することができる。匿名化サーバ１８のＣＰＵは、匿名化済データＤＢ３１をサーバ１１に送信する。 The CPU of the anonymization server 18 anonymizes the acquired original data individual form to create the anonymized data DB 31 (step S952). To anonymize the original data individual votes, for example, k-anonymization, Pk-anonymization, anonymization based on laws and guidelines such as US HIPAA (Health Insurance Portability and Accountability Act) privacy rules and guidelines can be used. Can be used. The CPU of the anonymization server 18 transmits the anonymized data DB 31 to the server 11.

サーバＣＰＵ１２は、通信部１５を介して匿名化済データＤＢ３１を取得して（ステップＳ５０１）、補助記憶装置１４に取得する。以後、ステップＳ５１２までのサーバＣＰＵ１２が行う処理および第１クライアント２１のＣＰＵが行う処理は実施の形態１と同一であるので説明を省略する。 The server CPU 12 acquires the anonymized data DB 31 via the communication unit 15 (step S501) and acquires it in the auxiliary storage device 14. Since the processing performed by the server CPU 12 and the processing performed by the CPU of the first client 21 up to step S512 are the same as those in the first embodiment, the description thereof will be omitted.

第２クライアント２５のＣＰＵは、第２ユーザが入力等を行うことにより生成した入力個票を取得する（ステップＳ６０１）。第２クライアント２５のＣＰＵは取得した入力個票をサーバ１１および匿名化サーバ１８に送信する（ステップＳ６１１）。 The CPU of the second client 25 acquires the input individual number generated by the second user performing input and the like (step S601). The CPU of the second client 25 transmits the acquired input vote to the server 11 and the anonymization server 18 (step S611).

匿名化サーバ１８のＣＰＵは、入力個票を受信し（ステップＳ９５３）、補助記憶装置に設けた入力個票記憶部に保存する。匿名化サーバ１８は、第２クライアント２５が入力個票を送信する都度、ステップＳ９５３を繰り返す。 The CPU of the anonymization server 18 receives the input form (step S953) and stores it in the input form storage unit provided in the auxiliary storage device. The anonymization server 18 repeats step S953 each time the second client 25 transmits the input individual number.

匿名化サーバ１８のＣＰＵは、ステップＳ９５１で取得した元データ個票とステップＳ９５３で受信した入力個票をまとめて元データとして、新たな匿名化済データＤＢ３１を作成する（ステップＳ９５４）。この際、匿名化サーバ１８のＣＰＵは、本実施の形態の再作成部の機能を実現する。匿名化サーバ１８のＣＰＵがステップＳ９５４を実施するタイミングは、任意に定めることができる。たとえば匿名化サーバ１８は、１週間ごと、１ヶ月ごと等、所定の時期にステップＳ９５４を実施する。また、匿名化サーバ１８のＣＰＵは、ステップＳ９５３で受信した入力個票が所定の数を超えた場合にステップＳ９５４を実行しても良い。 The CPU of the anonymization server 18 collectively creates the new anonymized data DB 31 by combining the original data individual data acquired in step S951 and the input individual data received in step S953 as original data (step S954). At this time, the CPU of the anonymization server 18 realizes the function of the recreating unit of this embodiment. The timing when the CPU of the anonymization server 18 executes step S954 can be arbitrarily determined. For example, the anonymization server 18 performs step S954 at a predetermined time, such as weekly or monthly. Further, the CPU of the anonymization server 18 may execute step S954 when the number of input votes received in step S953 exceeds a predetermined number.

匿名化サーバ１８のＣＰＵは、新たな匿名化済データＤＢ３１を作成したことを、サーバ１１に通知する（ステップＳ９５５）。 The CPU of the anonymization server 18 notifies the server 11 that the new anonymized data DB 31 has been created (step S955).

サーバＣＰＵ１２は、処理を終了するか否かを判定する（ステップＳ５１５）。処理を終了すると判定した場合（ステップＳ５１５でＹＥＳ）、サーバＣＰＵ１２は処理を終了する。 The server CPU 12 determines whether to end the process (step S515). When it is determined that the process is to be ended (YES in step S515), the server CPU 12 ends the process.

処理を終了しないと判定した場合（ステップＳ５１５でＮＯ）、サーバＣＰＵ１２は新たな匿名化済データＤＢ３１が存在するか否かを判定する（ステップＳ５５２）。新たな匿名化ＥＢ３１の存在の有無は、匿名化サーバ１８がステップＳ９５５で送信した通知を受信したか否かにより判定する。 When it is determined that the processing is not to be ended (NO in step S515), the server CPU 12 determines whether or not the new anonymized data DB 31 exists (step S552). The presence or absence of the new anonymized EB 31 is determined by whether or not the anonymization server 18 has received the notification transmitted in step S955.

存在しないと判定した場合（ステップＳ５５２でＮＯ）、サーバＣＰＵ１２はステップＳ５１０に戻る。存在すると判定した場合（ステップＳ５５２でＹＥＳ）、サーバＣＰＵ１２はステップＳ５０１に戻る。 If it is determined that there is not any (NO in step S552), the server CPU 12 returns to step S510. When it is determined that the file exists (YES in step S552), the server CPU 12 returns to step S501.

本実施の形態によると、ストリーミングデータを反映して、随時匿名化オートマトン３２を更新する情報処理システム１０を提供することができる。入力個票データが時間の経過とともに変化するトレンドを有する場合にも、陳腐化せず、有用な匿名化済データを提供することが可能である。 According to the present embodiment, it is possible to provide the information processing system 10 that updates the anonymized automaton 32 at any time by reflecting streaming data. Even when the input individual-bit data has a trend that changes with the passage of time, it is possible to provide useful anonymized data without becoming obsolete.

［実施の形態６］
実施の形態６は、匿名化オートマトン３２の作成とストリーミングデータの匿名化とを異なるサーバで行う情報処理システム１０に関する。なお、実施の形態５と共通する部分については、説明を省略する。 [Sixth Embodiment]
The sixth embodiment relates to an information processing system 10 that creates anonymized automata 32 and anonymizes streaming data by different servers. The description of the parts common to those of the fifth embodiment will be omitted.

図２７は、実施の形態６のプログラムの処理の流れを示すフローチャートである。図２７を使用して、本実施の形態の処理の流れを説明する。 FIG. 27 is a flowchart showing the flow of processing of the program according to the sixth embodiment. The processing flow of this embodiment will be described with reference to FIG.

匿名化サーバ１８のＣＰＵは、多数元データ個票を取得する（ステップＳ９５１）。なお、元データは、第２クライアント２５から逐次発生する入力個票を蓄積したデータであっても良い。 The CPU of the anonymization server 18 acquires a majority source data individual vote (step S951). It should be noted that the original data may be data in which input individual numbers sequentially generated from the second client 25 are accumulated.

匿名化サーバ１８のＣＰＵは、取得した元データ個票を匿名化して、匿名化済データＤＢ３１を作成する（ステップＳ９５２）。匿名化サーバ１８のＣＰＵは、階層木作成のサブルーチンを起動する（ステップＳ９６１）。階層木作成のサブルーチンには、図１１を使用して説明したサブルーチンまたは図２４を使用して説明したサブルーチンを使用することができる。 The CPU of the anonymization server 18 anonymizes the acquired original data individual form to create the anonymized data DB 31 (step S952). The CPU of the anonymization server 18 activates a subroutine for creating a hierarchical tree (step S961). The subroutine described with reference to FIG. 11 or the subroutine described with reference to FIG. 24 can be used for the subroutine for creating a hierarchical tree.

匿名化サーバ１８のＣＰＵは、ノード削減のサブルーチンを起動する（ステップＳ９６２）。ノード削減のサブルーチンには、図１２を使用して説明したサブルーチンを使用することができる。 The CPU of the anonymization server 18 starts a node reduction subroutine (step S962). The subroutine described with reference to FIG. 12 can be used for the node reduction subroutine.

匿名化サーバ１８のＣＰＵは、バイパス追加のサブルーチンを起動する（ステップＳ９６３）。バイパス追加のサブルーチンには、図１３および図１４を使用して説明したサブルーチンを使用することができる。 The CPU of the anonymization server 18 activates a bypass addition subroutine (step S963). As the bypass addition subroutine, the subroutine described with reference to FIGS. 13 and 14 can be used.

匿名化サーバ１８のＣＰＵは、完成した匿名化オートマトン３２をサーバ１１に送信する（ステップＳ９６４）。サーバＣＰＵ１２は、匿名化オートマトン３２を取得する（ステップＳ５６１）。 The CPU of the anonymization server 18 transmits the completed anonymization automaton 32 to the server 11 (step S964). The server CPU 12 acquires the anonymized automaton 32 (step S561).

サーバＣＰＵ１２は、第２クライアント２５からストリーミングデータの入力個票を受信する（ステップＳ５１０）。サーバＣＰＵ１２は、匿名化のサブルーチンを起動する（ステップＳ５１１）。匿名化のサブルーチンには、図１５を使用して説明したサブルーチンまたは図２１を使用して説明したサブルーチンを使用することができる。なお、サーバＣＰＵ１２は、ステップＳ５１１の終了後、ステップＳ５１０で受信した入力個票を補助記憶装置１４等から削除することが望ましい。 The server CPU 12 receives the input form of streaming data from the second client 25 (step S510). The server CPU 12 activates an anonymization subroutine (step S511). As the anonymization subroutine, the subroutine described with reference to FIG. 15 or the subroutine described with reference to FIG. 21 can be used. It is desirable that the server CPU 12 delete the input individual number received in step S510 from the auxiliary storage device 14 or the like after the end of step S511.

サーバＣＰＵ１２は、匿名化した出力個票を第１クライアント２１のＣＰＵに送信する（ステップＳ５１２）。第１クライアントのＣＰＵは、受信した出力個票を保存する（ステップＳ７０１）。 The server CPU 12 sends the anonymized output form to the CPU of the first client 21 (step S512). The CPU of the first client stores the received output form (step S701).

サーバＣＰＵ１２は、処理を終了するか否かを判定する（ステップＳ５１５）。処理を終了しないと判定した場合（ステップＳ５１５でＮＯ）、サーバＣＰＵ１２はステップＳ５６１に戻る。処理を終了すると判定した場合（ステップＳ５１５でＹＥＳ）、サーバＣＰＵ１２は処理を終了する。 The server CPU 12 determines whether to end the process (step S515). When it is determined that the processing is not to be ended (NO in step S515), the server CPU 12 returns to step S561. When it is determined that the process is to be ended (YES in step S515), the server CPU 12 ends the process.

匿名化サーバ１８のＣＰＵは、入力個票を受信して（ステップＳ９５３）、主記憶装置１３または補助記憶装置に記憶する。匿名化サーバ１８は、第２クライアント２５が入力個票を送信する都度、ステップＳ９５３を繰り返す。 The CPU of the anonymization server 18 receives the input vote (step S953) and stores it in the main storage device 13 or the auxiliary storage device. The anonymization server 18 repeats step S953 each time the second client 25 transmits the input individual number.

匿名化サーバ１８のＣＰＵは、匿名化済データＤＢ３１を再作成するか否かを判定する（ステップＳ９５４）。再作成を行う条件は、任意に設定することができる。たとえば匿名化サーバ１８のＣＰＵは、１週間ごと、１ヶ月ごと等に再作成する（ステップＳ９５４でＹＥＳ）と判定しても良い。また、匿名化サーバ１８のＣＰＵは、ステップＳ９５３で受信した入力個票が所定の数を超えた場合に再作成する（ステップＳ９５４でＹＥＳ）と判定しても良い。 The CPU of the anonymization server 18 determines whether to recreate the anonymized data DB 31 (step S954). The conditions for re-creating can be set arbitrarily. For example, the CPU of the anonymization server 18 may determine to recreate it every week or every month (YES in step S954). In addition, the CPU of the anonymization server 18 may determine to recreate (YES in step S954) when the number of input pieces received in step S953 exceeds a predetermined number.

匿名化済データＤＢ３１を再作成する（ステップＳ９５４でＹＥＳ）と判定した場合、匿名化サーバ１８のＣＰＵは、ステップＳ９５２に戻る。匿名化済データＤＢ３１を再作成しない（ステップＳ９５４でＮＯ）と判定した場合、匿名化サーバ１８のＣＰＵは、ステップＳ９５３に戻る。 When it is determined to recreate the anonymized data DB 31 (YES in step S954), the CPU of the anonymization server 18 returns to step S952. When it is determined that the anonymized data DB 31 is not recreated (NO in step S954), the CPU of the anonymization server 18 returns to step S953.

本実施の形態によると、匿名化オートマトン３２とストリーミングデータの入力個票の匿名化とを異なるハードウェアで行う情報処理システム１０を提供することができる。大量の個人情報を取り扱う匿名化サーバ１８のセキュリティレベルを高く設定することにより、個人情報を保護することが可能である。 According to the present embodiment, it is possible to provide the information processing system 10 that performs the anonymization automaton 32 and the anonymization of the input individual pieces of streaming data with different hardware. By setting the security level of the anonymization server 18 that handles a large amount of personal information to be high, it is possible to protect the personal information.

［実施の形態７］
図２８は、実施の形態７の情報処理装置１１の動作を示す機能ブロック図である。情報処理装置１１は、サーバＣＰＵ１２による制御に基づいて以下のように動作する。 [Embodiment 7]
FIG. 28 is a functional block diagram showing operations of the information processing apparatus 11 according to the seventh embodiment. The information processing device 11 operates as follows under the control of the server CPU 12.

取得部５１は、複数の項目にそれぞれ関連付けられたデータを有する入力個票を取得する。変換部５２は、取得部５１が取得した入力個票を、規則に基づいて、前記複数の項目にそれぞれ関連付けられたデータを有する複数の匿名化済個票のいずれか一つと同一の出力個票に変換する。 The acquisition unit 51 acquires an input individual vote having data associated with each of a plurality of items. The conversion unit 52 outputs the input vote acquired by the acquisition unit 51 to the same output vote as any one of the anonymized votes having data associated with the plurality of items based on the rule. Convert to.

［実施の形態８］
実施の形態８は、汎用のコンピュータとプログラム４７とを組み合わせて動作させることにより、本実施の形態のサーバ１１を実現する形態に関する。図２９は、実施の形態８の情報処理システム１０の構成を示す説明図である。図２９を使用して、本実施の形態の構成を説明する。なお、実施の形態１と共通する部分の説明は省略する。 [Embodiment 8]
The eighth embodiment relates to a mode in which the server 11 of the present embodiment is realized by operating a general-purpose computer and a program 47 in combination. FIG. 29 is an explanatory diagram showing the configuration of the information processing system 10 according to the eighth embodiment. The configuration of this embodiment will be described with reference to FIG. Note that the description of the parts common to the first embodiment is omitted.

本実施の形態の情報処理システム１０は、サーバコンピュータ４５、第１クライアント２１および第２クライアント２５を備える。サーバコンピュータ４５、第１クライアント２１および第２クライアント２５は、ネットワークを介して接続している。 The information processing system 10 of the present embodiment includes a server computer 45, a first client 21 and a second client 25. The server computer 45, the first client 21 and the second client 25 are connected via a network.

サーバコンピュータ４５は、サーバＣＰＵ１２、主記憶装置１３、補助記憶装置１４、通信部１５、読取部１７およびバスを備える。サーバコンピュータ４５は、汎用のパソコン等の情報処理装置である。 The server computer 45 includes a server CPU 12, a main storage device 13, an auxiliary storage device 14, a communication unit 15, a reading unit 17, and a bus. The server computer 45 is an information processing device such as a general-purpose personal computer.

プログラム４７は、可搬型記録媒体４８に記録されている。サーバＣＰＵ１２は、読取部１７を介してプログラム４７を読み込み、補助記憶装置１４に保存する。またサーバＣＰＵ１２は、サーバコンピュータ４５内に実装されたフラッシュメモリ等の半導体メモリ４９に記憶されたプログラム４７を読出しても良い。さらに、サーバＣＰＵ１２は、通信部１５および図示しないネットワークを介して接続される図示しない他のサーバコンピュータからプログラム４７をダウンロードして補助記憶装置１４に保存しても良い。 The program 47 is recorded in the portable recording medium 48. The server CPU 12 reads the program 47 via the reading unit 17 and saves it in the auxiliary storage device 14. Further, the server CPU 12 may read the program 47 stored in the semiconductor memory 49 such as a flash memory installed in the server computer 45. Further, the server CPU 12 may download the program 47 from another server computer (not shown) connected via the communication unit 15 and a network (not shown) and store the program 47 in the auxiliary storage device 14.

プログラム４７は、サーバコンピュータ４５の制御プログラムとしてインストールされ、主記憶装置１３にロードして実行される。これにより、サーバコンピュータ４５は上述したサーバ１１として機能する。 The program 47 is installed as a control program of the server computer 45, loaded into the main storage device 13 and executed. Thereby, the server computer 45 functions as the server 11 described above.

各実施例で記載されている技術的特徴（構成要件）はお互いに組合せ可能であり、組み合わせすることにより、新しい技術的特徴を形成することができる。
今回開示された実施の形態はすべての点で例示であって、制限的なものでは無いと考えられるべきである。本発明の範囲は、上記した意味では無く、特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 The technical features (constituent elements) described in the respective embodiments can be combined with each other, and by combining them, new technical features can be formed.
The embodiments disclosed this time are to be considered as illustrative in all points and not restrictive. The scope of the present invention is defined not by the above meaning but by the scope of the claims, and is intended to include meanings equivalent to the scope of the claims and all modifications within the scope.

（付記１）
複数の項目にそれぞれ関連付けられたデータを有する入力個票を取得する取得部と、
前記取得部が取得した前記入力個票を、規則に基づいて、前記複数の項目にそれぞれ関連付けられたデータを有する複数の匿名化済個票のいずれか一つと同一の出力個票に変換する変換部とを備える
情報処理装置。 (Appendix 1)
An acquisition unit that acquires an input individual number having data associated with each of a plurality of items,
Conversion for converting the input vote acquired by the acquisition unit into the same output vote as any one of the plurality of anonymized votes having data associated with the plurality of items, based on a rule An information processing device including a unit.

（付記２）
前記変換部は、前記入力個票を前記匿名化済個票と同一の出力個票に変換することができない場合に、該匿名化済個票が有する項目に関連付けられたすべてのデータを他のデータに置換した置換個票に変換する
付記１に記載の情報処理装置。 (Appendix 2)
When the conversion unit cannot convert the input individual number into the same output individual number as the anonymized individual number, all the data associated with the item of the anonymized individual number is converted to another one. The information processing apparatus according to appendix 1, wherein the information is converted into a replacement individual piece replaced with data.

（付記３）
前記規則は、前記入力個票を、各項目に関連付けられたデータが該入力個票の各項目に関連付けられたデータまたは該データを置換した他のデータと同一である出力個票に変換する規則である
付記１または付記２に記載の情報処理装置。 (Appendix 3)
The rule is a rule for converting the input vote into an output vote in which the data associated with each item is the same as the data associated with each item of the input vote or other data that replaces the data. The information processing apparatus according to supplementary note 1 or supplementary note 2.

（付記４）
前記規則は、前記匿名化済個票が有する項目およびデータに対応する複数階層の判定枝を有し、各階層の該判定枝により出力個票が定められ、
前記変換部は、前記入力個票の項目に関連付けられたデータが前記判定枝に対応する場合に該判定枝を選択し、選択した該判定枝によって定められた出力個票に入力個票を変換する
付記１から付記３のいずれか一つに記載の情報処理装置。 (Appendix 4)
The rule has a plurality of layers of decision branches corresponding to the items and data that the anonymized individual vote has, and the output votes are determined by the decision branches of each layer,
The conversion unit selects the determination branch when the data associated with the item of the input individual corresponds to the determination branch, and converts the input individual into the output individual defined by the selected determination branch. The information processing apparatus according to any one of appendices 1 to 3.

（付記５）
前記変換部は、取得した前記入力個票を逐次変換する
付記１から付記４のいずれか一つに記載の情報処理装置。 (Appendix 5)
The information processing apparatus according to any one of appendices 1 to 4, wherein the conversion unit sequentially converts the acquired input individual number.

（付記６）
前記変換部は、前記匿名化済個票が有さない項目に関連付けられた前記入力個票のデータを変換しない
付記１から付記５のいずれか一つに記載の情報処理装置。 (Appendix 6)
The information processing apparatus according to any one of appendices 1 to 5, wherein the conversion unit does not transform the data of the input individual form associated with an item that the anonymized individual form does not have.

（付記７）
前記匿名化済個票は、一部または全部の項目に関連付けられたデータを他のデータに置換した個票である
付記１から付記６のいずれか一つに記載の情報処理装置。 (Appendix 7)
The anonymized individual vote is an individual vote in which data associated with some or all items is replaced with other data. The information processing apparatus according to any one of appendices 1 to 6.

（付記８）
前記匿名化済個票は、ｋ−匿名性を有する
付記１から付記７に記載の情報処理装置。 (Appendix 8)
The information processing device according to appendix 1 to 7, wherein the anonymized individual bill has k-anonymity.

（付記９）
前記規則は、
入力ノードと、
前記匿名化済個票に関連付けられており、前記入力ノードから遷移可能な複数の受容ノードと、
前記入力ノードから前記受容ノードに至る遷移の条件を示す複数の判定枝と、
前記判定枝に接続された複数のノードとを備える
有限オートマトンである付記１から付記８のいずれか一つに記載の情報処理装置。 (Appendix 9)
The rules are
An input node,
A plurality of accepting nodes that are associated with the anonymized individual vote and that can transition from the input node;
A plurality of decision branches indicating conditions of transition from the input node to the accept node,
The information processing device according to any one of appendices 1 to 8, which is a finite state automaton, comprising: a plurality of nodes connected to the determination branch.

（付記１０）
前記有限オートマトンは、一のノードから前記入力個票に基づいて遷移可能な複数のノードが存在する場合には、遷移対象となるデータの範囲が狭いノードに遷移する付記９に記載の情報処理装置。 (Appendix 10)
The information processing apparatus according to appendix 9, wherein the finite state automaton makes a transition to a node having a narrow range of data to be transitioned when there are a plurality of nodes that can transition based on the input number from one node. ..

（付記１１）
前記有限オートマトンは、
入力ノードから一の受容ノードにつながる一のノード列に含まれるノードから、他のノード列に含まれるノードに遷移するバイパス枝を備える付記９または付記１０に記載の情報処理装置。 (Appendix 11)
The finite automaton is
11. The information processing apparatus according to appendix 9 or 10, further comprising a bypass branch that transitions from a node included in one node string connected from an input node to one accepting node to a node included in another node string.

（付記１２）
前記バイパス枝は、前記一のノード列に含まれるノードから、該ノードから他のいずれのノードにも遷移できない個票を遷移させる付記１１に記載の情報処理装置。 (Appendix 12)
12. The information processing device according to appendix 11, wherein the bypass branch causes a node included in the one node sequence to make a transition from a node that cannot be transited to any other node.

（付記１３）
複数の項目にそれぞれ関連付けられたデータを有する複数の元データ個票から匿名化済個票を作成する作成部を備える
付記１から付記１２のいずれか一つに記載の情報処理装置。 (Appendix 13)
The information processing apparatus according to any one of appendices 1 to 12, further comprising: a creation unit that creates an anonymized individual vote from a plurality of original data individual votes each having data associated with a plurality of items.

（付記１４）
複数の項目にそれぞれ関連付けられたデータを有する元データ個票を記憶する元データ記憶部と、
前記元データ記億部に記憶された元データ個票から匿名化済個票を作成する作成部と、
取得した入力個票を記憶する入力個票記憶部と、
前記元データ記憶部に記憶された元データ個票と前記入力個票記憶部に記憶された入力個票とから前記匿名化済個票を再作成する再作成部とを備える
付記１から付記１２のいずれか一つに記載の情報処理装置。 (Appendix 14)
An original data storage unit that stores an original data individual piece having data associated with each of a plurality of items,
A creation unit for creating anonymized individual votes from the original data individual votes stored in the original data storage section,
An input number storage unit that stores the acquired input number,
Note 1 to Note 12 further comprising: a recreating unit for recreating the anonymized individual vote from the original data individual form stored in the original data storage unit and the input individual form stored in the input individual form storage unit. The information processing apparatus described in any one of 1.

（付記１５）
複数の項目にそれぞれ関連付けられたデータを有する入力個票を取得し、
取得した入力個票を、規則に基づいて、前記複数の項目にそれぞれ関連付けられたデータを有する複数の匿名化済個票のいずれか一つと同一の出力個票に変換する
処理をコンピュータに実行させる情報処理方法。 (Appendix 15)
Acquire the input number that has the data respectively associated with multiple items,
Causes the computer to execute the processing of converting the acquired input individual vote into the same output individual vote as any one of the plurality of anonymized individual votes having the data respectively associated with the plurality of items. Information processing method.

（付記１６）
複数の項目にそれぞれ関連付けられたデータを有する入力個票を取得し、
取得した入力個票を、規則に基づいて、前記複数の項目にそれぞれ関連付けられたデータを有する複数の匿名化済個票のいずれか一つと同一の出力個票にする
処理をコンピュータに実行させるプログラム。 (Appendix 16)
Acquire the input number that has the data respectively associated with multiple items,
A program for causing a computer to execute a process for converting the acquired input individual vote into the same output individual vote as one of the plurality of anonymized individual votes having data associated with each of the plurality of items ..

１０情報処理システム
１１サーバ（情報処理装置）
１２サーバＣＰＵ
１３主記憶装置
１４補助記憶装置
１５通信部
１７読取部
１８匿名化サーバ
２１第１クライアント
２５第２クライアント
３１匿名化済データＤＢ
３２匿名化オートマトン（規則）
４５サーバコンピュータ
４７プログラム
４８可搬型記憶媒体
４９半導体メモリ
５１取得部
５２変換部
６０親ノード
６１中間ノード
６３受容ノード
６５遷移枝
６７バイパス枝
６９ノード列 10 Information Processing System 11 Server (Information Processing Device)
12 server CPU
13 main storage device 14 auxiliary storage device 15 communication unit 17 reading unit 18 anonymization server 21 first client 25 second client 31 anonymized data DB
32 Anonymized Automata (rule)
45 server computer 47 program 48 portable storage medium 49 semiconductor memory 51 acquisition unit 52 conversion unit 60 parent node 61 intermediate node 63 acceptance node 65 transition branch 67 bypass branch 69 node sequence

Claims

An acquisition unit that sequentially acquires input individual numbers each having data associated with a plurality of items,
The input individual vote acquired by the acquisition unit is sequentially converted into the same output individual vote as any one of the plurality of anonymized individual votes having the data respectively associated with the plurality of items based on the rule. And a conversion unit ,
When the conversion unit cannot convert the input individual number into the same output individual number as the anonymized individual number, all the data associated with the item of the anonymized individual number is converted to another one. An information processing device that converts the data into replacement individual data .

The rule is a rule for converting the input vote into an output vote in which the data associated with each item is the same as the data associated with each item of the input vote or other data that replaces the data. Is
The information processing apparatus according to claim 1 .

The rule has a plurality of layers of decision branches corresponding to the items and data that the anonymized individual vote has, and the output votes are determined by the decision branches of each layer,
The conversion unit selects the determination branch when the data associated with the item of the input individual corresponds to the determination branch, and converts the input individual into the output individual defined by the selected determination branch. The information processing apparatus according to claim 1 or 2 .

The input vote, the anonymized vote and the output vote include anonymized items and non-anonymized items,
The converting unit, together with the input microdata have the same anonymous items and anonymized item of any one of the anonymous already microdata of said anonymized already microdata, the non-anonymous said input microdata The information processing apparatus according to any one of claims 1 to 3 , wherein the information is converted into an output individual form having the same non-anonymized item as the generalized item .

The rules are
An input node,
A plurality of accepting nodes that are associated with the anonymized individual vote and that can transition from the input node;
A plurality of decision branches indicating conditions of transition from the input node to the accept node,
The information processing apparatus according to any one of claims 1 to 4 , which is a finite state automaton including a plurality of nodes connected to the determination branch.

The finite automaton is
The information processing apparatus according to claim 5 , further comprising a bypass branch that transitions from a node included in one node sequence connected from the input node to one reception node to a node included in another node sequence.

Through the network line, sequentially obtain the input form with the data associated with each item,
Based on the rules, the obtained input votes are sequentially converted into the same output votes as any one of a plurality of anonymized votes having data associated with the plurality of items ,
When it is not possible to convert the input individual vote into the same output individual vote as the anonymized individual vote, all data associated with the items of the anonymized individual vote are replaced with other data Convert into individual votes,
An information processing method for causing a computer to execute a process of transmitting a converted output form .

Sequentially acquire the input number of items having data associated with each of a plurality of items,
Based on the rules, the obtained input votes are sequentially converted into the same output votes as any one of a plurality of anonymized votes having data associated with the plurality of items ,
When it is not possible to convert the input individual vote into the same output individual vote as the anonymized individual vote, all data associated with the items of the anonymized individual vote are replaced with other data A program that causes a computer to execute the process of converting into individual pieces .