JP7100563B2

JP7100563B2 - Anonymization system and anonymization method

Info

Publication number: JP7100563B2
Application number: JP2018210777A
Authority: JP
Inventors: 啓成藤原; 尚宜佐藤; 健太高橋
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2018-11-08
Filing date: 2018-11-08
Publication date: 2022-07-13
Anticipated expiration: 2038-11-08
Also published as: WO2020095662A1; JP2020077256A

Description

本発明は、匿名化システムおよび匿名化方法に係り、医療情報等の個人情報を匿名化して医学研究等に活用するために提供する際に、改竄を検知して、提供された情報の正当性を検証するのに好適な匿名化システムおよび匿名化方法に関する。 The present invention relates to an anonymization system and anonymization method, and detects falsification when providing anonymization of personal information such as medical information for use in medical research and the like, and the validity of the provided information. Concerning anonymization systems and methods suitable for verifying.

近年、２０１７年５月の改正個人情報保護法の全面施行により、個人情報の適切な保護を前提とした匿名加工情報の利用・活用が進みつつある。また、２０１８年５月には、国民の医療情報を匿名加工して、大学や製薬企業の研究開発などでの活用を可能にする仕組みを定めた次世代医療基盤法が施行された。こうした法規制により、医療分野の研究開発等に匿名化データが活用可能となってきている。 In recent years, with the full enforcement of the revised Personal Information Protection Law in May 2017, the use and utilization of anonymously processed information on the premise of appropriate protection of personal information is progressing. In May 2018, the Next Generation Medical Infrastructure Law was enacted, which stipulates a mechanism for anonymously processing public medical information so that it can be used in research and development by universities and pharmaceutical companies. Due to these laws and regulations, anonymized data can be used for research and development in the medical field.

医療分野では、研究に用いる医療データに対し“バリデーション”と呼ばれる正当性の検証を行っている。今後、匿名加工に対しても、同様の正当性の検証が課題となると考えられる。 In the medical field, the validity of medical data used in research is verified, which is called "validation". In the future, it is considered that the same verification of legitimacy will be an issue for anonymous processing.

臨床情報等の個人情報を匿名化して提供する技術としては、例えば、特許文献１に開示がある。特許文献１に記載された情報管理システムによれば、臨床情報等の被検体情報（個人情報）の匿名化処理後、被検体情報の所有者や閲覧権限所有者が、匿名化処理された情報に関連付けられて蓄積された情報を特定可能とする。 As a technique for anonymizing and providing personal information such as clinical information, for example, there is disclosure in Patent Document 1. According to the information management system described in Patent Document 1, after the anonymization processing of the subject information (personal information) such as clinical information, the owner of the subject information and the owner of the viewing authority have anonymized the information. It makes it possible to identify the information accumulated in association with.

国際公開第２００８／０６９０１１号International Publication No. 2008/069011

上記従来技術の特許文献１では、正当性保証の対象が元データの一部（識別子ないし準識別子の組合せ）が対象であるため、データ全体に対し匿名化の正当性を保証することができない。 In Patent Document 1 of the above-mentioned prior art, since the object of the validity guarantee is a part of the original data (combination of identifiers or quasi-identifiers), the validity of anonymization cannot be guaranteed for the entire data.

一般的に、データに対する正当性を保証する技術としては、デジタル署名技術がある。しかしながら、単純にデジタル署名を適用するだけでは、データの削除や置換等の匿名化処理を施すと、正当性を検証できない。不正な匿名化データを利用した場合、研究成果が不正となる事態などが発生する恐れがある。 Generally, there is a digital signature technique as a technique for guaranteeing the correctness of data. However, the legitimacy cannot be verified by simply applying a digital signature and performing anonymization processing such as deletion or replacement of data. If unauthorized anonymized data is used, there is a risk that the research results will be incorrect.

本発明の目的は、データに対する匿名化処理の正当性を、削除や置換などの匿名化処理を施した後でも検証できる匿名化の正当性を検証可能な匿名化システムおよび匿名化方法を提供することにある。 An object of the present invention is to provide an anonymization system and anonymization method capable of verifying the validity of anonymization that can verify the validity of anonymization processing for data even after performing anonymization processing such as deletion or replacement. There is something in it.

本発明の匿名化システムの構成は、好ましくは、情報処理装置により秘密情報を匿名化して利用者に提供する匿名化システムであって、秘密情報を記憶する秘密情報記憶手段と、秘密情報を抽象化する情報の候補群である抽象化候補群情報記憶手段と、秘密情報に抽象化する情報の候補群を追加した拡張秘密データを記憶する拡張秘密データ記憶手段と、秘密情報から一部の情報を削除または置換した匿名化データを記憶する匿名化データ記憶手段と、秘密情報と抽象化候補群情報を用い、拡張秘密データを生成する拡張秘密データ生成手段と、拡張秘密データまたは匿名化データを用いて秘密情報のハッシュ値を中間値とするデジタル署名を生成する署名生成手段と、拡張秘密データを用いて匿名化データを生成する匿名化手段と、与えられた匿名化データの正当性を検証する匿名化データ正当性検証手段とを備え、拡張秘密データ生成手段は、秘密情報記憶手段により記憶された秘密情報と抽象化候補群情報記憶手段により記憶された秘密情報を抽象化する情報の候補群を参照して、拡張秘密データを生成し、匿名化手段は、拡張秘密データに対して、署名生成手段の署名生成の中間値と同一のハッシュ値に置き換える処理を実行して、匿名化データを生成し、署名生成手段は、拡張秘密データ生成手段より生成された拡張秘密データより、第一の署名値を、匿名化手段に生成された匿名化データより、第二の署名値をそれぞれ生成し、匿名化データ正当性検証手段は、第一の署名値と第二の署名値を比較することにより、与えられた匿名化データの正当性を検証するようにしたものである。 The configuration of the anonymization system of the present invention is preferably an anonymization system in which confidential information is anonymized by an information processing device and provided to a user, and the confidential information storage means for storing the confidential information and the confidential information are abstracted. Abstraction candidate group information storage means that is a candidate group of information to be converted, extended secret data storage means that stores extended secret data by adding candidate information to be abstracted to secret information, and some information from secret information Anonymized data storage means that stores anonymized data that has been deleted or replaced, extended secret data generation means that generates extended secret data using confidential information and abstraction candidate group information, and extended secret data or anonymized data. Verification of the validity of given anonymized data, a signature generation means that uses the hash value of confidential information as an intermediate value to generate a digital signature, an anonymization means that generates anonymized data using extended secret data, and anonymization means. The extended secret data generation means includes a means for verifying the validity of anonymized data, and the extended secret data generation means is a candidate for information that abstracts the secret information stored by the secret information storage means and a group of abstraction candidates. With reference to the group, extended secret data is generated, and the anonymization means performs a process of replacing the extended secret data with a hash value that is the same as the intermediate value of the signature generation of the signature generation means, and the anonymized data. The signature generation means generates the first signature value from the extended secret data generated by the extended secret data generation means and the second signature value from the anonymized data generated by the anonymization means. However, the anonymized data validity verification means verifies the validity of the given anonymized data by comparing the first signature value and the second signature value.

本発明によれば、データに対する匿名化処理の正当性を、削除や置換などの匿名化処理を施した後でも検証できる匿名化の正当性を検証可能な匿名化システムおよび匿名化方法を提供することができる。 INDUSTRIAL APPLICABILITY According to the present invention, there is provided an anonymization system and an anonymization method capable of verifying the validity of anonymization that can verify the validity of anonymization processing for data even after performing anonymization processing such as deletion or replacement. be able to.

匿名化システムの全体構成図である。It is an overall configuration diagram of the anonymization system. 匿名化データ提供サーバの機能構成図である。It is a functional block diagram of the anonymized data providing server. 匿名化データ利用者端末の機能構成図である。It is a functional block diagram of the anonymized data user terminal. 匿名化データ提供サーバおよび匿名化データ利用者端末のハードウェア・ソフトウェア構成図である。It is a hardware software block diagram of the anonymized data providing server and the anonymized data user terminal. 患者データテーブルのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of a patient data table. 抽象化パタン群のデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of the abstraction pattern group. 拡張患者データテーブルのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of the extended patient data table. 署名データテーブルのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of a signature data table. 匿名化データのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of anonymized data. 利用者が匿名化データ提供サーバから匿名化データを取得して検証するまでの一連の処理を示すシーケンス図である。It is a sequence diagram which shows a series of processing until a user acquires anonymized data from an anonymized data providing server and verifies it. 患者データレコード拡張処理を示すフローチャートである。It is a flowchart which shows the patient data record expansion processing. 署名生成処理を示すフローチャートである（その一）。It is a flowchart which shows the signature generation process (the 1). 署名生成処理を示すフローチャートである（その二）。It is a flowchart which shows the signature generation process (the second). 検証可能匿名化処理を示すフローチャートである。It is a flowchart which shows the verifiable anonymization process. レコード削除処理を示すフローチャートである。It is a flowchart which shows the record deletion process. 属性削除処理を示すフローチャートである。It is a flowchart which shows the attribute deletion process. 属性置換処理を示すフローチャートである。It is a flowchart which shows the attribute replacement process. 署名検証処理を示すフローチャートである。It is a flowchart which shows the signature verification process.

以下、本発明の一実施形態を、図１ないし図１７を用いて説明する。
本実施形態では、病院等が患者の医療情報を匿名化して医学研究、統計資料等に活用するために提供する例について説明する。
先ず、図１ないし図４を用いて匿名化システムの構成について説明する。 Hereinafter, an embodiment of the present invention will be described with reference to FIGS. 1 to 17.
In this embodiment, an example provided by a hospital or the like for anonymizing a patient's medical information and utilizing it for medical research, statistical data, etc. will be described.
First, the configuration of the anonymization system will be described with reference to FIGS. 1 to 4.

先ず、図１を用いて匿名化システムの全体構成について説明する。
匿名化データ提供システムは、個人情報を含む情報を保持するデータ所有者（データホルダ）が、情報を匿名化した上で、データ利用者へ提供するためのシステムである。匿名化データ提供システムは、図１に示されるように、匿名化データ提供サーバ１と匿名化データ利用者端末２からなり、それらをネットワーク３により接続した形態である。 First, the overall configuration of the anonymization system will be described with reference to FIG.
The anonymized data providing system is a system for a data owner (data holder) who holds information including personal information to anonymize the information and then provide the information to a data user. As shown in FIG. 1, the anonymized data providing system is composed of an anonymized data providing server 1 and an anonymized data user terminal 2, and is connected by a network 3.

匿名化データ提供サーバ１は、データホルダが個人情報を格納し提供用の匿名化処理を行う機能を提供するサーバである。匿名化データ利用者端末２は、データ利用者が匿名化データをダウンロードし、正当性の検証を行うクライアント端末である。ネットワーク３は、インターネットのようなグローバルなネットワークであってもよいし、構内に設置されるＬＡＮ（Local Network）であってもよい。 The anonymized data providing server 1 is a server that provides a function of a data holder to store personal information and perform anonymization processing for provision. The anonymized data user terminal 2 is a client terminal in which the data user downloads the anonymized data and verifies the validity. The network 3 may be a global network such as the Internet, or may be a LAN (Local Network) installed on the premises.

次に、図２を用いて匿名化データ提供サーバの機能構成について説明する。
匿名化データ提供サーバ１は、図２に示されるように、Ｗｅｂサーバ機能部１０１、レコード拡張機能部１０２、匿名化処理機能部１０３、乱数生成部１０４、ハッシュ値生成部１０５、署名生成部１０６、記憶部１１０からなる。 Next, the functional configuration of the anonymized data providing server will be described with reference to FIG.
As shown in FIG. 2, the anonymized data providing server 1 includes a Web server function unit 101, a record extension function unit 102, an anonymization processing function unit 103, a random number generation unit 104, a hash value generation unit 105, and a signature generation unit 106. , Consists of a storage unit 110.

Ｗｅｂサーバ機能部１０１は、データ利用者へ患者データ名とデジタル署名値をＷｅｂページにより公開する処理を行う機能部である。レコード拡張機能部１０２は、匿名化処理に先立ち患者データを拡張する前処理を行う機能部である。匿名化処理機能部１０３は、検証可能な匿名化処理を行う機能部である。乱数生成部１０４は、ハッシュ値の安全性を高めるために付与する乱数を生成する機能部である。ハッシュ値生成部１０５は、一方向関数などによるハッシュ値の生成を行う機能部である。署名生成部１０６は、拡張患者データを入力としてデジタル署名を生成する機能部である。記憶部１１０は、匿名化データ提供サーバ１で使用されるデータを記憶する機能部である。 The Web server function unit 101 is a function unit that performs a process of disclosing a patient data name and a digital signature value to a data user on a Web page. The record extension function unit 102 is a function unit that performs preprocessing for expanding patient data prior to the anonymization process. The anonymization processing function unit 103 is a function unit that performs verifiable anonymization processing. The random number generation unit 104 is a functional unit that generates a random number to be given in order to enhance the security of the hash value. The hash value generation unit 105 is a functional unit that generates a hash value by a one-way function or the like. The signature generation unit 106 is a functional unit that generates a digital signature by inputting extended patient data. The storage unit 110 is a functional unit that stores data used by the anonymized data providing server 1.

記憶部１１０には、患者データテーブル３０１、抽象化パタン群３０２、拡張患者データテーブル３０３、署名データテーブル３０４、匿名化データ３０５が格納されている。 The storage unit 110 stores a patient data table 301, an abstraction pattern group 302, an extended patient data table 303, a signature data table 304, and anonymization data 305.

患者データテーブル３０１は、個人情報を含む患者データを格納するテーブルである。抽象化パタン群３０２は、患者データの匿名化における置換処理のパタン群のデータである。拡張患者データテーブル３０３は、匿名化処理の前処理として拡張処理を行った患者データを格納するテーブルである。署名データテーブル３０４は、患者データ名とその患者データの拡張患者データのデジタル署名値のペアを格納するテーブルである。匿名化データ３０５は、患者データを匿名化したデータである。
なお、データの具体的な構造は、後に詳説する。 The patient data table 301 is a table for storing patient data including personal information. The abstraction pattern group 302 is the data of the pattern group of the replacement process in the anonymization of patient data. The extended patient data table 303 is a table for storing patient data that has undergone extended processing as a preprocessing for anonymization processing. The signature data table 304 is a table that stores a pair of a patient data name and a digital signature value of the extended patient data of the patient data. The anonymized data 305 is data obtained by anonymizing patient data.
The specific structure of the data will be described in detail later.

次に、図３を用いて匿名化データ利用者端末の機能構成に説明する。
匿名化データ利用者端末２は、図３に示されるように、ブラウザ機能部２０１、署名検証処理部２０２、ハッシュ値生成部２０３、署名生成部２０４、記憶部２１０からなる。Ｗｅｂブラウザ機能部２０１は、Ｗｅｂページを参照する処理を行う機能部である。署名検証処理部２０２は、Ｗｅｂページから取得した署名値と受領した匿名化データの署名値とを比較し匿名化の正当性を検証する機能部である。ハッシュ値生成部２０３は、一方向関数などによるハッシュ値の生成を行う機能部である。署名生成部２０４は、匿名化データ提供サーバ１の署名生成部１０６と同等の機能を有し、匿名化データを入力としてデジタル署名を生成する機能部である。記憶部２１０は、匿名化データ利用者端末２で使用されるデータを記憶する機能部である。 Next, the functional configuration of the anonymized data user terminal will be described with reference to FIG.
As shown in FIG. 3, the anonymized data user terminal 2 includes a browser function unit 201, a signature verification processing unit 202, a hash value generation unit 203, a signature generation unit 204, and a storage unit 210. The Web browser function unit 201 is a function unit that performs a process of referencing a Web page. The signature verification processing unit 202 is a functional unit that verifies the validity of anonymization by comparing the signature value acquired from the Web page with the signature value of the received anonymization data. The hash value generation unit 203 is a functional unit that generates a hash value by a one-way function or the like. The signature generation unit 204 has the same function as the signature generation unit 106 of the anonymization data providing server 1, and is a functional unit that generates a digital signature by inputting the anonymized data. The storage unit 210 is a functional unit that stores data used in the anonymized data user terminal 2.

記憶部２１０には、匿名化データ３０５、署名値３１１が格納される。匿名化データ３０５、患者データを匿名化したデータであり、匿名化データ提供サーバ１からネットワーク３経由で受領したデータである。署名値３１１は、Ｗｅｂページから取得した検証用のデジタル署名値および匿名化データ３０５より生成したデジタル署名値である。 Anonymized data 305 and signature value 311 are stored in the storage unit 210. The anonymized data 305 and the patient data are anonymized and are received from the anonymized data providing server 1 via the network 3. The signature value 311 is a digital signature value for verification acquired from the Web page and a digital signature value generated from the anonymized data 305.

次に、図４を用いて匿名化データ提供サーバおよび匿名化データ利用者端末のハードウェア構成、ソフトウェア構成について説明する。
匿名化データ提供サーバ１のハードウェア構成としては、例えば、図４に示されるサーバ装置のような一般的な情報処理装置で実現される。また、実計算機上に構築される仮想マシンであってもよい。 Next, the hardware configuration and software configuration of the anonymized data providing server and the anonymized data user terminal will be described with reference to FIG.
The hardware configuration of the anonymized data providing server 1 is realized by, for example, a general information processing device such as the server device shown in FIG. It may also be a virtual machine built on an actual computer.

匿名化データ提供サーバ１は、ＣＰＵ（Central Processing Unit）４０１、主メモリ４０２、ネットワークインタフェース４０３、表示装置４１０、入力装置４２０がバスにより結合された形態になっている。 The anonymized data providing server 1 has a CPU (Central Processing Unit) 401, a main memory 402, a network interface 403, a display device 410, and an input device 420 connected by a bus.

ＣＰＵ４０１は、匿名化データ提供サーバ１の各部を制御し、主メモリ４０２に必要なプログラムをロードして実行する。 The CPU 401 controls each part of the anonymized data providing server 1, loads and executes a program required for the main memory 402.

主メモリ４０２は、通常、ＲＡＭなどの揮発メモリで構成され、ＣＰＵ４０１が実行するプログラム、参照するデータが記憶される。 The main memory 402 is usually composed of a volatile memory such as a RAM, and stores a program executed by the CPU 401 and data to be referred to.

ネットワークインタフェース４０３は、ネットワーク３と接続するためのインタフェースである。 The network interface 403 is an interface for connecting to the network 3.

表示装置４１０は、ＬＣＤ（Liquid Crystal Display）などの情報を表示する装置である。 The display device 410 is a device that displays information such as an LCD (Liquid Crystal Display).

入力装置４２０は、コマンドやデータなどの情報を入力したり、装置を制御するための入力を行う装置であり、例えば、キーボードやポインティングデバイスのマウスなどである。 The input device 420 is a device for inputting information such as commands and data and inputting for controlling the device, and is, for example, a keyboard or a mouse of a pointing device.

ハードディスクドライブ（ＨＤＤ：Hard Disk Drive）４３０は、大容量の記憶容量を有しており、本実施形態を実行するためのプログラムが格納されている。匿名化データ提供サーバ１のハードディスクドライブ４３０には、Ｗｅｂサーバ機能プログラム６０１、レコード拡張機能プログラム６０２、匿名化処理機能プログラム６０３、乱数生成プログラム６０４、ハッシュ値生成プログラム６０５、署名生成プログラム６０６がインストールされている。Ｗｅｂサーバ機能プログラム６０１、レコード拡張機能プログラム６０２、匿名化処理機能プログラム６０３、乱数生成プログラム６０４、ハッシュ値生成プログラム６０５、署名生成プログラム６０６は、それぞれ、Ｗｅｂサーバ機能部１０１、レコード拡張機能部１０２、匿名化処理機能部１０３、乱数生成部１０４、ハッシュ値生成部１０５、署名生成部１０６の機能を実行するプログラムである。 The hard disk drive (HDD) 430 has a large storage capacity, and stores a program for executing the present embodiment. A Web server function program 601, a record extension function program 602, an anonymization processing function program 603, a random number generation program 604, a hash value generation program 605, and a signature generation program 606 are installed on the hard disk drive 430 of the anonymization data providing server 1. ing. The Web server function program 601 and the record extension function program 602, the anonymization processing function program 603, the random number generation program 604, the hash value generation program 605, and the signature generation program 606 are the Web server function unit 101 and the record extension function unit 102, respectively. This is a program that executes the functions of the anonymization processing function unit 103, the random number generation unit 104, the hash value generation unit 105, and the signature generation unit 106.

また、ハードディスクドライブ４３０は、患者データテーブル３０１、抽象化パタン群３０２、拡張患者データテーブル３０３、署名データテーブル３０４、匿名化データ３０５が格納されている。 Further, the hard disk drive 430 stores a patient data table 301, an abstraction pattern group 302, an extended patient data table 303, a signature data table 304, and anonymization data 305.

匿名化データ利用者端末２のハードウェア構成としては、例えば、図４に示されるパーソナルコンピュータのような一般的な情報処理装置で実現される。また、スマートフォンであってもよいし、専用端末であってもよい。 The hardware configuration of the anonymized data user terminal 2 is realized by, for example, a general information processing device such as the personal computer shown in FIG. Further, it may be a smartphone or a dedicated terminal.

匿名化データ利用者端末２のハードウェア構成の各部は、匿名化データ提供サーバ１と同一である。 Each part of the hardware configuration of the anonymized data user terminal 2 is the same as that of the anonymized data providing server 1.

匿名化データ利用者端末２のハードディスクドライブ５３０には、ブラウザ機能プログラム７０１、署名検証処理プログラム７０２、ハッシュ値生成プログラム７０３、署名生成プログラム７０４がインストールされている。ブラウザ機能プログラム７０１、署名検証処理プログラム７０２、ハッシュ値生成プログラム７０３、署名生成プログラム７０４は、それぞれブラウザ機能部２０１、署名検証処理部２０２、ハッシュ値生成部２０３、署名生成部２０４の各機能を実行するプログラムである。 A browser function program 701, a signature verification processing program 702, a hash value generation program 703, and a signature generation program 704 are installed in the hard disk drive 530 of the anonymized data user terminal 2. The browser function program 701, the signature verification processing program 702, the hash value generation program 703, and the signature generation program 704 execute the functions of the browser function unit 201, the signature verification processing unit 202, the hash value generation unit 203, and the signature generation unit 204, respectively. It is a program to do.

また、ハードディスクドライブ５３０には、匿名化データ３０５、署名値３１１が格納されている。 Further, the hard disk drive 530 stores the anonymized data 305 and the signature value 311.

次に、図５ないし図９を用いて匿名化システムで用いられデータ構造について説明する。
患者データテーブル３０１は、患者の個人情報や関連データを格納するテーブルであり、図５に示されるように、「氏名」「住所（都道府県名）」「性別」から構成される。例えば、レコード３０１１は、ある患者の氏名が「日立太郎」であり、住所が「東京都」であり、性別が「男性」であることを示している。 Next, the data structure used in the anonymization system will be described with reference to FIGS. 5 to 9.
The patient data table 301 is a table for storing personal information and related data of a patient, and is composed of a "name", an "address (prefecture name)", and a "gender" as shown in FIG. For example, record 3011 shows that a patient's name is "Hitachi Taro", his address is "Tokyo", and his gender is "male".

抽象化パタン群３０２は、患者データテーブル３０１のカラムの表す属性を抽象化するときに用いられるデータであり、図６（ａ）に示されるように、例えば、一般化階層木３０２ａというツリー構造のデータで表現される。 The abstraction pattern group 302 is data used when abstracting the attributes represented by the columns of the patient data table 301, and as shown in FIG. 6A, for example, has a tree structure called a generalized hierarchical tree 302a. Expressed in data.

一般化階層木３０２ａにおいては、ツリー構造の葉ノードには、最も抽象度の低い値が配置され、根ノードには最も抽象度の高い値が配置され、中間ノードは、葉ノードに近いノードから根ノードに近いノードになるにつれて抽象度が高い値が配置される。例えば、図６（ａ）に示される属性「住所」の一般化階層木３０２ａは、葉ノードが「東京都」「神奈川県」などの“都道府県名”であり、中間ノードがより抽象度の高い「関東地方」「近畿地方」などの“地方名”であり、根ノードがこれらの中では最も抽象度の高い「日本」という“国名”である一般化階層木を示している。 In the generalized hierarchical tree 302a, the leaf node of the tree structure is arranged with the lowest abstract value, the root node is arranged with the highest abstract value, and the intermediate node is from the node closest to the leaf node. A value with a higher degree of abstraction is arranged as the node becomes closer to the root node. For example, in the generalized hierarchical tree 302a of the attribute "address" shown in FIG. 6 (a), the leaf node is a "prefecture name" such as "Tokyo" or "Kanagawa prefecture", and the intermediate node has a higher degree of abstraction. It is a high "local name" such as "Kanto region" and "Kinki region", and indicates a generalized hierarchical tree whose root node is the "country name" of "Japan", which has the highest degree of abstraction among them.

抽象化パタン群３０２は、図６（ｂ）に示される抽象化パタンテーブル３０２ｂの形式で表現されていてもよい。例えば、属性「住所」の抽象化パタンテーブル３０２ｂは、「住所」と「抽象化パタン」のカラムからなるテーブル構造のデータで表現される。抽象化パタンテーブル３０２ｂは、現在「住所」カラムの値である各“都道府県名”に対し、より抽象度の高い匿名化処理における置換対象の候補となる“地方名”および“国名”の値を対応付ける。例えば、抽象化パタンテーブル３０２ｂの「住所」の値「東京都」の抽象化パタンは、“地方名”を示す「関東地方」および“国名”を示す「日本」がより抽象度の高い匿名化処理における置換対象の候補であることを示している。 The abstraction pattern group 302 may be represented in the form of the abstraction pattern table 302b shown in FIG. 6 (b). For example, the abstraction pattern table 302b of the attribute "address" is represented by data having a table structure composed of columns of "address" and "abstraction pattern". In the abstraction pattern table 302b, for each "prefecture name" that is currently the value in the "address" column, the values of "local name" and "country name" that are candidates for replacement in the anonymization process with a higher degree of abstraction. To associate. For example, in the abstraction pattern of the value "Tokyo" of the "address" of the abstraction pattern table 302b, "Kanto region" indicating "local name" and "Japan" indicating "country name" are anonymized with a higher degree of abstraction. It indicates that it is a candidate for replacement in processing.

拡張患者データテーブル３０３は、患者データテーブル３０１の患者データに対し、検証可能な匿名化処理を実現するための前処理を施したテーブルである。拡張患者データテーブル３０３は、置換による匿名化を実行する際に置換候補となる抽象化パタン群３０２の各値を患者データに追加し、さらに削除や置換の匿名化処理の代替処理として実行するハッシュ化の安全性を高めるための乱数を追加したデータ構造を有する。例えば、図７に示される例では、患者データテーブル３０１に対して、属性「住所」の抽象化パタンである「住所３」および「住所４」の値を追加し、さらに、各属性に対してハッシュ化の安全性を高めるための乱数である「氏名１」「住所１」「性別１」の乱数の値を追加している。 The extended patient data table 303 is a table in which the patient data in the patient data table 301 is preprocessed to realize a verifiable anonymization process. The extended patient data table 303 adds each value of the abstract pattern group 302 that is a replacement candidate when executing anonymization by substitution to the patient data, and further executes the hash as an alternative process of the anonymization process of deletion or substitution. It has a data structure with random numbers added to enhance the safety of the conversion. For example, in the example shown in FIG. 7, the values of "address 3" and "address 4", which are abstract patterns of the attribute "address", are added to the patient data table 301, and further, for each attribute. Random numbers of "name 1", "address 1", and "gender 1", which are random numbers for enhancing the security of hashing, are added.

この属性ごとに乱数を追加するデータ構造と、後に説明する属性ごとにハッシュ値を段階的に生成するアルゴリズムにより、すべての要素に乱数を付与する場合に比べてデータサイズを削減しつつ、乱数によるハッシュ化の安全性向上をデータ全体に適用することができる。 With a data structure that adds a random number for each attribute and an algorithm that generates a hash value step by step for each attribute described later, the data size is reduced compared to the case where random numbers are given to all elements, and the random numbers are used. Hashing security improvements can be applied to the entire data.

署名データテーブル３０４は、患者データに対しての署名値を保持するテーブルであり、図８に示されるように、「対象データ」、「署名値」から構成される。「対象データ」は、患者データ名を格納するカラムであり、「署名値」は、各患者データに対し署名生成処理により生成した署名値を格納するカラムである。 The signature data table 304 is a table that holds signature values for patient data, and is composed of "target data" and "signature value" as shown in FIG. The "target data" is a column for storing the patient data name, and the "signature value" is a column for storing the signature value generated by the signature generation process for each patient data.

本実施形態では、署名データテーブル３０４のデータは、匿名化データ提供サーバ１のＷｅｂサーバ機能部１０１部により読み込まれ、Ｗｅｂページとして、ネットワーク３経由で匿名化データ利用者端末２からＷｅｂブラウザ等による閲覧およびダウンロードを可能とする。 In the present embodiment, the data in the signature data table 304 is read by the Web server function unit 101 of the anonymized data providing server 1, and is used as a Web page from the anonymized data user terminal 2 via the network 3 by a Web browser or the like. Allows viewing and downloading.

匿名化データ３０５は、拡張患者データテーブル３０３に匿名化処理を実行したデータである。例えば、図９には、図７に示された、拡張患者データテーブル３０３のデータに対し、匿名化処理を実行した結果の一例が示されている。例えば、レコード削除されたレコード３０５１は、削除されたレコードであることを示すラベルである“Ｄｅｌｅｔｅ＿Ｒ”および削除の代替処理として実行するハッシュ化により生成されたレコードハッシュ値“２２８９１Ｆ”および空値を示す“－”（Ｎｕｌｌ）により構成される。また、例えば、属性削除および属性置換されたレコード３０５２は、属性「氏名」が属性削除されたことを示すラベルである“Ｄｅｌｅｔｅ＿Ａ”、属性削除の代替処理として実行するハッシュ化により生成された属性ハッシュ値“４６５ＦＣ４”、属性「住所」が置換されたことを示す“Ｒｅｐｌａｃｅ”、属性置換の代替処理として実行するハッシュ化により生成された属性ハッシュ値“Ｂ０Ｄ８Ｃ７”により構成される。
レコード削除、属性削除、属性置換の各処理の詳細は、後に説明する。 The anonymization data 305 is data obtained by performing anonymization processing on the extended patient data table 303. For example, FIG. 9 shows an example of the result of performing anonymization processing on the data in the extended patient data table 303 shown in FIG. 7. For example, the deleted record 3051 indicates a label "Delete_R" indicating that the record is deleted, a record hash value "22891F" generated by hashing performed as an alternative process of deletion, and a null value. It is composed of "-" (Null). Further, for example, the record 3052 in which the attribute is deleted and the attribute is replaced is "Delete_A", which is a label indicating that the attribute "name" has been deleted, and the attribute hash generated by hashing executed as an alternative process of the attribute deletion. It is composed of a value "465FC4", an "Replace" indicating that the attribute "address" has been replaced, and an attribute hash value "B0D8C7" generated by hashing executed as an alternative process of the attribute replacement.
Details of each process of record deletion, attribute deletion, and attribute replacement will be described later.

これらのハッシュ値は、署名生成処理における中間処理の値であり、このデータ構造により、匿名化データ利用者端末２の署名検証処理における署名生成処理のデータ処理量が削減できるので、署名検証処理を高速化することができる。 These hash values are values of intermediate processing in the signature generation processing, and this data structure can reduce the amount of data processing in the signature generation processing in the signature verification processing of the anonymized data user terminal 2, so that the signature verification processing can be performed. It can be speeded up.

次に、図１０ないし図１７を用いて匿名化システムの処理について説明する。 Next, the processing of the anonymization system will be described with reference to FIGS. 10 to 17.

先ず、図１０を用いて利用者が匿名化データ提供サーバから匿名化データを取得して検証するまでの一連の処理について説明する。
先ず、匿名化データ提供サーバ１のレコード拡張機能部１０２は、患者データテーブル３０１および抽象化パタン群３０２を入力とし、匿名化処理の前処理として患者データレコード拡張処理を行い、拡張患者データテーブル３０３へ処理結果を格納する（Ｓ０１）。なお、患者データレコード拡張処理は、後に、図１１を用いて詳説する。 First, a series of processes from the user acquiring the anonymized data from the anonymized data providing server and verifying it will be described with reference to FIG.
First, the record expansion function unit 102 of the anonymization data providing server 1 inputs the patient data table 301 and the abstraction pattern group 302, performs the patient data record expansion processing as a preprocessing of the anonymization processing, and performs the extended patient data table 303. The processing result is stored in (S01). The patient data record expansion process will be described in detail later with reference to FIG.

次に、匿名化データ提供サーバ１の署名生成部１０６は、拡張患者データテーブル３０３のデータを入力とし、デジタル署名を生成する署名生成処理を行い、生成した署名値を元となった患者データ名とともに署名データテーブル３０４に格納する（Ｓ０２）。なお、署名生成処理は、後に、図１２Ａおよび図１２Ｂを用いて詳説する。 Next, the signature generation unit 106 of the anonymized data providing server 1 takes the data of the extended patient data table 303 as an input, performs a signature generation process to generate a digital signature, and performs a signature generation process based on the generated signature value. It is stored in the signature data table 304 together with (S02). The signature generation process will be described in detail later with reference to FIGS. 12A and 12B.

次に、匿名化データ提供サーバ１のＷｅｂサーバ機能部１０１は、署名データテーブル３０４に格納された患者データ名および署名値を読出し、ネットワーク３からアクセス可能なＷｅｂページを生成し、ＷｅｂページのＵＲＬ（Uniform Resource Locator）を匿名化データ利用者端末２のＷｅｂブラウザ機能へ通知する（Ｓ０３）。 Next, the Web server functional unit 101 of the anonymized data providing server 1 reads the patient data name and the signature value stored in the signature data table 304, generates a Web page accessible from the network 3, and generates the URL of the Web page. (Uniform Resource Locator) is notified to the Web browser function of the anonymized data user terminal 2 (S03).

次に、匿名化データ利用者端末２のＷｅｂブラウザ機能部２０１は、患者データ名および署名値を含むＷｅｂページを取得し、患者データ名および署名値の一覧をデータ利用者にディスプレイ等の表示装置５１０により表示する（Ｓ０６）。 Next, the Web browser function unit 201 of the anonymized data user terminal 2 acquires a Web page including the patient data name and the signature value, and displays a list of the patient data name and the signature value to the data user as a display device such as a display. Displayed by 510 (S06).

次に、匿名化データのデータ利用者は、匿名化データ利用者端末２のマウスなどの入力装置５２０を操作し、利用する患者データの患者データ名に対応する署名値を、Ｗｅｂページからダウンロードし、匿名化データ利用者端末２の主メモリ５０２またはハードディスクドライブ５３０に検証用の署名値３１１として保存する（Ｓ０５）。 Next, the data user of the anonymized data operates an input device 520 such as a mouse of the anonymized data user terminal 2 and downloads a signature value corresponding to the patient data name of the patient data to be used from the Web page. , The data is stored in the main memory 502 of the anonymized data user terminal 2 or the hard disk drive 530 as a verification signature value 311 (S05).

次に、データ利用者は、利用する患者データ名および匿名化条件を匿名化データ利用者端末２のＷｅｂブラウザ機能部２０１に入力する。本実施形態では、以下の三つの匿名化条件が入力されるものとする。 Next, the data user inputs the patient data name to be used and the anonymization condition into the Web browser function unit 201 of the anonymization data user terminal 2. In this embodiment, the following three anonymization conditions are input.

匿名化条件１：属性「住所」が“日本”以外のレコードを削除
匿名化条件２：属性「氏名」を削除
匿名化条件３：属性「住所」を“都道府県名”から“地方名”に置換 Anonymization condition 1: Delete records whose attribute "address" is not "Japan" Anonymization condition 2: Delete attribute "name" Anonymization condition 3: Attribute "address" changed from "prefecture name" to "local name" Replacement

そして、Ｗｅｂブラウザ機能部２０１は、患者データ名および匿名化条件からなる匿名化データ取得依頼を匿名化データ提供サーバ１のＷｅｂサーバ機能部１０１に通知する（Ｓ０６）。 Then, the Web browser function unit 201 notifies the Web server function unit 101 of the anonymization data providing server 1 of the anonymization data acquisition request consisting of the patient data name and the anonymization condition (S06).

次に、匿名化データ提供サーバ１のＷｅｂサーバ機能部１０１は、通知された患者データ名および匿名化条件を匿名化処理機能部１０３に送信する。匿名化処理機能部１０３は、患者データ名および匿名化条件を入力として検証可能匿名化処理を実行して、匿名化データ３０５を生成し、Ｗｅｂサーバ機能部１０１に送信する。そして、Ｗｅｂサーバ機能部１０１は、匿名化データ３０５をデータ利用者向けのダウンロード用Ｗｅｂページに登録し、そのＵＲＬを匿名化データ利用者端末２のＷｅｂブラウザ機能へ通知する（Ｓ０７）。なお、検証可能匿名化処理の詳細は、後に、図１３を用いて詳説する。 Next, the Web server function unit 101 of the anonymization data providing server 1 transmits the notified patient data name and the anonymization condition to the anonymization processing function unit 103. The anonymization processing function unit 103 executes verifiable anonymization processing by inputting the patient data name and the anonymization condition, generates anonymization data 305, and transmits the anonymization data 305 to the Web server function unit 101. Then, the Web server function unit 101 registers the anonymized data 305 in the download Web page for the data user, and notifies the URL to the Web browser function of the anonymized data user terminal 2 (S07). The details of the verifiable anonymization process will be described later with reference to FIG.

次に、匿名化データ利用者端末２のＷｅｂブラウザ機能部２０１は、通知されたＵＲＬを入力としてデータ利用者向けのダウンロード用Ｗｅｂページにアクセスし、匿名化データをダウンロードにより取得し、匿名化データ利用者端末２の主メモリ５０２またはハードディスクドライブ５３０に匿名化データ３０５として格納する（Ｓ０８）。 Next, the Web browser function unit 201 of the anonymized data user terminal 2 accesses the download Web page for the data user by inputting the notified URL, acquires the anonymized data by downloading, and anonymizes the data. It is stored as anonymized data 305 in the main memory 502 or the hard disk drive 530 of the user terminal 2 (S08).

最後に、匿名化データ利用者端末２の署名検証処理部２０２は、匿名化データ３０５および検証用の署名値３１１を入力として、匿名化の正当性を検証する署名検証処理を実行し、正当性が検証された場合“ＯＫ”を、正当でない場合“ＮＧ”を、匿名化データ利用者端末２のディスプレイ等の表示装置５１０により、データ利用者に表示する（Ｓ０９）。なお、署名検証処理の詳細は、後に、図１７を用いて詳説する。 Finally, the signature verification processing unit 202 of the anonymized data user terminal 2 executes the signature verification process for verifying the validity of the anonymization by inputting the anonymized data 305 and the signature value 311 for verification, and the validity. If is verified, "OK" is displayed to the data user, and if it is not valid, "NG" is displayed to the data user by the display device 510 such as the display of the anonymized data user terminal 2 (S09). The details of the signature verification process will be described later with reference to FIG.

上記の利用者が匿名化データ提供サーバから匿名化データを取得して検証するまでの一連の処理により、データ利用者は、入手した匿名化データが正当な匿名化処理で匿名化されたか否かを確認できる。これにより、例えば、医療分野では研究に用いる患者、被験者等の匿名化データの匿名化の正当性を検証できるので、不正な匿名化データにより研究結果が誤る事態を避けることができる。 Through a series of processes from the above-mentioned user acquisition of anonymized data from the anonymized data providing server and verification, the data user can determine whether or not the obtained anonymized data has been anonymized by a legitimate anonymization process. Can be confirmed. As a result, for example, in the medical field, the validity of anonymization of anonymized data of patients, subjects, etc. used in research can be verified, so that it is possible to avoid a situation in which the research result is erroneous due to incorrect anonymization data.

次に、図１１を用いて患者データレコード拡張処理について説明する。
これは、図１０のＳ０１に該当する処理である。 Next, the patient data record expansion process will be described with reference to FIG.
This is the process corresponding to S01 in FIG.

先ず、匿名化データ提供サーバ１のレコード拡張機能部１０２は、患者データテーブルのレコード数をカウントし、レコード数Ｎとする（Ｓ１０１）。なお、図１１に示した処理で使用する変数ｍ（ｍは、レコードのカウンタ）の初期値は１とする。 First, the record extension function unit 102 of the anonymized data providing server 1 counts the number of records in the patient data table and sets the number of records to N (S101). The initial value of the variable m (m is a record counter) used in the process shown in FIG. 11 is 1.

次に、レコード拡張機能部１０２は、患者データテーブルのレコードを一つ読出す（Ｓ１０２）。 Next, the record extension function unit 102 reads one record of the patient data table (S102).

例えば、読み出したレコードが図５に示すレコード３０１１の場合、当該レコードは属性「氏名」の要素が「日立太郎」、属性「住所」の要素が「東京都」、属性「性別」の要素が「男性」となる。 For example, when the read record is the record 3011 shown in FIG. 5, the element of the attribute "name" is "Hitachi Taro", the element of the attribute "address" is "Tokyo", and the element of the attribute "gender" is "gender". It becomes "male".

次に、レコード拡張機能部１０２は、抽象化パタン群３０２から、読出した当該レコードの抽象化パタン群を読み出す（Ｓ１０３）。例えば、当該レコードの属性「住所」の「東京都」の抽象化パタンを、図６（ｂ）に示す属性「住所」の抽象化パタン群の抽象化パタンテーブル３０２ｂから読み出す場合、抽象化パタンとして、図６（ｂ）における属性「住所」の要素が「東京都」であるレコード３０２１の「関東地方」「日本」を読み出す。 Next, the record extension function unit 102 reads out the abstraction pattern group of the read record from the abstraction pattern group 302 (S103). For example, when the abstraction pattern of "Tokyo" of the attribute "address" of the record is read from the abstraction pattern table 302b of the abstraction pattern group of the attribute "address" shown in FIG. 6 (b), it is used as an abstraction pattern. , The element of the attribute "address" in FIG. 6B is "Tokyo", and the "Kanto region" and "Japan" of the record 3021 are read out.

次に、レコード拡張機能部１０２は、当該レコードに、取得した抽象化パタンの各要素を追加する（Ｓ１０４）。例えば、図５のレコード３０１１が当該レコードである場合、Ｓ１０３で抽象化パタンのレコード３０２１から読出した要素である「関東地方」「日本」を追加する。これにより、置換による匿名化時の置換先の値の候補を含むレコードが生成される。 Next, the record extension function unit 102 adds each element of the acquired abstraction pattern to the record (S104). For example, when the record 3011 in FIG. 5 is the record, the elements "Kanto region" and "Japan" read from the record 3021 of the abstraction pattern in S103 are added. As a result, a record containing candidates for the replacement destination value at the time of anonymization by replacement is generated.

次に、レコード拡張機能部１０２は、レコードの各属性に対して乱数生成部１０４から取得したそれぞれ乱数を一つずつ追加する（Ｓ１０５）。例えば、図５のレコード３０１１が当該レコードである場合、三つの属性に対して、異なる三つの乱数を取得し、そのレコードに追加する。このように属性毎に一つ乱数を付与することにより、すべての要素（カラム）に乱数を付与する場合に比べてレコードのデータサイズを小さくすることができる。 Next, the record extension function unit 102 adds one random number acquired from the random number generation unit 104 to each attribute of the record (S105). For example, when the record 3011 in FIG. 5 is the record, three different random numbers are acquired for the three attributes and added to the record. By assigning one random number to each attribute in this way, the data size of the record can be reduced as compared with the case where random numbers are assigned to all the elements (columns).

次に、レコード拡張機能部１０２は、属性の置換、乱数の付与をおこなったレコードを拡張患者データテーブル３０３のレコードとして出力する（Ｓ１０６）。例えば、図５のレコード３０１１が当該レコードである場合、Ｓ１０３～Ｓ１０５により生成したレコードは、図７の３０３１に示すように、属性「氏名」に追加された乱数を示す属性「氏名１」の値が「５ＥＦ４ＢＥ」、属性「氏名」の元の値を示す属性「氏名２」の値が「日立太郎」、属性「住所」に追加された乱数を示す属性「住所１」の値が「Ａ７５４Ｂ９」、属性「住所」の元の値を示す属性「住所２」の値が「東京都」、属性「住所」に追加された抽象化パタンの第一の要素を示す属性「住所３」の値が「関東地方」、属性「住所」に追加された抽象化パタンの第二の要素を示す属性「住所４」の値が「日本」、属性「性別」に追加された乱数を示す属性「性別１」の値が「７７０Ｅ６７」、属性「性別」の元の値を示す属性「性別２」の値が「男性」となる。 Next, the record expansion function unit 102 outputs a record in which the attribute is replaced and a random number is added as a record in the expansion patient data table 303 (S106). For example, when the record 3011 in FIG. 5 is the record, the record generated by S103 to S105 is the value of the attribute “name 1” indicating the random number added to the attribute “name” as shown in 3031 in FIG. Is "5EF4BE", the value of the attribute "Name 2" indicating the original value of the attribute "Name" is "Hitachi Taro", and the value of the attribute "Address 1" indicating the random number added to the attribute "Address" is "A754B9". , The value of the attribute "Address 2" indicating the original value of the attribute "Address" is "Tokyo", and the value of the attribute "Address 3" indicating the first element of the abstraction pattern added to the attribute "Address" is "Kanto region", the value of the attribute "Address 4" indicating the second element of the abstraction pattern added to the attribute "Address" is "Japan", and the attribute "Gender 1" indicating the random number added to the attribute "Gender" The value of "770E67" is "770E67", and the value of the attribute "gender 2" indicating the original value of the attribute "gender" is "male".

次に、レコード拡張機能部１０２は、変数ｍの値を１インクリメントする（Ｓ１０７）。 Next, the record extension function unit 102 increments the value of the variable m by 1 (S107).

最後に、レコード拡張機能部１０２は、変数ｍの値と患者データのレコード数Ｎを比較し、ｍがＮ以下の場合には（Ｓ１０８：Ｎｏ）、Ｓ１０２～Ｓ１０７の処理を実行し、ｍがＮより大きい場合には（Ｓ１０８：Ｙｅｓ）、処理を終了する。 Finally, the record extension function unit 102 compares the value of the variable m with the number of records N of the patient data, and if m is N or less (S108: No), executes the processes of S102 to S107, and m is If it is larger than N (S108: Yes), the process ends.

次に、図１２Ａおよび図１２Ｂを用いて署名検証処理について説明する。
これは、図１０のＳ０２とＳ０９に該当する処理であり、匿名化データ提供サーバ１の署名生成部１０６および匿名化データ利用者端末２の署名生成部２０４で実施される両者の共通の処理である。本実施形態においては、匿名化データ提供サーバ１の署名生成部１０６は、拡張患者データを入力としてデジタル署名値を出力し、一方、匿名化データ利用者端末２の署名生成部２０４は、匿名化データを入力としてデジタル署名値を出力する。 Next, the signature verification process will be described with reference to FIGS. 12A and 12B.
This is a process corresponding to S02 and S09 in FIG. 10, and is a common process of both performed by the signature generation unit 106 of the anonymized data providing server 1 and the signature generation unit 204 of the anonymized data user terminal 2. be. In the present embodiment, the signature generation unit 106 of the anonymized data providing server 1 outputs the digital signature value by inputting the extended patient data, while the signature generation unit 204 of the anonymized data user terminal 2 anonymizes. The data is input and the digital signature value is output.

以下では、署名検証処理が匿名化データ提供サーバ１の署名生成部１０６で行われるものとするが、匿名化データ利用者端末２の署名生成部２０４での署名検証処理も同様である。 In the following, it is assumed that the signature verification process is performed by the signature generation unit 106 of the anonymized data providing server 1, but the same applies to the signature verification process by the signature generation unit 204 of the anonymized data user terminal 2.

先ず、署名生成部１０６は、対象データから、レコード数Ｎ、属性数ｎを取得する（Ｓ２０１）。なお、本フローチャートで使用する変数ｉ，ｊ，ｔ，ｑ，ｋ，ｌ，ｍの初期値はすべて１とする。 First, the signature generation unit 106 acquires the number of records N and the number of attributes n from the target data (S201). The initial values of the variables i, j, t, q, k, l, and m used in this flowchart are all set to 1.

次に、署名生成部１０６は、対象データから、レコードを一つ読み出す（Ｓ２０２）。例えば、対象データが図７の拡張患者データテーブル３０３の場合、読み出すレコードは、レコード３０３１となる。また、対象データが図９の匿名化データ３０５の場合、読み出す対象レコードはレコード３０５２となる。 Next, the signature generation unit 106 reads one record from the target data (S202). For example, when the target data is the extended patient data table 303 of FIG. 7, the record to be read is the record 3031. When the target data is the anonymized data 305 of FIG. 9, the target record to be read is the record 3052.

次に、署名生成部１０６は、読み出した当該レコードの１番目の要素が、当該レコードが削除されていることを示す“Ｄｅｌｅｔｅ＿Ｒ”である場合（Ｓ２０３：Ｙｅｓ）には、Ｓ２０４の処理を実行し、それ以外の場合（Ｓ２０３：Ｎｏの場合）は、Ｓ２０５の処理を実行する（Ｓ２０３）。例えば、当該レコードが匿名化データ３０５のレコード３０５１である場合には、レコードの１番目の要素が“Ｄｅｌｅｔｅ＿Ｒ”であるため、次にＳ２０４の処理を実行する。一方、当該レコードが拡張患者データテーブル３０３のレコード３０３１の場合には、レコードの１番目の要素が“５ＥＦ４ＢＥ”であるため、次にＳ２０５の処理を実行する。 Next, when the first element of the read record is “Delete_R” indicating that the record has been deleted (S203: Yes), the signature generation unit 106 executes the process of S204. In other cases (S203: No), the process of S205 is executed (S203). For example, when the record is the record 3051 of the anonymized data 305, since the first element of the record is "Delete_R", the process of S204 is executed next. On the other hand, when the record is the record 3031 of the extended patient data table 303, the first element of the record is "5EF4BE", so the process of S205 is executed next.

次に、署名生成部１０６は、当該レコードの２番目の要素の値をｉ番目（ｉは、属性のカウンタ）のレコードのハッシュ値Ｈｒｉの値とし、次にＳ２１５の処理を実行し（Ｓ２０４）、次にＳ２１６の処理を実行する。例えば、当該レコードが匿名化データ３０５のレコード３０５１である場合、その２番目の要素の値“２２８９１Ｆ”をＨｒｉの値とする。 Next, the signature generation unit 106 sets the value of the second element of the record as the hash value Hri of the i-th (i is an attribute counter) record, and then executes the process of S215 (S204). Then, the process of S216 is executed. For example, when the record is the record 3051 of the anonymized data 305, the value "22891F" of the second element is set as the value of Hri.

次に、署名生成部１０６は、当該レコードのｉ番目の属性Ａｉの要素数を取得しＥｉｎとする（Ｓ２０５）。例えば、拡張患者データテーブル３０３のレコード３０３１の１番目の属性「氏名」をＡ１とした場合、拡張された二つの属性「氏名１」「氏名２」から構成されるため、Ｅｉｎの値は“２”となる。 Next, the signature generation unit 106 acquires the number of elements of the i-th attribute Ai of the record and sets it as Ein (S205). For example, when the first attribute "name" of the record 3031 of the extended patient data table 303 is A1, the value of Ein is "2" because it is composed of two extended attributes "name 1" and "name 2". ".

次に、署名生成部１０６は、属性Ａｉの１番目の要素が、“Ｄｅｌｅｔｅ＿Ａ”である場合には（Ｓ２０６：“Ｄｅｌｅｔｅ＿Ａ”）、次にＳ２０７の処理を実行し、“Ｒｅｐｌａｃｅ”である場合には（Ｓ２０６：“Ｒｅｐｌａｃｅ”）、次にＳ２０８の処理を実行し、いずれでもない場合（Ｓ２０６：Ｏｔｈｅｒｗｉｓｅ）には、次にＳ２０９の処理を実行する（Ｓ２０６）。例えば、当該レコードが匿名化データ３０５のレコード３０５２、当該属性Ａｉが「氏名」である場合、１番目の要素が属性「氏名１」の“Ｄｅｌｅｔｅ＿Ａ”であるため、次にＳ２０７の処理を実行する。また例えば、当該レコードが匿名化データ３０５のレコード３０５２、当該属性Ａｉが「住所」である場合、１番目の要素が属性「住所１」の“Ｒｅｐｌａｃｅ”であるため、次にＳ２０８の処理を実行する。また例えば、当該レコードが拡張患者データテーブル３０３の３０３１、当該属性Ａｉが「氏名」である場合、１番目の要素が属性「氏名１」の“５ＥＦ４ＢＥ”であるため、次にＳ２０９の処理を実行する。 Next, the signature generation unit 106 executes the process of S207 when the first element of the attribute Ai is “Delete_A” (S206: “Delete_A”), and when it is “Replace”. (S206: "Replace"), then the process of S208 is executed, and if neither is the case (S206: Otherwise), the process of S209 is executed next (S206). For example, when the record is the record 3052 of the anonymized data 305 and the attribute Ai is the "name", the first element is the "Delete_A" of the attribute "name 1", so the process of S207 is executed next. .. Further, for example, when the record is the record 3052 of the anonymized data 305 and the attribute Ai is the "address", the first element is the "Replace" of the attribute "address 1", so the process of S208 is executed next. do. Further, for example, when the record is 3031 of the extended patient data table 303 and the attribute Ai is "name", the first element is "5EF4BE" of the attribute "name 1", so the process of S209 is executed next. do.

次に、署名生成部１０６は、当該属性Ａｉの２番目の要素の値をハッシュ値Ｈｒｉの値とし（Ｓ２０７）、次にＳ２１６の処理を実行する。例えば、当該レコードが匿名化データ３０５のレコード３０５２、当該属性Ａｉが「氏名」である場合、２番目の要素の属性「氏名２」の値である“４６５ＦＣ４”をＨｒｉの値とする。 Next, the signature generation unit 106 sets the value of the second element of the attribute Ai as the value of the hash value Hri (S207), and then executes the process of S216. For example, when the record is the record 3052 of the anonymized data 305 and the attribute Ai is the "name", the value of "465FC4" which is the value of the attribute "name 2" of the second element is set as the value of Hri.

次に、署名生成部１０６は、当該属性Ａｉの２番目の要素の値をハッシュ値Ｈｒｉの値とし、次にＳ２１０の処理を実行する（Ｓ２０８）。例えば、当該レコードが匿名化データ３０５のレコード３０５２、当該属性Ａｉが「住所」である場合、２番目の要素の属性「住所２」の値である“Ｂ０Ｄ８Ｃ７”をＨｒｉの値とする。 Next, the signature generation unit 106 sets the value of the second element of the attribute Ai as the value of the hash value Hri, and then executes the process of S210 (S208). For example, when the record is the record 3052 of the anonymized data 305 and the attribute Ai is the "address", the value of "B0D8C7" which is the value of the attribute "address 2" of the second element is set as the value of Hri.

次に、署名生成部１０６は、変数ｊ（ｊは、要素のカウンタ）の値を１インクリメントする（Ｓ２１０）。 Next, the signature generation unit 106 increments the value of the variable j (j is an element counter) by 1 (S210).

次に、署名生成部１０６は、当該属性Ａｉのｊ番目の要素Ａｉｊと（ｊ＋１）番目の要素Ａｉ（ｊ＋１）を入力として、ハッシュ値生成部１０５からハッシュ値を取得し、当該レコードのハッシュ値Ｈｒｉとする（Ｓ２０９）。例えば、当該レコードが拡張患者データテーブル３０３のレコード３０３１、当該属性Ａｉが「氏名」、ｉ＝１、ｊ＝１である場合、当該属性Ａｉのｊ番目の要素は属性「氏名１」の値“５ＥＦ４ＢＥ”、（ｊ＋１）番目の要素は属性「氏名２」の値“日立太郎”となり、この二つの値をハッシュ値生成部１０５に入力して得たハッシュ値“Ａ８Ｅ０Ｃ２”をＨｒｉの値とする。 Next, the signature generation unit 106 obtains a hash value from the hash value generation unit 105 by inputting the j-th element Aij and the (j + 1) -th element Ai (j + 1) of the attribute Ai, and the hash value of the record. Let it be Hri (S209). For example, when the record is the record 3031 of the extended patient data table 303 and the attribute Ai is "name", i = 1, j = 1, the jth element of the attribute Ai is the value of the attribute "name 1". 5EF4BE ”, the (j + 1) th element is the value“ Hitachi Taro ”of the attribute“ name 2 ”, and the hash value“ A8E0C2 ”obtained by inputting these two values into the hash value generation unit 105 is used as the Hri value. ..

次に、署名生成部１０６は、当該属性Ａｉのｊ番目の要素の値が“－”（Ｎｕｌｌ）である場合（Ｓ２１２：Ｙｅｓ）には、次にＳ２１３の処理を実行し、それ以外の場合（Ｓ２１２：Ｎｏ）には、次にＳ２１２の処理を実行する。例えば、当該レコードが匿名化データ３０５のレコード３０５１、当該属性Ａｉが「住所」、ｊ番目の要素が属性「住所１」の値“－”である場合、次にＳ２１３の処理を実行する。 Next, when the value of the j-th element of the attribute Ai is "-" (Null) (S212: Yes), the signature generation unit 106 then executes the process of S213, and in other cases, the signature generation unit 106 executes the process of S213. In (S212: No), the process of S212 is executed next. For example, when the record is the record 3051 of the anonymized data 305, the attribute Ai is the "address", and the jth element is the value "-" of the attribute "address 1", the process of S213 is executed next.

次に、署名生成部１０６は、変数ｊの値を１インクリメントし、次に、Ｓ２１３の処理を実行する（Ｓ２１２）。 Next, the signature generation unit 106 increments the value of the variable j by 1, and then executes the process of S213 (S212).

次に、署名生成部１０６は、当該属性Ａｉの要素数Ｅｉｎと変数ｊの値を比較し、Ｅｉｎがｊより大きい場合（Ｓ２１３：Ｙｅｓ）には、次にＳ２１６の処理を実行し、Ｅｉｎがｊ以下の場合（Ｓ２１３：Ｎｏ）には、次にＳ２１４の処理を実行する。 Next, the signature generation unit 106 compares the number of elements Ein of the attribute Ai with the value of the variable j, and if Ein is larger than j (S213: Yes), then executes the process of S216, and Ein In the case of j or less (S213: No), the process of S214 is executed next.

次に、署名生成部１０６は、当該属性Ａｉのｊ番目の要素Ａｉｊおよびその時点の当該レコードのハッシュ値Ｈｒｉを入力として、ハッシュ値生成部１０５から新たなハッシュ値を取得して、新たなＨｒｉとする（Ｓ２１４）。例えば、当該レコードが拡張患者データテーブル３０３のレコード３０３１、当該属性Ａｉが「住所」、ｊ＝３であり、当該属性Ａｉの３番目の要素は属性「住所３」の値、“関東地方”、その時点のＨｒｉの値を“Ｂ７ＥＦ１４”である場合、“関東地方”および“Ｂ７ＥＦ１４”をハッシュ値生成部１０５に入力して得た“７Ｃ６４８Ｂ”を新たなＨｒｉの値とする。 Next, the signature generation unit 106 obtains a new hash value from the hash value generation unit 105 by inputting the j-th element Aij of the attribute Ai and the hash value Hri of the record at that time, and the new Hri. (S214). For example, the record is record 3031 of the extended patient data table 303, the attribute Ai is "address", j = 3, and the third element of the attribute Ai is the value of the attribute "address 3", "Kanto region". When the value of Hri at that time is "B7EF14", "7C648B" obtained by inputting "Kanto region" and "B7EF14" into the hash value generation unit 105 is used as a new Hri value.

次に、署名生成部１０６は、変数ｊの値を１インクリメントし、次に、Ｓ２１１の処理を実行する（Ｓ２１５）。 Next, the signature generation unit 106 increments the value of the variable j by 1, and then executes the process of S211 (S215).

次に、署名生成部１０６は、Ｓ２１５では、変数ｉの値を１インクリメントし、次にＳ２１７の処理を実行する（Ｓ２１６）。 Next, the signature generation unit 106 increments the value of the variable i by 1 in S215, and then executes the process of S217 (S216).

次に、署名生成部１０６は、属性数ｎと変数ｉを比較し、ｉがｎ以下の場合（Ｓ２１７：Ｙｅｓ）には、次にＳ２０３の処理を実行し、ｉがｎより大きい場合（Ｓ２１７：Ｎｏ）には、次にＳ２１８の処理を実行する。 Next, the signature generation unit 106 compares the number of attributes n with the variable i, and if i is n or less (S217: Yes), then executes the process of S203, and if i is larger than n (S217). : No), the process of S218 is executed next.

次に、ｎとｑ（ｑは、属性のカウンタ）を比較し、ｎがｑより大きい場合（Ｓ２１８：Ｙｅｓ）には、次にＳ２１９を実行し、ｎがｑ以下の場合（Ｓ２１８：Ｎｏ）には、次にＳ２２１を実行する。 Next, n and q (q is an attribute counter) are compared, and if n is larger than q (S218: Yes), then S219 is executed, and if n is q or less (S218: No). Next, S221 is executed.

次に、署名生成部１０６は、ハッシュ値Ｈｒｑとハッシュ値Ｈｒ（ｑ＋１）を入力として、ハッシュ値を生成し、新たなＨｒ（ｑ＋１）とする（Ｓ２１９）。 Next, the signature generation unit 106 takes the hash value Hrq and the hash value Hr (q + 1) as inputs, generates a hash value, and sets it as a new Hr (q + 1) (S219).

次に、ｑを１インクリメントし（Ｓ２２０）、Ｓ２１８に戻る。 Next, q is incremented by 1 (S220), and the process returns to S218.

次に、署名生成部１０６は、ハッシュ値ＨＲｔ（ｔは、レコードのカウンタ）に、ハッシュ値Ｈｒｎ（これは、ｎ≧２のとき、Ｈｒ（ｑ＋１）の値に等しいことに注意）の値を代入する（Ｓ２２１）。 Next, the signature generator 106 sets the hash value HRt (t is the counter of the record) to the value of the hash value Hrn (note that this is equal to the value of Hr (q + 1) when n ≧ 2). Substitute (S221).

次に、署名生成部１０６は、変数ｔの値を、１インクリメントし、変数ｉおよび変数ｊの値を１として、次にＳ２１８の処理を実行する（Ｓ２１７）。 Next, the signature generation unit 106 increments the value of the variable t by 1, sets the values of the variable i and the variable j to 1, and then executes the process of S218 (S217).

次に、署名生成部１０６は、対象データのレコード数Ｎと変数ｔの値を比較し、ｔがＮ以下の場合（Ｓ２２３：Ｎｏ）には、次にＳ２０２～Ｓ２２２の処理を実行し、ｔがＮより大きい場合（Ｓ２２３：Ｙｅｓ）には、次に図１２ＢのＳ２５１の処理を実行する。 Next, the signature generation unit 106 compares the number of records N of the target data with the value of the variable t, and if t is N or less (S223: No), then executes the processes of S202 to S222, t. If is greater than N (S223: Yes), then the process of S251 in FIG. 12B is executed.

以降の過程は、レコードハッシュ値からなる二分木構造のハッシュ木（Hash Tree）からルートのハッシュ値を生成する過程である。 The subsequent process is the process of generating the hash value of the root from the hash tree of the binary tree structure consisting of the record hash values.

次に、署名生成部１０６は、レコードハッシュ値ＨＲｉの残項目数としてｍ（ｍは、レコードの処理の残項目数のカウンタ）の値を対象データのレコード数Ｎとし、次にＳ２５２の処理を実行する（図１２ＢのＳ２５１）。 Next, the signature generation unit 106 sets the value of m (m is a counter of the number of remaining items in the record processing) as the number of remaining items in the record hash value HRi as the number of records N of the target data, and then performs the processing of S252. Execute (S251 in FIG. 12B).

次に、署名生成部１０６は、残項目数ｍと変数ｌの値を比較し、等しくない場合（Ｓ２５２：Ｎｏ）には、次にＳ２５３の処理を実行し、等しい場合（Ｓ２５２：Ｙｅｓ）には、次にＳ２６０の処理を実行する。 Next, the signature generation unit 106 compares the number of remaining items m with the value of the variable l, and if they are not equal (S252: No), then the process of S253 is executed, and if they are equal (S252: Yes). Next executes the process of S260.

次に、署名生成部１０６は、Ｓ２５３では、ｋ番目（ｋは、レコードのカウンタ）のレコードのハッシュ値ＨＲｋと（ｋ＋１）番目のハッシュ値ＨＲ（ｋ＋１）を入力として、ハッシュ値生成部１０５から新たなハッシュ値を取得し、ｌ番目（ｌは、レコードのカウンタ）のレコードハッシュ値ＨＲｌの新たな値とする（Ｓ２５３）。例えば、ｋ＝１、ｌ＝１の場合、ＨＲ１とＨＲ２を入力としてハッシュ値生成部１０５から新たに取得したハッシュ値を、ＨＲ１の新たな値とする。 Next, in S253, the signature generation unit 106 inputs the hash value HRk of the kth (k is a record counter) record and the (k + 1) th hash value HR (k + 1) from the hash value generation unit 105. A new hash value is acquired and used as a new value of the l-th (l is a record counter) record hash value HRl (S253). For example, when k = 1 and l = 1, the hash value newly acquired from the hash value generation unit 105 with HR1 and HR2 as inputs is used as the new value of HR1.

次に、署名生成部１０６は、残項目数ｍの値が２以外の場合（Ｓ２５４：Ｎｏ）には、次にＳ２５５の処理を実行し、残項目数ｍの値が２である場合（Ｓ２５４：Ｙｅｓ）には、次にＳ２６０の処理を実行する。 Next, when the value of the number of remaining items m is other than 2 (S254: No), the signature generation unit 106 then executes the process of S255, and when the value of the number of remaining items m is 2 (S254: No). : Yes), then the process of S260 is executed.

次に、署名生成部１０６は、２ｋ＋１を新たなｋの値とし、ｌの値を１インクリメントする（Ｓ２５５）。 Next, the signature generation unit 106 sets 2k + 1 as a new value of k and increments the value of l by 1 (S255).

次に、署名生成部１０６は、残項目数ｍと変数ｋの値を比較し、ｍがｋよりも大きい場合（Ｓ２５６：Ｙｅｓ）には、次にＳ２５３の処理を実行し、ｍがｋ以下の場合（Ｓ２５６：Ｎｏ）には、次にＳ２５７の処理を実行する。 Next, the signature generation unit 106 compares the number of remaining items m with the value of the variable k, and if m is larger than k (S256: Yes), then executes the process of S253, and m is k or less. In the case of (S256: No), the process of S257 is executed next.

次に、署名生成部１０６は、残項目数ｍと変数ｋの値を比較し、ｍとｋが異なる場合（Ｓ２５７：Ｎｏ）には、次にＳ２５８の処理を実行し、ｍとｋが等しい場合（Ｓ２５７：Ｙｅｓ）には、次にＳ２５９の処理を実行する。 Next, the signature generation unit 106 compares the number of remaining items m with the value of the variable k, and if m and k are different (S257: No), then executes the process of S258, and m and k are equal. In the case (S257: Yes), the process of S259 is executed next.

署名生成部１０６は、（ｍ／２＋ｍ％２）の値を新たな残項目数ｍの値とし、ｋおよびｌの値を１として、次にＳ２５３の処理を実行する（Ｓ２５８）。ここで、ｍ％２は、ｍを２で割った剰余を表す。 The signature generation unit 106 sets the value of (m / 2 + m% 2) as the value of the new number of remaining items m, sets the values of k and l to 1, and then executes the process of S253 (S258). Here, m% 2 represents a remainder obtained by dividing m by 2.

次に、署名生成部１０６は、Ｈｒｌの新しい値としてＨｒｋの値を代入する（Ｓ２５９）。 Next, the signature generation unit 106 substitutes the value of Hrk as the new value of Hrl (S259).

次に、署名生成部１０６は、その時点のＨｒｌの値を対象データの全体ハッシュ値Ｈｄの値とする（Ｓ２６０）。 Next, the signature generation unit 106 sets the value of Hrl at that time as the value of the total hash value Hd of the target data (S260).

最後に、署名生成部１０６は、ＨｄからＲＳＡなどのデジタル署名アルゴリズムにより署名値δを生成して出力し（Ｓ２６１）、処理を終了する。なお、デジタル署名アルゴリズムは、既存のアルゴリズムを利用することができる。 Finally, the signature generation unit 106 generates and outputs a signature value δ from Hd by a digital signature algorithm such as RSA (S261), and ends the process. As the digital signature algorithm, an existing algorithm can be used.

以上説明した署名生成処理により、図１３～図１６を用いて後述する検証可能匿名化処理が施された匿名化データに対しては、匿名化処理後であっても、匿名化前の拡張患者データが同一である場合は、拡張患者データに対して生成される署名の値と匿名化データに対して生成される署名の値が同一となるため、匿名化の正当性を検証することが可能となる。 With respect to the anonymized data subjected to the verifiable anonymization process described later using FIGS. 13 to 16 by the signature generation process described above, the extended patient before the anonymization even after the anonymization process. If the data are the same, the value of the signature generated for the extended patient data and the value of the signature generated for the anonymized data will be the same, so it is possible to verify the validity of the anonymization. It becomes.

次に、図１３を用いて検証可能匿名化処理について説明する。
これは、図１０のＳ０７に該当する処理である。 Next, the verifiable anonymization process will be described with reference to FIG.
This is the process corresponding to S07 in FIG.

先ず、匿名化データ提供サーバ１の匿名化処理機能部１０３は、Ｗｅｂサーバ機能部１０１から送信された患者データ名に対応する拡張患者データを拡張患者データテーブル３０３から読み出す（Ｓ３０１）。例えば、図１０のＳ０６において、対応する拡張患者データとして、図７の拡張患者データテーブル３０３のデータを取得する。 First, the anonymization processing function unit 103 of the anonymization data providing server 1 reads out the extended patient data corresponding to the patient data name transmitted from the Web server function unit 101 from the extended patient data table 303 (S301). For example, in S06 of FIG. 10, the data of the extended patient data table 303 of FIG. 7 is acquired as the corresponding extended patient data.

次に、匿名化処理機能部１０３は、Ｗｅｂサーバ機能部１０１から送信された匿名化条件を読出す（Ｓ３０２）。例えば、本実施形態では、以下の三つの匿名化条件を読み出すものとする。 Next, the anonymization processing function unit 103 reads the anonymization condition transmitted from the Web server function unit 101 (S302). For example, in this embodiment, the following three anonymization conditions are read out.

次に、匿名化処理機能部１０３は、レコードを削除する匿名化条件がある場合には、単にレコードを削除するのではなく、対象レコード全体のハッシュ化を拡張患者データに対して行うレコード削除処理を実行する（Ｓ３０３）。例えば、本実施形態の場合、匿名化条件１がレコードを削除する匿名化条件であるため、拡張患者データテーブル３０３から読み出した拡張患者データに対して、レコード削除処理を実行する。なお、レコード削除処理の詳細は、後に、図１４を用いて後述する。 Next, when there is an anonymization condition for deleting the record, the anonymization processing function unit 103 does not simply delete the record, but hashes the entire target record for the extended patient data. Is executed (S303). For example, in the case of the present embodiment, since the anonymization condition 1 is the anonymization condition for deleting the record, the record deletion process is executed for the extended patient data read from the extended patient data table 303. The details of the record deletion process will be described later with reference to FIG.

次に、匿名化処理機能部１０３は、属性を削除する匿名化条件がある場合、単に属性を削除にするのでなく、対象属性全体のハッシュ化を拡張患者データに対して行う属性削除処理を実行する（Ｓ３０４）。例えば、本実施形態の場合、匿名化条件２が属性「氏名」を削除する匿名化条件であるため、拡張患者データに対して属性削除処理を実行する。なお、属性削除処理の詳細は、図１５を用いて後述する。 Next, when there is an anonymization condition for deleting the attribute, the anonymization processing function unit 103 executes an attribute deletion process for hashing the entire target attribute to the extended patient data, instead of simply deleting the attribute. (S304). For example, in the case of the present embodiment, since the anonymization condition 2 is the anonymization condition for deleting the attribute "name", the attribute deletion process is executed for the extended patient data. The details of the attribute deletion process will be described later with reference to FIG.

次に、匿名化処理機能部１０３は、属性の要素を置換する匿名化条件がある場合、単に、属性の要素の置換するのでなく、対象属性の一部の要素のハッシュ化を拡張患者データに対して実行する属性置換処理を実行する（Ｓ３０５）。例えば、本実施形態の場合、匿名化条件３が属性「住所」の要素を“都道府県名”から“地方名”に置換する匿名化条件であるため、各条患者データに対して属性置換処理を実行する。なお、属性置換処理の詳細は、図１６を用いて後述する。 Next, when the anonymization processing function unit 103 has an anonymization condition for replacing the element of the attribute, the hashing of a part of the element of the target attribute is converted into the extended patient data instead of simply replacing the element of the attribute. The attribute replacement process to be executed is executed (S305). For example, in the case of the present embodiment, since the anonymization condition 3 is an anonymization condition in which the element of the attribute "address" is replaced from the "prefecture name" to the "local name", the attribute replacement process is performed for each patient data. To execute. The details of the attribute replacement process will be described later with reference to FIG.

最後に、匿名化処理機能部１０３は、Ｓ３０３～Ｓ３０５の処理を実行後の拡張患者データを匿名化データ３０５としてファイルに出力し（Ｓ３０６）、Ｗｅｂサーバ機能部１０１へ送信し、処理を終了する。 Finally, the anonymization processing function unit 103 outputs the extended patient data after executing the processing of S303 to S305 to a file as anonymization data 305 (S306), sends it to the Web server function unit 101, and ends the processing. ..

次に、図１４を用いてレコード削除処理について説明する。
これは、図１３のＳ３０３に該当する処理である。 Next, the record deletion process will be described with reference to FIG.
This is the process corresponding to S303 in FIG.

先ず、匿名化データ提供サーバ１の匿名化処理機能部１０３は、レコード削除の匿名化条件を読み出す。例えば、本実施形態の場合、匿名化条件１を読み出す（Ｓ４０１）。なお、変数ｉ，ｊ，ｔの初期値を１とする（匿名化条件１：属性「住所」が“日本”以外のレコードを削除）。 First, the anonymization processing function unit 103 of the anonymization data providing server 1 reads out the anonymization condition for record deletion. For example, in the case of this embodiment, the anonymization condition 1 is read out (S401). The initial values of the variables i, j, and t are set to 1 (anonymization condition 1: records whose attribute "address" is other than "Japan" are deleted).

次に、匿名化処理機能部１０３は、拡張患者データから削除対象のレコード群Ｒｄを特定し、Ｒｄのレコード数をＮ，Ｒｄの属性の属性数をｎとする。例えば、本実施形態では、図７の拡張患者データテーブル３０３から読み出した拡張患者データから、属性「住所」が“日本”以外の“アメリカ”であるレコード３０３２をＲｄとし、Ｎ＝１、ｎ＝３（属性「氏名」「住所」「性別」）とする（Ｓ４０２）。 Next, the anonymization processing function unit 103 identifies the record group Rd to be deleted from the extended patient data, sets the number of records of Rd to N, and sets the number of attributes of the Rd attribute to n. For example, in the present embodiment, from the extended patient data read from the extended patient data table 303 of FIG. 7, the record 3032 whose attribute “address” is “America” other than “Japan” is set as Rd, and N = 1, n =. 3 (attributes "name", "address", "gender") (S402).

次に、匿名化処理機能部１０３は、削除対象のレコード群Ｒｄからｔ番目（ｔは、レコードのカウンタ）のレコードを一つ読み出す。例えば、本実施形態では、Ｓ４０２でＲｄとして特定したレコード３０３２を読み出す（Ｓ４０３）。 Next, the anonymization processing function unit 103 reads one t-th (t is a record counter) record from the record group Rd to be deleted. For example, in the present embodiment, the record 3032 specified as Rd in S402 is read out (S403).

次に、匿名化処理機能部１０３は、読み出した当該レコードのｉ番目（ｉは、属性のカウンタ）の属性Ａｉの要素数Ｅｉｎを取得する。例えば、当該レコードがレコード３０３２、ｉ＝１の場合、１番目の属性「氏名」の要素数２をＥｉｎの値とする（Ｓ４０４）。 Next, the anonymization processing function unit 103 acquires the number of elements Ein of the attribute Ai of the i-th (i is the attribute counter) of the read record. For example, when the record is record 3032 and i = 1, the number of elements 2 of the first attribute "name" is set as the value of Ein (S404).

次に、匿名化処理機能部１０３は、属性Ａｉのｊ番目の要素Ａｉｊと（ｊ＋１）番目の要素Ａｉ（ｊ＋１）をハッシュ値生成部１０５に入力してハッシュ値を取得し、レコードハッシュ値Ｈｒｉの値とする。例えば、当該レコードがレコード３０３２、属性Ａｉが属性「氏名」かつｊ＝１の場合、属性「氏名」の１番目の要素“７４１ＤＣ３”と２番目の要素“Ｔｏｍ”をハッシュ値生成部１０５に入力してハッシュ値を取得し、Ｈｒｉの値とする（Ｓ４０５）。 Next, the anonymization processing function unit 103 inputs the j-th element Aij and the (j + 1) -th element Ai (j + 1) of the attribute Ai into the hash value generation unit 105 to acquire the hash value, and the record hash value Hri. The value of. For example, when the record is record 3032 and the attribute Ai is the attribute "name" and j = 1, the first element "741DC3" and the second element "Tom" of the attribute "name" are input to the hash value generation unit 105. Then, the hash value is obtained and used as the Hri value (S405).

次に、匿名化処理機能部１０３は、Ｅｉｎと変数ｊの値を比較し、Ｅｉｎがｊよりも大きい場合（Ｓ４０６：Ｙｅｓ）には、次にＳ４０７の処理を実行し、Ｅｉｎがｊ以下の場合（Ｓ４０６：Ｎｏ）には、次にＳ４１０の処理を実行する（Ｓ４０９）。 Next, the anonymization processing function unit 103 compares the values of Ein and the variable j, and if Ein is larger than j (S406: Yes), then executes the processing of S407, and Ein is j or less. In the case (S406: No), the process of S410 is then executed (S409).

次に、匿名化処理機能部１０３は、ｊの値を１インクリメントする（Ｓ４０７）。 Next, the anonymization processing function unit 103 increments the value of j by 1 (S407).

次に、匿名化処理機能部１０３は、属性Ａｉの番目の要素Ａｉｊとその時点のＨｒｉの値をハッシュ値生成部１０５に入力して新たなハッシュ値を取得し、新たなＨｒｉの値とする（Ｓ４０８）。 Next, the anonymization processing function unit 103 inputs the value of the second element Aij of the attribute Ai and the value of Hri at that time into the hash value generation unit 105, acquires a new hash value, and sets it as a new Hri value. (S408).

次に、匿名化処理機能部１０３は、Ｅｉｎと変数ｊの値を比較し、Ｅｉｎがｊよりも大きい場合（Ｓ４０９：Ｙｅｓ）には、次にＳ４０７の処理を実行し、Ｅｉｎがｊ以下の場合（Ｓ０９：Ｎｏ）には、次にＳ４１０の処理を実行する。 Next, the anonymization processing function unit 103 compares the values of Ein and the variable j, and if Ein is larger than j (S409: Yes), then executes the processing of S407, and Ein is j or less. In the case (S09: No), the process of S410 is executed next.

次に、匿名化処理機能部１０３は、変数ｉの値を１インクリメントし、次にＳ４１１の処理を実行する（Ｓ４１０）。 Next, the anonymization processing function unit 103 increments the value of the variable i by 1, and then executes the processing of S411 (S410).

次に、匿名化処理機能部１０３は、Ｓ４１１では、属性数ｎと変数ｉの値を比較し、ｎがｉよりも小さい場合（Ｓ４１１：Ｎｏ）には、次にＳ４０４の処理を実行し、ｎがｉ以上の場合（Ｓ４１１：Ｙｅｓ）には、次にＳ４１２の処理を実行する。 Next, in S411, the anonymization processing function unit 103 compares the number of attributes n with the value of the variable i, and if n is smaller than i (S411: No), then executes the processing of S404. When n is i or more (S411: Yes), the process of S412 is executed next.

次に、ｎとｋ（ｋは、属性のカウンタ）を比較し、ｎがｋより大きい場合（Ｓ４１２：Ｙｅｓ）には、次にＳ４１３を実行し、ｎがｋより大きくない場合（Ｓ４１２：Ｎｏ）には、次にＳ４１５を実行する。 Next, n and k (k is an attribute counter) are compared, and if n is larger than k (S412: Yes), then S413 is executed, and if n is not larger than k (S412: No). ), Then S415 is executed.

次に、匿名化処理機能部１０３は、ハッシュ値Ｈｒｋとハッシュ値Ｈｒ（ｋ＋１）を入力として、ハッシュ値を生成し、新たなＨｒ（ｋ＋１）とする（Ｓ４１３）。 Next, the anonymization processing function unit 103 takes the hash value Hrk and the hash value Hr (k + 1) as inputs, generates a hash value, and sets it as a new Hr (k + 1) (S413).

次に、ｋを１インクリメントし（Ｓ４１４）、Ｓ４１２に戻る。 Next, k is incremented by 1 (S414), and the process returns to S412.

次に、匿名化処理機能部１０３は、変数ｔの値（ｔは、レコードのカウンタ）を１インクリメントし、変数ｉおよび変数ｊの値を１とする（Ｓ４１５）。 Next, the anonymization processing function unit 103 increments the value of the variable t (t is a record counter) by 1, and sets the values of the variable i and the variable j to 1 (S415).

次に、匿名化処理機能部１０３は、削除対象の当該レコードの１番目の要素の値をレコードが削除されたことを示すラベルである“Ｄｅｌｅｔｅ＿Ｒ”とし、当該レコードの２番目の要素の値をその時点のレコードハッシュ値Ｈｒｎ（これは、ｎ≧２のとき、Ｈｒ（ｋ＋１）の値に等しいことに注意）とし、当該レコードの３番目以降の要素の値をすべて値なしを示す“－”（Ｎｕｌｌ）として、次にＳ４１４の処理を実行する（Ｓ４１６）。例えば、当該レコードが図７のレコード３０３２の場合、本処理の結果は、図９のレコード３０５１に示す１番目の要素が「Ｄｅｌｅｔｅ＿Ｒ」、２番目の要素が「２２８９１ｆ」というレコードハッシュ値、３番目以降の要素がすべて「－」とする。 Next, the anonymization processing function unit 103 sets the value of the first element of the record to be deleted as “Delete_R”, which is a label indicating that the record has been deleted, and sets the value of the second element of the record as the value. The record hash value at that time is Hrn (note that this is equal to the value of Hr (k + 1) when n ≧ 2), and all the values of the third and subsequent elements of the record are “-” indicating no value. Then, the process of S414 is executed as (Null) (S416). For example, when the record is the record 3032 of FIG. 7, the result of this processing is that the first element shown in the record 3051 of FIG. 9 is the record hash value “Delete_R” and the second element is the record hash value “22891f”. All subsequent elements are "-".

最後に、匿名化処理機能部１０３は、Ｒｄのレコード数Ｎと変数ｔの値を比較し、Ｎがｔ以上の場合（Ｓ４１７：Ｎｏ）には、次にＳ４０３以降の処理を実行し、Ｎがｔ未満の場合（Ｓ４１７：Ｙｅｓ）には、処理を終了する。 Finally, the anonymization processing function unit 103 compares the number of records N of Rd with the value of the variable t, and if N is t or more (S417: No), then executes the processing after S403, and N If is less than t (S417: Yes), the process ends.

次に、図１５を用いて属性削除処理について説明する。
これは、図１３のＳ３０４に該当する処理である。 Next, the attribute deletion process will be described with reference to FIG.
This is the process corresponding to S304 in FIG.

先ず、匿名化データ提供サーバ１の匿名化処理機能部１０３は、属性削除の匿名化条件を読み出す（Ｓ５０１）。例えば、本実施形態の場合、匿名化条件２を読み出す（匿名化条件２：属性「氏名」を削除）。なお、変数ｉ，ｊ，ｔの初期値を１とする。 First, the anonymization processing function unit 103 of the anonymization data providing server 1 reads out the anonymization condition for deleting the attribute (S501). For example, in the case of this embodiment, the anonymization condition 2 is read (anonymization condition 2: the attribute "name" is deleted). The initial values of the variables i, j, and t are set to 1.

次に、匿名化処理機能部１０３は、拡張患者データのレコード数をＮ、属性数をｎとし、削除対象の属性群をＡｄとし、Ａｄの属性数をａｎとする。例えば、拡張患者データが、図７に示された拡張患者データテーブル３０３のデータの場合、Ｎ＝４、ｎ＝３、Ａｄの要素は、属性「氏名」、Ａｄの属性数ａｎ＝１とする（Ｓ５０２）。 Next, the anonymization processing function unit 103 sets the number of records of the extended patient data to N, the number of attributes to n, the attribute group to be deleted to Ad, and the number of attributes of Ad to an. For example, when the extended patient data is the data of the extended patient data table 303 shown in FIG. 7, the elements of N = 4, n = 3, and Ad are the attribute “name” and the number of attributes of Ad is an = 1. (S502).

次に、匿名化処理機能部１０３は、拡張患者データのｔ番目（ｔは、レコードのカウンタ）のレコードを読み出す（Ｓ５０３）。 Next, the anonymization processing function unit 103 reads out the t-th (t is a record counter) record of the extended patient data (S503).

次に、匿名化処理機能部１０３は、Ａｄから属性を一つ読み出しＡｄｉとする（ｉは、属性のカウンタ）（Ｓ５０４）。 Next, the anonymization processing function unit 103 reads one attribute from Ad and sets it as Adi (i is an attribute counter) (S504).

次に、匿名化処理機能部１０３は、属性Ａｄｉの要素数を、Ｅｉｎの値とする（Ｓ５０５）。 Next, the anonymization processing function unit 103 sets the number of elements of the attribute Adi to the value of Ein (S505).

次に、匿名化処理機能部１０３は、属性Ａｄｉのｊ番目の要素Ａｄｉｊと（ｊ＋１）番目の要素Ａｄｉ（ｊ＋１）をハッシュ値生成部１０５に入力して、新たなハッシュ値を取得し、属性ハッシュ値Ｈａの値とする（Ｓ５０６）。例えば、当該レコードが図７のレコード３０３１、属性Ａｄｉが属性「氏名」、ｊ＝１である場合、１番目の要素「５ＥＦ４ＢＥ」と２番目の要素「日立太郎」をハッシュ値生成部１０５に入力して、新たなハッシュ値“４６５ＦＣ４”取得し、Ｈａの値とする。 Next, the anonymization processing function unit 103 inputs the j-th element Adig and the (j + 1) -th element Adi (j + 1) of the attribute Adi into the hash value generation unit 105, acquires a new hash value, and attributes. The hash value is Ha (S506). For example, when the record is the record 3031 in FIG. 7, the attribute Adi is the attribute "name", and j = 1, the first element "5EF4BE" and the second element "Hitachi Taro" are input to the hash value generation unit 105. Then, a new hash value "465FC4" is acquired and used as a Ha value.

次に、匿名化処理機能部１０３は、Ｅｉｎと変数ｊの値を比較し、Ｅｉｎがｊより大きい場合（Ｓ５０７：Ｙｅｓ）には、次にＳ５０８の処理を実行し、Ｅｉｎがｊ以下の場合（Ｓ５０７：Ｎｏ）には、次にＳ５１０の処理を実行する。 Next, the anonymization processing function unit 103 compares the values of Ein and the variable j, and if Ein is larger than j (S507: Yes), then executes the processing of S508, and if Ein is j or less. In (S507: No), the process of S510 is then executed.

次に、匿名化処理機能部１０３は、変数ｊの値を１インクリメントする（Ｓ５０８）。 Next, the anonymization processing function unit 103 increments the value of the variable j by 1 (S508).

次に、匿名化処理機能部１０３は、属性Ａｄｉのｊ番目の要素ＡｄｉｊとＨａをハッシュ値生成部１０５に入力して、新たにハッシュ値を取得し、それを新たなＨａの値とする（Ｓ５０９）。 Next, the anonymization processing function unit 103 inputs the j-th element Adig and Ha of the attribute Adi into the hash value generation unit 105, acquires a new hash value, and sets it as a new Ha value ( S509).

次に、匿名化処理機能部１０３は、Ｅｉｎと変数ｊの値を比較し、Ｅｉｎがｊより大きい場合（Ｓ５１０：Ｙｅｓ）には、次にＳ５０８の処理を実行し、Ｅｉｎがｊ以下の場合（Ｓ５１０：Ｎｏ）には、次にＳ５１１の処理を実行する。 Next, the anonymization processing function unit 103 compares the values of Ein and the variable j, and if Ein is larger than j (S510: Yes), then executes the processing of S508, and if Ein is j or less. In (S510: No), the process of S511 is then executed.

次に、匿名化処理機能部１０３は、Ｓ５１１では、当該レコードの属性Ａｄｉの１番目の要素の値を属性が削除されたことを示すラベルである“Ｄｅｌｅｔｅ＿Ａ”とし、２番目の要素の値をその時点の属性ハッシュ値Ｈａとし、３番目以降の要素の値を“－”（Ｎｕｌｌ）とする（Ｓ５１１）。例えば、当該レコードが図７のレコード３０３１、削除対象の属性が「氏名」である場合、本処理の結果、図９のレコード３０５２に示すように、属性「氏名」の１番目の要素である「氏名１」の値が“Ｄｅｌｅｔｅ＿Ａ”とし、属性「氏名」の２番目の要素である「氏名２」の値がその時点の属性ハッシュ値Ｈａの値である“４６５ＦＣ４”とし、属性「氏名」には３番以降の要素がないため“－”の値は使用しない。 Next, in S511, the anonymization processing function unit 103 sets the value of the first element of the attribute Adi of the record to "Delete_A", which is a label indicating that the attribute has been deleted, and sets the value of the second element. The attribute hash value Ha at that time is set, and the value of the third and subsequent elements is set to “−” (Null) (S511). For example, when the record is record 3031 in FIG. 7 and the attribute to be deleted is "name", as a result of this processing, as shown in record 3052 in FIG. 9, the first element of the attribute "name" is "name". The value of "Name 1" is "Delete_A", the value of "Name 2" which is the second element of the attribute "Name" is "465FC4" which is the value of the attribute hash value Ha at that time, and the attribute is "Name". Do not use the value of "-" because there is no element after the third.

次に、匿名化処理機能部１０３は、変数ｉの値を１インクリメントする（Ｓ５１２）。 Next, the anonymization processing function unit 103 increments the value of the variable i by 1 (S512).

次に、匿名化処理機能部１０３は、Ａｄの属性数ａｎと変数ｉの値を比較し、ａｎがｉ以上の場合（Ｓ５１３：Ｎｏ）には、次にＳ５０４の処理を実行し、ａｎがｉより小さい場合（Ｓ５１３：Ｙｅｓ）には、次にＳ５１４の処理を実行する。 Next, the anonymization processing function unit 103 compares the number of attributes an of Ad with the value of the variable i, and if an is i or more (S513: No), then executes the processing of S504, and an If it is smaller than i (S513: Yes), the process of S514 is executed next.

次に、匿名化処理機能部１０３は、変数ｔの値を１インクリメントし、変数ｉおよび変数ｊの値を１とする（Ｓ５１４）。 Next, the anonymization processing function unit 103 increments the value of the variable t by 1, and sets the values of the variable i and the variable j to 1 (S514).

最後に、匿名化処理機能部１０３は、拡張患者データのレコード数Ｎと変数ｔの値を比較し、Ｎがｔ以上の場合（Ｓ５１５：Ｎｏ）には、次にＳ５０３以降の処理を実行し、Ｎがｔ未満の場合（Ｓ５１５：Ｙｅｓ）には、属性削除処理を終了する。 Finally, the anonymization processing function unit 103 compares the number of records N of the extended patient data with the value of the variable t, and if N is t or more (S515: No), then executes the processing after S503. , N is less than t (S515: Yes), the attribute deletion process is terminated.

次に、図１６を用いて属性置換処理について説明する。
これは、図１３のＳ３０５に該当する処理である。 Next, the attribute replacement process will be described with reference to FIG.
This is the process corresponding to S305 in FIG.

先ず、匿名化データ提供サーバ１の匿名化処理機能部１０３は、属性置換の匿名化条件を読み出す（Ｓ６０１）。例えば、本実施形態の場合、匿名化条件３を読み出す（匿名化条件３属性「住所」を“都道府県名”（住所２）から“地方名”（住所３）に置換）。なお、変数ｉ，ｊ，ｔの初期値を１とする。 First, the anonymization processing function unit 103 of the anonymization data providing server 1 reads out the anonymization condition for attribute substitution (S601). For example, in the case of the present embodiment, the anonymization condition 3 is read (the anonymization condition 3 attribute "address" is replaced from the "prefecture name" (address 2) to the "local name" (address 3)). The initial values of the variables i, j, and t are set to 1.

次に、匿名化処理機能部１０３は、拡張患者データのレコード数をＮ，属性数をｎとし、置換対象の属性群をＡｒ、Ａｒの属性数をｒｎとする。例えば、拡張患者データが拡張患者データテーブル３０３のデータの場合、Ｎ＝４、ｎ＝３、Ａｒの要素は属性「住所」、Ａｒの属性数ｒｎ＝１とする（Ｓ６０２）。 Next, the anonymization processing function unit 103 sets the number of records of the extended patient data to N, the number of attributes to n, the attribute group to be replaced to Ar, and the number of attributes of Ar to rn. For example, when the extended patient data is the data of the extended patient data table 303, the elements of N = 4, n = 3, Ar are the attribute “address”, and the number of attributes of Ar is rn = 1 (S602).

次に、匿名化処理機能部１０３は、拡張患者データのｔ番目（ｔは、レコードのカウンタ）のレコードを読み出す（Ｓ６０３）。 Next, the anonymization processing function unit 103 reads out the t-th (t is a record counter) record of the extended patient data (S603).

次に、匿名化処理機能部１０３は、Ａｒから属性を一つ読み出し、Ａｒｉ（ｉは、属性のカウンタ）とする（Ｓ６０４）。 Next, the anonymization processing function unit 103 reads one attribute from Ar and sets it as Ari (i is an attribute counter) (S604).

次に、匿名化処理機能部１０３は、属性Ａｒｉの置換対象となる要素数をＥｉｒの値とする（Ｓ６０５）。例えば、属性Ａｒが「住所」の場合、“都道府県名”（住所２）を置換するため、Ｅｉｒ＝２とする。 Next, the anonymization processing function unit 103 sets the number of elements to be replaced by the attribute Ari as the value of Air (S605). For example, when the attribute Ar is "address", Er = 2 is set to replace the "prefecture name" (address 2).

次に、匿名化処理機能部１０３は、属性Ａｒｉのｊ番目（ｊは、要素のカウンタ）の要素Ａｒｉｊと（ｊ＋１）番目の要素Ａｒｉ（ｊ＋１）をハッシュ値生成部１０５に入力して、新たなハッシュ値を取得し、属性ハッシュ値Ｈａの値とする（Ｓ６０６）。例えば、当該レコードが図７のレコード３０３１、属性Ａｒｉが属性「住所」、ｊ＝１である場合、１番目の要素「住所１」の値“Ａ７５４Ｂ９”と２番目の要素「住所２」の値“東京都”をハッシュ値生成部１０５に入力して得た新たなハッシュ値“Ｂ０Ｄ８Ｃ７”をＨａの値とする。 Next, the anonymization processing function unit 103 newly inputs the j-th (j is an element counter) element Arij and the (j + 1) -th element Ari (j + 1) of the attribute Ari into the hash value generation unit 105. Hash value is acquired and used as the value of the attribute hash value Ha (S606). For example, when the record is the record 3031 in FIG. 7, the attribute Ari is the attribute "address", and j = 1, the value "A754B9" of the first element "address 1" and the value of the second element "address 2". The new hash value "B0D8C7" obtained by inputting "Tokyo" into the hash value generation unit 105 is used as the Ha value.

次に、匿名化処理機能部１０３は、Ｅｉｒと変数ｊ＋１の値を比較し、Ｅｉｒがｊ＋１より大きい場合（Ｙｅｓ：Ｓ６０７）には、次にＳ６０８の処理を実行し、Ｅｉｒがｊ以下の場合（Ｎｏ：Ｓ６０７）には、次にＳ６１１の処理を実行する。 Next, the anonymization processing function unit 103 compares the value of the Air with the value of the variable j + 1, and if the Air is larger than j + 1 (Yes: S607), then executes the processing of S608, and if the Air is j or less. In (No: S607), the process of S611 is next executed.

次に、匿名化処理機能部１０３は、変数ｊの値を１インクリメントする（Ｓ６０８）。 Next, the anonymization processing function unit 103 increments the value of the variable j by 1 (S608).

次に、匿名化処理機能部１０３は、属性Ａｒｉのｊ番目の要素ＡｒｉｊとＨａをハッシュ値生成部１０５に入力して新たなハッシュ値を取得し、それを新たなＨａの値とする（Ｓ６０９）。 Next, the anonymization processing function unit 103 inputs the j-th element Arij and Ha of the attribute Ari into the hash value generation unit 105, acquires a new hash value, and sets it as a new Ha value (S609). ).

次に、匿名化処理機能部１０３は、Ｅｉｒと変数ｊの値を比較し、Ｅｉｒがｊ＋１より大きい場合（Ｓ６１０：Ｙｅｓ）には、次にＳ６０８の処理を実行し、Ｅｉｒがｊ＋１以下の場合（Ｓ６１０：Ｎｏ）には、次にＳ６１１の処理を実行する。 Next, the anonymization processing function unit 103 compares the values of the Air and the variable j, and if the Air is larger than j + 1 (S610: Yes), then executes the processing of S608, and if the Air is j + 1 or less. In (S610: No), the process of S611 is then executed.

次に、匿名化処理機能部１０３は、当該レコードの属性Ａｒｉの１番目の要素の値を属性が置換されたことを示すラベルである“Ｒｅｐｌａｃｅ”とし、２番目の要素の値をその時点の属性ハッシュ値Ｈａとし、Ｅｉｒが３より大きい場合は属性Ａｒｉの３番目からＥｉｒ番目までの要素の値を“－”（Ｎｕｌｌ）として、次にＳ６１２の処理を実行する（Ｓ６１１）。例えば、当該レコードが図７のレコード３０３１、置換対象の属性が「住所」である場合、本処理の結果、図９のレコード３０５２に示すように、属性「住所」の１番目の要素である「住所１」の値が”Ｒｅｐｌａｃｅ“とし、２番目の要素である「住所２」の値がその時点の属性ハッシュ値Ｈａの値である“Ｂ０Ｄ８Ｃ７”とする。 Next, the anonymization processing function unit 103 sets the value of the first element of the attribute Ari of the record as "Replace", which is a label indicating that the attribute has been replaced, and sets the value of the second element at that time. The attribute hash value Ha is set, and if the Air is larger than 3, the values of the elements from the third to the Erth of the attribute Ari are set to "-" (Null), and then the processing of S612 is executed (S611). For example, when the record is record 3031 in FIG. 7 and the attribute to be replaced is "address", as a result of this processing, as shown in record 3052 in FIG. 9, the first element of the attribute "address" is ". The value of "address 1" is "Replace", and the value of the second element "address 2" is "B0D8C7" which is the value of the attribute hash value Ha at that time.

次に、匿名化処理機能部１０３は、変数ｉの値を１インクリメントする（Ｓ６１２）。 Next, the anonymization processing function unit 103 increments the value of the variable i by 1 (S612).

次に、匿名化処理機能部１０３は、Ａｒの属性数ｒｎと変数ｉの値を比較し、ｒｎがｉ以上の場合（Ｓ６１３：Ｎｏ）には、次にＳ６０４の処理を実行し、ｒｎがｉより小さい場合（Ｓ６１３：Ｙｅｓ）には、次にＳ６１４の処理を実行する（Ｓ６１３）。 Next, the anonymization processing function unit 103 compares the number of attributes rn of Ar with the value of the variable i, and if rn is i or more (S613: No), then executes the processing of S604, and rn is If it is smaller than i (S613: Yes), the process of S614 is executed next (S613).

次に、匿名化処理機能部１０３は、変数ｔの値（ｔは、レコードのカウンタ）を１インクリメントし、変数ｉおよび変数ｊの値を１とする（Ｓ６１４）。 Next, the anonymization processing function unit 103 increments the value of the variable t (t is a record counter) by 1, and sets the values of the variable i and the variable j to 1 (S614).

最後に、匿名化処理機能部１０３は、拡張データのレコード数Ｎと変数ｔの値を比較し、Ｎがｔ以上の場合（Ｓ６１５：Ｎｏ）には、次にＳ６０３以降の処理を実行し、Ｎがｔ未満の場合（Ｓ６１５：Ｙｅｓの場合）には、属性置換処理を終了する。 Finally, the anonymization processing function unit 103 compares the number of records N of the extended data with the value of the variable t, and if N is t or more (S615: No), then executes the processing after S603. If N is less than t (S615: Yes), the attribute replacement process is terminated.

次に、図１７を用いて署名検証処理について説明する。
これは、図１０のＳ０９に該当する処理である。 Next, the signature verification process will be described with reference to FIG.
This is the process corresponding to S09 in FIG.

先ず、匿名化データ利用者端末２の署名検証処理部２０２は、図１０のＳ１０８において、匿名化データ利用者端末２の主メモリ５０２またはハードディスクドライブ５３０に保存された匿名化データ３０５および検証用の署名値である署名値３１１を読み出す（Ｓ７０１、Ｓ７０２）。 First, in S108 of FIG. 10, the signature verification processing unit 202 of the anonymized data user terminal 2 has anonymized data 305 and verification data stored in the main memory 502 or the hard disk drive 530 of the anonymized data user terminal 2. The signature value 311 which is the signature value is read (S701, S702).

次に、署名検証処理部２０２は、匿名化データ３０５を匿名化データ利用者端末２の署名生成部２０４に入力して、生成された署名値δを取得する（Ｓ７０３）。 Next, the signature verification processing unit 202 inputs the anonymized data 305 into the signature generation unit 204 of the anonymized data user terminal 2 and acquires the generated signature value δ (S703).

最後に、署名検証処理部は、検証用署名値である署名値３１１とδを比較し、二つの値が同一である場合（Ｓ７０４：Ｙｅｓ）には、匿名化データは、正当なものであると認定され、匿名化データ利用者端末２の表示装置５１０などによりデータ利用者に“ＯＫ”を表示する（Ｓ７０５）。一方、二つの値が異なる場合（Ｓ７０４：Ｎｏ）には、匿名化データは、改竄や取り違えなどの理由による正当なものではない認定され、匿名化データ利用者端末２の表示装置５１０などによりデータ利用者に“ＮＧ”を表示し（Ｓ７０６）、処理を終了する。 Finally, the signature verification processing unit compares the signature value 311 which is the verification signature value with δ, and when the two values are the same (S704: Yes), the anonymized data is valid. It is certified as, and "OK" is displayed to the data user by the display device 510 or the like of the anonymized data user terminal 2 (S705). On the other hand, when the two values are different (S704: No), the anonymized data is certified as not legitimate due to falsification or mistake, and the data is determined by the display device 510 of the anonymized data user terminal 2. "NG" is displayed to the user (S706), and the process is terminated.

以上で説明したように、実施形態の匿名化データ提供システムでは、予め対象データの置換候補の値を対象データに追加した上で、単に削除や置換を行うのではなく、署名生成処理の中間処理であるハッシュ化によるハッシュ値への置き換えを行うので、削除や置換などの匿名化処理を施した後でも署名値による匿名化の正当性の検証を可能とすることができる。 As described above, in the anonymized data providing system of the embodiment, the value of the replacement candidate of the target data is added to the target data in advance, and then the intermediate process of the signature generation process is performed instead of simply deleting or replacing. Since the data is replaced with the hash value by hashing, it is possible to verify the validity of the anonymization by the signature value even after performing anonymization processing such as deletion or replacement.

また、ハッシュ化のプロセスにおいても、元データに対し乱数を追加した上で、各属性の値と乱数を入力値としてハッシュ化を行うことにより、ハッシュ値から元データを特定するために必要な計算量が膨大となるため、匿名化データから元データを復元されるリスクを低減することができる。 Also, in the hashing process, after adding a random number to the original data, hashing is performed using the value of each attribute and the random number as the input value, and the calculation required to identify the original data from the hash value. Since the amount is huge, the risk of recovering the original data from the anonymized data can be reduced.

また、元データに対し乱数を追加する際に、各属性に一つ乱数を追加した上で、図１２Ａないし図１６に示したアルゴリズムにおいて、各属性内で段階的に乱数を含むハッシュ化を行うので、元データのすべての要素に乱数を追加する場合に比べて、匿名化データのデータサイズを削減することができる。 Further, when adding a random number to the original data, one random number is added to each attribute, and then hashing including the random number is performed step by step in each attribute in the algorithm shown in FIGS. 12A to 16. Therefore, the data size of the anonymized data can be reduced as compared with the case of adding random numbers to all the elements of the original data.

また、ハッシュ化処理を署名生成処理の中間処理と同一としているので、匿名化データの署名検証処理における署名生成処理を減らし、署名検証処理を高速化することができる。 Further, since the hashing process is the same as the intermediate process of the signature generation process, the signature generation process in the signature verification process of the anonymized data can be reduced and the signature verification process can be speeded up.

以上説明したように、本実施形態によれば、医療分野の研究開発等に匿名化データを活用する際に、データの削除や置換等の匿名化処理を施した場合でも匿名化の正当性の検証を可能とするので、不正な匿名化データの利用により研究成果が不正となる事態などを避けることができる。 As described above, according to the present embodiment, when anonymizing data is used for research and development in the medical field, the legitimacy of anonymization is obtained even when anonymization processing such as deletion or replacement of data is performed. Since verification is possible, it is possible to avoid situations where research results are fraudulent due to the use of fraudulent anonymized data.

１…匿名化データ提供サーバ、２…匿名化データ利用者端末、３…ネットワーク、１０２…レコード拡張機能部、１０３…匿名化処理機能部、１０５…ハッシュ値生成部、１０６…署名生成部、１１０…記憶部、３０１…患者データテーブル、３０２…抽象化パタン群、３０３…拡張患者データテーブル、３０４…署名データテーブル、３０５…匿名化データ、２０１…ブラウザ機能部、２０２…署名検証処理部、２０３…ハッシュ値生成部、２０４…署名生成部、２１０…記憶部、３１１…署名値 1 ... Anonymized data providing server, 2 ... Anonymized data user terminal, 3 ... Network, 102 ... Record extension function unit, 103 ... Anonymization processing function unit, 105 ... Hash value generation unit, 106 ... Signature generation unit, 110 ... storage unit, 301 ... patient data table, 302 ... abstract pattern group, 303 ... extended patient data table, 304 ... signature data table, 305 ... anonymized data, 201 ... browser function unit, 202 ... signature verification processing unit, 203 ... Hash value generation unit, 204 ... Signature generation unit, 210 ... Storage unit, 311 ... Signature value

Claims

An anonymization system that anonymizes confidential information using an information processing device and provides it to users.
Confidential information storage means for storing confidential information,
The abstraction candidate group information storage means, which is a candidate group of information that abstracts the secret information,
An extended secret data storage means for storing extended secret data in which candidate groups of information to be abstracted are added to the secret information, and
Anonymized data storage means for storing anonymized data in which some information is deleted or replaced from the confidential information, and
An extended secret data generation means for generating the extended secret data using the secret information and the abstraction candidate group information, and
A signature generation means that uses the extended secret data or anonymized data to generate a digital signature having a hash value of the secret information as an intermediate value.
Anonymization means for generating anonymized data using the extended secret data, and
Equipped with anonymized data legitimacy verification means to verify the legitimacy of given anonymized data,
The extended secret data generation means refers to the secret information stored by the secret information storage means and the candidate group of information that abstracts the secret information stored by the abstraction candidate group information storage means. Generate secret data,
The anonymization means generates anonymization data by executing a process of replacing the extended secret data with a hash value having the same hash value as the intermediate value of the signature generation of the signature generation means.
The signature generation means generates a first signature value from the extended secret data generated by the extended secret data generation means and a second signature value from the anonymization data generated by the anonymization means. ,
The anonymized data validity verification means is an anonymization system characterized in that the validity of a given anonymized data is verified by comparing the first signature value with the second signature value.

Further, a random number generation means for adding a random number to the extended secret data is provided.
The anonymization system according to claim 1, wherein the signature generation means generates a hash value as an intermediate value from the extended secret data to which a random number generated by the random number generation means is added.

The random number generation means adds one random number to each attribute of the extended secret data.
The anonymization system according to claim 2, wherein the signature generation means generates an intermediate value of a hash value including a random number as an input for each attribute.

It is an anonymization method that anonymizes confidential information by an information processing device and provides it to users.
Confidential information storage step to memorize confidential information and
The abstraction candidate group information storage step, which is a candidate group of information that abstracts the secret information,
An extended secret data storage step for storing extended secret data in which candidate groups of information to be abstracted are added to the secret information, and an extended secret data storage step.
Anonymized data storage step of storing anonymized data in which some information is deleted or replaced from the confidential information, and
An extended secret data generation step for generating the extended secret data using the secret information and the abstraction candidate group information, and
A signature generation step of using the extended secret data or the anonymized data to generate a digital signature having a hash value of the secret information as an intermediate value.
Anonymization step to generate anonymized data using the extended secret data,
It has an anonymized data validation step to verify the validity of a given anonymized data,
In the extended secret data generation step, the secret information stored from the secret information storage step and the candidate group of information that abstracts the secret information stored by the abstraction candidate group information storage step are referred to, and the extension is performed. Generate secret data,
In the anonymization, the extended secret data is replaced with a hash value that is the same as the intermediate value of the signature generation of the signature generation means to generate the anonymization data.
In the signature generation step, the first signature value is generated from the extended secret data generated by the extended secret data generation means, and the second signature value is generated from the anonymized data generated by the anonymization means. ,
A method for anonymizing data, comprising verifying the validity of a given anonymized data by comparing the first signature value and the second signature value in the anonymization data validity verification step.

Further, it has a random number generation step for adding a random number to the extended secret data.
The anonymization method according to claim 4, wherein in the signature generation step, a hash value as an intermediate value is generated from the extended secret data to which a random number generated by the random number generation means is added.

In the random number generation step, one random number is added to each attribute of the extended secret data.
The anonymization method according to claim 5, wherein the signature generation means generates an intermediate value of a hash value including a random number in the input for each attribute.