JP2020077256A

JP2020077256A - Anonymization system and anonymization method

Info

Publication number: JP2020077256A
Application number: JP2018210777A
Authority: JP
Inventors: 啓成藤原; Hiroshige Fujiwara; 尚宜佐藤; Hisanobu Sato; 高橋　健太; Kenta Takahashi; 健太高橋
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2018-11-08
Filing date: 2018-11-08
Publication date: 2020-05-21
Anticipated expiration: 2038-11-08
Also published as: WO2020095662A1; JP7100563B2

Abstract

To provide an anonymization system and an anonymization method which can verify correctness of anonymization processing for data even after carrying out anonymization processing such as deletion and substitution.SOLUTION: An anonymization data providing server carries out patient data record expansion processing S01 to store the processing result on an extended patient data table. The server subsequently carries out signature generating processing S02 for generating digital signature by using data of the extended patient data table as an input. The server allows an anonymization data user terminal to acquire S05 the generated signature value together with an original patient data name through a Web server registration S03. The anonymization data providing server which received an anonymization data acquisition request S06 carries out the verifiable anonymization processing S07 with the use of the patient data name and anonymization conditions as an input to generate anonymization data, and allows the anonymization data user terminal to acquire the same S08. The anonymization data user terminal carries out signature verification processing S09 with the use of the anonymization data and the signature value to verify validity of the anonymization data.SELECTED DRAWING: Figure 10

Description

本発明は、匿名化システムおよび匿名化方法に係り、医療情報等の個人情報を匿名化して医学研究等に活用するために提供する際に、改竄を検知して、提供された情報の正当性を検証するのに好適な匿名化システムおよび匿名化方法に関する。 The present invention relates to an anonymization system and anonymization method, detects tampering when providing personal information such as medical information to be used for anonymizing and utilizing it in medical research, and the validity of the provided information. The present invention relates to an anonymization system and anonymization method suitable for verifying.

近年、２０１７年５月の改正個人情報保護法の全面施行により、個人情報の適切な保護を前提とした匿名加工情報の利用・活用が進みつつある。また、２０１８年５月には、国民の医療情報を匿名加工して、大学や製薬企業の研究開発などでの活用を可能にする仕組みを定めた次世代医療基盤法が施行された。こうした法規制により、医療分野の研究開発等に匿名化データが活用可能となってきている。 In recent years, with the full enforcement of the revised Personal Information Protection Law in May 2017, the use and utilization of anonymously processed information, which is premised on the proper protection of personal information, is progressing. In addition, in May 2018, the Next-Generation Medical Infrastructure Act was enacted, which established a mechanism that anonymously processes medical information of the people and enables it to be used in research and development of universities and pharmaceutical companies. Due to such regulations, anonymized data can be used for research and development in the medical field.

医療分野では、研究に用いる医療データに対し“バリデーション”と呼ばれる正当性の検証を行っている。今後、匿名加工に対しても、同様の正当性の検証が課題となると考えられる。 In the medical field, the validity of medical data used for research is called “validation”. In the future, it is considered that the verification of the same legitimacy will be an issue for anonymous processing.

臨床情報等の個人情報を匿名化して提供する技術としては、例えば、特許文献１に開示がある。特許文献１に記載された情報管理システムによれば、臨床情報等の被検体情報（個人情報）の匿名化処理後、被検体情報の所有者や閲覧権限所有者が、匿名化処理された情報に関連付けられて蓄積された情報を特定可能とする。 A technique for anonymizing and providing personal information such as clinical information is disclosed in Patent Document 1, for example. According to the information management system described in Patent Document 1, after the subject information (personal information) such as clinical information has been anonymized, the owner of the subject information and the viewing authority owner have anonymized information. It is possible to specify the information accumulated in association with.

国際公開第２００８／０６９０１１号International Publication No. 2008/069011

上記従来技術の特許文献１では、正当性保証の対象が元データの一部（識別子ないし準識別子の組合せ）が対象であるため、データ全体に対し匿名化の正当性を保証することができない。 In Patent Document 1 of the above-mentioned related art, since the target of the validity guarantee is a part of the original data (a combination of identifiers or quasi-identifiers), the validity of the anonymization cannot be guaranteed for the entire data.

一般的に、データに対する正当性を保証する技術としては、デジタル署名技術がある。しかしながら、単純にデジタル署名を適用するだけでは、データの削除や置換等の匿名化処理を施すと、正当性を検証できない。不正な匿名化データを利用した場合、研究成果が不正となる事態などが発生する恐れがある。 Generally, there is a digital signature technique as a technique for guaranteeing the validity of data. However, the legitimacy cannot be verified by simply applying a digital signature and performing anonymization processing such as deletion or replacement of data. If unauthorized anonymized data is used, there is a risk that research results will be unauthorized.

本発明の目的は、データに対する匿名化処理の正当性を、削除や置換などの匿名化処理を施した後でも検証できる匿名化の正当性を検証可能な匿名化システムおよび匿名化方法を提供することにある。 An object of the present invention is to provide an anonymization system and an anonymization method capable of verifying the anonymization correctness of data that can be verified even after the anonymization process such as deletion or replacement. Especially.

本発明の匿名化システムの構成は、好ましくは、情報処理装置により秘密情報を匿名化して利用者に提供する匿名化システムであって、秘密情報を記憶する秘密情報記憶手段と、秘密情報を抽象化する情報の候補群である抽象化候補群情報記憶手段と、秘密情報に抽象化する情報の候補群を追加した拡張秘密データを記憶する拡張秘密データ記憶手段と、秘密情報から一部の情報を削除または置換した匿名化データを記憶する匿名化データ記憶手段と、秘密情報と抽象化候補群情報を用い、拡張秘密データを生成する拡張秘密データ生成手段と、拡張秘密データまたは匿名化データを用いて秘密情報のハッシュ値を中間値とするデジタル署名を生成する署名生成手段と、拡張秘密データを用いて匿名化データを生成する匿名化手段と、与えられた匿名化データの正当性を検証する匿名化データ正当性検証手段とを備え、拡張秘密データ生成手段は、秘密情報記憶手段により記憶された秘密情報と抽象化候補群情報記憶手段により記憶された秘密情報を抽象化する情報の候補群を参照して、拡張秘密データを生成し、匿名化手段は、拡張秘密データに対して、署名生成手段の署名生成の中間値と同一のハッシュ値に置き換える処理を実行して、匿名化データを生成し、署名生成手段は、拡張秘密データ生成手段より生成された拡張秘密データより、第一の署名値を、匿名化手段に生成された匿名化データより、第二の署名値をそれぞれ生成し、匿名化データ正当性検証手段は、第一の署名値と第二の署名値を比較することにより、与えられた匿名化データの正当性を検証するようにしたものである。 The configuration of the anonymization system of the present invention is preferably an anonymization system in which confidential information is anonymized by an information processing device and provided to a user, and the confidential information storage means stores the confidential information and the confidential information is abstracted. Abstraction candidate group information storage means that is a candidate group of information to be converted, extended secret data storage means that stores extended secret data in which a candidate group of information to be abstracted is added to secret information, and some information from the secret information The anonymized data storage means for storing the anonymized data with deleted or replaced, the extended secret data generation means for generating the extended secret data by using the secret information and the abstraction candidate group information, and the extended secret data or the anonymized data. Signature generation means for generating a digital signature using the hash value of secret information as an intermediate value, anonymization means for generating anonymization data using extended secret data, and verification of the validity of given anonymization data And an anonymized data validity verification means, wherein the extended secret data generation means is an information candidate for abstracting the secret information stored by the secret information storage means and the secret information stored by the abstraction candidate group information storage means. By referring to the group, the extended secret data is generated, and the anonymization means performs a process of replacing the extended secret data with the same hash value as the intermediate value of the signature generation of the signature generation means to generate the anonymized data. The signature generation means generates a first signature value from the extended secret data generated by the extended secret data generation means, and a second signature value from the anonymization data generated by the anonymization means. The anonymized data validity verification means is configured to verify the validity of the given anonymized data by comparing the first signature value and the second signature value.

本発明によれば、データに対する匿名化処理の正当性を、削除や置換などの匿名化処理を施した後でも検証できる匿名化の正当性を検証可能な匿名化システムおよび匿名化方法を提供することができる。 According to the present invention, there is provided an anonymization system and anonymization method capable of verifying the correctness of anonymization processing that can verify the correctness of anonymization processing for data even after performing anonymization processing such as deletion or replacement. be able to.

匿名化システムの全体構成図である。It is a whole block diagram of an anonymization system. 匿名化データ提供サーバの機能構成図である。It is a functional block diagram of an anonymization data provision server. 匿名化データ利用者端末の機能構成図である。It is a functional block diagram of an anonymization data user terminal. 匿名化データ提供サーバおよび匿名化データ利用者端末のハードウェア・ソフトウェア構成図である。It is a hardware / software block diagram of an anonymization data providing server and an anonymization data user terminal. 患者データテーブルのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of a patient data table. 抽象化パタン群のデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of an abstraction pattern group. 拡張患者データテーブルのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of an extended patient data table. 署名データテーブルのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of a signature data table. 匿名化データのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of anonymization data. 利用者が匿名化データ提供サーバから匿名化データを取得して検証するまでの一連の処理を示すシーケンス図である。It is a sequence diagram which shows a series of processes until a user acquires anonymized data from an anonymized data providing server and verifies it. 患者データレコード拡張処理を示すフローチャートである。It is a flowchart which shows a patient data record expansion process. 署名生成処理を示すフローチャートである（その一）。It is a flow chart which shows signature generation processing (the 1). 署名生成処理を示すフローチャートである（その二）。It is a flow chart which shows signature generation processing (the 2). 検証可能匿名化処理を示すフローチャートである。It is a flowchart which shows a verifiable anonymization process. レコード削除処理を示すフローチャートである。It is a flow chart which shows record deletion processing. 属性削除処理を示すフローチャートである。It is a flowchart which shows an attribute deletion process. 属性置換処理を示すフローチャートである。It is a flowchart which shows an attribute replacement process. 署名検証処理を示すフローチャートである。It is a flowchart which shows a signature verification process.

以下、本発明の一実施形態を、図１ないし図１７を用いて説明する。
本実施形態では、病院等が患者の医療情報を匿名化して医学研究、統計資料等に活用するために提供する例について説明する。
先ず、図１ないし図４を用いて匿名化システムの構成について説明する。 An embodiment of the present invention will be described below with reference to FIGS. 1 to 17.
In the present embodiment, an example will be described in which a hospital or the like anonymizes medical information of a patient and provides it for use in medical research, statistical data, and the like.
First, the configuration of the anonymization system will be described with reference to FIGS. 1 to 4.

先ず、図１を用いて匿名化システムの全体構成について説明する。
匿名化データ提供システムは、個人情報を含む情報を保持するデータ所有者（データホルダ）が、情報を匿名化した上で、データ利用者へ提供するためのシステムである。匿名化データ提供システムは、図１に示されるように、匿名化データ提供サーバ１と匿名化データ利用者端末２からなり、それらをネットワーク３により接続した形態である。 First, the overall configuration of the anonymization system will be described with reference to FIG.
The anonymized data providing system is a system in which a data owner (data holder) holding information including personal information anonymizes the information and provides it to the data user. As shown in FIG. 1, the anonymized data providing system includes an anonymized data providing server 1 and an anonymized data user terminal 2, which are connected by a network 3.

匿名化データ提供サーバ１は、データホルダが個人情報を格納し提供用の匿名化処理を行う機能を提供するサーバである。匿名化データ利用者端末２は、データ利用者が匿名化データをダウンロードし、正当性の検証を行うクライアント端末である。ネットワーク３は、インターネットのようなグローバルなネットワークであってもよいし、構内に設置されるＬＡＮ（Local Network）であってもよい。 The anonymized data providing server 1 is a server that provides a function in which a data holder stores personal information and performs an providing anonymization process. The anonymized data user terminal 2 is a client terminal where a data user downloads anonymized data and verifies the validity. The network 3 may be a global network such as the Internet or a LAN (Local Network) installed in the premises.

次に、図２を用いて匿名化データ提供サーバの機能構成について説明する。
匿名化データ提供サーバ１は、図２に示されるように、Ｗｅｂサーバ機能部１０１、レコード拡張機能部１０２、匿名化処理機能部１０３、乱数生成部１０４、ハッシュ値生成部１０５、署名生成部１０６、記憶部１１０からなる。 Next, the functional configuration of the anonymized data providing server will be described with reference to FIG.
As shown in FIG. 2, the anonymized data providing server 1 has a Web server function unit 101, a record extension function unit 102, an anonymization processing function unit 103, a random number generation unit 104, a hash value generation unit 105, and a signature generation unit 106. , Storage unit 110.

Ｗｅｂサーバ機能部１０１は、データ利用者へ患者データ名とデジタル署名値をＷｅｂページにより公開する処理を行う機能部である。レコード拡張機能部１０２は、匿名化処理に先立ち患者データを拡張する前処理を行う機能部である。匿名化処理機能部１０３は、検証可能な匿名化処理を行う機能部である。乱数生成部１０４は、ハッシュ値の安全性を高めるために付与する乱数を生成する機能部である。ハッシュ値生成部１０５は、一方向関数などによるハッシュ値の生成を行う機能部である。署名生成部１０６は、拡張患者データを入力としてデジタル署名を生成する機能部である。記憶部１１０は、匿名化データ提供サーバ１で使用されるデータを記憶する機能部である。 The Web server function unit 101 is a function unit that performs processing of disclosing the patient data name and the digital signature value to the data user on the Web page. The record expansion function unit 102 is a function unit that performs preprocessing for expanding patient data prior to anonymization processing. The anonymization processing function unit 103 is a function unit that performs verifiable anonymization processing. The random number generation unit 104 is a functional unit that generates a random number that is added to enhance the security of hash values. The hash value generation unit 105 is a functional unit that generates a hash value using a one-way function or the like. The signature generation unit 106 is a functional unit that inputs the extended patient data and generates a digital signature. The storage unit 110 is a functional unit that stores data used in the anonymized data providing server 1.

記憶部１１０には、患者データテーブル３０１、抽象化パタン群３０２、拡張患者データテーブル３０３、署名データテーブル３０４、匿名化データ３０５が格納されている。 The storage unit 110 stores a patient data table 301, an abstract pattern group 302, an extended patient data table 303, a signature data table 304, and anonymized data 305.

患者データテーブル３０１は、個人情報を含む患者データを格納するテーブルである。抽象化パタン群３０２は、患者データの匿名化における置換処理のパタン群のデータである。拡張患者データテーブル３０３は、匿名化処理の前処理として拡張処理を行った患者データを格納するテーブルである。署名データテーブル３０４は、患者データ名とその患者データの拡張患者データのデジタル署名値のペアを格納するテーブルである。匿名化データ３０５は、患者データを匿名化したデータである。
なお、データの具体的な構造は、後に詳説する。 The patient data table 301 is a table that stores patient data including personal information. The abstraction pattern group 302 is data of a pattern group of replacement processing in anonymization of patient data. The extended patient data table 303 is a table that stores the patient data that has undergone the extended process as a preprocess of the anonymization process. The signature data table 304 is a table that stores a pair of a patient data name and a digital signature value of extended patient data of the patient data. Anonymization data 305 is data obtained by anonymizing patient data.
The specific structure of the data will be described in detail later.

次に、図３を用いて匿名化データ利用者端末の機能構成に説明する。
匿名化データ利用者端末２は、図３に示されるように、ブラウザ機能部２０１、署名検証処理部２０２、ハッシュ値生成部２０３、署名生成部２０４、記憶部２１０からなる。Ｗｅｂブラウザ機能部２０１は、Ｗｅｂページを参照する処理を行う機能部である。署名検証処理部２０２は、Ｗｅｂページから取得した署名値と受領した匿名化データの署名値とを比較し匿名化の正当性を検証する機能部である。ハッシュ値生成部２０３は、一方向関数などによるハッシュ値の生成を行う機能部である。署名生成部２０４は、匿名化データ提供サーバ１の署名生成部１０６と同等の機能を有し、匿名化データを入力としてデジタル署名を生成する機能部である。記憶部２１０は、匿名化データ利用者端末２で使用されるデータを記憶する機能部である。 Next, the functional configuration of the anonymized data user terminal will be described with reference to FIG.
As shown in FIG. 3, the anonymized data user terminal 2 includes a browser function unit 201, a signature verification processing unit 202, a hash value generation unit 203, a signature generation unit 204, and a storage unit 210. The web browser function unit 201 is a function unit that performs a process of referring to a web page. The signature verification processing unit 202 is a functional unit that compares the signature value acquired from the Web page with the signature value of the received anonymization data to verify the validity of the anonymization. The hash value generation unit 203 is a functional unit that generates a hash value using a one-way function or the like. The signature generation unit 204 has a function equivalent to that of the signature generation unit 106 of the anonymized data providing server 1, and is a functional unit that inputs the anonymized data and generates a digital signature. The storage unit 210 is a functional unit that stores data used in the anonymized data user terminal 2.

記憶部２１０には、匿名化データ３０５、署名値３１１が格納される。匿名化データ３０５、患者データを匿名化したデータであり、匿名化データ提供サーバ１からネットワーク３経由で受領したデータである。署名値３１１は、Ｗｅｂページから取得した検証用のデジタル署名値および匿名化データ３０５より生成したデジタル署名値である。 Anonymized data 305 and a signature value 311 are stored in the storage unit 210. Anonymized data 305 is data obtained by anonymizing patient data, and is data received from the anonymized data providing server 1 via the network 3. The signature value 311 is a digital signature value for verification acquired from the Web page and a digital signature value generated from the anonymized data 305.

次に、図４を用いて匿名化データ提供サーバおよび匿名化データ利用者端末のハードウェア構成、ソフトウェア構成について説明する。
匿名化データ提供サーバ１のハードウェア構成としては、例えば、図４に示されるサーバ装置のような一般的な情報処理装置で実現される。また、実計算機上に構築される仮想マシンであってもよい。 Next, the hardware configuration and software configuration of the anonymized data providing server and the anonymized data user terminal will be described with reference to FIG.
The hardware configuration of the anonymized data providing server 1 is realized by a general information processing device such as the server device shown in FIG. 4, for example. It may also be a virtual machine constructed on a real computer.

匿名化データ提供サーバ１は、ＣＰＵ（Central Processing Unit）４０１、主メモリ４０２、ネットワークインタフェース４０３、表示装置４１０、入力装置４２０がバスにより結合された形態になっている。 The anonymized data providing server 1 has a form in which a CPU (Central Processing Unit) 401, a main memory 402, a network interface 403, a display device 410, and an input device 420 are connected by a bus.

ＣＰＵ４０１は、匿名化データ提供サーバ１の各部を制御し、主メモリ４０２に必要なプログラムをロードして実行する。 The CPU 401 controls each unit of the anonymized data providing server 1, loads a necessary program into the main memory 402, and executes it.

主メモリ４０２は、通常、ＲＡＭなどの揮発メモリで構成され、ＣＰＵ４０１が実行するプログラム、参照するデータが記憶される。 The main memory 402 is usually composed of a volatile memory such as a RAM, and stores programs executed by the CPU 401 and data to be referred to.

ネットワークインタフェース４０３は、ネットワーク３と接続するためのインタフェースである。 The network interface 403 is an interface for connecting to the network 3.

表示装置４１０は、ＬＣＤ（Liquid Crystal Display）などの情報を表示する装置である。 The display device 410 is a device that displays information such as an LCD (Liquid Crystal Display).

入力装置４２０は、コマンドやデータなどの情報を入力したり、装置を制御するための入力を行う装置であり、例えば、キーボードやポインティングデバイスのマウスなどである。 The input device 420 is a device for inputting information such as commands and data and for inputting to control the device, and is, for example, a keyboard or a mouse of a pointing device.

ハードディスクドライブ（ＨＤＤ：Hard Disk Drive）４３０は、大容量の記憶容量を有しており、本実施形態を実行するためのプログラムが格納されている。匿名化データ提供サーバ１のハードディスクドライブ４３０には、Ｗｅｂサーバ機能プログラム６０１、レコード拡張機能プログラム６０２、匿名化処理機能プログラム６０３、乱数生成プログラム６０４、ハッシュ値生成プログラム６０５、署名生成プログラム６０６がインストールされている。Ｗｅｂサーバ機能プログラム６０１、レコード拡張機能プログラム６０２、匿名化処理機能プログラム６０３、乱数生成プログラム６０４、ハッシュ値生成プログラム６０５、署名生成プログラム６０６は、それぞれ、Ｗｅｂサーバ機能部１０１、レコード拡張機能部１０２、匿名化処理機能部１０３、乱数生成部１０４、ハッシュ値生成部１０５、署名生成部１０６の機能を実行するプログラムである。 A hard disk drive (HDD) 430 has a large storage capacity and stores a program for executing the present embodiment. The Web server function program 601, the record expansion function program 602, the anonymization processing function program 603, the random number generation program 604, the hash value generation program 605, and the signature generation program 606 are installed in the hard disk drive 430 of the anonymized data providing server 1. ing. The Web server function program 601, the record expansion function program 602, the anonymization processing function program 603, the random number generation program 604, the hash value generation program 605, and the signature generation program 606 are respectively the Web server function unit 101, the record expansion function unit 102, It is a program that executes the functions of the anonymization processing function unit 103, the random number generation unit 104, the hash value generation unit 105, and the signature generation unit 106.

また、ハードディスクドライブ４３０は、患者データテーブル３０１、抽象化パタン群３０２、拡張患者データテーブル３０３、署名データテーブル３０４、匿名化データ３０５が格納されている。 Further, the hard disk drive 430 stores a patient data table 301, an abstracted pattern group 302, an extended patient data table 303, a signature data table 304, and anonymized data 305.

匿名化データ利用者端末２のハードウェア構成としては、例えば、図４に示されるパーソナルコンピュータのような一般的な情報処理装置で実現される。また、スマートフォンであってもよいし、専用端末であってもよい。 The hardware configuration of the anonymized data user terminal 2 is realized by, for example, a general information processing device such as the personal computer shown in FIG. Further, it may be a smartphone or a dedicated terminal.

匿名化データ利用者端末２のハードウェア構成の各部は、匿名化データ提供サーバ１と同一である。 Each unit of the hardware configuration of the anonymized data user terminal 2 is the same as that of the anonymized data providing server 1.

匿名化データ利用者端末２のハードディスクドライブ５３０には、ブラウザ機能プログラム７０１、署名検証処理プログラム７０２、ハッシュ値生成プログラム７０３、署名生成プログラム７０４がインストールされている。ブラウザ機能プログラム７０１、署名検証処理プログラム７０２、ハッシュ値生成プログラム７０３、署名生成プログラム７０４は、それぞれブラウザ機能部２０１、署名検証処理部２０２、ハッシュ値生成部２０３、署名生成部２０４の各機能を実行するプログラムである。 A browser function program 701, a signature verification processing program 702, a hash value generation program 703, and a signature generation program 704 are installed in the hard disk drive 530 of the anonymized data user terminal 2. The browser function program 701, the signature verification processing program 702, the hash value generation program 703, and the signature generation program 704 execute the functions of the browser function unit 201, the signature verification processing unit 202, the hash value generation unit 203, and the signature generation unit 204, respectively. It is a program to do.

また、ハードディスクドライブ５３０には、匿名化データ３０５、署名値３１１が格納されている。 Further, the hard disk drive 530 stores anonymized data 305 and a signature value 311.

次に、図５ないし図９を用いて匿名化システムで用いられデータ構造について説明する。
患者データテーブル３０１は、患者の個人情報や関連データを格納するテーブルであり、図５に示されるように、「氏名」「住所（都道府県名）」「性別」から構成される。例えば、レコード３０１１は、ある患者の氏名が「日立太郎」であり、住所が「東京都」であり、性別が「男性」であることを示している。 Next, a data structure used in the anonymization system will be described with reference to FIGS.
The patient data table 301 is a table that stores personal information of a patient and related data, and as shown in FIG. 5, is composed of “name”, “address (prefecture name)”, and “sex”. For example, the record 3011 indicates that the name of a certain patient is “Hitachi Taro”, the address is “Tokyo”, and the gender is “male”.

抽象化パタン群３０２は、患者データテーブル３０１のカラムの表す属性を抽象化するときに用いられるデータであり、図６（ａ）に示されるように、例えば、一般化階層木３０２ａというツリー構造のデータで表現される。 The abstract pattern group 302 is data used when abstracting the attributes represented by the columns of the patient data table 301, and as shown in FIG. 6A, for example, has a tree structure of a generalized hierarchical tree 302a. Expressed in data.

一般化階層木３０２ａにおいては、ツリー構造の葉ノードには、最も抽象度の低い値が配置され、根ノードには最も抽象度の高い値が配置され、中間ノードは、葉ノードに近いノードから根ノードに近いノードになるにつれて抽象度が高い値が配置される。例えば、図６（ａ）に示される属性「住所」の一般化階層木３０２ａは、葉ノードが「東京都」「神奈川県」などの“都道府県名”であり、中間ノードがより抽象度の高い「関東地方」「近畿地方」などの“地方名”であり、根ノードがこれらの中では最も抽象度の高い「日本」という“国名”である一般化階層木を示している。 In the generalized hierarchical tree 302a, the leaf node of the tree structure has the lowest abstraction value, the root node has the highest abstraction value, and the intermediate nodes start from nodes close to the leaf node. A value having a higher degree of abstraction is arranged as the node gets closer to the root node. For example, in the generalized hierarchical tree 302a of the attribute “address” shown in FIG. 6A, the leaf nodes are “prefecture names” such as “Tokyo” and “Kanagawa prefecture”, and the intermediate nodes are more abstract. This shows a generalized hierarchical tree in which the "region name" is high such as "Kanto region" and "Kinki region", and the root node is the "country name" of "Japan" which has the highest degree of abstraction among these.

抽象化パタン群３０２は、図６（ｂ）に示される抽象化パタンテーブル３０２ｂの形式で表現されていてもよい。例えば、属性「住所」の抽象化パタンテーブル３０２ｂは、「住所」と「抽象化パタン」のカラムからなるテーブル構造のデータで表現される。抽象化パタンテーブル３０２ｂは、現在「住所」カラムの値である各“都道府県名”に対し、より抽象度の高い匿名化処理における置換対象の候補となる“地方名”および“国名”の値を対応付ける。例えば、抽象化パタンテーブル３０２ｂの「住所」の値「東京都」の抽象化パタンは、“地方名”を示す「関東地方」および“国名”を示す「日本」がより抽象度の高い匿名化処理における置換対象の候補であることを示している。 The abstract pattern group 302 may be expressed in the format of the abstract pattern table 302b shown in FIG. For example, the abstract pattern table 302b of the attribute “address” is represented by data having a table structure including columns of “address” and “abstract pattern”. In the abstraction pattern table 302b, for each "prefecture name" that is currently the value of the "address" column, the values of "regional name" and "country name" that are candidates for replacement in anonymization processing with a higher degree of abstraction. Correspond to. For example, in the abstract pattern of the value "Tokyo" of the "address" of the abstract pattern table 302b, "Kanto region" indicating "region name" and "Japan" indicating "country name" are anonymized with a higher degree of abstraction. It indicates that the candidate is a replacement target in the process.

拡張患者データテーブル３０３は、患者データテーブル３０１の患者データに対し、検証可能な匿名化処理を実現するための前処理を施したテーブルである。拡張患者データテーブル３０３は、置換による匿名化を実行する際に置換候補となる抽象化パタン群３０２の各値を患者データに追加し、さらに削除や置換の匿名化処理の代替処理として実行するハッシュ化の安全性を高めるための乱数を追加したデータ構造を有する。例えば、図７に示される例では、患者データテーブル３０１に対して、属性「住所」の抽象化パタンである「住所３」および「住所４」の値を追加し、さらに、各属性に対してハッシュ化の安全性を高めるための乱数である「氏名１」「住所１」「性別１」の乱数の値を追加している。 The extended patient data table 303 is a table in which the patient data of the patient data table 301 has been subjected to preprocessing for realizing verifiable anonymization processing. The extended patient data table 303 is a hash that adds each value of the abstract pattern group 302 that is a replacement candidate when anonymization by replacement is performed to patient data, and that is executed as an alternative process of anonymization process of deletion or replacement. It has a data structure with a random number added to improve the security of computerization. For example, in the example shown in FIG. 7, values of “address 3” and “address 4” which are abstract patterns of the attribute “address” are added to the patient data table 301, and further, for each attribute. Random number values of “name 1”, “address 1”, and “sex 1”, which are random numbers for enhancing the security of hashing, are added.

この属性ごとに乱数を追加するデータ構造と、後に説明する属性ごとにハッシュ値を段階的に生成するアルゴリズムにより、すべての要素に乱数を付与する場合に比べてデータサイズを削減しつつ、乱数によるハッシュ化の安全性向上をデータ全体に適用することができる。 By using a data structure that adds a random number for each attribute and an algorithm that generates a hash value step by step for each attribute that will be described later, while reducing the data size compared to when assigning random numbers to all elements, The increased security of hashing can be applied to the entire data.

署名データテーブル３０４は、患者データに対しての署名値を保持するテーブルであり、図８に示されるように、「対象データ」、「署名値」から構成される。「対象データ」は、患者データ名を格納するカラムであり、「署名値」は、各患者データに対し署名生成処理により生成した署名値を格納するカラムである。 The signature data table 304 is a table holding the signature value for the patient data, and as shown in FIG. 8, is composed of “target data” and “signature value”. The “target data” is a column that stores the patient data name, and the “signature value” is a column that stores the signature value generated by the signature generation process for each patient data.

本実施形態では、署名データテーブル３０４のデータは、匿名化データ提供サーバ１のＷｅｂサーバ機能部１０１部により読み込まれ、Ｗｅｂページとして、ネットワーク３経由で匿名化データ利用者端末２からＷｅｂブラウザ等による閲覧およびダウンロードを可能とする。 In the present embodiment, the data of the signature data table 304 is read by the Web server function unit 101 of the anonymized data providing server 1 and is used as a Web page from the anonymized data user terminal 2 via the network 3 by a Web browser or the like. You can browse and download.

匿名化データ３０５は、拡張患者データテーブル３０３に匿名化処理を実行したデータである。例えば、図９には、図７に示された、拡張患者データテーブル３０３のデータに対し、匿名化処理を実行した結果の一例が示されている。例えば、レコード削除されたレコード３０５１は、削除されたレコードであることを示すラベルである“Ｄｅｌｅｔｅ＿Ｒ”および削除の代替処理として実行するハッシュ化により生成されたレコードハッシュ値“２２８９１Ｆ”および空値を示す“−”（Ｎｕｌｌ）により構成される。また、例えば、属性削除および属性置換されたレコード３０５２は、属性「氏名」が属性削除されたことを示すラベルである“Ｄｅｌｅｔｅ＿Ａ”、属性削除の代替処理として実行するハッシュ化により生成された属性ハッシュ値“４６５ＦＣ４”、属性「住所」が置換されたことを示す“Ｒｅｐｌａｃｅ”、属性置換の代替処理として実行するハッシュ化により生成された属性ハッシュ値“Ｂ０Ｄ８Ｃ７”により構成される。
レコード削除、属性削除、属性置換の各処理の詳細は、後に説明する。 Anonymization data 305 is data obtained by executing anonymization processing on the extended patient data table 303. For example, FIG. 9 shows an example of the result of executing the anonymization process on the data of the extended patient data table 303 shown in FIG. 7. For example, the deleted record 3051 indicates a label “Delete_R” indicating that the record is a deleted record, a record hash value “22891F” generated by hashing executed as an alternative process of deletion, and an empty value. It is composed of "-" (Null). In addition, for example, the attribute deleted and replaced record 3052 includes a label “Delete_A”, which is a label indicating that the attribute “name” has been deleted, and an attribute hash generated by hashing executed as an alternative process of attribute deletion. The value “465FC4” is composed of “Replace” indicating that the attribute “address” has been replaced, and the attribute hash value “B0D8C7” generated by hashing executed as an alternative process of the attribute replacement.
Details of each process of record deletion, attribute deletion, and attribute replacement will be described later.

これらのハッシュ値は、署名生成処理における中間処理の値であり、このデータ構造により、匿名化データ利用者端末２の署名検証処理における署名生成処理のデータ処理量が削減できるので、署名検証処理を高速化することができる。 These hash values are intermediate processing values in the signature generation processing, and this data structure can reduce the data processing amount of the signature generation processing in the signature verification processing of the anonymized data user terminal 2. It can speed up.

次に、図１０ないし図１７を用いて匿名化システムの処理について説明する。 Next, processing of the anonymization system will be described with reference to FIGS. 10 to 17.

先ず、図１０を用いて利用者が匿名化データ提供サーバから匿名化データを取得して検証するまでの一連の処理について説明する。
先ず、匿名化データ提供サーバ１のレコード拡張機能部１０２は、患者データテーブル３０１および抽象化パタン群３０２を入力とし、匿名化処理の前処理として患者データレコード拡張処理を行い、拡張患者データテーブル３０３へ処理結果を格納する（Ｓ０１）。なお、患者データレコード拡張処理は、後に、図１１を用いて詳説する。 First, a series of processes until the user acquires anonymized data from the anonymized data providing server and verifies the anonymized data will be described with reference to FIG. 10.
First, the record expansion function unit 102 of the anonymized data providing server 1 receives the patient data table 301 and the abstract pattern group 302 as input, performs patient data record expansion processing as preprocessing of anonymization processing, and expands the patient data table 303. The processing result is stored in (S01). The patient data record expansion process will be described later in detail with reference to FIG.

次に、匿名化データ提供サーバ１の署名生成部１０６は、拡張患者データテーブル３０３のデータを入力とし、デジタル署名を生成する署名生成処理を行い、生成した署名値を元となった患者データ名とともに署名データテーブル３０４に格納する（Ｓ０２）。なお、署名生成処理は、後に、図１２Ａおよび図１２Ｂを用いて詳説する。 Next, the signature generation unit 106 of the anonymized data providing server 1 receives the data of the extended patient data table 303 as input, performs a signature generation process of generating a digital signature, and uses the generated signature value as the source of the patient data name. It is also stored in the signature data table 304 (S02). The signature generation process will be described later in detail with reference to FIGS. 12A and 12B.

次に、匿名化データ提供サーバ１のＷｅｂサーバ機能部１０１は、署名データテーブル３０４に格納された患者データ名および署名値を読出し、ネットワーク３からアクセス可能なＷｅｂページを生成し、ＷｅｂページのＵＲＬ（Uniform Resource Locator）を匿名化データ利用者端末２のＷｅｂブラウザ機能へ通知する（Ｓ０３）。 Next, the Web server function unit 101 of the anonymized data providing server 1 reads the patient data name and the signature value stored in the signature data table 304, generates a Web page accessible from the network 3, and outputs the URL of the Web page. The (Uniform Resource Locator) is notified to the web browser function of the anonymized data user terminal 2 (S03).

次に、匿名化データ利用者端末２のＷｅｂブラウザ機能部２０１は、患者データ名および署名値を含むＷｅｂページを取得し、患者データ名および署名値の一覧をデータ利用者にディスプレイ等の表示装置５１０により表示する（Ｓ０６）。 Next, the web browser function unit 201 of the anonymized data user terminal 2 acquires a web page including the patient data name and the signature value, and displays a list of the patient data name and the signature value on a display device such as a display for the data user. It is displayed by 510 (S06).

次に、匿名化データのデータ利用者は、匿名化データ利用者端末２のマウスなどの入力装置５２０を操作し、利用する患者データの患者データ名に対応する署名値を、Ｗｅｂページからダウンロードし、匿名化データ利用者端末２の主メモリ５０２またはハードディスクドライブ５３０に検証用の署名値３１１として保存する（Ｓ０５）。 Next, the data user of the anonymized data operates the input device 520 such as the mouse of the anonymized data user terminal 2 to download the signature value corresponding to the patient data name of the patient data to be used from the Web page. The signature value 311 for verification is stored in the main memory 502 or the hard disk drive 530 of the anonymized data user terminal 2 (S05).

次に、データ利用者は、利用する患者データ名および匿名化条件を匿名化データ利用者端末２のＷｅｂブラウザ機能部２０１に入力する。本実施形態では、以下の三つの匿名化条件が入力されるものとする。 Next, the data user inputs the patient data name to be used and the anonymization condition to the Web browser function unit 201 of the anonymized data user terminal 2. In this embodiment, the following three anonymization conditions are input.

匿名化条件１：属性「住所」が“日本”以外のレコードを削除
匿名化条件２：属性「氏名」を削除
匿名化条件３：属性「住所」を“都道府県名”から“地方名”に置換 Anonymization condition 1: Delete records where attribute "address" is other than "Japan" Anonymization condition 2: Delete attribute "name" Anonymization condition 3: Change attribute "address" from "prefecture name" to "regional name" Replacement

そして、Ｗｅｂブラウザ機能部２０１は、患者データ名および匿名化条件からなる匿名化データ取得依頼を匿名化データ提供サーバ１のＷｅｂサーバ機能部１０１に通知する（Ｓ０６）。 Then, the Web browser function unit 201 notifies the Web server function unit 101 of the anonymized data providing server 1 of the anonymized data acquisition request including the patient data name and the anonymization condition (S06).

次に、匿名化データ提供サーバ１のＷｅｂサーバ機能部１０１は、通知された患者データ名および匿名化条件を匿名化処理機能部１０３に送信する。匿名化処理機能部１０３は、患者データ名および匿名化条件を入力として検証可能匿名化処理を実行して、匿名化データ３０５を生成し、Ｗｅｂサーバ機能部１０１に送信する。そして、Ｗｅｂサーバ機能部１０１は、匿名化データ３０５をデータ利用者向けのダウンロード用Ｗｅｂページに登録し、そのＵＲＬを匿名化データ利用者端末２のＷｅｂブラウザ機能へ通知する（Ｓ０７）。なお、検証可能匿名化処理の詳細は、後に、図１３を用いて詳説する。 Next, the Web server function unit 101 of the anonymization data providing server 1 transmits the notified patient data name and anonymization condition to the anonymization processing function unit 103. The anonymization process function unit 103 executes the verifiable anonymization process by inputting the patient data name and the anonymization condition, generates the anonymization data 305, and transmits the anonymization data 305 to the Web server function unit 101. Then, the Web server function unit 101 registers the anonymized data 305 in the download Web page for data users, and notifies the URL to the Web browser function of the anonymized data user terminal 2 (S07). The details of the verifiable anonymization process will be described later with reference to FIG.

次に、匿名化データ利用者端末２のＷｅｂブラウザ機能部２０１は、通知されたＵＲＬを入力としてデータ利用者向けのダウンロード用Ｗｅｂページにアクセスし、匿名化データをダウンロードにより取得し、匿名化データ利用者端末２の主メモリ５０２またはハードディスクドライブ５３０に匿名化データ３０５として格納する（Ｓ０８）。 Next, the web browser function unit 201 of the anonymized data user terminal 2 accesses the download web page for the data user by inputting the notified URL, acquires the anonymized data by downloading, and acquires the anonymized data. The anonymized data 305 is stored in the main memory 502 or the hard disk drive 530 of the user terminal 2 (S08).

最後に、匿名化データ利用者端末２の署名検証処理部２０２は、匿名化データ３０５および検証用の署名値３１１を入力として、匿名化の正当性を検証する署名検証処理を実行し、正当性が検証された場合“ＯＫ”を、正当でない場合“ＮＧ”を、匿名化データ利用者端末２のディスプレイ等の表示装置５１０により、データ利用者に表示する（Ｓ０９）。なお、署名検証処理の詳細は、後に、図１７を用いて詳説する。 Finally, the signature verification processing unit 202 of the anonymized data user terminal 2 receives the anonymized data 305 and the verification signature value 311 and executes a signature verification process for verifying the anonymity of the anonymization to obtain the validity. Is verified, and if it is not valid, “NG” is displayed to the data user by the display device 510 such as the display of the anonymized data user terminal 2 (S09). The details of the signature verification process will be described later with reference to FIG.

上記の利用者が匿名化データ提供サーバから匿名化データを取得して検証するまでの一連の処理により、データ利用者は、入手した匿名化データが正当な匿名化処理で匿名化されたか否かを確認できる。これにより、例えば、医療分野では研究に用いる患者、被験者等の匿名化データの匿名化の正当性を検証できるので、不正な匿名化データにより研究結果が誤る事態を避けることができる。 Through the series of processes until the above user acquires anonymized data from the anonymized data providing server and verifies it, the data user determines whether the obtained anonymized data has been anonymized by a valid anonymization process. Can be confirmed. Thereby, for example, in the medical field, since it is possible to verify the anonymity of the anonymization data of the patient, the subject, etc. used in the research, it is possible to avoid the situation where the research result is erroneous due to the illegal anonymization data.

次に、図１１を用いて患者データレコード拡張処理について説明する。
これは、図１０のＳ０１に該当する処理である。 Next, the patient data record expansion process will be described with reference to FIG.
This is a process corresponding to S01 of FIG.

先ず、匿名化データ提供サーバ１のレコード拡張機能部１０２は、患者データテーブルのレコード数をカウントし、レコード数Ｎとする（Ｓ１０１）。なお、図１１に示した処理で使用する変数ｍ（ｍは、レコードのカウンタ）の初期値は１とする。 First, the record expansion function unit 102 of the anonymized data providing server 1 counts the number of records in the patient data table and sets it as the number of records N (S101). The initial value of the variable m (m is a record counter) used in the processing shown in FIG. 11 is 1.

次に、レコード拡張機能部１０２は、患者データテーブルのレコードを一つ読出す（Ｓ１０２）。 Next, the record expansion function part 102 reads one record of the patient data table (S102).

例えば、読み出したレコードが図５に示すレコード３０１１の場合、当該レコードは属性「氏名」の要素が「日立太郎」、属性「住所」の要素が「東京都」、属性「性別」の要素が「男性」となる。 For example, when the read record is the record 3011 shown in FIG. 5, the element having the attribute “name” is “Hitachi Taro”, the element having the attribute “address” is “Tokyo”, and the element having the attribute “gender” is “gender”. Become a man ".

次に、レコード拡張機能部１０２は、抽象化パタン群３０２から、読出した当該レコードの抽象化パタン群を読み出す（Ｓ１０３）。例えば、当該レコードの属性「住所」の「東京都」の抽象化パタンを、図６（ｂ）に示す属性「住所」の抽象化パタン群の抽象化パタンテーブル３０２ｂから読み出す場合、抽象化パタンとして、図６（ｂ）における属性「住所」の要素が「東京都」であるレコード３０２１の「関東地方」「日本」を読み出す。 Next, the record expansion function unit 102 reads the abstract pattern group of the read record from the abstract pattern group 302 (S103). For example, when the abstract pattern of “Tokyo” of the attribute “address” of the record is read from the abstract pattern table 302b of the abstract pattern group of the attribute “address” shown in FIG. 6B, the "Kanto region" and "Japan" of the record 3021 in which the element of the attribute "address" is "Tokyo" is read.

次に、レコード拡張機能部１０２は、当該レコードに、取得した抽象化パタンの各要素を追加する（Ｓ１０４）。例えば、図５のレコード３０１１が当該レコードである場合、Ｓ１０３で抽象化パタンのレコード３０２１から読出した要素である「関東地方」「日本」を追加する。これにより、置換による匿名化時の置換先の値の候補を含むレコードが生成される。 Next, the record extension function unit 102 adds each element of the acquired abstraction pattern to the record (S104). For example, when the record 3011 in FIG. 5 is the relevant record, the elements “Kanto region” and “Japan” that are read from the record 3021 of the abstraction pattern in S103 are added. As a result, a record including the candidate for the value of the replacement destination at the time of anonymization by replacement is generated.

次に、レコード拡張機能部１０２は、レコードの各属性に対して乱数生成部１０４から取得したそれぞれ乱数を一つずつ追加する（Ｓ１０５）。例えば、図５のレコード３０１１が当該レコードである場合、三つの属性に対して、異なる三つの乱数を取得し、そのレコードに追加する。このように属性毎に一つ乱数を付与することにより、すべての要素（カラム）に乱数を付与する場合に比べてレコードのデータサイズを小さくすることができる。 Next, the record extension function unit 102 adds one random number acquired from the random number generation unit 104 to each attribute of the record (S105). For example, when the record 3011 in FIG. 5 is the record, three different random numbers are acquired for three attributes and added to the record. By adding one random number to each attribute in this way, the data size of a record can be made smaller than in the case where random numbers are added to all elements (columns).

次に、レコード拡張機能部１０２は、属性の置換、乱数の付与をおこなったレコードを拡張患者データテーブル３０３のレコードとして出力する（Ｓ１０６）。例えば、図５のレコード３０１１が当該レコードである場合、Ｓ１０３〜Ｓ１０５により生成したレコードは、図７の３０３１に示すように、属性「氏名」に追加された乱数を示す属性「氏名１」の値が「５ＥＦ４ＢＥ」、属性「氏名」の元の値を示す属性「氏名２」の値が「日立太郎」、属性「住所」に追加された乱数を示す属性「住所１」の値が「Ａ７５４Ｂ９」、属性「住所」の元の値を示す属性「住所２」の値が「東京都」、属性「住所」に追加された抽象化パタンの第一の要素を示す属性「住所３」の値が「関東地方」、属性「住所」に追加された抽象化パタンの第二の要素を示す属性「住所４」の値が「日本」、属性「性別」に追加された乱数を示す属性「性別１」の値が「７７０Ｅ６７」、属性「性別」の元の値を示す属性「性別２」の値が「男性」となる。 Next, the record expansion function unit 102 outputs the record in which the attribute is replaced and the random number is added, as a record of the expanded patient data table 303 (S106). For example, when the record 3011 in FIG. 5 is the record, the record generated in S103 to S105 is the value of the attribute “name 1” indicating the random number added to the attribute “name”, as indicated by 3031 in FIG. Is "5EF4BE", the value of the attribute "Name 2" indicating the original value of the attribute "Name" is "Hitachi Taro", and the value of the attribute "Address 1" indicating the random number added to the attribute "Address" is "A754B9" , The value of the attribute “address 2” indicating the original value of the attribute “address” is “Tokyo”, and the value of the attribute “address 3” indicating the first element of the abstraction pattern added to the attribute “address” is "Kanto region", the attribute "address 4" showing the second element of the abstract pattern added to the attribute "address" has the value "Japan", and the attribute "gender 1" showing a random number added to the attribute "gender 1" "770E67" and the value of the attribute "sex 2" indicating the original value of the attribute "sex" is "male".

次に、レコード拡張機能部１０２は、変数ｍの値を１インクリメントする（Ｓ１０７）。 Next, the record extension function unit 102 increments the value of the variable m by 1 (S107).

最後に、レコード拡張機能部１０２は、変数ｍの値と患者データのレコード数Ｎを比較し、ｍがＮ以下の場合には（Ｓ１０８：Ｎｏ）、Ｓ１０２〜Ｓ１０７の処理を実行し、ｍがＮより大きい場合には（Ｓ１０８：Ｙｅｓ）、処理を終了する。 Finally, the record expansion function unit 102 compares the value of the variable m with the number N of records of the patient data, and when m is N or less (S108: No), the processes of S102 to S107 are executed, and m is If it is larger than N (S108: Yes), the process ends.

次に、図１２Ａおよび図１２Ｂを用いて署名検証処理について説明する。
これは、図１０のＳ０２とＳ０９に該当する処理であり、匿名化データ提供サーバ１の署名生成部１０６および匿名化データ利用者端末２の署名生成部２０４で実施される両者の共通の処理である。本実施形態においては、匿名化データ提供サーバ１の署名生成部１０６は、拡張患者データを入力としてデジタル署名値を出力し、一方、匿名化データ利用者端末２の署名生成部２０４は、匿名化データを入力としてデジタル署名値を出力する。 Next, the signature verification process will be described with reference to FIGS. 12A and 12B.
This is a process corresponding to S02 and S09 in FIG. 10, and is a common process performed by the signature generation unit 106 of the anonymized data providing server 1 and the signature generation unit 204 of the anonymized data user terminal 2. is there. In the present embodiment, the signature generation unit 106 of the anonymized data providing server 1 inputs the extended patient data and outputs a digital signature value, while the signature generation unit 204 of the anonymized data user terminal 2 anonymizes. Outputs a digital signature value with data as input.

以下では、署名検証処理が匿名化データ提供サーバ１の署名生成部１０６で行われるものとするが、匿名化データ利用者端末２の署名生成部２０４での署名検証処理も同様である。 In the following, the signature verification process is performed by the signature generation unit 106 of the anonymized data providing server 1, but the signature verification process by the signature generation unit 204 of the anonymized data user terminal 2 is also the same.

先ず、署名生成部１０６は、対象データから、レコード数Ｎ、属性数ｎを取得する（Ｓ２０１）。なお、本フローチャートで使用する変数ｉ，ｊ，ｔ，ｑ，ｋ，ｌ，ｍの初期値はすべて１とする。 First, the signature generation unit 106 acquires the number of records N and the number of attributes n from the target data (S201). The initial values of the variables i, j, t, q, k, l, m used in this flowchart are all 1.

次に、署名生成部１０６は、対象データから、レコードを一つ読み出す（Ｓ２０２）。例えば、対象データが図７の拡張患者データテーブル３０３の場合、読み出すレコードは、レコード３０３１となる。また、対象データが図９の匿名化データ３０５の場合、読み出す対象レコードはレコード３０５２となる。 Next, the signature generation unit 106 reads one record from the target data (S202). For example, when the target data is the expanded patient data table 303 in FIG. 7, the record to be read is the record 3031. If the target data is the anonymized data 305 in FIG. 9, the target record to be read is the record 3052.

次に、署名生成部１０６は、読み出した当該レコードの１番目の要素が、当該レコードが削除されていることを示す“Ｄｅｌｅｔｅ＿Ｒ”である場合（Ｓ２０３：Ｙｅｓ）には、Ｓ２０４の処理を実行し、それ以外の場合（Ｓ２０３：Ｎｏの場合）は、Ｓ２０５の処理を実行する（Ｓ２０３）。例えば、当該レコードが匿名化データ３０５のレコード３０５１である場合には、レコードの１番目の要素が“Ｄｅｌｅｔｅ＿Ｒ”であるため、次にＳ２０４の処理を実行する。一方、当該レコードが拡張患者データテーブル３０３のレコード３０３１の場合には、レコードの１番目の要素が“５ＥＦ４ＢＥ”であるため、次にＳ２０５の処理を実行する。 Next, if the first element of the read record is “Delete_R” indicating that the record is deleted (S203: Yes), the signature generation unit 106 executes the process of S204. Otherwise (S203: No), the process of S205 is executed (S203). For example, when the record is the record 3051 of the anonymized data 305, the first element of the record is “Delete_R”, and thus the process of S204 is executed. On the other hand, when the record is the record 3031 of the expanded patient data table 303, the first element of the record is “5EF4BE”, and therefore the process of S205 is executed next.

次に、署名生成部１０６は、当該レコードの２番目の要素の値をｉ番目（ｉは、属性のカウンタ）のレコードのハッシュ値Ｈｒｉの値とし、次にＳ２１５の処理を実行し（Ｓ２０４）、次にＳ２１６の処理を実行する。例えば、当該レコードが匿名化データ３０５のレコード３０５１である場合、その２番目の要素の値“２２８９１Ｆ”をＨｒｉの値とする。 Next, the signature generation unit 106 sets the value of the second element of the record as the hash value Hri of the i-th (i is an attribute counter) record, and then executes the process of S215 (S204). Then, the process of S216 is executed. For example, when the record is the record 3051 of the anonymized data 305, the value of the second element “22891F” is set as the value of Hri.

次に、署名生成部１０６は、当該レコードのｉ番目の属性Ａｉの要素数を取得しＥｉｎとする（Ｓ２０５）。例えば、拡張患者データテーブル３０３のレコード３０３１の１番目の属性「氏名」をＡ１とした場合、拡張された二つの属性「氏名１」「氏名２」から構成されるため、Ｅｉｎの値は“２”となる。 Next, the signature generation unit 106 acquires the number of elements of the i-th attribute Ai of the record and sets it as Ein (S205). For example, when the first attribute “name” of the record 3031 of the extended patient data table 303 is A1, it is composed of two extended attributes “name 1” and “name 2”, so the value of Ein is “2”. "It becomes.

次に、署名生成部１０６は、属性Ａｉの１番目の要素が、“Ｄｅｌｅｔｅ＿Ａ”である場合には（Ｓ２０６：“Ｄｅｌｅｔｅ＿Ａ”）、次にＳ２０７の処理を実行し、“Ｒｅｐｌａｃｅ”である場合には（Ｓ２０６：“Ｒｅｐｌａｃｅ”）、次にＳ２０８の処理を実行し、いずれでもない場合（Ｓ２０６：Ｏｔｈｅｒｗｉｓｅ）には、次にＳ２０９の処理を実行する（Ｓ２０６）。例えば、当該レコードが匿名化データ３０５のレコード３０５２、当該属性Ａｉが「氏名」である場合、１番目の要素が属性「氏名１」の“Ｄｅｌｅｔｅ＿Ａ”であるため、次にＳ２０７の処理を実行する。また例えば、当該レコードが匿名化データ３０５のレコード３０５２、当該属性Ａｉが「住所」である場合、１番目の要素が属性「住所１」の“Ｒｅｐｌａｃｅ”であるため、次にＳ２０８の処理を実行する。また例えば、当該レコードが拡張患者データテーブル３０３の３０３１、当該属性Ａｉが「氏名」である場合、１番目の要素が属性「氏名１」の“５ＥＦ４ＢＥ”であるため、次にＳ２０９の処理を実行する。 Next, when the first element of the attribute Ai is “Delete_A” (S206: “Delete_A”), the signature generation unit 106 executes the process of S207, and when it is “Replace”. (S206: “Replace”), next executes the process of S208, and if neither (S206: Other), next executes the process of S209 (S206). For example, when the record is the record 3052 of the anonymized data 305 and the attribute Ai is “name”, the first element is “Delete_A” of the attribute “name 1”, and thus the process of S207 is executed next. . Further, for example, when the record is the record 3052 of the anonymized data 305 and the attribute Ai is “address”, the first element is “Replace” of the attribute “address 1”, and therefore the process of S208 is executed next. To do. Further, for example, when the record is 3031 of the extended patient data table 303 and the attribute Ai is “name”, the first element is “5EF4BE” of the attribute “name 1”, so the process of S209 is executed next. To do.

次に、署名生成部１０６は、当該属性Ａｉの２番目の要素の値をハッシュ値Ｈｒｉの値とし（Ｓ２０７）、次にＳ２１６の処理を実行する。例えば、当該レコードが匿名化データ３０５のレコード３０５２、当該属性Ａｉが「氏名」である場合、２番目の要素の属性「氏名２」の値である“４６５ＦＣ４”をＨｒｉの値とする。 Next, the signature generation unit 106 sets the value of the second element of the attribute Ai as the value of the hash value Hri (S207), and then executes the process of S216. For example, if the record is a record 3052 of the anonymized data 305 and the attribute Ai is “name”, the value of the attribute “name2” of the second element “465FC4” is set as the value of Hri.

次に、署名生成部１０６は、当該属性Ａｉの２番目の要素の値をハッシュ値Ｈｒｉの値とし、次にＳ２１０の処理を実行する（Ｓ２０８）。例えば、当該レコードが匿名化データ３０５のレコード３０５２、当該属性Ａｉが「住所」である場合、２番目の要素の属性「住所２」の値である“Ｂ０Ｄ８Ｃ７”をＨｒｉの値とする。 Next, the signature generation unit 106 sets the value of the second element of the attribute Ai as the hash value Hri, and then executes the processing of S210 (S208). For example, when the record is a record 3052 of the anonymized data 305 and the attribute Ai is “address”, “B0D8C7” which is the value of the attribute “address 2” of the second element is set as the value of Hri.

次に、署名生成部１０６は、変数ｊ（ｊは、要素のカウンタ）の値を１インクリメントする（Ｓ２１０）。 Next, the signature generation unit 106 increments the value of the variable j (j is an element counter) by 1 (S210).

次に、署名生成部１０６は、当該属性Ａｉのｊ番目の要素Ａｉｊと（ｊ＋１）番目の要素Ａｉ（ｊ＋１）を入力として、ハッシュ値生成部１０５からハッシュ値を取得し、当該レコードのハッシュ値Ｈｒｉとする（Ｓ２０９）。例えば、当該レコードが拡張患者データテーブル３０３のレコード３０３１、当該属性Ａｉが「氏名」、ｉ＝１、ｊ＝１である場合、当該属性Ａｉのｊ番目の要素は属性「氏名１」の値“５ＥＦ４ＢＥ”、（ｊ＋１）番目の要素は属性「氏名２」の値“日立太郎”となり、この二つの値をハッシュ値生成部１０５に入力して得たハッシュ値“Ａ８Ｅ０Ｃ２”をＨｒｉの値とする。 Next, the signature generation unit 106 receives the j-th element Aij and the (j + 1) -th element Ai (j + 1) of the attribute Ai as input, acquires the hash value from the hash value generation unit 105, and obtains the hash value of the record. Hri (S209). For example, if the record is a record 3031 of the extended patient data table 303, and the attribute Ai is “name”, i = 1, j = 1, the j-th element of the attribute Ai is the value of the attribute “name 1” “ 5EF4BE ", the (j + 1) th element becomes the value" Hitachi Taro "of the attribute" name 2 ", and the hash value" A8E0C2 "obtained by inputting these two values to the hash value generation unit 105 is the value of Hri. ..

次に、署名生成部１０６は、当該属性Ａｉのｊ番目の要素の値が“−”（Ｎｕｌｌ）である場合（Ｓ２１２：Ｙｅｓ）には、次にＳ２１３の処理を実行し、それ以外の場合（Ｓ２１２：Ｎｏ）には、次にＳ２１２の処理を実行する。例えば、当該レコードが匿名化データ３０５のレコード３０５１、当該属性Ａｉが「住所」、ｊ番目の要素が属性「住所１」の値“−”である場合、次にＳ２１３の処理を実行する。 Next, if the value of the j-th element of the attribute Ai is "-" (Null) (S212: Yes), the signature generation unit 106 next executes the process of S213, and otherwise. If (S212: No), then the process of S212 is executed. For example, if the record is the record 3051 of the anonymized data 305, the attribute Ai is “address”, and the jth element is the value “−” of the attribute “address 1”, then the process of S213 is executed.

次に、署名生成部１０６は、変数ｊの値を１インクリメントし、次に、Ｓ２１３の処理を実行する（Ｓ２１２）。 Next, the signature generation unit 106 increments the value of the variable j by 1, and then executes the process of S213 (S212).

次に、署名生成部１０６は、当該属性Ａｉの要素数Ｅｉｎと変数ｊの値を比較し、Ｅｉｎがｊより大きい場合（Ｓ２１３：Ｙｅｓ）には、次にＳ２１６の処理を実行し、Ｅｉｎがｊ以下の場合（Ｓ２１３：Ｎｏ）には、次にＳ２１４の処理を実行する。 Next, the signature generation unit 106 compares the number of elements Ein of the attribute Ai with the value of the variable j, and if Ein is larger than j (S213: Yes), next executes the process of S216, and Ein is If j or less (S213: No), the process of S214 is executed next.

次に、署名生成部１０６は、当該属性Ａｉのｊ番目の要素Ａｉｊおよびその時点の当該レコードのハッシュ値Ｈｒｉを入力として、ハッシュ値生成部１０５から新たなハッシュ値を取得して、新たなＨｒｉとする（Ｓ２１４）。例えば、当該レコードが拡張患者データテーブル３０３のレコード３０３１、当該属性Ａｉが「住所」、ｊ＝３であり、当該属性Ａｉの３番目の要素は属性「住所３」の値、“関東地方”、その時点のＨｒｉの値を“Ｂ７ＥＦ１４”である場合、“関東地方”および“Ｂ７ＥＦ１４”をハッシュ値生成部１０５に入力して得た“７Ｃ６４８Ｂ”を新たなＨｒｉの値とする。 Next, the signature generation unit 106 receives the j-th element Aij of the attribute Ai and the hash value Hri of the record at that time as an input, acquires a new hash value from the hash value generation unit 105, and acquires a new Hri. (S214). For example, the record is a record 3031 of the extended patient data table 303, the attribute Ai is “address”, j = 3, and the third element of the attribute Ai is the value of the attribute “address 3”, “Kanto region”, When the Hri value at that time is “B7EF14”, “7C648B” obtained by inputting “Kanto region” and “B7EF14” into the hash value generation unit 105 is set as a new Hri value.

次に、署名生成部１０６は、変数ｊの値を１インクリメントし、次に、Ｓ２１１の処理を実行する（Ｓ２１５）。 Next, the signature generation unit 106 increments the value of the variable j by 1, and then executes the process of S211 (S215).

次に、署名生成部１０６は、Ｓ２１５では、変数ｉの値を１インクリメントし、次にＳ２１７の処理を実行する（Ｓ２１６）。 Next, the signature generation unit 106 increments the value of the variable i by 1 in S215, and then executes the process of S217 (S216).

次に、署名生成部１０６は、属性数ｎと変数ｉを比較し、ｉがｎ以下の場合（Ｓ２１７：Ｙｅｓ）には、次にＳ２０３の処理を実行し、ｉがｎより大きい場合（Ｓ２１７：Ｎｏ）には、次にＳ２１８の処理を実行する。 Next, the signature generation unit 106 compares the number of attributes n with the variable i. When i is n or less (S217: Yes), the process of S203 is executed next, and when i is larger than n (S217). : No), the process of S218 is executed next.

次に、ｎとｑ（ｑは、属性のカウンタ）を比較し、ｎがｑより大きい場合（Ｓ２１８：Ｙｅｓ）には、次にＳ２１９を実行し、ｎがｑ以下の場合（Ｓ２１８：Ｎｏ）には、次にＳ２２１を実行する。 Next, n is compared with q (q is an attribute counter), and when n is larger than q (S218: Yes), S219 is executed next, and when n is equal to or smaller than q (S218: No). Then, S221 is executed next.

次に、署名生成部１０６は、ハッシュ値Ｈｒｑとハッシュ値Ｈｒ（ｑ＋１）を入力として、ハッシュ値を生成し、新たなＨｒ（ｑ＋１）とする（Ｓ２１９）。 Next, the signature generation unit 106 inputs the hash value Hrq and the hash value Hr (q + 1), generates a hash value, and sets it as a new Hr (q + 1) (S219).

次に、ｑを１インクリメントし（Ｓ２２０）、Ｓ２１８に戻る。 Next, q is incremented by 1 (S220), and the process returns to S218.

次に、署名生成部１０６は、ハッシュ値ＨＲｔ（ｔは、レコードのカウンタ）に、ハッシュ値Ｈｒｎ（これは、ｎ≧２のとき、Ｈｒ（ｑ＋１）の値に等しいことに注意）の値を代入する（Ｓ２２１）。 Next, the signature generation unit 106 sets the hash value HRt (t is a record counter) to the hash value Hrn (note that this is equal to the value of Hr (q + 1) when n ≧ 2). Substitute (S221).

次に、署名生成部１０６は、変数ｔの値を、１インクリメントし、変数ｉおよび変数ｊの値を１として、次にＳ２１８の処理を実行する（Ｓ２１７）。 Next, the signature generation unit 106 increments the value of the variable t by 1, sets the values of the variable i and the variable j to 1, and then executes the process of S218 (S217).

次に、署名生成部１０６は、対象データのレコード数Ｎと変数ｔの値を比較し、ｔがＮ以下の場合（Ｓ２２３：Ｎｏ）には、次にＳ２０２〜Ｓ２２２の処理を実行し、ｔがＮより大きい場合（Ｓ２２３：Ｙｅｓ）には、次に図１２ＢのＳ２５１の処理を実行する。 Next, the signature generation unit 106 compares the number N of records of the target data with the value of the variable t, and when t is N or less (S223: No), next executes the processes of S202 to S222, and t If is larger than N (S223: Yes), then the process of S251 of FIG. 12B is executed.

以降の過程は、レコードハッシュ値からなる二分木構造のハッシュ木（Hash Tree）からルートのハッシュ値を生成する過程である。 The subsequent process is a process of generating a root hash value from a hash tree having a binary tree structure composed of record hash values.

次に、署名生成部１０６は、レコードハッシュ値ＨＲｉの残項目数としてｍ（ｍは、レコードの処理の残項目数のカウンタ）の値を対象データのレコード数Ｎとし、次にＳ２５２の処理を実行する（図１２ＢのＳ２５１）。 Next, the signature generation unit 106 sets the value of m (m is a counter of the number of remaining items of the record processing) as the number of remaining items of the record hash value HRi as the number of records N of the target data, and then performs the processing of S252. It is executed (S251 in FIG. 12B).

次に、署名生成部１０６は、残項目数ｍと変数ｌの値を比較し、等しくない場合（Ｓ２５２：Ｎｏ）には、次にＳ２５３の処理を実行し、等しい場合（Ｓ２５２：Ｙｅｓ）には、次にＳ２６０の処理を実行する。 Next, the signature generation unit 106 compares the number of remaining items m with the value of the variable l, and when they are not equal (S252: No), the process of S253 is executed next, and when they are equal (S252: Yes). Next executes the processing of S260.

次に、署名生成部１０６は、Ｓ２５３では、ｋ番目（ｋは、レコードのカウンタ）のレコードのハッシュ値ＨＲｋと（ｋ＋１）番目のハッシュ値ＨＲ（ｋ＋１）を入力として、ハッシュ値生成部１０５から新たなハッシュ値を取得し、ｌ番目（ｌは、レコードのカウンタ）のレコードハッシュ値ＨＲｌの新たな値とする（Ｓ２５３）。例えば、ｋ＝１、ｌ＝１の場合、ＨＲ１とＨＲ２を入力としてハッシュ値生成部１０５から新たに取得したハッシュ値を、ＨＲ１の新たな値とする。 Next, in step S <b> 253, the signature generation unit 106 receives the hash value HRk and the (k + 1) th hash value HR (k + 1) of the kth (k is a record counter) record from the hash value generation unit 105. A new hash value is acquired and set as a new value of the 1st (l is a record counter) record hash value HRl (S253). For example, when k = 1 and l = 1, the hash value newly acquired from the hash value generation unit 105 with HR1 and HR2 as input is set as the new value of HR1.

次に、署名生成部１０６は、残項目数ｍの値が２以外の場合（Ｓ２５４：Ｎｏ）には、次にＳ２５５の処理を実行し、残項目数ｍの値が２である場合（Ｓ２５４：Ｙｅｓ）には、次にＳ２６０の処理を実行する。 Next, if the value of the remaining item number m is other than 2 (S254: No), the signature generation unit 106 next executes the process of S255, and if the value of the remaining item number m is 2 (S254). : Yes), the process of S260 is executed next.

次に、署名生成部１０６は、２ｋ＋１を新たなｋの値とし、ｌの値を１インクリメントする（Ｓ２５５）。 Next, the signature generation unit 106 sets 2k + 1 as a new value of k and increments the value of l by 1 (S255).

次に、署名生成部１０６は、残項目数ｍと変数ｋの値を比較し、ｍがｋよりも大きい場合（Ｓ２５６：Ｙｅｓ）には、次にＳ２５３の処理を実行し、ｍがｋ以下の場合（Ｓ２５６：Ｎｏ）には、次にＳ２５７の処理を実行する。 Next, the signature generation unit 106 compares the number of remaining items m with the value of the variable k, and when m is larger than k (S256: Yes), next executes the process of S253, where m is less than or equal to k. In the case of (S256: No), the process of S257 is then executed.

次に、署名生成部１０６は、残項目数ｍと変数ｋの値を比較し、ｍとｋが異なる場合（Ｓ２５７：Ｎｏ）には、次にＳ２５８の処理を実行し、ｍとｋが等しい場合（Ｓ２５７：Ｙｅｓ）には、次にＳ２５９の処理を実行する。 Next, the signature generation unit 106 compares the number of remaining items m with the value of the variable k, and when m and k are different (S257: No), next executes the process of S258, and m and k are equal. In the case (S257: Yes), the process of S259 is executed next.

署名生成部１０６は、（ｍ／２＋ｍ％２）の値を新たな残項目数ｍの値とし、ｋおよびｌの値を１として、次にＳ２５３の処理を実行する（Ｓ２５８）。ここで、ｍ％２は、ｍを２で割った剰余を表す。 The signature generation unit 106 sets the value of (m / 2 + m% 2) as the value of the new remaining item number m, sets the values of k and l to 1, and then executes the processing of S253 (S258). Here, m% 2 represents a remainder obtained by dividing m by 2.

次に、署名生成部１０６は、Ｈｒｌの新しい値としてＨｒｋの値を代入する（Ｓ２５９）。 Next, the signature generation unit 106 substitutes the value of Hrk as a new value of Hrl (S259).

次に、署名生成部１０６は、その時点のＨｒｌの値を対象データの全体ハッシュ値Ｈｄの値とする（Ｓ２６０）。 Next, the signature generation unit 106 sets the value of Hrl at that time as the value of the entire hash value Hd of the target data (S260).

最後に、署名生成部１０６は、ＨｄからＲＳＡなどのデジタル署名アルゴリズムにより署名値δを生成して出力し（Ｓ２６１）、処理を終了する。なお、デジタル署名アルゴリズムは、既存のアルゴリズムを利用することができる。 Finally, the signature generation unit 106 generates and outputs the signature value δ from Hd by a digital signature algorithm such as RSA (S261), and ends the processing. An existing algorithm can be used as the digital signature algorithm.

以上説明した署名生成処理により、図１３〜図１６を用いて後述する検証可能匿名化処理が施された匿名化データに対しては、匿名化処理後であっても、匿名化前の拡張患者データが同一である場合は、拡張患者データに対して生成される署名の値と匿名化データに対して生成される署名の値が同一となるため、匿名化の正当性を検証することが可能となる。 With respect to the anonymized data that has been subjected to the verifiable anonymization process described below with reference to FIGS. 13 to 16 by the signature generation process described above, even after the anonymization process, the extended patient before anonymization When the data is the same, the signature value generated for the extended patient data and the signature value generated for the anonymized data are the same, so the validity of the anonymization can be verified. Becomes

次に、図１３を用いて検証可能匿名化処理について説明する。
これは、図１０のＳ０７に該当する処理である。 Next, the verifiable anonymization process will be described with reference to FIG.
This is a process corresponding to S07 of FIG.

先ず、匿名化データ提供サーバ１の匿名化処理機能部１０３は、Ｗｅｂサーバ機能部１０１から送信された患者データ名に対応する拡張患者データを拡張患者データテーブル３０３から読み出す（Ｓ３０１）。例えば、図１０のＳ０６において、対応する拡張患者データとして、図７の拡張患者データテーブル３０３のデータを取得する。 First, the anonymization processing function unit 103 of the anonymized data providing server 1 reads out the extended patient data corresponding to the patient data name transmitted from the Web server function unit 101 from the extended patient data table 303 (S301). For example, in S06 of FIG. 10, the data of the extended patient data table 303 of FIG. 7 is acquired as the corresponding extended patient data.

次に、匿名化処理機能部１０３は、Ｗｅｂサーバ機能部１０１から送信された匿名化条件を読出す（Ｓ３０２）。例えば、本実施形態では、以下の三つの匿名化条件を読み出すものとする。 Next, the anonymization processing function unit 103 reads the anonymization condition transmitted from the Web server function unit 101 (S302). For example, in this embodiment, the following three anonymization conditions are read.

次に、匿名化処理機能部１０３は、レコードを削除する匿名化条件がある場合には、単にレコードを削除するのではなく、対象レコード全体のハッシュ化を拡張患者データに対して行うレコード削除処理を実行する（Ｓ３０３）。例えば、本実施形態の場合、匿名化条件１がレコードを削除する匿名化条件であるため、拡張患者データテーブル３０３から読み出した拡張患者データに対して、レコード削除処理を実行する。なお、レコード削除処理の詳細は、後に、図１４を用いて後述する。 Next, if there is an anonymization condition for deleting a record, the anonymization processing function unit 103 does not simply delete the record, but performs hashing of the entire target record for the extended patient data. Is executed (S303). For example, in the case of the present embodiment, since the anonymization condition 1 is an anonymization condition for deleting a record, the record deletion process is executed on the extended patient data read from the extended patient data table 303. The details of the record deletion process will be described later with reference to FIG.

次に、匿名化処理機能部１０３は、属性を削除する匿名化条件がある場合、単に属性を削除にするのでなく、対象属性全体のハッシュ化を拡張患者データに対して行う属性削除処理を実行する（Ｓ３０４）。例えば、本実施形態の場合、匿名化条件２が属性「氏名」を削除する匿名化条件であるため、拡張患者データに対して属性削除処理を実行する。なお、属性削除処理の詳細は、図１５を用いて後述する。 Next, if there is an anonymization condition for deleting an attribute, the anonymization processing function unit 103 executes an attribute deletion process of performing hashing of the entire target attribute on the extended patient data instead of simply deleting the attribute. Yes (S304). For example, in the case of the present embodiment, since the anonymization condition 2 is the anonymization condition for deleting the attribute “name”, the attribute deletion process is executed on the extended patient data. The details of the attribute deletion process will be described later with reference to FIG.

次に、匿名化処理機能部１０３は、属性の要素を置換する匿名化条件がある場合、単に、属性の要素の置換するのでなく、対象属性の一部の要素のハッシュ化を拡張患者データに対して実行する属性置換処理を実行する（Ｓ３０５）。例えば、本実施形態の場合、匿名化条件３が属性「住所」の要素を“都道府県名”から“地方名”に置換する匿名化条件であるため、各条患者データに対して属性置換処理を実行する。なお、属性置換処理の詳細は、図１６を用いて後述する。 Next, if there is an anonymization condition that replaces the attribute element, the anonymization processing function unit 103 does not simply replace the attribute element, but rather hashes some elements of the target attribute to the extended patient data. The attribute replacement process to be executed is executed (S305). For example, in the case of the present embodiment, the anonymization condition 3 is an anonymization condition for replacing the element of the attribute “address” from “prefecture name” to “regional name”, and therefore attribute replacement processing is performed on each article patient data. To execute. The details of the attribute replacement process will be described later with reference to FIG.

最後に、匿名化処理機能部１０３は、Ｓ３０３〜Ｓ３０５の処理を実行後の拡張患者データを匿名化データ３０５としてファイルに出力し（Ｓ３０６）、Ｗｅｂサーバ機能部１０１へ送信し、処理を終了する。 Finally, the anonymization processing function unit 103 outputs the extended patient data after executing the processing of S303 to S305 to the file as the anonymization data 305 (S306), transmits it to the Web server function unit 101, and ends the processing. ..

次に、図１４を用いてレコード削除処理について説明する。
これは、図１３のＳ３０３に該当する処理である。 Next, the record deletion process will be described with reference to FIG.
This is a process corresponding to S303 of FIG.

先ず、匿名化データ提供サーバ１の匿名化処理機能部１０３は、レコード削除の匿名化条件を読み出す。例えば、本実施形態の場合、匿名化条件１を読み出す（Ｓ４０１）。なお、変数ｉ，ｊ，ｔの初期値を１とする（匿名化条件１：属性「住所」が“日本”以外のレコードを削除）。 First, the anonymization processing function unit 103 of the anonymized data providing server 1 reads the anonymization condition for record deletion. For example, in the case of the present embodiment, the anonymization condition 1 is read (S401). The initial values of the variables i, j, and t are set to 1 (anonymization condition 1: records in which the attribute "address" is other than "Japan" are deleted).

次に、匿名化処理機能部１０３は、拡張患者データから削除対象のレコード群Ｒｄを特定し、Ｒｄのレコード数をＮ，Ｒｄの属性の属性数をｎとする。例えば、本実施形態では、図７の拡張患者データテーブル３０３から読み出した拡張患者データから、属性「住所」が“日本”以外の“アメリカ”であるレコード３０３２をＲｄとし、Ｎ＝１、ｎ＝３（属性「氏名」「住所」「性別」）とする（Ｓ４０２）。 Next, the anonymization processing function unit 103 identifies the record group Rd to be deleted from the extended patient data, and sets the number of records of Rd to N and the number of attributes of the attribute of Rd to n. For example, in the present embodiment, from the extended patient data read from the extended patient data table 303 of FIG. 7, the record 3032 whose attribute “address” is “America” other than “Japan” is Rd, and N = 1, n = 3 (attribute “name” “address” “sex”) (S402).

次に、匿名化処理機能部１０３は、削除対象のレコード群Ｒｄからｔ番目（ｔは、レコードのカウンタ）のレコードを一つ読み出す。例えば、本実施形態では、Ｓ４０２でＲｄとして特定したレコード３０３２を読み出す（Ｓ４０３）。 Next, the anonymization processing function unit 103 reads one t-th (t is a record counter) record from the record group Rd to be deleted. For example, in the present embodiment, the record 3032 identified as Rd in S402 is read (S403).

次に、匿名化処理機能部１０３は、読み出した当該レコードのｉ番目（ｉは、属性のカウンタ）の属性Ａｉの要素数Ｅｉｎを取得する。例えば、当該レコードがレコード３０３２、ｉ＝１の場合、１番目の属性「氏名」の要素数２をＥｉｎの値とする（Ｓ４０４）。 Next, the anonymization processing function unit 103 acquires the number of elements Ein of the i-th (i is an attribute counter) attribute Ai of the read record. For example, when the record is the record 3032 and i = 1, the number of elements 2 of the first attribute “name” is set as the value of Ein (S404).

次に、匿名化処理機能部１０３は、属性Ａｉのｊ番目の要素Ａｉｊと（ｊ＋１）番目の要素Ａｉ（ｊ＋１）をハッシュ値生成部１０５に入力してハッシュ値を取得し、レコードハッシュ値Ｈｒｉの値とする。例えば、当該レコードがレコード３０３２、属性Ａｉが属性「氏名」かつｊ＝１の場合、属性「氏名」の１番目の要素“７４１ＤＣ３”と２番目の要素“Ｔｏｍ”をハッシュ値生成部１０５に入力してハッシュ値を取得し、Ｈｒｉの値とする（Ｓ４０５）。 Next, the anonymization processing function unit 103 inputs the jth element Aij and the (j + 1) th element Ai (j + 1) of the attribute Ai to the hash value generation unit 105 to acquire the hash value, and then the record hash value Hri. Value of. For example, when the record is the record 3032, the attribute Ai is the attribute “name” and j = 1, the first element “741DC3” and the second element “Tom” of the attribute “name” are input to the hash value generation unit 105. Then, the hash value is acquired and used as the value of Hri (S405).

次に、匿名化処理機能部１０３は、Ｅｉｎと変数ｊの値を比較し、Ｅｉｎがｊよりも大きい場合（Ｓ４０６：Ｙｅｓ）には、次にＳ４０７の処理を実行し、Ｅｉｎがｊ以下の場合（Ｓ４０６：Ｎｏ）には、次にＳ４１０の処理を実行する（Ｓ４０９）。 Next, the anonymization processing function unit 103 compares Ein with the value of the variable j, and when Ein is larger than j (S406: Yes), next executes the processing of S407 so that Ein is equal to or less than j. In that case (S406: No), the process of S410 is then executed (S409).

次に、匿名化処理機能部１０３は、ｊの値を１インクリメントする（Ｓ４０７）。 Next, the anonymization processing function unit 103 increments the value of j by 1 (S407).

次に、匿名化処理機能部１０３は、属性Ａｉの番目の要素Ａｉｊとその時点のＨｒｉの値をハッシュ値生成部１０５に入力して新たなハッシュ値を取得し、新たなＨｒｉの値とする（Ｓ４０８）。 Next, the anonymization processing function unit 103 inputs the th element Aij of the attribute Ai and the Hri value at that time to the hash value generation unit 105 to acquire a new hash value, and sets it as a new Hri value. (S408).

次に、匿名化処理機能部１０３は、Ｅｉｎと変数ｊの値を比較し、Ｅｉｎがｊよりも大きい場合（Ｓ４０９：Ｙｅｓ）には、次にＳ４０７の処理を実行し、Ｅｉｎがｊ以下の場合（Ｓ０９：Ｎｏ）には、次にＳ４１０の処理を実行する。 Next, the anonymization processing function unit 103 compares Ein with the value of the variable j, and when Ein is larger than j (S409: Yes), next executes the processing of S407, and Ein is equal to or less than j. In this case (S09: No), the process of S410 is executed next.

次に、匿名化処理機能部１０３は、変数ｉの値を１インクリメントし、次にＳ４１１の処理を実行する（Ｓ４１０）。 Next, the anonymization processing function unit 103 increments the value of the variable i by 1, and then executes the processing of S411 (S410).

次に、匿名化処理機能部１０３は、Ｓ４１１では、属性数ｎと変数ｉの値を比較し、ｎがｉよりも小さい場合（Ｓ４１１：Ｎｏ）には、次にＳ４０４の処理を実行し、ｎがｉ以上の場合（Ｓ４１１：Ｙｅｓ）には、次にＳ４１２の処理を実行する。 Next, in S411, the anonymization processing function unit 103 compares the number of attributes n with the value of the variable i, and when n is smaller than i (S411: No), next executes the processing of S404, When n is i or more (S411: Yes), the process of S412 is performed next.

次に、ｎとｋ（ｋは、属性のカウンタ）を比較し、ｎがｋより大きい場合（Ｓ４１２：Ｙｅｓ）には、次にＳ４１３を実行し、ｎがｋより大きくない場合（Ｓ４１２：Ｎｏ）には、次にＳ４１５を実行する。 Next, n is compared with k (k is an attribute counter), and when n is larger than k (S412: Yes), S413 is executed next, and when n is not larger than k (S412: No). ), Then S415 is executed.

次に、匿名化処理機能部１０３は、ハッシュ値Ｈｒｋとハッシュ値Ｈｒ（ｋ＋１）を入力として、ハッシュ値を生成し、新たなＨｒ（ｋ＋１）とする（Ｓ４１３）。 Next, the anonymization processing function unit 103 receives the hash value Hrk and the hash value Hr (k + 1), generates a hash value, and sets it as a new Hr (k + 1) (S413).

次に、ｋを１インクリメントし（Ｓ４１４）、Ｓ４１２に戻る。 Next, k is incremented by 1 (S414), and the process returns to S412.

次に、匿名化処理機能部１０３は、変数ｔの値（ｔは、レコードのカウンタ）を１インクリメントし、変数ｉおよび変数ｊの値を１とする（Ｓ４１５）。 Next, the anonymization processing function unit 103 increments the value of the variable t (t is a record counter) by 1 and sets the values of the variable i and the variable j to 1 (S415).

次に、匿名化処理機能部１０３は、削除対象の当該レコードの１番目の要素の値をレコードが削除されたことを示すラベルである“Ｄｅｌｅｔｅ＿Ｒ”とし、当該レコードの２番目の要素の値をその時点のレコードハッシュ値Ｈｒｎ（これは、ｎ≧２のとき、Ｈｒ（ｋ＋１）の値に等しいことに注意）とし、当該レコードの３番目以降の要素の値をすべて値なしを示す“−”（Ｎｕｌｌ）として、次にＳ４１４の処理を実行する（Ｓ４１６）。例えば、当該レコードが図７のレコード３０３２の場合、本処理の結果は、図９のレコード３０５１に示す１番目の要素が「Ｄｅｌｅｔｅ＿Ｒ」、２番目の要素が「２２８９１ｆ」というレコードハッシュ値、３番目以降の要素がすべて「−」とする。 Next, the anonymization processing function unit 103 sets the value of the first element of the record to be deleted to “Delete_R”, which is a label indicating that the record has been deleted, and sets the value of the second element of the record to the value. The record hash value Hrn at that point (note that this is equal to the value of Hr (k + 1) when n ≧ 2), and all the values of the third and subsequent elements of the record are “-” indicating no value. Then, as (Null), the process of S414 is executed (S416). For example, when the record is the record 3032 of FIG. 7, the result of this processing is that the first element shown in the record 3051 of FIG. 9 is “Delete_R”, the second element is the record hash value “22891f”, and the third All the following elements are "-".

最後に、匿名化処理機能部１０３は、Ｒｄのレコード数Ｎと変数ｔの値を比較し、Ｎがｔ以上の場合（Ｓ４１７：Ｎｏ）には、次にＳ４０３以降の処理を実行し、Ｎがｔ未満の場合（Ｓ４１７：Ｙｅｓ）には、処理を終了する。 Finally, the anonymization processing function unit 103 compares the number of records N of Rd with the value of the variable t, and when N is t or more (S417: No), next executes the processing of S403 and thereafter, N If is less than t (S417: Yes), the process ends.

次に、図１５を用いて属性削除処理について説明する。
これは、図１３のＳ３０４に該当する処理である。 Next, the attribute deletion process will be described with reference to FIG.
This is the process corresponding to S304 of FIG.

先ず、匿名化データ提供サーバ１の匿名化処理機能部１０３は、属性削除の匿名化条件を読み出す（Ｓ５０１）。例えば、本実施形態の場合、匿名化条件２を読み出す（匿名化条件２：属性「氏名」を削除）。なお、変数ｉ，ｊ，ｔの初期値を１とする。 First, the anonymization processing function unit 103 of the anonymized data providing server 1 reads out anonymization conditions for attribute deletion (S501). For example, in the case of the present embodiment, the anonymization condition 2 is read (anonymization condition 2: the attribute “name” is deleted). The initial values of the variables i, j, and t are 1.

次に、匿名化処理機能部１０３は、拡張患者データのレコード数をＮ、属性数をｎとし、削除対象の属性群をＡｄとし、Ａｄの属性数をａｎとする。例えば、拡張患者データが、図７に示された拡張患者データテーブル３０３のデータの場合、Ｎ＝４、ｎ＝３、Ａｄの要素は、属性「氏名」、Ａｄの属性数ａｎ＝１とする（Ｓ５０２）。 Next, the anonymization processing function unit 103 sets the number of records of the extended patient data to N, the number of attributes to n, the attribute group to be deleted to Ad, and the number of attributes of Ad to an. For example, when the extended patient data is the data of the extended patient data table 303 shown in FIG. 7, the elements of N = 4, n = 3, and Ad have the attribute “name” and the number of attributes of Ad an = 1. (S502).

次に、匿名化処理機能部１０３は、拡張患者データのｔ番目（ｔは、レコードのカウンタ）のレコードを読み出す（Ｓ５０３）。 Next, the anonymization processing function unit 103 reads the t-th (t is a record counter) record of the extended patient data (S503).

次に、匿名化処理機能部１０３は、Ａｄから属性を一つ読み出しＡｄｉとする（ｉは、属性のカウンタ）（Ｓ５０４）。 Next, the anonymization processing function unit 103 reads one attribute from Ad and sets it as Adi (i is an attribute counter) (S504).

次に、匿名化処理機能部１０３は、属性Ａｄｉの要素数を、Ｅｉｎの値とする（Ｓ５０５）。 Next, the anonymization processing function unit 103 sets the number of elements of the attribute Adi as the value of Ein (S505).

次に、匿名化処理機能部１０３は、属性Ａｄｉのｊ番目の要素Ａｄｉｊと（ｊ＋１）番目の要素Ａｄｉ（ｊ＋１）をハッシュ値生成部１０５に入力して、新たなハッシュ値を取得し、属性ハッシュ値Ｈａの値とする（Ｓ５０６）。例えば、当該レコードが図７のレコード３０３１、属性Ａｄｉが属性「氏名」、ｊ＝１である場合、１番目の要素「５ＥＦ４ＢＥ」と２番目の要素「日立太郎」をハッシュ値生成部１０５に入力して、新たなハッシュ値“４６５ＦＣ４”取得し、Ｈａの値とする。 Next, the anonymization processing function unit 103 inputs the j-th element Adj and the (j + 1) -th element Adi (j + 1) of the attribute Adi to the hash value generation unit 105 to acquire a new hash value, The hash value Ha is set as the value (S506). For example, when the record is the record 3031 of FIG. 7, the attribute Adi is the attribute “name”, and j = 1, the first element “5EF4BE” and the second element “Hitachi Taro” are input to the hash value generation unit 105. Then, a new hash value “465FC4” is acquired and set as the value of Ha.

次に、匿名化処理機能部１０３は、Ｅｉｎと変数ｊの値を比較し、Ｅｉｎがｊより大きい場合（Ｓ５０７：Ｙｅｓ）には、次にＳ５０８の処理を実行し、Ｅｉｎがｊ以下の場合（Ｓ５０７：Ｎｏ）には、次にＳ５１０の処理を実行する。 Next, the anonymization processing function unit 103 compares Ein with the value of the variable j, and when Ein is larger than j (S507: Yes), next executes the processing of S508, and when Ein is j or less. In (S507: No), the process of S510 is executed next.

次に、匿名化処理機能部１０３は、変数ｊの値を１インクリメントする（Ｓ５０８）。 Next, the anonymization processing function unit 103 increments the value of the variable j by 1 (S508).

次に、匿名化処理機能部１０３は、属性Ａｄｉのｊ番目の要素ＡｄｉｊとＨａをハッシュ値生成部１０５に入力して、新たにハッシュ値を取得し、それを新たなＨａの値とする（Ｓ５０９）。 Next, the anonymization processing function unit 103 inputs the j-th element Adj and Ha of the attribute Adi to the hash value generation unit 105, acquires a new hash value, and sets it as a new Ha value ( S509).

次に、匿名化処理機能部１０３は、Ｅｉｎと変数ｊの値を比較し、Ｅｉｎがｊより大きい場合（Ｓ５１０：Ｙｅｓ）には、次にＳ５０８の処理を実行し、Ｅｉｎがｊ以下の場合（Ｓ５１０：Ｎｏ）には、次にＳ５１１の処理を実行する。 Next, the anonymization processing function unit 103 compares Ein with the value of the variable j, and when Ein is larger than j (S510: Yes), next executes the processing of S508, and when Ein is j or less. In (S510: No), the process of S511 is executed next.

次に、匿名化処理機能部１０３は、Ｓ５１１では、当該レコードの属性Ａｄｉの１番目の要素の値を属性が削除されたことを示すラベルである“Ｄｅｌｅｔｅ＿Ａ”とし、２番目の要素の値をその時点の属性ハッシュ値Ｈａとし、３番目以降の要素の値を“−”（Ｎｕｌｌ）とする（Ｓ５１１）。例えば、当該レコードが図７のレコード３０３１、削除対象の属性が「氏名」である場合、本処理の結果、図９のレコード３０５２に示すように、属性「氏名」の１番目の要素である「氏名１」の値が“Ｄｅｌｅｔｅ＿Ａ”とし、属性「氏名」の２番目の要素である「氏名２」の値がその時点の属性ハッシュ値Ｈａの値である“４６５ＦＣ４”とし、属性「氏名」には３番以降の要素がないため“−”の値は使用しない。 Next, in step S511, the anonymization processing function unit 103 sets the value of the first element of the attribute Adi of the record as “Delete_A”, which is a label indicating that the attribute has been deleted, and sets the value of the second element to the value of the second element. The attribute hash value Ha at that time is set, and the values of the third and subsequent elements are set to "-" (Null) (S511). For example, if the record is record 3031 in FIG. 7 and the attribute to be deleted is “name”, as a result of this processing, as shown in record 3052 in FIG. 9, the first element of the attribute “name” is “ The value of "Name 1" is "Delete_A", the value of "Name 2" which is the second element of the attribute "Name" is "465FC4" which is the value of the attribute hash value Ha at that time, and is the attribute "Name". Does not use the value of "-" because there is no element after 3.

次に、匿名化処理機能部１０３は、変数ｉの値を１インクリメントする（Ｓ５１２）。 Next, the anonymization processing function unit 103 increments the value of the variable i by 1 (S512).

次に、匿名化処理機能部１０３は、Ａｄの属性数ａｎと変数ｉの値を比較し、ａｎがｉ以上の場合（Ｓ５１３：Ｎｏ）には、次にＳ５０４の処理を実行し、ａｎがｉより小さい場合（Ｓ５１３：Ｙｅｓ）には、次にＳ５１４の処理を実行する。 Next, the anonymization processing function unit 103 compares the attribute number an of Ad with the value of the variable i, and when an is greater than or equal to i (S513: No), next executes the processing of S504, and an is If it is smaller than i (S513: Yes), then the process of S514 is executed.

次に、匿名化処理機能部１０３は、変数ｔの値を１インクリメントし、変数ｉおよび変数ｊの値を１とする（Ｓ５１４）。 Next, the anonymization processing function unit 103 increments the value of the variable t by 1 and sets the values of the variable i and the variable j to 1 (S514).

最後に、匿名化処理機能部１０３は、拡張患者データのレコード数Ｎと変数ｔの値を比較し、Ｎがｔ以上の場合（Ｓ５１５：Ｎｏ）には、次にＳ５０３以降の処理を実行し、Ｎがｔ未満の場合（Ｓ５１５：Ｙｅｓ）には、属性削除処理を終了する。 Finally, the anonymization processing function unit 103 compares the record number N of the extended patient data with the value of the variable t, and when N is t or more (S515: No), next executes the processing of S503 and thereafter. , N is less than t (S515: Yes), the attribute deletion process ends.

次に、図１６を用いて属性置換処理について説明する。
これは、図１３のＳ３０５に該当する処理である。 Next, the attribute replacement process will be described with reference to FIG.
This is a process corresponding to S305 in FIG.

先ず、匿名化データ提供サーバ１の匿名化処理機能部１０３は、属性置換の匿名化条件を読み出す（Ｓ６０１）。例えば、本実施形態の場合、匿名化条件３を読み出す（匿名化条件３属性「住所」を“都道府県名”（住所２）から“地方名”（住所３）に置換）。なお、変数ｉ，ｊ，ｔの初期値を１とする。 First, the anonymization processing function unit 103 of the anonymized data providing server 1 reads out anonymization conditions for attribute replacement (S601). For example, in the case of this embodiment, the anonymization condition 3 is read out (the anonymization condition 3 attribute “address” is replaced from “prefecture name” (address 2) with “regional name” (address 3)). The initial values of the variables i, j, and t are 1.

次に、匿名化処理機能部１０３は、拡張患者データのレコード数をＮ，属性数をｎとし、置換対象の属性群をＡｒ、Ａｒの属性数をｒｎとする。例えば、拡張患者データが拡張患者データテーブル３０３のデータの場合、Ｎ＝４、ｎ＝３、Ａｒの要素は属性「住所」、Ａｒの属性数ｒｎ＝１とする（Ｓ６０２）。 Next, the anonymization processing function unit 103 sets the number of records of the extended patient data to N, the number of attributes to n, the attribute group to be replaced to Ar, and the number of attributes of Ar to rn. For example, when the extended patient data is the data of the extended patient data table 303, N = 4, n = 3, the element of Ar has the attribute “address”, and the number of attributes of Ar rn = 1 (S602).

次に、匿名化処理機能部１０３は、拡張患者データのｔ番目（ｔは、レコードのカウンタ）のレコードを読み出す（Ｓ６０３）。 Next, the anonymization processing function unit 103 reads the t-th (t is a record counter) record of the extended patient data (S603).

次に、匿名化処理機能部１０３は、Ａｒから属性を一つ読み出し、Ａｒｉ（ｉは、属性のカウンタ）とする（Ｓ６０４）。 Next, the anonymization processing function unit 103 reads out one attribute from Ar and sets it as Ari (i is an attribute counter) (S604).

次に、匿名化処理機能部１０３は、属性Ａｒｉの置換対象となる要素数をＥｉｒの値とする（Ｓ６０５）。例えば、属性Ａｒが「住所」の場合、“都道府県名”（住所２）を置換するため、Ｅｉｒ＝２とする。 Next, the anonymization processing function unit 103 sets the number of elements to be replaced for the attribute Ari as the value of Eir (S605). For example, when the attribute Ar is “address”, “Prefecture name” (address 2) is replaced, and thus Eir = 2.

次に、匿名化処理機能部１０３は、属性Ａｒｉのｊ番目（ｊは、要素のカウンタ）の要素Ａｒｉｊと（ｊ＋１）番目の要素Ａｒｉ（ｊ＋１）をハッシュ値生成部１０５に入力して、新たなハッシュ値を取得し、属性ハッシュ値Ｈａの値とする（Ｓ６０６）。例えば、当該レコードが図７のレコード３０３１、属性Ａｒｉが属性「住所」、ｊ＝１である場合、１番目の要素「住所１」の値“Ａ７５４Ｂ９”と２番目の要素「住所２」の値“東京都”をハッシュ値生成部１０５に入力して得た新たなハッシュ値“Ｂ０Ｄ８Ｃ７”をＨａの値とする。 Next, the anonymization processing function unit 103 inputs the jth (j is an element counter) element Arij and the (j + 1) th element Ari (j + 1) of the attribute Ari to the hash value generation unit 105, and A unique hash value is acquired and used as the value of the attribute hash value Ha (S606). For example, when the record is the record 3031 of FIG. 7, the attribute Ari is the attribute “address”, and j = 1, the value “A754B9” of the first element “address 1” and the value of the second element “address 2” The new hash value “B0D8C7” obtained by inputting “Tokyo” into the hash value generation unit 105 is set as the value of Ha.

次に、匿名化処理機能部１０３は、Ｅｉｒと変数ｊ＋１の値を比較し、Ｅｉｒがｊ＋１より大きい場合（Ｙｅｓ：Ｓ６０７）には、次にＳ６０８の処理を実行し、Ｅｉｒがｊ以下の場合（Ｎｏ：Ｓ６０７）には、次にＳ６１１の処理を実行する。 Next, the anonymization processing function unit 103 compares Eir with the value of the variable j + 1, and when Eir is larger than j + 1 (Yes: S607), next executes the processing of S608, and when Eir is j or less. In (No: S607), the process of S611 is executed next.

次に、匿名化処理機能部１０３は、変数ｊの値を１インクリメントする（Ｓ６０８）。 Next, the anonymization processing function unit 103 increments the value of the variable j by 1 (S608).

次に、匿名化処理機能部１０３は、属性Ａｒｉのｊ番目の要素ＡｒｉｊとＨａをハッシュ値生成部１０５に入力して新たなハッシュ値を取得し、それを新たなＨａの値とする（Ｓ６０９）。 Next, the anonymization processing function unit 103 inputs the j-th element Arij and Ha of the attribute Ari to the hash value generation unit 105 to acquire a new hash value, and sets it as a new Ha value (S609). ).

次に、匿名化処理機能部１０３は、Ｅｉｒと変数ｊの値を比較し、Ｅｉｒがｊ＋１より大きい場合（Ｓ６１０：Ｙｅｓ）には、次にＳ６０８の処理を実行し、Ｅｉｒがｊ＋１以下の場合（Ｓ６１０：Ｎｏ）には、次にＳ６１１の処理を実行する。 Next, the anonymization processing function unit 103 compares the value of Eir with the value of the variable j, and when Eir is larger than j + 1 (S610: Yes), next executes the processing of S608, and when Eir is j + 1 or less. In (S610: No), the process of S611 is executed next.

次に、匿名化処理機能部１０３は、当該レコードの属性Ａｒｉの１番目の要素の値を属性が置換されたことを示すラベルである“Ｒｅｐｌａｃｅ”とし、２番目の要素の値をその時点の属性ハッシュ値Ｈａとし、Ｅｉｒが３より大きい場合は属性Ａｒｉの３番目からＥｉｒ番目までの要素の値を“−”（Ｎｕｌｌ）として、次にＳ６１２の処理を実行する（Ｓ６１１）。例えば、当該レコードが図７のレコード３０３１、置換対象の属性が「住所」である場合、本処理の結果、図９のレコード３０５２に示すように、属性「住所」の１番目の要素である「住所１」の値が”Ｒｅｐｌａｃｅ“とし、２番目の要素である「住所２」の値がその時点の属性ハッシュ値Ｈａの値である“Ｂ０Ｄ８Ｃ７”とする。 Next, the anonymization processing function unit 103 sets the value of the first element of the attribute Ari of the record as “Replace”, which is a label indicating that the attribute has been replaced, and sets the value of the second element at that time. If the attribute hash value Ha is set and Eir is larger than 3, the values of the third to Eirth elements of the attribute Ari are set to "-" (Null), and then the processing of S612 is executed (S611). For example, if the record is record 3031 in FIG. 7 and the attribute to be replaced is “address”, as a result of this processing, as shown in record 3052 in FIG. 9, it is the first element of the attribute “address”. The value of "address 1" is "Replace", and the value of the second element "address 2" is "B0D8C7" which is the value of the attribute hash value Ha at that time.

次に、匿名化処理機能部１０３は、変数ｉの値を１インクリメントする（Ｓ６１２）。 Next, the anonymization processing function unit 103 increments the value of the variable i by 1 (S612).

次に、匿名化処理機能部１０３は、Ａｒの属性数ｒｎと変数ｉの値を比較し、ｒｎがｉ以上の場合（Ｓ６１３：Ｎｏ）には、次にＳ６０４の処理を実行し、ｒｎがｉより小さい場合（Ｓ６１３：Ｙｅｓ）には、次にＳ６１４の処理を実行する（Ｓ６１３）。 Next, the anonymization processing function unit 103 compares the number of attributes rn of Ar and the value of the variable i, and when rn is i or more (S613: No), next executes the processing of S604, where rn is If it is smaller than i (S613: Yes), then the process of S614 is executed (S613).

次に、匿名化処理機能部１０３は、変数ｔの値（ｔは、レコードのカウンタ）を１インクリメントし、変数ｉおよび変数ｊの値を１とする（Ｓ６１４）。 Next, the anonymization processing function unit 103 increments the value of the variable t (t is a record counter) by 1 and sets the values of the variable i and the variable j to 1 (S614).

最後に、匿名化処理機能部１０３は、拡張データのレコード数Ｎと変数ｔの値を比較し、Ｎがｔ以上の場合（Ｓ６１５：Ｎｏ）には、次にＳ６０３以降の処理を実行し、Ｎがｔ未満の場合（Ｓ６１５：Ｙｅｓの場合）には、属性置換処理を終了する。 Finally, the anonymization processing function unit 103 compares the record number N of the extended data with the value of the variable t, and when N is t or more (S615: No), next executes the processing of S603 and thereafter, When N is less than t (S615: Yes), the attribute replacement process ends.

次に、図１７を用いて署名検証処理について説明する。
これは、図１０のＳ０９に該当する処理である。 Next, the signature verification process will be described with reference to FIG.
This is a process corresponding to S09 in FIG.

先ず、匿名化データ利用者端末２の署名検証処理部２０２は、図１０のＳ１０８において、匿名化データ利用者端末２の主メモリ５０２またはハードディスクドライブ５３０に保存された匿名化データ３０５および検証用の署名値である署名値３１１を読み出す（Ｓ７０１、Ｓ７０２）。 First, in S108 of FIG. 10, the signature verification processing unit 202 of the anonymized data user terminal 2 uses the anonymized data 305 and the verification data 305 stored in the main memory 502 or the hard disk drive 530 of the anonymized data user terminal 2 for verification. The signature value 311 which is the signature value is read (S701, S702).

次に、署名検証処理部２０２は、匿名化データ３０５を匿名化データ利用者端末２の署名生成部２０４に入力して、生成された署名値δを取得する（Ｓ７０３）。 Next, the signature verification processing unit 202 inputs the anonymized data 305 to the signature generation unit 204 of the anonymized data user terminal 2 and acquires the generated signature value δ (S703).

最後に、署名検証処理部は、検証用署名値である署名値３１１とδを比較し、二つの値が同一である場合（Ｓ７０４：Ｙｅｓ）には、匿名化データは、正当なものであると認定され、匿名化データ利用者端末２の表示装置５１０などによりデータ利用者に“ＯＫ”を表示する（Ｓ７０５）。一方、二つの値が異なる場合（Ｓ７０４：Ｎｏ）には、匿名化データは、改竄や取り違えなどの理由による正当なものではない認定され、匿名化データ利用者端末２の表示装置５１０などによりデータ利用者に“ＮＧ”を表示し（Ｓ７０６）、処理を終了する。 Finally, the signature verification processing unit compares the signature value 311 which is the verification signature value with δ, and when the two values are the same (S704: Yes), the anonymized data is valid. The display device 510 of the anonymized data user terminal 2 displays “OK” for the data user (S705). On the other hand, when the two values are different (S704: No), the anonymized data is not authorized due to tampering or a mistake, and the data is displayed by the display device 510 of the anonymized data user terminal 2 or the like. "NG" is displayed to the user (S706), and the process ends.

以上で説明したように、実施形態の匿名化データ提供システムでは、予め対象データの置換候補の値を対象データに追加した上で、単に削除や置換を行うのではなく、署名生成処理の中間処理であるハッシュ化によるハッシュ値への置き換えを行うので、削除や置換などの匿名化処理を施した後でも署名値による匿名化の正当性の検証を可能とすることができる。 As described above, in the anonymized data providing system of the exemplary embodiment, the value of the replacement candidate of the target data is added to the target data in advance, and the intermediate process of the signature generation process is performed instead of simply deleting or replacing. Since the hash value is replaced with the hash value, it is possible to verify the anonymity of the signature value even after the anonymization process such as deletion or replacement.

また、ハッシュ化のプロセスにおいても、元データに対し乱数を追加した上で、各属性の値と乱数を入力値としてハッシュ化を行うことにより、ハッシュ値から元データを特定するために必要な計算量が膨大となるため、匿名化データから元データを復元されるリスクを低減することができる。 Also in the hashing process, after adding a random number to the original data and performing hashing with the value of each attribute and the random number as input values, the calculation necessary to identify the original data from the hash value Since the amount is enormous, the risk of restoring the original data from the anonymized data can be reduced.

また、元データに対し乱数を追加する際に、各属性に一つ乱数を追加した上で、図１２Ａないし図１６に示したアルゴリズムにおいて、各属性内で段階的に乱数を含むハッシュ化を行うので、元データのすべての要素に乱数を追加する場合に比べて、匿名化データのデータサイズを削減することができる。 In addition, when adding a random number to the original data, one random number is added to each attribute, and then hashing including random numbers is performed stepwise within each attribute in the algorithm shown in FIGS. 12A to 16. Therefore, the data size of the anonymized data can be reduced as compared with the case where random numbers are added to all the elements of the original data.

また、ハッシュ化処理を署名生成処理の中間処理と同一としているので、匿名化データの署名検証処理における署名生成処理を減らし、署名検証処理を高速化することができる。 Further, since the hashing process is the same as the intermediate process of the signature generation process, the signature generation process in the signature verification process of the anonymized data can be reduced and the signature verification process can be speeded up.

以上説明したように、本実施形態によれば、医療分野の研究開発等に匿名化データを活用する際に、データの削除や置換等の匿名化処理を施した場合でも匿名化の正当性の検証を可能とするので、不正な匿名化データの利用により研究成果が不正となる事態などを避けることができる。 As described above, according to the present embodiment, when utilizing anonymized data for research and development in the medical field, even if anonymization processing such as data deletion or replacement is performed, the validity of anonymization can be confirmed. Since the verification is possible, it is possible to avoid the situation where the research result is illegal due to the use of illegal anonymized data.

１…匿名化データ提供サーバ、２…匿名化データ利用者端末、３…ネットワーク、１０２…レコード拡張機能部、１０３…匿名化処理機能部、１０５…ハッシュ値生成部、１０６…署名生成部、１１０…記憶部、３０１…患者データテーブル、３０２…抽象化パタン群、３０３…拡張患者データテーブル、３０４…署名データテーブル、３０５…匿名化データ、２０１…ブラウザ機能部、２０２…署名検証処理部、２０３…ハッシュ値生成部、２０４…署名生成部、２１０…記憶部、３１１…署名値 DESCRIPTION OF SYMBOLS 1 ... Anonymized data providing server, 2 ... Anonymized data user terminal, 3 ... Network, 102 ... Record expansion function part, 103 ... Anonymization processing function part, 105 ... Hash value generation part, 106 ... Signature generation part, 110 Storage unit 301 Patient data table 302 Abstract pattern group 303 Extended patient data table 304 Signature data table 305 Anonymous data 201 Browser function unit 202 Signature verification processing unit 203 ... hash value generation unit, 204 ... signature generation unit, 210 ... storage unit, 311 ... signature value

Claims

An anonymization system in which confidential information is anonymized by an information processing device and provided to a user.
Secret information storage means for storing secret information,
An abstraction candidate group information storage unit which is a candidate group of information for abstracting the secret information,
Extended secret data storage means for storing extended secret data in which a candidate group of information to be abstracted is added to the secret information,
Anonymized data storage means for storing anonymized data obtained by deleting or replacing some information from the secret information,
Extended secret data generation means for generating the extended secret data using the secret information and the abstraction candidate group information;
Signature generation means for generating a digital signature using the extended secret data or anonymized data as an intermediate value of the hash value of the secret information,
Anonymizing means for generating anonymized data using the extended secret data,
Anonymized data validity verification means for verifying the validity of given anonymized data,
The expanded secret data generation unit refers to a secret information stored in the secret information storage unit and a candidate group of information for abstracting the secret information stored in the abstraction candidate group information storage unit, and refers to the expansion. Generate secret data,
The anonymization means performs a process of replacing the extended secret data with the same hash value as the intermediate value of the signature generation of the signature generation means to generate anonymized data,
The signature generation unit generates a first signature value from the extended secret data generated by the extended secret data generation unit and a second signature value from the anonymization data generated by the anonymization unit. ,
The anonymization data validity verification means verifies the validity of given anonymization data by comparing the first signature value and the second signature value.

Furthermore, a random number generating means for adding a random number to the extended secret data is provided,
The anonymization system according to claim 1, wherein the signature generation unit generates a hash value as an intermediate value from the extended secret data to which the random number generated by the random number generation unit is added.

The random number generation means adds one random number to each attribute of the extended secret data,
The anonymization system according to claim 2, wherein the signature generation unit generates an intermediate value of hash values including random numbers in the input for each attribute.

A method of anonymizing confidential information by an information processing device and providing it to a user,
A secret information storing step of storing secret information,
An abstraction candidate group information storage step, which is a candidate group of information for abstracting the secret information,
An extended secret data storing step of storing extended secret data in which a candidate group of information to be abstracted is added to the secret information,
Anonymized data storage step of storing anonymized data obtained by deleting or replacing some information from the secret information,
An extended secret data generating step of generating the extended secret data using the secret information and the abstraction candidate group information;
A signature generation step of generating a digital signature using the extended secret data or anonymized data as an intermediate value of the hash value of the secret information,
Anonymizing step of generating anonymized data using the extended secret data,
Anonymized data validity verification step for verifying the validity of given anonymized data,
In the extended secret data generation step, the extension is performed by referring to the secret information stored in the secret information storage step and the candidate group of information for abstracting the secret information stored in the abstraction candidate group information storage step. Generate secret data,
In the anonymization, with respect to the extended secret data, a process of replacing the intermediate value of the signature generation of the signature generation means with the same hash value is executed to generate anonymized data,
In the signature generation step, a first signature value is generated from the extended secret data generated by the extended secret data generation means, and a second signature value is generated from the anonymization data generated by the anonymization means. ,
In the anonymization data validity verification step, the validity of the given anonymization data is verified by comparing the first signature value and the second signature value.

Further, a random number generation step of adding a random number to the extended secret data,
5. The anonymization method according to claim 4, wherein in the signature generation step, a hash value as an intermediate value is generated from the extended secret data to which the random number generated by the random number generation means is added.

In the random number generating step, one random number is added to each attribute of the extended secret data,
The anonymization method according to claim 5, wherein the signature generation means generates an intermediate value of hash values including random numbers in the input for each attribute.