JP7234763B2

JP7234763B2 - SAME EVENT DETERMINATION PROGRAM, SAME EVENT DETERMINATION METHOD AND SAME EVENT DETERMINATION SYSTEM

Info

Publication number: JP7234763B2
Application number: JP2019075954A
Authority: JP
Inventors: 達也森; 一穂前田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2019-04-11
Filing date: 2019-04-11
Publication date: 2023-03-08
Anticipated expiration: 2039-04-11
Also published as: JP2020173675A

Description

本発明は、異なる単語が同一事象であるか否かを判定する同一事象判定プログラム、同一事象判定方法および同一事象判定システムに関する。 The present invention relates to a same event determination program, a same event determination method, and a same event determination system for determining whether or not different words are the same event.

複数の組織では、それぞれ同じような種類の情報が蓄積されたデータベースが存在している。例えば、複数の組織それぞれのデータベースを統合する場合、ある組織のデータベースのマスターに記載されている単語が、他の組織のマスターの単語と同じ事象（ものや行為等）を示すかどうかを精度よく判定する必要がある。例えば、病院で用いられる電子カルテは、複数の病院にそれぞれ同じ医療情報が蓄積されているデータベースである。ここで、病院１の電子カルテに記載された薬品（薬名）が、他の病院２の薬品と違っていた場合、データベース統合時には、同一の薬品か否かを判定する必要が生じる。 A plurality of organizations have databases in which similar types of information are stored. For example, when integrating the databases of multiple organizations, it is possible to accurately check whether the words written in the master of the database of one organization indicate the same phenomenon (things, actions, etc.) as the words of the master of another organization. It is necessary to judge. For example, an electronic medical record used in a hospital is a database in which the same medical information is stored in multiple hospitals. Here, if the drug (drug name) described in the electronic medical record of the hospital 1 is different from the drug of the other hospital 2, it will be necessary to determine whether or not it is the same drug at the time of database integration.

従来技術として、類義語を推定する技術がある。類義語の推定では、例えば、共通の文字が多く使われている単語同士を、類義語と推定する。他にも、ｗｏｒｄ２ｖｅｃという技術がある。この技術は、類義語なら文章中で同じような使われ方をすることに着目する。具体的には、文章中にでてくる前後の語句が共通なら同義語とみなす。 As a conventional technique, there is a technique for estimating synonyms. In the estimation of synonyms, for example, words in which many common characters are used are estimated as synonyms. In addition, there is a technology called word2vec. This technique focuses on the fact that synonyms are used similarly in a sentence. Specifically, if words before and after appearing in sentences are the same, they are regarded as synonyms.

類義語抽出に関連する技術として、文書群から共起関係等の類似性により同義語候補を抽出し、抽出した定型文中の非定型部分を同義語候補から除外する技術が開示されている（例えば、下記特許文献１参照。）。また、閲覧商品のセッション情報から、ある単語の短期的セッションを抽出し、抽出した中で、共起頻度が所定値以上で単語ＩＤＦ（単語の重要度）が閾値以下となる単語を同義語候補から除外する技術が開示されている（例えば、下記特許文献２参照。）。 As a technique related to synonym extraction, a technique has been disclosed in which synonym candidates are extracted from a group of documents based on similarities such as co-occurrence relationships, and non-fixed phrases in the extracted fixed phrases are excluded from the synonym candidates (for example, See Patent Document 1 below.). In addition, short-term sessions of a certain word are extracted from the session information of the browsed product, and among the extracted words, synonym candidates whose co-occurrence frequency is equal to or higher than a predetermined value and whose word IDF (word importance) is equal to or lower than a threshold (For example, see Patent Document 2 below.).

特開２０１４－１３２４０６号公報JP 2014-132406 A 特開２０１３－１６４７５１号公報JP 2013-164751 A

従来技術では、単語の使われ方が同じである単語は、同じ意味であるとして判断されている。しかし、同じ使われ方でも、実際の事象は異なる場合がある。例えば、ある病院１では、花粉症として診断された患者に薬品Ａを処方するが、他の病院２では、薬品Ｂを処方することを想定する。この例において、ＡとＢとの薬品の用途は「花粉症の患者に処方される」という点では共通しているが、別の薬品である。ここで、従来技術では、薬品Ａと薬品Ｂとの製品名に例えば共通する文字が複数含まれているなど類似している場合に、同じ薬品である確率が高いと判断されてしまう。このような抽出結果をそのまま用いてデータベース統合した場合、統合したデータベースには、実際には異なる薬品である薬品Ａと薬品Ｂとが「花粉症の患者に処方される」同一の薬品として登録されてしまう問題が生じる。 In the prior art, words that are used in the same way are judged to have the same meaning. However, even with the same usage, the actual phenomenon may differ. For example, assume that a certain hospital 1 prescribes a drug A to a patient diagnosed with hay fever, while another hospital 2 prescribes a drug B. FIG. In this example, the drugs A and B are used in common in that they are "prescribed to hay fever patients", but they are different drugs. Here, in the conventional technology, when the product names of the drug A and the drug B are similar, for example, a plurality of common characters are included, it is determined that the drugs are highly likely to be the same drug. When the database is integrated using such extraction results as they are, in the integrated database, drug A and drug B, which are actually different drugs, are registered as the same drug "prescribed for hay fever patients". problem arises.

一つの側面では、本発明は、二つの異なる単語が同一の事象であるか否かを精度良く判定できることを目的とする。 In one aspect, an object of the present invention is to be able to accurately determine whether two different words are the same event.

本発明の一側面によれば、複数の組織がそれぞれ保有するマスターのデータベースにそれぞれアクセスし、任意の二つの単語が一つの前記マスターのデータベース内に存在するか否かを、それぞれの前記マスターのデータベースについて照合し、それぞれの前記マスターのデータベースについての前記照合の結果に基づき、前記二つの単語が同一事象であるか否かを示す同一事象指標値を算出し、算出した前記同一事象指標値を同一の記憶部に記録する、ことを要件とする。 According to one aspect of the present invention, each of the master databases owned by a plurality of organizations is accessed, and whether two arbitrary words exist in one of the master databases is checked by each of the master databases. Databases are collated, based on the results of the collation for each of the master databases , a same event index value indicating whether the two words are the same event is calculated, and the calculated same event index value is It is a requirement to record in the same storage unit.

本発明の一態様によれば、二つの異なる単語が同一の事象であるか否かを精度良く判定できるという効果を奏する。 According to one aspect of the present invention, it is possible to accurately determine whether or not two different words are the same event.

図１は、本発明による同一事象判定処理の概要を説明する図である。FIG. 1 is a diagram for explaining the outline of the identical event determination processing according to the present invention. 図２は、実施の形態にかかる同一事象判定装置のハードウェア構成例を示す図である。FIG. 2 is a diagram of a hardware configuration example of the identical event determination device according to the embodiment; 図３は、実施の形態にかかる同一事象判定装置の構成例１を示すブロック図である。FIG. 3 is a block diagram of a configuration example 1 of the identical event determination device according to the embodiment. 図４は、実施の形態にかかる同一事象判定装置の構成例２を示すブロック図である。FIG. 4 is a block diagram of a configuration example 2 of the identical event determination device according to the embodiment. 図５は、実施の形態にかかる同一事象判定装置による同一事象判定例１を示すフローチャートである。FIG. 5 is a flowchart showing a same event determination example 1 by the same event determination device according to the embodiment. 図６は、実施の形態にかかる同一事象判定装置によるデータ処理の具体例１を説明する図である。FIG. 6 is a diagram for explaining a specific example 1 of data processing by the identical event determination device according to the embodiment; 図７は、実施の形態にかかる同一事象判定装置による同一事象判定例２を示すフローチャートである。FIG. 7 is a flowchart of a same event determination example 2 by the same event determination device according to the embodiment. 図８は、実施の形態にかかる同一事象判定装置によるデータ処理の具体例２を説明する図である。FIG. 8 is a diagram for explaining a specific example 2 of data processing by the identical event determination device according to the embodiment; 図９は、実施の形態にかかる同一事象判定装置による同一事象判定例３を示すフローチャートである。FIG. 9 is a flowchart showing a same event determination example 3 by the same event determination device according to the embodiment. 図１０は、実施の形態にかかる同一事象判定装置によるデータ処理の具体例３を説明する図である。FIG. 10 is a diagram for explaining a specific example 3 of data processing by the identical event determination device according to the embodiment; 図１１は、実施の形態にかかる同一事象判定装置によるデータ処理の具体例４を説明する図である。FIG. 11 is a diagram for explaining a specific example 4 of data processing by the identical event determination device according to the embodiment;

以下に図面を参照して、開示の同一事象判定プログラム、同一事象判定方法および同一事象判定システムの実施の形態を詳細に説明する。 Exemplary embodiments of the disclosed same-event determination program, same-event determination method, and same-event determination system will be described in detail below with reference to the drawings.

図１は、本発明による同一事象判定処理の概要を説明する図である。図１に示すように、異なる複数の組織１～ｎでは、同じような種類の情報が蓄積されたデータベースが存在する。例えば、各組織１～ｎは、それぞれ同様の種類の情報のマスター１を有する。マスターは、データベースに記録する静的な情報である。同様に、各組織１～ｎは、それぞれ同様の種類の情報のマスター２がデータベース化されている。このように、組織は同様の種類の情報のマスターのデータベースを有しており、組織は、例えば、企業、団体等、所定の集合体である。 FIG. 1 is a diagram for explaining the outline of the identical event determination processing according to the present invention. As shown in FIG. 1, different organizations 1 to n have databases in which similar types of information are accumulated. For example, each organization 1-n has a master 1 of similar types of information. A master is static information that you record in a database. Similarly, each organization 1-n has a database of masters 2 of similar types of information. In this way, an organization has a master database of similar types of information, and an organization is a given collection, eg, a company, an institution, or the like.

同一事象判定システムは、同一事象判定装置１００が各組織１～ｎのマスターのデータベースにアクセスする構成である。例えば、同一事象判定装置１００は、各組織１～ｎのサーバに通信接続し、マスターのデータベースにアクセスする。 The same-event determination system is configured such that the same-event determination device 100 accesses the master database of each organization 1 to n. For example, the identical event determination device 100 communicates with the servers of the organizations 1 to n and accesses the master database.

同一事象判定装置１００は、例えば、ある組織のマスターに記載されている単語が、他組織のマスターの単語と同じ事象（ものや行為等）を示すかどうかを精度よく判定する。例えば、同一事象判定装置１００は、異なる二つの単語Ｘ，Ｙについて、ある組織１のマスター１に記載されている単語Ｘが組織２～ｎのマスターの単語Ｙと同じ事象であるかどうかを判定する。 The identical event determination device 100, for example, accurately determines whether a word written in a master of a certain organization indicates the same event (object, action, etc.) as a word of a master of another organization. For example, the same-event determination device 100 determines whether word X described in master 1 of a certain organization 1 is the same event as word Y in masters of organizations 2 to n for two different words X and Y. do.

例えば、組織が病院である場合、マスター１は、各病院で使用する電子カルテで用いる「薬品」の情報であり、マスター２は、各病院で使用する電子カルテで用いる「病名」の情報である。電子カルテは、複数の病院（組織１～ｎ）にそれぞれ同じ医療情報が蓄積されているデータベースである。 For example, if the organization is a hospital, master 1 is information on "medicine" used in electronic medical records used in each hospital, and master 2 is information on "disease name" used in electronic medical records used in each hospital. . An electronic medical record is a database in which the same medical information is accumulated in each of a plurality of hospitals (organizations 1 to n).

実施の形態の同一事象判定装置１００は、例えば、「薬品」に関する同一事象判定を行う場合、各組織１～ｎの「薬品」のマスター１にアクセスする。そして、同一事象判定装置１００は、組織１「病院１」のマスター１に記載された「薬品Ｘ」と、組織２「病院２」のマスター１に記載された「薬品Ｙ」とが同一の薬品であるか否かを判定し、判定結果として同一事象指標を出力する。 For example, the same event determination device 100 of the embodiment accesses the "medicine" master 1 of each organization 1 to n when performing the same event determination regarding "medicine". Then, the same event determination device 100 determines that the “medicine X” described in the master 1 of the organization 1 “hospital 1” and the “medicine Y” described in the master 1 of the organization 2 “hospital 2” are the same medicines. , and outputs the same event index as the determination result.

実施の形態では、データベースのマスターが、所定の特性を持っていることを利用して、精度よく類義語を判定する。すなわち、どの組織１～ｎでも、組織内のマスターには、同じ事象（例えば、薬品）が別々の単語として同時に記載されることはない（確率は低い）。この特性は、一般にデータベースは正規化（データの重複をなくし、整合的にデータを取り扱うこと）されているためである。この場合、マスターは同一の事象を別々の単語として保持しない。同一事象判定装置１００は、このような特性に着目することで、以下のように二つの異なる単語が同一の事象か判断する。 In the embodiment, synonyms are determined with high accuracy by utilizing the fact that the database master has predetermined characteristics. That is, in any organization 1-n, the same event (eg, drug) is not simultaneously described as separate words in the master within the organization (probability is low). This characteristic is due to the fact that databases are generally normalized (elimination of data duplication and consistent handling of data). In this case the master does not keep the same event as separate words. The same event determination device 100 focuses on such characteristics to determine whether two different words are the same event as follows.

１．二つの異なる単語が同じ事象であった場合、どの組織のマスターにもその二つの単語は同時に出現しない（確率が低い）。
２．二つの異なる単語が同じ事象でなかった場合、その二つの単語が同時に記載されたマスターが、いずれかの組織には存在する可能性が高い。 1. If two different words are the same event, the two words will not appear at the same time in any organization's master (probability is low).
2. If two different words are not the same event, there is a high possibility that some organization has a master in which the two words are described at the same time.

例えば、ある組織１「病院１」では、花粉症として診断された患者に「薬品Ａ」を処方するが、ある組織２「病院２」では、「薬品Ｂ」を処方する。このようなケースでは、薬品の使い方が「花粉症の患者に処方される」という点では病院１，２で共通しているが、「薬品Ａ」と「薬品Ｂ」は別の薬である。ここで、従来技術では、「薬品Ａ」と「薬品Ｂ」は同じ薬品である確率が高いと判断してしまう。 For example, an organization 1 "hospital 1" prescribes "medicine A" to a patient diagnosed with hay fever, while an organization 2 "hospital 2" prescribes "medicine B". In such a case, hospitals 1 and 2 are common in that medicines are "prescribed to patients with hay fever", but "medicine A" and "medicine B" are different medicines. Here, in the conventional technology, it is determined that there is a high probability that "drug A" and "drug B" are the same drug.

これに対し、実施の形態の同一事象判定装置１００は、二つの単語Ｘ「薬品Ａ」，単語Ｙ「薬品Ｂ」について、同じ事象であるか否かを判定する。例えば、「薬品Ａ」と「薬品Ｂ」という二つの単語Ｘ，Ｙが同じ事象か否かを示す「同一事象指標」を算出する。「同一事象指標」は、二つの異なる単語が、同一の事象（ものや行為等）を示すのか、否かを示す指標である。実施の形態によれば、ある組織「病院」のマスター１には「薬品Ａ」と「薬品Ｂ」という異なる表記があることを「同一事象指標」で提示する。図１には、二つの単語Ｘ，Ｙが外部入力される形で記載したが、同一事象判定装置１００がマスター１にアクセスした際に、これら二つの単語Ｘ，Ｙを取得してもよい。 On the other hand, the identical event determination device 100 of the embodiment determines whether two words X "drug A" and word Y "drug B" are the same event. For example, a "same event index" indicating whether or not two words X and Y of "medicine A" and "medicine B" are the same event is calculated. The “same event index” is an index that indicates whether or not two different words indicate the same event (object, action, etc.). According to the embodiment, the "same event index" indicates that the master 1 of an organization "hospital" has different notations of "drug A" and "drug B". Although two words X and Y are input from the outside in FIG.

このように、実施の形態の同一事象判定装置１００を用いることでデータの利活用を拡大でき、例えば、異なる組織１～組織ｎのデータの統合を効率的に行えるようになる。例えば、全国に多数ある病院１～病院ｎのマスター（データベース）を統合して、保険商品の開発や、製薬プロセスの効率化、診断支援ＡＩ等の開発を行うことができるようになる。 In this way, by using the same-event determination device 100 of the embodiment, the utilization of data can be expanded, and for example, the data of different organizations 1 to n can be efficiently integrated. For example, it will be possible to integrate the masters (databases) of hospitals 1 to n, which are numerous throughout the country, to develop insurance products, improve the efficiency of pharmaceutical processes, develop diagnostic support AI, and the like.

ここで、ある組織１「病院１」内の電子カルテのシステムでは、「病院１」で使いやすい単語が用いられる傾向がある。このため、同一事象でも、「病院１」と異なる「病院２」～「病院ｎ」では、それぞれ異なる表記の単語が用いられることが多い。例えば、「病院１」では、事象を正式名称で管理するが、他の「病院２」では略称で管理している。このため、異なる「病院１」～「病院ｎ」のデータを統合して利活用するには、「同一事象＝同一表記」となるよう、表記を揃える必要がある。このデータの統合時には、異なる表記の単語が同一事象であるか否かを判断するための辞書（類義語の対応表に相当）を用意する必要がある。この辞書を作るのに工数がかかるため、できるだけ自動化したい要望がある。そのため、ある単語同士が同一事象であるかを、高精度に自動判定したいという要望がある。 Here, in an electronic medical record system within a certain organization 1 "Hospital 1", there is a tendency to use words that are easy to use in "Hospital 1". Therefore, even for the same event, words with different notations are often used in “Hospital 2” to “Hospital n”, which are different from “Hospital 1”. For example, "Hospital 1" manages events with formal names, while other "Hospital 2" manages events with abbreviated names. Therefore, in order to integrate and utilize the data of different "hospital 1" to "hospital n", it is necessary to align the notations so that "same event=same notation". When integrating this data, it is necessary to prepare a dictionary (corresponding to a synonym table) for judging whether or not words with different notations represent the same event. Since it takes a lot of man-hours to create this dictionary, there is a demand to automate it as much as possible. Therefore, there is a demand to automatically determine whether certain words are the same phenomenon with high accuracy.

実施の形態では、例えば、マスターのデータの統合等のために、ある単語同士が同一事象であるかを高精度で自動判定する。この同一事象指標を用いることで、異なる組織１「病院１」～組織ｎ「病院ｎ」のデータの統合を効率的に行えるようになる。 In the embodiment, for example, for the purpose of integrating master data, it is automatically determined with high accuracy whether certain words are the same event. By using this same event index, it becomes possible to efficiently integrate data of different organizations 1 "hospital 1" to organization n "hospital n".

同一事象判定装置１００は、上記判断を行うために、二つの単語が各組織のマスターに同時に記載されているか判断するマスター照合部１０１と、マスター照合部１０１の照合結果（記載されているか否か）を統合的に見て、二つの単語が同じ事象か否かを示す「同一事象指標」を算出する同一事象指標算出部１０２と、を含む。 In order to make the above determination, the identical event determination device 100 has a master matching unit 101 that determines whether two words are written simultaneously in the master of each organization, and the matching result of the master matching unit 101 (whether or not it is written ), and calculates a “same event index” indicating whether two words are the same event.

マスター照合部１０１は、組織１「病院１」のマスター１に、単語Ｘ「薬品Ａ」と、単語Ｙ「薬品Ｂ」が同時に記載されているか判断する（照合１）。同様に、組織２「病院２」のマスター１に、単語Ｘ「薬品Ａ」と、単語Ｙ「薬品Ｂ」が同時に記載されているか判断する（照合２）。マスター照合部１０１は、同様の処理により、組織ｎ「病院ｎ」のマスター１に、単語Ｘ「薬品Ａ」と、単語Ｙ「薬品Ｂ」が同時に記載されているか否かを判断する（照合ｎ）。 The master collation unit 101 determines whether the word X "medicine A" and the word Y "medicine B" are described simultaneously in the master 1 of the organization 1 "hospital 1" (collation 1). Similarly, it is determined whether the word X "medicine A" and the word Y "medicine B" are simultaneously described in the master 1 of the organization 2 "hospital 2" (collation 2). The master collation unit 101 performs similar processing to determine whether or not the word X “drug A” and the word Y “drug B” are simultaneously described in the master 1 of the organization n “hospital n” (verification n ).

同一事象指標算出部１０２は、マスター照合部１０１が照合したＮ個の照合結果「照合１」～「照合ｎ」に基づき、二つの単語が同じ事象か否かを示す「同一事象指標」を算出する。この際、同一事象指標算出部１０２は、単語Ｘ，Ｙが同時に存在するマスター数（ｂｏｔｈ＿ｕｓｅ）、および、単語Ｘ，Ｙのどちらかが存在している組織数（Ｓｈａｒｅ）を算出する。 The same event index calculation unit 102 calculates a “same event index” indicating whether or not two words are the same event based on the N matching results “match 1” to “match n” collated by the master collation unit 101. do. At this time, the same event index calculation unit 102 calculates the number of masters in which words X and Y exist simultaneously (both_use) and the number of organizations in which either word X or Y exists (Share).

また、同一事象指標算出部１０２では、同一事象指標値を、下記処理Ａのみ、あるいは処理Ｂまたは処理Ｃに基づき算出する。処理Ａは、同一事象指標値の算出の基本処理である。処理Ｂと処理Ｃは、処理Ａの処理内容の発展形であり、同一事象指標算出部１０２は、処理Ａに代えて、処理Ｂまたは処理Ｃを実施する。 Further, the same-event index calculation unit 102 calculates the same-event index value based on only the process A, or the process B or the process C below. Processing A is a basic processing for calculating the same event index value. Processing B and processing C are advanced forms of the processing content of processing A, and the same event index calculation unit 102 performs processing B or processing C instead of processing A. FIG.

処理Ａでは、マスター照合部１０１のＮ個の照合結果「照合１」～「照合ｎ」が、全ての組織において、二つの単語が同時に含まれなかった場合に、同一事象指標として値「１」を出力し、それ以外の場合に値「０」を出力する。 In process A, if the N matching results "matching 1" to "matching n" of the master matching unit 101 do not contain two words at the same time in all organizations, the value "1" is given as the same event index. , otherwise it outputs the value "0".

処理Ｂでは、マスター照合部１０１のＮ個の照合結果「照合１」～「照合ｎ」に基づき、二つの単語が両方とも同一マスターに存在した組織の数が大きくなるに従い、小さな同一事象指標を算出する。例えば、同一事象指標＝１－（二つの単語が両方とも同一マスターに存在した組織の数）／（全組織の数）を算出する。ここで、上記処理Ａでは、同じ事象のものを別の単語としてマスターに記載している組織が一つでもある場合に判定を誤ってしまう場合があるが、処理Ｂによれば、この誤判定を防ぐことができる。例えば、一部の組織が歴史的経緯（システム移行）等により同じ意味の単語を二重管理したケースに対応できる。 In process B, based on the N matching results “matching 1” to “matching n” of the master matching unit 101, as the number of tissues in which both two words exist in the same master increases, a small same event index is calculated. calculate. For example, the same event index=1−(number of organizations where both words exist in the same master)/(number of all organizations) is calculated. Here, in the above process A, if there is even one organization that describes the same event as a different word in the master, there is a case where the determination is erroneous, but according to the process B, this erroneous determination can be prevented. For example, it is possible to deal with a case where some organizations have duplicated management of words with the same meaning due to historical circumstances (system migration) or the like.

処理Ｃでは、二つの単語のマスター登録数を考慮し、二つの単語が両方とも同一マスターに存在した組織の数が大きくなるに従い、また、二つの単語のいずれかがマスターに存在する組織の数が小さくなるに従い、小さな同一事象指標を算出する。例えば、同一事象指標＝１－（二つの単語が両方とも同一マスターに存在した組織の数）／（二つの単語のいずれかがマスターに存在する組織の数）を算出する。この場合、二つの単語Ｘ，Ｙを使っている組織が少ない場合に、同一事象指標が高くなることを防止できる。 In process C, considering the number of master registrations of two words, as the number of organizations in which both of the two words exist in the same master increases, the number of organizations in which either of the two words exists in the master increases. is smaller, a smaller same-event index is calculated. For example, the same event index=1−(the number of organizations in which both two words exist in the same master)/(the number of organizations in which either of the two words exists in the master) is calculated. In this case, when there are few organizations using the two words X and Y, it is possible to prevent the same event index from becoming high.

なお、同一事象判定装置１００のマスター照合部１０１は、二つの単語Ｘ，Ｙについて、例えば、既存の技術（類義語検索等）の手法を用い、互いに似た使われ方をしていると判定された単語Ｘ，Ｙとしてもよい。 Note that the master matching unit 101 of the identical event determination device 100 determines that the two words X and Y are used in a similar manner using, for example, an existing technique (synonym search, etc.). It is also possible to use the words X and Y as

ここで、同一事象指標算出部１０２に、二つの単語Ｘ，Ｙの単語類似度を入力する場合には下記の処理Ａ～処理Ｃとなる。 Here, when inputting the word similarity of two words X and Y to the identical event index calculation unit 102, the following processes A to C are performed.

処理Ａでは、マスター照合部１０１のＮ個の照合結果「照合１」～「照合ｎ」が、全ての組織において、二つの単語が同時に含まれていない場合に、同一事象指標として、例えば、単語類似度の値を出力し、それ以外の場合に値「０」を出力する。値「０」の場合、いずれかの組織に二つの単語Ｘ，Ｙが同時に含まれていることを示す。 In process A, if the N matching results “matching 1” to “matching n” of the master matching unit 101 do not include two words at the same time in all organizations, the same event index, for example, the word Output the value of the similarity, otherwise output the value "0". A value of "0" indicates that any organization contains two words X and Y at the same time.

処理Ｂでは、マスター照合部１０１のＮ個の照合結果「照合１」～「照合ｎ」に基づき、例えば、同一事象指標＝単語類似度×（１－（二つの単語が両方とも同一マスターに存在した組織の数）／（全組織の数））を算出する。 In process B, based on the N matching results “matching 1” to “matching n” of the master matching unit 101, for example, same event index=word similarity×(1−(two words both exist The number of tissues that were treated)/(the number of all tissues)).

処理Ｃでは、二つの単語のマスター登録数を考慮し、例えば、同一事象指標＝単語類似度×（１－（二つの単語が両方とも同一マスターに存在した組織の数）／（二つの単語のいずれかがマスターに存在する組織の数）を算出する。 In process C, considering the number of master registrations of two words, for example, same event index = word similarity × (1 - (number of organizations where both two words existed in the same master) / (two words number of organizations for which any exist in the master).

図２は、本発明の同一事象判定装置のハードウェア構成例を示す図である。同一事象判定装置１００は、例えば、図２に示すハードウェアからなる汎用のサーバで構成することができる。 FIG. 2 is a diagram showing a hardware configuration example of the same-event determination device of the present invention. The same-event determination device 100 can be configured by, for example, a general-purpose server made up of hardware shown in FIG.

同一事象判定装置１００は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２０１、メモリ２０２、ネットワークインタフェース（ＩＦ）２０３、記録媒体ＩＦ２０４、記録媒体２０５、を含む。２００は各部を接続するバスである。 The same event determination device 100 includes a CPU (Central Processing Unit) 201 , a memory 202 , a network interface (IF) 203 , a recording medium IF 204 and a recording medium 205 . A bus 200 connects each unit.

ＣＰＵ２０１は、同一事象判定装置１００の全体の制御を司る制御部として機能する演算処理装置である。メモリ２０２は、不揮発性メモリおよび揮発性メモリを含む。不揮発性メモリは、例えば、ＣＰＵ２０１のプログラムを格納するＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）である。揮発性メモリは、例えば、ＣＰＵ２０１のワークエリアとして使用されるＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＳＲＡＭ（ＳｔａｔｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等である。 The CPU 201 is an arithmetic processing unit that functions as a control unit that controls the entire same-event determination device 100 . Memory 202 includes non-volatile memory and volatile memory. The nonvolatile memory is, for example, a ROM (Read Only Memory) that stores programs for the CPU 201 . The volatile memory is, for example, a DRAM (Dynamic Random Access Memory), an SRAM (Static Random Access Memory), etc. used as a work area for the CPU 201 .

ネットワークＩＦ２０３は、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、インターネットなどのネットワーク２１０に対する通信インタフェースである。同一事象判定装置１００は、ネットワークＩＦ２０３を介してネットワーク２１０に通信接続する。例えば、同一事象判定装置１００は、ネットワーク２１０を介して、対象のマスターのデータベースを保持する組織（病院）のサーバにアクセスする。 A network IF 203 is a communication interface for a network 210 such as a LAN (Local Area Network), a WAN (Wide Area Network), and the Internet. Identical event determination device 100 is connected for communication to network 210 via network IF 203 . For example, the same-event determination device 100 accesses a server of an organization (hospital) holding a target master database via the network 210 .

記録媒体ＩＦ２０４は、ＣＰＵ２０１が処理した情報を記録媒体２０５との間で読み書きするためのインタフェースである。記録媒体２０５は、メモリ２０２を補助する記録装置であり、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）や、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）フラッシュドライブ等を用いることができる。 The recording medium IF 204 is an interface for reading and writing information processed by the CPU 201 with the recording medium 205 . A recording medium 205 is a recording device that assists the memory 202, and may be a HDD (Hard Disk Drive), an SSD (Solid State Drive), a USB (Universal Serial Bus) flash drive, or the like.

メモリ２０２または記録媒体２０５に記録されたプログラムをＣＰＵ２０１が実行することにより、図１に示した同一事象判定装置１００の各機能（マスター照合部１０１、同一事象指標算出部１０２）を実現する。また、メモリ２０２や記録媒体２０５は、同一事象判定装置１００が扱う情報を記録保持する。 The CPU 201 executes the program recorded in the memory 202 or the recording medium 205 to realize each function (master collation unit 101, same event index calculation unit 102) of the same event determination device 100 shown in FIG. Also, the memory 202 and the recording medium 205 record and hold information handled by the same event determination apparatus 100 .

図３は、実施の形態にかかる同一事象判定装置の構成例１を示すブロック図である。図３に示す同一事象判定装置１００は、図１に示したマスター照合部１０１と、同一事象指標算出部１０２、の各機能を含む。 FIG. 3 is a block diagram of a configuration example 1 of the identical event determination device according to the embodiment. The same-event determination device 100 shown in FIG. 3 includes the functions of the master matching unit 101 and the same-event index calculation unit 102 shown in FIG.

マスター照合部１０１は、入力された二つの単語Ｘ，Ｙが各組織１～ｎのマスター（例えばマスター１）に同時に記載されているか判断する。そして、同一事象指標算出部１０２は、単語Ｘ，Ｙが同時に存在するマスター数（ｂｏｔｈ＿ｕｓｅ）を計算する。そして、上記処理Ａの実行により、各マスター照合部のＮ個の照合結果（記載されているか否か）から、二つの単語が同じ事象か否かを示す同一事象指標「０／１」を求め、出力する。 The master collating unit 101 determines whether the two input words X and Y are simultaneously described in the masters (for example, master 1) of each of the organizations 1 to n. Then, the same event index calculation unit 102 calculates the number of masters (both_use) in which the words X and Y exist at the same time. Then, by executing the process A, the same event index "0/1" indicating whether or not the two words are the same event is obtained from the N matching results (whether or not they are described) of each master matching unit. ,Output.

図４は、実施の形態にかかる同一事象判定装置の構成例２を示すブロック図である。図４に示す同一事象判定装置１００は、図３同様の構成のマスター照合部１０１、同一事象指標算出部１０２のほかに、類似単語抽出部４０１の機能を有する。 FIG. 4 is a block diagram of a configuration example 2 of the identical event determination device according to the embodiment. The same event determination device 100 shown in FIG. 4 has the function of a similar word extraction unit 401 in addition to the master matching unit 101 and the same event index calculation unit 102 having the same configuration as in FIG.

類似単語抽出部４０１は、少なくとも単語集合から、同一事象を表す単語Ｘ，Ｙの組の候補を抽出し、マスター照合部１０１に単語Ｘ，Ｙを出力する機能を有する。この際、外部入力される単語集合から、単語Ｘ，Ｙの組み合わせを計算する。 The similar word extraction unit 401 has a function of extracting candidates for a pair of words X and Y representing the same event at least from the word set and outputting the words X and Y to the master matching unit 101 . At this time, a combination of words X and Y is calculated from a set of words input from the outside.

また、類似単語抽出部４０１は、異なる単語Ｘ，Ｙ同士の類似度を算出し、類似度が大きい単語Ｘ，Ｙのみを抽出してもよい。これにより、同一事象を表す可能性が低い単語Ｘ，Ｙの組み合わせに対しては、マスター照合部１０１および同一事象指標算出部１０２での処理を不要にでき、装置全体の処理を効率化できるようになる。この単語Ｘ，Ｙ同士の類似度の計算は、例えば、レーベンシュタイン距離を用い、レーベンシュタイン距離の逆数を類似度とすればよい。 Further, the similar word extraction unit 401 may calculate the degree of similarity between different words X and Y, and extract only the words X and Y with high degree of similarity. This eliminates the need for processing in the master collation unit 101 and the same event index calculation unit 102 for combinations of words X and Y that are unlikely to represent the same event, thereby improving the efficiency of the processing of the entire apparatus. become. For calculating the degree of similarity between the words X and Y, for example, the Levenshtein distance may be used, and the reciprocal of the Levenshtein distance may be used as the degree of similarity.

また、類似単語抽出部４０１は、単語そのもの以外に、単語に関わる特徴量（例えば、単語Ｘ，Ｙが含まれる文章）等を取得し、この特徴量を類似度の算出に利用してもよい。 In addition to the words themselves, the similar word extraction unit 401 may also acquire feature amounts related to words (for example, sentences containing the words X and Y), etc., and use these feature amounts to calculate the degree of similarity. .

また、上記の説明では、二つの異なる単語Ｘ，Ｙを抽出する例について説明したが、二つ以上の単語の組を用いて同一事象指標の算出処理を行ってもよい。この場合、類似単語抽出部４０１は、複数の単語の組を抽出する。この場合、マスター照合部１０１は、各組毎に照合結果×Ｎを算出し、同一事象指標算出部１０２は各組毎に同一事象指標を算出すればよい。この場合、同一事象指標算出部１０２は、単語類似度も鑑みて同一事象指標を算出する。 Also, in the above description, an example of extracting two different words X and Y has been described, but a set of two or more words may be used to calculate the same event index. In this case, the similar word extraction unit 401 extracts a set of multiple words. In this case, the master collation unit 101 may calculate the collation result×N for each pair, and the same event index calculation unit 102 may calculate the same event index for each pair. In this case, the same-event index calculation unit 102 calculates the same-event index in consideration of word similarity as well.

また、外部入力される単語集合、および単語に関わる特徴量は、マスターを保持しているＮ個の「組織１～ｎ」からそれぞれ取得して統合したものでもよい。また、マスターを保持しているＮ個の「組織１～ｎ」とは関係なく、取得してもよい。例えば、インターネット上に公開されている論文等から取得した文章を用いてもよい。さらには、文章に限らず、時系列データでもよい。これら文章や時系列データを用いることで、同時に使われている単語や、単語と同時、あるいは、前後の状況を特徴量（例えば、時間帯）として取得することができる。これによって、類似単語抽出部４０１では、「似た使われ方」の単語を精度よく求めることができる。 Also, the set of words to be externally input and the feature values related to the words may be acquired from the N "organizations 1 to n" holding masters and integrated. Also, it may be acquired regardless of the N "organizations 1 to n" holding the master. For example, sentences obtained from papers published on the Internet may be used. Furthermore, it is not limited to sentences, and may be time-series data. By using these sentences and time-series data, it is possible to acquire words that are used at the same time and situations that occur at the same time as or before and after the words as feature quantities (for example, time period). As a result, the similar word extraction unit 401 can accurately obtain words with "similar usage".

（同一事象の各判定例）
次に、同一事象判定装置１００が行う二つの単語Ｘ，Ｙに対する同一事象の各判定例について説明する。同一事象判定装置１００の制御部（ＣＰＵ２０１）は、プログラム実行することで、同一事象判定の処理を行う。この際、制御部は、二つの単語Ｘ，Ｙについて、構成例１または構成例２に基づき、同一事象指標の算出対象の二つの単語Ｘ，Ｙがマスター照合部１０１に入力される。例えば、構成例１の場合には、同一事象指標を算出する二つの単語Ｘ，Ｙは、例えばユーザ操作によりマスター照合部１０１に手動入力され、構成例２の場合には、類似単語抽出部４０１が抽出してマスター照合部１０１に入力する。そして、制御部は、抽出した二つの単語Ｘ，Ｙに対し、上記処理Ａ～処理Ｃで説明した同一事象指標の算出処理を行う。 (Each judgment example of the same event)
Next, examples of determination of the same event for two words X and Y performed by the same event determination device 100 will be described. The control unit (CPU 201) of the same-event determination device 100 executes the same-event determination process by executing a program. At this time, the control unit inputs the two words X and Y for which the same event index is to be calculated to the master matching unit 101 based on the configuration example 1 or the configuration example 2 for the two words X and Y. FIG. For example, in the case of configuration example 1, the two words X and Y for which the same event index is calculated are manually input to the master matching unit 101 by user operation, for example, and in the case of configuration example 2, the similar word extraction unit 401 extracts and inputs to the master matching unit 101 . Then, the control unit performs the same event index calculation processing described in the above processing A to processing C for the two extracted words X and Y. FIG.

図５は、実施の形態にかかる同一事象判定装置による同一事象判定例１を示すフローチャートである。はじめに、制御部（マスター照合部１０１）は、単語の組（二つの単語Ｘ，Ｙ）の入力を待機する（ステップＳ５０１：Ｎｏのループ）。単語の組が入力されると（ステップＳ５０１：Ｙｅｓ）、制御部は、必要な全マスターの照合を終了したか判定する（ステップＳ５０２）。この照合は、図１で示した各組織１～ｎのマスター１～ｎに対する照合１～ｎに相当する。 FIG. 5 is a flowchart showing a same event determination example 1 by the same event determination device according to the embodiment. First, the control unit (master collation unit 101) waits for input of a word pair (two words X and Y) (step S501: No loop). When a set of words is input (step S501: Yes), the control unit determines whether or not all necessary masters have been checked (step S502). This collation corresponds to the collation 1-n against the masters 1-n of the organizations 1-n shown in FIG.

必要な全マスターの照合が終了していなければ（ステップＳ５０２：Ｎｏ）、制御部は、未照合のマスターを取得し（ステップＳ５０３）、取得したマスターに単語Ｘと単語Ｙが同時に存在するか照合を行い（ステップＳ５０４）、ステップＳ５０２の処理に戻る。 If collation of all required masters has not been completed (step S502: No), the control unit acquires uncollated masters (step S503), and verifies whether word X and word Y exist simultaneously in the acquired masters. (step S504), and the process returns to step S502.

そして、全マスターの照合が終了すれば（ステップＳ５０２：Ｙｅｓ）、制御部（同一事象指標算出部１０２）は、単語Ｘ，Ｙが同時に存在するマスター数（ｂｏｔｈ＿ｕｓｅ）を計算する（ステップＳ５０５）。 Then, when collation of all masters is completed (step S502: Yes), the control unit (same event index calculation unit 102) calculates the number of masters (both_use) in which words X and Y exist simultaneously (step S505).

また、制御部（同一事象指標算出部１０２）は、処理Ａの実施による同一事象指標を算出し、単語Ｘ，Ｙが同時に存在するマスター数（ｂｏｔｈ＿ｕｓｅ）がなければ（０の場合）には同一事象指標として「１」を出力し、それ以外（正の値）の場合「０」を出力し（ステップＳ５０６）、以上の処理を終了する。 In addition, the control unit (same event index calculation unit 102) calculates the same event index by executing the process A, and if there is no master number (both_use) in which the words X and Y exist simultaneously (when 0), the same event index "1" is output as the event index, otherwise "0" is output (step S506), and the above processing is terminated.

図６は、実施の形態にかかる同一事象判定装置によるデータ処理の具体例１を説明する図である。図５に示した同一事象判定例１に対応する具体的なデータ処理例を示す。図６には、構成例１または２により、同一事象指標を算出する対象の単語Ｘ，Ｙが薬品であり、単語Ｘは「ピペラジンアジピン酸塩」、単語Ｙは「アジピン酸ピペラジン」である。これら単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」は、互いに類似する（似た使われ方をする）薬品である。 FIG. 6 is a diagram for explaining a specific example 1 of data processing by the identical event determination device according to the embodiment; A specific data processing example corresponding to the same event determination example 1 shown in FIG. 5 is shown. In FIG. 6, words X and Y for which the same event index is to be calculated are medicines according to configuration example 1 or 2, word X is "piperazine adipate" and word Y is "piperazine adipate". These word X "piperazine adipate" and word Y "piperazine adipate" are drugs that are similar to each other (used similarly).

同一事象判定装置１００のマスター照合部１０１は、入力された単語Ｘ，Ｙに基づき、各組織１～ｎに相当する「病院１～ｎ」の薬品のマスター６００にアクセスする。例えば、「病院１」の薬品のマスター６００ａには、「ピペラジンアジピン酸塩」、「アトロピン硫酸塩水和物」、…、の各単語が記憶保持されているとする。 Based on the input words X and Y, the master collating unit 101 of the identical event determination device 100 accesses the drug masters 600 of "hospitals 1 to n" corresponding to the respective organizations 1 to n. For example, it is assumed that words such as "piperazine adipate", "atropine sulfate hydrate", . . .

そして、マスター照合部１０１は、単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」の両方が「病院１～ｎ」のそれぞれのマスター６００に存在するか否かを照合処理し、照合結果６０１を求める。この照合結果６０１では、「病院１」のマスター６００ａには、単語Ｘ，Ｙの両方が存在せず（記号：×）、「病院２」のマスター６００ｂにも、単語Ｘ，Ｙの両方が存在しない（記号：×）ことが示されている。 Then, the master collation unit 101 performs a collation process on whether or not both the word X “piperazine adipate” and the word Y “piperazine adipate” exist in the respective masters 600 of “hospitals 1 to n”, and performs collation. A result 601 is obtained. In this collation result 601, both the words X and Y do not exist in the master 600a of "Hospital 1" (symbol: x), and both the words X and Y exist in the master 600b of "Hospital 2". Not (symbol: x) is indicated.

そして、同一事象指標算出部１０２は、マスター照合部１０１の照合結果６０１に基づき、単語Ｘ，Ｙについての同一事象指標算出結果６０２を算出する。図６の例の場合、同一事象指標算出部１０２は、単語Ｘ，Ｙの同時利用組織数（ｂｏｔｈ＿ｕｓｅ）が「０」であると算出する。 Then, the same event index calculation unit 102 calculates the same event index calculation result 602 for the words X and Y based on the collation result 601 of the master collation unit 101 . In the example of FIG. 6, the same event index calculation unit 102 calculates that the number of organizations in which words X and Y are simultaneously used (both_use) is "0".

また、同一事象指標算出部１０２は、処理Ａの実施により、同一事象指標値「１」を算出する。同一事象判定装置１００（制御部）は、同一事象指標算出部１０２が算出した同一事象指標算出結果６０２を外部出力する。 Further, the same-event index calculation unit 102 calculates the same-event index value “1” by executing the processing A. FIG. The same event determination device 100 (control unit) externally outputs the same event index calculation result 602 calculated by the same event index calculation unit 102 .

上記処理によれば、同一事象指標算出結果６０２として、単語Ｘ，Ｙが同時に存在するマスター数（ｂｏｔｈ＿ｕｓｅ）と、同一事象指標値を得ることができる。上記例では、全ての「病院１～ｎ」のマスター６００のうち、全ての病院のマスター６００に単語Ｘ，Ｙが同時に存在しないことを示す。 According to the above process, as the same event index calculation result 602, the number of masters (both_use) in which words X and Y exist simultaneously and the same event index value can be obtained. In the above example, out of all the masters 600 of "hospitals 1 to n", the words X and Y do not exist in all the hospital masters 600 at the same time.

また、同一事象指標算出結果６０２として、「全ての病院１～ｎに二つの単語Ｘ，Ｙが同時に含まれていない」という条件を満たしている旨を、同一事象指標値「１」で示している。すなわち、同一事象指標値「１」の場合、全ての病院１～ｎに二つの単語が同時に含まれないことを明確に示すことができる。例えば、照合結果６０１が全ての病院Ａ＝×、病院Ｂ＝×、病院Ｃ＝×のとき、同一事象指標値は初めて「１」となる。病院Ａ～Ｃのいずれか一つでも、照合結果に〇があれば、同一事象指標値は「０」となる。この同一事象指標算出結果６０２により、例えば、これら全ての「病院１～ｎ」のマスターを統合した共通マスターの作成時、あるいはマスター間の共通辞書の作成時における単語に関する注意事項を明確に提示できるようになる。 In addition, as the same event index calculation result 602, the fact that the condition that "the two words X and Y are not included at the same time in all hospitals 1 to n" is satisfied is indicated by the same event index value "1". there is That is, for the same event index value "1", it can be clearly shown that the two words are not included in all hospitals 1-n at the same time. For example, when the collation result 601 is all hospital A=x, hospital B=x, and hospital C=x, the same event index value becomes "1" for the first time. If even one of the hospitals A to C has ◯ in the collation result, the same event index value will be "0". With this same event index calculation result 602, for example, when creating a common master that integrates all the masters of "Hospital 1 to n", or when creating a common dictionary between masters, it is possible to clearly present notes on words. become.

図７は、実施の形態にかかる同一事象判定装置による同一事象判定例２を示すフローチャートである。はじめに、制御部（マスター照合部１０１）は、単語の組（二つの単語Ｘ，Ｙ）の入力を待機する（ステップＳ７０１：Ｎｏのループ）。単語の組が入力されると（ステップＳ７０１：Ｙｅｓ）、制御部は、必要な全マスターの照合を終了したか判定する（ステップＳ７０２）。 FIG. 7 is a flowchart of a same event determination example 2 by the same event determination device according to the embodiment. First, the control unit (master collation unit 101) waits for input of a word pair (two words X and Y) (step S701: No loop). When a set of words is input (step S701: Yes), the control unit determines whether all necessary masters have been checked (step S702).

必要な全マスターの照合が終了していなければ（ステップＳ７０２：Ｎｏ）、制御部は、未照合のマスターを取得し（ステップＳ７０３）、取得したマスターに単語Ｘと単語Ｙが同時に存在するか照合を行い（ステップＳ７０４）、ステップＳ７０２の処理に戻る。 If collation of all necessary masters has not been completed (step S702: No), the control unit acquires uncollated masters (step S703), and verifies whether word X and word Y exist simultaneously in the acquired masters. is performed (step S704), and the process returns to step S702.

そして、全マスターの照合が終了すれば（ステップＳ７０２：Ｙｅｓ）、制御部（同一事象指標算出部１０２）は、単語Ｘ，Ｙが同時に存在するマスター数（ｂｏｔｈ＿ｕｓｅ）を計算する（ステップＳ７０５）。 Then, when collation of all masters is completed (step S702: Yes), the control unit (same event index calculation unit 102) calculates the number of masters (both_use) in which words X and Y exist simultaneously (step S705).

また、制御部（同一事象指標算出部１０２）は、処理Ｂの実施により、同一事象指標を、１－（二つの単語が両方とも同一マスターに存在した組織の数）／（全組織の数）に基づき算出し（ステップＳ７０６）、以上の処理を終了する。 In addition, the control unit (same event index calculation unit 102) calculates the same event index as 1-(number of organizations in which both words exist in the same master)/(number of all organizations) (step S706), and the above processing ends.

図８は、実施の形態にかかる同一事象判定装置によるデータ処理の具体例２を説明する図である。図７に示した同一事象判定例２に対応する具体的なデータ処理例を示す。図８においても、構成例１または２により、同一事象指標を算出する対象の単語Ｘが「ピペラジンアジピン酸塩」、単語Ｙが「アジピン酸ピペラジン」である例を示す。 FIG. 8 is a diagram for explaining a specific example 2 of data processing by the identical event determination device according to the embodiment; A specific data processing example corresponding to the same event determination example 2 shown in FIG. 7 is shown. FIG. 8 also shows an example in which the word X for which the same event index is to be calculated is "piperazine adipate" and the word Y is "piperazine adipate" according to configuration example 1 or 2.

同一事象判定装置１００のマスター照合部１０１は、入力された単語Ｘ，Ｙに基づき、各組織１～ｎに相当する「病院１～ｎ」の薬品のマスター６００にアクセスする。そして、マスター照合部１０１は、単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」の両方が「病院１～ｎ」のそれぞれのマスター６００に存在するか否かを照合処理し、照合結果６０１を求める。照合結果６０１には、「病院１」のマスター６００ａには、単語Ｘ，Ｙの両方が存在せず（記号：×）、「病院２」のマスター６００ｂには、単語Ｘ，Ｙの両方が存在している（記号：〇）ことが示されている。 Based on the input words X and Y, the master collating unit 101 of the identical event determination device 100 accesses the drug masters 600 of "hospitals 1 to n" corresponding to the respective organizations 1 to n. Then, the master collation unit 101 performs a collation process on whether or not both the word X “piperazine adipate” and the word Y “piperazine adipate” exist in the respective masters 600 of “hospitals 1 to n”, and performs collation. A result 601 is obtained. In the matching result 601, the master 600a of "Hospital 1" does not contain both the words X and Y (symbol: x), and the master 600b of "Hospital 2" contains both the words X and Y. (symbol: 〇) is indicated.

そして、同一事象指標算出部１０２は、マスター照合部１０１の照合結果に基づき、単語Ｘ，Ｙについての同一事象指標算出結果６０２を算出する。図８の例の場合、同一事象指標算出部１０２は、単語Ｘ，Ｙの同時利用組織数（ｂｏｔｈ＿ｕｓｅ）が「１」であると算出する。 Then, the same event index calculation unit 102 calculates the same event index calculation result 602 for the words X and Y based on the collation result of the master collation unit 101 . In the example of FIG. 8, the same event index calculation unit 102 calculates that the number of organizations in which words X and Y are simultaneously used (both_use) is "1".

また、同一事象指標算出部１０２は、処理Ｂの実施により、同一事象指標値を、１－（二つの単語が両方とも同一マスターに存在した組織の数）／（全組織の数）に基づき「０．９５」と算出する。同一事象指標値は、Ｎ（全マスター数）＝２０の場合で算出した。同一事象判定装置１００（制御部）は、同一事象指標算出部１０２が算出した同一事象指標算出結果６０２を外部出力する。 In addition, the same-event-index calculation unit 102 calculates the same-event-index value by performing process B based on 1-(the number of organizations in which both words exist in the same master)/(the number of all organizations). 0.95". The same event index value was calculated when N (total number of masters)=20. The same event determination device 100 (control unit) externally outputs the same event index calculation result 602 calculated by the same event index calculation unit 102 .

上記処理によれば、同一事象指標算出結果６０２として、単語Ｘ，Ｙが同時に存在するマスター数（ｂｏｔｈ＿ｕｓｅ）と、同一事象指標を得ることができる。上記例では、全ての「病院１～ｎ」のマスター６００のうち、一部の病院（病院２）のマスター６００ｂに単語Ｘ，Ｙが同時に存在することを示すことができる。 According to the above processing, as the same event index calculation result 602, the number of masters (both_use) in which the words X and Y exist simultaneously and the same event index can be obtained. In the above example, it can be shown that the words X and Y are present at the same time in the master 600b of some hospitals (Hospital 2) among all the masters 600 of "Hospital 1 to n".

また、同一事象指標算出結果６０２として、「全ての病院１～ｎに二つの単語Ｘ，Ｙが同時に含まれていない」という条件を満たしていない割合を、同一事象指標値「０．９５」で明確に示している。値は「１」に近いほど「全ての病院１～ｎに二つの単語Ｘ，Ｙが同時に含まれていない」という条件を満たしていない（「全ての病院１～ｎに二つの単語Ｘ，Ｙが同時に含まれている」）旨を示す。このように同一事象指標値を処理Ｂにより細かく算出することで、全てのマスター６００の同一事象指標（「全ての病院１～ｎに二つの単語Ｘ，Ｙが同時に含まれていない」）をより細かく提示できるようになる。 In addition, as the same event index calculation result 602, the ratio that does not satisfy the condition that “two words X and Y are not included in all hospitals 1 to n at the same time” is calculated with the same event index value “0.95”. clearly show. The closer the value is to "1", the less it satisfies the condition "all hospitals 1 to n do not contain two words X and Y at the same time" ("all hospitals 1 to n contain two words X and Y are included at the same time”). By finely calculating the same event index value by processing B in this way, the same event index of all the masters 600 ("two words X and Y are not included in all hospitals 1 to n at the same time") can be more accurately calculated. be able to present it in detail.

図９は、実施の形態にかかる同一事象判定装置による同一事象判定例３を示すフローチャートである。はじめに、制御部（マスター照合部１０１）は、単語の組（二つの単語Ｘ，Ｙ）の入力を待機する（ステップＳ９０１：Ｎｏのループ）。単語の組が入力されると（ステップＳ９０１：Ｙｅｓ）、制御部は、必要な全マスターの照合を終了したか判定する（ステップＳ９０２）。 FIG. 9 is a flowchart showing a same event determination example 3 by the same event determination device according to the embodiment. First, the control unit (master collation unit 101) waits for input of a word pair (two words X, Y) (step S901: No loop). When a set of words is input (step S901: Yes), the control unit determines whether all necessary masters have been checked (step S902).

必要な全マスターの照合が終了していなければ（ステップＳ９０２：Ｎｏ）、制御部は、未照合のマスターを取得する（ステップＳ９０３）。そして、制御部は、今回取得した一つのマスターに単語Ｘと単語Ｙどちらか一方が存在するか照合を行い（ステップＳ９０４）、また、今回取得したマスターに単語Ｘと単語Ｙが同時に存在するか照合を行い（ステップＳ９０５）、ステップＳ９０２の処理に戻る。 If collation of all necessary masters has not been completed (step S902: No), the control unit acquires uncollated masters (step S903). Then, the control unit checks whether either word X or word Y exists in one master acquired this time (step S904), and whether word X and word Y exist at the same time in the master acquired this time. Collation is performed (step S905), and the process returns to step S902.

そして、全マスターの照合が終了すれば（ステップＳ９０２：Ｙｅｓ）、制御部（同一事象指標算出部１０２）は、単語Ｘ，Ｙのどちらかが存在している組織数（Ｓｈａｒｅ）を計算する（ステップＳ９０６）。また、単語Ｘ，Ｙが同時に存在するマスター数（ｂｏｔｈ＿ｕｓｅ）を計算する（ステップＳ９０７）。 Then, when collation of all masters is completed (step S902: Yes), the control unit (same event index calculation unit 102) calculates the number of tissues (Share) in which either word X or Y exists ( step S906). Also, the number of masters (both_use) in which words X and Y exist simultaneously is calculated (step S907).

そして、制御部（同一事象指標算出部１０２）は、処理Ｂの実施により、同一事象指標として、１－（二つの単語が両方とも同一マスターに存在した組織の数）／（全組織の数）に基づき算出する。また、制御部（同一事象指標算出部１０２）は、処理Ｃの実施により、同一事象指標＝１－（二つの単語が両方とも同一マスターに存在した組織の数）／（二つの単語のいずれかがマスターに存在する組織の数）を算出する（ステップＳ９０８）。以上により、制御部は、一連の処理を終了する。 Then, the control unit (same event index calculation unit 102), by executing process B, calculates the same event index as 1-(the number of organizations in which both two words exist in the same master)/(the number of all organizations) Calculated based on In addition, the control unit (same event index calculation unit 102), by executing the process C, sets the same event index = 1 - (the number of organizations in which both of the two words exist in the same master) / (either of the two words number of tissues existing in the master) is calculated (step S908). With the above, the control unit terminates a series of processes.

図１０は、実施の形態にかかる同一事象判定装置によるデータ処理の具体例３を説明する図である。図９に示した同一事象判定例３に対応する具体的なデータ処理例１を示す。図１０においても、構成例１または２により、同一事象指標を算出する対象の単語Ｘが「ピペラジンアジピン酸塩」、単語Ｙが「アジピン酸ピペラジン」である例を示す。 FIG. 10 is a diagram for explaining a specific example 3 of data processing by the identical event determination device according to the embodiment; A specific data processing example 1 corresponding to the same event determination example 3 shown in FIG. 9 is shown. FIG. 10 also shows an example in which the word X for which the same event index is to be calculated is "piperazine adipate" and the word Y is "piperazine adipate" according to configuration example 1 or 2. FIG.

同一事象判定装置１００のマスター照合部１０１は、入力された単語Ｘ，Ｙに基づき、各組織１～ｎに相当する「病院１～ｎ」の薬品のマスター６００にアクセスする。そして、マスター照合部１０１は、単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」の両方が取得した一つのマスターに同時に存在するか照合した照合結果６０１ａを求める。照合結果６０１ａには、「病院１」のマスター６００ａには、単語Ｘ，Ｙの両方が同時に存在せず（記号：×）、「病院２」のマスター６００ｂには、単語Ｘ，Ｙの両方が同時に存在している（記号：〇）ことが示されている。 Based on the input words X and Y, the master collating unit 101 of the identical event determination device 100 accesses the drug masters 600 of "hospitals 1 to n" corresponding to the respective organizations 1 to n. Then, the master collation unit 101 obtains a collation result 601a by collating whether both the word X “piperazine adipate” and the word Y “piperazine adipate” exist simultaneously in one acquired master. In the collation result 601a, both the words X and Y do not exist at the same time in the master 600a of "Hospital 1" (symbol: x), and both the words X and Y exist in the master 600b of "Hospital 2". It is shown that they exist at the same time (symbol: ◯).

また、マスター照合部１０１は、単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」の少なくともどちらか一方が取得した一つのマスターに同時に存在するか照合した照合結果６０１ｂを求める。照合結果６０１ｂには、「病院１」のマスター６００ａには、単語Ｘ，Ｙの少なくとも一方が存在し（記号：〇）、また、「病院２」のマスター６００ｂにも単語Ｘ，Ｙの少なくとも一方が存在している（記号：〇）ことが示されている。 In addition, the master collation unit 101 obtains a collation result 601b by collating whether at least one of the word X “piperazine adipate” and the word Y “piperazine adipate” exists in the acquired master at the same time. In the collation result 601b, at least one of the words X and Y exists in the master 600a of "Hospital 1" (symbol: ◯), and at least one of the words X and Y exists in the master 600b of "Hospital 2". exists (symbol: 〇).

そして、同一事象指標算出部１０２は、マスター照合部１０１の照合結果６０１ａ，６０１ｂに基づき、単語Ｘ，Ｙについての同一事象指標算出結果６０２を算出する。図１０の例の場合、同一事象指標算出部１０２は、照合結果６０１ｂに基づく処理Ｃの実施により、単語Ｘ，Ｙのいずれか一方を利用する利用組織数（Ｓｈａｒｅ）が「１３」と算出する。また、照合結果６０１ａに基づく処理Ｂの実施により、単語Ｘ，Ｙを同時に利用する同時利用組織数（ｂｏｔｈ＿ｕｓｅ）が「１」であると算出する。また、同一事象指標算出部１０２は、同一事象指標算出結果６０２として、「全ての病院１～ｎに二つの単語Ｘ，Ｙが同時に含まれていない」という条件を満たしていない割合を、同一事象指標値「０．９２」を算出する。そして、同一事象判定装置１００（制御部）は、同一事象指標算出部１０２が算出した同一事象指標算出結果６０２を外部出力する。 Then, the same event index calculation unit 102 calculates the same event index calculation result 602 for the words X and Y based on the matching results 601 a and 601 b of the master matching unit 101 . In the case of the example of FIG. 10, the same-event-index calculation unit 102 calculates the number of organizations (Share) that use either one of the words X and Y as "13" by performing the process C based on the collation result 601b. . Also, by performing the process B based on the collation result 601a, the number of concurrently using organizations (both_use) that simultaneously uses the words X and Y is calculated to be "1". In addition, the same event index calculation unit 102 calculates, as the same event index calculation result 602, the percentage of cases that do not satisfy the condition that “all the hospitals 1 to n do not contain the two words X and Y at the same time”. An index value of "0.92" is calculated. Then, the same event determination device 100 (control unit) outputs the same event index calculation result 602 calculated by the same event index calculation unit 102 to the outside.

上記処理によれば、同一事象指標算出結果６０２として、単語Ｘ，Ｙのいずれか一方を利用する利用組織数（Ｓｈａｒｅ）と、単語Ｘ，Ｙが同時に存在するマスター数（ｂｏｔｈ＿ｕｓｅ）と、同一事象指標を得ることができる。上記例では、全ての「病院１～ｎ」のマスター６００のうち、一部の病院（病院２）のマスター６００ｂに単語Ｘ，Ｙが同時に存在することを示している。また、一部の病院（病院１，２）のマスター６００ａ，６００ｂには、単語Ｘ，Ｙの少なくともどちらか一方が存在することを示している。 According to the above process, as the same event index calculation result 602, the number of organizations using either word X or Y (Share), the number of masters in which words X and Y exist at the same time (both_use), and the same event You can get the index. In the above example, out of the masters 600 of all "hospitals 1 to n", the masters 600b of some hospitals (hospital 2) show that the words X and Y are present at the same time. It also shows that at least one of the words X and Y exists in the masters 600a and 600b of some hospitals (hospitals 1 and 2).

また、同一事象指標算出部１０２は、同一事象指標算出結果６０２として、「全ての病院１～ｎに二つの単語Ｘ，Ｙが同時に含まれていない」という条件を満たしていない割合を、同一事象指標値「０．９２」で明確に示している。このように同一事象指標値を処理Ｂ，処理Ｃにより細かく算出することで、全てのマスター６００の同一事象指標（「全ての病院１～ｎに二つの単語Ｘ，Ｙが同時に含まれていない」）をより細かく提示できるようになる。 In addition, the same event index calculation unit 102 calculates, as the same event index calculation result 602, the percentage of cases that do not satisfy the condition that “all the hospitals 1 to n do not contain the two words X and Y at the same time”. This is clearly indicated by the index value "0.92". By finely calculating the same event index value by the processing B and the processing C in this way, the same event index of all the masters 600 ("two words X and Y are not included at the same time in all hospitals 1 to n") ) can be presented in more detail.

図１１は、実施の形態にかかる同一事象判定装置によるデータ処理の具体例４を説明する図である。図９に示した同一事象判定例３に対応する具体的なデータ処理例２を示す。上述した図１０に示したデータ処理例１は、一組（二つの単語Ｘ，Ｙ）の入力であった。これに対し、この図１１に示すデータ処理例２では、構成例２で説明した類似単語抽出部４０１を設け、類似単語抽出部４０１が多数の単語集合の中から単語Ｘ，Ｙの組を複数求め、マスター照合部１０１に出力する構成である。 FIG. 11 is a diagram for explaining a specific example 4 of data processing by the identical event determination device according to the embodiment; A specific data processing example 2 corresponding to the same event determination example 3 shown in FIG. 9 is shown. Data processing example 1 shown in FIG. 10 described above was an input of a set (two words X, Y). On the other hand, in the data processing example 2 shown in FIG. 11, the similar word extraction unit 401 described in the configuration example 2 is provided, and the similar word extraction unit 401 extracts a plurality of pairs of words X and Y from a large number of word sets. It is configured to obtain and output to the master matching unit 101 .

図１１において、同一事象判定装置１００に入力される単語集合は、同一の薬品で異なる表記の薬品、または異なる薬品の集合体である。類似単語抽出部４０１は、この集合単語の中から任意（例えば総あたりで得た）の単語Ｘ，Ｙについてそれぞれの類似度を求める。図１１の例では、一組の単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」の類似度は０．１と算出する。他の一組の単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アトロピン硫酸塩水和物」の類似度は０．１と算出する。一組の単語Ｘ「アトロピン硫酸塩水和物」、単語Ｙ「アトロピン硫酸塩」の類似度は０．３３３と算出する。 In FIG. 11, the word set input to the same event determination apparatus 100 is the same drug with different notation, or a set of different drugs. The similar word extracting unit 401 obtains the degree of similarity between arbitrary words X and Y (obtained by round-robin, for example) from this set of words. In the example of FIG. 11, the degree of similarity between a pair of word X “piperazine adipate” and word Y “piperazine adipate” is calculated as 0.1. The degree of similarity of another set of word X "piperazine adipate" and word Y "atropine sulfate hydrate" is calculated as 0.1. The similarity between a pair of word X "atropine sulfate hydrate" and word Y "atropine sulfate" is calculated as 0.333.

同一事象判定装置１００のマスター照合部１０１は、類似単語抽出部４０１が抽出した一組の単語Ｘ，Ｙ毎に、各組織１～ｎに相当する「病院１～ｎ」の薬品のマスター６００にアクセスする。 The master matching unit 101 of the identical event determination device 100 compares each set of words X and Y extracted by the similar word extraction unit 401 to the drug master 600 of “hospitals 1 to n” corresponding to each organization 1 to n. to access.

そして、マスター照合部１０１は、一組の単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」の両方が取得した一つのマスターに同時に存在するか照合した照合結果６０１ａを求める。照合結果６０１ａには、「病院１」のマスター６００ａには、単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」の両方が同時に存在していない（記号：×）ことが示されている。また、「病院２」のマスター６００ｂには、単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」の両方が同時に存在している（記号：〇）ことが示されている。 Then, the master collation unit 101 obtains a collation result 601a by collating whether a pair of the word X “piperazine adipate” and the word Y “piperazine adipate” are simultaneously present in one acquired master. The collation result 601a shows that both the word X "piperazine adipate" and the word Y "piperazine adipate" do not exist simultaneously in the master 600a of "Hospital 1" (symbol: x). there is In addition, the master 600b of "Hospital 2" shows that both the word X "piperazine adipate" and the word Y "piperazine adipate" exist at the same time (symbol: ◯).

また、マスター照合部１０１は、他の一組の単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アトロピン硫酸塩水和物」の両方が取得した一つのマスターに同時に存在するか照合した照合結果６０１ａを求める。照合結果６０１ａには、「病院１」のマスター６００ａと、「病院２」のマスター６００ｂには、単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アトロピン硫酸塩水和物」の両方が同時に存在する（記号：〇）ことが示されている。以降も同様に、マスター照合部１０１は、他の一組の単語Ｘ，Ｙの両方が取得した一つのマスターに同時に存在するか照合した照合結果６０１ａを求める。 In addition, the master collation unit 101 collates whether or not another pair of the word X “piperazine adipate” and the word Y “atropine sulfate hydrate” exist at the same time in one acquired master. demand. In the collation result 601a, both the word X "piperazine adipate" and the word Y "atropine sulfate hydrate" exist simultaneously in the master 600a of "Hospital 1" and the master 600b of "Hospital 2" ( Symbol: ○) is indicated. Thereafter, similarly, the master collation unit 101 obtains a collation result 601a by collating whether the other pair of words X and Y both exist in the acquired master at the same time.

また、マスター照合部１０１は、一組の単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」の少なくともどちらか一方が取得した一つのマスターに存在するか照合した照合結果６０１ｂを求める。照合結果６０１ｂには、「病院１」のマスター６００ａと、「病院２」のマスター６００ｂには、単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」の少なくとも一方が存在している（記号：〇）ことが示されている。 In addition, the master collation unit 101 obtains a collation result 601b by collating whether at least one of the set of words X “piperazine adipate” and word Y “piperazine adipate” exists in one acquired master. In the matching result 601b, at least one of the word X "piperazine adipate" and the word Y "piperazine adipate" exists in the master 600a of "Hospital 1" and the master 600b of "Hospital 2" ( Symbol: ○) is indicated.

また、マスター照合部１０１は、他の一組の単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アトロピン硫酸塩水和物」の少なくとも一方が取得した一つのマスターに存在するか照合した照合結果６０１ｂを求める。照合結果６０１ｂには、「病院１」のマスター６００ａと、「病院２」のマスター６００ｂには、単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アトロピン硫酸塩水和物」の少なくとも一方が存在している（記号：〇）ことが示されている。以降も同様に、マスター照合部１０１は、他の一組の単語Ｘ，Ｙの少なくとも一方が取得した一つのマスターに存在するか照合した照合結果６０１ｂを求める。 In addition, the master collation unit 101 collates whether or not at least one of the word X “piperazine adipate” and the word Y “atropine sulfate hydrate” in another set of words exists in the acquired master, and returns the collation result 601b. demand. In the matching result 601b, at least one of the word X "piperazine adipate" and the word Y "atropine sulfate hydrate" exists in the master 600a of "Hospital 1" and the master 600b of "Hospital 2". (symbol: 〇). In the same way, the master matching unit 101 obtains a matching result 601b by checking whether at least one of the other pair of words X and Y exists in the acquired one master.

そして、同一事象指標算出部１０２は、マスター照合部１０１の照合結果６０１ａ，６０１ｂに基づき、一組の単語Ｘ，Ｙ毎に同一事象指標算出結果６０２を算出する。図１１の例の場合、同一事象指標算出部１０２は、照合結果６０１ｂに基づく処理Ｃの実施により、一組の単語Ｘ，Ｙ毎に、単語Ｘ，Ｙのいずれか一方を利用する利用組織数（Ｓｈａｒｅ）を算出する。例えば、一組の単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」については、単語Ｘ，Ｙのいずれか一方を利用する利用組織数（Ｓｈａｒｅ）が「１３」と算出する。 Then, the same event index calculation unit 102 calculates a same event index calculation result 602 for each pair of words X and Y based on the matching results 601 a and 601 b of the master matching unit 101 . In the case of the example of FIG. 11, the same-event-index calculation unit 102 performs the processing C based on the collation result 601b to determine the number of organizations using either one of the words X and Y for each pair of words X and Y. (Share) is calculated. For example, for a set of word X “piperazine adipate” and word Y “piperazine adipate”, the number of organizations using either word X or Y (Share) is calculated as “13”.

また、照合結果６０１ａに基づく処理Ｂの実施により、単語Ｘ，Ｙを同時に利用する同時利用組織数（ｂｏｔｈ＿ｕｓｅ）を算出する。例えば、一組の単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」については、単語Ｘ，Ｙを同時に利用する同時利用組織数（ｂｏｔｈ＿ｕｓｅ）が「１」と算出する。また、同一事象指標算出部１０２は、単語Ｘ「ピペラジンアジピン酸塩」、単語Ｙ「アジピン酸ピペラジン」が、「全ての病院１～ｎに二つの単語Ｘ，Ｙが同時に含まれていない」という条件を満たしていない割合を、同一事象指標値「０．０９２」と算出する。この値は、類似度「０．１」の乗算により算出している。 Also, the number of simultaneously using organizations (both_use) that use the words X and Y at the same time is calculated by executing the process B based on the collation result 601a. For example, for a set of word X “piperazine adipate” and word Y “piperazine adipate”, the number of concurrently using organizations (both_use) that use words X and Y at the same time is calculated as “1”. In addition, the same event index calculation unit 102 determines that the word X “piperazine adipate” and the word Y “piperazine adipate” are not included in all the hospitals 1 to n at the same time. The rate of not satisfying the conditions is calculated as the same event index value "0.092". This value is calculated by multiplying the similarity "0.1".

また、同一事象指標算出部１０２は、マスター照合部１０１の照合結果６０１ａ，６０１ｂに基づき、他の各組の単語Ｘ，Ｙ毎に同一事象指標算出結果６０２を算出する。そして、同一事象判定装置１００（制御部）は、同一事象指標算出部１０２が算出した同一事象指標算出結果６０２を外部出力する。 The same event index calculation unit 102 also calculates the same event index calculation result 602 for each of the other pairs of words X and Y based on the matching results 601 a and 601 b of the master matching unit 101 . Then, the same event determination device 100 (control unit) outputs the same event index calculation result 602 calculated by the same event index calculation unit 102 to the outside.

上記処理によれば、同一事象指標算出結果６０２として、一組の単語Ｘ，Ｙ毎に、単語Ｘ，Ｙのいずれか一方を利用する利用組織数（Ｓｈａｒｅ）と、単語Ｘ，Ｙが同時に存在するマスター数（ｂｏｔｈ＿ｕｓｅ）と、同一事象指標を得ることができる。また、同一事象指標算出結果６０２として、「全ての病院１～ｎに二つの単語Ｘ，Ｙが同時に含まれていない」という条件を満たしていない割合を、一組の単語毎に同一事象指標値として明確に示すことができる。 According to the above process, as the same event index calculation result 602, for each pair of words X and Y, the number of organizations using either one of the words X and Y (Share) and the words X and Y exist at the same time. It is possible to obtain the number of masters (both_use) to be used and the same event index. In addition, as the same event index calculation result 602, the rate that does not satisfy the condition "two words X and Y are not included in all hospitals 1 to n at the same time" is calculated as the same event index value for each set of words. can be clearly shown as

以上説明した実施の形態によれば、複数の組織１～ｎがそれぞれ保有するマスター１にアクセスし、任意の二つの単語Ｘ，Ｙが一つの組織１の一つのマスター１に存在するか否かを照合する。そして、照合の結果に基づき、全ての組織１～ｎのマスター１のそれぞれに、二つの単語Ｘ，Ｙが含まれているか否かを示す同一事象指標値を算出する。例えば、どの組織１～ｎでも、一つの組織内の一つのマスター１には、同じ事象（例えば、薬品）が別々の単語として同時に記載されることがないとする。この場合、どの組織のマスターにも二つの単語が同時に出現しない場合、二つの異なる単語が同じ事象と判断する。また、二つの単語が同時に記載されたマスターが、いずれかの組織のマスター１に存在する場合、二つの異なる単語が同じ事象ではないと判断する。 According to the embodiment described above, the masters 1 owned by a plurality of organizations 1 to n are accessed, and whether two arbitrary words X and Y exist in one master 1 of one organization 1 is checked. to match. Then, based on the collation result, the same event index value indicating whether or not the two words X and Y are included in each of the masters 1 of all the organizations 1 to n is calculated. For example, in any of the organizations 1 to n, one master 1 within one organization does not simultaneously describe the same event (for example, drug) as separate words. In this case, if two words do not appear in the master of any organization at the same time, it is determined that two different words are the same event. Also, if a master in which two words are described at the same time exists in the master 1 of any organization, it is determined that two different words are not the same event.

また、二つの単語を取得し、複数の組織１～ｎがそれぞれ保有するマスター１にアクセスし、複数のマスター１の内の少なくとも１以上のマスター１に二つの単語Ｘ，Ｙのいずれもが存在する場合は、二つの単語Ｘ，Ｙの意味はそれぞれ異なると判定する。 Also, acquire two words, access the masters 1 owned by a plurality of organizations 1 to n, and find that at least one of the plurality of masters 1 has both the two words X and Y. If so, it is determined that the two words X and Y have different meanings.

これにより、実施の形態によれば、二つの異なる単語が同一の事象であるか否かを精度良く判定できるようになる。そして、多数の組織のマスター（データベース）のデータを統合して利活用する際、「同一事象＝同一表記」となるよう、表記を揃えるための事前作業を効率的に行うことができる。例えば、異なる表記の単語が同一事象であるか否かを判断するための辞書（類義語の対応表に相当）を容易に作成できるようになる。 Thus, according to the embodiment, it becomes possible to accurately determine whether or not two different words are the same event. When integrating and utilizing master (database) data of a large number of organizations, it is possible to efficiently perform preparatory work for aligning notations so that "same event=same notation". For example, it becomes possible to easily create a dictionary (corresponding to a synonym table) for determining whether or not words with different notations represent the same event.

また、二つの単語Ｘ，Ｙがともに存在するマスターの数を算出してもよい。これにより、複数の組織それぞれのマスターのうち、二つの単語Ｘ，Ｙが存在するマスターの数を具体的に示すことができるようになる。 Also, the number of masters in which two words X and Y exist together may be calculated. This makes it possible to specifically indicate the number of masters in which two words X and Y are present among the masters of each of the multiple organizations.

また、照合の結果に基づき、全ての組織において、二つの単語Ｘ，Ｙが同時に含まれなかった場合に同一事象指標値を「１」、それ以外の場合に値を「０」と算出してもよい。これにより、対象とした一部の組織１～ｎのマスター１について、二つの単語Ｘ，Ｙが同時に含まれているか否かを最も簡単で明確な数値で示すことができる。 In addition, based on the results of matching, the same event index value is calculated as "1" in all organizations when the two words X and Y are not included at the same time, and the value is calculated as "0" otherwise. good too. As a result, whether or not the two words X and Y are included in the master 1 of some of the target organizations 1 to n can be indicated by the simplest and clearest numerical value.

また、照合の結果に基づき、同一事象指標値として、１－（二つの単語が両方とも同一マスターに存在した組織の数）／（全組織の数）を算出してもよい。これにより、対象とした全ての組織１～ｎのマスター１について、二つの単語Ｘ，Ｙが存在する全体の組織数に対する割合を具体的な数値で示すことができる。また、同じ事象のものを別の単語としてマスターに記載している組織が一つでもある場合の誤判定を防ぐことができる。 Also, based on the collation result, 1-(the number of organizations in which both words exist in the same master)/(the number of all organizations) may be calculated as the same event index value. As a result, the percentage of the total number of tissues in which the two words X and Y are present in the masters 1 of all the target tissues 1 to n can be expressed as a specific numerical value. In addition, it is possible to prevent erroneous determination when there is even one organization that describes the same phenomenon as different words in the master.

また、二つの単語Ｘ，Ｙのいずれかが存在する組織１～ｎの数を算出してもよい。これにより、複数の組織のうち、二つの単語Ｘ，Ｙのいずれかが存在する組織の数を具体的に示すことができるようになる。 Also, the number of tissues 1 to n in which any of the two words X and Y exists may be calculated. This makes it possible to specifically indicate the number of organizations in which either of the two words X and Y exists among the plurality of organizations.

また、照合の結果に基づき、同一事象指標値として、１－（二つの単語が両方とも同一マスターに存在した組織の数）／（二つの単語のいずれかがマスターに存在する組織の数）を算出してもよい。これにより、対象とした全ての組織１～ｎのマスター１について、二つの単語Ｘ，Ｙいずれかが存在する組織数に対し、二つの単語Ｘ，Ｙが両方とも存在する割合を具体的な数値で示すことができる。また、二つの単語Ｘ，Ｙを使っている組織が少ない場合に、同一事象指標が高くなることを防止できる。 In addition, based on the collation results, as the same event index value, 1 - (number of organizations where both words exist in the same master) / (number of organizations where either of the two words exists in the master) can be calculated. As a result, for the masters 1 of all target organizations 1 to n, the ratio of the number of organizations in which either the two words X or Y exists to the number of organizations in which both the two words X and Y exist is calculated as a specific numerical value. can be shown as Also, when there are few organizations using the two words X and Y, it is possible to prevent the same event index from becoming high.

また、対象とする二つの単語を、任意の多数の単語集合のなかから類似する二つの単語の組を抽出してもよい。二つの単語の類似度は汎用の技術を用いることができ、類似度を用いて得た組毎の二つの単語を対象とすることで、同一事象指標にかかる全体処理を効率的に行えるようになる。例えば、複数のマスター１にアクセスして多数の単語集合が得られ、類似する二つの単語の組を多数の単語集合の中から抽出することができ、この後の同一事象指標算出にかかる処理を効率的に行えるようになる。 Alternatively, a set of two words similar to the two target words may be extracted from an arbitrary large number of word sets. A general-purpose technique can be used to calculate the similarity between two words, and by targeting two words in each set obtained using the similarity, it is possible to efficiently perform overall processing related to the same event index. Become. For example, by accessing a plurality of masters 1, a large number of word sets can be obtained, and a set of two similar words can be extracted from the large number of word sets. be able to do so efficiently.

また、同一事象指標を算出するシステム（同一事象判定装置１００）は、対象となる組織のマスター１に通信接続するネットワークインタフェース２０３を備えてもよい。これにより、多数の組織のマスター１に対する通信接続で、これら多数の組織のマスター１に、二つの単語が含まれているか否かを示す同一事象指標値を算出する処理を効率的に行えるようになる。 The system for calculating the same event index (same event determination device 100) may also include a network interface 203 that communicates with the master 1 of the target organization. As a result, in the communication connection with the masters 1 of many organizations, the process of calculating the same event index value indicating whether or not two words are included in the masters 1 of many organizations can be efficiently performed. Become.

これらのことから、実施の形態によれば、異なる組織の同じ種類（例えば薬品）のマスター全てにおいて、二つの異なる単語が同一の事象（ものや行為等）として記載されているか否かを精度良く判定できるようになる。この点、従来の類義語判定の処理だけでは、例えば、二つの単語「薬品Ａ」と「薬品Ｂ」が同じ薬品である確率が高いと誤判断されていた。これに対し、実施の形態では、これら異なる二つの薬品の単語が一つのマスターに同一の事象として記載されていることを明確に提示できるようになる。 From these facts, according to the embodiment, it is possible to accurately determine whether two different words are described as the same event (object, action, etc.) in all masters of the same type (for example, medicine) in different organizations. be able to judge. In this respect, with the conventional synonym determination process alone, for example, it was erroneously determined that the two words "medicine A" and "medicine B" are highly likely to be the same drug. On the other hand, in the embodiment, it becomes possible to clearly present that these two different drug words are described as the same event in one master.

なお、本発明の実施の形態で説明した同一事象判定にかかる方法は、あらかじめ用意されたプログラムをサーバ等のプロセッサに実行させることにより実現することができる。本方法は、ハードディスク、フレキシブルディスク、ＣＤ－ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｃ－ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、フラッシュメモリ等のコンピュータで読み取り可能な記録媒体に記録され、コンピュータによって記録媒体から読み出されることによって実行される。また本方法は、インターネット等のネットワークを介して配布してもよい。 Note that the method for determining the same event described in the embodiment of the present invention can be realized by causing a processor such as a server to execute a program prepared in advance. This method is recorded in a computer-readable recording medium such as a hard disk, flexible disk, CD-ROM (Compact Disc-Read Only Memory), flash memory, etc., and is executed by being read from the recording medium by a computer. The method may also be distributed over a network such as the Internet.

上述した実施の形態に関し、さらに以下の付記を開示する。 Further, the following additional remarks are disclosed with respect to the above-described embodiment.

（付記１）複数の組織がそれぞれ保有するマスターにそれぞれアクセスし、
任意の二つの単語が一つの前記マスター内に存在するか否かを、それぞれの前記マスターについて照合し、
それぞれの前記マスターについて、前記照合の結果に基づき、一つの前記マスターに、前記二つの単語が含まれているか否かを示す同一事象指標値を算出し、
算出した前記同一事象指標値を同一の記憶部に記録する、
処理をコンピュータに実行させることを特徴とする同一事象判定プログラム。 (Appendix 1) Access each master owned by multiple organizations,
Checking each said master whether any two words exist in one said master,
calculating, for each of the masters, a same-event index value indicating whether one of the masters contains the two words, based on the results of the matching;
recording the calculated same event index value in the same storage unit;
A same event determination program characterized by causing a computer to execute processing.

（付記２）さらに、前記二つの単語がともに存在する前記マスターの数を算出する、
ことを特徴とする付記１に記載の同一事象判定プログラム。 (Appendix 2) Further, calculating the number of the masters in which the two words are present together,
The identical event determination program according to Supplementary Note 1, characterized by:

（付記３）前記照合の結果に基づき、前記同一事象指標値として、
処理対象とするマスターの総数に対する二つの単語が両方とも存在したマスターの数に応じた値を算出する、
ことを特徴とする付記１に記載の同一事象判定プログラム。 (Appendix 3) Based on the result of the collation, as the same event index value,
Calculate a value according to the number of masters in which both words exist with respect to the total number of masters to be processed,
The identical event determination program according to Supplementary Note 1, characterized by:

（付記４）前記二つの単語として、任意の多数の単語集合のなかから類似する二つの単語を抽出する、
ことを特徴とする付記１～３のいずれか一つに記載の同一事象判定プログラム。 (Appendix 4) As the two words, extract two similar words from an arbitrary large number of word sets.
The identical event determination program according to any one of Appendices 1 to 3, characterized by:

（付記５）複数の組織がそれぞれ保有するマスターにアクセスし、
任意の二つの単語が一つの前記マスター内に存在するか否かを、それぞれの前記マスターについて照合し、
それぞれの前記マスターについて、前記照合の結果に基づき、一つの前記マスターに、前記二つの単語が含まれているか否かを示す同一事象指標値を算出し、
算出した前記同一事象指標値を同一の記憶部に記録する、
処理をコンピュータが実行することを特徴とする同一事象判定方法。 (Appendix 5) Access masters owned by multiple organizations,
Checking each said master whether any two words exist in one said master,
calculating, for each of the masters, a same-event index value indicating whether one of the masters contains the two words, based on the results of the matching;
recording the calculated same event index value in the same storage unit;
A same event determination method characterized in that the processing is executed by a computer.

（付記６）複数の組織がそれぞれ保有するマスターにアクセスし、
任意の二つの単語が一つの前記マスター内に存在するか否かを、それぞれの前記マスターについて照合するマスター照合部と、
それぞれの前記マスターについて、前記照合の結果に基づき、一つの前記マスターに、前記二つの単語が含まれているか否かを示す同一事象指標値を算出する同一事象指標算出部と、
算出した前記同一事象指標値を記録する記憶部と、
を備えたことを特徴とする同一事象判定システム。 (Appendix 6) Access masters owned by multiple organizations,
a master collating unit that collates each of the masters to determine whether two arbitrary words exist in one of the masters;
a same-event index calculation unit that calculates, for each of the masters, a same-event index value indicating whether or not the two words are included in one of the masters, based on the result of the collation;
a storage unit for recording the calculated identical event index value;
A same-event determination system characterized by comprising:

（付記７）多数の単語集合のなかから類似する二つの単語の組を複数抽出し、各組の前記二つの単語を前記マスター照合部に出力する類似単語抽出部、
を備えたことを特徴とする付記６に記載の同一事象判定システム。 (Appendix 7) A similar word extraction unit that extracts a plurality of pairs of similar two words from a large number of word sets and outputs the two words of each pair to the master matching unit;
The same event determination system according to appendix 6, characterized by comprising:

（付記８）前記組織の前記マスターに通信接続するネットワークインタフェースを備えたことを特徴とする付記６または７に記載の同一事象判定システム。 (Appendix 8) The identical event determination system according to appendix 6 or 7, further comprising a network interface that communicates with the master of the organization.

（付記９）二つの単語を取得し、
複数の組織がそれぞれ保有するマスターにアクセスし、
複数の前記マスターの内の少なくとも１以上のマスターに前記二つの単語のいずれもが存在する場合は、前記二つの単語の意味はそれぞれ異なると判定する、
処理をコンピュータに実行させることを特徴とする同一事象判定プログラム。 (Appendix 9) Get two words,
Access masters owned by multiple organizations,
If both of the two words are present in at least one or more of the plurality of masters, it is determined that the two words have different meanings,
A same event determination program characterized by causing a computer to execute processing.

１００同一事象判定装置
１０１マスター照合部
１０２同一事象指標算出部
２０１ＣＰＵ（制御部）
２０２メモリ
２０３ネットワークインタフェース（ＩＦ）
２０５記録媒体
２１０ネットワーク
４０１類似単語抽出部
６００マスター
６０１（６０１ａ，６０１ｂ）照合結果
６０２同一事象指標算出結果
Ｘ，Ｙ（一組の）単語 100 same event determination device 101 master collation unit 102 same event index calculation unit 201 CPU (control unit)
202 memory 203 network interface (IF)
205 recording medium 210 network 401 similar word extraction unit 600 master 601 (601a, 601b) matching result 602 same event index calculation result X, Y (a set of words)

Claims

Accessing master databases owned by multiple organizations,
Checking each of the master databases whether any two words exist in one of the master databases ,
calculating a same-event index value indicating whether the two words are the same event based on the results of the matching for each of the master databases ;
recording the calculated same event index value in the same storage unit;
A same event determination program characterized by causing a computer to execute processing.

Based on the results of the matching for each of the master databases, calculate the number of the master databases in which the two words are present together , and calculate the number of the master databases in which the two words are present together as the same event index value. Calculate a value according to the number of databases in
The identical event determination program according to claim 1, characterized by:

Based on the results of the matching for each of the master databases , as the same event index value,
Calculating a value according to the ratio of the number of master databases in which both two words exist to the total number of the master databases to be collated,
The identical event determination program according to claim 1, characterized by:

As the two words, extracting two similar words from an arbitrary large number of word sets,
The identical event determination program according to any one of claims 1 to 3, characterized in that:

Access master databases owned by multiple organizations,
Checking each of the master databases whether any two words exist in one of the master databases ,
calculating a same-event index value indicating whether the two words are the same event based on the results of the matching for each of the master databases ;
recording the calculated same event index value in the same storage unit;
A same event determination method characterized in that the processing is executed by a computer.

Access master databases owned by multiple organizations,
a master collation unit that collates each of the master databases to determine whether two arbitrary words exist in one of the master databases ;
a same-event index calculation unit that calculates a same-event index value indicating whether the two words are the same event based on the results of the collation for each of the master databases ;
a storage unit for recording the calculated identical event index value;
A same-event determination system characterized by comprising:

a similar word extraction unit that extracts a plurality of sets of two similar words from a large number of word sets and outputs the two words of each set to the master matching unit;
7. The same event determination system according to claim 6, comprising:

get two words,
Access master databases owned by multiple organizations,
If both of the two words exist in at least one or more master databases out of the plurality of master databases , the two words are determined not to be the same event ;
A same event determination program characterized by causing a computer to execute processing.