JP7234622B2

JP7234622B2 - Document retrieval program, document retrieval method and document retrieval system

Info

Publication number: JP7234622B2
Application number: JP2018240065A
Authority: JP
Inventors: 潮柏木; 亜紀山下
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2018-12-21
Filing date: 2018-12-21
Publication date: 2023-03-08
Anticipated expiration: 2038-12-21
Also published as: JP2020102021A

Description

本発明は、文書検索プログラム、文書検索方法および文書検索システムに関する。 The present invention relates to a document search program, document search method and document search system.

従来、患者の診療を行うにあたり、医師が、過去のカルテの記載を参考にして、今後の治療方針を検討することがある。例えば、医師は、通院歴の長い患者について、その患者の過去のカルテの記載を参考にして、選択すべき治療法や薬剤などを判断することがある。カルテの検索には、例えば、キーワード検索が用いられる。 2. Description of the Related Art Conventionally, when treating a patient, a doctor sometimes considers a future treatment policy by referring to past medical records. For example, for a patient who has a long history of hospital visits, a doctor may refer to the patient's past medical record to determine the treatment method or drug to be selected. A keyword search, for example, is used to search for medical charts.

先行技術としては、例えば、ポジティブワードの一覧とネガティブワードの一覧とを記憶する記憶部に記憶されたポジティブワードおよびネガティブワードに基づいて、連絡情報に対するネガティブ・ポジティブ判定を実行するものがある。また、分類わけされた要素がポジティブなものかネガティブなものかを解析して、要素をポジティブ辞書、ネガティブ辞書に蓄積し、ポジティブ辞書の要素とネガティブ辞書の要素とを比較して、同一の場合、該同一の要素をポジティブ辞書およびネガティブ辞書から削除する技術がある。 As a prior art, for example, there is a technique that performs negative/positive judgment on contact information based on positive words and negative words stored in a storage unit that stores a list of positive words and a list of negative words. In addition, it analyzes whether the classified elements are positive or negative, accumulates the elements in positive and negative dictionaries, compares the elements in the positive dictionary and the elements in the negative dictionary, and determines if they are the same. , there are techniques to remove the same elements from the positive and negative dictionaries.

特開２０１６－２１２９２５号公報JP 2016-212925 A 特開２００８－２０４３５５号公報JP 2008-204355 A

しかしながら、従来技術では、文書検索を行うにあたり、単語が持つポジティブまたはネガティブの意味合いを判断することができない場合がある。単語が持つポジティブまたはネガティブの意味合いを判断できなければ、ポジティブまたはネガティブの条件を加味した文書検索を行うことは難しい。 However, with the conventional technology, there are cases in which it is not possible to determine the positive or negative connotations of words when searching for documents. If it is not possible to determine the positive or negative connotations of words, it is difficult to perform a document search that takes positive or negative conditions into consideration.

一つの側面では、本発明は、単語が持つポジティブまたはネガティブの意味合いの傾向を特定可能にし、ひいては、ポジティブまたはネガティブの条件を加味した文書検索を実現することを目的とする。 In one aspect, the present invention aims to make it possible to identify the tendency of positive or negative connotations of words, and to realize a document search taking positive or negative conditions into consideration.

１つの実施態様では、ポジティブまたはネガティブに分類可能な文書を登録した第１の記憶部を参照して、処理対象となる単語について、前記ポジティブまたはネガティブに分類可能な文書それぞれにおける前記単語の出現頻度に基づき、前記単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値を算出し、算出した前記指標値を前記単語と対応付けて第２の記憶部に登録する、文書検索プログラムが提供される。 In one embodiment, with reference to a first storage unit in which documents that can be classified as positive or negative are registered, for a word to be processed, the frequency of appearance of the word in each of the documents that can be classified as positive or negative. A document search program is provided for calculating an index value indicating the tendency of the positive or negative connotation of the word based on the above, and registering the calculated index value in a second storage unit in association with the word. .

本発明の一側面によれば、単語が持つポジティブまたはネガティブの意味合いの傾向を特定可能にし、ひいては、ポジティブまたはネガティブの条件を加味した文書検索を実現することができる。 According to one aspect of the present invention, it is possible to identify the tendency of positive or negative connotations of words, and to realize document retrieval that takes positive or negative conditions into consideration.

図１は、実施の形態にかかる文書検索方法の一実施例を示す説明図である。FIG. 1 is an explanatory diagram showing an example of a document retrieval method according to an embodiment. 図２は、文書検索システム２００のシステム構成例を示す説明図である。FIG. 2 is an explanatory diagram showing a system configuration example of the document search system 200. As shown in FIG. 図３は、情報処理装置１０１のハードウェア構成例を示すブロック図である。FIG. 3 is a block diagram showing a hardware configuration example of the information processing apparatus 101. As shown in FIG. 図４は、病名ＤＢ２２０の記憶内容の一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of the contents stored in the disease name DB 220. As shown in FIG. 図５は、カルテＤＢ２３０の記憶内容の一例を示す説明図である。FIG. 5 is an explanatory diagram showing an example of the contents stored in the chart DB 230. As shown in FIG. 図６は、ネガポジＤＢ２４０の記憶内容の一例を示す説明図である。FIG. 6 is an explanatory diagram showing an example of the contents stored in the negative/positive DB 240. As shown in FIG. 図７は、情報処理装置１０１の機能的構成例を示すブロック図である。FIG. 7 is a block diagram showing a functional configuration example of the information processing apparatus 101. As shown in FIG. 図８は、ネガポジ件数テーブル８００の記憶内容の一例を示す説明図である。FIG. 8 is an explanatory diagram showing an example of the contents stored in the negative/positive number table 800. As shown in FIG. 図９は、クライアント装置２０１の画面遷移例を示す説明図である。FIG. 9 is an explanatory diagram showing an example of screen transition of the client device 201. As shown in FIG. 図１０は、検索結果画面の他の画面例を示す説明図である。FIG. 10 is an explanatory diagram showing another screen example of the search result screen. 図１１は、キーワード検索画面の他の画面例を示す説明図である。FIG. 11 is an explanatory diagram showing another screen example of the keyword search screen. 図１２は、情報処理装置１０１のネガポジ登録処理手順の一例を示すフローチャート（その１）である。FIG. 12 is a flowchart (part 1) showing an example of a negative-positive registration processing procedure of the information processing apparatus 101 . 図１３は、情報処理装置１０１のネガポジ登録処理手順の一例を示すフローチャート（その２）である。FIG. 13 is a flowchart (No. 2) showing an example of the negative-positive registration processing procedure of the information processing apparatus 101 . 図１４は、情報処理装置１０１のネガポジ登録処理手順の一例を示すフローチャート（その３）である。FIG. 14 is a flowchart (No. 3) showing an example of the negative-positive registration processing procedure of the information processing apparatus 101 . 図１５は、ネガティブ件数算出処理の具体的処理手順の一例を示すフローチャートである。FIG. 15 is a flow chart showing an example of a specific processing procedure of negative number calculation processing. 図１６は、ポジティブ件数算出処理の具体的処理手順の一例を示すフローチャートである。FIG. 16 is a flow chart showing an example of a specific processing procedure of the number-of-positives calculation process. 図１７は、情報処理装置１０１の文書検索処理手順の一例を示すフローチャート（その１）である。FIG. 17 is a flowchart (Part 1) showing an example of the document search processing procedure of the information processing apparatus 101 . 図１８は、情報処理装置１０１の文書検索処理手順の一例を示すフローチャート（その２）である。FIG. 18 is a flowchart (part 2) showing an example of the document search processing procedure of the information processing apparatus 101 .

以下に図面を参照して、本発明にかかる文書検索プログラム、文書検索方法および文書検索システムの実施の形態を詳細に説明する。 Exemplary embodiments of a document search program, a document search method, and a document search system according to the present invention will be described below in detail with reference to the drawings.

（実施の形態）
図１は、実施の形態にかかる文書検索方法の一実施例を示す説明図である。図１において、情報処理装置１０１は、文書の検索を支援するコンピュータである。文書は、キーワード検索が可能な電子情報である。キーワード検索は、キーワードに対応する文書を検索する機能であり、キーワードとして入力された文字や文字列を含む文書を検索する。 (Embodiment)
FIG. 1 is an explanatory diagram showing an example of a document retrieval method according to an embodiment. In FIG. 1, an information processing apparatus 101 is a computer that supports document retrieval. Documents are electronic information that can be searched by keywords. Keyword search is a function of searching for documents corresponding to a keyword, and searches for documents containing characters or character strings input as keywords.

検索対象の文書は、例えば、医療機関において作成されるカルテ（診療記録）の記事や、研究開発において作成される報告書などである。カルテには、患者の主訴、医師の所見（評価）、診断結果、治療計画（治療方針）などが記載される。研究開発における報告書には、研究者の所見や研究結果などが記載される。 Documents to be searched include, for example, articles on medical charts (medical records) created in medical institutions, reports created in research and development, and the like. The patient's chief complaint, doctor's findings (evaluation), diagnosis results, treatment plan (treatment policy), and the like are described in the chart. A research and development report includes the researcher's observations and research results.

これらの文書には、ポジティブな内容の文書もあれば、ネガティブな内容の文書もある。ポジティブとは、例えば、肯定的、積極的、前向き、好転などの意味である。ネガティブとは、否定的、消極的、後ろ向き、悪化などの意味である。例えば、カルテの記事には、症状が改善されたときの記事（ポジティブ）もあれば、症状が悪化したときの記事（ネガティブ）もある。 Some of these documents have positive content, and some have negative content. Positive means, for example, affirmative, positive, forward-looking, improvement, and the like. Negative means negative, negative, negative, aggravating, and so on. For example, medical record articles include articles (positive) when symptoms have improved and articles (negative) when symptoms have worsened.

ここで、文書を検索するにあたり、ネガティブまたはポジティブの条件を加味した文書検索を行いたい場合がある。例えば、「発熱」の症状がある患者の診療を行う際に、医師が、患者にとって不向きな治療法や薬剤を判断するために、その患者の症状が悪化したときの過去のカルテの記事が欲しい場合がある。 Here, in searching for documents, there are cases where it is desired to perform document searches with negative or positive conditions added. For example, when treating a patient with symptoms of "fever", doctors want past medical record articles when the patient's symptoms worsened in order to judge unsuitable treatment methods and drugs for the patient. Sometimes.

しかし、キーワード検索を用いた文書検索では、キーワードとして入力された文字列が単純に一致するかどうかで検索を行うため、本来欲しい情報がヒットしない、あるいは、欲しい情報を見つけ出すのに時間がかかる場合がある。例えば、「発熱」というキーワードで文書検索を行うと、「発熱が治まった」という本来必要でない情報もヒットして、欲しい情報が埋もれてしまうことがある。 However, in document retrieval using keyword search, since the search is performed simply by whether or not the character string entered as the keyword matches, the originally desired information may not be hit, or it may take a long time to find the desired information. There is For example, if a document search is performed using the keyword "fever", the information "fever has subsided", which is not originally necessary, may be hit, and the desired information may be buried.

このため、単語が持つポジティブまたはネガティブな意味合いの傾向から、その単語を含む文書のポジティブまたはネガティブな意味合いを判断し、ネガティブまたはポジティブの条件を加味した文書検索を行うことが考えられる。この場合、様々な単語について、ポジティブまたはネガティブな意味合いの傾向を特定する情報を予め保持しておくことになる。 Therefore, it is conceivable to determine the positive or negative connotation of the document containing the word based on the tendency of the word to have a positive or negative connotation, and perform a document search with the negative or positive condition added. In this case, information specifying tendencies of positive or negative connotations is stored in advance for various words.

ところが、様々な単語について、人手により単語の意味合いを一つ一つ確認し、ポジティブまたはネガティブの意味合いの傾向を示す情報をデータベース等に予め登録することは、時間や手間がかかり現実的ではない。また、単語によっては、ポジティブまたはネガティブの意味合いの傾向を人手で判断しにくいものもある。 However, manually confirming the meanings of various words one by one and registering information indicating tendencies of positive or negative meanings in a database or the like in advance is time-consuming and laborious and unrealistic. Also, depending on the word, it is difficult to manually determine the tendency of positive or negative connotations.

そこで、本実施の形態では、単語が持つポジティブまたはネガティブの意味合いの傾向を特定可能にし、ひいては、ポジティブまたはネガティブの条件を加味した文書検索を実現する文書検索方法について説明する。以下、情報処理装置１０１の処理例について説明する。 Therefore, in the present embodiment, a document retrieval method that makes it possible to specify the tendency of positive or negative connotations of words and realizes document retrieval that takes into account positive or negative conditions will be described. A processing example of the information processing apparatus 101 will be described below.

（１）情報処理装置１０１は、処理対象となる単語を受け付ける。ここで、処理対象となる単語は、単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値を算出する対象の単語である。ポジティブな意味合いの傾向とは、どの程度ポジティブな意味を有しているかをあらわす。ネガティブな意味合いの傾向とは、どの程度ネガティブな意味を有しているかをあらわす。 (1) The information processing device 101 receives a word to be processed. Here, the word to be processed is a word for which an index value indicating the tendency of the word to have positive or negative connotations is calculated. The positive connotation tendency indicates how positive the connotation is. The tendency of negative connotation represents the degree of negative connotation.

具体的には、例えば、情報処理装置１０１は、ユーザの操作入力により、または、外部のコンピュータから受信することにより、処理対象となる単語を受け付けることにしてもよい。また、情報処理装置１０１は、文書から単語を抽出することにより、抽出した単語を処理対象となる単語として受け付けることにしてもよい。 Specifically, for example, the information processing apparatus 101 may accept a word to be processed by a user's operation input or by receiving it from an external computer. Alternatively, the information processing apparatus 101 may extract words from a document and accept the extracted words as words to be processed.

図１の例では、処理対象となる単語として、単語「ｘｘｘ」を受け付けた場合を想定する。 In the example of FIG. 1, it is assumed that the word "xxx" is accepted as the word to be processed.

（２）情報処理装置１０１は、第１の記憶部１１０を参照して、処理対象となる単語について、ポジティブまたはネガティブに分類可能な文書それぞれにおける単語の出現頻度に基づき、単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値を算出する。 (2) The information processing apparatus 101 refers to the first storage unit 110, and determines whether the word to be processed is positive or negative based on the frequency of occurrence of the word in each document that can be classified as positive or negative. Calculate an index value that indicates the tendency of meaning.

第１の記憶部１１０は、ポジティブまたはネガティブに分類可能な文書を登録した記憶部である。ポジティブまたはネガティブに分類可能な文書とは、ポジティブな内容、結果の文書であるか、または、ネガティブな内容、結果の文書であるかを分類可能な文書である。第１の記憶部１１０には、例えば、文書に対応付けて、当該文書がポジティブまたはネガティブのいずれの文書であるかを分類するための情報が記憶されている。 A first storage unit 110 is a storage unit in which documents that can be classified as positive or negative are registered. A document that can be classified as positive or negative is a document that can be classified as having positive content and results or negative content and results. The first storage unit 110 stores, for example, information for classifying whether the document is positive or negative in association with the document.

ここで、ポジティブに分類された文書における出現頻度が高い単語ほど、ポジティブな意味合いが強い単語であるといえる。一方、ネガティブに分類された文書における出現頻度が高い単語ほど、ネガティブな意味合いが強い単語であるといえる。ただし、ポジティブに分類された文書に出現した単語が、ネガティブに分類された文書に出現することもある。また、ネガティブに分類された文書に出現した単語が、ポジティブに分類された文書に出現することもある。 Here, it can be said that a word with a higher appearance frequency in positively classified documents has a stronger positive connotation. On the other hand, it can be said that a word with a higher appearance frequency in negatively classified documents has a stronger negative connotation. However, words that appear in positively classified documents may also appear in negatively classified documents. Also, words that appear in negatively classified documents may appear in positively classified documents.

このため、情報処理装置１０１は、処理対象となる単語について、ポジティブに分類された文書における出現頻度と、ネガティブに分類された文書における出現頻度とに基づいて、その単語の指標値を算出する。文書における単語の出現頻度は、例えば、その単語が文書中に出現した回数を、文書に含まれる単語の総数（全単語数）で除算することにより求めることができる。 Therefore, the information processing apparatus 101 calculates the index value of a word to be processed based on the appearance frequency in positively classified documents and the appearance frequency in negatively classified documents. The appearance frequency of a word in a document can be obtained, for example, by dividing the number of times the word appears in the document by the total number of words included in the document (total number of words).

図１の例では、第１の記憶部１１０には、ポジティブに分類された文書１１１と、ネガティブに分類された文書１１２とが記憶されているとする。また、ポジティブに分類された文書１１１における単語「ｘｘｘ」の出現頻度を「出現頻度ｆ１」とし、ネガティブに分類された文書１１２における単語「ｘｘｘ」の出現頻度を「出現頻度ｆ２」とする。 In the example of FIG. 1, the first storage unit 110 stores a document 111 classified as positive and a document 112 classified as negative. Also, let the appearance frequency of the word "xxx" in the positively classified documents 111 be "appearance frequency f1" and the appearance frequency of the word "xxx" in the negatively classified documents 112 be "appearance frequency f2".

この場合、情報処理装置１０１は、単語「ｘｘｘ」について、ポジティブに分類された文書１１１における出現頻度ｆ１と、ネガティブに分類された文書１１２における出現頻度ｆ２とに基づいて、単語「ｘｘｘ」の指標値ｖ１を算出する。より具体的には、例えば、指標値ｖ１は、出現頻度ｆ１と出現頻度ｆ２との差分（例えば、ｆ１－ｆ２）によって表されることにしてもよい。 In this case, the information processing apparatus 101 calculates the index of the word "xxx" based on the appearance frequency f1 in the positively classified document 111 and the appearance frequency f2 in the negatively classified document 112 for the word "xxx". Calculate the value v1. More specifically, for example, the index value v1 may be represented by the difference between the appearance frequency f1 and the appearance frequency f2 (eg, f1-f2).

ここでは、指標値ｖ１を、出現頻度ｆ１から出現頻度ｆ２を引いた値とする。この場合、指標値ｖ１が大きいほど、単語「ｘｘｘ」が持つポジティブな意味合いの傾向が強いといえる。また、指標値ｖ１が小さいほど、単語「ｘｘｘ」が持つネガティブな意味合いの傾向が強いといえる。 Here, the index value v1 is a value obtained by subtracting the appearance frequency f2 from the appearance frequency f1. In this case, it can be said that the larger the index value v1, the stronger the tendency of the word "xxx" to have a positive connotation. Also, it can be said that the smaller the index value v1, the stronger the tendency of the word "xxx" to have a negative connotation.

なお、ここではポジティブ（または、ネガティブ）に分類された文書が１つの場合を例に挙げて説明したが、複数の文書が存在する場合は、情報処理装置１０１は、複数の文書全体における単語の出現頻度を算出することにしてもよい。 Here, the case where there is one document classified positively (or negatively) has been described as an example, but when there are a plurality of documents, the information processing apparatus 101 determines the number of words in all of the plurality of documents. You may decide to calculate appearance frequency.

（３）情報処理装置１０１は、算出した指標値を、単語と対応付けて第２の記憶部１２０に登録する。ここで、第２の記憶部１２０は、単語と対応付けて、その単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値を記憶する記憶部である。 (3) The information processing device 101 registers the calculated index value in the second storage unit 120 in association with the word. Here, the second storage unit 120 is a storage unit that stores an index value indicating the tendency of the word to have a positive or negative connotation in association with the word.

図１の例では、情報処理装置１０１は、算出した指標値ｖ１を、単語「ｘｘｘ」と対応付けて第２の記憶部１２０に登録する。 In the example of FIG. 1, the information processing apparatus 101 registers the calculated index value v1 in the second storage unit 120 in association with the word "xxx".

このように、情報処理装置１０１によれば、処理対象となる単語が持つポジティブまたはネガティブの意味合いの傾向を特定可能な指標値を求めて登録しておくことができる。これにより、単語が持つポジティブまたはネガティブの意味合いの傾向から、その単語を含む文書が持つポジティブまたはネガティブの意味合いの傾向を推定可能となり、ひいては、ポジティブまたはネガティブの条件を加味した文書検索を実現することができる。 As described above, according to the information processing apparatus 101, it is possible to obtain and register an index value that can identify the tendency of the positive or negative connotations of words to be processed. As a result, it becomes possible to estimate the tendency of positive or negative connotations of documents containing the words from the tendency of positive or negative connotations of the words. can be done.

図１の例では、単語「ｘｘｘ」が持つポジティブまたはネガティブの意味合いの傾向を特定可能な指標値ｖ１を登録しておくことができる。これにより、文書検索を行うにあたり、単語「ｘｘｘ」の指標値ｖ１を用いて、単語「ｘｘｘ」を含む文書が持つポジティブまたはネガティブの意味合いの傾向を推定することが可能となる。 In the example of FIG. 1, an index value v1 that can identify the positive or negative connotation tendency of the word "xxx" can be registered. As a result, when performing a document search, it is possible to use the index value v1 of the word "xxx" to estimate the tendency of positive or negative connotations of documents containing the word "xxx".

（文書検索システム２００のシステム構成例）
つぎに、図１に示した情報処理装置１０１を含む文書検索システム２００のシステム構成例について説明する。文書検索システム２００は、例えば、医療機関の電子カルテシステムに適用される。医療機関は、病院、診療所などである。 (System Configuration Example of Document Search System 200)
Next, an example system configuration of the document retrieval system 200 including the information processing apparatus 101 shown in FIG. 1 will be described. The document retrieval system 200 is applied, for example, to an electronic medical record system of a medical institution. Medical institutions include hospitals, clinics, and the like.

図２は、文書検索システム２００のシステム構成例を示す説明図である。図２において、文書検索システム２００は、情報処理装置１０１と、クライアント装置２０１と、を含む。文書検索システム２００において、情報処理装置１０１およびクライアント装置２０１は、有線または無線のネットワーク２１０を介して接続される。ネットワーク２１０は、例えば、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、インターネットなどである。 FIG. 2 is an explanatory diagram showing a system configuration example of the document search system 200. As shown in FIG. In FIG. 2 , the document search system 200 includes an information processing device 101 and a client device 201 . In the document search system 200, the information processing device 101 and the client device 201 are connected via a wired or wireless network 210. FIG. The network 210 is, for example, a LAN (Local Area Network), a WAN (Wide Area Network), the Internet, or the like.

ここで、情報処理装置１０１は、病名ＤＢ（Ｄａｔａｂａｓｅ）２２０、カルテＤＢ２３０およびネガポジＤＢ２４０を有する。情報処理装置１０１は、例えば、電子カルテシステムのサーバである。なお、各種ＤＢ２２０，２３０，２４０の記憶内容については、図４～図６を用いて後述する。 Here, the information processing apparatus 101 has a disease name DB (Database) 220 , a chart DB 230 and a negative/positive DB 240 . The information processing device 101 is, for example, a server of an electronic medical record system. The storage contents of various DBs 220, 230, and 240 will be described later with reference to FIGS. 4 to 6. FIG.

クライアント装置２０１は、文書検索システム２００のユーザが使用するコンピュータである。文書検索システム２００のユーザは、例えば、医療機関の医師、看護師などの医療従事者である。クライアント装置２０１は、例えば、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）、タブレット型ＰＣなどである。 A client device 201 is a computer used by a user of the document retrieval system 200 . A user of the document retrieval system 200 is, for example, a medical worker such as a doctor or a nurse at a medical institution. The client device 201 is, for example, a PC (Personal Computer), a tablet PC, or the like.

なお、上述した説明では、情報処理装置１０１とクライアント装置２０１とが別体に設けられることにしたが、これに限らない。例えば、情報処理装置１０１は、クライアント装置２０１により実現されることにしてもよい。 In the above description, the information processing apparatus 101 and the client apparatus 201 are provided separately, but the present invention is not limited to this. For example, the information processing device 101 may be implemented by the client device 201 .

（情報処理装置１０１のハードウェア構成例）
図３は、情報処理装置１０１のハードウェア構成例を示すブロック図である。図３において、情報処理装置１０１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）３０１と、メモリ３０２と、ディスクドライブ３０３と、ディスク３０４と、通信Ｉ／Ｆ（Ｉｎｔｅｒｆａｃｅ）３０５と、可搬型記録媒体Ｉ／Ｆ３０６と、可搬型記録媒体３０７と、を有する。また、各構成部は、バス３００によってそれぞれ接続される。 (Hardware Configuration Example of Information Processing Device 101)
FIG. 3 is a block diagram showing a hardware configuration example of the information processing apparatus 101. As shown in FIG. 3, an information processing apparatus 101 includes a CPU (Central Processing Unit) 301, a memory 302, a disk drive 303, a disk 304, a communication I/F (Interface) 305, and a portable recording medium I/F 306. , and a portable recording medium 307 . Also, each component is connected by a bus 300 .

ここで、ＣＰＵ３０１は、情報処理装置１０１の全体の制御を司る。ＣＰＵ３０１は、複数のコアを有していてもよい。メモリ３０２は、例えば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）およびフラッシュＲＯＭなどを有する。具体的には、例えば、フラッシュＲＯＭがＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）のプログラムを記憶し、ＲＯＭがアプリケーションプログラムを記憶し、ＲＡＭがＣＰＵ３０１のワークエリアとして使用される。メモリ３０２に記憶されるプログラムは、ＣＰＵ３０１にロードされることで、コーディングされている処理をＣＰＵ３０１に実行させる。 Here, the CPU 301 controls the entire information processing apparatus 101 . The CPU 301 may have multiple cores. The memory 302 has, for example, a ROM (Read Only Memory), a RAM (Random Access Memory), a flash ROM, and the like. Specifically, for example, a flash ROM stores an OS (Operating System) program, a ROM stores application programs, and a RAM is used as a work area for the CPU 301 . A program stored in the memory 302 is loaded into the CPU 301 to cause the CPU 301 to execute coded processing.

ディスクドライブ３０３は、ＣＰＵ３０１の制御に従ってディスク３０４に対するデータのリード／ライトを制御する。ディスク３０４は、ディスクドライブ３０３の制御で書き込まれたデータを記憶する。ディスク３０４としては、例えば、磁気ディスク、光ディスクなどが挙げられる。 The disk drive 303 controls data read/write with respect to the disk 304 under the control of the CPU 301 . The disk 304 stores data written under the control of the disk drive 303 . Examples of the disk 304 include a magnetic disk and an optical disk.

通信Ｉ／Ｆ３０５は、通信回線を通じてネットワーク２１０に接続され、ネットワーク２１０を介して外部のコンピュータ（例えば、図２に示したクライアント装置２０１）に接続される。そして、通信Ｉ／Ｆ３０５は、ネットワーク２１０と装置内部とのインターフェースを司り、外部のコンピュータからのデータの入出力を制御する。通信Ｉ／Ｆ３０５には、例えば、モデムやＬＡＮアダプタなどを採用することができる。 The communication I/F 305 is connected to the network 210 through a communication line, and is connected to an external computer (for example, the client device 201 shown in FIG. 2) via the network 210 . A communication I/F 305 serves as an interface between the network 210 and the inside of the apparatus, and controls input/output of data from an external computer. For the communication I/F 305, for example, a modem or a LAN adapter can be adopted.

可搬型記録媒体Ｉ／Ｆ３０６は、ＣＰＵ３０１の制御に従って可搬型記録媒体３０７に対するデータのリード／ライトを制御する。可搬型記録媒体３０７は、可搬型記録媒体Ｉ／Ｆ３０６の制御で書き込まれたデータを記憶する。可搬型記録媒体３０７としては、例えば、ＣＤ（ＣｏｍｐａｃｔＤｉｓｃ）－ＲＯＭ、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）メモリなどが挙げられる。 A portable recording medium I/F 306 controls reading/writing of data from/to a portable recording medium 307 under the control of the CPU 301 . The portable recording medium 307 stores data written under control of the portable recording medium I/F 306 . Examples of the portable recording medium 307 include CD (Compact Disc)-ROM, DVD (Digital Versatile Disk), USB (Universal Serial Bus) memory, and the like.

なお、情報処理装置１０１は、上述した構成部のほかに、例えば、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）、入力装置、ディスプレイ等を有することにしてもよい。また、情報処理装置１０１は、上述した構成部のうち、例えば、ディスクドライブ３０３、ディスク３０４、可搬型記録媒体Ｉ／Ｆ３０６、可搬型記録媒体３０７を有していなくてもよい。また、図２に示したクライアント装置２０１についても、情報処理装置１０１と同様のハードウェア構成により実現することができる。ただし、クライアント装置２０１は、上述した構成部のほかに、入力装置、ディスプレイなどを有する。 Note that the information processing apparatus 101 may include, for example, an SSD (Solid State Drive), an input device, a display, etc., in addition to the components described above. Further, the information processing apparatus 101 does not have to include, for example, the disk drive 303, the disk 304, the portable recording medium I/F 306, and the portable recording medium 307 among the components described above. Also, the client device 201 shown in FIG. 2 can be realized by a hardware configuration similar to that of the information processing device 101 . However, the client device 201 has an input device, a display, etc. in addition to the components described above.

（各種ＤＢ２２０，２３０，２４０の記憶内容）
つぎに、図４～図６を用いて、情報処理装置１０１が有する各種ＤＢ２２０，２３０，２４０の記憶内容について説明する。各種ＤＢ２２０，２３０，２４０は、例えば、図３に示したメモリ３０２、ディスク３０４などの記憶装置により実現される。 (Stored contents of various DBs 220, 230, 240)
Next, the storage contents of the various DBs 220, 230, and 240 of the information processing apparatus 101 will be described with reference to FIGS. 4 to 6. FIG. Various DBs 220, 230, and 240 are realized by storage devices such as the memory 302 and the disk 304 shown in FIG. 3, for example.

図４は、病名ＤＢ２２０の記憶内容の一例を示す説明図である。図４において、病名ＤＢ２２０は、患者ＩＤ、病名コード、病名、主病名区分、病名開始日、転帰および転帰日のフィールドを有し、各フィールドに情報を設定することで、病名情報（例えば、病名情報４００－１～４００－８）をレコードとして記憶する。 FIG. 4 is an explanatory diagram showing an example of the contents stored in the disease name DB 220. As shown in FIG. 4, the disease name DB 220 has fields of patient ID, disease name code, disease name, main disease name category, disease name start date, outcome, and outcome date. Information 400-1 to 400-8) are stored as records.

ここで、患者ＩＤは、患者を一意に識別する識別子である。患者は、例えば、文書検索システム２００が導入される医療機関における患者である。病名コードは、患者につけられた病名を一意に識別する識別子である。病名は、病気の名称である。病名には、主病名や疑い病名などがある。主病名は、主たる病名である。疑い病名は、疑いのある病名である。 Here, the patient ID is an identifier that uniquely identifies a patient. A patient is, for example, a patient at a medical institution where the document retrieval system 200 is introduced. A disease name code is an identifier that uniquely identifies a disease name given to a patient. A disease name is a name of a disease. The name of the disease includes the name of the primary disease, the name of the suspected disease, and the like. The main disease name is the main disease name. A suspected disease name is a suspected disease name.

主病名区分は、主病名であるか否かを示すフラグである。主病名区分「１」は、主病名であることを示す。主病名区分「０」は、主病名ではないことを示す。病名開始日は、病名がつけられた日付である。転帰は、病気が進行または改善した結果、どのような状態にいたったのかを示す。転帰日は、転帰が記載された日付である。 The main disease name classification is a flag indicating whether or not it is the main disease name. The main disease name category "1" indicates that it is the main disease name. The main disease name category "0" indicates that it is not the main disease name. The disease name start date is the date when the disease name was given. Outcome indicates what kind of condition the patient has developed as a result of progression or improvement of the disease. Outcome date is the date the outcome was described.

例えば、病名情報４００－１は、患者ＩＤ「Ｐ１」の患者の病名コード「１２３４５」、病名「胃腫瘍」、主病名区分「１」、病名開始日「２０１８／１０／１」、転帰「死亡」および転帰日「２０１８／１１／１」を示す。 For example, the disease name information 400-1 includes the disease name code "12345", the disease name "stomach tumor", the main disease category "1", the disease name start date "2018/10/1", and the outcome "death" of the patient with the patient ID "P1". ” and the outcome date “2018/11/1”.

図５は、カルテＤＢ２３０の記憶内容の一例を示す説明図である。図５において、カルテＤＢ２３０は、患者ＩＤ、記載日、ＳＯＡＰ区分および記事のフィールドを有し、各フィールドに情報を設定することで、カルテ情報（例えば、カルテ情報５００－１～５００－８）をレコードとして記憶する。 FIG. 5 is an explanatory diagram showing an example of the contents stored in the chart DB 230. As shown in FIG. 5, the medical record DB 230 has patient ID, entry date, SOAP category and article fields, and by setting information in each field, medical record information (eg, medical record information 500-1 to 500-8) can be obtained. Store as a record.

ここで、患者ＩＤは、患者を一意に識別する識別子である。記載日は、カルテの記事を記載した日付である。ＳＯＡＰ区分は、ＳＯＡＰ形式で記載されたカルテのいずれの情報についての記事であるかを示す。ＳＯＡＰ区分「Ｓ」は、患者の主訴を示す。「Ｓ」は、Ｓｕｂｊｅｃｔｉｖｅの頭文字である。 Here, the patient ID is an identifier that uniquely identifies a patient. The entry date is the date when the article in the medical chart was entered. The SOAP section indicates which information in the medical record described in SOAP format is the article. The SOAP category "S" indicates the chief complaint of the patient. "S" is an acronym for Subjective.

ＳＯＡＰ区分「Ｏ」は、患者の検査結果等を示す。「Ｏ」は、Ｏｂｊｅｃｔｉｖｅの頭文字である。ＳＯＡＰ区分「Ａ」は、医師の所見（評価）を示す。「Ａ」は、Ａｓｓｅｓｓｍｅｎｔの頭文字である。ＳＯＡＰ区分「Ｐ」は、治療計画を示す。「Ｐ」は、Ｐｌａｎの頭文字である。記事は、カルテの記載内容である。 The SOAP section “O” indicates patient test results and the like. "O" is an acronym for Objective. The SOAP section "A" indicates the doctor's findings (evaluation). "A" is an acronym for Assessment. SOAP section "P" indicates a treatment plan. "P" is an acronym for Plan. The article is the description content of the chart.

例えば、カルテ情報５００－１は、患者ＩＤ「Ｐ１」、記載日「２０１８／１０／１」、ＳＯＡＰ区分「Ｓ」および記事「１週間前から下腹部に痛みがある」を示す。 For example, the chart information 500-1 indicates patient ID "P1", entry date "2018/10/1", SOAP category "S", and article "lower abdominal pain since 1 week ago".

図６は、ネガポジＤＢ２４０の記憶内容の一例を示す説明図である。図６において、ネガポジＤＢ２４０は、要素、病名コード、病名、ポジティブ件数、ポジティブ出現頻度、ネガティブ件数、ネガティブ出現頻度およびネガポジ度数のフィールドを有する。各フィールドに情報を設定することで、ネガポジ情報（例えば、ネガポジ情報６００－１～６００－１０）がレコードとして記憶される。 FIG. 6 is an explanatory diagram showing an example of the contents stored in the negative/positive DB 240. As shown in FIG. 6, the negative-positive DB 240 has fields of element, disease name code, disease name, positive number, positive appearance frequency, negative number, negative appearance frequency, and negative-positive frequency. By setting information in each field, negative-positive information (for example, negative-positive information 600-1 to 600-10) is stored as a record.

ここで、要素は、文書で使用された単語である。文書は、例えば、カルテ情報の記事である。病名コードは、病名を一意に識別する識別子である。病名は、病気の名称である。ポジティブ件数は、ポジティブに分類された文書に単語が出現した回数である。ポジティブ出現頻度は、ポジティブに分類された文書における単語の出現頻度を示す。 Here, the elements are the words used in the document. The document is, for example, an article of medical record information. A disease name code is an identifier that uniquely identifies a disease name. A disease name is a name of a disease. The positive count is the number of times the word appears in positively classified documents. The positive frequency of occurrence indicates the frequency of occurrence of the word in positively classified documents.

ネガティブ件数は、ネガティブに分類された文書に単語が出現した回数である。ネガティブ出現頻度は、ネガティブに分類された文書における単語の出現頻度を示す。ネガポジ度数は、単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値である。ここでは、ネガポジ度数は、ポジティブ出現頻度からネガティブ出現頻度を引いた値を示す。 Negative count is the number of times a word appears in documents classified as negative. Negative frequency indicates the frequency of occurrence of words in negatively classified documents. The negative-positive frequency is an index value that indicates the tendency of a word to have a positive or negative connotation. Here, the negative-positive frequency indicates a value obtained by subtracting the negative appearance frequency from the positive appearance frequency.

例えば、ネガポジ情報６００－１は、要素「痛み」、病名コード「１２３４５」、病名「胃腫瘍」、ポジティブ件数「５」、ポジティブ出現頻度「０．１１３６３６３６４」、ネガティブ件数「１０」、ネガティブ出現頻度「０．２６３１５７８９５」およびネガポジ度数「－０．１４９５２１５３１」を示す。 For example, the negative/positive information 600-1 includes the element "pain", the disease name code "12345", the disease name "stomach tumor", the positive number "5", the positive appearance frequency "0.113636364", the negative number "10", and the negative appearance frequency. "0.263157895" and negative-positive power "-0.149521531" are shown.

（情報処理装置１０１の機能的構成例）
図７は、情報処理装置１０１の機能的構成例を示すブロック図である。図７において、情報処理装置１０１は、第１の受付部７０１と、算出部７０２と、登録部７０３と、第２の受付部７０４と、検索部７０５と、決定部７０６と、出力部７０７と、第１の記憶部７１０と、第２の記憶部７２０と、を含む。具体的には、例えば、第１の受付部７０１～出力部７０７は、図３に示したメモリ３０２、ディスク３０４などの記憶装置に記憶されたプログラムをＣＰＵ３０１に実行させることにより、または、通信Ｉ／Ｆ３０５により、その機能を実現する。各機能部の処理結果は、例えば、メモリ３０２、ディスク３０４などの記憶装置に記憶される。第１の記憶部７１０および第２の記憶部７２０は、例えば、メモリ３０２、ディスク３０４などの記憶装置により実現される。 (Example of functional configuration of information processing apparatus 101)
FIG. 7 is a block diagram showing a functional configuration example of the information processing apparatus 101. As shown in FIG. 7, the information processing apparatus 101 includes a first reception unit 701, a calculation unit 702, a registration unit 703, a second reception unit 704, a search unit 705, a determination unit 706, and an output unit 707. , a first storage unit 710 and a second storage unit 720 . Specifically, for example, the first reception unit 701 to the output unit 707 cause the CPU 301 to execute a program stored in a storage device such as the memory 302 and the disk 304 shown in FIG. /F305 realizes the function. The processing results of each functional unit are stored in a storage device such as the memory 302 or disk 304, for example. The first storage unit 710 and the second storage unit 720 are realized by storage devices such as the memory 302 and the disk 304, for example.

第１の記憶部７１０は、ポジティブまたはネガティブに分類可能な文書を記憶する。具体的には、例えば、第１の記憶部７１０は、図４に示した病名ＤＢ２２０および図５に示したカルテＤＢ２３０を記憶する。図１に示した第１の記憶部１１０は、例えば、第１の記憶部７１０に相当する。 The first storage unit 710 stores documents that can be classified as positive or negative. Specifically, for example, the first storage unit 710 stores the disease name DB 220 shown in FIG. 4 and the chart DB 230 shown in FIG. The first storage unit 110 illustrated in FIG. 1 corresponds to the first storage unit 710, for example.

第２の記憶部７２０は、単語と対応付けて、当該単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値を記憶する。具体的には、例えば、第２の記憶部７２０は、図６に示したネガポジＤＢ２４０を記憶する。図１に示した第２の記憶部１２０は、例えば、第２の記憶部７２０に相当する。 The second storage unit 720 stores an index value indicating the tendency of the word to have a positive or negative connotation in association with the word. Specifically, for example, the second storage unit 720 stores the negative-positive DB 240 shown in FIG. The second storage unit 120 illustrated in FIG. 1 corresponds to the second storage unit 720, for example.

第１の受付部７０１は、処理対象となる単語を受け付ける。ここで、処理対象となる単語は、単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値を算出する対象の単語である。処理対象となる単語は、特定の品詞の単語であってもよい。特定の品詞は、任意に設定可能であり、例えば、名詞、動詞、形容詞、形容動詞、副詞などである。 A first reception unit 701 receives a word to be processed. Here, the word to be processed is a word for which an index value indicating the tendency of the word to have positive or negative connotations is calculated. The words to be processed may be words of a particular part of speech. A specific part of speech can be arbitrarily set, and includes, for example, nouns, verbs, adjectives, adverbs, and adverbs.

具体的には、例えば、第１の受付部７０１は、文書から抽出された単語を、処理対象となる単語として受け付ける。文書は、例えば、第１の記憶部７１０に記憶されたポジティブまたはネガティブに分類可能な文書であってもよい。より詳細に説明すると、例えば、第１の受付部７０１は、図５に示したカルテＤＢ２３０内のカルテ情報の記事から抽出された特定の品詞の単語を、処理対象となる単語として受け付けることにしてもよい。 Specifically, for example, the first reception unit 701 receives words extracted from a document as words to be processed. The document may be, for example, a positive or negative classifiable document stored in the first storage unit 710 . More specifically, for example, the first reception unit 701 receives words of a specific part of speech extracted from articles of medical record information in the medical record DB 230 shown in FIG. 5 as words to be processed. good too.

算出部７０２は、第１の記憶部７１０を参照して、処理対象となる単語について、ポジティブまたはネガティブに分類可能な文書それぞれにおける単語の出現頻度に基づき、単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値を算出する。 The calculation unit 702 refers to the first storage unit 710, and for the word to be processed, based on the appearance frequency of the word in each document that can be classified as positive or negative, determines the tendency of the word to have a positive or negative connotation. Calculates an index value indicating

以下の説明では、ポジティブまたはネガティブな意味合いの傾向を示す指標値を「ネガポジ度数」と表記する場合がある。 In the following description, an index value indicating a tendency of positive or negative connotations may be referred to as "negative-positive frequency".

具体的には、例えば、算出部７０２は、第１の記憶部７１０に記憶された各文書を、ポジティブまたはネガティブのいずれかに分類する。つぎに、算出部７０２は、ポジティブに分類された文書における単語の出現頻度を算出する。また、算出部７０２は、ネガティブに分類された文書における単語の出現頻度を算出する。そして、算出部７０２は、算出したポジティブに分類された文書における単語の出現頻度と、ネガティブに分類された文書における単語の出現頻度とに基づいて、単語のネガポジ度数を算出する。 Specifically, for example, the calculation unit 702 classifies each document stored in the first storage unit 710 as either positive or negative. Next, the calculation unit 702 calculates the appearance frequency of words in positively classified documents. The calculation unit 702 also calculates the frequency of appearance of words in negatively classified documents. Then, the calculating unit 702 calculates the negative-positive frequency of the word based on the calculated appearance frequency of the word in the positively classified document and the calculated appearance frequency of the word in the negatively classified document.

より詳細に説明すると、例えば、算出部７０２は、図４に示した病名ＤＢ２２０から、病名情報を選択する。つぎに、算出部７０２は、選択した病名情報の病名開始日から転帰日までの期間Ｔを特定する。そして、算出部７０２は、図５に示したカルテＤＢ２３０から、選択した病名情報の患者ＩＤに対応し、かつ、特定した期間Ｔに記載日が含まれるカルテ情報を選択する。 More specifically, for example, the calculation unit 702 selects disease name information from the disease name DB 220 shown in FIG. Next, the calculation unit 702 specifies a period T from the disease name start date to the outcome date of the selected disease name information. Then, the calculation unit 702 selects medical chart information that corresponds to the patient ID of the selected disease name information and that includes the date of description in the specified period T from the medical chart DB 230 shown in FIG.

ここで、選択した病名情報の転帰が「死亡」の場合、算出部７０２は、選択したカルテ情報の記事をネガティブに分類する。一方、選択した病名情報の転帰が「寛解または治癒」の場合、算出部７０２は、選択したカルテ情報の記事をポジティブに分類する。 Here, if the outcome of the selected disease name information is "death", the calculation unit 702 classifies the article of the selected medical chart information as negative. On the other hand, when the outcome of the selected disease name information is "remission or cure", the calculation unit 702 classifies the article of the selected medical chart information as positive.

これにより、患者の転帰が「寛解または治癒」であるか、または、「死亡」であるかに応じて、病気の治療中に記載されたカルテ情報の記事を、ポジティブまたはネガティブに分類することができる。 This allows medical chart information articles written during treatment of an illness to be classified as positive or negative, depending on whether the patient's outcome is "remission or cure" or "death". can.

なお、病名ＤＢ２２０内の病名情報のうち、病名開始日が所定期間内の病名情報だけを処理対象とすることにしてもよい。所定期間は、任意に設定可能であり、例えば、直近５年などの期間に設定される。これにより、古いカルテ情報の記事を処理対象から除外することができる。 Of the disease name information in the disease name DB 220, only disease name information whose disease name start date is within a predetermined period may be processed. The predetermined period can be set arbitrarily, and is set to a period such as the last five years, for example. This makes it possible to exclude articles of old medical record information from being processed.

また、病名ＤＢ２２０内の病名情報のうち、病名が主病名である病名情報、すなわち、主病名区分「１」の病名情報だけを処理対象とすることにしてもよい。これにより、例えば、病名が確定していないときに記載されたカルテ情報の記事を処理対象から除外することができる。 Further, among the disease name information in the disease name DB 220, only the disease name information whose disease name is the main disease name, that is, only the disease name information of the main disease name classification "1" may be processed. As a result, for example, medical record information articles written when the name of the disease has not been determined can be excluded from the processing targets.

つぎに、算出部７０２は、ポジティブに分類されたカルテ情報の記事を形態素解析することにより、当該記事を単語（形態素）単位に分解する。つぎに、算出部７０２は、分解した単語（形態素）のうち、特定の品詞の単語を抽出する。そして、算出部７０２は、抽出した各単語について、ポジティブに分類されたカルテ情報の記事における出現回数を算出する。 Next, the calculation unit 702 morphologically analyzes the article of the medical record information classified as positive, thereby breaking down the article into words (morphemes). Next, the calculation unit 702 extracts words of a specific part of speech from the decomposed words (morphemes). Then, the calculation unit 702 calculates the number of appearances of each extracted word in articles of medical record information classified as positive.

なお、ポジティブに分類されたカルテ情報の記事における出現回数は、例えば、ポジティブに分類されたカルテ情報の記事のうち、全てのカルテ情報の記事を対象として算出されてもよく、また、一部のカルテ情報の記事を対象として算出されてもよい。 The number of appearances in articles of medical record information classified as positive may be calculated, for example, for all articles of medical record information among articles of medical record information classified positively. It may be calculated for articles of medical record information.

また、算出部７０２は、ネガティブに分類されたカルテ情報の記事を形態素解析することにより、当該記事を単語（形態素）単位に分解する。つぎに、算出部７０２は、分解した単語（形態素）のうち、特定の品詞の単語を抽出する。そして、算出部７０２は、抽出した各単語について、ネガティブに分類されたカルテ情報の記事における出現回数を算出する。 In addition, the calculation unit 702 performs morphological analysis on articles of medical record information classified as negative, thereby decomposing the articles into word (morpheme) units. Next, the calculation unit 702 extracts words of a specific part of speech from the decomposed words (morphemes). Then, the calculation unit 702 calculates the number of appearances of each extracted word in articles of medical record information classified as negative.

なお、ネガティブに分類されたカルテ情報の記事における出現回数は、例えば、ネガティブに分類されたカルテ情報の記事のうち、全てのカルテ情報の記事を対象として算出されてもよく、また、一部のカルテ情報の記事を対象として算出されてもよい。 The number of appearances in articles of medical record information classified as negative may be calculated, for example, for all articles of medical record information among the articles of medical record information classified as negative. It may be calculated for articles of medical record information.

以下の説明では、ポジティブに分類された文書（例えば、カルテ情報の記事）における単語の出現回数を「ポジティブ件数」と表記する場合がある。また、ネガティブに分類された文書（例えば、カルテ情報の記事）における単語の出現回数を「ネガティブ件数」と表記する場合がある。 In the following description, the number of occurrences of a word in positively classified documents (for example, articles of medical chart information) may be referred to as "positive number". Also, the number of occurrences of a word in a negatively classified document (for example, an article in medical record information) may be expressed as "negative number".

算出された単語のポジティブ件数およびネガティブ件数は、例えば、図８に示すようなネガポジ件数テーブル８００に記憶される。ネガポジ件数テーブル８００は、例えば、メモリ３０２、ディスク３０４などの記憶装置により実現される。 The positive counts and negative counts of the calculated words are stored, for example, in a negative-positive count table 800 as shown in FIG. The negative/positive number table 800 is realized by a storage device such as the memory 302 and the disk 304, for example.

図８は、ネガポジ件数テーブル８００の記憶内容の一例を示す説明図である。図８において、ネガポジ件数テーブル８００は、要素、病名コード、病名、ネガティブ件数およびポジティブ件数のフィールドを有し、各フィールドに情報を設定することで、件数情報（例えば、件数情報８００－１～８００－６）をレコードとして記憶する。 FIG. 8 is an explanatory diagram showing an example of the contents stored in the negative/positive number table 800. As shown in FIG. In FIG. 8, the negative/positive number table 800 has fields for element, disease name code, disease name, negative number, and positive number. -6) is stored as a record.

ここで、要素は、カルテ情報の記事で使用された単語である。病名コードは、病名を一意に識別する識別子である。病名は、カルテ情報に対応する病名情報、すなわち、カルテ情報を選択するにあたり参照された病名情報から特定される。病名は、病気の名称である。ネガティブ件数は、ネガティブに分類されたカルテ情報の記事における単語の出現回数である。ポジティブ件数は、ポジティブに分類されたカルテ情報の記事における単語の出現回数である。 Here, the elements are the words used in the medical record information article. A disease name code is an identifier that uniquely identifies a disease name. The disease name is specified from the disease name information corresponding to the medical chart information, ie, the disease name information referred to when selecting the medical chart information. A disease name is a name of a disease. The number of negatives is the number of occurrences of a word in articles of medical record information classified as negative. The number of positives is the number of occurrences of a word in articles of medical record information classified as positive.

例えば、件数情報８００－１は、要素「痛み」、病名コード「１２３４５」、病名「胃腫瘍」、ネガティブ件数「５」およびポジティブ件数「１０」を示す。 For example, the number information 800-1 indicates the element "pain", disease name code "12345", disease name "stomach tumor", negative number "5" and positive number "10".

例えば、算出部７０２は、ネガポジ件数テーブル８００を参照して、各単語のポジティブ件数を積算することにより、合計ポジティブ件数を算出する。合計ポジティブ件数は、ポジティブに分類されたカルテ情報の記事に含まれる単語（例えば、特定の品詞の単語）の総数に相当する。つぎに、算出部７０２は、各単語のポジティブ件数を合計ポジティブ件数で除算することにより、各単語のポジティブ出現頻度を算出する。ポジティブ出現頻度は、ポジティブに分類されたカルテ情報の記事における単語の出現頻度である。 For example, the calculation unit 702 refers to the negative/positive number table 800 and calculates the total number of positives by accumulating the number of positives for each word. The total number of positive cases corresponds to the total number of words (for example, words of a specific part of speech) included in articles of medical record information classified as positive. Next, the calculation unit 702 calculates the frequency of occurrence of positives for each word by dividing the number of positives for each word by the total number of positives. The positive appearance frequency is the appearance frequency of a word in articles of medical record information classified as positive.

また、算出部７０２は、ネガポジ件数テーブル８００を参照して、各単語のネガティブ件数を積算することにより、合計ネガティブ件数を算出する。合計ネガティブ件数は、ネガティブに分類されたカルテ情報の記事に含まれる単語（例えば、特定の品詞の単語）の総数に相当する。つぎに、算出部７０２は、各単語のネガティブ件数を合計ネガティブ件数で除算することにより、各単語のネガティブ出現頻度を算出する。ネガティブ出現頻度は、ネガティブに分類されたカルテ情報の記事における単語の出現頻度である。 The calculation unit 702 also refers to the negative/positive number table 800 and calculates the total number of negatives by accumulating the number of negatives of each word. The total number of negative cases corresponds to the total number of words (for example, words of a specific part of speech) included in articles of medical record information classified as negative. Next, the calculation unit 702 calculates the negative appearance frequency of each word by dividing the number of negative occurrences of each word by the total number of negative occurrences. Negative appearance frequency is the appearance frequency of a word in articles of medical record information classified as negative.

そして、算出部７０２は、算出した各単語のポジティブ出現頻度とネガティブ出現頻度とに基づいて、各単語のネガポジ度数を算出する。例えば、算出部７０２は、各単語のポジティブ出現頻度とネガティブ出現頻度との差分を、各単語のネガポジ度数として算出することにしてもよい。 Then, the calculation unit 702 calculates the negative-positive frequency of each word based on the calculated positive appearance frequency and negative appearance frequency of each word. For example, the calculation unit 702 may calculate the difference between the positive appearance frequency and the negative appearance frequency of each word as the negative-positive frequency of each word.

ネガポジ度数を、ポジティブ出現頻度からネガティブ出現頻度を引いた値とすると、ネガポジ度数が大きいほど、単語が持つポジティブな意味合いの傾向が強いといえる。一方、ネガポジ度数を、ネガティブ出現頻度からポジティブ出現頻度を引いた値とすると、ネガポジ度数が大きいほど、単語が持つネガティブな意味合いの傾向が強いといえる。ここでは、ネガポジ度数を、ポジティブ出現頻度からネガティブ出現頻度を引いた値とする。 Assuming that the negative-positive frequency is a value obtained by subtracting the negative-appearance frequency from the positive frequency, it can be said that the higher the negative-positive frequency, the stronger the tendency of the word to have a positive connotation. On the other hand, if the negative-positive frequency is a value obtained by subtracting the positive-appearance frequency from the negative frequency, it can be said that the larger the negative-positive frequency, the stronger the negative connotation of the word. Here, the negative-positive frequency is a value obtained by subtracting the negative appearance frequency from the positive appearance frequency.

登録部７０３は、算出された単語のネガポジ度数を、単語と対応付けて第２の記憶部７２０に登録する。具体的には、例えば、登録部７０３は、算出された単語のネガポジ度数を、単語（要素）と対応付けて、図６に示したネガポジＤＢ２４０に登録する。この際、登録部７０３は、病名コード、病名、ポジティブ件数、ポジティブ出現頻度、ネガティブ件数およびネガティブ出現頻度をさらに対応付けて、ネガポジＤＢ２４０に登録する。 The registration unit 703 registers the calculated negative-positive frequency of the word in the second storage unit 720 in association with the word. Specifically, for example, the registration unit 703 registers the calculated negative-positive frequency of the word in the negative-positive DB 240 shown in FIG. 6 in association with the word (element). At this time, the registration unit 703 further associates the disease name code, disease name, positive number, positive appearance frequency, negative number, and negative appearance frequency, and registers them in the negative/positive DB 240 .

病名は、単語のネガポジ度数を算出する際に参照されたカルテ情報に対応する病名情報から特定される。ただし、ポジティブ件数、ポジティブ出現頻度、ネガティブ件数およびネガティブ出現頻度については、ネガポジＤＢ２４０に登録しないようにしてもよい。 The disease name is specified from the disease name information corresponding to the medical record information referred to when calculating the negative-positive frequency of the word. However, the positive number, positive appearance frequency, negative number, and negative appearance frequency may not be registered in the negative/positive DB 240 .

第２の受付部７０４は、検索文字列を受け付ける。ここで、検索文字列は、文書を検索する際の検索条件となるキーワード（文字または文字列）を指定するものである。検索文字列には、複数のキーワードが含まれていてもよい。具体的には、例えば、第２の受付部７０４は、図２に示したクライアント装置２０１から検索文字列を受信することにより、受信した検索文字列を受け付ける。 A second reception unit 704 receives a search string. Here, the search character string designates a keyword (character or character string) as a search condition when searching for documents. A search string may include multiple keywords. Specifically, for example, the second accepting unit 704 accepts the received search character string by receiving the search character string from the client device 201 shown in FIG.

より詳細に説明すると、例えば、医師が、ある患者の診療を行うにあたり、後述の図９に示すようなキーワード検索画面９１０において、過去のカルテ記事を検索するために検索文字列を入力する。この場合、クライアント装置２０１は、入力された検索文字列を、情報処理装置１０１に送信する。検索文字列には、診療対象の患者の病名を特定する情報が対応付けられていてもよい。 More specifically, for example, when a doctor treats a patient, he enters a search character string to search for past medical record articles on a keyword search screen 910 shown in FIG. 9, which will be described later. In this case, the client device 201 transmits the input search character string to the information processing device 101 . The search character string may be associated with information specifying the disease name of the patient to be treated.

検索部７０５は、文書群から、受け付けた検索文字列に対応する文書を検索する。ここで、文書群は、例えば、第１の記憶部７１０に記憶された文書群である。また、文書群は、インターネット上のウェブサイトから取得可能な文書群であってもよい。 A search unit 705 searches for a document corresponding to the received search character string from the document group. Here, the document group is, for example, a document group stored in the first storage unit 710 . Also, the document group may be a document group that can be obtained from a website on the Internet.

具体的には、例えば、検索部７０５は、カルテＤＢ２３０から、検索文字列に対応するカルテ情報の記事を検索する。より詳細に説明すると、例えば、検索部７０５は、検索文字列に含まれる文字や文字列を含むカルテ情報の記事を検索する。この際、検索部７０５は、ＳＯＡＰ区分が「Ｓ」または「Ａ」のカルテ情報の記事を検索対象としてもよい。 Specifically, for example, the search unit 705 searches the medical record DB 230 for an article of medical record information corresponding to the search character string. More specifically, for example, the search unit 705 searches for articles in medical record information that include characters or character strings included in the search character string. At this time, the search unit 705 may search articles of medical record information with the SOAP classification of "S" or "A".

また、検索部７０５は、現在の病名（診断対象の患者の病名）と同じ病名の病気を治療中に記載されたカルテ情報の記事を検索対象としてもよい。カルテ情報に対応する病名は、例えば、カルテ情報の患者ＩＤと記載日をもとに病名ＤＢ２２０から特定可能である。ただし、検索アルゴリズムは、既存のいかなるものを用いることにしてもよい。 Further, the search unit 705 may search for articles in medical record information describing the treatment of a disease with the same name as the current name of the disease (name of the patient to be diagnosed). A disease name corresponding to medical record information can be specified from the disease name DB 220 based on, for example, the patient ID and the written date of the medical record information. However, any existing search algorithm may be used.

以下の説明では、カルテＤＢ２３０内のカルテ情報の記事を、単に「カルテ記事」と表記する場合がある。 In the following description, articles of medical chart information in the medical chart DB 230 may simply be referred to as "medical chart articles".

決定部７０６は、検索文字列に対応する１または複数の文書が検索された場合、検索された１または複数の文書の各文書の優先度を決定する。ここで、文書の優先度とは、検索文字列に対する検索結果として文書を出力する際の優先度である。例えば、優先度が高い文書ほど、検索結果として他の文書よりも上位に表示される。 When one or more documents corresponding to the search character string are retrieved, the determination unit 706 determines the priority of each of the retrieved one or more documents. Here, the priority of a document is the priority when outputting a document as a search result for a search character string. For example, a document with a higher priority is displayed higher than other documents in the search results.

具体的には、例えば、決定部７０６は、第２の記憶部７２０を参照して、検索された１または複数の文書の各文書に含まれる単語に対応するネガポジ度数に基づき、各文書のネガポジ度数を算出する。また、決定部７０６は、第２の記憶部７２０を参照して、検索文字列に含まれる単語に対応するネガポジ度数に基づき、検索文字列のネガポジ度数を算出する。そして、決定部７０６は、算出した各文書のネガポジ度数と検索文字列のネガポジ度数とに基づいて、各文書の優先度を決定する。 Specifically, for example, the determination unit 706 refers to the second storage unit 720 and determines the negative-positive value of each of the retrieved one or more documents based on the negative-positive frequency corresponding to the word included in each document. Calculate the frequency. The determining unit 706 also refers to the second storage unit 720 and calculates the negative-positive frequency of the search character string based on the negative-positive frequency corresponding to the word included in the search character string. Then, the determination unit 706 determines the priority of each document based on the calculated negative-positive frequency of each document and the negative-positive frequency of the search character string.

より詳細に説明すると、例えば、決定部７０６は、検索されたカルテ記事を形態素解析することにより、当該カルテ記事を単語（形態素）単位に分解する。つぎに、決定部７０６は、分解した単語（形態素）のうち、特定の品詞の単語を抽出する。 More specifically, for example, the determining unit 706 morphologically analyzes the retrieved medical record article to break it down into words (morphemes). Next, the determining unit 706 extracts words of a specific part of speech from the decomposed words (morphemes).

また、決定部７０６は、診療対象の患者の病名を特定する。診療対象の患者の病名は、例えば、検索文字列に対応付けられた情報から特定されてもよく、また、クライアント装置２０１や電子カルテシステムに問い合わせることにより特定されることにしてもよい。 The determination unit 706 also identifies the disease name of the patient to be treated. The disease name of the patient to be treated may be identified from information associated with the search character string, or may be identified by querying the client device 201 or the electronic medical record system.

つぎに、決定部７０６は、抽出した各単語について、ネガポジＤＢ２４０を参照して、特定した病名に対応するネガポジ度数を特定する。すなわち、決定部７０６は、単語と病名との組み合わせに対応するネガポジ度数を特定する。例えば、単語「痛み」と病名「胃腫瘍」との組み合わせに対応するネガポジ度数は「－０．１４９５２１５３１」である。 Next, the determination unit 706 refers to the negative-positive DB 240 for each extracted word, and specifies the negative-positive frequency corresponding to the specified disease name. That is, the determining unit 706 identifies the negative-positive frequency corresponding to the combination of the word and the disease name. For example, the negative-positive frequency corresponding to the combination of the word "pain" and the disease name "stomach tumor" is "-0.149521531".

そして、決定部７０６は、特定した各単語のネガポジ度数を積算することにより、カルテ記事のネガポジ度数を算出する。この際、決定部７０６は、例えば、各単語のネガポジ度数を積算した値を、カルテ記事から抽出した単語数（あるいは、分解した単語数）で割ることにより正規化することにしてもよい。 Then, the determining unit 706 calculates the negative-positive frequency of the medical record article by accumulating the negative-positive frequency of each specified word. At this time, the determining unit 706 may normalize the sum of the negative-positive frequencies of each word by dividing it by the number of words extracted from the medical record article (or the number of decomposed words).

これにより、検索文字列に対応するカルテ記事が持つポジティブまたはネガティブな意味合いの傾向を示す指標値（ネガポジ度数）を得ることができる。 As a result, it is possible to obtain an index value (negative-positive frequency) that indicates the tendency of the positive or negative connotation of the medical record article corresponding to the search character string.

また、決定部７０６は、検索文字列を形態素解析することにより、当該検索文字列を単語（形態素）単位に分解する。つぎに、決定部７０６は、分解した単語（形態素）のうち、特定の品詞の単語を抽出する。そして、決定部７０６は、抽出した各単語について、ネガポジＤＢ２４０を参照して、特定した診療対象の患者の病名に対応するネガポジ度数を特定する。 The determination unit 706 also performs morphological analysis on the search string to decompose the search string into words (morphemes). Next, the determining unit 706 extracts words of a specific part of speech from the decomposed words (morphemes). Then, the determination unit 706 refers to the negative-positive DB 240 for each extracted word, and specifies the negative-positive frequency corresponding to the disease name of the specified patient to be treated.

つぎに、決定部７０６は、特定した各単語のネガポジ度数を積算することにより、検索文字列のネガポジ度数を算出する。この際、決定部７０６は、例えば、各単語のネガポジ度数を積算した値を、検索文字列から抽出した単語数（あるいは、分解した単語数）で割ることにより正規化することにしてもよい。 Next, the determination unit 706 calculates the negative-positive frequency of the search character string by accumulating the negative-positive frequency of each specified word. At this time, the determination unit 706 may normalize the sum of the negative-positive frequencies of each word by dividing the number of words extracted from the search character string (or the number of decomposed words).

これにより、検索文字列が持つポジティブまたはネガティブな意味合いの傾向を示す指標値（ネガポジ度数）を得ることができる。 As a result, it is possible to obtain an index value (negative-positive frequency) that indicates the tendency of the search character string to have a positive or negative connotation.

そして、決定部７０６は、算出した各カルテ記事のネガポジ度数と検索文字列のネガポジ度数とに基づいて、各カルテ記事の優先度を決定する。ここで、各カルテ記事のネガポジ度数と検索文字列のネガポジ度数とに基づいて、カルテ記事の優先度を決定する第１および第２の決定例について説明する。 Then, the determination unit 706 determines the priority of each medical chart article based on the calculated negative-positive frequency of each medical chart article and the negative-positive frequency of the search character string. Here, first and second determination examples for determining the priority of medical chart articles based on the negative-positive frequency of each medical chart article and the negative-positive frequency of the search character string will be described.

・第１の決定例
決定部７０６は、各カルテ記事のネガポジ度数と検索文字列のネガポジ度数とに基づいて、検索された１または複数のカルテ記事のうち、ネガポジ度数が、検索文字列に近いカルテ記事ほど優先度が高くなるように、各カルテ記事の優先度を決定する。これにより、ポジティブまたはネガティブな意味合いの傾向の強さが検索文字列と同程度であるカルテ記事ほど高い優先度とすることができる。 First Determination Example Based on the negative-positive frequency of each medical chart article and the negative-positive frequency of the search character string, the determination unit 706 selects one or more retrieved medical chart articles whose negative-positive frequency is close to the search character string. The priority of each medical chart article is determined so that the medical chart article has a higher priority. As a result, it is possible to give a higher priority to medical record articles that tend to have positive or negative connotations as strong as the search character string.

この際、決定部７０６は、算出した検索文字列のネガポジ度数に基づいて、検索文字列をポジティブまたはネガティブに分類することにしてもよい。例えば、検索文字列のネガポジ度数が０以上の場合、決定部７０６は、検索文字列をポジティブに分類する。一方、検索文字列のネガポジ度数が０未満の場合、決定部７０６は、検索文字列をネガティブに分類する。ただし、ネガポジ度数は、値が大きいほど、ポジティブな意味合いの傾向が強いものとする。 At this time, the determining unit 706 may classify the search string as positive or negative based on the calculated negative-positive frequency of the search string. For example, when the negative-positive frequency of the search string is 0 or more, the determining unit 706 classifies the search string as positive. On the other hand, when the negative-positive frequency of the search string is less than 0, the determining unit 706 classifies the search string as negative. However, it is assumed that the larger the value of the negative-positive frequency, the stronger the tendency of the positive connotation.

ここで、検索文字列がポジティブに分類された場合、検索者は、ネガティブな意味合いの検索結果よりも、ポジティブな意味合いの検索結果を必要としているといえる。一方、検索文字列がネガティブに分類された場合、検索者は、ポジティブな意味合いの検索結果よりも、ネガティブな意味合いの検索結果を必要としているといえる。 Here, when the search character string is classified as positive, it can be said that the searcher needs search results with a positive connotation rather than search results with a negative connotation. On the other hand, if the search character string is classified as negative, it can be said that the searcher needs search results with a negative connotation rather than search results with a positive connotation.

このため、検索文字列がポジティブに分類された場合、決定部７０６は、ネガポジ度数が０未満のカルテ記事よりも、ネガポジ度数が０以上のカルテ記事の優先度が高くなるように、各カルテ記事の優先度を決定することにしてもよい。これにより、検索文字列がポジティブに分類された場合には、ネガポジ度数が、０以上であって、検索文字列に近いカルテ記事ほど高い優先度とすることができる。 Therefore, when the search character string is classified as positive, the determination unit 706 assigns each medical chart article a priority such that a medical chart article with a negative-positive frequency of 0 or more has a higher priority than a medical chart article with a negative-positive frequency of less than 0. may be determined. As a result, when the search character string is classified as positive, the chart article whose negative-positive frequency is 0 or more and which is closer to the search character string can be given higher priority.

一方、検索文字列がネガティブに分類された場合、決定部７０６は、ネガポジ度数が０以上のカルテ記事よりも、ネガポジ度数が０未満のカルテ記事の優先度が高くなるように、各カルテ記事の優先度を決定することにしてもよい。これにより、検索文字列がネガティブに分類された場合には、ネガポジ度数が、０未満であって、検索文字列に近いカルテ記事ほど高い優先度とすることができる。 On the other hand, if the search character string is classified as negative, the determining unit 706 assigns a higher priority to medical chart articles with a negative-positive frequency of less than 0 than to medical chart articles with a negative-positive frequency of 0 or more. Priority may be determined. As a result, when the search character string is classified as negative, the chart article whose negative-positive count is less than 0 and which is closer to the search character string can be given higher priority.

・第２の決定例
決定部７０６は、算出した検索文字列のネガポジ度数に基づいて、検索文字列をポジティブまたはネガティブに分類する。ここで、検索文字列がポジティブに分類された場合、決定部７０６は、ネガポジ度数が大きいカルテ記事ほど優先度が高くなるように、各カルテ記事の優先度を決定することにしてもよい。これにより、検索文字列がポジティブに分類された場合には、ポジティブな意味合いの傾向が強いカルテ記事ほど高い優先度とすることができる。 Second Determination Example The determination unit 706 classifies the search string as positive or negative based on the calculated negative-positive frequency of the search string. Here, when the search string is classified as positive, the determination unit 706 may determine the priority of each medical chart article so that the medical chart article with a higher negative-positive frequency has a higher priority. As a result, when the search character string is classified as positive, the chart article that tends to have a stronger positive connotation can be given higher priority.

一方、ここで、検索文字列がネガティブに分類された場合、決定部７０６は、ネガポジ度数が小さいカルテ記事ほど優先度が高くなるように、各カルテ記事の優先度を決定することにしてもよい。これにより、検索文字列がネガティブに分類された場合には、ネガティブな意味合いの傾向が強いカルテ記事ほど高い優先度とすることができる。 On the other hand, if the search character string is classified as negative here, the determination unit 706 may determine the priority of each medical chart article so that the medical chart article with a smaller negative-positive frequency number has a higher priority. . As a result, when the search character string is classified as negative, the chart article that tends to have a stronger negative connotation can be given a higher priority.

なお、各カルテ記事の優先度をどのように決定するかは、任意に設定可能である。例えば、情報処理装置１０１は、クライアント装置２０１からの指示に応じて、上述した第１または第２の決定例のいずれかの方法により、各カルテ記事の優先度を決定することにしてもよい。 It is possible to arbitrarily set how to determine the priority of each medical chart article. For example, the information processing apparatus 101 may determine the priority of each medical record article according to an instruction from the client apparatus 201 by either the first or second determination method described above.

また、決定部７０６は、算出した各カルテ記事のネガポジ度数に基づいて、各カルテ記事の表示態様を決定することにしてもよい。ここで、カルテ記事の表示態様とは、検索結果としてカルテ記事を表示する際の態様であり、カルテ記事の背景色、文字の色、大きさなどによって表される。 Further, the determining unit 706 may determine the display mode of each medical chart article based on the calculated negative-positive frequency of each medical chart article. Here, the display mode of the medical chart article is the mode when the medical chart article is displayed as a search result, and is represented by the background color, character color, size, and the like of the medical chart article.

具体的には、例えば、決定部７０６は、ネガポジ度数が０以上のカルテ記事の表示態様が、ネガポジ度数が０未満のカルテ記事の表示態様と異なるように、各カルテ記事の表示態様を決定する。これにより、ポジティブな意味合いの検索結果と、ネガティブな意味合いの検索結果とを判別可能にすることができる。 Specifically, for example, the determining unit 706 determines the display mode of each medical chart article such that the display mode of a medical chart article with a negative-positive frequency of 0 or more is different from the display mode of a medical chart article with a negative-positive frequency of less than 0. . This makes it possible to distinguish between search results with a positive connotation and search results with a negative connotation.

なお、決定部７０６によって決定された表示態様によるカルテ記事の表示例については、図９を用いて後述する。 A display example of medical record articles in the display mode determined by the determination unit 706 will be described later with reference to FIG. 9 .

出力部７０７は、決定された各文書の優先度に基づいて、検索文字列に対応する検索結果を出力する。具体的には、例えば、出力部７０７は、検索された１または複数の文書を、決定された優先度が高い順に並べた文書リスト（あるいは、文書の名称等のリスト）を検索結果として出力することにしてもよい。 The output unit 707 outputs search results corresponding to the search character string based on the determined priority of each document. Specifically, for example, the output unit 707 outputs a document list (or a list of document names, etc.) in which the retrieved one or more documents are arranged in descending order of priority as the search result. You can decide.

この際、出力部７０７は、検索された１または複数の文書のうち、決定された優先度が高い順に上位Ｎ個の文書を検索結果として出力することにしてもよい。Ｎは、任意に設定可能である。また、出力部７０７は、各文書（あるいは、文書の名称等）を検索結果として出力する際に、決定された表示態様で各文書を出力することにしてもよい。 At this time, the output unit 707 may output, as search results, the top N documents among the retrieved one or more documents in descending order of the determined priority. N can be set arbitrarily. The output unit 707 may output each document (or the name of the document, etc.) as a search result in the determined display mode.

また、出力部７０７は、各文書（あるいは、文書の名称等）を出力する際に、各文書と対応付けて、算出された各文書の指標値を出力することにしてもよい。出力部７０７の出力形式としては、例えば、通信Ｉ／Ｆ３０５による他のコンピュータへの送信、メモリ３０２、ディスク３０４などの記憶装置への記憶、不図示のディスプレイへの表示、不図示のプリンタへの印刷出力などがある。 Further, when outputting each document (or the name of the document, etc.), the output unit 707 may output the calculated index value of each document in association with each document. The output format of the output unit 707 includes, for example, transmission to another computer via the communication I/F 305, storage in a storage device such as the memory 302 and disk 304, display on a display (not shown), and output to a printer (not shown). print output, etc.

より詳細に説明すると、例えば、出力部７０７は、検索文字列に対応するカルテ記事をリスト化して示す検索結果画面を、クライアント装置２０１に表示することにしてもよい。検索結果画面の画面例については、図９および図１０を用いて後述する。 More specifically, for example, the output unit 707 may display, on the client device 201, a search result screen showing a list of medical record articles corresponding to the search character string. A screen example of the search result screen will be described later with reference to FIGS. 9 and 10. FIG.

なお、上述した説明では、検索文字列のネガポジ度数に基づいて、検索文字列をポジティブまたはネガティブに分類することにしたが、これに限らない。例えば、第２の受付部７０４は、検索文字列とともに、ポジティブまたはネガティブのいずれかの選択を受け付けることにしてもよい。 In the above description, the search character string is classified into positive or negative based on the negative-positive frequency of the search character string, but the classification is not limited to this. For example, the second reception unit 704 may receive selection of either positive or negative along with the search character string.

ポジティブまたはネガティブの選択は、例えば、クライアント装置２０１に表示される、後述の図１１に示すようなキーワード検索画面１１００において行われる。この場合、クライアント装置２０１は、入力された検索文字列とともに、ポジティブまたはネガティブの選択結果を、情報処理装置１０１に送信する。 Selection of positive or negative is performed, for example, on a keyword search screen 1100 displayed on the client device 201 and shown in FIG. 11 described later. In this case, the client device 201 transmits the input search character string and the positive or negative selection result to the information processing device 101 .

そして、第２の受付部７０４は、クライアント装置２０１から検索文字列および選択結果を受信することにより、検索文字列およびポジティブまたはネガティブの選択を受け付ける。決定部７０６は、ポジティブの選択を受け付けた場合、検索文字列をポジティブに分類する。一方、ネガティブの選択を受け付けた場合、決定部７０６は、検索文字列をネガティブに分類する。 Then, the second reception unit 704 receives the search character string and the selection result from the client device 201, thereby accepting the search character string and the selection of positive or negative. The determining unit 706 classifies the search character string as positive when positive selection is received. On the other hand, when negative selection is accepted, the determining unit 706 classifies the search character string as negative.

また、上述した説明では、文書検索時に、検索された各文書について、当該各文書のネガポジ度数を算出することにしたが、これに限らない。例えば、情報処理装置１０１は、第１の記憶部７１０に記憶された文書について、当該文書のネガポジ度数を予め算出しておくことにしてもよい。 Further, in the above description, the negative-positive frequency of each document is calculated for each document when searching for documents, but the present invention is not limited to this. For example, the information processing apparatus 101 may calculate in advance the negative-positive frequency of the document stored in the first storage unit 710 .

具体的には、例えば、情報処理装置１０１は、第２の記憶部７２０を参照して、第１の記憶部７１０に記憶された各文書に含まれる単語に対応するネガポジ度数に基づき、各文書のネガポジ度数を算出する。そして、情報処理装置１０１は、算出した各文書のネガポジ度数を、各文書と対応付けて第１の記憶部７１０に登録しておくことにしてもよい。 Specifically, for example, the information processing apparatus 101 refers to the second storage unit 720, and based on the negative-positive frequency corresponding to the word included in each document stored in the first storage unit 710, each document Calculate the negative-positive frequency of Then, the information processing apparatus 101 may register the calculated negative-positive frequency of each document in the first storage unit 710 in association with each document.

より詳細に説明すると、例えば、情報処理装置１０１は、カルテＤＢ２３０に記憶された各カルテ記事に含まれる単語について、ネガポジＤＢ２４０を参照して、各カルテ記事に対応する病名に対応するネガポジ度数を特定する。つぎに、情報処理装置１０１は、特定した各カルテ記事に含まれる単語のネガポジ度数に基づき、各カルテ記事のネガポジ度数を算出する。そして、情報処理装置１０１は、算出した各カルテ記事のネガポジ度数を、各カルテ記事と対応付けて、メモリ３０２、ディスク３０４などの記憶装置に登録しておくことにしてもよい。 More specifically, for example, the information processing apparatus 101 refers to the negative-positive DB 240 for words included in each medical chart article stored in the medical chart DB 230, and specifies the negative-positive frequency corresponding to the disease name corresponding to each medical chart article. do. Next, the information processing apparatus 101 calculates the negative-positive frequency of each medical chart article based on the negative-positive frequency of words included in each specified medical chart article. Then, the information processing apparatus 101 may register the calculated negative-positive frequency of each medical chart article in a storage device such as the memory 302 and the disk 304 in association with each medical chart article.

これにより、文書検索時に、予め登録した文書（例えば、カルテ記事）のネガポジ度数を読み出すだけでよい、すなわち、各文書のネガポジ度数を算出しなくてもよくなるため、文書検索時の処理負荷を削減して応答性能を高めることができる。 As a result, when retrieving documents, it is only necessary to read out the negative-positive frequencies of documents registered in advance (e.g., medical chart articles). response performance can be improved.

また、情報処理装置１０１は、例えば、研修医などが記載したカルテ記事などを対象として、カルテ記事のネガポジ度数を求めることにしてもよい。これにより、カルテ記事のネガポジ度数から、カルテ記事の内容がポジティブすぎる、あるいは、ネガティブすぎるといった評価を行うことができ、新人教育に役立てることができる。 Further, the information processing apparatus 101 may obtain the negative-positive frequency of medical chart articles, for example, for medical chart articles written by trainees or the like. As a result, it is possible to evaluate whether the content of the medical record article is too positive or too negative based on the negative-positive frequency of the medical record article, which can be used for training new employees.

また、情報処理装置１０１の各機能部は、文書検索システム２００内の情報処理装置１０１とは異なる他のコンピュータ、例えば、クライアント装置２０１で実現することにしてもよい。また、情報処理装置１０１の各機能部は、文書検索システム２００内の複数のコンピュータ（例えば、情報処理装置１０１とクライアント装置２０１）により実現されることにしてもよい。 Also, each functional unit of the information processing apparatus 101 may be realized by another computer different from the information processing apparatus 101 in the document search system 200, for example, the client apparatus 201. FIG. Also, each functional unit of the information processing apparatus 101 may be implemented by a plurality of computers (for example, the information processing apparatus 101 and the client apparatus 201) in the document search system 200. FIG.

（クライアント装置２０１に表示される各種画面の画面例）
つぎに、クライアント装置２０１のディスプレイ（不図示）に表示される各種画面の画面例について説明する。ここでは、医師が、患者の診療を行うにあたり、その患者の過去のカルテ記事を参考にして、今後の治療方針を検討する場合を想定する。 (Screen examples of various screens displayed on the client device 201)
Next, screen examples of various screens displayed on the display (not shown) of the client device 201 will be described. Here, it is assumed that a doctor considers a future treatment policy by referring to the patient's past medical chart articles when treating a patient.

以下の説明では、操作画面に表示されているボタン等をユーザが選択する操作として、クリック操作を行う場合を例に挙げて説明する。 In the following description, as an operation for the user to select a button or the like displayed on the operation screen, a case of performing a click operation will be described as an example.

図９は、クライアント装置２０１の画面遷移例を示す説明図である。図９において、キーワード検索画面９１０は、検索文字列の入力を受け付ける操作画面である。キーワード検索画面９１０において、不図示の入力装置を用いたユーザ（医師）の操作入力により、ボックス９１１をクリックすると、検索文字列を入力することができる。図９の例では、ボックス９１１に検索文字列「下腹部の痛み」が入力されている。 FIG. 9 is an explanatory diagram showing an example of screen transition of the client device 201. As shown in FIG. In FIG. 9, a keyword search screen 910 is an operation screen for accepting input of a search character string. On the keyword search screen 910, a user (doctor) can input a search character string by clicking a box 911 using an input device (not shown). In the example of FIG. 9, the search string “pain in lower abdomen” is entered in box 911 .

また、キーワード検索画面９１０において、ユーザ（医師）の操作入力により、ボタン９１２をクリックすると、ボックス９１１に入力された検索文字列が情報処理装置１０１に送信される。図９の例では、ボックス９１１に入力された検索文字列「下腹部の痛み」が情報処理装置１０１に送信される。 When a user (doctor) clicks a button 912 on the keyword search screen 910 , the search character string input in the box 911 is transmitted to the information processing apparatus 101 . In the example of FIG. 9 , the search character string “pain in lower abdomen” input in box 911 is transmitted to information processing apparatus 101 .

情報処理装置１０１は、クライアント装置２０１から検索文字列「下腹部の痛み」を受け付けると、カルテＤＢ２３０から検索文字列「下腹部の痛み」に対応するカルテ記事を検索する。つぎに、情報処理装置１０１は、検索した各カルテ記事の優先度を決定する。ここでは、上述した第１の決定例により、各カルテ記事の優先度を決定する場合を想定する。 When the information processing apparatus 101 receives the search character string “lower abdominal pain” from the client device 201, it searches the medical record DB 230 for a medical record article corresponding to the search character string “lower abdominal pain”. Next, the information processing apparatus 101 determines the priority of each retrieved medical record article. Here, it is assumed that the priority of each medical record article is determined according to the first determination example described above.

具体的には、例えば、情報処理装置１０１は、検索文字列「下腹部の痛み」のネガポジ度数を算出する。また、情報処理装置１０１は、検索した各カルテ記事のネガポジ度数を算出する。ここで、検索文字列「下腹部の痛み」のネガポジ度数を算出する場合を例に挙げて、具体的な処理内容について説明する。 Specifically, for example, the information processing apparatus 101 calculates the negative-positive frequency of the search character string “pain in the lower abdomen”. In addition, the information processing apparatus 101 calculates the negative-positive frequency of each searched medical record article. Here, a case of calculating the negative-positive frequency of the search character string "pain in the lower abdomen" will be taken as an example to describe specific processing contents.

より詳細に説明すると、例えば、情報処理装置１０１は、検索文字列「下腹部の痛み」を形態素解析して、単語（形態素）単位に分解する。ここでは、「下腹部」と「の」と「痛み」の３単語に分解される。つぎに、情報処理装置１０１は、ネガポジＤＢ２４０を参照して、分解した各単語に対応するネガポジ度数を特定する。 More specifically, for example, the information processing apparatus 101 morphologically analyzes the search character string “pain in the lower abdomen” and breaks it down into words (morphemes). Here, it is decomposed into three words, ``lower abdomen'', ``no'' and ``pain''. Next, the information processing apparatus 101 refers to the negative-positive DB 240 to specify the negative-positive frequency corresponding to each decomposed word.

ただし、患者の病名を「胃腫瘍」とする。また、単語「の」は、助詞のため除外する。すなわち、情報処理装置１０１は、ネガポジＤＢ２４０を参照して、分解した３単語のうち、単語「下腹部」と単語「痛み」について、病名「胃腫瘍」に対応するネガポジ度数を特定する。 However, the disease name of the patient is assumed to be "stomach tumor". Also, the word "no" is excluded because it is a particle. That is, the information processing apparatus 101 refers to the negative-positive DB 240 and specifies the negative-positive frequency corresponding to the disease name "stomach tumor" for the word "lower abdomen" and the word "pain" among the three decomposed words.

ここでは、単語「下腹部」のネガポジ度数として、病名「胃腫瘍」に対応する「－０．０１７９４２５８４」が特定された場合を想定する。また、単語「痛み」のネガポジ度数として、病名「胃腫瘍」に対応する「－０．１４９５２１５３１」が特定された場合を想定する。 Here, it is assumed that "-0.017942584" corresponding to the disease name "stomach tumor" is specified as the negative-positive frequency of the word "lower abdomen". It is also assumed that "-0.149521531" corresponding to the disease name "stomach tumor" is specified as the negative-positive frequency of the word "pain".

つぎに、情報処理装置１０１は、特定した各単語のネガポジ度数を積算することにより、検索文字列「下腹部の痛み」のネガポジ度数を算出する。ここでは、検索文字列「下腹部の痛み」のネガポジ度数は「－０．１６７４６４１１５」となる。このため、検索文字列「下腹部の痛み」は、ネガティブに分類される。 Next, the information processing apparatus 101 calculates the negative-positive frequency of the search character string “pain in the lower abdomen” by accumulating the negative-positive frequency of each specified word. Here, the negative-positive frequency of the search character string "pain in the lower abdomen" is "-0.167464115". Therefore, the search character string “pain in the lower abdomen” is classified as negative.

この場合、情報処理装置１０１は、検索したカルテ記事のうち、ネガポジ度数が０未満であって、ネガポジ度数が検索文字列に近いカルテ記事ほど高い優先度とする。そして、情報処理装置１０１は、決定した各カルテ記事の優先度に従って、検索したカルテ記事を並べた検索結果画面９２０をクライアント装置２０１に表示する。 In this case, the information processing apparatus 101 assigns a higher priority to a medical chart article having a negative-positive frequency of less than 0 and a negative-positive frequency closer to the search character string among the retrieved medical chart articles. Then, the information processing apparatus 101 displays on the client device 201 a search result screen 920 in which the searched medical chart articles are arranged according to the determined priority of each medical chart article.

検索結果画面９２０は、検索文字列「下腹部の痛み」に対応するカルテ記事９２１～９２４をリスト化して示す操作画面である。検索結果画面９２０では、ネガポジ度数が０未満であって、ネガポジ度数が検索文字列「下腹部の痛み」に近いカルテ記事が優先して表示（上位に表示）されている。 The search result screen 920 is an operation screen showing a list of medical record articles 921 to 924 corresponding to the search character string "pain in the lower abdomen". On the search result screen 920, chart articles having a negative-positive frequency of less than 0 and having a negative-positive frequency close to the search character string "lower abdominal pain" are preferentially displayed (displayed at the top).

ここで、カルテ記事９２１のネガポジ度数は、「－０．１６７５５５２２」である。また、カルテ記事９２２のネガポジ度数は、「－０．１６８９９９９９」である。カルテ記事９２３のネガポジ度数は、「－０．１４７８５６３２１」である。カルテ記事９２４のネガポジ度数は、「＋０．１５７８５６５６１」である。 Here, the negative-positive frequency of the medical record article 921 is "-0.16755522". Also, the negative-positive frequency of the medical record article 922 is "-0.16899999". The negative-positive frequency of the medical record article 923 is "-0.147856321". The negative-positive frequency of the medical record article 924 is "+0.157856561".

例えば、カルテ記事９２１は、ネガポジ度数が０未満であって、ネガポジ度数が検索文字列「下腹部の痛み」に最も近いため、最上位に表示されている。また、カルテ記事９２４は、ネガポジ度数が０以上のため、ネガポジ度数が０未満である他のカルテ記事９２１～９２３よりも下位に表示されている。 For example, the medical record article 921 has a negative-positive frequency of less than 0 and is closest to the search character string "lower abdominal pain", so it is displayed at the top. Further, since the medical chart article 924 has a negative-positive frequency of 0 or more, it is displayed below the other medical chart articles 921 to 923 whose negative-positive frequency is less than zero.

検索結果画面９２０によれば、検索文字列「下腹部の痛み」に応じて、ネガティブの条件を加味した検索結果を提示することができる。これにより、医師は、病名「胃腫瘍」の患者の診療を行うにあたり、その患者の過去のカルテ記事から必要な情報を容易に見つけ出すことができる。 According to the search result screen 920, it is possible to present search results with negative conditions added according to the search character string "pain in the lower abdomen". As a result, the doctor can easily find the necessary information from the patient's past medical records when treating the patient with the disease name "stomach tumor".

また、検索結果画面９２０では、ネガポジ度数が０未満のカルテ記事９２１～９２３と、ネガポジ度数が０以上のカルテ記事９２４とが異なる態様（背景の模様が異なる）で表示されている。これにより、医師は、ポジティブな意味合いのカルテ記事と、ネガティブな意味合いのカルテ記事とを容易に判別することができる。 In addition, on the search result screen 920, the medical chart articles 921 to 923 with a negative-positive frequency of less than 0 and the medical chart article 924 with a negative-positive frequency of 0 or more are displayed in different manners (different background patterns). This allows the doctor to easily discriminate between medical chart articles with positive connotations and medical chart articles with negative connotations.

また、検索結果画面９２０では、カルテ記事９２１～９２４と対応付けて、ネガポジ度数情報９３１～９３４がそれぞれ表示されている。各ネガポジ度数情報９３１～９３４は、各カルテ記事９２１～９２４のネガポジ度数を示す情報である。これにより、医師は、各カルテ記事９２１～９２４のネガポジ度数から、各カルテ記事９２１～９２４のポジティブまたはネガティブの傾向の強さを判断することができる。 Further, on the search result screen 920, negative-positive frequency information 931-934 are displayed in association with medical record articles 921-924, respectively. Each of the negative-positive frequency information 931-934 is information indicating the negative-positive frequency of each medical record article 921-924. This allows the doctor to determine the strength of the positive or negative tendency of each of the medical chart articles 921-924 from the negative-positive frequency of each of the medical chart articles 921-924.

ただし、検索結果画面９２０において、ネガポジ度数情報９３１～９３４は表示されていなくてもよい。 However, the negative-positive frequency information 931 to 934 may not be displayed on the search result screen 920 .

つぎに、図１０を用いて、検索結果画面の他の画面例について説明する。ここでは、上述した第２の決定例により、各カルテ記事の優先度が決定された場合を想定する。この場合、検索文字列「下腹部の痛み」がネガティブに分類されると、ネガティブな意味合いの傾向が強いカルテ記事ほど高い優先度となる。 Next, another screen example of the search result screen will be described with reference to FIG. Here, it is assumed that the priority of each medical record article is determined according to the second determination example described above. In this case, if the search character string “pain in the lower abdomen” is classified as negative, the chart article with a stronger negative connotation tends to have a higher priority.

図１０は、検索結果画面の他の画面例を示す説明図である。図１０において、検索結果画面１０００は、検索文字列「下腹部の痛み」に対応するカルテ記事９２１～９２４をリスト化して示す操作画面である。検索結果画面１０００では、ネガティブな意味合いの傾向が強いカルテ記事が優先して表示（上位に表示）されている。 FIG. 10 is an explanatory diagram showing another screen example of the search result screen. In FIG. 10, a search result screen 1000 is an operation screen showing a list of medical record articles 921 to 924 corresponding to the search character string "lower abdominal pain". On the search result screen 1000, medical chart articles that tend to have a strong negative connotation are preferentially displayed (displayed at the top).

これにより、医師は、上位に表示されたものほど、ネガティブな意味合いの傾向が強いことがわかり、過去のカルテ記事から必要な情報を容易に見つけ出すことができる。 As a result, the doctor can see that the higher the displayed information, the stronger the negative connotation, and the doctor can easily find the necessary information from the past medical record articles.

つぎに、図１１を用いて、キーワード検索画面の他の画面例について説明する。 Next, another screen example of the keyword search screen will be described with reference to FIG.

図１１は、キーワード検索画面の他の画面例を示す説明図である。図１１において、キーワード検索画面１１００は、検索文字列の入力を受け付ける操作画面である。キーワード検索画面１１００において、ユーザ（医師）の操作入力により、ボックス１１０１をクリックすると、検索文字列を入力することができる。図１１の例では、ボックス１１０１に検索文字列「下腹部の痛み」が入力されている。 FIG. 11 is an explanatory diagram showing another screen example of the keyword search screen. In FIG. 11, a keyword search screen 1100 is an operation screen for accepting input of a search character string. In the keyword search screen 1100, a user (doctor) can input a search character string by clicking a box 1101 through an operation input. In the example of FIG. 11, a search string “pain in lower abdomen” is entered in box 1101 .

また、キーワード検索画面１１００において、ユーザの操作入力により、ボタン１１０２，１１０３のいずれかをクリックすると、ポジティブまたはネガティブのいずれかを選択することができる。図１１の例では、ボタン１１０３がクリックされ、ネガティブが選択されている。 Also, on the keyword search screen 1100, the user can click either button 1102 or 1103 to select either positive or negative. In the example of FIG. 11, button 1103 is clicked to select negative.

また、キーワード検索画面１１００において、ユーザの操作入力により、ボタン１１０４をクリックすると、ボックス１１０１に入力された検索文字列、および、ポジティブまたはネガティブの選択結果が情報処理装置１０１に送信される。図１１の例では、ボックス１１０１に入力された検索文字列「下腹部の痛み」および選択結果「ネガティブ」が情報処理装置１０１に送信される。 When a user clicks a button 1104 on the keyword search screen 1100 , the search character string entered in the box 1101 and the positive or negative selection result are sent to the information processing apparatus 101 . In the example of FIG. 11 , the search character string “lower abdominal pain” input in the box 1101 and the selection result “negative” are transmitted to the information processing apparatus 101 .

これにより、情報処理装置１０１において、検索文字列「下腹部の痛み」をポジティブまたはネガティブに分類するにあたり、検索文字列「下腹部の痛み」のネガポジ度数を算出することなく、選択結果「ネガティブ」から精度よく分類することができる。 As a result, when classifying the search character string "pain in the lower abdomen" into positive or negative in the information processing apparatus 101, the selection result "negative" can be obtained without calculating the negative-positive frequency of the search character string "pain in the lower abdomen". can be classified with high accuracy from

（情報処理装置１０１のネガポジ登録処理手順）
つぎに、図１２～図１４を用いて、情報処理装置１０１のネガポジ登録処理手順について説明する。 (Negative-Positive Registration Processing Procedure of Information Processing Device 101)
Next, negative/positive registration processing procedures of the information processing apparatus 101 will be described with reference to FIGS. 12 to 14. FIG.

図１２～図１４は、情報処理装置１０１のネガポジ登録処理手順の一例を示すフローチャートである。図１２のフローチャートにおいて、まず、情報処理装置１０１は、病名ＤＢ２２０から選択されていない未選択の病名情報を選択する（ステップＳ１２０１）。つぎに、情報処理装置１０１は、選択した病名情報の主病名区分を参照して、主病名であるか否かを判断する（ステップＳ１２０２）。 12 to 14 are flow charts showing an example of a negative/positive registration processing procedure of the information processing apparatus 101. FIG. 12, the information processing apparatus 101 first selects unselected disease name information from the disease name DB 220 (step S1201). Next, the information processing apparatus 101 refers to the main disease name category of the selected disease name information and determines whether or not it is the main disease name (step S1202).

ここで、主病名ではない場合（ステップＳ１２０２：Ｎｏ）、情報処理装置１０１は、ステップＳ１２０８に移行する。一方、主病名の場合には（ステップＳ１２０２：Ｙｅｓ）、情報処理装置１０１は、選択した病名情報の病名開始日が５年以内であるか否かを判断する（ステップＳ１２０３）。 Here, if it is not the main disease name (step S1202: No), the information processing apparatus 101 proceeds to step S1208. On the other hand, if it is the main disease name (step S1202: Yes), the information processing apparatus 101 determines whether the disease name start date of the selected disease name information is within five years (step S1203).

ここで、病名開始日が５年以内ではない場合（ステップＳ１２０３：Ｎｏ）、情報処理装置１０１は、ステップＳ１２０８に移行する。一方、病名開始日が５年以内の場合（ステップＳ１２０３：Ｙｅｓ）、情報処理装置１０１は、選択した病名情報の転帰が「死亡」であるか否かを判断する（ステップＳ１２０４）。 Here, if the disease name start date is not within five years (step S1203: No), the information processing apparatus 101 proceeds to step S1208. On the other hand, if the disease name start date is within five years (step S1203: Yes), the information processing apparatus 101 determines whether the outcome of the selected disease name information is "death" (step S1204).

ここで、転帰が「死亡」の場合（ステップＳ１２０４：Ｙｅｓ）、情報処理装置１０１は、ネガティブ件数算出処理を実行して（ステップＳ１２０５）、ステップＳ１２０８に移行する。ネガティブ件数算出処理の具体的な処理手順については、図１５を用いて後述する。 Here, if the outcome is "death" (step S1204: Yes), the information processing apparatus 101 executes negative number calculation processing (step S1205), and proceeds to step S1208. A specific processing procedure of the negative number calculation processing will be described later with reference to FIG. 15 .

一方、転帰が「死亡」ではない場合（ステップＳ１２０４：Ｎｏ）、情報処理装置１０１は、選択した病名情報の転帰が「寛解または治癒」であるか否かを判断する（ステップＳ１２０６）。ここで、転帰が「寛解または治癒」ではない場合（ステップＳ１２０６：Ｎｏ）、情報処理装置１０１は、ステップＳ１２０８に移行する。 On the other hand, if the outcome is not "death" (step S1204: No), the information processing apparatus 101 determines whether the outcome of the selected disease name information is "remission or cure" (step S1206). If the outcome is not "remission or cure" (step S1206: No), the information processing apparatus 101 proceeds to step S1208.

一方、転帰が「寛解または治癒」の場合（ステップＳ１２０６：Ｙｅｓ）、情報処理装置１０１は、ポジティブ件数算出処理を実行する（ステップＳ１２０７）。ポジティブ件数算出処理の具体的な処理手順については、図１６を用いて後述する。そして、情報処理装置１０１は、病名ＤＢ２２０から選択されていない未選択の病名情報があるか否かを判断する（ステップＳ１２０８）。 On the other hand, if the outcome is "remission or cure" (step S1206: Yes), the information processing apparatus 101 executes positive number calculation processing (step S1207). A specific processing procedure of the number-of-positives calculation processing will be described later with reference to FIG. 16 . Then, the information processing apparatus 101 determines whether there is unselected disease name information from the disease name DB 220 (step S1208).

ここで、未選択の病名情報がある場合（ステップＳ１２０８：Ｙｅｓ）、情報処理装置１０１は、ステップＳ１２０１に戻る。一方、未選択の病名情報がない場合には（ステップＳ１２０８：Ｎｏ）、情報処理装置１０１は、図１３に示すステップＳ１３０１に移行する。 If there is unselected disease name information (step S1208: Yes), the information processing apparatus 101 returns to step S1201. On the other hand, if there is no unselected disease name information (step S1208: No), the information processing apparatus 101 proceeds to step S1301 shown in FIG.

図１３のフローチャートにおいて、まず、情報処理装置１０１は、ネガポジ件数テーブル８００を参照して、各単語のネガティブ件数を積算することにより、合計ネガティブ件数を算出する（ステップＳ１３０１）。つぎに、情報処理装置１０１は、ネガポジ件数テーブル８００を参照して、各単語のポジティブ件数を積算することにより、合計ポジティブ件数を算出する（ステップＳ１３０２）。 In the flowchart of FIG. 13, the information processing apparatus 101 first refers to the negative/positive number table 800 and calculates the total number of negatives by accumulating the number of negatives of each word (step S1301). Next, the information processing apparatus 101 refers to the negative/positive number table 800 and calculates the total number of positives by accumulating the number of positives of each word (step S1302).

そして、情報処理装置１０１は、ネガポジ件数テーブル８００から選択されていない未選択の件数情報を選択する（ステップＳ１３０３）。つぎに、情報処理装置１０１は、選択した件数情報のネガティブ件数が未設定であるか否かを判断する（ステップＳ１３０４）。 Then, the information processing apparatus 101 selects unselected number information that has not been selected from the negative/positive number table 800 (step S1303). Next, the information processing apparatus 101 determines whether or not the number of negatives in the selected number of cases information has not been set (step S1304).

ここで、ネガティブ件数が設定済みの場合（ステップＳ１３０４：Ｎｏ）、情報処理装置１０１は、選択した件数情報のネガティブ件数を、算出した合計ネガティブ件数で除算して、ネガティブ出現頻度を算出し（ステップＳ１３０５）、ステップＳ１３０７に移行する。 Here, if the number of negatives has already been set (step S1304: No), the information processing apparatus 101 divides the number of negatives in the selected number of cases information by the calculated total number of negatives to calculate the appearance frequency of negatives (step S1305), and proceeds to step S1307.

一方、ネガティブ件数が未設定の場合（ステップＳ１３０４：Ｙｅｓ）、情報処理装置１０１は、ネガティブ出現頻度を「０」とする（ステップＳ１３０６）。そして、情報処理装置１０１は、選択した件数情報の要素と病名に対応付けて、ネガティブ出現頻度をネガポジＤＢ２４０に記憶して（ステップＳ１３０７）、図１４に示すステップＳ１４０１に移行する。 On the other hand, if the number of negatives is not set (step S1304: Yes), the information processing apparatus 101 sets the frequency of appearance of negatives to "0" (step S1306). Then, the information processing apparatus 101 associates the element of the selected number of cases information with the disease name, stores the negative appearance frequency in the negative/positive DB 240 (step S1307), and proceeds to step S1401 shown in FIG.

図１４のフローチャートにおいて、まず、情報処理装置１０１は、選択した件数情報のポジティブ件数が未設定であるか否かを判断する（ステップＳ１４０１）。ここで、ポジティブ件数が設定済みの場合（ステップＳ１４０１：Ｎｏ）、情報処理装置１０１は、選択した件数情報のポジティブ件数を、算出した合計ポジティブ件数で除算して、ポジティブ出現頻度を算出し（ステップＳ１４０２）、ステップＳ１４０４に移行する。 In the flowchart of FIG. 14, first, the information processing apparatus 101 determines whether or not the positive number of the selected number of cases information has not been set (step S1401). Here, if the number of positives has already been set (step S1401: No), the information processing apparatus 101 divides the number of positives in the selected number of information by the calculated total number of positives to calculate the appearance frequency of positives (step S1402), and proceeds to step S1404.

一方、ポジティブ件数が未設定の場合（ステップＳ１４０１：Ｙｅｓ）、情報処理装置１０１は、ポジティブ出現頻度を「０」とする（ステップＳ１４０３）。そして、情報処理装置１０１は、選択した件数情報の要素と病名に対応付けて、ポジティブ出現頻度をネガポジＤＢ２４０に記憶する（ステップＳ１４０４）。 On the other hand, if the number of positives has not been set (step S1401: YES), the information processing apparatus 101 sets the appearance frequency of positives to "0" (step S1403). Then, the information processing apparatus 101 stores the positive frequency of appearance in the negative-positive DB 240 in association with the element of the selected number of cases information and the disease name (step S1404).

つぎに、情報処理装置１０１は、ポジティブ出現頻度からネガティブ出現頻度を引くことにより、ネガポジ度数を算出する（ステップＳ１４０５）。そして、情報処理装置１０１は、選択した件数情報の要素と病名に対応付けて、算出したネガポジ度数をネガポジＤＢ２４０に記憶する（ステップＳ１４０６）。 Next, the information processing apparatus 101 calculates the negative-positive frequency by subtracting the negative appearance frequency from the positive appearance frequency (step S1405). Then, the information processing apparatus 101 stores the calculated negative-positive frequency in the negative-positive DB 240 in association with the element of the selected number of cases information and the disease name (step S1406).

つぎに、情報処理装置１０１は、ネガポジ件数テーブル８００から選択されていない未選択の件数情報があるか否かを判断する（ステップＳ１４０７）。ここで、未選択の件数情報がある場合（ステップＳ１４０７：Ｙｅｓ）、情報処理装置１０１は、ステップＳ１３０３に戻る。 Next, the information processing apparatus 101 determines whether or not there is unselected number information that has not been selected from the negative/positive number table 800 (step S1407). Here, if there is unselected number information (step S1407: Yes), the information processing apparatus 101 returns to step S1303.

一方、未選択の件数情報がない場合（ステップＳ１４０７：Ｎｏ）、情報処理装置１０１は、本フローチャートによる一連の処理を終了する。なお、情報処理装置１０１は、選択した件数情報の要素と病名に対応付けて、選択した件数情報のネガティブ件数およびポジティブ件数をネガポジＤＢ２４０に記憶することにしてもよい。 On the other hand, if there is no unselected number information (step S1407: No), the information processing apparatus 101 terminates the series of processes according to this flowchart. The information processing apparatus 101 may store the negative number of cases and the positive number of cases of the selected number of cases information in the negative-positive DB 240 in association with the selected number of cases information element and disease name.

これにより、カルテで使用される単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値（ネガポジ度数）を登録することができる。 Thereby, it is possible to register an index value (negative-positive frequency) indicating the tendency of the words used in the medical record to have positive or negative connotations.

つぎに、図１５を用いて、図１２に示したステップＳ１２０５のネガティブ件数算出処理の具体的な処理手順について説明する。 Next, a specific processing procedure of the negative count calculation processing in step S1205 shown in FIG. 12 will be described with reference to FIG.

図１５は、ネガティブ件数算出処理の具体的処理手順の一例を示すフローチャートである。図１５のフローチャートにおいて、まず、ステップＳ１２０１において選択した病名情報の病名開始日から転帰日までの期間Ｔを特定する（ステップＳ１５０１）。つぎに、情報処理装置１０１は、カルテＤＢ２３０から、選択した病名情報の患者ＩＤに対応するカルテ情報を抽出する（ステップＳ１５０２）。 FIG. 15 is a flow chart showing an example of a specific processing procedure of negative number calculation processing. In the flowchart of FIG. 15, first, a period T from the disease name start date to the outcome date of the disease name information selected in step S1201 is specified (step S1501). Next, the information processing apparatus 101 extracts medical record information corresponding to the patient ID of the selected disease name information from the medical record DB 230 (step S1502).

そして、情報処理装置１０１は、抽出したカルテ情報のうち、特定した期間Ｔに記載日が含まれるカルテ情報の中から選択されていない未選択のカルテ情報を選択する（ステップＳ１５０３）。つぎに、情報処理装置１０１は、選択したカルテ情報の記事を形態素解析して、当該記事を単語（形態素）単位に分解する（ステップＳ１５０４）。 Then, the information processing apparatus 101 selects unselected medical chart information from among the extracted medical chart information whose description date is included in the specified period T (step S1503). Next, the information processing apparatus 101 morphologically analyzes the article of the selected medical record information, and decomposes the article into word (morpheme) units (step S1504).

そして、情報処理装置１０１は、分解した単語（形態素）のうち、特定の品詞の単語を抽出する（ステップＳ１５０５）。つぎに、情報処理装置１０１は、抽出した各単語について、選択したカルテ情報の記事における出現回数を算出する（ステップＳ１５０６）。 Then, the information processing apparatus 101 extracts words of a specific part of speech from the decomposed words (morphemes) (step S1505). Next, the information processing apparatus 101 calculates the number of appearances of each extracted word in the article of the selected medical record information (step S1506).

そして、情報処理装置１０１は、算出した各単語の出現回数を、選択した病名情報の病名に対応するネガポジ件数テーブル８００内の各単語のネガティブ件数に加算する（ステップＳ１５０７）。なお、ネガポジ件数テーブル８００内に、病名情報の病名に対応する単語のレコードが存在しない場合は、情報処理装置１０１は、病名情報の病名に対応する単語のレコードを新規作成する。 Then, the information processing apparatus 101 adds the calculated appearance count of each word to the number of negatives of each word in the negative-positive number table 800 corresponding to the disease name of the selected disease name information (step S1507). If there is no record of words corresponding to the disease name in the disease name information in the negative-positive number table 800, the information processing apparatus 101 newly creates a record of words corresponding to the disease name in the disease name information.

つぎに、情報処理装置１０１は、期間Ｔに記載日が含まれるカルテ情報の中から選択されていない未選択のカルテ情報があるか否かを判断する（ステップＳ１５０８）。ここで、未選択のカルテ情報がある場合（ステップＳ１５０８：Ｙｅｓ）、情報処理装置１０１は、ステップＳ１５０３に戻る。 Next, the information processing apparatus 101 determines whether or not there is unselected medical chart information that has not been selected from the medical chart information that includes the entry date in the period T (step S1508). Here, if there is unselected medical chart information (step S1508: Yes), the information processing apparatus 101 returns to step S1503.

一方、未選択のカルテ情報がない場合（ステップＳ１５０８：Ｎｏ）、情報処理装置１０１は、ネガティブ件数算出処理を呼び出したステップに戻る。これにより、ネガティブに分類されたカルテ情報（転帰が「死亡」のカルテ情報）の記事における単語の出現回数をカウントすることができる。 On the other hand, if there is no unselected medical record information (step S1508: No), the information processing apparatus 101 returns to the step that called the negative number calculation process. As a result, it is possible to count the number of occurrences of words in articles of medical record information classified as negative (medical record information with the outcome of "death").

つぎに、図１６を用いて、図１２に示したステップＳ１２０７のポジティブ件数算出処理の具体的な処理手順について説明する。 Next, a specific processing procedure of the number of positives calculation processing in step S1207 shown in FIG. 12 will be described with reference to FIG.

図１６は、ポジティブ件数算出処理の具体的処理手順の一例を示すフローチャートである。図１６のフローチャートにおいて、まず、ステップＳ１２０１において選択した病名情報の病名開始日から転帰日までの期間Ｔを特定する（ステップＳ１６０１）。つぎに、情報処理装置１０１は、カルテＤＢ２３０から、選択した病名情報の患者ＩＤに対応するカルテ情報を抽出する（ステップＳ１６０２）。 FIG. 16 is a flow chart showing an example of a specific processing procedure of the number-of-positives calculation process. In the flowchart of FIG. 16, first, a period T from the disease name start date to the outcome date of the disease name information selected in step S1201 is specified (step S1601). Next, the information processing apparatus 101 extracts medical record information corresponding to the patient ID of the selected disease name information from the medical record DB 230 (step S1602).

そして、情報処理装置１０１は、抽出したカルテ情報のうち、特定した期間Ｔに記載日が含まれるカルテ情報の中から選択されていない未選択のカルテ情報を選択する（ステップＳ１６０３）。つぎに、情報処理装置１０１は、選択したカルテ情報の記事を形態素解析して、当該記事を単語（形態素）単位に分解する（ステップＳ１６０４）。 Then, the information processing apparatus 101 selects unselected medical chart information from among the extracted medical chart information whose entry date is included in the specified period T (step S1603). Next, the information processing apparatus 101 morphologically analyzes the article of the selected medical record information, and decomposes the article into word (morpheme) units (step S1604).

そして、情報処理装置１０１は、分解した単語（形態素）のうち、特定の品詞の単語を抽出する（ステップＳ１６０５）。つぎに、情報処理装置１０１は、抽出した各単語について、選択したカルテ情報の記事における出現回数を算出する（ステップＳ１６０６）。 Then, the information processing apparatus 101 extracts words of a specific part of speech from the decomposed words (morphemes) (step S1605). Next, the information processing apparatus 101 calculates the number of appearances of each extracted word in the article of the selected medical record information (step S1606).

そして、情報処理装置１０１は、算出した各単語の出現回数を、選択した病名情報の病名に対応するネガポジ件数テーブル８００内の各単語のポジティブ件数に加算する（ステップＳ１６０７）。 Then, the information processing apparatus 101 adds the calculated number of appearances of each word to the positive number of each word in the negative-positive number table 800 corresponding to the disease name of the selected disease name information (step S1607).

つぎに、情報処理装置１０１は、期間Ｔに記載日が含まれるカルテ情報の中から選択されていない未選択のカルテ情報があるか否かを判断する（ステップＳ１６０８）。ここで、未選択のカルテ情報がある場合（ステップＳ１６０８：Ｙｅｓ）、情報処理装置１０１は、ステップＳ１６０３に戻る。 Next, the information processing apparatus 101 determines whether or not there is unselected medical chart information that has not been selected from among the medical chart information whose description date is included in the period T (step S1608). Here, if there is unselected medical record information (step S1608: Yes), the information processing apparatus 101 returns to step S1603.

一方、未選択のカルテ情報がない場合（ステップＳ１６０８：Ｎｏ）、情報処理装置１０１は、ポジティブ件数算出処理を呼び出したステップに戻る。これにより、ポジティブに分類されたカルテ情報（転帰が「寛解または治癒」のカルテ情報）の記事における単語の出現回数をカウントすることができる。 On the other hand, if there is no unselected medical chart information (step S1608: No), the information processing apparatus 101 returns to the step that called the positive number calculation process. This makes it possible to count the number of occurrences of words in articles of medical record information classified positively (medical record information whose outcome is “remission or cure”).

（情報処理装置１０１の文書検索処理手順）
つぎに、図１７および図１８を用いて、情報処理装置１０１の文書検索処理手順について説明する。 (Document search processing procedure of information processing apparatus 101)
Next, the document search processing procedure of the information processing apparatus 101 will be described with reference to FIGS. 17 and 18. FIG.

図１７および図１８は、情報処理装置１０１の文書検索処理手順の一例を示すフローチャートである。図１７のフローチャートにおいて、まず、情報処理装置１０１は、クライアント装置２０１からの検索文字列を受け付けたか否かを判断する（ステップＳ１７０１）。 17 and 18 are flowcharts showing an example of the document search processing procedure of the information processing apparatus 101. FIG. In the flowchart of FIG. 17, first, the information processing apparatus 101 determines whether or not a search character string has been received from the client apparatus 201 (step S1701).

ここで、情報処理装置１０１は、検索文字列を受け付けるのを待つ（ステップＳ１７０１：Ｎｏ）。そして、情報処理装置１０１は、検索文字列を受け付けた場合（ステップＳ１７０１：Ｙｅｓ）、診療対象の患者の病名を特定する（ステップＳ１７０２）。なお、診療対象の患者の病名を特定する情報は、検索文字列に対応付けられていてもよく、また、クライアント装置２０１や電子カルテシステムに問い合わせることにしてもよい。 Here, the information processing apparatus 101 waits for reception of the search character string (step S1701: No). Then, when the search character string is received (step S1701: Yes), the information processing apparatus 101 identifies the disease name of the patient to be treated (step S1702). The information specifying the disease name of the patient to be treated may be associated with the search character string, or may be inquired of the client device 201 or the electronic medical record system.

つぎに、情報処理装置１０１は、検索文字列を形態素解析して、当該検索文字列を単語（形態素）単位に分解する（ステップＳ１７０３）。そして、情報処理装置１０１は、分解した単語（形態素）のうち、特定の品詞の単語を抽出する（ステップＳ１７０４）。つぎに、情報処理装置１０１は、抽出した各単語について、ネガポジＤＢ２４０を参照して、特定した診療対象の患者の病名に対応するネガポジ度数を特定する（ステップＳ１７０５）。 Next, the information processing apparatus 101 morphologically analyzes the search character string to decompose the search character string into word (morpheme) units (step S1703). Then, the information processing apparatus 101 extracts words of a specific part of speech from the decomposed words (morphemes) (step S1704). Next, the information processing apparatus 101 refers to the negative-positive DB 240 for each extracted word, and specifies the negative-positive frequency corresponding to the disease name of the specified patient to be treated (step S1705).

そして、情報処理装置１０１は、特定した各単語のネガポジ度数を積算することにより、検索文字列のネガポジ度数を算出する（ステップＳ１７０６）。つぎに、情報処理装置１０１は、カルテＤＢ２３０から、検索文字列に対応するカルテ情報の記事を検索して（ステップＳ１７０７）、図１８に示すステップＳ１８０１に移行する。 Then, the information processing apparatus 101 calculates the negative-positive frequency of the search character string by accumulating the negative-positive frequency of each specified word (step S1706). Next, the information processing apparatus 101 searches for an article of medical record information corresponding to the search character string from the medical record DB 230 (step S1707), and proceeds to step S1801 shown in FIG.

なお、ステップＳ１７０７において、カルテ情報の記事が検索されなかった場合には、情報処理装置１０１は、例えば、検索文字列に対応する記事が検索されなかったことを示す検索結果を出力して、本フローチャートによる一連の処理を終了する。 Note that in step S1707, if the article of the chart information is not searched, the information processing apparatus 101 outputs a search result indicating that the article corresponding to the search character string was not searched, for example. A series of processes according to the flowchart is ended.

図１８のフローチャートにおいて、まず、情報処理装置１０１は、検索された１または複数のカルテ情報の記事のうち選択されていない未選択の記事を選択する（ステップＳ１８０１）。つぎに、情報処理装置１０１は、選択した記事を形態素解析して、当該記事を単語（形態素）単位に分解する（ステップＳ１８０２）。 In the flowchart of FIG. 18, first, the information processing apparatus 101 selects an unselected article from among articles of one or a plurality of retrieved medical record information (step S1801). Next, the information processing apparatus 101 morphologically analyzes the selected article and decomposes the article into words (morphemes) (step S1802).

そして、情報処理装置１０１は、分解した単語（形態素）のうち、特定の品詞の単語を抽出する（ステップＳ１８０３）。つぎに、情報処理装置１０１は、抽出した各単語について、ネガポジＤＢ２４０を参照して、特定した診療対象の患者の病名に対応するネガポジ度数を特定する（ステップＳ１８０４）。 Then, the information processing apparatus 101 extracts words of a specific part of speech from the decomposed words (morphemes) (step S1803). Next, the information processing apparatus 101 refers to the negative-positive DB 240 for each extracted word, and specifies the negative-positive frequency corresponding to the disease name of the specified patient to be treated (step S1804).

そして、情報処理装置１０１は、特定した各単語のネガポジ度数を積算した値を、抽出した単語数で割ることにより、選択した記事のネガポジ度数を算出する（ステップＳ１８０５）。つぎに、情報処理装置１０１は、検索された１または複数のカルテ情報の記事のうち選択されていない未選択の記事があるか否かを判断する（ステップＳ１８０６）。 Then, the information processing apparatus 101 calculates the negative-positive frequency of the selected article by dividing the value obtained by integrating the negative-positive frequency of each specified word by the number of extracted words (step S1805). Next, the information processing apparatus 101 determines whether or not there is an unselected article that has not been selected among the retrieved articles of the one or more pieces of medical chart information (step S1806).

ここで、未選択の記事がある場合（ステップＳ１８０６：Ｙｅｓ）、情報処理装置１０１は、ステップＳ１８０１に戻る。一方、未選択の記事がない場合（ステップＳ１８０６：Ｎｏ）、情報処理装置１０１は、算出した各記事のネガポジ度数と検索文字列のネガポジ度数とに基づいて、各記事の優先度を決定する（ステップＳ１８０７）。 Here, if there is an unselected article (step S1806: Yes), the information processing apparatus 101 returns to step S1801. On the other hand, if there is no unselected article (step S1806: No), the information processing apparatus 101 determines the priority of each article based on the calculated negative-positive frequency of each article and the negative-positive frequency of the search character string ( step S1807).

そして、情報処理装置１０１は、検索した１または複数のカルテ情報の記事を、決定した優先度が高い順に並べた検索結果をクライアント装置２０１に出力して（ステップＳ１８０８）、本フローチャートによる一連の処理を終了する。これにより、検索文字列に応じて、ポジティブまたはネガティブの条件を加味した検索結果を出力することができる。 Then, the information processing apparatus 101 outputs to the client apparatus 201 a search result in which one or more articles of medical record information that have been searched are arranged in descending order of the determined priority (step S1808), and a series of processing according to this flowchart is performed. exit. As a result, it is possible to output search results that take into account positive or negative conditions according to the search character string.

なお、情報処理装置１０１は、ステップＳ１８０７において、さらに、各記事のネガポジ度数に基づいて、各記事の表示態様を決定することにしてもよい。この場合、情報処理装置１０１は、ステップＳ１８０８において、決定した表示態様で各記事を出力する。 In step S1807, the information processing apparatus 101 may further determine the display mode of each article based on the negative-positive frequency of each article. In this case, the information processing apparatus 101 outputs each article in the determined display mode in step S1808.

以上説明したように、実施の形態にかかる情報処理装置１０１によれば、ポジティブまたはネガティブに分類可能な文書を登録した第１の記憶部７１０を参照して、処理対象となる単語について、ポジティブまたはネガティブに分類可能な文書それぞれにおける単語の出現頻度に基づき、単語のネガポジ度数を算出し、算出したネガポジ度数を単語と対応付けて第２の記憶部７２０に登録することができる。単語のネガポジ度数は、単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値である。 As described above, according to the information processing apparatus 101 according to the embodiment, referring to the first storage unit 710 in which documents that can be classified as positive or negative are registered, a word to be processed is classified as positive or negative. It is possible to calculate the negative-positive frequency of a word based on the appearance frequency of the word in each negatively classifiable document, and register the calculated negative-positive frequency in the second storage unit 720 in association with the word. The negative-positive frequency of a word is an index value indicating the tendency of the word to have positive or negative connotations.

これにより、単語が持つポジティブまたはネガティブの意味合いの傾向を特定することが可能となり、ひいては、ポジティブまたはネガティブの条件を加味した文書検索を実現することができる。 This makes it possible to identify the tendency of positive or negative connotations of words, and furthermore realizes document retrieval taking into account positive or negative conditions.

また、情報処理装置１０１によれば、検索文字列を受け付け、文書群から検索文字列に対応する文書を検索し、検索文字列に対応する１または複数の文書が検索された場合、第２の記憶部７２０を参照して、検索された各文書に含まれる単語に対応するネガポジ度数に基づき、各文書のネガポジ度数を算出することができる。また、情報処理装置１０１によれば、第２の記憶部７２０を参照して、検索文字列に含まれる単語に対応するネガポジ度数に基づき、検索文字列のネガポジ度数を算出することができる。そして、情報処理装置１０１によれば、算出した各文書のネガポジ度数と検索文字列のネガポジ度数とに基づいて、検索文字列に対応する検索結果を出力することができる。文書群は、例えば、第１の記憶部７１０に記憶された文書群である。 Further, according to the information processing apparatus 101, when a search character string is received, documents corresponding to the search character string are searched from the document group, and one or more documents corresponding to the search character string are searched, the second By referring to the storage unit 720, the negative-positive frequency of each document can be calculated based on the negative-positive frequency corresponding to the words contained in each document. Further, according to the information processing apparatus 101, it is possible to refer to the second storage unit 720 and calculate the negative-positive frequency of the search character string based on the negative-positive frequency corresponding to the word included in the search character string. Then, according to the information processing apparatus 101, it is possible to output the search result corresponding to the search character string based on the calculated negative-positive frequency of each document and the negative-positive frequency of the search character string. The document group is, for example, a document group stored in the first storage unit 710 .

これにより、単語が持つポジティブまたはネガティブの意味合いの傾向から、その単語を含む文書が持つポジティブまたはネガティブの意味合いの傾向を推定して、ポジティブまたはネガティブの条件を加味した文書検索を行うことができる。 As a result, it is possible to estimate the tendency of positive or negative connotations of documents containing the words from the tendency of positive or negative connotations of words, and perform document retrieval with positive or negative conditions added.

また、情報処理装置１０１によれば、各文書のネガポジ度数と検索文字列のネガポジ度数とに基づいて、検索した１または複数の文書のうち、検索文字列とネガポジ度数が近い文書を優先して検索結果として出力することができる。 Further, according to the information processing apparatus 101, based on the negative-positive frequency of each document and the negative-positive frequency of the search character string, priority is given to documents having a similar negative-positive frequency to the search character string among one or more retrieved documents. Can be output as search results.

これにより、ポジティブまたはネガティブな意味合いの傾向の強さが検索文字列と同程度である文書（例えば、カルテ記事）を優先して検索結果として出力することができる。 As a result, it is possible to preferentially output documents (for example, medical record articles) that tend to have positive or negative connotations at the same level as the search character string as search results.

また、情報処理装置１０１によれば、検索文字列のネガポジ度数に基づいて、検索文字列をポジティブまたはネガティブに分類し、検索文字列をポジティブに分類した場合、各文書のネガポジ度数に基づいて、検索した１または複数の文書のうち、ポジティブな意味合いの傾向が強い文書を優先して出力することができる。また、情報処理装置１０１によれば、検索文字列をネガティブに分類した場合、各文書のネガポジ度数に基づいて、検索した１または複数の文書のうち、ネガティブな意味合いの傾向が強い文書を優先して出力することができる。 Further, according to the information processing apparatus 101, based on the negative-positive frequency of the search character string, the search character string is classified into positive or negative, and when the search character string is classified as positive, based on the negative-positive frequency of each document, Among the retrieved one or more documents, a document that tends to have a strong positive connotation can be preferentially output. Further, according to the information processing apparatus 101, when the retrieval character string is classified as negative, priority is given to documents having a strong negative connotation among one or more retrieved documents based on the negative-positive frequency of each document. can be output as

これにより、検索文字列がポジティブまたはネガティブのいずれに分類されたかに応じて、ポジティブまたはネガティブのいずれの傾向が強い文書を優先して出力するのかを決めることができる。 This makes it possible to determine whether documents with a strong positive or negative tendency are preferentially output according to whether the search character string is classified as positive or negative.

また、情報処理装置１０１によれば、各文書のネガポジ度数に基づいて、各文書の表示態様を決定し、各文書を検索結果として出力する際に、決定した表示態様で各文書を出力することができる。 Further, according to the information processing apparatus 101, the display mode of each document is determined based on the negative positive frequency of each document, and each document is output in the determined display mode when outputting each document as a search result. can be done.

これにより、検索文字列に対応する検索結果を表示する際に、ポジティブな意味合いの文書と、ネガティブな意味合いの文書とを判別可能に表示することができる。 As a result, when displaying search results corresponding to a search character string, it is possible to distinguish between documents with positive connotations and documents with negative connotations.

また、情報処理装置１０１によれば、第１の記憶部７１０（カルテＤＢ２３０）に記憶された文書である患者のカルテ情報の記事を、その患者の転帰に応じてポジティブまたはネガティブに分類することができる。 Further, according to the information processing apparatus 101, articles of patient's medical chart information, which are documents stored in the first storage unit 710 (medical chart DB 230), can be classified into positive or negative according to the outcome of the patient. can.

これにより、例えば、患者の転帰が「寛解または治癒」であるか、または、「死亡」であるかに応じて、病気の治療中に記載されたカルテ記事を、ポジティブまたはネガティブに分類することができる。 This allows, for example, classifying medical record entries written during treatment of an illness as positive or negative depending on whether the patient's outcome is "remission or cure" or "death". can.

また、情報処理装置１０１によれば、第１の記憶部７１０（カルテＤＢ２３０）に記憶された文書（カルテ情報の記事）のうち、各病名に対応する文書それぞれにおける単語の出現頻度に基づき、単語のネガポジ度数を算出することができる。そして、情報処理装置１０１によれば、算出したネガポジ度数を、単語と各病名とに対応付けて第２の記憶部７２０（ネガポジＤＢ２４０）に登録することができる。 Further, according to the information processing apparatus 101, out of the documents (articles of medical record information) stored in the first storage unit 710 (medical record DB 230), word can be calculated. Then, according to the information processing apparatus 101, the calculated negative-positive frequency can be registered in the second storage unit 720 (negative-positive DB 240) in association with the word and each disease name.

これにより、カルテ記事で使用された単語のネガポジ度数を、病名ごとの単語の使われ方を考慮して求めることができる。例えば、ある病気ではネガティブな意味合いで使う単語であっても、他の病気ではポジティブな意味合いで使うことがある。このような病名ごとの使われ方の特性を考慮することで、単語が持つポジティブまたはネガティブの意味合いの傾向を精度よく特定することが可能となる。 As a result, the negative-positive frequency of words used in medical record articles can be obtained by considering how the words are used for each disease name. For example, a word that is used with a negative connotation in one disease may be used with a positive connotation in another disease. By considering the usage characteristics of each disease name, it is possible to accurately identify the tendency of positive or negative connotations of words.

また、情報処理装置１０１によれば、第２の記憶部７２０（ネガポジＤＢ２４０）を参照して、検索された各文書（カルテ情報の記事）に含まれる単語について、診療対象の患者の病名に対応するネガポジ度数を特定し、特定した各文書に含まれる単語のネガポジ度数に基づき、各文書のネガポジ度数を算出することができる。また、情報処理装置１０１によれば、第２の記憶部７２０を参照して、検索文字列に含まれる単語について、診療対象の患者の病名に対応するネガポジ度数を特定し、特定した検索文字列に含まれる単語のネガポジ度数に基づき、検索文字列のネガポジ度数を算出することができる。 Further, according to the information processing apparatus 101, the second storage unit 720 (negative/positive DB 240) is referred to, and words included in each retrieved document (article of medical record information) correspond to the disease name of the patient to be treated. The negative-positive frequency of each document can be calculated based on the negative-positive frequency of words contained in each document. Further, according to the information processing apparatus 101, the second storage unit 720 is referred to, the negative-positive frequency corresponding to the disease name of the patient to be treated is specified for the word included in the search character string, and the specified search character string It is possible to calculate the negative-positive frequency of the search character string based on the negative-positive frequency of words included in .

これにより、検索文字列やカルテ記事のネガポジ度数を、病名ごとの単語の使われ方を考慮して求めることができ、検索文字列やカルテ記事が持つポジティブまたはネガティブの意味合いの傾向を精度よく特定することが可能となる。 This makes it possible to determine the frequency of negative positives in search strings and medical record articles by taking into consideration the usage of words for each disease name, and accurately identify trends in positive or negative connotations of search strings and medical chart articles. It becomes possible to

これらのことから、実施の形態にかかる情報処理装置１０１および文書検索システム２００によれば、ポジティブまたはネガティブの条件を加味した文書検索を実現することで、ユーザにとって必要な情報をより早く簡単に見つけ出せるようにして、文書検索を行う際の作業効率を向上させることができる。 For these reasons, according to the information processing apparatus 101 and the document search system 200 according to the embodiment, by realizing a document search with positive or negative conditions added, the user can quickly and easily find necessary information. In this way, it is possible to improve work efficiency when searching for documents.

なお、本実施の形態で説明した文書検索方法は、予め用意されたプログラムをパーソナル・コンピュータやワークステーション等のコンピュータで実行することにより実現することができる。本文書検索プログラムは、ハードディスク、フレキシブルディスク、ＣＤ－ＲＯＭ、ＤＶＤ、ＵＳＢメモリ等のコンピュータで読み取り可能な記録媒体に記録され、コンピュータによって記録媒体から読み出されることによって実行される。また、本文書検索プログラムは、インターネット等のネットワークを介して配布してもよい。 The document retrieval method described in this embodiment can be realized by executing a prepared program on a computer such as a personal computer or a workstation. The document search program is recorded in a computer-readable recording medium such as a hard disk, flexible disk, CD-ROM, DVD, USB memory, etc., and executed by being read from the recording medium by a computer. Also, the document search program may be distributed via a network such as the Internet.

また、本実施の形態で説明した情報処理装置１０１は、スタンダードセルやストラクチャードＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）などの特定用途向けＩＣやＦＰＧＡなどのＰＬＤ（ＰｒｏｇｒａｍｍａｂｌｅＬｏｇｉｃＤｅｖｉｃｅ）によっても実現することができる。 Further, the information processing apparatus 101 described in the present embodiment can also be realized by application specific ICs such as standard cells and structured ASICs (Application Specific Integrated Circuits), and PLDs (Programmable Logic Devices) such as FPGAs.

上述した実施の形態に関し、さらに以下の付記を開示する。 Further, the following additional remarks are disclosed with respect to the above-described embodiment.

（付記１）ポジティブまたはネガティブに分類可能な文書を登録した第１の記憶部を参照して、処理対象となる単語について、前記ポジティブまたはネガティブに分類可能な文書それぞれにおける前記単語の出現頻度に基づき、前記単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値を算出し、
算出した前記指標値を前記単語と対応付けて第２の記憶部に登録する、
処理をコンピュータに実行させることを特徴とする文書検索プログラム。 (Appendix 1) Referring to a first storage unit in which documents that can be classified as positive or negative are registered, a word to be processed is based on the appearance frequency of the word in each of the documents that can be classified as positive or negative. , calculating an index value indicating the tendency of the positive or negative connotation of the word,
registering the calculated index value in a second storage unit in association with the word;
A document search program characterized by causing a computer to execute processing.

（付記２）検索文字列を受け付け、
文書群から前記検索文字列に対応する文書を検索し、
前記検索文字列に対応する１または複数の文書が検索された場合、前記第２の記憶部を参照して、前記１または複数の文書の各文書に含まれる単語に対応する指標値に基づき、前記各文書の指標値を算出し、
前記第２の記憶部を参照して、前記検索文字列に含まれる単語に対応する指標値に基づき、前記検索文字列の指標値を算出し、
算出した前記各文書の指標値と前記検索文字列の指標値とに基づいて、前記検索文字列に対応する検索結果を出力する、
処理を前記コンピュータに実行させることを特徴とする付記１に記載の文書検索プログラム。 (Appendix 2) Receiving a search string,
searching for a document corresponding to the search string from the group of documents;
when one or more documents corresponding to the search character string are retrieved, referring to the second storage unit, based on index values corresponding to words contained in each of the one or more documents, calculating an index value for each of the documents;
referring to the second storage unit and calculating the index value of the search character string based on the index value corresponding to the word included in the search character string;
outputting a search result corresponding to the search character string based on the calculated index value of each document and the index value of the search character string;
The document search program according to appendix 1, which causes the computer to execute processing.

（付記３）前記出力する処理は、
前記各文書の指標値と前記検索文字列の指標値とに基づいて、前記１または複数の文書のうち、前記検索文字列と指標値が近い文書を優先して検索結果として出力する、ことを特徴とする付記２に記載の文書検索プログラム。 (Appendix 3) The process of outputting
Based on the index value of each document and the index value of the search character string, among the one or more documents, a document having an index value close to the search character string is preferentially output as a search result. A document search program according to appendix 2, characterized by:

（付記４）前記検索文字列の指標値に基づいて、前記検索文字列をポジティブまたはネガティブに分類する、処理を前記コンピュータに実行させ、
前記出力する処理は、
前記検索文字列をポジティブに分類した場合、前記各文書の指標値に基づいて、前記１または複数の文書のうち、ポジティブな意味合いの傾向が強い文書を優先して出力し、
前記検索文字列をネガティブに分類した場合、前記各文書の指標値に基づいて、前記１または複数の文書のうち、ネガティブな意味合いの傾向が強い文書を優先して出力する、
ことを特徴とする付記２に記載の文書検索プログラム。 (Appendix 4) causing the computer to perform a process of classifying the search string as positive or negative based on the index value of the search string;
The output process is
when the search character string is positively classified, based on the index value of each document, priority is given to a document having a strong positive connotation among the one or more documents, and
when the search character string is classified as negative, preferentially outputs a document having a strong negative connotation among the one or more documents based on the index value of each document;
The document search program according to appendix 2, characterized by:

（付記５）前記各文書の指標値に基づいて、前記各文書の表示態様を決定する、処理を前記コンピュータに実行させ、
前記出力する処理は、
前記各文書を検索結果として出力する際に、決定した前記表示態様で前記各文書を出力する、ことを特徴とする付記２～４のいずれか一つに記載の文書検索プログラム。 (Appendix 5) causing the computer to execute a process of determining a display mode of each document based on the index value of each document;
The output process is
5. The document retrieval program according to any one of attachments 2 to 4, characterized in that, when outputting each document as a retrieval result, each document is output in the determined display mode.

（付記６）前記第１の記憶部に記憶された文書は、患者のカルテ情報であり、前記患者の転帰に応じてポジティブまたはネガティブに分類可能である、ことを特徴とする付記２～５のいずれか一つに記載の文書検索プログラム。 (Appendix 6) Documents stored in the first storage unit are patient chart information, which can be classified as positive or negative according to the outcome of the patient. A document retrieval program according to any one of the above.

（付記７）前記第１の記憶部に記憶された文書は、前記患者の病名に対応付けられており、
前記単語の指標値を算出する処理は、
前記第１の記憶部に記憶された文書のうち、各病名に対応する文書それぞれにおける前記単語の出現頻度に基づき、前記単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値を算出し、
前記登録する処理は、
算出した前記指標値を、前記単語と前記各病名とに対応付けて前記第２の記憶部に登録する、ことを特徴とする付記６に記載の文書検索プログラム。 (Appendix 7) The document stored in the first storage unit is associated with the disease name of the patient,
The process of calculating the index value of the word includes:
calculating an index value indicating the tendency of the word to have a positive or negative connotation based on the appearance frequency of the word in each document corresponding to each disease name among the documents stored in the first storage unit;
The process of registering
7. The document search program according to appendix 6, wherein the calculated index value is associated with the word and each disease name and registered in the second storage unit.

（付記８）前記各文書の指標値を算出する処理は、
前記第２の記憶部を参照して、検索された前記各文書に含まれる単語について、診療対象の患者の病名に対応する指標値を特定し、
特定した前記各文書に含まれる単語の指標値に基づき、前記各文書の指標値を算出し、
前記検索文字列の指標値を算出する処理は、
前記第２の記憶部を参照して、前記検索文字列に含まれる単語について、前記診療対象の患者の病名に対応する指標値を特定し、
特定した前記検索文字列に含まれる単語の指標値に基づき、前記検索文字列の指標値を算出する、ことを特徴とする付記７に記載の文書検索プログラム。 (Appendix 8) The process of calculating the index value of each document is
identifying an index value corresponding to a disease name of a patient to be treated for a word contained in each of the retrieved documents by referring to the second storage unit;
calculating an index value of each of the documents based on the index values of the words contained in each of the identified documents;
The process of calculating the index value of the search character string includes:
referring to the second storage unit to specify an index value corresponding to the disease name of the patient to be treated for the word included in the search character string;
The document search program according to appendix 7, wherein the index value of the search character string is calculated based on the index value of the specified word included in the search character string.

（付記９）前記文書群は、前記第１の記憶部に記憶された文書群である、ことを特徴とする付記２～８のいずれか一つに記載の文書検索プログラム。 (Appendix 9) The document search program according to any one of Appendices 2 to 8, wherein the document group is a document group stored in the first storage unit.

（付記１０）前記処理対象となる単語は、前記第１の記憶部に記憶された文書に含まれる単語である、ことを特徴とする付記１～９のいずれか一つに記載の文書検索プログラム。 (Appendix 10) The document search program according to any one of Appendices 1 to 9, wherein the words to be processed are words included in documents stored in the first storage unit. .

（付記１１）ポジティブまたはネガティブに分類可能な文書を登録した第１の記憶部を参照して、処理対象となる単語について、前記ポジティブまたはネガティブに分類可能な文書それぞれにおける前記単語の出現頻度に基づき、前記単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値を算出し、
算出した前記指標値を前記単語と対応付けて第２の記憶部に登録する、
処理をコンピュータが実行することを特徴とする文書検索方法。 (Appendix 11) Referring to a first storage unit in which documents that can be classified as positive or negative are registered, a word to be processed is based on the appearance frequency of the word in each of the documents that can be classified as positive or negative. , calculating an index value indicating the tendency of the positive or negative connotation of the word,
registering the calculated index value in a second storage unit in association with the word;
A document retrieval method characterized in that processing is executed by a computer.

（付記１２）ポジティブまたはネガティブに分類可能な文書を登録した第１の記憶部を参照して、処理対象となる単語について、前記ポジティブまたはネガティブに分類可能な文書それぞれにおける前記単語の出現頻度に基づき、前記単語が持つポジティブまたはネガティブな意味合いの傾向を示す指標値を算出する算出部と、
前記算出部によって算出された前記指標値を前記単語と対応付けて第２の記憶部に登録する登録部と、
を含むことを特徴とする文書検索システム。 (Appendix 12) Referring to a first storage unit in which documents that can be classified as positive or negative are registered, a word to be processed is based on the appearance frequency of the word in each of the documents that can be classified as positive or negative. , a calculation unit that calculates an index value indicating the tendency of the word to have a positive or negative connotation;
a registration unit that associates the index value calculated by the calculation unit with the word and registers the index value in a second storage unit;
A document retrieval system comprising:

１０１情報処理装置
１１０，７１０第１の記憶部
１２０，７２０第２の記憶部
１１１，１１２文書
２００文書検索システム
２０１クライアント装置
２１０ネットワーク
２２０病名ＤＢ
２３０カルテＤＢ
２４０ネガポジＤＢ
３００バス
３０１ＣＰＵ
３０２メモリ
３０３ディスクドライブ
３０４ディスク
３０５通信Ｉ／Ｆ
３０６可搬型記録媒体Ｉ／Ｆ
３０７可搬型記録媒体
７０１第１の受付部
７０２算出部
７０３登録部
７０４第２の受付部
７０５検索部
７０６決定部
７０７出力部
８００ネガポジ件数テーブル
９１０，１１００キーワード検索画面
９２０，１０００検索結果画面
９２１，９２２，９２３，９２４カルテ記事
９３１，９３２，９３３，９３４ネガポジ度数情報 101 information processing device 110,710 first storage unit 120,720 second storage unit 111,112 document 200 document search system 201 client device 210 network 220 disease name DB
230 medical record database
240 negative positive DB
300 Bus 301 CPU
302 memory 303 disk drive 304 disk 305 communication I/F
306 portable recording medium I/F
307 portable recording medium 701 first reception unit 702 calculation unit 703 registration unit 704 second reception unit 705 search unit 706 determination unit 707 output unit 800 negative/positive number table 910, 1100 keyword search screen 920, 1000 search result screen 921, 922, 923, 924 Medical records 931, 932, 933, 934 Negative and positive frequency information

Claims

By referring to a first storage unit in which positively classified documents and negatively classified documents are registered, words included in at least one of the positively classified documents and the negatively classified documents calculating an index value indicating the tendency of the word to have a positive or negative connotation based on the appearance frequency of the word in each of the positively classified documents and the negatively classified documents;
registering the calculated index value in a second storage unit in association with the word;
accepts a search string,
searching for a document corresponding to the search string from the group of documents;
when one or more documents corresponding to the search character string are retrieved, referring to the second storage unit, based on index values corresponding to words contained in each of the one or more documents, calculating an index value for each of the documents;
referring to the second storage unit and calculating the index value of the search character string based on the index value corresponding to the word included in the search character string;
outputting a search result corresponding to the search character string based on the calculated index value of each document and the index value of the search character string;
A document search program characterized by causing a computer to execute processing.

The output process is
Based on the index value of each document and the index value of the search character string, among the one or more documents, a document having an index value close to the search character string is preferentially output as a search result. 2. A document search program according to claim 1.

causing the computer to perform a process of classifying the search string as positive or negative based on the index value of the search string;
The output process is
when the search character string is positively classified, based on the index value of each document, priority is given to a document having a strong positive connotation among the one or more documents, and
when the search character string is classified as negative, preferentially outputs a document having a strong negative connotation among the one or more documents based on the index value of each document;
2. The document search program according to claim 1, characterized by:

causing the computer to execute a process of determining a display mode of each of the documents based on the index value of each of the documents;
The output process is
4. The document retrieval program according to any one of claims 1 to 3, wherein when outputting each document as a retrieval result, each document is output in the determined display mode.

5. The document stored in the first storage unit is patient medical chart information, and is classified into positive or negative according to the outcome of the patient. the document retrieval program described in .

The document stored in the first storage unit is associated with the disease name of the patient,
The process of calculating the index value of the word includes:
calculating an index value indicating the tendency of the word to have a positive or negative connotation based on the appearance frequency of the word in each document corresponding to each disease name among the documents stored in the first storage unit;
The process of registering
6. The document retrieval program according to claim 5, wherein the calculated index value is associated with the word and each disease name and registered in the second storage unit.

The process of calculating the index value of each document includes:
identifying an index value corresponding to a disease name of a patient to be treated for a word contained in each of the retrieved documents by referring to the second storage unit;
calculating an index value of each of the documents based on the index values of the words contained in each of the identified documents;
The process of calculating the index value of the search character string includes:
referring to the second storage unit to specify an index value corresponding to the disease name of the patient to be treated for the word included in the search character string;
7. The document search program according to claim 6, wherein an index value of said search character string is calculated based on an index value of a word included in said specified search character string.

By referring to a first storage unit in which positively classified documents and negatively classified documents are registered, words included in at least one of the positively classified documents and the negatively classified documents calculating an index value indicating the tendency of the word to have a positive or negative connotation based on the appearance frequency of the word in each of the positively classified documents and the negatively classified documents;
registering the calculated index value in a second storage unit in association with the word;
accepts a search string,
searching for a document corresponding to the search string from the group of documents;
when one or more documents corresponding to the search character string are retrieved, referring to the second storage unit, based on index values corresponding to words contained in each of the one or more documents, calculating an index value for each of the documents;
referring to the second storage unit and calculating the index value of the search character string based on the index value corresponding to the word included in the search character string;
outputting a search result corresponding to the search character string based on the calculated index value of each document and the index value of the search character string;
A document retrieval method characterized in that processing is executed by a computer.

By referring to a first storage unit in which positively classified documents and negatively classified documents are registered, words included in at least one of the positively classified documents and the negatively classified documents a calculation unit that calculates an index value indicating a tendency of the positive or negative connotation of the word based on the appearance frequency of the word in each of the positively classified documents and the negatively classified documents;
a registration unit that associates the index value calculated by the calculation unit with the word and registers the index value in a second storage unit;
a reception unit that receives a search string;
a search unit that searches for a document corresponding to the search character string received by the reception unit from a group of documents;
index corresponding to words contained in each of the one or more documents by referring to the second storage unit when one or more documents corresponding to the search character string are retrieved by the search unit; calculating the index value of each document based on the value, referring to the second storage unit, and calculating the index value of the search character string based on the index value corresponding to the word included in the search character string a calculation unit for
an output unit configured to output a search result corresponding to the search character string based on the index value of each document and the index value of the search character string calculated by the calculation unit;
A document retrieval system comprising: