JPWO2016151838A1

JPWO2016151838A1 - Prior Research Research System and Prior Research Research Method

Info

Publication number: JPWO2016151838A1
Application number: JP2017507278A
Authority: JP
Inventors: 平林　由紀子; 由紀子平林; 芳樹丹羽; 敦牧; 真子石丸
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2015-03-26
Filing date: 2015-03-26
Publication date: 2017-07-20
Also published as: WO2016151838A1

Abstract

調査対象用語辞書（２１）には、ユーザが知りたい情報である調査対象用語が所定の研究分野およびカテゴリに対応づけて記憶されている。ユーザが研究分野・カテゴリ選択部（１４）で検索対象の研究分野・カテゴリを指定すると、処理装置１０は、調査対象用語辞書（２１）からその研究分野・カテゴリに対応付けられた調査対象用語を取得し、その取得した調査対象用語を用いて、文献ＤＢ（３１）から選ばれた文献のテキストを検索し、その検索で抽出された調査対象用語にマークを付して、その文献のテキストを表示装置（５０）に表示する。In the search target term dictionary (21), search target terms that are information that the user wants to know are stored in association with predetermined research fields and categories. When the user designates a research field / category to be searched in the research field / category selection unit (14), the processing apparatus 10 selects a search target term associated with the research field / category from the search target term dictionary (21). Retrieve the text of the document selected from the document DB (31) using the acquired search target term, mark the search target term extracted in the search, Displayed on the display device (50).

Description

本発明は、文献を蓄積した文献ＤＢ（Database）からユーザが知りたい情報を取得するのに好適な先行研究調査システムおよび先行研究調査方法に関する。 The present invention relates to a prior research research system and a prior research research method suitable for acquiring information that a user wants to know from a literature DB (Database) that accumulates literature.

研究を新たに始める場合、最初に先行研究を調査する作業が不可欠である。その作業では、研究者は、文献ＤＢのキーワード検索により得られた多数の文献を実際に読んで、調べたい項目を見つけ出し、その重要度を判断し、結果をまとめる必要がある。この作業をすべて人手でやろうとすると、多大な手間と時間を要する。 When starting a new study, it is essential to first investigate previous studies. In that work, the researcher needs to actually read a large number of documents obtained by keyword search of the document DB, find out the item to be examined, determine its importance, and summarize the results. If all this work is to be done manually, a great deal of labor and time is required.

例えば、特許文献１には、検索された文献中の特徴語の出現頻度や特徴語の出現位置の距離を可視化し、可視化された特徴語を選択してさらに再検索することにより文献検索の精度を高めることが可能な文書検索システムが開示されている。 For example, Patent Document 1 discloses the accuracy of document search by visualizing the appearance frequency of feature words in the searched document and the distance between the appearance positions of feature words, selecting the visualized feature word, and further searching again. A document search system capable of enhancing the above is disclosed.

特許４２２４１３１号公報Japanese Patent No. 4224131

しかしながら、特許文献１に開示されている文献検索方法は、基本的には自由なキーワード検索に基づくものであり、特徴語による再検索の手段が盛り込まれてはいるものの、その検索によりユーザが知りたい情報が的確にまたは直接的に得られるとは限らない。また、研究分野によっては、ユーザが知りたい情報は、概ね同じである場合が多いが、特許文献１の文書検索システムでは、そのことについてはとくに考慮されていない。 However, the document search method disclosed in Patent Document 1 is basically based on a free keyword search, and includes a means of re-search by feature words, but the user knows through the search. The information you want isn't always accurate or directly. Also, depending on the research field, the information that the user wants to know is often the same, but the document search system of Patent Document 1 does not take particular consideration thereof.

そのため、検索された文献からユーザが知りたい情報を抽出するためには、ユーザは、様々なキーワードを設定する必要があり、さらに、検索で得られた文献についても、例えば、少なくともキーワード近傍の文章を読んで知りたい情報を得るというユーザの作業は欠かせない。 Therefore, in order to extract information that the user wants to know from the retrieved documents, the user needs to set various keywords. Further, for the documents obtained by the search, for example, at least sentences near the keywords. The user's work of obtaining the information they want to know by reading is essential.

以上のような従来技術の課題に鑑み、本発明は、先行研究の文献ＤＢからユーザが知りたい情報を効率よく取得することが可能な先行研究調査システムおよび先行研究調査方法を提供することを目的とする。 In view of the above-described problems of the prior art, the present invention aims to provide a prior research research system and a prior research research method capable of efficiently acquiring information that a user wants to know from a literature DB of prior research. And

本発明に係る先行研究調査システムは、文献のテキストおよびその書誌データを蓄積した文献ＤＢに通信可能に接続され、ユーザが知りたい情報である調査対象用語を研究分野およびカテゴリに対応づけて構成した調査対象用語辞書を保持した記憶装置、および、入力装置を介してユーザが操作する情報に基づき、検索対象の調査対象用語が含まれる研究分野およびカテゴリを選択する研究分野・カテゴリ選択部と、前記調査対象用語辞書を参照して、前記選択された研究分野およびカテゴリに対応付けられた前記調査対象用語を取得し、前記文献ＤＢから選ばれた文献のテキストを前記取得した調査対象用語で検索し、前記文献のテキストに含まれる前記調査対象用語にタグを付けてタグ付きテキストを生成し、前記生成したタグ付きテキストを前記記憶装置に格納するタグ付きテキスト生成部と、前記タグ付きテキストを前記表示装置に表示するとともに、前記タグが付けられた調査対象用語には、前記タグが付けられたことを明示するマークを表示する調査対象用語マーク表示部と、を備えた処理装置を有してなることを特徴とする。 The prior research research system according to the present invention is communicably connected to a literature DB in which the text of a document and its bibliographic data are stored, and is configured by associating a search target term, which is information that a user wants to know, with a research field and category. A research field / category selection unit for selecting a research field and a category including a search target term to be searched based on a storage device that holds a search target term dictionary, and information operated by a user via an input device, The search target term dictionary is referred to, the search target term associated with the selected research field and category is acquired, and the text of the document selected from the reference DB is searched with the acquired search target term. , Generate a tagged text by attaching a tag to the search term included in the text of the document, and generate the tagged text A tagged text generator that stores the tag in the storage device, and the tagged text is displayed on the display device, and the search target term to which the tag is attached clearly indicates that the tag has been attached. It has the processing apparatus provided with the search object term mark display part which displays a mark, It is characterized by the above-mentioned.

本発明によれば、先行研究の文献ＤＢからユーザが知りたい情報を効率よく取得することが可能になる。 According to the present invention, it is possible to efficiently acquire information that a user wants to know from a literature DB of prior research.

本発明の実施形態に係る先行研究調査システムの全体構成の例を示した図である。It is the figure which showed the example of the whole structure of the prior research research system which concerns on embodiment of this invention. 調査対象用語辞書の構成の例を示した図である。It is the figure which showed the example of the structure of the investigation object term dictionary. 検索優先度テーブルの構成の例を示した図である。It is the figure which showed the example of the structure of the search priority table. 表示装置表示される検索指示画面の例を示したである図。The figure which showed the example of the search instruction | indication screen displayed on a display apparatus. 調査対象用語辞書に研究分野およびカテゴリを追加するときの操作を説明するための図であり、（Ａ）は、検索指示画面のうち説明に必要な部分を示した図、（Ｂ）は、研究分野追加編集画面の例、（Ｃ）は、カテゴリ追加編集画面の例である。It is a figure for demonstrating operation when adding a research field and a category to a term object dictionary, (A) is a figure which showed a part required for explanation among search directions screens, and (B) is research. An example of a field addition editing screen, (C) is an example of a category addition editing screen. 表示装置に表示される出力指示画面の例を示した図である。It is the figure which showed the example of the output instruction | indication screen displayed on a display apparatus. 本発明の実施形態に係る先行研究調査システムによって生成され出力される調査結果情報の構成の例を示した図である。It is the figure which showed the example of the structure of the research result information produced | generated and output by the prior research research system which concerns on embodiment of this invention. 本発明の実施形態に係る先行研究調査システムにおいて実行される主処理の処理フローの概要を示した図The figure which showed the outline | summary of the processing flow of the main process performed in the prior research research system which concerns on embodiment of this invention. 図８における研究分野・カテゴリ追加処理の詳細な処理フローの例を示した図。The figure which showed the example of the detailed process flow of the research field and category addition process in FIG. 文献のテキストにおいて、優先検索する部分を設定する方法の例を示した図である。It is the figure which showed the example of the method of setting the part to perform a priority search in the text of literature. 文献のテキストにおいて、優先検索する部分を設定する方法の他の例を示した図である。It is the figure which showed the other example of the method of setting the part to perform a priority search in the text of literature. 文献のテキストから検索対象外とする部分を抽出する方法の例を模式的に示した図である。It is the figure which showed typically the example of the method of extracting the part which is not made into search object from the text of literature.

以下、本発明の実施形態について、図面を参照して詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明の実施形態に係る先行研究調査システム１の全体構成の例を示した図である。図１に示すように、先行研究調査システム１は、処理装置１０と記憶装置２０と入力装置４０と表示装置５０と図示しない通信装置とを有してなる、いわゆるコンピュータによって構成される。また、先行研究調査システム１は、前記通信装置および通信ネットワーク２を介して、先行研究の文献を蓄積した文献ＤＢ３１を保持した文献ＤＢサーバ３に通信可能に接続されている。 FIG. 1 is a diagram showing an example of the overall configuration of a prior research research system 1 according to an embodiment of the present invention. As shown in FIG. 1, the prior research research system 1 is configured by a so-called computer having a processing device 10, a storage device 20, an input device 40, a display device 50, and a communication device (not shown). The prior research research system 1 is communicably connected to a literature DB server 3 that holds a literature DB 31 that stores literatures of prior research via the communication device and the communication network 2.

処理装置１０は、いわゆるＣＰＵ（Central Processing Unit）に加え、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）などのメモリを有してなり、ＣＰＵがメモリに格納された所定のプログラムを実行することにより、様々な機能が実現される。そして、本実施形態では、処理装置１０は、キーワード検索部１１、特徴語抽出・表示部１２、文献テキスト表示部１３、研究分野・カテゴリ選択部１４、研究分野・カテゴリ追加部１５、タグ付きテキスト生成部１６、調査対象用語マーク表示部１７、学習部１８、調査結果情報出力部１９などの機能ブロックを有している。なお、これらの機能ブロックが実現する機能については、以下の実施形態の説明の中で、順次説明する。 The processing device 10 includes a memory such as a ROM (Read Only Memory) and a RAM (Random Access Memory) in addition to a so-called CPU (Central Processing Unit), and the CPU executes a predetermined program stored in the memory. As a result, various functions are realized. In this embodiment, the processing device 10 includes a keyword search unit 11, a feature word extraction / display unit 12, a document text display unit 13, a research field / category selection unit 14, a research field / category addition unit 15, and tagged text. It has functional blocks such as a generation unit 16, a survey target term mark display unit 17, a learning unit 18, and a survey result information output unit 19. Note that the functions realized by these functional blocks will be sequentially described in the following description of the embodiments.

また、記憶装置２０は、ＲＡＭなどのメモリ、ハードディスク装置、ＳＳＤ（Solid State Disk）装置などにより構成され、処理装置１０が所定のプログラムを実行する上で必要な情報を記憶する。本実施形態では、記憶装置２０には、専門用語辞書２１、検索文献リスト２２、文献テキスト２３、調査対象用語辞書２４、タグ付きテキスト２５、教師テキスト２６、検索優先度テーブル２７、調査結果情報２８などが保持される。 The storage device 20 includes a memory such as a RAM, a hard disk device, an SSD (Solid State Disk) device, and the like, and stores information necessary for the processing device 10 to execute a predetermined program. In the present embodiment, the storage device 20 includes the technical term dictionary 21, the search reference list 22, the reference text 23, the search target term dictionary 24, the tagged text 25, the teacher text 26, the search priority table 27, and the search result information 28. Etc. are retained.

また、入力装置４０は、キーボード、マウス、タッチパネルなどで構成され、ユーザが処理装置１０に情報を入力するのに用いられる。また、表示装置５０は、ＬＣＤ（Liquid Crystal Display）などで構成され、主として処理装置１０が処理結果などをユーザに伝達するのに用いられる。 The input device 40 includes a keyboard, a mouse, a touch panel, and the like, and is used by a user to input information to the processing device 10. The display device 50 is configured by an LCD (Liquid Crystal Display) or the like, and is mainly used by the processing device 10 to transmit processing results and the like to the user.

なお、本実施形態では、文献ＤＢ３１は、先行研究調査システム１のコンピュータとは別のコンピュータである文献ＤＢサーバ３に備えられるとしているが、先行研究調査システム１のコンピュータに内蔵されるものであってもよい。また、先行研究調査システム１は、複数のコンピュータで構成されていてもよい。 In the present embodiment, the document DB 31 is provided in the document DB server 3, which is a computer different from the computer of the preceding research research system 1, but is incorporated in the computer of the preceding research research system 1. May be. The prior research research system 1 may be composed of a plurality of computers.

続いて、本実施形態に特有の構成要素である調査対象用語辞書２４および検索優先度テーブル２７の構成について説明する。 Next, the configurations of the search target term dictionary 24 and the search priority table 27, which are components specific to the present embodiment, will be described.

図２は、調査対象用語辞書２４の構成の例を示した図である。図２に示すように、調査対象用語辞書２４は、研究分野別にユーザが知りたい情報を表す用語（以下、調査対象用語という）をカテゴリに分類し、登録した辞書であり、いわゆるシソーラス辞書の構成を有している。具体的には、研究分野の下位に、その研究分野でよく用いられる用語のカテゴリがあり、さらにその下位に、個々の用語があるという構成をしている。 FIG. 2 is a diagram showing an example of the configuration of the search target term dictionary 24. As shown in FIG. 2, the survey target term dictionary 24 is a dictionary in which terms (hereinafter referred to as survey target terms) representing information that the user wants to know for each research field is classified into categories and registered, and a so-called thesaurus dictionary is configured. have. Specifically, there is a category of terms often used in the research field at the lower level of the research field, and individual terms are further subordinate to the category.

また、図２に示した例の医学分野のように、用語の下にもう一階層下位の用語があってもよく、さらに深く複数階層の下位の用語があってもよい。同様に、カテゴリが複数階層構成であってもよい。 In addition, as in the medical field of the example illustrated in FIG. 2, there may be a term that is one level lower than the term, and there may be a term that is deeper than a plurality of levels. Similarly, the category may have a multi-layer structure.

従って、本実施形態では、ユーザは、調査対象用語辞書２４を用いれば、研究分野およびカテゴリを指定することにより、その指定された研究分野およびカテゴリに属する複数の用語を一挙に検索することが可能となる。例えば、ユーザが、ある文献の内容を調査するとき、研究分野を「医学」、カテゴリを「疾患名」と指定すると、その文献のテキストから「悪性腫瘍」、「大腸がん」、「乳がん」などの用語が一挙に抽出される。 Therefore, in this embodiment, the user can search a plurality of terms belonging to the specified research field and category at a time by specifying the research field and category by using the search object term dictionary 24. It becomes. For example, when a user investigates the content of a document and specifies the research field as “medicine” and the category as “disease name”, the text of the document reads “malignant tumor”, “colon cancer”, and “breast cancer”. The terms such as are extracted at once.

このような調査対象用語辞書２４は、予め作成され、記憶装置２０に記憶されているものとする。これは、多くの研究分野で、ユーザが知りたい情報、すなわち調査対象用語が概ね同じになることが多いことを考慮したものである。従って、本実施形態では、調査対象用語辞書２４を事前に用意しておくことにより、文献調査の効率を向上させることができる。 It is assumed that such a search target term dictionary 24 is created in advance and stored in the storage device 20. This is because in many research fields, the information that the user wants to know, that is, the terms to be investigated are often the same. Therefore, in the present embodiment, the efficiency of the literature search can be improved by preparing the search target term dictionary 24 in advance.

ただし、研究分野やカテゴリが既定のものばかりでは、システムの融通性や拡張性が損なわれることになる。そこで、本実施形態では、ユーザが新たな研究分野やカテゴリを定義し、調査対象用語辞書２４に適宜追加することが可能な構成であるとする。このことについては、別途図面を参照して詳しく説明する。 However, if the research fields and categories are only predetermined, the flexibility and expandability of the system will be impaired. Therefore, in the present embodiment, it is assumed that the user can define a new research field or category and add it to the search target term dictionary 24 as appropriate. This will be described in detail with reference to the accompanying drawings.

図３は、検索優先度テーブル２７の構成の例を示した図である。図３に示すように、検索優先度テーブル２７は、図２の調査対象用語辞書２４で定義されている各研究分野のカテゴリごとに、検索する際の場所優先度、相関性優先度、近接性優先度およびこれらの優先度の重みを設定したテーブルである。 FIG. 3 is a diagram showing an example of the configuration of the search priority table 27. As shown in FIG. 3, the search priority table 27 includes a place priority, a correlation priority, and a proximity for searching for each category of each research field defined in the search term dictionary 24 of FIG. 2. It is a table in which priorities and weights of these priorities are set.

ここで、場所優先度は、当該カテゴリの用語を検索する際に、文献のテキスト中でどの場所を優先して検索するかを表した情報である。なお、ここで、場所とは、テキストを章など複数の部分に区分したとき、それぞれ区分された部分を指す。また、相関性優先度は、検索でヒットした用語の確度を評価するための相関性解析の対象の場所（章など）の優先度を表した情報である。また、近接性優先度は、当該カテゴリの用語が他のいずれのカテゴリの用語の近くに出現するかを表した情報である。また、重みは、用語の検索に際して、場所優先度、相関性優先度、近接性優先度および手動学習のいずれに重みをつけて検索するかを表した情報である。 Here, the place priority is information representing which place is preferentially searched in the text of a document when searching for terms in the category. Here, when the text is divided into a plurality of parts such as chapters, the place refers to the divided parts. The correlation priority is information representing the priority of a location (a chapter or the like) that is a target of correlation analysis for evaluating the accuracy of a term hit in the search. Further, the proximity priority is information indicating whether a term in the category appears near a term in any other category. Further, the weight is information indicating whether the search is performed with a weight given to a place priority, a correlation priority, a proximity priority, or manual learning when searching for a term.

なお、これらの優先度を表す値は、既定の研究分野およびカテゴリに対しては予め設定されているものとする。ただし、ユーザは、調査対象用語辞書２４に新たな研究分野またはカテゴリを追加する場合には、後記にて説明する図５（Ｃ）の表示画面を介して、これらの優先度を表す値を適宜設定することができる。 Note that values representing these priorities are set in advance for predetermined research fields and categories. However, when the user adds a new research field or category to the research target term dictionary 24, the values indicating these priorities are appropriately set via the display screen of FIG. 5C described later. Can be set.

図４は、先行研究調査システム１において表示装置５０表示される検索指示画面１０１の例を示した図である。図４に示すように、検索指示画面１０１には、検索対象の文献ＤＢ３１を選択するプルダウンメニュー１０２が設けられている。そこで、ユーザは、そのプルダウンメニュー１０２で表示される文献ＤＢ３１の名称のリストから、検索対象の文献ＤＢ３１を選択する。次に、ユーザは、その中から文献を絞り込むために必要なキーワードをテキストボックス１０３に入力する。キーワードは、複数入力が可能で、文章を入力してもよい。 FIG. 4 is a diagram showing an example of the search instruction screen 101 displayed on the display device 50 in the prior research research system 1. As shown in FIG. 4, the search instruction screen 101 is provided with a pull-down menu 102 for selecting the document DB 31 to be searched. Therefore, the user selects the document DB 31 to be searched from the list of document DB 31 names displayed in the pull-down menu 102. Next, the user inputs a keyword necessary for narrowing down documents in the text box 103. A plurality of keywords can be input, and sentences may be input.

次に、ユーザが検索指示画面１０１の検索ボタン１０５をクリックすると、処理装置１０（キーワード検索部１１）は、前記選択された文献ＤＢ３１について、前記入力されたキーワードを用いた検索処理を開始する。なお、この検索に当たって、ユーザは、プルダウンメニュー１０４により検索に用いる専門用語辞書２１を指定しておいてもよい。また、テキストボックス１２６で、表示すべき文献数を設定しておいてもよい。なお、このテキストボックス１２６には、予め既定の数字が設定されていてもよい。 Next, when the user clicks the search button 105 on the search instruction screen 101, the processing device 10 (keyword search unit 11) starts a search process using the input keyword for the selected document DB 31. In this search, the user may designate the technical term dictionary 21 used for the search from the pull-down menu 104. Further, the number of documents to be displayed may be set in the text box 126. In this text box 126, a predetermined number may be set in advance.

処理装置１０は、以上の検索処理を終えると、文献リスト表示領域１０７にテキストボックス１２６で指定された件数の文献のタイトル１０９を表示する。さらに、処理装置１０（特徴語抽出・表示部１２）は、前記検索された文献の上位のものまたは全部に頻出する特徴語１１１を抽出し、抽出した特徴語１１１を特徴語表示領域１１０に表示する。ここで、特徴語表示領域１１０に表示される特徴語１１１の数は、テキストボックス１２７で予め指定しておくことができる。あるいは、その特徴語の数は、予め定められた既定の数であってもよい。 When the processing device 10 finishes the above search processing, the processing device 10 displays the titles 109 of the number of documents specified in the text box 126 in the document list display area 107. Further, the processing device 10 (feature word extraction / display unit 12) extracts feature words 111 that frequently appear in the top or all of the retrieved documents, and displays the extracted feature words 111 in the feature word display area 110. To do. Here, the number of feature words 111 displayed in the feature word display area 110 can be designated in advance in the text box 127. Alternatively, the number of feature words may be a predetermined number.

特徴語表示領域１１０において、特徴語１１１は、それぞれの特徴語１１１相互間の関係の距離に応じた位置に表示される。例えば、検索された文献のテキスト中で互いに近傍に出現する特徴語１１１は、互いに近い位置に表示され、線で結ばれる。ユーザは、これらの特徴語１１１の配置を見て、興味ある特徴語１１１を見つけた場合には、それをキーワードとして選択し、検索ボタン１１２をクリックすることにより、処理装置１０に文献検索を再実行させることができる。さらに、ユーザは、興味ある特徴語１１１を選んで関連ボタン１１３をクリックすることにより、その特徴語１１１を中心においたときの他の特徴語１１１との関係を再表示させることもできる。その際、ユーザは、テキストボックス１２７の数字を変更し、表示させる特徴語の数を変えてもよい。 In the feature word display area 110, the feature word 111 is displayed at a position corresponding to the distance of the relationship between the feature words 111. For example, the feature words 111 that appear in the vicinity of each other in the text of the retrieved document are displayed at positions close to each other and connected by lines. When the user finds the interesting feature word 111 by looking at the arrangement of the feature words 111 and selects the keyword as a keyword, and clicks the search button 112, the user re-searches the document. Can be executed. Furthermore, the user can redisplay the relationship with other feature words 111 when the feature word 111 is centered by selecting the feature word 111 of interest and clicking the related button 113. At that time, the user may change the number in the text box 127 to change the number of feature words to be displayed.

文献リスト表示領域１０７には、表示された文献のタイトル１０９のそれぞれに対応するようにチェックボックス１０８が表示される。ここで、黒い四角は、チェックされたことを表し、白い四角は、チェックされていないことを表す。ユーザは、それぞれの文献に対応するチェックボックス１０８を適宜チェックすることにより、自在に文献を選択することができる。そこで、ユーザがチェックボックス１０８をチェックすることにより文献を選択して、テキスト表示ボタン１０６をクリックすると、処理装置１０（文献テキスト表示部１３）は、選択された文献のテキストをテキスト表示領域１１４に表示する。 In the document list display area 107, check boxes 108 are displayed so as to correspond to the respective titles 109 of the displayed documents. Here, a black square indicates that the check has been performed, and a white square indicates that the check has not been performed. The user can freely select a document by appropriately checking the check box 108 corresponding to each document. Therefore, when the user selects a document by checking the check box 108 and clicks the text display button 106, the processing device 10 (the document text display unit 13) puts the text of the selected document in the text display area 114. indicate.

こうして選択された文献のテキスト１２２がテキスト表示領域１１４に表示されると、ユーザは、知りたい情報すなわち調査対象用語が属する研究分野が含まれる研究分野を、プルダウンメニュー１１５の項目１１６から選択する。例えば、ユーザが研究分野として医学を選択すると、処理装置１０（研究分野・カテゴリ選択部１４）は、調査対象用語辞書２４を参照して、医学に含まれるカテゴリ名を抽出し、カテゴリのプルダウンメニュー１１７の項目１１８として、その抽出したカテゴリ名を表示する。このとき、項目１１８には、例えば、疾患名、被験者など医学に特有のカテゴリ名が表示される。 When the text 122 of the document selected in this way is displayed in the text display area 114, the user selects from the item 116 of the pull-down menu 115 a research field that includes the research field to which the information that the user wants to know, that is, the search target term belongs. For example, when the user selects medicine as a research field, the processing device 10 (research field / category selection unit 14) refers to the search target term dictionary 24 to extract a category name included in the medicine, and a category pull-down menu. As the item 118 of 117, the extracted category name is displayed. At this time, in the item 118, for example, a category name peculiar to medicine such as a disease name and a subject is displayed.

次に、ユーザが項目１１８から一つの項目、例えば疾患名を選択すると、処理装置１０（調査対象用語マーク表示部１７）は、テキスト１２２の中から疾患名のカテゴリに属する用語（大腸がんなど）を抽出し、抽出した用語の背景色を変えるなどのマーク１２３を付ける。 Next, when the user selects one item from the item 118, for example, a disease name, the processing device 10 (survey target term mark display unit 17) selects a term (colorectal cancer or the like) belonging to the disease name category from the text 122. ) And a mark 123 such as changing the background color of the extracted term.

以上のように、本実施形態では、研究分野およびカテゴリを選択するだけで、表示されたテキスト１２２の中から、ユーザが知りたい情報である調査対象用語を素早く見つけることが可能になる。例えば、ユーザが医学分野および疾患名のカテゴリを選択すれば、当該表示された文献のテキスト１２２の中で疾患名に該当する調査対象用語（大腸がんなど）が自動的にマークされることになる。従って、ユーザは、その文献が主としてどんな疾患を対象としたものであるかなどを判断することが容易になる。 As described above, in this embodiment, it is possible to quickly find a search target term that is information that the user wants to know from the displayed text 122 simply by selecting a research field and a category. For example, if the user selects a medical field and a disease name category, a search target term (such as colorectal cancer) corresponding to the disease name is automatically marked in the text 122 of the displayed document. Become. Therefore, the user can easily determine what kind of disease the document is mainly intended for.

続いて、研究分野のプルダウンメニュー１１５の項目１１６に適切な研究分野がなかった場合、または、カテゴリのプルダウンメニュー１１７の項目１１８に適切なカテゴリがなかった場合のユーザの操作については、次の図５を用いて説明する。 Subsequently, when there is no appropriate research field in the item 116 of the pull-down menu 115 of the research field, or when there is no appropriate category in the item 118 of the pull-down menu 117 of the category, the user operation is as follows. 5 will be described.

図５は、調査対象用語辞書２４に研究分野およびカテゴリを追加するときの操作を説明するための図であり、（Ａ）は、検索指示画面１０１のうち説明に必要な部分を示した図、（Ｂ）は、研究分野追加編集画面１０１ｂの例、（Ｃ）は、カテゴリ追加編集画面１０１ｃの例である。 FIG. 5 is a diagram for explaining an operation when adding a research field and a category to the search term dictionary 24, and FIG. 5A is a diagram showing a part necessary for explanation in the search instruction screen 101. (B) is an example of the research field addition edit screen 101b, and (C) is an example of the category addition edit screen 101c.

ユーザが既定の研究分野以外の研究分野を設定する場合には、ユーザは、検索指示画面１０１のプルダウンメニュー１１５の項目１１６から「追加」を選択する。すると、処理装置１０（研究分野・カテゴリ追加部１５）は、ユーザ定義の研究分野名を入力するための研究分野追加編集画面１０１ｂ（図５（Ｂ）参照）を表示する。 When the user sets a research field other than the default research field, the user selects “Add” from the item 116 of the pull-down menu 115 of the search instruction screen 101. Then, the processing apparatus 10 (research field / category addition unit 15) displays a research field addition editing screen 101b (see FIG. 5B) for inputting a user-defined research field name.

そこで、ユーザがテキストボックス１２８に追加したい新たな研究分野名「ユーザ定義１」を入力し、追加ボタン１２９をクリックすると、プルダウンメニューの項目１１６に「ユーザ定義１」が追加される（図示せず）。併せて、このプルダウンメニュー１１５の項目１１６と連動する調査対象用語辞書２４（図２参照）には、新たな研究分野名「ユーザ定義１」が追加される。 Therefore, when the user inputs a new research field name “user definition 1” to be added to the text box 128 and clicks the add button 129, “user definition 1” is added to the item 116 of the pull-down menu (not shown). ). In addition, a new research field name “user definition 1” is added to the search target term dictionary 24 (see FIG. 2) linked to the item 116 of the pull-down menu 115.

また，ユーザが例えばプルダウンメニュー１１５の項目１１６で医学を選んだとき、カテゴリのプルダウンメニュー１１７の項目１１８に知りたい情報（調査対象用語）が属すべきカテゴリがなかった場合には、同様にしてそのカテゴリを追加することができる。すなわち、ユーザがプルダウンメニュー１１７の項目１１８の「追加」を選択すると、処理装置１０（研究分野・カテゴリ追加部１５）は、ユーザ定義のカテゴリを入力するためのカテゴリ追加編集画面１０１ｃ（図５（Ｃ）参照）を表示する。 For example, when the user selects medicine in the item 116 of the pull-down menu 115, if there is no category to which the information (investigation target term) that the user wants to know belongs to the item 118 of the category pull-down menu 117, the same applies. Categories can be added. That is, when the user selects “add” of the item 118 of the pull-down menu 117, the processing apparatus 10 (research field / category addition unit 15) adds a category addition editing screen 101c (FIG. 5 (FIG. 5)). C)) is displayed.

そこで、ユーザがテキストボックス１３０に追加したい新たなカテゴリ名「ユーザ定義２」を入力し、追加ボタン１３１をクリックすると、プルダウンメニュー１１７の項目１１８に「ユーザ定義２」が追加される（図示せず）。併せて、このプルダウンメニュー１１７の項目１１８と連動する調査対象用語辞書２４（図２参照）には、新たなカテゴリ名「ユーザ定義２」が追加される。 Therefore, when the user inputs a new category name “user definition 2” to be added to the text box 130 and clicks the add button 131, “user definition 2” is added to the item 118 of the pull-down menu 117 (not shown). ). In addition, a new category name “user definition 2” is added to the search target term dictionary 24 (see FIG. 2) linked to the item 118 of the pull-down menu 117.

さらに、図５（Ｃ）のカテゴリ追加編集画面１０１ｃでは、前記追加されたカテゴリ「ユーザ定義２」に対する検索優先度テーブル２７における検索場所の優先度の値を設定することができる。すなわち、ユーザがプルダウンメニュー１３２の項目１３３から検索場所を示す章名などを選択し、テキストボックス１３４に優先度の値を入力すると、検索優先度テーブル２７（図３参照）には、カテゴリ「ユーザ定義２」に対する場所の優先度の値が設定される（図示省略）。 Furthermore, in the category addition edit screen 101c of FIG. 5C, the priority value of the search location in the search priority table 27 for the added category “user definition 2” can be set. That is, when the user selects a chapter name or the like indicating the search location from the item 133 of the pull-down menu 132 and inputs a priority value in the text box 134, the search priority table 27 (see FIG. 3) includes a category “user”. A place priority value for “Definition 2” is set (not shown).

また、同様に、前記追加されたカテゴリ「ユーザ定義２」に属する用語に対する場所（章など）の優先度の値を設定することができる。すなわち、ユーザがプルダウンメニュー１３５の項目１３６から場所を示す章名などを選択し、テキストボックス１３７に優先度の値を入力すると、検索優先度テーブル２７（図３参照）には、カテゴリ「ユーザ定義２」に対する相関性の優先度の値が設定される（図示省略）。 Similarly, the priority value of the place (chapter etc.) for the term belonging to the added category “user definition 2” can be set. That is, when the user selects a chapter name indicating a location from the item 136 of the pull-down menu 135 and inputs a priority value in the text box 137, the search priority table 27 (see FIG. 3) includes a category “user definition”. The correlation priority value for “2” is set (not shown).

また、同様に、前記追加されたカテゴリ「ユーザ定義２」に属する用語の他のカテゴリに属する用語との近接性の優先度の値を設定することができる。すなわち、ユーザがプルダウンメニュー１３８の項目１３９から、近接性を解析するカテゴリを選択し、テキストボックス１４０に優先度の値を入力すると、検索優先度テーブル２７（図３参照）には、カテゴリ「ユーザ定義２」に対する近接性の優先度の値が設定される（図示省略）。 Similarly, it is possible to set a priority value of proximity to a term belonging to another category of the term belonging to the added category “user definition 2”. That is, when the user selects a category for analyzing proximity from the item 139 of the pull-down menu 138 and inputs a priority value in the text box 140, the category “user” is displayed in the search priority table 27 (see FIG. 3). A proximity priority value for “Definition 2” is set (not shown).

なお、相関性解析や近接性解析は、テキスト１２２の検索で複数の用語（調査対象用語すなわち、ユーザが知りたい情報）が抽出されたときなどに、その用語の確度を評価するために行われる。例えば、前記抽出された複数の用語とその用語が出現する場所との相関性を解析し、指定された場所との相関性が高い用語を選べば、より適切な用語を選ぶことができる。なお、ここでいう近接とは、例えば、同じ段落に出現することを意味しているものとする。ただし、その近接の範囲は、もう少し広げてもよいし、狭めてもよい。 The correlation analysis and the proximity analysis are performed in order to evaluate the accuracy of the terms when a plurality of terms (search target terms, that is, information that the user wants to know) are extracted by searching the text 122. . For example, by analyzing the correlation between the plurality of extracted terms and the place where the term appears, and selecting a term having a high correlation with the designated location, a more appropriate term can be selected. In addition, proximity here means that it appears in the same paragraph, for example. However, the range of the proximity may be widened or narrowed a little.

また、これらの検索優先度を決めるパラメータは、以上に説明した検索場所、相関性、近接性に限定されず、他の項目を設定できるようにしてもよい。 The parameters for determining the search priority are not limited to the search location, correlation, and proximity described above, and other items may be set.

さらに、これらの検索場所、相関性、近接性と、後記する手動学習のうち、どれを優先して用語を抽出するかの重み付けも、図１（Ｃ）のカテゴリ追加編集画面１０１ｃで設定することができる。その場合、ユーザは、手動学習、検索場所、相関性、近接性の各項目１４１の優先度を表す重み付けの値をテキストボックス１４２に入力する。 In addition, the search location, the correlation, the proximity, and the manual learning to be described later are weighted to prioritize the extraction of the terms on the category addition editing screen 101c in FIG. Can do. In that case, the user inputs a weighting value indicating the priority of each item 141 of manual learning, search location, correlation, and proximity into the text box 142.

以上のようにして、カテゴリ追加編集画面１０１ｃで設定された優先度のデータは、ユーザが追加ボタン１３１をクリックすることにより、図３の検索優先度テーブル２７に追加される。 As described above, the priority data set on the category addition edit screen 101c is added to the search priority table 27 in FIG. 3 when the user clicks the add button 131.

次に、ユーザが追加したカテゴリの用語を抽出するための手動学習について説明する。ユーザは、まず、プルダウンメニュー１１７の項目１１８の追加したカテゴリを選択し、テキスト表示領域１１４のテキスト１２２中で当該カテゴリに属する用語にマウスなどを使ってマークする。続いて、ユーザが保存ボタン１１９を押すと、そのマークされた用語が当該カテゴリに属する用語として調査対象用語辞書２４に追加される。このとき、処理装置１０（学習部１８）は、前後の文脈、文章中の位置、品詞などを解析し、出現した章や場所など、当該追加したカテゴリの特徴を学習する。 Next, manual learning for extracting terms of categories added by the user will be described. First, the user selects a category added by the item 118 of the pull-down menu 117, and marks terms belonging to the category in the text 122 of the text display area 114 using a mouse or the like. Subsequently, when the user presses the save button 119, the marked term is added to the search target term dictionary 24 as a term belonging to the category. At this time, the processing device 10 (learning unit 18) analyzes the contexts before and after, the position in the sentence, the part of speech, and the like, and learns the characteristics of the added category such as the chapter or the place where it appears.

処理装置１０（学習部１８）は、テキスト表示領域１１４に表示されたテキスト１２２についての学習が終了すると、ユーザに別の文献を選択させ、ユーザが選択した文献のテキスト１２２をテキスト表示領域１１４に表示する。そこで、ユーザがそのテキスト１２２中で当該カテゴリに属すべき用語に手動でマークし、保存ボタン１１９を押すと、処理装置１０（学習部１８）は、そのマークされた用語を、当該カテゴリに属する用語として調査対象用語辞書２４に追加する。 When learning about the text 122 displayed in the text display area 114 is completed, the processing device 10 (learning unit 18) causes the user to select another document, and the text 122 of the document selected by the user is displayed in the text display area 114. indicate. Therefore, when the user manually marks a term that should belong to the category in the text 122 and presses the save button 119, the processing device 10 (learning unit 18) changes the marked term to the term belonging to the category. To the search term dictionary 24.

ユーザは、以上のようなユーザの操作および処理装置１０による学習を何回か繰り返した後、新たな文献のテキスト１２２をテキスト表示領域１１４に呼び出し、プルダウンメニュー１１７の項目１１８で追加したカテゴリを選択し、適用ボタン１２０をクリックする。そうすると、処理装置１０は、検索優先度テーブル２７と前記学習で得られたカテゴリの特徴を用いて、表示されたテキスト１２２の中で当該カテゴリに属する用語をマークする。このとき適切な用語がマークされていなかった場合には、ユーザが適切な用語をマークして保存ボタン１１９をクリックすることにより、処理装置１０にさらに学習を行わせることができる。そして、図１（ｃ）のカテゴリ追加編集画面１０１ｃを表示させ、以上のような学習で得られた結果に基づき重み付けなどの優先度を再設定するようにしてもよい。 After the user's operation and learning by the processing device 10 are repeated several times, the user calls the text 122 of the new document in the text display area 114 and selects the category added in the item 118 of the pull-down menu 117. Then, the apply button 120 is clicked. Then, the processing device 10 marks the terms belonging to the category in the displayed text 122 using the search priority table 27 and the category characteristics obtained by the learning. At this time, if an appropriate term is not marked, the user can cause the processing device 10 to further learn by marking the appropriate term and clicking the save button 119. Then, the category addition editing screen 101c in FIG. 1C may be displayed, and the priority such as weighting may be reset based on the result obtained by learning as described above.

また、例えば疾患名のカテゴリが選択されたとき、表示されたテキスト１２２中に自動マークされない疾患名があった場合、ユーザは、テキスト１２２中のその自動マークされなかった用語を手動でマークすることができる。この場合、ユーザがさらに保存ボタン１１９を押すと、調査対象用語辞書２４の疾患名のカテゴリに、手動でマークした疾患用語が追加される。そして、この用語には、疾患名のカテゴリの検索優先度が適用される。 Also, for example, when a disease name category is selected, and there is a disease name that is not automatically marked in the displayed text 122, the user manually marks the unmarked term in the text 122. Can do. In this case, when the user further presses the save button 119, the manually-marked disease term is added to the disease name category of the search target term dictionary 24. The search priority of the disease name category is applied to this term.

一通りカテゴリが選択された後、ユーザが全適用ボタン１２１をクリックすると、ここまでに選択されたカテゴリのマーク付けが、文献リスト表示領域１０７にリストされた文献全部に適用される。 When the user clicks the all application button 121 after the category has been selected, the marking of the category selected so far is applied to all the documents listed in the document list display area 107.

なお、調査対象用語（知りたい情報）が疾患名などの用語ではなく、結論の一文である場合は、統計処理に関する用語、例えば「有意差」「ｐ値」など数字が一文中に含まれている文章を抽出してもよい。その際、「明らかになった」「確認できた」など結論を肯定する文脈の有無で確度を評価してもよい。また、結論として抽出された文の付近に出現する図番号の図は、結論を表している可能性が高いので、その図とキャプションも抽出できるようにしてもよい。 Note that if the survey target term (information you want to know) is not a term such as a disease name but a sentence of conclusion, terms related to statistical processing, for example, numbers such as “significant difference” and “p value” are included in the sentence. You may extract the sentence which is. At that time, the accuracy may be evaluated based on the presence or absence of a context in which the conclusion is affirmed, such as “clarified” or “confirmed”. Moreover, since the figure number figure that appears in the vicinity of the sentence extracted as a conclusion is likely to represent the conclusion, the figure and caption may be extracted.

文献ＤＢ３１の文献のテキストは、例えばｘｍｌなどマークアップ言語形式で格納されており、文献の掲載雑誌名、著者名、発行年、要旨などがタグでマークされている。そのため、処理装置１０は、そのタグでマークされている情報を文献のテキストから容易に抽出することができ、それを表などに格納して書誌情報の一覧を作成することができる。また、これらの書誌情報の他に、ユーザが知りたい情報すなわちカテゴリで指定された情報（調査対象用語）、例えば対象疾患や被験者数、測定法などにもタグを付け、抽出できるようにする。すなわち、プルダウンメニュー１１７の項目１１８でカテゴリを選ぶと，そのカテゴリに属する用語がタグでマークされ、テキスト表示領域１１４のテキスト１２２の中の該当する用語にマーク１２３，１２４，１２５が表示される。 The text of the document in the document DB 31 is stored in a markup language format such as xml, for example, and the journal name, author name, publication year, abstract, etc. of the document are marked with tags. Therefore, the processing apparatus 10 can easily extract information marked with the tag from the text of the document, and can store it in a table or the like to create a list of bibliographic information. In addition to these bibliographic information, information that the user wants to know, that is, information specified in a category (survey target term), for example, target disease, number of subjects, measurement method, etc., can be tagged and extracted. That is, when a category is selected in the item 118 of the pull-down menu 117, terms belonging to the category are marked with tags, and marks 123, 124, and 125 are displayed for the corresponding terms in the text 122 in the text display area 114.

以上のように、本実施形態では、検索指示画面１０１を介して、ユーザが自在に研究項目およびカテゴリを追加し、さらに、その研究項目およびカテゴリに対応づけられる調査対象用を設定することができる。さらには、カテゴリ追加編集画面１０１ｃを介して、ユーザは、それぞれの研究項目およびカテゴリごとに、検索場所などの優先順位を定める優先度を自在に定めることができる。従って、本実施形態に係る先行研究調査システム１の一定の拡張性や融通性が確保される。 As described above, in the present embodiment, the user can freely add a research item and a category via the search instruction screen 101, and can further set a research object associated with the research item and the category. . Furthermore, the user can freely set priorities for determining priorities such as search places for each research item and category via the category addition editing screen 101c. Therefore, certain expandability and flexibility of the prior research research system 1 according to the present embodiment are ensured.

図６は、先行研究調査システム１において表示装置５０に表示される出力指示画面２０１の例を示した図である。図６に示すように、出力指示画面２０１には、文献リスト表示領域２０６、出力項目選択領域２１１および出力ファイル形式選択領域２２２が設けられている。なお、ユーザは、検索指示画面１０１（図４参照）とこの出力指示画面２０１とをそれぞれの画面のタブ部分をクリックすることにより自在に切り替えることができる。 FIG. 6 is a diagram showing an example of the output instruction screen 201 displayed on the display device 50 in the prior research research system 1. As shown in FIG. 6, the output instruction screen 201 is provided with a document list display area 206, an output item selection area 211, and an output file format selection area 222. Note that the user can freely switch between the search instruction screen 101 (see FIG. 4) and the output instruction screen 201 by clicking a tab portion of each screen.

文献リスト表示領域２０６には、検索指示画面１０１での検索指示により検索された文献の文献タイトル２０８と、そのそれぞれの文献タイトル２０８に対応したチェックボックス２０７が表示される。 In the document list display area 206, a document title 208 of a document searched by a search instruction on the search instruction screen 101 and a check box 207 corresponding to each document title 208 are displayed.

ここで、ユーザが全選択ボタン２０２をクリックすると、チェックボックス２０７の全てにチェックが入れられ、全ての文献が調査対象用語（ユーザが知りたい情報）の出力対象になる。また、ユーザが上位ボタン２０３をクリックし、テキストボックス２０４に文献数を入力すると、入力した文献数の上位文献の文献タイトル２０８のチェックボックス２０７にチェックが入れられる。また、ユーザが個別選択ボタン２０５をクリックすると、チェックボックス２０７は自在にチェック可能となる。このとき、ユーザは、チェックボックス２０７を自由にクリックして、文献タイトル２０８を自由に選択することができるようになる。なお、ユーザが全選択や上位選択した後に、個別選択ボタン２０５で個々文献のチェックボックス２０７のチェックを外したり、入れたりしてもよい。 Here, when the user clicks the all selection button 202, all of the check boxes 207 are checked, and all the documents are output as search target terms (information that the user wants to know). When the user clicks the upper button 203 and inputs the number of documents in the text box 204, the check box 207 of the document title 208 of the upper document of the input number of documents is checked. When the user clicks the individual selection button 205, the check box 207 can be freely checked. At this time, the user can freely select the document title 208 by clicking the check box 207 freely. It should be noted that after the user selects all items or selects a higher level, the individual document check box 207 may be unchecked or entered using the individual selection button 205.

出力項目選択領域２１１には、書誌情報など出力すべきユーザが知りたい情報の項目が表示される。例えば、書誌情報項目２１７が表示され、そのうちユーザが出力したい項目があれば、その項目に対応するチェックボックス２１６にチェックを入れる。このとき、ユーザが全選択ボタン２０９をクリックすると、チェックボックス２１６の全てにチェックが入れられ、全部の書誌情報項目２１７が出力対象になる。また、ユーザが個別選択ボタン２１０をクリックし、個々のチェックボックス２１６をクリックすると、クリックしたチェックボックス２１６に対応する書誌情報項目２１７が出力対象となる。 In the output item selection area 211, items of information that the user wants to know, such as bibliographic information, are displayed. For example, if a bibliographic information item 217 is displayed and there is an item that the user wants to output, a check box 216 corresponding to that item is checked. At this time, when the user clicks the all selection button 209, all the check boxes 216 are checked, and all the bibliographic information items 217 are output. When the user clicks the individual selection button 210 and clicks the individual check box 216, the bibliographic information item 217 corresponding to the clicked check box 216 becomes the output target.

次に、ユーザは、プルダウンメニュー２１２の項目２１３の中から、先に検索指示画面１０１（図５参照）のプルダウンメニュー１１５の項目１１６で選択した研究分野を選択する。すると、検索指示画面１０１のカテゴリのプルダウンメニュー１１７で選択され、テキスト１２２でマークされた用語（例えば、マーク１２３，１２４，１２５：図５参照）が属するカテゴリが、出力項目選択領域２１１の中にカテゴリ項目２１９として表示される。そこで、ユーザは、出力すべき調査対象用語（知りたい情報）が属するカテゴリ項目２１９に対応するチェックボックス２１８にチェックを入れる。 Next, the user selects, from among the items 213 in the pull-down menu 212, the research field previously selected in the item 116 of the pull-down menu 115 of the search instruction screen 101 (see FIG. 5). Then, the category to which the term (for example, marks 123, 124, and 125: see FIG. 5) selected by the category pull-down menu 117 on the search instruction screen 101 and marked with the text 122 belongs is displayed in the output item selection area 211. It is displayed as a category item 219. Therefore, the user checks the check box 218 corresponding to the category item 219 to which the investigation target term (information to be known) to be output belongs.

なお、ユーザ定義の研究分野は、既定の研究分野と同じプルダウンメニュー２１２の項目２１３から選択する。あるいは、ユーザ定義の研究分野を別のプルダウンメニュー２１４に分離し、そのプルダウンメニュー２１４を用いてユーザ定義の研究分野の項目２１５を選択するようにしてもよい。その場合、ユーザは、ユーザ定義のカテゴリ項目２２１に対応したチェックボックス２２０にチェックを入れて、出力すべき調査対象用語が属するカテゴリ項目２２１を選択する。 The user-defined research field is selected from the item 213 in the pull-down menu 212 that is the same as the default research field. Alternatively, the user-defined research field may be separated into another pull-down menu 214, and the user-defined research field item 215 may be selected using the pull-down menu 214. In this case, the user checks the check box 220 corresponding to the user-defined category item 221 and selects the category item 221 to which the investigation target term to be output belongs.

次に、ユーザは、出力ファイル形式選択領域２２２に表示されたファイル形式名２２４に対応したチェックボックス２２３をクリックすることにより、出力する調査結果情報２８のファイル形式を選択する。さらに、ユーザは、保存先パス指定ボックス２２５で保存先のパスを指定し、テキストボックス２２６にファイル名を記入する。そして、ユーザがプレビューボタン２２７をクリックすると、調査結果情報２８が画面表示され、さらにユーザが出力ボタン２２８をクリックすると、調査結果情報２８が記憶装置２０に保存される。 Next, the user clicks the check box 223 corresponding to the file format name 224 displayed in the output file format selection area 222 to select the file format of the survey result information 28 to be output. Further, the user designates a save destination path in the save destination path designation box 225 and enters a file name in the text box 226. When the user clicks the preview button 227, the survey result information 28 is displayed on the screen, and when the user clicks the output button 228, the survey result information 28 is stored in the storage device 20.

なお、以上の出力指示画面２０１を介してユーザが入力装置４０から入力する情報に応じて、処理装置１０が行う処理は、ほとんどが調査結果情報出力部１９（図１参照）の処理に含まれる。 Note that most of the processing performed by the processing device 10 in accordance with information input from the input device 40 by the user via the output instruction screen 201 described above is included in the processing of the survey result information output unit 19 (see FIG. 1). .

図７は、先行研究調査システム１によって生成され出力される調査結果情報２８の構成の例を示した図である。図７に示すように、調査結果情報２８の研究分野欄３０１には、調査した研究分野名が格納される。また、タイトル欄３０２には、調査の対象となった文献のタイトルが格納され、書誌情報欄３０３には、当該文献の掲載雑誌名、著者、発行年などが格納される。また、調査結果情報欄３０４には、当該文献中から得られた疾患名や被験者などの情報が格納される。 FIG. 7 is a diagram showing an example of the configuration of the survey result information 28 generated and output by the prior research survey system 1. As shown in FIG. 7, the research field name 301 of the research result information 28 stores the name of the research field that was investigated. In addition, the title column 302 stores the title of the document that is the subject of the survey, and the bibliographic information column 303 stores the magazine name, author, and publication year of the document. The survey result information column 304 stores information such as disease names and subjects obtained from the document.

ここで、調査結果情報欄３０４は、図３の出力項目選択領域２１１のカテゴリ項目２１９のチェックボックス２１８でチェックされた項目に分割されており、それぞれの分割された項目の名称は、疾患名、被験者、測定対象などカテゴリの名称に対応している。そして、文献ごとの各カテゴリの項目に対応する調査結果情報欄３０４には、そのカテゴリに応じて当該文献から抽出された情報が格納される。ここで、文献から抽出された情報とは、図６の検索指示画面１０１のテキスト表示領域１１４に表示されたテキスト１２２で、ユーザが当該カテゴリを指定し、マークした用語をいう。 Here, the survey result information column 304 is divided into items checked by the check box 218 of the category item 219 in the output item selection area 211 of FIG. 3, and the name of each divided item is a disease name, Corresponds to the category name, such as subject, measurement target. And the information extracted from the said literature according to the category is stored in the investigation result information column 304 corresponding to the item of each category for every literature. Here, the information extracted from the document refers to a term that is marked by the user specifying the category in the text 122 displayed in the text display area 114 of the search instruction screen 101 in FIG.

従って、それぞれの調査結果情報欄３０４には、複数の用語または情報が格納されるとしてもよい。この場合、ファイル出力時に、ファイルの行または列を増やし、並列に表示してもよい。または、ファイル出力時に、いずれの用語または情報を選択して表示するか、あるいは、全てを表示するかなどをユーザに選択させる画面を表示してもよい。 Therefore, a plurality of terms or information may be stored in each survey result information column 304. In this case, at the time of file output, the rows or columns of the file may be increased and displayed in parallel. Alternatively, when outputting a file, a screen may be displayed that allows the user to select which term or information to select and display, or to display all.

以上のように、本実施形態では、ユーザは、知りたい情報が属する研究分野およびカテゴリを選択を選択すれば、キーワード検索で得られた文献について、それぞれの文献に記載されている情報のうち、選択したそれぞれのカテゴリに含まれる調査対象用語が、それぞれの文献のタイトルおよび書誌情報に対応付けられた調査結果情報２８を得ることができる。従って、ユーザは、この調査結果情報２８を閲覧することにより、各文献の記載内容の要点を素早く把握することができる。よって、本実施形態によれば、先行研究の調査作業の効率を向上させることができる。 As described above, in the present embodiment, if the user selects the research field and category to which the information that the user wants to belong belongs, the information obtained in the keyword search, among the information described in each document, Search result information 28 in which search target terms included in each selected category are associated with the title and bibliographic information of each document can be obtained. Therefore, the user can quickly grasp the main points of the description contents of each document by browsing the survey result information 28. Therefore, according to the present embodiment, it is possible to improve the efficiency of research work of prior research.

図８は、先行研究調査システム１において実行される主処理の処理フローの概要を示した図である。先行研究調査システム１の処理装置１０は、まず、表示装置５０に検索指示画面１０１を表示し、ユーザがプルダウンメニュー１０２で選択する項目を取得することにより、検索対象の文献ＤＢ３１を選択する（ステップＳ１１）。 FIG. 8 is a diagram showing an outline of the processing flow of the main processing executed in the prior research research system 1. The processing device 10 of the prior research research system 1 first displays the search instruction screen 101 on the display device 50, and acquires the items that the user selects from the pull-down menu 102, thereby selecting the document DB 31 to be searched (step). S11).

次に、処理装置１０は、キーワード検索部１１の処理として、テキストボックス１０３に入力されるキーワードを読み取り、そのキーワードを用いて前記選択した文献ＤＢ３１のキーワード検索を行う（ステップＳ１２）。なお、このキーワード検索では、プルダウンメニュー１０４で選択される専門用語辞書２１が適宜用いられる。 Next, as a process of the keyword search unit 11, the processing device 10 reads a keyword input in the text box 103, and performs a keyword search of the selected document DB 31 using the keyword (step S12). In this keyword search, the technical term dictionary 21 selected from the pull-down menu 104 is used as appropriate.

次に、処理装置１０は、前記キーワード検索の結果に基づいて、重要度順にソートした検索文献リスト２２を生成するとともに、検索指示画面１０１の文献リスト表示領域１０７に表示する（ステップＳ１３）。なお、ここでいう重要度は、キーワードがヒットした頻度などにより決められる。 Next, the processing apparatus 10 generates a search document list 22 sorted in order of importance based on the keyword search result, and displays it in the document list display area 107 of the search instruction screen 101 (step S13). Note that the importance here is determined by the frequency of keyword hits.

次に、処理装置１０は、特徴語抽出・表示部１２の処理として、前記キーワード検索時と並行して特徴語抽出を行うとともに、その抽出した特徴語１１１を検索指示画面１０１の特徴語表示領域１１０に表示する（ステップＳ１４）。なお、特徴語とは、当該文献で頻出する専門用語などをいう。また、処理装置１０は、表示した特徴語１１１の中からユーザが１つ以上の特徴語１１１を選択し、検索ボタン１１２をクリックする入力を受付けた場合、すなわち、再検索する場合（ステップＳ１５でＹｅｓ）、ステップＳ１２以下の処理を再実行する。 Next, the processing device 10 performs the feature word extraction in parallel with the keyword search as the processing of the feature word extraction / display unit 12 and displays the extracted feature words 111 in the feature word display area of the search instruction screen 101. 110 (step S14). Note that the term “feature word” refers to a technical term that frequently appears in the document. In addition, the processing device 10 receives an input in which the user selects one or more feature words 111 from the displayed feature words 111 and clicks the search button 112, that is, in a case where a search is performed again (in step S15). Yes), the process from step S12 is re-executed.

また、検索ボタン１１２がクリックされず、文献リスト表示領域１０７に表示された検索文献リスト２２から特定の文献をチェックし、テキスト表示ボタン１０６がクリックされる入力を受付けた場合（再検索しない場合：ステップＳ１５でＮｏ）、処理装置１０は、文献テキスト表示部１３の処理として、前記チェックされた文献を選択し、その選択した文献のテキストを検索指示画面１０１のテキスト表示領域１１４に表示する（ステップＳ１６）。 In addition, when the search button 112 is not clicked and a specific document is checked from the search document list 22 displayed in the document list display area 107 and an input for clicking the text display button 106 is accepted (when the search is not performed again: In step S15, the processing device 10 selects the checked document as the process of the document text display unit 13, and displays the text of the selected document in the text display area 114 of the search instruction screen 101 (step S15). S16).

続いて、処理装置１０は、当該文献のテキストを章などに区分する処理を行い（ステップＳ１７）、さらに、重要な情報が含まれていないと判断される引用文などを除外する処理を行う（ステップＳ１８）。なお、これらの具体的な処理のやり方については、別途図面を参照して詳しく説明する。 Subsequently, the processing device 10 performs a process of dividing the text of the document into chapters and the like (step S17), and further performs a process of excluding quotes and the like that are determined not to contain important information (step S17). Step S18). The specific processing method will be described in detail with reference to the drawings.

次に、処理装置１０は、研究分野・カテゴリ選択部１４の処理として、ユーザがプルダウンメニュー１１５，１１７の項目１１６，１１８を選択する操作情報に基づき、ユーザが知りたい情報（調査対象用語）が属する研究分野およびカテゴリを選択する（ステップＳ１９）。ただし、このとき、項目１１６，１１８で「追加」が選択されていた場合には（ステップＳ２０でＹｅｓ）、処理装置１０は、研究分野・カテゴリ追加処理を実行する（ステップＳ２１）。なお、ステップＳ２１の研究分野・カテゴリ追加処理については、別途図面を参照して詳しく説明する。 Next, in the processing device 10, as the processing of the research field / category selection unit 14, information (study target term) that the user wants to know is based on operation information for the user to select the items 116 and 118 of the pull-down menus 115 and 117. The research field and category to which it belongs are selected (step S19). However, at this time, if “addition” is selected in the items 116 and 118 (Yes in step S20), the processing device 10 executes a research field / category addition process (step S21). Note that the research field / category addition processing in step S21 will be described in detail with reference to a separate drawing.

一方、ステップＳ１９において、項目１１６，１１８から既定の研究分野およびカテゴリが選択された場合には（「追加」が選択されていない場合：ステップＳ２０でＮｏ）、処理装置１０は、検索優先度テーブル２７から前記選択された研究分野およびカテゴリに対応する検索優先度を取得し、その検索優先度に基づき、ステップＳ１６で選択された文献のテキストについて検索順などを決定する（ステップＳ２２）。 On the other hand, when a predetermined research field and category are selected from the items 116 and 118 in step S19 (when “addition” is not selected: No in step S20), the processing device 10 uses the search priority table. The search priority corresponding to the selected research field and category is acquired from 27, and based on the search priority, the search order or the like is determined for the text of the document selected in step S16 (step S22).

次に、処理装置１０は、タグ付きテキスト生成部１６および調査対象用語マーク表示部の処理として、前記文献テキスト２３の中から、調査対象用語辞書２４において前記選択された研究分野およびカテゴリに対応付けられた用語（知りたい情報）を検索し、このときヒットした用語にタグを付けて、タグ付きテキスト２５を生成・表示する（ステップＳ２３）。なお、このとき、テキスト表示領域１１４に表示されたテキスト１２２中では、検索でヒットした用語すなわちタグが付けられた用語にマーク１２３が付される。 Next, the processing device 10 associates with the selected research field and category in the search target term dictionary 24 from the document text 23 as processing of the tagged text generation unit 16 and the search target term mark display unit. The retrieved term (information to be known) is searched, a tag is attached to the term hit at this time, and the tagged text 25 is generated and displayed (step S23). At this time, in the text 122 displayed in the text display area 114, a mark 123 is added to a term hit in the search, that is, a term with a tag.

次に、処理装置１０は、調査結果情報出力部１９の処理として、出力指示画面２０１を表示し、その出力指示画面２０１を介してユーザが入力する情報を読み取り、そのユーザが入力する情報に基づいて、出力項目を選択する（ステップＳ２４）。続いて、処理装置１０は、図７に示したような調査結果情報２８を記憶装置２０または表示装置５０に出力する（ステップＳ２５）。 Next, the processing apparatus 10 displays an output instruction screen 201 as processing of the survey result information output unit 19, reads information input by the user via the output instruction screen 201, and based on the information input by the user The output item is selected (step S24). Subsequently, the processing device 10 outputs the survey result information 28 as shown in FIG. 7 to the storage device 20 or the display device 50 (step S25).

図９は、図８における研究分野・カテゴリ追加処理（ステップＳ２１）の詳細な処理フローの例を示した図である。処理装置１０は、研究分野追加編集画面１０１ｂのテキストボックス１２８およびカテゴリ追加編集画面１０１ｃのテキストボックス１３０を介して入力される新たな研究分野およびカテゴリを読み取り、その新たな研究分野およびカテゴリを調査対象用語辞書２４における新たな研究分野およびカテゴリとして設定する（ステップＳ３１）。なお、ステップＳ１９（図８参照）で既定の研究分野が選択された場合には、研究分野追加編集画面１０１ｂを介して入力される研究分野の設定は省略される。 FIG. 9 is a diagram showing an example of a detailed processing flow of the research field / category addition processing (step S21) in FIG. The processing apparatus 10 reads a new research field and category input via the text box 128 of the research field addition editing screen 101b and the text box 130 of the category addition editing screen 101c, and investigates the new research field and category. A new research field and category are set in the term dictionary 24 (step S31). If a predetermined research field is selected in step S19 (see FIG. 8), the setting of the research field input via the research field additional editing screen 101b is omitted.

次に、処理装置１０は、研究分野追加編集画面１０１ｂのプルダウンメニュー１３２，１３５，１３８の項目１３３，１３６，１３９の選択情報およびテキストボックス１３４，１３７，１４０への入力値を読み取る。そして、処理装置１０は、これら読み取った情報に基づき、テキストボックス１３０を介して設定された新たなカテゴリに対する検索場所、相関性、近接性それぞれについての優先度の値を設定し、検索優先度テーブル２７に格納する（ステップＳ３２）。また、処理装置１０は、このとき、手動学習、検索場所、相関性、近接性それぞれの重みも併せて設定する。 Next, the processing apparatus 10 reads the selection information of the items 133, 136, and 139 of the pull-down menus 132, 135, and 138 on the research field addition editing screen 101b and the input values to the text boxes 134, 137, and 140. Then, the processing device 10 sets priority values for the search location, correlation, and proximity for the new category set via the text box 130 based on the read information, and sets the search priority table. 27 (step S32). At this time, the processing device 10 also sets weights for manual learning, search location, correlation, and proximity.

次に、処理装置１０は、検索指示画面１０１の文献リスト表示領域１０７に表示されている文献リストの中からチェックボックス１０８でチェックされた文献を選択し、その選択した文献のテキストをテキスト表示領域１１４に表示する（ステップＳ３３）。そして、処理装置１０は、前記表示した文献のテキストについて、そのテキストを章などに区分し（ステップＳ３４）、さらに、そのテキストから引用文などを除外する（ステップＳ３５）。 Next, the processing apparatus 10 selects the document checked by the check box 108 from the document list displayed in the document list display area 107 of the search instruction screen 101, and the text of the selected document is displayed in the text display area. 114 (step S33). Then, the processing device 10 divides the text of the displayed document into chapters or the like (step S34), and further excludes quotes from the text (step S35).

次に、処理装置１０は、テキスト表示領域１１４に表示されたテキスト１２２においてユーザが選択する用語を読み取り、その用語の背景色を変えるなどのマークをつける（ステップＳ３６）。なお、ここでは、ユーザは、ステップＳ３１で設定した新しい研究分野およびカテゴリに含めるのにふさわしい用語を選択するものとする。 Next, the processing apparatus 10 reads the term selected by the user in the text 122 displayed in the text display area 114, and puts a mark such as changing the background color of the term (step S36). Here, it is assumed that the user selects a term suitable for inclusion in the new research field and category set in step S31.

次に、処理装置１０は、前記マークされた用語が区分されたテキストのどの部分（章など）に多く出現するか、いずれの特徴語の近傍に出現するかなどの情報を取得、すなわち、用語の特徴を学習し、保存する（ステップＳ３７）。 Next, the processing apparatus 10 obtains information such as in which part (such as chapters) of the marked text the marked term frequently appears and in which feature word it appears, that is, the term Are learned and stored (step S37).

処理装置１０は、ステップＳ３３〜ステップＳ３７の処理（学習部３９の処理）を所定回数繰り返し実行する（ステップＳ３８でＮｏ）。この繰り返しにより、調査対象用語辞書２４に新たに設定された研究分野およびカテゴリに属する用語が決定される。また、検索優先度テーブル２７における検索場所、相関性、近接性の優先度の値を設定するための学習情報が得られる。従って、ユーザは、ステップＳ３２で定めた優先度の値をより適切な値に変更することができる。 The processing device 10 repeatedly executes the processing of Step S33 to Step S37 (processing of the learning unit 39) a predetermined number of times (No in Step S38). By repeating this, terms belonging to the research field and category newly set in the search target term dictionary 24 are determined. Further, learning information for setting search priority, correlation, and proximity priority values in the search priority table 27 is obtained. Therefore, the user can change the priority value determined in step S32 to a more appropriate value.

次に、処理装置１０は、学習部３９の処理を所定回数実行した場合には（ステップＳ３８でＹｅｓ）、ここまでの処理で定められた検索優先度テーブル２７の当該新たな研究分野およびカテゴリにおける優先度情報を用いて、テキストの検索順などを決定する（ステップＳ３９）。そして、処理装置１０は、適宜選択された文献のテキストを、調査対象用語辞書２４に設定された当該カテゴリに属する用語で検索し、検索された用語のタグ付きテキストを生成し、テキスト表示領域１１４に表示する（ステップＳ４０）。 Next, when the processing unit 10 has executed the processing of the learning unit 39 a predetermined number of times (Yes in step S38), the processing device 10 in the new research field and category of the search priority table 27 determined by the processing so far. Using the priority information, the text search order is determined (step S39). Then, the processing device 10 searches the text of the appropriately selected document with terms belonging to the category set in the search target term dictionary 24, generates tagged text of the searched terms, and displays the text display area 114. (Step S40).

テキスト表示領域１１４に表示されるテキスト１２２中、タグが付された用語にはマーク１２３などが付されているため、ユーザは、適切な調査対象用語が適切に抽出されているか否かを判定することができる。そこで、ユーザがさらに学習を繰り返すべきと判定した場合には（ステップＳ４１でＮｏ）、処理装置１０は、ステップＳ３３以下の処理（学習部３９の処理）を再度実行する。 In the text 122 displayed in the text display area 114, since the tag-added term is marked with a mark 123 or the like, the user determines whether or not an appropriate survey target term is appropriately extracted. be able to. Therefore, when it is determined that the user should repeat the learning (No in step S41), the processing device 10 executes again the processing after step S33 (processing of the learning unit 39).

図１０は、文献のテキストにおいて、優先検索する部分を設定する方法の例を示した図である。文献がとくに学術論文である場合には、そのテキストは、図１０に示すように、Abstract（要旨）６１０、Introduction（序章）６２０、Materials and Method（実験方法）６３０、Results and Discussion（結果と考察）６４０、Conclusion（結論）６５０、Acknowledgement（謝辞）６６０などに定型化されて区分されていることが多い。そして、Abstract（要旨）には、研究内容が要約された記載され、Introduction（序章）には、当該研究に関する先行研究について記載される。また、Material and Method（実験方法）には、測定方法や測定対象などが記載される。 FIG. 10 is a diagram illustrating an example of a method for setting a part to be preferentially searched in the text of a document. If the document is an academic article in particular, the text will be abstracted 610, Introduction 620, Materials and Method 630, Results and Discussion, as shown in Figure 10. ), 640, Conclusion 650, Acknowledgment 660, etc. The Abstract (summary) describes the summary of the research, and the Introduction (introduction) describes the previous research related to the research. In the Material and Method (experimental method), a measurement method, a measurement object, and the like are described.

そこで、本実施形態では、文献のテキストをAbstract６１０、Introduction６２０、Materials and Method６３０、Results and Discussion６４０、Conclusion６５０、Acknowledgement（謝辞）６６０などの章に区分することとし、用語のカテゴリごとに優先検索する章を決め、それぞれの章の優先検索の優先度の値を検索優先度テーブル２７に格納しておく。 Therefore, in this embodiment, the text of the literature is divided into chapters such as Abstract 610, Introduction 620, Materials and Method 630, Results and Discussion 640, Conclusion 650, Acknowledgment 660, and the chapter to be preferentially searched is determined for each category of term. The priority value of the priority search for each chapter is stored in the search priority table 27.

例えば、マーク６０１でマークされたカテゴリの用語がAbstract６１０、Results and Discussion６４０の順に出現頻度が高いとすると、その順序で優先検索の優先度の値を設定し、検索優先度テーブル２７に格納しておく。また、マーク６０２，６０３でマークされた用語のカテゴリは、Material and Method６３０、Abstract６１０の順に出現頻度が高いとすると、その順序で優先検索の優先度の値を設定し、検索優先度テーブル２７に格納しておく。 For example, if the terms of the category marked with the mark 601 appear in the order of Abstract 610 and Results and Discussion 640, the priority search priority values are set in that order and stored in the search priority table 27. . Further, if the frequency of terms marked with the marks 602 and 603 appears in the order of Material and Method 630 and Abstract 610, priority search priority values are set in that order and stored in the search priority table 27. Keep it.

また、例えば、Introduction６２０は、当該研究論文の先行研究が記載されたものであり、当該研究論文の内容が記載されたものではないので、ユーザが知りたい情報すなわち調査対象用語は含まれていないと判断される。従って、Introduction６０１は、検索対象から除外することができる。このように検索を除外する章については、優先検索の優先度の値に例えば０を設定するとよい。 In addition, for example, the introduction 620 describes the previous research of the research paper, and does not describe the content of the research paper. To be judged. Therefore, the Introduction 601 can be excluded from the search target. For chapters excluding search in this way, for example, 0 may be set as the priority value of priority search.

図１１は、文献のテキストにおいて、優先検索する部分を設定する方法の他の例を示した図である。この方法では、図１１に示すように、ユーザが知りたい情報（予め用意されたカテゴリに属する用語）にタグ（マーク）７０５，７０６，７０７が付された複数の教師テキストデータ７０１，７０２，７０３が用いられる。ここで、教師テキストデータ７０１，７０２，７０３は、教師用として予め選ばれた文献のそれぞれのテキスト全部を１行に並べたデータであり、記憶装置２０の教師テキスト２６の中に格納されている。 FIG. 11 is a diagram showing another example of a method for setting a part to be preferentially searched in the text of a document. In this method, as shown in FIG. 11, a plurality of teacher text data 701, 702, 703 in which tags (marks) 705, 706, 707 are added to information (terms belonging to a category prepared in advance) that the user wants to know. Is used. Here, the teacher text data 701, 702, and 703 are data in which all texts of documents previously selected for teacher use are arranged in one line, and are stored in the teacher text 26 of the storage device 20. .

ここで、処理装置１０は、教師テキストデータ７０１，７０２，７０３を用いて、タグ７０５，７０６，７０７が付された用語が出現する場所（章など）と、前後の文脈や品詞などとの関係を解析し、学習する。処理装置１０は、カテゴリごとにタグが付された用語についての出現場所の学習を終えると、それぞれのカテゴリに属する用語が出現する場所の頻度などを求める。そして、その頻度などに基づいて、優先検索の優先度の値を決定し、検索優先度テーブル２７に格納する。 Here, the processing apparatus 10 uses the teacher text data 701, 702, and 703, and the relationship between the place (chapter and the like) where the term with the tags 705, 706, and 707 appears and the context and part of speech before and after the term. Analyze and learn. When the processing device 10 finishes learning the appearance location for the term tagged for each category, the processing device 10 obtains the frequency of the location where the term belonging to each category appears. Then, based on the frequency and the like, a priority search priority value is determined and stored in the search priority table 27.

そして、処理装置１０は、タグが付いていない検索対象テキスト７０４を検索するときには、検索優先度テーブル２７から優先検索の優先度を取得し、それぞれのカテゴリの用語について、優先度の高い部分を優先して検索し、ヒットした用語にマーク（タグ）を付ける。 Then, when searching for the search target text 704 with no tag, the processing device 10 obtains the priority of the priority search from the search priority table 27, and prioritizes the high priority part for each category term. Search and put a mark (tag) on the hit term.

なお、この例では、教師テキストデータ７０１，７０２，７０３の数は、３つとしているが、その数は、３つより多くても少なくてもよい。ただし、その数が多い方が学習の精度は高くなる。以上のような学習処理により、検索優先度テーブル２７に記憶される優先検索の優先度の精度を向上させることができる。 In this example, the number of teacher text data 701, 702, and 703 is three, but the number may be more or less than three. However, the higher the number, the higher the learning accuracy. Through the learning process as described above, the priority accuracy of the priority search stored in the search priority table 27 can be improved.

図１２は、文献のテキストから検索対象外とする部分を抽出する方法の例を模式的に示した図である。図１２に示した各テキスト８０１〜８０４において、句読点などの前に出現する［］で囲まれた数字、上付きの数字などは、その句読点までの記載事項が他の文献の引用事項であることを示している。従って、その記載事項は、当該文献の方法や結論ではないことが多い。そこで、本実施形態では、このような記載部分を検索対象から除外する。 FIG. 12 is a diagram schematically illustrating an example of a method for extracting a part to be excluded from a search target from the text of a document. In each of the texts 801 to 804 shown in FIG. 12, the numbers enclosed in [] appearing before the punctuation marks, the superscript numbers, etc., the items up to the punctuation marks are citations of other documents. Is shown. Therefore, the described items are often not the method or conclusion of the document. Therefore, in the present embodiment, such a description portion is excluded from the search target.

すなわち、処理装置１０は、テキスト８０１，８０２，８０３，８０４を検索し、句読点の前で引用記号（［］で囲まれた数字、上付きの数字など）を検出した場合には、その句読点の前の一文を検索対象から除外する。なお、ここでいう一文とは、テキスト８０２，８０３の例のように、引用記号が句点「。」の前にある場合には、当該句点を含む文の全部を意味する。また、テキスト８０１，８０４の例のように、引用記号が読点「、」の前にある場合には、当該句点を含む文のうち、当該句点より前の部分を意味する。 That is, the processing device 10 searches the texts 801, 802, 803, and 804, and when a quotation mark (a number surrounded by [], a superscript number, etc.) is detected before the punctuation mark, Exclude the previous sentence from search. Note that one sentence here means the whole sentence including a punctuation mark when the quotation mark precedes the punctuation mark “.” As in the examples of the texts 802 and 803. In addition, as in the examples of the texts 801 and 804, when the quotation mark is before the punctuation mark “,”, it means the part before the punctuation point in the sentence including the punctuation point.

また、前記したように、各文献のIntroduction６２０の部分は、当該研究論文の内容が記載されていないので、検索の対象外とすることができる。このように、本実施形態では、各文献のテキストについて検索対象外の部分を設定したことにより、処理装置１０による検索時間の短縮を図ることができる。 Further, as described above, the Introduction 620 portion of each document does not include the contents of the research paper, and can be excluded from the search target. Thus, in this embodiment, the search time by the processing apparatus 10 can be shortened by setting a part not to be searched for the text of each document.

以上、本実施形態によれば、簡単な操作でユーザが知りたい情報を素早く得ることが可能になるので、とくに文献調査に基づく先行研究の調査作業の効率向上を実現することができる。 As described above, according to the present embodiment, it is possible to quickly obtain information that the user wants to know with a simple operation, and thus it is possible to realize an improvement in the efficiency of the research work of the prior research based on the literature research.

なお，本発明は上記した実施形態に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施形態は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また，ある実施形態の構成の一部を他の実施形態の構成に置き換えることが可能であり、また、ある実施形態の構成に他の実施形態の構成を加えることも可能である。また、各実施形態の構成の一部について、他の構成の追加、削除又は置換をすることが可能である。 In addition, this invention is not limited to above-described embodiment, Various modifications are included. For example, the above-described embodiment has been described in detail for easy understanding of the present invention, and is not necessarily limited to one having all the configurations described. In addition, a part of the configuration of a certain embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of a certain embodiment. In addition, it is possible to add, delete, or replace another configuration for a part of the configuration of each embodiment.

１先行研究調査システム
２通信ネットワーク
３文献ＤＢサーバ
１０処理装置
１１キーワード検索部
１２特徴語抽出表示部
１３文献テキスト表示部
１４研究分野・カテゴリ選択部
１５研究分野・カテゴリ追加部
１６タグ付きテキスト生成部
１７調査対象用語マーク表示部
１８学習部
１９調査結果情報出力部
２０記憶装置
２１専門用語辞書
２２検索文献リスト
２３文献テキスト
２４調査対象用語辞書
２５タグ付きテキスト
２６教師テキスト
２７検索優先度テーブル
２８調査結果情報
３９学習部
４０入力装置
５０表示装置
１０１検索指示画面
１０１ｂ研究分野追加編集画面
１０１ｃカテゴリ追加編集画面
２０１出力指示画面DESCRIPTION OF SYMBOLS 1 Prior research research system 2 Communication network 3 Literature DB server 10 Processing apparatus 11 Keyword search part 12 Feature word extraction display part 13 Reference text display part 14 Research field | category selection part 15 Research field | category addition part 16 Tagged text generation part 17 Search target term mark display unit 18 Learning unit 19 Search result information output unit 20 Storage device 21 Technical term dictionary 22 Search reference list 23 Reference text 24 Search target term dictionary 25 Tagged text 26 Teacher text 27 Search priority table 28 Search result Information 39 Learning unit 40 Input device 50 Display device 101 Search instruction screen 101b Research field addition edit screen 101c Category addition edit screen 201 Output instruction screen

Claims

A communication device communicably connected to a document DB storing the text of the document and its bibliographic data;
A storage device holding a search term dictionary configured by associating a search term that is information that the user wants to know with a research field and category; and
A research field / category selection unit for selecting a research field and a category including a search target term based on information operated by a user via an input device;
Referring to the search term dictionary, obtain the search term associated with the selected research field and category, and search the text of the document selected from the literature DB with the acquired search term A tagged text generation unit that generates a tagged text by tagging the search target term included in the text of the document, and stores the generated tagged text in the storage device;
While displaying the tagged text on a display device, a search target term mark display unit for displaying a mark clearly indicating that the tag has been attached to the search target term,
A prior research research system characterized by having a processing device equipped with

The processor is
The text of one or more documents selected from the document DB is searched for the search target terms associated with each of the one or more research fields / categories selected by the research field / category selection unit, and The survey result information output unit for displaying on the display device a list in which the search target terms extracted from the text are associated with the titles and bibliographic data of the documents are further provided. Prior research survey system described.

The processor is
Based on the operation information of the user's input device, a new research field and category are set, and terms associated with the new research field and category are selected from the texts of one or more documents selected from the document DB. 2. A research field / category addition unit that selects and adds the selected term to the research term dictionary as a research target term in association with the new research field and category. Prior research survey system described in 1.

The storage device
For each research field and category, a search priority table storing search priorities related to the search order of each part when the text of the document is divided into a plurality of parts,
The processor is
When searching for the search target terms belonging to the research field and category in the text of the document, the text of the document is divided into a plurality of parts and is associated with the research field and category from the search priority table. The search order in the text of the document is determined based on the search priority of each part of the text of the document, and the text of the classified document is searched according to the determined search order. Prior research research system described in 1.

The processor is
For each research field and category, an input field for setting a search priority of each part of the text of the document is displayed on the display device, and information input by a user through the input field is used. The prior research research system according to claim 4, wherein a search priority of each part of the text of the document is set for each research field and category in the search priority table.

The storage device
Tags of one or more documents selected from the document DB are searched with search terms corresponding to one or more research fields and categories selected in advance, and tags are identified in the respective research fields and categories. Further comprising one or more teacher texts that are tagged text generated with
The processor is
Using the one or more teacher texts, for each of the research fields and categories, any part when the search target term associated with each research field and category classifies the teacher text into one or more parts The frequency information of whether or not each of the texts of the document corresponding to the research field and category in the search priority table is set based on the frequency information. Item 5. A prior research research system according to item 4.

The processor is
When searching for the text of the document, when a quotation mark is detected immediately before a punctuation mark in the text, a part before the punctuation mark is excluded from a search target in a sentence including the punctuation mark. The prior research research system according to claim 1.

It is communicably connected to a document DB that stores the text of the document and its bibliographic data, and includes a processing device, a storage device, an input device, and a display device. A computer that holds a search target term dictionary configured in association with the storage device,
A first step of selecting a research field and a category including a search target term to be searched based on information operated by a user via the input device;
Referring to the search term dictionary, obtain the search term associated with the selected research field and category, and search the text of the document selected from the literature DB with the acquired search term A second step of generating a tagged text by attaching a tag to the search target term included in the text of the document, and storing the generated tagged text in the storage device;
A third step of displaying the tagged text on the display device and displaying a mark clearly indicating that the tag is attached to the search target term to which the tag is attached;
This is a prior research survey method characterized by

The computer
The text of one or more documents selected from the document DB is searched for the search target terms associated with each of the one or more research fields / categories selected in the first step, and the texts of the respective documents are searched. 9. The fourth step of displaying on the display device a list formed by associating the extracted search target terms with the titles and bibliographic data of the documents is further executed. Prior research survey method.

The computer
Based on the operation information of the user's input device, a new research field and category are set, and terms associated with the new research field and category are selected from the texts of one or more documents selected from the document DB. 9. The fifth step of selecting and adding the selected term to the research term dictionary as a research term in association with the new research field and category is further performed. Prior research survey method described.

The storage device includes a search priority table configured by associating search priorities related to the search order of each part when the text of the document is divided into a plurality of parts for each research field and category. Is also retained,
The computer
When searching for the search target terms belonging to the research field and category in the text of the document, the text of the document is divided into a plurality of parts and is associated with the research field and category from the search priority table. The search order in the text of the document is determined based on the search priority of each part of the text of the document, and the text of the classified document is searched according to the determined search order. Prior research investigation method according to 8.

The computer
For each research field and category, an input field for setting a search priority of each part of the text of the document is displayed on the display device, and information input by a user through the input field is used. The prior research research method according to claim 11, wherein a search priority of each part of the text of the document is set for each research field and category in the search priority table.

In the storage device, texts of one or more documents selected from the document DB are searched for a search target term corresponding to each of one or more research fields and categories selected in advance. One or more teacher texts, which are tagged text generated with tags identified by category, are also retained,
The computer
Using the one or more teacher texts, for each of the research fields and categories, any part when the search target term associated with each research field and category classifies the teacher text into one or more parts The frequency information of whether or not each of the texts of the document corresponding to the research field and category in the search priority table is set based on the frequency information. Item 14. A prior research investigation method according to Item 11.

The computer
When searching for the text of the document, when a quotation mark is detected immediately before a punctuation mark in the text, a part before the punctuation mark is excluded from a search target in a sentence including the punctuation mark. The prior research investigation method according to claim 8.