JP7182585B2

JP7182585B2 - program

Info

Publication number: JP7182585B2
Application number: JP2020169393A
Authority: JP
Inventors: 澄人吉川
Original assignee: Micware Co Ltd
Current assignee: Micware Co Ltd
Priority date: 2018-02-27
Filing date: 2020-10-06
Publication date: 2022-12-02
Anticipated expiration: 2038-07-20
Also published as: JP7023821B2; JP2021009720A; JP2019149145A; JP2019149140A; JP6788637B2

Description

本発明は、情報検索装置及び情報検索システムに関し、特に、検索フレーズに基づいて、特定の地点を表す検索対象を検索する情報検索装置及び情報検索システムに関する。 The present invention relates to an information retrieval device and an information retrieval system, and more particularly to an information retrieval device and an information retrieval system for retrieving a retrieval object representing a specific point based on a retrieval phrase.

従来、地図検索において、座標情報を明示的に登録しているデータだけでなく、非定型データの内容に含まれる情報を対象として地図検索することができる情報検索装置が知られている（特許文献１）。 2. Description of the Related Art Conventionally, in a map search, there is known an information search device capable of performing a map search not only for data in which coordinate information is explicitly registered, but also for information contained in the content of atypical data (Patent Document 1).

特許文献１の情報検索装置では、属性を有する単語が複数含まれるデータを入力し、入力したデータに含まれる複数の単語を抽出し、複数の単語を属性毎に抽出した属性関連単語を変換した属性情報が付加されたデータを用いて検索している。 In the information retrieval device of Patent Document 1, data including multiple words having attributes is input, multiple words included in the input data are extracted, and attribute-related words extracted for each attribute are converted. Search is performed using data with attribute information added.

特許文献１の情報検索装置は、例えば、位置の属性に関する「北海道」という単語の場合、「北海道」の座標情報が非定型データに付加されているので、利用者により入力された検索キーワードに基づいて、所定の検索結果を出力することができる、としている。 For example, in the case of the word "Hokkaido" related to the location attribute, the information retrieval device of Patent Document 1 has the coordinate information of "Hokkaido" added to the atypical data. can be used to output predetermined search results.

特開２０１４－１９１７１４号公報JP 2014-191714 A

しかしながら、特許文献１の情報検索装置の構成では、属性を有する単語でなければ検索結果を出力できず、より精度の高い検索対象を検索フレーズから検出することができない。 However, in the configuration of the information search device of Patent Document 1, search results cannot be output unless the word has an attribute, and a more accurate search target cannot be detected from the search phrase.

本発明の目的は、より精度の高い検索対象を検出することにある。 An object of the present invention is to detect search targets with higher accuracy.

本開示の第一態様に係る情報検索装置は、位置情報取得部と、特徴語生成部と、検索部とを備えている。上記位置情報取得部は、位置情報を取得する。上記位置情報は、検索の起点となる地図上の位置を示す。上記特徴語生成部は、特徴語を検索フレーズから生成する。上記特徴語は、検索対象を表す単語と関連している。上記検索部は、蓄積データベースから上記検索対象を検索する。上記蓄積データベースは、上記特徴語と上記検索対象とを蓄積している。上記検索対象は、上記特徴語に対応している。上記検索対象は、地点情報を有している。上記地点情報は、特定の地点を示す。上記検索部は、上記位置情報と、上記特徴語とに基づいて、上記位置から所定の範囲で、上記特徴語に対応した上記検索対象を検索する。 An information search device according to a first aspect of the present disclosure includes a position information acquisition unit, a feature word generation unit, and a search unit. The location information acquisition unit acquires location information. The position information indicates the position on the map that is the starting point of the search. The feature word generation unit generates a feature word from the search phrase. The feature words are associated with words representing search targets. The search unit searches for the search target from the storage database. The accumulation database accumulates the feature words and the search targets. The search target corresponds to the feature word. The search target has point information. The location information indicates a specific location. The search unit searches for the search target corresponding to the feature word within a predetermined range from the position based on the position information and the feature word.

第一態様の情報検索装置は、より精度の高い検索対象を検出することができる。 The information search device of the first aspect can detect search targets with higher accuracy.

本開示の第二態様に係る情報検索装置では、上記位置情報取得部は、上記位置情報を取得する。上記位置情報は、機械学習により、上記検索フレーズに含まれる単語から推定される。 In the information search device according to the second aspect of the present disclosure, the position information acquisition unit acquires the position information. The position information is estimated from words included in the search phrase by machine learning.

第二態様の情報検索装置は、機械学習により、位置情報を推定することができる。第二
態様の情報検索装置は、直接的に位置情報を示す単語が検索フレーズに含まれていなくとも、検索フレーズに含まれる単語から間接的に位置情報を推定することができる。 The information retrieval device of the second aspect can estimate position information by machine learning. The information search device of the second aspect can indirectly estimate the position information from the words included in the search phrase even if the search phrase does not include a word directly indicating the position information.

本開示の第三態様に係る情報検索装置では、カテゴリ判別部と、検索記憶装置とを更に備えている。上記カテゴリ判別部は、上記検索フレーズのカテゴリを判別する。上記検索記憶装置は、上記カテゴリに対応する単語が記憶されている。上記蓄積データベースは、上記特徴語が上記カテゴリごとに分類されている。上記カテゴリ判別部は、カテゴリ判定部と、カテゴリ推定部とを有する。上記カテゴリ判定部は、上記検索フレーズに含まれる単語が、予め上記検索記憶装置に記憶された単語と一致するか否かによって、上記カテゴリを判別する。上記カテゴリ推定部は、機械学習により、上記カテゴリ判定部により判定されなかった上記検索フレーズに含まれる単語から上記カテゴリを推定する。上記検索部は、上記位置情報と、上記カテゴリと、上記特徴語とに基づいて、上記位置から所定の範囲、かつ判別された上記カテゴリで、上記特徴語に対応した上記検索対象を検索する。 The information search device according to the third aspect of the present disclosure further includes a category determination section and a search storage device. The category discriminating unit discriminates the category of the search phrase. Words corresponding to the categories are stored in the search storage device. In the accumulation database, the feature words are classified according to the categories. The category determination unit has a category determination unit and a category estimation unit. The category determination unit determines the category based on whether or not words included in the search phrase match words stored in advance in the search storage device. The category estimating unit uses machine learning to estimate the category from words included in the search phrases that have not been determined by the category determining unit. The search unit searches for the search target corresponding to the feature word within a predetermined range from the position and in the determined category based on the position information, the category, and the feature word.

第三態様の情報検索装置は、機械学習により、カテゴリを推定することができる。第三態様の情報検索装置は、直接的にカテゴリを示す単語が検索フレーズに含まれていなくとも、検索フレーズに含まれる単語から間接的にカテゴリを推定することができる。 The information retrieval device of the third aspect can estimate the category by machine learning. The information search device of the third aspect can indirectly estimate the category from the words included in the search phrase even if the search phrase does not include a word that directly indicates the category.

本開示の第四態様に係る情報検索装置では、上記特徴語生成部は、上記検索フレーズにおける名詞と形容詞と動詞と副詞とを含めた単語から、上記検索対象を表す単語と類似度のより高い特徴語を生成する。 In the information search device according to the fourth aspect of the present disclosure, the feature word generation unit selects words including nouns, adjectives, verbs, and adverbs in the search phrase that have a higher degree of similarity with words representing the search target. Generate feature words.

第四態様の情報検索装置は、上記検索対象を表す単語と類似度のより高い特徴語を生成することで、さらに精度の高い検索対象を検出することができる。 The information retrieval device of the fourth aspect can detect a retrieval target with higher accuracy by generating a feature word having a higher degree of similarity with the word representing the retrieval target.

本開示の第五態様に係る情報検索システムは、情報処理端末と、情報検索装置と、情報蓄積装置とを備えている。上記情報検索装置は、上記情報処理端末と通信可能なように構成されている。上記情報蓄積装置は、上記情報検索装置がアクセス可能なように構成されている。上記情報処理端末は、入力装置と、端末通信装置と、表示装置とを備えている。上記入力装置は、検索フレーズが入力できるように構成されている。上記端末通信装置は、上記検索フレーズを、上記情報検索装置に送信することができるように構成されている。上記端末通信装置は、上記検索フレーズに基づいて検索された検索対象を、上記情報検索装置から受信することができるように構成されている。上記表示装置は、上記検索対象を表示することができるように構成されている。 An information retrieval system according to a fifth aspect of the present disclosure includes an information processing terminal, an information retrieval device, and an information storage device. The information retrieval device is configured to be communicable with the information processing terminal. The information storage device is configured to be accessible by the information retrieval device. The information processing terminal includes an input device, a terminal communication device, and a display device. The input device is configured to allow entry of a search phrase. The terminal communication device is configured to be able to transmit the search phrase to the information search device. The terminal communication device is configured to receive a search target searched based on the search phrase from the information search device. The display device is configured to display the search target.

上記情報検索装置は、検索通信装置と、位置情報取得部と、特徴語生成部と、検索部とを備えている。上記検索通信装置は、上記情報処理端末からの上記検索フレーズを受信することができるように構成されている。上記検索通信装置は、上記検索対象を送信することができるように構成されている。上記位置情報取得部は、位置情報を取得することができるように構成されている。上記位置情報は、検索の起点となる地図上の位置を示す。上記特徴語生成部は、特徴語を上記検索フレーズから生成する。上記特徴語は、上記検索対象を表す単語と関連している。上記検索部は、蓄積データベースから上記検索対象を検索することができるように構成されている。 The information search device includes a search communication device, a position information acquisition section, a feature word generation section, and a search section. The search communication device is configured to receive the search phrase from the information processing terminal. The search communication device is configured to be able to transmit the search target. The location information acquisition unit is configured to acquire location information. The position information indicates the position on the map that is the starting point of the search. The feature word generation unit generates a feature word from the search phrase. The feature word is associated with the word representing the search target. The search unit is configured to search the storage database for the search target.

上記検索対象は、地点情報を有している。上記地点情報は、特定の地点を示す。上記検索部は、上記位置情報と、上記特徴語とに基づいて、上記位置から所定の範囲で、上記特徴語に対応した上記検索対象を検索することができるように構成されている。上記情報蓄積装置は、蓄積記憶装置を備える。上記蓄積記憶装置は、上記蓄積データベースを記憶する。上記蓄積データベースは、上記特徴語と、上記検索対象とを蓄積している。上記検索対象は、上記特徴語に対応している。 The search target has point information. The location information indicates a specific location. The search unit is configured to search for the search target corresponding to the feature word within a predetermined range from the position based on the position information and the feature word. The information storage device includes an accumulation storage device. The accumulation storage device stores the accumulation database. The accumulation database accumulates the feature words and the search targets. The search target corresponds to the feature word.

第五態様の情報検索システムは、より精度の高い検索対象を検出することができる。 The information search system of the fifth aspect can detect search targets with higher accuracy.

本開示の第六態様に係る情報検索システムでは、上記情報処理端末は、端末記憶装置を更に備えている。上記端末記憶装置には、地図情報が記憶される。上記表示装置は、上記検索対象を示す上記特定の地点を上記地図情報の地図に表示する。 In the information retrieval system according to the sixth aspect of the present disclosure, the information processing terminal further includes a terminal storage device. Map information is stored in the terminal storage device. The display device displays the specific point indicating the search target on the map of the map information.

第六態様の情報検索システムは、より精度の高い検索対象の地点を地図に表示させることができる。 The information search system of the sixth aspect can display a more accurate search target point on a map.

本開示の第七態様に係る情報検索システムでは、上記特徴語生成部は、上記検索フレーズの統計データを算出する。上記情報蓄積装置は、上記情報処理端末から予め設定した期間内に上記検索通信装置が受信した上記検索フレーズと、上記統計データとを関連づけして、上記蓄積データベースに蓄積する。 In the information search system according to the seventh aspect of the present disclosure, the feature word generation unit calculates statistical data of the search phrases. The information storage device associates the search phrase received by the search communication device from the information processing terminal within a preset period with the statistical data, and stores them in the storage database.

第七態様の情報検索システムは、予め設定した期間内の検索フレーズに基づいて、より精度の高い検索対象を検出することができる。 The information search system of the seventh aspect can detect search targets with higher accuracy based on search phrases within a preset period.

本開示の第八態様に係る情報検索システムでは、上記特徴語生成部は、上記検索フレーズに含まれる単語から、上記検索対象を表す単語と類似度のより高い上記特徴語を生成する。上記特徴語生成部は、上記統計データに基づいて、頻度のより高い上記特徴語を優先して出力する。 In the information search system according to the eighth aspect of the present disclosure, the feature word generation unit generates the feature word having a higher degree of similarity with the word representing the search target from the words included in the search phrase. The feature word generation unit preferentially outputs the feature word with a higher frequency based on the statistical data.

第八態様の情報検索システムは、更に精度の高い検索対象を検出することができる。 The information search system of the eighth aspect can detect search targets with even higher accuracy.

本発明によれば、より精度の高い検索対象を検出することができる。 According to the present invention, search targets can be detected with higher accuracy.

本実施形態に係る情報検索システムの概略構成図である。1 is a schematic configuration diagram of an information search system according to this embodiment; FIG. 本実施形態に係る情報検索システムの処理動作を示すフローチャート図である。It is a flowchart figure which shows the processing operation of the information search system which concerns on this embodiment. クラスタに基づいてスポットを検索する処理の一例を示すフローチャート図である。FIG. 4 is a flow chart diagram showing an example of processing for searching for spots based on clusters. 検索フレーズからクラスタを決定した具体例を示す図である。It is a figure which shows the specific example which determined the cluster from the search phrase. カテゴリデータベースに格納されているデータテーブルの一例を示す図である。It is a figure which shows an example of the data table stored in a category database. スポットデータベースに格納されているデータテーブルの一例を示す図である。It is a figure which shows an example of the data table stored in the spot database. 情報蓄積装置の要部構成の一例を示すブロック図である。FIG. 2 is a block diagram showing an example of the main configuration of an information storage device; スポットデータベースを生成する処理の例を示すフローチャート図である。FIG. 10 is a flow chart diagram showing an example of processing for generating a spot database; カテゴリデータベースを生成する処理の例を示すフローチャート図である。FIG. 10 is a flow chart diagram showing an example of processing for generating a category database;

以下、本実施形態の情報検索システム１０について説明する。情報検索システム１０は、図１に示すように、情報処理端末１と、情報検索装置２と、情報蓄積装置３とを備えている。情報検索装置２は、情報処理端末１と通信可能なように構成されている。情報蓄積装置３は、情報検索装置２がアクセス可能なように構成されている。 The information retrieval system 10 of this embodiment will be described below. The information retrieval system 10 includes an information processing terminal 1, an information retrieval device 2, and an information storage device 3, as shown in FIG. The information retrieval device 2 is configured to be able to communicate with the information processing terminal 1 . The information storage device 3 is configured to be accessible by the information retrieval device 2 .

情報処理端末１は、入力装置１１と、端末通信装置１２と、表示装置１３とを備えてい
る。入力装置１１は、検索フレーズが入力できるように構成されている。端末通信装置１２は、検索フレーズを、情報検索装置２に送信することができるように構成されている。端末通信装置１２は、検索フレーズに基づいて検索された検索対象を、情報検索装置２から受信することができるように構成されている。表示装置１３は、検索対象を表示することができるように構成されている。 The information processing terminal 1 includes an input device 11 , a terminal communication device 12 and a display device 13 . The input device 11 is configured so that a search phrase can be input. The terminal communication device 12 is configured to be able to transmit the search phrase to the information search device 2 . The terminal communication device 12 is configured to be able to receive the search target searched based on the search phrase from the information search device 2 . The display device 13 is configured to be able to display search targets.

情報検索装置２は、検索通信装置２１と、位置情報取得部２２と、特徴語生成部２４と、検索部２５とを備えている。検索通信装置２１は、情報処理端末１からの検索フレーズを受信することができるように構成されている。検索通信装置２１は、検索対象を送信することができるように構成されている。位置情報取得部２２は、位置情報を取得することができるように構成されている。位置情報は、検索の起点となる地図上の位置を示す。特徴語生成部２４は、特徴語を検索フレーズから生成する。特徴語は、検索対象を表す単語と関連している。検索部２５は、蓄積データベースから検索対象を検索することができるように構成されている。 The information search device 2 includes a search communication device 21 , a position information acquisition section 22 , a feature word generation section 24 and a search section 25 . The search communication device 21 is configured to receive a search phrase from the information processing terminal 1 . The search communication device 21 is configured to be able to transmit search targets. The position information acquisition unit 22 is configured to be able to acquire position information. The position information indicates the position on the map that is the starting point of the search. The feature word generation unit 24 generates feature words from the search phrase. A feature word is associated with a word representing a search target. The search unit 25 is configured to search for a search target from the storage database.

検索対象は、地点情報を有している。地点情報は、特定の地点を示す。検索部２５は、位置情報と、特徴語とに基づいて、位置から所定の範囲で、特徴語に対応した検索対象を検索することができるように構成されている。 The search target has point information. Point information indicates a specific point. The search unit 25 is configured to search for a search target corresponding to the feature word within a predetermined range from the position based on the position information and the feature word.

情報蓄積装置３は、蓄積記憶装置３１を備える。蓄積記憶装置３１は、蓄積データベースを記憶する。蓄積データベースは、特徴語と、検索対象とを蓄積している。検索対象は、特徴語に対応している。 The information storage device 3 includes an accumulation storage device 31 . The accumulation storage device 31 stores an accumulation database. The accumulation database accumulates characteristic words and search targets. A search target corresponds to a feature word.

本実施形態の情報検索システム１０は、より精度の高い検索対象を検出することができる。 The information search system 10 of this embodiment can detect search targets with higher accuracy.

以下、本実施形態の情報検索システム１０における各構成を具体的に説明する。 Hereinafter, each configuration in the information retrieval system 10 of this embodiment will be specifically described.

本実施形態に係る情報検索システム１０では、ユーザが使用する情報処理端末１と、情報検索装置２とが通信できるように構成されている。情報処理端末１と情報検索装置２とは、例えば、公衆通信網５０を介して、無線通信により接続される。図１では、公衆通信網５０を介して、通信が行われる情報の流れを両矢印で例示している。 The information retrieval system 10 according to this embodiment is configured so that the information processing terminal 1 used by the user and the information retrieval device 2 can communicate with each other. The information processing terminal 1 and the information retrieval device 2 are connected by wireless communication via a public communication network 50, for example. In FIG. 1, the flow of information communicated via the public communication network 50 is exemplified by double-headed arrows.

情報処理端末１は、図１に示すように、入力装置１１と端末通信装置１２と表示装置１３とに加え、端末制御装置１４と、端末記憶装置１５とを備えている。情報処理端末１は、ユーザが使用する端末機器である。端末機器は、例えば、携帯電話、若しくはスマートフォンが挙げられる。 The information processing terminal 1 includes an input device 11, a terminal communication device 12, and a display device 13, as well as a terminal control device 14 and a terminal storage device 15, as shown in FIG. The information processing terminal 1 is terminal equipment used by a user. Examples of terminal devices include mobile phones and smart phones.

入力装置１１は、検索フレーズが入力できるように構成されている。検索フレーズは、検索対象を検索するための言葉である。検索フレーズは、ユーザによって任意に入力された言葉でもある。検索フレーズは、単語、単語の組合せ、若しくは主語と述語を含んだ成文でもよい。入力装置１１は、入力された検索フレーズを端末通信装置１２から送信指示できるように構成されている。入力装置１１は、例えば、表示装置１３と一体的に構成されたタッチパネル、音声で入力が可能な音声入力装置、若しくは物理キーが挙げられる。入力装置１１は、表示装置１３と一体的に構成されたタッチパネルの構成に限定されるものではなく、単に入力部だけの構成であってもよい。 The input device 11 is configured so that a search phrase can be input. A search phrase is a word for searching for a search target. Search phrases are also words arbitrarily entered by the user. A search phrase may be a sentence containing a word, a combination of words, or a subject and a predicate. The input device 11 is configured so that an input search phrase can be instructed to be transmitted from the terminal communication device 12 . The input device 11 is, for example, a touch panel integrated with the display device 13, a voice input device capable of voice input, or a physical key. The input device 11 is not limited to a configuration of a touch panel integrally configured with the display device 13, and may be configured only with an input unit.

端末通信装置１２は、情報検索装置２と通信できるように構成されている。端末通信装置１２は、検索フレーズの情報を送信する。端末通信装置１２は、検索対象の情報を受信する。検索対象は、特定の地点を示す情報を有しているので、地点情報でもある。地点情
報は、ＰＯＩ（ＰｏｉｎｔＯｆＩｎｔｅｒｅｓｔ）情報となる。具体的には、端末通信装置１２は、公衆通信網５０を用いて、情報検索装置２の検索通信装置２１と通信ができるように構成されている。端末通信装置１２は、公衆通信網５０を利用した公衆通信だけでなく、例えば、有線通信ができるように構成されていてもよい。また、端末通信装置１２は、赤外線通信、Ｗｉ－Ｆｉ（登録商標）、Ｂｌｕｅｔｏｏｔｈ（登録商標）、若しくはＢＬＥ（ＢｌｕｅｔｏｏｔｈＬｏｗＥｎｅｒｇｙ）の通信規格に対応した無線通信ができるように構成されてもよい。 The terminal communication device 12 is configured to be able to communicate with the information retrieval device 2 . The terminal communication device 12 transmits the information of the search phrase. The terminal communication device 12 receives information to be searched. Since the search target has information indicating a specific point, it is also point information. The point information is POI (Point Of Interest) information. Specifically, the terminal communication device 12 is configured to be able to communicate with the search communication device 21 of the information search device 2 using the public communication network 50 . The terminal communication device 12 may be configured to be capable of not only public communication using the public communication network 50 but also wired communication, for example. Further, the terminal communication device 12 may be configured to be capable of wireless communication conforming to communication standards such as infrared communication, Wi-Fi (registered trademark), Bluetooth (registered trademark), or BLE (Bluetooth Low Energy).

表示装置１３は、各種の情報を表示することができるように構成されている。表示装置１３は、例えば、検索フレーズを表示する。表示装置１３は、地図を表示してもよい。表示装置１３は、情報検索装置２で検索された検索対象の地点情報が示す地点を、地図に重ねて表示してもよい。表示装置１３は、例えば、複数の地点情報を可視化して、端末記憶装置１５に記憶した地図情報が示す地図に重ねて表示することができる。表示装置１３は、入力装置１１と一体的に構成されたタッチパネルの構成に限定されるものではなく、単に表示部だけの構成であってもよい。 The display device 13 is configured to display various types of information. The display device 13 displays, for example, search phrases. The display device 13 may display a map. The display device 13 may display the points indicated by the search target point information searched by the information search device 2 so as to be superimposed on the map. The display device 13 can, for example, visualize a plurality of location information and display them superimposed on the map indicated by the map information stored in the terminal storage device 15 . The display device 13 is not limited to a configuration of a touch panel integrally configured with the input device 11, and may be configured only with a display unit.

端末制御装置１４は、各種の情報処理動作を制御する。端末制御装置１４は、入力装置１１と、端末通信装置１２と、表示装置１３と、端末記憶装置１５と電気的に接続されている。端末制御装置１４は、端末記憶装置１５に記憶されたプログラムに基づいて、入力装置１１と端末通信装置１２と表示装置１３とを制御できるように構成されている。端末制御装置１４は、例えば、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）若しくはＭＰＵ（ＭｉｃｒｏＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）で構成される。 The terminal control device 14 controls various information processing operations. The terminal control device 14 is electrically connected to the input device 11 , the terminal communication device 12 , the display device 13 and the terminal storage device 15 . The terminal control device 14 is configured to be able to control the input device 11 , the terminal communication device 12 and the display device 13 based on a program stored in the terminal storage device 15 . The terminal control device 14 is composed of, for example, a CPU (Central Processing Unit) or an MPU (Micro Processing Unit).

端末記憶装置１５は、電源を切れば情報が消える揮発性記憶装置と、電源を切っても情報が消えない不揮発性記憶装置とを有している。揮発性記憶装置は、端末制御装置１４により処理される情報を一時的に記憶する。揮発性記憶装置は、例えば、地図を示す地図情報及び地図情報に付加する地点情報が記憶される。揮発性記憶装置は、例えば、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）が挙げられる。不揮発性記憶装置は、各種の情報処理プログラム及び各種の情報が記憶可能なストレージである。不揮発性記憶装置は、例えば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）が挙げられる。ＲＯＭは、例えば、フラッシュメモリ、若しくはＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）が用いられる。 The terminal storage device 15 has a volatile storage device in which information disappears when power is turned off, and a non-volatile storage device in which information does not disappear even when power is turned off. The volatile storage temporarily stores information processed by the terminal controller 14 . The volatile storage device stores, for example, map information indicating a map and point information added to the map information. Volatile storage devices include, for example, RAM (Random Access Memory). A non-volatile memory device is a storage capable of storing various information processing programs and various types of information. Non-volatile storage devices include, for example, ROM (Read Only Memory). For ROM, for example, flash memory or HDD (Hard Disk Drive) is used.

情報検索装置２は、図１に示すように、検索通信装置２１と、検索制御装置２７とを備えている。情報検索装置２は、検索通信装置２１と検索制御装置２７とに加え、検索記憶装置２８を備えている。情報検索装置２は、例えば、サーバにより構成される。検索制御装置２７は、位置情報取得部２２と、カテゴリ判別部２３と、特徴語生成部２４と、検索部２５と、計時部２６とを有している。 The information retrieval device 2 includes a retrieval communication device 21 and a retrieval control device 27, as shown in FIG. The information retrieval device 2 includes a retrieval storage device 28 in addition to a retrieval communication device 21 and a retrieval control device 27 . The information retrieval device 2 is configured by, for example, a server. The search control device 27 has a position information acquisition section 22 , a category determination section 23 , a feature word generation section 24 , a search section 25 and a clock section 26 .

検索通信装置２１は、情報処理端末１と通信できるように構成されている。検索通信装置２１は、検索フレーズの情報を、情報処理端末１から受信できるように構成されている。検索通信装置２１は、検索制御装置２７で検索された検索対象の情報を、情報処理端末１に送信することができるように構成されている。即ち、検索通信装置２１は、検索フレーズの情報を受信する。検索通信装置２１は、検索対象の情報を送信する。具体的には、検索通信装置２１は、公衆通信網５０を用いて、情報処理端末１の端末通信装置１２と通信ができるように構成されている。 The search communication device 21 is configured to be able to communicate with the information processing terminal 1 . The search communication device 21 is configured to receive search phrase information from the information processing terminal 1 . The search communication device 21 is configured to be able to transmit search target information searched by the search control device 27 to the information processing terminal 1 . That is, the search communication device 21 receives search phrase information. The search communication device 21 transmits search target information. Specifically, the search communication device 21 is configured to be able to communicate with the terminal communication device 12 of the information processing terminal 1 using the public communication network 50 .

また、検索通信装置２１は、情報蓄積装置３と通信できるように構成されている。検索通信装置２１は、検索対象を検索するための情報を送信する。検索通信装置２１は、検索した検索対象の情報を情報蓄積装置３から受信する。具体的には、検索通信装置２１は、
公衆通信網５０を用いて、情報蓄積装置３の蓄積通信装置３３と通信ができるように構成されている。検索通信装置２１は、公衆通信網５０を利用した公衆通信だけでなく、例えば、有線通信ができるように構成されていてもよい。また、端末通信装置１２は、赤外線通信、Ｗｉ－Ｆｉ（登録商標）、Ｂｌｕｅｔｏｏｔｈ（登録商標）、若しくはＢＬＥの通信規格に対応した無線通信ができるように構成されてもよい。 Also, the search communication device 21 is configured to be able to communicate with the information storage device 3 . The search communication device 21 transmits information for searching for a search target. The search communication device 21 receives the searched search target information from the information storage device 3 . Specifically, the search communication device 21
It is configured to be able to communicate with the storage communication device 33 of the information storage device 3 using the public communication network 50 . The search communication device 21 may be configured not only for public communication using the public communication network 50, but also for wired communication, for example. Further, the terminal communication device 12 may be configured to be capable of wireless communication corresponding to communication standards such as infrared communication, Wi-Fi (registered trademark), Bluetooth (registered trademark), or BLE.

検索制御装置２７は、検索通信装置２１と、検索記憶装置２８と電気的に接続されている。検索制御装置２７は、位置情報取得部２２と、カテゴリ判別部２３と、特徴語生成部２４と、検索部２５と、計時部２６とが機能するように構成されている。検索制御装置２７は、例えば、検索記憶装置２８に記憶された位置情報取得プログラム、カテゴリ判別プログラム、特徴語生成プログラム、検索プログラム及び計時プログラムを動作させるように駆動する。検索制御装置２７は、例えば、ＣＰＵ若しくはＭＰＵで構成される。 The search control device 27 is electrically connected to the search communication device 21 and the search storage device 28 . The search control device 27 is configured such that a position information acquisition section 22, a category determination section 23, a feature word generation section 24, a search section 25, and a clock section 26 function. The search control device 27 drives, for example, a position information acquisition program, a category discrimination program, a feature word generation program, a search program, and a timer program stored in the search storage device 28 to operate. The search control device 27 is composed of, for example, a CPU or an MPU.

位置情報取得プログラムは、コンピュータ上で実行されることにより、位置情報を取得する位置情報取得部２２として機能する。カテゴリ判別プログラムは、コンピュータ上で実行することにより、検索フレーズのカテゴリを判別するカテゴリ判別部２３として機能する。特徴語生成プログラムは、コンピュータ上で実行されることにより、特徴語を検索フレーズから生成する特徴語生成部２４として機能する。検索プログラムは、コンピュータ上で実行されることにより、情報蓄積装置３に記憶された蓄積データベースから検索対象を検索する検索部２５として機能する。計時プログラムは、コンピュータ上で実行されることにより、年、日、時を計時することが可能な計時部２６として機能する。 The position information acquisition program functions as the position information acquisition unit 22 that acquires position information by being executed on the computer. The category discrimination program functions as a category discrimination section 23 that discriminates the category of search phrases by being executed on a computer. The feature word generation program functions as a feature word generation unit 24 that generates feature words from search phrases by being executed on a computer. The search program functions as a search unit 25 that searches for a search target from the storage database stored in the information storage device 3 by being executed on the computer. The timekeeping program functions as a timekeeping unit 26 capable of timekeeping the year, day, and hour by being executed on the computer.

位置情報取得部２２は、位置情報を取得する。位置情報は、例えば、経度の情報及び緯度の情報である。位置情報には、経度の情報及び緯度の情報に加え、高度の情報が含まれていてもよい。位置情報取得部２２は、例えば、情報処理端末１から送信される位置情報を取得することができる。情報処理端末１から送信される位置情報としては、例えば、情報処理端末１の入力装置１１で入力された位置情報が挙げられる。また、位置情報取得部２２は、情報処理端末１から送信される検索フレーズから位置情報を取得することができる。 The position information acquisition unit 22 acquires position information. The location information is, for example, longitude information and latitude information. The location information may include altitude information in addition to longitude information and latitude information. The location information acquisition unit 22 can acquire location information transmitted from the information processing terminal 1, for example. Position information transmitted from the information processing terminal 1 includes, for example, position information input by the input device 11 of the information processing terminal 1 . Also, the position information acquisition unit 22 can acquire position information from a search phrase transmitted from the information processing terminal 1 .

検索フレーズは、テキストデータで構成されている。検索フレーズには、フレーズが含まれている。フレーズは、１つの単語でもよいし、複数の単語の集まりでもよい。フレーズは、成文であってもよい。検索フレーズは、テキストデータを機械的に扱える形式に変換される。検索フレーズは、形態素解析により、複数の単語単位に分解される。ここで、形態素解析は、検索フレーズを、自然言語で意味を持つ最小の単位に分類することができる。形態素解析では、分類した形態素の品詞情報を取得することもできる。形態素解析では、例えば、検索フレーズに含まれる単語が、名詞、形容詞、動詞若しくは副詞であるかどうかの情報を取得することもできる。 A search phrase consists of text data. A search phrase contains a phrase. A phrase may be a single word or a group of words. Phrases may be written. The search phrase is converted into a format that allows text data to be handled mechanically. A search phrase is decomposed into a plurality of word units by morphological analysis. Here, morphological analysis can classify search phrases into the smallest meaningful units in natural language. Morphological analysis can also acquire part-of-speech information of classified morphemes. In morphological analysis, for example, it is also possible to obtain information as to whether words included in a search phrase are nouns, adjectives, verbs, or adverbs.

形態素解析には、例えば、ＭｅＣａｂ（メカブ）などの形態素解析エンジンを用いることができる。ＭｅＣａｂは、固有表現辞書と一緒に用いることができる。固有表現辞書としては、例えば、Ｎｅｏｌｏｇｄなどのシステム辞書が挙げられる。位置情報取得部２２は、検索フレーズに含まれる単語を得るために、検索フレーズを形態素解析している。位置情報取得部２２は、検索フレーズに含まれる単語を得るために、検索フレーズを形態素解析する構成だけに限られない。位置情報取得部２２は、例えば、カテゴリ判別部２３、特徴語生成部２４、検索部２５、若しくは図示していない形態素解析部が検索フレーズを形態素解析した単語を入手する構成でもよい。位置情報取得部２２は、形態素解析した単語を数値表現にした分散表現を行ってもよい。位置情報取得部２２は、形態素解析した単語の活用形をまとめてもよい。形態素解析した単語の活用形をまとめるためには、例えば、ＦａｓｔＴｅｘｔを用いてもよい。 For morphological analysis, for example, a morphological analysis engine such as MeCab can be used. MeCab can be used with a named entity dictionary. Examples of named entity dictionaries include system dictionaries such as Neologd. The position information acquisition unit 22 morphologically analyzes the search phrase in order to obtain words included in the search phrase. The position information acquisition unit 22 is not limited to the configuration of morphologically analyzing the search phrase in order to obtain words included in the search phrase. The position information acquisition unit 22 may be configured to acquire words obtained by morphologically analyzing a search phrase by the category discrimination unit 23, the feature word generation unit 24, the search unit 25, or a morphological analysis unit (not shown), for example. The position information acquisition unit 22 may perform distributed representation in which the morphologically analyzed word is expressed numerically. The positional information acquiring unit 22 may collect the conjugations of the morphologically analyzed words. FastText, for example, may be used to summarize the conjugations of morphologically analyzed words.

位置情報取得部２２は、検索フレーズに含まれる単語が地名の場合、地名と位置情報とが対応付いた地名データベースから位置情報を取得することができる。位置情報取得部２２は、例えば、検索フレーズを形態素解析により分解した単語を、検索記憶装置２８に記憶された地名データベースと比較する。位置情報取得部２２は、検索フレーズを形態素解析により分解した単語が地名データベースの地名と一致すれば、一致した地名に対応する位置情報を取得すればよい。 When the word included in the search phrase is a place name, the position information acquisition unit 22 can acquire the position information from the place name database in which the place name and the position information are associated with each other. For example, the positional information acquiring unit 22 compares words obtained by breaking down the search phrase by morphological analysis with the place name database stored in the search storage device 28 . If a word obtained by decomposing a search phrase by morphological analysis matches a place name in the place name database, the position information acquisition unit 22 may acquire position information corresponding to the matching place name.

地名データベースは、例えば、県、市、区、町、村若しくは施設の名称と、位置情報とが対応付いて構成されていればよい。ここでは、地名は、市区町村を表す名称だけでなく、施設の名称も含む。言い換えれば、地名は、地理的情報を示す名称である。地名データベースでは、例えば、県の名称と県庁の位置情報と、市の名称と市役所の位置情報と、区の名称と区役所の位置情報、町の名称と町役場の位置情報、村の名称と村役場の位置情報、施設の名称と施設の位置情報とが対応するように構成されている。以下では、検索フレーズに含まれる単語が地名の場合、地名と位置情報とが直接対応付いた地名データベースから位置情報を取得することを、非機械学習による位置情報の取得ともいう。位置情報取得部２２は、非機械学習により、位置情報を取得する場合、データベースに含まれる単語と一致するか否かの硬い検索を行うことになる。 The geographical name database may consist of, for example, the names of prefectures, cities, wards, towns, villages, or facilities in association with location information. Here, the place name includes not only the name representing the municipality but also the name of the facility. In other words, a place name is a name indicating geographical information. In the place name database, for example, the name of the prefecture and the location of the prefectural office, the name of the city and the location of the city hall, the name of the ward and the location of the ward office, the name of the town and the location of the town office, the name of the village and the village office , the name of the facility and the location information of the facility are configured to correspond to each other. Hereinafter, when a word included in a search phrase is a place name, obtaining position information from a place name database in which place names and position information are directly associated is also referred to as obtaining position information by non-machine learning. When the position information acquisition unit 22 acquires position information by non-machine learning, it performs a hard search to determine whether or not it matches a word included in the database.

位置情報取得部２２は、検索フレーズに含まれる単語が地名を含まない場合、機械学習により、検索語と位置情報とが対応付いたデータベースから位置情報を取得することができる。位置情報取得部２２は、例えば、機械学習により、検索フレーズを形態素解析により分解された単語と、類似度の高い検索語に置き換える。位置情報取得部２２は、例えば、検索フレーズに形態素解析を行った上、予め記憶されている単語と検索語との類似度で、所望の検索語を選択することができる。位置情報取得部２２は、検索語と位置情報とが対応付いた地名データベースと比較する。位置情報取得部２２は、検索語が地名データベースの地名と一致すれば、一致した地名に対応する位置情報を取得すればよい。言い換えれば、位置情報取得部２２は、地理的情報を示す地名から経度及び緯度の座標値に変換するジオコーディング技術を利用している。 When a word included in a search phrase does not include a place name, the position information acquisition unit 22 can acquire position information from a database in which search words and position information are associated with each other by machine learning. For example, the position information acquisition unit 22 replaces the search phrase with words decomposed by morphological analysis and search words with a high degree of similarity by machine learning. The position information acquiring unit 22 can, for example, perform morphological analysis on the search phrase and select a desired search term based on the degree of similarity between the pre-stored word and the search term. The position information acquisition unit 22 compares the search word and the position information with a place name database associated with each other. If the search term matches a place name in the place name database, the position information acquisition unit 22 may acquire the position information corresponding to the matching place name. In other words, the location information acquisition unit 22 uses geocoding technology to convert a place name indicating geographical information into coordinate values of longitude and latitude.

位置情報取得部２２は、情報処理端末１からの位置情報、情報処理端末１からの検索フレーズに含まれる地名の単語、情報処理端末１からの検索フレーズに基づいた検索語の少なくとも１つを利用して、地図上の位置を示す位置情報を取得することができる。位置情報取得部２２は、情報処理端末１からの情報を用いて、地図上の位置を示す位置情報を取得している。本実施形態に係る情報検索装置２では、情報処理端末１から直接的に位置情報を取得できない場合でも、間接的に位置情報を取得することができる。情報検索装置２は、機械学習だけで位置情報を取得する構成と比較して、より低い処理負荷で位置情報を取得することができる。 The position information acquisition unit 22 uses at least one of the position information from the information processing terminal 1, the word of the place name included in the search phrase from the information processing terminal 1, and the search term based on the search phrase from the information processing terminal 1. Then, the position information indicating the position on the map can be acquired. The position information acquisition unit 22 uses information from the information processing terminal 1 to acquire position information indicating a position on the map. The information retrieval device 2 according to the present embodiment can indirectly acquire position information even when the position information cannot be acquired directly from the information processing terminal 1 . The information search device 2 can acquire position information with a lower processing load than a configuration that acquires position information only by machine learning.

即ち、本実施形態に係る情報検索装置２では、位置情報取得部２２は、位置情報を取得する。位置情報は、機械学習により、検索フレーズに含まれる単語から推定される。 That is, in the information retrieval device 2 according to the present embodiment, the position information acquisition section 22 acquires position information. Location information is estimated from words included in search phrases by machine learning.

本実施形態に係る情報検索装置２は、直接的に位置情報を示す単語が検索フレーズに含まれていなくとも、機械学習により、検索フレーズに含まれる単語から間接的に位置情報を推定することができる。 The information search device 2 according to the present embodiment can indirectly estimate position information from words included in the search phrase by machine learning even if the search phrase does not include a word directly indicating the position information. can.

カテゴリ判別部２３は、検索フレーズのカテゴリを判別することができるように構成されている。カテゴリ判別部２３は、情報処理端末１から送信される検索フレーズのカテゴリを判別する。カテゴリは、例えば、温泉、公園、美術館、神社、ライブ、ショッピング、アミューズメント、アウトドア、イベント、グルメ、ラーメン、山が挙げられる。カテゴリは、適宜の種類に分けられている。カテゴリの種類は、例えば、３００種類とすることができる。 The category determination unit 23 is configured to be able to determine the category of search phrases. Category determination unit 23 determines the category of the search phrase transmitted from information processing terminal 1 . Categories include, for example, hot springs, parks, museums, shrines, live performances, shopping, amusement, outdoors, events, gourmet, ramen, and mountains. The categories are divided into appropriate types. The types of categories can be, for example, 300 types.

カテゴリ判別部２３は、検索フレーズを形態素解析している。カテゴリ判別部２３は、検索フレーズに含まれる単語を得るために、検索フレーズを形態素解析する構成だけに限られない。カテゴリ判別部２３は、例えば、位置情報取得部２２、特徴語生成部２４、検索部２５、若しくは図示していない形態素解析部が検索フレーズを形態素解析した単語を入手する構成でもよい。カテゴリ判別部２３は、形態素解析した単語を数値表現にした分散表現を行ってもよい。カテゴリ判別部２３は、形態素解析した単語の活用形をまとめてもよい。 The category discrimination unit 23 morphologically analyzes the search phrase. The category determining unit 23 is not limited to the configuration of morphologically analyzing a search phrase in order to obtain words included in the search phrase. The category determination unit 23 may acquire words obtained by morphologically analyzing a search phrase by the position information acquisition unit 22, the feature word generation unit 24, the search unit 25, or a morphological analysis unit (not shown), for example. The category discriminating unit 23 may perform distributed representation in which the words subjected to morphological analysis are represented numerically. The category discrimination unit 23 may group conjugations of words subjected to morphological analysis.

カテゴリ判別部２３は、検索フレーズに含まれる単語がカテゴリを表すカテゴリ名の場合、カテゴリ名と単語とが対応付いたカテゴリデータベースからカテゴリを直接的に判別することができる。カテゴリ判別部２３は、例えば、検索フレーズを形態素解析により分解した単語を、検索記憶装置２８に記憶されたカテゴリデータベースと比較する。カテゴリ判別部２３は、検索フレーズを形態素解析により分解された単語がカテゴリデータベースのカテゴリ名と一致すれば、一致したカテゴリ名を検索記憶装置２８に検索フレーズと対応させて記憶させればよい。以下では、検索フレーズに含まれる単語がカテゴリ名の場合、カテゴリデータベースからカテゴリの判別を行うことを、カテゴリの判定ともいう。すなわち、カテゴリ判別部２３は、カテゴリの判定により、検索フレーズに含まれる単語がカテゴリ名の場合、カテゴリデータベースからカテゴリの判別を非機械学習で行うカテゴリ判定部２３１を備えているといえる。 When a word included in a search phrase is a category name representing a category, the category discriminating section 23 can directly discriminate the category from the category database in which the category name and the word are associated with each other. The category discrimination unit 23 compares words obtained by decomposing a search phrase by morphological analysis with the category database stored in the search storage device 28, for example. If the words obtained by decomposing the search phrase by morphological analysis match the category name of the category database, the category discrimination unit 23 may store the matching category name in the search storage device 28 in association with the search phrase. Hereinafter, when a word included in a search phrase is a category name, determining the category from the category database is also referred to as category determination. That is, it can be said that the category determination unit 23 includes the category determination unit 231 that determines the category from the category database by non-machine learning when the word included in the search phrase is the category name.

カテゴリ判別部２３は、検索フレーズに含まれる単語がカテゴリ名を含まない場合、機械学習により、検索語とカテゴリ名とが対応付いたカテゴリデータベースからカテゴリを推定することができる。カテゴリ判別部２３は、例えば、機械学習により、検索フレーズを形態素解析により分解した単語を、その単語との類似度が高い検索語に置き換える。この場合、カテゴリ判別部２３は、検索フレーズを形態素解析により分解した単語のうち、１つの単語をその単語と類似度の高い検索語に置き換えてもよいし、２つ以上の単語をそれらの単語と類似度の高い検索語に置き換えてもよい。また、カテゴリ判別部２３は、例えば、２つ以上の単語から共起する共起語を抽出し、共起語から予めカテゴリごとにクラスタリングされた単語と高い類似度の検索語に置き換えてもよい。ここで、共起とは、検索フレーズに含まれている所定の単語が、検索フレーズ中などに別の単語と同時に出現することをいう。共起語とは、所定の単語と同時に出現する別の単語をいう。カテゴリは、分類によって使用される単語が異なる場合が多いため、カテゴリ判別部２３は、入力と対応する出力が予め判明している教師データを用いた機械学習でカテゴリ判別を行うことが好ましい。 When a word included in a search phrase does not include a category name, the category determination unit 23 can use machine learning to estimate a category from a category database in which search words and category names are associated with each other. For example, the category determination unit 23 replaces a word obtained by decomposing a search phrase by morphological analysis by machine learning with a search word having a high degree of similarity with the word. In this case, the category determination unit 23 may replace one of the words obtained by decomposing the search phrase by morphological analysis with a search word highly similar to that word, or replace two or more words with those words. may be replaced with a search term having a high degree of similarity with In addition, the category determination unit 23 may, for example, extract co-occurring words that co-occur from two or more words, and replace the co-occurring words with search words that are highly similar to words clustered in advance for each category. . Here, co-occurrence means that a predetermined word included in a search phrase appears simultaneously with another word in the search phrase. A co-occurring word is another word that appears at the same time as a given word. Since the categories often use different words depending on the classification, the category discrimination unit 23 preferably performs category discrimination by machine learning using teacher data whose inputs and corresponding outputs are known in advance.

カテゴリ判別部２３は、例えば、検索フレーズに形態素解析を行った上、予め記憶されている単語と共起語との類似度で、所望の検索語を選択することができる。所望の検索語は、カテゴリ名と一致するように設定されている。カテゴリ判別部２３は、カテゴリ名と一致する検索語をカテゴリデータベースと比較する。カテゴリ判別部２３は、検索語がカテゴリデータベースのカテゴリ名と一致すれば、一致したカテゴリ名が検索フレーズのカテゴリであると推定すればよい。すなわち、カテゴリ判別部２３は、カテゴリの推定により、検索フレーズに含まれる単語と類似する検索語がカテゴリ名の場合、カテゴリデータベースからカテゴリの推定を機械学習で行うカテゴリ推定部２３２を備えているといえる。 The category discrimination unit 23 can, for example, perform morphological analysis on the search phrase, and select a desired search word based on the degree of similarity between pre-stored words and co-occurring words. The desired search term is set to match the category name. The category discriminator 23 compares search words that match the category name with the category database. If the search word matches the category name of the category database, the category determination unit 23 may estimate that the matching category name is the category of the search phrase. That is, the category determination unit 23 includes a category estimation unit 232 that performs category estimation by machine learning from the category database when a search word similar to a word included in a search phrase is a category name. I can say.

カテゴリ判別部２３は、情報処理端末１からの検索フレーズに含まれるカテゴリを示す単語、若しくは情報処理端末１からの検索フレーズに含まれる単語に基づいた検索語を利
用して、カテゴリを判別することができる。 The category discriminating unit 23 discriminates the category by using a word indicating the category included in the search phrase from the information processing terminal 1 or a search term based on the word included in the search phrase from the information processing terminal 1. can be done.

本実施形態に係る情報検索装置２では、検索フレーズから直接的にカテゴリを判別できない場合でも、検索フレーズに含まれる単語に基づいて間接的にカテゴリを判別することができる。情報検索装置２は、機械学習により、カテゴリを推定する場合、非機械学習でカテゴリを判定する場合と比較して、より精度よくカテゴリを判別することができる。 In the information retrieval device 2 according to the present embodiment, even if the category cannot be determined directly from the search phrase, the category can be determined indirectly based on the words included in the search phrase. When estimating a category by machine learning, the information retrieval device 2 can determine the category with higher accuracy than when determining the category by non-machine learning.

言い換えれば、本実施形態に係る情報検索装置２では、カテゴリ判別部２３と、検索記憶装置２８とを更に備えている。カテゴリ判別部２３は、検索フレーズのカテゴリを判別する。検索記憶装置２８には、カテゴリに対応する単語が記憶されている。カテゴリ判別部２３は、カテゴリ判定部２３１と、カテゴリ推定部２３２とを有する。カテゴリ判定部２３１は、検索フレーズに含まれる単語が、予め検索記憶装置２８に記憶された単語と一致するか否かによって、カテゴリを判別する。カテゴリ推定部２３２は、機械学習により、カテゴリ判定部２３１により判定されなかった検索フレーズに含まれる単語からカテゴリを推定する。 In other words, the information retrieval device 2 according to this embodiment further includes the category determination section 23 and the retrieval storage device 28 . The category discriminator 23 discriminates the category of the search phrase. Words corresponding to the categories are stored in the search storage device 28 . The category determination section 23 has a category determination section 231 and a category estimation section 232 . The category determination unit 231 determines the category based on whether or not the words included in the search phrase match the words stored in the search storage device 28 in advance. The category estimating unit 232 uses machine learning to estimate a category from words included in search phrases that have not been determined by the category determining unit 231 .

本実施形態に係る情報検索装置２は、直接的にカテゴリを示す単語が検索フレーズに含まれていなくとも、機械学習により、検索フレーズに含まれる単語から間接的にカテゴリを推定することができる。情報検索装置２は、カテゴリ判定部２３１とカテゴリ推定部２３２とを有することで、機械学習だけでカテゴリを判定する構成と比較して、より低い処理負荷でカテゴリを判定することができる。 The information search device 2 according to the present embodiment can indirectly estimate the category from the words included in the search phrase by machine learning even if the search phrase does not include a word directly indicating the category. Since the information retrieval device 2 includes the category determining unit 231 and the category estimating unit 232, the category can be determined with a lower processing load than a configuration that determines the category only by machine learning.

特徴語生成部２４は、検索対象を表す単語と関連する特徴語を生成することができるように構成されている。特徴語生成部２４は、情報処理端末１から送信される検索フレーズから特徴語を生成する。特徴語生成部２４は、検索フレーズから特徴語を生成するために、検索フレーズを形態素解析する。特徴語生成部２４は、例えば、位置情報取得部２２、カテゴリ判別部２３、検索部２５、若しくは図示していない形態素解析部が検索フレーズを形態素解析した単語を入手して、特徴語を生成する構成でもよい。特徴語生成部２４は、形態素解析した単語を数値表現にした分散表現を行う。特徴語生成部２４は、形態素解析した単語の活用形をまとめている。なお、上記「特徴語」は、１つの単語であってもよいし、複数の単語からなる群であってもよい。また、「特徴語の生成」には、特徴語を決定すること、あるいは特徴語を選択することも含まれる。 The feature word generation unit 24 is configured to be able to generate feature words related to words representing search targets. The feature word generation unit 24 generates feature words from the search phrases transmitted from the information processing terminal 1 . The feature word generation unit 24 morphologically analyzes the search phrase in order to generate feature words from the search phrase. The feature word generation unit 24 obtains words obtained by morphologically analyzing a search phrase by the position information acquisition unit 22, the category determination unit 23, the search unit 25, or a morphological analysis unit (not shown), and generates feature words. may be configured. The feature word generation unit 24 performs distributed representation in which the morphologically analyzed words are represented numerically. The characteristic word generation unit 24 summarizes the conjugations of the morphologically analyzed words. The "feature word" may be one word or a group of words. Also, "generating a feature word" includes determining a feature word or selecting a feature word.

特徴語生成部２４は、例えば、機械学習により、検索フレーズを形態素解析により分解された単語を、その単語との類似度が高い特徴語に置き換える。特徴語生成部２４は、検索フレーズを形態素解析により分解された単語のうち、１つの単語を類似度の高い特徴語に置き換えてもよいし、２つ以上の単語をそれらの単語と類似度の高い特徴語に置き換えてもよい。特徴語生成部２４は、機械学習により、検索フレーズに含まれる単語から、検索対象を表す単語と類似度のより高い特徴語を生成する。特徴語生成部２４は、例えば、検索フレーズに形態素解析を行った上、予め記憶されている単語と特徴語との類似度で、所望の特徴語を選択することができる。より具体的には、特徴語生成部２４は、特徴語の特徴ベクトルをコサイン類似で測定することにより、どのくらい離れているかを判別することができる。すなわち、特徴語生成部２４は、特徴語の特徴ベクトルをコサイン類似で測定することにより、類似度を判別することができる。 The feature word generation unit 24, for example, by machine learning, replaces a word obtained by decomposing a search phrase by morphological analysis with a feature word having a high degree of similarity with the word. The feature word generation unit 24 may replace one word with a feature word having a high degree of similarity among the words obtained by decomposing the search phrase by the morphological analysis, or replace two or more words with a degree of similarity with those words. It may be replaced with a high feature word. The feature word generation unit 24 uses machine learning to generate feature words having a higher degree of similarity to words representing a search target from words included in a search phrase. The feature word generator 24 can, for example, perform morphological analysis on a search phrase and select a desired feature word based on the degree of similarity between a pre-stored word and the feature word. More specifically, the feature word generation unit 24 can determine how far apart the feature vectors of the feature words are by measuring the cosine similarity. That is, the feature word generation unit 24 can determine the degree of similarity by measuring the feature vector of the feature word by cosine similarity.

特徴ベクトルは、例えば、形態素解析した単語を数値表現にしたデータを入力データとして、ｗｏｒｄ２ｖｅｃなどのアルゴリズムによって得ることができる。ここで用いるｗｏｒｄ２ｖｅｃなどのアルゴリズムは、入力された単語の特徴ベクトルを出力できるように、機械学習済みとされている。コサイン類似は、２つのベクトルのコサインを計算し、コサインの値を類似度としている。コサイン類似は、ベクトル同士がなす角度の近さを表
し、１に近ければ近いほどより類似している。コサイン類似は、０に近ければ近いほどより類似していないことを表している。特徴語生成部２４は、類似度に応じた検索フレーズの統計データを算出してもよい。特徴語生成部２４は、複数の特徴語のうち、統計データに基づいて頻度のより高い特徴語を優先して出力するように構成されていることが好ましい。 A feature vector can be obtained, for example, by using an algorithm such as word2vec, using as input data the data obtained by numerically expressing a word subjected to morphological analysis. Algorithms such as word2vec used here are machine-learned so that feature vectors of input words can be output. For cosine similarity, the cosine of two vectors is calculated and the cosine value is used as the degree of similarity. Cosine similarity represents the closeness of angles between vectors, the closer to 1 the more similar. For cosine similarity, the closer it is to 0, the less similar it is. The feature word generation unit 24 may calculate statistical data of search phrases according to similarities. Preferably, the feature word generator 24 is configured to preferentially output a feature word with a higher frequency based on statistical data among the plurality of feature words.

特徴語生成部２４は、検索フレーズに含まれる単語から共起語を抽出する。この場合、特徴語生成部２４は、共起語を特徴語としてもよく、検索部２５は、特徴語から検索対象を検索してもよい。特徴語生成部２４が共起語を利用することで、検索対象を表すのに一見無意味のように思われる、相関が付かない単語であっても、検索対象を有効に導き出すことができる。言い換えれば、特徴語生成部２４は、より検索対象を表しやすい特徴ベクトルである特徴語を、特徴ベクトルから機械学習で生成している。 The feature word generator 24 extracts co-occurring words from words included in the search phrase. In this case, the feature word generation unit 24 may use the co-occurring word as the feature word, and the search unit 25 may search for a search target from the feature word. The use of co-occurrence words by the characteristic word generation unit 24 enables effective derivation of search targets even from uncorrelated words that seem meaningless to represent search targets. In other words, the feature word generation unit 24 generates feature words, which are feature vectors that more easily represent a search target, from feature vectors by machine learning.

すなわち、特徴語生成部２４は、検索対象を表す単語と関連する特徴語を、検索フレーズから生成する。本実施形態に係る情報検索装置２では、特徴語生成部２４は、検索フレーズにおける名詞と形容詞と動詞と副詞とを含めた単語から、検索対象を表す単語と類似度のより高い特徴語を生成する。 That is, the feature word generation unit 24 generates feature words related to words representing a search target from a search phrase. In the information retrieval device 2 according to the present embodiment, the feature word generation unit 24 generates feature words having a higher degree of similarity with words representing a search target from words including nouns, adjectives, verbs, and adverbs in a search phrase. do.

情報検索装置２は、検索対象を表す単語と類似度のより高い特徴語を生成することで、さらに精度の高い検索対象を検出することができる。情報検索装置２は、特に、検索フレーズに含まれる単語が形容詞の場合、少なくとも形容詞を含む単語から特徴語を生成するので、主観的若しくは感情的な意味が含まれる検索対象を検索することができる。 The information search device 2 can detect a search target with higher accuracy by generating a feature word having a higher degree of similarity with a word representing the search target. Especially when the words included in the search phrase are adjectives, the information retrieval device 2 generates characteristic words from words including at least adjectives, so it is possible to retrieve search targets that include subjective or emotional meanings. .

検索部２５は、蓄積データベースから検索対象を検索できるように構成されている。蓄積データベースには、特徴語がカテゴリごとに分類されている。検索対象は、特定の地点を示す地点情報を有している。特定の地点は、地図上における検索対象の位置を表している。地点情報は、例えば、経度の情報及び緯度の情報である。地点情報は、経度の情報及び緯度の情報に加え、高度の情報が含まれていてもよい。検索部２５は、位置情報取得部２２から位置情報を取得する。検索部２５は、カテゴリ判別部２３からカテゴリの情報を取得する。検索部２５は、特徴語生成部２４から特徴語を取得する。検索部２５は、位置情報と、カテゴリと、特徴語とに基づいて、蓄積データベースから検索対象を検索する。検索部２５が検索した検索対象は、上記位置情報が示す位置から所定の範囲、かつ判別されたカテゴリで、特徴語に対応している。言い換えれば、検索部２５は、位置情報と、カテゴリと、特徴語とに基づいて、上記位置情報が示す位置から所定の範囲、かつ判別されたカテゴリで、特徴語に対応した検索対象を検索することができるように構成されている。 The search unit 25 is configured to search for a search target from the storage database. Feature words are classified by category in the accumulation database. The search target has point information indicating a specific point. A specific point represents a search target position on the map. The point information is, for example, longitude information and latitude information. The location information may include altitude information in addition to longitude information and latitude information. The search unit 25 acquires position information from the position information acquisition unit 22 . The search unit 25 acquires category information from the category determination unit 23 . The search unit 25 acquires feature words from the feature word generation unit 24 . The search unit 25 searches for a search target from the storage database based on the position information, category, and feature word. The search target searched by the search unit 25 corresponds to the feature word within a predetermined range from the position indicated by the position information and in the determined category. In other words, based on the position information, the category, and the feature word, the search unit 25 searches for a search target corresponding to the feature word within a predetermined range from the position indicated by the position information and in the determined category. configured to be able to

計時部２６は、年、日、時を計時することができるように構成されている。計時部２６は、例えば、ＣＰＵのクロック数をカウントすることで、計時できるように構成されればよい。計時部２６は、例えば、情報処理端末１から検索フレーズを検索通信装置２１が受信した日時及び時刻を特定することができる。計時部２６は、検索通信装置２１が受信した検索フレーズを、所定の期間ごとに弁別することができる。計時部２６は、検索フレーズを受信した日時及び時刻を統計データと関連付けて、検索記憶装置２８に記憶させることができる。 The clock unit 26 is configured to be able to clock the year, day, and hour. The timekeeping unit 26 may be configured to measure time by, for example, counting the number of clocks of the CPU. The clock unit 26 can specify, for example, the date and time when the search communication device 21 receives the search phrase from the information processing terminal 1 . The clock unit 26 can discriminate the search phrases received by the search communication device 21 for each predetermined period. The clock unit 26 can store the date and time when the search phrase was received in the search storage device 28 in association with the statistical data.

検索記憶装置２８は、電源を切れば情報が消える揮発性記憶装置と、電源を切っても情報が消えない不揮発性記憶装置とを有している。揮発性記憶装置は、検索制御装置２７により処理される情報を一時的に記憶する。揮発性記憶装置の一例としては、例えば、ＲＡＭが挙げられる。 The search memory device 28 has a volatile memory device whose information disappears when the power is turned off, and a non-volatile memory device whose information does not disappear even when the power is turned off. Volatile storage temporarily stores information processed by search controller 27 . An example of a volatile memory device is, for example, RAM.

不揮発性記憶装置は、位置情報取得プログラム、カテゴリ判別プログラム、特徴語生成プログラム、検索プログラム及び計時プログラムを記憶可能なストレージである。不揮発性記憶装置の一例としては、例えば、ＲＯＭが挙げられる。また、不揮発性記憶装置には、地名データベース、カテゴリデータベース、統計データ、若しくはシステム辞書が記憶されていてもよい。 The nonvolatile storage device is a storage capable of storing a position information acquisition program, a category discrimination program, a feature word generation program, a search program, and a timer program. An example of a nonvolatile memory device is, for example, ROM. Also, the nonvolatile storage device may store a place name database, a category database, statistical data, or a system dictionary.

本実施形態の情報検索システム１０では、特徴語生成部２４は、例えば、検索フレーズの内容が「夜景の綺麗なレストラン」の場合、形態素解析を用いて「夜景」、「綺麗」、「レストラン」の単語を取得する。特徴語生成部２４は、「夜景」、「綺麗」、若しくは「レストラン」と共起する特徴語の類似度を機械学習で判定し、判定された特徴語に基づいて、検索部２５が蓄積データベースから検索対象が曖昧な「レストラン」の地点情報を検索することができる。 In the information retrieval system 10 of the present embodiment, for example, when the content of the search phrase is "restaurant with a beautiful night view", the feature word generation unit 24 uses morphological analysis to generate "night view", "beautiful", and "restaurant". to get the word The feature word generation unit 24 determines the degree of similarity of feature words co-occurring with "night view", "beautiful", or "restaurant" by machine learning. You can search for location information of "restaurants" whose search target is ambiguous.

情報蓄積装置３は、蓄積記憶装置３１に加え、蓄積制御装置３２と、蓄積通信装置３３とを備えている。情報蓄積装置３は、例えば、データベースサーバにより構成される。 The information storage device 3 includes a storage control device 32 and a storage communication device 33 in addition to the storage storage device 31 . The information storage device 3 is configured by, for example, a database server.

蓄積記憶装置３１は、情報検索装置２が検索する検索対象を蓄積した蓄積データベースを記憶している。蓄積データベースでは、カテゴリごとに検索対象が分類されている。蓄積データベースでは、検索対象が特徴語と対応付いて蓄積されている。 The accumulation storage device 31 stores an accumulation database in which search targets to be searched by the information search device 2 are accumulated. In the storage database, search targets are classified for each category. In the storage database, search targets are stored in association with feature words.

蓄積記憶装置３１は、電源を切れば情報が消える揮発性記憶装置と、電源を切っても情報が消えない不揮発性記憶装置とを有している。揮発性記憶装置は、蓄積制御装置３２により処理される情報を一時的に記憶する。揮発性記憶装置は、例えば、ＲＡＭが挙げられる。不揮発性記憶装置は、各種の情報処理プログラム及び各種の情報を記憶可能なストレージである。不揮発性記憶装置は、例えば、ＲＯＭが挙げられる。不揮発性記憶装置は、蓄積データベースを記憶している。不揮発性記憶装置には、地図を示す地図情報及び地図情報に付加する地点情報が記憶されていてもよい。 The storage storage device 31 has a volatile storage device in which information disappears when power is turned off, and a non-volatile storage device in which information does not disappear when power is turned off. The volatile storage temporarily stores information processed by the storage controller 32 . Volatile storage devices include, for example, RAM. A non-volatile storage device is a storage capable of storing various information processing programs and various types of information. Non-volatile storage devices include, for example, ROMs. A non-volatile storage stores an accumulation database. The non-volatile storage device may store map information indicating a map and point information to be added to the map information.

蓄積制御装置３２は、蓄積記憶装置３１と、蓄積通信装置３３と電気的に接続されている。蓄積制御装置３２は、蓄積記憶装置３１と、蓄積通信装置３３とを制御できるように構成されている。蓄積制御装置３２は、ＣＰＵ若しくはＭＰＵを用いて構成することができる。蓄積制御装置３２は、例えば、予め蓄積記憶装置３１に記憶されたプログラムに基づいて所定のプログラムを実装することができるように構成されている。 The accumulation control device 32 is electrically connected to the accumulation storage device 31 and the accumulation communication device 33 . The accumulation control device 32 is configured to be able to control the accumulation storage device 31 and the accumulation communication device 33 . The accumulation control device 32 can be configured using a CPU or an MPU. The accumulation control device 32 is configured, for example, to be able to implement a predetermined program based on a program stored in advance in the accumulation storage device 31 .

蓄積通信装置３３は、情報検索装置２がアクセスできるように構成されている。蓄積通信装置３３は、検索対象を検索するための情報を情報検索装置２から受信できるように構成されている。蓄積通信装置３３は、検索対象の情報を情報検索装置２へ送信できるように構成されている。 The storage communication device 33 is configured so that the information retrieval device 2 can access it. The storage communication device 33 is configured to receive information from the information search device 2 for searching for a search target. The storage communication device 33 is configured to be able to transmit information to be retrieved to the information retrieval device 2 .

また、情報蓄積装置３は、図示しない外部情報処理装置が提供するＳＮＳ（ＳｏｃｉａｌＮｅｔｗｏｒｋｉｎｇＳｅｒｖｉｃｅ）から蓄積データベースを構築できるように構成されてもよい。情報蓄積装置３は、外部情報処理装置が外部に公開した投稿記事情報に基づいて、情報検索装置２が検索フレーズから特徴語を生成するのと同様にして、生成した特徴語から蓄積データベースを構築することができる。投稿記事情報は、例えば、ユーザが外部情報処理装置へ投稿した投稿記事の情報である。 Further, the information storage device 3 may be configured so as to construct a storage database from an SNS (Social Networking Service) provided by an external information processing device (not shown). The information storage device 3 constructs a storage database from the generated feature words in the same manner as the information retrieval device 2 generates feature words from search phrases based on the posted article information that the external information processing device has made public. can do. Posted article information is, for example, information of a posted article posted by a user to an external information processing device.

次に、本実施形態に係る情報検索システム１０全体の動作について説明する。 Next, the overall operation of the information retrieval system 10 according to this embodiment will be described.

まず、情報処理端末１と情報検索装置２との関係を説明する。本実施形態に係る情報検索システム１０では、ユーザが情報処理端末１の入力装置１１で検索フレーズを入力する
。情報処理端末１は、検索フレーズを送信する指示が入力装置１１で受け付けられた場合、端末通信装置１２から検索フレーズを情報検索装置２に送信する。 First, the relationship between the information processing terminal 1 and the information retrieval device 2 will be described. In the information retrieval system 10 according to this embodiment, a user inputs a search phrase with the input device 11 of the information processing terminal 1 . When the input device 11 receives an instruction to transmit a search phrase, the information processing terminal 1 transmits the search phrase from the terminal communication device 12 to the information search device 2 .

また、情報処理端末１は、表示装置１３に表示された地図上の位置をユーザが選択することで、選択された位置を示す位置情報を端末通信装置１２から情報検索装置２に送信するように構成されてもよい。情報検索装置２では、情報処理端末１からの位置情報を受信した場合、位置情報を検索の地図上の起点とすることができる。 Further, when the user selects a position on the map displayed on the display device 13, the information processing terminal 1 transmits position information indicating the selected position from the terminal communication device 12 to the information search device 2. may be configured. When receiving the position information from the information processing terminal 1, the information search device 2 can use the position information as a starting point on the map for search.

次に、情報処理端末１から検索フレーズを受信した情報検索装置２の動作について、図２を用いて説明する。 Next, the operation of information retrieval device 2 that receives a search phrase from information processing terminal 1 will be described with reference to FIG.

図２に示すフローチャートでは、情報検索装置２が行う工程のうち、情報検索装置２は、情報処理端末１から検索フレーズを受信する（Ｓ１０１）。情報検索装置２は、受信した検索フレーズを検索記憶装置２８に記憶する。 In the flowchart shown in FIG. 2, among the steps performed by the information retrieval device 2, the information retrieval device 2 receives a search phrase from the information processing terminal 1 (S101). The information search device 2 stores the received search phrase in the search storage device 28 .

情報検索装置２は、検索制御装置２７にて動作制御する位置情報取得プログラムによって位置情報の取得を行う位置情報取得工程を行う（Ｓ１０２）。位置情報取得プログラムは、位置情報取得工程において、情報処理端末１から位置情報が送信された場合、位置情報を取得する。位置情報取得プログラムは、位置情報取得工程において、取得可能な位置情報がない場合、検索フレーズから位置情報を取得する。 The information search device 2 performs a position information acquisition step of acquiring position information by a position information acquisition program controlled by the search control device 27 (S102). A position information acquisition program acquires position information, when position information is transmitted from the information processing terminal 1 in a position information acquisition process. The location information acquisition program acquires location information from the search phrase when there is no acquirable location information in the location information acquisition step.

次に、情報検索装置２は、検索制御装置２７にて動作制御するカテゴリ判別プログラムによって、検索フレーズのカテゴリを判別するカテゴリ判別工程を行う（Ｓ１０３）。カテゴリ判別プログラムは、検索フレーズに地名を示す単語が含まれている場合、カテゴリデータベースからカテゴリ名を判定する。カテゴリ判別プログラムは、検索フレーズに地名を示す単語が含まれていない場合、機械学習により、検索フレーズから検索語を生成する。カテゴリ判別プログラムは、生成された検索語のうち、類似度の高い検索語に基づいてカテゴリデータベースからカテゴリ名を推定する。 Next, the information retrieval device 2 performs a category determination step of determining the category of the search phrase by means of a category determination program controlled by the search control device 27 (S103). A category determination program determines a category name from a category database when a word indicating a place name is included in a search phrase. If the search phrase does not contain a word indicating a place name, the category discrimination program uses machine learning to generate a search term from the search phrase. The category discrimination program estimates the category name from the category database based on search words with a high degree of similarity among the generated search words.

続いて、情報検索装置２は、検索制御装置２７にて動作制御する特徴語生成プログラムによって、検索フレーズから検索対象を表す単語と関連する特徴語を生成する特徴語生成工程を行う（Ｓ１０４）。 Subsequently, the information retrieval device 2 performs a feature word generation step of generating feature words related to words representing a search target from the search phrase by means of a feature word generation program controlled by the search control device 27 (S104).

次に、情報検索装置２は、検索制御装置２７にて動作制御する検索プログラムによって、蓄積データベースから検索対象を検索する検索の実行工程を行う（Ｓ１０５）。 Next, the information retrieval device 2 performs a retrieval execution step of retrieving a retrieval target from the storage database by a retrieval program controlled by the retrieval control device 27 (S105).

最後に、情報検索装置２は、検索対象を検索すれば、終了してもよい。 Finally, the information retrieval device 2 may terminate after retrieving the retrieval target.

本実施形態の情報検索システム１０によれば、ユーザが曖昧なイメージに基づく検索フレーズを情報処理端末１に入力しても、情報検索装置２は、ユーザの主観語若しくは感情語による柔らかな検索を行うことができる。情報検索装置２は、例えば、「激辛ラーメン」という検索フレーズを入力しても、表示装置１３に激辛ラーメンが食べられる可能性のあるお店の位置を、情報処理端末１の表示装置１３で表示される地図上にラーメンのカテゴリに対応する色のマーカで表示させることができる。 According to the information retrieval system 10 of the present embodiment, even if the user inputs a search phrase based on an ambiguous image into the information processing terminal 1, the information retrieval device 2 performs a flexible search using the user's subjective words or emotional words. It can be carried out. For example, even if a search phrase such as "super spicy ramen" is input, the information retrieval device 2 displays the location of a restaurant where you can eat super spicy ramen on the display device 13 of the information processing terminal 1. It can be displayed with a marker of the color corresponding to the category of ramen on the map.

すなわち、本実施形態の情報検索システム１０では、情報処理端末１は、端末記憶装置１５を更に備えている。端末記憶装置１５には、地図情報が記憶されている。表示装置１３は、検索対象を示す特定の地点を地図情報の地図に表示する。 That is, in the information retrieval system 10 of this embodiment, the information processing terminal 1 further includes a terminal storage device 15 . The terminal storage device 15 stores map information. The display device 13 displays a specific point indicating the search target on the map of the map information.

本実施形態の情報検索システム１０は、より精度の高い検索対象の地点を、地図に表示
させることができる。 The information search system 10 of the present embodiment can display a more accurate search target point on a map.

また、本実施形態の情報検索システム１０では、特徴語生成部２４は、検索フレーズの統計データを算出する。情報蓄積装置３は、情報処理端末１から予め設定した期間内に検索通信装置２１が受信した検索フレーズと、統計データとを関連づけして、蓄積データベースに蓄積する。 In addition, in the information retrieval system 10 of the present embodiment, the feature word generator 24 calculates statistical data of search phrases. The information storage device 3 associates the search phrases received by the search communication device 21 from the information processing terminal 1 within a preset period with the statistical data, and stores them in the storage database.

本実施形態の情報検索システム１０は、予め設定した期間内の検索フレーズに基づいて、より精度の高い検索対象を検出することができる。 The information search system 10 of this embodiment can detect search targets with higher accuracy based on search phrases within a preset period.

さらに、本実施形態の情報検索システム１０では、特徴語生成部２４は、検索フレーズに含まれる単語から、検索対象を表す単語と類似度のより高い特徴語を生成する。特徴語生成部２４は、統計データに基づいて、頻度のより高い特徴語を優先して出力するように構成されている。 Furthermore, in the information retrieval system 10 of the present embodiment, the feature word generation unit 24 generates feature words having a higher degree of similarity with words representing a search target from words included in a search phrase. The feature word generator 24 is configured to preferentially output feature words with higher frequencies based on statistical data.

本実施形態の情報検索システム１０は、更に精度の高い検索対象を検出することができる。 The information search system 10 of this embodiment can detect search targets with even higher accuracy.

なお、上述の実施形態の情報検索システム１０では、情報処理端末１は、例えば、タッチパネルを備えた構成の場合、表示装置１３に表示された地図を拡大若しくは縮小させて、検索するエリアとしてユーザが指定したい場所をタップすることで、位置情報を入力装置１１へ入力させることができる。 In the information retrieval system 10 of the above-described embodiment, for example, when the information processing terminal 1 is configured to have a touch panel, the map displayed on the display device 13 is enlarged or reduced, and the user selects an area to be searched. The location information can be input to the input device 11 by tapping the place to be specified.

また、情報処理端末１は、検索対象をマーカとして地図上に表すように、表示装置１３で表示する。マーカは、例えば、中心にカテゴリを表すピクトグラムが表示されたバルーンが挙げられる。情報処理端末１は、例えば、カテゴリごとにマーカの色若しくは濃淡を変えて、マーカを含む地図を表示装置１３に表示させてもよい。情報処理端末１は、例えば、検索フレーズに含まれる単語と、特徴語との類似度に応じて、マーカの大きさを変えて、マーカを含む地図を表示装置１３に表示させてもよい。 Further, the information processing terminal 1 displays the search target as a marker on the map on the display device 13 . The marker is, for example, a balloon in which a pictogram representing the category is displayed in the center. For example, the information processing terminal 1 may change the color or shade of the marker for each category and cause the display device 13 to display a map including the marker. The information processing terminal 1 may change the size of the marker according to the degree of similarity between the word included in the search phrase and the feature word, and cause the display device 13 to display a map including the marker.

さらに、上述の実施形態の情報検索システム１０では、情報処理端末１がナビゲーション機能を有していてもよい。情報処理端末１は、ナビゲーション機能を有することで、表示装置１３に表示した地図上の地点情報にユーザを誘導させることができる。ナビゲーション機能は、情報処理端末１の端末記憶装置１５に記憶され端末制御装置１４によって制御される。ナビゲーション機能は、情報検索装置２の検索記憶装置２８に記憶され検索制御装置２７によって制御されてもよい。情報検索システム１０は、情報処理端末１の表示装置１３において地図及びナビゲーションの表示が可能であれば、情報処理端末１と情報検索装置２とのどちらにナビゲーション機能が搭載されてもよい。 Furthermore, in the information retrieval system 10 of the above embodiment, the information processing terminal 1 may have a navigation function. By having a navigation function, the information processing terminal 1 can guide the user to point information on the map displayed on the display device 13 . The navigation function is stored in the terminal storage device 15 of the information processing terminal 1 and controlled by the terminal control device 14 . Navigation functions may be stored in the search storage device 28 of the information search device 2 and controlled by the search control device 27 . In the information retrieval system 10, either the information processing terminal 1 or the information retrieval device 2 may be equipped with a navigation function as long as the display device 13 of the information processing terminal 1 can display a map and navigation.

具体的には、情報検索装置２と情報処理端末１のいずれか一方の装置にナビゲーション機能を記憶させ、他方の装置によってナビゲーション機能を制御するように構成されてもよい。または、情報検索装置２と情報処理端末１の両方にナビゲーション機能を分割して記憶し、いずれか一方の装置、若しくは両方の装置が協動して、ナビゲーション機能を制御するように構成されてもよい。 Specifically, the navigation function may be stored in one of the information retrieval device 2 and the information processing terminal 1, and the navigation function may be controlled by the other device. Alternatively, the navigation function may be divided and stored in both the information retrieval device 2 and the information processing terminal 1, and either one of the devices or both devices may cooperate to control the navigation function. good.

上記実施形態の各処理、又は各機能の各々は、単一の装置又は単一のシステムにより集中処理されることで実現されてもよいし、複数の装置又は複数のシステムによって分散処理されることで実現されてもよい。また、上記実施形態の各構成要素は、専用のハードウェアにより構成されてもよい。上記実施形態の各構成要素は、ソフトウェアにより実現可能な構成要素について、プログラムを実行することによって実現されてもよい。 Each processing or each function of the above embodiments may be realized by centralized processing by a single device or single system, or distributed processing by multiple devices or multiple systems. may be implemented with Further, each component of the above embodiment may be configured by dedicated hardware. Each component of the above embodiments may be implemented by executing a program for components that can be implemented by software.

上記実施形態の各構成要素は、例えば、記録媒体に記録されたソフトウェアのプログラムをＣＰＵが実行することによって実現されてもよい。プログラムは、サーバからダウンロードされることによって実行されてもよいし、所定の記録媒体に記録されたプログラムが読み出されることによって実行されてもよい。また、プログラムを実行するコンピュータは、単数であってもよいし、複数であってもよい。上記実施形態は、集中処理を行うように構成されてもよいし、分散処理を行うように構成されてもよい。 Each component of the above-described embodiment may be implemented by, for example, a CPU executing a software program recorded on a recording medium. The program may be executed by being downloaded from a server, or may be executed by reading a program recorded on a predetermined recording medium. Also, the number of computers that execute the programs may be singular or plural. The above embodiments may be configured for centralized processing or may be configured for distributed processing.

なお、本発明は、その精神や主旨又は主要な特徴から逸脱することなく、他のいろいろな形で実施することができる。上記実施形態はあらゆる点で単なる例示にすぎず、限定的に解釈してはならない。すなわち、上記実施形態を例として、本発明は、情報検索装置、又は情報検索システムである。 However, the invention may be embodied in various other forms without departing from its spirit, scope or essential characteristics. The above-described embodiments are merely examples in all respects, and should not be construed in a restrictive manner. That is, the present invention is an information retrieval device or an information retrieval system, using the above embodiment as an example.

〔実施形態のまとめ〕
（１：特徴語について）
以上説明した情報検索装置２によれば、検索制御装置２７に含まれる各部の機能により、検索を行うユーザによって入力された検索フレーズが、ユーザの検索したいスポットを直接的に示すものでなかったとしても、妥当なスポットを検出してユーザに提示することができる。 [Summary of embodiment]
(1: About feature words)
According to the information search device 2 described above, the function of each unit included in the search control device 27 allows the search phrase input by the user performing the search to be performed even if the search phrase does not directly indicate the spot the user wants to search. can also detect valid spots and present them to the user.

具体的には、検索制御装置２７には、特徴語生成部２４と検索部２５が含まれている。このうち、特徴語生成部２４は、検索フレーズを形態素解析して得た単語から特徴語を生成する。以下、検索フレーズを形態素解析して得た単語は、入力語と呼ぶ場合もある。なお、本明細書では、特徴語に関して「生成」との語を「特定」、「判定」、「検出」、あるいは「選択」といった意味で用いている。 Specifically, the search control device 27 includes a feature word generation unit 24 and a search unit 25 . Of these, the feature word generation unit 24 generates feature words from words obtained by morphologically analyzing the search phrase. Hereinafter, words obtained by morphologically analyzing a search phrase may be referred to as input words. In this specification, the term "generate" is used with respect to feature words to mean "specify," "determine," "detect," or "select."

ここで、特徴語生成部２４が生成する特徴語は、入力語との類似度が高い語であってもよいし、入力語の共起語であってもよく、それらの両方であってもよい。以下では、入力語との類似度が所定の閾値よりも高い語を、類似語と呼ぶ場合がある。 Here, the feature word generated by the feature word generation unit 24 may be a word highly similar to the input word, a co-occurring word of the input word, or both of them. good. Hereinafter, a word whose similarity to the input word is higher than a predetermined threshold may be referred to as a similar word.

類似語を特徴語として用いる場合、特徴語生成部２４は、入力語の特徴を示す特徴情報を生成する。この特徴情報は、入力語の特徴を示すものであればよく、例えば入力語の特徴ベクトルを特徴情報としてもよい。特徴ベクトルは、上述のｗｏｒｄ２ｖｅｃの他、例えばＧｌｏＶｅ、ＷｏｒｄＮｅｔ、あるいはｆａｓｔＴｅｘｔのアルゴリズムを用いることにより算出することもできる。そして、特徴語生成部２４は、蓄積データベースにおいてスポットに対応付けられている各特徴語の特徴ベクトルと、上記算出した特徴ベクトルとの類似度を算出し、類似度が閾値以上の語を特定する。これにより、特徴語生成部２４は、蓄積データベースにおいてスポットに対応付けられている特徴語のうち、入力語に類似した類似語を特定することができる。なお、類似度を算出する手法は特に限定されず、例えばコサイン類似により算出することができる。 When similar words are used as feature words, the feature word generation unit 24 generates feature information indicating features of the input word. This feature information may indicate the feature of the input word, and for example, the feature vector of the input word may be used as the feature information. The feature vector can also be calculated by using, for example, the GloVe, WordNet, or fastText algorithms, in addition to the word2vec described above. Then, the feature word generation unit 24 calculates the degree of similarity between the feature vector of each feature word associated with the spot in the accumulation database and the feature vector calculated above, and specifies words whose degree of similarity is equal to or greater than the threshold. . As a result, the feature word generation unit 24 can identify similar words similar to the input word among the feature words associated with the spots in the accumulation database. Note that the method of calculating the degree of similarity is not particularly limited, and can be calculated by cosine similarity, for example.

一方、共起語を特徴語として用いる場合、特徴語生成部２４は、入力語の共起語を特定する。なお、共起語は、例えば、ＴＦ－ＩＤＦ（ＴｅｒｍＦｒｅｑｕｅｎｃｙ－ＩｎｖｅｒｓｅＤｏｃｕｍｅｎｔＦｒｅｑｕｅｎｃｙ）により抽出可能である。また、特徴語生成部２４は、上述の類似語の共起語を特徴語として用いてもよい。 On the other hand, when co-occurring words are used as feature words, the feature word generation unit 24 identifies the co-occurring words of the input word. Co-occurring words can be extracted by, for example, TF-IDF (Term Frequency-Inverse Document Frequency). Further, the feature word generation unit 24 may use the co-occurring words of the above-described similar words as feature words.

そして、検索部２５は、生成された特徴語を用いて蓄積記憶装置３１に記憶されている蓄積データベースを検索し、該蓄積データベースにおいて特徴語に対応付けられているスポットを検出する。なお、「スポット」とは、例えば、地点や場所、あるいは特定の地点や場所に設置されている建造物を意味する。蓄積データベースにおいて、各スポットには
、そのスポットの地理的位置を示す位置情報が対応付けられている。上述のように、検索に用いる特徴語は、入力語の類似語または共起語であるから、検索フレーズが、ユーザが検索したいスポットを直接的に示すものでなかったとしても、妥当なスポットを検出してユーザに提示することができる。 Then, the search unit 25 searches the storage database stored in the storage storage device 31 using the generated feature word, and detects spots associated with the feature word in the storage database. Note that the "spot" means, for example, a point, a place, or a building installed at a specific point or place. In the accumulation database, each spot is associated with position information indicating the geographical position of the spot. As described above, the feature words used for searching are similar words or co-occurring words of the input word, so even if the search phrase does not directly indicate the spot that the user wants to search, It can be detected and presented to the user.

なお、特徴語生成部２４は、検索フレーズを形態素解析して得た単語のうち、所定の品詞の単語を特徴語生成の対象としてもよい。上記所定の品詞は、例えば名詞、形容詞、動詞、および副詞であってもよい。これにより、有意な特徴語を生成し難い、例えば接続詞、感動詞、助詞、助動詞といった品詞を除外して、妥当性の高い特徴語を生成することができる。 Of the words obtained by morphologically analyzing the search phrase, the feature word generation unit 24 may target words of a predetermined part of speech for feature word generation. The predetermined parts of speech may be, for example, nouns, adjectives, verbs and adverbs. This makes it possible to generate highly appropriate feature words by excluding parts of speech such as conjunctions, interjections, particles, and auxiliary verbs that are difficult to generate significant feature words.

（２：位置情報について）
検索制御装置２７には、位置情報取得部２２が含まれていてもよい。位置情報取得部２２は、ユーザが使用する情報処理端末１から受信した情報に基づいて、スポットの検索範囲を決める基準となる位置情報を取得する。そして、検索部２５は、取得された位置情報が示す位置を基準として所定の範囲を設定し、その範囲内のスポットを検出する。これにより、検索を行うユーザが関心のある地域に絞ったスポットの検索が可能になる。 (2: Location information)
The search control device 27 may include the position information acquisition section 22 . The position information acquisition unit 22 acquires position information that serves as a reference for determining a spot search range based on information received from the information processing terminal 1 used by the user. Then, the search unit 25 sets a predetermined range based on the position indicated by the acquired position information, and detects spots within the range. This makes it possible for the searching user to search for spots focused on the area of interest.

位置情報取得部２２は、検索フレーズを形態素解析して得た単語から上記位置情報を取得してもよい。例えば、検索フレーズを形態素解析して得た単語に、位置を示す単語が含まれていれば、位置情報取得部２２は、位置を示す単語と位置情報とが対応付けられた地名データベースを利用して位置情報を取得してもよい。なお、位置を示す単語としては、例えば、地名が挙げられる。 The position information acquisition unit 22 may acquire the position information from words obtained by morphologically analyzing the search phrase. For example, if words obtained by morphologically analyzing a search phrase include words indicating locations, the location information acquisition unit 22 uses a place name database in which words indicating locations and location information are associated with each other. location information. Note that, for example, a place name can be given as a word indicating a position.

また、位置情報取得部２２は、位置情報を特定するための位置情報データベースを用いて位置情報を取得してもよい。位置情報データベースは、単語とその単語に対応する位置を示す位置情報とが対応付けられたデータベースである。位置情報取得部２２は、位置情報データベースを用いて位置情報を取得する場合、検索フレーズを形態素解析して得た単語、その単語の類似語、およびその単語の共起語の少なくともいずれかを用いて上記位置情報データベースを検索する。これにより、位置情報取得部２２は、検索フレーズを形態素解析して得た単語に関連する位置情報を取得することができる。なお、類似語や共起語は、機械学習済みのモデルを用いて特定すればよい。この場合、位置情報取得部２２は、検索フレーズに含まれる単語から機械学習により推定した位置情報を取得することになる。 Further, the position information acquisition unit 22 may acquire position information using a position information database for specifying position information. The positional information database is a database in which words are associated with positional information indicating positions corresponding to the words. When acquiring the position information using the position information database, the position information acquisition unit 22 uses at least one of a word obtained by morphological analysis of the search phrase, a similar word of the word, and a co-occurring word of the word. to search the location information database. As a result, the positional information acquiring unit 22 can acquire the positional information related to the word obtained by morphologically analyzing the search phrase. Similar words and co-occurring words may be identified using a machine-learned model. In this case, the position information acquisition unit 22 acquires position information estimated by machine learning from words included in the search phrase.

また、情報処理端末１は、自機の位置を示す位置情報を例えばＧＰＳ（ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ）を利用して取得し、取得した位置情報を情報検索装置２に送信してもよい。この場合、位置情報取得部２２は、情報処理端末１から送信された位置情報を取得してもよい。このような位置情報を用いることにより、情報処理端末１の現在地周辺のスポットを検出することも可能になる。 Information processing terminal 1 may acquire position information indicating its own position using, for example, a GPS (Global Positioning System) and transmit the acquired position information to information search device 2 . In this case, the positional information acquisition unit 22 may acquire the positional information transmitted from the information processing terminal 1 . By using such position information, it is also possible to detect spots around the current location of information processing terminal 1 .

（３：カテゴリについて）
検索制御装置２７には、カテゴリ判別部２３が含まれていてもよい。また、カテゴリ判別部２３には、カテゴリ判定部２３１およびカテゴリ推定部２３２が含まれていてもよい。カテゴリ判別部２３は、検索フレーズを形態素解析して得た単語のカテゴリを判定する。また、この場合、蓄積データベースにおいて、各スポットにそのスポットのカテゴリを対応付けておく。これにより、検索部２５は、検索フレーズに応じたスポットのうち、カテゴリ判別部２３が判定したカテゴリのスポットを検出することが可能になる。 (3: Categories)
The search control device 27 may include the category discrimination section 23 . In addition, category determination section 23 may include category determination section 231 and category estimation section 232 . The category determination unit 23 determines the category of the word obtained by morphologically analyzing the search phrase. In this case, each spot is associated with its category in the accumulation database. This enables the search unit 25 to detect spots of the category determined by the category determination unit 23 among the spots corresponding to the search phrase.

カテゴリ判定部２３１は、例えば、単語とその単語に対応するカテゴリとを対応付けた
カテゴリデータベースを用いることにより、検索フレーズを形態素解析して得た単語に対応するカテゴリを判定することができる。 The category determination unit 231 can determine the category corresponding to the word obtained by morphologically analyzing the search phrase by using, for example, a category database that associates the word with the category corresponding to the word.

カテゴリ推定部２３２は、機械学習済みのアルゴリズムを用いて、検索フレーズを形態素解析して得た単語に応じたカテゴリを特定する。上記アルゴリズムは、例えば単語の特徴情報と、カテゴリとの対応関係を機械学習したものであってもよい。単語の特徴情報は、例えば特徴ベクトルである。単語の特徴情報とカテゴリとの対応関係を機械学習したアルゴリズムを用いる場合、カテゴリ推定部２３２は、検索フレーズを形態素解析して得た単語の特徴情報を取得し、取得した特徴情報を上記アルゴリズムに入力することにより、カテゴリを特定することができる。また、カテゴリ推定部２３２は、検索フレーズを形態素解析して得た単語の類似語、および検索フレーズを形態素解析して得た単語の共起語の少なくともいずれかを特定してもよい。そして、カテゴリ推定部２３２は、特定した類似語および共起語の少なくともいずれかを用いて上記カテゴリデータベースを検索することにより、カテゴリを特定してもよい。なお、類似語や共起語は、機械学習済みのモデルを用いて特定すればよい。この場合、カテゴリ推定部２３２は、検索フレーズに含まれる単語からそのカテゴリを機械学習により推定することになる。また、例えば、カテゴリ推定部２３２は、あるカテゴリに属する単語の特徴ベクトルと、検索フレーズを形態素解析して得た単語の特徴ベクトルとの類似度を算出してもよい。この場合、カテゴリ推定部２３２は、算出した類似度が閾値以上であれば、検索フレーズを形態素解析して得た上記単語が、上記あるカテゴリに属すると判定する。 The category estimation unit 232 uses a machine-learned algorithm to specify a category corresponding to a word obtained by morphologically analyzing a search phrase. The above algorithm may be obtained by machine-learning the correspondence between word feature information and categories, for example. The word feature information is, for example, a feature vector. When using an algorithm that machine-learns the correspondence between word feature information and categories, the category estimation unit 232 acquires word feature information obtained by morphological analysis of a search phrase, and applies the acquired feature information to the above algorithm. A category can be specified by inputting. In addition, the category estimation unit 232 may specify at least one of similar words of words obtained by morphological analysis of the search phrase and co-occurrence words of words obtained by morphological analysis of the search phrase. Then, the category estimation unit 232 may identify the category by searching the category database using at least one of the identified similar words and co-occurring words. Similar words and co-occurring words may be identified using a machine-learned model. In this case, the category estimation unit 232 estimates the category from the words included in the search phrase by machine learning. Further, for example, the category estimation unit 232 may calculate the degree of similarity between a feature vector of a word belonging to a certain category and a feature vector of a word obtained by morphologically analyzing a search phrase. In this case, if the calculated similarity is equal to or greater than the threshold, the category estimation unit 232 determines that the word obtained by morphologically analyzing the search phrase belongs to the certain category.

判定したカテゴリは、後述する重み付けの際に用いてもよいし、検索部２５の検索範囲を絞り込むために用いてもよい。後者の場合、検索部２５は、蓄積データベースに蓄積されているスポットのうち、カテゴリ判別部２３が判定したカテゴリに属するスポットを検索対象として、特徴語生成部２４が生成した特徴語が対応付けられているスポットを検出する。例えば、検索フレーズに含まれる「風景」との単語のカテゴリをカテゴリ判別部２３が「観光地」と判定した場合を考える。この場合、検索部２５は、「風景」から生成された特徴語を用いて、蓄積データベースに蓄積されているスポットのうち「観光地」のカテゴリが対応付けられているスポットを検索する。これにより、検索範囲を適切に絞り込んで、妥当な検索結果を迅速に得ることが可能になる。 The determined category may be used for weighting, which will be described later, or may be used for narrowing down the search range of the search unit 25 . In the latter case, the search unit 25 searches for spots belonging to the category determined by the category determination unit 23 among the spots stored in the storage database, and the feature words generated by the feature word generation unit 24 are associated with the spots. Detects spots where For example, consider a case where the category determination unit 23 determines that the category of the word "landscape" included in the search phrase is "sightseeing spot". In this case, the search unit 25 searches spots associated with the category of “sightseeing spot” among the spots accumulated in the accumulation database using the feature word generated from “landscape”. This makes it possible to appropriately narrow down the search range and quickly obtain appropriate search results.

（４：検索における優先度について）
特徴語生成部２４は、複数の特徴語を生成できる場合、それら複数の特徴語のうち優先度の高いものを生成してもよい。上記優先度は、例えば情報検索システム１０において、過去に検索に用いられた累積回数、過去の所定期間に検索に用いられた頻度、および累積回数と使用頻度の両方または一方に基づいて算出した順位の少なくともいずれかであってもよい。例えば、特徴語生成部２４は、特徴語として「景色」と「レストラン」を生成可能である場合に、「レストラン」の直近一週間の検索頻度が「景色」の直近一週間の検索頻度よりも高ければ、「レストラン」を特徴語としてもよい。 (4: Priority in search)
If a plurality of feature words can be generated, the feature word generation unit 24 may generate a feature word with a high priority among the plurality of feature words. For example, in the information retrieval system 10, the priority is a ranking calculated based on the cumulative number of searches used in the past, the frequency of searches during a predetermined period in the past, and both or one of the cumulative number of times and the frequency of use. may be at least one of For example, when the characteristic word generating unit 24 can generate "scenery" and "restaurant" as characteristic words, the search frequency for "restaurant" in the most recent week is higher than the search frequency for "scenery" in the most recent week. If it is expensive, "restaurant" may be used as a feature word.

なお、特徴語生成部２４は、予め設定した期間内に情報処理端末１から受信した検索フレーズを対象として、上記のような回数や頻度を示す情報を算出して、統計データとして記憶しておけばよい。また、記憶した統計データは、情報処理端末１から受信した検索フレーズと関連付けしておく。 Note that the feature word generation unit 24 should calculate information indicating the number of times and frequency as described above for the search phrases received from the information processing terminal 1 within a preset period, and store the information as statistical data. Just do it. Also, the stored statistical data is associated with the search phrase received from the information processing terminal 1 .

また、特徴語生成部２４が複数の特徴語を生成した場合に、検索部２５がそれら複数の特徴語のうち優先度の高いものを用いて検索を行ってもよい。また、検索部２５は、優先度の高い特徴語による検索結果を、優先度のより低い特徴語による検索結果よりも優先して出力してもよい。例えば、検索部２５は、優先度の高い特徴語による検索結果から順に出力してもよい。また、上記の統計データは、後述する重み付けに利用することもできる
。 Also, when the feature word generation unit 24 generates a plurality of feature words, the search unit 25 may perform a search using the feature word with the highest priority among the plurality of feature words. Further, the search unit 25 may preferentially output a search result using a feature word with a high priority over a search result using a feature word with a low priority. For example, the search unit 25 may output search results in descending order of priority using feature words. Also, the above statistical data can be used for weighting, which will be described later.

（５：蓄積データベースの構築）
情報蓄積装置３は、検索対象となるスポットに関連する各種情報を取得して、蓄積データベースを構築することができる。例えば、情報蓄積装置３は、上述のようにＳＮＳに投稿された記事から特徴語を生成してもよいし、スポットを紹介したウェブサイトや、スポットの口コミ情報を掲載したサイトに投稿された記事から特徴語を生成してもよい。以下、ＳＮＳに投稿された記事、若しくはスポットの口コミ情報を掲載したサイトに投稿された記事をコメントともいう。また、ウェブサイトを、サイトと略称する場合がある。 (5: Construction of accumulation database)
The information storage device 3 can acquire various types of information related to spots to be searched and construct an accumulation database. For example, the information storage device 3 may generate feature words from articles posted on SNS as described above, or articles posted on websites introducing spots and sites posting word-of-mouth information on spots. A feature word may be generated from Hereinafter, an article posted on an SNS or an article posted on a site that posts word-of-mouth information about a spot is also referred to as a comment. Moreover, a website may be abbreviated as a site.

例えば、情報蓄積装置３は、スポット「展望台Ｘ」についての口コミサイトに投稿された「眺望が最高」とのコメントを取得した場合、これを形態素解析して「眺望」、「最高」との単語を得る。次に、情報蓄積装置３は、これらの単語の特徴ベクトルを算出し、算出した特徴ベクトルを用いて、各単語と特徴ベクトルが類似した特徴語を生成する。そして、情報蓄積装置３は、蓄積データベースにスポット「展望台Ｘ」を追加し、形態素解析によって得た単語と、その単語から生成した特徴語とを対応付けて記録する。なお、蓄積データベースにスポット「展望台Ｘ」が既に記録されていれば、既に記録済みのスポットに、形態素解析によって得た単語と、その単語から生成した特徴語とを追加で対応付けて記録する。 For example, when the information storage device 3 acquires a comment "The view is the best" posted on a word-of-mouth site about the spot "Observatory X", the information storage device 3 morphologically analyzes the comment to add the words "View" and "Best". get the word Next, the information storage device 3 calculates feature vectors of these words, and uses the calculated feature vectors to generate feature words whose feature vectors are similar to each word. Then, the information storage device 3 adds the spot "Observatory X" to the storage database, and records the words obtained by the morphological analysis in association with the feature words generated from the words. If the spot "Observatory X" is already recorded in the accumulation database, the word obtained by the morphological analysis and the feature word generated from the word are additionally recorded in association with the already recorded spot. .

また、特徴語の代わりに、あるいは特徴語に加えて、共起語や、後述のクラスタを対応付けて記録してもよい。また、上述のようなサイトでは、スポットが点数や星の数などによって評価されていることがある。その場合、情報蓄積装置３は、スポットの評価についても蓄積データベースに記録してもよい。この他にも、例えばサイトへのアクセス数、当該スポットのページへのアクセス数、当該スポットへのコメント数、コメントの投稿日時、サイト名についても蓄積データベースに記録してもよい。なお、サイト名は例えばサービス名であってもよい。これらの情報は、例えばスポットの重み付けに利用することができ、これにより例えば評価の高いスポットを優先して検索することも可能になる。 Also, instead of or in addition to the feature words, co-occurring words or clusters described below may be associated and recorded. In addition, on the above-mentioned sites, spots may be evaluated by points, stars, or the like. In that case, the information storage device 3 may also record the spot evaluation in the storage database. In addition, for example, the number of accesses to the site, the number of accesses to the page of the spot, the number of comments on the spot, the date and time of comment posting, and the name of the site may be recorded in the accumulation database. Note that the site name may be, for example, a service name. These pieces of information can be used, for example, for weighting spots, thereby enabling, for example, high-rated spots to be preferentially searched.

以上のように、情報蓄積装置３は、口コミ情報に基づいて蓄積データベースを拡充することができ、これにより新たなスポットを検索できるようにしたり、実際にスポットを訪れた人の感想をベースにした検索を行うことを可能にしたりすることも可能になる。例えば、スポット「展望台Ｘ」についての「眺望が最高」とのコメントに基づいて特徴語を記録しておけば、「解放感が最高」のような曖昧な検索フレーズからスポット「展望台Ｘ」を検出することも可能になる。 As described above, the information storage device 3 can expand the storage database based on the word-of-mouth information, thereby making it possible to search for new spots, and based on the impressions of people who have actually visited the spots. It is also possible to enable searching. For example, if a feature word is recorded based on the comment "The view is the best" for the spot "Observatory X", the vague search phrases such as "The feeling of liberation is the best" can also be detected.

また、スポットに関連する情報に基づいて蓄積データベースを構築することにより、上述の例のように、曖昧な検索フレーズが入力された場合であってもユーザの意に沿ったスポットを検出することが可能である。これに対し、一般的な検索サイトでは、スポット以外の多様な対象についても検索対象に含めなければならないため、曖昧な検索フレーズでユーザの意に沿ったスポットは通常は検出できない。例えば、一般的な検索サイトで「解放感が最高」とのフレーズを検索しても、スポット「展望台Ｘ」は検出されないか、あるいは検出されても他の雑多な検索結果に埋もれて認識困難な状態となってしまう。 Also, by constructing an accumulation database based on information related to spots, it is possible to detect spots that match the user's intentions even when an ambiguous search phrase is entered, as in the above example. It is possible. On the other hand, general search sites must include various targets other than spots in their search targets, so it is usually not possible to find spots that match the user's intentions with ambiguous search phrases. For example, even if you search for the phrase "the feeling of liberation is the best" on a general search site, the spot "Observatory X" may not be detected, or even if it is detected, it may be difficult to recognize because it is buried in other miscellaneous search results. state.

〔クラスタを用いた検索〕
上記実施形態で説明したように、特徴語生成部２４は、１つの単語を「特徴語」としてもよいし、複数の単語からなる語群を「特徴語」としてもよい。以下では後者の場合について図３に基づいて、より詳細に説明する。 [Search using clusters]
As described in the above embodiment, the feature word generation unit 24 may set one word as a "feature word", or may set a word group consisting of a plurality of words as a "feature word". The latter case will be described in more detail below with reference to FIG.

複数の単語からなる語群を特徴語とする場合、蓄積データベースでは、単語の特徴を示
す特徴情報が類似した複数の特徴語からなる特徴語群と検索対象であるスポットとが対応付けられている。上述のように、特徴情報は例えば特徴ベクトルである。また、以下では、特徴語群をクラスタと呼ぶ。検索フレーズが入力されると、特徴語生成部２４は、その検索フレーズに含まれる単語の特徴情報に基づいて当該単語に対応するクラスタを決定する。そして、検索部２５は、特徴語生成部２４が決定したクラスタに対応付けられているスポットを蓄積データベースから検出する。 When a word group consisting of a plurality of words is used as a feature word, in the accumulation database, the feature word group consisting of a plurality of feature words having similar feature information indicating the feature of the word is associated with the spot to be searched. . As mentioned above, the feature information is, for example, a feature vector. Also, hereinafter, the feature word group is called a cluster. When a search phrase is input, the feature word generator 24 determines a cluster corresponding to the word based on feature information of the word included in the search phrase. Then, the search unit 25 detects spots associated with the clusters determined by the feature word generation unit 24 from the accumulated database.

この構成によれば、スポットに直接対応付けられていない検索フレーズが入力された場合であっても、検索を行うユーザの意に沿った妥当なスポットを検出することが可能になる。例えば、蓄積データベースにおいて、「風景」、「景観」といった類似した特徴語からなるクラスタＡと、スポットＡとが対応付けられていた場合に、「景色」との単語を含む検索フレーズが入力されたとする。この場合、「景色」は、「風景」や「景観」と類似しているから、「景色」はクラスタＡに属すると決定されるので、蓄積データベースからスポットＡを検出することができる。なお、特徴情報が類似しているか否かについては、例えば類似度が所定の閾値以上であるか否かにより判定できる。 According to this configuration, even when a search phrase that is not directly associated with a spot is input, it is possible to detect a suitable spot that matches the intention of the user performing the search. For example, in the accumulation database, when a cluster A consisting of similar feature words such as "landscape" and "landscape" is associated with a spot A, it is assumed that a search phrase including the word "landscape" is entered. do. In this case, since "landscape" is similar to "landscape" and "landscape", it is determined that "landscape" belongs to cluster A, so spot A can be detected from the accumulation database. Whether or not the feature information is similar can be determined, for example, by determining whether or not the degree of similarity is equal to or greater than a predetermined threshold.

また、複数の単語からなる語群を特徴語とする場合、単語の特徴を示す特徴情報が類似した複数の特徴語からなるクラスタとカテゴリとが対応付けられたカテゴリデータベースを用いてもよい。この場合、カテゴリ判別部２３は、検索フレーズに含まれる単語の特徴情報に基づいて決定されたクラスタに対応付けられているカテゴリをカテゴリデータベースから検出する。この構成によれば、カテゴリに直接対応付けられていない検索フレーズが入力された場合であっても、検索を行うユーザの意に沿った妥当なカテゴリを判定することが可能になる。 Further, when a group of words consisting of a plurality of words is used as a feature word, a category database may be used in which clusters consisting of a plurality of feature words having similar feature information indicating characteristics of words are associated with categories. In this case, the category determination unit 23 detects from the category database the category associated with the cluster determined based on the feature information of the words included in the search phrase. According to this configuration, even when a search phrase that is not directly associated with a category is input, it is possible to determine a proper category that matches the intention of the user performing the search.

図３は、クラスタに基づいてスポットを検索する処理の一例を示すフローチャートである。また、以下では、図３のフローに従ったスポット検索の具体例を図４から図６に基づいて説明する。図４は、検索フレーズからクラスタを決定した具体例を示す図であり、図５は、カテゴリデータベースに格納されているデータテーブルの一例を示す図であり、図６は、スポットデータベースに格納されているデータテーブルの一例を示す図である。なお、カテゴリデータベースとスポットデータベースは、上述の蓄積データベースに対応している。また、図３の検索フレーズおよび位置情報の取得（Ｓ２０１、Ｓ２０２）については図２と同様であるからここでは説明を省略する。 FIG. 3 is a flowchart showing an example of processing for searching for spots based on clusters. A specific example of spot search according to the flow of FIG. 3 will be described below with reference to FIGS. 4 to 6. FIG. FIG. 4 is a diagram showing a specific example of determining clusters from search phrases, FIG. 5 is a diagram showing an example of a data table stored in a category database, and FIG. 6 is a diagram showing an example of a data table stored in a spot database. It is a figure which shows an example of the data table which exists. Note that the category database and the spot database correspond to the accumulation database described above. Also, the acquisition of the search phrase and position information (S201, S202) in FIG. 3 is the same as in FIG. 2, so the description is omitted here.

Ｓ２０３では、特徴語生成部２４は、検索フレーズを形態素解析して得た各単語の特徴ベクトルを算出する。図４には、例１として「景色がきれい」、例２として「景色きれい」、例３として「風景美しい」、例４として「楽しいデート」の４通りの検索フレーズが入力された例を示している。例えば、例１の検索フレーズが入力された場合、Ｓ２０３では、「景色がきれい」を形態素解析して得られた単語のうち、助詞である「が」を除いた「景色」と「きれい」について特徴ベクトルが算出される。図示の例では、ｎ次元の特徴ベクトルを算出している。ｎは自然数であればよく、例えばｎ＝３００としてもよい。 In S203, the feature word generation unit 24 calculates a feature vector of each word obtained by morphologically analyzing the search phrase. FIG. 4 shows an example in which four types of search phrases were entered: "beautiful scenery" as example 1, "beautiful scenery" as example 2, "beautiful scenery" as example 3, and "fun date" as example 4. ing. For example, when the search phrase of Example 1 is input, in S203, out of the words obtained by morphological analysis of "beautiful scenery," A feature vector is calculated. In the illustrated example, an n-dimensional feature vector is calculated. n may be a natural number, for example n=300.

Ｓ２０４では、特徴語生成部２４は、検索フレーズを形態素解析して得た各単語のクラスタを決定する。具体的には、特徴語生成部２４は、Ｓ２０３で算出した特徴ベクトルを、所定のクラスタリングモデルに入力することにより、当該特徴ベクトルに対応するクラスタ、すなわち検索フレーズを形態素解析して得た各単語のクラスタの出力を得る。例えば、図４の例では、「景色」と「風景」はいずれもクラスタＩＤが「１５」のクラスタに属すると決定されている。また、「きれい」と「美しい」はいずれもクラスタＩＤが「８」のクラスタに属すると決定されている。なお、クラスタリングモデルは、単語の特徴ベクトルを入力データとして、その単語に対応するクラスタを出力するように機械学習され
たモデルである。単語に対応するクラスタとは、その単語と特徴ベクトルが類似した特徴語を構成要素に含むクラスタである。クラスタリングモデルの生成方法については後述する。 In S204, the feature word generator 24 determines clusters of words obtained by morphologically analyzing the search phrase. Specifically, the feature word generation unit 24 inputs the feature vector calculated in S203 to a predetermined clustering model, thereby generating a cluster corresponding to the feature vector, that is, each word obtained by morphologically analyzing the search phrase. to get the output of clusters of . For example, in the example of FIG. 4, both "landscape" and "scenery" are determined to belong to the cluster with the cluster ID of "15". Also, both "beautiful" and "beautiful" are determined to belong to the cluster with the cluster ID of "8". Note that the clustering model is a machine-learned model that uses word feature vectors as input data and outputs clusters corresponding to the words. A cluster corresponding to a word is a cluster whose components include feature words whose feature vectors are similar to those of the word. A method of generating a clustering model will be described later.

Ｓ２０５では、カテゴリ判別部２３は、Ｓ２０４で決定されたクラスタに基づいて、検索フレーズを形態素解析して得た各単語のカテゴリを判定する。この判定には、図５に示すような、クラスタＩＤと、特徴語と、重みと、カテゴリとが対応付けられたカテゴリデータベースを用いることができる。 In S205, the category determination unit 23 determines the category of each word obtained by morphologically analyzing the search phrase based on the cluster determined in S204. For this determination, a category database in which cluster IDs, feature words, weights, and categories are associated with each other can be used as shown in FIG.

具体的には、まず、カテゴリ判別部２３は、カテゴリデータベースからＳ２０４で決定されたクラスタが対応付けられているカテゴリを抽出する。例えば、図４の例１「景色がきれい」では、Ｓ２０４でクラスタはＩＤ「１５」とＩＤ「８」であると判定されるから、図５のカテゴリデータベースを用いて、カテゴリ「名所・旧跡」と、カテゴリ「レストラン」が抽出される。 Specifically, first, the category determination unit 23 extracts categories associated with the clusters determined in S204 from the category database. For example, in example 1 "beautiful scenery" in FIG. , the category "restaurant" is extracted.

次に、カテゴリ判別部２３は、検索フレーズを形態素解析して得た各単語と、その単語に対応するカテゴリの各特徴語との類似度を算出し、算出した類似度に重みを乗算する。なお、類似度の算出には、ｆａｓｔＴｅｘｔの単語間類似度推定モデルを用いてもよい。そして、算出した値をカテゴリ毎に合計し、その合計値の大きい順に上記抽出したカテゴリを順位づけする。以下、カテゴリ毎に合計した合計値を、スコアともいう。 Next, the category determination unit 23 calculates the degree of similarity between each word obtained by morphological analysis of the search phrase and each characteristic word of the category corresponding to the word, and multiplies the calculated degree of similarity by a weight. Note that a fastText inter-word similarity estimation model may be used to calculate the similarity. Then, the calculated values are totaled for each category, and the extracted categories are ranked in descending order of the total value. Hereinafter, the total value summed for each category is also referred to as score.

例えば、図４の例１「景色がきれい」の場合、カテゴリ「名所・旧跡」に関して、検索フレーズを形態素解析して得た単語「景色」と、カテゴリデータベースに登録されている特徴語「風景」の類似度が算出される。単語「きれい」についても同様である。そして、算出された類似度に、カテゴリデータベースに登録されている重みが乗算される。例えば、単語「景色」と特徴語「風景」の類似度が０．８であり、単語「きれい」と特徴語「美しい」の類似度が０．７であったとすれば、カテゴリ「名所・旧跡」のスコアは下記のように算出される。 For example, in the case of example 1 "beautiful scenery" in FIG. is calculated. The same applies to the word "beautiful". Then, the calculated similarity is multiplied by the weight registered in the category database. For example, if the similarity between the word "scenery" and the characteristic word "landscape" is 0.8, and the similarity between the word "beautiful" and the characteristic word "beautiful" is 0.7, then the category "famous place/historic site" ” is calculated as follows:

単語「景色」について：類似度「０．８」×重み「０．８８」＝０．７０４
単語「きれい」について：類似度「０．７」×重み「０．７６」＝０．５３２
カテゴリ「名所・旧跡」のスコア：０．７０４＋０．５３２＝１．２３６
なお、カテゴリデータベースにクラスタＩＤ「１５」の特徴語が複数登録されていた場合には、各特徴語について上記と同様の類似度の算出を重みの乗算とを行う。クラスタＩＤ「８」についても同様である。そして、乗算によって得られた値をカテゴリ毎に合計して上記スコアを算出し、そのスコアの大きい順に上記抽出したカテゴリを順位づけする。 For the word "scenery": similarity "0.8" x weight "0.88" = 0.704
For the word "beautiful": similarity "0.7" x weight "0.76" = 0.532
Score for the category "famous places/historic sites": 0.704 + 0.532 = 1.236
If a plurality of feature words with a cluster ID of "15" are registered in the category database, similarity calculation and weight multiplication are performed for each feature word in the same manner as described above. The same applies to cluster ID "8". Then, the values obtained by the multiplication are totaled for each category to calculate the score, and the extracted categories are ranked in descending order of the score.

Ｓ２０６では、検索部２５は、Ｓ２０４で決定されたクラスタに基づいて、スポットの検索を行う。この検索には、図６に示すような、クラスタＩＤと、特徴語と、重みと、スポットＩＤとが対応付けられたスポットデータベースを用いることができる。 In S206, the search unit 25 searches for spots based on the clusters determined in S204. For this search, a spot database in which cluster IDs, feature words, weights, and spot IDs are associated with each other can be used as shown in FIG.

具体的には、まず、検索部２５は、スポットデータベースからＳ２０４で決定されたクラスタが対応付けられているスポットを抽出する。例えば、図４の例１「景色がきれい」では、Ｓ２０４でクラスタはＩＤ「１５」とＩＤ「８」であると決定されるから、図６のスポットデータベースからスポット「名所・旧跡」と、スポット「レストラン」が抽出される。 Specifically, first, the search unit 25 extracts spots associated with the clusters determined in S204 from the spot database. For example, in example 1 “beautiful scenery” in FIG. "Restaurant" is extracted.

次に、検索部２５は、検索フレーズを形態素解析して得た各単語と、その単語に対応するスポットの各特徴語との類似度を算出し、算出した類似度に重みを乗算する。なお、類似度の算出には、ｆａｓｔＴｅｘｔの単語間類似度推定モデルを用いてもよい。そして、
算出した値をスポット毎に合計し、その合計値の大きい順に上記抽出したスポットを順位づけする。以下、上記のようにして算出した値をスポット毎に合計した合計値を、スコアともいう。 Next, the search unit 25 calculates the degree of similarity between each word obtained by morphologically analyzing the search phrase and each feature word of the spot corresponding to the word, and multiplies the calculated degree of similarity by a weight. Note that a fastText inter-word similarity estimation model may be used to calculate the similarity. and,
The calculated values are totaled for each spot, and the extracted spots are ranked in descending order of the total value. Hereinafter, the total value obtained by totaling the values calculated as described above for each spot is also referred to as a score.

例えば、図４の例１「景色がきれい」の場合、抽出されたスポット「名所Ａ」に関して、検索フレーズを形態素解析して得た単語「景色」と、スポットデータベースに登録されている特徴語「景観」の類似度が算出される。なお、単語「きれい」は、スポットデータベースに登録されている特徴語「きれい」と一致するから類似度は１となる。そして、算出した類似度に、スポットデータベースに登録されている重みが乗算される。例えば、単語「景色」と特徴語「景観」の類似度が０．９であれば、スポット「名所Ａ」のスコアは下記のように算出される。 For example, in the case of example 1 "beautiful scenery" in FIG. The similarity of "Landscape" is calculated. Note that the word "beautiful" matches the feature word "beautiful" registered in the spot database, so the degree of similarity is 1. Then, the calculated similarity is multiplied by the weight registered in the spot database. For example, if the similarity between the word "scenery" and the characteristic word "scenery" is 0.9, the score of the spot "famous place A" is calculated as follows.

単語「景色」について：類似度「０．９」×重み「０．５３」＝０．４７７
単語「きれい」について：類似度「１．０」×重み「０．４６」＝０．４６
スポット「名所Ａ」のスコア：０．４７７＋０．４６＝０．９３７
なお、スポットデータベースにクラスタＩＤ「１５」の特徴語が複数登録されていた場合には、各特徴語について上記と同様の類似度の算出を重みの乗算とを行う。また、クラスタＩＤ「８」についても同様である。そして、乗算によって得られた値をスポット毎に合計し、その合計値であるスコアの大きい順に上記抽出したスポットを順位づけする。 For the word "scene": similarity "0.9" x weight "0.53" = 0.477
For the word "beautiful": similarity "1.0" x weight "0.46" = 0.46
The score of the spot "famous place A": 0.477 + 0.46 = 0.937
If a plurality of feature words with cluster ID "15" are registered in the spot database, similarity calculation and weight multiplication are performed for each feature word in the same manner as described above. The same applies to cluster ID "8". Then, the values obtained by the multiplication are totaled for each spot, and the extracted spots are ranked in descending order of the total score.

Ｓ２０７では、検索部２５は、Ｓ２０５のカテゴリ判定結果を加味して、Ｓ２０６で検出したスポットについて、最終の順位付けを行う。例えば、検索部２５は、Ｓ２０５のカテゴリ判定において、スコアが最も大きかったカテゴリに属するスポットを、Ｓ２０６で算出したスコアの大きい順に順位づけしてもよい。なお、各スポットがいずれのカテゴリに属するかは予め定めておく。 In S207, the search unit 25 performs final ranking of the spots detected in S206, taking into account the category determination result of S205. For example, the search unit 25 may rank the spots belonging to the category with the highest score in the category determination in S205 in descending order of the score calculated in S206. It should be noted that which category each spot belongs to is determined in advance.

そして、検索部２５は、スコアが最も大きかったカテゴリに属するスポットを全て順位づけした後、スコアが次に大きかったカテゴリに属するスポットを全て順位づけする、という処理を、検出された全スポットの順位が決まるまで繰り返してもよい。なお、検索部２５は、所定の順位まで決定された段階で上記の繰り返し処理を終了してもよい。所定の順位としては、例えば上位２０位が挙げられる。 Then, the search unit 25 ranks all the spots belonging to the category with the highest score, and then ranks all the spots belonging to the category with the next highest score. may be repeated until determined. Note that the search unit 25 may end the above-described repeated processing when a predetermined order is determined. For example, the top 20 ranks are given as the predetermined ranking.

なお、カテゴリとスポットのスコアに基づく順位付けの方法は上記の例に限られない。例えば、検索部２５は、Ｓ２０６で検出した各スポットについて、そのスポットのスコアと、そのスポットが属するカテゴリのスコアの合計値を算出し、その合計値の順にスポットを順位づけしてもよい。この場合、カテゴリのスコアは、スポットのスコアの重み付けに用いられているといえる。また、Ｓ２０７の順位づけには、上述の統計データを加味してもよい。この場合、例えば評価の高いスポットの順位が高くなるようにすることもできる。 Note that the ranking method based on category and spot scores is not limited to the above example. For example, for each spot detected in S206, the search unit 25 may calculate the total value of the score of the spot and the score of the category to which the spot belongs, and rank the spots in order of the total value. In this case, it can be said that the category score is used to weight the spot score. Moreover, the above-described statistical data may be added to the ranking in S207. In this case, for example, it is possible to make the ranking of spots with high evaluations higher.

最後にＳ２０９では、検索通信装置２１は、検索結果を情報処理端末１に通知する。これにより、情報処理端末１の端末制御装置１４は、上記検索結果を表示装置１３に表示させる。この際、検索通信装置２１は、検索結果のうちＳ２０８で決定された順位が上位の所定数のみを情報処理端末１に通知してもよい。上位の所定数としては、例えば検索結果のうち上位１０件が挙げられる。 Finally, in S209, the search communication device 21 notifies the information processing terminal 1 of the search result. Thereby, the terminal control device 14 of the information processing terminal 1 causes the display device 13 to display the search result. At this time, the search communication device 21 may notify the information processing terminal 1 of only a predetermined number of search results having the highest rank determined in S208. For example, the top 10 items in the search results can be used as the predetermined number of top items.

また、検索通信装置２１は、Ｓ２０８で決定された順位についても情報処理端末１に通知してもよい。この場合、端末制御装置１４は、上記検索結果をその順位に応じた表示態様で表示装置１３に表示させることができる。例えば、上記検索結果を地図上に表示させる場合、当該地図上において、順位が高いスポットほど大きいマーカを表示させてもよい
。これにより、検出されたスポットのうちいずれの妥当性が高いかをユーザに認識させることができる。 The search communication device 21 may also notify the information processing terminal 1 of the order determined in S208. In this case, the terminal control device 14 can cause the display device 13 to display the search results in a display mode according to the ranking. For example, when the search results are displayed on a map, a larger marker may be displayed for a spot with a higher rank on the map. This allows the user to recognize which of the detected spots is highly relevant.

以上のように、検索フレーズに対応するクラスタを決定し、決定したクラスタを用いてスポットを検索することにより、多様な検索フレーズから妥当なスポットを検出することが可能になる。例えば、検索フレーズが図４の例１「景色がきれい」であっても、図４の例２「景色きれい」であっても、図４の例３「風景美しい」であっても、同様のスポットを検出することができる。また、検索フレーズが例１～例３と全く異なる図４の例４「楽しいデート」である場合、例１～例３とは異なるクラスタが決定され、それに応じて異なるスポットとして、例えばアミューズメントパークＦが検出される。 As described above, by determining clusters corresponding to search phrases and searching for spots using the determined clusters, it is possible to detect appropriate spots from various search phrases. For example, even if the search phrase is example 1 in FIG. 4 “beautiful scenery”, example 2 “beautiful scenery” in FIG. spots can be detected. Further, when the search phrase is Example 4 "fun date" of FIG. is detected.

また、クラスタはカテゴリの特定に用いることもでき、カテゴリを特定することにより、そのカテゴリの範疇のスポットを検出することができる。さらに、スポットと特徴語の組み合わせ、あるいはカテゴリと特徴語の組み合わせについて、関連性の高い組み合わせほど大きい重みを設定しておくことにより、妥当性の高いスポットを優先して検出することができる。 A cluster can also be used to specify a category, and by specifying a category, spots within that category can be detected. Furthermore, with respect to the combination of a spot and a feature word or the combination of a category and a feature word, by setting a higher weight for a combination with a higher degree of relevance, it is possible to preferentially detect a highly relevant spot.

（カテゴリ判定・スポット検索の変形例）
Ｓ２０５のカテゴリ判定では、検索フレーズを形態素解析して得た単語と、カテゴリデータベースに登録されている特徴語との類似度を算出しているが、この処理は省略することも可能である。この場合、Ｓ２０５では、カテゴリ判別部２３は、カテゴリに対応付けられた各クラスタＩＤの重み値の和を算出し、その算出値を当該カテゴリのスコアとしてもよい。 (Modified example of category judgment/spot search)
In the category determination in S205, the degree of similarity between words obtained by morphologically analyzing the search phrase and feature words registered in the category database is calculated, but this process can be omitted. In this case, in S205, the category determination unit 23 may calculate the sum of the weight values of the cluster IDs associated with the category, and use the calculated value as the score of the category.

同様に、Ｓ２０６のスポット検索においても類似度の算出は省略することが可能である。この場合、Ｓ２０６では、検索部２５は、スポットに対応付けられた各クラスタＩＤの重み値の和を算出し、その算出値を当該カテゴリのスコアとしてもよい。 Similarly, the similarity calculation can be omitted in the spot search in S206. In this case, in S206, the search unit 25 may calculate the sum of the weight values of each cluster ID associated with the spot, and use the calculated value as the score of the category.

〔カテゴリデータベースとスポットデータベースの生成〕
（１：実現するための構成）
情報検索システム１０は、カテゴリデータベースとスポットデータベースの生成機能を備えていてもよい。この機能を実現するための構成要素は、例えば情報検索装置２や情報蓄積装置３に設けてもよい。また、情報検索システム１０にこの機能を有する装置を別途追加してもよい。 [Generation of category database and spot database]
(1: Configuration for realization)
The information retrieval system 10 may have a function of generating a category database and a spot database. Components for realizing this function may be provided in the information retrieval device 2 or the information storage device 3, for example. Also, a device having this function may be added to the information retrieval system 10 separately.

以下では、情報蓄積装置３が、カテゴリデータベースとスポットデータベースの生成機能を備えている例を図７に基づいて説明する。図７は、情報蓄積装置の要部構成の一例を示すブロック図である。図示のように、蓄積制御装置３２には、クラスタリングモデル生成部３２１、特徴語抽出部３２２、クラスタ決定部３２３、スポットデータベース生成部３２４、カテゴリ分類部３２５、およびカテゴリデータベース生成部３２６が含まれる。以下では、データベースを、ＤＢと省略することもある。これら各部の機能は例えばプログラムにより実現することもできる。また、カテゴリデータベースとスポットデータベースの生成には、所定のサイトに書き込まれた口コミのデータを用いるとする。また、１つの口コミには少なくとも１つの文が含まれているとする。ここで、所定のサイトは、複数のサイトであってもよいし、ＳＮＳであってもよい。また、以下では、口コミのデータを単に口コミと略称する場合がある。 An example in which the information storage device 3 has a function of generating a category database and a spot database will be described below with reference to FIG. FIG. 7 is a block diagram showing an example of the main configuration of the information storage device. As shown, the accumulation control device 32 includes a clustering model generation unit 321, a feature word extraction unit 322, a cluster determination unit 323, a spot database generation unit 324, a category classification unit 325, and a category database generation unit 326. Below, the database may be abbreviated as DB. The functions of these units can also be realized by, for example, a program. Also, it is assumed that word-of-mouth data written on a predetermined site is used to generate the category database and the spot database. It is also assumed that one word-of-mouth includes at least one sentence. Here, the predetermined site may be a plurality of sites or SNS. In addition, word-of-mouth data may be abbreviated as word-of-mouth hereinafter.

クラスタリングモデル生成部３２１は、単語をクラスタリングするためのクラスタリングモデルを生成する。クラスタリングは、例えば特徴ベクトルの類似度に基づいて行ってもよく、これにより大まかに類似した語群を１まとめのクラスタとして扱うことができる
。なお、上記「大まかに類似した語群」は、大まかな類義語群ともいえる。単語の特徴ベクトルの算出には、例えばｆａｓｔＴｅｘｔのベクトル化モデルを利用してもよい。生成されたクラスタリングモデルは、図３のＳ２０４におけるクラスタの決定に用いられる。クラスタリングを行うことにより、クラスタ単位で検索を行うことができるので、検索を効率化し、検索を高速化することも可能になる。 The clustering model generation unit 321 generates a clustering model for clustering words. Clustering may be performed, for example, based on the degree of similarity of feature vectors, whereby roughly similar word groups can be treated as one cluster. It should be noted that the above-mentioned "roughly similar word group" can also be said to be a rough synonym group. A fastText vectorization model, for example, may be used to calculate feature vectors of words. The generated clustering model is used for cluster determination in S204 of FIG. By performing clustering, a search can be performed in units of clusters, so that it is possible to improve the efficiency of the search and speed up the search.

特徴語抽出部３２２は、入力された文における特徴語を抽出する。特徴語の抽出には、例えば共起語解析を利用することができる。具体的には、特徴語抽出部３２２は、入力された文を形態素解析することで得られた単語群に含まれる各単語について、ＴＦ－ＩＤＦ値を算出する。そして、特徴語抽出部３２２は、算出したＴＦ－ＩＤＦ値によって、上記単語群に含まれる単語を順位づけし、上位に順位づけされた単語を特徴語として抽出する。また、ＴＦ－ＩＤＦ値は特徴語の確からしさを示す数値であるから、特徴語抽出部３２２は、抽出した特徴語のＴＦ－ＩＤＦ値をその特徴語の重み値とする。 The feature word extraction unit 322 extracts feature words in the input sentence. Co-occurrence word analysis, for example, can be used to extract feature words. Specifically, the feature word extraction unit 322 calculates the TF-IDF value for each word included in the word group obtained by morphologically analyzing the input sentence. Then, the feature word extraction unit 322 ranks the words included in the word group according to the calculated TF-IDF value, and extracts the words ranked higher as feature words. Also, since the TF-IDF value is a numerical value indicating the certainty of the feature word, the feature word extraction unit 322 uses the TF-IDF value of the extracted feature word as the weight value of the feature word.

クラスタ決定部３２３は、クラスタリングモデル生成部が生成したクラスタリングモデルを用いて、特徴語抽出部３２２が抽出した特徴語のクラスタを決定する。決定されたクラスタは、特徴情報が類似した複数の特徴語を構成要素としたものとなる。 The cluster determination unit 323 determines clusters of feature words extracted by the feature word extraction unit 322 using the clustering model generated by the clustering model generation unit. The determined cluster is composed of a plurality of feature words having similar feature information.

スポットＤＢ生成部３２４は、スポットと、特徴語と、そのクラスタと、重み値とを対応付けて、図６に示すようなスポットデータベースを生成する。特徴語と重み値は、特徴語抽出部３２２が抽出および算出したものである。また、クラスタはクラスタ決定部３２３が決定したものである。 The spot DB generator 324 associates spots, feature words, their clusters, and weight values to generate a spot database as shown in FIG. The feature words and weight values are extracted and calculated by the feature word extraction unit 322 . Also, the cluster is determined by the cluster determination unit 323 .

カテゴリ分類部３２５は、入力された文を所定の複数のカテゴリに分類する。分類には、教師あり機械学習により構築した分類モデルを用いてもよい。該分類モデルとしては、例えばｆａｓｔＴｅｘｔのテキスト分類モデルを用いてもよい。上記機械学習においては、カテゴリが既知の文を教師データとする。例えば、所定のサイトに書き込まれた口コミのそれぞれについて、人手等によってカテゴリを対応付けて教師データとしてもよい。 The category classification section 325 classifies the input sentence into a plurality of predetermined categories. A classification model constructed by supervised machine learning may be used for classification. As the classification model, for example, a fastText text classification model may be used. In the above machine learning, sentences with known categories are used as teacher data. For example, each word-of-mouth written on a predetermined site may be manually associated with a category and used as teacher data.

カテゴリＤＢ生成部３２６は、カテゴリと、特徴語と、そのクラスタと、重み値とを対応付けて、図５に示すようなカテゴリデータベースを生成する。特徴語と重み値は、特徴語抽出部３２２が抽出および算出したものである。また、クラスタはクラスタ決定部３２３が決定したものである。 The category DB generator 326 associates categories, feature words, their clusters, and weight values to generate a category database as shown in FIG. The feature words and weight values are extracted and calculated by the feature word extraction unit 322 . Also, the cluster is determined by the cluster determination unit 323 .

（２：処理の流れ）
図８にはスポットデータベースを生成する処理の例を示している。Ｓ３０１では、クラスタリングモデル生成部３２１がクラスタリングモデルを生成する。例えば、クラスタリングモデル生成部は、所定のサイトに書き込まれた口コミの全てを形態素解析して得られた全単語について特徴ベクトルを算出し、算出した特徴ベクトルに基づいてクラスタリングモデルを生成してもよい。この場合、生成されるクラスタリングモデルは、入力された単語を、その単語と特徴ベクトルが類似したクラスタに分類するモデルとなる。このようなモデルの生成には、例えばｆａｓｔＴｅｘｔを利用することもできる。 (2: Flow of processing)
FIG. 8 shows an example of processing for generating a spot database. In S301, the clustering model generation unit 321 generates a clustering model. For example, the clustering model generation unit may calculate feature vectors for all words obtained by morphologically analyzing all word-of-mouth written on a predetermined site, and generate a clustering model based on the calculated feature vectors. . In this case, the generated clustering model is a model that classifies the input words into clusters whose feature vectors are similar to those of the words. FastText, for example, can be used to generate such a model.

Ｓ３０２では、特徴語抽出部３２２が、各スポットの口コミから特徴語と重みを抽出する。特徴語と重みの抽出は、上述のように、ＴＦ－ＩＤＦ値を用いて行うことができる。なお、各口コミとスポットとの対応関係は、その口コミがいずれのスポットについて書き込まれた口コミであるか等に応じて予め特定しておく。例えば、所定のサイトにおける名所Ａを紹介するページに書き込まれた口コミであれば、名所Ａと対応付けて記憶しておくことにより、特徴語抽出部３２２がその口コミを名所Ａの口コミであると特定することができる。 In S302, the feature word extraction unit 322 extracts feature words and weights from the word-of-mouth of each spot. Feature term and weight extraction can be performed using TF-IDF values, as described above. Note that the correspondence relationship between each word of mouth and the spot is specified in advance according to which spot the word of mouth is written about. For example, if a word-of-mouth written on a page introducing a sight A on a predetermined site is stored in association with the sight A, the characteristic word extracting unit 322 recognizes the word-of-mouth as a word of mouth about the sight A. can be specified.

Ｓ３０３では、クラスタ決定部３２３が、Ｓ３０１で生成されたクラスタリングモデルを用いて、Ｓ３０２で抽出された特徴語のクラスタを決定する。上記Ｓ３０２の処理と、Ｓ３０３の処理により、各スポットについて書き込まれた口コミのそれぞれについて、特徴語と、重みと、クラスタが決定される。 In S303, the cluster determination unit 323 uses the clustering model generated in S301 to determine clusters of the feature words extracted in S302. Characteristic words, weights, and clusters are determined for each word-of-mouth written for each spot by the processing of S302 and the processing of S303.

Ｓ３０４では、スポットＤＢ生成部３２４が、上記のようにして決定された特徴語、重み、およびクラスタを、これらに対応するスポットと対応付けてスポットデータベースに格納する。これにより、各スポットに対し、特徴語とクラスタと重みとが対応付けられたレコードがスポットデータベースに追加される。なお、スポットデータベースが未生成であれば、Ｓ３０４ではスポットデータベースが生成される。 In S304, the spot DB generator 324 associates the feature words, weights, and clusters determined as described above with the corresponding spots and stores them in the spot database. As a result, for each spot, a record in which the feature word, cluster, and weight are associated is added to the spot database. Note that if the spot database has not yet been generated, the spot database is generated in S304.

続いて、図９について説明する。図９は、カテゴリデータベースを生成する処理の例を示すフローチャートである。Ｓ４０１では、カテゴリ分類部３２５が、所定のサイトに書き込まれた各口コミをカテゴリに分類する。 Next, FIG. 9 will be described. FIG. 9 is a flow chart showing an example of processing for generating a category database. In S401, the category classification unit 325 classifies each word-of-mouth written on a predetermined site into categories.

Ｓ４０２では、特徴語抽出部３２２が、各カテゴリの口コミから特徴語と重みを抽出する。特徴語と重みの抽出は、上述のように、ＴＦ－ＩＤＦ値を用いて行うことができる。なお、Ｓ４０２では、Ｓ３０２で抽出した特徴語と重みを流用してもよい。 In S402, the feature word extraction unit 322 extracts feature words and weights from word-of-mouth in each category. Feature term and weight extraction can be performed using TF-IDF values, as described above. In S402, the feature words and weights extracted in S302 may be used.

Ｓ４０３では、クラスタ決定部３２３が、Ｓ３０１で生成されたクラスタリングモデルを用いて、Ｓ４０２で抽出された特徴語のクラスタを決定する。上記Ｓ３０２の処理と、Ｓ４０３の処理により、各カテゴリの口コミのそれぞれについて、特徴語と、重みと、クラスタが決定される。 In S403, the cluster determination unit 323 uses the clustering model generated in S301 to determine clusters of the feature words extracted in S402. A feature word, a weight, and a cluster are determined for each word-of-mouth in each category by the processing of S302 and the processing of S403.

Ｓ４０４では、カテゴリＤＢ生成部３２６が、上記のようにして決定された特徴語、重み、およびクラスタを、これらに対応するカテゴリと対応付けてカテゴリデータベースに格納する。これにより、各カテゴリに対し、特徴語とクラスタと重みとが対応付けられたレコードがカテゴリデータベースに追加される。なお、カテゴリデータベースが未生成であれば、Ｓ４０４ではカテゴリデータベースが生成される。 In S404, the category DB generation unit 326 associates the feature words, weights, and clusters determined as described above with their corresponding categories and stores them in the category database. As a result, for each category, a record in which feature words, clusters, and weights are associated with each other is added to the category database. If the category database has not yet been generated, the category database is generated in S404.

（３：まとめ）
以上のように、クラスタリングモデル生成部３２１は、口コミを形態素解析して得られた各単語について特徴ベクトルを算出し、算出した特徴ベクトルに基づいてクラスタリングモデルを生成する。このようにして生成されたクラスタリングモデルによれば、口コミと同様の表現を含む検索フレーズを精度よくクラスタリングすることができる。また、このクラスタリングモデルは、口コミのみから生成することもできる。 (3: Summary)
As described above, the clustering model generation unit 321 calculates feature vectors for each word obtained by morphologically analyzing word-of-mouth, and generates a clustering model based on the calculated feature vectors. According to the clustering model generated in this way, it is possible to accurately cluster search phrases containing expressions similar to word-of-mouth. In addition, this clustering model can also be generated only from word of mouth.

そして、構築したクラスタリングモデルと口コミを用いることにより、スポットデータベースと、カテゴリデータベースを生成することができる。このスポットデータベースと、カテゴリデータベースは、口コミのみから生成することもできる。また、最新の口コミを随時取得することにより、これらのデータベースを拡充することもできる。そして、口コミには、スポットやカテゴリを評価、あるいは表現する際に使用される最新の用語が反映されるので、上記のようにして生成した各データベースを用いることにより、そのような最新の用語を用いた検索にも対応することが可能になる。 Then, by using the constructed clustering model and word of mouth, a spot database and a category database can be generated. This spot database and category database can also be generated only from word of mouth. In addition, these databases can be expanded by acquiring the latest word-of-mouth information from time to time. Word of mouth reflects the latest terms used to evaluate or describe spots and categories. It is also possible to respond to searches using

また、クラスタ決定部３２３は、上記クラスタリングモデルを用いて特徴語のクラスタを決定し、スポットＤＢ生成部３２４は、クラスタ決定部３２３が決定したクラスタとスポットとを対応付けて記憶する。これにより、検索部２５は、スポットに直接対応付けられていない単語からなる検索フレーズが入力された場合であっても、その単語に対応する
クラスタを用いることにより、妥当なスポットを検出することが可能になる。 In addition, the cluster determination unit 323 determines clusters of feature words using the clustering model, and the spot DB generation unit 324 associates and stores the clusters determined by the cluster determination unit 323 and the spots. As a result, even when a search phrase made up of words that are not directly associated with spots is input, the search unit 25 can detect appropriate spots by using clusters corresponding to the words. be possible.

同様に、カテゴリＤＢ生成部３２６は、クラスタ決定部３２３が決定したクラスタとカテゴリ分類部３２５が分類したカテゴリとを対応付けて記憶する。これにより、カテゴリ判別部２３は、カテゴリに直接対応付けられていない単語からなる検索フレーズが入力された場合であっても、その単語に対応するクラスタを用いることにより、妥当なカテゴリを判定することが可能になる。 Similarly, the category DB generation unit 326 stores the clusters determined by the cluster determination unit 323 and the categories classified by the category classification unit 325 in association with each other. As a result, even when a search phrase composed of words not directly associated with a category is input, the category determination unit 23 can determine an appropriate category by using the cluster corresponding to the word. becomes possible.

さらに、特徴語抽出部３２２は特徴語を抽出すると共にその重みを算出し、スポットＤＢ生成部３２４はスポットと、クラスタと、特徴語と、重みとを対応付けて記憶する。これにより、１つのクラスタに含まれる各特徴語について、スポットとの関連性の大きさが重みの値により特定可能となる。よって、検索部２５は、検索フレーズに対応する特徴語と、スポットとの関連性の大きさを考慮して、検索結果を順位付けすることが可能になる。 Furthermore, the feature word extraction unit 322 extracts feature words and calculates their weights, and the spot DB generation unit 324 stores spots, clusters, feature words, and weights in association with each other. As a result, for each feature word included in one cluster, the degree of relevance to the spot can be specified by the weight value. Therefore, the search unit 25 can rank the search results in consideration of the degree of relevance between the feature word corresponding to the search phrase and the spot.

同様に、カテゴリＤＢ生成部３２６は、カテゴリと、クラスタと、特徴語と、重みとを対応付けて記憶する。これにより、１つのカテゴリに対応付けられている各特徴語について、当該カテゴリとの関連性の大きさが重みの値により特定可能となる。よって、カテゴリ判別部２３は、検索フレーズに対応する特徴語と、カテゴリとの関連性の大きさを考慮して、カテゴリの判定結果を順位付けすることが可能になる。 Similarly, the category DB generator 326 associates and stores categories, clusters, feature words, and weights. As a result, for each feature word associated with one category, the degree of relevance to the category can be specified by the weight value. Therefore, the category determination unit 23 can rank the category determination results in consideration of the degree of relevance between the characteristic word corresponding to the search phrase and the category.

（再学習について）
検索において、教師あり機械学習で生成したモデルを用いる場合、該検索の検索結果に応じた教師データを生成し、その教師データにより再度機械学習を行って当該モデルを更新してもよい。例えば、ある検索フレーズが入力されて、カテゴリの判定（Ｓ２０４）とスポットの検索（Ｓ２０６）が行われ、情報処理端末１に通知されたスポットのいずれかがユーザに選択されたとする。この場合、選択されたスポットはユーザの意に沿ったスポットであったと考えられる。よって、上記検索フレーズあるいは該検索フレーズを構成する単語とその特徴ベクトルと、当該スポットまたはそのカテゴリとを対応付けて教師データとすることができる。例えば、上記検索フレーズと選択されたスポットのカテゴリとを対応付けることにより、カテゴリ分類部が使用する分類モデルを更新するための教師データとしてもよい。このような教師データを用いた再学習によって分類モデルを更新することにより、Ｓ４０１の分類精度を向上することができる。 (Regarding re-learning)
When a model generated by supervised machine learning is used in a search, teacher data may be generated according to the search results of the search, and machine learning may be performed again using the teacher data to update the model. For example, suppose that a certain search phrase is input, category determination (S204) and spot search (S206) are performed, and one of the spots notified to information processing terminal 1 is selected by the user. In this case, it is considered that the selected spot was the spot that the user intended. Therefore, the search phrase or the words forming the search phrase, their feature vectors, and the spot or its category can be associated with each other and used as teacher data. For example, by associating the search phrase with the category of the selected spot, the teacher data may be used to update the classification model used by the category classification unit. By updating the classification model by re-learning using such teacher data, the classification accuracy in S401 can be improved.

１情報処理端末
２情報検索装置
３情報蓄積装置
１０情報検索システム
１１入力装置
１２端末通信装置
１３表示装置
１５端末記憶装置
２１検索通信装置
２２位置情報取得部
２３カテゴリ判別部
２４特徴語生成部
２５検索部
２８検索記憶装置
３１蓄積記憶装置 1 Information Processing Terminal 2 Information Retrieval Device 3 Information Storage Device 10 Information Retrieval System 11 Input Device 12 Terminal Communication Device 13 Display Device 15 Terminal Storage Device 21 Search Communication Device 22 Position Information Acquisition Part 23 Category Discrimination Part 24 Characteristic Word Generation Part 25 Search Part 28 Search storage device 31 Accumulation storage device

Claims

A program for searching a spot belonging to a category associated with position information indicating a geographical position, using a computer as an information search device,
an obtaining step of obtaining a search phrase;
a location information acquisition step of acquiring the location information;
a calculating step of calculating a feature vector of each word obtained by morphologically analyzing the search phrase;
a determination step of determining each cluster containing a feature word having a high degree of similarity with each of the feature vectors calculated for each of the words;
a determination step of determining each of the categories corresponding to each of the determined clusters from a category database in which each of the clusters, the feature words, the first weighting factors, and the categories are associated;
Corresponding to each cluster determined from the spot database in which each cluster, the feature word, the second weighting factor, and the spot are associated, and included in a predetermined range from the position indicated by the acquired position information a searching step of searching for each of the said spots;
Calculate the similarity between each word and the feature word of each category, and calculate the first score based on a first score obtained by multiplying the calculated similarity by the first weighting factor Rank each category in descending order, calculate the similarity between each word and the feature word of each spot, and multiply the calculated similarity by the second weighting factor. and a ranking step of ranking each of the spots within each of the categories in descending order of the second score based on the second score.