JP4009937B2 - Document search device, document search program, and medium storing document search program - Google Patents

Document search device, document search program, and medium storing document search program Download PDF

Info

Publication number
JP4009937B2
JP4009937B2 JP2002004805A JP2002004805A JP4009937B2 JP 4009937 B2 JP4009937 B2 JP 4009937B2 JP 2002004805 A JP2002004805 A JP 2002004805A JP 2002004805 A JP2002004805 A JP 2002004805A JP 4009937 B2 JP4009937 B2 JP 4009937B2
Authority
JP
Japan
Prior art keywords
search
phrase
document
search request
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP2002004805A
Other languages
Japanese (ja)
Other versions
JP2003208447A (en
Inventor
大二郎 森
正之 杉崎
聡哉 栗島
博人 稲垣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP2002004805A priority Critical patent/JP4009937B2/en
Publication of JP2003208447A publication Critical patent/JP2003208447A/en
Application granted granted Critical
Publication of JP4009937B2 publication Critical patent/JP4009937B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Description

【0001】
【発明の属する技術分野】
本発明は、大量の文書情報を蓄積し、入力された文字列(語句)を含む文書を検索して提示する文書検索技術に関する。
【0002】
【従来の技術】
近年、インターネットの普及に代表される情報流通インフラの急速な整備に伴い、大量の文書情報(以下、単に文書)が流通するようになった。これらの文書を網羅的に収集し、検索要求として入力された語句を本文に含む文書を検索して提示するシステムが実現されている。特にWWW(World Wide Web)上の文書を検索するWebサーチエンジンと呼ばれるサービスとして、goo(http://www.goo.ne.jp/)やgoogle(http://www.google.com/)などが実現されている。
【0003】
これらのWebサーチエンジンでは、入力された検索要求語句に対して、数千件〜数百万件という大量の文書が検索結果として得られる場合が少なくない。このような大量の検索結果文書の全てに利用者が目を通すことは事実上不可能であるから、検索要求語句により良く適合していると考えられる文書を選りすぐって提示したり、検索結果文書の集合を分析し、サイト単位や文書の主題単位に検索結果文書をグループ化して表示し、利用者が求める文書を的確に絞り込むことを支援していた。
【0004】
自然言語では一つの語句が複数の意味を持つ、即ち多義性を持つことがしばしばあるため、ある語句によって検索を行った時、異なる語義でその語句が用いられている文書が検索結果として混在して出力されることがある。これらの文書の中から利用者が求める文書を的確に絞り込む手段として、文書の主題によって検索結果を分類し、検索結果をグループ化して提示する技術は有用である。
【0005】
検索結果の分類を行う手法としては、TF×IDF法等を用いて各文書から特徴的な単語(語句)を抽出し、これらの語句数分の次元のベクトル空間に文書を配置し、文書ベクトルのなす角の余弦によって文書間の類似度を定義して、類似度の高い文書をクラスタ分析を用いて分類する手法等が用いられる。
【0006】
なお、TF×IDF法の詳細については、例えば、G. Salton, "Automatic Text Processing" 1989, Addison-Wesley Pubrishing等に記載されているが、文書集合全体の中で出現頻度がより少ない語句を重要な語句とみなし、ある文書の中で出現頻度がより高い語句をその文書の特徴を良く表す言葉だとみなすという二つの原理に基づいて、文書と語句との適合度を算出する手法である。
【0007】
【発明が解決しようとする課題】
しかしながら、検索結果文書の数が大量である場合、全ての文書に対して上記のようなクラスタ分析を行うと、検索応答時間が長くなってしまうという問題があった。
【0008】
また、TF×IDF法を用いて各文書の特徴量を算出する場合、各語句の重要度は静的で変化しないものとみなすことになるが、実際には、ごく一般的な語句が、例えば本や映画の表題に用いられたりすると、利用者にとって重要度の高い語句とみなされるようになったり、ごくありふれた姓と名の組み合わせが有名人の名前として認識され、利用者にとって重要度の高いフレーズとなるようなケースがしばしばあるにも拘わらず、これを反映して的確に分類を行うことができなかった。
【0009】
本発明の目的は、検索応答時間を短くし、利用者にとっての語句の重要度の動的な変化に対応して的確な分類を可能とすることにある。
【0010】
【課題を解決するための手段】
前記課題を解決するため、本発明では、予め蓄積した検索対象文書の集合から所望の語句を含む文書を検索し提示する文書検索装置であって、利用者によって入力された語句を検索要求語句として受け取る検索要求入力手段と、検索対象文書の集合から前記利用者によって入力された検索要求語句を含む文書を検索し出力する文書検索手段と、文書検索手段が出力する文書のうち、前記利用者によって入力された検索要求語句と関連度の高い関連語句を含む文書を、各関連語句毎にグループ化して表示する検索結果表示手段とを備えた文書検索装置を提案する。
【0012】
ここで、検索結果表示手段として、検索要求語句の履歴(過去に入力され、使用された検索要求語句の集合)中から前記利用者によって入力された検索要求語句に隣接している語句を抽出し、その出現頻度を算出し、該頻度が高い語句を前記利用者によって入力された検索要求語句と関連度の高い関連語句とみなし、文書検索手段が出力する文書のうち、前記利用者によって入力された検索要求語句とこれらの関連語句との連接語句を含む文書を、各関連語句毎にグループ化して表示する検索結果表示手段を用いても良く、検索入力語句と隣接する語句とで構成されるフレーズのうち、検索要求語句の履歴中に高い頻度で現れるフレーズはその時点で多くの利用者が共通に求める傾向が高いフレーズであると考えることができるから、これらのフレーズを分類の基準とみなすことによって、利用者にとって重要度の高い概念によって検索結果を分類し、提示することが可能となる。
【0013】
以上の通り、本発明によれば、検索応答時間が長くなってしまうという課題と、利用者にとっての語句の重要度の動的な変化に対応して的確な分類が行えないという課題の両方を解決することが可能となる。
【0014】
【発明の実施の形態】
次に、本発明の実施の形態について図面を参照して詳細に説明する。
【0015】
図1は、本発明の文書検索装置の実施の形態の一例を示すもので、図中、1は検索要求入力処理部、2は文書検索処理部、3は検索結果表示処理部、4は予め蓄積した検索対象文書の集合(を記憶した記憶部)である。
【0016】
検索要求入力処理部1は、利用者によって入力された語句を検索要求語句として受け取り、これを文書検索処理部2に引き渡す。
【0017】
文書検索処理部2は、検索対象文書の集合4から前記検索要求語句を含む文書を検索し、これを検索結果文書として検索結果表示処理部3に出力する。
【0018】
検索結果表示処理部3は、検索結果文書のうち、前記検索要求語句と関連度の高い関連語句を含む文書を、各関連語句毎にグループ化して表示する。
【0019】
検索結果表示処理部3において、何をもって関連語句とするか(何から抽出し、どのようにして決定するか)は種々考えられるが、本発明では、検索要求語句に隣接して現れる語句を関連語句の候補として、検索要求語句の履歴から検索要求語句に隣接して現れる語句を抽出し、隣接して現れる回数が多い語句(出現頻度の高い語句)を関連語句とみなし、検索要求語句とこれらの関連語句との連接語句を含む文書を、各関連語句毎にグループ化して表示する。
【0020】
また、何をもって隣接して現れる回数が多い(出現頻度が高い)とするかについても種々考えられるが、(1)全検索結果文書における出現割合が一定以上、(2)出現回数の多い順に上位何番目まで、(3)これらの組み合わせ等が考えられる。
【0021】
なお、抽出対象とする隣接語句は検索要求語句(の品詞)によっても変わるが、通常、名詞(人名、地名等の固有名詞を含む)であり、場合によっては形容詞、副詞等を含めても良いが、接続詞、接頭語、接尾語等は対象外とする。
【0022】
図2は、文書検索及び検索結果表示処理の一例(但し、特許請求の範囲には含まれない。)、ここでは検索結果文書中の検索要求語句の隣接語句を関連語句とする場合の例を示す流れ図である。
【0023】
即ち、検索要求入力処理部1において利用者より入力された検索要求語句に基づき、文書検索処理部2において検索対象文書の集合4から該検索要求語句を含む文書を検索し、検索結果文書として出力した後(S1)、検索結果表示処理部3において検索結果文書中から検索要求語句に隣接している語句を抽出し、その出現頻度を算出し(S2)、該頻度が高い語句を関連語句と見なし、検索要求語句とこれらの関連語句との連接語句を含む文書を、各関連語句毎にグループ化し(S3)、出現頻度が高いものから順に表示する(S4)。
【0024】
図3にこの方法による検索結果表示処理の具体例を示す。
【0025】
図4は、文書検索及び検索結果表示処理の他の例、ここでは検索要求語句の履歴中の検索要求語句の隣接語句を関連語句とする場合の例を示す流れ図である。
【0026】
即ち、検索要求入力処理部1において利用者より入力された検索要求語句に基づき、文書検索処理部2において検索対象文書の集合4から該検索要求語句を含む文書を検索し、検索結果文書として出力した後(S1)、検索結果表示処理部3において検索要求語句の履歴(過去に入力され、使用された検索要求語句の集合)中から検索要求語句に隣接している語句を抽出し、その出現頻度を算出し(S11)、該頻度が高い語句を関連語句と見なし、検索要求語句とこれらの関連語句との連接語句を含む文書を、各関連語句毎にグループ化し(S3)、出現頻度が高いものから順に表示する(S4)。
【0027】
図5にこの方法による検索結果表示処理の具体例を示す。
【0028】
この際、対象とする検索要求語句の履歴の範囲を、検索要求語句が入力された時刻から遡って一定期間(例えば72時間の範囲等)内に限定し、この中から該当する語句の出現頻度を求めることにより、流行等の要因による検索要求語句の重要度の変動により的確に追従することができる。
【0029】
【発明の効果】
本発明の効果は、第1に、検索要求語句に対する関連語句を抽出し、それに関する分類を行うことにより、クラスタリング等の手法に比べ、より短時間に分類処理を行えることであり、第2に、検索要求語句の重要度が流行等を反映して変動する事象に追従でき、多くの利用者の関心に沿った語句によって分類が可能になることである。
【図面の簡単な説明】
【図1】本発明の文書検索装置の実施の形態の一例を示す構成図
【図2】文書検索及び検索結果表示処理の一例を示す流れ図
【図3】図2の処理の流れに従う検索結果表示処理の具体例を示す説明図
【図4】文書検索及び検索結果表示処理の他の例を示す流れ図
【図5】図4の処理の流れに従う検索結果表示処理の具体例を示す説明図
【符号の説明】
1:検索要求入力処理部、2:文書検索処理部、3:検索結果表示処理部、4:検索対象文書の集合。
[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a document retrieval technique for accumulating a large amount of document information and retrieving and presenting a document including an inputted character string (phrase).
[0002]
[Prior art]
In recent years, with the rapid development of information distribution infrastructure represented by the spread of the Internet, a large amount of document information (hereinafter simply referred to as documents) has been distributed. A system has been realized in which these documents are collected in an exhaustive manner, and a document that includes a word / phrase inputted as a search request in the text is searched and presented. In particular, as a service called a Web search engine for searching a document on the WWW (World Wide Web), goo (http://www.goo.ne.jp/) and Google (http://www.google.com/) Etc. are realized.
[0003]
In these Web search engines, a large number of documents of thousands to millions of documents are often obtained as search results for the input search request phrases. Since it is virtually impossible for a user to look through all of such a large number of search result documents, a document that is considered to be better suited to the search request phrase is selected and presented, or a search result document is displayed. The search result documents are grouped and displayed in units of sites and the subject units of the documents, and the user's desired documents are narrowed down.
[0004]
In natural language, a single phrase often has multiple meanings, that is, ambiguity, so when a search is performed using a certain phrase, documents that use that phrase with different meanings are mixed as search results. May be output. A technique for classifying search results according to the subject matter of the documents and presenting the search results in a group is useful as means for accurately narrowing down the documents requested by the user from these documents.
[0005]
As a method for classifying search results, a characteristic word (phrase) is extracted from each document using a TF × IDF method or the like, and the document is arranged in a vector space of a dimension corresponding to the number of these phrases. For example, a method is used in which the similarity between documents is defined by the cosine of the angle formed by and the documents having a high similarity are classified using cluster analysis.
[0006]
The details of the TF × IDF method are described in, for example, G. Salton, “Automatic Text Processing” 1989, Addison-Wesley Pubrishing, etc. This is a technique for calculating the degree of matching between a document and a phrase based on two principles that it is regarded as a simple phrase and a phrase with a higher appearance frequency in a document is regarded as a word that well represents the characteristics of the document.
[0007]
[Problems to be solved by the invention]
However, when the number of search result documents is large, if the above cluster analysis is performed on all documents, there is a problem that the search response time becomes long.
[0008]
Also, when calculating the feature value of each document using the TF × IDF method, the importance of each word is considered to be static and does not change. When used in the title of a book or movie, it will be regarded as a phrase that is important to the user, or a combination of a surname and a surname that is very common will be recognized as a celebrity name, and a phrase that is important to the user In spite of the fact that there are many cases that become such cases, it was not possible to accurately classify them reflecting this.
[0009]
An object of the present invention is to shorten the search response time and enable accurate classification corresponding to dynamic changes in the importance of phrases for users.
[0010]
[Means for Solving the Problems]
In order to solve the above-described problem, the present invention is a document search apparatus that searches and presents a document including a desired phrase from a set of previously stored search target documents, and uses a phrase input by a user as a search request phrase. A search request input means for receiving, a document search means for searching and outputting a document including a search request phrase input by the user from a set of search target documents, and a document output by the user by the user among the documents output by the document search means A document search apparatus is provided that includes search result display means for grouping and displaying documents including related phrases that are highly related to the input search request phrase.
[0012]
Here, as a search result display means, a phrase adjacent to the search request phrase input by the user is extracted from the history of search request phrases (a set of search request phrases input and used in the past). The appearance frequency is calculated, the high frequency word / phrase is regarded as the related word / phrase having a high degree of association with the search request word / phrase inputted by the user, and the document search means outputs the document inputted by the user. Search result display means for grouping and displaying documents including concatenated phrases of the search request phrases and these related phrases may be used. The search result display means is composed of search input phrases and adjacent phrases. Of the phrases, phrases that appear frequently in the history of search request phrases can be considered as phrases that many users tend to seek in common at that time. By considering the classification of the reference phrases, it is possible to classify the search results by the high concept of importance for the user, presented.
[0013]
As described above, according to the present invention, both the problem that the search response time becomes long and the problem that accurate classification cannot be performed in response to the dynamic change of the importance of the phrase for the user are performed. It can be solved.
[0014]
DETAILED DESCRIPTION OF THE INVENTION
Next, embodiments of the present invention will be described in detail with reference to the drawings.
[0015]
FIG. 1 shows an example of an embodiment of a document search apparatus according to the present invention. In the figure, 1 is a search request input processing unit, 2 is a document search processing unit, 3 is a search result display processing unit, This is a set of stored search target documents (a storage unit that stores them).
[0016]
The search request input processing unit 1 receives a phrase input by the user as a search request word and passes it to the document search processing unit 2.
[0017]
The document search processing unit 2 searches for a document including the search request word / phrase from the set 4 of search target documents, and outputs this as a search result document to the search result display processing unit 3.
[0018]
The search result display processing unit 3 displays, among search result documents, documents including related words / phrases having high relevance to the search request word / phrase, grouped for each related word / phrase.
[0019]
In the search result display processing unit 3, there are various possible ways to determine the related word (extract from what is determined and how to determine it). In the present invention, the word that appears adjacent to the search request word is related. as the phrase candidates, extracts words appearing adjacent to the search request terms from the history of the search request term, regarded as adjacent number appearing often words (the high frequency terms) related phrases, the search request terms Documents containing concatenated phrases with these related phrases are displayed grouped for each related phrase.
[0020]
In addition, there are various possible ways of determining the number of adjacent appearances (the appearance frequency is high), but (1) the appearance ratio in all search result documents is greater than or equal to a certain level, and (2) the higher order in the number of appearances. To what number, (3) a combination of these is conceivable.
[0021]
In addition, although the adjacent phrase to be extracted varies depending on the search request phrase (part of speech), it is usually a noun (including proper nouns such as personal names and place names), and may include adjectives, adverbs, etc. in some cases. However, conjunctions, prefixes, suffixes, etc. are excluded.
[0022]
FIG. 2 shows an example of document search and search result display processing (however, it is not included in the scope of claims) . Here, an example in which the adjacent phrase of the search request phrase in the search result document is used as the related phrase. It is a flowchart shown.
[0023]
That is, based on the search request phrase input by the user in the search request input processing unit 1, the document search processing unit 2 searches for a document including the search request phrase from the set 4 of search target documents and outputs it as a search result document. After that (S1), the search result display processing unit 3 extracts words / phrases adjacent to the search request word / phrase from the search result document, calculates the appearance frequency (S2), and uses the word / phrase with high frequency as the related word / phrase. Assuming that the documents including the search request phrases and the concatenated phrases of these related phrases are grouped for each related phrase (S3) and displayed in descending order of appearance frequency (S4).
[0024]
FIG. 3 shows a specific example of search result display processing by this method.
[0025]
FIG. 4 is a flowchart showing another example of the document search and search result display processing, here an example in which the adjacent phrase of the search request phrase in the history of the search request phrase is a related phrase.
[0026]
That is, based on the search request phrase input by the user in the search request input processing unit 1, the document search processing unit 2 searches for a document including the search request phrase from the set 4 of search target documents and outputs it as a search result document. After that (S1), the search result display processing unit 3 extracts a phrase adjacent to the search request phrase from the history of the search request phrase (a set of search request phrases input and used in the past), and its appearance The frequency is calculated (S11), the phrase having the high frequency is regarded as the related phrase, and the documents including the search request phrase and the concatenated phrase of these related phrases are grouped for each related phrase (S3). Display in order from the highest (S4).
[0027]
FIG. 5 shows a specific example of search result display processing by this method.
[0028]
At this time, the history range of the target search request phrase is limited to a certain period (for example, a range of 72 hours, etc.) retroactively from the time when the search request phrase is input, and the appearance frequency of the corresponding phrase from this is limited. By obtaining the above, it is possible to accurately follow the change in the importance of the search request phrase due to factors such as fashion.
[0029]
【The invention's effect】
The effect of the present invention is that, firstly, by extracting a related phrase for the search request phrase and performing classification related thereto, classification processing can be performed in a shorter time compared to a technique such as clustering. Second, In other words, it is possible to follow an event in which the importance of the search request word changes reflecting the fashion or the like, and it becomes possible to classify the search request word according to the words in line with the interests of many users.
[Brief description of the drawings]
FIG. 1 is a block diagram showing an example of an embodiment of a document search apparatus of the present invention. FIG. 2 is a flowchart showing an example of a document search and search result display process. FIG. 3 is a search result display according to the process flow of FIG. FIG. 4 is a flowchart showing another example of document search and search result display processing. FIG. 5 is an explanatory diagram showing a specific example of search result display processing according to the processing flow of FIG. Explanation of]
1: search request input processing unit, 2: document search processing unit, 3: search result display processing unit, 4: set of search target documents.

Claims (3)

予め蓄積した検索対象文書の集合から所望の語句を含む文書を検索し提示する文書検索装置であって、
利用者によって入力された語句を検索要求語句として受け取る検索要求入力手段と、
検索対象文書の集合から前記利用者によって入力された検索要求語句を含む文書を検索し出力する文書検索手段と、
過去に入力され、使用された検索要求語句の集合である検索要求語句の履歴中から前記利用者によって入力された検索要求語句に隣接している語句を抽出し、その出現頻度を算出し、該頻度が高い語句を前記利用者によって入力された検索要求語句と関連度の高い関連語句とみなし、文書検索手段が出力する文書のうち、前記利用者によって入力された検索要求語句とこれらの関連語句との連接語句を含む文書を、各関連語句毎にグループ化して表示する検索結果表示手段とを備えた
ことを特徴とする文書検索装置。
A document search apparatus for searching and presenting a document including a desired phrase from a set of search target documents stored in advance,
A search request input means for receiving a phrase input by a user as a search request phrase;
A document search means for searching and outputting a document including a search request phrase input by the user from a set of search target documents;
A phrase adjacent to the search request phrase input by the user is extracted from the history of search request phrases that are a set of search request phrases that have been input and used in the past, and the appearance frequency is calculated. Of the documents output by the document search means, the search request words and their related words and phrases are output from the document search means, assuming that the frequently used words and phrases are related to the search request words inputted by the user. And a search result display means for displaying a document including a concatenated word phrase and a group for each related word phrase.
予め蓄積した検索対象文書の集合から所望の語句を含む文書を検索し提示する文書検索プログラムであって、
該プログラムはコンピュータに、
利用者によって入力された語句を検索要求語句として受け取る検索要求入力工程と、
検索対象文書の集合から前記利用者によって入力された検索要求語句を含む文書を検索し出力する文書検索工程と、
過去に入力され、使用された検索要求語句の集合である検索要求語句の履歴中から前記利用者によって入力された検索要求語句に隣接している語句を抽出し、その出現頻度を算出し、該頻度が高い語句を前記利用者によって入力された検索要求語句と関連度の高い関連語句とみなし、前記検索され出力された文書のうち、前記利用者によって入力された検索要求語句とこれらの関連語句との連接語句を含む文書を、各関連語句毎にグループ化して表示する検索結果表示工程とを実行させる
ことを特徴とする文書検索プログラム。
A document search program for searching and presenting a document including a desired phrase from a set of search target documents stored in advance,
The program is stored on the computer
A search request input step for receiving a phrase input by a user as a search request phrase;
A document search step of searching and outputting a document including a search request word input by the user from a set of search target documents;
A phrase adjacent to the search request phrase input by the user is extracted from the history of search request phrases that are a set of search request phrases that have been input and used in the past, and the appearance frequency is calculated. A search request phrase entered by the user and these related phrases in the searched and output document are regarded as a related phrase having a high degree of association with the search request phrase input by the user. And a search result display step of displaying a document including a concatenated word and a grouped display for each related word.
請求項2記載の文書検索プログラムを記録したことを特徴とするコンピュータ読み取り可能な媒体。  A computer-readable medium having the document search program according to claim 2 recorded thereon.
JP2002004805A 2002-01-11 2002-01-11 Document search device, document search program, and medium storing document search program Expired - Fee Related JP4009937B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2002004805A JP4009937B2 (en) 2002-01-11 2002-01-11 Document search device, document search program, and medium storing document search program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2002004805A JP4009937B2 (en) 2002-01-11 2002-01-11 Document search device, document search program, and medium storing document search program

Publications (2)

Publication Number Publication Date
JP2003208447A JP2003208447A (en) 2003-07-25
JP4009937B2 true JP4009937B2 (en) 2007-11-21

Family

ID=27644029

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2002004805A Expired - Fee Related JP4009937B2 (en) 2002-01-11 2002-01-11 Document search device, document search program, and medium storing document search program

Country Status (1)

Country Link
JP (1) JP4009937B2 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006040058A (en) * 2004-07-28 2006-02-09 Mitsubishi Electric Corp Document classification device
JP2006236221A (en) * 2005-02-28 2006-09-07 Kazuhiko Mori Management server for web page retrieval
JP2007011973A (en) * 2005-07-04 2007-01-18 Sharp Corp Information retrieval device and information retrieval program
JP4857448B2 (en) * 2006-03-10 2012-01-18 独立行政法人情報通信研究機構 Information retrieval apparatus and program using multiple meanings
JP4547500B2 (en) * 2006-07-21 2010-09-22 国立大学法人群馬大学 Search device and program
JP5252410B2 (en) * 2007-03-05 2013-07-31 公立大学法人広島市立大学 Technical term classification device, technical term classification method, and program
JP5130954B2 (en) * 2008-02-29 2013-01-30 日本電気株式会社 Electronic device, hieroglyph search method, program, and recording medium
US20130212089A1 (en) * 2012-02-10 2013-08-15 Google Inc. Search Result Categorization

Also Published As

Publication number Publication date
JP2003208447A (en) 2003-07-25

Similar Documents

Publication Publication Date Title
US10452718B1 (en) Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems
US7783644B1 (en) Query-independent entity importance in books
US7676745B2 (en) Document segmentation based on visual gaps
US9846744B2 (en) Media discovery and playlist generation
US7716216B1 (en) Document ranking based on semantic distance between terms in a document
US9323827B2 (en) Identifying key terms related to similar passages
KR101078864B1 (en) The query/document topic category transition analysis system and method and the query expansion based information retrieval system and method
US7392244B1 (en) Methods and apparatus for determining equivalent descriptions for an information need
US20080086453A1 (en) Method and apparatus for correlating the results of a computer network text search with relevant multimedia files
US20070073683A1 (en) System and method for question answering document retrieval
US20040098385A1 (en) Method for indentifying term importance to sample text using reference text
KR20070039072A (en) Results based personalization of advertisements in a search engine
CN103678576A (en) Full-text retrieval system based on dynamic semantic analysis
KR100356105B1 (en) Method and system for document classification and search using document auto-summary system
JP2005122295A (en) Relationship figure creation program, relationship figure creation method, and relationship figure generation device
JPH03172966A (en) Similar document retrieving device
JP4009937B2 (en) Document search device, document search program, and medium storing document search program
JP3921837B2 (en) Information discrimination support device, recording medium storing information discrimination support program, and information discrimination support method
KR101140724B1 (en) Method and system of configuring user profile based on a concept network and personalized query expansion system using the same
Iacobelli et al. Finding new information via robust entity detection
JP2006139484A (en) Information retrieval method, system therefor and computer program
WO2009123594A1 (en) Correlating the results of a computer network text search with relevant multimedia files
JP2004206571A (en) Method, device, and program for presenting document information, and recording medium
JP2002183195A (en) Concept retrieving system
JP4384736B2 (en) Image search device and computer-readable recording medium storing program for causing computer to function as each means of the device

Legal Events

Date Code Title Description
A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20061221

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20070123

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20070320

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20070515

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20070703

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20070821

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20070823

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20100914

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20100914

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20110914

Year of fee payment: 4

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20120914

Year of fee payment: 5

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130914

Year of fee payment: 6

S531 Written request for registration of change of domicile

Free format text: JAPANESE INTERMEDIATE CODE: R313531

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350

LAPS Cancellation because of no payment of annual fees