JP2009116422A

JP2009116422A - Query extraction method, query extractor, and query extraction program

Info

Publication number: JP2009116422A
Application number: JP2007285707A
Authority: JP
Inventors: Motohiro Koma; 基裕小間; Hironobu Inoue; 洋信井上; Kengo Ebihara; 健吾海老原; Tatsuhiro Niwa; 達洋丹羽
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2007-11-02
Filing date: 2007-11-02
Publication date: 2009-05-28
Anticipated expiration: 2027-11-02
Also published as: JP4839295B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a query extraction method, a query extraction device and a query extraction program for extracting information of the corresponding relation of a site corresponding to a navigation query with sufficient precision. <P>SOLUTION: This query extraction method includes: a process for summing up click logs showing the history of each selection to a plurality of sites from a retrieval list to the query by each query; a process for calculating a first distribution value based on the click frequency of each site corresponding to each query on the basis of the clock log; and a process for extracting the query whose calculated first distribution value is equal to or more than a prescribed value and site information whose click frequency in the retrieval list for the query is the highest. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、インターネットのポータルサイト等における検索技術に関する。 The present invention relates to a search technique in an Internet portal site or the like.

インターネット上には膨大な量の情報が存在するため、その中から所望の情報を効率的に発見するツールとして各種の検索サービスがポータルサイト等において提供されている。ユーザは、クエリ（キーワード、ターム）を入力して検索を行い、検索結果の一覧から所望のサイトを選択するといった検索操作によりサイトを閲覧する。 Since an enormous amount of information exists on the Internet, various search services are provided on portal sites and the like as tools for efficiently finding desired information from the information. The user performs a search by inputting a query (keyword, term), and browses the site by a search operation such as selecting a desired site from a list of search results.

ところで、検索サービスは、知りたい情報を掲載しているページ（記事）を探すためだけではなく、ある特定のサイトに移動するのが目的で用いられていることが多々ある。例えば、会社名、サイト名、大学名、個人名等を入力して、その主体が運営しているサイトを検索する場合である。このような場合に入力されるクエリをナビゲーショナルクエリという。 By the way, the search service is often used not only for searching a page (article) on which information desired to be found but also for the purpose of moving to a specific site. For example, when a company name, a site name, a university name, an individual name, etc. are input, a site operated by the subject is searched. A query input in such a case is called a navigational query.

また、ウェブブラウザには、ユーザが所望のウェブページのＵＲＬを予め登録しておくブックマーク機能がある。このブックマークへの登録は、一般には、ユーザが所望のウェブページをウェブブラウザ上に表示させた状態でブックマーク登録のメニュー操作を行うことで実行される。 In addition, the web browser has a bookmark function in which the URL of a desired web page is registered in advance by the user. Registration to the bookmark is generally performed by the user performing a menu operation for bookmark registration in a state where a user displays a desired web page on a web browser.

しかし、ユーザは複数のサイトを次々と辿って閲覧するため、他のウェブページへの遷移後にブックマークに登録する場合には、所望のウェブページを改めて表示させてから登録操作を行う必要があり、その操作が煩雑である。特に、ブックマークに登録されるウェブページは、そのサイトのトップページであることが多いため、ユーザの閲覧履歴を辿るのも時間がかかってしまう。 However, since the user browses a plurality of sites one after another, when registering in a bookmark after transition to another web page, it is necessary to perform a registration operation after displaying a desired web page again, The operation is complicated. In particular, since the web page registered in the bookmark is often the top page of the site, it takes time to trace the browsing history of the user.

そのため、ブックマークの代わりに検索サイトが用いられることがある。すなわち、ユーザが所望のサイトが検索結果として出力されるようなクエリを検索サイトにて選定して入力し、検索結果から所望のサイトを閲覧するのである。このことからも、ナビゲーショナルクエリは、ユーザがブックマークしていないサイトを閲覧するために入力するクエリとも換言することができる。 Therefore, a search site may be used instead of a bookmark. That is, the user selects and inputs a query that outputs a desired site as a search result at the search site, and browses the desired site from the search result. From this point of view, the navigational query can be rephrased as a query input for browsing a site that the user has not bookmarked.

検索サービスを提供するポータルサイト側でこのようなナビゲーショナルクエリと、そのナビゲーショナルクエリに対応するサイトの対応関係の情報を抽出できることは検索サービスの向上に有効である。例えば、ナビゲーショナルクエリで検索が行なわれた場合に、ナビゲーショナルクエリに対応するサイトを検索結果の先頭に持っていくこともできるし、ナビゲーショナルクエリに対応するサイトにユーザを誘導することもできるようになる。
特開２００１−３１２５１８号公報 It is effective in improving the search service that the portal site providing the search service can extract such navigational queries and information on the correspondence between the sites corresponding to the navigational queries. For example, when a search is performed using a navigational query, the site corresponding to the navigational query can be brought to the top of the search result, or the user can be guided to the site corresponding to the navigational query. It becomes like this.
JP 2001-31518 A

上述したように、ナビゲーショナルクエリと対応するサイトの対応関係の情報を抽出できることは有用であるが、従来、そのための有効な方法が存在しなかった。すなわち、ナビゲーショナルクエリを抽出する方法としては、サイトのＵＲＬ（Uniform Resource Locator）やサイトのタイトル等を静的に解析する方法が考えられるが、ナビゲーショナルクエリに対応するサイトは時間的に変動する可能性があり、上記のような静的な解析方法では十分に対応できないものであった。例えば、ある有名人の公式サイトとブログサイトがあり、通常であれば多くのユーザが有名人名のクエリで検索して公式サイトにアクセスしていたものが、ニュース等の時事的な影響によりユーザがブログサイトへアクセスするようになることがある。このように、ナビゲーショナルクエリは時期により変動する可能性がある。 As described above, it is useful to be able to extract information on the correspondence between sites corresponding to navigational queries, but there has been no effective method for this purpose. That is, as a method for extracting the navigational query, a method of statically analyzing the URL (Uniform Resource Locator) of the site, the title of the site, etc. can be considered, but the site corresponding to the navigational query varies with time. There is a possibility, and the static analysis method as described above is not sufficient. For example, there are official websites and blog sites of a celebrity. Normally, many users searched by celebrity name queries and accessed the official website. May come to access the site. Thus, the navigational query may vary depending on the time.

一方、特許文献１には、所定のキーワードに対して１つのＵＲＬが対応するように記憶されたデータベースを用いて検索を行ない、検索結果からのリンクに対してアクセス数をカウントすることで、ホームページに対するニーズをカウントできるようにした検索システムが開示されている。しかしながら、上述したナビゲーショナルクエリの性質を考慮したものではなく、ナビゲーショナルクエリを抽出する方法としては使用できない。 On the other hand, in Patent Document 1, a search is performed using a database stored so that one URL corresponds to a predetermined keyword, and the number of accesses to a link from the search result is counted. A search system that can count the needs for a computer is disclosed. However, it does not take into account the above-mentioned properties of navigational queries, and cannot be used as a method for extracting navigational queries.

本発明は上記の従来の問題点に鑑み提案されたものであり、その目的とするところは、ナビゲーショナルクエリと対応するサイトの対応関係の情報を十分な精度で抽出することのできるクエリ抽出方法、クエリ抽出装置およびクエリ抽出プログラムを提供することにある。 The present invention has been proposed in view of the above-described conventional problems, and an object of the present invention is to provide a query extraction method capable of extracting information on correspondence between a navigational query and a corresponding site with sufficient accuracy. Another object of the present invention is to provide a query extraction device and a query extraction program.

上記の課題を解決するため、本発明にあっては、請求項１に記載されるように、クエリに対する検索一覧からの複数のサイトに対する個々の選択の履歴を示すクリックログを前記クエリ毎に集計する工程と、前記クリックログに基づいて、各クエリに対するサイト毎のクリック回数に基づく第１の分散値を算出する工程と、算出された前記第１の分散値が所定値以上のクエリと、当該クエリに対する検索一覧内におけるクリック回数が最上位のサイト情報を抽出する工程とを備えるクエリ抽出方法を要旨としている。 In order to solve the above problems, according to the present invention, as described in claim 1, a click log indicating a history of individual selections for a plurality of sites from a search list for a query is aggregated for each query. A step of calculating a first variance value based on the number of clicks per site for each query based on the click log, a query in which the calculated first variance value is a predetermined value or more, and The gist is a query extraction method including a step of extracting site information having the highest number of clicks in a search list for a query.

また、請求項２に記載されるように、クエリに対する検索一覧からの複数のサイトに対する個々の選択の履歴を示すクリックログを前記クエリ毎に集計する工程と、前記クリックログに基づいて、各クエリに対するサイト毎のクリック回数に基づく第１の分散値を算出する工程と、前記クリックログに基づいて、各クエリに対するサイト毎のクリック回数率に基づく第２の分散値を算出する工程と、算出された前記第１の分散値および前記第２の分散値がそれぞれ所定値以上のクエリと、当該クエリに対する検索一覧内におけるクリック回数が最上位のサイト情報を抽出する工程とを備えるクエリ抽出方法として構成することができる。 In addition, as described in claim 2, a step of counting a click log indicating a history of individual selections for a plurality of sites from a search list for a query for each query, and each query based on the click log Calculating a first variance value based on the number of clicks per site with respect to the site, and calculating a second variance value based on the click frequency rate per site for each query based on the click log. A query extraction method comprising: a query in which the first variance value and the second variance value are each equal to or greater than a predetermined value; and a step of extracting site information having the highest number of clicks in a search list for the query. can do.

また、請求項３に記載されるように、請求項１または２のいずれか一項に記載のクエリ抽出方法において、前記第１の分散値を算出する工程は、最大クリック回数の偏差２乗からその他のクリック回数の偏差２乗和を差し引くことで前記第１の分散値を算出するようにすることができる。 Moreover, as described in claim 3, in the query extraction method according to claim 1 or 2, the step of calculating the first variance value is based on a deviation square of the maximum number of clicks. The first variance value can be calculated by subtracting the deviation sum of squares of other clicks.

また、請求項４に記載されるように、請求項３に記載のクエリ抽出方法において、前記クエリに対する検索一覧を表示する画面には、当該検索一覧を変更するためのボタンが表示され、前記第１の分散値を算出する工程は、更に、前記表示されたボタンのクリック回数の偏差２乗を差し引くことで前記第１の分散値を算出するようにすることができる。 In addition, as described in claim 4, in the query extraction method according to claim 3, a button for changing the search list is displayed on the screen displaying the search list for the query, The step of calculating the variance value of 1 can further calculate the first variance value by subtracting a deviation square of the number of clicks of the displayed button.

また、請求項５に記載されるように、請求項２に記載のクエリ抽出方法において、前記第２の分散値を算出する工程は、最大クリック回数率の偏差２乗からその他のクリック回数率の偏差２乗和を差し引くことで前記第２の分散値を算出するようにすることができる。 In addition, as described in claim 5, in the query extraction method according to claim 2, the step of calculating the second variance value includes the step of calculating the second click value rate from the square of the deviation of the maximum click number rate. The second variance value can be calculated by subtracting the deviation sum of squares.

また、請求項６に記載されるように、請求項５に記載のクエリ抽出方法において、前記クエリに対する検索一覧と、表示された検索一覧を変更するためのボタンとが一画面上に表示され、前記第２の分散値を算出する工程は、更に、前記表示されたボタンのクリック回数率の偏差２乗を差し引くことで前記第２の分散値を算出するようにすることができる。 Further, as described in claim 6, in the query extraction method according to claim 5, a search list for the query and a button for changing the displayed search list are displayed on one screen, The step of calculating the second variance value may further calculate the second variance value by subtracting a deviation square of the click frequency rate of the displayed button.

また、請求項７に記載されるように、請求項１乃至６のいずれか一項に記載のクエリ抽出方法において、前記クリックログを集計する工程は、ユーザ毎に前記クリックログを集計し、前記分散値を算出する工程は、前記ユーザ毎のクリックログに基づいて、ユーザが２回目以降に入力したクエリに対応するクリックログを前記分散値の算出に用いるようにすることができる。 Moreover, as described in claim 7, in the query extraction method according to any one of claims 1 to 6, the step of counting the click logs includes counting the click logs for each user, In the step of calculating the variance value, the click log corresponding to the query input by the user for the second time or later can be used for the calculation of the variance value based on the click log for each user.

また、請求項８、９に記載されるように、クエリ抽出装置として構成することができる。 Further, as described in claims 8 and 9, it can be configured as a query extraction device.

また、請求項１０、１１に記載されるように、クエリ抽出プログラムとして構成することができる。 Further, as described in claims 10 and 11, it can be configured as a query extraction program.

本発明のクエリ抽出方法、クエリ抽出装置およびクエリ抽出プログラムにあっては、検索サービスのクリックログに基づいて各クエリに対するサイト毎のクリック回数に基づく分散値を算出し、その分散値が所定値以上のクエリと、当該クエリに対する検索一覧内におけるクリック回数が最上位のサイト情報を抽出するようにしているので、ナビゲーショナルクエリと対応するサイトの対応関係の情報を十分な精度で抽出することができる。 In the query extraction method, the query extraction device, and the query extraction program of the present invention, a variance value based on the number of clicks for each site for each query is calculated based on the click log of the search service, and the variance value is equal to or greater than a predetermined value. And the site information with the highest number of clicks in the search list for the query is extracted, so the information on the correspondence between the navigational query and the corresponding site can be extracted with sufficient accuracy. .

以下、本発明の好適な実施形態につき説明する。 Hereinafter, preferred embodiments of the present invention will be described.

＜システム構成＞
図１は本発明の一実施形態にかかるシステムの構成例を示す図である。 <System configuration>
FIG. 1 is a diagram showing a configuration example of a system according to an embodiment of the present invention.

図１において、インターネット等のネットワーク２には、ユーザが利用する複数のクライアント端末１と、検索および閲覧の対象となるコンテンツを提供する複数のＷｅｂサーバ４と、本発明を適用した検索サーバ３とが接続されている。 In FIG. 1, a network 2 such as the Internet includes a plurality of client terminals 1 used by a user, a plurality of Web servers 4 that provide contents to be searched and viewed, and a search server 3 to which the present invention is applied. Is connected.

クライアント端末１は、ユーザからクエリを入力するクエリ入力部１０１と、ユーザから検索実行指示を受け付ける検索実行指示受付部１０２と、クエリ入力部１０１でクエリが入力される都度に検索サーバ３に対してクエリを送信するとともに、検索実行指示受付部１０２で検索実行指示を受け付けた際に検索サーバ３に対してクエリを送信するクエリ送信部１０３とを備えている。 The client terminal 1 has a query input unit 101 that inputs a query from the user, a search execution instruction reception unit 102 that receives a search execution instruction from the user, and the search server 3 each time a query is input by the query input unit 101. In addition to transmitting a query, the search execution instruction receiving unit 102 includes a query transmission unit 103 that transmits a query to the search server 3 when a search execution instruction is received.

また、クライアント端末１は、検索サーバ３から推奨サイト（ナビゲーショナルクエリとして特に推奨するサイトの要約およびＵＲＬ）を受信する推奨サイト受信部１０４と、検索サーバ３から検索一覧（検索によりヒットしたサイトの要約およびＵＲＬの一覧）を受信する検索一覧受信部１０５と、推奨サイト受信部１０４で受信した推奨サイトおよび検索一覧受信部１０５で受信した検索一覧に基づいて画面表示を制御する表示制御部１０６とを備えている。なお、推奨サイトおよび検索一覧に含まれるＵＲＬは、検索結果からのユーザによるサイト選択の履歴をクリックログとして検索サーバ３側で取得するため、サイトを直接に示すＵＲＬではなく、いったん検索サーバ３にアクセスし、そこから目的のサイトにリダイレクトするリダイレクトＵＲＬとなっている。このリダイレクトＵＲＬには、検索サーバ３へのエントリとなるＵＲＬのほか、検索に用いられたクエリや選択されたサイトを特定する情報が含まれている。 The client terminal 1 also receives a recommended site (summary and URL of a site particularly recommended as a navigational query) from the search server 3, and a search list from the search server 3 (a list of sites hit by the search). A search list receiving unit 105 that receives a summary and a list of URLs), a display control unit 106 that controls screen display based on the recommended site received by the recommended site receiving unit 104 and the search list received by the search list receiving unit 105; It has. Note that the URL included in the recommended site and the search list is acquired on the search server 3 side as a click log of the site selection history by the user from the search result. The URL is a redirect URL that is accessed and redirected from there. The redirect URL includes a URL used as an entry to the search server 3 and information specifying the query used for the search and the selected site.

また、クライアント端末１は、表示制御部１０６によって表示された推奨サイトもしくは検索一覧からユーザによるサイト選択を受け付けるサイト選択受付部１０７と、このサイト選択受付部１０７で選択されたサイトのＵＲＬ（リダイレクトＵＲＬ）に応じて検索サーバ３にアクセスを行い、検索サーバ３から目的のサイトのＵＲＬへのアクセス切り替えを指示するリダイレクトレスポンスを受けて目的のサイトであるＷｅｂサーバ４にアクセスを行なうリダイレクト処理部１０８とを備えている。 The client terminal 1 also includes a site selection receiving unit 107 that receives a site selection by the user from a recommended site or a search list displayed by the display control unit 106, and a URL (redirect URL) of the site selected by the site selection receiving unit 107. ) To the search server 3, receives a redirect response from the search server 3 to instruct to switch access to the URL of the target site, and accesses the Web server 4 that is the target site; It has.

クライアント端末１における機能部１０１〜１０８は、コンピュータハードウェア上で実行されるコンピュータソフトウェア（プログラム）により実現される。 The function units 101 to 108 in the client terminal 1 are realized by computer software (program) executed on computer hardware.

一方、検索サーバ３は、クライアント端末１からネットワーク２を介してクエリ（単にユーザにより入力されただけのクエリの場合と、検索実行指示を伴う場合とがある）を受信するクエリ受信部３０１と、このクエリ受信部３０１で受信したクエリに基づいて、単にユーザにより入力されただけのクエリの場合、ナビゲーショナルクエリＤＢ３１２を検索してナビゲーショナルクエリが見つかった場合に推奨サイトとして出力するとともに、検索実行指示を伴うクエリの場合、コンテンツＤＢ３１１を検索して検索一覧を出力する検索処理部３０２とを備えている。 On the other hand, the search server 3 receives a query from the client terminal 1 via the network 2 (a query that is simply input by a user or a search execution instruction may be accompanied), Based on the query received by the query receiving unit 301, in the case of a query that is simply input by the user, the navigational query DB 312 is searched, and when a navigational query is found, it is output as a recommended site and the search is executed. In the case of a query with an instruction, a search processing unit 302 that searches the content DB 311 and outputs a search list is provided.

また、検索サーバ３は、検索処理部３０２から出力された推奨サイトをネットワーク２を介してクライアント端末１に送信する推奨サイト送信部３０３と、検索処理部３０２から出力された検索一覧をネットワーク２を介してクライアント端末１に送信する検索一覧送信部３０４とを備えている。 In addition, the search server 3 transmits a recommended site output from the search processing unit 302 to the client terminal 1 via the network 2 and a search list output from the search processing unit 302 to the network 2. A search list transmission unit 304 that transmits to the client terminal 1.

また、検索サーバ３は、クライアント端末１からネットワーク２を介してリダイレクトＵＲＬによるアクセスを受け付け、リダイレクトレスポンスを返送するリダイレクト処理部３０５と、このリダイレクト処理部３０５が処理を行ったタイミングでリダイレクトＵＲＬからクエリおよび選択されたサイトを特定し、クリックログをクエリログＤＢ３１３に記録するクリックログ収集部３０６とを備えている。 Further, the search server 3 accepts an access by the redirect URL from the client terminal 1 via the network 2, and sends a query from the redirect URL at the timing when the redirect processing unit 305 performs processing by the redirect processing unit 305 that returns a redirect response. And a click log collection unit 306 that identifies the selected site and records the click log in the query log DB 313.

また、検索サーバ３は、所定のタイミングで、クエリログＤＢ３１３のクエリログに基づいて、クエリ毎にクリック回数によるナビゲーショナルクエリとしての確からしさを示す指標である第１スコア（第１の分散値）を算出してスコアリングＤＢ３１４に登録する第１スコア算出部３０７と、クエリ毎にクリック回数率によるナビゲーショナルクエリとしての確からしさを示す指標である第２スコア（第２の分散値）を算出してスコアリングＤＢ３１４に登録する第２スコア算出部３０８と、スコアリングＤＢ３１４のスコアリング結果に基づいてナビゲーショナルクエリを抽出し、ナビゲーショナルクエリＤＢ３１２に登録するナビゲーショナルクエリ抽出部３０９とを備えている。 Further, the search server 3 calculates a first score (first variance value) that is an index indicating the probability as a navigational query based on the number of clicks for each query based on the query log in the query log DB 313 at a predetermined timing. Then, the first score calculation unit 307 registered in the scoring DB 314 and the second score (second variance value) that is an index indicating the probability as a navigational query based on the click frequency rate for each query are calculated and scored. A second score calculation unit 308 registered in the ring DB 314 and a navigational query extraction unit 309 that extracts a navigational query based on the scoring result of the scoring DB 314 and registers it in the navigational query DB 312 are provided.

検索サーバ３における機能部３０１〜３０９は、コンピュータハードウェア上で実行されるコンピュータソフトウェア（プログラム）により実現される。 The functional units 301 to 309 in the search server 3 are realized by computer software (program) executed on computer hardware.

図２は検索サーバ３に設けられた各データベースのデータ構造例を示す図である。 FIG. 2 is a diagram showing an example of the data structure of each database provided in the search server 3.

図２（ａ）はコンテンツＤＢ３１１のデータ構造例を示しており、「クエリ」フィールドと、そのクエリに対応する「サイトＵＲＬ」フィールドとを含んでいる。「サイトＵＲＬ」フィールドには複数のＵＲＬが含まれてもよい。 FIG. 2A shows an example of the data structure of the content DB 311, which includes a “query” field and a “site URL” field corresponding to the query. The “site URL” field may include a plurality of URLs.

図２（ｂ）はナビゲーショナルクエリＤＢ３１２のデータ構造例を示しており、「ナビゲーショナルクエリ」フィールドと、そのナビゲーショナルクエリに対応する「サイトＵＲＬ」フィールドとを含んでいる。ナビゲーショナルクエリに対応するサイトＵＲＬは原則として１つである。 FIG. 2B shows an example of the data structure of the navigational query DB 312, and includes a “navigation query” field and a “site URL” field corresponding to the navigational query. In principle, there is one site URL corresponding to the navigational query.

図２（ｃ）はクエリログＤＢ３１３のデータ構造例を示しており、「クエリ」フィールドと、そのクエリに対する検索一覧の表示順序に応じた、「１件目ＵＲＬおよびクリック回数」フィールド、「１件目ＵＲＬのクリック回数率」フィールド、・・・、「Ｎ件目ＵＲＬおよびクリック回数」フィールド、「Ｎ件目ＵＲＬのクリック回数率」フィールドと、「合計クリック回数」フィールドと、「平均クリック回数」フィールドと、「平均クリック回数率」フィールドと、「「次へ／再検索」クリック回数」フィールドと、「「次へ／再検索」クリック回数率」フィールドとを含んでいる。ここで、１〜Ｎ件目ＵＲＬのクリック回数は、そのクエリに対する検索一覧から複数のユーザによりその順位のＵＲＬに対応するサイトが選択された回数の所定期間内における積算値である。合計クリック回数は、そのクエリに対する１〜Ｎ件目ＵＲＬのクリック回数の合計値である。平均クリック回数は、そのクエリに対する１〜Ｎ件目ＵＲＬのクリック回数の平均値である。「次へ／再検索」クリック回数は、検索時に最初に表示されるＮ件分の検索一覧からユーザによりサイトの選択が行なわれず、次の検索一覧ページへの切り替えや再検索が行なわれた回数である。また、「〜回数率」は、集計対象の全クエリの合計クリック回数の総和を母数にした比率である。 FIG. 2C shows an example of the data structure of the query log DB 313. The “query URL” field, the “first URL and click count” field, and “first record” corresponding to the display order of the search list for the query are shown. URL click frequency rate field,..., “Nth URL and click frequency” field, “Nth URL click frequency rate” field, “total click frequency” field, and “average click frequency” field And an “average click frequency rate” field, a “next / research” click frequency field, and a “next / research” click frequency rate field. Here, the number of clicks of the 1st to Nth URLs is an integrated value within a predetermined period of the number of times a site corresponding to the URL of the rank is selected by a plurality of users from the search list for the query. The total number of clicks is the total number of clicks of the 1st to Nth URLs for the query. The average number of clicks is an average value of the number of clicks of the 1st to Nth URLs for the query. The number of clicks on “next / re-search” is the number of times that the user did not select a site from the N search lists initially displayed at the time of search and switched to the next search list page or re-searched. It is. Further, “˜frequency ratio” is a ratio in which the sum of the total number of clicks of all the queries to be aggregated is used as a parameter.

なお、検索一覧の表示順序に応じた１〜Ｎ件目のＵＲＬについては別のテーブルで管理してもよい。また、１〜Ｎ件目ＵＲＬのクリック回数率、合計クリック回数、平均クリック回数、平均クリック回数率、「次へ／再検索」クリック回数率等をテーブルに含めず、後述するスコア算出時に計算するようにしてもよい。 Note that the first to Nth URLs corresponding to the search list display order may be managed in a separate table. Also, the click frequency rate of the 1st to Nth URLs, the total click frequency, the average click frequency, the average click frequency rate, the “next / re-search” click frequency rate, etc. are not included in the table, and are calculated at the time of calculating the score described later. You may do it.

図２（ｄ）はスコアリングＤＢ３１４のデータ構造例を示しており、「クエリ」フィールドと、そのクエリに対する「第１スコア」フィールドと、「第２スコア」フィールドと、対応する「最大クリック回数（率）ＵＲＬ」フィールドとを含んでいる。最大クリック回数（率）ＵＲＬは、第１／第２スコア計算にあたって最大クリック回数と把握したサイトに対応するＵＲＬである。なお、最大クリック回数（率）ＵＲＬについては別のテーブルで管理してもよい。 FIG. 2D shows an example of the data structure of the scoring DB 314. The “query” field, the “first score” field for the query, the “second score” field, and the corresponding “maximum number of clicks ( Rate) URL "field. The maximum number of clicks (rate) URL is a URL corresponding to a site that is recognized as the maximum number of clicks in the first / second score calculation. Note that the maximum number of clicks (rate) URL may be managed in a separate table.

＜動作＞
図３は上述した実施形態の処理例を示すシーケンス図である。 <Operation>
FIG. 3 is a sequence diagram illustrating a processing example of the above-described embodiment.

図３において、ユーザがクライアント端末１に対してクエリ入力部１０１によりクエリを入力すると（ステップＳ１０１）、クライアント端末１のクエリ送信部１０３は検索サーバ３に対して検索実行指示を伴わないクエリを送信する（ステップＳ１０２）。 In FIG. 3, when a user inputs a query to the client terminal 1 using the query input unit 101 (step S101), the query transmission unit 103 of the client terminal 1 transmits a query without a search execution instruction to the search server 3. (Step S102).

検索サーバ３は、クエリ受信部３０１により検索実行指示を伴わないクエリを受信すると、検索処理部３０２によりナビゲーショナルクエリＤＢ３１２を検索する（ステップＳ１０３）。この際、検索処理部３０２は入力されたクエリがナビゲーショナルクエリＤＢ３１２に存在する場合、そのナビゲーショナルクエリに対応するサイトを推奨サイトとして出力する。そして、検索サーバ３の推奨サイト送信部３０３は検索処理部３０２の検索結果である推奨サイトをクライアント端末１に送信する（ステップＳ１０４）。 When the query receiving unit 301 receives a query not accompanied by a search execution instruction, the search server 3 searches the navigational query DB 312 using the search processing unit 302 (step S103). At this time, when the input query exists in the navigational query DB 312, the search processing unit 302 outputs a site corresponding to the navigational query as a recommended site. Then, the recommended site transmission unit 303 of the search server 3 transmits the recommended site, which is the search result of the search processing unit 302, to the client terminal 1 (step S104).

クライアント端末１は、推奨サイト受信部１０４により推奨サイトを受信すると、表示制御部１０６により画面に推奨サイトを表示してユーザに提示する（ステップＳ１０５）。図４（ａ）は検索画面の例を示しており、クエリ入力欄１１にクエリ「○○太郎」が入力された結果、推奨サイトを示す吹き出し１２が表示された状態を示している。この場合、ＯＫボタン１３をクリックすることで、そのサイトにアクセスして閲覧を行なうことができる。 When the recommended site is received by the recommended site receiving unit 104, the client terminal 1 displays the recommended site on the screen by the display control unit 106 and presents it to the user (step S105). FIG. 4A shows an example of a search screen, and shows a state where a balloon 12 indicating a recommended site is displayed as a result of the query “Taro” being entered in the query input field 11. In this case, the user can access and browse the site by clicking the OK button 13.

図３に戻り、その後、ユーザが検索実行指示を行なって検索実行指示受付部１０２により受け付けられると（ステップＳ１０６）、クライアント端末１のクエリ送信部１０３は検索サーバ３に対して検索実行指示を伴うクエリを送信する（ステップＳ１０７）。 Returning to FIG. 3, when the user issues a search execution instruction and is received by the search execution instruction reception unit 102 (step S106), the query transmission unit 103 of the client terminal 1 accompanies the search server 3 with the search execution instruction. A query is transmitted (step S107).

検索サーバ３は、クエリ受信部３０１により検索実行指示を伴うクエリを受信すると、検索処理部３０２によりコンテンツＤＢ３１１を検索し、検索一覧を出力する（ステップＳ１０８）。なお、ここでは検索処理部３０２がコンテンツＤＢ３１１による通常の検索処理を行なうことを想定しているが、コンテンツＤＢ３１１とともにナビゲーショナルクエリＤＢ３１２を検索し、入力されたクエリがナビゲーショナルクエリである場合（ナビゲーショナルクエリＤＢ３１２で検索にヒットしたものがある場合）には、検索一覧の最上位にナビゲーショナルクエリに対応するサイトを表示するよう表示順位の変更を行なうようにしてもよい。 When the query receiving unit 301 receives a query with a search execution instruction, the search server 3 searches the content DB 311 by using the search processing unit 302 and outputs a search list (step S108). Here, it is assumed that the search processing unit 302 performs normal search processing using the content DB 311. However, when the navigation query DB 312 is searched together with the content DB 311 and the input query is a navigation query (navigation query). If there is a search hit in the null query DB 312), the display order may be changed so that the site corresponding to the navigational query is displayed at the top of the search list.

次いで、検索サーバ３の検索一覧送信部３０４は検索処理部３０２の検索結果である検索一覧をクライアント端末１に送信する（ステップＳ１０９）。 Next, the search list transmission unit 304 of the search server 3 transmits the search list that is the search result of the search processing unit 302 to the client terminal 1 (step S109).

クライアント端末１は、検索一覧受信部１０５により検索一覧を受信すると、表示制御部１０６により画面に検索一覧を表示してユーザに提示する（ステップＳ１１０）。図４（ｂ）は検索画面の例を示しており、クエリ入力欄１１にクエリ「○○太郎」が入力され、検索実行指示ボタン１４が押された結果、検索一覧１５として「１．太郎のブログ２．○○太郎公式サイト３．○○太郎のファンクラブ・・・」が表示された状態を示している。なお、検索画面の下部には検索一覧他ページ選択欄１６が併せて表示され、「次へ」をクリックすることで検索一覧の次のページが表示され、ページ番号をクリックすることで任意のページの検索一覧が表示される。 When the search list receiving unit 105 receives the search list, the client terminal 1 causes the display control unit 106 to display the search list on the screen and present it to the user (step S110). FIG. 4B shows an example of a search screen. As a result of the query “XX Taro” being entered in the query input field 11 and the search execution instruction button 14 being pressed, the search list 15 is “1. Blog 2. XXX Taro Official Website 3. XXX Taro Fan Club ... "is displayed. A search list and other page selection field 16 is also displayed at the bottom of the search screen. Clicking “Next” displays the next page of the search list, and clicking a page number allows any page to be displayed. A search list of is displayed.

図３に戻り、ユーザは検索一覧もしくは推奨サイトの表示から所望のサイトを選択し、これがサイト選択受付部１０７により受け付けられると（ステップＳ１１１）、クライアント端末１のリダイレクト処理部１０８は検索一覧もしくは推奨サイトに埋め込まれたリダイレクトＵＲＬに基づいて検索サーバ３にアクセスする（ステップＳ１１２）。検索サーバ３のリダイレクト処理部３０５はアクセスを受けると、目的のサイトの本来のＵＲＬへのアクセス切り替えを指示するリダイレクトレスポンスをクライアント端末１に送信する（ステップＳ１１３）。これと並行して、検索サーバ３のクリックログ収集部３０６はリダイレクトＵＲＬからクエリおよび選択されたサイトを特定し、クリックログをクエリログＤＢ３１３に記録する（ステップＳ１１４）。なお、クリックログ収集部３０６は、クライアント端末１から検索サーバ３に対して検索一覧の他のページの表示が要求された場合や、再検索が要求された場合にも、クリックログをクエリログＤＢ３１３に記録する。 Returning to FIG. 3, the user selects a desired site from the search list or recommended site display, and when this is received by the site selection receiving unit 107 (step S111), the redirect processing unit 108 of the client terminal 1 searches the search list or recommended site. The search server 3 is accessed based on the redirect URL embedded in the site (step S112). When receiving the access, the redirect processing unit 305 of the search server 3 transmits a redirect response instructing switching of access to the original URL of the target site to the client terminal 1 (step S113). In parallel with this, the click log collection unit 306 of the search server 3 identifies the query and the selected site from the redirect URL, and records the click log in the query log DB 313 (step S114). The click log collection unit 306 also stores the click log in the query log DB 313 when the client terminal 1 requests the search server 3 to display another page of the search list or when a re-search is requested. Record.

そして、クライアント端末１のリダイレクト処理部１０８はリダイレクトレスポンスに基づいてＷｅｂサーバ４にアクセスし（ステップＳ１１５）、Ｗｅｂサーバ４はクライアント端末１にページコンテンツを含むレスポンスを送信する（ステップＳ１１６）。クライアント端末１はこのレスポンスに基づいて表示を行い（ステップＳ１１７）、ユーザはコンテンツの閲覧を行なう。 Then, the redirect processing unit 108 of the client terminal 1 accesses the Web server 4 based on the redirect response (Step S115), and the Web server 4 transmits a response including page content to the client terminal 1 (Step S116). The client terminal 1 performs display based on this response (step S117), and the user browses the content.

その後、所定のタイミングによるバッチ処理等により、検索サーバ３の第１スコア算出部３０７はクエリログＤＢ３１３のクエリログに基づいて第１スコアを算出し、算出した第１スコアをスコアリングＤＢ３１４に登録する（ステップＳ１２１）。また、第２スコア算出部３０８はクエリログＤＢ３１３のクエリログに基づいて第２スコアを算出し、算出した第２スコアをスコアリングＤＢ３１４に登録する（ステップＳ１２２）。スコアの算出の詳細については後述する。 After that, the first score calculation unit 307 of the search server 3 calculates the first score based on the query log of the query log DB 313 by batch processing or the like at a predetermined timing, and registers the calculated first score in the scoring DB 314 (step S121). The second score calculation unit 308 calculates a second score based on the query log in the query log DB 313, and registers the calculated second score in the scoring DB 314 (step S122). Details of the score calculation will be described later.

次いで、ナビゲーショナルクエリ抽出部３０９はスコアリングＤＢ３１４のスコアリング結果に基づいてナビゲーショナルクエリを抽出し、ナビゲーショナルクエリＤＢ３１２に登録する（ステップＳ１２３）。ナビゲーショナルクエリの抽出の詳細については後述する。 Next, the navigational query extraction unit 309 extracts a navigational query based on the scoring result of the scoring DB 314 and registers it in the navigational query DB 312 (step S123). Details of the navigational query extraction will be described later.

このように、ユーザの入力するクエリおよびその検索結果からのサイトの選択の行動に基づいてナビゲーショナルクエリを抽出し、次回以降の検索に反映するようにしているため、時期により変動する可能性のあるナビゲーショナルクエリに適切に対応することができる。 In this way, because the navigational query is extracted based on the query entered by the user and the site selection behavior from the search result, and reflected in the subsequent search, the possibility of fluctuation depending on the time It can respond appropriately to a certain navigational query.

図５は第１スコア算出部３０７および第２スコア算出部３０８による第１スコアおよび第２スコアの算出の処理例を示す図である。 FIG. 5 is a diagram illustrating a processing example of calculation of the first score and the second score by the first score calculation unit 307 and the second score calculation unit 308.

図５において、第１スコアおよび第２スコアの算出の処理を開始すると（ステップＳ２０１）、第１スコア算出部３０７はクエリログＤＢ３１３から１つのクエリを選択する（ステップＳ２０２）。 In FIG. 5, when the calculation process of the first score and the second score is started (step S201), the first score calculation unit 307 selects one query from the query log DB 313 (step S202).

次いで、１件目〜Ｎ件目ＵＲＬのクリック回数の中で最大のものを変数＄ＭＡＸに設定し（ステップＳ２０３）、平均クリック回数を変数＄ＡＶＥに設定し（ステップＳ２０４）、「次へ／再検索」クリック回数を変数＄ＢＡＤに設定する（ステップＳ２０５）。 Next, the maximum number of clicks of the first to Nth URLs is set in the variable $ MAX (step S203), the average number of clicks is set in the variable $ AVE (step S204), and “next / The “re-search” click count is set in the variable $ BAD (step S205).

そして、クリック回数が最大のものを除く１件目〜Ｎ件目ＵＲＬのクリック回数をΣの計算において毎回、変数＄ＥＡＣＨとして、次式で第１スコアを計算する（ステップＳ２０６）。式の意味するところについては後述する。 Then, the first score is calculated by the following formula using the number of clicks of the first to Nth URLs except the one with the maximum number of clicks as the variable $ EACH in the calculation of Σ (step S206). The meaning of the formula will be described later.

第１スコア＝（＄ＭＡＸ−＄ＡＶＥ）^２
−Σ（＄ＥＡＣＨ−＄ＡＶＥ）^２
−（＄ＢＡＤ−＄ＡＶＥ）^２
次いで、第２スコア算出部３０８は、１件目〜Ｎ件目ＵＲＬのクリック回数率の中で最大のものを変数＄ＭＡＸ_Ｒに設定し（ステップＳ２０７）、平均クリック回数率を変数＄ＡＶＥ_Ｒに設定し（ステップＳ２０８）、「次へ／再検索」クリック回数率を変数＄ＢＡＤ_Ｒに設定する（ステップＳ２０９）。 First score = ($ MAX− $ AVE) ²
-Σ ($ EACH- $ AVE) ²
-($ BAD- $ AVE) ²
Next, the second score calculation unit 308 sets the maximum click frequency rate of the first to Nth URLs to the variable $ MAX _R (step S207), and sets the average click frequency rate to the variable $ AVE _R. set to (step S208), and set the "next / re-search" clicks rate to variable $ BAD _R (step S209).

そして、クリック回数率が最大のものを除く１件目〜Ｎ件目ＵＲＬのクリック回数率をΣの計算において毎回、変数＄ＥＡＣＨ_Ｒとして、次式で第２スコアを計算する（ステップＳ２１０）。式の意味するところについては後述する。 Then, the second score is calculated by the following equation using the click frequency rate of the first to Nth URLs except the one with the maximum click frequency rate as the variable $ EACH _R in the calculation of Σ (step S210). The meaning of the formula will be described later.

第２スコア＝（＄ＭＡＸ_Ｒ−＄ＡＶＥ_Ｒ）^２
−Σ（＄ＥＡＣＨ_Ｒ−＄ＡＶＥ_Ｒ）^２
−（＄ＢＡＤ_Ｒ−＄ＡＶＥ_Ｒ）^２
次いで、処理対象のクエリと算出された第１スコア、第２スコアとクリック回数（率）が最大のＵＲＬを、スコアリングＤＢ３１４のクエリ、第１スコア、第２スコア、最大クリック回数（率）ＵＲＬに登録する（ステップＳ２１１）。 Second score = ($ MAX _R − $ AVE _R ) ²
-Σ ($ EACH _R- $ AVE _R ) ²
_{_{- ($ BAD R - $ AVE}} R) 2
Next, the URL of the query to be processed and the calculated first score, second score, and the maximum number of clicks (rate), the query of the scoring DB 314, the first score, the second score, the maximum number of clicks (rate) URL (Step S211).

次いで、対象となるクエリにつき処理済であるか否か判断し（ステップＳ２１２）、処理済でない場合（ステップＳ２１２のＮｏ）は次のクエリの選択（ステップＳ２０２）に戻り、処理済である場合（ステップＳ２１２のＹｅｓ）は第１スコアおよび第２スコアの算出の処理を終了する（ステップＳ２１３）。 Next, it is determined whether or not the target query has been processed (step S212). If it has not been processed (No in step S212), the process returns to the selection of the next query (step S202), and has been processed ( In step S212, Yes) ends the calculation of the first score and the second score (step S213).

図６はナビゲーショナルクエリ抽出部３０９によるナビゲーショナルクエリ抽出の処理例を示すフローチャートである。 FIG. 6 is a flowchart showing a processing example of navigational query extraction by the navigational query extraction unit 309.

図６において、ナビゲーショナルクエリ抽出の処理を開始すると（ステップＳ３０１）、ナビゲーショナルクエリ抽出部３０９はスコアリングＤＢ３１４から１つのクエリを選択する（ステップＳ３０２）。 In FIG. 6, when the navigational query extraction process is started (step S301), the navigational query extraction unit 309 selects one query from the scoring DB 314 (step S302).

次いで、第１スコアが所定値以上であるか否か判断し（ステップＳ３０３）、所定値以上である場合（ステップＳ３０３のＹｅｓ）、続いて第２スコアが所定値以上であるか否か判断する（ステップＳ３０４）。 Next, it is determined whether or not the first score is greater than or equal to a predetermined value (step S303). If it is greater than or equal to the predetermined value (Yes in step S303), it is subsequently determined whether or not the second score is greater than or equal to the predetermined value. (Step S304).

第２スコアが所定値以上である場合（ステップＳ３０４のＹｅｓ）、処理対象のクエリと最大クリック回数（率）のＵＲＬを、ナビゲーショナルクエリＤＢ３１２のナビゲーショナルクエリとサイトＵＲＬに登録する（ステップＳ３０５）。 If the second score is greater than or equal to a predetermined value (Yes in step S304), the URL of the query to be processed and the maximum number of clicks (rate) is registered in the navigational query and site URL of the navigational query DB 312 (step S305). .

第１スコアが所定値以上でない場合（ステップＳ３０３のＮｏ）、第２スコアが所定値以上でない場合（ステップＳ３０４のＮｏ）、もしくは、ナビゲーショナルクエリＤＢ３１２への登録（ステップＳ３０５）の後、対象となるクエリにつき処理済であるか否か判断し（ステップＳ３０６）、処理済でない場合（ステップＳ３０６のＮｏ）は次のクエリの選択（ステップＳ３０２）に戻り、処理済である場合（ステップＳ３０６のＹｅｓ）はナビゲーショナルクエリ抽出の処理を終了する（ステップＳ３０７）。 If the first score is not equal to or higher than the predetermined value (No in step S303), if the second score is not equal to or higher than the predetermined value (No in step S304), or after registration in the navigational query DB 312 (step S305), It is determined whether or not each query has been processed (step S306). If not processed (No in step S306), the process returns to the selection of the next query (step S302), and if processed (Yes in step S306). ) Ends the navigational query extraction process (step S307).

＜第１スコアおよび第２スコアの意味＞
以下、第１スコアおよび第２スコアの意味について説明する。 <The meaning of the first score and the second score>
Hereinafter, the meaning of the first score and the second score will be described.

本発明では、検索サービスにおけるクリックログのデータを解析し、ナビゲーショナルクエリと非ナビゲーショナルクエリの自動的な選別を行なっている。この際、上記の実施形態では、分散値の計算方法を応用して、ナビゲーショナルクエリとしての確からしさを示す指標である第１スコアおよび第２スコアを算出し（第１スコアおよび第２スコアは一種の分散値でもある。）、その両者が所定の閾値を超えるものをナビゲーショナルクエリとして抽出している。なお、ナビゲーショナルクエリを抽出する精度は若干低下するが、第１スコアのみを用い、その第１スコアが所定の閾値を超えるものをナビゲーショナルクエリとして抽出するようにしてもよい。 In the present invention, click log data in a search service is analyzed to automatically select a navigational query and a non-navigation query. At this time, in the above-described embodiment, the first score and the second score, which are indices indicating the probability of the navigational query, are calculated by applying the variance value calculation method (the first score and the second score are It is also a kind of variance value.), Those whose both exceed a predetermined threshold are extracted as navigational queries. Although the accuracy of extracting the navigational query is slightly lowered, only the first score may be used, and the first score exceeding a predetermined threshold may be extracted as the navigational query.

ナビゲーショナルクエリは、
（１）検索一覧における一箇所のサイトが集中してクリックされる。
（２）必ず検索一覧の１ページ目に目的のサイトが含まれる。
（３）「次へ」や「再検索」はクリックされない。
という特性を有している。 Navigational queries are
(1) One site in the search list is clicked in a concentrated manner.
(2) The target site is always included in the first page of the search list.
(3) “Next” and “Re-search” are not clicked.
It has the characteristic.

ここで、ナビゲーショナルクエリの場合は、上記の「検索一覧における一箇所のサイトが集中してクリックされる」という特性があるので、ナビゲーショナルクエリであるか否かの判定には、分散値の使用が適しているとも考えられる。分散値とは、「平均値との偏差２乗和」を「要素数」で割ったものである。一つの要素が他の要素と比べて突出していれば、分散値は高くなるので、ナビゲーショナルクエリである場合は分散値が高くなる。また、全ての要素が平均値に近ければ、分散値は小さくなる。 Here, in the case of a navigational query, since there is a characteristic that “one site in the search list is clicked in a concentrated manner”, the determination of whether or not it is a navigational query has a variance value. The use is also considered suitable. The variance value is obtained by dividing the “sum of squared deviations from the average value” by the “number of elements”. If one element is prominent compared to the other elements, the variance value is high. Therefore, in the case of a navigational query, the variance value is high. If all the elements are close to the average value, the variance value becomes small.

しかし、それぞれの要素がバラバラであれば、平均値との偏差が大きくなり、分散値も大きくなる。そのため、クリックされる箇所とされない箇所が数箇所に分かれる場合も分散値が高くなってしまい、ナビゲーショナルクエリであるか否かを正確に判定することができない。 However, if each element is disjoint, the deviation from the average value increases and the variance value also increases. For this reason, even when a portion that is not clicked is divided into several locations, the variance value becomes high, and it is not possible to accurately determine whether or not it is a navigational query.

そこで、本実施形態では、分散値をそのまま適用するのではなく、最大クリック回数の偏差２乗からその他のクリック回数の偏差２乗和を差し引くようにしている。これにより、「ある一箇所のサイトが集中してクリック」されるナビゲーショナルクエリの場合、クリック回数による偏差２乗の値は大きくなり、その他のクリック回数の偏差２乗和の値は相対的に小さくなり、全体のスコア値は大きくなる。また、クリックされる箇所とされない箇所が数箇所に分かれる場合（ナビゲーショナルクエリではない場合）、その他のクリック回数の偏差２乗和が大きくなり、スコア値を引き下げる。この手法の利点としては、クリック回数にバラつきがある場合は差分が小さくなり、さらに、均等にクリックされるような場合は、差分値が負の値となることである。その結果、一箇所のサイトのみが集中してクリックされている状態を示す指標とすることができる。 Therefore, in this embodiment, the variance value is not applied as it is, but the sum of deviation squares of other clicks is subtracted from the deviation square of the maximum clicks. As a result, in the case of a navigational query in which a certain site is clicked in a concentrated manner, the deviation squared value due to the number of clicks becomes larger, and the deviation squared sum of other clicks becomes relatively larger. It becomes smaller and the overall score value becomes larger. In addition, when the clicked part is not divided into several parts (in the case of not being a navigational query), the deviation sum of squares of other clicks becomes large, and the score value is lowered. The advantage of this method is that the difference becomes smaller when the number of clicks varies, and the difference value becomes a negative value when the number of clicks is evenly clicked. As a result, it can be used as an index indicating a state where only one site is clicked in a concentrated manner.

また、上述した「必ず検索一覧の１ページ目に目的のサイトが含まれる」というナビゲーショナルクエリの特性に基づき、計算に使用するクリック回数としては、検索一覧の最初の１ページ目に表示されるＮ件（例えば、１０件）に制限することができる。 The number of clicks used for the calculation is displayed on the first page of the search list based on the characteristic of the navigational query that “the target site is always included in the first page of the search list” described above. The number can be limited to N (for example, 10).

図７は、横軸に、あるクエリに対する検索一覧の表示順位順のサイトをとり、縦軸に、各サイトに対する選択クリック数を示したものである。第１スコアの算出式の第１項「（＄ＭＡＸ−＄ＡＶＥ）^２」は図７では１番目のサイトに対応するものであり（常に１番目になるとは限らない）、第２項「−Σ（＄ＥＡＣＨ−＄ＡＶＥ）^２」は図７では２番目〜Ｎ番目のサイトについての総和である。 In FIG. 7, the horizontal axis indicates the sites in the order of display order of the search list for a certain query, and the vertical axis indicates the number of selected clicks for each site. The first term “($ MAX− $ AVE) ² ” of the formula for calculating the first score corresponds to the first site in FIG. 7 (it is not always the first), and the second term “−” “Σ ($ EACH− $ AVE) ² ” is the total sum for the second to Nth sites in FIG.

一方、第１スコアの算出式における第３項「−（＄ＢＡＤ−＄ＡＶＥ）^２」は、「「次へ」や「再検索」はクリックされない」というナビゲーショナルクエリの特性に基づき、「次へ／再検索」が行なわれたことによるナビゲーショナルクエリではないとのユーザの判断を反映させたものである。 On the other hand, the third term “− ($ BAD− $ AVE) ² ” in the calculation formula of the first score is based on the characteristic of the navigational query that “next” or “re-search” is not clicked ”. This is a reflection of the user's judgment that the search is not a navigational query due to the “re-search / re-search”.

第２スコアは、第１スコアがクリック回数（絶対回数）に基づいて算出するのに対し、クリック回数率に基づいて算出するものである。すなわち、「検索要求が高いクエリではスコアが高くなりがち」になることから、その影響を除去するためのものである。例えば、検索回数のうちの１％がクリックされた場合を考えたとき、人気のあるサイトであるが故に検索回数が日頃から多いクエリの場合と、そうでないクエリ（検索回数が低い）の場合、クリック回数のみでスコアリングした場合では、検索回数が多いクエリの方がスコアが高くなってしまう。 The second score is calculated based on the click frequency rate, whereas the first score is calculated based on the number of clicks (absolute number). In other words, since the score tends to be high in a query with a high search request, this is to remove the influence. For example, when 1% of the number of searches is clicked, a query that is popular because it is a popular site, and a query that is not frequent (low search) In the case of scoring only by the number of clicks, a query with a higher number of searches has a higher score.

そこで、第１スコアの算出式と同様の式において、クリック回数をクリック回数率に置き換えて第２スコアを算出することで、「クリック分散の形状」に関してスコアリングを行なう。ただし、クリック回数率に基づいて算出する場合、インプレッションが低いクエリのスコアが高くなってしまうため、単独で用いるのではなく、第１スコアが所定の閾値より大きいものにつき、更に第２スコアが所定の閾値より大きいか否かを確認するのに用いる。これにより、ナビゲーショナルクエリの判定精度を高めることができる。 Therefore, scoring is performed with respect to the “shape of click distribution” by calculating the second score by replacing the click frequency with the click frequency rate in the same formula as the first score calculation formula. However, when calculating based on the click rate, the score of a query with a low impression will be high, so that it is not used alone, and a second score is predetermined for a case where the first score is larger than a predetermined threshold. It is used to confirm whether or not it is larger than the threshold value. Thereby, the determination accuracy of the navigational query can be increased.

＜変形例＞
ナビゲーショナルクエリがブックマークの代わりに用いられることに対応するため、ユーザが２回目以降に入力したクエリに対応するクリックログを第１スコアおよび第２スコアの算出に用いるようにすることができる。 <Modification>
Since the navigational query corresponds to being used instead of the bookmark, the click log corresponding to the query input by the user for the second time or later can be used for the calculation of the first score and the second score.

この場合の処理の流れは次のようになる。
（１）検索サーバ３は、ユーザＩＤによるログインにより、ユーザ毎の検索履歴を管理する。
（２）クリックログ収集部３０６は、クリックログをユーザＩＤ毎にクエリログＤＢ３１３に記憶する。
（３）第１スコア算出部３０７および第２スコア算出部３０８は、ユーザが始めて入力したクエリに対するクリックログはスコアの算出には用いないで、第１スコアおよび第２スコアを算出する。 The flow of processing in this case is as follows.
(1) The search server 3 manages a search history for each user by logging in with a user ID.
(2) The click log collection unit 306 stores the click log in the query log DB 313 for each user ID.
(3) The first score calculation unit 307 and the second score calculation unit 308 calculate the first score and the second score without using the click log for the query input for the first time by the user for the calculation of the score.

これにより、ユーザが一度閲覧したサイトに対するナビゲーショナルクエリを抽出することができる。 Thereby, the navigational query with respect to the site which the user browsed once can be extracted.

＜総括＞
以上説明したように、本発明の実施形態によれば、ナビゲーショナルクエリを精度よく自動的に抽出することができる。 <Summary>
As described above, according to the embodiment of the present invention, a navigational query can be automatically extracted with high accuracy.

そして、そのナビゲーショナルクエリを検索サービスにおいて用いることにより、例えば、
（１）ナビゲーショナルクエリに対応するサイトが検索結果の２番目以降にある場合には、表示順序を変更して先頭に移動させることで、ユーザのニーズに即応した検索結果とする。
（２）クライアント端末で入力されているクエリを取得し、そのクエリに対応するサイトをユーザにサジェストして誘導することで、ユーザの操作性を向上させる。
等の有用な用途に活用することができる。 And by using the navigational query in the search service, for example,
(1) When the site corresponding to the navigational query is in the second and subsequent search results, the display order is changed and moved to the top to obtain a search result that meets the user's needs.
(2) The user's operability is improved by acquiring a query input at the client terminal, and suggesting and guiding a site corresponding to the query to the user.
It can be used for useful applications such as.

以上、本発明の好適な実施の形態により本発明を説明した。ここでは特定の具体例を示して本発明を説明したが、特許請求の範囲に定義された本発明の広範な趣旨および範囲から逸脱することなく、これら具体例に様々な修正および変更を加えることができることは明らかである。すなわち、具体例の詳細および添付の図面により本発明が限定されるものと解釈してはならない。 The present invention has been described above by the preferred embodiments of the present invention. While the invention has been described with reference to specific embodiments, various modifications and changes may be made to the embodiments without departing from the broad spirit and scope of the invention as defined in the claims. Obviously you can. In other words, the present invention should not be construed as being limited by the details of the specific examples and the accompanying drawings.

本発明の一実施形態にかかるシステムの構成例を示す図である。It is a figure which shows the structural example of the system concerning one Embodiment of this invention. 各データベースのデータ構造例を示す図である。It is a figure which shows the data structure example of each database. 実施形態の処理例を示すシーケンス図である。It is a sequence diagram which shows the process example of embodiment. 検索時の画面表示例を示す図である。It is a figure which shows the example of a screen display at the time of a search. 第１スコアおよび第２スコアの算出の処理例を示す図である。It is a figure which shows the example of a process of calculation of a 1st score and a 2nd score. ナビゲーショナルクエリ抽出の処理例を示すフローチャートである。It is a flowchart which shows the process example of a navigational query extraction. 第１スコアの算出式における第１項および第２項の説明図である。It is explanatory drawing of the 1st term and the 2nd term in the calculation formula of the 1st score.

Explanation of symbols

１クライアント端末
１０１クエリ入力部
１０２検索実行指示受付部
１０３クエリ送信部
１０４推奨サイト受信部
１０５検索一覧受信部
１０６表示制御部
１０７サイト選択受付部
１０８リダイレクト処理部
２ネットワーク
３検索サーバ
３０１クエリ受信部
３０２検索処理部
３０３推奨サイト送信部
３０４検索一覧送信部
３０５リダイレクト処理部
３０６クリックログ収集部
３０７第１スコア算出部
３０８第２スコア算出部
３０９ナビゲーショナルクエリ抽出部
３１１コンテンツＤＢ
３１２ナビゲーショナルクエリＤＢ
３１３クエリログＤＢ
３１４スコアリングＤＢ
４Ｗｅｂサーバ DESCRIPTION OF SYMBOLS 1 Client terminal 101 Query input part 102 Search execution instruction reception part 103 Query transmission part 104 Recommended site reception part 105 Search list reception part 106 Display control part 107 Site selection reception part 108 Redirect processing part 2 Network 3 Search server 301 Query reception part 302 Search processing unit 303 Recommended site transmission unit 304 Search list transmission unit 305 Redirect processing unit 306 Click log collection unit 307 First score calculation unit 308 Second score calculation unit 309 Navigational query extraction unit 311 Content DB
312 Navigational Query DB
313 Query log DB
314 Scoring DB
4 Web server

Claims

Aggregating a click log indicating the history of individual selections for a plurality of sites from a search list for a query for each query;
Calculating a first variance based on the number of clicks per site for each query based on the click log;
A query extraction method comprising: a query in which the calculated first variance value is equal to or greater than a predetermined value; and a step of extracting site information having the highest number of clicks in a search list for the query.

Aggregating a click log indicating the history of individual selections for a plurality of sites from a search list for a query for each query;
Calculating a first variance based on the number of clicks per site for each query based on the click log;
Calculating a second variance value based on a click rate for each site for each query based on the click log;
A query in which the calculated first variance value and the second variance value are each equal to or greater than a predetermined value, and a step of extracting site information having the highest number of clicks in a search list for the query. Feature query extraction method.

In the query extraction method according to claim 1 or 2,
The step of calculating the first variance value includes calculating the first variance value by subtracting the sum of deviation squares of other clicks from the deviation square of the maximum number of clicks. .

The query extraction method according to claim 3, wherein
On the screen displaying the search list for the query, a button for changing the search list is displayed.
The step of calculating the first variance value further includes calculating the first variance value by subtracting a deviation square of the number of clicks of the displayed button.

The query extraction method according to claim 2,
The step of calculating the second variance value calculates the second variance value by subtracting the sum of deviation squares of other click frequency rates from the square of the deviation of the maximum click frequency rate. Extraction method.

The query extraction method according to claim 5, wherein
A search list for the query and a button for changing the displayed search list are displayed on one screen.
The step of calculating the second variance value further includes calculating the second variance value by subtracting the deviation square of the click frequency rate of the displayed button.

In the query extraction method according to any one of claims 1 to 6,
The step of counting the click log includes counting the click log for each user,
The step of calculating the variance value uses the click log corresponding to the query input by the user for the second time or later on the basis of the click log for each user to calculate the variance value.

Means for aggregating, for each query, a click log indicating a history of individual selections for a plurality of sites from a search list for a query;
Means for calculating a first variance based on the number of clicks per site for each query based on the click log;
A query extraction apparatus comprising: a query in which the calculated first variance value is equal to or greater than a predetermined value; and means for extracting site information having the highest number of clicks in a search list for the query.

Means for aggregating, for each query, a click log indicating a history of individual selections for a plurality of sites from a search list for a query;
Means for calculating a first variance based on the number of clicks per site for each query based on the click log;
Means for calculating a second variance based on the click rate for each site for each query based on the click log;
A query in which the calculated first variance value and the second variance value are each equal to or greater than a predetermined value, and means for extracting site information having the highest number of clicks in a search list for the query. A feature query extraction device.

A computer constituting the processing device,
Means for aggregating, for each query, a click log indicating a history of individual selections for a plurality of sites from a search list for a query;
Means for calculating a first variance based on the number of clicks per site for each query based on the click log;
Means for extracting a query in which the calculated first variance value is a predetermined value or more, and site information having the highest number of clicks in a search list for the query;
Query extraction program to function as.

A computer constituting the processing device,
Means for aggregating, for each query, a click log indicating a history of individual selections for a plurality of sites from a search list for a query;
Means for calculating a first variance based on the number of clicks per site for each query based on the click log;
Means for calculating a second variance based on a click rate for each site for each query based on the click log;
Means for extracting a query in which the calculated first variance value and the second variance value are each equal to or greater than a predetermined value, and site information having the highest number of clicks in a search list for the query;
Query extraction program to function as.