JP2003248696A

JP2003248696A - Page rating/filtering method, device, and program, and computer readable recording medium recording the program

Info

Publication number: JP2003248696A
Application number: JP2002047024A
Authority: JP
Inventors: Nobuyuki Omori; 信行大森; Masayuki Sugizaki; 正之杉崎; Hiroshi Takeno; 浩竹野; Hiroto Inagaki; 博人稲垣
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2002-02-22
Filing date: 2002-02-22
Publication date: 2003-09-05
Anticipated expiration: 2022-02-22
Also published as: JP4021681B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a page rating/filtering method, a device, and a program capable of efficiently and appropriately rating and filtering an object page using link pass information and to provide a computer readable recording medium recording the same program. <P>SOLUTION: This page rating/filtering device stores hyperlink information comprising the link pass information, or the link of URLs of respective pages to be a standard, in a DB part 101, allows a pass search part 102 to search the link pass information stored in the DB part 101 from the object page, allows a page rating part 103 to rate whether or not the object page accords with a prescribed standard relative to the link pass information stored in the database, and filters the object page based on the rating result. <P>COPYRIGHT: (C)2003,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、ドキュメント間の
リンク情報を利用してドキュメントのレイティングおよ
びフィルタリングを行うページレイティング／フィルタ
リング方法および装置に関し、更に詳しくは、ある対象
ページが予め設定した規準に対してどの程度関連がある
かどうかについてのレイティングを行い、このレイティ
ング結果に基づき対象ページに対するフィルタリングを
行うページレイティング／フィルタリング方法および装
置とページレイティング／フィルタリングプログラムお
よび該プログラムを記録したコンピュータ読取り可能な
記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a page rating / filtering method and apparatus for rating and filtering documents using link information between documents, and more specifically, to a standard set for a certain target page in advance. Page rating / filtering method and apparatus for performing a rating on the degree of association and filtering based on the rating result, a page rating / filtering program, and a computer-readable recording medium storing the program. Regarding

【０００２】[0002]

【従来の技術】ユーザがあるカテゴリに属するコンテン
ツのみを取得したい時、あるいはあるカテゴリに属する
コンテンツのみを取得したくない時、コンテンツのフィ
ルタリングおよびレイティングと呼ばれる技術が使われ
てきた。あるコンテンツのレイティングとは情報に対し
て一定の基準に沿った格付けを行うことである。フィル
タリングとは、情報を受信者が設定する基準に合わせて
選択的に受信することである。2. Description of the Related Art When a user wants to obtain only content belonging to a certain category or does not want to obtain only content belonging to a certain category, a technique called content filtering and rating has been used. Rating certain content means rating information according to certain criteria. Filtering is the selective reception of information according to the criteria set by the recipient.

【０００３】これらを行うために一定の基準に従ってコ
ンテンツに得点を計算し、その得点に基づいて、コンテ
ンツを格付けするのがレイティングであり、その得点に
基づいてコンテンツがユーザの望む情報であるかまたは
望まない情報であるかを判断するのがフィルタリングで
ある。インターネットのＷｅｂページの閲覧にフィルタ
リングを適用する例としては、親が子供に対して閲覧可
能なＷｅｂページを制限する場合や社員に対して、業務
に関係のないＷｅｂページの閲覧を制限する場合などが
ある。In order to do these, it is a rating to calculate a score for the content according to a certain standard and to rate the content based on the score, and based on the score, the content is information desired by the user, or Filtering determines whether the information is unwanted information. Examples of applying filtering to the browsing of web pages on the Internet include a case where a parent limits the web pages that can be browsed by children, and a case where the parent is restricted from browsing web pages not related to work. There is.

【０００４】ただし、一般的にはレイティングの結果に
基づいてフィルタリングを行うことが多く、レイティン
グおよびフィルタリングは近い意味の言葉として使われ
る。すなわち、コンテンツを「１８歳未満閲覧禁止」の
ように格付けし（レイティング）、そのように格付けに
従って、コンテンツがユーザに望まれていないと判断す
る（フィルタリング）ためである。However, in general, filtering is often performed based on the result of rating, and rating and filtering are used as words having similar meanings. In other words, this is because the content is rated as “Browsing prohibited under the age of 18” (rating), and according to the rating, it is determined that the content is not desired by the user (filtering).

【０００５】これらレイティングおよびフィルタリング
を行う際の、コンテンツの得点付けの方法としては、従
来は、（１）単語に基づく方法、および（２）コンテン
ツ指定方法（ＵＲＬ指定方法）がある。Conventionally, there are (1) a word-based method and (2) a content designating method (URL designating method) as a method of scoring the content when performing the rating and the filtering.

【０００６】単語に基づく方法は、あるコンテンツに指
定された単語が含まれているか、あるいはどの程度含ま
れているかによって得点付けを行う。In the word-based method, scoring is performed depending on whether or not a certain content contains a designated word.

【０００７】コンテンツ指定方法は、レイティングおよ
びフィルタリングの基準として、あるカテゴリに入るコ
ンテンツを指定しておく方法である。つまり、例えばユ
ーザに閲覧させるコンテンツを制限する際には、予めど
のコンテンツをユーザに閲覧させるのかを閲覧可能なコ
ンテンツとして指定しておき、ユーザがコンテンツを閲
覧しようとしたとき、そのコンテンツが閲覧可能なコン
テンツとして指定されていれば閲覧させるというもので
ある。例えば、インターネットのＷｅｂページであれ
ば、http://www.xxx.yyy/のページは閲覧させないとい
う指定を予めしておきユーザがそのページを閲覧しよう
としても、ページを表示させなくする。The content designating method is a method of designating content in a certain category as a criterion for rating and filtering. That is, for example, in limiting the content to be viewed by the user, have specified whether to browse previously which content to the user as viewable content, when the user attempts to view the content, viewable its contents If the content is specified, it will be viewed. For example, if it is a Web page on the Internet, the page http: //www.xxx.yyy/ is specified not to be browsed, and the page is not displayed even if the user tries to browse the page.

【０００８】[0008]

【発明が解決しようとする課題】上述した従来の方法の
うち、単語に基づく方法では、次のような問題がある。Among the conventional methods described above, the word-based method has the following problems.

【０００９】（１）ユーザが予め単語を指定しておく必
要がある。(1) The user needs to specify a word in advance.

【００１０】（２）画像や音楽データには適用できな
い。(2) Not applicable to images and music data.

【００１１】（３）単語の含まれているコンテンツにの
み適用可能であり、指定した単語を更新し続ける、つま
り新たな単語を登録し続ける必要がある。(3) It can be applied only to contents including words, and it is necessary to keep updating the designated word, that is, keep registering new words.

【００１２】（４）インターネットＷｅｂなどにおいて
は、従来使われなかった単語（新語・未知語）が極めて
頻繁に利用されるため、それに対応できない。例えば、
「切れる」という言葉が、従来から利用されていた「対
象を切断する」という意味のみでなく、「急に怒り出
す」という意味に使われ始めるような例がインターネッ
トでも多い。これに対応するために継続的に更新をして
いく必要があり、更新の頻度は頻繁であるほどよい。(4) In the Internet and the like, words (new words / unknown words) that have not been used conventionally are used very frequently, and it is not possible to deal with them. For example,
There are many examples on the Internet where the word "cut off" starts to be used not only in the conventional sense of "cutting off an object" but also in the sense of "getting angry". In order to deal with this, it is necessary to continuously update, and it is better that the update frequency is more frequent.

【００１３】（５）予め指定されていた単語「Ａ」の代
わりに同じ意味の別の単語「Ａ′」が使用された場合に
は対応できない。(5) If another word "A '" having the same meaning is used instead of the previously designated word "A", it cannot be dealt with.

【００１４】また、コンテンツ指定方法では、次のよう
な問題がある。Further, the content designating method has the following problems.

【００１５】（１）新たなコンテンツに対応できない。(1) New contents cannot be supported.

【００１６】（２）指定したコンテンツのみに対応でき
るため、コンテンツを指定した時点では存在せず、それ
以降に出現したコンテンツに対しては、あまり指定がで
きない。(2) Since only the specified content can be dealt with, the content does not exist at the time of specifying the content, and the content that appears after that cannot be specified much.

【００１７】本発明は、上記に鑑みてなされたもので、
その目的とするところは、リンクパス情報を用いて対象
ページのレイティングおよびフィルタリングを効率的か
つ適確に行い得るページレイティング／フィルタリング
方法および装置とページレイティング／フィルタリング
プログラムおよび該プログラムを記録したコンピュータ
読取り可能な記録媒体を提供することにある。The present invention has been made in view of the above,
The purpose thereof is to provide a page rating / filtering method and apparatus capable of efficiently and accurately rating and filtering a target page using link path information, a page rating / filtering program, and a computer readable recording program thereof. To provide a suitable recording medium.

【００１８】[0018]

【課題を解決するための手段】上記目的を達成するた
め、請求項１記載の本発明は、ある対象ページが予め設
定した規準に対してどの程度関連があるかどうかについ
てのレイティングを行い、このレイティング結果に基づ
き対象ページに対するフィルタリングを行うページレイ
ティング／フィルタリング方法であって、前記規準とな
る各ページのＵＲＬの連結であるリンクパス情報からな
るハイパーリンク情報をデータベースに格納しておき、
前記対象ページから前記データベースに格納されたリン
クパス情報を検索し、対象ページがデータベースに格納
されたリンクパス情報に対して所定の基準に合致するか
否かのレイティングを行い、このレイティング結果に基
づき前記対象ページをフィルタリングすることを要旨と
する。In order to achieve the above-mentioned object, the present invention according to claim 1 performs a rating as to how much a certain target page is related to a preset standard. A page rating / filtering method for filtering a target page based on a rating result, wherein hyperlink information composed of link path information which is a link of URLs of respective pages serving as the criteria is stored in a database,
The link path information stored in the database is retrieved from the target page, and the target page is rated for the link path information stored in the database to determine whether or not the link path information is stored in the database, and based on the rating result. The gist is to filter the target page.

【００１９】請求項１記載の本発明にあっては、規準と
なる各ページのＵＲＬの連結であるリンクパス情報から
なるハイパーリンク情報をデータベースに格納し、対象
ページからデータベースに格納されたリンクパス情報を
検索し、対象ページがデータベースに格納されたリンク
パス情報に対して所定の基準に合致するか否かのレイテ
ィングを行い、このレイティング結果に基づき前記対象
ページをフィルタリングするため、規準となる各ページ
のＵＲＬ群のみ指定すれば、この指定されたＵＲＬ群と
対象ページとの関連度を判定し、対象ページのフィルタ
リングを効率的かつ適確に行うことができる。According to the first aspect of the present invention, hyperlink information including link path information, which is a link of URLs of standard pages, is stored in a database, and the link path stored in the database from the target page. Information is searched for, whether or not the target page matches the predetermined criteria with respect to the link path information stored in the database, and the target page is filtered based on this rating result. If only the URL group of the page is designated, the degree of association between the designated URL group and the target page can be determined, and the target page can be filtered efficiently and accurately.

【００２０】また、請求項２記載の本発明は、請求項１
記載の発明において、前記レイティングが、データベー
スに格納されているリンクパス情報を構成する各ＵＲＬ
に対する対象ページの関連度に基づき行われることを要
旨とする。The present invention according to claim 2 is the same as claim 1
In the described invention, the rating is each URL that constitutes link path information stored in the database.
The summary is that it is performed based on the degree of association of the target page with.

【００２１】更に、請求項３記載の本発明は、請求項２
記載の発明において、前記関連度が、リンクパス情報を
構成する各ＵＲＬから対象ページへとハイパーリンクを
辿って、対象ページに到達できる経路があるかどうかを
探索し、この経路に沿った距離が短いほど、関連度が大
きいとすることを要旨とする。Further, the present invention according to claim 3 provides the invention according to claim 2.
In the invention described above, the degree of association is a hyperlink from each URL forming the link path information to a target page, and a search is made to see if there is a route that can reach the target page. The gist is that the shorter the relationship, the higher the degree of association.

【００２２】請求項４記載の本発明は、請求項２記載の
発明において、前記関連度が、リンクパス情報を構成す
る各ＵＲＬから対象ページへとハイパーリンクを辿っ
て、対象ページに到達できる経路があるかどうかを探索
し、経路がある場合には、すべての到達可能な経路を探
索し、経路が多いほど、関連度が大きいとすることを要
旨とする。According to a fourth aspect of the present invention, in the invention according to the second aspect, a route in which the degree of association can reach the target page by following a hyperlink from each URL constituting the link path information to the target page. The gist is to search whether or not there is a route and, if there is a route, to search all reachable routes, and assume that the more routes there are, the higher the degree of association is.

【００２３】また、請求項５記載の本発明は、ある対象
ページが予め設定した規準に対してどの程度関連がある
かどうかについてのレイティングを行い、このレイティ
ング結果に基づき対象ページに対するフィルタリングを
行うページレイティング／フィルタリング装置であっ
て、前記規準となる各ページのＵＲＬの連結であるリン
クパス情報からなるハイパーリンク情報を格納するデー
タベースと、前記対象ページから前記データベースに格
納されたリンクパス情報を検索するパス探索部と、対象
ページがデータベースに格納されたリンクパス情報に対
して所定の基準に合致するか否かのレイティングを行う
レイティング手段と、このレイティング結果に基づき前
記対象ページをフィルタリングするフィルタリング手段
とを有することを要旨とする。The present invention according to claim 5 performs a rating as to how much a certain target page is related to a preset standard, and a page for filtering the target page based on the rating result. A rating / filtering device, which searches a database for storing hyperlink information, which is link path information that is a link of URLs of respective pages serving as the standard, and link path information stored in the database, from the target page. A path search unit, a rating unit for rating whether or not the target page matches a predetermined criterion with respect to the link path information stored in the database, and a filtering unit for filtering the target page based on the rating result. Required to have To.

【００２４】請求項５記載の本発明にあっては、規準と
なる各ページのＵＲＬの連結であるリンクパス情報から
なるハイパーリンク情報をデータベースに格納し、対象
ページからデータベースに格納されたリンクパス情報を
検索し、対象ページがデータベースに格納されたリンク
パス情報に対して所定の基準に合致するか否かのレイテ
ィングを行い、このレイティング結果に基づき前記対象
ページをフィルタリングするため、規準となる各ページ
のＵＲＬ群のみ指定すれば、この指定されたＵＲＬ群と
対象ページとの関連度を判定し、対象ページのフィルタ
リングを効率的かつ適確に行うことができる。In the present invention according to claim 5, hyperlink information, which is link path information that is a link of URLs of standard pages, is stored in a database, and the link path stored in the database from the target page. Information is searched for, whether or not the target page matches the predetermined criteria with respect to the link path information stored in the database, and the target page is filtered based on this rating result. If only the URL group of the page is designated, the degree of association between the designated URL group and the target page can be determined, and the target page can be filtered efficiently and accurately.

【００２５】更に、請求項６記載の本発明は、請求項５
記載の発明において、前記レイティング手段が、データ
ベースに格納されているリンクパス情報を構成する各Ｕ
ＲＬに対する対象ページの関連度を算出する関連度算出
手段と、この算出した関連度に基づきレイティングを行
う手段とを有することを要旨とする。Further, the present invention according to claim 6 provides the invention according to claim 5.
In the invention described above, each of the U's forming the link path information stored in the database by the rating means.
The gist is to have a degree-of-association calculating unit that calculates the degree of association of the target page with the RL and a unit that performs rating based on the calculated degree of association.

【００２６】請求項７記載の本発明は、請求項６記載の
発明において、前記関連度が、リンクパス情報を構成す
る各ＵＲＬから対象ページへとハイパーリンクを辿っ
て、対象ページに到達できる経路があるかどうかを探索
し、この経路に沿った距離が短いほど、関連度が大きい
とすることを要旨とする。A seventh aspect of the present invention is the route according to the sixth aspect, wherein the degree of relevance can reach the target page by following a hyperlink from each URL forming the link path information to the target page. The gist is to search for whether there is any, and the shorter the distance along this route, the greater the degree of association.

【００２７】また、請求項８記載の本発明は、請求項６
記載の発明において、前記関連度が、リンクパス情報を
構成する各ＵＲＬから対象ページへとハイパーリンクを
辿って、対象ページに到達できる経路があるかどうかを
探索し、経路がある場合には、すべての到達可能な経路
を探索し、経路が多いほど、関連度が大きいとすること
を要旨とする。The present invention according to claim 8 provides the invention according to claim 6.
In the invention described above, the degree of relevance is a hyperlink from each URL forming the link path information to the target page to search for a route to reach the target page. If there is a route, The gist is that all reachable routes are searched, and the more routes there are, the higher the degree of association is.

【００２８】更に、請求項９記載の本発明は、ある対象
ページが予め設定した規準に対してどの程度関連がある
かどうかについてのレイティングを行い、このレイティ
ング結果に基づき対象ページに対するフィルタリングを
行うページレイティング／フィルタリングプログラムで
あって、前記規準となる各ページのＵＲＬの連結である
リンクパス情報からなるハイパーリンク情報をデータベ
ースに格納しておき、前記対象ページから前記データベ
ースに格納されたリンクパス情報を検索し、対象ページ
がデータベースに格納されたリンクパス情報に対して所
定の基準に合致するか否かのレイティングを行い、この
レイティング結果に基づき前記対象ページをフィルタリ
ングすることを要旨とする。Further, the present invention according to claim 9 performs a rating as to how a certain target page is related to a preset standard, and a page for filtering the target page based on the rating result. A rating / filtering program, wherein hyperlink information consisting of link path information which is a link of URLs of the respective standard pages is stored in a database, and link path information stored in the database is stored from the target page. The gist of the present invention is to perform a rating on the target page for the link path information stored in the database to find out whether the target page matches a predetermined criterion, and to filter the target page based on the rating result.

【００２９】請求項９記載の本発明にあっては、規準と
なる各ページのＵＲＬの連結であるリンクパス情報から
なるハイパーリンク情報をデータベースに格納し、対象
ページからデータベースに格納されたリンクパス情報を
検索し、対象ページがデータベースに格納されたリンク
パス情報に対して所定の基準に合致するか否かのレイテ
ィングを行い、このレイティング結果に基づき前記対象
ページをフィルタリングするため、規準となる各ページ
のＵＲＬ群のみ指定すれば、この指定されたＵＲＬ群と
対象ページとの関連度を判定し、対象ページのフィルタ
リングを効率的かつ適確に行うことができる。According to the present invention of claim 9, hyperlink information consisting of link path information, which is a link of URLs of standard pages, is stored in a database, and the link path stored in the database from the target page. Information is searched for, whether or not the target page matches the predetermined criteria with respect to the link path information stored in the database, and the target page is filtered based on this rating result. If only the URL group of the page is designated, the degree of association between the designated URL group and the target page can be determined, and the target page can be filtered efficiently and accurately.

【００３０】請求項１０記載の本発明は、請求項９記載
の発明において、前記レイティングが、データベースに
格納されているリンクパス情報を構成する各ＵＲＬに対
する対象ページの関連度に基づき行われることを要旨と
する。According to a tenth aspect of the present invention, in the invention according to the ninth aspect, the rating is performed based on the degree of association of the target page with each URL constituting the link path information stored in the database. Use as a summary.

【００３１】また、請求項１１記載の本発明は、請求項
１０記載の発明において、前記関連度が、リンクパス情
報を構成する各ＵＲＬから対象ページへとハイパーリン
クを辿って、対象ページに到達できる経路があるかどう
かを探索し、この経路に沿った距離が短いほど、関連度
が大きいとすることを要旨とする。Further, in the invention described in claim 11, in the invention described in claim 10, the degree of association reaches the target page by following a hyperlink from each URL constituting the link path information to the target page. The gist is to search whether there is a possible route, and the shorter the distance along this route, the greater the degree of association.

【００３２】更に、請求項１２記載の本発明は、請求項
１０記載の発明において、前記関連度が、リンクパス情
報を構成する各ＵＲＬから対象ページへとハイパーリン
クを辿って、対象ページに到達できる経路があるかどう
かを探索し、経路がある場合には、すべての到達可能な
経路を探索し、経路が多いほど、関連度が大きいとする
ことを要旨とする。Further, in the invention of claim 12, in the invention of claim 10, the degree of association reaches the target page by following a hyperlink from each URL constituting the link path information to the target page. The gist is to search whether there is a possible route, and if there is a route, to search all reachable routes, and to find that the more routes there are, the higher the degree of association is.

【００３３】請求項１３記載の本発明は、ある対象ペー
ジが予め設定した規準に対してどの程度関連があるかど
うかについてのレイティングを行い、このレイティング
結果に基づき対象ページに対するフィルタリングを行う
ページレイティング／フィルタリングプログラムを記録
したコンピュータ読取り可能な記録媒体であって、前記
規準となる各ページのＵＲＬの連結であるリンクパス情
報からなるハイパーリンク情報をデータベースに格納し
ておき、前記対象ページから前記データベースに格納さ
れたリンクパス情報を検索し、対象ページがデータベー
スに格納されたリンクパス情報に対して所定の基準に合
致するか否かのレイティングを行い、このレイティング
結果に基づき前記対象ページをフィルタリングするペー
ジレイティング／フィルタリングプログラムをコンピュ
ータ読取り可能な記録媒体に記録することを要旨とす
る。The present invention according to claim 13 is a page rating / rating method for performing a rating as to how a certain target page is related to a preset standard, and filtering the target page based on the rating result. A computer-readable recording medium in which a filtering program is recorded, wherein hyperlink information including link path information, which is a link of URLs of each standard page, is stored in a database, and the target page is stored in the database. A page that searches the stored link path information and performs a rating as to whether the target page matches a predetermined criterion with respect to the link path information stored in the database, and filters the target page based on the rating result. Rating / And summarized in that to record the I filter ring program in a computer-readable recording medium.

【００３４】請求項１３記載の本発明にあっては、規準
となる各ページのＵＲＬの連結であるリンクパス情報か
らなるハイパーリンク情報をデータベースに格納し、対
象ページからデータベースに格納されたリンクパス情報
を検索し、対象ページがデータベースに格納されたリン
クパス情報に対して所定の基準に合致するか否かのレイ
ティングを行い、このレイティング結果に基づき前記対
象ページをフィルタリングするページレイティング／フ
ィルタリングプログラムを記録媒体に記録しているた
め、該記録媒体を用いて、その流通性を高めることがで
きる。According to the present invention of claim 13, hyperlink information consisting of link path information, which is a concatenation of URLs of standard pages, is stored in a database, and the link path stored in the database from the target page. A page rating / filtering program that retrieves information, performs a rating on whether or not a target page matches a predetermined criterion with respect to link path information stored in a database, and filters the target page based on the rating result. Since the data is recorded on the recording medium, the distribution property can be improved by using the recording medium.

【００３５】また、請求項１４記載の本発明は、請求項
１３記載の発明において、前記レイティングが、データ
ベースに格納されているリンクパス情報を構成する各Ｕ
ＲＬに対する対象ページの関連度に基づき行われるペー
ジレイティング／フィルタリングプログラムをコンピュ
ータ読取り可能な記録媒体に記録することを要旨とす
る。Further, the present invention according to claim 14 is the invention according to claim 13, wherein each rating constitutes each U constituting the link path information stored in the database.
It is a gist to record a page rating / filtering program performed on the basis of the degree of association of the target page with the RL in a computer-readable recording medium.

【００３６】更に、請求項１５記載の本発明は、請求項
１４記載の発明において、前記関連度が、リンクパス情
報を構成する各ＵＲＬから対象ページへとハイパーリン
クを辿って、対象ページに到達できる経路があるかどう
かを探索し、この経路に沿った距離が短いほど、関連度
が大きいとするページレイティング／フィルタリングプ
ログラムをコンピュータ読取り可能な記録媒体に記録す
ることを要旨とする。Further, in the invention described in claim 15, in the invention described in claim 14, the degree of association reaches the target page by following a hyperlink from each URL constituting the link path information to the target page. The gist is to search whether there is a possible route, and record a page rating / filtering program that the degree of association is higher as the distance along the route is shorter in a computer-readable recording medium.

【００３７】請求項１６記載の本発明は、請求項１４記
載の発明において、前記関連度が、リンクパス情報を構
成する各ＵＲＬから対象ページへとハイパーリンクを辿
って、対象ページに到達できる経路があるかどうかを探
索し、経路がある場合には、すべての到達可能な経路を
探索し、経路が多いほど、関連度が大きいとするページ
レイティング／フィルタリングプログラムをコンピュー
タ読取り可能な記録媒体に記録することを要旨とする。According to a sixteenth aspect of the present invention, in the invention according to the fourteenth aspect, a route through which the degree of association can reach the target page by following a hyperlink from each URL constituting the link path information to the target page If there is a route, search all reachable routes, and record the page rating / filtering program that the more the route is, the higher the degree of association is in a computer-readable recording medium. The point is to do.

【００３８】[0038]

【発明の実施の形態】以下、図面を用いて本発明の実施
の形態を説明する。図１は、本発明の一実施形態に係る
ページレイティング／フィルタリング装置の構成を示す
ブロック図である。同図に示すページレイティング／フ
ィルタリング装置は、ある対象ページが予め設定した規
準に対してどの程度関連があるかどうかについてのレイ
ティングを行い、このレイティング結果に基づき対象ペ
ージに対するフィルタリングを行うページレイティング
／フィルタリング装置であって、前記規準となる各ペー
ジのＵＲＬの連結であるリンクパス情報からなるハイパ
ーリンク情報を格納するデータベース部（ＤＢ部と称す
る）１０１と、前記対象ページからＤＢ部１０１に格納
されたリンクパス情報を検索するパス探索部１０２と、
対象ページがＤＢ部１０１に格納されたリンクパス情報
に対して所定の基準に合致するか否かのレイティングを
行うレイティング手段およびこのレイティング結果に基
づき前記対象ページをフィルタリングするフィルタリン
グ手段を構成するページ得点計算部１０３と、ユーザか
らの入力およびレイティング／フィルタリング結果の出
力を行う入出力部１０４とから構成されている。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a configuration of a page rating / filtering apparatus according to an exemplary embodiment of the present invention. The page rating / filtering apparatus shown in the figure performs a rating as to how a certain target page is related to a preset standard, and performs a filtering on the target page based on the rating result. The device is a database unit (referred to as a DB unit) 101 that stores hyperlink information that is link path information that is a link of URLs of the respective standard pages, and the target page that is stored in the DB unit 101. A path search unit 102 for searching link path information,
A page score which constitutes a rating means for rating whether or not the target page matches the predetermined criteria with respect to the link path information stored in the DB unit 101 and a filtering means for filtering the target page based on the rating result. It is composed of a calculation unit 103 and an input / output unit 104 for inputting by a user and outputting a rating / filtering result.

【００３９】また、レイティング手段は、ＤＢ部１０１
に格納されているリンクパス情報を構成する各ＵＲＬに
対する対象ページの関連度に基づき行い、関連度は、リ
ンクパス情報を構成する各ＵＲＬから対象ページへとハ
イパーリンクを辿って、対象ページに到達できる経路が
あるかどうかを探索し、この経路に沿った距離が短いほ
ど、関連度が大きいとし、また対象ページに到達できる
経路がある場合には、すべての到達可能な経路を探索
し、経路が多いほど、関連度が大きいとする。The rating means is the DB section 101.
Based on the degree of relevance of the target page for each URL configuring the link path information stored in, the degree of relevance reaches the target page by following a hyperlink from each URL configuring the link path information to the target page. Search whether there is a possible route, and if the distance along this route is shorter, the degree of relevance is higher, and if there is a route that can reach the target page, then search all reachable routes and The higher the number, the higher the degree of association.

【００４０】本実施形態においては、入力はあるＷｅｂ
ページのＵＲＬである。以下、ユーザの指定したページ
を対象ページ、そのページのＵＲＬを対象ページＵＲＬ
という。本実施形態において、出力はその対象ページの
得点と、ユーザの指定したルールに一致するかどうかで
ある。得点は、予め指定されたＵＲＬ群とそのページの
関連度である。ルールは、対象ページを取得すべきか否
かを記述したものであり、上記関連度に基づいて判断さ
れる。In this embodiment, there is an input Web
This is the URL of the page. Below, the page specified by the user is the target page, and the URL of that page is the target page URL
Say. In this embodiment, the output is the score of the target page and whether or not it matches the rule specified by the user. The score is the degree of association between a URL group designated in advance and the page. The rule describes whether or not the target page should be acquired, and is determined based on the degree of association.

【００４１】具体的には、本実施形態では、予めＵＲＬ
群を前記規準として指定しておき、対象ページをフィル
タリングするか否かの判定は、対象ページがＵＲＬ群に
含まれるページとの関連度を計算することにより行う。
関連度の計算は、ＵＲＬ群に含まれるページと対象ペー
ジの間にハイパーリンクを辿って到達できる経路がある
かどうかを探索し、経路がある場合には、すべての到達
可能な経路を探索し、経路の数が多いほど、また経路の
距離が近いほど、関連が大きいと判定する。Specifically, in this embodiment, the URL is set in advance.
A group is designated as the criterion, and whether to filter the target page is determined by calculating the degree of association of the target page with the pages included in the URL group.
The relevance calculation calculates whether there is a route that can be reached by following a hyperlink between the page included in the URL group and the target page, and if there is a route, searches all reachable routes. As the number of routes increases and the distance between routes decreases, the relationship is determined to be greater.

【００４２】例えば、社員にＷＷＷページを閲覧させる
かどうかの判断を行う場合、業務に必要なページのＵＲ
Ｌ群のみを前記規準として指定しておき、関連度が予め
指定されたしきい値以上の場合のみ閲覧を許可するとい
うルールを定めておく。そして、対象ページとそのＵＲ
Ｌ群との関連度を計算し、関連度が予め指定されたしき
い値以上の場合のみ、社員に対象ページの閲覧を許可す
る。なお、規準となるＵＲＬ群の指定は一度だけでよ
く、継続的に更新する必要はない。For example, when making a decision as to whether or not an employee should browse a WWW page, the UR of the page necessary for the work
Only the L group is designated as the criterion, and a rule is defined that the browsing is permitted only when the degree of association is equal to or higher than a threshold value designated in advance. And the target page and its UR
The degree of association with the L group is calculated, and the employee is allowed to browse the target page only when the degree of association is greater than or equal to a predetermined threshold value. The standard URL group need only be specified once, and need not be updated continuously.

【００４３】すなわち、本実施形態のコンテンツ配信方
法では、対象ページからリンクされているページへのリ
ンクパス情報および対象ページへリンクされているペー
ジからのリンクパス情報を用いて、指定されたＵＲＬ群
と対象ページとの関連度を判定することにより、従来の
ように単語情報を利用せずに、すなわちコンテンツの内
容を閲覧せずに、対象コンテンツが閲覧を許可されてい
るかどうかの判定が可能となる。また、従来のようにコ
ンテンツを予め閲覧して判断する必要もない。That is, in the content distribution method of the present embodiment, the specified URL group is used by using the link path information from the target page to the linked page and the link path information from the page linked to the target page. By determining the degree of relevance between the target content and the target page, it is possible to determine whether the target content is permitted to browse without using word information as in the past, that is, without browsing the content. Become. Further, it is not necessary to browse the content in advance and make a judgment as in the conventional case.

【００４４】なお、ここでいうリンクのパス、リンクパ
スとは、ハイパーリンクを辿って到達できるＷｅｂペー
ジのことであり、ＵＲＬ１→ＵＲＬ２→ＵＲＬ３→…→
ＵＲＬｍのような一連のＵＲＬが連続したものである。
リンクパスは、具体的には、図４に示すようなグラフ構
造であり、ＵＲＬをノードとし、各ノードは向きを持つ
アーク（この場合はアークがリンクである）を接続し、
アークは向きの識別のために一方の端をＨＥＡＤとし、
他方の端をＴＡＩＬとし、ノードに連結しており、各ノ
ードはＨＥＡＤ，ＴＡＩＬをそれぞれ最大１つ接続でき
るものである。The link path and link path referred to here are Web pages that can be reached by following hyperlinks, and are URL1 → URL2 → URL3 → ... →
A series of URLs such as URLm are continuous.
Specifically, the link path has a graph structure as shown in FIG. 4, where URL is a node, and each node connects arcs having directions (in this case, arcs are links),
Ark has HEAD at one end to identify the direction,
The other end is set as TAIL and is connected to a node, and each node can connect at most one HEAD and one TAIL.

【００４５】また、対象ページの関連度は、予め計算し
ておく必要はないので、新たなページの関連度を計算す
る必要が生じた時点でそのページとＵＲＬ群のリンク関
係をハイパーリンクに基づき計算すればよい。このよう
にハイパーリンク情報に基づく対象ページの得点計算方
法は、あるページＡにおいてリンクがページＢに張られ
ているとすると、それはページＡの作者がページＢを推
薦していると考えられる。多くの場合、リンクは自分の
ページに関連するページに張られるため、リング情報を
使用してページの関連性を判定することができる。Since the degree of association of the target page does not need to be calculated in advance, when it becomes necessary to calculate the degree of association of a new page, the link relationship between the page and the URL group is based on hyperlinks. Just calculate. As described above, in the score calculation method for the target page based on the hyperlink information, if a link is placed on page B in a certain page A, it is considered that the author of page A recommends page B. In many cases, the link will be to a page that is related to your page, so the ring information can be used to determine page relevance.

【００４６】次に、ページの得点を計算するための準備
とその計算処理について説明する。ここでは、アダルト
ページ、アダルトサイトと呼ばれるページの閲覧を禁止
するために、対象ページがアダルトページに該当するか
どうかを判定する場合を例にとって説明する。Next, the preparation for calculating the page score and the calculation process will be described. Here, a case will be described as an example in which it is determined whether the target page corresponds to an adult page in order to prohibit browsing of pages called adult pages and adult sites.

【００４７】まず、準備について説明する。ユーザが対
象ページとの関連度を計算するために指定したＵＲＬを
シードＵＲＬという。シードＵＲＬは複数のＵＲＬを指
定することができる。また、シードＵＲＬは何らかのシ
ードＵＲＬグループに分けて登録することもできる。ユ
ーザの指定したシードＵＲＬはページ得点計算部１０３
に保存される。First, the preparation will be described. The URL designated by the user to calculate the degree of association with the target page is called a seed URL. A plurality of URLs can be designated as the seed URL. Further, the seed URL can be registered by dividing it into some kind of seed URL group. The seed URL specified by the user is the page score calculation unit 103.
Stored in.

【００４８】本実施形態においては、２つのシードＵＲ
Ｌグループを登録する。この２つのグループは、それぞ
れ、アダルトサイトのＵＲＬ登録がされたブラックリス
ト・グループ（以下グループ１またはＧ１という）と、
非アダルトサイトで、閲覧を推薦されるサイトのＵＲＬ
が登録されたホワイトリスト・グループ（以下グループ
２またはＧ２という）である。各グループに属するＷｅ
ｂページのＵＲＬは、登録者が判定した結果に基づいて
登録されたものである。In this embodiment, two seed URs are used.
Register the L group. These two groups are the blacklist group (hereinafter referred to as group 1 or G1) in which the URL of the adult site is registered,
URL of a non-adult site recommended for browsing
Is a registered whitelist group (hereinafter referred to as group 2 or G2). We belonging to each group
The URL of page b is registered based on the result determined by the registrant.

【００４９】ハイパーリンク情報の登録では、インター
ネットのＷＷＷページ内（リンク元ＵＲＬ）からリンク
先ＵＲＬへと設定されているハイパーリンクを取り出
し、そのリンクの情報が登録される。In the registration of hyperlink information, the hyperlink set from the WWW page (link source URL) on the Internet to the link destination URL is taken out and the information of the link is registered.

【００５０】ハイパーリンク１つにつき、リンク元ＵＲＬ：リンク先ＵＲＬという情報がＤＢ部１０１のリンクテーブルに登録され
る。登録時には、既に登録されたリンク情報が再登録さ
れないように、重複登録のチェックを行う。つまり、登
録しようとしているリンク元ＵＲＬ：リンク先ＵＲＬという情報が既に登録されていた場合は、それは登録し
ない。Information of link source URL: link destination URL for each hyperlink is registered in the link table of the DB unit 101. At the time of registration, duplicate registration is checked so that the already registered link information is not re-registered. That is, if the information of link source URL to be registered: link destination URL is already registered, it is not registered.

【００５１】リンクテーブルは、準備段階で作成してお
く。これは、ある瞬間のＷｅｂページのリンクの状態を
表す。準備段階で作成しておくと、ちょうどリンク情報
のみをキャッシュした状態になるので、同じ情報を取得
するために何度もネットワークに接続することなく、Ｄ
Ｂにアクセスすることで必要なリンク情報を取得でき
る。探索と同時にＷｅｂページを取得すると、あるＵＲ
ＬのＷｅｂサーバからのレスポンスが返って来るまで探
索処理がそこで止まってしまう。これを防ぐために、予
めリンクテーブルを準備する。キャッシュしたのちに、
ページが更新されると、Ｗｅｂのリンク情報とＤＢのリ
ンク情報が矛盾してしまうが、頻繁に更新されてしまう
Ｗｅｂページのリンク情報は、更新されない固定的なリ
ンクと比べ、それほど重要でないと考えられるので、パ
ス探索時にＷｅｂページからリンク情報を取得すること
はここでは考えていない。The link table is created in the preparation stage. This represents the state of the link on the Web page at a certain moment. If you create it in the preparation stage, it will be in a state where only the link information is cached, so you do not have to connect to the network many times to get the same information
By accessing B, necessary link information can be acquired. When a web page is acquired at the same time as the search, a certain UR
The search process stops there until a response is returned from the L Web server. To prevent this, a link table is prepared in advance. After caching,
When the page is updated, the link information of the Web and the link information of the DB are inconsistent, but the link information of the Web page that is frequently updated is considered to be less important than the fixed link that is not updated. Therefore, acquiring link information from a Web page at the time of path search is not considered here.

【００５２】ルールの登録について説明する。すなわ
ち、ページ得点の計算結果に基づいて、更に得点に基づ
いて対象ページを取得すべきか否かを記述するルールの
設定について説明する。Registration of rules will be described. That is, the setting of a rule that describes whether or not the target page should be acquired based on the score based on the calculation result of the page score will be described.

【００５３】ルールは必ずしも設定しなくてもよく、設
定しない場合はページ得点のみが出力される。今回はア
ダルトページの閲覧を禁止するためのレイティングであ
るので、ルールとして、（１）ページ得点が定められた
一定値を超える場合、「ページ取得禁止」と出力するル
ールと、（２）ページ得点が定められた一定値を超えな
い場合、「ページ取得許可」と出力するルールとを設定
する。The rule does not necessarily have to be set, and if not set, only the page score is output. This time, the rating is for prohibiting browsing of adult pages, so as a rule, (1) if the page score exceeds a certain fixed value, a rule to output "Page acquisition prohibited" and (2) page score If the value does not exceed a predetermined value, a rule to output "page acquisition permission" is set.

【００５４】次に、図２に示すフローチャートを参照し
て、ページ得点計算処理について説明する。Next, the page score calculation processing will be described with reference to the flowchart shown in FIG.

【００５５】ユーザがある対象ページＵＲＬｏｂｊを入
出力部１０４から入力すると（ステップＳ２０１）。ペ
ージ得点計算部１０３は、パス探索部１０２にＵＲＬｏ
ｂｊとホワイトリスト・グループＧ２に登録されている
ＵＲＬ群を渡す。パス探索部１０２は、ＵＲＬｏｂｊと
Ｇ２のＵＲＬでパス探索処理を行い（ステップＳ２０
２）、パス探索処理結果をページ得点計算部１０３に渡
す。ページ得点計算部１０３は、パス探索処理の結果を
パス探索部１０２から受け取り、対象ページとの関連度
Ｒ（ＵＲＬｏｂｊ，Ｇ２）を計算する（ステップＳ２０
３）。When the user inputs a certain target page URL obj from the input / output unit 104 (step S201). The page score calculation unit 103 informs the path search unit 102 of URLo.
Pass bj and the URL group registered in the whitelist group G2. The path search unit 102 performs a path search process using the URL obj and the URL of G2 (step S20).
2) Pass the path search processing result to the page score calculation unit 103. The page score calculation unit 103 receives the result of the path search processing from the path search unit 102, and calculates the degree of association R (URLobj, G2) with the target page (step S20).
3).

【００５６】次に、ブラックリスト・グループＧ１に登
録されているＵＲＬ群のそれぞれと、対象ページとの関
連度Ｒ（ＵＲＬｏｂｊ，Ｇ１）を計算する（ステップＳ
２０４，Ｓ２０５）。すなわち、パス探索部１０２は、
ＵＲＬｏｂｊとＧ１のＵＲＬでパス探索処理を行い（ス
テップＳ２０４）、ページ得点計算部１０３はそのパス
探索処理結果を受け取り、対象ページとの関連度Ｒ（Ｕ
ＲＬｏｂｊ，Ｇ１）を求める（ステップＳ２０５）。な
お、Ｒ（ＵＲＬｏｂｊ，Ｇ１）は、グループ１と対象ペ
ージの関連度、Ｒ（ＵＲＬｏｂｊ，Ｇ２）はグループ２
と対象ページの関連度である。Next, the degree of association R (URLobj, G1) between each of the URL groups registered in the blacklist group G1 and the target page is calculated (step S).
204, S205). That is, the path search unit 102
The path search processing is performed with the URL obj and the URL of G1 (step S204), and the page score calculation unit 103 receives the result of the path search processing, and the degree of association R (U
RLobj, G1) is calculated (step S205). Note that R (URLobj, G1) is the degree of association between the group 1 and the target page, and R (URLobj, G2) is group 2
And the degree of association between the target page.

【００５７】アダルトページの判定であるので、対象ペ
ージＵＲＬｏｂｊのページ得点を以下の式で計算する
（ステップＳ２０６）。Since this is an adult page determination, the page score of the target page URLobj is calculated by the following formula (step S206).

【００５８】[0058]

【数１】ページ得点（ＵＲＬｏｂｊ）＝Ｒ（ＵＲＬｏｂ
ｊ，Ｇ１）−Ｒ（ＵＲＬｏｂｊ，Ｇ２）ここで、ページ得点の意味について説明する。Ｒ（ＵＲ
Ｌｏｂｊ，Ｇ１）は、値が大きいほどブラックリストと
関連が深いことを示し閲覧を禁止すべき度合いが大きい
ことを示す。Ｒ（ＵＲＬｏｂｊ，Ｇ２）は、値が大きい
ほどホワイトリストと関連が深いことを示し閲覧を許可
すべき度合いが大きいことを示す。従って、ページ得点
のＲ（ＵＲＬｏｂｊ，Ｇ１）−Ｒ（ＵＲＬｏｂｊ，Ｇ
２）は、値が大きいほど、閲覧を禁止すべきことを示
す。すなわち、定められた値を超える場合は閲覧を禁止
する。[Equation 1] Page score (URLobj) = R (URLob
j, G1) -R (URLobj, G2) Here, the meaning of the page score will be described. R (UR
Lobj, G1) indicates that the larger the value, the deeper the relationship with the blacklist, and the greater the degree to which browsing should be prohibited. R (URLobj, G2) indicates that the larger the value, the closer the relationship with the white list is, and the larger the degree to which browsing is permitted. Therefore, the page score R (URLobj, G1) -R (URLobj, G
2) indicates that the larger the value, the more prohibited the browsing. That is, browsing is prohibited when the value exceeds the specified value.

【００５９】このページ得点が、ＵＲＬｏｂｊのページ
のページ得点である。この得点に基づいて、上記ルール
に従い「ページ取得禁止またはページ取得許可」を判断
する。この得点とルールに従った判断結果、つまりルー
ル適用結果を入出力部１０４へ渡し、入出力部１０４は
得点と判断結果を出力する。This page score is the page score of the page of URLobj. Based on this score, "page acquisition prohibition or page acquisition permission" is determined according to the above rule. The score and the determination result according to the rule, that is, the rule application result are passed to the input / output unit 104, and the input / output unit 104 outputs the score and the determination result.

【００６０】次に、関連度の計算方法について２つの方
法を説明する。Next, two methods of calculating the degree of association will be described.

【００６１】まず、関連度Ｒ（ＵＲＬｏｂｊ，Ｇｉ）の
計算方法１について説明する。ページ得点計算部１０３
は、対象ページＵＲＬｏｂｊと、ＵＲＬ群Ｇｉとの関連
度Ｒ（ＵＲＬｏｂｊ，Ｇｉ）を計算する。関連度ＲはＵ
ＲＬｏｂｊとＵＲＬｇ１内各ページとを結ぶリンクのパ
スの数、およびそのリンクパスの距離によって計算され
る。つまり、以下の計算式で求められる。First, the calculation method 1 of the degree of association R (URLobj, Gi) will be described. Page score calculation unit 103
Calculates the degree of association R (URLobj, Gi) between the target page URLobj and the URL group Gi. Relevance R is U
It is calculated by the number of link paths connecting RLobb and each page in URLg1 and the distance of the link path. That is, it can be calculated by the following formula.

【００６２】[0062]

【数２】関連度Ｒ（ＵＲＬｏｂｊ，Ｇｉ）＝パス数÷パ
ス距離合計パス数は、ＵＲＬｏｂｊとＵＲＬｇ１内各ページとを結
ぶリンクのパスの数を合計したものであり、リンクパス
の距離は、ＵＲＬｏｂｊとＧｉに含む各ページとを結ぶ
リンクパスにいくつリンクがあるか、つまり、いくつ他
のページがあるかにより計算する。例えば、ＵＲＬｏｂ
ｊからリンクを１つ辿って到達する場合は距離１、２回
辿る場合は距離２となる。## EQU00002 ## Relevance R (URLobj, Gi) = number of paths / total path distance The number of paths is the total number of paths of links connecting URLobj and each page in URLg1, and the distance of link paths is The calculation is performed depending on how many links are included in the link path connecting the URLobj and each page included in Gi, that is, how many other pages are included. For example, URLob
When one link is reached from j, the distance is 1, and when two links are reached, the distance is 2.

【００６３】関連度計算に必要なパス数およびパス距離
合計は以下のように計算する。The number of paths and the total path distance required for calculating the degree of association are calculated as follows.

【００６４】[0064]

【数３】ｎは指定されたＵＲＬ群、つまりブラックリストまたは
ホワイトリストに含まれるＵＲＬの数を示す。[Equation 3] n indicates the number of specified URL groups, that is, the number of URLs included in the black list or the white list.

【００６５】次に、図３に示すフローチャートを参照し
て、パス探索処理について説明する。Next, the path search processing will be described with reference to the flowchart shown in FIG.

【００６６】対象ページＵＲＬｏｂｊとＵＲＬ群Ｇｉと
の関連度Ｒ（ＵＲＬｏｂｊ，Ｇｉ）は、ＵＲＬｏｂｊと
ＵＲＬｇ１内各ページとを結ぶリンクのパスの数および
そのリンクパスの距離によって計算される。従って、そ
のために、パス探索処理を行う。The degree of association R (URLobj, Gi) between the target page URLobj and the URL group Gi is calculated by the number of link paths connecting the URLobj and each page in the URLg1 and the distance of the link path. Therefore, for that purpose, a path search process is performed.

【００６７】パス数およびパス距離の具体的な処理とし
ては、Ｇｉ内の各ＵＲＬｉｍについて、ＵＲＬｏｂｊと
そのＵＲＬｉｍを結ぶリンクのパスをすべて探索する。
パスの探索結果は、例えば図４に示すように、節にＵＲ
Ｌ情報を持った木構造で表現される。根はＵＲＬ情報と
してＵＲＬｏｂｊを持つ。As a specific process of the number of paths and the path distance, for each URLim in Gi, all paths of links connecting URLobj and the URLim are searched.
The path search result is UR in the section as shown in FIG. 4, for example.
It is represented by a tree structure having L information. The root has URLobj as URL information.

【００６８】図３に示すパス探索処理では、まず木構造
の根ＵＲＬｏｂｊを注目点とする（ステップＳ３０
１）。それから、ＤＢ部１０１のリンクテーブルからリンク元＝注目点のＵＲＬという行を取り出し、注目節、すなわち現在の処理の基
点となるＵＲＬを示す節をチェック済みとする（ステッ
プＳ３０２）。In the path search process shown in FIG. 3, first, the root URLobj of the tree structure is set as a target point (step S30).
1). Then, the line "URL of link source = URL of attention point" is extracted from the link table of the DB unit 101, and the attention section, that is, the section indicating the URL serving as the base point of the current process is checked (step S302).

【００６９】次に、各行のリンク先のＵＲＬを注目点の
子節として木構造に節を追加する（ステップＳ３０
３）。ただし、ステップＳ３０２の処理で条件に一致す
る行がＤＢ部１０１になかった場合はなにもしない。追
加した節の階層が探索限界階層数に達しているか否かを
チェックし、達していれば（ステップＳ３０４）、追加
した接点をチェック済みにする（ステップＳ３０５）。Next, a node is added to the tree structure by using the URL of the link destination of each line as a child node of the point of interest (step S30).
3). However, if there is no row in the DB unit 101 that matches the condition in the process of step S302, nothing is done. It is checked whether or not the layer of the added section has reached the search limit layer number (step S304), and the added contact is checked (step S305).

【００７０】次に、注目点と同じ階層（根からの注目点
までの節の数）にチェック済みでない節があるか否かを
チェックし、あれば（ステップＳ３０６）、その中の１
つを次の注目点にする（ステップＳ３０７）。そして、
ステップＳ３０１に戻り、同じ処理を繰り返し行うが、
ない場合には、ステップＳ３０８に進み、１つ多い階層
でチェック済みでない節があるか否かをチェックし、あ
れば（ステップＳ３０８）、その中の１つを注目点とし
（ステップＳ３０９）、ステップＳ３０１に戻り、同じ
処理を繰り返し行う。注目点とする節がない場合は探索
を終了する。根からＵＲＬがＵＲＬｉｍである葉までの
節のＵＲＬがＵＲＬｏｂｊとそのＵＲＬｉｍを結ぶリン
クのパスである。Next, it is checked whether or not there is an unchecked node in the same hierarchy as the point of interest (the number of nodes from the root to the point of interest), and if there is (step S306), 1 of them is checked.
To the next point of interest (step S307). And
Return to step S301 and repeat the same process,
If not, the process proceeds to step S308, and it is checked whether or not there is an unchecked clause in one more layer (step S308), and if there is one, the one of them is set as the point of interest (step S309). It returns to S301 and repeats the same processing. If there is no clause to be the point of interest, the search ends. The URL of the section from the root to the leaf whose URL is URLim is the path of the link connecting URLobj and its URLim.

【００７１】上記探索終了後の木構造を図４に示す。ｒ
ｏｕｔｅ（ＵＲＬｏｂｊ，ＵＲＬi）は、上記探索終了
後の木構造において、ＵＲＬがＵＲＬｉである葉の数で
ある。ｄｉｓｔ（Ｌij，ＵＲＬi）は、葉ＬijのＵＲＬ
がＵＲＬiである場合、葉Ｌijの階層数である。葉Ｌij
のＵＲＬがＵＲＬiでない場合、０である。The tree structure after the above search is shown in FIG. r
oute (URLobj, URLi) is the number of leaves whose URL is URLi in the tree structure after completion of the search. dist (Lij, URLi) is the URL of the leaf Lij
Is the number of layers of the leaf Lij. Leaf Lij
Is 0 if the URL is not URLi.

【００７２】なお、同じホームページに複数のルートか
ら辿り着く場合がある。例えば、ＡからＸに辿り着くの
に、Ａ→Ｂ→Ｘがあれば、Ａ→Ｃ→Ｄ→Ｘとなる場合も
ある。このようなときは、別々にカウントする。In some cases, the same home page may be reached from a plurality of routes. For example, if there is A → B → X to reach from A to X, it may be A → C → D → X. In such cases, count separately.

【００７３】関連度は、２ＵＲＬ間の関連の大きさを表
し、その計算は２ＵＲＬ間を指定して行う。複数のＵＲ
ＬとＵＲＬｏｂｊとの関連度は、複数のＵＲＬに含まれ
る個々のＵＲＬとＵＲＬｏｂｊとの２ＵＲＬ間の関連度
に基づいて計算される。The degree of association represents the degree of association between two URLs, and the calculation is performed by designating between two URLs. Multiple URs
The degree of association between L and URL obj is calculated based on the degree of association between each URL included in a plurality of URLs and URL obj.

【００７４】例えば、ブラックリストとＵＲＬｏｂｊの
場合は、両者の関連度は次のように算出される。For example, in the case of the black list and URL obj, the degree of association between the two is calculated as follows.

【００７５】（１）関連度合計Ｒ＿ａｌｌ＝０とする。
関連度合計は、ブラックリスト中の各ＵＲＬとＵＲＬｏ
ｂｊとの関連度の合計値を示す。(1) Assume that the total degree of association R_all = 0.
The total relevance is calculated for each URL and URLo in the blacklist.
The total value of the degree of association with bj is shown.

【００７６】（２）ブラックリストから１つＵＲＬを選
択する。(2) Select one URL from the black list.

【００７７】（３）選択したＵＲＬとＵＲＬｏｂｊの関
連度を計算し、関連度合計Ｒ＿ａｌｌに加算する。(3) The degree of association between the selected URL and URLobj is calculated and added to the total degree of association R_all.

【００７８】（４）ブラックリストにまだ選択されてい
ないＵＲＬがある場合は、（２）へ戻る。そうでない場
合は次の（５）へ進む。(4) If there is a URL that has not been selected in the black list, return to (2). If not, proceed to the next (5).

【００７９】（５）ブラックリストとＵＲＬｏｂｊの関
連度として関連度合計Ｒ＿ａｌｌを出力する。(5) The total degree of association R_all is output as the degree of association between the blacklist and URLobj.

【００８０】上述した関連度の計算方法１では、２つの
ＵＲＬ間の関連度をそのＵＲＬを結ぶリンクパスを探索
し、距離の短いリンクパスが多いほど、またリンクパス
の数が多いほど関連度が大きくなるというように計算す
る。In the above-mentioned calculation method 1 of the degree of association, the degree of association between two URLs is searched for a link path connecting the URLs, and the more the number of link paths having a shorter distance and the greater the number of link paths, the more the degree of association becomes. Is calculated to be larger.

【００８１】これに対して、次に示す関連度の計算方法
２では、最短のパスだけ見つける。これは、パス探索処
理で探索するリンクパス、つまりパス探索部１０２の返
すリンクパスが２ＵＲＬ間の最短の経路の場合に相当す
る。ただし、リンクパスの数は常に１になる。On the other hand, in the following method 2 of calculating the degree of association, only the shortest path is found. This corresponds to the case where the link path searched in the path search process, that is, the link path returned by the path search unit 102 is the shortest path between two URLs. However, the number of link paths is always 1.

【００８２】関連度の計算方法２では、一度あるＵＲＬ
を探索してしまうとそのＵＲＬを再度探索することはな
い。すなわち、計算方法１では、２ＵＲＬを結ぶリンク
パス探索結果で、ある同じＵＲＬを含むパスを複数返す
ことがあるが、計算方法２ではそのようなことはない。
計算方法２では以下の方法で対象ページＵＲＬｏｂｊ
と、ＵＲＬ群Ｇｉとの関連度Ｒ（ＵＲＬｏｂｊ，Ｇｉ）
をより少ない計算量で高速に行うことができる。In the calculation method 2 of the degree of association, a URL
Once the search is done, the URL will not be searched again. That is, in the calculation method 1, a plurality of paths including a certain same URL may be returned as a link path search result connecting two URLs, but the calculation method 2 does not have such a result.
In the calculation method 2, the target page URL obj is
And the degree of association R with URL group Gi (URLobj, Gi)
Can be performed at a high speed with a smaller calculation amount.

【００８３】関連度の計算方法２について説明する。A method 2 of calculating the degree of association will be described.

【００８４】まず、初期処理を行う（ステップＳ１）。
この初期処理では、探索リストを（（ＵＲＬｏｂｊ，
０））に設定する。探索リストは探索が必要なＵＲＬを
保持し、（ＵＲＬ名、階層数）を要素とする集合であ
る。初期値は対象ページのＵＲＬｏｂｊとその階層数を
要素とする集合となる。First, initial processing is performed (step S1).
In this initial processing, the search list is set to ((URLobj,
0)). The search list is a set that holds URLs that need to be searched and has (URL name, number of layers) as an element. The initial value is a set having elements of the URL obj of the target page and the number of layers.

【００８５】また、探索済みＵＲＬリストを空とする。
探索済みリストは、探索が終わったＵＲＬを保持する。
ＵＲＬを要素とする集合であり、例えば（ＵＲＬ１，Ｕ
ＲＬ２，ＵＲＬ３）である。The searched URL list is empty.
The searched list holds URLs that have been searched.
It is a set whose elements are URLs. For example, (URL1, U
RL2, URL3).

【００８６】更に、探索結果リストを空とする。探索結
果リストは、目的とするブラックリストやホワイトリス
トに辿り着いた結果を保持する（ＵＲＬ名、階層数）を
要素とする信号である。例えば（（ＵＲＬｘ，５），
（ＵＲＬｙ，６））である。Further, the search result list is empty. The search result list is a signal whose element is (URL name, number of layers) holding the result of reaching the target blacklist or whitelist. For example ((URLx, 5),
(URLy, 6)).

【００８７】次に、着目ＵＲＬの決定処理を次のように
行う（ステップＳ２）。Next, the process of determining the URL of interest is performed as follows (step S2).

【００８８】（１）探索リストの中から階層数が小さい
ものを１つ選ぶ。探索リストが（（ＵＲＬａ，３），
（ＵＲＬｂ，４））であれば、（ＵＲＬａ，３）を選
ぶ。探索リストが空であれば、選ばれるものがなく、探
索が終了する。階層数が小さいものから選ぶのは、パス
距離が小さいＵＲＬを優先して探索するためである。(1) From the search list, select one having a small number of layers. The search list is ((URLa, 3),
If (URLb, 4)), select (URLa, 3). If the search list is empty, nothing is selected and the search ends. The reason why the number of layers is small is to preferentially search a URL having a small path distance.

【００８９】（２）探索リストからは、選ばれた要素を
除外する。上記（１）の場合、探索リストは（（ＵＲＬ
ｂ，４））となる。(2) The selected element is excluded from the search list. In the case of (1) above, the search list is ((URL
b, 4)).

【００９０】（３）選ばれた要素のＵＲＬを探索済みＵ
ＲＬリストに追加する。この例ではＵＲＬａが探索済み
ＵＲＬリストに追加される。(3) The URL of the selected element has been searched U
Add to RL list. In this example, URLa is added to the searched URL list.

【００９１】（４）選ばれた要素のＵＲＬがブラックリ
ストあるいはホワイトリストに存在すれば、探索結果リ
ストに追加する。この例では、（ＵＲＬａ，３）を追加
する。(4) If the URL of the selected element exists in the black list or the white list, it is added to the search result list. In this example, (URLa, 3) is added.

【００９２】次に、探索空間の追加処理を次のように行
う（ステップＳ３）。Next, the process of adding the search space is performed as follows (step S3).

【００９３】（１）ステップＳ２で選ばれたＵＲＬの階
層数が一定の値に達していれば、ステップＳ２に戻る。
これは、階層数が一定値に達していれば、探索を打ち切
ることを目的としている。(1) If the number of layers of the URL selected in step S2 has reached a certain value, the process returns to step S2.
This is intended to terminate the search if the number of layers reaches a certain value.

【００９４】（２）ステップＳ２で選ばれたＵＲＬから
リンクが張られているページのＵＲＬ（これを追加候補
ＵＲＬと呼ぶ）のそれぞれに以下を行う。(2) The following is performed for each of the URLs of pages linked to from the URL selected in step S2 (this is called an additional candidate URL).

【００９５】追加候補ＵＲＬが探索済みＵＲＬリストに
存在するかを調べる。存在する場合には、既に調査済み
のＵＲＬを意味するので、何もしない。存在しない場合
には、未調査のＵＲＬを意味するので、探索リストに追
加する。階層数は１つ増加する。すなわち、（ＵＲＬ
ａ，３）からＵＲＬｃへのリンクがあって、ＵＲＬｃを
追加する場合は（ＵＲＬｃ，４）が探索リストに追加さ
れる。It is checked whether the additional candidate URL exists in the searched URL list. If it exists, it means that the URL has already been investigated, so nothing is done. If it does not exist, it means an unsearched URL and is added to the search list. The number of layers increases by one. That is, (URL
If there is a link from a, 3) to URLc and URLc is added, (URLc, 4) is added to the search list.

【００９６】最後に探索結果リストに入っている要素が
探索結果であり、ＵＲＬと階層数が組になったものがリ
ストの要素である。Finally, the element in the search result list is the search result, and the element of the list is a combination of the URL and the number of layers.

【００９７】なお、上述した従来のコンテンツ指定方法
（ＵＲＬ指定方法）、すなわちリストに含まれるか含ま
れないかでページをフィルタリングする方法は、本発明
の特別な場合として考えることができる。「ホワイトリ
ストに含まれる」状態はホワイトリストからのホップ数
が０の状態となる。つまり、パス探索処理において、探
索限界階層数が０のときに相当し、ホワイトリストのＵ
ＲＬ自身が対象ページＵＲＬｏｂｊと一致した場合にな
る。The above-described conventional content designating method (URL designating method), that is, a method of filtering a page depending on whether it is included in the list or not can be considered as a special case of the present invention. In the “included in white list” state, the number of hops from the white list is 0. That is, in the path search process, this corresponds to when the search limit hierarchy number is 0,
This is the case when the RL itself matches the target page URL obj.

【００９８】なお、本発明の方法を利用して、特にイン
ターネットのコンテンツの表示の可否を判断する場合に
は、コンテンツを表示する端末とサーバとの間にフィル
タリングおよびレイティングのための本発明の方法を実
装したソフトウェアを動作させ、端末はそのソフトウェ
アを経由してサーバとアクセスする。端末機器自体で該
ソフトウェアを動作させる場合や、端末のサーバとは別
のハードウェアを設置し、そこで該ソフトウェアを動作
させる場合などがあるが、その方法はいずれでもよい。When the method of the present invention is used to determine whether or not to display the Internet content, the method of the present invention for filtering and rating between the terminal displaying the content and the server. The software that implements is operated, and the terminal accesses the server via the software. There are cases where the software is operated by the terminal device itself, and cases where hardware different from the server of the terminal is installed and the software is operated there, but any method may be used.

【００９９】また、ＤＢ部１０１のＵＲＬは非常に深い
ディレクトリを持つことがある。例えば、以下のように
Ｄ１以下がディレクトリである。jp/ まではホスト部分
である。The URL of the DB unit 101 may have a very deep directory. For example, the directory below D1 is as follows. Up to jp / is the host part.

【０１００】 Http://www.hyp.jp/D1/D2/D3/content.html 通常、一般的にはＵＲＬではディレクトリ部分を含めて
ＵＲＬとして扱う。本発明の方法の問題として、ＵＲＬ
が増加すると探索空間、つまりパス探索処理における木
構造の情報量、およびパス探索処理の計算量が爆発的に
大きくなる。そこで、探索空間および計算量の削減を目
的とし、ＵＲＬの代わりに近似ＵＲＬを利用する。ここ
で、近似ＵＲＬとは、元のＵＲＬの左端先頭からホスト
名以下のｎ番目の“／”までを取り出したものである。
ｎは予め指定するものとする。これは、ＵＲＬ先頭から
ディレクトリのｎ階層目までを取り出したのと同じであ
る。ｎ＝２のときは、上記例のＵＲＬの近似ＵＲＬは次
のようになる。Http://www.hyp.jp/D1/D2/D3/content.html Normally, a URL is generally treated as a URL including a directory part. The problem with the method of the present invention is that the URL
As the number of times increases, the search space, that is, the amount of information of the tree structure in the path search processing and the amount of calculation of the path search processing explosively increase. Therefore, an approximate URL is used instead of the URL for the purpose of reducing the search space and the calculation amount. Here, the approximate URL is obtained by extracting from the leftmost head of the original URL to the nth "/" below the host name.
n is designated in advance. This is the same as extracting from the top of the URL to the nth layer of the directory. When n = 2, the approximate URL of the URL in the above example is as follows.

【０１０１】Http://www.hyp.jp/D1/D2/ 一般的に、ＵＲＬ内のディレクトリは、階層が大きくな
るほど、詳細に記述内容を分類して行くが、一般的に
は、ｎ＝２程度で近似ＵＲＬが一致するＵＲＬをまとめ
ると、ほぼ内容的にまとまりのあるページがまとまるで
あろうという仮説に基づく。例えば、ある実験では、１
５００Ｗｅｂページ中の２８０００のユニークなリンク
を取り出した場合に、この中で、ＵＲＬを近似ＵＲＬ
（ｎ＝１）とすると、ユニークなリンクは３５００リン
クになり、８０％以上、データベース件数を削減でき
る。Http://www.hyp.jp/D1/D2/ Generally, for directories in a URL, the contents of the description are classified in more detail as the hierarchy increases, but in general, n = It is based on the hypothesis that if the URLs whose approximate URLs match with each other are about 2, the pages that are almost cohesive will be collected. For example, in one experiment, 1
When 28,000 unique links in 500 Web pages are retrieved, the URL is approximated
If (n = 1), the number of unique links becomes 3,500, and the number of databases can be reduced by 80% or more.

【０１０２】なお、上記実施形態のページレイティング
／フィルタリング方法の処理手順をプログラムとして例
えばＣＤやＦＤなどの記録媒体に記録して、この記録媒
体をコンピュータシステムに組み込んだり、または記録
媒体に記録されたプログラムを通信回線を介してコンピ
ュータシステムにダウンロードしたり、または記録媒体
からインストールし、該プログラムでコンピュータシス
テムを作動させることによりページレイティング／フィ
ルタリング方法を実施するページレイティング／フィル
タリング装置として機能させることができることは勿論
であり、このような記録媒体を用いることにより、その
流通性を高めることができるものである。The processing procedure of the page rating / filtering method of the above embodiment is recorded as a program in a recording medium such as a CD or FD, and this recording medium is incorporated into a computer system or recorded in the recording medium. The program can be downloaded to a computer system via a communication line, or can be installed from a recording medium, and the computer system can be operated by the program to function as a page rating / filtering device that implements a page rating / filtering method. Of course, by using such a recording medium, it is possible to improve its distribution.

【０１０３】[0103]

【発明の効果】以上説明したように、本発明によれば、
対象ページからデータベースに格納された規準となるリ
ンクパス情報を検索し、対象ページがリンクパス情報に
対して所定の基準に合致するか否かのレイティングを行
い、このレイティング結果に基づき前記対象ページをフ
ィルタリングするので、規準となる各ページのＵＲＬ群
のみ指定すれば、この指定されたＵＲＬ群と対象ページ
との関連度を判定し、対象ページのフィルタリングを効
率的かつ適確に行うことができる。As described above, according to the present invention,
The target page is searched for the standard link path information stored in the database, and the target page is rated for the link path information to determine whether or not the target page matches, and the target page is searched based on the rating result. Since the filtering is performed, if only the URL group of each standard page is designated, the degree of association between the designated URL group and the target page can be determined, and the target page can be filtered efficiently and appropriately.

[Brief description of drawings]

【図１】本発明の一実施形態に係るページレイティング
／フィルタリング装置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a page rating / filtering apparatus according to an exemplary embodiment of the present invention.

【図２】図１に示す実施形態のページレイティング／フ
ィルタリング装置に使用されているページ得点計算部の
ページ得点計算処理を示すフローチャートである。FIG. 2 is a flowchart showing a page score calculation process of a page score calculation unit used in the page rating / filtering device of the exemplary embodiment shown in FIG.

【図３】図１に示す実施形態のページレイティング／フ
ィルタリング装置に使用されているパス探索部のパス探
索処理を示すフローチャートである。FIG. 3 is a flowchart showing a path search process of a path search unit used in the page rating / filtering apparatus of the embodiment shown in FIG.

【図４】図１に示す実施形態のパス探索処理後にできる
木構造を示す図である。4 is a diagram showing a tree structure formed after the path search processing of the embodiment shown in FIG.

[Explanation of symbols]

１０１ＤＢ部１０２パス探索部１０３ページ得点計算部１０４入出力部 101 DB section 102 Path search unit 103 page score calculator 104 Input / output section

フロントページの続き (72)発明者竹野浩東京都千代田区大手町二丁目３番１号日本電信電話株式会社内 (72)発明者稲垣博人東京都千代田区大手町二丁目３番１号日本電信電話株式会社内Ｆターム(参考） 5B075 NK44 PR08 Continued front page (72) Inventor Hiroshi Takeno 2-3-1, Otemachi, Chiyoda-ku, Tokyo Inside Telegraph and Telephone Corporation (72) Inventor Hiroto Inagaki 2-3-1, Otemachi, Chiyoda-ku, Tokyo Inside Telegraph and Telephone Corporation F-term (reference) 5B075 NK44 PR08

Claims

[Claims]

1. A page rating / filtering method for performing a rating as to how a certain target page is related to a preset standard, and filtering the target page based on the rating result. Hyperlink information consisting of link path information, which is a concatenation of URLs of standard pages, is stored in a database, the link path information stored in the database is searched from the target page, and the target page is stored in the database. A page rating / filtering method, characterized in that the link path information thus obtained is rated to determine whether or not it meets a predetermined criterion, and the target page is filtered based on the rating result.

2. The page rating / filtering method according to claim 1, wherein the rating is performed based on a degree of association of a target page with each URL constituting the link path information stored in the database.

3. The degree of relevance is obtained by searching a hyperlink from each URL forming the link path information to a target page to see if there is a route that can reach the target page. The page rating / filtering method according to claim 2, wherein the shorter the relationship, the higher the degree of association.

4. The degree of relevance is obtained by following a hyperlink from each URL constituting the link path information to a target page, searching for a route to reach the target page, and if there is a route, 3. The page rating / filtering method according to claim 2, wherein all reachable routes are searched for, and the more routes there are, the higher the degree of association is.

5. A page rating / filtering device, which performs a rating as to how much a certain target page is related to a preset standard, and filters the target page based on the rating result. A database that stores hyperlink information that is link path information that is a link of URLs of standard pages, a path search unit that searches the link path information stored in the database from the target page, and the target page is a database Page rating, characterized in that it has rating means for rating whether or not the link path information stored in (1) meets a predetermined criterion, and filtering means for filtering the target page based on the rating result. Filtering apparatus.

6. The rating means calculates the degree of association of a target page with respect to each URL constituting the link path information stored in the database, and the means for performing rating based on the calculated degree of association. 6. The page rating / filtering device according to claim 5, further comprising:

7. The degree of relevance is obtained by tracing a hyperlink from each URL constituting the link path information to a target page, searching for a route to reach the target page, and finding a distance along the route. 7. The page rating / filtering device according to claim 6, wherein the shorter the relationship, the higher the degree of association.

8. The degree of relevance is obtained by following a hyperlink from each URL forming the link path information to a target page, searching for a route to reach the target page, and if there is a route, 7. The page rating / filtering device according to claim 6, wherein all reachable routes are searched, and the more routes there are, the higher the degree of association is.

9. A page rating / filtering program for rating how much a certain target page is related to a preset standard, and filtering the target page based on the rating result. Hyperlink information consisting of link path information, which is a concatenation of URLs of standard pages, is stored in a database, the link path information stored in the database is searched from the target page, and the target page is stored in the database. A page rating / filtering program, characterized in that the link path information thus obtained is rated to determine whether or not it meets a predetermined criterion, and the target page is filtered based on the rating result.

10. The page rating / filtering program according to claim 9, wherein the rating is performed based on a degree of association of a target page with each URL constituting the link path information stored in the database.

11. The degree of relevance is determined by searching a hyperlink from each URL constituting the link path information to a target page to see if there is a route that can reach the target page. 11. The page rating / filtering program according to claim 10, wherein the shorter the degree, the higher the degree of association.

12. The degree of relevance is obtained by following a hyperlink from each URL forming the link path information to a target page, searching for a route to reach the target page, and if there is a route, 11. The page rating / filtering program according to claim 10, wherein all reachable routes are searched, and the more routes there are, the higher the degree of association is.

13. A computer reading recording a page rating / filtering program for performing a rating as to how a certain target page is related to a preset standard and filtering the target page based on the rating result. It is a possible recording medium, and hyperlink information consisting of link path information which is a link of URLs of the respective standard pages is stored in a database, and the link path information stored in the database is stored from the target page. A page rating, characterized in that the target page is searched, the link path information stored in the database is rated, and whether the target page matches a predetermined criterion, and the target page is filtered based on the rating result. Phil Recorded computer-readable recording medium, the data ring program.

14. The page rating / filtering program according to claim 13, wherein the rating is performed based on a degree of association of a target page with each URL constituting link path information stored in a database. Computer-readable recording medium.

15. The degree of association is determined by searching a hyperlink from each URL forming the link path information to a target page to see if there is a route to reach the target page, and determining the distance along the route. The computer-readable recording medium recording the page rating / filtering program according to claim 14, wherein the shorter the relationship, the higher the degree of association.

16. The degree of relevance is obtained by tracing a hyperlink from each URL forming the link path information to a target page, searching for a route to reach the target page, and if there is a route, 15. The computer-readable recording medium storing the page rating / filtering program according to claim 14, wherein all reachable routes are searched, and the more routes there are, the higher the degree of association is.