JP2008250597A

JP2008250597A - Computer system

Info

Publication number: JP2008250597A
Application number: JP2007090184A
Authority: JP
Inventors: Masanori Hara; 正憲原; Ayumi Kubota; 歩窪田; Masaru Miyake; 優三宅
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2007-03-30
Filing date: 2007-03-30
Publication date: 2008-10-16

Abstract

<P>PROBLEM TO BE SOLVED: To provide a computer system capable of certainly filtering a troublesome mail. <P>SOLUTION: A harmful site decision part 13 analyzes a character string of a URI (Uniform Resource Identifier) recorded inside a reception mail, and decides whether a Web site identified by the URI is a harmful site or not. A harmful site decision part 21 analyzes a character string of a URI required with access by a user, and decides whether the Web site identified by the URI is the harmful site or not. A blacklist storage part 3 stores the URI decided that it is the harmful site by the harmful site decision parts 13, 21. A spam mail decision part 12 decides whether the reception mail is the junk mail or not based on the URI stored by the blacklist storage part 3. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、受信メールが迷惑メールであるか否かを判定するコンピュータシステムに関する。 The present invention relates to a computer system that determines whether a received mail is a junk mail.

不特定ユーザのコンピュータ端末に対して一方的に送信されるスパムメールと呼ばれる迷惑メールが問題となっている。特許文献１には、スパムメールに記載されていたＵＲＩ（Uniform Resource Identifier）を保持しておき、保持したＵＲＩに基づいてフィルタリングを行う技術が記載されている。
特開２００５−１２８９２２号公報 Spam mail called spam mail that is unilaterally transmitted to a computer terminal of an unspecified user is a problem. Patent Document 1 describes a technique for holding a URI (Uniform Resource Identifier) described in spam mail and performing filtering based on the held URI.
JP 2005-128922 A

特許文献１に記載された技術では、スパムメールの１度目の受信時にＵＲＩがシステムによって保持され、２度目の受信以降は、そのＵＲＩに基づいてスパムメールであるか否かの判定を行うことができるが、１度目の受信時にその判定を行うことができないため、スパムメールを確実にフィルタリングすることができない。また、この技術では、受信メールのみからＵＲＩを収集しているが、受信メールに記載されているＵＲＩが有害サイトのＵＲＩであるとは限らないため、誤判定が生じる可能性がある。さらに、この技術では、受信メールがスパムメールであるか否かを判定する際に、ＵＲＩのみに基づいて判定を行っているため、誤判定が生じる可能性がある。 In the technique described in Patent Document 1, the URI is held by the system when the spam mail is received for the first time, and after the second reception, it is determined whether the mail is spam mail based on the URI. However, since the determination cannot be performed at the first reception, the spam mail cannot be reliably filtered. In this technique, URIs are collected only from received mail. However, since the URI described in the received mail is not necessarily the URI of the harmful site, there is a possibility that erroneous determination may occur. Further, in this technique, when determining whether or not the received mail is spam mail, since the determination is performed based only on the URI, an erroneous determination may occur.

本発明は、上述した課題に鑑みてなされたものであって、迷惑メールを確実にフィルタリングすることができるコンピュータシステムを提供することを目的とする。 The present invention has been made in view of the above-described problems, and an object of the present invention is to provide a computer system capable of reliably filtering junk mail.

本発明は、上記の課題を解決するためになされたもので、受信メール中に記載されているＵＲＩ（Uniform Resource Identifier）の文字列を解析し、前記ＵＲＩによって識別されるウェブサイトが有害サイトであるか否かを判定する第１の有害サイト判定手段と、前記第１の有害サイト判定手段によって有害サイトであると判定された前記ＵＲＩを記憶する記憶手段と、前記記憶手段によって記憶されている前記ＵＲＩに基づいて、受信メールが迷惑メールであるか否かを判定する迷惑メール判定手段とを備えたことを特徴とするコンピュータシステムである。 The present invention has been made to solve the above-described problems, and analyzes a character string of a URI (Uniform Resource Identifier) described in a received mail, and a website identified by the URI is a harmful site. First harmful site determination means for determining whether or not there is present, storage means for storing the URI determined as a harmful site by the first harmful site determination means, and stored by the storage means A computer system comprising: a junk mail judging means for judging whether or not a received mail is a junk mail based on the URI.

また、本発明のコンピュータシステムは、ユーザがアクセスを要求するＵＲＩの文字列を解析し、前記ＵＲＩによって識別されるウェブサイトが有害サイトであるか否かを判定する第２の有害サイト判定手段をさらに備え、前記記憶手段はさらに、前記第２の有害サイト判定手段によって有害サイトであると判定された前記ＵＲＩを記憶することを特徴とする。 The computer system according to the present invention further includes second harmful site determination means for analyzing a character string of a URI for which a user requests access and determining whether or not a website identified by the URI is a harmful site. The storage means further stores the URI determined to be a harmful site by the second harmful site determination means.

また、本発明のコンピュータシステムにおいて、前記第１の有害サイト判定手段はさらに、前記ＵＲＩによって識別されるウェブサイトのＩＰアドレス、前記ウェブサイトのドメインの情報、および前記ウェブサイトのページの情報の少なくともいずれかに基づいて、前記ウェブサイトが有害サイトであるか否かを判定し、前記記憶手段はさらに、前記第１の有害サイト判定手段によって有害サイトであると判定された前記ウェブサイトのＩＰアドレス、前記ウェブサイトのドメインの情報、および前記ウェブサイトのページの情報の少なくともいずれかを記憶し、前記迷惑メール判定手段はさらに、前記記憶手段によって記憶されている前記ウェブサイトのＩＰアドレス、前記ウェブサイトのドメインの情報、および前記ウェブサイトのページの情報の少なくともいずれかに基づいて、受信メールが迷惑メールであるか否かを判定することを特徴とする。 In the computer system of the present invention, the first harmful site determination means may further include at least one of an IP address of the website identified by the URI, information on the website domain, and information on the website page. Based on one of the above, it is determined whether or not the website is a harmful site, and the storage means further includes an IP address of the website determined to be a harmful site by the first harmful site determination means. , Storing at least one of the domain information of the website and the information of the page of the website, and the junk mail determination means further includes the IP address of the website stored in the storage means, the web Site domain information and the website page Based on at least one of di-information, received mail and judging whether or not spam.

また、本発明のコンピュータシステムにおいて、前記第２の有害サイト判定手段はさらに、前記ＵＲＩによって識別されるウェブサイトのＩＰアドレス、前記ウェブサイトのドメインの情報、前記ウェブサイトのページの情報、および前記ウェブサイトが有害サイトであるか否かをユーザが判定した結果を示す情報の少なくともいずれかに基づいて、前記ウェブサイトが有害サイトであるか否かを判定し、前記記憶手段はさらに、前記第１の有害サイト判定手段によって有害サイトであると判定された前記ウェブサイトのＩＰアドレス、前記ウェブサイトのドメインの情報、前記ウェブサイトのページの情報、および前記ウェブサイトが有害サイトであるか否かをユーザが判定した結果を示す情報の少なくともいずれかを記憶し、前記迷惑メール判定手段はさらに、前記記憶手段によって記憶されている前記ウェブサイトのＩＰアドレス、前記ウェブサイトのドメインの情報、および前記ウェブサイトのページの情報、および前記ウェブサイトが有害サイトであるか否かをユーザが判定した結果を示す情報の少なくともいずれかに基づいて、受信メールが迷惑メールであるか否かを判定することを特徴とする。 In the computer system of the present invention, the second harmful site determination means may further include an IP address of a website identified by the URI, information on the domain of the website, information on a page of the website, and The storage means further determines whether the website is a harmful site based on at least one of the information indicating the result of the user determining whether the website is a harmful site, and the storage means further includes The IP address of the website determined to be a harmful site by one harmful site determination means, information on the domain of the website, information on the page of the website, and whether or not the website is a harmful site At least one of the information indicating the result of the determination by the user, and The website determination unit further includes an IP address of the website stored in the storage unit, information on the domain of the website, information on a page of the website, and whether the website is a harmful site. It is characterized in that it is determined whether or not the received mail is a junk mail based on at least one of the information indicating the result of the determination by the user.

本発明によれば、メール中に記載されているＵＲＩの文字列を解析し、有害サイトであるか否かを判定することによって、初めて判定を行うＵＲＩが記載されている迷惑メールが受信された場合でも迷惑メールを確実にフィルタリングすることができる。また、ユーザがアクセスを要求するＵＲＩの文字列を解析し、有害サイトであるか否かを判定することによって、有害サイトの判定精度を向上し、迷惑メールを確実にフィルタリングすることができる。また、ウェブサイトのＩＰアドレス、ウェブサイトのドメインの情報、ウェブサイトのページの情報、およびウェブサイトが有害サイトであるか否かをユーザが判定した結果を示す情報の少なくともいずれかに基づいて、受信メールが迷惑メールあるか否かを判定することによって、迷惑メールの判定精度を向上し、迷惑メールを確実にフィルタリングすることができる。 According to the present invention, an unsolicited mail containing a URI to be determined for the first time is received by analyzing a character string of the URI described in the mail and determining whether or not it is a harmful site. Even in this case, junk mail can be reliably filtered. Further, by analyzing a URI character string requesting access by a user and determining whether or not the site is a harmful site, it is possible to improve the accuracy of determining a harmful site and to filter spam mails with certainty. Also, based on at least one of the website IP address, website domain information, website page information, and information indicating the result of the user determining whether the website is a harmful site, By determining whether or not the received mail is junk mail, it is possible to improve the accuracy of junk mail determination and reliably filter junk mail.

以下、図面を参照し、本発明の実施形態を説明する。図１は、本発明の一実施形態によるコンピュータシステムの構成を示している。本実施形態によるコンピュータシステムは、電子メールを配送するメールサーバ１と、ユーザからのアクセス要求を受けて外部のネットワークにアクセスするプロキシサーバ２と、有害サイトのＵＲＩを記憶するブラックリスト記憶部３とを有している。ブラックリスト記憶部３は、メールサーバ１やプロキシサーバ２とは別個のデータベースサーバ等に設置してもよいし、メールサーバ１あるいはプロキシサーバ２に設置してもよい。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 shows the configuration of a computer system according to an embodiment of the present invention. The computer system according to the present embodiment includes a mail server 1 that delivers electronic mail, a proxy server 2 that receives an access request from a user and accesses an external network, and a blacklist storage unit 3 that stores URIs of harmful sites. have. The black list storage unit 3 may be installed in a database server or the like separate from the mail server 1 or the proxy server 2, or may be installed in the mail server 1 or the proxy server 2.

メールサーバ１において、メール受信部１０は外部のネットワークからメールを受信する。メール記憶部１１は、メール受信部１０によって受信されたメールを記憶する。スパムメール判定部１２は、受信メールがスパムメール（迷惑メール）であるか否かを判定する。有害サイト判定部１３は、受信メールに記載されているＵＲＩによって識別されるウェブサイトが有害サイトであるか否かを判定する。 In the mail server 1, the mail receiving unit 10 receives mail from an external network. The mail storage unit 11 stores the mail received by the mail receiving unit 10. The spam mail determination unit 12 determines whether or not the received mail is spam mail (spam mail). The harmful site determination unit 13 determines whether the website identified by the URI described in the received mail is a harmful site.

プロキシサーバ２において、ウェブアクセス監視部２０は、ユーザによる外部のネットワーク上のウェブサイトへのアクセスを監視する。有害サイト判定部２１は、ユーザがアクセスを要求するウェブサイトが有害サイトであるか否かを判定する。 In the proxy server 2, the web access monitoring unit 20 monitors user access to websites on an external network. The harmful site determination unit 21 determines whether or not the website for which the user requests access is a harmful site.

有害サイト判定部１３，２１は、ウェブサイトが有害サイトであるか否かを判定する際に、以下の３種類の情報をチェックする。
（１）ＵＲＩの文字列の情報
（２）ドメインの情報
（３）ウェブサイトのページの情報
有害サイト判定部１３，２１はこれら３種類の情報それぞれについて、判定対象のウェブサイトが有害サイトである可能性を示す評価値を計算し、各情報について得られた評価値を合計した値と所定の閾値を比較して、評価値の合計値が閾値以上となった場合に、判定対象のウェブサイトが有害サイトであると判定する。 The harmful site determination units 13 and 21 check the following three types of information when determining whether the website is a harmful site.
(1) URI character string information (2) Domain information (3) Website page information For each of these three types of information, the harmful website is a harmful website. When the evaluation value indicating the possibility is calculated, the value obtained by summing the evaluation values obtained for each information is compared with a predetermined threshold, and the total value of the evaluation values exceeds the threshold, the website to be determined Is determined to be a harmful site.

まず、ＵＲＩの文字列の情報に基づいたチェックの内容を説明する。メールサーバ１の有害サイト判定部１３は、メール記憶部１１が記憶している受信メールを読み出し、その受信メール中に記載されているＵＲＩの文字列を解析する。また、プロキシサーバ２のウェブアクセス監視部２０は、ユーザから外部のウェブサイトへのアクセス要求があった場合に、アクセスを要求されたウェブサイトのＵＲＩを有害サイト判定部２１に通知する。有害サイト判定部２１は、通知されたＵＲＩの文字列を解析する。 First, the contents of a check based on URI character string information will be described. The harmful site determination unit 13 of the mail server 1 reads the received mail stored in the mail storage unit 11, and analyzes the URI character string described in the received mail. The web access monitoring unit 20 of the proxy server 2 notifies the harmful site determination unit 21 of the URI of the website requested to be accessed when a user requests access to an external website. The harmful site determination unit 21 analyzes the character string of the notified URI.

ＵＲＩからはフィッシング詐欺などの詐欺サイトの情報を読み取ることができる。詐欺サイトを作成する際に、ユーザやサーバ管理者に詐欺サイトであることを気づかれないように、ＵＲＩに工夫を凝らしていることが多い。例えば、実際に存在するウェブサイトの名前に似た文字列をＵＲＩ中に仕込み、ユーザをだますパターンが考え得る。本実施形態では、あらかじめ有名なウェブサイトの名前に似た文字列や名前そのものをリスト化し、その名前にマッチングしたものに高い評価値を与える。また、有名なウェブサイトに似た名前の文字列がＵＲＩ中のどこに書かれているかのチェックも行う。詐欺サイトの場合はサブドメインやディレクトリ名の中にその文字列を書く場合が多いため、これらの場所に書かれていた場合に評価値を大きくする。 Information on fraud sites such as phishing scams can be read from the URI. When creating a fraud site, the URI is often devised so that the user or server administrator is not aware of the fraud site. For example, a character string similar to the name of a website that actually exists is stored in the URI, and a pattern that tricks the user can be considered. In this embodiment, character strings similar to names of famous websites and names themselves are listed in advance, and a high evaluation value is given to those matching the names. It also checks where in the URI a string with a name similar to a famous website is written. In the case of fraudulent sites, the character strings are often written in subdomains and directory names, so the evaluation value is increased when written in these places.

また、サーバ管理者をだますパターンには次のようなものが存在する。その１つが、空白のディレクトリを作成し、その中に有害サイトを作成する場合であり、もう１つが、「.」から始まるディレクトリを作成し、その中に有害サイトを作成する場合であり、どちらも管理者から有害サイトの場所を隠すことを意図している。これらの場合には、ＵＲＩの文字列の「/」と「/」の間に空白（文字列としては「%20」）が入れられていたり、「/」の後に「.」が入れられていたりする。 In addition, the following patterns exist to trick server administrators. One is to create a blank directory and create a harmful site in it, and the other is to create a directory starting with "." And create a harmful site in it. It is also intended to hide the location of harmful sites from the administrator. In these cases, there is a space ("% 20" as the string) between the URI characters "/" and "/", or "." After "/". Or

また、上記以外に、ＩＰアドレスを直接書いたＵＲＩや、そのＩＰアドレスを１０進数や１６進数に直したものを記述している例などが有害サイトのＵＲＩとして挙げられる。 In addition to the above, URIs in which IP addresses are directly written, and examples in which the IP address is converted to decimal or hexadecimal are listed as URIs for harmful sites.

次に、ドメインの情報に基づいたチェックの内容を説明する。有害サイトの管理者が有害サイトを作成し、宣伝する場合に、有害サイトを作成した直後にスパムメールを送ることが考えられる。したがって、ドメインの取得日を判断基準として利用することが可能である。また、早期に発見された場合などに対処するために、１年という短い有効期限でドメインを取得することが多く、ドメインの有効期限も判断基準として利用することができると考えられる。また、ＩＰアドレスを判断基準として利用することも可能である。 Next, the contents of the check based on the domain information will be described. When an administrator of a harmful site creates and promotes a harmful site, it is possible to send a spam mail immediately after creating the harmful site. Therefore, the domain acquisition date can be used as a criterion. Further, in order to deal with cases such as being discovered early, a domain is often acquired with an expiration date as short as one year, and it is considered that the expiration date of the domain can also be used as a criterion. It is also possible to use an IP address as a criterion.

有害サイト判定部１３，２１はＵＲＩの文字列からドメイン名を抽出し、ｗｈｏｉｓコマンドを実行してドメインの情報を取得する。そして、有害サイト判定部１３，２１は、そのドメインの取得日を確認し、ドメインが最近取得された場合には評価値を大きくする。また、有害サイト判定部１３，２１は、そのドメインの有効期限を確認し、有効期限が短い場合には評価値を大きくする。 The harmful site determination units 13 and 21 extract the domain name from the URI character string, and execute the whois command to acquire the domain information. Then, the harmful site determination units 13 and 21 confirm the acquisition date of the domain, and increase the evaluation value when the domain has been acquired recently. Moreover, the harmful site determination units 13 and 21 confirm the expiration date of the domain, and increase the evaluation value when the expiration date is short.

次に、ウェブサイトのページの情報に基づいたチェックの内容を説明する。これは、実際にウェブサイトのページに書かれている情報を取得することによって分析を行う。ウェブページには、そのサイトの特徴とも言える単語が書かれていることが多い。そこで、あらかじめ特徴的な単語のデータベースを用意しておき、そのデータベースに含まれる単語をどれだけ使っているのかということや、各単語の出現頻度などを計算し、評価値に重みをつける。また、そのＵＲＩに書かれているウェブページだけでなく、そのページからアクセス可能なリンク先のウェブページでも同様のことを行う。リンク先まで含めると、複数ページについて評価値が得られるが、ＵＲＩで示されるページに近いページほど評価値の重みを大きくする。 Next, the contents of the check based on the information on the website page will be described. This is done by getting the information that is actually written on the pages of the website. Web pages often have words that are characteristic of the site. Therefore, a database of characteristic words is prepared in advance, the number of words included in the database is used, the appearance frequency of each word is calculated, and the evaluation value is weighted. The same is done not only on the web page written in the URI but also on the linked web page accessible from the page. When the link destination is included, an evaluation value is obtained for a plurality of pages, but the weight of the evaluation value is increased as the page is closer to the page indicated by the URI.

上記以外では、固有単語の比較をすることで、判定を行うこともできる。ＵＲＩで示されるページに書かれている固有名詞と、ＵＲＩの文字列の後ろの方を削ったＵＲＩで示されるそのドメインのサイトのトップページといえるページに書かれている固有名詞とを比較し、それらに共通の単語がない場合、トップページとＵＲＩのページとの関連性が薄いと考えられる。通常の個人サイトなどでは多くあることであると考えられるが、企業などのサイトの場合ではそのようなことは少なく、共通の固有名詞がなかった場合、クラックされたサイトであると判断することができる。 Other than the above, the determination can also be made by comparing the unique words. Compare the proper noun written on the page indicated by the URI with the proper noun written on the page that can be said to be the top page of the domain site indicated by the URI that has been trimmed away from the URI string. If there is no common word between them, the relationship between the top page and the URI page is considered to be weak. It is considered that there are many in ordinary personal sites, but such cases are rare in the case of sites such as companies, and if there is no common proper name, it can be judged that it is a cracked site it can.

同様に、上記のページ間のタイトルを比較し、タイトルが異なっている場合に、有害サイトである可能性が高いと判断することも可能である。また、フィッシングサイトの場合、パスワードなどを送信するための送信フォームがウェブページにあるため、送信フォームの有無についても評価項目として挙げることができる。 Similarly, the titles between the pages described above are compared, and if the titles are different, it is possible to determine that there is a high possibility of being a harmful site. In the case of a phishing site, since there is a transmission form for transmitting a password or the like on the web page, the presence or absence of the transmission form can also be listed as an evaluation item.

有害サイト判定部１３，２１はＵＲＩでよって識別されるウェブサイトにアクセスしてウェブページの情報をダウンロードし、そのウェブページの情報に基づいて、上記のような基準で評価値を計算する。 The harmful site determination units 13 and 21 access the website identified by the URI, download the information on the web page, and calculate the evaluation value based on the above-described criteria based on the information on the web page.

次に、図２を参照しながら、メールサーバ１の動作を説明する。メール受信部１０はメールを受信し、メール記憶部１１に受信メールを格納する（ステップＳ１００）。有害サイト判定部１３はメール記憶部１１から受信メールを読み出し、受信メールのテキストからＵＲＩを抽出する（ステップＳ１１０）。 Next, the operation of the mail server 1 will be described with reference to FIG. The mail receiving unit 10 receives the mail and stores the received mail in the mail storage unit 11 (step S100). The harmful site determination unit 13 reads the received mail from the mail storage unit 11 and extracts the URI from the text of the received mail (step S110).

続いて、有害サイト判定部１３は、ｗｈｏｉｓコマンドを実行し、受信メールに記載されていたＵＲＩが示すウェブサイトが所属するドメインの情報を取得する。また、有害サイト判定部１３は、受信メールに記載されていたＵＲＩが示すウェブページにアクセスし、そのページの情報を取得する（ステップＳ１２０）。 Subsequently, the harmful site determination unit 13 executes a whois command and acquires information on the domain to which the website indicated by the URI described in the received mail belongs. Further, the harmful site determination unit 13 accesses the web page indicated by the URI described in the received mail and acquires information on the page (step S120).

続いて、有害サイト判定部１３は、受信メールから抽出したＵＲＩの文字列を解析すると共に、ステップＳ１２０で取得したドメインの情報やウェブページの情報を解析し、各情報についての評価値およびその合計値を計算する（ステップＳ１３０）。有害サイト判定部１３は、評価値の合計値と所定の閾値を比較し、比較結果に基づいて、受信メールに記載されたＵＲＩによって識別されるウェブサイトが有害サイトであるか否かを判定する（ステップＳ１４０）。 Subsequently, the harmful site determination unit 13 analyzes the URI character string extracted from the received mail, analyzes the domain information and the web page information acquired in step S120, and evaluates each information and the sum thereof. A value is calculated (step S130). The harmful site determination unit 13 compares the total value of the evaluation values with a predetermined threshold, and determines whether the website identified by the URI described in the received mail is a harmful site based on the comparison result. (Step S140).

ウェブサイトが有害サイトであると判定した場合（ステップＳ１４０でＹＥＳの場合）、有害サイト判定部１３はＵＲＩおよびステップＳ１２０で取得した情報を関連付けてブラックリスト記憶部３に格納する（ステップＳ１５０）。ウェブサイトが有害サイトでないと判定した場合（ステップＳ１４０でＮＯの場合）には、処理が終了する。 If it is determined that the website is a harmful site (YES in step S140), the harmful site determination unit 13 associates the URI and the information acquired in step S120 and stores them in the blacklist storage unit 3 (step S150). If it is determined that the website is not a harmful site (NO in step S140), the process ends.

次に、図３を参照しながら、プロキシサーバ２の動作を説明する。ユーザがウェブサイトへのアクセス要求をプロキシサーバ２に行うと、ウェブアクセス監視部２０はアクセスを要求されたＵＲＩを有害サイト判定部２１に通知する（ステップＳ２００）。有害サイト判定部２１は、時間のかからないチェック（ＵＲＩ中の文字列のチェックやウェブサイトのトップページの情報のチェック）を実行し、各情報についての評価値およびその合計値を計算する（ステップＳ２１０）。 Next, the operation of the proxy server 2 will be described with reference to FIG. When the user makes a request for accessing the website to the proxy server 2, the web access monitoring unit 20 notifies the harmful site determination unit 21 of the URI requested to be accessed (step S200). The harmful site determination unit 21 performs a time-consuming check (checking the character string in the URI and checking the information on the top page of the website), and calculates an evaluation value and a total value thereof for each piece of information (step S210). ).

続いて、有害サイト判定部２１は、評価値の合計値と所定の閾値を比較し、比較結果に基づいて、ユーザがアクセスを要求したＵＲＩによって識別されるウェブサイトが有害サイトであるか否かを判定する（ステップＳ２２０）。ウェブサイトが有害サイトであると判定した場合（ステップＳ２２０でＹＥＳの場合）、有害サイト判定部１３はユーザのアクセスを警告ページへ転送させる（ステップＳ２３０）。 Subsequently, the harmful site determination unit 21 compares the total value of the evaluation values with a predetermined threshold, and based on the comparison result, whether or not the website identified by the URI that the user requested access to is a harmful site. Is determined (step S220). If it is determined that the website is a harmful site (YES in step S220), the harmful site determination unit 13 transfers the user's access to the warning page (step S230).

このとき、ユーザが操作する端末には、図４に示す警告ページが表示される。警告ページには、警告メッセージのほか、ユーザがアクセスを希望しているウェブサイトへ本当にアクセスするのか否かを確認するメッセージ４００と共に、ユーザが判断結果を入力するボタン４１０，４２０が表示される。 At this time, a warning page shown in FIG. 4 is displayed on the terminal operated by the user. On the warning page, in addition to the warning message, a message 400 for confirming whether or not the user really wants to access the website is displayed, and buttons 410 and 420 for the user to input the determination result are displayed.

有害サイト判定部１３は、ボタン４１０，４２０の操作結果に応じて、ユーザによるウェブサイトの判定結果を認識し、その判定結果に基づいて、ユーザがアクセスを要求したＵＲＩによって識別されるウェブサイトが有害サイトであるか否かを判定する（ステップＳ２４０）。ユーザがボタン４１０を選択した場合（ステップＳ２４０でＹＥＳの場合）には、ウェブサイトが有害サイトであるとユーザが判定したと認識し、有害サイト判定部１３はＵＲＩ、ステップＳ２１０で取得した情報、およびユーザの判定結果を関連付けてブラックリスト記憶部３に格納する（ステップＳ２５０）。 The harmful site determination unit 13 recognizes the determination result of the website by the user according to the operation result of the buttons 410 and 420, and based on the determination result, the website identified by the URI that the user requested access to is determined. It is determined whether or not the site is a harmful site (step S240). If the user selects button 410 (YES in step S240), the user recognizes that the website has been determined to be a harmful site, and harmful site determination unit 13 uses the URI, the information acquired in step S210, Then, the determination result of the user is associated and stored in the black list storage unit 3 (step S250).

また、ユーザがボタン４２０を選択した場合（ステップＳ２４０でＮＯの場合）には、ウェブサイトが有害サイトではないとユーザが判定したと認識し、有害サイト判定部１３はそのウェブサイトへのユーザのアクセスを許可する（ステップＳ２６０）。そして、有害サイト判定部１３は、判定対象のＵＲＩをブラックリスト記憶部３に格納せず、もし、既に同じＵＲＩがブラックリスト記憶部３に格納されていた場合には、そのＵＲＩおよびそれに関連付けられている情報を削除する（ステップＳ２７０）。 When the user selects the button 420 (NO in step S240), the user recognizes that the website is not a harmful site, and the harmful site determination unit 13 determines that the user has visited the website. Access is permitted (step S260). Then, the harmful site determination unit 13 does not store the URI to be determined in the blacklist storage unit 3, and if the same URI has already been stored in the blacklist storage unit 3, the URI and the associated URI are associated therewith. Is deleted (step S270).

一方、ステップＳ２２０の判定で、ウェブサイトが有害サイトでないと判定した場合、有害サイト判定部１３はそのウェブサイトへのユーザのアクセスを許可する（ステップＳ２８０）。そして、有害サイト判定部１３は、時間のかかるチェック（ドメインの情報のチェックや、ウェブサイトのトップページ以外の別ページのチェック）を実行し、各情報についての評価値およびその合計値を計算する（ステップＳ２９０）。 On the other hand, if it is determined in step S220 that the website is not a harmful site, the harmful site determination unit 13 permits the user to access the website (step S280). Then, the harmful site determination unit 13 performs time-consuming checks (domain information check and other page check other than the top page of the website), and calculates an evaluation value and a total value of each information. (Step S290).

続いて、有害サイト判定部２１は、評価値の合計値と所定の閾値を比較し、比較結果に基づいて、ユーザがアクセスを要求したＵＲＩによって識別されるウェブサイトが有害サイトであるか否かを判定する（ステップＳ３００）。ウェブサイトが有害サイトであると判定した場合（ステップＳ３００でＹＥＳの場合）、有害サイト判定部１３は、ウェブ際とのトップページからリンクしている別ページへのアクセスの要求などがあったときに、ユーザのアクセスを警告ページへ転送させる（ステップＳ３１０）。これに続く、ステップＳ３２０〜Ｓ３５０はステップＳ２４０〜Ｓ２７０と同様であるので、説明を省略する。 Subsequently, the harmful site determination unit 21 compares the total value of the evaluation values with a predetermined threshold, and based on the comparison result, whether or not the website identified by the URI that the user requested access to is a harmful site. Is determined (step S300). When it is determined that the website is a harmful site (in the case of YES in step S300), the harmful site determination unit 13 receives a request for access to another page linked from the top page of the web. Then, the user access is transferred to the warning page (step S310). Subsequent steps S320 to S350 are the same as steps S240 to S270, and a description thereof will be omitted.

また、ステップＳ３００の判定で、ウェブサイトが有害サイトでないと判定した場合、有害サイト判定部１３はそのウェブサイトへのユーザのアクセスを許可する（ステップＳ３６０）。これ以後、ユーザはウェブサイトへ自由にアクセスできるようになる。 If it is determined in step S300 that the website is not a harmful site, the harmful site determination unit 13 permits the user to access the website (step S360). Thereafter, the user can freely access the website.

図２および図３に示した処理によってブラックリスト記憶部３に格納された情報に基づいて、メールサーバ１のスパムメール判定部１２は、受信メールがスパムメールであるか否かを判定する。この判定は、ブラックリスト記憶部３に格納されたＵＲＩと一致するＵＲＩが受信メールに記載されているか否かによって行う。ブラックリスト記憶部３に格納されたＵＲＩと一致するＵＲＩが受信メールに記載されている場合には、スパムメール判定部１２は、受信メールがスパムメールであると判定する。 Based on the information stored in the blacklist storage unit 3 by the processing shown in FIGS. 2 and 3, the spam mail determination unit 12 of the mail server 1 determines whether or not the received mail is a spam mail. This determination is made based on whether or not a URI that matches the URI stored in the blacklist storage unit 3 is described in the received mail. When a URI that matches the URI stored in the blacklist storage unit 3 is described in the received mail, the spam mail determination unit 12 determines that the received mail is a spam mail.

また、ＵＲＩと共にブラックリスト記憶部３に格納されているウェブサイトのＩＰアドレスや、ドメインの情報、ウェブサイトのページの情報（ウェブページのタイトルや頻出単語に関する情報）、およびウェブサイトが有害サイトであるか否かをユーザが判定した結果を示す情報を用いた公知の手法によりスパム判定を行うことも可能である。 In addition, the IP address of the website stored in the blacklist storage unit 3 along with the URI, domain information, website page information (information on the web page title and frequent words), and the website are harmful sites. It is also possible to perform spam determination by a known method using information indicating the result of the user determining whether or not there is.

上述したように、本実施形態によれば、メールサーバ１において、メール中に記載されているＵＲＩの文字列を解析し、有害サイトであるか否かを判定することによって、初めて判定を行うＵＲＩが記載されている迷惑メールが受信された場合でも迷惑メールを確実にフィルタリングすることができる。また、プロキシサーバ２において、ユーザがアクセスを要求するＵＲＩの文字列を解析し、有害サイトであるか否かを判定することによって、有害サイトの判定精度を向上し、迷惑メールを確実にフィルタリングすることができる。 As described above, according to the present embodiment, the mail server 1 analyzes the URI character string described in the mail and determines whether or not it is a harmful site. Even if a junk e-mail with “” is received, junk e-mail can be reliably filtered. Further, the proxy server 2 analyzes the character string of the URI for which the user requests access, and determines whether or not the site is a harmful site, thereby improving the accuracy of determining the harmful site and reliably filtering junk mail. be able to.

また、ウェブサイトのＩＰアドレス、ウェブサイトのドメインの情報、ウェブサイトのページの情報、およびウェブサイトが有害サイトであるか否かをユーザが判定した結果を示す情報の少なくともいずれかに基づいて、受信メールがスパムメールあるか否かを判定することによって、スパムメールの判定精度を向上し、迷惑メールを確実にフィルタリングすることができる。特に、ウェブサイトが有害サイトであるか否かをユーザが判定した結果を示す情報に基づいて、受信メールがスパムメールあるか否かを判定することによって、自動判定による誤判定を訂正することができる。 Also, based on at least one of the website IP address, website domain information, website page information, and information indicating the result of the user determining whether the website is a harmful site, By determining whether or not the received mail is spam mail, it is possible to improve the spam mail determination accuracy and reliably filter junk mail. In particular, it is possible to correct an erroneous determination by automatic determination by determining whether or not the received mail is spam mail based on information indicating a result of the user determining whether or not the website is a harmful site. it can.

以上、図面を参照して本発明の実施形態について詳述してきたが、具体的な構成は上記の実施形態に限られるものではなく、本発明の要旨を逸脱しない範囲の設計変更等も含まれる。 As described above, the embodiments of the present invention have been described in detail with reference to the drawings. However, the specific configuration is not limited to the above-described embodiments, and includes design changes and the like without departing from the gist of the present invention. .

本発明の一実施形態によるコンピュータシステムの構成を示すブロック図である。It is a block diagram which shows the structure of the computer system by one Embodiment of this invention. 本発明の一実施形態によるコンピュータシステムが備えるメールサーバの動作の手順を示すフローチャートである。It is a flowchart which shows the procedure of operation | movement of the mail server with which the computer system by one Embodiment of this invention is provided. 本発明の一実施形態によるコンピュータシステムが備えるプロキシサーバの動作の手順を示すフローチャートである。It is a flowchart which shows the procedure of the operation | movement of the proxy server with which the computer system by one Embodiment of this invention is provided. 本発明の一実施形態において表示される警告ページを示す参考図である。It is a reference figure showing the warning page displayed in one embodiment of the present invention.

Explanation of symbols

１・・・メールサーバ、２・・・プロキシサーバ、３・・・ブラックリスト記憶部（記憶手段）、１０・・・メール受信部、１１・・・メール記憶部、１２・・・スパムメール判定部（迷惑メール判定手段）、１３・・・有害サイト判定部（第１の有害サイト判定手段）、２０・・・ウェブアクセス監視部、２１・・・有害サイト判定部（第２の有害サイト判定手段） DESCRIPTION OF SYMBOLS 1 ... Mail server, 2 ... Proxy server, 3 ... Black list memory | storage part (memory | storage means), 10 ... Mail receiving part, 11 ... Mail memory | storage part, 12 ... Spam mail determination Part (spam e-mail judging means), 13 ... harmful site judging part (first harmful site judging means), 20 ... web access monitoring part, 21 ... harmful site judging part (second harmful site judging part) means)

Claims

Analyzing a URI (Uniform Resource Identifier) character string described in the received mail and determining whether or not the website identified by the URI is a harmful site;
Storage means for storing the URI determined to be a harmful site by the first harmful site determination means;
A junk e-mail determination unit that determines whether the received mail is a junk mail based on the URI stored by the storage unit;
A computer system comprising:

A second harmful site determination means for analyzing a character string of a URI for which a user requests access and determining whether or not the website identified by the URI is a harmful site;
The computer system according to claim 1, wherein the storage unit further stores the URI determined to be a harmful site by the second harmful site determination unit.

The first harmful site determination means is further configured based on at least one of an IP address of a website identified by the URI, information on a domain of the website, and information on a page of the website. To determine if is a harmful site,
The storage means further includes at least one of an IP address of the website determined as a harmful site by the first harmful site determination means, information on the domain of the website, and information on a page of the website Remember
The junk mail determination means may further receive the received mail based on at least one of the IP address of the website, the domain information of the website, and the information of the website page stored by the storage means. The computer system according to claim 1, wherein it is determined whether or not it is a spam mail.

The second harmful site determination means further includes an IP address of a website identified by the URI, information on the website domain, information on the website page, and whether the website is a harmful site. Determining whether the website is a harmful site based on at least one of the information indicating the result of the user's determination,
The storage means further includes an IP address of the website determined as a harmful site by the first harmful site determination means, information on the website domain, information on the website page, and the website Memorize at least one of the information indicating the result of the user's determination whether or not is a harmful site,
The junk mail determination means further includes the IP address of the website stored in the storage means, the domain information of the website, the information of the page of the website, and whether the website is a harmful site. The computer system according to claim 2, wherein it is determined whether or not the received mail is a junk mail based on at least one of the information indicating the result of the user determining whether or not.