JP6531793B2

JP6531793B2 - INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM

Info

Publication number: JP6531793B2
Application number: JP2017147453A
Authority: JP
Inventors: 靖大田中
Original assignee: Canon Marketing Japan Inc; Canon IT Solutions Inc
Current assignee: Canon Marketing Japan Inc; Canon IT Solutions Inc
Priority date: 2017-07-31
Filing date: 2017-07-31
Publication date: 2019-06-19
Anticipated expiration: 2037-07-31
Also published as: JP2019028724A

Description

本発明は、情報処理装置、情報処理方法、プログラムに関する。 The present invention relates to an information processing apparatus, an information processing method, and a program.

近年、特定の組織や人物から金銭や重要情報の窃取を目的とした標的型攻撃が増加している。標的型攻撃では、目的を達成するために、標的となる組織や人物に特化した情報に基づき、多様な手法を組み合わせて攻撃が行われる。 In recent years, targeted attacks aimed at stealing money and important information from specific organizations and persons have been increasing. In the targeted attack, in order to achieve the purpose, the attack is performed by combining various methods based on the information specific to the target organization or person.

標的型攻撃では、初期侵入手段として、標的型攻撃メールと呼ばれる電子メールを用いるのが典型的な方法である。標的型攻撃メールでは、標的となる受信者の興味を引く内容や、受信者と関連があると誤認させる内容を本文に記載し、電子メールに添付したファイルや記載されたリンク先に仕込まれた不正プログラムを実行させることを狙う。標的が不正ブログラムを実行してしまうとバッグドアが設置され更なる攻撃の侵入口となってしまう。 In a targeted attack, it is typical to use an email called a targeted attack email as an initial intrusion means. In targeted attack email, content that draws the interest of the targeted recipient, and content that misleads them to be related to the recipient are described in the text, and it is included in the file attached to the email or the linked destination. Aim to run a malicious program. If the target executes an illegal program, a bag door will be installed and it will be an entrance for further attacks.

標的型攻撃の対策としては多層防御の考え方が基本となるが、初期段階である標的型攻撃メールを防ぐことできれば最も効果的で望ましい。 While defense against targeted attacks is basically based on the concept of defense in depth, it is most effective and desirable if targeted attack emails can be prevented at the early stage.

標的型攻撃メールには、標的の興味・関心や誤認を利用するだけでなく、標的と関連の深い人物や組織を偽装するものがある。標的となった受信者は、知人からのメールと思って警戒を怠ってしまい、攻撃が成功する確率が高くなる。 Some targeted attack emails not only use the target's interest / interest or misidentification, but also disguise a person or organization closely related to the target. The targeted recipients are alerted because they think they are emails from acquaintances, and the chances of successful attacks are high.

特許文献１には、標的が過去に受信したことのある電子メールの送信者アドレスを偽装した電子メールを検知する技術について開示されている。 Patent Document 1 discloses a technology for detecting an e-mail spoofing a sender address of an e-mail that the target has received in the past.

特開２０１３−２３６３０８号公報JP, 2013-236308, A

特許文献１には、過去に受信したメールから抽出したメールヘッダの特徴を記憶し、新規メールを受信した際に同じ送信者アドレスを持つ過去メールのヘッダの特徴と新規メールのヘッダの特徴の類似度を求め、類似度が規定値以下である場合に警告を発する旨が記載されている。すなわち、同じ送信元なのにいつもと異なるメール文面である場合に警告を発するものである。 Patent Document 1 stores the feature of the email header extracted from the email received in the past, and when the new email is received, the feature of the header of the previous email having the same sender address and the feature of the header of the new email are similar. It is described that the degree is determined and a warning is issued when the degree of similarity is less than or equal to a prescribed value. That is, a warning is issued when the mail source is the same source but the mail text is different.

しかしながら、特許文献１は、送信者アドレスが偽装された場合に対しては効果的であるが、全く異なるアドレスから本文の詐称による攻撃メールには効果が得られないという課題がある。 However, although Patent Document 1 is effective for the case where the sender address is disguised, there is a problem that an attack mail by spoofing of the text can not be obtained from a completely different address.

一般に、メールの受信時には、未読メールの一覧から件名と送信者を確認し、次に本文を見る。一覧では送信者として表示名が使われることが多く、本文の内容に違和感がない場合、送信者のアドレスの確認を怠ってしまう場合がある。また、誤認を狙って、偽装対象と関連する単語を一部に含むなど、紛らわしいアドレスを用いている場合もある。 Generally, when receiving mail, check the subject and sender from the list of unread mails, then look at the text. In a list, a display name is often used as a sender, and when there is no sense of incongruity in the contents of the text, the sender's address may be neglected. In addition, in some cases, misleading addresses are used, for example, including words related to the target of camouflage as part of misleading.

上述したようなケースでは、当然、送信者が不明な（新規）アドレスであることが多く、その旨の警告を出すことは可能であるが、明示的に攻撃である旨を警告するほうがより望ましい。 In the case described above, it is natural that the sender is often an unknown (new) address, and it is possible to issue a warning to that effect, but it is more preferable to warn explicitly that it is an attack .

そこで、本発明は、過去に受信したメールと同様の文面を持つが異なるアドレスから送信されたメール（詐称メール）について警告し送信者の確認を促すことで、標的型攻撃メールによる被害を防ぐことを目的とする。 Therefore, the present invention prevents damage by targeted attack mail by warning about mail (spoofed mail) sent from a different address but having the same text as mail received in the past and prompting the sender to confirm it. With the goal.

本発明の情報処理装置は、過去に受信した電子メールを用いて学習した知識に基づき、新たに受信した電子メールが詐称メールであるかを判定する判定手段と、前記判定手段による判定結果に基づき、警告を通知する通知手段と、を備えることを特徴とする。 According to the information processing apparatus of the present invention, based on the knowledge learned using e-mail received in the past, determination means for determining whether the newly received e-mail is an spoofed mail and the determination result by the determination means , And notification means for notifying a warning.

本発明の情報処理方法は、情報処理装置における情報処理方法であって、過去に受信した電子メールを用いて学習した知識に基づき、新たに受信した電子メールが詐称メールであるかを判定する判定工程と、前記判定工程による判定結果に基づき、警告を通知する通知工程と、を備えることを特徴とする。 The information processing method according to the present invention is an information processing method in an information processing apparatus, and determines whether newly received e-mail is a false mail based on knowledge learned using e-mail received in the past. It is characterized by including a process and a notification process of notifying a warning based on the determination result of the determination process.

本発明のプログラムは、情報処理装置を、過去に受信した電子メールを用いて学習した知識に基づき、新たに受信した電子メールが詐称メールであるかを判定する判定手段と、前記判定手段による判定結果に基づき、警告を通知する通知手段として機能させるためのプログラム。 A program according to the present invention is a determination unit that determines whether a newly received e-mail is an spoofed mail based on knowledge obtained by learning an information processing apparatus using an e-mail received in the past; A program for functioning as notification means for notifying a warning based on the result.

本発明によれば、標的型攻撃メールによる被害を低減させることが可能となる。 According to the present invention, it is possible to reduce the damage caused by targeted attack mail.

本発明の実施形態における、詐称メール検査装置のシステム構成の一例を示す図である。FIG. 1 is a diagram showing an example of a system configuration of a fraudulent e-mail inspection device according to an embodiment of the present invention. 本発明の実施形態における、詐称メール検査装置、メールサーバのハードウェア構成の一例を示すブロック図である。It is a block diagram showing an example of hardware constitutions of a false mail inspection device and a mail server in an embodiment of the present invention. 本発明の実施形態における、詐称メール検査装置の機能構成の一例を示す図である。It is a figure which shows an example of a function structure of a false mail inspection apparatus in embodiment of this invention. 本発明の実施形態における、メールを検査する処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process which test | inspects a mail in embodiment of this invention. 本発明の実施形態における、メール保存領域の構成の一例を示す図である。It is a figure which shows an example of a structure of a mail storage area in embodiment of this invention. 本発明の実施形態における、メールテーブルの一例を示す図である。It is a figure which shows an example of a mail table in embodiment of this invention. 本発明の実施形態における、メールの一覧を表示する画面の一例を示す図である。It is a figure which shows an example of the screen which displays the list of mail in embodiment of this invention. 本発明の実施形態における、メールに対するユーザの判定を取得する処理の一例を示すフローチャートである。It is a flow chart which shows an example of processing which acquires a user's judgment about mail in an embodiment of the present invention. 本発明の実施形態における、メールを閲覧する画面の一例を示す図である。It is a figure which shows an example of the screen which browses an email in embodiment of this invention. 本発明の実施形態における、判定にようする知識の構築処理の一例を示すフローチャートである。It is a flowchart which shows an example of the construction | assembly process of the knowledge made to judge in embodiment of this invention. 本発明の実施形態における、判定知識保存領域の構成の一例を示す図である。It is a figure which shows an example of a structure of the determination knowledge preservation | save area | region in embodiment of this invention. 本発明の実施形態における、単語統計テーブルの一例を示す図である。It is a figure which shows an example of a word statistics table in embodiment of this invention. 本発明の実施形態における、送信者別単語統計テーブルの一例を示す図である。It is a figure which shows an example of the word statistics table according to sender in embodiment of this invention. 本発明の実施形態における、定型部単語抽出結果の一例を示す図である。It is a figure which shows an example of the fixed part word extraction result in embodiment of this invention. 本発明の実施形態における、文書ベクトルの一例を示す図である。It is a figure which shows an example of a document vector in embodiment of this invention. 本発明の実施形態における、分類器の概要を示す図である。FIG. 1 is a schematic diagram of a classifier according to an embodiment of the present invention. 本発明の実施形態における、定型部抽出処理の一例を示すフローチャートである。It is a flow chart which shows an example of fixed part extraction processing in an embodiment of the present invention. 本発明の実施形態における、定型部抽出結果の一例を示す図である。It is a figure which shows an example of a fixed part extraction result in embodiment of this invention. 本発明の実施形態における、定型部抽出処理で生成される行の重み一覧の一例を示す図である。It is a figure which shows an example of the line weight list produced | generated by fixed form extraction processing in embodiment of this invention. 本発明の実施形態における、詐称判定処理の一例を示すフローチャートである。It is a flow chart which shows an example of false detection processing in an embodiment of the present invention. 本発明の実施形態における、詐称メールの一例を示す図である。It is a figure which shows an example of false mail in embodiment of this invention. 本発明の実施形態における、送信者候補の一例を示す図である。It is a figure which shows an example of a sender candidate in embodiment of this invention.

以下、図面を参照して、本発明の実施形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明の実施形態における詐称メール検査装置のシステム構成の一例を示す図である。 FIG. 1 is a diagram showing an example of a system configuration of a false mail inspection apparatus according to an embodiment of the present invention.

詐称メール検査装置１００と、メールサーバ１２０とは、ローカルエリアネットワーク１３０を介して接続される構成となっている。またメールサーバ１２０は、外部ネットワーク１４０に接続可能な構成となっている。 The false mail inspection device 100 and the mail server 120 are connected via the local area network 130. The mail server 120 is configured to be connectable to the external network 140.

詐称メール検査装置１００は、一般的な電子メールクライアントの機能を持ち、電子メールの閲覧機能とメールサーバからの受信機能を有する。更に、電子メールを新たに受信した際に、過去に受信済みの電子メールから得た知識に基づき、新たに受信した電子メールに対する詐称の有無の判定（電子メールの本文は、過去に受信した電子メールと同様の文面を含みながら（類似する文面でありながら）、当該過去に受信した電子メールの送信元とは異なる送信元から送信された電子メールであるか否かの判定）を行ない、電子メールが詐称の可能性があると判断された場合、電子メールの詐称箇所を強調表示し目視確認を促す。 The fraudulent e-mail examination apparatus 100 has a general e-mail client function, and has an e-mail browsing function and a reception function from the mail server. Furthermore, when an e-mail is newly received, it is judged whether there is a false positive with respect to the newly received e-mail based on the knowledge obtained from the e-mail already received in the past (the text of the e-mail is an electronic message received in the past While including the same text as email (while similar text), it is judged whether the email is sent from a sender different from the sender of the email received in the past) If it is determined that there is a possibility that the e-mail is spoofing, the spoofing portion of the e-mail is highlighted to prompt visual confirmation.

また、詐称メール検査装置１００は、新たに受信したメールが詐称でないという判断がなされた際に、詐称を判定する知識を更新する。 In addition, when it is determined that the newly received e-mail is not an spoofing, the spoofed mail inspection device 100 updates the knowledge for determining the spoofing.

メールサーバ１２０は一般的なメールの受信サーバ（ＰＯＰ３またはＩＭＡＰサーバ）であり、外部またはローカルの送信先から受け取ったメールを電子メールクライアントの要求に応じて転送する。 The mail server 120 is a general mail reception server (POP3 or IMAP server), and transfers mail received from an external or local transmission destination in response to a request from an electronic mail client.

図２は、本発明の実施形態における詐称メール検査装置１００のハードウェア構成の一例を示すブロック図である。 FIG. 2 is a block diagram showing an example of the hardware configuration of the fraudulent e-mail inspection device 100 according to the embodiment of the present invention.

図２に示すように、情報処理装置は、システムバス２０４を介してＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２０１、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２０２、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２０３、入力コントローラ２０５、ビデオコントローラ２０６、メモリコントローラ２０７、よび通信Ｉ／Ｆコントローラ２０８が接続される。 As shown in FIG. 2, the information processing apparatus includes a central processing unit (CPU) 201, a read only memory (ROM) 202, a random access memory (RAM) 203, an input controller 205, a video controller 206, and a system bus 204. The memory controller 207 and the communication I / F controller 208 are connected.

ＣＰＵ２０１は、システムバス２０４に接続される各デバイスやコントローラを統括的に制御する。 The CPU 201 centrally controls the devices and controllers connected to the system bus 204.

ＲＯＭ２０２あるいは外部メモリ２１１は、ＣＰＵ２０１が実行する制御プログラムであるＢＩＯＳ（ＢａｓｉｃＩｎｐｕｔ／ＯｕｔｐｕｔＳｙｓｔｅｍ）やＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）や、本情報処理方法を実現するためのコンピュータ読み取り実行可能なプログラムおよび必要な各種データ（データテーブルを含む）を保持している。 The ROM 202 or the external memory 211 is a control program executed by the CPU 201, such as a BIOS (Basic Input / Output System) or an OS (Operating System), a computer readable and executable program for realizing the information processing method, and various necessary items. It holds data (including data tables).

ＲＡＭ２０３は、ＣＰＵ２０１の主メモリ、ワークエリア等として機能する。ＣＰＵ２０１は、処理の実行に際して必要なプログラム等をＲＯＭ２０２あるいは外部メモリ２１１からＲＡＭ２０３にロードし、ロードしたプログラムを実行することで各種動作を実現する。 The RAM 203 functions as a main memory, a work area, and the like of the CPU 201. The CPU 201 loads programs necessary for execution of processing from the ROM 202 or the external memory 211 to the RAM 203 and executes the loaded programs to realize various operations.

入力コントローラ２０５は、キーボード２０９や不図示のマウス等のポインティングデバイス等の入力装置からの入力を制御する。入力装置がタッチパネルの場合、ユーザがタッチパネルに表示されたアイコンやカーソルやボタンに合わせて押下（指等でタッチ）することにより、各種の指示を行うことができることとする。 The input controller 205 controls input from an input device such as a keyboard 209 and a pointing device such as a mouse (not shown). When the input device is a touch panel, various instructions can be given by the user pressing (touching with a finger or the like) according to an icon, a cursor or a button displayed on the touch panel.

また、タッチパネルは、マルチタッチスクリーンなどの、複数の指でタッチされた位置を検出することが可能なタッチパネルであってもよい。 Further, the touch panel may be a touch panel capable of detecting a position touched by a plurality of fingers, such as a multi-touch screen.

ビデオコントローラ２０６は、ディスプレイ２１０などの外部出力装置への表示を制御する。ディスプレイは本体と一体になったノート型パソコンのディスプレイも含まれるものとする。なお、外部出力装置はディスプレイに限ったものははく、例えばプロジェクタであってもよい。また、前述のタッチ操作を受け付け可能な装置については、入力装置も提供する。 Video controller 206 controls display on an external output device such as display 210. The display shall also include the display of a notebook computer integrated with the main unit. The external output device is not limited to the display, and may be, for example, a projector. The present invention also provides an input device for a device capable of receiving the above-described touch operation.

なおビデオコントローラ２０６は、表示制御を行うためのビデオメモリ（ＶＲＡＭ）を制御することが可能で、ビデオメモリ領域としてＲＡＭ２０３の一部を利用することもできるし、別途専用のビデオメモリを設けることも可能である。 The video controller 206 can control a video memory (VRAM) for display control, and can use part of the RAM 203 as a video memory area, or can separately provide a dedicated video memory. It is possible.

メモリコントローラ２０７は、外部メモリ２１１へのアクセスを制御する。外部メモリとしては、ブートプログラム、各種アプリケーション、フォントデータ、ユーザファイル、編集ファイル、および各種データ等を記憶する外部記憶装置（ハードディスク）、フレキシブルディスク（ＦＤ）、或いはＰＣＭＣＩＡカードスロットにアダプタを介して接続されるコンパクトフラッシュ（登録商標）メモリ等を利用可能である。 The memory controller 207 controls access to the external memory 211. The external memory is connected via an adapter to an external storage device (hard disk) that stores a boot program, various applications, font data, user files, editing files, various data, etc., a flexible disk (FD), or a PCMCIA card slot. Compact flash (registered trademark) memory etc. can be used.

通信Ｉ／Ｆコントローラ２０９は、ネットワークを介して外部機器と接続・通信するものであり、ネットワークでの通信制御処理を実行する。例えば、ＴＣＰ／ＩＰを用いた通信やＩＳＤＮなどの電話回線、および携帯電話の３Ｇ回線を用いた通信が可能である。 The communication I / F controller 209 connects / communicates with an external device via a network, and executes communication control processing in the network. For example, communications using TCP / IP, telephone lines such as ISDN, and 3G lines of mobile phones are possible.

尚、ＣＰＵ２０１は、例えばＲＡＭ２０３内の表示情報用領域へアウトラインフォントの展開（ラスタライズ）処理を実行することにより、ディスプレイ２１０上での表示を可能としている。また、ＣＰＵ２０１は、ディスプレイ２１０上の不図示のマウスカーソル等でのユーザ指示を可能とする。 The CPU 201 enables display on the display 210 by executing, for example, outline font rasterization processing on a display information area in the RAM 203. In addition, the CPU 201 enables user instruction with a mouse cursor (not shown) or the like on the display 210.

次に図３を参照して、本発明の実施形態における各種装置の機能構成の一例について説明する。 Next, with reference to FIG. 3, an example of functional configurations of various devices in the embodiment of the present invention will be described.

受信処理部３０１は、メールサーバ１２０から電子メールを受信する機能を備える。 The reception processing unit 301 has a function of receiving an electronic mail from the mail server 120.

検査処理部３０２は、受信処理部３０１で受信した電子メールについて、詐称メール（過去に受信した電子メールと文面が類似する（同様の文面を含む）メールであるが、送信元が当該類似するメールの送信元とは異なるメール）であるか否かの判定を行う機能等を備える。 The inspection processing unit 302 is an e-mail received by the reception processing unit 301, which is an e-mail spoofed mail (a mail whose text is similar (including similar text) to an e-mail received in the past). It has a function to determine whether or not it is a mail different from the sender of

メール保存領域３０３は、電子メールを記憶する領域であり、詐称情報や判定情報を関連付けて電子メールを記憶する機能を備える。 The e-mail storage area 303 is an area for storing an e-mail, and has a function of storing e-mail in association with fraudulent information and determination information.

判定知識保存領域３０４は、電子メールが詐称メールであるかの判定に用いる各種情報や学習した知識を記憶する機能を備える。 The determination knowledge storage area 304 has a function of storing various information used to determine whether the e-mail is a false mail or learned knowledge.

閲覧処理部３０５は、メール閲覧画面等を表示し、詐称メールである旨の警告等を表示する機能を備える。 The browsing processing unit 305 has a function of displaying a mail browsing screen or the like and displaying a warning or the like to the effect that the mail is a false mail.

判定知識構築処理部３０６は、電子メールが詐称メールであるかを判定するための知識を構築する処理を実行する機能を The determination knowledge construction processing unit 306 has a function of executing a process of constructing knowledge for determining whether the email is an spoofed mail.

（メール検査処理）
次に図４のフローチャートを用いて、本発明の実施形態における検査処理部３０２が実行する電子メールの検査処理について説明する。 (E-mail inspection process)
Next, an inspection process of an electronic mail performed by the inspection processing unit 302 in the embodiment of the present invention will be described using the flowchart of FIG. 4.

図４のフローチャートは、詐称メール検査装置１００のＣＰＵ２０１が所定の制御プログラムを読み出して実行する処理であり、受信処理部３０１において電子メールを受信した際、電子メールが詐称されている可能性の有無を検査する処理を示すフローチャートである。 The flowchart in FIG. 4 is a process in which the CPU 201 of the false mail examination apparatus 100 reads and executes a predetermined control program. When the reception processing unit 301 receives an e-mail, there is a possibility that the e-mail is spoofed It is a flowchart which shows the process which inspects.

ステップＳ４０１では、検査処理部３０２は、メール受信処理部３０１で受信された電子メールを受け取る。 In step S401, the inspection processing unit 302 receives the electronic mail received by the mail reception processing unit 301.

ステップＳ４０２では、検査処理部３０２は、ステップＳ４０１で受け取った電子メールに対し、詐称の有無（詐称の可能性）を判定し、詐称情報を取得する。詐称判定処理の詳細については図２０を用いて後述する。 In step S402, the inspection processing unit 302 determines the presence or absence of fraud (possibility of fraud) for the electronic mail received in step S401, and acquires fraud information. The details of the false detection process will be described later with reference to FIG.

ステップＳ４０３では、検査処理部３０２は、ステップＳ４０２で詐称の可能性ありと判定した場合、ステップＳ４０４に処理を移す。詐称の可能性なしと判断した場合は、ステップＳ４０５に処理を移す。 In step S403, if the inspection processing unit 302 determines in step S402 that there is a possibility of misrepresentation, the processing proceeds to step S404. If it is determined that there is no possibility of fraud, the process proceeds to step S405.

ステップＳ４０４では、検査処理部３０２は、ステップＳ４０２で取得した詐称情報を処理対象である電子メールと関連付ける。 In step S404, the inspection processing unit 302 associates the false information acquired in step S402 with the e-mail to be processed.

ステップＳ４０５では、検査処理部３０２は、電子メールをメール保存領域３０３のメールテーブル５０１に保存する。 In step S405, the inspection processing unit 302 stores the electronic mail in the mail table 501 of the mail storage area 303.

図５はメール保存領域３０３の構成であり、メールテーブル５０１を有する。具体的なデータ内容については、図６を用いて説明する。 FIG. 5 shows the configuration of the mail storage area 303, which has a mail table 501. Specific data contents will be described with reference to FIG.

図６に示すメールテーブル５０１は、検査処理部３０２が受け取った電子メールを保存するテーブルであり、電子メールの各項目（具体的には、電子メールの本文、受信者（送信先）、送信者、送信日時など）から構成 The e-mail table 501 shown in FIG. 6 is a table for storing the e-mail received by the inspection processing unit 302, and each item of the e-mail (specifically, the text of the e-mail, the receiver (transmission destination), the sender) , Sent date, etc.)

（メール判定処理）
次に図８のフローチャートを用いて、ユーザによるメール判定処理について説明する。 (Email judgment processing)
Next, mail determination processing by the user will be described using the flowchart of FIG.

詐称メール検査装置は、ユーザが起動（またはログイン）すると一般のメールクライアントと同様に図７に示すようなメール一覧画面７０１を表示する。メール一覧画面７０１の内容は、メールテーブル５０１の内容に基づく。詐称情報があり、かつ判定情報がないメールに対しては警告アイコン７０２が表示される。 When the user is activated (or logged in), the false mail examination apparatus displays a mail list screen 701 as shown in FIG. 7 as in the case of a general mail client. The contents of the mail list screen 701 are based on the contents of the mail table 501. A warning icon 702 is displayed for a mail that has misrepresentation information and no determination information.

ユーザが一覧に表示されている任意のメールに対して閲覧の指示を行うと、閲覧処理部３０５は、図８に示すフローチャートの処理を実行する。 When the user issues an instruction to browse an arbitrary email displayed in the list, the browse processing unit 305 executes the processing of the flowchart shown in FIG.

ステップＳ８０１では、閲覧処理部３０５はユーザが指定した電子メールに対し、メールテーブル５０１から取得した情報に基づき、図９に示すようなメール閲覧画面９０１を表示する。電子メールに詐称情報がある場合、警告表示領域９０２に詐称情報の詳細を表示する。 In step S801, the browsing processing unit 305 displays an e-mail browsing screen 901 as shown in FIG. 9 based on the information acquired from the e-mail table 501 for the e-mail specified by the user. If there is fraudulent information in the e-mail, details of the fraudulent information are displayed in the warning display area 902.

ユーザが閉じるボタン９０５を押下すると、メール閲覧画面９０１を閉じて処理をステップＳ８０２に処理を移す。 When the user presses the close button 905, the mail viewing screen 901 is closed, and the process proceeds to step S802.

ステップＳ８０２では、閲覧処理部３０５は、メール閲覧画面９０１の判定指示領域９０４の状態を判定情報として取得する。「詐称メールです。」がアクティブな場合（ユーザにより詐称メールである旨が選択された場合）は判定情報として「詐称」を取得し、「詐称メールではありません。」がアクティブな場合（ユーザにより詐称メールではない旨が選択された場合）は判定情報として「非詐称」を取得する。 In step S802, the browsing processing unit 305 acquires the state of the determination instruction area 904 of the mail viewing screen 901 as determination information. If "It is an spoofed mail." Is active (if the user chooses to be an spoofed mail), "Spoofing" is acquired as the judgment information, and if "Not a spoofed mail" is active (spoofed by the user When it is selected that the mail is not mail), "non-spoofing" is acquired as the judgment information.

判定情報が「詐称」となったメールは一覧に表示しないように構成してもよいし、自動で削除するように構成してもよい。 The mail whose judgment information is “spoofed” may not be displayed on the list, or may be automatically deleted.

ステップＳ８０３では、閲覧処理部３０５は、ステップＳ８０２で取得した判定情報を、メールテーブル５０１の該当する電子メールに関連付けて保 In step S 803, the browsing processing unit 305 associates the determination information acquired in step S 802 with the corresponding e-mail in the e-mail table 501 and stores it.

（判定知識構築処理）
次に図１０のフローチャートを用いて、判定知識構築処理について説明する。本発明の実施例において、判定知識構築処理は、メール判定処理とは非同期に実行される。判定知識構築処理は、定期的に実行されるように構成してもよいし、前回処理から保存されたメールの状態に基づき実行を判断するように構成してもよい。 (Determination knowledge construction processing)
Next, the determination knowledge construction process will be described using the flowchart of FIG. In the embodiment of the present invention, the determination knowledge construction process is executed asynchronously with the mail determination process. The determination knowledge construction processing may be configured to be periodically executed, or may be configured to determine execution based on the state of the mail saved from the previous processing.

また、本発明の実施形態においては、説明が容易になるように、全ての判定知識を構築しなおすように構成しているが、部分的に判定知識を更新するように構成してもよい。 Further, in the embodiment of the present invention, although all the determination knowledge is configured to be reconstructed so as to facilitate the description, the determination knowledge may be partially updated.

ステップＳ１００１では、判定知識構築処理部３０６は、メールテーブル５０１から判定情報が「非詐称」に設定されている電子メールを取得する。取得する電子メールについては「直近半年に送ったメール」というように知識構築の対象となる過去メールの限定を行なうように構成してもよい。 In step S1001, the determination knowledge construction processing unit 306 acquires, from the mail table 501, an electronic mail in which the determination information is set to “non-spoofing”. The e-mail to be acquired may be configured to limit past e-mails that are targets of knowledge construction, such as “e-mail sent in the last half year”.

ステップＳ１００２では、判定知識構築処理部３０６は、ステップＳ１００１で取得した電子メールに対して、ステップＳ１００５までの繰り返し処理を開始する。 In step S1002, the determination knowledge construction processing unit 306 starts repeated processing up to step S1005 for the electronic mail acquired in step S1001.

ステップＳ１００３では、判定知識構築処理部３０６は、処理対象となった電子メールの本文に対し、形態素解析などを用いて、電子メールを構成する単語を取得する。単語の取得に際しては、助詞・助動詞などの付属語の除外を行なってもよいし、同義語辞書を用いるなどの正規化を行なってもよい。 In step S1003, the determination knowledge construction processing unit 306 acquires, for the text of the email to be processed, words constituting the email using morphological analysis or the like. When acquiring words, additional words such as particles and auxiliary verbs may be excluded, or normalization such as using a synonym dictionary may be performed.

ステップＳ１００４では、判定知識構築処理部３０６は、ステップＳ１００３で分割した単語に対して、単語統計テーブル１１０１を更新する。同時に、送信者と関連付けて送信者別単語統計テーブル１１０２を更新する。 In step S1004, the determination knowledge construction processing unit 306 updates the word statistics table 1101 for the words divided in step S1003. At the same time, the per-sender word statistics table 1102 is updated in association with the sender.

ステップＳ１００５では、判定知識構築処理部３０６は、処理対象となる電子メールがまだあれば、ステップＳ１００２からの繰り返し処理を実施する。処理対象となる電子メールがなければ、ステップＳ１００６に処理を移す。 In step S1005, if the e-mail to be processed is still present, the determination knowledge construction processing unit 306 executes the iterative process from step S1002. If there is no e-mail to be processed, the process moves to step S1006.

ステップＳ１００６では、判定知識構築処理部３０６は、ステップＳ１００４で取得した単語の集計結果から、単語の出現確率を算出して単語統計テーブル１１０１および送信者別単語統計テーブル１１０２を更新する。 In step S1006, the determination knowledge construction processing unit 306 calculates the word appearance probability from the word counting result acquired in step S1004 and updates the word statistics table 1101 and the sender-by-sender word statistics table 1102.

ステップＳ１００７では、判定知識構築処理部３０６は、ステップＳ１００１で取得した電子メールに対して、ステップＳ１０１０までの繰り返し処理を開始する。 In step S1007, the determination knowledge construction processing unit 306 starts repeated processing up to step S1010 on the electronic mail acquired in step S1001.

ステップＳ１００８では、判定知識構築処理部３０６は、処理対処である電子メールに対し、定型部を抽出する。定型部の抽出処理については図１７を用いて後述する。 In step S1008, the determination knowledge construction processing unit 306 extracts a fixed form part from the e-mail that is the processing countermeasure. The extraction process of the fixed form part will be described later with reference to FIG.

ステップＳ１００９では、判定知識構築処理部３０６は、ステップＳ１００８で抽出した電子メールの定型部を構成する単語を、対象となる電子メールと関連付けて、一時領域に保存する。単語の抽出は、ステップＳ１００３の結果を再利用するように構成してもよい。 In step S1009, the determination knowledge construction processing unit 306 stores the words constituting the fixed form part of the electronic mail extracted in step S1008 in the temporary area in association with the target electronic mail. The word extraction may be configured to reuse the result of step S1003.

ステップＳ１０１０では、判定知識構築処理部３０６は、処理対象となる電子メールがまだあれば、ステップＳ１００７からの繰り返し処理を実施する。処理対象となる電子メールがなければ、ステップＳ１０１１に処理を移す。 In step S1010, if the e-mail to be processed is still present, the determination knowledge construction processing unit 306 executes the repetitive processing from step S1007. If there is no e-mail to be processed, the process moves to step S1011.

ステップＳ１０１１では、判定知識構築処理部３０６は、一時領域に蓄えられた、電子メールと定型部の単語の組みをベクトル化し、ベクトル化に必要な情報を判定知識保存領域３０４におけるベクトル化情報テーブル１１０３に保存する。ベクトル化はｔｆ・ｉｄｆなどの一般的な方法による。単語の組みを持たない、または規定の数より少ない単語しか持たない電子メールは対象外とするように構成してもよい。 In step S1011, the determination knowledge construction processing unit 306 vectorizes the combination of the e-mail and the word of the fixed form portion stored in the temporary area, and information necessary for vectorization is the vectorization information table 1103 in the determination knowledge storage area 304. Save to Vectorization is by a general method such as tf and idf. Emails that do not have a set of words or have fewer words than a prescribed number may be excluded.

ステップＳ１０１２では、判定知識構築処理部３０６は、ステップＳ１０１１で取得したベクトルに対し、関連付けられている電子メールの送信者（Ｆｒｏｍ：）を関連付けて学習を行ない、学習知識を判定知識保存領域３０４における学習知識テーブル１１０４に保存する。 In step S1012, the determination knowledge construction processing unit 306 performs learning by associating the sender of the associated email (From :) with the vector acquired in step S1011, and learning is performed in the determination knowledge storage area 304. It is stored in the learning knowledge table 1104.

学習には多クラスの分類を行なうことができる、一般的な分類器を用いるものとする Use a general classifier that can perform multi-class classification for learning

（判定知識構築処理具体例）
次に判定知識構築処理の具体例として、図６に示すメールテーブル５０１に対して図１０に示す処理が実施された場合について説明する。 (Specific example of judgment knowledge construction process)
Next, as a specific example of the determination knowledge construction process, a case where the process shown in FIG. 10 is performed on the mail table 501 shown in FIG. 6 will be described.

ステップＳ１００１では、判定知識構築処理部３０６は、電子メール６０１から始まる一連の電子メールを取得する。 In step S1001, the determination knowledge construction processing unit 306 acquires a series of electronic mail starting from the electronic mail 601.

ステップＳ１００２では、判定知識構築処理部３０６は、電子メール６０１に対して、ステップＳ１００５までの繰り返し処理を開始する。 In step S1002, the determination knowledge construction processing unit 306 starts repeated processing up to step S1005 for the email 601.

ステップＳ１００３では、判定知識構築処理部３０６は、電子メール６０１の本文から単語として「ニューセレクト」「近藤」「様」「お世話」「トサロジスティクス」「坂本」「納品書」「件」などを取得する（本具体例においては、助詞・助動詞などの付属語は対象外とする）。 In step S1003, the determination knowledge construction processing unit 306 acquires “New Select”, “Kondo”, “Like”, “Take Care”, “Tosa Logistics”, “Sakamoto”, “Invoice”, “Case”, etc. as words from the text of the email 601. (In this specific example, adjuncts such as particles and auxiliary verbs are excluded).

ステップＳ１００４では、判定知識構築処理部３０６は、ステップＳ１００３で分割した単語に対して単語統計テーブル１１０１を更新する。具体的には、単語がなければ単語を単語統計テーブル１１０１に追加し、出現頻度（出現文書数）を１とする。単語があれば出現頻度（出現文書数）を１加算する。また同様に送信者「ｓａｋａｍｏｔｏ＠ｔｏｓａ．ｃｏ．ｊｐ」と関連付けて送信者別単語統計テーブル１１０２も更新する。 In step S1004, the determination knowledge construction processing unit 306 updates the word statistics table 1101 for the words divided in step S1003. Specifically, if there is no word, the word is added to the word statistics table 1101, and the appearance frequency (the number of appearing documents) is set to 1. If there is a word, the appearance frequency (the number of appearing documents) is incremented by one. Similarly, the per-sender word statistics table 1102 is also updated in association with the sender “sakamoto@tosa.co.jp”.

ステップＳ１００５では、判定知識構築処理部３０６は、処理対象となる電子メール６０２があるので、ステップＳ１００２からの繰り返し処理を実施する。 In step S1005, the determination knowledge construction processing unit 306 executes the repetitive processing from step S1002 because there is the electronic mail 602 to be processed.

以下、全ての電子メールに対して同様の処理を繰り返す。 Thereafter, the same processing is repeated for all e-mails.

ステップＳ１００６では、判定知識構築処理部３０６は、ステップＳ１００１〜ステップＳ１００５で更新した単語統計テーブル１１０１に対し、単語の出現確率を算出して単語統計テーブル１１０１を更新し、結果として図１２に示す単語統計テーブルを得る。また送信者別単語統計テーブル１１０２に対しても同様に単語の出現確率を求めて更新する。 In step S1006, the determination knowledge construction processing unit 306 calculates the word appearance probability with respect to the word statistics table 1101 updated in steps S1001 to S1005, and updates the word statistics table 1101. As a result, the words shown in FIG. Get statistics table. The occurrence probability of the word is similarly obtained and updated in the word-by-sender word statistics table 1102 as well.

ステップＳ１００７では、判定知識構築処理部３０６は、電子メール６０１から始まる一連の電子メールに対して、ステップＳ１０１０までの繰り返し処理を開始する。 In step S1007, the determination knowledge construction processing unit 306 starts repeated processing up to step S1010 for a series of electronic mail starting from the electronic mail 601.

ステップＳ１００８では、判定知識構築処理部３０６は、電子メール６０１に対する定型部抽出処理の結果、メール冒頭の挨拶や署名等の定型的に記述されている部分として「ニューセレクト近藤様お世話になっております。トサロジスティクス坂本です。」と「−− トサロジスティクス株式会社Ｔｅｌ：１２３−４５６７Ｆａｘ：１２３−４５６８坂本辰雄」を抽出する。定型部抽出の具体例については後述する。 In step S1008, as a result of the routine extraction processing on the email 601, the determination knowledge construction processing unit 306 determines that “New Select Kondo has been indebated as a routinely described part such as a greeting or a signature at the beginning of the email. Tosa Logistics Sakamoto. ”And“-Tosa Logistics Co., Ltd. Tel: 123-4567 Fax: 123-4568 Sakamoto Ikuo ”are extracted. A specific example of fixed part extraction will be described later.

ステップＳ１００９では、判定知識構築処理部３０６は、ステップＳ１００８で抽出した電子メールの定型部を構成する単語として「ニューセレクト」、「近藤」、「様」、「お世話」、「トサロジスティクス」、「坂本」、「以上」、「よろしく」、「お願い」、「株式」、「会社」、「ｔｅｌ」、「１２３−４５６７」、「ｆａｘ」、「１２３−４５６８」、「辰雄」を、電子メール６０１と関連付けて、一時領域に保存する。 In step S1009, the determination knowledge construction processing unit 306 sets “new select”, “Kondo”, “like”, “care”, “Tosa logistics”, “words” as the words constituting the fixed form part of the e-mail extracted in step S1008. "Sakamoto", "More than", "Regards", "Please", "Shares", "Company", "tel", "123-4567", "fax", "123-4568", "Mio", E-mail Associate with 601 and save in temporary area.

ステップＳ１０１０では、判定知識構築処理部３０６は、電子メール６０２があるので、ステップＳ１００７からの繰り返し処理を実施する。 In step S1010, the determination knowledge construction processing unit 306 executes the iterative process from step S1007 because there is the electronic mail 602.

以下、同様の処理を繰り返し、図１４の結果を一時領域に得る。 Thereafter, the same process is repeated to obtain the result of FIG. 14 in the temporary area.

ステップＳ１０１１では、判定知識構築処理部３０６は、一時領域に蓄えられた、電子メールと定型の単語の組みをベクトル化して図１５に示すベクトルを得る。ベクトル化に必要な情報を判定知識保存領域３０４におけるベクトル化情報テーブル１１０３に保存する。 In step S1011, the determination knowledge construction processing unit 306 vectorizes the combination of the electronic mail and the fixed word stored in the temporary area to obtain a vector shown in FIG. Information required for vectorization is stored in the vectorization information table 1103 in the determination knowledge storage area 304.

ステップＳ１０１２では、判定知識構築処理部３０６は、ステップＳ１０１１で取得したベクトルに対し、関連付けられている電子メールの送信者を関連付けて学習を行ない、学習した知識を判定知識保存領域３０４における学習知識テーブル１１０４に保存する。 In step S1012, the determination knowledge construction processing unit 306 performs learning by associating the sender of the associated electronic mail with the vector acquired in step S1011, and performs the learned knowledge in the learning knowledge table in the determination knowledge storage area 304. Save to 1104.

結果として、誤送信防止装置１００は、ステップＳ１０１１で取得したベクトル化情報とステップＳ１０１２で取得した学習知識を用いることで、電子メールの挨拶部に対する送信先候補を確度付きで推測することが可能となる。図１６に分類器による送信先候補推測の概要につい As a result, by using the vectorization information acquired in step S1011 and the learning knowledge acquired in step S1012, the erroneous transmission preventing apparatus 100 can estimate the transmission destination candidate for the greeting part of the electronic mail with certainty. Become. Figure 16 shows an overview of destination candidate guessing by the classifier.

（定型部抽出処理）
次に図１７のフローチャートを用いて、ステップＳ１００８において、判定知識構築処理部３０６が電子メールから定型部を抽出する定型部抽出処理について説明する。 (Fixed part extraction process)
Next, with reference to the flowchart of FIG. 17, the fixed form part extraction processing will be described in which the determination knowledge construction processing unit 306 extracts the fixed form part from the electronic mail in step S1008.

本発明の実施形態の説明では、出現する単語の出現頻度や出現確率を用いて定型部を特定するが、他の方法によって抽出するように構成しても構わない。 In the description of the embodiment of the present invention, the fixed part is specified using the appearance frequency and the appearance probability of the appearing word, but may be configured to be extracted by another method.

ステップＳ１７０１では、判定知識構築処理部３０６は、処理対象の電子メールから本文を取得する。 In step S1701, the determination knowledge construction processing unit 306 acquires the text from the processing target electronic mail.

ステップＳ１７０２では、判定知識構築処理部３０６は、ステップＳ１７０１で取得した本文の各行に対して、ステップＳ１７０６までの繰り返し処理を開始する。 In step S1702, the determination knowledge construction processing unit 306 starts repeated processing up to step S1706 for each line of the text acquired in step S1701.

ステップＳ１７０３では、判定知識構築処理部３０６は、処理対象の行を単語に分割する。 In step S1703, the determination knowledge construction processing unit 306 divides the processing target line into words.

ステップＳ１７０４では、判定知識構築処理部３０６は、ステップＳ１７０３で取得した単語それぞれに対し単語統計テーブル１１０１からの出現確率を全文書出現確率として取得する。また送信者別単語統計テーブル１１０２から出現確率を送信者別出現確率として取得する。送信者別単語統計テーブル１１０２からは１つの単語に対して複数の出現確率が得られる。全文書出現確率と送信者別出現確率の最大のものをその単語の出現確率とする。 In step S1704, the determination knowledge construction processing unit 306 acquires the appearance probability from the word statistics table 1101 for each of the words acquired in step S1703 as the all document appearance probability. Further, the appearance probability is acquired from the sender-by-sender word statistics table 1102 as the transmission probability by transmitter. From the per-sender word statistics table 1102, a plurality of appearance probabilities can be obtained for one word. The largest of all document appearance probability and the appearance probability by sender is regarded as the appearance probability of the word.

ステップＳ１７０５では、判定知識構築処理部３０６は、ステップＳ１７０４で取得した出現確率の調和平均を行の重みとして算出する。行に有効な単語が１つも出現しない場合は行の重みを０とする。 In step S1705, the determination knowledge construction processing unit 306 calculates the harmonic mean of the appearance probabilities acquired in step S1704 as the row weight. If no valid words appear in the line, the line weight is set to 0.

ステップＳ１７０６では、判定知識構築処理部３０６は、処理対象となる行がまだあれば、ステップＳ１７０２からの繰り返し処理を実施する。処理対象となる行がなければ、ステップＳ１７０７に処理を移す。 In step S1706, if there is still a row to be processed, the determination knowledge construction processing unit 306 executes the iterative process from step S1702. If there is no row to be processed, the process moves to step S1707.

ステップＳ１７０７では、判定知識構築処理部３０６は、行の重みが規定値以上である行を定型部として抽出する。全ての行が規定値に満たない場合は定型部がないものと判断する。 In step S1707, the determination knowledge construction processing unit 306 extracts a row whose line weight is equal to or more than a specified value as a fixed portion. If all lines do not meet the specified value, it is determined that there is no fixed part.

上記のように構成することで、電子メールに記述されている定型部を抽出することが可能 By configuring as described above, it is possible to extract the fixed form part described in the e-mail

（定型部抽出処理具体例）
次に定型部抽出処理の具体例として、図１８に示す電子メール１８０１に対して図１７に示す処理が実施された場合について説明する。 (Specific example of fixed part extraction process)
Next, as a specific example of the fixed form part extraction process, a case where the process shown in FIG. 17 is performed on the electronic mail 1801 shown in FIG. 18 will be described.

ステップＳ１７０１では、判定知識構築処理部３０６は、処理対象の電子メール１８０１から本文１８０２を取得する。 In step S1701, the determination knowledge construction processing unit 306 acquires the text 1802 from the email 1801 to be processed.

ステップＳ１７０２では、判定知識構築処理部３０６は、ステップＳ１７０１で取得した本文１８０２に対して、ステップＳ１７０６までの繰り返し処理を開始する。 In step S1702, the determination knowledge construction processing unit 306 starts repeated processing up to step S1706 for the text 1802 acquired in step S1701.

ステップＳ１７０３では、判定知識構築処理部３０６は、本文１８０２の最初の行「ニューセレクト近藤様」を単語に分割し、「ニューセレクト」「近藤」「様」を得る。 In step S1703, the determination knowledge construction processing unit 306 divides the first line "New Select Kondo-like" in the text 1802 into words, and obtains "New Select", "Kondo", and "like".

ステップＳ１７０４では、判定知識構築処理部３０６は、ステップＳ１７０３で取得した単語「ニューセレクト」「近藤」「様」に対し、単語統計テーブル１１０１から「１．０００」「０．６３０」「０．６３０」を出現確率として取得する。同様に送信者別単語統計テーブル１１０２から「１．０００」「１．０００」「１．０００」を得る。最終的に単語「ニューセレクト」「近藤」「様」に対する出現確率として「１．０００」「１．０００」「１．０００」を得る。 In step S1704, the determination knowledge construction processing unit 306 executes, from the word statistics table 1101, “1.000”, “0.630”, and “0.630” for the words “new select”, “Kondo”, and “like” acquired in step S1703. "Is obtained as an appearance probability. Similarly, "1.000", "1.000" and "1.000" are obtained from the per-sender word statistics table 1102. Finally, "1.000", "1.000", and "1.000" are obtained as the appearance probabilities for the words "New Select", "Kondo", and "Like".

ステップＳ１７０５では、判定知識構築処理部３０６は、「１．０００」「１．０００」「１．０００」の調和平均「１．００」を行の重みとして得る。 In step S1705, the determination knowledge construction processing unit 306 obtains the harmonic mean "1.00" of "1.000" "1.000" "1.000" as the row weight.

ステップＳ１７０６では、判定知識構築処理部３０６は、本文１８０２にまだ行があるので、ステップＳ１７０２からの繰り返し処理を実施する。以下、同様の処理を繰り返した結果として、図１９に示すような各行に対する重みを取得する。

In step S1706, the determination knowledge construction processing unit 306 executes the repetitive processing from step S1702 because there is still a line in the text 1802. Hereinafter, as a result of repeating the same process, weights for each row as shown in FIG. 19 are acquired.

ステップＳ１７０７では、判定知識構築処理部３０６は、行の重みが規定値（本実施形態では０．３とする）以上である行として定型部１８０３を抽 In step S1707, the determination knowledge construction processing unit 306 extracts the fixed form unit 1803 as a line whose line weight is equal to or greater than a specified value (0.3 in this embodiment).

（詐称判定処理）
次に図２０のフローチャートを用いて、検査処理部３０２が実行するメール検査処理におけるステップＳ４０２の詐称判定処理の詳細について説明する。 (Spoofing judgment processing)
Next, the details of the false recognition process in step S402 in the mail inspection process executed by the inspection processing unit 302 will be described using the flowchart in FIG.

ステップＳ２００１では、検査処理部３０２は、対象となる電子メールから定型部を抽出する。定型部抽出処理は、図１７に示した判定知識構築処理部３０６における定型部抽出処理と同様である。 In step S2001, the inspection processing unit 302 extracts the fixed form part from the target electronic mail. The fixed form part extraction process is the same as the fixed form part extraction process in the determination knowledge construction processing unit 306 shown in FIG.

ステップＳ２００２では、検査処理部３０２は、ステップＳ２００１で定型部がなければ、ステップＳ２００３に処理を移す。定型部があれば、ステップＳ２００４に処理を移す。 In step S2002, if there is no fixed form part in step S2001, the inspection processing unit 302 shifts the process to step S2003. If there is a fixed form part, the process moves to step S2004.

ステップＳ２００３では、検査処理部３０２は、対象となる電子メールを「定型部なし」と判定する。 In step S2003, the inspection processing unit 302 determines that the target electronic mail is "no fixed portion".

ステップＳ２００４では、検査処理部３０２は、ステップＳ２００１で抽出した定型部を単語に分割する。 In step S2004, the inspection processing unit 302 divides the fixed form part extracted in step S2001 into words.

ステップＳ２００５では、検査処理部３０２は、ステップＳ２００６で抽出した単語の集合を、ベクトル化情報テーブル１１０３を参照してベクトル化する。 In step S2005, the inspection processing unit 302 vectorizes the set of words extracted in step S2006 with reference to the vectorization information table 1103.

ステップＳ２００６では、検査処理部３０２は、ステップＳ２００５で取得したベクトルに対して、学習知識テーブル１１０４を参照した分類器を用いて確度が規定値以上の送信者を候補として取得する。 In step S2006, the inspection processing unit 302 acquires, for the vector acquired in step S2005, a sender having a certainty or more of the specified value as a candidate using a classifier that refers to the learning knowledge table 1104.

ステップＳ２００７では、検査処理部３０２は、ステップＳ２００６で推定した送信者候補がなければ、ステップＳ２００８に処理を移す。送信者候補があれば、ステップＳ２０１１に処理を移す。 In step S2007, if there is no sender candidate estimated in step S2006, the test processing unit 302 shifts the process to step S2008. If there is a sender candidate, the process proceeds to step S2011.

ステップＳ２００８では、検査処理部３０２は、メールテーブル５０１を参照して、対象となる電子メールの送信者からの送信実績が過去にあるかを判定する。送信実績がなければ、ステップＳ２００９に処理を移す。送信実績があれば、ステップＳ２０１０に処理を移す。 In step S2008, the inspection processing unit 302 refers to the mail table 501 to determine whether the transmission result from the sender of the target electronic mail is in the past. If there is no transmission record, the process proceeds to step S2009. If there is a transmission record, the process proceeds to step S2010.

ステップＳ２００９では、検査処理部３０２は、対象となる電子メールを「新規送信者」と判定する。 In step S2009, the inspection processing unit 302 determines that the target electronic mail is a "new sender".

ステップＳ２０１０では、検査処理部３０２は、対象となる電子メールを「送信者要確認」と判定する。 In step S2010, the inspection processing unit 302 determines that the target electronic mail is “sender confirmation required”.

ステップＳ２０１１では、検査処理部３０２は、送信者候補の中に、対象となる電子メールの送信者が含まれているかを判定する。送信者候補の中に送信者が含まれていなければ、ステップＳ２０１２に処理を移す。送信者が含まれていれば、ステップＳ２０１３に処理を移す。 In step S2011, the inspection processing unit 302 determines whether the sender of the target electronic mail is included in the sender candidates. If the sender candidate is not included in the sender candidate, the process proceeds to step S2012. If the sender is included, the process proceeds to step S2013.

ステップＳ２０１２では、検査処理部３０２は、対象となる電子メールを「詐称」と判定する。また、確度が最上位の送信者候補を正しい送信者として、警告時に提示するように構成する。 In step S2012, the inspection processing unit 302 determines that the target electronic mail is “spoofing”. In addition, the sender candidate with the highest probability is configured to be presented as a correct sender at the time of warning.

ステップＳ２０１３では、検査処理部３０２は、送信先と一致した送信先候補の確度がもっとも高いかを判定する。確度が最上位の値でなければ、ステップＳ２０１０に処理を移す。確度が最上位の値であれば、処理を終 In step S2013, the inspection processing unit 302 determines whether the probability of the transmission destination candidate that matches the transmission destination is the highest. If the certainty is not the highest value, the process moves to step S2010. If the accuracy is the highest value, the process ends

（詐称判定処理具体例）
次に詐称判定処理の具体例として、図２１に示す電子メール２１０１に対して図２０に示す処理が実施された場合について説明する。 (A specific example of the false recognition process)
Next, a case where the process shown in FIG. 20 is performed on the e-mail 2101 shown in FIG. 21 will be described as a specific example of the false detection process.

ステップＳ２００１では、検査処理部３０２は、対象となる電子メール２１０１から定型部２１０２を抽出する。 In step S2001, the inspection processing unit 302 extracts the fixed form part 2102 from the target electronic mail 2101.

ステップＳ２００２では、検査処理部３０２は、ステップＳ２００１で定型部２１０２を取得したので、ステップＳ２００４に処理を移す。 In step S2002, since the inspection processing unit 302 has acquired the fixed form unit 2102 in step S2001, the processing proceeds to step S2004.

ステップＳ２００４では、検査処理部３０２は、ステップＳ２００１で抽出した定型部２１０２を単語に分割し、「ニューセレクト」、「近藤様」、「萩産業」、「高杉」、「お世話」、「依頼」、「見積書」、「送付」、「添付」、「ファイル」、「確認」、「株」、「東日本営業部」、「新太郎」、「ｔｅｌ」、「９８７−６５４３」、「ｅｍａｉｌ」、「ｔａｋａｓｕｇｉ＠ｈａｇｉ．ｃｏｍ」を得る。 In step S2004, the inspection processing unit 302 divides the fixed form unit 2102 extracted in step S2001 into words, and selects "New Select", "Kondo-sama", "Aoi Sangyo", "Takasugi", "Thank you", "Request" , "Estimate", "Sending", "Attachment", "File", "Confirmation", "Share", "East Japan Sales Department", "Shintaro", "tel", "987-6543", "email", Get "takasugi@hagi.com".

ステップＳ２００６では、検査処理部３０２は、ステップＳ２００５で取得したベクトルに対して、学習知識テーブル１１０４を参照した分類器を用いて、図２２に示す送信者候補２２０１（ｔａｋａｓｕｇｉ＠ｈａｇｉ．ｃｏｍ）および送信者候補２２０２（ｋｏｇｏｒｏ＠ｈａｇｉ．ｃｏｍ）を取得する。 In step S2006, the inspection processing unit 302 transmits the candidate 2201 (takasugi@hagi.com) and transmission shown in FIG. 22 to the vector acquired in step S2005 using the classifier that refers to the learning knowledge table 1104. Candidate 2202 (kogoro@hagi.com) is acquired.

ステップＳ２００７では、検査処理部３０２は、ステップＳ２００６で推定した送信者候補があるので、ステップＳ２０１１に処理を移す。 In step S2007, since there are sender candidates estimated in step S2006, the test processing unit 302 shifts the process to step S2011.

ステップＳ２０１１では、検査処理部３０２は、送信者候補の中に、対象となる電子メールの送信者２１０３（ａ２ｂ３ｃ４＠ａｔｔａｃｋｅｒ．ｃｏｍ）が含まれていないので、ステップＳ２０１２に処理を移す。 In step S2011, the inspection processing unit 302 transfers the process to step S2012 because the sender candidate 2103 (a2b3c4@attacker.com) is not included in the sender candidates.

ステップＳ２０１２では、検査処理部３０２は、対象となる電子メールを「詐称」と判定する。また、確度が最上位の送信者候補２２０１（ｔａｋａｓｕｇｉ＠ｈａｇｉ．ｃｏｍ）を正しい送信者として取得し処理を終了する。 In step S2012, the inspection processing unit 302 determines that the target electronic mail is “spoofing”. In addition, the highest possible sender candidate 2201 (takasugi@hagi.com) is acquired as the correct sender, and the process ends.

結果として、定型部２１０２に対する送信者２１０３（ａ２ｂ３ｃ４＠ａｔｔａｃｋｅｒ．ｃｏｍ）が整合せず詐称の可能性があると判定し、正しい送信者の候補として送信者候補２２０１（ｔａｋａｓｕｇｉ＠ｈａｇｉ．ｃｏｍ）を提示することが可能となる。ユーザは送信者２１０３（ａ２ｂ３ｃ４＠ａｔｔａｃｋｅｒ．ｃｏｍ）が信者候補２２０１（ｔａｋａｓｕｇｉ＠ｈａｇｉ．ｃｏｍ）を詐称してメールを送ってきたと容易に判断することが可能となる。 As a result, it is determined that the sender 2103 (a2b3c4@attacker.com) for the fixed form part 2102 is not consistent and there is a possibility of falsehood, and the sender candidate 2201 (takasugi@hagi.com) is presented as a candidate of the correct sender. It is possible to The user can easily judge that the sender 2103 (a2b3c4@attacker.com) has misrepresented the candidate for the candidate 2201 (takasugi@hagi.com) and sent a mail.

上記のように構成することで、メールの定型部が似ているメールを過去に送信してきた送信者と対象となるメールの送信者が異なる場合を判定することが可能となり、詐称の可能性が高い状態を検出することが可能となる。さらに詐称と判定した場合に正しいと推測される送信者を提示することができ、容易な確認が可能となる。 By configuring as described above, it is possible to determine the case where the sender who has sent an e-mail with similar fixed part of e-mail in the past and the sender of the target e-mail are different, and there is a possibility of spoofing It is possible to detect a high state. Further, it is possible to present a sender that is presumed to be correct when it is determined to be a false statement, and easy confirmation is possible.

本発明の実施形態においては、メールクライアントとして実施する構成とした場合について説明したが、いわゆるＷｅｂメールを提供するシステムやメールフィルタリングの機能として構成してもよい。 In the embodiment of the present invention, although the case where it implements as a mail client was explained, it may constitute as a system which provides what is called Web mail, and a function of mail filtering.

本発明は、例えば、システム、装置、方法、プログラムもしくは記録媒体等としての実施態様をとることが可能である。具体的には、複数の機器から構成されるシステムに適用しても良いし、また、一つの機器からなる装置に適用しても良い。 The present invention can be embodied as, for example, a system, an apparatus, a method, a program, or a recording medium. Specifically, the present invention may be applied to a system constituted by a plurality of devices, or may be applied to an apparatus comprising a single device.

また、本発明におけるプログラムは、図４、図８、図１０、図１７、図２０に示すフローチャートの処理方法をコンピュータが実行可能なプログラムであり、本発明の記憶媒体は図４、図８、図１０、図１７、図２０の処理方法をコンピュータが実行可能なプログラムが記憶されている。なお、本発明におけるプログラムは図４、図８、図１０、図１７、図２０の各装置の処理方法ごとのプログラムであってもよい。 Further, the program in the present invention is a program that enables a computer to execute the processing method of the flowcharts shown in FIGS. 4, 8, 10, 17 and 20, and the storage medium of the present invention is FIG. A program that can execute a computer on the processing methods shown in FIGS. 10, 17 and 20 is stored. Note that the program in the present invention may be a program for each processing method of each device in FIG. 4, FIG. 8, FIG. 10, FIG. 17, and FIG.

以上のように、前述した実施形態の機能を実現するプログラムを記録した記録媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に格納されたプログラムを読み出し、実行することによっても本発明の目的が達成されることは言うまでもない。 As described above, the recording medium recording the program for realizing the functions of the above-described embodiments is supplied to the system or apparatus, and the computer (or CPU or MPU) of the system or apparatus stores the program stored in the recording medium. It goes without saying that the object of the present invention can also be achieved by reading and executing.

この場合、記録媒体から読み出されたプログラム自体が本発明の新規な機能を実現することになり、そのプログラムを記録した記録媒体は本発明を構成することになる。 In this case, the program itself read out from the recording medium realizes the novel function of the present invention, and the recording medium recording the program constitutes the present invention.

プログラムを供給するための記録媒体としては、例えば、フレキシブルディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＤＶＤ−ＲＯＭ、磁気テープ、不揮発性のメモリカード、ＲＯＭ、ＥＥＰＲＯＭ、シリコンディスク等を用いることが出来る。 As a recording medium for supplying the program, for example, a flexible disk, hard disk, optical disk, magneto-optical disk, CD-ROM, CD-R, DVD-ROM, magnetic tape, non-volatile memory card, ROM, EEPROM, silicon A disk etc. can be used.

また、コンピュータが読み出したプログラムを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムの指示に基づき、コンピュータ上で稼働しているＯＳ（オペレーティングシステム）等が実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program read by the computer, not only the functions of the above-described embodiment are realized, but also an operating system (OS) or the like running on the computer is actually executed based on the instructions of the program. It goes without saying that the processing is partially or entirely performed, and the processing realizes the functions of the above-described embodiments.

さらに、記録媒体から読み出されたプログラムが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵ等が実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Furthermore, after the program read from the recording medium is written to the memory provided to the function expansion board inserted into the computer or the function expansion unit connected to the computer, the function expansion board is read based on the instruction of the program code. It goes without saying that the case where the CPU or the like provided in the function expansion unit performs part or all of the actual processing and the functions of the above-described embodiment are realized by the processing.

また、本発明は、複数の機器から構成されるシステムに適用しても、ひとつの機器から成る装置に適用しても良い。また、本発明は、システムあるいは装置にプログラムを供給することによって達成される場合にも適応できることは言うまでもない。この場合、本発明を達成するためのプログラムを格納した記録媒体を該システムあるいは装置に読み出すことによって、そのシステムあるいは装置が、本発明の効果を享受することが可能となる。 Further, the present invention may be applied to a system constituted by a plurality of devices or to an apparatus comprising a single device. It goes without saying that the present invention can also be applied to the case where it is achieved by supplying a program to a system or apparatus. In this case, by reading a recording medium storing a program for achieving the present invention into the system or apparatus, the system or apparatus can receive the effects of the present invention.

さらに、本発明を達成するためのプログラムをネットワーク上のサーバ、データベース等から通信プログラムによりダウンロードして読み出すことによって、そのシステムあるいは装置が、本発明の効果を享受することが可能となる。なお、上述した各実施形態およびその変形例を組み合わせた構成も全て本発明に含まれるものである。 Further, by downloading and reading out a program for achieving the present invention from a server on a network, a database or the like by a communication program, the system or apparatus can receive the effects of the present invention. In addition, the structure which combined each embodiment mentioned above and its modification is also contained in this invention altogether.

１００詐称メール検査装置（情報処理装置）
１２０メールサーバ
１３０ＬＡＮ
１４０外部ネットワーク 100 Spoofed Email Inspection System (Information Processing System)
120 Mail Server 130 LAN
140 External network

Claims

Computer,
From the text of the e-mail received in the past, using the appearance frequency of the character string for each transmission source, a fixed form part specifying means for specifying the fixed form part of the e-mail;
Storage means for storing the feature of the fixed form part specified by the fixed part specification means and the transmission source of the electronic mail in association with each other;
Estimating means for estimating a sender of the newly received e-mail based on the information stored in the storage means ;
A program for functioning as notification means for notifying a warning based on a transmission source estimated by the estimation means and an actual transmission source of the newly received e-mail.

The said character string is a word, The program of Claim 1 characterized by the above-mentioned.

The notification means is transferred as a means for notifying a warning when the sender of the newly received e-mail estimated by the estimation means and the actual sender of the newly received e-mail do not match The program according to claim 1 or 2.

Said notification means,
A sender of electronic mail newly received estimated by the estimation means, if the actual source of the newly received e-mail does not coincide, as a means for notifying the source estimated by said estimating means The program according to any one of claims 1 to 3 for functioning.

The method according to any one of claims 1 to 4, for causing the fixed form part specifying means to function as a means for specifying the fixed form part from the text of the e-mail determined by the user not to be an spoofed mail among e-mails received in the past. Or the program described in paragraph 1.

From the text of the e-mail received in the past, using the appearance frequency of the character string for each transmission source, a fixed form part specifying means for specifying the fixed form part of the e-mail;
Storage means for storing the feature of the fixed form part specified by the fixed part specification means and the transmission source of the electronic mail in association with each other;
Estimating means for estimating a sender of the newly received e-mail based on the information stored in the storage means ;
Notification means for notifying a warning based on the transmission source estimated by the estimation means and the actual transmission source of the newly received e-mail;
An information processing apparatus comprising:

A fixed form specification step of specifying a fixed form part of the electronic mail from the text of the e-mail received in the past using the appearance frequency of the character string for each transmission source;
A storage process in which storage means of the information processing apparatus stores the feature of the fixed form part specified in the fixed form part specification process in association with the transmission source of the electronic mail;
An estimation step of estimating the transmission source of the newly received electronic mail based on the information stored in the storage step, the estimation means of the information processing apparatus;
A notification step of notifying a warning based on the transmission source estimated in the estimation step and the actual transmission source of the newly received electronic mail, by the notification means of the information processing apparatus;
An information processing method comprising: