JP5050724B2

JP5050724B2 - Document monitoring program, document monitoring apparatus, and document monitoring method

Info

Publication number: JP5050724B2
Application number: JP2007212542A
Authority: JP
Inventors: 剛寿安藤; 陽佐藤; 聡子志賀; 青史岡本
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2007-08-17
Filing date: 2007-08-17
Publication date: 2012-10-17
Anticipated expiration: 2027-08-17
Also published as: JP2009048340A

Description

本発明は、送信する文書を監視する文書監視プログラム、文書監視装置、文書監視方法に関するものである。 The present invention relates to a document monitoring program, a document monitoring apparatus, and a document monitoring method for monitoring a document to be transmitted.

近年、インターネットにおいて、ブログ(Blog）、ＳＮＳ（Social Networking Service）、掲示板などのユーザ参加型メディアが台頭している。これらは、ＣＧＭ（Consumer Generated Media）と呼ばれ、一般のユーザが自由に意見や感想を述べられる場として注目を集めている。 In recent years, user participation media such as blogs, SNSs (Social Networking Services), bulletin boards, and the like have emerged on the Internet. These are called CGM (Consumer Generated Media) and attract attention as a place where general users can freely express their opinions and impressions.

なお、本発明の関連ある従来技術として、公開された情報を監視し、登録されたキーワードにヒットした記事が存在する場合にその記事の削除依頼を行う掲載情報削除依頼代行システムがある（例えば、特許文献１参照）。また、知りたい情報に対して豊富な知識を備える人物を選別して紹介する情報共有システムがある（例えば、特許文献２参照）。
特開２００２−１０９０８５号公報特開２００４−２２０１７７号公報 As a related art related to the present invention, there is a publication information deletion request agency system that monitors published information and requests deletion of an article when there is an article that hits a registered keyword (for example, Patent Document 1). In addition, there is an information sharing system that selects and introduces persons with abundant knowledge with respect to information to be known (see, for example, Patent Document 2).
JP 2002-109085 A JP 2004-220177 A

しかしながら、誰でも簡単に情報発信ができる気軽さから、個人情報を載せて投稿してしまい問題となるケースが子供を中心に起きている。これに対して、ブログ、ＳＮＳ、掲示板等を巡回・監視し、不適切な情報を発見して通知したり、削除したりするサービスが立ち上がっている。ただし、このようなサービスのほとんどが人手によって巡回・監視を行っているため、対象となる件数や監視間隔には限界がある。 However, because of the ease with which anyone can easily send information, cases where personal information is posted and posted are problematic, especially for children. On the other hand, a service has been launched that circulates and monitors blogs, SNSs, bulletin boards, etc., finds and notifies inappropriate information, and deletes it. However, since most of these services are manually patroled and monitored, there are limits to the number of targets and monitoring intervals.

また、特許文献１の技術は、監視から削除依頼までを自動で行うため、人手よりは広範囲、短周期の監視が可能である。しかしながら、すでにインターネット上に公開されたものしか監視の対象にならない点、登録されたキーワードと完全に一致した記事しか発見できない点が課題である。 Moreover, since the technique of Patent Document 1 automatically performs monitoring to a deletion request, it is possible to monitor over a wider range and in a shorter cycle than manual operation. However, there are problems in that only articles already published on the Internet can be monitored, and only articles that completely match the registered keywords can be found.

本発明は上述した問題点を解決するためになされたものであり、個人に関わる情報の公開を防止する文書監視プログラム、文書監視装置、文書監視方法を提供することを目的とする。 SUMMARY An advantage of some aspects of the invention is to provide a document monitoring program, a document monitoring apparatus, and a document monitoring method for preventing disclosure of information related to an individual.

上述した課題を解決するため、本発明の一態様は、ユーザにより作成された文書の監視をコンピュータに実行させる文書監視プログラムであって、前記ユーザの住所を表す文字列である第１文字列を取得し、前記ユーザにより作成された文書から名詞を抽出し、データベースにおいて前記名詞を検索することにより、前記名詞が示す住所を表す文字列である第２文字列を取得し、前記第１文字列と前記第２文字列との類似度から危険度を算出し、前記危険度に基づいて前記第２文字列に関する表示を行うことをコンピュータに実行させる。 In order to solve the above-described problem, one aspect of the present invention is a document monitoring program that causes a computer to monitor a document created by a user, the first character string being a character string representing the user's address. Obtaining a second character string that is a character string representing an address indicated by the noun by extracting a noun from the document created by the user and searching for the noun in a database; and the first character string And the second character string are calculated from the degree of risk, and the computer is caused to display the second character string based on the degree of risk.

また、本発明の一態様は、ユーザにより作成された文書の監視を行う文書監視装置であって、前記ユーザの住所を表す文字列である第１文字列を取得する第１取得部、前記ユーザにより作成された文書から名詞を抽出する抽出部と、データベースにおいて前記名詞を検索することにより、前記名詞が示す住所を表す文字列である第２文字列を取得する第２取得部と、前記第１文字列と前記第２文字列との類似度から危険度を算出する算出部と、前記危険度に基づいて前記第２文字列に関する表示を行う表示部とを備える。 According to another aspect of the present invention, there is provided a document monitoring apparatus that monitors a document created by a user, a first acquisition unit that acquires a first character string that is a character string representing an address of the user, the user An extraction unit that extracts a noun from the document created by the above, a second acquisition unit that acquires a second character string that is a character string representing an address indicated by the noun by searching for the noun in a database, and the second A calculation unit that calculates a risk level from a similarity between one character string and the second character string, and a display unit that displays the second character string based on the risk level.

また、本発明の一態様は、ユーザにより作成された文書の監視を行う文書監視方法であって、前記ユーザの住所を表す文字列である第１文字列を取得し、前記ユーザにより作成された文書から名詞を抽出し、データベースにおいて前記名詞を検索することにより、前記名詞が示す住所を表す文字列である第２文字列を取得し、前記第１文字列と前記第２文字列との類似度から危険度を算出し、前記危険度に基づいて前記第２文字列に関する表示を行う。 Another aspect of the present invention is a document monitoring method for monitoring a document created by a user, wherein the first character string that is a character string representing the address of the user is acquired and created by the user. By extracting a noun from a document and searching for the noun in a database, a second character string that is a character string representing an address indicated by the noun is obtained, and the similarity between the first character string and the second character string The degree of danger is calculated from the degree, and the display relating to the second character string is performed based on the degree of danger.

本発明によれば、個人に関わる情報の公開を防止することができる。 According to the present invention, disclosure of information related to an individual can be prevented.

以下、本発明の実施の形態について図面を参照しつつ説明する。 Embodiments of the present invention will be described below with reference to the drawings.

実施の形態１．
以下の実施の形態においては、ブログ、ＳＮＳ、掲示板等のＣＧＭの投稿を行う投稿システムに、本発明の文書監視装置を適用した例について説明する。 Embodiment 1 FIG.
In the following embodiment, an example will be described in which the document monitoring apparatus of the present invention is applied to a posting system for posting CGM such as a blog, an SNS, and a bulletin board.

まず、本実施の形態に係る文書監視装置の構成について説明する。 First, the configuration of the document monitoring apparatus according to the present embodiment will be described.

図１は、本実施の形態に係る文書監視装置の構成の一例を示すブロック図である。この文書監視装置は、ユーザ情報登録部１１、ユーザ情報ＤＢ（Database）１２、ユーザ認証部１３、ユーザ情報取得部１４、文書受信部２１、文書ＤＢ２２、キーワード抽出部２４、抽出キーワードＤＢ２６、緯度経度取得部３１、緯度経度ＤＢ３２、距離算出部３３、危険キーワードＤＢ３４、判定部３５、警告部４２、修正部４３、文書送信部４４を備える。また、この文書監視装置は、ユーザが文書を作成及び投稿するためのユーザ端末１と投稿された記事を公開するサーバ２とに、ネットワークを介して接続されている。この文書監視装置は、ユーザ端末１からサーバ２へ投稿される文書を、投稿前に監視するものである。 FIG. 1 is a block diagram showing an example of the configuration of the document monitoring apparatus according to the present embodiment. This document monitoring apparatus includes a user information registration unit 11, a user information DB (Database) 12, a user authentication unit 13, a user information acquisition unit 14, a document reception unit 21, a document DB 22, a keyword extraction unit 24, an extraction keyword DB 26, a latitude and longitude. An acquisition unit 31, a latitude / longitude DB 32, a distance calculation unit 33, a danger keyword DB 34, a determination unit 35, a warning unit 42, a correction unit 43, and a document transmission unit 44 are provided. The document monitoring apparatus is connected to a user terminal 1 for creating and posting a document by a user and a server 2 for publishing posted articles via a network. This document monitoring apparatus monitors a document posted from the user terminal 1 to the server 2 before posting.

投稿を行うユーザの情報は、ユーザ情報として予め登録される。ユーザ情報登録部１１は、ユーザにより入力されたユーザの住所や氏名等のユーザ情報を受信し、ユーザ情報ＤＢ１２に登録する。緯度経度ＤＢ３２は、公園、学校、店などの場所を示す名詞である場所名（地名）とその場所の緯度及び経度とを対応付けて格納している。 Information about the user who makes the posting is registered in advance as user information. The user information registration unit 11 receives user information such as the user's address and name input by the user, and registers the user information in the user information DB 12. The latitude / longitude DB 32 stores a place name (place name), which is a noun indicating a place such as a park, a school, or a store, and the latitude and longitude of the place in association with each other.

次に、本実施の形態に係る文書監視装置の動作について説明する。 Next, the operation of the document monitoring apparatus according to this embodiment will be described.

図２は、本実施の形態に係る文書監視装置の動作の一例を示すフローチャートである。まず、ユーザ認証部１３は、ユーザ端末１からの認証要求を受信すると、投稿を行うユーザの認証を行い、ユーザＩＤを取得する（Ｓ１１）。以後、このユーザを対象ユーザと呼ぶ。次に、ユーザ情報取得部１４は、ユーザＩＤに基づいて、ユーザ情報ＤＢ１２から対象ユーザのユーザ情報を取得し、ユーザ情報から対象ユーザの住所であるユーザ住所を取得する（Ｓ１２）。 FIG. 2 is a flowchart showing an example of the operation of the document monitoring apparatus according to the present embodiment. First, when receiving an authentication request from the user terminal 1, the user authentication unit 13 authenticates a user who makes a posting and acquires a user ID (S11). Hereinafter, this user is referred to as a target user. Next, the user information acquisition unit 14 acquires the user information of the target user from the user information DB 12 based on the user ID, and acquires the user address that is the address of the target user from the user information (S12).

次に、ユーザ情報取得部１４は、ユーザ住所を緯度経度ＤＢ３２において検索し、ユーザ住所の緯度及び経度を取得する（Ｓ１３）。次に、文書受信部２１は、投稿のためにユーザ端末１から送信された文書を受信し、文書ＤＢ２２へ登録する（Ｓ１６）。以後、この文書を対象文書とする。次に、キーワード抽出部２４は、受信した文書の形態素解析を行うことにより対象文書から名詞を抽出してキーワードとし、抽出キーワードＤＢ２６へ登録する（Ｓ１７）。 Next, the user information acquisition unit 14 searches the latitude / longitude DB 32 for the user address, and acquires the latitude and longitude of the user address (S13). Next, the document receiving unit 21 receives a document transmitted from the user terminal 1 for posting and registers it in the document DB 22 (S16). Hereinafter, this document is the target document. Next, the keyword extracting unit 24 extracts a noun from the target document by performing morphological analysis of the received document, and registers it in the extracted keyword DB 26 (S17).

図３は、本実施の形態に係る緯度経度ＤＢの内容の一例を示す表である。緯度経度ＤＢ３２は、予め地図情報に基づいて作成され、場所毎に、場所名、緯度、経度の情報を格納する。また、場所名は、都道府県名、市町村名、番地等の住所文字列も含む。緯度経度ＤＢ３２には、例えば日本全域の場所が登録されている。 FIG. 3 is a table showing an example of the contents of the latitude / longitude DB according to the present embodiment. The latitude / longitude DB 32 is created in advance based on map information, and stores location name, latitude, and longitude information for each location. The place name also includes an address character string such as a prefecture name, a municipality name, and a street address. In the latitude / longitude DB 32, for example, locations throughout Japan are registered.

次に、キーワード判定処理（処理Ｓ２２〜Ｓ３３）を行う。 Next, keyword determination processing (processing S22 to S33) is performed.

まず、緯度経度取得部３１は、抽出キーワードＤＢ２６に格納されたキーワードの中から１つを選択して選択キーワードとする（Ｓ２２）。次に、緯度経度取得部３１は、緯度経度ＤＢ３２において選択キーワードを検索し、ヒットしたか否かの判定を行う（Ｓ２３）。なお、緯度経度取得部３１は、ユーザ住所付近に限定して、緯度経度ＤＢ３２における選択キーワードの検索を行っても良い。 First, the latitude / longitude acquisition unit 31 selects one of the keywords stored in the extracted keyword DB 26 as a selected keyword (S22). Next, the latitude / longitude acquisition unit 31 searches the latitude / longitude DB 32 for the selected keyword and determines whether or not a hit has been made (S23). Note that the latitude / longitude acquisition unit 31 may search for the selected keyword in the latitude / longitude DB 32 only in the vicinity of the user address.

ヒットしなかった場合（Ｓ２３，Ｎ）、選択キーワードが場所名でないと判断し、処理Ｓ２２へ戻り、次の選択キーワードの処理を行う。 If there is no hit (S23, N), it is determined that the selected keyword is not a place name, and the process returns to process S22 to process the next selected keyword.

一方、ヒットした場合（Ｓ２３，Ｙ）、緯度経度取得部３１は、選択キーワードが場所名であると判断し、場所名に対応する緯度及び経度を取得する（Ｓ２５）。 On the other hand, when a hit occurs (S23, Y), the latitude / longitude acquisition unit 31 determines that the selected keyword is a place name, and acquires the latitude and longitude corresponding to the place name (S25).

次に、距離算出部３３は、ユーザ住所の緯度経度と選択キーワードの緯度経度とから、ユーザ住所と選択キーワードとの距離を算出する（Ｓ３１）。次に、距離算出部３３は、ユーザ住所と選択キーワードとの距離が距離しきい値以下であるか否かの判定を行う（Ｓ３２）。ここで距離しきい値は、例えば５ｋｍである。 Next, the distance calculation unit 33 calculates the distance between the user address and the selected keyword from the latitude and longitude of the user address and the latitude and longitude of the selected keyword (S31). Next, the distance calculation unit 33 determines whether or not the distance between the user address and the selected keyword is equal to or less than a distance threshold (S32). Here, the distance threshold is, for example, 5 km.

ユーザ住所と選択キーワードとの距離が所定距離以下でない場合（Ｓ３２，Ｎ）、処理Ｓ４１へ移行する。一方、ユーザ住所と選択キーワードとの距離が所定距離以下である場合（Ｓ３２，Ｙ）、距離算出部３３は、選択キーワードを危険キーワードとし、危険キーワードや対象ユーザのユーザＩＤ等を危険キーワードＤＢ３４に登録し（Ｓ３３）、処理Ｓ４１へ移行する。この処理により、ユーザ住所に近い場所を表すキーワードを検出することができる。 When the distance between the user address and the selected keyword is not less than the predetermined distance (S32, N), the process proceeds to S41. On the other hand, when the distance between the user address and the selected keyword is equal to or less than the predetermined distance (S32, Y), the distance calculation unit 33 sets the selected keyword as a risk keyword and stores the risk keyword, the user ID of the target user, and the like in the risk keyword DB 34. Register (S33) and proceed to step S41. By this process, a keyword representing a place close to the user address can be detected.

図４は、本実施の形態に係る危険キーワードＤＢの内容の一例を示す表である。危険キーワードＤＢ３４は、ユーザＩＤ毎及び危険キーワード毎のエントリを格納する。各エントリは、危険キーワードである場所名、ユーザＩＤ、その場所名に対応する緯度、経度、投稿したユーザのユーザ住所から危険キーワードの場所までの距離を格納する。 FIG. 4 is a table showing an example of the contents of the dangerous keyword DB according to the present embodiment. The dangerous keyword DB 34 stores an entry for each user ID and each dangerous keyword. Each entry stores a dangerous keyword location name, user ID, latitude and longitude corresponding to the location name, and the distance from the posted user's user address to the location of the dangerous keyword.

次に、緯度経度取得部３１は、対象文書中の全てのキーワードに対してキーワード判定処理を終了したか否かの判断を行う（Ｓ４１）。キーワード判定処理が終了していない場合（Ｓ４１，Ｎ）、このフローは処理Ｓ２２へ戻る。一方、対象文書中の全てのキーワードについてキーワード判定処理が終了した場合（Ｓ４１，Ｙ）、判定部３５は、危険キーワードＤＢ３４における当該ユーザの危険キーワード数をカウントし（Ｓ４２）、危険キーワード数が危険キーワード数しきい値以上であるか否かの判定を行う（Ｓ４３）。ここで、危険キーワード数しきい値は、例えば５個である。 Next, the latitude / longitude acquisition unit 31 determines whether or not the keyword determination process has been completed for all keywords in the target document (S41). If the keyword determination process has not ended (S41, N), this flow returns to process S22. On the other hand, when the keyword determination process is completed for all keywords in the target document (S41, Y), the determination unit 35 counts the number of dangerous keywords of the user in the dangerous keyword DB 34 (S42), and the number of dangerous keywords is dangerous. It is determined whether or not the keyword count threshold value is exceeded (S43). Here, the threshold number of dangerous keywords is, for example, five.

対象ユーザが過去にも文書を投稿している場合、危険キーワードＤＢ３４には、過去の危険キーワードも蓄積されているため、現在までの全ての危険キーワードに対して処理Ｓ４３の判定が行われる。 When the target user has posted a document in the past, since the past risk keywords are also stored in the risk keyword DB 34, the determination in step S43 is performed for all the risk keywords up to the present.

危険キーワード数が危険キーワード数しきい値以上でない場合（Ｓ４３，Ｎ）、文書送信部４４は、対象文書をサーバ２へ送信し（Ｓ５９）、このフローは終了する。一方、危険キーワード数が危険キーワード数しきい値以上である場合（Ｓ４３，Ｙ）、警告部４２は、危険キーワード表示処理を行う（Ｓ５６）。修正部４３は、対象文書において危険キーワードを修正する修正処理を行い、修正した対象文書を文書ＤＢ２２へ保存する（Ｓ５７）。修正処理において、修正部４３は、対象文書における危険キーワードを別な文字（伏字）に置換する。次に、修正部４３は、修正処理により対象文書から削除された危険キーワードのエントリを危険キーワードＤＢ３４から削除する（Ｓ５８）。 If the number of dangerous keywords is not equal to or greater than the threshold number of dangerous keywords (S43, N), the document transmission unit 44 transmits the target document to the server 2 (S59), and this flow ends. On the other hand, when the number of dangerous keywords is equal to or greater than the threshold number of dangerous keywords (S43, Y), the warning unit 42 performs a dangerous keyword display process (S56). The correction unit 43 performs a correction process for correcting the dangerous keyword in the target document, and stores the corrected target document in the document DB 22 (S57). In the correction process, the correction unit 43 replaces the dangerous keyword in the target document with another character (abbreviated character). Next, the correcting unit 43 deletes the dangerous keyword entry deleted from the target document by the correcting process from the dangerous keyword DB 34 (S58).

次に、文書送信部４４は、文書ＤＢ２２に保存された対象文書をユーザ端末１及びサーバ２へ送信し（Ｓ５９）、このフローは終了する。サーバ２は、文書監視装置から受信した対象文書を公開する。 Next, the document transmission unit 44 transmits the target document stored in the document DB 22 to the user terminal 1 and the server 2 (S59), and this flow ends. The server 2 publishes the target document received from the document monitoring apparatus.

次に、危険キーワード表示処理について説明する。 Next, the dangerous keyword display process will be described.

警告部４２は、ユーザ端末１に危険キーワードに関する情報を表示する。図５は、本実施の形態に係る危険キーワード表示処理による表示の一例を示す画面である。危険キーワード表示処理において、警告部４２は、ユーザ住所を中心として、距離しきい値を半径とする円を表示する。更に、警告部４２は、ユーザ住所の緯度経度と危険キーワードの緯度経度を用いて、ユーザ住所に対する危険キーワードの相対位置を表示し、その相対位置に危険キーワードを表示する。なお、危険キーワード表示処理は、危険キーワードだけを表示しても良いし、警告文を表示しても良い。また、修正処理は、警告部４２による表示にしたがってユーザから再度送信された対象文書を修正結果としても良い。 The warning unit 42 displays information related to the risk keyword on the user terminal 1. FIG. 5 is a screen showing an example of display by the dangerous keyword display process according to the present embodiment. In the danger keyword display process, the warning unit 42 displays a circle with the distance threshold as a radius centered on the user address. Further, the warning unit 42 displays the relative position of the dangerous keyword with respect to the user address by using the latitude and longitude of the user address and the latitude and longitude of the dangerous keyword, and displays the dangerous keyword at the relative position. In the dangerous keyword display process, only the dangerous keyword may be displayed or a warning text may be displayed. In the correction process, the target document retransmitted from the user according to the display by the warning unit 42 may be used as the correction result.

なお、文書監視装置は、警告部４２と修正部４３のいずれか一方を備える構成としても良い。 Note that the document monitoring apparatus may include one of the warning unit 42 and the correction unit 43.

本実施の形態によれば、緯度及び経度に基づいて、投稿しようとする文書から、ユーザ住所が特定される可能性の高いキーワードを検出し、警告または修正を行うことにより、ユーザ住所に関わる情報の公開を防止することができる。また、ユーザ毎に危険キーワードを蓄積することにより、複数の文書からユーザ住所が特定される可能性を判定することができる。 According to the present embodiment, based on latitude and longitude, information related to the user address is detected by detecting a keyword that is likely to identify the user address from the document to be posted and performing warning or correction. Can be prevented from being disclosed. Further, by accumulating dangerous keywords for each user, it is possible to determine the possibility that the user address is specified from a plurality of documents.

実施の形態２．
まず、本実施の形態に係る文書監視装置の構成について説明する。 Embodiment 2. FIG.
First, the configuration of the document monitoring apparatus according to the present embodiment will be described.

図６は、本実施の形態に係る文書監視装置の構成の一例を示すブロック図である。この図において、図１と同一符号は図１に示された対象と同一又は相当物を示しており、ここでの説明を省略する。この図は、図１と比較すると、緯度経度取得部３１、緯度経度ＤＢ３２、距離算出部３３、危険キーワードＤＢ３４、判定部３５の代わりに、住所文字列取得部５１、住所文字列ＤＢ５２（住所データベース）、危険度算出部５３、危険度ＤＢ５４、判定部５５を備える。 FIG. 6 is a block diagram showing an example of the configuration of the document monitoring apparatus according to the present embodiment. In this figure, the same reference numerals as those in FIG. 1 denote the same or corresponding parts as those in FIG. 1, and the description thereof is omitted here. Compared with FIG. 1, this figure replaces the latitude / longitude acquisition unit 31, the latitude / longitude DB 32, the distance calculation unit 33, the danger keyword DB 34, and the determination unit 35 with an address character string acquisition unit 51, an address character string DB 52 (address database). ), A risk calculation unit 53, a risk DB 54, and a determination unit 55.

図７は、本実施の形態に係る文書監視装置の動作の一例を示すフローチャートである。この図において、図２と同一符号は図２に示された対象と同一又は相当物を示しており、ここでの説明を省略する。まず、実施の形態１と同様、処理Ｓ１１，Ｓ１２が実行される。 FIG. 7 is a flowchart showing an example of the operation of the document monitoring apparatus according to the present embodiment. In this figure, the same reference numerals as those in FIG. 2 denote the same or corresponding parts as those in FIG. 2, and the description thereof will be omitted here. First, similarly to the first embodiment, processes S11 and S12 are executed.

次に、ユーザ情報取得部１４は、ユーザ情報から対象ユーザの住所の文字列であるユーザ住所文字列（第１文字列）を取得し（Ｓ１４）、ユーザ住所文字列の形態素解析を行い、ユーザ住所文字列をブロックに分割する（Ｓ１５）。 Next, the user information acquisition unit 14 acquires a user address character string (first character string) that is a character string of the target user's address from the user information (S14), performs morphological analysis of the user address character string, and The address character string is divided into blocks (S15).

図８は、本実施の形態に係るユーザ住所文字列の分割の一例を示す図である。各ブロックは、都道府県名、市町村名、区名等の単位である。 FIG. 8 is a diagram showing an example of division of the user address character string according to the present embodiment. Each block is a unit such as a prefecture name, a municipality name, or a ward name.

次に、文書受信部２１は、投稿のためにユーザ端末１から送信された対象文書を受信する（Ｓ１６）。次に、キーワード抽出部２４は、対象文書の形態素解析を行うことにより対象文書から名詞を抽出してキーワードとし、抽出キーワードＤＢ２６へ登録する（Ｓ１７）。次に、住所文字列取得部５１は、危険度ＤＢ５４の中から危険度が最大のキーワードを読み出して最大危険度キーワードとし、その危険度を最大危険度とし、最大危険キーワードに対応する住所文字列を最大危険度住所文字列とする（Ｓ１８）。ここで、あるキーワードの危険度は、そのキーワードに対応する住所文字列とユーザ住所文字列との類似度であり、具体的な算出方法は後述する。 Next, the document receiving unit 21 receives the target document transmitted from the user terminal 1 for posting (S16). Next, the keyword extraction unit 24 extracts a noun from the target document by performing morphological analysis of the target document, and registers it in the extracted keyword DB 26 (S17). Next, the address character string acquisition unit 51 reads out the keyword with the highest risk level from the risk DB 54 and sets it as the maximum risk keyword. The address string corresponding to the maximum risk keyword is set as the maximum risk level. Is the maximum risk address character string (S18). Here, the risk level of a certain keyword is the similarity between the address character string corresponding to the keyword and the user address character string, and a specific calculation method will be described later.

図９は、本実施の形態に係る危険度ＤＢの内容の一例を示す表である。危険度ＤＢ５４は、キーワード毎、ユーザ毎に、場所名、住所文字列、危険度を格納する。 FIG. 9 is a table showing an example of the contents of the risk DB according to the present embodiment. The risk DB 54 stores a place name, an address character string, and a risk for each keyword and each user.

次に、キーワード判定処理（処理Ｓ６２〜Ｓ６８）を行う。 Next, keyword determination processing (processing S62 to S68) is performed.

まず、住所文字列取得部５１は、抽出キーワードＤＢ２６に格納されたキーワードの中から１つを選択して選択キーワードとする（Ｓ６２）。次に、住所文字列取得部５１は、住所文字列ＤＢ５２において選択キーワードを検索し、選択キーワードがヒットしたか否かの判定を行う（Ｓ６３）。 First, the address character string acquisition unit 51 selects one of the keywords stored in the extracted keyword DB 26 as a selected keyword (S62). Next, the address character string acquisition unit 51 searches the address character string DB 52 for the selected keyword and determines whether or not the selected keyword has been hit (S63).

図１０は、本実施の形態に係る住所文字列ＤＢの内容の一例を示す表である。住所文字列ＤＢ５２は、予め地図情報に基づいて作成され、場所毎に、場所名、住所文字列を格納する。住所文字列ＤＢ５２には、例えば日本全域の場所が登録されている。 FIG. 10 is a table showing an example of the contents of the address character string DB according to the present embodiment. The address character string DB 52 is created in advance based on map information, and stores a place name and an address character string for each place. In the address character string DB 52, for example, locations throughout Japan are registered.

選択キーワードがヒットした場合（Ｓ６３，Ｙ）、危険度算出部５３は、第１危険度算出処理により選択キーワードの危険度を算出し（Ｓ６４）、選択キーワードがヒットしなかった場合（Ｓ６３，Ｎ）、危険度算出部５３は、第２危険度算出処理により選択キーワードの危険度を算出する（Ｓ６５）。次に、危険度算出部５３は、選択キーワードとその危険度とを危険度ＤＢ５４へ登録する（Ｓ６６）。次に、危険度算出部５３は、選択キーワードの危険度が最大危険度より大きいか否かの判定を行う（Ｓ６７）。 When the selected keyword is hit (S63, Y), the risk calculating unit 53 calculates the risk of the selected keyword by the first risk calculating process (S64), and when the selected keyword is not hit (S63, N). ), The risk level calculation unit 53 calculates the risk level of the selected keyword by the second risk level calculation process (S65). Next, the risk level calculation unit 53 registers the selected keyword and the risk level in the risk level DB 54 (S66). Next, the risk level calculation unit 53 determines whether or not the risk level of the selected keyword is greater than the maximum risk level (S67).

選択キーワードの危険度が最大危険度以下である場合（Ｓ６７，Ｎ）、危険度算出部５３は、処理Ｓ７１へ移行する。一方、選択キーワードの危険度が最大危険度より大きい場合（Ｓ６７，Ｙ）、危険度算出部５３は、最大危険度キーワードを選択キーワードに更新し（Ｓ６８）、処理Ｓ７１へ移行する。 When the risk level of the selected keyword is equal to or lower than the maximum risk level (S67, N), the risk level calculation unit 53 proceeds to process S71. On the other hand, when the risk level of the selected keyword is greater than the maximum risk level (S67, Y), the risk level calculation unit 53 updates the maximum risk level keyword to the selected keyword (S68), and proceeds to processing S71.

次に、住所文字列取得部５１は、対象文書中の全てのキーワードに対してキーワード判定処理を終了したか否かの判断を行う（Ｓ７１）。キーワード判定処理が終了していない場合（Ｓ７１，Ｎ）、このフローは処理Ｓ６２へ戻る。一方、対象文書中の全てのキーワードについてキーワード判定処理が終了した場合（Ｓ７１，Ｙ）、判定部５５は、危険度ＤＢ５４において危険度が危険度しきい値以上であるキーワードを危険キーワードとし、対象文書中に危険キーワードが存在するか否かの判定を行う（Ｓ７３）。ここで、危険度しきい値は、例えば８０％である。 Next, the address character string acquisition unit 51 determines whether or not the keyword determination processing has been completed for all keywords in the target document (S71). If the keyword determination process has not ended (S71, N), this flow returns to process S62. On the other hand, when the keyword determination processing is completed for all keywords in the target document (S71, Y), the determination unit 55 sets the keywords whose risk level is equal to or higher than the risk threshold value in the risk level DB 54 as risk keywords. It is determined whether a dangerous keyword exists in the document (S73). Here, the risk threshold is 80%, for example.

危険キーワードが存在しない場合（Ｓ７３，Ｎ）、文書送信部４４は、文書ＤＢ２２に保存された対象文書をサーバ２へ送信し（Ｓ７９）、このフローは終了する。一方、危険キーワードが存在する場合（Ｓ７３，Ｙ）、警告部４２は、実施の形態１と同様の危険キーワード表示処理を行う（Ｓ７６）。修正部４３は、実施の形態１と同様の修正処理を行い、対象文書を文書ＤＢ２２へ保存する（Ｓ７７）。次に、修正部４３は、修正処理により対象文書から削除された危険キーワードのエントリを危険度ＤＢ５４から削除する（Ｓ７８）。 When the dangerous keyword does not exist (S73, N), the document transmission unit 44 transmits the target document stored in the document DB 22 to the server 2 (S79), and this flow ends. On the other hand, when a dangerous keyword exists (S73, Y), the warning unit 42 performs a dangerous keyword display process similar to that of the first embodiment (S76). The correction unit 43 performs the same correction process as in the first embodiment and stores the target document in the document DB 22 (S77). Next, the correcting unit 43 deletes the entry of the dangerous keyword deleted from the target document by the correcting process from the risk DB 54 (S78).

次に、文書送信部４４は、文書ＤＢ２２に保存された対象文書をユーザ端末１及びサーバ２へ送信し（Ｓ７９）、このフローは終了する。サーバ２は、文書監視装置から受信した対象文書を公開する。 Next, the document transmission unit 44 transmits the target document stored in the document DB 22 to the user terminal 1 and the server 2 (S79), and this flow ends. The server 2 publishes the target document received from the document monitoring apparatus.

次に、第１危険度算出処理について説明する。 Next, the first risk level calculation process will be described.

図１１は、本実施の形態に係る第１危険度算出処理の動作の一例を示すフローチャートである。まず、住所文字列取得部５１は、住所文字列ＤＢ５２から選択キーワードと一致した場所名に対応する住所文字列を取得して選択住所文字列（第２文字列）とする（Ｓ８１）。なお、住所文字列取得部５１は、ユーザ住所付近に限定して選択キーワードの検索を行っても良い。次に、危険度算出部５３は、選択住所文字列の長さと最大危険度住所文字列の長さのうち、大きい方をＮとする（Ｓ９６）。次に、選択住所文字列と最大危険度住所文字列を比較し、連続して一致した文字列の長さをＭとする（Ｓ９７）。次に、危険度算出部５３は、（Ｍ／Ｎ×１００）を選択キーワードの危険度［％］とし（Ｓ９８）、このフローは終了する。選択キーワードの危険度が高いほど、ユーザ住所に近い、またはユーザ住所を特定しやすいことを示す。 FIG. 11 is a flowchart showing an example of the operation of the first risk degree calculation process according to the present embodiment. First, the address character string acquisition unit 51 acquires an address character string corresponding to a place name that matches the selected keyword from the address character string DB 52 and sets it as a selected address character string (second character string) (S81). Note that the address character string acquisition unit 51 may search for the selected keyword only in the vicinity of the user address. Next, the risk level calculation unit 53 sets N as the larger one of the length of the selected address character string and the length of the maximum risk address character string (S96). Next, the selected address character string is compared with the maximum risk address character string, and the length of the character string that is continuously matched is set to M (S97). Next, the risk level calculation unit 53 sets (M / N × 100) as the risk level [%] of the selected keyword (S98), and this flow ends. The higher the risk of the selected keyword, the closer to the user address or the easier it is to identify the user address.

この第１危険度算出処理によれば、ユーザ住所文字列と選択キーワードの場所名に対応する住所文字列とが類似しているほど危険度を高くすることができる。 According to the first risk level calculation process, the risk level can be increased as the user address character string and the address character string corresponding to the place name of the selected keyword are more similar.

次に、第２危険度算出処理について説明する。 Next, the second risk level calculation process will be described.

図１２は、本実施の形態に係る第２危険度算出処理の動作の一例を示すフローチャートである。まず、危険度算出部５３は、ユーザ住所の住所文字列のブロックのうち最大危険度住所文字列の次の１ブロックを、最大危険度住所文字列に加えて検索住所文字列（第３文字列）とする（Ｓ８２）。次に、危険度算出部５３は、インターネット上のコンテンツのデータベース（インデックス）において選択キーワードと検索住所文字列の両方を含むコンテンツを検索し、選択キーワードと検索住所文字列の両方を含むコンテンツが存在するか否かの判定を行う（Ｓ８３）。 FIG. 12 is a flowchart showing an example of the operation of the second risk level calculation process according to the present embodiment. First, the risk level calculation unit 53 adds a block next to the maximum risk level address character string among the address character string blocks of the user address to the maximum risk level address character string and adds a search address character string (third character string). (S82). Next, the risk level calculation unit 53 searches the content database (index) on the Internet for content including both the selected keyword and the search address character string, and there is content including both the selected keyword and the search address character string. It is determined whether or not to perform (S83).

選択キーワードと検索住所文字列の両方を含むコンテンツが存在しない場合（Ｓ８３，Ｎ）、処理Ｓ９５へ移行する。一方、選択キーワードと検索住所文字列の両方を含むコンテンツが存在する場合（Ｓ８３，Ｙ）、ユーザ住所の住所文字列のブロックのうち検索住所文字列の次の１ブロックを、検索住所文字列に加えて新たな検索住所文字列とし（Ｓ８４）、処理Ｓ８３へ戻る。 When there is no content including both the selected keyword and the search address character string (S83, N), the process proceeds to S95. On the other hand, when there is content including both the selected keyword and the search address character string (S83, Y), the next block of the search address character string is selected as the search address character string from among the address character string blocks of the user address. In addition, a new search address character string is set (S84), and the process returns to step S83.

次に、検索住所文字列から最後の１ブロックを削除した住所文字列を選択住所文字列（第２文字列）とする（Ｓ９５）。次に、第１危険度算出処理と同様の処理Ｓ９６〜Ｓ９８が実行され、このフローは終了する。 Next, an address character string obtained by deleting the last block from the search address character string is set as a selected address character string (second character string) (S95). Next, processes S96 to S98 similar to the first risk degree calculation process are executed, and this flow ends.

図１３は、本実施の形態に係る第２危険度算出処理に関する情報の一例を示す図である。ここでは、最大危険度住所文字列、選択キーワード、選択キーワードの住所の例を示す。図１４は、本実施の形態に係る第２危険度算出処理の一例を示す図である。 FIG. 13 is a diagram showing an example of information related to the second risk degree calculation process according to the present embodiment. Here, an example of the address of the maximum risk address character string, the selected keyword, and the selected keyword is shown. FIG. 14 is a diagram showing an example of the second risk degree calculation process according to the present embodiment.

１回目の検索では、選択キーワードと最大危険度住所文字列に１ブロックを加えた検索住所文字列とによる検索が行われ、検索住所文字列は選択キーワードの住所に含まれるため、選択キーワードに関するサイトにヒットする。同様に、２回目の検索では、選択キーワードと最大危険度住所文字列に２ブロックを加えた検索住所文字列とによる検索が行われ、検索住所文字列は選択キーワードの住所に含まれるため、検索は選択キーワードに関するサイトにヒットする。３回目の検索では、選択キーワードと最大危険度住所文字列に３ブロックを加えた検索住所文字列とによる検索が行われ、検索住所文字列は選択キーワードの住所に含まれないため、検索はミスする。ここで、検索住所文字列から最後の１ブロックを削除したものを選択住所文字列とする。 In the first search, a search is performed with the selected keyword and a search address character string obtained by adding one block to the maximum risk address character string, and the search address character string is included in the address of the selected keyword. To hit. Similarly, in the second search, a search is performed using the selected keyword and a search address character string obtained by adding two blocks to the maximum risk address character string, and the search address character string is included in the address of the selected keyword. Hits the site about the selected keyword. In the third search, a search is performed based on the selected keyword and the search address character string obtained by adding 3 blocks to the maximum risk address character string, and the search address character string is not included in the address of the selected keyword. To do. Here, the selected address character string is obtained by deleting the last one block from the search address character string.

この第２危険度算出処理によれば、選択キーワードの場所名に対応する住所文字列が住所文字列ＤＢ５２から得られない場合でもインターネット上のコンテンツに基づいて選択キーワードに対応する住所文字列を生成することができ、危険度を算出することができる。 According to the second risk level calculation process, an address character string corresponding to the selected keyword is generated based on contents on the Internet even if an address character string corresponding to the place name of the selected keyword is not obtained from the address character string DB 52. And the degree of risk can be calculated.

なお、住所文字列ＤＢ５２を用いずに、全ての危険度を第２危険度算出処理により算出しても良い。 Note that all risk levels may be calculated by the second risk level calculation process without using the address character string DB 52.

本実施の形態によれば、住所文字列に基づいて、ユーザが投稿しようとする文書から、ユーザ住所が特定される可能性の高いキーワードを検出し、警告または修正を行うことにより、ユーザ住所に関わる情報の公開を防ぐことができる。 According to this embodiment, based on an address character string, a keyword that is likely to identify a user address is detected from a document that the user intends to post, and a warning or correction is performed. The disclosure of related information can be prevented.

なお、第１取得部は、実施の形態におけるユーザ情報取得部に対応する。また、抽出部は、実施の形態におけるキーワード抽出部に対応する。また、第２取得部及び算出部は、実施の形態における危険度算出部に対応する。また、表示部は、実施の形態における警告部、修正部、文書送信部に対応する。 The first acquisition unit corresponds to the user information acquisition unit in the embodiment. The extraction unit corresponds to the keyword extraction unit in the embodiment. The second acquisition unit and the calculation unit correspond to the risk calculation unit in the embodiment. The display unit corresponds to the warning unit, the correction unit, and the document transmission unit in the embodiment.

更に、文書監視装置を構成するコンピュータにおいて上述した各ステップを実行させるプログラムを、文書監視プログラムとして提供することができる。上述したプログラムは、コンピュータにより読取り可能な記録媒体に記憶させることによって、文書監視装置を構成するコンピュータに実行させることが可能となる。ここで、上記コンピュータにより読取り可能な記録媒体としては、ＲＯＭやＲＡＭ等のコンピュータに内部実装される内部記憶装置、ＣＤ−ＲＯＭやフレキシブルディスク、ＤＶＤディスク、光磁気ディスク、ＩＣカード等の可搬型記憶媒体や、コンピュータプログラムを保持するデータベース、或いは、他のコンピュータ並びにそのデータベースや、更に回線上の伝送媒体をも含むものである。 Furthermore, it is possible to provide a program for executing the above steps in a computer constituting the document monitoring apparatus as a document monitoring program. By storing the above-described program in a computer-readable recording medium, the computer constituting the document monitoring apparatus can be executed. Here, examples of the recording medium readable by the computer include an internal storage device such as a ROM and a RAM, a portable storage such as a CD-ROM, a flexible disk, a DVD disk, a magneto-optical disk, and an IC card. It includes a medium, a database holding a computer program, another computer and its database, and a transmission medium on a line.

（付記１）ユーザにより作成された文書の監視をコンピュータに実行させる文書監視プログラムであって、
前記ユーザの住所を表す文字列である第１文字列を取得し、
前記ユーザにより作成された文書から名詞を抽出し、
データベースにおいて前記名詞を検索することにより、前記名詞が示す住所を表す文字列である第２文字列を取得し、
前記第１文字列と前記第２文字列との類似度から危険度を算出し、
前記危険度に基づいて前記第２文字列に関する表示を行う
ことをコンピュータに実行させる文書監視プログラム。
（付記２）付記１に記載の文書監視プログラムにおいて、
前記第１文字列に含まれる文字列を第３文字列とし、インターネット上のコンテンツのデータベースにおいて前記第３文字列と前記名詞の両方を含むコンテンツを検索し、該コンテンツが存在する場合に前記第３文字列を前記第２文字列とする文書監視プログラム。
（付記３）付記２に記載の文書監視プログラムにおいて、
前記第１文字列との類似度が最も高い第２文字列を用いて前記第３文字列を生成する文書監視プログラム。
（付記４）付記３に記載の文書監視プログラムにおいて、
前記第１文字列を複数のブロックに分割し、前記第３文字列に前記ブロックを追加して新たな第３文字列を生成する文書監視プログラム。
（付記５）付記３に記載の文書監視プログラムにおいて、
前記所定の関係は、全ての第２文字列のうち前記第１文字列との類似度が最も高い第２文字列を用いて前記第３文字列を生成する文書監視プログラム。
（付記６）付記１に記載の文書監視プログラムにおいて、
地名と該地名の住所を示す文字列とが対応付けられた住所データベースにおいて、前記名詞と一致する地名を検索し、前記名詞と一致する地名が存在した場合、該地名の住所を示す文字列を前記データベースから取得して前記第２文字列とする文書監視プログラム。
（付記７）付記１に記載の文書監視プログラムにおいて、
地名と該地名の住所を示す文字列とを対応付けて格納する住所データベースにおいて、前記名詞と一致する地名を検索し、前記名詞と一致する地名が存在しない場合、前記第１文字列に含まれる文字列を第３文字列とし、インターネットにおいて前記第３文字列と前記名詞の両方を含むコンテンツを検索し、該コンテンツが存在する場合に前記第３文字列を前記第２文字列とする文書監視プログラム。
（付記８）付記１に記載の文書監視プログラムにおいて、
前記ユーザにより作成された文書の形態素解析を行い、該文書から名詞を抽出する文書監視プログラム。
（付記９）付記１に記載の文書監視プログラムにおいて、
前記第１文字列と前記第２文字列とにおいて、連続して一致する文字列の長さに基づいて前記類似度を算出する文書監視プログラム。
（付記１０）付記１に記載の文書監視プログラムにおいて、
前記第１文字列と前記第２文字列との類似度が所定の条件を満たす場合、警告を表示する文書監視プログラム。
（付記１１）付記１に記載の文書監視プログラムにおいて、
前記第１文字列と前記第２文字列との類似度が所定の条件を満たす場合、前記文書において該第２文字列を修正する文書監視プログラム。
（付記１２）ユーザにより作成された文書の監視をコンピュータに実行させる文書監視プログラムであって、
前記ユーザの住所を取得し、
前記ユーザにより作成された文書から名詞を抽出し、
データベースにおいて前記名詞を検索することにより、前記名詞により示される位置を取得し、
前記ユーザの住所と前記位置との関連性を算出し、
前記関連性に基づいて前記名詞に関する表示を行う
ことをコンピュータに実行させる文書監視プログラム。
（付記１３）付記１２に記載の文書監視プログラムにおいて、
前記地名と該地名の緯度及び経度とを対応付けて格納する緯度経度データベースにおいて、前記名詞と一致する地名を検索し、前記名詞と一致する地名が存在した場合、該地名の緯度経度に基づいて該地名から前記ユーザの住所までの距離を算出し、前記距離が所定の条件を満たす場合、前記名詞に関する表示を行う文書監視プログラム。
（付記１４）ユーザにより作成された文書の監視を行う文書監視装置であって、
前記ユーザの住所を表す文字列である第１文字列を取得する第１取得部、
前記ユーザにより作成された文書から名詞を抽出する抽出部と、
データベースにおいて前記名詞を検索することにより、前記名詞が示す住所を表す文字列である第２文字列を取得する第２取得部と、
前記第１文字列と前記第２文字列との類似度から危険度を算出する算出部と、
前記危険度に基づいて前記第２文字列に関する表示を行う表示部と、
を備える文書監視装置。
（付記１５）付記１４に記載の文書監視装置において、
前記第２取得部は、前記第１文字列に含まれる文字列を第３文字列とし、インターネット上のコンテンツのデータベースにおいて前記第３文字列と前記名詞の両方を含むコンテンツを検索し、該コンテンツが存在する場合に前記第３文字列を前記第２文字列とする文書監視プログラム。
（付記１６）付記１５に記載の文書監視プログラムにおいて、
前記第２取得部は、前記第１文字列との類似度が所定の関係にある第２文字列を用いて前記第３文字列を生成する文書監視プログラム。
（付記１７）付記１６に記載の文書監視プログラムにおいて、
前記第２取得部は、前記第１文字列を複数のブロックに分割し、前記第３文字列に前記ブロックを追加して新たな第３文字列を生成する文書監視プログラム。
（付記１８）付記１４に記載の文書監視プログラムにおいて、
地名と該地名の住所を示す文字列とが対応付けられた住所データベースにおいて、前記名詞と一致する地名を検索し、前記名詞と一致する地名が存在した場合、該地名の住所を示す文字列を前記データベースから取得して前記第２文字列とする文書監視プログラム。
（付記１９）付記１４に記載の文書監視プログラムにおいて、
地名と該地名の住所を示す文字列とを対応付けて格納する住所データベースにおいて、前記名詞と一致する地名を検索し、前記名詞と一致する地名が存在しない場合、前記第１文字列に含まれる文字列を第３文字列とし、インターネットにおいて前記第３文字列と前記名詞の両方を含むコンテンツを検索し、該コンテンツが存在する場合に前記第３文字列を前記第２文字列とする文書監視プログラム。
（付記２０）ユーザにより作成された文書の監視を行う文書監視方法であって、
前記ユーザの住所を表す文字列である第１文字列を取得し、
前記ユーザにより作成された文書から名詞を抽出し、
データベースにおいて前記名詞を検索することにより、前記名詞が示す住所を表す文字列である第２文字列を取得し、
前記第１文字列と前記第２文字列との類似度から危険度を算出し、
前記危険度に基づいて前記第２文字列に関する表示を行う
文書監視方法。 (Supplementary Note 1) A document monitoring program for causing a computer to monitor a document created by a user,
Obtaining a first character string that is a character string representing the address of the user;
Extracting nouns from the document created by the user,
By searching for the noun in the database, a second character string that is a character string representing the address indicated by the noun is obtained,
Calculating the degree of risk from the similarity between the first character string and the second character string;
A document monitoring program for causing a computer to execute display related to the second character string based on the degree of risk.
(Appendix 2) In the document monitoring program described in Appendix 1,
A character string included in the first character string is set as a third character string, a content including both the third character string and the noun is searched in a content database on the Internet, and the content is found when the content exists. A document monitoring program that uses three character strings as the second character string.
(Appendix 3) In the document monitoring program described in Appendix 2,
A document monitoring program for generating the third character string using a second character string having the highest similarity to the first character string.
(Appendix 4) In the document monitoring program described in Appendix 3,
A document monitoring program for dividing the first character string into a plurality of blocks and adding the block to the third character string to generate a new third character string.
(Appendix 5) In the document monitoring program described in Appendix 3,
The document monitoring program for generating the third character string by using the second character string having the highest similarity to the first character string among all the second character strings.
(Appendix 6) In the document monitoring program described in Appendix 1,
In an address database in which a place name and a character string indicating an address of the place name are associated with each other, a place name that matches the noun is searched. When a place name that matches the noun exists, a character string that indicates the address of the place name is A document monitoring program acquired from the database and used as the second character string.
(Appendix 7) In the document monitoring program described in Appendix 1,
In an address database that stores a place name and a character string indicating an address of the place name in association with each other, a place name that matches the noun is searched, and if there is no place name that matches the noun, the place name is included in the first character string. Document monitoring using a third character string as a character string, searching for contents including both the third character string and the noun on the Internet, and using the third character string as the second character string when the content exists program.
(Appendix 8) In the document monitoring program described in Appendix 1,
A document monitoring program that performs morphological analysis of a document created by the user and extracts nouns from the document.
(Appendix 9) In the document monitoring program described in Appendix 1,
A document monitoring program for calculating the similarity based on a length of a character string that continuously matches in the first character string and the second character string.
(Appendix 10) In the document monitoring program described in Appendix 1,
A document monitoring program for displaying a warning when a similarity between the first character string and the second character string satisfies a predetermined condition.
(Appendix 11) In the document monitoring program described in Appendix 1,
A document monitoring program for correcting a second character string in the document when a similarity between the first character string and the second character string satisfies a predetermined condition.
(Supplementary Note 12) A document monitoring program for causing a computer to monitor a document created by a user,
Obtain the user's address,
Extracting nouns from the document created by the user,
By retrieving the noun in the database, the position indicated by the noun is obtained,
Calculating the relevance between the user's address and the location;
A document monitoring program that causes a computer to execute display related to the noun based on the association.
(Supplementary note 13) In the document monitoring program described in supplementary note 12,
In a latitude / longitude database that stores the place name and the latitude and longitude of the place name in association with each other, search for a place name that matches the noun, and if a place name that matches the noun exists, A document monitoring program for calculating a distance from the place name to the user's address and displaying the noun when the distance satisfies a predetermined condition.
(Supplementary Note 14) A document monitoring apparatus that monitors a document created by a user,
A first acquisition unit that acquires a first character string that is a character string representing the address of the user;
An extractor for extracting nouns from the document created by the user;
A second acquisition unit that acquires a second character string that is a character string representing an address indicated by the noun by searching for the noun in a database;
A calculation unit for calculating a risk level from a similarity between the first character string and the second character string;
A display unit for displaying the second character string based on the degree of risk;
A document monitoring apparatus comprising:
(Supplementary Note 15) In the document monitoring apparatus according to Supplementary Note 14,
The second acquisition unit uses a character string included in the first character string as a third character string, searches a content database on the Internet for content including both the third character string and the noun, A document monitoring program that uses the third character string as the second character string when there is a file.
(Supplementary Note 16) In the document monitoring program described in Supplementary Note 15,
The second acquisition unit is a document monitoring program that generates the third character string using a second character string having a predetermined degree of similarity with the first character string.
(Supplementary note 17) In the document monitoring program according to supplementary note 16,
The second acquisition unit is a document monitoring program that divides the first character string into a plurality of blocks and adds the block to the third character string to generate a new third character string.
(Supplementary note 18) In the document monitoring program described in supplementary note 14,
In an address database in which a place name and a character string indicating an address of the place name are associated with each other, a place name that matches the noun is searched. When a place name that matches the noun exists, a character string that indicates the address of the place name is A document monitoring program acquired from the database and used as the second character string.
(Supplementary note 19) In the document monitoring program according to supplementary note 14,
In an address database that stores a place name and a character string indicating an address of the place name in association with each other, a place name that matches the noun is searched, and if there is no place name that matches the noun, it is included in the first character string. Document monitoring using a third character string as a character string, searching for contents including both the third character string and the noun on the Internet, and using the third character string as the second character string when the content exists program.
(Supplementary note 20) A document monitoring method for monitoring a document created by a user,
Obtaining a first character string that is a character string representing the address of the user;
Extracting nouns from the document created by the user,
By searching for the noun in the database, a second character string that is a character string representing the address indicated by the noun is obtained,
Calculating the degree of risk from the similarity between the first character string and the second character string;
A document monitoring method for performing display related to the second character string based on the risk level.

実施の形態１に係る文書監視装置の構成の一例を示すブロック図である。1 is a block diagram illustrating an example of a configuration of a document monitoring apparatus according to Embodiment 1. FIG. 実施の形態１に係る文書監視装置の動作の一例を示すフローチャートである。6 is a flowchart illustrating an example of an operation of the document monitoring apparatus according to the first embodiment. 実施の形態１に係る緯度経度ＤＢの内容の一例を示す表である。4 is a table showing an example of contents of a latitude / longitude DB according to the first embodiment. 実施の形態１に係る危険キーワードＤＢの内容の一例を示す表である。4 is a table showing an example of contents of a danger keyword DB according to Embodiment 1. 実施の形態１に係る危険キーワード表示処理による表示の一例を示す画面である。6 is a screen showing an example of display by dangerous keyword display processing according to Embodiment 1. 実施の形態２に係る文書監視装置の構成の一例を示すブロック図である。6 is a block diagram illustrating an example of a configuration of a document monitoring apparatus according to Embodiment 2. FIG. 実施の形態２に係る文書監視装置の動作の一例を示すフローチャートである。10 is a flowchart illustrating an example of an operation of the document monitoring apparatus according to the second embodiment. 実施の形態２に係るユーザ住所文字列の分割の一例を示す図である。It is a figure which shows an example of the division | segmentation of the user address character string which concerns on Embodiment 2. FIG. 実施の形態２に係る危険度ＤＢの内容の一例を示す表である。10 is a table showing an example of contents of a risk DB according to the second embodiment. 実施の形態２に係る住所文字列ＤＢの内容の一例を示す表である。It is a table | surface which shows an example of the content of the address character string DB which concerns on Embodiment 2. FIG. 実施の形態２に係る第１危険度算出処理の動作の一例を示すフローチャートである。10 is a flowchart illustrating an example of an operation of a first risk degree calculation process according to the second embodiment. 実施の形態２に係る第２危険度算出処理の動作の一例を示すフローチャートである。12 is a flowchart illustrating an example of an operation of a second risk degree calculation process according to the second embodiment. 実施の形態２に係る第２危険度算出処理に関する情報の一例を示す図である。10 is a diagram illustrating an example of information related to a second risk degree calculation process according to Embodiment 2. FIG. 実施の形態２に係る第２危険度算出処理の一例を示す図である。It is a figure which shows an example of the 2nd risk degree calculation process which concerns on Embodiment 2. FIG.

Explanation of symbols

１ユーザ端末、２サーバ、１１ユーザ情報登録部、１２ユーザ情報ＤＢ、１３ユーザ認証部、１４ユーザ情報取得部、２１文書受信部、２２文書ＤＢ、２４キーワード抽出部、２６抽出キーワードＤＢ、３１緯度経度取得部、３２緯度経度ＤＢ、３３距離算出部、３４危険キーワードＤＢ、３５判定部、４２警告部、４３修正部、４４文書送信部、５１住所文字列取得部、５２住所文字列ＤＢ、５３危険度算出部、５４危険度ＤＢ、５５判定部。 1 User terminal, 2 server, 11 User information registration unit, 12 User information DB, 13 User authentication unit, 14 User information acquisition unit, 21 Document reception unit, 22 Document DB, 24 Keyword extraction unit, 26 Extraction keyword DB, 31 Latitude Longitude acquisition unit, 32 latitude / longitude DB, 33 distance calculation unit, 34 danger keyword DB, 35 determination unit, 42 warning unit, 43 correction unit, 44 document transmission unit, 51 address character string acquisition unit, 52 address character string DB, 53 Risk level calculation unit, 54 Risk level DB, 55 determination unit.

Claims

A document monitoring program for causing a computer to monitor a document created by a user,
Obtaining a first character string that is a character string representing the address of the user;
Extracting nouns from the document created by the user,
By searching for the noun in the database, a second character string that is a character string representing the address indicated by the noun is obtained,
Calculating the degree of risk from the similarity between the first character string and the second character string;
A document monitoring program for causing a computer to execute display related to the second character string based on the degree of risk.

The document monitoring program according to claim 1,
A character string included in the first character string is set as a third character string, a content including both the third character string and the noun is searched in a content database on the Internet, and the content is found when the content exists. A document monitoring program that uses three character strings as the second character string.

The document monitoring program according to claim 2,
A document monitoring program for generating the third character string using a second character string having the highest similarity to the first character string.

The document monitoring program according to claim 2,
In an address database that stores a place name and a character string indicating an address of the place name in association with each other, a place name that matches the noun is searched, and if there is no place name that matches the noun, the place name is included in the first character string. Document monitoring using a third character string as a character string, searching for contents including both the third character string and the noun on the Internet, and using the third character string as the second character string when the content exists program.

The document monitoring program according to any one of claims 1 to 4,
In an address database in which a place name and a character string indicating an address of the place name are associated with each other, a place name that matches the noun is searched. When a place name that matches the noun exists, a character string that indicates the address of the place name is A document monitoring program acquired from the database and used as the second character string.

A document monitoring apparatus for monitoring a document created by a user,
A first acquisition unit that acquires a first character string that is a character string representing the address of the user;
An extractor for extracting nouns from the document created by the user;
A second acquisition unit that acquires a second character string that is a character string representing an address indicated by the noun by searching for the noun in a database;
A calculation unit for calculating a risk level from a similarity between the first character string and the second character string;
A display unit for displaying the second character string based on the previous risk level;
A document monitoring apparatus comprising:

A document monitoring method for monitoring a document created by a user,
Obtaining a first character string that is a character string representing the address of the user;
Extracting nouns from the document created by the user,
By searching for the noun in the database, a second character string that is a character string representing the address indicated by the noun is obtained,
Calculating the degree of risk from the similarity between the first character string and the second character string;
A document monitoring method for performing display related to the second character string based on the risk level.