JP3676120B2

JP3676120B2 - Text electronic authentication apparatus, method, and recording medium recording text electronic authentication program

Info

Publication number: JP3676120B2
Application number: JP14567699A
Authority: JP
Inventors: 博人稲垣; 和宏早川; 大二郎森; 一男田中
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1999-05-25
Filing date: 1999-05-25
Publication date: 2005-07-27
Anticipated expiration: 2019-05-25
Also published as: JP2000338870A

Description

【０００１】
【発明の属する技術分野】
本発明は、テキストが記述された電子文書（ファイル）に対する電子的な認証を目的として、テキスト中に透かし情報を入れる電子認証装置に関する。
【０００２】
【従来の技術】
従来、インターネットでは、ＨＴＭＬ（Hyper Text Markup Language）文書ファイルが、標準的な文書として利用されている。ＨＴＭＬ文書ファイルに記述されたテキストの認証は、適切なサイトから認証されうるセキユアな方法、例えば、ＳＳＬ（セキュアソケットレイア）などを利用することにより、適切な認証が行われている。
【０００３】
【発明が解決しようとする課題】
しかし、これらの文書は、ブラウザや、クローラとよばれるような機械的にｗｅｂサーバからＨＴＭＬ文書を取り出すソフトウェアにより、サーバ側からクライアント側に取り出すことができるため、クライアント側に取り出した文書を不正に使用することが可能であった。また、上記ＳＳＬ等の方法では、特定の情報のみのテキスト情報をやりとりする場合には、適切であるが、どのような文書に対しても適切な認証を行うことは困難であり、かつ、一般のＨＴＭＬ文書のに対して適切な認証を与えることが難しかった。
【０００４】
本発明は、上記の点に鑑みてなされたもので、インターネットなどで標準的に使用されているＨＴＭＬ文書を代表とする電子文書に対して適切な認証を与えることができ、かつ、著作権を無視した違法なコピーなどを排除するために、ＨＴＭＬ文書等の電子文書に埋め込んだ認証情報（透かし情報）を取り出すことのできるテキスト電子認証装置、方法、及び、テキスト電子認証プログラムを記録した記録媒体を提供するものである。
【０００５】
【課題を解決するための手段】
請求項１に記載の発明は、ＨＴＭＬテキストが記述された電子文書に対して、該電子文書に認証情報を埋め込むことで該電子文書を認証可能とするテキスト電子認証装置であって、前記ＨＴＭＬテキストが記述された電子文書からＨＴＭＬテキストを読み取るテキスト読み取り部と、前記電子文書に記述されたＨＴＭＬテキストの発行元の情報を入力するテキスト発行元情報入力部と、前記ＨＴＭＬテキストの発行元の認証のための情報を入力するテキスト認証情報入力部と、前記テキスト発行元情報入力部に入力された発行元の情報と、前記テキスト認証情報入力部に入力されたＨＴＭＬテキストの発行元の認証のための情報とからユーザからは解読不能な暗号化データを生成し、暗号化された１バイトのコードを、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードに１対１に写像した暗号化電子認証情報を生成するテキスト認証情報発生部と、前記テキスト認証情報発生部により生成された暗号化電子認証情報を前記テキスト読み取り部により読み取られたＨＴＭＬテキストに埋め込むテキスト認証情報埋め込み部とを有することを特徴とするテキスト電子認証装置である。
【０００６】
また、請求項２に記載の発明は、ＨＴＭＬテキストが記述された電子文書に対して、該電子文書に認証情報を埋め込むことで該電子文書を認証可能とするテキスト電子認証装置であって、前記ＨＴＭＬテキストが記述された電子文書からＨＴＭＬテキストを読み取るテキスト読み取り部と、前記テキスト読み取り部が読み取ったＨＴＭＬテキストの特徴を抽出するテキスト特徴抽出部と、前記電子文書に記述されたＨＴＭＬテキストの発行元の情報を入力するテキスト発行元情報入力部と、前記ＨＴＭＬテキストの発行元の認証のための情報を入力するテキスト認証情報入力部と、前記テキスト特徴抽出部が抽出したＨＴＭＬテキストの特徴を表す情報と、前記テキスト発行元情報入力部に入力された発行元の情報と、前記テキスト認証情報入力部に入力されたＨＴＭＬテキストの発行元の認証のための情報とからユーザからは解読不能な暗号化データを生成し、暗号化された１バイトのコードを、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードに１対１に写像した暗号化電子認証情報を生成するテキスト認証情報発生部と、前記テキスト認証情報発生部により生成された暗号化電子認証情報を前記テキスト読み取り部により読み取られたＨＴＭＬテキストに埋め込むテキスト認証情報埋め込み部とを有することを特徴とするテキスト電子認証装置である。
【０００７】
また、請求項３に記載の発明は、前記テキスト認証情報発生部における暗号化データの生成において、前記発行元の情報と前記ＨＴＭＬテキストの発行元の認証のための情報とを暗号化した第１の暗号化データを生成し、第１の暗号化データと前記ＨＴＭＬテキストの特徴を表す情報とから最終的な暗号化データを生成することを特徴とする請求項２に記載のテキスト電子認証装置である。
【０００８】
また、請求項４に記載の発明は、暗号化されてコード変換がされた認証情報が埋め込まれたＨＴＭＬテキストが記述された電子文書から、認証情報を取り出す装置であって、前記電子文書を読み込み、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードで記述された部分を暗号化されてコード変換がされた認証情報として取り出し、前記電子文書から該暗号化されてコード変換がされた認証情報を除いた部分をＨＴＭＬテキストとして分離するテキスト認証情報取り出し部と、前記テキスト認証情報取り出し部によって分離・取り出された暗号化されてコード変換がされた認証情報を、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードを１対１の写像により１バイトのコードに変換することによりユーザからは解読不能な暗号化された認証情報にコード変換し、該暗号化された認証情報を復号化して認証情報を得るテキスト認証情報読み取り部と、前記テキスト認証情報読み取り部が復号化した認証情報から、テキスト発行元の情報を読み取るテキスト発行元情報抽出部と、前記テキスト認証情報読み取り部が復号化した認証情報から、前記テキスト発行元の認証のための情報を読み取るテキスト認証情報抽出部とを有することを特徴とするテキスト電子認証装置である。
【０００９】
また、請求項５に記載の発明は、暗号化されてコード変換がされた認証情報が埋め込まれたＨＴＭＬテキストが記述された電子文書から、認証情報を取り出す装置であって、前記電子文書を読み込み、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードで記述された部分を暗号化されてコード変換がされた認証情報として取り出し、前記電子文書から該暗号化されてコード変換がされた認証情報を除いた部分をＨＴＭＬテキストとして分離するテキスト認証情報取り出し部と、前記テキスト認証情報取り出し部により取り出された前記電子文書に記述されたＨＴＭＬテキストに基づき、ＨＴＭＬテキストの特徴を抽出するテキスト特徴取り出し部と、前記テキスト認証情報取り出し部によって分離・取り出された暗号化されてコード変換がされた認証情報を、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードを1対1の写像により1バイトのコードに変換することによりユーザからは解読不能な暗号化された認証情報にコード変換し、該暗号化された認証情報を前記ＨＴＭＬテキストの特徴を用いて復号化して認証情報を得るテキスト認証情報読み取り部と、前記テキスト認証情報読み取り部が復号化した認証情報から、テキスト発行元の情報を読み取るテキスト発行元情報抽出部と、前記テキスト認証情報読み取り部が復号化した認証情報から、前記テキスト発行元の認証のための情報を読み取るテキスト認証情報抽出部とを有することを特徴とするテキスト電子認証装置である。
【００１０】
また、請求項６に記載の発明は、ＨＴＭＬテキストが記述された電子文書に対して、該電子文書に認証情報を埋め込むことで該電子文書を認証可能とするテキスト電子認証装置における電子認証方法であって、テキスト読み取り部が、前記ＨＴＭＬテキストが記述された電子文書からＨＴＭＬテキストを読み取るテキスト読み取り手順と、テキスト発行元情報入力部が、前記電子文書に記述されたＨＴＭＬテキストの発行元の情報を入力するテキスト発行元情報入力手順と、テキスト認証情報入力部が、前記ＨＴＭＬテキストの発行元の認証のための情報を入力するテキスト認証情報入力手順と、テキスト認証情報発生部が、前記テキスト発行元情報入力部に入力された発行元の情報と、前記テキスト認証情報入力部に入力されたＨＴＭＬテキストの発行元の認証のための情報とからユーザからは解読不能な暗号化データを生成し、暗号化された１バイトのコードを、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードに１対１に写像した暗号化電子認証情報を生成するテキスト認証情報発生手順と、テキスト認証情報埋め込み部が、前記テキスト認証情報発生部により生成された暗号化電子認証情報を前記テキスト読み取り部により読み取られたＨＴＭＬテキストに埋め込むテキスト認証情報埋め込み手順とを有することを特徴とするテキスト電子認証方法である。
【００１１】
また、請求項７に記載の発明は、ＨＴＭＬテキストが記述された電子文書に対して、該電子文書に認証情報を埋め込むことで該電子文書を認証可能とするテキスト電子認証装置における電子認証方法であって、テキスト読み取り部が、前記ＨＴＭＬテキストが記述された電子文書からＨＴＭＬテキストを読み取るテキスト読み取り手順と、テキスト特徴抽出部が、前記テキスト読み取り部が読み取ったＨＴＭＬテキストの特徴を抽出するテキスト特徴抽出手順と、テキスト発行元情報入力部が、前記電子文書に記述されたＨＴＭＬテキストの発行元の情報を入力するテキスト発行元情報入力手順と、テキスト認証情報入力部が、前記ＨＴＭＬテキストの発行元の認証のための情報を入力するテキスト認証情報入力手順と、テキスト認証情報発生部が、前記テキスト特徴抽出部が抽出したＨＴＭＬテキストの特徴を表す情報と、前記テキスト発行元情報入力部に入力された発行元の情報と、前記テキスト認証情報入力部に入力されたＨＴＭＬテキストの発行元の認証のための情報とからユーザからは解読不能な暗号化データを生成し、暗号化された１バイトのコードを、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードに１対１に写像した暗号化電子認証情報を生成するテキスト認証情報発生手順と、テキスト認証情報埋め込み部が、前記テキスト認証情報発生部により生成された暗号化電子認証情報を前記テキスト読み取り部により読み取られたＨＴＭＬテキストに埋め込むテキスト認証情報埋め込み手順とを有することを特徴とするテキスト電子認証方法である。
【００１２】
また、請求項８に記載の発明は、暗号化されてコード変換がされた認証情報が埋め込まれたＨＴＭＬテキストが記述された電子文書から、認証情報を取り出す装置における電子認証方法であって、テキスト認証情報取り出し部が、前記電子文書を読み込み、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードで記述された部分を暗号化されてコード変換がされた認証情報として取り出し、前記電子文書から該暗号化されてコード変換がされた認証情報を除いた部分をＨＴＭＬテキストとして分離する手順と、テキスト認証情報読み取り部が、前記テキスト認証情報取り出し部によって分離・取り出された暗号化されてコード変換がされた認証情報を、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードを１対１の写像により１バイトのコードに変換することによりユーザからは解読不能な暗号化された認証情報にコード変換し、該暗号化された認証情報を復号化して認証情報を得る手順と、テキスト発行元情報抽出部が、前記テキスト認証情報読み取り部が復号化した認証情報から、テキスト発行元の情報を読み取る手順と、テキスト認証情報抽出部が、前記テキスト認証情報読み取り部が復号化した認証情報から、前記テキスト発行元の認証のための情報を読み取る手順とを有することを特徴とするテキスト電子認証方法である。
【００１３】
また、請求項９に記載の発明は、暗号化されてコード変換がされた認証情報が埋め込まれたＨＴＭＬテキストが記述された電子文書から、認証情報を取り出す装置における電子認証方法であって、テキスト認証情報取り出し部が、前記電子文書を読み込み、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードで記述された部分を暗号化されてコード変換がされた認証情報として取り出し、前記電子文書から該暗号化されてコード変換がされた認証情報を除いた部分をＨＴＭＬテキストとして分離する手順と、テキスト特徴取り出し部が、前記テキスト認証情報取り出し部により取り出された前記電子文書に記述されたＨＴＭＬテキストに基づき、ＨＴＭＬテキストの特徴を抽出する手順と、テキスト認証情報読み取り部が、前記テキスト認証情報取り出し部によって分離・取り出された暗号化されてコード変換がされた認証情報を、ＷＥＢブラウザでは不可視となる漢字コードで利用されない２バイトのコードを1対1の写像により1バイトのコードに変換することによりユーザからは解読不能な暗号化された認証情報にコード変換し、該暗号化された認証情報を前記ＨＴＭＬテキストの特徴を用いて復号化して認証情報を得る手順と、テキスト発行元情報抽出部が、前記テキスト認証情報読み取り部が復号化した認証情報から、テキスト発行元の情報を読み取る手順と、テキスト認証情報抽出部が、前記テキスト認証情報読み取り部が復号化した認証情報から、前記テキスト発行元の認証のための情報を読み取る手順とを有することを特徴とするテキスト電子認証方法である。
【００１４】
また、請求項１０に記載の発明は、請求項６又は７に記載のテキスト電子認証方法をコンピュータで実現するためのプログラムを記録したコンピュータが読み取り可能な記録媒体である。
また、請求項１１に記載の発明は、請求項８又は９に記載のテキスト電子認証方法をコンピュータで実現するためのプログラムを記録したコンピュータが読み取り可能な記録媒体である。
【００１５】
【発明の実施の形態】
以下、本発明の実施の形態を図面を参照して説明する。
図１は、本発明の一実施の形態であるテキスト電子認証装置の構成を示すブロック図である。
【００１６】
Ｔ−１は、テキスト読み取り部で、テキストが記述されたファイルをオープンし、装置または記憶媒体上に読み込む部分である。
Ｔ−２は、テキスト特徴抽出部で、Ｔ−１のテキスト読み取り部で読み取ったテキストの特徴を表すデータ（テキスト特徴情報）を抽出する部分である。ここでは、テキスト特徴情報として種々考えられる。例えば、すべてのテキストデータをテキスト特徴情報として捕らえてもよい。この場合、テキストを認証するためのデータ量を固定とすると、テキスト特徴情報が大きくなればなるほど、テキスト認証情報に記述されるテキスト特徴情報の比率は減少する。つまり、テキスト全データをテキスト特徴情報とした場合、テキスト全データに対して認証は可能であるが、そのテキストの中に記述された個々のテキスト内容についての認証はできなくなる可能性が高い。
【００１７】
本装置の使い方として、テキスト全体を認証する場合には、テキスト全体をテキスト特徴情報として抽出すべきであるが、テキストの内容も認証したい場合、例えば、自らが書いたテキストの一部であるかどうかを認証するためには、テキスト全体を抽出するのではなく、テキストに含まれる自立語または重要な語を抽出し、こららをテキスト特徴情報としなければならない。利用者は、これらの特性を考慮して、一例として以下に示す最適なテキスト特徴量（テキスト特徴情報の大きさ）またはこのテキスト特徴量に対応するフラグ（識別子、テキスト特徴量パラメータ）をＴ−２のテキスト特徴抽出部に設定する。

Ｔ−２のテキスト特徴抽出部は、利用者の設定したテキスト特徴量に基づき、テキスト特徴量パラメータとして、使用したテキスト抽出法を指定するフラグをＴ−５のテキスト認証情報発生部に対して通知する。
【００１８】
Ｔ−３は、テキスト発行元情報入力部である。これは、テキストを発行した本人であることを示す情報（テキスト発行元情報）を記述する。例えば、テキストの著作権を持つ会社組織や、テキストを著作した個人の住所、氏名、ＵＲＬなどの発行元であることを示す情報を記述する。
Ｔ−４は、テキスト認証情報入力部である。これは、公的機関や、ある種の認証会社が発行する、発行元のＩＤ（テキスト発行元ＩＤ）を入力する部位である。このテキスト発行元ＩＤを元に、発行者を特定することができる。ただし、発行元のＩＤを一意に示すため、テキスト発行元ＩＤは、世の中で一意である必要がある。
【００１９】
Ｔ−５は、テキスト認証情報発生部である。本テキスト認証情報発生部では、Ｔ−２のテキスト特徴抽出部で抽出したテキスト特徴情報とテキスト特徴量パラメータおよび、Ｔ−３のテキスト発行元情報入力部で入力された発行元の情報を示すテキスト発行元情報および、Ｔ―４のテキスト認証情報入力部で入力したテキスト発行元ＩＤを入力として、該当テキストのテキスト認証情報を発生する部分である。
【００２０】
詳しくは、図２に示すように、Ｔ−２のテキスト特徴抽出部で抽出したテキスト特徴情報を入力するテキスト特徴情報入力部Ｔ−５ａと、Ｔ−２のテキスト特徴抽出部から通知されるテキスト特徴量パラメータを入力する特徴量パラメータ入力部Ｔ−５ｂと、Ｔ−３のテキスト発行元情報入力部で入力されたテキスト発行元情報を入力する発行元情報入力部Ｔ−５ｃと、Ｔ―４のテキスト認証情報入力部で入力したテキスト発行元ＩＤを入力する発行元ＩＤ入力部Ｔ−５ｄと、テキスト特徴情報とテキスト発行元情報とテキスト発行元ＩＤとテキスト特徴量パラメータとからなるテキスト認証情報を暗号化する暗号化器Ｔ−５ｅと、暗号化されたテキスト認証情報をさらにブラウザに不可視なコード（不可視バイト列）に変換するコード変換部Ｔ−５ｆと、コード変換部Ｔ−５ｆによりコード変換された不可視バイト列を出力する暗号化認証情報出力部Ｔ−５ｇとから構成される。なお、暗号化およびコード変換の詳細は後述する。
【００２１】
Ｔ−６は、テキスト認証情報埋め込み部である。本テキスト認証情報埋め込み部では、Ｔ−５のテキスト認証情報発生部で発生したテキスト認証情報（不可視バイト列）を入力されたテキストに対して埋め込む。ただし、テキスト情報であるため、単純に埋め込む場合、テキストへの埋め込みが他者に漏洩し、当該テキスト認証情報を削除してしまう場合などが考えられる。また、テキスト中に埋め込む場合、通常のテキストのブラウジングの際に妨げとなってしまうことが考えられる。そこで、本装置では、これらのＴ−５のテキスト認証情報発生部で発生したテキスト認証情報を、他者に漏洩しにくいように、テキスト中にばらまくとともに、テキストのブラウジングの妨げとならないような形で、テキスト中に埋め込むことを特徴とする。つまり、通常のテキストのブラウジングを行った場合となんら変わらないようにテキスト認証情報を埋め込むとともに、テキストのカットや削除などの変更にも耐久性があるようにテキスト認証情報をテキスト中に分散させて埋め込む。
【００２２】
詳しくは、図３に示すように、Ｔ−５のテキスト認証情報発生部からテキスト特徴量パラメータ（Ｆ値）を読み取る特徴量パラメータ入力部Ｔ−６ａと、Ｔ−５のテキスト認証情報発生部から不可視バイト列を入力する暗号化認証情報入力部Ｔ−６ｂと、Ｔ−１のテキスト読み取り部から入力されたテキストを入力するテキスト入力部Ｔ−６ｃと、すべての不可視バイト列を出力したか判定し、各出力部を制御する判定部Ｔ−６ｄと、テキスト特徴量パラメータ（Ｆ値）に基づいて、入力テキストを読み込み、読み込んだテキストのバイト列を出力する埋込テキスト出力部Ｔ−６ｅと、テキスト特徴量パラメータ（Ｆ値）に基づいて、不可視バイト列を読み込み、読み込んだ不可視バイト列を出力する埋込暗号化認証情報出力部Ｔ−６ｆと、すべての不可視バイト列が出力された場合、残りの入力テキストを出力するテキスト出力部Ｔ−６ｇとから構成される。
【００２３】
Ｔ−７は、テキスト認証情報取り出し部である。Ｔ−６のテキスト認証情報埋め込み部により埋め込まれたテキスト認証情報を、認証等のためにテキスト認証情報が埋め込まれたテキスト中から取り出す処理部である。本処理部では、テキスト認証情報が埋め込まれたテキストから、テキスト認証情報と元のテキストとを分離する処理も行う。
Ｔ−８は、テキスト特徴取り出し部である。これは、テキスト認証情報を発生する際に使用したテキスト特徴情報を、テキスト認証情報を分離したテキスト（元のテキスト）から抽出する部位である。Ｔ−７のテキスト認証情報取り出し部により、テキスト認証情報と元のテキストが分離されるので、その分離されたテキストからテキストの特徴（テキスト特徴情報）を再計算する。
【００２４】
Ｔ−９は、テキスト認証読み出し部である。Ｔ−７のテキスト認証情報取り出し部により分離されたテキスト認証情報に対して、Ｔ−８が抽出したテキスト特徴情報とを用いて、テキスト発行元ＩＤとテキスト発行元情報を分離・抽出する。
Ｔ−１０は、テキスト発行元情報抽出部で、Ｔ−９で抽出したテキスト発行元情報を取り出し、出力する。
Ｔ−１１は、テキスト認証情報抽出部で、Ｔ−９で抽出したテキスト発行元ＩＤを取り出し、出力機器に出力する。
なお、本テキスト電子認証装置は、専用の装置として構成されてもよく、また、上記各部の機能を実現するプログラムを、コンピュータに読み込ませ実行させることにより実現されてもよい。また、本テキスト電子認証装置は、周辺機器として入力装置、表示装置等（いずれも図示せず）が接続されるものとする。ここで、入力装置とはキーボード、マウス等の入力デバイスのことをいう。表示装置とはＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）や液晶表示装置等のことをいう。
【００２５】
次に、このように構成された本実施の形態のテキスト電子認証装置の動作について、図を参照して説明する。以下の説明は、ＨＴＭＬ文書（ＨＴＭＬテキスト）に対する動作例である。
【００２６】
インターネットで主流のＨＴＭＬテキストとは、“＜”と“＞”のタグで記述されたテキスト属性に基づき、テキストを構造化した文書である。このタグは、一般に、Ｗ３Ｃ（World Wide Web Consortium）等で認証され、規定されている。このＨＴＭＬテキストを記述するために、オーサリングソフト等がある。一方、ＨＴＭＬテキスト形式で記述された文書を見る（ブラウジング）するための装置として、一般に、ブラウザというものがある。これは、上記タグの情報をもとに、テキスト情報を構造化して表示機器上に再配置して情報を提示する装置である。これらのブラウザでは、通常のＨＴＭＬテキストを解釈し、表示する機構をもっている。
【００２７】
例えば、ＨＴＭＬテキストの例としては以下のような例を考える。
<ＨＴＭＬ>
<ＴＩＴＬＥ>これはテストです</ＴＩＴＬＥ>
<ＢＯＤＹ>
<Ｈ１>今日は天気がよい。
</ＢＯＤＹ>
</ＨＴＭＬ>
このようにＨＴＭＬテキストは、プレーンなテキスト構造となっている。
これをＨＴＭＬテキストを見ることが可能なブラウザで表示すると、例えば図４のようになる。<ＴＩＴＬＥ>部は、表示器（ブラウザ）の一番上部に表示される。一方、<Ｈ１>タグで示された部分は、表示器の中に表示される。
【００２８】
Ｔ−１のテキスト読み取り部では、このようなＨＴＭＬテキストを読み込む（図８（ａ）：ステップＳ１）。８ｂｉｔ単位でデータを読み込むとすれば、以下のようなバイト群が装置に読み込まれる。

【００２９】
次に、Ｔ−２のテキスト特徴抽出部は、Ｔ−１のテキスト読み取り部で読み取ったテキストの特徴を表すデータであるテキスト特徴情報を抽出する（図８（ａ）：ステップＳ２）。
【００３０】
例えば、テキスト特徴量と、テキスト抽出部位と、テキスト抽出法と、テキスト特徴量パラメータ（Ｆ値）の関係を以下のように定義する。

【００３１】
テキスト全体をテキスト抽出する場合は、Ｔ−１のテキスト読み取り部で読み込まれたテキストをすべてそのまま、テキスト特徴情報として、Ｔ−５のテキスト認証情報発生部に渡す。テキスト抽出の手法を示すフラグ（Ｆ）は、初期に設定を行う。
【００３２】
テキスト全体からなるテキスト特徴情報を渡すと、以下のバイト列をＴ−５のテキスト認証情報発生部に渡すことになる。この場合、１６７バイト必要となる。

【００３３】
ＨＴＭＬテキストは、タグの情報により文書を構造化しているが、タグ自体には、文書の意味が含まれているわけではない。そこで、これらのＨＴＭＬテキストのタグ情報を削除したテキストをテキスト特徴情報とすることも考えられる。例えば、
これはテストです
今日は天気がよい。
をバイト例で表すと、

となり、３８バイトで表すことができる。
【００３４】
さらに、テキストに含まれる自立語は、テキストのキーワードとして利用されることが多く、該当テキストの内容を表すために用いられることが多い。例えば、これらのテキストに含まれる自立語を抽出する方法としては、形態素解析が通常用いられる。形態素解析とは、入力された文字列を単語辞書に対して、検索を行い、品詞情報(品詞)、文頭可否情報(文頭可)、前方接続情報(前接)、後方接続情報(後接)などの情報を取得する。通常の単語辞書では、ＴＲＥΙ辞書構造という特別な辞書構造を行うことにより高速な検索を行えるようになっている。
【００３５】
辞書項目として、“ああ”、“あいさつ”、“あい”、などがある場合、それぞれの第一文字（ここでは、日本語であるので、アルファベットと異なり、日本語文字２バイトを指し示す）が同じもの、第二文字目が同じものなど、それぞれ順次に、木構造的に構成される。そして、最後の文字まで、一致した場合には、その単語辞書項目に対する品詞情報(品詞)、文頭可否情報(文頭可)、前方接続情報(前接)、後方接続情報(後接)などの情報が記述される。
【００３６】
文頭可否情報とは、文頭にあってよいかどうかを示すフラグである。文頭可であれば、文頭に存在してもよいが、文頭否であれば、文頭にあることが許可されない単語ということになる。前方接続情報とは、前の単語の品詞または属性が適正な場合だけ接続が許可され、前接で接続が許可されない単語の場合、候補として削除される。同様に後方接続情報も、後の単語の品詞または属性が適正な場合だけ接続が許可され、後接で接続が許可されない単語の場合、候補として削除される。
【００３７】
このような、品詞接続により、候補を選択する。最尤候補は、最小コスト法とよぶ方法により選択する。最小コスト法とは、最もコストが最小となる形態素候補を最尤候補とする処理方式である。形態素解析において利用されるコストは、以下の２種類のコストがある。
１．接続コスト
２．単語コスト
接続コストは、ある単語と単語を接続する場合に必要なコストである。単語と単語であるため、単語＋当該単語の活用、に対する接続コストは０となる。また、単語コストとは、その単語に関するコストであり、例えば、使用頻度が高い単語は、コストが低くなる。また、活用は単語ではないので、コストは０となる。形態素解析により、テキスト部が単語単位に分解されると同時に、各単語に尤も正しいと考えられる品詞が付与される。本実施の形態では、接続コストと単語コストの総和が最小となるものを最尤候補とする。なお、接続コストおよび単語コストの数値定義は別途なされるものである。
【００３８】
ここで、先の例文が入力されたとする。
これはテストです
今日は天気がよい。
この例における形態素解析は以下のようになる。
【００３９】

【００４０】

【００４１】
このような例文で、自立語だけを抽出すると、
これ、テスト、今日、天気
が抽出される。これをバイト列で表すと、以下のようになる。

この例では、合計１９バイトでテキストの特徴を表現することができる。
【００４２】
さらに、テキストの要約を使う。例えば、稲垣らが発明した、テキストの要約手法（特願平１０−１８０１８１公開文書要約装置およびそのためのプログラムを記録した記録媒体）を利用すれば、テキストからその中で重要な要旨を抽出することができる。
例えば、上記の例の要旨として、
今日は天気がよい。
が選ばれたとする。そうすると、これに対するバイト列は、以下のようになる。

【００４３】
さらに、上記文を形態素解析し、この中の自立語を抽出すると、
今日、天気
が抽出される。これをバイト列で表すと、以下のようになる。
0000000 baa3 c6fc c5b7
【００４４】
このようにして、用途に応じて、適切なテキスト特徴情報を抽出する。例えば、テキスト全体を認証する場合には、テキスト全体をテキスト抽出すべきであるが、テキストの内容も認証したい場合、例えば、自らが書いたテキストの一部であるかどうかを認証するためには、テキスト全体を抽出するのではなく、テキストに含まれる自立語または、重要な語をテキスト抽出し、テキスト特徴情報としなければならない。利用者は、これらの特性を考慮して、最適なテキスト特徴量を設定する。Ｔ−２のテキスト特徴抽出部は、利用者の設定したテキスト特徴量に基づき、テキスト特徴情報の抽出に使用したテキスト抽出法（Ｆ値）をＴ−５のテキスト認証情報発生部に対して通知する。
【００４５】
次に、Ｔ−３のテキスト発行元情報入力部は、テキストを発行した本人であることを示す情報（テキスト発行元情報）の入力を受け所定の形式に記述する（図８（ａ）：ステップＳ３）。例えば、テキストの著作権を持つ会社組織や、テキストを著作した個人の住所、氏名、ＵＲＬなどの発行元であることを示す情報が入力され所定の形式に記述する。
【００４６】
一例として、以下のようなテキスト発行元情報を記述する。
<氏名>あいうえおたろう</氏名>
<所属>たろう株式会社</所属>
<往所>京都太郎区１</住所>
<URL>http://aaaaaa.ne.jp/aaa.htm1</URL>
<作成日>９９年３月１日</作成日>
<発行日>９９年３月２日</発行日>
<権利保有日>２０００年３月２日</権利保有日>
【００４７】
Ｔ−３のテキスト発行元情報入力部は、次に、テキスト発行元情報をＴ−５のテキスト認証情報発生部に送る。これらのテキスト発行元情報をＴ−５のテキスト認証情報発生部に送る場合、テキスト発行元情報の属性（つまり、住所であるのか、氏名であるのかなど）を明確にするために、ＳＧＭＬ（Standard generalized Markup Language）と同様にタグでその属性で囲んでいる。例えば、<氏名></氏名>のタグの間に属性の値、つまりここでは氏名を記述する。氏名などの属性の終了は、“/”で記述されたタグ（ここでは、</氏名>の部分）がそれを示すマーカとなる。
【００４８】
これらをバイト列で表すと、以下のようになる。

【００４９】
Ｔ−４のテキスト認証情報入力部は、公的機関や、ある種の認証会社が発行する、発行元のＩＤの入力を受け所定の形式に記述する（図８（ｂ）：ステップＳ４）。このテキスト発行元ＩＤを元に、発行者を特定することができる。ただし、発行元のＩＤを一意に示すため、テキスト発行元ＩＤは、世の中で一意である必要がある。
例えば、以下に示す発行元組織ＩＤとこの発行元組織が一意に発行した、テキスト発行元ＩＤをテキスト発行元ＩＤとして記述する。
<発行元組織ＩＤ>ＡＡＡ</発行元組織ＩＤ>
<発行元ＩＤ>123456789</発行元ＩＤ>
【００５０】
次に、Ｔ−５のテキスト認証情報発生部は、テキスト認証情報を発生する（図８（ａ）：ステップＳ５）。詳しくは、Ｔ−２のテキスト特徴抽出部で抽出したテキスト特徴情報をテキスト特徴情報入力部Ｔ−５ａで受け（図９：ステップＳ５１）、また、Ｔ−２のテキスト特徴抽出部から通知されるテキスト特徴量パラメータを特徴量パラメータ入力部Ｔ−５ｂで受け（図９：ステップＳ５２）、Ｔ−３のテキスト発行元情報入力部で入力された発行元の情報を示すテキスト発行元情報を発行元情報入力部Ｔ−５ｃで受け（図９：ステップＳ５３）、Ｔ−４のテキスト認証情報入力部で入力されたテキスト発行元ＩＤを発行元ＩＤ入力部Ｔ−５ｄで受け（図９：ステップＳ５４）、各情報が揃うと、以下のようにして、テキスト認証情報を発生する。
テキスト認証情報は、それ自体がテキスト中に埋め込まれるため、単純にテキスト認証情報をテキスト中に埋め込んでしまうと、テキスト認証情報がブラウザに表示されるとともに、ある特殊な編集器により改ざんされる可能性が生じる。そこで、本装置では、テキスト認証情報が、ブラウザ等で不可視となり、かつ、どのようなテキスト認証情報が記述されているかがわからないように、暗号化することを行う。
【００５１】
つまり、まず、暗号化器Ｔ−５ｅがテキスト認証情報を暗号化し、通常のユーザからは解読不能とする（図９：ステップＳ５５）。暗号化には、例えば、清水らが発明した、ＦＥＡＬ−８、ＮＸ（特願昭６０-２５２６５０、「データ拡散装置」）などの暗号化装置を利用する。暗号化装置は、基本的には、あるバイト列と暗号鍵を与えると、それに基づき、バイト列を暗号化して、暗号化されたバイト列を返す装置である。通常、これらの暗号化装置は、バイト列を暗号化して適当なバイト列に変換する。しかし、これらの暗号化装置では、ブラウザに不可視であるような暗号化を行うわけではなく、ブラウザに対しては、可視であったり、制御コードとなってしまう場合がある。
【００５２】
例えば、0x20は半角スペースであったり、0x0aは、改行コードであると通常のブラウザでは認識してしまう。そのため、ただ単純に暗号化器Ｔ−５ｅで生成したテキスト認証情報であると、ブラウザに可視となってしまう。そのため、ブラウザにおいて不可視とするために、暗号化器Ｔ−５ｅが生成したバイト列を、ある特殊なコード列に変換することによりブラウザにおいて不可視とするバイト列を生成する。例えば、ＳＪＩＳ漢字コード体系では、１バイト目のバイトが以下のように定められている。
制御コード群が、0x00から0x1fまで
ＡＳＣＩＩコード群が、0x20から0x7Fまで
ＳＪＩＳコード群が、0x81から0x9Fまで
半角カタカナコード群が0xa1から0xCFまで
ＳＪＩＳコードが、0xE0から0xf9まで
以上が漢字コードとして利用される。それ以外のコードは、逆にブラウザに非表示なコード群となる。つまり、漢字コードが属しているものを１バイト目と２バイト目の関係で表せば、図５のようになる。
【００５３】
すなわち、斜線で示した部分が漢字コードで利用されるコード群であり、白の部分は、漢字コードで利用されていないコード群であり、かつ、ブラウザで不可視となりうるコード群である。この２バイトで表される、未使用領域に対してそれぞれ暗号化されたバイト列を写像することにより、暗号化された任意の８バイトを表す。
例えば、以下のように
暗号化バイト表現バイト
0x00 0f8080
0x01 0x8081
Ox02 0x8082
0x03 0x8083
…
0xff 0xffff
というような写像テーブルを記述することにより、暗号化バイト列をブラウザに不可視な不可視バイト列に変換することができる。
【００５４】
Ｔ−５のテキスト認証情報発生部では、コード変換部Ｔ−５ｆが最終的に暗号化バイト列を不可視バイト列に変換して（図９：ステップＳ５６）、暗号化認証情報出力部Ｔ−５ｇがこの不可視バイト列を出力する（図９：ステップＳ５７）。暗号化するデータ列としては、Ｔ−３のテキスト発行元情報入力部からの出力であるテキスト発行元情報と、Ｔ−４のテキスト認証情報入力部から出力されるテキスト発行元ＩＤと、Ｔ―２のテキスト特徴抽出部から出力されるテキスト特徴情報およびテキストパラメータ特徴パラメータ（Ｆ値）である。テキスト特徴情報については、入力されたテキストの特徴を明確にするために使用されるので、復元された際に、復元されたデータと現状データの相違が発見された場合には、現状データが元データの改ざんを受けたことを表す。
【００５５】
次に、Ｔ−５のテキスト認証情報発生部における暗号化の詳細を説明する。
テキスト発行元情報およびテキスト発行元ＩＤおよびテキスト特徴量パラメータを暗号化するとともに、テキスト特徴情報については埋め込むバイト数に応じて、暗号化する。テキスト発行元情報とテキスト発行元ＩＤは、必ず埋め込むため、これらの情報のバイト数の総和よりも埋め込みバイト数は大きくなければならない。すなわち、
（テキスト発行元情報バイト数）＋（テキスト発行元ＩＤバイト数）＋ｎ
＜埋め込みバイト数
ただし、ｎはテキスト特徴量パラメータ（Ｆ値）のバイト数である。
【００５６】
Ｔ−５のテキスト認証情報発生部では、埋め込みバイト数単位で、暗号化するデータを分割する。（一番最後の余りについては、0x00をpaddingする。）図６に示すように、分割されたそれぞれのバイト列に対して暗号化を行い、暗号化された結果に対して、さらに次の分割されたテキスト特徴情報の排他的論理和を演算し、その結果を秘密鍵で暗号化する。これを埋め込み可能バイト数で分割されたテキスト特徴情報量分だけこなし、最終的に暗号化された結果を埋め込む。
【００５７】
これは、ＦＥＡＬなどのようなブロック暗号型暗号化装置では、不得意なある偏った入カデータ（例えば、テキスト）に対する暗号の強度を強くする方法として利用されている操作モードに近い。本操作モードを利用することにより、テキスト特徴は、単純にテキスト特徴を暗号化しただけでなく、暗号化されたテキスト特徴との排他論理和による暗号をかけているため、暗号化器に対する入力が分散し、暗号強度が著しく高くなる。
【００５８】
例えば、埋め込み可能なバイト数が８バイトであれば、８バイト以内でテキストを暗号化しなければならない。暗号化する情報のバイト列は、以下である。

【００５９】
これらを埋め込み可能なバイト数（８バイト）で分割する。
（１）3c48 544d 4c3e 0a3c
（２）5449 544c 453e a4b3
（中略）．．．
（２４）2f48 544d 4c3e 0a00
一番最後に分割された（２４）番目の分割データに対しては、最後に0x00をパッディング（padding）する。そして、（１）の暗号化器の出力を（２）と排他的論理和を行い、その結果を暗号化器にかけ、さらに（３）と排他論理和を行い。最後の（２４）まで暗号器にかけた結果を、最終的な暗号化結果とする。
以上、Ｔ−５のテキスト認証情報発生部における暗号化の詳細を説明した。
【００６０】
また、テキストの一部の編集などの可能性により、全体のテキストが分断される可能性があるが、本装置では、前記Ｆ値に基づき、解析される単位を元に、テキスト特徴情報を抽出し、それぞれの解析単位毎に暗号化されたテキスト特徴情報とテキスト全体のテキスト発行元情報、テキスト発行元ＩＤ、テキスト特徴パラメータ(Ｆ値)とを、Ｔ−６のテキスト認証情報埋め込み部で埋め込む（図８（ｂ）：ステップＳ７）。
【００６１】
ここで、Ｔ−６のテキスト認証情報埋め込み部の動作の詳細を説明する。
まず、Ｔ−５のテキスト認証情報発生部からテキスト特徴パラメータ（Ｆ値）が、テキスト特徴パラメータ入力部Ｔ−６ａに入力される（図１０：ステップＳ６１）。
そして、Ｔ−５のテキスト認証情報発生部が発生した不可視バイト列が暗号化認証情報入力部Ｔ−６ｂに入力される（図１０：ステップＳ６２）。
さらにＴ−１のテキスト読み取り部からテキスト認証情報を埋め込むテキストが、テキスト入力部Ｔ−６ｃに入力される（図１０：ステップＳ６３）。
そして、各入力部に入力された情報が揃うと、判定部Ｔ−６ｄは、すべての不可視バイト列を出力したか判断する（図１０：ステップＳ６４）。
【００６２】
判定部Ｔ−６ｄですべての不可視バイト列を出力していないと判定された場合、埋込テキスト出力部Ｔ−６ｆは、入力テキストを、Ｆ値に基づき別途定められるサイズのバイト列毎に読み込み、出力する（図１０：ステップＳ６５）。
さらに、埋込暗号化認証情報出力部Ｔ−６ｅは、不可視バイト列を、Ｆ値に基づき別途定められるサイズのバイト列毎に読み込み、出力する（図１０：ステップＳ６６）。
そして、すべての不可視バイト列が出力されるまで、図１０のステップＳ６４〜Ｓ６６を繰り返す。
判定部Ｔ−６ｄですべての不可視バイト列を出力したと判定された場合、残った入力テキストをすべて出力する（図１０：ステップＳ６７）。
以上、Ｔ−６のテキスト認証情報埋め込み部の動作の詳細を説明した。
【００６３】
次に、Ｔ−６のテキスト認証情報埋め込み部の動作を具体例を上げて説明する。
前記例で示したＦ値を１とする場合、すなわちテキスト特徴としてテキスト全体を利用する場合には、テキスト全体の任意の個所（ＳＪＩＳ等の２バイトの境界を妨げない範囲）に挿入する。例えば、以下のようなテキストが入力され、

【００６４】
暗号化最終データが
8081 8283 8485 8687
となっていたとすると、下線の部分に暗号化データが埋め込まれる。

【００６５】
Ｆ値が２の場合には、タグ単位で処理されるので、タグ毎に埋め込まれる。第一タグの暗号化最終データが
8081 8283 8485 8687
第二タグの暗号化最終データが
8889 9091 9293 9495
である時、以下のようにデータ中に埋め込まれる。

以上が、テキスト認証情報をＨＴＭＬテキストに埋め込むまでの動作である。
【００６６】
次に、テキスト認証情報を埋め込まれたＨＴＭＬテキストの認証を行う際の動作を説明する。
【００６７】
はじめにＴ−７のテキスト認証情報取り出し部は、Ｔ―６のテキスト認証情報埋め込み部により埋め込まれたテキスト認証情報を、認証のためにテキスト中から取り出す。さらにＴ−７のテキスト認証情報取り出し部では、テキスト認証情報が埋め込まれたテキストから、テキスト認証情報と元のテキストを分離する処理も行う。つまり、図５で示されるコード領域で例えばＳＪＩＳで利用される領域のバイト列は、テキスト（テキストバイト列）と判断し、それ以外のバイト列は、テキスト認証情報であると判断する。テキストバイト列は、テキスト特徴情報を取り出すためにＴ−８のテキスト特徴取り出し部に渡される。一方、暗号化されているテキスト認証情報は、これをを復元するためにＴ−９のテキスト認証情報読み取り部に渡される。
【００６８】
次に、Ｔ−８のテキスト特徴取り出し部は、テキスト認証情報を発生する際に使用したテキスト特徴情報を、テキスト認証情報を分離したテキスト（すなわち、元のテキスト）から抽出する部位である。Ｔ−７のテキスト認証情報取り出し部により、テキスト認証情報と元のテキストが分離されるので、その分離された元のテキストからテキストの特徴をＴ−２のテキスト特徴抽出部と同様な処理方式で再計算する。テキスト特徴パラメータ（Ｆ値）は初期に設定されているので、その値に基づき対応する処理を行い、テキスト特徴情報とテキスト特徴情報を抽出する解析単位を抽出する。
【００６９】
次に、Ｔ−９のテキスト認証読み出し部は、Ｔ−７のテキスト認証情報取り出し部により分離されたテキスト認証情報に対して、Ｔ−８のテキスト特徴取り出し部が抽出したテキスト特徴情報と解析単位を用いて、テキスト発行元ＩＤとテキスト発行元情報を分離・抽出する。Ｔ−８のテキスト特徴取り出し部が抽出した解析単位に基づき、埋め込みバイト数が特定されるので、それに基づいてＴ−８のテキスト特徴取り出し部が抽出したテキスト特徴情報をｎ列に分離される。このｎ列を図７に示すような方法で、復号化を行い、テキスト発行元情報および、テキスト発行元ＩＤを抽出する。
【００７０】
先の例では、以下のような、テキスト認証情報などの情報が透かし情報として埋め込まれたテキストから、暗号化情報と元のテキスト情報を分離する例を考えると、

【００７１】
下線のある部分と、下線以外の部分とに分離する。分離には、図５に示されるようなＳＪＩＳのコード体系を利用し、ＳＪＩＳのコードとそれ以外のデータに分離することにより分離処理が実行される。結果、以下のように分離される。

【００７２】
さらに、暗号化部を復元し、テキスト発行元情報とテキスト発行元ＩＤを計算する。
例えば、上記データから、
<氏名>たろう</氏名>
<発行元ＩＤ>123</発行元ＩＤ>
が抽出される。
【００７３】
次に、Ｔ−１０のテキスト発行元情報抽出部は、Ｔ−９のテキスト認証読み出し部で抽出したテキスト発行元情報を取り出し、出力する。上記例では、Ｔ−９のテキスト認証読み出し部で抽出したテキスト発行元情報が、“<氏名>たろう</氏名>”であるから、この情報が出力機器に出力される。
次に、Ｔ−１１のテキスト認証情報抽出部は、Ｔ−９のテキスト認証読み出し部で抽出したテキスト発行元ＩＤを取り出し、出力機器に出力を行う。上記例では、Ｔ−９のテキスト認証読み出し部で抽出したテキスト発行元ＩＤが、“<発行元ＩＤ>123</発行元ＩＤ>”であるから、この情報が出力機器に出力される。
以上が、テキスト認証情報が埋め込まれたＨＴＭＬテキストの認証を行う際の動作を説明した。
【００７４】
なお、本発明は、インターネットの他、ＬＡＮやダイアルアップによるネットワークを利用してもよい。また、スタンドアローンの装置として実現されてもよい。また、認証情報の埋め込みの機能および取り出しの機能を、各々別個の装置として実現してもよく、同一の装置として実現してもよい。
また、本発明のテキスト電子認証装置を実現するためのプログラム（テキスト電子認証プログラム）をコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによりテキストの電子認証を行ってもよい。
すなわち、このテキスト電子認証プログラムの一方は、前記テキスト読み取り部の機能と、テキスト特徴抽出部の機能と、テキスト発行元情報入力部の機能と、テキスト認証情報入力部の機能と、テキスト認証情報発生部の機能と、テキスト認証情報埋め込み部の機能とをコンピュータに実現させる。また、このテキスト電子認証プログラムの他方は、テキスト認証情報取り出し部の機能と、テキスト特徴取り出し部の機能と、テキスト認証情報読み取り部の機能と、テキスト発行元情報抽出部の機能と、テキスト認証情報抽出部の機能とをコンピュータに実現させる。
【００７５】
なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。
また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。
また、「コンピュータ読み取り可能な記録媒体」とは、フロッピーディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可般媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの（伝送媒体ないしは伝送波）、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含むものとする。
また上記プログラムは、前述した機能の一部を実現するためのものであってもよい。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であってもよい。
【００７６】
以上、この発明の実施形態を図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。
【００７７】
【発明の効果】
以上、詳細に説明したように、本発明によれば、テキストが記述された電子文書から、該テキストの特徴を抽出するとともに、該テキストの発行元の情報と認証のための情報を入力し、これらの情報を既存の電子文書表示器では不可視であって、かつ、解読不能な暗号化電子認証情報とし、前記電子文書に埋め込むので、一般に用いられるテキストが記述された電子文書に対する適切な認証を与えることができる。
【図面の簡単な説明】
【図１】本発明の一実施の形態であるテキスト電子認証装置の構成を示す図である。
【図２】一実施の形態であるテキスト認証情報発生部の構成を示す図である。
【図３】一実施の形態であるテキスト認証情報埋め込み部の構成を示す図である。
【図４】ＨＴＭＬテキストの一例の表示例である。
【図５】ＳＪＩＳ漢字コード体系における漢字コードとそれ以外のコードを示す図である。
【図６】テキスト認証情報発生部における暗号化の手順を説明する図である。
【図７】テキスト認証読み出し部における復号化の手順を説明する図である。
【図８】一実施の形態であるテキスト電子認証装置の動作手順を示す図である。
【図９】一実施の形態であるテキスト認証情報発生部の動作手順を示す図である。
【図１０】一実施の形態であるテキスト認証情報埋め込み部の動作手順を示す図である。
【符号の説明】
Ｔ−１…テキスト読み取り部
Ｔ−２…テキスト特徴抽出部
Ｔ−３…テキスト発行元情報入力部
Ｔ−４…テキスト認証情報入力部
Ｔ−５…テキスト認証情報発生部
Ｔ−６…テキスト認証情報埋め込み部
Ｔ−７…テキスト認証情報取り出し部
Ｔ−８…テキスト特徴取り出し部
Ｔ−９…テキスト認証情報読み取り部
Ｔ−１０…テキスト発行元情報抽出部
Ｔ−１１…テキスト認証情報抽出部
Ｔ−５ａ…テキスト特徴情報入力部
Ｔ−５ｂ…特徴量パラメータ入力部（特徴量識別子入力部）
Ｔ−５ｃ…発行元情報入力部
Ｔ−５ｄ…発行元ＩＤ入力部（発行元認証情報入力部）
Ｔ−５ｅ…暗号化器
Ｔ−５ｆ…コード変換部
Ｔ−５ｇ…暗号化認証情報出力部
Ｔ−６ａ…特徴量パラメータ入力部（特徴量識別子入力部）
Ｔ−６ｂ…暗号化認証情報入力部
Ｔ−６ｃ…テキスト入力部
Ｔ−６ｄ…判定部
Ｔ−６ｅ…埋込暗号化認証情報出力部
Ｔ−６ｆ…埋込テキスト出力部
Ｔ−６ｇ…テキスト出力部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an electronic authentication apparatus for placing watermark information in a text for the purpose of electronic authentication for an electronic document (file) in which the text is described.
[0002]
[Prior art]
Conventionally, on the Internet, HTML (Hyper Text Markup Language) document files are used as standard documents. The authentication of the text described in the HTML document file is performed by using a secure method that can be authenticated from an appropriate site, for example, SSL (Secure Socket Layer).
[0003]
[Problems to be solved by the invention]
However, these documents can be retrieved from the server side to the client side by software that automatically retrieves HTML documents from a web server such as a browser or a crawler. It was possible to use. In addition, the above-described SSL method is appropriate when exchanging text information of only specific information, but it is difficult to perform appropriate authentication for any document, and in general It has been difficult to give proper authentication to HTML documents.
[0004]
The present invention has been made in view of the above points, and can give appropriate authentication to an electronic document typified by an HTML document that is standardly used on the Internet or the like, and has a copyright. Text electronic authentication apparatus and method capable of taking out authentication information (watermark information) embedded in an electronic document such as an HTML document in order to eliminate an illegal copy that has been ignored, and a recording medium on which a text electronic authentication program is recorded Is to provide.
[0005]
[Means for Solving the Problems]
The invention described in claim 1 HTML A text electronic authentication apparatus that can authenticate an electronic document in which text is described by embedding authentication information in the electronic document, HTML From an electronic document that contains text HTML A text reader for reading text, and the electronic document HTML A text publisher information input section for inputting information of a text publisher; HTML Text authentication information input unit for inputting information for authentication of the text issuer, issuer information input to the text issuer information input unit, and input to the text authentication information input unit HTML From information for authenticating the publisher of the text Indecipherable from the user Generate encrypted data and send the encrypted 1 byte code , Invisible in WEB browser A text authentication information generating unit for generating encrypted electronic authentication information mapped one-to-one to a 2-byte code that is not used in a kanji code, and reading the encrypted electronic authentication information generated by the text authentication information generating unit into the text Read by the department HTML A text electronic authentication device comprising a text authentication information embedding unit embedded in text.
[0006]
The invention according to claim 2 HTML A text electronic authentication apparatus that can authenticate an electronic document in which text is described by embedding authentication information in the electronic document, HTML From an electronic document that contains text HTML A text reader that reads text and the text reader reads HTML A text feature extractor for extracting text features and the electronic document described in the electronic document HTML A text publisher information input section for inputting information of a text publisher; HTML Extracted by the text authentication information input unit for inputting information for authentication of the text issuer and the text feature extraction unit HTML Information representing the characteristics of the text, information of the issuer input to the text issuer information input unit, and input to the text authentication information input unit HTML From information for authenticating the publisher of the text Indecipherable from the user Generate encrypted data and send the encrypted 1 byte code , Invisible in WEB browser A text authentication information generating unit for generating encrypted electronic authentication information mapped one-to-one to a 2-byte code that is not used in a kanji code, and reading the encrypted electronic authentication information generated by the text authentication information generating unit into the text Read by the department HTML A text electronic authentication device comprising a text authentication information embedding unit embedded in text.
[0007]
Further, in the invention according to claim 3, in the generation of the encrypted data in the text authentication information generation unit, the issuer information and the HTML Generating first encrypted data obtained by encrypting information for authentication of a text issuer, the first encrypted data, HTML 3. The text electronic authentication device according to claim 2, wherein final encrypted data is generated from information representing the characteristics of the text.
[0008]
Further, in the invention according to claim 4, the authentication information that has been encrypted and converted into code is embedded. HTML An apparatus for extracting authentication information from an electronic document in which text is described, the electronic document being read, Invisible in WEB browser A portion described by a 2-byte code that is not used in a Kanji code is extracted as authentication information that has been encrypted and code-converted, and a portion obtained by removing the encrypted and code-converted authentication information from the electronic document HTML A text authentication information takeout unit that separates as text, and an encrypted and code converted authentication information separated and taken out by the text authentication information takeout unit, Invisible in WEB browser By converting a 2-byte code that is not used in Kanji code into a 1-byte code by a one-to-one mapping Indecipherable from the user A text authentication information reading unit that converts the code into encrypted authentication information and decrypts the encrypted authentication information to obtain authentication information; and a text issuer from the authentication information decrypted by the text authentication information reading unit. A text issuer information extraction unit that reads the information of the text authentication information, and a text authentication information extraction unit that reads information for authentication of the text issuer from the authentication information decrypted by the text authentication information reading unit. This is a text electronic authentication device.
[0009]
Further, in the invention according to claim 5, the authentication information encrypted and converted into code is embedded. HTML An apparatus for extracting authentication information from an electronic document in which text is described, the electronic document being read, Invisible in WEB browser A portion described by a 2-byte code that is not used in a Kanji code is extracted as authentication information that has been encrypted and code-converted, and a portion obtained by removing the encrypted and code-converted authentication information from the electronic document HTML A text authentication information extracting unit that separates the text, and the electronic document extracted by the text authentication information extracting unit HTML Based on the text, HTML A text feature extraction unit that extracts text features, and encrypted and code-converted authentication information separated and extracted by the text authentication information extraction unit, Invisible in WEB browser By converting a 2-byte code that is not used in the Kanji code into a 1-byte code by a one-to-one mapping Indecipherable from the user The code is converted to encrypted authentication information, and the encrypted authentication information is HTML A text authentication information reading unit that obtains authentication information by decrypting using text features; a text issuer information extracting unit that reads information of a text issuer from the authentication information decrypted by the text authentication information reading unit; A text electronic authentication apparatus comprising: a text authentication information extraction unit that reads information for authentication of the text issuer from authentication information decrypted by a text authentication information reading unit.
[0010]
The invention according to claim 6 HTML An electronic authentication method in a text electronic authentication apparatus that enables authentication of an electronic document by embedding authentication information in the electronic document with respect to an electronic document in which text is described, wherein the text reading unit HTML From an electronic document that contains text HTML A text reading procedure for reading text and a text publisher information input unit are described in the electronic document. HTML The text publisher information input procedure for inputting text publisher information, and the text authentication information input unit, HTML Text authentication information input procedure for inputting information for authentication of text issuer, text authentication information generating unit, information of issuer input to text issuer information input unit, and text authentication information input unit Entered in HTML From information for authenticating the publisher of the text Indecipherable from the user Generate encrypted data and send the encrypted 1 byte code , Invisible in WEB browser A text authentication information generation procedure for generating encrypted electronic authentication information mapped one-to-one to a 2-byte code that is not used in a kanji code, and a text authentication information embedding unit is an encryption generated by the text authentication information generation unit Electronic authentication information was read by the text reader HTML A text electronic authentication method comprising a text authentication information embedding procedure embedded in text.
[0011]
The invention according to claim 7 HTML An electronic authentication method in a text electronic authentication apparatus that enables authentication of an electronic document by embedding authentication information in the electronic document with respect to an electronic document in which text is described, wherein the text reading unit HTML From an electronic document that contains text HTML Text reading procedure for reading text and text feature extraction unit read by the text reading unit HTML A text feature extraction procedure for extracting text features and a text issuer information input unit are described in the electronic document. HTML The text publisher information input procedure for inputting text publisher information, and the text authentication information input unit, HTML A text authentication information input procedure for inputting information for authentication of the text issuer and a text authentication information generation unit extracted by the text feature extraction unit HTML Information representing the characteristics of the text, information of the issuer input to the text issuer information input unit, and input to the text authentication information input unit HTML From information for authenticating the publisher of the text Indecipherable from the user Generate encrypted data and send the encrypted 1 byte code , Invisible in WEB browser A text authentication information generation procedure for generating encrypted electronic authentication information mapped one-to-one to a 2-byte code that is not used in a kanji code, and a text authentication information embedding unit is an encryption generated by the text authentication information generation unit Electronic authentication information was read by the text reader HTML A text electronic authentication method comprising a text authentication information embedding procedure embedded in text.
[0012]
Further, in the invention according to claim 8, the authentication information encrypted and converted into code is embedded. HTML An electronic authentication method in an apparatus for extracting authentication information from an electronic document in which text is described, wherein the text authentication information extraction unit reads the electronic document, Invisible in WEB browser A portion described by a 2-byte code that is not used in a Kanji code is extracted as authentication information that has been encrypted and code-converted, and a portion obtained by removing the encrypted and code-converted authentication information from the electronic document HTML The procedure for separating as text, and the text authentication information reading unit, the encrypted and transcoded authentication information separated and extracted by the text authentication information extraction unit, Invisible in WEB browser By converting a 2-byte code that is not used in Kanji code into a 1-byte code by a one-to-one mapping Indecipherable from the user The code is converted into encrypted authentication information, the encrypted authentication information is decrypted to obtain the authentication information, and the text issuer information extraction unit uses the authentication information decrypted by the text authentication information reading unit. A procedure for reading the text issuer information, and a text authentication information extracting unit reading the information for authentication of the text issuer from the authentication information decrypted by the text authentication information reading unit. This is a featured text electronic authentication method.
[0013]
Further, in the invention according to claim 9, the authentication information encrypted and converted into code is embedded. HTML An electronic authentication method in an apparatus for extracting authentication information from an electronic document in which text is described, wherein the text authentication information extraction unit reads the electronic document, Invisible in WEB browser A portion described by a 2-byte code that is not used in a Kanji code is extracted as authentication information that has been encrypted and code-converted, and a portion obtained by removing the encrypted and code-converted authentication information from the electronic document HTML A procedure for separating as text and a text feature extraction unit are described in the electronic document extracted by the text authentication information extraction unit HTML Based on the text, HTML The procedure for extracting the characteristics of the text, and the text authentication information reading unit, the authentication information that has been encrypted and transcoded by the text authentication information extraction unit is separated and extracted, Invisible in WEB browser By converting a 2-byte code that is not used in the Kanji code into a 1-byte code by a one-to-one mapping Indecipherable from the user The code is converted to encrypted authentication information, and the encrypted authentication information is HTML A procedure for obtaining authentication information by decrypting using text characteristics, a procedure for the text issuer information extracting unit to read the information of the text issuer from the authentication information decrypted by the text authentication information reading unit, and text authentication The information extraction unit includes a procedure for reading information for authentication of the text issuer from the authentication information decrypted by the text authentication information reading unit.
[0014]
The invention described in claim 10 is a computer-readable recording medium recording a program for realizing the text electronic authentication method according to

claim

6 or 7 on a computer.
The invention described in claim 11 is a computer-readable recording medium recording a program for realizing the text electronic authentication method according to

claim

8 or 9 on a computer.
[0015]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing a configuration of a text electronic authentication device according to an embodiment of the present invention.
[0016]
T-1 is a text reading unit that opens a file in which text is described and reads the file on a device or a storage medium.
T-2 is a text feature extraction unit that extracts data (text feature information) representing text features read by the text reading unit T-1. Here, various text feature information can be considered. For example, all text data may be captured as text feature information. In this case, if the amount of data for authenticating the text is fixed, the ratio of the text feature information described in the text authentication information decreases as the text feature information increases. That is, when all text data is used as text feature information, it is possible to authenticate all text data, but there is a high possibility that authentication of individual text contents described in the text cannot be performed.
[0017]
If you want to authenticate the entire text as a usage of this device, you should extract the entire text as text feature information, but if you want to authenticate the text content too, for example, whether it is part of the text you wrote In order to authenticate whether or not, it is necessary not to extract the entire text, but to extract independent words or important words included in the text, and to use these as text feature information. In consideration of these characteristics, the user sets, as an example, an optimum text feature amount (size of text feature information) shown below or a flag (identifier, text feature amount parameter) corresponding to the text feature amount as T- 2 is set in the text feature extraction unit.

The text feature extraction unit of T-2 notifies the text authentication information generation unit of T-5 of a flag that specifies the text extraction method used as a text feature parameter based on the text feature amount set by the user. To do.
[0018]
T-3 is a text issuer information input unit. This describes information (text issuer information) indicating the person who issued the text. For example, information indicating the company organization having the copyright of the text and the address of the person who authored the text, name, URL, and the like are described.
T-4 is a text authentication information input unit. This is a part for inputting an issuer ID (text issuer ID) issued by a public institution or a certain type of authentication company. An issuer can be specified based on the text issuer ID. However, in order to uniquely indicate the issuer ID, the text issuer ID needs to be unique in the world.
[0019]
T-5 is a text authentication information generation unit. In the text authentication information generation unit, the text feature information and text feature parameter extracted by the text feature extraction unit of T-2 and the text indicating the issuer information input by the text issuer information input unit of T-3 This is a part that generates text authentication information of the corresponding text by using the issuer information and the text issuer ID input by the text authentication information input unit of T-4.
[0020]
Specifically, as shown in FIG. 2, the text feature information input unit T-5a for inputting the text feature information extracted by the text feature extraction unit of T-2, and the text notified from the text feature extraction unit of T-2 A feature parameter input unit T-5b for inputting a feature parameter, an issuer information input unit T-5c for inputting text issuer information input by a text issuer information input unit of T-3, and T-4 Text authentication information consisting of an issuer ID input unit T-5d for inputting the text issuer ID input in the text authentication information input unit, text feature information, text issuer information, text issuer ID, and text feature parameter Encrypter T-5e that encrypts the code and code conversion that converts the encrypted text authentication information into a code (invisible byte string) that is invisible to the browser And T-5f, composed of the encrypted authentication information output section T-5 g to output the invisible bytes encoded converted by the code converting unit T-5f. Details of encryption and code conversion will be described later.
[0021]
T-6 is a text authentication information embedding unit. The text authentication information embedding unit embeds the text authentication information (invisible byte string) generated by the T-5 text authentication information generation unit in the input text. However, since it is text information, when it is simply embedded, there is a case where the text authentication information is deleted because the embedding in the text leaks to others. Moreover, when embedding in a text, it may be a hindrance when browsing normal text. Therefore, in this apparatus, the text authentication information generated by the T-5 text authentication information generation unit is dispersed in the text so as not to be leaked to others, and does not hinder text browsing. And embedded in the text. In other words, text authentication information is embedded so that it is not different from normal text browsing, and text authentication information is distributed in the text so that changes such as text cut and deletion are durable. Embed.
[0022]
Specifically, as shown in FIG. 3, a feature parameter input unit T-6a that reads a text feature parameter (F value) from a text authentication information generation unit at T-5, and a text authentication information generation unit at T-5. Encrypted authentication information input unit T-6b for inputting an invisible byte sequence, text input unit T-6c for inputting text input from a text reading unit of T-1, and whether all invisible byte sequences have been output A determination unit T-6d that controls each output unit, and an embedded text output unit T-6e that reads an input text and outputs a byte sequence of the read text based on a text feature parameter (F value). An embedded encryption authentication information output unit T-6f that reads an invisible byte string based on the text feature parameter (F value) and outputs the read invisible byte string; If invisible bytes of all is output, and a text output unit T-6 g of outputting the remaining input text.
[0023]
T-7 is a text authentication information extraction unit. This is a processing unit that extracts the text authentication information embedded by the text authentication information embedding unit of T-6 from the text in which the text authentication information is embedded for authentication or the like. The processing unit also performs processing for separating the text authentication information and the original text from the text in which the text authentication information is embedded.
T-8 is a text feature extraction unit. This is a part for extracting the text feature information used when generating the text authentication information from the text (original text) from which the text authentication information is separated. Since the text authentication information and the original text are separated by the T-7 text authentication information extraction unit, the text features (text feature information) are recalculated from the separated texts.
[0024]
T-9 is a text authentication reading unit. The text issuer ID and the text issuer information are separated and extracted from the text authentication information separated by the text authentication information take-out unit of T-7, using the text feature information extracted by T-8.
T-10 is a text issuer information extraction unit that extracts and outputs the text issuer information extracted in T-9.
T-11 is a text authentication information extraction unit that extracts the text issuer ID extracted in T-9 and outputs it to the output device.
The text electronic authentication device may be configured as a dedicated device, or may be realized by causing a computer to read and execute a program that realizes the function of each unit. The text electronic authentication device is connected to an input device, a display device, and the like (none of which are shown) as peripheral devices. Here, the input device refers to an input device such as a keyboard and a mouse. The display device refers to a CRT (Cathode Ray Tube), a liquid crystal display device, or the like.
[0025]
Next, the operation of the text electronic authentication device of the present embodiment configured as described above will be described with reference to the drawings. The following description is an operation example for an HTML document (HTML text).
[0026]
The mainstream HTML text on the Internet is a document in which text is structured based on text attributes described by tags “<” and “>”. This tag is generally authenticated and defined by W3C (World Wide Web Consortium) or the like. In order to describe this HTML text, there is authoring software or the like. On the other hand, as a device for viewing (browsing) a document described in the HTML text format, there is generally a browser. This is a device that presents information by structuring text information and rearranging it on a display device based on the information of the tag. These browsers have a mechanism for interpreting and displaying normal HTML text.
[0027]
For example, consider the following example of HTML text.
<HTML>
<TITLE> This is a test </ TITLE>
<BODY>
<H1> The weather is good today.
</ BODY>
</ HTML>
Thus, the HTML text has a plain text structure.
When this is displayed by a browser capable of viewing HTML text, for example, FIG. 4 is obtained. The <TITLE> part is displayed at the top of the display (browser). on the other hand, The part indicated by the <H1> tag is displayed in the display.
[0028]
The T-1 text reading unit reads such HTML text (FIG. 8A: step S1). If data is read in units of 8 bits, the following byte group is read into the device.

[0029]
Next, the text feature extraction unit of T-2 extracts text feature information that is data representing the feature of the text read by the text reading unit of T-1 (FIG. 8A: step S2).
[0030]
For example, the relationship between a text feature quantity, a text extraction part, a text extraction method, and a text feature quantity parameter (F value) is defined as follows.

[0031]
When the entire text is extracted, all the text read by the text reading unit of T-1 is directly passed to the text authentication information generation unit of T-5 as text feature information. The flag (F) indicating the text extraction method is initially set.
[0032]
When the text feature information consisting of the entire text is passed, the following byte sequence is passed to the text authentication information generation unit of T-5. In this case, 167 bytes are required.

[0033]
The HTML text has a document structured by tag information, but the tag itself does not include the meaning of the document. Therefore, it is also conceivable to use text obtained by deleting the tag information of these HTML texts as text feature information. For example,
This is a test
The weather is nice today.
Is represented by a byte example.

And can be represented by 38 bytes.
[0034]
Furthermore, the independent words included in the text are often used as text keywords, and are often used to represent the content of the corresponding text. For example, morphological analysis is usually used as a method for extracting independent words contained in these texts. Morphological analysis is a search for the input character string against the word dictionary, part-of-speech information (part of speech), sentence availability information (possible sentence beginning), forward connection information (forehead), backward connection information (rearward) Get such information. In a normal word dictionary, a high-speed search can be performed by performing a special dictionary structure called a TREΙ dictionary structure.
[0035]
If there are dictionary items such as “Ah”, “Greeting”, “Ai”, etc., the first character of each is the same (in this case, because it is Japanese, it points to 2 bytes of Japanese characters unlike the alphabet) The second character is the same, and so on. And if there is a match up to the last character, information such as part-of-speech information (part-of-speech), sentence head availability information (sentence of sentence head), forward connection information (forehead), backward connection information (rearward) for the word dictionary item Is described.
[0036]
The sentence head availability information is a flag indicating whether or not the sentence head is acceptable. If the beginning of the sentence is acceptable, it may be present at the beginning of the sentence. The forward connection information is permitted only when the part of speech or attribute of the previous word is appropriate, and is deleted as a candidate when the word does not allow connection in the front. Similarly, the backward connection information is allowed to be connected only when the part of speech or the attribute of the subsequent word is appropriate, and is deleted as a candidate when the word is not permitted to be connected in the subsequent connection.
[0037]
A candidate is selected by such a part-of-speech connection. The maximum likelihood candidate is selected by a method called a minimum cost method. The minimum cost method is a processing method in which the morpheme candidate with the lowest cost is the maximum likelihood candidate. Costs used in morphological analysis include the following two types of costs.
1. Connection cost
2. Word cost
The connection cost is a cost necessary for connecting a word to a word. Since it is a word and a word, the connection cost for a word + utilization of the word is zero. The word cost is a cost related to the word. For example, a word that is frequently used has a low cost. Moreover, since utilization is not a word, the cost is zero. By morphological analysis, the text part is decomposed into word units, and at the same time, parts of speech that are considered to be correct are given to each word. In the present embodiment, the one with the smallest sum of connection cost and word cost is set as the maximum likelihood candidate. Note that the numerical definitions of the connection cost and the word cost are made separately.
[0038]
Here, it is assumed that the previous example sentence is input.
This is a test
The weather is nice today.
The morphological analysis in this example is as follows.
[0039]

[0040]

[0041]
In this example sentence, if you extract only independent words,
This, test, today, weather
Is extracted. This is expressed as a byte sequence as follows.

In this example, the feature of the text can be expressed with a total of 19 bytes.
[0042]
In addition, use text summaries. For example, if a text summarization technique (Japanese Patent Application No. Hei 10-180181 published document summarization apparatus and a recording medium recording a program therefor) invented by Inagaki et al. Can do.
For example, as a gist of the above example:
The weather is nice today.
Is chosen. Then, the byte sequence for this is as follows.

[0043]
Furthermore, when the above sentence is morphologically analyzed and independent words are extracted,
Today, weather
Is extracted. This is expressed as a byte sequence as follows.
0000000 baa3 c6fc c5b7
[0044]
In this way, appropriate text feature information is extracted according to the application. For example, if you want to authenticate the entire text, you should extract the entire text, but if you want to authenticate the text content too, for example, to authenticate whether it is part of the text you wrote Instead of extracting the entire text, it is necessary to extract independent words or important words contained in the text as text feature information. The user sets an optimum text feature amount in consideration of these characteristics. The text feature extraction unit of T-2 notifies the text authentication information generation unit of T-5 of the text extraction method (F value) used to extract the text feature information based on the text feature amount set by the user. To do.
[0045]
Next, the text issuer information input unit of T-3 receives the input of the information (text issuer information) indicating the person who issued the text and describes it in a predetermined format (FIG. 8 (a): step S3). For example, information indicating the company organization having the copyright of the text and the address, name, URL, etc. of the person who authored the text is input and described in a predetermined format.
[0046]
As an example, the following text issuer information is described.
<Name> Taro Aiue </ Name>
<Affiliation> Taro Co., Ltd. </ Affiliation>
<Outbound> Kyoto Taro Ward 1 </ Address>
<URL> http://aaaaaa.ne.jp/aaa.htm1 </ URL>
<Creation date> March 1, 1999 </ Creation date>
<Issued date> March 2, 1999 </ Issue date>
<Ownership date> March 2, 2000 </ Rights holding date>
[0047]
Next, the text issuer information input unit at T-3 sends the text issuer information to the text authentication information generation unit at T-5. When sending these text issuer information to the T-5 text authentication information generator, in order to clarify the attribute of the text issuer information (that is, whether it is an address or a name), SGML (Standard As with generalized markup language, it is enclosed in tags with its attributes. For example, <Name> The attribute value, that is, the name is described here between the </ name> tags. The end of an attribute such as name is indicated by a tag written in “/” (in this case, The </ name> part) is a marker indicating this.
[0048]
These are expressed as byte strings as follows.

[0049]
The text authentication information input unit of T-4 receives and inputs the issuer ID issued by a public institution or a certain type of authentication company and writes it in a predetermined format (FIG. 8B: step S4). An issuer can be specified based on the text issuer ID. However, in order to uniquely indicate the issuer ID, the text issuer ID needs to be unique in the world.
For example, an issuer organization ID shown below and a text issuer ID uniquely issued by this issuer organization are described as a text issuer ID.
<Issuing organization ID> AAA </ Issuer ID>
<Publisher ID> 123456789 </ Issuer ID>
[0050]
Next, the text authentication information generation unit of T-5 generates text authentication information (FIG. 8A: step S5). Specifically, the text feature information extracted by the text feature extraction unit at T-2 is received by the text feature information input unit T-5a (FIG. 9: step S51), and notified from the text feature extraction unit at T-2. The text feature parameter is received by the feature parameter input unit T-5b (FIG. 9: Step S52), and the text issuer information indicating the issuer information input by the text issuer information input unit of T-3 is issued. Received by the information input unit T-5c (FIG. 9: Step S53), and received by the issuer ID input unit T-5d the text issuer ID input by the text authentication information input unit T-4 (FIG. 9: Step S54). ) When each information is collected, text authentication information is generated as follows.
Since the text authentication information itself is embedded in the text, if the text authentication information is simply embedded in the text, the text authentication information is displayed in the browser and can be altered by a special editor. Sex occurs. Therefore, in this apparatus, the text authentication information is encrypted so that the text authentication information becomes invisible in a browser or the like and what text authentication information is described is unknown.
[0051]
That is, first, the encryptor T-5e encrypts the text authentication information so that it cannot be decrypted by a normal user (FIG. 9: Step S55). For the encryption, for example, an encryption device invented by Shimizu et al. Such as FEAL-8, NX (Japanese Patent Application No. 60-252650, “data diffusion device”) is used. The encryption device is basically a device that, when given a certain byte string and an encryption key, encrypts the byte string based on the given byte string and returns an encrypted byte string. Usually, these encryption devices encrypt a byte string and convert it into an appropriate byte string. However, these encryption devices do not perform encryption that is invisible to the browser, and may be visible to the browser or become a control code.
[0052]
For example, a normal browser recognizes that 0x20 is a single-byte space and 0x0a is a line feed code. Therefore, the text authentication information simply generated by the encryptor T-5e is visible to the browser. Therefore, in order to make it invisible in the browser, a byte string that is made invisible in the browser is generated by converting the byte string generated by the encryptor T-5e into a special code string. For example, in the SJIS Kanji code system, the first byte is defined as follows.
Control code group from 0x00 to 0x1f
ASCII code group from 0x20 to 0x7F
SJIS code group from 0x81 to 0x9F
Half-width katakana code group from 0xa1 to 0xCF
SJIS code from 0xE0 to 0xf9
The above is used as a kanji code. The other code is a group of codes that are hidden from the browser. In other words, if the thing to which the Kanji code belongs is represented by the relationship between the first byte and the second byte, it is as shown in FIG.
[0053]
That is, the hatched portion is a code group used in the kanji code, and the white portion is a code group not used in the kanji code, and is a code group that can be invisible in the browser. By mapping each encrypted byte sequence represented by these 2 bytes to the unused area, any 8 bytes encrypted are represented.
For example,
Encrypted byte Representation byte
0x00 0f8080
0x01 0x8081
Ox02 0x8082
0x03 0x8083
...
0xff 0xffff
By describing such a mapping table, the encrypted byte sequence can be converted into an invisible byte sequence invisible to the browser.
[0054]
In the text authentication information generation unit of T-5, the code conversion unit T-5f finally converts the encrypted byte sequence into an invisible byte sequence (FIG. 9: step S56), and the encrypted authentication information output unit T-5g Outputs this invisible byte sequence (FIG. 9: step S57). The data string to be encrypted includes text issuer information output from the text issuer information input unit of T-3, text issuer ID output from the text authentication information input unit of T-4, and T- The text feature information and the text parameter feature parameter (F value) output from the text feature extraction unit 2. The text feature information is used to clarify the features of the input text. If a difference between the restored data and the current data is found when restored, the current data is the original. Indicates that the data has been tampered with.
[0055]
Next, details of encryption in the T-5 text authentication information generation unit will be described.
The text issuer information, the text issuer ID, and the text feature parameter are encrypted, and the text feature information is encrypted according to the number of bytes to be embedded. Since the text issuer information and the text issuer ID are necessarily embedded, the number of embedded bytes must be larger than the total number of bytes of these pieces of information. That is,
(Number of text issuer information bytes) + (number of text issuer ID bytes) + n
<Number of embedded bytes
Here, n is the number of bytes of the text feature parameter (F value).
[0056]
The T-5 text authentication information generation unit divides the data to be encrypted by the number of embedded bytes. (For the last remainder, 0x00 is padded.) As shown in FIG. 6, each divided byte string is encrypted, and the encrypted result is further divided into the following divisions. The exclusive OR of the obtained text feature information is calculated, and the result is encrypted with the secret key. This is performed for the amount of text feature information divided by the number of embeddable bytes, and finally the encrypted result is embedded.
[0057]
This is close to an operation mode that is used as a method of increasing the strength of encryption for biased input data (for example, text), which is not good in block encryption type encryption devices such as FEAL. By using this operation mode, the text feature is not only simply encrypted, but also encrypted by exclusive OR with the encrypted text feature. Distributed and the cryptographic strength is remarkably increased.
[0058]
For example, if the number of bytes that can be embedded is 8 bytes, the text must be encrypted within 8 bytes. The byte sequence of the information to be encrypted is as follows.

[0059]
These are divided by the number of bytes that can be embedded (8 bytes).
(1) 3c48 544d 4c3e 0a3c
(2) 5449 544c 453e a4b3
(Omitted). . .
(24) 2f48 544d 4c3e 0a00
For the (24) th divided data divided last, 0x00 is finally padded. Then, the output of the encryptor (1) is exclusive ORed with (2), the result is applied to the encryptor, and the exclusive OR is further performed with (3). The result of applying to the encryptor up to the last (24) is the final encryption result.
The details of the encryption in the T-5 text authentication information generation unit have been described above.
[0060]
In addition, there is a possibility that the entire text may be divided due to the possibility of editing a part of the text, but this device extracts text feature information based on the unit to be analyzed based on the F value. Then, the text feature information encrypted for each analysis unit, the text issuer information of the entire text, the text issuer ID, and the text feature parameter (F value) are embedded in the text authentication information embedding unit of T-6. (FIG. 8B: Step S7).
[0061]
Here, details of the operation of the text authentication information embedding unit of T-6 will be described.
First, a text feature parameter (F value) is input from the text authentication information generation unit at T-5 to the text feature parameter input unit T-6a (FIG. 10: step S61).
Then, the invisible byte sequence generated by the T-5 text authentication information generation unit is input to the encrypted authentication information input unit T-6b (FIG. 10: step S62).
Further, the text for embedding the text authentication information from the text reading unit at T-1 is input to the text input unit T-6c (FIG. 10: step S63).
When the information input to each input unit is ready, the determination unit T-6d determines whether all invisible byte sequences have been output (FIG. 10: step S64).
[0062]
When the determination unit T-6d determines that all invisible byte sequences are not output, the embedded text output unit T-6f reads the input text for each byte sequence having a size determined separately based on the F value. Are output (FIG. 10: Step S65).
Further, the embedded encryption authentication information output unit T-6e reads and outputs the invisible byte sequence for each byte sequence having a size determined separately based on the F value (FIG. 10: step S66).
Then, steps S64 to S66 in FIG. 10 are repeated until all invisible byte sequences are output.
If the determination unit T-6d determines that all invisible byte sequences have been output, all remaining input text is output (FIG. 10: step S67).
The details of the operation of the T-6 text authentication information embedding unit have been described above.
[0063]
Next, the operation of the T-6 text authentication information embedding unit will be described with a specific example.
When the F value shown in the above example is 1, that is, when the entire text is used as a text feature, it is inserted at an arbitrary position of the entire text (a range that does not obstruct the 2-byte boundary such as SJIS). For example, if the following text is entered,

[0064]
Encrypted final data is
8081 8283 8485 8687
The encrypted data is embedded in the underlined portion.

[0065]
When the F value is 2, since processing is performed in units of tags, each tag is embedded. The final encrypted data of the first tag
8081 8283 8485 8687
The final encrypted data of the second tag
8889 9091 9293 9495
Is embedded in the data as follows.

The above is the operation until the text authentication information is embedded in the HTML text.
[0066]
Next, an operation when authenticating HTML text embedded with text authentication information will be described.
[0067]
First, the text authentication information extraction unit of T-7 extracts the text authentication information embedded by the text authentication information embedding unit of T-6 from the text for authentication. Further, the text authentication information extraction unit of T-7 also performs processing for separating the text authentication information and the original text from the text in which the text authentication information is embedded. That is, in the code area shown in FIG. 5, for example, the byte string of the area used in SJIS is determined to be text (text byte string), and the other byte strings are determined to be text authentication information. The text byte string is passed to the text feature extraction unit of T-8 to extract the text feature information. On the other hand, the encrypted text authentication information is passed to the text authentication information reading unit of T-9 to restore it.
[0068]
Next, the text feature extraction unit of T-8 is a part that extracts the text feature information used when generating the text authentication information from the text from which the text authentication information is separated (that is, the original text). The text authentication information and the original text are separated by the text authentication information extraction unit of T-7, so that the feature of the text is separated from the separated original text by the same processing method as the text feature extraction unit of T-2. Recalculate. Since the text feature parameter (F value) is initially set, the corresponding processing is performed based on the value, and the text feature information and the analysis unit for extracting the text feature information are extracted.
[0069]
Next, the T-9 text authentication reading unit extracts the text feature information and the analysis unit extracted by the T-8 text feature extraction unit from the text authentication information separated by the T-7 text authentication information extraction unit. Is used to separate and extract the text issuer ID and text issuer information. Since the number of embedded bytes is specified based on the analysis unit extracted by the text feature extracting unit of T-8, the text feature information extracted by the text feature extracting unit of T-8 is separated into n columns based on the number of embedded bytes. The n columns are decrypted by a method as shown in FIG. 7, and text issuer information and text issuer ID are extracted.
[0070]
In the previous example, consider the example of separating the encrypted information and the original text information from the text in which information such as text authentication information is embedded as watermark information as follows:

[0071]
Separated into underlined and non-underlined parts. For the separation, the separation process is executed by separating the SJIS code and other data using the SJIS code system as shown in FIG. As a result, it is separated as follows.

[0072]
Further, the encryption unit is restored, and text issuer information and text issuer ID are calculated.
For example, from the above data:
<Name> Taro </ Name>
<Issuer ID> 123 </ Issuer ID>
Is extracted.
[0073]
Next, the text issuer information extraction unit at T-10 extracts and outputs the text issuer information extracted by the text authentication read unit at T-9. In the above example, the text issuer information extracted by the text authentication reading unit of T-9 is “ <Name> Taro </ Name> ”, this information is output to the output device.
Next, the text authentication information extraction unit of T-11 takes out the text issuer ID extracted by the text authentication reading unit of T-9 and outputs it to the output device. In the above example, the text issuer ID extracted by the text authentication reading unit of T-9 is “ <Issuer ID> 123 </ Issuer ID> ”, this information is output to the output device.
The operation at the time of authenticating the HTML text in which the text authentication information is embedded has been described above.
[0074]
The present invention may use a network by LAN or dial-up in addition to the Internet. Further, it may be realized as a stand-alone device. Further, the authentication information embedding function and the retrieving function may be realized as separate devices or as the same device.
Further, a program (text electronic authentication program) for realizing the text electronic authentication device of the present invention is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. Thus, text electronic authentication may be performed.
That is, one of the text electronic authentication programs includes the function of the text reading unit, the function of the text feature extraction unit, the function of the text issuer information input unit, the function of the text authentication information input unit, and the generation of text authentication information. And the function of the text authentication information embedding unit are realized by the computer. The other of the text electronic authentication program includes a function of a text authentication information extraction unit, a function of a text feature extraction unit, a function of a text authentication information reading unit, a function of a text issuer information extraction unit, and text authentication information. Let the computer realize the function of the extraction unit.
[0075]
Here, the “computer system” includes an OS and hardware such as peripheral devices.
Further, the “computer system” includes a homepage providing environment (or display environment) if a WWW system is used.
The “computer-readable recording medium” refers to a storage medium such as a floppy disk, a magneto-optical disk, a general medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. It also includes a device (transmission medium or transmission wave) and a device that holds a program for a certain period of time, such as a volatile memory inside a computer system serving as a server or client in that case.
The program may be for realizing a part of the functions described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, what is called a difference file (difference program) may be sufficient.
[0076]
The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes design and the like within the scope not departing from the gist of the present invention.
[0077]
【The invention's effect】
As described above in detail, according to the present invention, the feature of the text is extracted from the electronic document in which the text is described, the information of the publisher of the text and the information for authentication are input, Since these pieces of information are made into encrypted electronic authentication information that is invisible and cannot be decrypted by an existing electronic document display device and embedded in the electronic document, appropriate authentication for an electronic document in which a commonly used text is described is performed. Can be given.
[Brief description of the drawings]
FIG. 1 is a diagram showing a configuration of a text electronic authentication device according to an embodiment of the present invention.
FIG. 2 is a diagram illustrating a configuration of a text authentication information generation unit according to an embodiment.
FIG. 3 is a diagram illustrating a configuration of a text authentication information embedding unit according to an embodiment.
FIG. 4 is a display example of an example of HTML text.
FIG. 5 is a diagram showing Kanji codes and other codes in the SJIS Kanji code system.
FIG. 6 is a diagram illustrating an encryption procedure in a text authentication information generation unit.
FIG. 7 is a diagram illustrating a decrypting procedure in a text authentication reading unit.
FIG. 8 is a diagram illustrating an operation procedure of the text electronic authentication device according to the embodiment.
FIG. 9 is a diagram illustrating an operation procedure of a text authentication information generation unit according to an embodiment.
FIG. 10 is a diagram illustrating an operation procedure of a text authentication information embedding unit according to an embodiment.
[Explanation of symbols]
T-1 Text reading unit
T-2 ... Text feature extraction unit
T-3 ... Text publisher information input part
T-4 ... Text authentication information input part
T-5: Text authentication information generator
T-6 ... Text authentication information embedding part
T-7: Text authentication information extraction unit
T-8 ... Text feature extraction unit
T-9 ... Text authentication information reader
T-10: Text issuer information extraction unit
T-11 ... text authentication information extraction unit
T-5a ... Text feature information input section
T-5b: Feature parameter input unit (feature parameter identifier input unit)
T-5c: Publisher information input section
T-5d: Issuer ID input unit (issuer authentication information input unit)
T-5e ... Encryptor
T-5f: Code converter
T-5g: Encryption authentication information output unit
T-6a: Feature parameter input unit (feature parameter identifier input unit)
T-6b: Encryption authentication information input section
T-6c ... Text input part
T-6d: determination unit
T-6e: Embedded encryption authentication information output unit
T-6f: Embedded text output section
T-6g ... text output part

Claims

A text electronic authentication device that enables authentication of an electronic document in which HTML text is described by embedding authentication information in the electronic document,
A text reading unit that reads HTML text from an electronic document in which the HTML text is described;
A text issuer information input unit for inputting information of an HTML text issuer described in the electronic document;
A text authentication information input unit for inputting information for authentication of the HTML text issuer;
Encrypted data that cannot be deciphered by the user from the issuer information input to the text issuer information input unit and the information for authentication of the HTML text issuer input to the text authentication information input unit A text authentication information generation unit that generates encrypted electronic authentication information in which the generated and encrypted 1-byte code is mapped one-to-one to a 2-byte code that is not used in a kanji code that is invisible in a WEB browser ;
A text authentication information embedding unit for embedding the encrypted electronic authentication information generated by the text authentication information generating unit in the HTML text read by the text reading unit;
A text electronic authentication device comprising:

A text electronic authentication device that enables authentication of an electronic document in which HTML text is described by embedding authentication information in the electronic document,
A text reading unit that reads HTML text from an electronic document in which the HTML text is described;
A text feature extraction unit for extracting features of HTML text read by the text reading unit;
A text issuer information input unit for inputting information of an HTML text issuer described in the electronic document;
A text authentication information input unit for inputting information for authentication of the HTML text issuer;
Information representing the characteristics of the HTML text extracted by the text feature extraction unit, information of the issuer input to the text issuer information input unit, and the issuer of the HTML text input to the text authentication information input unit Encrypted data that cannot be deciphered by the user is generated from the information for authentication, and the encrypted 1-byte code is converted into a 2-byte code that is not used by the Kanji code that is invisible in the WEB browser, on a one-to-one basis. A text authentication information generating unit for generating the mapped encrypted electronic authentication information;
A text authentication information embedding unit for embedding the encrypted electronic authentication information generated by the text authentication information generating unit in the HTML text read by the text reading unit;
A text electronic authentication device comprising:

In generating encrypted data in the text authentication information generating unit,
Generating first encrypted data obtained by encrypting the information of the issuer and the information for authentication of the issuer of the HTML text;
3. The text electronic authentication device according to claim 2, wherein final encrypted data is generated from the first encrypted data and information representing characteristics of the HTML text.

An apparatus for extracting authentication information from an electronic document in which HTML text in which authentication information that has been encrypted and converted into code is embedded is described,
The electronic document is read, a portion described by a 2-byte code that is not used in a Kanji code that is invisible in the WEB browser is extracted as authentication information that has been encrypted and converted, and is encrypted from the electronic document. A text authentication information extraction unit that separates the part excluding the authentication information subjected to code conversion as HTML text;
The authentication information separated and extracted by the text authentication information extraction unit and subjected to code conversion is converted into a 1-byte mapping of a 2-byte code that is not used in a Kanji code that is invisible in a WEB browser . A text authentication information reading unit that converts the code into encrypted authentication information that can not be decrypted by the user by converting into a code, decrypts the encrypted authentication information, and obtains authentication information;
A text issuer information extraction unit that reads information of a text issuer from the authentication information decrypted by the text authentication information reading unit;
A text authentication information extraction unit that reads information for authentication of the text issuer from the authentication information decrypted by the text authentication information reading unit;
A text electronic authentication device comprising:

An apparatus for extracting authentication information from an electronic document in which HTML text in which authentication information that has been encrypted and converted into code is embedded is described,
The electronic document is read, a portion described by a 2-byte code that is not used in a Kanji code that is invisible in the WEB browser is extracted as authentication information that has been encrypted and converted, and is encrypted from the electronic document. A text authentication information extraction unit that separates the part excluding the authentication information subjected to code conversion as HTML text;
Based on the text authentication information extraction unit HTML text displayed on the electronic documents retrieved by the text feature extraction portion for extracting features of HTML text,
The authentication information separated and extracted by the text authentication information extraction unit and subjected to code conversion is converted into a 1-byte mapping of a 2-byte code that is not used in a Kanji code that is invisible in a WEB browser . Text authentication information reading that obtains authentication information by converting the code into encrypted authentication information that cannot be deciphered by the user by converting into a code, and decrypting the encrypted authentication information using the characteristics of the HTML text And
A text issuer information extraction unit that reads information of a text issuer from the authentication information decrypted by the text authentication information reading unit;
A text authentication information extraction unit that reads information for authentication of the text issuer from the authentication information decrypted by the text authentication information reading unit;
A text electronic authentication device comprising:

An electronic authentication method in a text electronic authentication apparatus that makes it possible to authenticate an electronic document in which HTML text is described by embedding authentication information in the electronic document,
A text reading unit for reading HTML text from an electronic document in which the HTML text is described;
A text issuer information input unit for inputting information of an HTML text issuer described in the electronic document;
A text authentication information input unit for inputting information for authentication of the HTML text issuer;
Text authentication information generating section, and the text publisher information input publisher input to the unit information from the user from said information for text credentials entered HTML text in the input portion issuer authentication Generates encrypted data that cannot be decrypted, and generates encrypted electronic authentication information that is a one-to-one mapping of the encrypted 1-byte code to the 2-byte code that is not used by the Kanji code that is invisible in the WEB browser. Text authentication information generation procedure,
A text authentication information embedding unit that embeds the encrypted electronic authentication information generated by the text authentication information generation unit in the HTML text read by the text reading unit;
A text electronic authentication method characterized by comprising:

An electronic authentication method in a text electronic authentication apparatus that makes it possible to authenticate an electronic document in which HTML text is described by embedding authentication information in the electronic document,
A text reading unit for reading HTML text from an electronic document in which the HTML text is described;
A text feature extraction procedure for extracting a feature of HTML text read by the text reading unit;
A text issuer information input unit for inputting information of an HTML text issuer described in the electronic document;
A text authentication information input unit for inputting information for authentication of the HTML text issuer;
A text authentication information generation unit is input to the text authentication information input unit, information representing the characteristics of the HTML text extracted by the text feature extraction unit, issuer information input to the text issuer information input unit, and the text authentication information input unit Generates encrypted data that cannot be deciphered by the user from the authentication information of the HTML text issuer, and uses the encrypted 1-byte code as a 2-byte character that cannot be used with the Kanji code that is invisible on the WEB browser. A text authentication information generation procedure for generating encrypted electronic authentication information mapped one-to-one to the code of
A text authentication information embedding unit that embeds the encrypted electronic authentication information generated by the text authentication information generation unit in the HTML text read by the text reading unit;
A text electronic authentication method characterized by comprising:

An electronic authentication method in an apparatus for retrieving authentication information from an electronic document in which an HTML text in which authentication information that has been encrypted and code-converted is embedded is described,
A text authentication information extracting unit reads the electronic document, extracts a portion described by a 2-byte code that is not used in a kanji code that is invisible in a WEB browser, and extracts the encrypted information as code-converted authentication information. Separating a portion of the document excluding the encrypted and transcoded authentication information as HTML text;
The text authentication information reading unit receives a pair of 2-byte codes that are not used in the Kanji code that is invisible in the WEB browser, after the encrypted and code-converted authentication information separated and extracted by the text authentication information extraction unit. A procedure of converting the encoded authentication information into a 1-byte code by mapping 1 to encrypted authentication information that cannot be decrypted by the user, and decrypting the encrypted authentication information to obtain the authentication information;
A procedure for reading information of the text issuer from the authentication information decrypted by the text authentication information reading unit;
A procedure for reading out information for authentication of the text issuer from the authentication information decrypted by the text authentication information reading unit;
A text electronic authentication method characterized by comprising:

An electronic authentication method in an apparatus for retrieving authentication information from an electronic document in which an HTML text in which authentication information that has been encrypted and code-converted is embedded is described,
A text authentication information extracting unit reads the electronic document, extracts a portion described by a 2-byte code that is not used in a kanji code that is invisible in a WEB browser, and extracts the encrypted information as code-converted authentication information. Separating a portion of the document excluding the encrypted and transcoded authentication information as HTML text;
A step of the text feature extraction unit, based on the text authentication information extraction unit HTML text displayed on the electronic documents retrieved by, extracts the feature of HTML text,
The text authentication information reading unit uses a pair of 2-byte codes that are not used in the Kanji code that is invisible in the WEB browser, after the encrypted and code-converted authentication information separated and extracted by the text authentication information extraction unit. The code is converted into 1-byte code by mapping to 1-byte encrypted authentication information that cannot be decrypted by the user , and the encrypted authentication information is decrypted using the HTML text characteristics for authentication. A procedure to get information,
A procedure for reading information of the text issuer from the authentication information decrypted by the text authentication information reading unit;
A procedure for reading out information for authentication of the text issuer from the authentication information decrypted by the text authentication information reading unit;
A text electronic authentication method characterized by comprising:

A computer-readable recording medium having recorded thereon a program for realizing the text electronic authentication method according to claim 6 or 7.

A computer-readable recording medium storing a program for realizing the text electronic authentication method according to claim 8 or 9 on a computer.