JP2001507837A

JP2001507837A - Method and apparatus for securely storing data

Info

Publication number: JP2001507837A
Application number: JP53026198A
Authority: JP
Inventors: ディ．ゴールドスタイン，ベンジャミン
Original assignee: ディ．ゴールドスタイン，ベンジャミン
Priority date: 1996-12-30
Filing date: 1997-12-29
Publication date: 2001-06-12
Also published as: CA2276036A1; WO1998029981A1; CA2276036C; AU5723198A; US5963642A; EP1013030A1; EP1013030A4

Abstract

(57)【要約】末端ユーザのクライアントワークステーション(1)は、データベースサーバコンピュータ(30)と通信する。末端ユーザのクライアントワークステーション(1)には、コードブック(11)と、２つのアルゴリズム(15,19)を含むメモリー(6)がある。データベースサーバコンピュータ(30)には、ＣＰＵ(31)、通信ポート(32)及びメモリー(33)がある。メモリー(33)は、ｑ−コードデータベース(35)と、アルゴリズム(39)を含んでいる。末端ユーザのクライアントワークステーション(1)は、さらに、データベースサーバコンピュータ(30)と通信するための通信ポート(5)を有する。 (57) [Summary] The end user's client workstation (1) communicates with the database server computer (30). The end user's client workstation (1) has a codebook (11) and a memory (6) containing two algorithms (15,19). The database server computer (30) has a CPU (31), a communication port (32), and a memory (33). The memory (33) contains a q-code database (35) and an algorithm (39). The end user's client workstation (1) further has a communication port (5) for communicating with the database server computer (30).

Description

【発明の詳細な説明】データを安全に格納する方法及び装置［０００１］産業上の利用分野この発明はデータの安全な格納方法に関するものである。より具体的には、この発明はデータの平文化の作業を必要としない方法で、有意的に暗号化されたデータを、安全に格納することに関するものである。［０００２］本発明の背景データベースシステムは、様々なデータ集合の信頼性や秘匿性を保つことが要求されており、従って承認された個々の利用者や利用者のグループのみが、データにアクセスしたり操作することを許される。この要求は一般に利用者の認証コントロールを通して扱われる。オーディトトレール(Audit trail)も同時に保存され、少なくとも理論的には、ある利用者がどの情報にアクセスしたか、そのアクセスはいつ行なわれたかなどの経過が記録される。その他の目的の中で、オーディトトレールは、データベースシステム上のデータに対してアクセスし、操作する責任を明確にすることを意図している。それによって、システム上のデータへ不適切なアクセスや操作がなされることを妨げることに役立っている。アクセス制御とオーディトトレールは、データベースシステムの秘密保持を支援するのに有用であり、用心深い機構であるが、これらの方法を使ったデータベースシステムは、依然として秘密性の破壊を受け易い。破壊を受け易い状態で残っている基本的な領域として、システムを管理している人によるものが挙げられる。システムを管理する権限を持った人は、オーディトトレールの記録を採らないようにしたり、消したり、書き換えたりすることができる。システムを管理している人は、効果的に仕事をすることが求められているために、現在の技術に制限を加え、データベースの情報にアクセスする特別な権限をもっている。その制限の主なものは、現在のデータベースシステムには効果的かつ効率的なデータの暗号化によるセキュリティーが存在しないことによる。さらには、たとえばコンピュータハッカーによりアクセスコントロールを無効化されてしまったとき、データベースの暗号化による防御がなされていないことにより、データの機密が危険にさらされる。［０００３］現在の手法による問題点現在存在する暗号の技術をデータベースのセキュリティーを向上するために利用した場合、それはデータベースのパフォーマンスに影響することが広く知られている。強力な暗号化手法はデータベースにおけるデータ構造を変化させてしまうため、多くの種類のクエリー操作(query operation)と、その他のDML(database manipu lation language)(データベース操作言語)は激しく影響を受ける。なぜならデータベースを操作するために、まずデータを平文化するという作業が必要となるためである。さらには、現在存在する暗号化の手法によりデータの平文化を行なうことは、情報を平文の状態でさらけ出すことにもなる。［０００４］データベースの暗号化には、コミュニケーションの暗号化との間に多くの異なる特徴がある。データベースの暗号化とコミュニケーションの暗号化の違いについては以下のものを含むいくつもの場所で論じられてきている。[Gudes，E．"Th e Application of Cryptography to Data Base Security."Ph.D．Dissertation ，Ohio State University，1976]。[Gudes，E.，H.S.Koch，and F.A．Stahl．"T he Applications of Cryptography for Data Base Securityo."In Prceedings o f the National Computer Conference，AFIPS Press，1976，pp．97-107]，[Seb erry J．And J.Pieprzyk．"Cryptography:An Introduction to Computer Securi ty."New York:Prentice-Hall，1989，pp.233-259]。これらの参考文献は本発明の背景としてここに挿入し、出願の一部とした。データベースは、複数の利用者が、共有して格納されているデータにアクセスし、クエリー(query)を発し操作できるように作成されている。それらの利用者は典型的には様々な種類の格納された情報に対して異なった権限を持つ。ここで強調したいのは、様々なアクセス権限を伴っている共有して格納されたデータは、単純な通信について存在する一般の問題とは全く異なった問題がある。単純な通信では、通信の当事者は、共通の共有している格納データに操作を加えることを通常は問題にしないからである。［０００５］データーベースは、それに属する記録が選択的にかつ予想できないように転換されることを許している。この特質は、メッセージの操作が問題ではなく、むしろ単純である通信に適用される特質と較べて、データベースへ効果的に加え得る暗号化の種類にさらなる制限を加えることになる。データベースの暗号化についての要求に答えることは、ファイルの暗号化における直接的な要求と比較して明らかに難しい。［０００６］ほとんどすべてのデータベースにおいて、クエリー要求に対して適切なパフォーマンスを得るためには、インデックスの作成を行なうことが不可欠である。データベースの情報にアクセスする際に、インデックスは暗号化された状態のままで、それを効果的に利用する一般的な手法は知られていない。それゆえ、インデックスが活用されるためには、それは暗号化されていない状態でなければならない。［０００７］データベースの暗号化の概要 Gudes[Gudes，E．"The Application of Cryptography to Data Base Security ."Ph.D．Dissertation，Ohio State University，1976]と、Gudes，Koch and St ahl[Gudes，E.，H．S．Koch，and F．A．Stahl．"The Application of Cryptogr aphy for Data Base Security."In Proceedings of the National Computer Con ference，AFIPS Press，1976，pp．97-107]は、データベースの暗号化が、コミュニケーションの暗号化と区別される３つの根本的な制約を指摘した。これらの制約は、データベースに効果的に適用することができる暗号化の種類に厳しい制限をはめる事になる。まず第１に、その方法は選択的にそして効率的にデータを取り出すことが出来なければならない。なぜならデータベースの中のデータはこれらの操作を容易にするように配置されていて、個々のレコードの暗号化と平文化は他のレコードを同時に含まないことが望ましい。第２に、データはデータベースの中に長期にわたり含まれることが通常である。もしデータが暗号化されているとした時に、暗号鍵の変更が要求されたとしたら、新しい鍵をつかってのデータの再暗号化をする必要が生じる。第３に、"取扱い上の問題"が存在する。もしデータベースの操作が、暗号化されたデータに対して直接行なうことが出来たとしたら、すなわち平文のデータと同じ方法で暗号化されたデータを扱うことが出来るとしたら、それはとても都合の良いことである。これは平文の暗号化と、暗号の平文化に含まれるオーバーヘッドをなくすだけでなく、曝されることがあるデータ操作サイクルのいかなる場所においても、平文のままでデータが存在することがないことを意味し、結果としてデータの安全性を高めることにもなる。［０００８］ GudesとGudes、KochとStahlには、マルチレベル形式がデータベースの暗号化の課題に対処するための、もっとも適切なモデルであると指摘されている。データベースは複数のレベルをもつと認識される。すなわち、データは、複数のデータ構造の形式で存在するし、この方式にしたがって参照でき、複数のレベルの間にマッピングが存在するものとして管理される。このマッピングは、実際には、データの変形を定義するが、これらのデータの変形はデータベース設計における当たり前の機能であるから、それらは暗号によるセキュリティー機能を使えるように利用し、拡張することが可能である。彼らの分析は、複数レベルのデータベース構造の隣り合うレベル間で、利用可能な様々なタイプの暗号化手法を検証した。Gudesらは独自に物理的にも論理的にもいくつかのレベルを持つデータベース構造を定義した。彼らの複数レベルデータベース構造は、データベースの異なるレベル間における暗号化の可能性を目立たせるために使われる。データベースにおいては、データがシステムの様々な物理メディア(ディスク、メモリー、ディスプレイ)上に異なった形態で存在し、それゆえデータは様々な物理レベルをもち、そのそれぞれは対応する絶対的な（論理的な）意味あいを持つ、という事実を彼らは強調している。様々なタイプの暗号による変形はこの構造における様々なレベルの間で実行可能である。論理レベルはデータベースレコードの形式を様々なレベルにおいて適当であるように定義する一方、物理レベルは論理レコードにより定義される形式による具体的なデータから構成される。典型的には、任意の数の物理レコードが、すべての定義済みの論理レコードのために存在する。Gudes及び、Gudes、KochとStah lによる成果は1976年に発表されているが,それにもかかわず彼らのモデルは分散コンピュータ構造のことも考慮している、というのも複数レベルデータベース構造における様々なレベルは物理的に分離した場所に置くことも可能であると暗に理解されるからである。［０００９］ Gudesらはデータベース構造における５つの論理的な段階を定義した。 1）ユーザ論理レベル 2）システム論理レベル 3）アクセス論理レベル 4）格納レベル(又は、組織化された格納レベル) 5）組織化されていない格納レベル１つ若しくはそれ以上の物理レベルは、物理媒体の数に応じてそれぞれの論理レベルに割り当てることが可能である。論理レベルと物理レベルとの対応があるかどうかは、実装の詳細さにも依存する。［００１０］ Gudesらは、データベースアーキテクチャの隣接するレベル間で写像する過程で、適用される所定タイプの暗号化変換を詳述している。Seberry、Peiprzykは、Gudes、Koch、Stahlによるコンピュータセキュリティーに関する文献の最新の要約と分析とを示した。［Seberry J．And J．Pieprzyk．"Cryptography:An Int roduction to Computer Security."New York:Prentice-Hall，1989，pp．233-25 9］［００１１］ Gudesらの複数レベルのデータベース構造によって達成された暗号法は、コンピュータシステムにおいて保管されているままの状態の暗号を参照する。複数レベルにわたる暗号化は、暗号化されたデータはどのように操作されるべきかという問題を実際には出さない。Gudesらによる複数レベルの暗号化は、暗号化されたデータ要素を直接操作することをすべての場合については可能とはしない。暗号化されたデータは、ある項目がアクセスされる前に、先ず平文化される必要があるものである。［００１２］データベースの暗号化が直面する根本的な制約は、データを保管する際には、操作をする上で便利な状態でなければならないという要求である。データベースにおける隣り合うレベルに暗号カギを利用する場合の暗号化の強度は、この条件により制限される。究極的には、容易に取り出せるように低いレベルでデータを格納することが必要である。もしこれが出来ないのであれば、クエリーを受け付けることができる形式にデータを再構成しなければならないため、追加の作業を行なう必要がある。［００１３］異なった形式の暗号化の変換を組み合わせれば非常に強力な暗号となりうる。その一方で、自然と形式化された複数レベルデータベース構造においては、本質的に利用可能であるデータベースのレベル数が制限されていることが、暗号化によるセキュリティへのクエリーに厳しい制約を加えることになる。暗号化によるセキュリティは、データベースに大きな変化を加えて、悪化させる様なパフォーマンスを行なうことなく、暗号を使ってデータベースから究極的に獲得するものである。［００１４］複数レベルデータベース構造の概念は良く知られている。様々な複数レベルデータベース構造と関連した用語が定義されている。"ANSI/SPARC"複数レベルデータベース構造は、複数レベルデータベース構造の中で最も広く認識されているモデルである。DateによるANSI/SPARC構造についての説明である[Date，C．J．"An Introduction to Database Systems."5th ed．New York:Addison-Wesley，1990 ，Vol．I.，Chapter 2，pp.31-54]を、ここで本発明の背景として出願の一部とする。［００１５］暗号/Ｑコード暗号には大きく２つの主流の分野がある。暗号カギとqコード化である。それぞれの分野は異なった種類の暗号化を扱う。すなわちそれぞれ暗号カギとｑコードである。暗号はアルファベットにおける個々の記号や、記号のグループの変換を含む。ここでいう記号には、例えば、大文字、小文字、数字そして句読点なども含む。暗号化の変換は、意味的な単位を取り扱う特別なルールを使うことなく、完全に一般的な方法でそれぞれの記号や、記号のグループに対して当てはめられる。任意のそして意味のない記号による文章は、暗号化したのと同様に、容易に意味のある文章にできる。暗号化が実行される文章構成上の１単位は、偶然により意味を持つ内容となる。Ｑコードは、それとは逆に、文章構成上の単位、例えば単語や、文節や、場合によっては明確に意味を持つ内容を持つ文全体を変化させることを含む。どんな些細なｑコードでも大きなコード表を使う必要がある。例をしめすことでこの理由は明確になる：英語の文章を変換することができる単純なｑコードは、固有名詞も含むすべての英語の単語を含むのと等価なコード表を必要とする。ｑコードで用いられるコード表は、コードにおけるキーを構成する。コード表の中のエントリーは、それ自身をカギとみなすこともできる。［００１６］暗号解析をおこなう手法は、本質的には平文の領域における統計的特徴に基づく。原文が例えばある特定の自然言語でのメッセージであると推測される暗号を解析しようと試みる時には、暗号解析者は推測される言語の平文におけるそれぞれの文字や、文字の組み合わせの出現頻度について検討する。例えばｑコードにおいては、個々の単語と単語の組み合わせの頻度について検討される。［００１７］ＳＰＡＲＣＯＭ概略 SPARCOMというのは"疎関連接続行列(Sparse Associative Relational Connect ion Matrix)"の頭字語である。これはAshany氏により提案され研究されている手法であって、データベースシステムにおいて、データを動的に構築し、早い反応時間と高いスループットを、多くの種類の応用に用いられる。このアプローチにおいては、離散的な値をとるデータを大規模な疎行列(sparse matrix)に変換し、データベース操作を行なうために広範囲な疎行列技術を用いて、データベース操作することを可能にする。SP ARCOMのアプローチでは、その疎行列を圧縮された状態において格納や操作を行ない、それにより大きな格納スペースと実行時間を節約する。SPARCOMに固有の正規化過程は、任意の属性について複数の値をもつ実体により、屡々生じさせられるデータの冗長性を減少させる。データベースの操作は、内部レベルでデータベースがもっている構造化された情報を含む疎行列構造に対して、算術的な操作により実行される。［００１８］ SPARCOMは、離散的な値を取るデータに対して、その内容を呼び出す方法を提供する。すなわち、データの要素はその内容の関数として呼び出され、取り出される。内容を呼び出すためにSPARCOMは与えられた実体と属性との関係を、それに対応する実体と特性との関係に変換する。実体と属性との関係は、ある特定した関係で目的が設定されると、どんな属性を有しているかを示している。対応する実体と特徴との関係は、設定されたオブジェクトが、設定された属性の特性として取りうる値のすべての値に対応するだけの様々な特性を持っているかどうかを示す。実体と特徴との関係は行列として表現される。その行列は通常はきわめて疎である。これまでの関係付けデータベースの理論においては、任意の属性に対して、複数の値を持つオブジェクトの為にいくつもの関係を作る必要がある。（すなわち、テーブルの中に複数の組を生成する）のに対し、SPARCOMでは、それが実体と特性の関係を基本として構築されているため、このような関係を作る作業の必要はない。［００１９］アシャニィ氏は、Binary Property MatrixがSPARCOMにおける基本的なデータ構造であり、それが実体と属性との関係に対応すると記述している。[Ashany,pp .62-63]:Ａ₁,Ａ₂,...,Ａ_nという属性をもつｎ次元の属性空間があり、それは基底ｄ₁，ｄ₂,...，ｄ_nのそれぞれ別々の成分Ｄ₁,Ｄ₂,...,Ｄ_nからなる領域を持つが、それらはＮ次元の特性空間に以下の式に従い変換することができる。この独立した特性Ｐ₁，Ｐ₂,...，Ｐ_nの数は、ｎ次元の属性空間における点から、Ｎ次元の特性空間へとあらゆる点を写像する。明らかにＮはｎよりも大きい、そして高次元のユークリッド空間における点として表現するために、さらに多くの軸が必要となる。それゆえに大きなベクトルとなる。特性空間においては、しかしながら、それぞれの軸において２つの区別され得る点しかない；0もしくは1 ;そしてそれぞれの成分の軸はある特性を表わす。［００２０］ｎの単一値の属性によって、属性空間に表現される実体、すなわち；ｎ桁の数によって、ｎ個の1とＮ−ｎ個の0の値からなるＮ桁の特性空間により表現される。１という値は当該の属性をあらわす場所に挿入される。性別をあらわすための属性の基本的な種類数は（男,女）の２つであり、目の色をあらわす属性の基本的な種類は５つ（黒,青,茶,緑,赤褐色）であるため、（男,青）という特性をもつ実体、すなわち性別が男で目の色が青であるという実体は7桁の数E(1,0,0 ,1,0,0,0)で表現される。２要素の組は、0でない２つの要素を含む7桁の数に変換される。単一の値をもつ属性がそれぞれ基本的な数(cardinal number)をもち、例えば、ｄ₁＝１０で、ｄ₂＝１２であるような時、属性空間における２要素の組は特性空間における22桁のものに変換される。再び言うと、２つの０でない値と、２０の０の値を持つバイナリベクトルである。これらの特性空間におけるベクトルは、Extended Binary Vectors(EBV)(拡大バイナリベクトル)と呼ばれ、通常そのベクトルは疎大である。［００２１］ｍ個の実体の集合は、ｍ×ｎのBinary Connection Matrix(BCM)(バイナリ連結行列)と呼ばれる0,1からなる(バイナリー)行列により表現される。なぜならその0でない要素はそれぞれの実体と、そのそれぞれが対応する特性との間に存在する関係を示すことになるからである。この行列はより具体的にはバイナリー特徴行列(Binary Property Matrix:BP M)と呼ばれる。EBVの重要な特徴の１つに、値が1つしかない属性と複数の値を採りうる属性は、冗長問題を解く上では、1つのそして同じベクトルとして表現される。［００２２］バイナリー特徴行列の作りかたから生じる特徴として、それぞれ、そしてすべての各特性がインデックスされていることから、それ自身が完全に反転したファイル(inverted files)でもある(さらには直接ファイルでもある)という重要な特徴を持つ事実があり、それゆえバイナリ特徴行列は本質的に、特徴を取り出す作業をするために特に適している。領域の問い合わせ(range query)も、SPARCOMのアプローチによって、古典的なデータベース構造と比較して非常に容易に実行することが出来る。というのもその問い合わせの回答を得るためには、BPMに対して1つだけのクエリーベクトルをかけるだけでいいからである。これとは対称的に、属性を基本とするデータベースでは、領域の問い合せについての回答を得るには、一般的に検索操作を複数回反復する必要がある。［００２３］ Binary Property Matricesがベースとしている関係は、SPARCOM標準形式(SNF' s)とよばれる固有の標準形式であり、それは関係のある他のデータベースの標準形式とは、属性を基本とするというよりむしろ、特性を基本とするという点において区別される。そしてもっとも顕著なのは、複数の値を持つ関係をどのように取り扱うかという点における違いである。Coddにより定義された1NF標準化方法は、複数の値を持つ属性が存在した時、複数の値を持つ属性により生成される冗長性を複数の関係に分解する方法により小さくする。このような分解は、SPARCO Mの手法によると不必要であり、かつ不適切である。SPARCOMでは、属性をベースとするのではなく、特性をベースにして関係が組織化されるから、上記の点はその通りである。事実上、１NFの目的は、通常ならば１NFに関連している分解に頼らないで、SPARCOMの下で自動的に達成される。［００２４］図1Aにおけるコンサルタントの例における関係は１NF には含まれない。ここでコンサルタント関係における属性のために存在する機能上の依存関係は次のようなものである。名前→時給，名前→技量，名前→曜日これは前提のようなものであるが、他の依存関係があってもよい。技量と曜日という属性は、ここでは複数の値を持つ属性として与えられている。説明するための図1Aにおいては、複数の値を持つ属性である技量と翌日は、二つの異なる目的で扱われている。技量におけるそれぞれの特定のコンサルタントの複数の例は複数のレコードに分解されている。一方、コンサルタントを利用できる曜日についての複数の例では、その個々のレコードの範囲内で繰返しグループが生成される。これらの複数の値を持つ属性を表現するための手法は、どちらも好ましいものではない。［００２５］一方では、複数の値を持つ属性の各事例に関する複数のレコードを１NFの中にではなく、ある関係の中に持つことは、内在する所定の機能上の依存関係の無い他の属性を、不必要にも複製してしまう。（この例では、時給と曜日の属性は、名前が技量である機能上の依存関係には関係していないが、しかしながらそのデータは複製されることになる。）それに対して、繰り返されるグループは分子の如く分解した値ではなく、それゆえ繰り返されるグループをもつレコードに対する操作には、更なる操作が必要となる。さらには繰り返されるグループの関係のレコードは、同じ長さではないか、もしくは空値を含み、このどちらの場合も好ましいものとはいえない。［００２６］コンサルタントの関係（名前、時給、技量、曜日）を分解し、1NFに要求されている如く、複数の値を持つ属性を取り除くと、３つの独立した関係であるCRat e(名前,時給)やCSkills(名前,技量)やCDays(名前,曜日)が得られる。図1Bは、図 1Aのデータに対応するデータを分解して得られた1NFの関係を示す。図1Bの関係は、3NFやBCNF(Boyce-Codd Normal Form)においても同様に発生する。［００２７］分解は通常は、全体の冗長性を縮小するが、その過程で小さな冗長性を生成してしまっている。それは図1Bにおける事例では、「名前」が何れの関係にもその一部として見い出すことが出来る。この冗長性は必要であって、そこですべての関係は、保存され、そして分解の過程から導き出された複数の関係中に存在する属性に対して、普通の連結を実行することで元の関係を再構築することに役立つ。［００２８］図1Aで用いているコンサルタントの関係をSNFのバージョンにしたものが図1C である。SPARCOMデータベースにおいては1NFに分解する必要がないという事実のため,データのより効果的な方法を提供することにより、この過程において冗長性が生成されることを防止出来る。これはまさに、これまでの属性を基に考えるデータベースモデルに対するSPACOM モデルの優位性である。［００２９］ Ashany氏はSPARCOMの手法による主たるパフォーマンスの優位性の１つを次のように説明している。［Ashany,184頁］疎行列を扱っている多くのアルゴリズムは、ひとつの共通の特徴を持っている。即ち０でない要素のみが行列に格納されることである。目標はこれらの行列を、格納するスペースを節約するために,そして特にアクセスと実行時間を短縮するために、行列の全体が存在しないかの如き状態において操作することにある。なぜなら０の要素は表現したり操作したりする必要はないからである。［００３０］ SPARCOM手法が、他のデータベース手法を越えて有する別の主なパフォーマンス上の利点は、データに対する内容アドレス能力にある。他のデータベースシステムは様々なタイプのクエリーに対して高速に答えるために、複数のインデックスが保持されていることが求められることが多い。SPARCOM以外のデータベースシステムにおいては、特定したクエリーをサポートするインデックスが作成されていない時には、そのクエリーに対する回答速度は非常に遅いものとなる。というのもインデックスがないときには徹底的にデータ要素の検索を行なう必要があるからである。SP ARCOMは複数のインデックスを必要とはしない。なぜならそれはすべてのデータをインデックスしているからである。SPARCOMに用いられているBPMを圧縮するための様々な手法は、実はそれ自身インデックスを行なう手法であるからである。［００３１］疎行列の概念これまでに述べたようにSPARCOMは、特徴を基にしたデータベースにおいて、建築レンガの如く内部レベルのデータ構造を作るために、バイナリー疎行列を用いる。SPARCOM中の持続性データはＢＰＭによって構成されているが、クエリーの結果は、それ自身は疎であったり、或いはそうではない、非バイナリー行列の形式となる。Ashany氏は行列のインデックスを作る３つの手法について議論した。Ashany氏が調査した方法は、Bitmap法(BMS)と、"Single Index"法(SIS)と、"D ouble Index"法(DIS)による圧縮方法である。これらの手法はいずれも著しい圧縮を疎な行列に対して行なう。そしてバイナリー疎行列に対しては、より高い圧縮を行なう。データベース操作が、これらの３つの手法の各々にしたがってインデックス化されたBPMに対して行われた。インデックス手法や、データセットや、データベース操作においてそのどれを選択したかによって、良かったり悪かったりの結果が得られたが、それぞれにおいてすばらしいパフォーマンスを示す結果が、様々な圧縮方法において得られた。図2は、以下に詳細を示すBMS,SIS,DISの各圧縮方法によって行列がどのように圧縮されるかの例である。［００３２］"Bit Map" 法 bitmap法においては、ｍ×ｎの行列A(ｍは行、nは列)は3つの要素に分解される。 1）２要素のＤｉｍ(ｍ,ｎ)、但しｍ,ｎは行列Aにおける行と列。 2）ｍ×ｎのディメンジョンのバイナリ行列B。但し、Aの0でない値は、Bのそれに置き換える。 3）ベクトルv、但しそれの要素はＡの０でない値であって、ある順序で輪郭がたどられている。バイナリー行列Bの要素のビットは、その行列の行（もしくは列）を繋ぐことにより、形成されたビット列Ｓ_Bとして格納される。Ｓ_Bを格納するために必要なバイト数は、以下の簡単な公式により計算できる。Ｓ_B=［(ｍ×ｎ)/Ｓ］但しＳは1バイト中のビットの数である。 vの要素の配列順序は、それが行１からｍまで、又は列１からｎまで順番にスキャンしたときに出現する様に、配列される。他の配列順序ももちろん可能である。［００３３］ bitmap法は、バイナリ行列要素Bを1つのビットとして、それぞれの要素を格納することにより著しい圧縮を達成する。bitmap法においては、複数のバイナリ要素は、バイトサイズに依存する実際のビット数をもつ単一のバイトに、周知な方法によって格納される。この圧縮の手法は、この特徴のための使用をサポートする言語を使って効率の良いビット操作を実行できるハードウェアーにより、もっとも効率良く達成されることは明白である。BPM Aのビットマップ表現には、ベクトル要素vを必要としないことは、２要素組のＤｉｍ(ｍ,ｎ)と、ビットマップ要素Bが、BPM Aを完全に定義するには十分であることからも明らかである。［００３４］単一インデックス付け体系ビット写像体系と対照的に、単一インデックス付け体系は、マトリックスのゼロでない要素だけをメモリーする。単一インデックス付け体系は、次の３つの構成要素を用いて２進でないマトリックスＡを表現している。１）２個組のＤｉｍ (ｍ,ｎ),但しｍはＡの行の番号、ｎはＡの列の番号である。２）要素はＡ内のゼロでない要素の位置をリストしている位置ベクトルｖ₁。３）要素はＡのゼロでない値であるベクトルｖ₂。二つのベクトルｖ₁及びｖ₂の要素は、ｖ₂ の要素ｂ_iが、ｖ₁の要素ａ_iによって特定された位置に、マトリックスＡ内で見つけられた要素の値を保持するようにインデックス付けされる。［００３５］Ａの要素(ｉ,ｊ)の位置ｋは、線状のマッピング機能によって決定される。ｋ＝ｆ(ｉ,ｊ)＝ｊ÷(ｉ−１)×ｎ前記式において、ｉ及びｊはそれぞれ、要素の行と列であり、ｎはＡ内の列の番号である。この数式は、単にマトリックスの要素の配列を決めるだけであって、これはマトリックスの要素を順次、１行が済めば次の行を走査することを、１行目からｍ行目まで行なうことによって、決定される。［００３６］ＢＰＭのマトリックスのような２進マトリックスは、単一インデックス付け体系を用いて同様な方法で表されるが、ここでは二つの構成要素、即ち上記に定義した１）２個組のＤｉｍ(ｍ,ｎ)及び、２）位置ベクトルｖ₁だけが必要とされる。すべてのゼロでない値は、２進マトリックスの値であるので、ゼロでない値を特定する二番目のベクトルｖ₂の必要がないことは明らかである。［００３７］二重のインデックス付け体系二重のインデックス付け体系には３つの構成要素があって、構成要素自体の２番目は二つの部分から成り立っている。１）マトリックスの行及び列の数を決定する２個組のＤｉｍ(ｍ,ｎ)、２）マトリックスの要素の位置をインデックス付けするための二つのベクトルｖ₁及びｖ₂及び、３）要素がＡのゼロでない値であるベクトルｖ₃である。構成要素１及び３は、ビット写像と既述の単一インデックス付け体系の両方と同一の対応を有しているから、検討は全く必要でない。他の圧縮体系と同様に、ベクトルｖ₃は、(この場合、マトリックスのゼロでない要素の値を保持している)２進マトリックスには必要とされていない。［００３８］Ｄｉｍ(ｍ,ｎ)を有するマトリックスＡに於ける１からｍ行目までの各行のために、ベクトルｖ₁は順次にＡ内のゼロでない値を有する要素の列番号をリストする。ｖ₁の最後の要素は、区別できる(distinguished)記号を保持しなければならない。１からｎの範囲の整数以外の記号が保持されるであろう。(Ashanyは、記号「Δ」を使用している)。ベクトルｖ₁における要素の数は、Ａにおけるゼロでない要素の数より１多い数に等しい。［００３９］ベクトルｖ₂の要素は、マトリックスＡの各行中でゼロでない１番目の要素を含んでいるｖ₁における要素の位置を特定する。ベクトルｖ₂の要素は、ｖ₂の要素ｉが、マトリックスＡの行ｉに第１のゼロでない要素を含んでいるｖ₁のインデックスの数を特定するように、該要素自体がインデックス付けされる。ベクトルｖ₂は、ｍ＋１個の要素を含んでいる。ｖ₂の最後の要素は、ｖ₁の最後の要素を特定し、これは区別される記号である。［００４０］その他のインデックス付け体系スパースマトリックスを圧縮するその他の技術は多く存在する。頻繁に引用され、プログラムすることが容易な簡単な技術としては、行又は列のいずれかによってリンクされたリストの使用を含んでいるものがある。行又は列のいずれかによってデータが容易に回復できる二重にリンクされたリストも、その他の方法と同様に使用されることが可能であり、これは、(ＳＩＳ及びＤＩＳ圧縮技術の様な)配列中又は更に複雑なデータ構造におけるゼロでないスパースマトリックス値をインデックス付けする。［００４１］トレードオフ(tradeoffs)は、様々なスパースマトリックス圧縮体系の選択と共に存在する。例えば、ビット写像、上記した単一インデックス及び二重インデックス圧縮体系を比較すると、リンクされた及び二重にリンクされたリストを具体化することは、リンクされたリストを保持するために要求される増加したオーバーヘッドが原因で、同じ様な圧縮を提供しない。これは、リストの各ノードが、要素の値と連結アドレスの情報の両方を含んでいるからである。肯定的な面に関して、リンクされたリスト構造は、上記記載の体系を、新たなゼロでない要素をスパースマトリックスへ挿入することにおいて具体化するという明白な方法より、優れたパフォーマンスをもたらす傾向にある。上記のスパースマトリックス圧縮体系にて用いられたベクトルが十分に稠密である単純な配列を使って具体化される場合は、次に新たなスパースマトリックス圧縮体系を挿入するには、新たな配列が構成される必要である。せいぜい、これは、左のサブベクトルをメモリーにて逆方向に桁送りすること若しくは、右のサブベクトルをメモリーにて順方向に桁送りすることを含む。これには、これらの桁送り操作に先じて、予め適当なメモリーが付与されていることを仮定している。［００４２］クエリー(queries) ＳＰＡＲＣＯＭにおいて、単純なクエリーが、ＢＰＭをクエリーベクトルの変換で行列乗算することによって実行される。クエリーベクトルは行ベクトルであって、問い合わせするＢＰＭ中の列数と同数の要素を備えるように構成されなければならない。クエリーベクトルは、２進である即ち、１或いはゼロだけを含んでいる。クエリーベクトルの１は、求められているプロパティを表示する。［００４３］ＳＰＡＲＣＯＭにおいて、単純なクエリーの結果は、一般的に２進でない列ベクトル(又は応答マトリックス)である。このような列ベクトルの次元は、それをある程度導いているＢＰＭ内の行数と対応する。単純なクエリーにおいて獲得された列ベクトルのｉ番目の要素の値は、クエリーベクトル及びクエリーのためのＢＰＭを共通に持つプロパティ数を示す。クエリーベクトルの程度は、ベクトル中の１の数である。単純なクエリーの場合、ＢＰＭの行ｉがクエリーから獲得されたクエリーベクトルに「一致する」のは、(２進でないことが多い)列ベクトルのｉ番目の要素がクエリーベクトルの程度と等しいときである。このことを別の表現で示すならば、単純なクエリーには、応答マトリックスの要素のしきい値は、クエリーベクトルの程度と等しい。図３は、単純なクエリーの実施例を示している。［００４４］多くの型の更に複雑なクエリーは、問合せ範囲及びブール操作を必要とするクエリーを含んでいるＳＰＡＲＣＯＭアプローチを用いて、容易に実行され得る。問合せ範囲において、値(プロパティ)の倍数は、何かの属性のために特定される。クエリー範囲は、特定された特性の任意の１つを有する記録を送り返す。仮に我々が顧客情報の提供業に関係している場合、Ｃｕｓｔ(顧客)(名前、通り、市、州、郵便番号)、州についてのクエリー範囲を特定したこの関係についてのＳＱＬステートメントは、以下の様に与えられることが出来るであろう。選択*Ｃｕｓｔから州は？＝"ＮＹ"又は州＝"ＮＪ"又は州＝"ＣＴ" ［００４５］このＳＱＬステートメントは、問合せ範囲が「又は」の操作−多重の「又は」の操作であることが多い−を必要とする事実を強調する。属性に重点を置くデータベースにおいて、「又は」の各操作は、クエリーに要する検索時間を増加する。ＳＰＡＲＣＯＭアプローチに従って、単一値属性の問合せ範囲は、特定された範囲内のすべての値を含むクエリーベクトルを有するＢＰＭの普通の行列乗算を実行することによって達成出来る。一致している行を獲得するために、この場合に行われることが必要な唯一の調整は、応答マトリックスの要素のしきい値がクエリーベクトルではなく問合せされた属性の数に等しくなるように行われなければならない。従って、ＳＰＡＲＣＯＭでは、単一値属性に関するクエリー範囲には、余分な検索時間は不要である。図４は、ＳＰＡＲＣＯＭのクエリー範囲がどの様に上記に示したＣｕｓｔ関係のＳＱＬステートメントを実行するかを例示している。［００４６］特定の技術、例えばＳＩＳやＤＩＳ圧縮技術を使って圧縮されたスパースマトリックスにおける行列乗算は、スパースマトリックスのゼロでない要素を因子とする処置だけが必要であり、結果として、スパースマトリックスにおける行列乗算のために優れたパフォーマンスを得ることが出来る。［００４７］データベースの操作を実行することに対するＳＰＡＲＣＯＭアプローチに関した上述の概観は、導入としてのみ役立つ。更に詳細な説明は、Ashany氏の博士論文に示されており、本発明の背景として本願の一部とする。［Ashany，R．"SPAR COM:A Sparse Matrix Associative Relational Approach to Dynamic Structuri ng and Data Retrieval."Ph.D．Dissertationn，Polytechnic institute of New York，June 1976］［００４８］発明の要旨本発明は、データを安全に格納する装置に関する。装置は、有意的に暗号化されたデータの記憶を有するデータベースを具えている。装置は、データの解読を必要とせずに有意的に暗号化されたデータを含んでいる有意味のデータベース操作を実行するためのデータベース機構を具えている。データベース機構は、データベースへ接続されている。該装置は、データベース機構からデータを獲得するためにデータベース機構へ接続されているアクセス機構も具えている。［００４９］本発明は、データを格納する装置に関する。装置は、有意的に表されたデータを有するデータベースを具えている。装置は、データベースの操作を有意的に表されたデータで実行するためのデータベース機構を具えている。データベース機構は、データベースへ接続されている。装置は、データベース機構からデータを獲得するためにデータベース機構へ接続されたアクセス機構を具えており、該アクセス機構は、有意的に暗号化されたデータによって異なった標識を持った、種々のユーザを含んでいる。［００５０］本発明は、データを安全に格納する方法に関する。方法は、有意的に暗号化されたデータをメモリー内に蓄積する工程を含む。次に、データの暗号化を必要とせずに、メモリーからの有意的に暗号化されたデータを用いて、データベース操作を実行する工程がある。更に、メモリーからデータを獲得する工程がある。［００５１］図面の簡単な説明添付の図面において、本発明の望ましい実施例及び本発明を実行する望ましい方法が示されている。図１Ａは、背景的情報のために、ＩＮＦ(第１正規形)でない関係を示している。図１Ｂは、背景的情報のために、図１Ａが３ＮＦ(第３正規形)へ変換された後の図１Ａの関係を示している。図１Ｃは、背景的情報のために、図１ＡがＳＮＦ(ＳＰＡＲＣＯＭ正規形)へ変換された後の図１Ａの関係を示している。図２は、ＢＭＳ(ビット写像体系)、ＳＩＳ(単一インデックス付け体系)及びＤＩＳ(二重インデックス付け体系)圧力方法を用いてマトリックスがどの様に圧縮されるかについての例を、背景的情報のために示している。図３は、背景的情報のために、クエリーを実行するＳＰＡＲＣＯＭの方法を用いている単純なクエリーの例を示している。図４は、背景的情報のために、クエリー範囲を実行するＳＰＡＲＣＯＭの方法を用いているクエリー範囲の例を示している。図５は、本発明の基本的な要素をブロック図にて示している。図６Ａは、本発明の望ましい実施例のネットワークアーキテクチャを示している。図６Ｂは、本発明の望ましい実施例の他のネットワークアーキテクチャを示している。図７Ａは、「Sales Rep」関係を例としてＢＰＭ(２進プロパティマトリックス )を示している。図７Ｂは、図７Ａにて示されているものと同一の「Sales Rep」関係のＢＰＭを表している。ＢＰＭは変更されて、ＢＰＭの列の数及び列の特性内容は、結合されるであろう"Ｃｕｓｔ"関係のＢＰＭの列の数及び列の特性内容に対応している。図７Ｃは、「州」を表わすものに関する列だけ選択し、その他の関係は「きれいにする(sanitize)」投影を行なった後、図７ＢのＢＰＭから得たＢＰＭを図示している。図７Ｄは、転換された後の図７ＣのＢＰＭを示している。図８は、図２にて示されているものと同一の「Ｃｕｓｔ」関係のＢＰＭを表している。ＢＰＭは、ＢＰＭのための列の特性内容及び列の数が、結合されるであろうものとの「Sales Rep」関係のＢＰＭの特性内容及び列の数と合うように変更されている。図９は、図８×図７Ｄのマトリックスの行列乗算によって獲得される応答マトリックスを示している。該応答マトリックスは、結合されるであろう原型のＢＰＭの(図８及び図７Ｂにて得られる)マトリックスの行を特定する。図１０Ａは、本発明の望ましい実施例の他のネットワークアーキテクチャを示している。図１０Ｂは、本発明の望ましい実施例の他のネットワークアーキテクチャを示している。図１１は、単一インデックス付け圧縮体系(ＳＩＳ)に基づいて５×８のマトリックスの座標を配列することを示しており、ＳＩＳ表現(representation)の例に加えてマトリックスの例を示している。このマトリックスのＳＩＳ表現の次元情報だけを暗号化することは、実質上パフォーマンスを犠牲にすることなく、符号化されたマトリックスによって暗号手法のセキュリティを更に高めるということに気づかれるであろう。［００５２］望ましい実施例の説明図中、参照番号は同様な又は同一の部分を示している図面及び更に詳細には各図を参照して、データを安全に格納する装置が示されている。装置は、有意的に暗号化されたデータの格納装置を有するデータベースを含んでいる。装置は、データの解読を必要とせずに有意的に暗号化されたデータを含んでいる有意味のデータベース操作を実行するためのデータベース機構を具えている。データベース機構は、データベースへ接続されている。該装置は、データベース機構からデータを獲得するためにデータベース機構へ接続されているアクセス機構も具えている。［００５３］アクセス機構は、暗号化／解読の機構を含んでいることが望ましく、これはデータベース機構へ接続されていて、解読されたデータを受信し、データを暗号化して、これをデータベースへ供給する。また、データベース機構から解読されたデータを受信して、これを解読する。望ましくは、アクセス機構は、ユーザＣＰＵとワークステーションメモリーを有しているエンドユーザワークステーションを含んでいて、そこで暗号化／解読機構は、メモリーに蓄積されたコードブック及び、コードブックへのアクセスと更新をするメモリー内にあるソフトウェアプログラムを含んでいる。［００５４］有意的に暗号化されたデータは、プロパティに重点を置く定位置のＱコードであることが望ましい。プロパティに重点を置く定位置のＱコードは、スパース２進マトリックスを含むことが望ましい。プロパティに重点を置く定位置のＱコードは、仮の列、仮の行、列の分割、列のオフセット、ＢＰＭ列の順列の情報の寸法を示す圧縮されたスパースマトリックスの暗号化を使用して、各プロパティに重点を置く定位置のＱコードのセキュリティを高める。［００５５］装置は、データベース機構及び、データベースを有しているデータベースサーバコンピュータを含むことが望ましい。データベース機構は、サーバＣＰＵ及びサーバＣＰＵへ接続されたサーバを具えていることが望ましい。サーバメモリーはデータベースを有している。望ましくは、サーバメモリーは、データベースコマンド格納バッファ装置及びデータベース応答格納バッファ装置を具えている。サーバコンピュータは、サーバメモリーとサーバＣＰＵへ接続されたワークステーション通信ポートを含むことが望ましい。望ましくは、ワークステーションは、サーバ通信ポート及び、ワークステーシヨンＣＰＵとワークステーションメモリーへ接続されたワークステーション通信ポートと、入力ポートと、出力ポートとを設けており、両ポートは、ワークステーションメモリー、ワークステーションＣＰＵ及びワークステーション通信ポートへ接続されている。［００５６］本発明は、データの格納装置を保護する装置に関する。装置は、十分なインデックス付きデータを有しているデータベース機構を具えている。装置は、インデックス情報を有している十分なインデックス付きデータの、又は該データで操作を実行するための、データベース機構も具えていて、インデックス情報は、十分なインデックス付きデータへのアクセス及び翻訳を許容する。装置は、データベース機構からデータを得るためにデータベース機構へ接続されたアクセス機構を具えている。［００５７］本発明は、データを格納する装置に関する。装置は、有意的に表されたデータの格納装置を有しているデータベースを具えている。装置は、有意的に表されたデータを使用してデータベースの操作を実行するためのデータベース機構を具えている。データベース機構は、データベースと接続されている。装置は、アクセス機構が、有意的に暗号化されたデータの様々な表現を持った種々のユーザを含むように、データベース機構からデータを獲得するためにデータベース機構へ接続されたアクセス機構を具備している。［００５８］アクセス機構は、有意的に表されたデータの様々な自然言語翻訳を種々のユーザへ提供する。或いは、アクセス機構は、有意的に表されたデータの音声表現を視覚損傷者へ提供する。［００５９］本発明は、データの格納装置を保護する方法に関する。方法は、有意的に暗号化されたデータをメモリー内に蓄積する工程を含む。次に、データの暗号化を必要とせずに、メモリーから有意的に暗号化されたデータでデータベース操作を実行する工程がある。更に、メモリーからデータを獲得する工程がある。［００６０］本発明の主要なアイデアは、図５で示されているとおり、基本的にはデータベースの暗号化機構及び方法であって、データベース情報を編成及び分散することにより、ｑコード情報の形式をした内部レベルのデータが１以上のデータベースサーバ上に保持されつつ、、外部レベル(ユーザレベル) 及び／又は概念レベル(一般レベル)のスキーマ情報が、エンドユーザ顧客のワークステーションに設置されることである。外部レベル又は概念レベルのスキーマ情報に加えて、任意に採ったエンドユーザ顧客のワークステーション上にも存在しているデータベース情報は、対の値のリストを含むコードブックである。各対の１要素(member)は、プロパティを特定する。もう片方の値は、所与のプロパティのために一組の等値のｑコードを特定する。従って、コードブックは、プロパティのインデックスを設けると考えられることが可能で、任意に採ったエンドユーザ顧客のワークステーションは、そのための暗号を使用したキーを有している。望ましい実施例において、任意に採ったデータベースサーバ上にも存在するデータベース情報(データ内容)はＳＰＡＲＣＯＭデータベースである。即ち、該情報は、一組の圧縮されたＢＰＭのデータベースで構成されていて、夫々は特定のプロパティ−エンティティ関係の例示である。データベースサーバは、この内部レベルの圧縮されたＢＰＭデータを翻訳するために必要とされるインデックス情報を保持しない。望ましい実施例のネットワークアーキテクチャは、図６Ａにて示されている。［００６１］図５を参照して、上記記載の如く、エンドユーザ顧客のワークステーション(1 )において、アルゴリズム１Ａ(15)は、ユーザ入力を入力ポート(2)に受入れ、入力情報を分析して、コードブックの検索を実行して入力情報中の特定されたプロパティと等価の暗号化された物を発見し、暗号化されたデータベースコマンドを公式化して、これらをデータベースサーバに提示して、発せられたコマンドについての情報を内容格納バッファ装置(13)へ蓄積する。アルゴリズム１Ｂ(19)は、暗号応答がデータベースサーバコンピュータ(30)の通信ポート(32)から送信され、(5)にて受信されるまで待ち、暗号を読み取り、これを一時的にデータベース応答格納バッファ装置(17)にて蓄積する。アルゴリズム１Ｂ(19)は次に、内容格納バッファ装置(13)を検査して、受信された暗号応答のコマンドが何と関係しているかを決定して、コードブックの検索を行なうことによって暗号を解読し、受信された暗号の要素に等価な普通文を決定して、(1)上のユーザによって指示された通りに普通文の結果を処理する即ち、出力情報を(4)へ導く或いは(23)へ導くか、又はこの両方へ導いて処理する。アルゴリズム１Ａ及び１Ｂの中心のデータ構造は、コードブック(11)である。エンドユーザ顧客のワークステーション(1 )上のコードブック(11)は、特定のテーブルのための具体的なプロパティのリスティングを有しており、単独のワークステーション(1)には、そこへのアクセスが与えられている。コードブックのリスティングは、１セットの(プロパティ、列)の２個組(2-tuples)で構成されるであろう。［００６２］このセット自体は、別個の部分集合の(プロパティ、列)の２個組に完全に分割されてもよく、各セットは、データベースサーバ(12)上の異なったテーブルの画面(view)である。画面は、特定のユーザへ指定されたアクセスの特権に依って、所定のテーブル内のすべての列をリストし、又はリストしないかもしれない。一般的に、テーブルの特定の画面から除外されているかもしれない列はそれら自体、テーブル内の特定の属性に関している列の群で構成されるであろう。しかしながら、ユーザは、特定の属性に関する列の一部分のみへのアクセスを容易に得ることが出来るので、容易に、細かなレベルに細分化することが可能である。［００６３］コードブックは、２フィールドを有しているレコードの簡単なリストとして実行されることが可能であり、該リストは、線状に蓄積される又は、メモリー(6) 内のリンクされた或いは二重にリンクされたリストへ蓄積される。当該分野の専門家によく知られている多くの他のアルゴリズムは利用可能であって、これらは、順方向及び逆方向に検索することが可能であり、対の関連項目にて効果的に実行される。敏速なコードブック検索は、本発明には重要であるのは、発明を利用する単純でないデータベースが多くのプロパティを有しているからである。［００６４］望ましい一実施例において、異なったデータベースに関係するコードブック(1 1)の２個組の別個、単独した部分集合(subset)は、それぞれ収集されて、敏速な検索のために二つの別個の関連配列のようにプログラムされる。即ち、各テーブルに二つの関連配列であり、一方の配列はプロパティから列数へ順方向に検索させるものであり、二つ目の配列は列数からプロパティへ逆方向に検索させるものである。代わりの望ましい実施例において、一つのセットとして共に利用されるコードブックについてのすべての(プロパティ、列)の２個組は、二つの関連配列 (順方向の検索用の配列及び逆方向の検索用の配列)を使用して、敏速な検索のためにプログラムされ、ここで各プロパティフィールドは、整列された対(テーブルネーム、プロパティ¹)で構成される記号列のように構成されている。ここでは、プロパティ¹は多重のテーブルに現れるプロパティ(即ち、整列された属性−値の対)を意味している。それゆえに、(プロパティ、列)の２個組は、次のフォーマット(テーブルネーム、プロパティ¹、列)を有するコードブックエントリーとして現れるであろう。異なったユーザによって所有されているいくつかのテーブルは、まったく同じに名付けられるかもしれないので、テーブルネームのフィールドはそれ自体、テーブルネーム＝(所有者、テーブルネーム¹)を有する複合構成要素であるかもしれないことに気づかれるべきである。更に、図６Ｂにおいて示されるように、多重のデータベースサーバは、ネットワーク上に存在することが出来る。従って、プロパティの値は実質的に、更に拡張された階層構造、例えば住所.データベース.所有者.テーブルネーム¹.プロパティ¹.にあってもよい。各コードブック自体が、簡単で、特別な目的の単一使用者のデータベースを事実上、構成することが判るであろう。［００６５］アルゴリズム１Ａ(15)は、平文のプロパティ情報を、定位置のｑコード列情報へ変換する。(15)は、このことを、エンドユーザ顧客のワークステーションコードブック(11)において「順方向の検索」を行なうことによって達成する。アルゴリズム１Ｂ(19)は、定位置のｑコード列情報を、平文プロパティ情報に変換する。(19)は、このことを、コードブック(11)において「逆方向の検索」を行なうことによって達成する。［００６６］クエリー及びその他のデータベース操作は、上記に明示したように、エンドユーザ顧客のワークステーション(1)の定位置のｑコードを含んでいるＳＰＡＲＣＯＭデータベースコマンドに組み立てられる。エンドユーザ顧客のワークステーションでこのように公式化されたデータベースコマンドは、データベースサーバ(30)へのネットワークへ送り出される。目標のデータベースサーバ(30)へ送られたデータベースコマンドがこのように暗号であるのは、定位置のｑコードを含んでいるからである。エンドユーザ顧客のワークステーション(1)の通信ポート(5)からデータベースサーバ(30)の通信ポート(32)へ送られたデータベースコマンドを含むネットワークトラフィックは、データベースコマンドの暗号を含んでいるので、これ自体が暗号化されることが明らかである。（このネットワークトラフィックは勿論、ＤＥＳ又はＲＳＡのような、その他の形式の暗号化技術を使って付加的に暗号化されることが出来る。）［００６７］図５を参照して、望ましい実施例において、アルゴリズム２(39)は、通信ポート(32)で受信されたデータベースコマンドを分解する。該コマンドは、定位置のｑコードを含んでいて、それゆえ、これは暗号である。次に、アルゴリズム２(3 9)は、データベースコマンド格納バッファ装置(37)内にデータベースコマンドを蓄積して、定位置のｑコードデータベース(35)上のコマンド内で特定された操作を実行しながら、バッファ装置(37)内で見つけられたコマンドを実行する。クエリー及びその他の内部レベルのデータベース操作は、背景情報において説明されたＳＰＡＲＣＯＭアプローチを使ってデータベースサーバコンピュータ(30)に保持されたＢＰＭデータに実行される。このことは、データベースサーバコンピュータ(30)のメモリー(33)内に含まれるｑコードデータベース(35)内で見つけられた圧縮された２進プロパティマトリックスにおける実行操作を意味している。実行されたデータベース操作は、データベースサーバコンピュータ(30) に蓄積されている暗号化されたデータへ、その普通テキスト形式のデータをどんなときでも見せずに、このように直接実行される。定位置のｑコードデータベース(35)上のコマンドを実行することによって生成された出力情報は、圧縮されたＢＰＭ及び操作状況情報、例えば「トランザクションＩＤ番号」、「成功」、「失敗」、を含んでおり、データベース応答格納バッファ装置(41)内で一時的に蓄積される。(操作状況情報は、本発明を基本的に機能させることには不可欠ではないが、むしろこれは該システムに設けられることが可能な標準的なデータベースプログラム情報である。「トランザクションＩＤ番号」の使用によって、複雑なネットワークシステム内で個々のトランザクションを見失わないようにすることを補助する。その他の方法は、同様に、本システムの必要条件を処理する。「トランザクションＩＤ番号」の使用は単に、この必要性に焦点をあてる一方法として示されているだけである。)アルゴリズム２(39)によって行われる最後の工程は、データベースコマンドの実行によって生成された出力情報を、コマンドを起動するエンドユーザ顧客のワークステーション(1)へ送り返すことである。［００６８］ネットワークトラフィックは、通信ポート(32)からエンドユーザ顧客のワークステーション(1)の通信ポート(5)へ送られたデータベースサーバ(30)においてデータベースコマンドを実行することによって生成された出力情報を含んでいて、ネットワークトラフィック自体が暗号化されていることが明白であるのは、少なくともデータが送り返されるケース内の定位置のｑコード暗号である圧縮されたＢＰＭを含んでいるからである。（その他の方向にネットワークトラフィックを有する時、このネットワークトラフィックは勿論、ＤＥＳ又はＲＳＡのようなその他の形式の暗号化技術を用いて付加的に暗号化されることが出来る。）［００６９］代わりの実施例において、エンドユーザ顧客のワークステーション自体は、データベースサーバコンピュータに配置されたＢＰＭにおいてデータベース操作を直接的に行なう。この場合、データベースサーバは単に、データベースのファイルサーバとして作動する。すべてのＳＰＡＲＣＯＭのデータベース操作(例えば、行列乗算を伴うクエリー)は、エンドユーザ顧客のワークステーションによって、それら自身のＣＰＵ及びメモリーキャッシュを使って実行される。通例のトレードオフは、二つのアプローチの間に存在する。データの中央処理は、中央ホストマシンに、一層多くを求めるであろう。ネットワークされたワークステーションにおけるデータの遠隔処理は、中央ホストコンピュータの負担を軽減して、デスクトップの処理電力を利用するであろう。しかし、これは、更にネットワークトラフィックを生じるかもしれないし、データベースサーバコンピュータ上でデータを同時に変更しようとしている多数のエンドユーザ顧客のワークステーションに関して更に困難な点を引き起こすかもしれない。［００７０］例１以下の注釈付けられた例は、前述の背景の欄で記述されたＳＰＡＲＣＯＭ範囲クエリー例が、本発明の好ましい実施例により如何に扱われるかの具体例を説明する。"Cust"関係式に対する非圧縮データ表示(ＢＰＭ)は図４に示される。１）ユーザは末端ユーザワークステーション(１)の入力ポート(２)に高レベルデータベース範囲のクエリーコマンドを発する。州が(‘ＮＹ’、‘ＮＪ’、‘ＣＴ’)であれば、Custから＊を選択する。２a）アルゴリズム1A(15)はこの入力を読み、解析する。２b）アルゴリズム1A(15)は、コードブック(11)上で順方向探索を行い、範囲クエリーにて特定された特質への列番号を決定する。［００７１］ Custテーブルに適したコードブック(11)上のエントリーは以下の通りである。名前．Lynn 1 名前．Mark 2 名前．Bill 3 名前．Sam 4 名前．Liza 5 名前．Carl 6 通り.5 Oak 7 通り.6 Gunn 8 通り.2 Pine 9 通り.8 Main 10 通り.4 Main 11 市.Nyack 12 市.Union 13 市.Derby 14 市.Reno 15 市.Butte 16 州.NY 17 州.NJ 18 州.CT 19 州.NV 20 州.MT 21 ２c）アルゴリズム1A(15)は特定された特質に基づいたコードブック探索により得られた手法情報に基づき、クエリーベクトルを構築する。アルゴリズムが構築したクエリーベクトルは、２進ベクトルであり、配列要素内に１'sを持ち、配列要素のインデックスはコードブック探索で見つけられた列番号に対応する。他の全ての配列要素は０'sである。故にアルゴリズム1A(15)は、ＮＹ、ＮＪ、ＣＴと同等の列番号を含む適切なクエリーベクトルを構築する。クエリーベクトルＱＶは非圧縮形式で以下のように表される。ＱＶ＝（000000000000000011100）しかし、ＢＰＭに関し、クエリーベクトルＱＶを直接圧縮形式で生成することは勿論可能であるし、望ましい。［００７２］このクエリーベクトルを圧縮形式で生成する好ましい方法は、サイズに等しい長さを有する配列をクエリーベクトルの非ゼロ要素(１'s)の番号に割り当てる。非ゼロクエリーベクトル要素のインデックス(即ち、列番号)は、ベクトルの圧縮表示に順次入力される。(クエリーベクトルを圧縮する代わりの方法は、勿論可能である。例えば、ベクトルは非ゼロ列番号のインデックスのリストにリンクするものとして表される。)故に、クエリーベクトルＱＶは以下のように表される。ＱＶ＝（17、18、19）［００７３］範囲クエリーはこのとき構築され、”しきい値”は末端ユーザワークステーション(１)上のアルゴリズム１A(15)によって特定される。(しきい値、このケースでは”１”は、クエリーによって生成された応答マトリックスのどのエントリーが、クエリーされたＢＰＭの対応する行が選択標準に合うことを示しているかを決定する為に用いられる。)クエリーは、４つの領域から構成される。1)OPコード(オペレーションコード)2)テーブル識別番号(即ち、ＢＰＭ)3)クエリーベクトルＱＶ及び4)特異処理識別番号である。データベースサーバーコンピュータ(30)は”１”に指定されるとすれば(多数のデータベースサーバーがあることに注意)、ユーザ”６”は識別番号”３８” を有するＣＵＳＴテーブルの所有者であり、クライエント４(末端ユーザワークステーションは範囲クエリーを処理(formula te)する)は特異処理ID”client4.id185”を生成し、このときクエリーは以下のデータを含む。 Op-code="Range Query，Threshold value=1"，Table ID=1.6.38，Que ry Vector=(17,18,19)， Transaction ID=client4.id185 ［００７４］より簡潔なＯＰコードの表示、即ちこのケースにて”Range Query,Threshold value=1”として”RQ1”を用いて、クエリーアルゴリズム1A(15)が構築するベクトルは以下のようになる。 RQ1 1.6.38（17、18、19） client4.id185 ここでスペースは領域の境界を定める為に用いられる。明らかに、他のデリミッターは同様に働き、領域の順序は単に都合の便宜上のものである。更にテーブル識別番号を一般化し、多数のデータベースが存するケースを同じデータベースサーバーコンピュータに含めることはまた容易なことである。例えば1.3.6.38はデータベースオペレーションがデータベースサーバーコンピュータ”１”、データベース”３”(即ち、データベースサーバーコンピュータ”１”における３番目のデータベース)、ユーザ”６”及びテーブル”３８”を示す為に用いられる。［００７５］２ｄ）末端ユーザワークステーション(１)上のアルゴリズム1A(15)はこのとき上記範囲クエリーを通信ポート (5)から”１”に指定されたデータベースサーバーコンピュータ(30)に送る。アルゴリズムはまた他の関係あるコンテキスト情報、例えばクエリーに対する応答出力をどこへ送るべきかに関する情報と一緒に送られてきたクエリーをコンテキスト格納バッファ(13)へ格納する。３ａ）”１”に指定されたデータベースサーバーコンピュータ(30)のメモリ(3 3)内のアルゴリズム２(39)は、このとき命令(前ステップのアルゴリズム1A(15) により送られた)をデータベースサーバーコンピュータ(30)上の通信ポート(32) 上で受ける。［００７６］３ｂ）アルゴリズム２(39)は命令を解析し、特定されたコマンドを実行する。このケースのＯＰコードは”ＲＱ１”であり、”範囲クエリー，しきい値=1”、命令参照テーブル38を意味し、命令はクエリーベクトルＱＶ＝ (17、18、19)を提供する。故にアルゴリズム２(39)はこの行列掛け算を実行する。ＲＭ＝ＢＰＭ₃₈Ｘ(17、18、19)^T ＲＭは応答マトリックスである。Table 38=ＢＰＭ₃₈及び(17、18、19)^Tは、クエリーベクトルＱＶ＝(17、18、19)の置換である。背景の項で述べられた注釈図４は、このマトリックス掛け算の圧縮されない表示を示す。［００７７］本発明の好ましい実施例の１つに於いて、テーブル38(即ちＢＰＭ₃₈)は背景部に於いて記載された単一のインデックス圧縮手法を用いて、圧縮されたＢＰＭとして実行される。このゆえにテーブル38(即ち、ＢＰＭ₃₈)はその次元とともに単一のベクトルとして次のように示される。 (1,7，12,17,25,32,34,41,44,52,56,61,68,71,79,84,87,93,99,102，111,11 2,117,122) Dim(6,21) 同様に応答マトリックスＲＭは、その次元とともにベクトルとして次のように示される。 (1,3,5,6) Dim(6,1) （このケースに於いて応答マトリクス内の非ゼロエントリーの値を格納する必要はない。なぜならクエリーの性質に従って、それらは全て１に等しいからである。）［００７８］応答マトリクスＲＭは、このときテーブル(38)から選択標準を満たす行を選択するために用いられる。(これは応答マトリクスＲＭが実際には積極的に格納される必要がないように、実際には"処理中(on the fly)"と成される)。ＯＰコードは"RQ1"、即ちしきい値１の範囲クエリー動作を示し、しきい値１の応答マトリクス内のエントリーの行の数は、ＢＰＭ₃₈内の対応する行は範囲クエリーを満たすことを示している。それ故に、図４に示すように、行１、３、５及び６は範囲クエリーに合うように選択される。(背景の項に述べられているように、異なるクエリーは異なるしきい値を有する)。アルゴリズム２(39)は範囲クエリーを満たす４つのＢＰＭ₃₈の行から成る新たなマトリックスＢＰＭ_RESPONSEを生成する。これを表す非圧縮のＢＰＭは以下に与えられる。しかし、ＢＰＭ_RESPONSEは圧縮形式で生成され、単一のインデックス圧縮手法の使用は、以下に表示される。 (1,7,12,17,23,31,35,40,45,51,57,60,69,70,75,80) Dim(4,21) ＢＰＭ_RESPONEは一時的にデータベース応答格納バッファ(41)に格納される。［００７９］３ｃ）アルゴリズム２(39)は次にデータベースサーバーコンピュータ(30)(このケースにてデータベースコサーバーンピュータ”１”として指定された)から通信ポート(32)を介して末端ユーザワークステーション(１)の通信ポート(５)に暗号を送り、ワークステーションは丁度処理された命令をこのケースでは即ち客４に送る。暗号応答は４つの領域から成る。１）ＢＰＭ_RESPONSEの単一のインデックス圧縮手法からのベクトル、２）ＢＰＭ_RESPONSEの次元を与える順序付けられたペア、３）最初は範囲クエリーとともに単に識別する為に送られる処理ＩＤ"client4.id185"、４）特定されたRQ1動作の好結果の終了を示す動作ステータスコード。領域１）と２）はデータベース応答格納バッファ(41)から取られ、領域３）はデータベース命令格納バッファ(37)から取られ、一方領域４）はアルゴリズム２(39)により直接生成される。”RQ1AA”が適切なステータスコードを表すとすると、暗号応答は以下のようになる。 (1,7,12,17,23,31,35,40,45,51,57,60,69,70,75,80)(4.21)client4.id185RQ 1AA ここでスペースは前記の如く領域の境界を定める為に用いられる。［００８０］４ａ）末端ユーザワークステーション(１)、このケースでは客４、上のアルゴリズム1B(19)はデータベースサーバーコンピュータ(30)(このケースにてデータベースコサーバーンピュータ”１”として指定された)の通信ポート(32)から送られる上記暗号応答を処理する。アルゴリズムは通信ポート(５)上にて受信された暗号応答を読み、一時的にデータベース応答格納バッファ(17)に格納する。アルゴリズム1B(19)はこのとき受信された暗号応答を以下のステップを用いて解読する。１）動作ステータスコードをチェックし、RQ1AAが好結果を示したから続行する。２）処理ＩＤをチェックする。処理ＩＤはこの処理の為に特定されたコンテキスト情報をコンテキスト格納バッファ(13)内に置くために用いられ、コンテキスト格納バッファはこの処理が”Cust”関係式に適していることを示す。３）コードブック(11)を逆探索することによりＢＰＭ_RESPONSE(これは、暗号応答の最初の２つの領域内にある単一インデックス手法圧縮形式にて与えられる。)を解読し、ＢＰＭ_RESPONSEの各行に存在する特質を決定する。逆探索は”Cust”関係式に適するコードブックのエントリー上にて成され、ＢＰＭ_RESP _ONSE によって特定された列番号に対して(平文)特質同等物を見出す。［００８１］４ｂ）アルゴリズム1B(19)はコンテキスト格納バッファ(13)内にてこの処理の為に特定されたコンテキスト情報をチェックし、上記ステップ４aにて生成された解読データがいかに導かれフォーマットされるかを決定する。コンテキスト情報は送られた出力が直接に出力ポート(４)に送られるべきか、補助ユーザ格納エリア(23)に送られるべきかを特定し、またコンテキスト情報は出力ポート(４)と補助ユーザ格納エリア(23)の両方に送ることを特定できる。代わりにコンテキスト情報は、出力は末端ユーザワークステーション(１)のメモリ(６)内に存する他の処理又は応用を経るべきかを特定する。データのフォーマットが通常のタイプとすれば、アルゴリズム1B(15)により生成される出力は以下のようになる。 Lynn 5 Oak Street Nyack，NY Mark 8 Main Street Derby，CT Bill 2 Pine Street Reno，NJ Carl 5 Oak Street Nyack，NY ［００８２］動作の挿入、更新、削除は、本発明を用いて、範囲クエリーに対する上記例( 例１)に類似し、データはデータベースサーバーコンピュータ(30)のメモリ(33) 内でＱコードデータベース(35)に変えられる明白な違いを有する方法にて遂行される。ＳＰＡＲＣＯＭ方法を用いて動作の挿入、更新を行うことに関し、新たな特質が特定されたときに、データベーステーブル(ＢＰＭ)に列の追加供給が成されなければならない。これは我々の知る限り、従来技術にはない。この問題を取り扱う好ましい方法は、任意のテーブルの創造子(creator)に対し基本的なＢＰＭが持つ列番号を特定することである。新たな特質には、このときＢＰＭの列次元を再び作る必要なしにテーブル(ＢＰＭ)内に導入されるように列番号が割り当てられる。割り当てられた特質がない列番号は、”利用できる列格納プール”内に置かれ、必要な基礎上にて順序を割り当てられて(連続的に又はランダムに)、”利用できる列格納プール”から取り除かれる。列の欠落した番号は、テーブルの創造子により値が供給されないときには、ＤＢＭＳによって供給される。［００８３］たとえＢＰＭ内に非ゼロ値のみが格納されていても、ＢＰＭ内の列番号はＢＰＭの圧縮表示のサイズに影響を与えることは明白である。圧縮されたＢＰＭのサイズは列番号を表示するのに要求されるビットの数に大凡比例し、例えば６５５３６(２¹⁶)の列を有するＢＰＭは、夫々多くの基礎的なコンピュータハードウエア構造上で１６ビットにて表されることができる。［００８４］新たな特質が導入される前に(挿入又は更新動作のいずれかにより)ＢＰＭ内に列番号をプリセットすることは、特に単一インデックス圧縮手法が利用されるときに、ＳＰＡＲＣＯＭデータベース内の挿入又は更新動作の遂行を容易にする。例えば、図４のＢＰＭの単一インデックス手法表示を考える。 (1,7,12,17,25,32,34,41,44,52,56,61,68,71,79,84,87, 93,99,102,111,112,117,122) Dim(6,21)，このＢＰＭに、ＢＰＭに新たな特質情報(Ann,6 Gulf Road，Tampa，FL)の新たな列を挿入することは、４つの新たな特質列の追加が必要とされる。修正された非圧縮ＢＰＭはこのとき以下のように示される。但し、新たな列が挿入された箇所は修正されたＢＰＭの最後の列であり、新たな４つの特質列は列２２−２５である。修正されたＢＰＭの単一インデックス手法表示は、以下のように示される。 (1,7,12,17,29,36,38,52,60,64,69,80,83,91,96,103,109,115,118,131,132,13 7,142,172,173,174,175) Dim(7,25)．［００８５］明らかに本例に於いて、新たな特質の追加によりＢＰＭの列の数が変わるから、ＢＰＭ内の非ゼロ(”１’ｓ ”)エントリーのインデックス値は、ＢＰＭの単一インデックス手法表示について再計算されなければならない。ＢＰＭにデータを挿入する前に列番号を大きな番号にプリセットすることは、単一インデックス圧縮手法の下でＢＰＭの非ゼロ(”１’ｓ”)エントリーのインデックスを再計算しなければならない問題を解消する。［００８６］結合結合は、関係するデータベースシステムに於いて、特に重要な動作である。当業者はＳＰＡＲＣＯＭＤＢＭＳ(即ち、データベース情報を構築し、掛け算するＳＰＡＲＣＯＭ方法を用いるＤＢＭＳに於いて)に於いて結合動作を実行でき、これゆえにシステム上にて本発明が利用できる。それにもかかわらず、自然結合(また等価結合として知られる)の構築を容易にするＳＰＡＲＣＯＭデータベースに於ける列を数える２つの有用で自明でないシステム及び方法が以下に示される。［００８７］ＳＰＡＲＣＯＭデータベースを実行する好ましい方法及びシステムは、データベースの全ての関係式を結合して、単一の”データベース２進特質マトリックス ”又は”ＤＢＰＭ”にする。ＤＢＰＭはデータベース(好ましくはＳＰＡＲＣＯＭの通常形式)の関係式内に存する全ての特質(即ち列)を結合し、同じ特質に適用するこれらの列を併合し、ＤＢＰＭに組み入れられる各関係式に列を追加する。これゆえに、データベースの各列は、特定のデータベース関係式(好ましくはＳＰＡＲＣＯＭの通常形式)に適し、行が適する関係式は特別な関係式で関連づけられた列内の ”１”の存在により示される。ＳＰＡＲＣＯＭデータベースシステムを実行する為の第２の好ましい方法(及びシステム)は、ＤＢＰＭの実質上実行を伴う。これは、第１の実施例を用いて得られる同じ列番号手法を利用するが、(ＳＰＡＲＣＯＭ方法のアシャニー氏(Ashany's)の元の形式化を用いて)各関係式について分離したＢＰＭを維持する。本発明の第２の好ましい実施例のケースに於いて、勿論圧縮形式にて非ゼロ(即ち”１”列)が格納され、及び／又は掛け算されるけれども、全てのＢＰＭは同じ列総番号を持っていることは特筆されるべきである。また、２番目の方法に於いて、与えられたＢＰＭ内の全ての行は、個々の関係式に明らかに関係しているから、個々の行が関係する特別な関係式を表す為に列を維持する必要はない。［００８８］ＳＰＡＲＣＯＭＤＢＭＳは、この単一の計数システムに従って、ＳＰＡＲＣＯＭデータベースの列を数えることを簡単に援助できる。ＤＢＭＳはカウンタを維持し、データベースに追加された新たな特質に、カウンタの増分値に等しい新たな列番号を単に割り当てる。また、ＳＰＡＲＣＯＭデータベース内にて、一意的な列番号を割り当てる方法をより複雑にプログラミングすることも、更により容易にプログラミングすることも明らかに使用可能である。ＤＢＰＭの為に列番号をプリセットし、全ての列番号をデータベース創造時に”利用できる列番号プール”に入れることは、”利用できる列番号プール”内の利用可能な番号の範囲から列番号を”ランダムに”選択する為に、乱数生成器の使用を許す。本発明に於いて、異なる関係式内にて識別列番号を識別特質に割り当てることは、特質が既に存在しているか、上記に定義されたように特質が属性値ペアをどこに示しているかを決定する為にコードブック(11) をチェックすることにより、末端ユーザワークステーション(１)上にて達成できる。［００８９］関係式の非ゼロ属性を含む自然結合は、ＳＰＡＲＣＯＭデータベース上にて達成され、該データベースの列は自然結合内に含まれるＢＰＭの２つの関係式を掛けることにより上記手法に従って順序付けられる。掛け算の前に、関係式の１つに基づき、投影が最初に遂行され、結合内に含まれない全ての属性を排出すべきである。掛け算はこのとき”消去関係式”を用いて遂行される。［００９０］ここで特定されたユニバーサル計数手法を用いることにより、データベース内の全てのＢＰＭは同じ列番号を持ち、それにより、あるマトリックス又は他のマトリックスが取り換えられた後に、互いの掛け算に準拠した全ての関係式に関連づけられるＢＰＭを作る。また、特定されたユニバーサル計数手法を用いることにより、異なる関係式間の識別属性は同じ列番号を分かち合う事実は、マトリックスが掛けられたときに応答マトリクス内の非ゼロ（”１’s”）の位置が、マトリックス掛け算に含まれるＢＰＭのどの行が結合されるべきかを示すようにする。マトリックス掛け算から得られる応答マトリックスはこのように、互いに結合されるようにマトリックス掛け算の２つの元のＢＰＭの行を選択する為に用いられる。結合の為のマトリックス掛け算を遂行する前に自然結合に含まれていない共通の属性を排出することは、(即ち、以前に「きれいにする」と表現したとおり)応答マトリックス内にて” 誤正数”が得られることを防ぐ。しかし、応答マトリックスは、どの行又は元のＢＰＭが結合されるべきか、即ち関係式のＢＰＭが結合されるべきことを示すことに注意する。［００９１］明らかに、ＳＰＡＲＣＯＭデータベース内で列を数えるためにこの手法を用いることは、非圧縮形式内で採用されるＢＰＭのサイズを著しく増大する。ＢＰＭは勿論圧縮形式で格納されているから、もう一度繰り返すが、格納されたデータのサイズに基づく実際のインパクトは非常に小さい。次の例は、列番号がここでの記載に従って割り当てられたときに、自然結合がＳＰＡＲＣＯＭ構築データ上でどのように遂行されるかを説明する。用いられるデータセットは、目的を説明するには非常に僅かである。［００９２］例２図４で”Ｃｕｓｔ”関係式に対して与えられるＢＰＭ、及び図７で”ＳａｌｅｓＲｅｐ”関係式に対して与えられるＢＰＭを考える。これら２つのＢＰＭは、同じ列番号を有し、２つの関係式が共通に持つ全ての特質が同じ列番号を利用するように修正されることができる。”Ｃｕｓｔ”及び”ＳａｌｅｓＲｅｐ” に対する２つの修正されたＢＰＭは夫々図８及び図７Ｂに示される。”state(州 )”属性上でこれら２つの関係式を結合するＳＱＬステートメントは以下に与えられる。 Select * from Cust c、Salesreps where c.state=s.state；［００９３］関係"”の”state(州)"特質のみを選択すべく投影して、図７ｃのＢＰＭを得る。このＢＰＭを取り換えて図７ＤのＢＰＭを得る。”Ｃｕｓｔ”関係式(図８) のＢＰＭと”ＳａｌｅｓＲｅｐ”関係式(図７Ｄ)の取り換えられたＢＰＭとのマトリックス掛け算を実行して、図９の応答マトリックスＢＰＭ₉を得る。ＢＰＭ₉は３つの非ゼロ(”１’s”)エントリー、(１、１)(１、３)(２、２)を持つ。これらのエントリーは結合されるべき元のＢＰＭ(図８及び図７Ｂにより示される)の行：結合されるべきＢＰＭ₆の行１とＢＰＭ_7Bの行１、ＢＰＭ₈の行２とＢＰＭ_7Bの行２、及びＢＰＭ₈の行６とＢＰＭ_7Bの行１を特定する。［００９４］鍵の交換本発明では、各末端ユーザのクライアントワークステーションは、コードブック情報を持っているので、データベースのサーバコンピュータ上のＢＰＭデータを解釈することができるだけである。逆に言うと、末端ユーザのクライアントワークステーションは、コードブックエントリーを持っていなければ、それらの列の意味を解釈することはできない。ＢＰＭデータベース情報を、２又はそれ以上の末端ユーザのクライアントワークステーションでアクセスできるようにするために、これらの末端ユーザのクライアントワークステーショのコードブックには、アクセスを共有するプロパティ用のエントリーを含んでいなければならない。つまり、本発明では、コードブック情報(全文又は状況により部分的なものの何れか)を安全に配布するために、ある種の機構又は方法が利用されることを必要としており、その結果、データベース情報を共有することができる。新たなプロパティを、末端ユーザのクライアントワークステーションからテーブルに加えると、この情報へアクセスを許可された他の末端ユーザのクライアントワークステーションは、新たなプロパティ用のエントリーで更新されたコードブックを持たなければならない。コードブックエントリーの交換は、明らかに鍵交換の問題となる。コードブックエントリーは暗号鍵である。それ故、コードブックエントリーについての詳細な情報の移動は、暗号鍵の交換である。［００９５］暗号鍵を安全に交換する方法についての問題は、周知の問題であり、多くのプロトコルと方法によって、うまくアドレスされている。コードブックの更新情報は、直接的(ピアツーピア)に配布されるか、又は委託された仲介者を通じて配布されるかの何れかである。［００９６］公開鍵暗号技術(例えば、ＲＳＡ)を利用したピアツーピア手法を用いて、コードブックエントリーの交換を処理すると、末端ユーザのクライアントワークステーション(以下のステップでは「送信局」という)の鍵共有機構(又はアルゴリズム)は、以下のステップによって、コードブック更新版を、他の末端ユーザのクライアントワークステーションに送信する。ステップ１）どの末端ユーザのクライアントワークステーションが、配布されるコードブックエントリーに結合したデータベース情報にアクセスする権利を与えられているかを決めるための照合を行なう。本発明の望ましい実施例として、この情報は、ローカルに維持される。他の望ましい実施例として、この情報は、委託された第三者機関のコンピュータにリモートに保持される。ステップ２）コードブックエントリーは、送信局の非公開鍵(private key)を用いて電子署名(暗号化)される。ステップ３）次に、電子署名された(つまり暗号化されている)コードブックエントリーは、コードブック更新版を受信する権限を与えられた末端ユーザのクライアントワークステーションの公開鍵を用いて暗号化される。ステップ４）適当なコードブック更新版は、送信局から、更新版を受信する権限を与えられた他の末端ユーザのクライアントワークステーションに送信される。［００９７］暗号化されたコードブック更新版を受信する末端ユーザのクライアントワークステーション(以下のステップでは「受信局」」という)は、以下のステップにより、送信局からのコードブック更新版を受信する。ステップ１）公開鍵で暗号化されたコードブック更新版を受信する。送信局が、更新版を供給する権限を有しているかどうかを照合する。権限を有しておれば、次のステップに進み、そうでなければ、セキュリティの侵害が発生したことを通知する。ステップ２）受信したコードブック更新版を非公開鍵(即ち受信局の非公開鍵) を用いて解読する。これによって、送信局の非公開鍵で暗号化されたコードブック更新版から構成される別の暗号化されたメッセージが(おそらく)生み出される。ステップ３）前のステップで獲得された暗号文を、送信局の公開鍵を用いて解読し、受信したコードブック更新版の出所が正しいことを確認する。受信した更新版が真正(即ち、送信局が、特定された個々のリレーションに対して、コードブックの更新版の提供を許可されている)であれば、受信局は、次のステップに進み、そうでなければ、セキュリティの侵害が発生したことを通知する。ステップ４）受信局のコードブックは、受信された情報によって更新される。［００９８］鍵を安全に交換する低度技術の方法であるが、末端ユーザのクライアントワークステーションのユーザが、転送されるｑコード列情報にアクセスする権限を持つ他のユーザへ、適当なコードブック更新版の入ったディスケットを個人的に運ぶことが、迅速性には欠けるけれども、それにも拘わらず効果的な方法である。さらに安全にするには、各ディスケットの内容を、所望の受取人の公開鍵を用いて夫々暗号化して、所望の受取人のみが、データを使用できるようにすればよい。［００９９］「委託された鍵サーバ」は、コードブック情報を配布するためにも用いることができる。この場合、更新版は、最初に委託された鍵サーバに送信され、委託された鍵サーバは、その許可データベース(authorization database)を確認し、暗号化されたコードブック更新版を情報を持つ権限のある末端ユーザのクライアントワークステーションに送る。図１０Ａは、本発明の構成を示しており、委託された鍵サーバを含んでいる。注目すべきことは、委託された鍵サーバには、「完全なる委託」を具えている必要はないことである。例えば、ネットワーク上のどのデータベースサーバコンピュータにもアクセスを認められている必要はなく、また、鍵を配布する導管(conduit)を一つも必要としない。それ故、ＤＢＡ(データベース管理者)は、委託された鍵サーバを管理し、データベーステーブルを定義するにもかかわらず、データへのアクセス権を持っていなくてもよい。多数の委託された鍵サーバを、同様に利用することもできる。図１０Ｂは、本発明の構成を示しており、多数のデータベースサーバコンピュータだけでなく、多数の委託された鍵サーバを含んでいる。［０１００］暗号の拡張定位置Ｑコードの長さの拡張この発明に使用される方法であって、データベースサーバコンピュータ上で、プロパティの見かけの統計上頻度(apparent statistical frequencies)を変えるいくつかの方法について、以下で具体的に説明する。これらの方法を用いることによって、データベースサーバコンピュータ上にあるデータ(即ち、ＢＰＭ)の暗号化を、より困難にする。［０１０１］１）ダミー列ＢＰＭには、意味のない列を付け加えることができ、「１」と「０」を、どのようなやり方でも、例えばランダムに、又は現在の行にいくつかの「１」と相関する機能として付け加えることができる。末端ユーザのクライアントワークステーションには、その情報にアクセスを継続するために、コードブック更新版を提供する必要はなく、導入されたデータは、ＢＰＭの暗号化をより困難にすることにのみ有用である。それでもやはり、「ダミー列を含むコードブック更新版」を送出することは十分に有用であり、他人が、ＢＰＭ更新版の「真」と「偽」を区別するのを阻害する。［０１０２］２）ダミー行意味のない(又は間違った)行の情報を、データベースに付加することができる。末端ユーザのクライアントワークステーションは、ダミーの存在を認識し、データベース操作を行なう際に、それらを無視できる必要がある。ダミー行を処理する望ましい方法は、「ダミー行マーカー列」を用いて、ダミー行をもつＢＰＭを提供することである。すべてのダミー行は、また、少なくとも一つの「ダミーマーカー列」に「１」を有する。ダミー行を含むＢＰＭにアクセスする権限を有する末端ユーザのクライアントワークステーションは、ＢＰＭ中で、ダミー列マーカーのコードブック情報の提供を受ける。データベースの演算は、まず、特定の行が、ダミー列マーカーを含んでいるかどうかを確認し、ダミー列マーカーを含んでいる場合、その行を無視する。［０１０３］３）列分割この方法を用いて、プロパティ頻度が均一化される。例えば、すべての兵士の８０％が男性であるということが知られているならば、４つの列は、プロパティ「男性」を記録するために用いられ、４列毎の１列には、プロパティ「女性」を記録するのに用いられる。所与の特性に対して、種々のプロパティの頻度に変動がない場合でも、複数の列が、プロパティのために用いられ、現実の統計上頻度をスキューし、又は、プロパティと列との関係を単にわかりにくくする。［０１０４］列分割の最も極端なフォームの一つとして、各列が、プロパティ発生の一つの例にのみ用いられることがある。プロパティの第２の例を、ＢＰＭに付加する必要があれば、新しい列が、それに付加されなければならない。例として、次の４つの記録について考える。マーサ、女性、青い目、 5'6"、120ポンドジョージ、男性、青い目、 6'1"、190ポンドジョージ、男性、茶色の目、5'6"、190ポンドリサ、女性、茶色の目、5'6"、120ポンド［０１０５］これらの記録は、次のＢＰＭ(又はこのＢＰＭの中で、いくつかを列方向に並べたもの)で表わされ、列の意味を明白にするために注釈を付けておいた。情報の理論的な観点から、この方法で構成されたＢＰＭは、驚くほど秘密性の高いものとなる。クエリーの操作は、依然として、このＢＰＭ上で行なわれていることに注目すべきである。［０１０６］委託された鍵サーバは、様々な特性を具える種々のプロパティの頻度を監視するためにアクセスを許容されており、ある閾値が限界を超えたときには、列の分割を指示する。又は、列の分割は、末端ユーザのクライアントワークステーションに基づいて並べ直されるので、末端ユーザのクライアントワークステーションは、勿論、アクセスを許可されたプロパティについての頻度を演算することができる。［０１０７］４）列オフセット圧縮して表わされたＢＰＭ中のすべての指数には、実際の値からオフセットを行なう。異なるＢＰＭには、異なるオフセットが行なわれる。ランダムな(そうでなければ意味のない)データを作り出し、所与のＢＰＭに適用されたオフセットよりも小さいインデックス値を持つ列を埋める。オフセットは、元の値を容易に演算できる如何なる数式によっても行なうことができ、元のインデックス値は、データベース操作で用いることができる。とても簡単な例として、もう一度図４を参照すると、＋５のオフセットが、一つのインデックス圧縮系で表わされたＢＰＭに適用されると、ＢＰＭは、次のように表わされる。 (6,12,17,22,30,37,39,46,49,57,61,66,73,76,84,89,92,98,104,107,116,117,12 2,127) Dim(11,26) ［０１０８］オフセットを有する所与のＢＰＭにアクセスする権限を有する末端ユーザのクライアントワークステーションは、そのＢＰＭ用のオフセット情報が安全に配布されなければならない。ＢＰＭを含む演算は、また、ＢＰＭのオフセットを考慮して適合化され、オフセットよりも小さいインデックス値を持つ列のデータをはっきりと廃棄し又は無視しなければならない。一又はそれ以上の委託された鍵サーバを用いた構成が、この発明で用いられる場合、所与のＢＰＭについてのオフセット情報は、そのＢＰＭについてのコードブック更新版を配布するのに用いられた同じ委託された鍵サーバを介して送信される必要はない。オフセット情報は、異なる委託された鍵サーバから、又は、直接末端ユーザのクライアントワークステーション間の何れかで配信することができる。末端ユーザのクライアントワークステーションからデータベースサーバに発行されるデータベースの命令には、また、列オフセット情報を含んでいる。［０１０９］圧縮スパースマトリックスの次元情報の暗号化この発明で用いられる圧縮ＢＰＭの次元を暗号化することにより、追加の安全性を容易にすることが提供できる。ビットマップを用いたスパースマトリックスの圧縮や、単一インデックス又はダブルインデックス圧縮系はすべて、マトリックスの次元が特定されていることを必要とする。他のスパースマトリックス圧縮系にも、マトリックスの次元が、マトリックスを圧縮するために特定されている必要がある。圧縮スパースマトリックスの次元を特定するデータをただ暗号化するだけで、殆んど実施コストなしで、コード化されたマトリックスの暗号化による安全性が高められる。［０１１０］例えば、単一のインデックス圧縮系では、ＢＰＭＡは、ちょうど２つの成分からなり、それは、１）ＢＰＭの次元を特定する２要素の組と、２）ＢＰＭ中のゼロでない要素の位置を特定するベクトルｖである。単一のインデックス化された方法では、Ａの要素は、順次、一次元に並べられる。それ故、Ａ中の列の数を知ることが、ベクトルｖの各要素であるＡの列と行が表しているものを解釈するために極めて重要となる。ＳＰＡＲＣＯＭデータベースのＢＰＭは、一般的に非常に大きいので、図示するために、図１１では、５×８マトリックス中で、要素の位置をまっすぐに並べたものを示しており、例示したＢＰＭの同じ次元の単一インデックス系を表している。［０１１１］図１１の例中、マトリックスの次元がDim(5,8)(特に、列数が８に等しいとき) であることを知らなければ、例えば、１３の値を有するベクトルvの４番目の要素は、ＢＰＭＡが、座標(2,5)にて「１」を有しているということを意味していることを知ることができないことがわかる。同様に、３７の値を有するベクトルｖの１０番目の要素が、座標(5,5)でＡが「１」を有していることを意味していることを知ることはできない。勿論、座標(2,5)と(5,5)の両方が、Ａで「１」であることを知らなければ、ＢＰＭＡの２番目と５番目の両方のレコードは、ＢＰＭの五番目の列がどのようなプロパティを表していたとしても、プロパティの点で同じであることがわからないことを意味する。どのような暗号化系(望ましくは強力な系)でも、データベース記憶装置に保持された圧縮スパースマトリックスの次元の暗号化に用いることができる。［０１１２］６）ＢＰＭ列の並べ替えＢＰＭの列を並べ替えることは、情報にアクセスするのに必要な鍵を変える方法である。列並べ替えは、多くの方法によって成し遂げることができる。これを成し遂げるための望ましい方法は、テーブルの持ち主が、次のステップにより、自分の末端ユーザのクライアントワークステーションで、このタスクを行なうことである。ステップ１）自分の末端ユーザのクライアントワークステーションにテーブル(ＢＰＭ)をダウンロードする。ステップ２）列をランダムに並べ替える。(擬似ランダム数発生器を用いたプログラムが、列の順序づけの選択を補助するために用いられ、或いは、列の順序づけの選択を補助する物理的なソースの乱数で結合したプログラムが用いられる。) ステップ３）データベースサーバコンピュータから元のＢＰＭを削除する。ステップ４）元のＢＰＭに変えて、新たに並べ替えたＢＰＭをデータベースサーバコンピュータにアップロードする。この方法でＢＰＭＡを並べ替えた後、勿論、ＢＰＭＡのデータにアクセスする権限を有する利用者にコードブック更新版を提供する必要がある。［０１１３］自然結合操作(natural join operations)を手助けするために、上述の特定された列に対して、汎用数体系手法(universal numbering scheme)を用いるならば、ＢＰＭＡと同じように、プロパティを有する他のＢＰＭは、ＢＰＭＡのプロパティ列数の割り当てと同じやり方で並べ替えられた列を有していなければならない。連鎖ＢＰＭの並べ替えは、データベースのＢＰＭに存在する共通のプロパティのウェブによって、プロパティ列数を、一貫して維持するために必要であることは明らかである。［０１１４］本発明の配布型データベース構築についての変形この発明の最初に述べた配布型データベース構築について、図６Ａに示している。他の配布型データベース構築を特定する図を、図６Ｂ、図１０Ａ及び図１０Ｂに示している。この発明の配布型データベース構築の構成要素を配布する方法は、その他にも多くあるが、それらはここに記された発明と矛盾しないことは明らかである。注目すべき付加的な形態は、ネットワーク上に位置し、利用可能なＳＰＡＲＣＯＭデータのいくつか又はすべての個々の末端ユーザのワークステーションのハウスポーション(house portions)を有するということである。この計画の下では、末端ユーザのワークステーションは、１又は２以上の異なったデータベースのみのＳＰＡＲＣＯＭサーバの代わりに、他の末端ユーザのワークステーション上に位置するＳＰＡＲＣＯＭデータベース情報にアクセスするであろう。［０１１５］データベースのプロパティ独立性本発明の実用的な利点は、データベースに対して、プロパティの独立性を提供できる点にある。この発明で用いられる圧縮ＢＰＭは、プロパティが存在しているか、存在していないかということのみを記録し、実際に末端ユーザのクライアントワークステーションが、アクセス権を持つ各プロパティの内容を特定する。異なる末端ユーザのクライアントワークステーション上で、所与のプロパティ(即ち、「列の分割」が用いられた場合は、ＢＰＭ列数、又は１セットの列数)に対するコードブックのエントリーには、所与のプロパティについての異なる解釈が含まれる。例えば、所与のＢＰＭで同じ列について言及した２つの異なるコードブックは、異なる自然言語、例えば、英語と日本語で、同じ意味をもつエントリーを含むことができる。属性適用データベース(attribute oriented databases)と比較すると、この発明で用いられる圧縮ＢＰＭデータは、完全に自然言語の偏りがない。この発明によって提供されるプロパティの独立性は、イメージや、ビデオ、音だけでなく、これらのタイプの対象を示すもののような、より複雑なデータ対象についても適用できる。例えば、一方のコードブックは、所与のプロパティに対してテキスト値を特定しており、他方のコードブックは、同じＢＰＭ列数に対してオーディオ又はイメージファイルで特定するものであるように、同じＢＰＭに対して言及する２つの異なるコードブックには、同じプロパティに対して異なるデータタイプを有しているエントリーさえも含めることができる。［０１１６］本発明は、例示を目的とした上述の実施例で詳細に説明されているが、これら説明は、単にその目的のためのものであり、次の請求の範囲の記載された事項を除いて、本発明の精神と範囲から離れることなく、当業者が変形を加え得ることは理解されるべきである。DETAILED DESCRIPTION OF THE INVENTION Method and apparatus for securely storing data [0001]Industrial applications The present invention relates to a method for securely storing data. More specifically, this The invention of the present invention does not require the work of plaintexting of data, and in a manner that does not require significant encryption. Data is stored securely. [0002]Background of the invention Database systems must maintain the reliability and confidentiality of various data sets. Only the individual users or groups of users who have been Are allowed to access and operate the data. This request is generally a user authentication code. Is handled through the control. Audit trail is also kept at the same time. And, at least in theory, what information a user accessed The progress such as when the access was performed is recorded. Among other purposes, o Audit Trail accesses data on the database system, It is intended to clarify the responsibility for operating. As a result, the data on the system Improper access or operation to data Has been helped to prevent Access control and audit trails keep database systems confidential. Useful and cautious mechanism to assist, but data using these methods The base system is still susceptible to secrecy. In a state susceptible to destruction One of the remaining basic areas is that of the person who manages the system. It is. Anyone with the authority to administer the system should keep an audit trail record. Can be turned off, erased, or rewritten. Tube system People need to work effectively. And have special rights to access information in the database. So One of the main limitations is that effective and efficient data Due to the lack of security through data encryption. Furthermore, for example, Access control has been disabled by a computer hacker The lack of protection by database encryption Is at risk. [0003]Problems with current methods Use existing cryptographic techniques to improve database security. If used, it is widely known that it affects database performance. ing. Because strong encryption changes the data structure in the database, Many types of query operations and other DMLs (database manipu The lation language (database manipulation language) is severely affected. Because day To operate the database, it was necessary to first clear the data. It is. In addition, data is encrypted using existing encryption methods. That also exposes the information in plaintext. [0004] There are many distinctions between database encryption and communication encryption. Features. What is the difference between database encryption and communication encryption? Has been discussed in several places, including: [Gudes, E .; "Th e Application of Cryptography to Data Base Security. "Ph.D. Dissertation Ohio State University, 1976]. [Gudes, E., H.S.Koch, and F.A. Stahl. "T he Applications of Cryptography for Data Base Securityo. "In Prceedings o f the National Computer Conference, AFIPS Press, 1976, pp. 97-107], [Seb erry J. And J. Pieprzyk. "Cryptography: An Introduction to Computer Securi ty. "New York: Prentice-Hall, 1989, pp. 233-259]. As a part of the application. data The base is used by multiple users to access data stored and shared It is designed to be able to issue and operate queries. Those users are typical Has different rights to various types of stored information. Emphasized here What is different is that data stored and shared with various access rights is simple There is a completely different problem from the general problem that exists with communication. In simple communication , The parties involved in the communication usually act on the common shared stored data Because it doesn't matter. [0005] Databases are turned selectively and unpredictably into records belonging to them Is allowed to be. This is because message handling is not a problem, Can be effectively added to the database compared to the qualities applied to simple communications This places additional restrictions on the type of encryption. About database encryption Answering all requests is clear compared to direct requests in encrypting files. It is difficult. [0006] Almost all databases have adequate performance for query requests. It is essential to create an index in order to gain the performance. De Index is encrypted when accessing database information There is no known general method for effectively using the state as it is. Therefore, in order for the index to be exploited, it must be unencrypted Must. [0007]Overview of database encryption Gudes [Gudes, E. "The Application of Cryptography to Data Base Security . "Ph.D. Dissertation, Ohio State University, 1976] and Gudes, Koch and St. ahl [Gudes, E., H .; S. Koch, and F.S. A. Stahl. "The Application of Cryptogr aphy for Data Base Security. "In Proceedings of the National Computer Con ference, AFIPS Press, 1976, pp. 97-107] is that database encryption He pointed out three fundamental constraints that are distinguished from communication encryption. these Restrictions place strict restrictions on the types of encryption that can be effectively applied to the database. Will be limited. First of all, the method selectively and efficiently data It must be able to be taken out. Because the data in the database is Arranged to facilitate these operations, encryption and cleartext of individual records It is desirable that the transformation does not include other records at the same time. Second, the data is It is usually included in the source for a long time. If the data is encrypted Request to change encryption key If so, it would be necessary to re-encrypt the data using a new key. Third, there is a "handling problem". If the database operation is encrypted, Could be performed directly on the data that was If you can handle encrypted data in the same way, it is very convenient That is good. This is the encryption of plaintext and the overhead that is included in the plaintext of cryptography. Not only eliminate data loss, but also anywhere in the data manipulation cycle that may be exposed Also means that there is no data in plaintext, Data security. [0008] Gudes and Gudes, Koch and Stahl have a multi-level format for database encryption It is pointed out that this is the most appropriate model for addressing the issue. Day The database is recognized as having multiple levels. That is, data is Exist in the form of data structures, can be referenced according to this scheme, Is managed as if a mapping exists. This mapping is actually Defining data transformations, these data transformations are used in database design. Because they are natural functions, they can use cryptographic security functions. It can be used and extended. Their analysis is a multi-level database. Between adjacent levels in the source structure Examined the various types of encryption available. Gudes et al. Physical on their own We defined a database structure with several levels, both logical and logical. Their duplication Several level database structures provide encryption between different levels of the database. Used to highlight possibilities. In a database, data is Different formats on different physical media (disk, memory, display) And therefore the data has different physical levels, each of which corresponds They emphasize the fact that they have absolute (logical) meaning. various Various types of cryptographic variants are feasible between the various levels of this structure. You. The logical level is appropriate for the format of the database record at various levels. While the physical level depends on the format defined by the logical record. It consists of specific data. Typically, any number of physical records Exists for all defined logical records. Gudes, Gudes, Koch and Stah l's work was published in 1976, but nonetheless their model was decentralized. It also considers the computer structure, because it has a multi-level database structure. It is implied that various levels of construction can be placed in physically separate locations. Because it is understood. [0009] Gudes et al. Five logical steps in the database structure Defined floor. 1) User logic level 2) System logic level 3) Access logical level 4) Storage level (or organized storage level) 5) Unorganized storage levels One or more physical levels may have different logical levels depending on the number of physical media. It is possible to assign to a bell. Is there a correspondence between logical and physical levels? Whether it depends on the details of the implementation. [0010] Gudes et al., The process of mapping between adjacent levels of the database architecture. Details the predetermined type of encryption conversion to be applied. Seberry, Peiprzyk Latest in Computer Security Literature, by Gudes, Koch, Stahl A summary and analysis are given. [Seberry J. And J. Pieprzyk. "Cryptography: An Int roduction to Computer Security. "New York: Prentice-Hall, 1989, pp. 233-25 9] [0011] The cryptography achieved by Gudes et al.'S multi-level database structure Refers to the cipher as stored in the computer system. Multiple records Bell-wide encryption is how encrypted data is manipulated It doesn't really raise the question of whether to be. Multi-level encryption by Gudes et al. Is possible in all cases to work directly with encrypted data elements And not. Encrypted data is first stored in plaintext before an item is accessed. Need to be transformed. [0012] A fundamental limitation faced by database encryption is that when storing data, , It must be in a state that is convenient for operation. Database If encryption keys are used at adjacent levels in the Limited by case. Ultimately, data at a low level for easy retrieval Need to be stored. If this is not possible, query Additional work because data must be reconstructed into a format that can be attached Need to be done. [0013] Combining different forms of encryption transformation can result in very strong encryption. On the other hand, in a naturally formalized multi-level database structure, The limited number of database levels that can be used This imposes severe restrictions on security queries. By encryption Security is a performance change that can make and change databases significantly. Data without encryption Ultimately obtained from the database. [0014] The concept of a multi-level database structure is well known. Various multi-level data Terms related to the database structure are defined. "ANSI / SPARC" multi-level data The database structure is the most widely recognized model of a multi-level database structure. Dell. Explanation of ANSI / SPARC structure by Date [Date, C.I. J. "An Introduction to Database Systems. "5th ed. New York: Addison-Wesley, 1990 , Vol. I., Chapter 2, pp. 31-54], hereby incorporated by reference as part of the application as background to the present invention. I do. [0015]Encryption / Q code There are two mainstreams in cryptography. The key to encryption and q-coding. It Each field deals with a different kind of encryption. That is, the encryption key and q code Is. Cryptography is the conversion of individual symbols or groups of symbols in the alphabet including. Symbols here include, for example, uppercase letters, lowercase letters, numbers, and punctuation. Including. Conversion of encryption can be done without using special rules for handling semantic units. Fit to each symbol or group of symbols in a completely general way It is. Text with arbitrary and meaningless symbols is as easy to encrypt as Can be a meaningful sentence. The sentence structure in which the encryption is performed One unit of success is meaningful by chance. The Q code, on the contrary, , Textual units, such as words, phrases, and sometimes clear meaning Includes changing the entire sentence with content. Any trivial q-code is big You need to use a code table. The example clarifies this reason: English text A simple q-code that can convert all English words, including proper nouns Requires a code table equivalent to containing The code table used in the q code is Configure the keys in the code. Entries in the code table are key to themselves Can also be considered. [0016] Cryptanalysis techniques are essentially based on statistical features in the plaintext domain. Good. The cipher whose original text is assumed to be, for example, a message in a certain natural language When attempting to parse, the cryptanalyst will be able to Consider the appearance frequency of these characters and character combinations. For example, in q code In this case, the frequency of individual words and combinations of words is examined. [0017] SPARCOMOutline SPARCOM stands for "Sparse Associative Relational Connect ion Matrix) ". This is a method proposed and studied by Ashany, System that dynamically builds data to provide fast response times and high throughput. Used for many types of applications. In this approach, discrete-valued data Data into large sparse matrices and perform database operations. Database manipulation using a wide range of sparse matrix techniques. SP In the ARCOM approach, the sparse matrix is stored and operated in a compressed state. No, thereby saving large storage space and execution time. SPARCOM specific The normalization process is often caused by entities that have multiple values for any attribute. Reduce data redundancy. Database operations are performed at the internal level Arithmetic operations on sparse matrix structures that contain structured information that the base has Is executed by [0018] SPARCOM provides a method for retrieving the contents of data that takes discrete values. Offer. That is, the data element is called as a function of its contents and It is. In order to invoke the content, SPARCOM determines the relationship between the given entity and the attribute, Is converted into the relationship between the entity corresponding to and the characteristic. The relationship between an entity and its attributes is When the purpose is set based on the relationship, it indicates what attribute it has. Corresponding The relationship between the entity and the feature But only correspond to all possible values of the property of the set attribute. Indicates whether or not it has the proper characteristics. The relationship between an entity and a feature is represented as a matrix You. The matrix is usually very sparse. In the relational database theory up to now, for any attribute, We need to make a number of relationships for objects with values of. (That is, On the other hand, in SPARCOM, it is an entity and a feature. Since it is built on sexual relations, there is no need to create such a relation Absent. [0019] Ashany states that the Binary Property Matrix is the basic data at SPARCOM. It describes that it corresponds to the relationship between the entity and the attribute. [Ashany, pp .62-63]: A₁, A_Two, ..., A_nThere is an n-dimensional attribute space with the attribute Bottom d₁, D_Two, ..., d_nEach separate component D₁, D_Two, ..., D_nHas an area consisting of However, they can be transformed into an N-dimensional characteristic space according to the following equation: This independent property P₁, P_Two, ..., P_nIs the number of points in the n-dimensional attribute space , Map every point into an N-dimensional characteristic space. Obviously N is greater than n, And more to represent as points in a higher dimensional Euclidean space Axis is required. Hence a large vector. In the characteristic space, However, there are only two distinguishable points on each axis; 0 or 1 And each component axis represents a characteristic. [0020] entity represented in attribute space by n single-valued attributes, ie, n digits Is represented by an N-digit characteristic space consisting of n 1 and N−n 0 values. It is. A value of 1 is inserted at the location representing the attribute. Gender There are two basic types of attributes (male, female). Since there are five basic types (black, blue, brown, green, reddish brown), the characteristic of (male, blue) , I.e., an entity with a male gender and blue eyes is a seven-digit number E (1,0,0 , 1,0,0,0). A two-element set is converted to a seven-digit number that includes two nonzero elements. Is replaced. Each attribute with a single value has a cardinal number. , For example, d₁= 10 and d_Two= 12, the two elements in the attribute space The tuple is converted to 22 digits in the property space. Again, two non-zero values And a binary vector with 20 zero values. The base in these characteristic spaces Kuto Files are called Extended Binary Vectors (EBV) and are usually Is sparse. [0021] A set of m entities is an m × n Binary Connection Matrix (BCM) (binary It is represented by a (binary) matrix consisting of 0 and 1 called a connection matrix. Because The non-zero elements exist between each entity and each of its corresponding properties. This is because it indicates the existing relationship. This matrix is more specifically a binary feature matrix (BP). M). One of the important features of EBV is that it has attributes with only one value and multiple values. Possible attributes are represented as one and the same vector in solving the redundancy problem. It is. [0022] Each and every feature arising from the construction of the binary feature matrix Because each characteristic is indexed, the file itself is completely inverted. Important feature that it is also an inverted file (or even a direct file). There is a fact that the binary feature matrix is essentially a feature extraction feature. Especially suitable for doing business. SPARCOM's range query The approach is very easy to perform compared to classic database structures Rukoto can. To In order to get the answer to the query, only one query You just need to put a coutle. In contrast, attribute-based data In a database, a search operation is generally used to get an answer Must be repeated multiple times. [0023] Binary Property Matrices is based on the SPARCOM standard format (SNF ' s), a proprietary standard format that is relevant to other database standards Format is based on characteristics rather than attributes. Are distinguished. And most notably, how to define a relationship with multiple values It is a difference in the point of handling. 1NF standardization method defined by Codd Is an attribute generated by an attribute with multiple values when there is an attribute with multiple values. The length is reduced by a method of decomposing into a plurality of relations. Such decomposition is known as SPARCO Unnecessary and inappropriate according to M's method. In SPARCOM, attribute-based This is not the case because relationships are organized based on characteristics rather than It is as follows. In effect, the purpose of 1NF relies on the decomposition normally associated with 1NF. Not automatically achieved under SPARCOM. [0024] The relationship in the consultant example in Figure 1A is 1NF Is not included. Features that exist here for attributes in consultant relationships The above dependencies are as follows: Name → hourly wage, name → skill, name → day of the week This is a premise, but there may be other dependencies. Skills and days of the week Is given here as an attribute having a plurality of values. To explain In Figure 1A, the skill, which is an attribute with multiple values, and the next day have two different purposes Is treated in. Multiple examples of each particular consultant in the skills Has been broken down into a number of records. On the other hand, on days when consultants are available In all of the examples, a repeating group is created within that individual record . Both of these techniques for representing attributes with multiple values are preferred is not. [0025] On the other hand, multiple records for each case of attribute with multiple values are stored in 1NF. Rather, having in a relationship has no inherent functional dependencies Unnecessarily duplicate other attributes. (In this example, the hourly and weekday attributes are It is not related to functional dependencies where the name is a skill, but Data will be duplicated. On the other hand, the repeated group is Not as decomposed value, and therefore repeated Operations on records with groups require additional operations. Moreover Records in repeated group relationships are not the same length or are null And neither of these cases is preferable. [0026] Break down consultant relationships (name, hourly rate, workmanship, day of the week), required by 1NF As we can see, removing attributes with multiple values removes the three independent relationships CRat You can get e (name, hourly), CSkills (name, skill) and CDays (name, day of the week). FIG. The relationship of 1NF obtained by decomposing data corresponding to 1A data is shown. Figure 1B relationship Also occurs in 3NF and BCNF (Boyce-Codd Normal Form). [0027] Decomposition usually reduces overall redundancy, but in the process creates smaller I have. That is, in the example in Figure 1B, the "name" Can be found as part. This redundancy is necessary, where all Relationships are preserved and exist in multiple relationships derived from the decomposition process Performing normal concatenation on attributes helps reconstruct the original relationship . [0028] Figure 1C shows the SNF version of the relationship between consultants used in Figure 1A. It is. SPARCOM database The need to decompose the data into 1NF The provision can prevent the generation of redundancy in this process. This is exactly the SPACOM for the database model that thinks based on the existing attributes The superiority of the model. [0029] Ashany notes that one of the key performance advantages of the SPARCOM approach is: It is explained as follows. [Ashany, 184 pages] Many algorithms that deal with sparse matrices have one common feature. That is, only non-zero elements are stored in the matrix. The goal is to use these matrices, To save storage space, and especially to reduce access and execution time Therefore, it is necessary to operate in a state where the entire matrix does not exist. What This is because there is no need to represent or manipulate the zero element. [0030] Another key performance that the SPARCOM approach has over other database approaches An advantage in terms of data resides in the content addressability of the data. Other database systems The system uses multiple indexes to answer various types of queries quickly. Is often required to be maintained. Database other than SPARCOM In the system Will not be indexed to support the specified query. The answer speed for the query will be very slow. Because the index When there are no resources, it is necessary to perform a thorough search for data elements. SP ARCOM does not require multiple indexes. Because it is all data Is indexed. To compress the BPM used in SPARCOM This is because various methods for indexing are actually methods for performing the index itself. [0031]Sparse matrix concept As mentioned above, SPARCOM is a database based on features, Use binary sparse matrices to create internal-level data structures, such as building bricks I have. The persistence data in SPARCOM is composed of BPM, Results in a non-binary matrix that is itself sparse or not. Format. Ashany discussed three approaches to indexing matrices . Ashany investigated the methods of Bitmap (BMS), "Single Index" (SIS), and "D ouble Index "(DIS) compression method. Perform reduction on sparse matrices. And for binary sparse matrices, higher pressure Perform shrinkage. Database operations are implemented according to each of these three approaches. Dexification Made against BPM. Index methods, datasets, and databases Good or bad results depending on which one you select in the source operation , But the results that show excellent performance in each are various Obtained in various compression methods. Fig. 2 shows how the matrix depends on each of the BMS, SIS, and DIS compression methods detailed below. This is an example of whether compression is performed. [0032]"Bit Map" Law In the bitmap method, an m × n matrix A (m is rows, n is columns) is decomposed into three elements. You. 1) Two-element Dim (m, n), where m and n are rows and columns in matrix A. 2) A binary matrix B of dimensions m × n. However, the non-zero value of A is Replace it with 3) a vector v, whose elements are non-zero values of A and whose contours are Is being traced. The bits of the elements of the binary matrix B are used to connect the rows (or columns) of the matrix. Thus, the formed bit string S_BIs stored as S_BRequired to store the The number of units can be calculated by the following simple formula: S_B= [(Mxn) / S] Here, S is the number of bits in one byte. The order of the elements in v is either row 1 to m or column 1 Are arranged so that they appear when scanning from n to n in order. Other array orders Of course, it is possible. [0033] Bitmap method stores each element of binary matrix element B as one bit To achieve significant compression. In the bitmap method, multiple binary The element is a well-known single byte with the actual number of bits depending on the byte size. Stored by the law. This compression technique supports the use for this feature. Hardware that can perform efficient bit operations using different languages It is clear that both are achieved efficiently. The bitmap representation of BPM A The fact that the vector element v is not required is that the two-element set Dim (m, n) and the bitmap It is clear that element B is enough to completely define BPM A. [0034]Single indexing scheme In contrast to the bit mapping scheme, the single indexing scheme is Remember only non-b elements. The single indexing scheme consists of three structures: A non-binary matrix A is expressed using components. 1) Two-piece Dim (m, n), where m is the row number of A and n is the row number of A. 2) The element is z in A Position vector v listing the positions of non-b elements₁. 3) Elements are A vector v which is the non-zero value of A_Two. Two vectors v₁And v_TwoElement of v_Two Element b_iIs v₁Element a_iIn matrix A at the location specified by Indexed to hold the value of the element being indexed. [0035] The position k of the element (i, j) of A is determined by the linear mapping function. k = f (i, j) = j ÷ (i−1) × n In the above formula, i and j are the row and column of the element, respectively, and n is the number of the column in A. No. This formula simply determines the arrangement of the elements of the matrix, This means that the elements of the matrix are scanned one row at a time and the next row is scanned. It is determined by performing from the eyes to the m-th line. [0036] Binary matrices, such as BPM matrices, are single indexed Represented in a similar way using the system, but here it has two components, defined above. 1) Dim (m, n) of 2 sets and 2) position vector v₁Only needed . All non-zero values are binary matrix values, so The second vector to identify v_TwoObviously there is no need for. [0037]Double indexing scheme The dual indexing scheme has three components, two of the components themselves. The second is composed of two parts. 1) Determine the number of rows and columns in the matrix 2) Dim (m, n), 2) Index the position of the elements of the matrix Two vectors v₁And v_TwoAnd 3) the element is a non-zero value of A Vector v_ThreeIt is. Components 1 and 3 consist of a bit map and the single index described above. No consideration is needed at all, as it has the same correspondence as both of the indexing systems. other Vector v,_ThreeIs (in this case, the nonzero element of the matrix Not required for binary matrices (which hold prime values). [0038] Each row from row 1 to row m in matrix A having Dim (m, n) In order, the vector v₁Sequentially lists the column numbers of the elements with non-zero values in A I do. v₁The last element of a must hold a distinguished symbol. No. Non-integer symbols in the range 1 to n will be retained. (Ashany, The symbol “Δ” is used). Vector v₁The number of elements in is zero in A Equal to one more than the number of non-elements. [0039] Vector v_TwoIs the first non-zero element in each row of matrix A. V₁The position of the element in To identify. Vector v_TwoElement of v_TwoElement i of the matrix A V containing a nonzero element of 1₁To determine the number of indexes The elements themselves are indexed. Vector v_TwoContains m + 1 elements I have. v_TwoIs the last element of v₁Identifies the last element of, which is a distinguished symbol is there. [0040]Other indexing schemes There are many other techniques for compressing sparse matrices. Frequently quoted Simple techniques that are easy to program and program can be either rows or columns. Some include the use of linked lists. Either in rows or columns So doubly linked lists where data can be easily recovered, It can be used as well, which is similar to (SIS and DIS compression techniques). Non-zero sparse matrix in arrays or in more complex data structures Index values. [0041] Tradeoffs depend on the choice of various sparse matrix compression schemes. Coexist. For example, bit mapping, single index and double index When comparing the compression schemes, you will find linked and doubly linked lists. Embodying a linked list Provide similar compression due to the increased overhead required to maintain Do not offer. This means that each node in the list contains both the element value and the linked address information. Is included. On the positive side, the linked list structure is Insert the new non-zero element into the sparse matrix Tend to provide better performance than the obvious way of In the direction. Enough vectors used in the sparse matrix compression scheme above If it is embodied with a simple array that is denser then the new sparse In order to insert a Rix compression scheme, a new array needs to be constructed. Cause This can be done by shifting the left subvector backward in memory or , Right in the memory in the forward direction. This includes Prior to these shift operations, it is assumed that appropriate memory has been allocated in advance. are doing. [0042]Queries In SPARCOM, a simple query converts BPM into a query vector transformation. This is performed by matrix multiplication. The query vector is a row vector Therefore, it must be configured to have the same number of elements as the number of columns in the BPM to be queried. I have to. Query vector is binary, ie contains only 1 or zero In. Que A Lie vector of 1 indicates the property that is being sought. [0043] In SPARCOM, the result of a simple query is typically a non-binary column Vector (or response matrix). The dimension of such a column vector is Corresponds to the number of lines in the BPM that have been derived to some extent. Obtained in simple queries The value of the ith element of the filtered column vector is Shows the number of properties that have BPM in common. The degree of the query vector is the vector It is the number of 1 in. For a simple query, row i of BPM is obtained from the query. "Matches" the selected query vector is the column vector (often not binary) Is equal to the degree of the query vector. This is another Expressed as a simple query, the threshold of the response matrix element is , Equal to the magnitude of the query vector. FIG. 3 shows an example of a simple query I have. [0044] Many types of more complex queries require queries that require query scope and Boolean operations. It can be easily implemented using the SPARCOM approach involving Erie. In the query range, multiples of the value (property) are specified for some attribute . The query scope is based on the identified characteristics Send back the record with any one. If we are involved in providing customer information Query about the customer (name, street, city, state, postal code), state -The SQL statement for this relationship with the scope specified is given below: Could be done. Select from * Cust State? = "NY" or state = "NJ" or state = "CT" [0045] This SQL statement is an operation with a query range of "or"-multiple "or" Highlight the fact that this is often the case. Data that focuses on attributes In the database, each operation of "or" increases the search time required for a query . According to the SPARCOM approach, the query scope of a single valued attribute is specified Perform a normal matrix multiplication of BPM with a query vector containing all the values in the range It can be achieved by executing. In this case, to get the matching row The only adjustment that needs to be made is that the thresholds of the elements in the response matrix are cleared. Must be equal to the number of attributes queried, not the Erie vector Must. Therefore, in SPARCOM, the query range for single-valued attributes is Does not require extra search time. Figure 4 shows the query range of SPARCOM. Example of whether to execute the Cust-related SQL statement shown above like ing. [0046] Sparse mats compressed using a specific technology, such as SIS or DIS compression technology Matrix multiplication in Rix uses nonzero elements of a sparse matrix as a factor. Requires only the action of Excellent performance can be obtained for the calculation. [0047] Regarding the SPARCOM approach to performing database operations The above overview also serves only as an introduction. For a more detailed explanation, see Ashany's doctoral dissertation. And is incorporated in the present application by way of background. [Ashany, R .; "SPAR COM: A Sparse Matrix Associative Relational Approach to Dynamic Structuri ng and Data Retrieval. "Ph.D. Dissertationn, Polytechnic institute of New York, June 1976] [0048]Summary of the Invention The present invention relates to a device for securely storing data. The device is significantly encrypted A database having a storage of the retrieved data. The device decrypts the data Meaningful database operations that contain significant encrypted data without the need It has a database mechanism to perform the work. The database mechanism Connect to database Has been continued. The device is used to obtain data from a database facility. There is also an access mechanism connected to the base mechanism. [0049] The present invention relates to an apparatus for storing data. The device has significantly represented data With a database having The device significantly represents the operation of the database It has a database mechanism for executing on the data obtained. Database machine The structure is connected to a database. The device receives data from the database facility Providing an access mechanism connected to the database mechanism to obtain Access mechanisms have identified species that differ significantly depending on the data that was significantly encrypted. Includes various users. [0050] The present invention relates to a method for securely storing data. The method is significantly encrypted Storing the acquired data in a memory. Next, you need data encryption Database operations using significantly encrypted data from memory There is a process to execute the crop. Further, there is the step of obtaining data from memory. [0051]BRIEF DESCRIPTION OF THE FIGURES In the accompanying drawings, a preferred embodiment of the present invention and A preferred method of practicing the invention is shown. FIG. 1A shows a non-INF (first normal form) relationship due to background information. . FIG. 1B shows after FIG. 1A has been converted to 3NF (third normal form) for background information. 1A shows the relationship of FIG. FIG. 1C shows FIG. 1A converted to SNF (SPARCOM normal form) for background information. FIG. 1A shows the relationship after the conversion. FIG. 2 shows BMS (Bit Mapping Scheme), SIS (Single Indexing Scheme) and D How the matrix is compressed using the IS (Double Indexing System) pressure method An example of what is done is provided for background information. FIG. 3 uses the SPARCOM method of executing a query for background information. Here is an example of a simple query: FIG. 4 shows a SPARCOM method of executing a query range for background information. The example of the query range which uses is shown. FIG. 5 shows the basic elements of the invention in a block diagram. FIG. 6A illustrates a network architecture of a preferred embodiment of the present invention. You. FIG. 6B shows another network architecture of the preferred embodiment of the present invention. ing. FIG. 7A shows a BPM (binary property matrix) using the “Sales Rep” relationship as an example. ). FIG. 7B shows the same “Sales Rep” BPM shown in FIG. 7A. Is represented. The BPM is changed so that the number of columns of the BPM and the characteristics of the columns are combined. Corresponding to the number of columns of the "Cust" -related BPM and the characteristics of the columns. You. FIG. 7C selects only the column for those representing "state" and the other BPM obtained from the BPM of FIG. 7B after performing "sanitize" projection are doing. FIG. 7D shows the BPM of FIG. 7C after being converted. FIG. 8 shows the same “Cust” relationship BPM as shown in FIG. ing. The BPM is a combination of the column content and the number of columns for the BPM. Change to match the content of the BPM and the number of columns in the “Sales Rep” relationship with the deaf. Has been updated. FIG. 9 shows the response matrices obtained by matrix multiplication of the matrix of FIG. Rix shows. The response matrix is the original BP to be bound Identify the rows of the matrix of M (obtained in FIGS. 8 and 7B). FIG. 10A shows another network architecture of the preferred embodiment of the present invention. are doing. FIG. 10B shows another network according to the preferred embodiment of the present invention. 3 shows a work architecture. FIG. 11 shows a 5 × 8 matrix based on the Single Indexing Compression Scheme (SIS). This indicates that the coordinates of the boxes are arranged, and in the example of SIS representation (representation), In addition, an example of a matrix is shown. Dimensional information of SIS expression of this matrix Encrypting only the information, without substantially sacrificing performance, That the security of the cryptographic method is further enhanced by the simplified matrix Will be noticed. [0052]Description of the preferred embodiment In the drawings, reference numbers indicate similar or identical parts, and more particularly, Referring to the figures, an apparatus for securely storing data is shown. The device is significantly A database having a storage for the encrypted data. The device is Meaningful data containing data that is significantly encrypted without the need to decrypt the data It has a database mechanism for performing database operations. Database The mechanism is connected to a database. The device receives data from the database facility. It also has an access mechanism connected to the database mechanism to obtain data You. [0053] Access mechanisms should include encryption / decryption mechanisms. Preferably, this is connected to a database mechanism, and decrypts the decrypted data. Upon receipt, the data is encrypted and supplied to the database. Also, the database Receiving the decrypted data from the source mechanism and decrypting it. Preferably, The access mechanism has an end unit having a user CPU and a workstation memory. User workstation, where the encryption / decryption mechanism is stored in memory Stored codebooks and memory for accessing and updating codebooks Includes software programs located within. [0054] Significantly encrypted data is a positional Q code that focuses on properties Desirably. Positional Q-codes that focus on properties are sparse 2 It is desirable to include a hexadecimal matrix. Positional Q-course focusing on properties Is the size of the information of the temporary column, temporary row, column division, column offset, and permutation of BPM column. Use compressed sparse matrix encryption to indicate the Improve the security of Q-codes where the emphasis is placed [0055] The device comprises a database mechanism and a database server having a database. It is desirable to include a computer. The database mechanism includes a server CPU and server It is desirable to have a server connected to the CPU. Server memory is It has a database. Preferably, the server memory is a database command A storage buffer device and a database response storage buffer device. server The computer has a server memory and a workstation connected to the server CPU. It is desirable to include a communication port. Preferably, the workstation is To communication port, workstation CPU and workstation memory Set the connected workstation communication port, input port, and output port. And both ports are workstation memory, workstation CP U and workstation communication ports. [0056] The present invention relates to a device for protecting a data storage device. The device must have sufficient It has a database mechanism that has data with boxes. The device is Of or with sufficient indexed data with index information It also has a database mechanism to perform Allows access to and translation of highly indexed data. The device is An access mechanism connected to the database mechanism to obtain data from the source mechanism I have it. [0057] The present invention relates to an apparatus for storing data. The device has significantly represented data A database having a storage device. Equipment was significantly represented Providing a database mechanism for performing database operations using data ing. The database mechanism is connected to the database. The device is Mechanisms include various users with various representations of the data that are significantly encrypted. Connected to the database facility to obtain data from the database facility. It has a continuous access mechanism. [0058] The access mechanism provides various natural language translations of the significantly represented data to various users. Offer to the. Alternatively, the access mechanism may provide an audio representation of the significantly represented data. Provide to the visually impaired. [0059] The present invention relates to a method for protecting a data storage device. The method is significantly cryptographic Storing the converted data in the memory. Next, you need to encrypt your data. Perform database operations on significantly encrypted data from memory without need There is a step to perform. Further, there is the step of obtaining data from memory. [0060] The main idea of the present invention is shown in FIG. Basically, it is a database encryption mechanism and method. Internal level data in the form of q-code information by organizing and distributing Is maintained on one or more database servers, while at the external level (user level) And / or schema information at the conceptual level (general level) Station. External or conceptual level schema In addition to information, also present on workstations of voluntary end-user customers The database information is a codebook containing a list of pairs of values. Each pair One element (member) specifies a property. The other value is the given property Identify a set of equivalent q-codes for the Therefore, the codebook is Can be thought of as having an index of The customer's workstation has a key with the encryption for it . In the preferred embodiment, data that also resides on an The database information (data contents) is a SPARCOM database. That is, The report consists of a set of compressed BPM databases, each of which contains a specific It is an illustration of a property-entity relationship. The database server Index information needed to translate the level compressed BPM data Information is not kept. Network arch of preferred embodiment The architecture is shown in FIG. 6A. [0061] Referring to FIG. 5, as described above, the end user customer's workstation (1 )), The algorithm 1A (15) receives the user input into the input port (2), Analyze the input information and perform a codebook search to identify the identified Find an encrypted equivalent of Patty and use the encrypted database command Formalize, present these to the database server, and summarize the commands issued Is stored in the content storage buffer device (13). Algorithm 1B (19) An encrypted response is sent from the communication port (32) of the database server computer (30). , Wait until received in (5), read the encryption, and temporarily store it in the database The response is stored in the response storage buffer device (17). Algorithm 1B (19) is next Inspect the storage buffer device (13) and check that the received And decrypt the code by searching the codebook. Determine the plaintext equivalent to the element of the cipher that was received and indicate it by the user on (1). Process the result of the ordinary sentence as specified, i.e. direct the output information to (4) or to (23). Or both. Data at the center of algorithms 1A and 1B The data structure is a code book (11). End user customer workstation (1 The codebook (11) above Has a specific property listing for a given table, and The workstation (1) has been given access to it. Code book Query listings consist of a set of (property, column) tuples (2-tuples). Will be. [0062] The set itself is completely split into two distinct subsets (properties, columns) Each set may be represented by a different table on the database server (12). It is a view. The screen will appear depending on the access privileges specified for the particular user. All columns in a given table may or may not be listed. one In general, columns that may be excluded from a particular screen in a table are themselves , A group of columns relating to a particular attribute in the table. But Meanwhile, the user can easily gain access to only a part of the column for a particular attribute. As a result, it is possible to easily divide the data into smaller levels. [0063] Codebooks are implemented as simple lists of records with two fields. The list can be stored linearly or in a memory (6). Stored in a linked or doubly linked list within. Specialized in the field Many other algos well known to the gate family Rhythms are available, these can be searched forward and backward And is effectively executed on the related items of the pair. Prompt Codebook Search Book It is important to the invention that non-trivial databases that utilize the invention are often used by many professionals. This is because they have patties. [0064] In a preferred embodiment, the codebooks (1 The two distinct and singular subsets of 1) are each collected and promptly Programmed as two separate related sequences for search. That is, each tape Are two related arrays, one of which is searched forward from the property to the number of columns. The second array is used to search backward from the number of columns to the property It is. In an alternative preferred embodiment, used together as one set Every (property, column) duplicate for a codebook has two associated arrays (Forward search sequence and reverse search sequence). Where each property field is an ordered pair (table Name, property¹). here , Properties¹Is a property that appears in multiple tables (ie, sorted attribute-value Means pair). Therefore, the (property, column) pair is Mat (table name, property¹, Column) Will appear as a codebook entry. Owned by different users Some tables may be named exactly the same, , The table name field is itself table name = (owner, table Rename¹It should be noted that it may be a composite component with You. Further, as shown in FIG. 6B, multiple database servers Network. Therefore, the value of the property is effectively Extended hierarchical structure, e.g. address; database; owner; table name¹.Professional Patty¹. Each codebook itself is a simple, special-purpose, single-use It will be seen that the user database is effectively structured. [0065] Algorithm 1A (15) converts the plaintext property information to the q-code Convert to (15) describes this in the workstation code of the end user customer. This is achieved by performing "forward search" in the book (11). Argo The rhythm 1B (19) converts the q-code sequence information at the fixed position into plaintext property information. . (19) explains this by performing a "reverse search" in the codebook (11). And achieve by. [0066] Queries and other database operations are described above. As shown, the q code of the end-user customer's workstation (1) in place Is assembled into a SPARCOM database command containing Endo Database commands formulated on the user's client workstation in this way Is sent out to the network to the database server (30). Target database The database command sent to the source server (30) is thus encrypted This is because the q code at the fixed position is included. End user customer workstation From the communication port (5) of the application (1) to the communication port (32) of the database server (30). Network traffic, including database commands, It is clear that it itself is encrypted because it includes the encryption of the command. ( This network traffic can, of course, be in any other form, such as DES or RSA. It can be additionally encrypted using a formal encryption technique. ) [0067] Referring to FIG. 5, in the preferred embodiment, algorithm 2 (39) Decompose the database command received in step (32). The command is contains the q-code, so this is cryptographic. Next, algorithm 2 (3 9) stores the database command in the database command storage buffer device (37). Stored and specified in the command on the fixed position q code database (35) Execute the command found in the buffer device (37) while performing the specified operation I do. Queries and other internal-level database operations can Database server computer using the SPARCOM approach described in This is performed on the BPM data held in the data (30). This means that the database server In the q-code database (35) contained in the memory (33) of the computer (30) Means the operation to be performed on the found compressed binary property matrix ing. The executed database operation is performed on the database server computer (30). The plaintext data to the encrypted data stored in the In this way, it is executed directly without showing. Q-code database The output information generated by executing the command on BPM and operation status information such as “transaction ID number”, “success”, “ Failed), and temporarily stored in the database response storage buffer device (41). Be stacked. (Operation status information is not essential for the basic functioning of the present invention. No, but rather this is a standard database that can be provided in the system. Program information. Complex by using "transaction ID number" To keep track of individual transactions in complex network systems. And assist. Other methods are also required for this system Handle the condition. The use of a “transaction ID number” simply focuses on this need. It is only shown as a way to spot. ) Algorithm 2 (39) The final step taken is the output generated by executing the database command Sends the information to the end-user customer workstation (1) that invokes the command Is to return. [0068] Network traffic is routed through communication port (32) to end-user The data is sent to the database server (30) sent to the communication port (5) of the station (1). Contains output information generated by executing database commands, It is obvious that network traffic itself is encrypted At least compressed q-code encryption in place in the case where data is sent back This is because it contains BPM. (Network traffic in the other direction When having this network traffic, of course, such as DES or RSA Can be additionally encrypted using other types of encryption techniques. ) [0069] In an alternative embodiment, the end-user customer's workstation itself is Database operation in BPM located on the database server computer Do it directly. In this case, the database server simply Acts as a database file server. All SPARCOM databases Operations (e.g., queries involving matrix multiplication) Depending on the application, they can be executed using their own CPU and memory cache. It is. The usual trade-off exists between the two approaches. Center of data The process will require more from the central host machine. Networked The remote processing of data at the workstation is burdened by the central host computer. It will ease the burden and utilize the processing power of the desktop. But this is It may cause more network traffic and the database server Of many end-user customers trying to change data on a computer at the same time. This may cause further difficulties with workstations. [0070]Example 1 The following annotated example is the SPARCOM range described in the Background section above. Illustrates specific examples of how query examples are handled by the preferred embodiment of the present invention I do. The uncompressed data representation (BPM) for the "Cust" relation is shown in FIG. 1) The user enters the end user workstation (1). Issue a high level database range query command to the input port (2). If the state is ('NY', 'NJ', 'CT'), select * from Cust. 2a) Algorithm 1A (15) reads and analyzes this input. 2b) Algorithm 1A (15) performs a forward search on codebook (11) and returns Determine the column number for the attribute identified in the query. [0071] A suitable entry on the codebook (11) for the Cust table is: name. Lynn 1 name. Mark 2 name. Bill 3 name. Sam 4 name. Liza 5 name. Carl 6 Street 5 Oak 7 Street 6 Gunn 8 Street 2 Pine 9 Street. 8 Main 10 Street. 4 Main 11 City.Nyack 12 City.Union 13 City.Derby 14 City.Reno 15 City.Butte 16 State.NY 17 State.NJ 18 State.CT 19 State.NV 20 State.MT 21 2c) Algorithm 1A (15) uses a codebook search based on the identified characteristics. A query vector is constructed based on the obtained technique information. Algorithm is configurable The constructed query vector is a binary vector, has 1's in the array element, The column element index corresponds to the column number found in the codebook search. other Are all 0's. Therefore, algorithm 1A (15) is used for NY, NJ, CT Construct an appropriate query vector containing column numbers equivalent to. Query vector Q V is represented in the uncompressed form as follows: QV = (000000000000000011100) However, for BPM, generating the query vector QV in direct compression form is Of course it is possible and desirable. [0072] The preferred way to generate this query vector in compressed form is equal to size Query length array Assign to the number of the non-zero element (1's) of the vector. For non-zero query vector elements The indices (ie, column numbers) are sequentially input to the compressed representation of the vector. (Kue Alternative methods of compressing the Lie vector are of course possible. For example, the vector is Represented as linking to a list of non-zero column number indices. Therefore , Query vector QV is expressed as follows. QV = (17,18,19) [0073] A range query is now constructed, and the "threshold" is the end-user workstation. This is specified by the algorithm 1A (15) on the section (1). (Threshold, in this case Where "1" is any entry in the response matrix generated by the query Indicates that the corresponding row of the queried BPM meets the selection criteria Used to determine. ) The query is composed of four areas. 1) OP Co (Operation code) 2) Table identification number (that is, BPM) 3) Query vector QV and 4) unique process identification number. If the database server computer (30) is designated as "1" (many Note that there is a database server of ")", user "6" has identification number "38". Is the owner of the CUST table with Station processes range queries (formula te))) generates a unique processing ID “client4.id185”, and the query is Including data. Op-code = "Range Query, Threshold value = 1", Table ID = 1.6.38, Que ry Vector = (17,18,19), Transaction ID = client4.id185 [0074] A more concise OP code display, in this case "Range Query, Threshold The vector constructed by the query algorithm 1A (15) using “RQ1” as “value = 1” The toll looks like this: RQ1 1.6.38 (17, 18, 19) client4.id185 Here, the space is used to define the boundary of the area. Obviously, other delimiters The zones work in the same way, and the order of the regions is merely for convenience. More tables Generalize identification numbers and use cases with multiple databases in the same database It is also easy to include in a computer. For example, 1.3.6.38 is Database operation is database server computer "1", data Base "3" (that is, the third in database server computer "1") Database), user "6" and table "38". [0075] 2d) Algorithm 1A (15) on the end user workstation (1) Communication port for the above range query From (5), send to the database server computer (30) designated as "1". A Algorithms also contain other relevant contextual information, such as responses to queries. Context queries sent with information about where to send the output The data is stored in the buffer (13). 3a) The memory (3) of the database server computer (30) designated as "1" At this time, the algorithm 2 (39) in (3) uses the instruction (algorithm 1A (15) in the previous step). Communication port (32) on the database server computer (30) Receive on. [0076] 3b) Algorithm 2 (39) analyzes the instruction and executes the specified command. The OP code in this case is “RQ1”, and the “range query, Threshold = 1 ”, meaning the instruction lookup table 38, where the instruction is a query vector QV = (17,18,19). Hence Algorithm 2 (39) performs this matrix multiplication . RM = BPM₃₈X (17, 18, 19)^T RM is a response matrix. Table 38 = BPM₃₈And (17,18,19)^TThe que This is a permutation of the Lie vector QV = (17, 18, 19). Figure 4 described in the background section Indicates an uncompressed representation of this matrix multiplication. [0077] In one preferred embodiment of the present invention, table 38 (ie, BPM₃₈) Is the background part BPM compressed using the single index compression technique described in And executed. Therefore, table 38 (ie, BPM₃₈) Is simply with its dimension This is expressed as a vector as follows. (1,7,12,17,25,32,34,41,44,52,56,61,68,71,79,84,87,93,99,102,111,11 (2,117,122) Dim (6,21) Similarly, the response matrix RM is represented as a vector with its dimensions as follows: Is done. (1,3,5,6) Dim (6,1) (In this case it is necessary to store the value of the non-zero entry in the response matrix There is no. Because they are all equal to 1 according to the nature of the query . ) [0078] At this time, the response matrix RM selects a row satisfying the selection standard from the table (38). Used to (This is because the response matrix RM is actually actively stored. Is actually done "on the fly" so that it does not need to be done). OP Co The command indicates "RQ1", that is, a threshold 1 range query operation, and a threshold 1 response matrix. The number of rows of entries in the Rix is BPM₃₈The corresponding rows in are range queries Satisfies the following conditions. Therefore, as shown in FIG. 6 is selected to match the range query. (As mentioned in the background section , Different queries have different thresholds). Algorithm 2 (39) is a range query 4 BPMs that meet Lee₃₈New matrix BPM consisting of rows of_RESPONSETo Generate. The uncompressed BPM representing this is given below. But BPM_RESPONSEIs generated in a compressed format and uses a single index compression technique The use of is shown below. (1,7,12,17,23,31,35,40,45,51,57,60,69,70,75,80) Dim (4,21) BPM_RESPONEAre temporarily stored in the database response storage buffer (41). [0079] 3c) Algorithm 2 (39) then goes to the database server computer (30) (this From the database coserver computer "1" in the case of Communication of end user workstation (1) via communication port (32) Sends the encryption to port (5), and the workstation sends the just processed instruction to this In other words, it sends to customer 4. The cryptographic response consists of four fields. 1) BPM_RESPONSEof Vector from a single index compression technique, 2) BPM_RESPONSEGives the dimension of 3) the first sent simply for identification with the range query Process ID "client4.id185", 4) An operation indicating the successful end of the specified RQ1 operation Operation status code. Areas 1) and 2) are from the database response storage buffer (41) Area 3) is taken from the database instruction storage buffer 37, while area 4 is taken. ) Are generated directly by Algorithm 2 (39). "RQ1AA" is the appropriate status Assuming the code, the cryptographic response looks like this: (1,7,12,17,23,31,35,40,45,51,57,60,69,70,75,80) (4.21) client4.id185RQ 1AA Here, the space is used to determine the boundary of the area as described above. [0080] 4a) the algo on the end user workstation (1), in this case the customer 4, Rhythm 1B (19) is a database server computer (30) (data From the communication port (32) of the base co-server computer (designated as "1") Process the encrypted response. The algorithm is received on the communication port (5) Read the encrypted response It is temporarily stored in the database response storage buffer (17). Algorithm 1B (19) At this time, the received cryptographic response is decrypted using the following steps. 1) Check the operation status code and RQ1AA showed good results continue. 2) Check the processing ID. The process ID is the ID specified for this process. Used to place context information in the context storage buffer (13). The text storage buffer indicates that this process is suitable for the "Cust" relation. 3) BPM by back searching codebook (11)_RESPONSE(this is , Given in a compressed form with a single index technique in the first two fields of the cryptographic response Can be ) And BPM_RESPONSEDetermine the attributes that exist in each row of. Reverse search Is made on a codebook entry suitable for the "Cust" relation, and BPM_RESP _ONSE Find the (plaintext) attribute equivalent for the column number specified by. [0081] 4b) Algorithm 1B (19) performs this processing in the context storage buffer (13). Check the context information specified for the Determine how the decrypted data is derived and formatted. Contextual information The report indicates whether the output sent should be sent directly to the output port (4) or not. Send to rear (23) And the context information includes the output port (4) and the auxiliary user It can be specified to send to both the delivery area (23). Instead, the context information The force is applied to other processes or responses existing in the memory (6) of the end user workstation (1). Identify if it should be used. If the data format is a normal type, it will be generated by algorithm 1B (15). The output produced is as follows: Lynn 5 Oak Street Nyack, NY Mark 8 Main Street Derby, CT Bill 2 Pine Street Reno, NJ Carl 5 Oak Street Nyack, NY [0082] Inserting, updating, and deleting actions are performed using the present invention using the above example ( Similar to Example 1), the data is stored in the memory (33) of the database server computer (30). Performed in a manner that has obvious differences that can be turned into a Q-code database (35) within It is. With respect to inserting and updating operations using the SPARCOM method, a new When attributes are identified, additional columns are provided in the database table (BPM). Must be done. This, to our knowledge, is not in the prior art. Take this issue The preferred method of handling is to create a basic BP for any table creator. This is to specify the column number of M. New attributes include: At this time, it is introduced into the table (BPM) without having to recreate the column dimensions of BPM. Column numbers are assigned as follows. Column numbers that do not have an assigned attribute Column storage pool, and are assigned an order on the required basis (continuous (Randomly or randomly) removed from the "available column storage pool". Missing columns The dropped number is sent to the DBMS when no value is supplied by the creator of the table. Is supplied. [0083] Even if only non-zero values are stored in the BPM, the column number in the BPM is BP Obviously, this affects the size of the M compressed representation. Compressed BPM The size is roughly proportional to the number of bits required to represent the column number, eg, 655 36 (2¹⁶), Each of which has a lot of basic computer hardware. A can be represented by 16 bits on the structure. [0084] Before a new attribute is introduced (either by an insert or an update operation) Presetting column numbers is especially useful when single index compression techniques are used. Facilitates performing insert or update operations in the SPARCOM database. For example, consider the single index approach display of BPM in FIG. (1,7,12,17,25,32,34,41,44,52,56,61,68,71,79,84,87, 93,99,102,111,112,117,122) Dim (6,21), In this BPM, a new characteristic information (Ann, 6 Gulf Road, Tampa, FL) Inserting a column requires the addition of four new attribute columns. Modified non The compressed BPM is then shown as: However, the place where the new column is inserted is the last column of the corrected BPM, The four characteristic columns are columns 22-25. Modified BPM single index hand The modal indication is shown as follows. (1,7,12,17,29,36,38,52,60,64,69,80,83,91,96,103,109,115,118,131,132,13 7,142,172,173,174,175) Dim (7, 25). [0085] Obviously, in this example, the number of columns in the BPM changes with the addition of new attributes. , BPM non-zero ("1's ") The index value of the entry will be Must be recalculated. Presetting the column number to a large number before inserting data into BPM, BPM non-zero ("1's") entry entry under single index compression scheme Eliminates the need to recalculate dex. [0086] Join Joining is a particularly important operation in the database system concerned. This Merchants build SPARCOM DBMS (ie, build and multiply database information) (In a DBMS using the SPARCOM method). Therefore, the present invention can be used on a system. Nevertheless, spontaneous SPARCOM database that facilitates the construction of joins (also known as equivalent joins) Two useful and non-obvious systems and methods for counting columns in a program are shown below. You. [0087] A preferred method and system for implementing a SPARCOM database comprises a data Combining all relations in the base into a single "database binary property matrix" "Or" DBPM ". DBPM is a database (preferably SPARCO Combine all attributes (i.e., columns) that exist in the relation (of the normal form of M) and apply them to the same attribute. Use these Merge the columns and add a column to each relational expression that is included in DBPM. Therefore, Each column of the database contains a specific database relation (preferably SPARCOM). And the row is suitable for a relation in a column associated with a special relation. Indicated by the presence of "1". Run the SPARCOM database system A second preferred method (and system) for performing involves substantially implementing DBPM. this Utilizes the same column numbering scheme obtained using the first embodiment, but (SPARC For each relation (using Ashany's original formalization of the OM method) Maintain the released BPM. In the case of the second preferred embodiment of the invention, of course, Non-zeros (ie "1" columns) are stored and / or multiplied in a compressed form It should be noted, however, that all BPMs have the same column total number. Also, in the second method, all rows in a given BPM are represented by individual relations Columns are used to represent the special relations in which the individual rows relate to No need to maintain. [0088] SPARCCOM DBMS, according to this single counting system, SPARC It can easily assist in counting the columns of the OM database. DBMS has counter Maintain and add new attributes equal to counter increments to new attributes added to the database. Simply assign a column number. Also, S More complex way to assign unique column numbers in PARCOM database It is clear that programming is easier than programming It can be used for Column numbers are preset for DBPM and all column numbers are When creating a database, putting in the "Available column number pool" To "randomly" select column numbers from the range of available numbers in the "number pool" Allows the use of a random number generator. In the present invention, identification column numbers in different relational expressions Assigning an attribute to an identifying attribute means that the attribute already exists or is defined above. Codebook (11) to determine where attributes represent attribute-value pairs Can be achieved on the end user workstation (1) by checking You. [0089] Natural joins involving non-zero attributes in relational expressions are reached on the SPARCOM database And the database columns are multiplied by the two relational expressions of BPM contained in the natural join. The ordering is performed according to the above method. Before multiplication, one of the relational expressions Projections are performed first, and all attributes not included in the join should be ejected based on It is. The multiplication is then performed using an "erasure relation". [0090] By using the universal counting method specified here, All BPMs have the same column number After which one matrix or another matrix is replaced Then, a BPM associated with all the relational expressions based on the multiplication is created. Also , Distinguish between different relational expressions by using the specified universal counting method The fact that attributes share the same column number is the response matrix when the matrix is multiplied. The position of a non-zero ("1's") in the Trix is the B value included in the matrix multiplication. Indicate which rows of PM should be combined. Obtained from matrix multiplication The response matrices thus obtained are Used to select the two original BPM rows of the calculation. Matrix for binding Ejecting common attributes not included in natural joins before performing the multiplication , In the response matrix (ie, as previously described as "clean") To avoid false positives. However, the response matrix will Indicates that the BPM should be combined, that is, the BPM of the relation should be combined. And be careful. [0091] Obviously, using this technique to count columns in the SPARCOM database This significantly increases the size of the BPM employed within the uncompressed format. BPM Of course, since it is stored in a compressed format, repeat once again, The actual impact based on the size of Very small. The following example shows that when column numbers are assigned as described here, How it is performed on SPARCOM construction data will be described. Used The data set is very small to explain the purpose. [0092]Example 2 The BPM given for the “Cust” relation in FIG. 4 and “Sale” in FIG. Consider the BPM given for the s Rep "relation. These two BPMs are , Have the same column number, and all attributes that the two relations have in common use the same column number Can be modified to "Cust" and "Sales Rep" The two modified BPMs for are shown in FIGS. 8 and 7B, respectively. ”State (state The SQL statement that joins these two relations on the ")" attribute is given below. Can be Select * from Cust c, Salesreps where c.state = s.state; [0093] Projecting to select only the "state" trait of the relationship "" yields the BPM of FIG. You. This BPM is replaced to obtain the BPM of FIG. 7D. "Cust" relational expression (Fig. 8) And the replaced BPM in the “Sales Rep” relational expression (FIG. 7D). Perform a matrix multiplication to obtain Response matrix BPM₉Get. BPM₉Is three non-zero ("1's") entries -, (1, 1) (1, 3) (2, 2). These entries should be combined Row of BPM (indicated by FIGS. 8 and 7B): BPM to be combined₆Row 1 And BPM_7BRow 1, BPM₈Row 2 and BPM_7BRow 2 of the BPM₈Row 6 and BP M_7BLine 1 is specified. [0094] Key exchange In the present invention, each end user's client workstation BPM data on the server computer in the database Can only be interpreted. Conversely, the end user's client If the workstation does not have a codebook entry, those Cannot be interpreted. 2 or more BPM database information To be accessible on the end user's client workstation For example, these end-user client workstation codebooks Must contain entries for properties that share access. In other words, in the present invention, the code book information (whether the whole text or partial Need to use some mechanism or method to safely distribute As a result, the database Information can be shared. Add new properties to the end-user client When added to the table from a workstation, access to this information is granted. Other end-user client workstations are You must have the codebook updated with the entry. Codebook d The exchange of entries is obviously a key exchange problem. Codebook entry is dark Key. Therefore, the transfer of detailed information about codebook entries is This is the exchange of encryption keys. [0095] The problem of how to securely exchange cryptographic keys is a well-known issue, and many It is well addressed by protocols and methods. Codebook updates are distributed directly (peer-to-peer) or commissioned Distributed through a designated mediator. [0096] Using a peer-to-peer method using public key cryptography (for example, RSA), Processing a book entry exchange will result in an end-user client workstation Key sharing mechanism (or algorithm in the following steps). The following steps will ensure that the updated version of the codebook is Send to client workstation. Step 1) Which end-user client workstations are distributed The right to access database information linked to the codebook entry Check to determine if they are available. As a preferred embodiment of the present invention, This information is maintained locally. In another preferred embodiment, this information is Maintained remotely on a contracted third-party computer. Step 2) The codebook entry contains the transmitting station's private key. Digital signature (encryption) is performed. Step 3) Next, the digitally signed (that is, encrypted) codebook The tree of end users is authorized to receive codebook updates. It is encrypted using the public key of the client workstation. Step 4) The appropriate codebook update is the right to receive the update from the transmitting station. To other end-user client workstations . [0097] End-user client work receiving encrypted codebook updates The station (hereafter referred to as the `` receiving station '') performs the following steps: And receives the updated codebook from the transmitting station. Step 1) Receive the updated codebook version encrypted with the public key. The transmitting station , Check if you have the authority to supply updates. If you have the authority ,next Proceed to step, otherwise notify that a security breach has occurred You. Step 2) Update the received codebook to a private key (that is, the receiving station's private key). Decrypt using. As a result, the code book encrypted with the private key of the transmitting station is Another encrypted message (probably) consisting of updated version . Step 3) decrypt the ciphertext obtained in the previous step using the public key of the transmitting station. Read and confirm that the source of the received codebook update is correct. Update received If the new version is authentic (i.e., the transmitting station If the receiving station is authorized to provide an updated version of the book), the receiving station may proceed to the next step. Proceed, otherwise notify that a security breach has occurred. Step 4) The code book of the receiving station is updated with the received information. [0098] A low-tech technique for securely exchanging keys, but the client Station has the right to access the transmitted q-code string information. Personally ship the diskette with the appropriate codebook update to another user. This is a less efficient but nevertheless effective method. For added security, the contents of each diskette should be published to the desired recipient. Each is encrypted with a key so that only the intended recipient can use the data. Just do it. [0099] "Entrusted key server" should also be used to distribute codebook information Can be. In this case, the updated version is sent to the first commissioned key server and The authorized key server checks its authorization database and The end-user's client who has the information to To the workstation. FIG. 10A shows the configuration of the present invention, Included key servers. It should be noted that the entrusted key server has a "complete It is not necessary to have "all commission". For example, on a network You do not need to be granted access to any of the database server computers, Also, no single conduit for key distribution is required. Therefore, DBA Database administrator) manages the entrusted key server and defines the database tables. You don't have to have access to the data, though you do. Many The entrusted key server can be used similarly. FIG. 10B shows the structure of the present invention. Configuration, and not only a large number of database server computers, but also a large number of Includes a trusted key server. [0100] Cryptographic extensions Extension of fixed position Q code length A method used in the present invention, comprising: Change the apparent statistical frequencies of a property Several methods are specifically described below. Using these methods Allows the encryption of data (ie, BPM) on the database server computer. Makes the encryption more difficult. [0101] 1) Dummy column BPM can have extra columns, meaning "1" and "0" In such a way, for example, randomly or correlate with some "1" in the current row It can be added as a function to do. The end user's client workstation has access to that information. You do not need to provide a codebook update to continue, and the data , Only useful in making BPM encryption more difficult. Still, Sending an updated codebook with dummy columns "is useful enough and Prevents the BPM update from distinguishing between “true” and “false”. [0102] 2) Dummy row Meaningless (or wrong) rows of information can be added to the database . The end user's client workstation recognizes the presence of the dummy and They must be negligible when performing database operations. Process dummy rows The preferred method is to use a "dummy row marker column" to create a BPM with a dummy row. It is to provide. Every dummy row also has at least one "dummy" Marker column "has" 1 ". Authorized to access BPM including dummy rows The end user's client workstation in dummy BPM Receive the codebook information of the manufacturer. First, the calculation of the database Check if the row contains a dummy column marker, If so, ignore the line. [0103] 3) Column division Using this method, property frequencies are equalized. For example, for all soldiers If it is known that 80% are male, the four columns will be the property Used to record “male”, one out of every four columns contains the property “female” Used to record. Varying frequency of various properties for a given property Multiple columns are used for properties, even if there is no real statistical frequency Or skew the property or column Simply obscure the relationship. [0104] As one of the most extreme forms of column splitting, each column has one May only be used for examples. If the second example of a property needs to be added to the BPM, a new column Must be added. As an example, consider the following four records. Martha, woman, blue eyes, 5'6 ", 120 lbs. George, male, blue eyes, 6'1 ", £ 190 George, male, brown eyes, 5'6 ", 190 lbs. Lisa, female, brown eyes, 5'6 ", 120 lbs [0105] These records show the next BPM (or some of this BPM in a row). ) And annotated to clarify the meaning of the columns. From a theoretical point of view of information, BPM constructed in this way is surprisingly confidential. It will be expensive. Manipulating queries It should be noted that the work is still being done on this BPM. [0106] The entrusted key server monitors the frequency of different properties with different characteristics. Access is allowed, and when a certain threshold is exceeded, Instruct the percentage. Alternatively, splitting the columns can be done at the end user's client workstation. End-user client workstations Can, of course, calculate the frequency for the property to which access has been granted. Wear. [0107] 4) Column offset All exponents in the compressed BPM are offset from the actual values. Do. Different BPMs have different offsets. Random (yes (Otherwise meaningless) data and the offset applied to a given BPM Fill columns with index values less than Offset facilitates the original value It can be done by any formula that can be calculated as , Can be used in database operations. As a very simple example, once again Referring to FIG. 4, an offset of +5 was represented by one index compression system. When applied to BPM, BPM is expressed as You. (6,12,17,22,30,37,39,46,49,57,61,66,73,76,84,89,92,98,104,107,116,117,12 2,127) Dim (11,26) [0108] The end user who has the right to access the given BPM with the offset Client workstations securely distribute offset information for their BPM It must be. Operations involving BPM also take into account BPM offsets Column data with an index value less than the offset Must be discarded or neglected. One or more key contractors When a configuration using a server is used in the present invention, the off-state for a given BPM is used. Set information is used to distribute updated codebooks for the BPM. It does not need to be sent via the same commissioned key server. The offset information is From a different commissioned key server or directly to the end user's client work It can be delivered anywhere between stations. Client of end user The database instructions issued from the workstation to the database server include , And also contains column offset information. [0109] Encryption of dimension information of compressed sparse matrix By encrypting the dimensions of the compressed BPM used in the present invention, additional security is provided. Can be provided to facilitate sex. Sparse matrix using bitmap Compression and single index or double index compression systems are all Requires that the dimensions of the box be specified. Other sparse matrix compression In the system, the dimensions of the matrix are also specified to compress the matrix There is a need. Just encrypt the data specifying the dimensions of the compressed sparse matrix With just the cost of implementation, encryption of the coded matrix Safety is improved. [0110] For example, in a single index compression system, BPM A has exactly two components Which consist of 1) a set of two elements that specify the dimensions of the BPM, and 2) A vector v specifying the position of a non-zero element. Single indexed In this method, the elements of A are sequentially arranged one-dimensionally. Therefore, the number of columns in A Knowing interprets what the columns and rows of A represent, each element of the vector v It is extremely important for. The BPM of the SPARCOM database is generally non- Since they are always large, for the sake of illustration, FIG. Are shown in a straight line, Fig. 4 illustrates a single index system of the same dimension of BPM. [0111] In the example of FIG. 11, the dimension of the matrix is Dim (5,8) (particularly when the number of columns is equal to 8) Is not known, for example, the fourth element of a vector v having a value of 13 Element means that BPM A has “1” at coordinates (2,5). It turns out that you can not know that. Similarly, a vector with a value of 37 The tenth element of v means that A has "1" at coordinates (5,5). You cannot know that. Of course, the coordinates (2,5) and (5,5) are both A and "1". If you don't know that, both BPM A's second and fifth records No matter what property the fifth column of PM represents, Means that they are not the same in terms. Keep any encryption system (preferably strong) in database storage Can be used to encrypt the dimensions of the compressed sparse matrix. [0112] 6) Rearrangement of BPM column Reordering BPM columns is a way to change the keys needed to access information Is the law. Column reordering can be accomplished in a number of ways. this The preferred way to accomplish this is if the table owner: This step allows this end user's client workstation to To do the task. Step 1) Tap into your end user's client workstation Download (BPM). Step 2) Rearrange the columns randomly. (Using a pseudo-random number generator A program is used to assist in the selection of column ordering, or A program linked with random numbers from a physical source is used to assist in the selection You. ) Step 3) Delete the original BPM from the database server computer. Step 4) Replace the original BPM with the newly sorted BPM in the database Upload to server computer. After rearranging BPM A in this way, of course, access the data of BPM A. It is necessary to provide updated codebooks to users who have authority to do so. [0113] To help with natural join operations, the above specified If you use a universal numbering scheme for Other BPMs with properties, like BPM A, Must have columns sorted in the same way as the Patty column number assignment Absent. Chain BPM Is sorted by the web of common properties that exist in the BPM of the database. Therefore, it is clear that the number of property columns is necessary to maintain You. [0114] Variations on the construction of a distributed database according to the present invention FIG. 6A shows the construction of the distributed database described at the beginning of the present invention. You. FIGS. 6B, 10A, and 10 show diagrams that identify other distributed database constructions. B. Method for Distributing Components of Distributed Database Construction of the Present Invention Although there are many others, it is clear that they are not inconsistent with the invention described herein. It is easy. Notable additional forms are located on the network and available Workstations of some or all individual end users of SPARCOM data It has the house portions of the house. This meter Below the screen, the end user's workstation will have one or more different data Instead of a database-only SPARCOM server, Will access SPARCOM database information located on the application . [0115] Database property independence A practical advantage of the present invention is that it provides property independence for the database It is possible. For use in this invention The compressed BPM to be used depends on whether the property exists or not. Records only that the end user's client workstation Identify the content of each property that you have access to. Clients of different end users On the workstation, the given property (i.e. The number of BPM columns or the number of columns in one set) Includes different interpretations for a given property. For example, given B Two different codebooks that mention the same column in PM are different natural languages, For example, entries having the same meaning in English and Japanese can be included. attribute Compared to applicable databases (attribute oriented databases), The compressed BPM data used is completely free of natural language bias. By this invention The property independence offered is not only for images, videos and sounds It can be applied to more complex data objects, such as those that You. For example, one codebook specifies text values for a given property. The other codebook has audio or audio for the same number of BPM columns. Two that refer to the same BPM, as specified in the image file Different codebooks have different data types for the same property You can even include entries that have [0116] The present invention has been described in detail in the foregoing examples for illustrative purposes. The description is merely for that purpose and may refer to the statements in the following claims. Except for departures may be made by those skilled in the art without departing from the spirit and scope of the invention. Should be understood.

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ０９Ｃ 1/00 ６６０Ｇ０６Ｆ 15/40 ３２０Ｂ ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) G09C 1/00 660 G06F 15/40 320B

Claims

[Claims] 1. A device for securely storing data, A database for storing semantically encrypted data, Semantically encrypted data that does not require decryption A database mechanism for performing meaningful database operations, wherein the data A database mechanism connected to the base, and An action connected to the database to obtain data from the database mechanism Seth mechanism, Equipped. 2. The apparatus of claim 1, wherein the access mechanism includes an encryption / decryption mechanism. The encryption / decryption mechanism is connected to a database mechanism, and the encrypted data is decrypted. Receiving the data, decrypting the data, providing the decrypted data to a database, and Receiving encrypted data from the database mechanism and transmitting the encrypted data. Decrypt. 3. 3. The apparatus according to claim 2, wherein the semantically encrypted data is a property. It is a fixed position Q code based on the 4. 4. The apparatus according to claim 3, wherein the fixed-position Q-code for the property is sparse. Contains the hexadecimal matrix. 5. 5. The apparatus of claim 4, wherein the access mechanism comprises an end user workstation. The end user's workstation contains the user's CPU. And a workstation memory connected to the CPU for encrypting / decrypting. The reading mechanism consists of a code book stored in memory and a code book stored in memory. Software program to access and update codebooks You. 6. The apparatus according to claim 5, comprising a database mechanism and a database. Database server computer. 7. 7. The apparatus according to claim 6, wherein the database mechanism comprises: a server CPU; The server memory includes a server memory connected to the server CPU. It has a database. 8. 8. The apparatus according to claim 7, wherein the server memory stores an instruction in a database. It contains a buffer and a database response storage buffer. 9. 9. The apparatus according to claim 8, wherein the server computer has a server memory and a server. Server communication port connected to the server CPU. Ten. 10. The apparatus of claim 9, wherein the workstation is a workstation. Communication port, input port, and output port. Faith The ports are the server communication port and the workstation CPU and workstation. Input and output ports are connected to the workstation memory Connected to the workstation CPU and workstation communication port . 11． A device for securely storing data, A database mechanism with fully indexed data; Allows access and interpretation of fully indexed data For fully indexed data with indexed information, Or a database mechanism for performing an operation using the data, Database mechanism connected to the database, Connected to a database facility to get data from the database facility Access mechanism and It has. 12． 5. The apparatus according to claim 4, wherein the positional Q code for the property is To improve the security of the putty Q-position, the dummy column, dummy row, and column Division, column offset, encryption of dimension information of compressed sparse matrix, or rearrangement of BPM column Is used. 13. A device for storing data, A database that stores semantically represented data When, A database that performs database operations using semantically represented data A database mechanism connected to the database; Connected to the database facility to get data from the database facility Access mechanism to provide different users with different semantically encrypted data. An access mechanism that provides an expression Equipped. 14. 14. The apparatus according to claim 13, wherein the access mechanism provides different users with meaning. And provide the translated data in different natural languages. 15． 14. The apparatus according to claim 13, wherein the access mechanism is provided to a person who has lost vision. Tastefully represented data is provided in audio representation. 16． A method for securely storing data, Storing semantically encrypted data in memory; Semantically encrypted from memory without the need to decrypt the data Operating the database using the data, and Obtaining data from memory; Contains.