JP3928677B2

JP3928677B2 - Information search method and information search apparatus

Info

Publication number: JP3928677B2
Application number: JP31957797A
Authority: JP
Inventors: 憲浅野; 環前野
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1997-11-20
Filing date: 1997-11-20
Publication date: 2007-06-13
Anticipated expiration: 2017-11-20
Also published as: JPH11154156A

Description

【０００１】
【発明の属する技術分野】
この発明は、例えば、ＣＤ−ＲＯＭ（コンパクトディスクＲＯＭ）などの記録媒体に情報を圧縮して記録する方法、装置、これらの方法、装置により圧縮された情報が記録された記録媒体、および、圧縮されて記録媒体に記録された情報の検索方法、検索装置に関する。
【０００２】
【従来の技術】
ＣＤ−ＲＯＭに記録された国語辞典や英和辞典など各種文献の内容情報を、例えば、専用の情報検索装置を用いて検索することができるようにされた、いわゆる電子ブックシステムが提供されている。
【０００３】
この電子ブックシステムは、例えば、意味や内容を知りたい言葉や単語などの検索キー情報（インデックス情報）を情報検索装置に入力することにより、当該装置に装填されたＣＤ−ＲＯＭに記録されている情報を検索する。そして、入力された検索キー情報に対応する情報を、当該ＣＤ−ＲＯＭから読み出し、これを情報検索装置の表示画面に表示するなどして、利用者に提供する。
【０００４】
したがって、電子ブックシステムの利用者は、国語辞典や英和辞典などの文献のページをめくって、調べたい言葉や単語についての記述を見付け出すなど、手間や時間をかけることなく、迅速に目的とする言葉や単語の意味内容を検索して、得ることができる。
【０００５】
ところで、電子ブックシステムにおいては、情報の迅速な検索を実現するため、情報検索用の階層構造のインデックス情報が作成され、このインデックス情報が文献の内容情報（以下、本文データという）と共に、ＣＤ−ＲＯＭに記録されている。
【０００６】
この検索用の階層構造のインデックス情報のうち、最下層以外の各階層のインデックス情報は、入力された検索キー情報と比較される比較キー情報と、その比較キー情報に対応する次層のインデックス情報の先頭記録位置を示すアドレス情報を有する構成とされている。また、最下層のインデックス情報は、入力された検索キー情報と比較されるキー情報であって、入力された検索キー情報に一致する比較キー情報と、入力された検索キー情報に対応する情報のＣＤ−ＲＯＭ上の先頭記録位置を示すアドレス情報（本文アドレス情報）を有している。
【０００７】
そして、入力された検索キー情報とインデックス情報の比較キー情報との比較処理を順次に行うことによって、入力された検索キー情報に対応する情報のＣＤ−ＲＯＭ上の先頭記録開始位置を探し出すようにされている。この場合、インデックス情報の全ての比較キー情報を対象に検索を行うことなく、入力された検索キー情報の検索範囲を徐々に絞り込んで行くことができるようにされており、迅速に入力された検索キー情報に対応する情報をＣＤ−ＲＯＭに記録されている本文データから探し出して利用することができるようにされている。
【０００８】
【発明が解決しようとする課題】
ところで、近年、電子ブックシステムのＣＤ−ＲＯＭに、より多くの情報を記憶させることにより、その内容を充実させたいとたいとする要求が大きくなってきている。このように、ＣＤ−ＲＯＭに記憶する情報量を多くすることにより、電子ブックシステムにおいて、ＣＤ−ＲＯＭの交換回数を少なくすることができるなど、電子ブックシステムの利便性を向上させることが期待できる。
【０００９】
しかし、電子ブックシステムのＣＤ−ＲＯＭに記録される各種文献の本文データは、従来から非圧縮状態で記憶するようにされており、ＣＤ−ＲＯＭの容量不足が問題となっている。
【００１０】
そこで、本文データを圧縮してＣＤ−ＲＯＭに記憶することが考えられる。しかし、以下のような問題があり、単純に本文データを圧縮してＣＤ−ＲＯＭに記録することはできない。
【００１１】
まず、圧縮する本文データの処理単位が問題になる。例えば、１つの文献の本文データの全部を１固まりのデータ（処理単位）として、圧縮するようにした場合には、圧縮されたこの１固まりの本文データの全部を情報検索装置に取り込んで、圧縮解凍処理（伸長処理）しなければならないために、情報検索装置に大きなメモリを搭載しなければならなくなる。また、この場合、処理単位当たりの本文データのデータ量が多いので、ＣＤ−ＲＯＭからの圧縮された本文データの取り込みや、圧縮された本文データの伸長処理に時間が掛り、迅速な検索処理が実現できなくなる。
【００１２】
このため、本文データを、例えば検索の対象となる１まとまりの情報（検索対象項目）毎に区切り、この区切った情報毎に本文データを圧縮することが考えられる。本文データが、例えば国語辞典のデータである場合には、単語とその単語の意味内容を示す情報を検索の対象となる１まとまりの検索対象項目として、この検索対象項目毎に本文データを圧縮するようにする。
【００１３】
しかし、このようにした場合には、検索対象データが小さすぎたり、あるいは、検索対象項目のデータの大きさがまちまちとなるために、効率のよいデータ圧縮ができない可能性がある。
【００１４】
また、電子ブックシステムの場合、前述したように、迅速な検索を可能にするため、各検索対象項目のＣＤ−ＲＯＭ上の先頭記録位置を示すアドレスを有するインデックス情報が、本文データと共にＣＤ−ＲＯＭに書き込まれるようにされている。このため、検索対象データ毎に、本文データを圧縮してＣＤ−ＲＯＭに記録するようにした場合には、従来からある電子ブックシステムのソフトウエアについても、そのインデックス情報を、圧縮した本文データに対応して作り直さなければならない。
【００１５】
電子ブック規格のインデックス情報は、前述のように階層構造の複雑な構成とされており、本文データの圧縮に伴いインデックス情報を作り直すには、インデックス情報を新規に作成する場合と同じ位の時間とコストがかかる。このことが、本文データを圧縮することにより、より多くの本文データをＣＤ−ＲＯＭに記録するようにした電子ブックシステムのＣＤ−ＲＯＭの提供を阻害する原因になっている。
【００１６】
以上のことにかんがみ、この発明は、上記問題点を一掃し、記録媒体の記憶容量を有効に活用できるようにする情報記録方法、情報記録装置、および、記録媒体に記録された情報から目的の情報を合理的かつ迅速に検索することができる情報検索方法、情報検索装置および目的の情報を合理的かつ迅速に検索することができるように情報が記録された情報記録媒体を提供することを目的とする。
【００１７】
【課題を解決するための手段】
上記課題を解決するため、請求項１に記載の発明の情報検索方法は、
記録媒体からデータを読み出す読み出し手段と、読み出されたデータを圧縮解凍する圧縮解凍手段と、圧縮解凍されたデータを出力するデータ出力手段とを備える情報検索装置において用いられ、複数個の検索対象項目を含む本文データが、前記検索対象項目毎の区切りなく、順次に記録媒体に記録されたときの前記検索対象項目毎の先頭記録位置を検出するためのインデックス情報を、検索のためのキー情報として、圧縮されて記録されている前記複数個の検索対象項目を含む本文データから、目的とする前記検索対象項目を検出するようにする情報検索方法であって、
前記記録媒体には、前記本文データが、所定の等しい大きさのデータ量毎に分割されて、その分割データ単位で圧縮されたものが、記録位置が連続するように順次に記録されていると共に、前記分割データ単位毎の圧縮後のデータサイズの累算値が、各分割データ毎に対応付けられて記述された圧縮サイズテーブルが、前記インデックス情報に加えて記録されており、
前記読み出し手段により、前記インデックス情報に基づいて、指定された前記検索対象項目の先頭位置を示すアドレスまでのデータ量を特定し、前記先頭位置を示すアドレスまでのデータ量を前記所定の大きさのデータ量で割り算することで、前記先頭位置を含む前記分割データ単位を特定すると共に、前記分割データの圧縮後のデータサイズの累算値に基づいて、当該特定した分割データ単位に対応する圧縮後の本文データの記録開始位置と、そのデータ量とを特定して、当該特定した分割データ単位に対応する本文データを、前記記録媒体から読み出す読み出し工程と、
前記圧縮解凍手段により、前記読み出し工程において前記読み出し手段により読み出された前記分割データ単位の圧縮された本文データの圧縮を解凍する圧縮解凍工程と、
前記データ出力手段により、前記圧縮解凍工程において前記圧縮解凍手段により圧縮解凍されたデータの中の、前記インデックス情報に基づいて検出される前記指定された検索対象項目の先頭位置から、当該検索対象項目のデータを出力する対象データ出力工程と
を備えることを特徴とする。
【００２７】
この請求項１に記載の発明の情報検索方法によれば、記録媒体には、複数個の検索対象項目を含む本文データが、予め決められた等しい大きさのデータ量毎に分割され、この分割データ単位に圧縮されて記録されていると共に、分割データ単位毎の圧縮後のデータサイズに関する情報と、圧縮前の本文データの検索対象項目毎の先頭位置を検出するためのインデックス情報とが記録されている。
【００２８】
読み出し工程において、読み出し手段により、前記インデックス情報と、前記データサイズに関する情報とに基づいて、指定された検索対象項目の先頭位置を含む分割データ単位が特定され、特定された分割データ単位に対応する圧縮後の本文データが、前記記録媒体から読み出される。
【００２９】
読み出された圧縮されている本文データは、圧縮解凍工程において、圧縮解凍手段により、圧縮解凍すなわち伸長されて、元の本文データに復元される。対象データ出力工程により、この復元された本文データの中から、指定された検索対象項目のデータが検出され、これが、データ出力工程において、データ出力手段により例えば表示されるなどして出力される。
【００３０】
このように、圧縮されて記録媒体に記録されている本文データの読み出しや、読み出した本文データの圧縮解凍処理は、分割データ単位に行われる。したがって、例えば、１つの文献の本文データの全体を１つの処理単位とする場合のように、処理単位当たりのデータ量が大きすぎることがなく、記録媒体から目的の検索対象項目のデータを迅速に読み出して、迅速に圧縮解凍して利用することができる。
【００３２】
また、各分割データの圧縮後のデータサイズが累算されて得られる累算値が、各分割データに対応して記述されたもの（圧縮サイズテーブル）が、データサイズに関する情報として記録媒体に記録されている。
【００３３】
この場合、各分割データに対する圧縮後のデータサイズの累算値は、次の分割データの圧縮後の先頭記録位置を示し、また、目的とする分割データまでの圧縮後のデータサイズの累算値から、その１つ前の分割データまでの圧縮後のデータサイズの累算値を減算することにより、目的とする分割データの圧縮後のデータ量を得ることができる。
【００３４】
これにより、読み出し工程においては、インデックス情報を参照して指定された検索対象項目の先頭を含む分割データを特定すると共に、この特定した分割データに対応する圧縮後の本文データの先頭記録位置と、データ量とを簡単な演算処理により合理的に検出することができるようにされる。つまり、圧縮されて記録媒体の記録されている本文データの中から、指定された検索対象項目のデータを迅速に検索して利用できるようにすることができる。
【００３５】
【発明の実施の形態】
以下、図を参照しながら、この発明の方法、装置の一実施の形態について説明する。
【００３６】
この実施の形態においては、いわゆる電子ブックシステムにこの発明を適用したものとして説明する。電子ブックシステムは、前述にもしたように、各種の文献の本文データを、例えばＣＤ−ＲＯＭに記録しておき、情報検索装置を用いて、ＣＤ−ＲＯＭに記録されている本文データの中から目的とする検索対象項目を検索するようにしたものである。
【００３７】
例えば、国語辞典は、単語とその単語の意味内容を示す情報とからなる検索対象データが多数集まることにより、１冊の国語辞典の本文データを形成するが、この本文データをデジタルデータとして、例えばＣＤ−ＲＯＭに記録することにより電子ブックシステムのＣＤ−ＲＯＭを作成する。
【００３８】
そして、例えば電子ブックシステム用の情報検索装置に当該ＣＤ−ＲＯＭを装填し、調べたい単語を検索キー情報として入力すると、この検索キー情報に対応する検索対象項目のデータ（この場合には、単語とその意味内容を示す情報）がＣＤ−ＲＯＭに記録されている本文データの中から検索されて、これが情報検索装置の表示画面に表示されるなどしてユーザに提供される。
【００３９】
電子ブックシステムは、このように、ＣＤ−ＲＯＭなどの記録媒体に記録された文献情報を、簡単な操作で、迅速に検索して利用することができるようにされたものである。
【００４０】
［電子ブックシステム用のＣＤ−ＲＯＭの作成］
まず、電子ブックシステムで用いられる、いわゆる電子ブック規格のＣＤ−ＲＯＭの作成について説明する。図１は、ＣＤ−ＲＯＭに情報を書き込むことにより、電子ブック規格のＣＤ−ＲＯＭを作成するこの実施の形態の情報記録装置を説明するためのブロック図である。この実施の形態の情報記録装置は、この発明による情報記録方法が適用されたものであり、従来、非圧縮で記録していた各種の文献の本文データを圧縮してＣＤ−ＲＯＭに記録することができるようにされたものである。
【００４１】
図１に示すように、この実施の形態の情報記録装置は、インデックス情報発生部１、本文データ発生部２、データ分割部３、データ圧縮部４、圧縮サイズテーブル生成部５、書き込み制御部６を備えている。また、ＣＤ−ＲＯＭ２００は、この実施の形態の情報記録装置に装填され、本文データ、インデックス情報、圧縮サイズテーブルが書き込まれるものである。
【００４２】
図１に示すこの実施の形態の情報記録装置に具体的な説明に入る前に、この実施の形態のインデックス情報発生部１において発生されて、ＣＤ−ＲＯＭ２００に記録するようにされるインデックス情報について説明する。このインデックス情報は、電子ブックシステムにおいて、迅速な検索処理を実現するために作成されて用いられるものである。
【００４３】
図２は、インデックス情報生成部１から出力される電子ブック規格のインデックス情報の一例を説明するための図である。電子ブック規格のインデックス情報は、ＣＤ−ＲＯＭに記録する文献の本文データに応じて、ｎ次の階層構造で作成される。図２に示した電子ブック規格のインデックス情報は、３次の階層構造の例であり、例えば、国語辞典用のインデックス情報の例である。
【００４４】
図２に示すように、この例のインデックス情報は、第１次インデックスブロック１Ｂ、第２次インデックスブロック２Ｂ、第３次インデックスブロック３Ｂからなっている。第２インデックスブロック２Ｂ、および、この例の最下層のインデックスブロックである第３次インデックスブロックは、その内容がさらに細分化され、複数の細分化ブロック２Ｂ１、２Ｂ２、…、複数の細分化ブロック３Ｂ１、３Ｂ２、…、を備えている。
【００４５】
そして、第１次インデックスブロック１Ｂ、および、第２次インデックスブロック２Ｂの各細分化ブロック２Ｂ１、２Ｂ２、…は、入力された検索キー情報と比較される「あま」、「かき」といった比較キー情報と、その比較キー情報に対応する次層の細分化ブロックの先頭記録位置を示すアドレス情報を有している。
【００４６】
また、第３次インデックスブロック３Ｂの各細分化ブロック３Ｂ１、３Ｂ２、…は、この例の最下層のインデックスブロックであり、入力された検索キー情報と比較されるキー情報であって、入力された検索キー情報に一致する比較キー情報と、ＣＤ−ＲＯＭなどの記録媒体に記録されている本文データのうちの、比較キー情報に対応する検索対象項目の先頭記録位置を示すアドレス情報（本文アドレス情報）を有している。
【００４７】
このように構成されたインデックス情報を用いて、入力された検索キー情報に基づく情報検索は、以下のようにして行なわれる。
【００４８】
この例においては、まず、第１次インデックスブロック１Ｂを参照し、入力された検索キー情報の先頭から２文字の情報と、第１次インデックスブロック１Ｂの比較キー情報とを比較する。この比較処理により、入力された検索キー情報の先頭から２文字の情報は、五十音順で、第１次インデックスブロック１Ｂの比較の対象となった比較キー情報より、前に位置する情報か、後ろに位置する情報か、あるいは、第１次インデックスブロック１Ｂの当該比較キー情報と同じ情報かを判断する。
【００４９】
入力された検索キー情報の先頭から２文字の情報が、五十音順で、第１次インデックスブロック１Ｂの比較の対象となった比較キー情報より後ろに位置する情報であると判断したときには、第１次インデックスブロック１Ｂの次の比較キー情報について、同じように比較処理を行う。
【００５０】
また、入力された検索キー情報の先頭から２文字の情報が、五十音順で、第１次インデックスブロック１Ｂの比較の対象となった比較キー情報より前に位置する情報である、あるいは、第１次インデックスブロック１Ｂの当該比較キー情報と同じ情報であると判断したときには、第１次インデックスブロックの当該比較キー情報に対応する次層のアドレス情報に基づいて、第２次インデックスブロック２Ｂの該当する細分化ブロックを参照する。
【００５１】
そして、入力された検索キー情報の先頭から２文字の情報と、第２次インデックスブロック２Ｂの指定された細分化ブロックの比較キー情報との間で、上述の第１次インデックスブロックの場合と同様に比較処理を行う。
【００５２】
この第２次インデックスブロック２Ｂの指定された細分化ブロックの比較キー情報との間で行なわれる比較処理において、入力された検索キー情報の先頭から２文字の情報が、五十音順で、第２次インデックスブロック２Ｂの細分化ブロックの比較の対象となった比較キー情報より前に位置する情報である、あるいは、比較の対象となった比較キー情報と同じ情報であると判断したときには、その比較キー情報に対応する次層のアドレス情報により指定される、第３次インデックスブロック３Ｂの該当する細分化ブロックを参照する。
【００５３】
そして、入力された検索キー情報と、第３次インデックスブロック３Ｂの指定された細分化ブロックの比較キー情報との間で比較処理を行い、入力された検索キー情報と一致する比較キー情報を検出する。この検出された第３次インデックスブロック３Ｂの細分化ブロックの比較キー情報に対応して記憶されている本文アドレスが、入力された検索キー情報に対する検索対象データのＣＤ−ＲＯＭ上の記録開始位置を示している。したがって、この本文アドレスにより示されるＣＤ−ＲＯＭの記録位置から本文データを読み出すことにより、入力された検索キー情報に対応する検索対象データを取得することができるようにされる。
【００５４】
例えば、「あいさつ」が検索キー情報として入力された場合には、この検索キー情報の先頭から２文字の「あい」が、第１次インデックスブロックの比較キー情報と比較される。まず、検索キー情報の先頭から２文字「あい」と、第１次インデックスブロックの比較キー情報「あま」とが比較される。検索キー情報の先頭から２文字「あい」は、比較キー情報「あま」よりも五十音順で前に位置する情報であるので、比較キー情報「あま」に対応して記録されているアドレス情報に基づいて、第２次インデックスブロックの細分化ブロック２Ｂ１を参照する。
【００５５】
そして、検索キー情報の先頭から２文字「あい」と、第２次インデックスブロック２Ｂの細分化ブロック２Ｂ１の比較キー情報とが比較される。最初に、検索キー情報の先頭から２文字「あい」と、第２次インデックスブロック２Ｂの細分化ブロック２Ｂ１の比較キー情報「あう」とが比較される。検索キー情報の先頭から２文字「あい」は、比較キー情報「あう」よりも五十音順で前に位置する情報であるので、比較キー情報「あう」に対応して記録されているアドレス情報に基づいて、第３次インデックスブロックの細分化ブロック３Ｂ１を参照する。
【００５６】
そして、細分化ブロック３Ｂ１の比較キー情報の中ら、検索キー情報「あいさつ」に一致する比較キー情報を検出し、この検出された比較キー情報「あいさつ」に対応して記録されている本文アドレスに基づいて、入力された検索キー情報に対応する検索対象項目のデータを読み出して提供される。つまり、この場合には、検索キー情報「あいさつ」の意味内容を示すテキストデータが検索対象項目のデータとして本文データから読み出されてユーザに提供される。
【００５７】
そして、この実施の形態においては、図２を用いて説明したように階層構造で作成される電子ブック規格のインデックス情報は、図３に示すように、本文データが記録されるＣＤ−ＲＯＭに設けられるインデックス領域ＩＤＸに記録される。
【００５８】
この場合、インデックス領域ＩＤＸには、インデックス領域ＩＤＸの物理アドレスが低い方から高い方へ、第１次インデックスブロック１Ｂ、第２次インデックスブロック群２Ｂ、…、第ｎ次インデックスブロック群というように順に記録される。これにより、物理アドレスの低い方から高い方へ、順にインデックス情報の階層をたどることができるようにされている。
【００５９】
また、この実施の形態においては、文字管理ブロックＭＫが設けられ、入力された検索キー情報に応じて、第１次インデックスブロックのどこから検索を開始するかを決めることができるようにされている。例えば、検索キー情報の先頭文字が、五十音の「あ行」から「な行」までなら、第１次ブロック１Ｂの先頭から検索を開始し、検索キー情報の先頭文字が、「は行」以降であれは、第１次ブロックの中間位置付近の予め定められた位置から、すなわち、第１次インデックスブロックの「は行」の開始位置から検索を開始することができるようにされている。
【００６０】
このように、電子ブック規格のインデックス情報は、インデックス情報の全ての比較キー情報を対象に検索処理を行うことなく、インデックス情報を階層構造にすることで、検索範囲を効率よく絞り込んで行き、入力された検索キー情報に対応する検索対象項目を本文データから迅速に探し出して利用することができるようにされている。
【００６１】
ところで、前述もしたように、電子ブック規格の本文データは、非圧縮状態でＣＤ−ＲＯＭに記憶されているため、より多くの文献の本文データなどを記録することができないなど、ＣＤ−ＲＯＭの限られた記憶容量を有効に活用していない場合も多い。そこで、本文データの圧縮が考えられるが、前述したように単順に圧縮することができない。
【００６２】
また、本文データを圧縮してＣＤ−ＲＯＭに記録する場合、インデックス情報の作り直しが必要になるが、電子ブック規格のインデックス情報は、図２を用いて前述したように、階層構造の複雑な構成とされており、インデックス情報を作り直すには、時間とコストがかかる。このことが、より多くの本文データをＣＤ−ＲＯＭに記録することにより、より充実した内容の電子ブックシステムのＣＤ−ＲＯＭの提供を阻害する原因になっている。
【００６３】
そこで、図１に示すこの実施の形態の情報記録装置は、本文データを、予め決められた所定のデータ量毎に分割する。以下、この明細書においては、本文データが所定のデータ量毎に分割されて形成されるデータの集まり（本文データの一部分）を単位ブロックという。そして、この単位ブロック毎に本文データを圧縮し、この圧縮した本文データを連続するアドレスに順次につめてＣＤ−ＲＯＭに記録する。
【００６４】
また、この実施の形態の情報記録装置は、圧縮前の当該本文データに対応して既に作成されているインデックス情報をそのままＣＤ−ＲＯＭに記録する。この圧縮前の本文データに対して作成されたインデックス情報を用いて、圧縮された本文データの中から入力された検索キー情報に対応する検索対象項目のデータを取得することができるようにするため、当該本文データを分割することにより形成した各単位ブロック毎の圧縮後のデータ量の累算値を、各単位ブロックに対応して記憶した圧縮サイズテーブルを形成し、これを単位ブロック毎に圧縮した本文データやインデックス情報と共に、ＣＤ−ＲＯＭに記憶することにより、電子ブック規格のＣＤ−ＲＯＭを作成する。
【００６５】
この場合、ＣＤ−ＲＯＭの記録領域は、本文データの記録領域である本文データ領域、インデックス情報の記録領域であるインデックス領域、および、圧縮サイズテーブルの記録領域である圧縮サイズテーブル領域に分離され、圧縮された本文データ、インデックス情報、圧縮サイズテーブルは、それぞれ対応する記録領域に記録される。以下、図１に示すこの実施の形態の情報記録装置について詳述する。
【００６６】
インデックス情報発生部１は、ＣＤ−ＲＯＭ２００に圧縮して記録しようとする圧縮前の本文データに対応して、予め作成されたインデックス情報に基づいて、ＣＤ−ＲＯＭ２００に記録するインデックス情報を発生し、これを書き込み制御部６に供給する。
【００６７】
本文データ発生部２は、ＣＤ−ＲＯＭ２００に圧縮して記録する本文データを発生し、これをデータ分割部３に供給する。データ分割部３は、供給された本文データを予め決められたデータ量の単位ブロックに分割し、単位ブロック毎に本文データをデータ圧縮部４に供給する。
【００６８】
図４は、データ分割部３、データ圧縮部４において行われる本文データの分割処理および圧縮処理を説明するための図であり、説明を簡単にするため、圧縮前の本文データ（以下、圧縮前本文データという）の一部を抜き出して示したものである。
【００６９】
この実施の形態において、データ分割部３は、図４Ａに示すように、圧縮前本文データＤＴを、予め決められた大きさの単位ブロックＤＴ１、ＤＴ２、ＤＴ３、ＤＴ４、…に分割する。
【００７０】
図４Ａにおいて、圧縮前本文データＤＴの左側に付された、００００Ｈ、１０００Ｈ、２０００Ｈ、３０００Ｈは、圧縮前本文データの先頭からの各単位ブロックＤＴ１、ＤＴ２、ＤＴ３、ＤＴ４の開始アドレスを示し、３ＦＦＦＨは、単位ブロックＤＴ４の終了アドレスを示している。各アドレスの末尾に付されたアルファベットの「Ｈ」は、当該アドレスが１６進法で表現されていることを示している。以下、この明細書において、アドレス情報、圧縮サイズ、バイト数などの末尾に付されたアルファベットの「Ｈ」は、それらの情報が１６進数で表現されていることを示すものとする。
【００７１】
したがって、この実施の形態においては、圧縮前本文データＤＴは、図４Ａに示すように、４０９６バイト毎に圧縮の処理単位となる単位ブロックＤＴ１、ＤＴ２、ＤＴ３、ＤＴ４、…に分割され、この単位ブロック毎の圧縮前本文データがデータ圧縮部４に供給される。
【００７２】
データ圧縮部４は、各単位ブロック毎に本文データを圧縮する。このデータ圧縮部４において単位ブロック毎に圧縮された本文データが、順次にＣＤ−ＲＯＭ２００に記録されて、図４Ａに示すように、圧縮後の本文データＤＴＡを形成するようにされる。ここで、圧縮後の本文データＤＴＡを構成する圧縮後単位ブロックＤＴＡ１、ＤＴＡ２、ＤＴＡ３、ＤＴＡ４は、圧縮前本文データの各単位ブロックＤＴ１、ＤＴ２、ＤＴ３、ＤＴ４のデータを圧縮することにより形成されたものである。
【００７３】
そして、図４Ｂに示すように、例えば、アドレス００００Ｈ〜アドレス０ＦＦＦＨまでの約４キロバイト（４０９６バイト）の単位ブロックＤＴ１は、これに対応する圧縮後単位ブロックＤＴＡ１に示すように、２０６６バイト（１６進数で表すと０８１２Ｈバイト）に圧縮される。
【００７４】
同様に、図４Ａに示すように、アドレス１０００Ｈ〜１ＦＦＦＨまでの約４キロバイトの単位ブロックＤＴ２は、圧縮後単位ブロックＤＴＡ２に示すように、２２９４バイト（１６進数で表すと０８Ｆ６Ｈバイト）に圧縮され、アドレス２０００Ｈ〜２ＦＦＦＨまでの単位ブロックＤＴ３は、圧縮後単位ブロックＤＴＡ３が示すように、１７６７バイト（１６進数で表すと０６Ｅ７Ｈバイト）に圧縮される。また、アドレス３０００Ｈ〜３ＦＦＦＨまでの単位ブロックＤＴ３は、圧縮後単位ブロックＤＴＡ４が示すように、２５７８バイト（１６進数で表すと０Ａ１２Ｈバイト）に圧縮される。
【００７５】
そして、データ圧縮部４は、単位ブロック毎に圧縮した本文データを書き込み制御部６に供給する。また、データ圧縮部４は、各単位ブロック毎の圧縮後の本文データのデータ量を検出し、これを圧縮サイズテーブル生成部５に供給する。
【００７６】
この実施の形態において、圧縮サイズテーブル生成部５は、データ圧縮部４からの各単位ブロック毎の圧縮後のデータ量の累算値を求め、この累算値を圧縮サイズとして、各単位ブロックに対応付けた圧縮サイズテーブルを作成し、これを書き込み制御部６に供給する。
【００７７】
つまり、データ圧縮部４は、各単位ブロックＤＴ１〜ＤＴ４を圧縮することにより形成した圧縮後単位ブロックＤＴＡ１〜ＤＴＡ４のデータ量を順次に圧縮サイズテーブル生成部５に供給する。この実施の形態において、圧縮サイズテーブル生成部５は、各単位ブロック毎の圧縮後のデータ量を、その先頭の単位ブロックから順に累算して累算値を得て、この累算値と各単位ブロックとを対応付けた圧縮サイズテーブルＴＢを形成する。
【００７８】
図４Ａにおいて、圧縮後本文データＤＴＡの右側に付された、０８１２Ｈ、１１０８Ｈ、１７ＥＦＨ、２２０１Ｈは、各圧縮後単位ブロックＤＴＡ１〜ＤＴＡ４のデータ量の累算値を示している。
【００７９】
つまり、単位ブロックＤＴ１、ＤＴ２、ＤＴ３、ＤＴ４を圧縮することに形成された圧縮後単位ブロックＤＴＡ１、ＤＴＡ２、ＤＴＡ３、ＤＴＡ４のデータ量は、前述したように、０８１２Ｈ、０８Ｆ６Ｈ、０６Ｅ７Ｈ、０Ａ１２Ｈとなる。圧縮サイズテーブル生成部５は、これを順次累算していくことにより、各単位ブロックに対応して、圧縮対象となった単位ブロックまでの圧縮後のデータ量の累算値を得る。
【００８０】
したがって、圧縮後単位ブロックＤＴＡ１を先頭の圧縮後単位ブロックとすると、圧縮後単位ブロックＤＴＡ１までの累算値は、０８１２Ｈとなる。また、圧縮後単位ブロックＤＴＡ２までの累算値は、０８１２Ｈと０８Ｆ６Ｈが加算され、１１０８Ｈとなる。同様に、圧縮後単位ブロックＤＴＡ３までの累算値は、１１０８Ｈと０６Ｅ７Ｈとが加算され、１７ＥＦＨとなり、圧縮後単位ブロックＤＴＡ４までの累算値は、１７ＥＦＨと０Ａ１２Ｈとが加算されて、２２０１Ｈとなる。
【００８１】
このようにして、圧縮サイズテーブル生成部５は、各単位ブロック毎のデータ量を累算して累算値を得て、この圧縮後のデータ量の累算値を圧縮サイズとして、各単位ブロックと対応付けた圧縮サイズテーブルＴＢを形成する。
【００８２】
図５は、この実施の形態の圧縮サイズテーブルＴＢを説明するための図である。図５Ａに示すように、各単位ブロック毎に求められる圧縮後の単位ブロックの大きさの累算値が、圧縮サイズとして圧縮サイズテーブルＴＢに記録される。この場合、各圧縮サイズ（圧縮後単位ブロックのデータ量の累算値）は、１６進数、４バイトで表現されたものである。また、圧縮サイズテーブルＴＢのアドレスは、圧縮サイズテーブルＴＢの先頭からの各圧縮サイズの先頭アドレスである。圧縮サイズテーブル生成部５においいて作成された圧縮サイズテーブルＴＢは、書き込み制御部６に供給される。
【００８３】
書き込み制御部６は、各部から供給される情報を、ＣＤ−ＲＯＭ２００に書き込む。つまり、書き込み制御部６は、インデックス情報発生部１からのインデックス情報を、ＣＤ−ＲＯＭ２００上に形成されるインデックス領域に書き込む。また、データ圧縮部４からの単位ブロック毎の圧縮後の本文データを順次につめて、ＣＤ−ＲＯＭ２００の本文データ領域に書き込む。同様に、圧縮サイズテーブル生成部５からの圧縮サイズテーブルの情報を圧縮サイズテーブル領域に書き込む。
【００８４】
このようにして、この実施の形態の情報記録装置により、圧縮前の本文データに対応して作成されたインデックス情報と、単位ブロック毎に圧縮した本文データと、圧縮サイズテーブルとがＣＤ−ＲＯＭに書き込まれ、電子ブック規格のＣＤ−ＲＯＭが作成される。
【００８５】
そして、詳しくは後述するように、ＣＤ−ＲＯＭに記録されたインデックス情報と、圧縮サイズテーブルとに基づいて、指定された検索対象項目を含む、圧縮された本文データが読み出され、これを伸長することにより、指定された検索対象データを利用することができるようにされる。
【００８６】
次に、前述した電子ブック規格のＣＤ−ＲＯＭの作成処理について、図６のフローチャートを用いて説明する。
【００８７】
本文データ発生部２からの記録しようとする圧縮前本文データは、データ圧縮部４により、予め決められた一定の大きさのデータ量の単位ブロックに分割される（ステップＳ１）。つまり、図４を用いて前述したように、圧縮前本文データを、例えば、４０９６バイトのデータ量の単位ブロックに分割するのが、このステップＳ１の処理である。
【００８８】
次に、データ圧縮部４により、一定の大きさの単位ブロックを処理単位として、単位ブロック毎に本文データの圧縮処理が行われ（ステップＳ２）、圧縮後の単位ブロックのデータ量が取得される（ステップＳ３）。
【００８９】
単位ブロック毎に圧縮された本文データは、書き込み制御部６の制御により、順次につめて、ＣＤ−ＲＯＭ２００の本文領域に記録される（ステップＳ４）。また、データ圧縮部４により取得された圧縮後の単位ブロックのデータ量が、圧縮サイズテーブル作成部５に供給され、圧縮後の単位ブロックのデータ量の累算値が算出される（ステップＳ５）。このステップＳ５の処理では、例えば、前回までの圧縮後の単位ブロックのデータ量の累算値を保持しておき、この前回までの累算値と、今回圧縮の対象となった単位ブロックの圧縮後のデータ量を加算することにより今回の累算値を求めることができる。
【００９０】
そして、圧縮サイズテーブル作成部５は、例えば、自己が備えるメモリに、今回の累算値を圧縮サイズとして、今回圧縮の対象となった単位ブロックと対応がとれるようにして、圧縮サイズテーブルを作成していく（ステップＳ６）。
【００９１】
つまり、このステップＳ６においては、１番目の単位ブロックまでの圧縮後の大きさの累算値は、圧縮サイズテーブルの１番目に記録し、２番目の単位ブロックまでの圧縮後の大きさの累算値は、圧縮サイズテーブルの２番目に記録するというように、どの単位ブロックまでの累算値かが分かるように、圧縮して記録する当該本文データに対する圧縮サイズテーブルを作成する。
【００９２】
そして、圧縮する当該本文データの全ての単位ブロックについて、圧縮処理が終了したか否かを判断し（ステップＳ７）、終了していないと判断したときには、次の単位ブロックが処理の対象とするように位置付けて（ステップＳ８）、ステップＳ２からの処理を繰り返す。
【００９３】
また、ステップＳ７の判断処理において、圧縮する当該本文データの全ての単位ブロックについて、圧縮処理が終了したと判断したときには、圧縮サイズテーブル作成部５において作成された圧縮サイズテーブルと、インデックス情報発生部１からの圧縮前の当該本文データに対するインデックス情報とが、書き込み制御部６に供給され、これらの情報がＣＤ−ＲＯＭのインデックス領域、圧縮サイズテーブル領域に記録された後（ステップＳ９）、図６に示した本文データの圧縮処理および圧縮サイズテーブルの作成処理を終了する。
【００９４】
このようにして、本文データは、単位ブロックに分割され、単位ブロック毎に圧縮されて電子ブックシステムのＣＤ−ＲＯＭに記録されると共に、圧縮後の単位ブロックのデータ量の累算値を圧縮サイズとする図５Ｂに示したような圧縮サイズテーブルと、圧縮前の本文データに応じて作成されたインデックス情報とが、ＣＤ−ＲＯＭ２００に記録される。
【００９５】
このように、本文データは圧縮されてＣＤ−ＲＯＭに記録されるので、従来よりもさらに多くの文献の本文データを同じＣＤ−ＲＯＭに記録することができる。そして、複数の文献の本文データを圧縮して１枚のＣＤ−ＲＯＭに記録するようにした場合には、各文献の単位ブロック毎に圧縮した本文データに対応する圧縮サイズテーブルと、その文献の圧縮前の本文データに対応して作成されたインデックス情報が同じＣＤ−ＲＯＭに記録しておくことにより、１枚のＣＤ−ＲＯＭに記録された各種の本文データを同じように利用することができる。
【００９６】
また、本文データは、所定の大きさの単位ブロック毎に圧縮することにより、圧縮された単位で伸長処理を行えばよいので、大きなメモリを情報検索装置に搭載しなくてもすむようにすることができる。また、圧縮処理が所定の単位ブロック毎に行われれば、その単位ブロック毎にデータを処理すればよいので、圧縮された本文データの読み出しや伸長処理に長い時間がかかることもない。
【００９７】
さらに、インデックス情報が既に作成されている場合、その既存のインデックス情報をそのまま用いることができるので、圧縮後の本文データに対応した新たなインデックス情報を作成する必要もない。
【００９８】
なお、前述の説明においては、すべての単位ブロックの本文データについて、圧縮してＣＤ−ＲＯＭ２００に記録した後に、当該本文データに対するインデックス情報と圧縮サイズテーブルとをＣＤ−ＲＯＭ２００に記録するようにした。しかし、インデックス情報は、圧縮前の本文データに対応して予め作成されているので、先にインデックス情報を記録したＣＤ−ＲＯＭを作成しておいて、このＣＤ−ＲＯＭに単位ブロック毎に圧縮した本文データと、圧縮サイズテーブルとを記録するようにしてもよい。
【００９９】
また、圧縮サイズテーブル生成部５において、すべての単位ブロックに対応する圧縮サイズからなる圧縮サイズテーブルを完成させた後に、この完成された圧縮サイズテーブルをＣＤ−ＲＯＭ２００に記録するようにしてもよいし、単位ブロック毎の圧縮後のデータ量の累算値を順次にＣＤ−ＲＯＭ２００に記録するようにすることもできる。
【０１００】
［情報検索装置について］
次に、前述のようにして、圧縮前の本文データに応じて作成されたインデックス情報と、単位ブロック毎に圧縮された本文データと、圧縮サイズテーブルとが記録されて作成された電子ブックシステム用のＣＤ−ＲＯＭを用いた情報の検索について説明する。この場合、圧縮前の本文データおよび圧縮後の本文データの先頭は、図４に示したように、ＣＤ−ＲＯＭ２００の本文データ領域の先頭（００００Ｈ）に一致するようにされているものとして説明する。
【０１０１】
図７は、図１を用いて前述した情報記録装置により作成された電子ブックシステム用のＣＤ−ＲＯＭ２００が装填され、ＣＤ−ＲＯＭ２００に記録された情報を検索することができるこの実施の形態の電子ブックシステムの情報検索装置を説明するためのブロック図である。この情報検索装置は、この発明による情報検索方法が適用されたものである。
【０１０２】
図７に示すように、この実施の形態の電子ブックシステムの情報検索装置は、光ピックアップ１１、２軸デバイス１２、スピンドルモータ１３、ドライバ１４、ＲＦアンプ１５、信号処理部１６、表示制御部１７、表示パネル１８を備えると共に、ＲＯＭ１０１、ＲＡＭ１０２、キー操作部１０３が接続されたＣＰＵ１００を備えている。
【０１０３】
ＣＰＵ１００は、この検索再生装置の各部の動作を制御するシステムコントローラとしての機能を有するものである。ＲＯＭ１０１は、動作プログラムや表示文字のフォントデータなど、この情報検索装置において用いられるプログラムやデータが記録されたものである。
【０１０４】
ＲＡＭ１０２は、ＣＤ−ＲＯＭ２００から読み出した再生データを一時記憶するなど、この情報検索装置において行われる処理の作業領域として用いられる。また、キー操作部１０３は、数字キーやアルファベットキーなどの複数の操作キーを備え、検索キー情報などのユーザからの情報入力を受け付ける。
【０１０５】
そして、キー操作部１０３の操作キーがユーザにより操作され、検索キー情報が入力されると、ＣＰＵ１００はこれを受け付けて、ＣＤ−ＲＯＭ２００に記録されている本文データの検索処理を開始する。
【０１０６】
まず、ＣＰＵ１００は、ドライバ１４に対して、検索処理の開始を指示する制御信号を供給する。ドライバ１４は、この制御信号に応じて、光ピックアップ１１、２軸デバイス１２、スピンドルモータ１３を駆動させ、光ピックアップ１１によりＣＤ−ＲＯＭ２００に記録されているデータを読み出す。
【０１０７】
光ピックアップ１１は、図示しないが、例えば、レーザダイオード、対物レンズ、ハーフミラー、フォトディテクタなどを備え、ＣＤ−ＲＯＭ２００のトラックにレーザビームを照射し、その反射光をフォトディテクタで受光して、反射光の光量の変化に基づいて、ＣＤ−ＲＯＭ２００に記録されているデータを読み出す。
【０１０８】
この実施の形態において、光ピックアップ１１のフォトディテクタは、フォーカスエラー、および、トラッキングエラーを検出するために、複数個の受光領域に分割されたものである。
【０１０９】
光ピックアップ１１のフォトディテクタの各受光領域で受光されたＣＤ−ＲＯＭ２００からの反射光は、電気信号に変換されてＲＦアンプ１５に供給される。ＲＦアンプ１５は、光ピックアップ１１のフォトディテクタの各受光領域からの電気信号から、再生高周波信号、および、フォーカスエラー信号ＦＥ、トラッキングエラー信号ＴＥを形成する。
【０１１０】
ＲＦアンプ１５において形成されたフォーカスエラー信号ＦＥ、トラッキングエラー信号ＴＥは、ＣＰＵ１００に供給される。ＣＰＵ１００は、これらの信号ＦＥ、ＴＥに基づいて、ドライバ１４を通じて、２軸デバイス１２を制御し、フォーカスエラー制御、トラッキングエラー制御を行うことができるようにされている。
【０１１１】
また、ＲＦアンプ１５で形成された再生高周波信号は、信号処理部１６に供給され、ここで、アナログ／デジタル変換処理や、ＣＤ−ＲＯＭ２００への記録時の変調方式に応じた復調処理がなされ、復調されたデータが取り出される。
【０１１２】
この場合、後述もするように、ＣＰＵ１００は、まず始めに、入力された検索キー情報に基づいて、ＣＤ−ＲＯＭ２００に記録されているインデックス情報を参照し、このインデックス情報に基づいて、入力された検索キー情報に対応する検索対象項目の先頭を含む単位ブロックを特定する。
【０１１３】
つまり、ＣＰＵ１００は、ＣＤ−ＲＯＭ２００に記録されている圧縮前の本文データに対応して作成されたインデックス情報を参照し、入力された検索キー情報に対応する検索対象項目の先頭位置を示す本文アドレスを取得する。そして、当該本文データの圧縮前の先頭から、当該検索対象項目の先頭位置までの圧縮前の本文データのデータ量を求める。この求めたデータ量を、圧縮処理時の処理単位である単位ブロック当たりのデータ量で割り算することにより、当該検索対象項目の先頭位置を含む単位ブロックは、先頭単位ブロックから何番目の単位ブロックかを特定する。
【０１１４】
次に、ＣＰＵ１００は、圧縮サイズテーブルを参照し、この圧縮サイズテーブルの情報に基づいて、特定した単位ブロックに対応する圧縮された本文データをＣＤ−ＲＯＭ２００から読み出す。
【０１１５】
つまり、圧縮サイズテーブルの各圧縮サイズは、図５を用いて前述したように、圧縮後単位ブロックのデータ量の累算値であり、次の圧縮後単位ブロックの開始位置に対応している。また、目的とする圧縮後単位ブロックまでのデータ量の累算値（圧縮サイズ）から、その１つ前の圧縮後単位ブロックまでのデータ量の累算値（圧縮サイズ）を減算することにとり、目的とする圧縮後単位ブロックのデータ量を得ることができる。
【０１１６】
これにより、例えば、図４Ａに示した例において、圧縮前単位ブロックＤＴ３に対応する圧縮後単位ブロックＤＴＡ３の圧縮されたデータを読み出して、利用しようとする場合には、図５Ｂに示した圧縮サイズテーブルＴＢから、圧縮後単位ブロックＤＴ３までの圧縮サイズ１７ＥＦＨと、その１つ前の圧縮後単位ブロックＤＴ２までの圧縮サイズ１１０８Ｈとを読み出し、圧縮後単位ブロックＤＴ３までの圧縮サイズ１７ＥＦＨから圧縮後単位ブロックＤＴ２までの圧縮サイズ１１０８Ｈを減算することにより、圧縮後単位ブロックＤＴＡ３のデータ量を得る。この場合、圧縮後単位ブロックＤＴ３の大きさは、図４にも示したように、１７６７Ｈバイトであることが分かる。
【０１１７】
そして、圧縮後本文データの先頭から、圧縮前単位ブロックＤＴ３の１つ前の圧縮後単位テーブルＤＴＡ２までの圧縮サイズである１１０８Ｈバイト目から、圧縮後単位ブロックＤＴ３のデータ量分、つまり、１７６７Ｈバイト分、圧縮後の本文データを読み出せば、単位ブロックＤＴ３に対応する圧縮された本文データ、この場合には、圧縮後単位ブロックＤＴＡ３の全部を読み出すことができる。このようにして、読み出された，特定された単位ブロックに対応する圧縮された本文データは、ＲＡＭ１０２に一時記憶される。
【０１１８】
そして、ＣＰＵ１００は、ＲＡＭ１０２に一時記憶した圧縮されている本文データを圧縮解凍し、特定された単位ブロックの圧縮前の元の本文データを得る。そして、この圧縮解凍された単位ブロックの本文データから、前述したように、インデックス情報から取得される入力された検索キー情報に対応する検索対象項目の先頭位置を示す本文アドレスに基づいて、入力された検索キー情報に対応する検索対象項目のデータを取得する。
【０１１９】
つまり、特定された単位ブロックは、前述したようにＣＤ−ＲＯＭ２００に記録されている当該本文データ全体の何番目の単位ブロックかは既に分かっており、また、各単位ブロックは、予め決められた大きさのデータ量毎に分割されたものであるので、当該本文データ全体の先頭からの当該特定された単位ブロックの先頭位置は容易に分かる。すなわち、当該本文データの先頭単位ブロックから当該特定された単位ブロックまでの単位ブロック数に、予め定められている単位ブロック当たりのデータ量を掛け合わせれば、特定された単位ブロックの先頭位置が分かる。
【０１２０】
したがって、インデックス情報から取得される検索キー情報に対応する検索対象項目の先頭位置を示す本文アドレスから、当該特定された単位ブロックの先頭位置を示すアドレスを引き算すれば、当該特定された単位ブロックの先頭からの、目的とする検索対象項目の先頭位置を特定することができる。そして、圧縮解凍された単位ブロックの特定された検索対象項目の先頭位置に対応する位置から本文データを読み出せば、入力された検索キー情報に対応する目的とする検索対象項目のデータを取得することができる。
【０１２１】
このようにして、検索キー情報に対応する検索対象項目のデータを取得し、この検索対象項目のデータに基づいて、ＲＯＭ１０１に記憶されているフォントデータなどを用い、表示させようとする文字などの表示情報の形状データ形成し、これを表示制御部１７に供給する。
【０１２２】
表示制御部１７は、表示用メモリ７１を備えており、ＣＰＵ１００からの表示情報の形状データに応じて、表示用メモリ７１に表示用の画像データを形成する。そして、表示制御部１７は、液晶表示パネルなどで構成される表示パネル１８を制御して、表示用メモリ７１に形成した画像データに応じた画像を表示パネル１８に表示させる。
【０１２３】
これにより、表示パネル１８には、ユーザからの検索キー情報に基づいて、ＣＤ−ＲＯＭ２００から読み出された検索対象項目のデータが表示するようにされる。
【０１２４】
［情報検索装置においての情報検索時の動作について］
次に、この実施の形態の電子ブックシステムの情報検索装置の検索時の動作について、図８のフローチャートを参照しながら説明する。
【０１２５】
この実施の形態の情報検索装置の電子ブックシステムのＣＤ−ＲＯＭ２００が装填され、キー操作部１０３を通じて、ユーザにより検索キー情報が入力されると（ステップＳ１１）、情報検索装置のＣＰＵ１００は、ドライバ１４を通じて、光ピックアップ１１、２軸デバイス１２、スピンドルモータ１３を駆動させて、圧縮前の本文データに応じて作成されたインデックス情報を参照し、前述したように、入力された検索キー情報に対応する検索対象データの先頭を含む単位ブロック（圧縮前）を特定する（ステップＳ１２）。
【０１２６】
そして、前述した圧縮サイズテーブルから、特定した単位ブロックまでの圧縮後の単位ブロックの大きさの累算値（圧縮サイズ）ＲＡと、その１つ前の単位ブロックまでの圧縮後の単位ブロックの大きさの累算値（圧縮サイズ）ＲＢとを読み出し（ステップＳ１３）、圧縮サイズＲＡから圧縮サイズＲＢを減算することにより、特定した単位ブロックの大きさＳＡを算出する（ステップＳ１４）。
【０１２７】
このステップＳ１４の減算処理を具体的に説明すると、例えば、特定した単位ブロックまでの圧縮サイズＲＡが、０１４Ｃ１０ＣＦＨであり、その１つ前の単位ブロックまでの圧縮サイズＲＢが、０１４Ｃ０９Ｆ６Ｈであった場合、０１４Ｃ１０ＣＦＨから０１４Ｃ０９Ｆ６Ｈが減算されて、特定された単位ブロックの大きさＳＡは、１７５３バイトであることが分かる。
【０１２８】
そして、前述したように、圧縮サイズは、圧縮後の単位ブロックの大きさの累算値であるので、圧縮サイズ自体が、次の単位ブロックの先頭位置を示すことになる。そこで、ＣＰＵ１００は、圧縮サイズＲＢが示すＣＤ−ＲＯＭ２００上の特定した単位ブロックの先頭位置に読み出し位置を位置付け（ステップＳ１５）、そこから、特定した単位ブロックの大きさＳＡ分、圧縮された本文データを読み出す（ステップＳ１６）。
【０１２９】
つまり、上述の例によれば、特定された単位領域の１つ前の単位ブロックまでの圧縮サイズＲＢ＝０１４Ｃ０９Ｆ６Ｈ＝２１７６２５５０バイトであるので、圧縮された本文データの先頭を基準にして２１７６２５５０バイト目から１７５３バイト分、圧縮された本文データを読み出すことになる。
【０１３０】
読み出した圧縮された本文データは、ＲＡＭ１０２に一時記憶される。このＲＡＭ１０２に一時記憶された本文データは、特定された単位ブロックの本文データが圧縮されたものであるので、これを圧縮解凍することにより、特定した単位ブロックの圧縮前の元の本文データを得る（ステップＳ１７）。
【０１３１】
そして、この圧縮解凍した本文データから、前述したように、入力された検索キー情報に対応する検索対象項目のデータの先頭位置を特定し、目的とする検索対象項目のデータを圧縮解凍された単位ブロックの本文データから読み出して再生する（ステップＳ１８）。
【０１３２】
このステップＳ１８においては、ＲＡＭ１０２に一時記憶されて、圧縮解凍された本文データから、検索キー情報に対応する検索対象項目を読み出し、ＲＡＭ１０１に記憶されているフォントデータを用いて、検索キー情報に対応する検索対象項目の表示画像を、表示制御部１７のＲＡＭ７１に形成する。このＲＡＭ７１の表示画像が、表示パネル１８に表示されて、入力された検索キー情報に対応する本文データがユーザに提供される。
【０１３３】
なお、入力された検索キー情報に対応する検索対象データが、複数のブロックにまたがることを考慮して、各文献の本文データを構成する各検索対象データの終りには、その検索対象データの終りを示すいわゆるエンドマークを付加するようにしておく。そして、このエンドマークが検出されない場合には、特定した単位ブロックの次の単位ブロックを新たに特定した単位ブロックとしてステップＳ１３からの処理を行うようにする。これにより、入力された検索キー情報に対応する検索対象データが、複数のブロックにまたがった場合にも対応することができる。
【０１３４】
このように、この実施の形態の情報検索装置を用いることにより、圧縮前の本文データに応じて作成されたインデックス情報と、単位ブロック毎に圧縮された本文データと、圧縮サイズテーブルとが記録されて作成された電子ブックシステムのＣＤ−ＲＯＭ２００から、ユーザにより入力された検索キー情報に対応する検索対象項目のデータを迅速かつ正確に読み出して圧縮解凍し、利用することができるようにされる。
【０１３５】
また、本文データは、予め決められた大きさの単位ブロック毎に圧縮されてＣＤ−ＲＯＭに記録されているので、本文データの読み出しや、圧縮解凍処理の処理単位を小さくすることができる。このため、ＣＤ−ＲＯＭからの本文データの読み出しや、圧縮解凍処理に時間がかかることもなく、ＣＤ−ＲＯＭに記録された本文データの中から、検索キー情報に対応する検索対象項目のデータを迅速に得ることができる。
【０１３６】
また、既に作成されているインデックス情報は、そのまま用いることができるので、本文データを圧縮してＣＤ−ＲＯＭに記録することにより、より多くの本文データを記録した内容の充実した電子ブックシステムに問題なく移行することが可能となる。
【０１３７】
なお、前述した実施の形態においては、圧縮前の本文データおよび圧縮後の本文データの先頭は、ＣＤ−ＲＯＭ２００の本文データ領域の先頭に一致するものとして説明した。しかし、ＣＤ−ＲＯＭに複数の文献の本文データを記録するようにした場合には、各文献の本文データのＣＤ−ＲＯＭ上の位置は異なる。
【０１３８】
そこで、各文献の圧縮前の本文データが記録された場合の当該本文データの先頭位置情報や圧縮後の本文データの先頭位置情報、あるいは、分割ブロックのデータ量などの情報を、ＣＤ−ＲＯＭのＴＯＣ（テーブル・オブ・コンテンツ）や、当該ＣＤ−ＲＯＭの他の領域に記憶させておき、圧縮前の単位ブロックは当該本文データの何番目の単位ブロックであるか、あるいは、圧縮前の単位ブロックの先頭から検索キー情報に対応する検索対象項目のデータの先頭位置までのデータ量などを算出する場合などに用いることができるようにしておくことにより、１枚のＣＤ−ＲＯＭに複数の文献の本文データを記録した場合にも問題なく対応することができる。
【０１３９】
また、インデックス情報や、圧縮サイズテーブルが、これらの情報の本文データの先頭を、例えば００００Ｈとして作成しても、上述のように、各文献の圧縮前の本文データが記録された場合の当該本文データの先頭位置情報や圧縮後の本文データの先頭位置情報、あるいは、分割ブロックのデータ量などの情報をＣＤ−ＲＯＭに記録させておくことにより、前述のようにして、検索きー情報に対応する検索対象項目のデータを取得することができる。
【０１４０】
また、前述した実施の形態においては、単位ブロックの大きさは、４０９６バイト（約４キロバイト）であるものとして説明したが、これに限るものではない。情報検索装置のメモリの記憶容量や、ＣＤ−ＲＯＭからのデータの読み出し速度、圧縮解凍処理にかかる時間などを考慮して、大きくしたり、小さくしたりすることができる。
【０１４１】
また、前述した実施の形態においては、圧縮サイズテーブルには、圧縮後の各単位ブロック毎のデータ量の累算値を圧縮サイズとして、各単位ブロックと対応付けて記録するようにしたが、これに限るものではない。
【０１４２】
例えば、圧縮後の各単位ブロック毎のデータ量自体を、各単位ブロックに対応付けた圧縮サイズテーブルを作成するようにしてもよい。つまり、第１の単位ブロックの圧縮後のデータ量は何バイト、第２の単位ブロックのデータ量は何バイトというように、各単位ブロックの圧縮後のデータ量が分かるように圧縮サイズテーブルを作成する。
【０１４３】
このようにしておくことにより、目的とする単位ブロックの圧縮後の本文データの先頭位置は、先頭の単位ブロックから当該目的とする単位ブロックまでの圧縮後のデータ量を加算することにより得られる。また、当該目的とする単位ブロックの圧縮後のデータ量は、先頭の単位ブロックから当該目的とする単位ブロックまでの圧縮後のデータ量の合計値から、先頭の単位ブロックから当該目的とする単位ブロックの１つ前の単位ブロックまでの圧縮後のデータ量の合計値を減算することにより得られる。このように、各単位ブロックの圧縮後のデータ量が分かるように圧縮サイズテーブルを作成した場合にも、目的とする単位ブロックの圧縮後のデータ量と、その先頭記録位置を求め、ＣＤ−ＲＯＭから読み出して利用することができる。
【０１４４】
また、前述した実施の形態においては、ＣＤ−ＲＯＭを電子ブックシステムの記録媒体として用いるようにしたが、これに限るものではない。いわゆるフロッピィディスクやミニディスク（ＭＤ）と呼ばれる小型光磁気ディスク、ＤＶＤ（デジタルビデオディスク）など各種の記録媒体を用いることができる。
【０１４５】
また、文献の本文データとしては、テキストデータだけでなく、グラフィックスデータについても同様に処理することができる。
【０１４６】
また、前述した情報検索装置は、電子ブックシステム専用のものとして説明したが、これに限るものではない。例えば、パーソナルコンピュータなどの情報処理装置にこの発明を適用することができる。
【０１４７】
【発明の効果】
以上説明したように、この発明によれば、データを圧縮して記録媒体に記録できるので、より多くのデータを記録媒体に記録することができる。また、圧縮して記録するデータについて、圧縮前の当該データに対するインデックス情報がある場合には、当該データを圧縮して記録媒体に記録した場合であっても、その既存のインデックス情報を用いて、検索処理を行うようにすることができる。
【０１４８】
また、記録媒体に記録するデータの圧縮は、当該データを予め決められた大きさの単位ブロックに分割し、この単位ブロック毎に圧縮処理するようにされるので、処理単位を特定するために必要とされる付加情報が少なく、かつ、特定にかかる演算も単純なので、合理的で高速な検索処理を行うことができる。
【図面の簡単な説明】
【図１】この発明による情報記録装置の一実施の形態を説明するためのブロック図である。
【図２】電子ブック規格のインデックス情報の一例を説明するための図である。
【図３】電子ブック規格のインデックス情報の一例を説明するための図である。
【図４】この発明による情報記録装置の一実施の形態において行われる情報の圧縮処理を説明するための図である。
【図５】この発明による情報記録装置の一実施の形態において作成される圧縮サイズテーブルを説明するための図である。
【図６】この発明による情報記録装置の一実施の形態において行われる情報の圧縮処理および圧縮サイズテーブルの作成処理を説明するためのフローチャートである。
【図７】この発明による情報検索装置の一実施の形態を説明するためのブロック図である。
【図８】この発明による情報検索装置の一実施の形態の情報の検索処理時の動作を説明するためのフローチャートである。
【符号の説明】
１…インデックス情報発生部、２…本文データ発生部、３…データ分割部、４…データ圧縮部、５…圧縮データ生成部、６…書き込み制御部、１１…光ピックアップ、１２…２軸デバイス、１３…スピンドルモータ、１４…ドライバ、１５…ＲＦアンプ、１６…信号処理部、１７…表示制御部、７１…表示用ＲＡＭ、１８…表示パネル、１００…ＣＰＵ、１０１…ＲＯＭ、１０２…ＲＡＭ、１０３…キー操作部、ＤＴ…圧縮前本文データ、ＤＴＡ…圧縮後本文データ、ＴＢ…圧縮サイズテーブル[0001]
BACKGROUND OF THE INVENTION
The present invention relates to, for example, a method and apparatus for compressing and recording information on a recording medium such as a CD-ROM (compact disk ROM), a recording medium on which information compressed by these methods and apparatuses is recorded, and compression. The present invention relates to a search method and a search device for information recorded on a recording medium.
[0002]
[Prior art]
There is provided a so-called electronic book system in which contents information of various documents such as Japanese language dictionaries and English-Japanese dictionaries recorded on a CD-ROM can be searched using, for example, a dedicated information search device.
[0003]
In this electronic book system, for example, search key information (index information) such as a word or a word whose meaning and contents are to be known is input to an information search device, and is recorded on a CD-ROM loaded in the device. Search for information. Then, information corresponding to the input search key information is read from the CD-ROM and displayed on the display screen of the information search device, etc., and provided to the user.
[0004]
Therefore, users of e-book systems can quickly turn to the target without taking time and effort, such as turning over pages of documents such as Japanese dictionaries and English-Japanese dictionaries and finding descriptions of words and words that they want to look up. You can search and obtain words and meanings of words.
[0005]
By the way, in the electronic book system, in order to realize a quick search of information, index information having a hierarchical structure for information search is created, and this index information together with document content information (hereinafter referred to as body data) is a CD- It is recorded in ROM.
[0006]
Among the index information of the hierarchical structure for search, the index information of each layer other than the lowest layer includes comparison key information to be compared with the input search key information and index information of the next layer corresponding to the comparison key information. The address information indicating the head recording position is included. The index information in the lowest layer is key information to be compared with the input search key information, and is comparison key information that matches the input search key information and information corresponding to the input search key information. It has address information (text address information) indicating the head recording position on the CD-ROM.
[0007]
Then, by sequentially performing a comparison process between the input search key information and the comparison key information of the index information, the start recording start position on the CD-ROM of information corresponding to the input search key information is searched. Has been. In this case, the search range of the input search key information can be gradually narrowed down without performing a search on all comparison key information of the index information, so that the search input can be quickly performed. Information corresponding to the key information can be searched from the text data recorded on the CD-ROM and used.
[0008]
[Problems to be solved by the invention]
By the way, in recent years, there is an increasing demand for enhancing the contents of a CD-ROM of an electronic book system by storing more information. Thus, by increasing the amount of information stored in the CD-ROM, the convenience of the electronic book system can be expected to be improved, for example, the number of CD-ROM replacements can be reduced in the electronic book system. .
[0009]
However, the text data of various documents recorded on the CD-ROM of the electronic book system has been conventionally stored in an uncompressed state, and there is a problem of insufficient capacity of the CD-ROM.
[0010]
Therefore, it is conceivable to compress the text data and store it in a CD-ROM. However, there are the following problems, and it is not possible to simply compress the body data and record it on a CD-ROM.
[0011]
First, the processing unit of the text data to be compressed becomes a problem. For example, when the whole body data of one document is compressed as a set of data (processing unit), the compressed whole body data is taken into the information search device and compressed. Since the decompression process (decompression process) must be performed, a large memory must be installed in the information retrieval apparatus. Also, in this case, since the amount of text data per processing unit is large, it takes time to import compressed text data from the CD-ROM and decompression processing of the compressed text data, so that a quick search process can be performed. It cannot be realized.
[0012]
For this reason, it is conceivable to divide the text data for each piece of information (search target item) to be searched, for example, and compress the text data for each divided information. When the text data is, for example, data of a Japanese dictionary, the text data is compressed for each search target item, using a word and information indicating the meaning content of the word as a set of search target items to be searched. Like that.
[0013]
However, in this case, there is a possibility that efficient data compression cannot be performed because the search target data is too small or the data size of the search target item varies.
[0014]
In the case of an electronic book system, as described above, in order to enable a quick search, index information having an address indicating the head recording position on the CD-ROM of each search target item is stored together with the text data in the CD-ROM. Has been written to. For this reason, when the text data is compressed and recorded on the CD-ROM for each search target data, the index information of the software of the conventional electronic book system is also converted into the compressed text data. It must be recreated accordingly.
[0015]
As described above, the electronic book standard index information has a complicated hierarchical structure. To recreate the index information as the main text data is compressed, it takes about the same amount of time as when index information is newly created. costly. This is a cause of obstructing provision of the CD-ROM of the electronic book system in which more text data is recorded on the CD-ROM by compressing the text data.
[0016]
In view of the above, the present invention eliminates the above-mentioned problems and makes it possible to effectively use the storage capacity of the recording medium, the information recording method, the information recording apparatus, and the information recorded on the recording medium. An object of the present invention is to provide an information search method, an information search device, and an information recording medium on which information is recorded so that the target information can be searched reasonably and quickly. And
[0017]
[Means for Solving the Problems]
  In order to solve the above problem, the invention according to claim 1 is provided.Information search method is
  Used in an information retrieval apparatus comprising a reading means for reading data from a recording medium, a compression / decompression means for compressing and decompressing the read data, and a data output means for outputting the compressed and decompressed data,Search for index information for detecting a head recording position for each search target item when body data including a plurality of search target items is sequentially recorded on a recording medium without a break for each search target item An information search method for detecting a target search target item from text data including the plurality of search target items that are compressed and recorded as key information for
  In the recording medium, the body data is stored in a predetermined manner.equalThe data divided by the amount of data and compressed in the divided data units are sequentially recorded so that the recording positions are continuous, and the compressed data size for each divided data unit is accumulated. In addition to the index information, a compressed size table in which an arithmetic value is described in association with each divided data is recorded,
  By the reading means,Based on the index information, the head of the specified search target itemBy specifying the data amount up to the address indicating the position and dividing the data amount up to the address indicating the head position by the data amount of the predetermined size,,Including the head positionWhile specifying the divided data unit, based on the accumulated value of the compressed data size of the divided data, the recording start position of the compressed body data corresponding to the specified divided data unit, the data amount, And reading the body data corresponding to the specified divided data unit from the recording medium,
  By the compression / decompression means,The reading stepIn said reading meansA decompression step of decompressing the compressed body data compressed in units of the divided data read by
  By the data output means,The compression / decompression processIn the compression and decompression meansA target data output step for outputting data of the search target item from the head position of the specified search target item detected based on the index information in the compressed and decompressed data;
  It is characterized by providing.
[0027]
  thisClaim 1According to the information search method of the invention described in the above, text data including a plurality of search target items is predetermined on the recording medium.equalThe data is divided for each size data amount, compressed and recorded in this divided data unit, information on the data size after compression for each divided data unit, and the head of each search target item of the body data before compression Index information for detecting the position is recorded.
[0028]
  For reading processIn the reading meansThus, based on the index information and the information on the data size, a divided data unit including the head position of the specified search target item is specified, and the compressed body data corresponding to the specified divided data unit is , Read from the recording medium.
[0029]
  The compressed body data that has been read out is subjected to the compression / decompression process.In the compression and decompression meansThus, it is decompressed, ie decompressed, and restored to the original body data. In the target data output process, the data of the specified search target item is detected from the restored body data,In the data output process, by the data output meansFor example, it is displayed and output.
[0030]
As described above, the reading of the body data compressed and recorded on the recording medium and the compression / decompression processing of the read body data are performed in units of divided data. Therefore, for example, the amount of data per processing unit is not too large as in the case where the entire body data of one document is used as one processing unit, and the data of the target search target item can be quickly retrieved from the recording medium. It can be read out and quickly compressed and decompressed for use.
[0032]
  Also,Accumulated value obtained by accumulating the compressed data size of each divided data is described corresponding to each divided data(Compressed size table)Are recorded on the recording medium as information on the data size.
[0033]
In this case, the accumulated value of the compressed data size for each divided data indicates the first recording position after the compression of the next divided data, and the accumulated value of the compressed data size up to the target divided data From this, by subtracting the accumulated value of the data size after compression up to the previous divided data, the data amount after compression of the target divided data can be obtained.
[0034]
Thereby, in the reading step, the divided data including the head of the search target item designated with reference to the index information is specified, and the head recording position of the compressed body data corresponding to the specified divided data, The amount of data can be reasonably detected by simple arithmetic processing. That is, it is possible to quickly search and use the data of the designated search target item from the body data compressed and recorded on the recording medium.
[0035]
DETAILED DESCRIPTION OF THE INVENTION
  The present invention will be described below with reference to the drawings.One of the method and apparatusEmbodiments will be described.
[0036]
In this embodiment, the present invention is applied to a so-called electronic book system. As described above, the electronic book system records the text data of various documents on, for example, a CD-ROM, and uses the information retrieval device to search from the text data recorded on the CD-ROM. The target search target item is searched.
[0037]
For example, a national language dictionary forms a body data of a single Japanese language dictionary by collecting a large number of search target data consisting of a word and information indicating the meaning content of the word. A CD-ROM of the electronic book system is created by recording on the CD-ROM.
[0038]
Then, for example, when the CD-ROM is loaded into an information search device for an electronic book system and a word to be checked is input as search key information, search target item data corresponding to the search key information (in this case, a word And information indicating its meaning) are retrieved from the text data recorded on the CD-ROM, and displayed on the display screen of the information retrieval device, and provided to the user.
[0039]
As described above, the electronic book system can quickly search and use the literature information recorded on the recording medium such as the CD-ROM with a simple operation.
[0040]
[Creation of CD-ROM for electronic book system]
First, creation of a so-called electronic book standard CD-ROM used in the electronic book system will be described. FIG. 1 is a block diagram for explaining an information recording apparatus according to this embodiment that creates an electronic book standard CD-ROM by writing information on a CD-ROM. The information recording apparatus of this embodiment is an application of the information recording method according to the present invention, and compresses and records on the CD-ROM the text data of various documents that have been recorded uncompressed conventionally. It was made to be able to.
[0041]
As shown in FIG. 1, the information recording apparatus of this embodiment includes an index information generation unit 1, a body data generation unit 2, a data division unit 3, a data compression unit 4, a compression size table generation unit 5, and a write control unit 6. It has. Further, the CD-ROM 200 is loaded into the information recording apparatus of this embodiment, and the body data, index information, and compression size table are written therein.
[0042]
Before the detailed description of the information recording apparatus of this embodiment shown in FIG. 1, the index information generated in the index information generator 1 of this embodiment and recorded on the CD-ROM 200 is described. explain. This index information is created and used in an electronic book system in order to realize quick search processing.
[0043]
FIG. 2 is a diagram for explaining an example of the electronic book standard index information output from the index information generation unit 1. The index information of the electronic book standard is created in an nth-order hierarchical structure according to the text data of documents recorded on the CD-ROM. The index information of the electronic book standard shown in FIG. 2 is an example of a tertiary hierarchical structure, for example, index information for a Japanese dictionary.
[0044]
As shown in FIG. 2, the index information in this example includes a primary index block 1B, a secondary index block 2B, and a tertiary index block 3B. The contents of the second index block 2B and the third index block which is the lowest index block in this example are further subdivided into a plurality of subdivided blocks 2B1, 2B2,..., A plurality of subdivided blocks 3B1. 3B2,...
[0045]
The primary index block 1B and the subdivision blocks 2B1, 2B2,... Of the secondary index block 2B are compared key information such as “Ama” and “Kaki” that is compared with the input search key information. And address information indicating the head recording position of the sub-block of the next layer corresponding to the comparison key information.
[0046]
Further, each of the subdivided blocks 3B1, 3B2,... Of the third index block 3B is the lowest-level index block in this example, and is key information to be compared with the input search key information. Comparison key information matching the search key information and address information (text address information) indicating the first recording position of the search target item corresponding to the comparison key information among the text data recorded on a recording medium such as a CD-ROM )have.
[0047]
Using the thus configured index information, information retrieval based on the inputted retrieval key information is performed as follows.
[0048]
In this example, first, the primary index block 1B is referred to, and the two-character information from the beginning of the input search key information is compared with the comparison key information of the primary index block 1B. As a result of this comparison processing, whether the two-character information from the beginning of the input search key information is information located in front of the comparison key information to be compared in the primary index block 1B in alphabetical order. Whether the information is located behind or the same information as the comparison key information of the primary index block 1B.
[0049]
When it is determined that the information of the two characters from the beginning of the input search key information is information that is located after the comparison key information that is the comparison target of the primary index block 1B in the order of the Japanese alphabet. The comparison process is performed in the same manner for the next comparison key information of the primary index block 1B.
[0050]
Also, the information of the two characters from the beginning of the input search key information is information located in alphabetical order before the comparison key information to be compared in the primary index block 1B, or When it is determined that the information is the same as the comparison key information of the primary index block 1B, the second index block 2B of the secondary index block 2B is based on the address information of the next layer corresponding to the comparison key information of the primary index block. Refer to the relevant subdivision block.
[0051]
Then, between the two-character information from the beginning of the input search key information and the comparison key information of the specified segmented block of the secondary index block 2B, the same as in the case of the above-mentioned primary index block. The comparison process is performed.
[0052]
In the comparison process performed with the comparison key information of the designated subdivision block of the secondary index block 2B, the information of the two characters from the beginning of the input search key information is in the order of the Japanese syllabary order. When it is determined that the information is located before the comparison key information that is the comparison target of the subdivision block of the secondary index block 2B, or the same information as the comparison key information that is the comparison target, The corresponding segmented block of the third index block 3B designated by the address information of the next layer corresponding to the comparison key information is referred to.
[0053]
Then, a comparison process is performed between the input search key information and the comparison key information of the designated subdivision block of the tertiary index block 3B, and comparison key information that matches the input search key information is detected. To do. The body address stored in correspondence with the detected comparison key information of the segmented block of the third index block 3B indicates the recording start position on the CD-ROM of search target data for the input search key information. Show. Therefore, the retrieval target data corresponding to the inputted retrieval key information can be acquired by reading the body data from the recording position of the CD-ROM indicated by the body address.
[0054]
For example, when “greeting” is input as the search key information, the two characters “ai” from the head of the search key information are compared with the comparison key information of the primary index block. First, the two characters “Ai” from the beginning of the search key information are compared with the comparison key information “Ama” of the primary index block. Since the two characters “Ai” from the beginning of the search key information are information located in front of the comparison key information “Ama” in Japanese alphabetical order, the address recorded corresponding to the comparison key information “Ama” Based on the information, the subdivision block 2B1 of the secondary index block is referred to.
[0055]
Then, the two characters “Ai” from the beginning of the search key information are compared with the comparison key information of the subdivision block 2B1 of the secondary index block 2B. First, the two characters “Ai” from the beginning of the search key information are compared with the comparison key information “Au” of the subdivision block 2B1 of the secondary index block 2B. Since the two characters “Ai” from the beginning of the search key information are information located in front of the comparison key information “AOU” in the order of the Japanese syllabary order, the address recorded corresponding to the comparison key information “AOU” Based on the information, the segmented block 3B1 of the third index block is referred to.
[0056]
Then, the comparison key information matching the search key information “greeting” is detected from the comparison key information of the subdivided block 3B1, and the body address recorded in correspondence with the detected comparison key information “greeting”. The data of the search target item corresponding to the input search key information is read out and provided. That is, in this case, the text data indicating the meaning content of the search key information “greeting” is read from the body data as the data of the search target item and provided to the user.
[0057]
In this embodiment, as described with reference to FIG. 2, the index information of the electronic book standard created in a hierarchical structure is provided in a CD-ROM in which text data is recorded, as shown in FIG. Recorded in the index area IDX.
[0058]
In this case, the index area IDX includes the first index block 1B, the second index block group 2B,..., The nth index block group in order from the lowest physical address of the index area IDX. To be recorded. Thereby, the hierarchy of the index information can be traced in order from the lowest physical address to the higher physical address.
[0059]
In this embodiment, a character management block MK is provided so that it can be determined from where in the primary index block the search is started according to the input search key information. For example, if the first character of the search key information is from “a line” to “na line” of the Japanese syllabary, the search starts from the top of the primary block 1B, and the first character of the search key information is “ha line”. After “”, the search can be started from a predetermined position near the intermediate position of the primary block, that is, from the start position of “ha” in the primary index block. .
[0060]
In this way, the index information of the electronic book standard can narrow down the search range efficiently by making the index information into a hierarchical structure without performing a search process on all comparison key information of the index information. The search target item corresponding to the searched search key information can be quickly searched from the body data and used.
[0061]
By the way, as described above, since the text data of the electronic book standard is stored in the CD-ROM in an uncompressed state, the text data of more documents cannot be recorded. In many cases, the limited storage capacity is not effectively utilized. Therefore, although compression of text data is conceivable, it cannot be compressed in a single order as described above.
[0062]
Further, when the main text data is compressed and recorded on the CD-ROM, it is necessary to recreate the index information. However, as described above with reference to FIG. 2, the electronic book standard index information has a complicated hierarchical structure. Therefore, it takes time and cost to recreate the index information. This is a cause of obstructing provision of a CD-ROM of an electronic book system having a more substantial content by recording more text data on the CD-ROM.
[0063]
Therefore, the information recording apparatus of this embodiment shown in FIG. 1 divides the text data into predetermined data amounts. Hereinafter, in this specification, a collection of data (part of the text data) formed by dividing the text data into predetermined data amounts is referred to as a unit block. Then, the body data is compressed for each unit block, and the compressed body data is sequentially packed into consecutive addresses and recorded on the CD-ROM.
[0064]
Also, the information recording apparatus of this embodiment records the index information already created corresponding to the text data before compression on the CD-ROM as it is. In order to be able to acquire the data of the search target item corresponding to the input search key information from the compressed body data using the index information created for the body data before compression. , Forming a compressed size table that stores the accumulated value of the compressed data amount for each unit block formed by dividing the body data corresponding to each unit block, and compresses this for each unit block The CD-ROM of the electronic book standard is created by storing it in the CD-ROM together with the text data and the index information.
[0065]
In this case, the recording area of the CD-ROM is separated into a body data area which is a body data recording area, an index area which is an index information recording area, and a compression size table area which is a compression area table recording area. The compressed body data, index information, and compressed size table are recorded in the corresponding recording areas. Hereinafter, the information recording apparatus of this embodiment shown in FIG. 1 will be described in detail.
[0066]
The index information generating unit 1 generates index information to be recorded on the CD-ROM 200 based on the index information created in advance corresponding to the uncompressed body data to be compressed and recorded on the CD-ROM 200. This is supplied to the write control unit 6.
[0067]
The body data generating unit 2 generates body data to be compressed and recorded on the CD-ROM 200 and supplies this to the data dividing unit 3. The data dividing unit 3 divides the supplied text data into unit blocks having a predetermined data amount, and supplies the text data to the data compression unit 4 for each unit block.
[0068]
FIG. 4 is a diagram for explaining text data division processing and compression processing performed in the data division unit 3 and the data compression unit 4. For the sake of simplicity, FIG. 4 shows text data before compression (hereinafter, before compression). (Excerpt from the text data).
[0069]
In this embodiment, the data dividing unit 3 divides the uncompressed text data DT into unit blocks DT1, DT2, DT3, DT4,... Having a predetermined size as shown in FIG.
[0070]
In FIG. 4A, 0000H, 1000H, 2000H, and 3000H attached to the left side of the uncompressed body data DT indicate the start addresses of the unit blocks DT1, DT2, DT3, and DT4 from the beginning of the uncompressed body data. Indicates the end address of the unit block DT4. The alphabet “H” appended to the end of each address indicates that the address is expressed in hexadecimal. Hereinafter, in this specification, “H” of the alphabet attached to the end of the address information, the compressed size, the number of bytes, and the like indicates that the information is expressed in hexadecimal.
[0071]
Therefore, in this embodiment, as shown in FIG. 4A, the uncompressed text data DT is divided into unit blocks DT1, DT2, DT3, DT4,... Which are units of compression every 4096 bytes. The uncompressed text data for each block is supplied to the data compression unit 4.
[0072]
  The data compressing unit 4 performs body data for each unit block.TheCompress. The text data compressed for each unit block in the data compression unit 4 is sequentially recorded on the CD-ROM 200 to form compressed text data DTA as shown in FIG. 4A. Here, the post-compression unit blocks DTA1, DTA2, DTA3, and DTA4 constituting the post-compression body data DTA are formed by compressing the data of the unit blocks DT1, DT2, DT3, and DT4 of the body data before compression. Is.
[0073]
As shown in FIG. 4B, for example, a unit block DT1 of about 4 kilobytes (4096 bytes) from address 0000H to address 0FFFH has 2066 bytes (hexadecimal number) as shown in the corresponding compressed unit block DTA1. Is compressed to 0812H bytes).
[0074]
Similarly, as shown in FIG. 4A, the unit block DT2 of about 4 kilobytes from the address 1000H to 1FFFH is compressed to 2294 bytes (08F6H bytes in hexadecimal) as shown in the post-compression unit block DTA2, The unit block DT3 from the address 2000H to 2FFFH is compressed to 1767 bytes (06E7H bytes in hexadecimal) as indicated by the post-compression unit block DTA3. The unit block DT3 from addresses 3000H to 3FFFH is compressed to 2578 bytes (0A12H bytes in hexadecimal) as indicated by the compressed unit block DTA4.
[0075]
Then, the data compression unit 4 supplies the body data compressed for each unit block to the write control unit 6. Further, the data compression unit 4 detects the data amount of the compressed body data for each unit block, and supplies this to the compression size table generation unit 5.
[0076]
In this embodiment, the compression size table generation unit 5 obtains an accumulated value of the data amount after compression for each unit block from the data compression unit 4, and uses this accumulated value as a compressed size for each unit block. The associated compressed size table is created and supplied to the write control unit 6.
[0077]
That is, the data compressing unit 4 sequentially supplies the compressed data size of the unit blocks DTA1 to DTA4 after compression formed by compressing the unit blocks DT1 to DT4 to the compressed size table generating unit 5. In this embodiment, the compressed size table generating unit 5 accumulates the compressed data amount for each unit block in order from the head unit block to obtain an accumulated value. A compressed size table TB in which unit blocks are associated is formed.
[0078]
In FIG. 4A, 0812H, 1108H, 17EFH, and 2201H attached to the right side of the compressed text data DTA indicate the accumulated values of the data amounts of the compressed unit blocks DTA1 to DTA4.
[0079]
That is, as described above, the data amounts of the compressed unit blocks DTA1, DTA2, DTA3, and DTA4 formed by compressing the unit blocks DT1, DT2, DT3, and DT4 are 0812H, 08F6H, 06E7H, and 0A12H. The compression size table generation unit 5 sequentially accumulates these to obtain an accumulated value of the amount of data after compression up to the unit block to be compressed corresponding to each unit block.
[0080]
Therefore, if the unit block after compression DTA1 is the first unit block after compression, the accumulated value up to the unit block after compression DTA1 is 0812H. The accumulated value up to the compressed unit block DTA2 is 0108H and 08F6H added to become 1108H. Similarly, 1108H and 06E7H are added to the accumulated value up to the compressed unit block DTA3 to become 17EFH, and the accumulated value up to the compressed unit block DTA4 is added to 17EFH and 0A12H to become 2201H. .
[0081]
  In this way, the compression size table generating unit 5 obtains an accumulated value by accumulating the data amount for each unit block, and uses the accumulated value of the data amount after compression as the compressed size.WhenThe associated compressed size table TB is formed.
[0082]
FIG. 5 is a diagram for explaining the compression size table TB of this embodiment. As shown in FIG. 5A, the accumulated value of the size of the compressed unit block obtained for each unit block is recorded in the compressed size table TB as a compressed size. In this case, each compression size (accumulated value of the data amount of the unit block after compression) is expressed in hexadecimal and 4 bytes. The address of the compressed size table TB is the head address of each compressed size from the head of the compressed size table TB. The compressed size table TB created in the compressed size table generation unit 5 is supplied to the write control unit 6.
[0083]
The writing control unit 6 writes information supplied from each unit to the CD-ROM 200. That is, the write control unit 6 writes the index information from the index information generation unit 1 in an index area formed on the CD-ROM 200. Further, the compressed body data for each unit block from the data compression unit 4 is sequentially packed and written in the body data area of the CD-ROM 200. Similarly, the compressed size table information from the compressed size table generating unit 5 is written into the compressed size table area.
[0084]
As described above, the information recording apparatus according to this embodiment stores the index information created corresponding to the uncompressed body data, the body data compressed for each unit block, and the compression size table on the CD-ROM. The data is written, and an electronic book standard CD-ROM is created.
[0085]
Then, as will be described in detail later, based on the index information recorded on the CD-ROM and the compressed size table, the compressed body data including the designated search target item is read and decompressed. By doing so, the designated search target data can be used.
[0086]
Next, the above-described creation process of the electronic book standard CD-ROM will be described with reference to the flowchart of FIG.
[0087]
The uncompressed body data to be recorded from the body data generating unit 2 is divided by the data compression unit 4 into unit blocks having a predetermined amount of data (step S1). That is, as described above with reference to FIG. 4, the processing of step S1 is to divide the uncompressed text data into unit blocks having a data amount of 4096 bytes, for example.
[0088]
Next, the data compression unit 4 performs a compression process of the body data for each unit block using a unit block of a certain size as a processing unit (step S2), and acquires the data amount of the unit block after compression. (Step S3).
[0089]
The text data compressed for each unit block is sequentially packed under the control of the writing controller 6 and recorded in the text area of the CD-ROM 200 (step S4). Further, the data amount of the unit block after compression acquired by the data compression unit 4 is supplied to the compression size table creation unit 5, and an accumulated value of the data amount of the unit block after compression is calculated (step S5). . In the process of step S5, for example, the accumulated value of the data amount of the unit block after compression up to the previous time is held, and the accumulated value up to the previous time and the compression of the unit block to be compressed this time are stored. The accumulated value of this time can be obtained by adding the subsequent data amount.
[0090]
Then, the compression size table creation unit 5 creates a compression size table in the memory provided therein, for example, by using the current accumulated value as the compression size so that it can be associated with the unit block targeted for the current compression. (Step S6).
[0091]
That is, in this step S6, the accumulated value of the size after compression up to the first unit block is recorded in the first in the compression size table, and the accumulated size after compression up to the second unit block. As the calculated value is recorded second in the compressed size table, a compressed size table for the body data to be compressed and recorded is created so that the accumulated value up to which unit block can be known.
[0092]
Then, it is determined whether or not the compression process has been completed for all the unit blocks of the body data to be compressed (step S7). If it is determined that the compression has not been completed, the next unit block is set as the processing target. (Step S8) and repeat the process from Step S2.
[0093]
If it is determined in step S7 that the compression processing has been completed for all unit blocks of the body data to be compressed, the compression size table created in the compression size table creation unit 5 and the index information generation unit 1 is supplied to the writing control unit 6 and the information is recorded in the index area and the compressed size table area of the CD-ROM (step S9). The body data compression processing and compression size table creation processing shown in FIG.
[0094]
In this way, the body data is divided into unit blocks, compressed for each unit block, recorded on the CD-ROM of the electronic book system, and the accumulated value of the data amount of the unit block after compression is compressed size. The compressed size table as shown in FIG. 5B and the index information created according to the body data before compression are recorded on the CD-ROM 200.
[0095]
As described above, since the body data is compressed and recorded on the CD-ROM, the body data of more documents can be recorded on the same CD-ROM than before. When the text data of a plurality of documents is compressed and recorded on one CD-ROM, a compressed size table corresponding to the text data compressed for each unit block of each document, By recording the index information created corresponding to the text data before compression on the same CD-ROM, various text data recorded on one CD-ROM can be used in the same way. .
[0096]
In addition, since the body data is compressed for each unit block of a predetermined size, it is only necessary to perform decompression processing in a compressed unit, so that it is not necessary to install a large memory in the information retrieval apparatus. it can. In addition, if the compression process is performed for each predetermined unit block, it is only necessary to process the data for each unit block, so that it does not take a long time to read and decompress the compressed body data.
[0097]
Further, when the index information has already been created, the existing index information can be used as it is, so that it is not necessary to create new index information corresponding to the compressed body data.
[0098]
In the above description, the text data of all unit blocks is compressed and recorded on the CD-ROM 200, and then the index information and the compressed size table for the text data are recorded on the CD-ROM 200. However, since the index information is created in advance corresponding to the text data before compression, a CD-ROM in which the index information is recorded is created in advance and compressed into this CD-ROM for each unit block. The body data and the compressed size table may be recorded.
[0099]
Further, after the compressed size table generation unit 5 completes a compressed size table composed of compressed sizes corresponding to all unit blocks, the completed compressed size table may be recorded on the CD-ROM 200. The accumulated value of the compressed data amount for each unit block can be sequentially recorded on the CD-ROM 200.
[0100]
[Information retrieval device]
Next, for the electronic book system created by recording the index information created according to the text data before compression, the text data compressed for each unit block, and the compressed size table as described above Information retrieval using the CD-ROM will be described. In this case, it is assumed that the body data before compression and the head of the body data after compression are made to coincide with the head (0000H) of the body data area of the CD-ROM 200 as shown in FIG. .
[0101]
FIG. 7 shows an electronic device according to this embodiment in which a CD-ROM 200 for an electronic book system created by the information recording apparatus described above with reference to FIG. 1 is loaded and information recorded on the CD-ROM 200 can be searched. It is a block diagram for demonstrating the information search device of a book system. This information retrieval apparatus is one to which the information retrieval method according to the present invention is applied.
[0102]
As shown in FIG. 7, the information retrieval apparatus of the electronic book system of this embodiment includes an optical pickup 11, a biaxial device 12, a spindle motor 13, a driver 14, an RF amplifier 15, a signal processing unit 16, and a display control unit 17. And a display panel 18 and a CPU 100 to which a ROM 101, a RAM 102, and a key operation unit 103 are connected.
[0103]
The CPU 100 has a function as a system controller that controls the operation of each unit of the search / playback apparatus. The ROM 101 stores programs and data used in the information retrieval apparatus such as operation programs and display character font data.
[0104]
The RAM 102 is used as a work area for processing performed in the information retrieval apparatus, such as temporarily storing reproduction data read from the CD-ROM 200. The key operation unit 103 includes a plurality of operation keys such as numeric keys and alphabet keys, and receives information input from the user such as search key information.
[0105]
When the operation key of the key operation unit 103 is operated by the user and the search key information is input, the CPU 100 accepts this and starts the search process of the body data recorded on the CD-ROM 200.
[0106]
First, the CPU 100 supplies a control signal for instructing the driver 14 to start search processing. In response to this control signal, the driver 14 drives the optical pickup 11, the biaxial device 12, and the spindle motor 13, and reads data recorded on the CD-ROM 200 by the optical pickup 11.
[0107]
Although not shown, the optical pickup 11 includes, for example, a laser diode, an objective lens, a half mirror, a photodetector, etc., irradiates a track of the CD-ROM 200 with a laser beam, receives the reflected light with the photodetector, and reflects the reflected light. Data recorded on the CD-ROM 200 is read based on the change in the amount of light.
[0108]
In this embodiment, the photodetector of the optical pickup 11 is divided into a plurality of light receiving areas in order to detect a focus error and a tracking error.
[0109]
The reflected light from the CD-ROM 200 received by each light receiving area of the photodetector of the optical pickup 11 is converted into an electric signal and supplied to the RF amplifier 15. The RF amplifier 15 forms a reproduction high-frequency signal, a focus error signal FE, and a tracking error signal TE from electric signals from the respective light receiving regions of the photodetector of the optical pickup 11.
[0110]
The focus error signal FE and tracking error signal TE formed in the RF amplifier 15 are supplied to the CPU 100. The CPU 100 controls the biaxial device 12 through the driver 14 based on these signals FE and TE so as to perform focus error control and tracking error control.
[0111]
The reproduced high-frequency signal formed by the RF amplifier 15 is supplied to the signal processing unit 16 where analog / digital conversion processing and demodulation processing according to the modulation method at the time of recording on the CD-ROM 200 are performed. Demodulated data is retrieved.
[0112]
In this case, as will be described later, the CPU 100 first refers to the index information recorded on the CD-ROM 200 based on the input search key information and inputs the input based on the index information. A unit block including the head of the search target item corresponding to the search key information is specified.
[0113]
That is, the CPU 100 refers to the index information created corresponding to the uncompressed text data recorded on the CD-ROM 200, and the text address indicating the head position of the search target item corresponding to the input search key information. To get. Then, the data amount of the body data before compression from the head before compression of the body data to the head position of the search target item is obtained. By dividing the obtained data amount by the data amount per unit block, which is the processing unit at the time of compression processing, the unit block that includes the start position of the search target item is the unit block number from the start unit block. Is identified.
[0114]
Next, the CPU 100 refers to the compressed size table, and reads out compressed body data corresponding to the specified unit block from the CD-ROM 200 based on the information of the compressed size table.
[0115]
That is, as described above with reference to FIG. 5, each compression size in the compression size table is an accumulated value of the data amount of the unit block after compression, and corresponds to the start position of the next unit block after compression. Also, subtracting the accumulated value (compressed size) of the data amount up to the previous compressed unit block from the accumulated value (compressed size) of the data amount up to the target compressed unit block, The target data amount of the unit block after compression can be obtained.
[0116]
Accordingly, for example, in the example shown in FIG. 4A, when the compressed data of the post-compression unit block DTA3 corresponding to the pre-compression unit block DT3 is read and used, the compression size shown in FIG. 5B is used. The compressed size 17EFH up to the compressed unit block DT3 and the compressed size 1108H up to the previous compressed unit block DT2 are read from the table TB, and the compressed unit block from the compressed size 17EFH up to the compressed unit block DT3 is read. By subtracting the compression size 1108H up to DT2, the data amount of the post-compression unit block DTA3 is obtained. In this case, the size of the compressed unit block DT3 is 1767H bytes as shown in FIG.
[0117]
Then, from the 1108Hth byte, which is the compression size from the beginning of the compressed body data to the post-compression unit table DTA2 immediately before the pre-compression unit block DT3, the data amount of the post-compression unit block DT3, that is, 1767H bytes If the compressed body data is read, the compressed body data corresponding to the unit block DT3, in this case, the entire compressed unit block DTA3 can be read out. The compressed body data corresponding to the specified unit block read out in this way is temporarily stored in the RAM 102.
[0118]
Then, the CPU 100 compresses and decompresses the compressed body data temporarily stored in the RAM 102 to obtain original body data before compression of the specified unit block. Then, as described above, the compressed and decompressed unit block body data is input based on the body address indicating the start position of the search target item corresponding to the input search key information acquired from the index information. The data of the search target item corresponding to the searched key information is acquired.
[0119]
That is, it is already known which unit block of the whole body data recorded in the CD-ROM 200 is identified as described above, and each unit block has a predetermined size. Therefore, the head position of the specified unit block from the head of the entire body data can be easily understood. That is, the head position of the specified unit block can be found by multiplying the number of unit blocks from the head unit block of the body data to the specified unit block by a predetermined amount of data per unit block.
[0120]
Therefore, if the address indicating the start position of the specified unit block is subtracted from the body address indicating the start position of the search target item corresponding to the search key information acquired from the index information, The head position of the target search target item from the head can be specified. Then, if the body data is read from the position corresponding to the head position of the specified search target item in the compressed and decompressed unit block, the data of the target search target item corresponding to the input search key information is acquired. be able to.
[0121]
In this way, the data of the search target item corresponding to the search key information is acquired, and based on the data of the search target item, the font data stored in the ROM 101 is used to display characters and the like to be displayed. Shape data of display information is formed and supplied to the display control unit 17.
[0122]
The display control unit 17 includes a display memory 71 and forms display image data in the display memory 71 in accordance with the shape data of the display information from the CPU 100. Then, the display control unit 17 controls the display panel 18 configured by a liquid crystal display panel or the like, and causes the display panel 18 to display an image corresponding to the image data formed in the display memory 71.
[0123]
Thereby, the display panel 18 displays the data of the search target item read from the CD-ROM 200 based on the search key information from the user.
[0124]
[Operations during information retrieval in the information retrieval device]
Next, the operation at the time of search of the information search apparatus of the electronic book system of this embodiment will be described with reference to the flowchart of FIG.
[0125]
When the CD-ROM 200 of the electronic book system of the information search apparatus of this embodiment is loaded and the search key information is input by the user through the key operation unit 103 (step S11), the CPU 100 of the information search apparatus displays the driver 14 Through the optical pickup 11, the 2-axis device 12, and the spindle motor 13, the index information created according to the text data before compression is referred to, and as described above, corresponding to the input search key information A unit block (before compression) including the head of the search target data is specified (step S12).
[0126]
Then, an accumulated value (compression size) RA of the size of the unit block after compression up to the specified unit block from the compression size table described above, and the size of the unit block after compression up to the previous unit block The accumulated value (compressed size) RB is read (step S13), and the size SA of the specified unit block is calculated by subtracting the compressed size RB from the compressed size RA (step S14).
[0127]
The subtraction process in step S14 will be specifically described. For example, when the compression size RA up to the specified unit block is 014C10CFH and the compression size RB up to the previous unit block is 014C09F6H, It can be seen that 014C09F6H is subtracted from 014C10CFH, and the size SA of the specified unit block is 1753 bytes.
[0128]
As described above, since the compression size is an accumulated value of the size of the unit block after compression, the compression size itself indicates the head position of the next unit block. Therefore, the CPU 100 positions the reading position at the head position of the specified unit block on the CD-ROM 200 indicated by the compressed size RB (step S15), and from this, the compressed body data is compressed by the size SA of the specified unit block. Is read (step S16).
[0129]
That is, according to the above-described example, since the compression size RB = 014C09F6H = 221762550 bytes up to the unit block immediately before the specified unit area, from the 2762550th byte based on the head of the compressed body data The compressed text data corresponding to 1753 bytes is read out.
[0130]
The compressed compressed body data is temporarily stored in the RAM 102. Since the text data temporarily stored in the RAM 102 is obtained by compressing the text data of the specified unit block, the original text data before compression of the specified unit block is obtained by compressing and decompressing the text data. (Step S17).
[0131]
Then, as described above, the head position of the data of the search target item corresponding to the input search key information is specified from the compressed and decompressed body data, and the target search target item data is compressed and decompressed. It reads out from the body data of the block and reproduces it (step S18).
[0132]
In this step S18, the search target item corresponding to the search key information is read from the body data temporarily stored in the RAM 102 and decompressed, and the search key information is supported using the font data stored in the RAM 101. A display image of the search target item is formed in the RAM 71 of the display control unit 17. The display image of the RAM 71 is displayed on the display panel 18, and text data corresponding to the input search key information is provided to the user.
[0133]
In consideration of the fact that the search target data corresponding to the input search key information extends over a plurality of blocks, the end of each search target data constituting the text data of each document is the end of the search target data. A so-called end mark is added. If this end mark is not detected, the processing from step S13 is performed with the unit block next to the specified unit block as the newly specified unit block. Thereby, it is possible to cope with the case where the search target data corresponding to the input search key information extends over a plurality of blocks.
[0134]
As described above, by using the information search device of this embodiment, index information created according to the text data before compression, text data compressed for each unit block, and a compression size table are recorded. The data of the search target item corresponding to the search key information input by the user can be quickly and accurately read out from the CD-ROM 200 of the electronic book system created in this way, and can be compressed and decompressed for use.
[0135]
Further, since the body data is compressed for each unit block of a predetermined size and recorded on the CD-ROM, the processing unit for reading the body data and the compression / decompression process can be reduced. For this reason, without reading the text data from the CD-ROM and the compression / decompression process, it is possible to retrieve the data of the search target item corresponding to the search key information from the text data recorded on the CD-ROM. Can be obtained quickly.
[0136]
In addition, since the index information that has already been created can be used as it is, there is a problem in an electronic book system with a large amount of content in which more text data is recorded by compressing the text data and recording it on a CD-ROM. It is possible to migrate without any problems.
[0137]
In the above-described embodiment, it has been described that the body data before compression and the head of the body data after compression coincide with the head of the body data area of the CD-ROM 200. However, when the text data of a plurality of documents is recorded on the CD-ROM, the positions of the text data of each document on the CD-ROM are different.
[0138]
Therefore, when the body data before compression of each document is recorded, information such as the head position information of the body data, the head position information of the body data after compression, or the data amount of the divided block is stored in the CD-ROM. It is stored in the TOC (table of contents) or other area of the CD-ROM, and the unit block before compression is the unit block of the body data, or the unit block before compression In this case, it can be used when calculating the amount of data from the head of the search target item data corresponding to the search key information to the head position of the search target information. Even when body data is recorded, it can be handled without any problem.
[0139]
Further, even if the index information or the compression size table creates the head of the text data of these information as, for example, 0000H, the text when the text data before compression of each document is recorded as described above. As described above, it corresponds to the search key information by recording information such as the head position information of the data, the head position information of the compressed body data, or the data amount of the divided blocks on the CD-ROM. The data of the search target item to be acquired can be acquired.
[0140]
In the above-described embodiment, the unit block has been described as having a size of 4096 bytes (about 4 kilobytes), but is not limited thereto. It can be increased or decreased in consideration of the storage capacity of the memory of the information retrieval apparatus, the data reading speed from the CD-ROM, the time required for compression / decompression processing, and the like.
[0141]
In the above-described embodiment, the accumulated value of the data amount for each unit block after compression is recorded in the compressed size table in association with each unit block as the compressed size. It is not limited to.
[0142]
For example, a compressed size table in which the data amount itself for each unit block after compression is associated with each unit block may be created. In other words, create a compression size table so that the amount of data after compression of each unit block can be known, such as how many bytes of data after compression of the first unit block and how many bytes of data of the second unit block. To do.
[0143]
Thus, the head position of the text data after compression of the target unit block can be obtained by adding the amount of data after compression from the head unit block to the target unit block. In addition, the data amount after compression of the target unit block is calculated from the total value of the data amount after compression from the head unit block to the target unit block, from the head unit block to the target unit block. It is obtained by subtracting the total value of the data amount after compression up to the previous unit block. Thus, even when the compression size table is created so that the data amount after compression of each unit block can be known, the data amount after compression of the target unit block and its head recording position are obtained, and the CD-ROM is obtained. It can be read from and used.
[0144]
In the above-described embodiment, the CD-ROM is used as a recording medium for the electronic book system, but the present invention is not limited to this. Various recording media such as a so-called floppy disk and a miniature magneto-optical disk called a mini disk (MD) and a DVD (digital video disk) can be used.
[0145]
Further, as text data of documents, not only text data but also graphics data can be processed in the same manner.
[0146]
Moreover, although the above-described information retrieval apparatus has been described as dedicated to an electronic book system, the present invention is not limited to this. For example, the present invention can be applied to an information processing apparatus such as a personal computer.
[0147]
【The invention's effect】
As described above, according to the present invention, since data can be compressed and recorded on a recording medium, more data can be recorded on the recording medium. In addition, for data to be compressed and recorded, if there is index information for the data before compression, even if the data is compressed and recorded on a recording medium, the existing index information is used, Search processing can be performed.
[0148]
In addition, compression of data to be recorded on a recording medium is necessary to specify a processing unit because the data is divided into unit blocks of a predetermined size and compressed for each unit block. Since the additional information is small and the calculation for the identification is simple, a reasonable and high-speed search process can be performed.
[Brief description of the drawings]
FIG. 1 is a block diagram for explaining an embodiment of an information recording apparatus according to the present invention.
FIG. 2 is a diagram for explaining an example of electronic book standard index information;
FIG. 3 is a diagram for explaining an example of electronic book standard index information;
FIG. 4 is a diagram for explaining information compression processing performed in an embodiment of an information recording apparatus according to the present invention;
FIG. 5 is a diagram for explaining a compressed size table created in an embodiment of an information recording apparatus according to the present invention;
FIG. 6 is a flowchart for explaining information compression processing and compression size table creation processing performed in an embodiment of the information recording apparatus according to the present invention;
FIG. 7 is a block diagram for explaining an embodiment of an information retrieval apparatus according to the present invention;
FIG. 8 is a flow chart for explaining an operation at the time of information search processing of the embodiment of the information search device according to the present invention;
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Index information generation part, 2 ... Text data generation part, 3 ... Data division part, 4 ... Data compression part, 5 ... Compression data generation part, 6 ... Write control part, 11 ... Optical pick-up, 12 ... Two-axis device, DESCRIPTION OF SYMBOLS 13 ... Spindle motor, 14 ... Driver, 15 ... RF amplifier, 16 ... Signal processing part, 17 ... Display control part, 71 ... Display RAM, 18 ... Display panel, 100 ... CPU, 101 ... ROM, 102 ... RAM, 103 ... Key operation unit, DT ... Text data before compression, DTA ... Text data after compression, TB ... Compression size table

Claims

A plurality of search objects used in an information search apparatus comprising a reading means for reading data from a recording medium, a compression / decompression means for compressing / decompressing the read data, and a data output means for outputting the compressed / decompressed data Index information for detecting the first recording position for each of the search target items when the body data including the items are sequentially recorded on a recording medium without a break for each of the search target items, key information for search As an information search method for detecting the target search target item from body data including the plurality of search target items that are compressed and recorded,
On the recording medium, the body data is divided into predetermined equal amounts of data, and compressed in units of the divided data are sequentially recorded so that the recording positions are continuous. In addition to the index information, a compressed size table in which an accumulated value of the compressed data size for each divided data unit is described in association with each divided data is recorded,
Based on the index information , the reading means specifies the data amount up to the address indicating the start position of the specified search target item, and the data amount up to the address indicating the start position is set to the predetermined size. By dividing by the data amount, the divided data unit including the head position is specified, and after the compression corresponding to the specified divided data unit based on the accumulated value of the compressed data size of the divided data A step of reading out the body data corresponding to the specified divided data unit from the recording medium, specifying the recording start position of the body data and the data amount thereof,
A compression / decompression step of decompressing the compressed body data of the divided data unit read by the readout unit in the readout step by the compression / decompression unit ;
The search target item from the head position of the designated search target item detected based on the index information in the data compressed and decompressed by the compression and decompression unit in the compression and decompression step by the data output unit. An information retrieval method comprising: a target data output step for outputting the data of

Search for index information for detecting a head recording position for each search target item when body data including a plurality of search target items is sequentially recorded on a recording medium without a break for each search target item An information search device for detecting the target search target item from text data including the plurality of search target items that are compressed and recorded as key information for
Wherein the recording medium, the text data is divided into each data of a predetermined equal size, which has been compressed by the divided data units, with are sequentially recorded as recording position successively , the accumulated value of the data size after compression divided per data unit is compressed size table that is described in association with each divided data are recorded in addition to the index information,
A loading unit for the recording medium;
Search key information input means for receiving input of search key information for searching for a target search target item;
Based on the index information corresponding to the search target item corresponding to the search key information input through the search key information input means, specify the data amount up to the address indicating the start position of the specified search target item , By dividing the data amount up to the address indicating the head position by the data amount of the predetermined size, the divided data unit including the head position is specified, and the accumulated data size of the divided data is compressed. Based on the arithmetic value, the recording start position of the compressed body data corresponding to the specified divided data unit and the data amount thereof are specified, and the body data corresponding to the specified divided data unit is recorded in the recording Reading means for reading from the medium;
Compression and decompression means for decompressing the compressed body data compressed in units of the divided data read by the readout means;
Target data output means for outputting data of the search target item from the head position of the specified search target item detected based on the index information in the data compressed and decompressed by the compression / decompression means; An information retrieval device provided.