JP2004110411A

JP2004110411A - Document display system, document display method, and document display program

Info

Publication number: JP2004110411A
Application number: JP2002272120A
Authority: JP
Inventors: Shuji Senda; 仙田　修司
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2002-09-18
Filing date: 2002-09-18
Publication date: 2004-04-08

Abstract

<P>PROBLEM TO BE SOLVED: To provide a system for displaying a folded document image without using a letter slicing method. <P>SOLUTION: A sentence block extraction means 2 outputs the location and the dimension of a block of letter strings (sentence block) arranged in the document image, and a line rectangle extraction means 3 outputs the locations and the dimensions of respective lines inside the sentence block. A user instructs the sentence block desired to be displayed by means of a user instruction means 6, and a folded image formation means 4 forms a folded image of the sentence block along a display area according to the locations and the dimensions of the lines inside the instructed sentence block. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、文書表示システム、文書表示方法および文書表示用プログラムに関し、特に表示領域の大きさに合わせて文字列を再配置して表示する文書表示システム、文書表示方法および文書表示用プログラムに関する。
【０００２】
【従来の技術】
従来の文書画像（文書を画素の集合としてデジタル化したもの）を表示するシステムでは、表示領域に収まりきらない文書画像を表示するために上下左右のスクロールを用いていた。文書画像と表示領域の関係の一例を図１５に示す。図１５に示すように表示したい文書画像１０２に対して表示領域１０１が小さい場合、ユーザは横方向と縦方向のスクロールを頻繁に繰り返しながら内容を把握しなければならず、非常に不便である。例えば、電子図書館などで、ディジタル画像として保存されている文書をパソコンなどで閲覧する場合、閲覧する文書量が多いと、ユーザにかかる負担も非常に大きくなる。
【０００３】
一方、文書を画像ではなく文字コードで扱うエディタまたはビューワは、図１６に示すように表示領域１０１に合わせて行を折り返し表示することにより、一方向（横書きであれば縦方向）のスクロールのみで内容を把握することができる（例えば、特許文献１参照。）。ここで折り返し表示とは、表示したい文書ブロックに対して表示領域が小さい場合に、文章ブロックの行の途中から新たな行に折り返して表示を行うことをいう。図１５の例では、文書画像１０２の文章ブロックの最初の１行は「あいうえお、かきくけこ、さし」からなっているのに対して、表示領域１０１は小さく、１行を１画面に表示できない。そこで図１６のように、表示領域１０１の大きさに合わせて、最初の１行の表示を「あいうえお、か」とし、「きくけこ、さし」を次の行に折り返して表示している。つまり、表示装置の違いや利用形態の違い（２つの文書を同時に閲覧するなど）により表示領域の大きさが変化する場合でも、それに合わせて文書中の文字列を折り返すことでその表示領域に適した表示を行う。
【０００４】
それとは別に、文書画像をレイアウト解析した結果を利用して折り返し表示を行う装置もある（例えば、特許文献２参照。）。レイアウト解析によって、文章のブロック、行、文字といった文書画像を構成している各要素の種類と位置と大きさを得ることができ、それらの情報を利用して文字単位で画像を再配置すれば前記エディタまたはビューワと同様に折り返し表示が実現できる。
【０００５】
【特許文献１】
特開平７−２６１７３６号公報　（第３−５頁、第６図）
【特許文献２】
特開２００１−２１６２９２号公報　（第４−５頁、第２図）
【０００６】
【発明が解決しようとする課題】
しかしながら、特許文献１に記載された発明による拡大表示装置における折り返し表示は、文書を文字コードで扱っているためにできることであり、文書画像には適用できない。また、文字コードで扱っていても、文書作成時に文字の配置が固定されている文書フォーマット（ＰＤＦ（ポータブルドキュメントフォーマット）や一部のＨＴＭＬ（ハイパーテキストマークアップラングエイジ）など）では折り返し表示を行うことができない。
【０００７】
また、特許文献２に記載された発明によるレイアウト画像編集装置及びレイアウト画像編集方法の第１の問題点は、文字ごとの画像を抽出する文字切り出し処理に依存していることである。文字切り出しは文字認識と同程度に難しい課題であるためこれを誤りなく行うことは困難であるにもかかわらず、解析誤りへの対応がなされていない。
【０００８】
第２の問題点は、ネットワークを介した文書配信の効率化を考慮していないことである。文書画像はデータ量が膨大であるために、インターネットなどの高速でないネットワークを介して配信するためには効率的な手法が必要となる。
【０００９】
第３の問題点は、文書画像以外の文書フォーマットに対応していないことである。そのため、画像ではない文書を折り返し表示したいという要求に応えられない。その一方、各文書フォーマットにはそれに対応した表示装置（または表示プログラム）がそれぞれ存在するが、それらで折り返し表示を実現するにはそれぞれの装置（またはプログラム）を改造しなくてはならない。
【００１０】
本発明は、文字切り出しを行わずに文書画像の折り返し表示を行うシステムおよびプログラムを提供することを目的とする。
【００１１】
また本発明は、文書画像をネットワークを介して効率的に配信できるシステムを提供することも目的とする。
【００１２】
さらに本発明は、異なるフォーマットの文書を統一したフォーマットに変換することで単一の表示装置（または表示プログラム）で折り返し表示が可能なシステムを提供することも目的とする。
【００１３】
【課題を解決するための手段】
請求項１記載の発明による文書表示システムは、文書画像中に配置されたひとかたまりの文字列である文章ブロックの位置と大きさとを出力する文章ブロック抽出手段と、該抽出された文章ブロック内の各行の位置と大きさとを抽出して、文章ブロック内の一列の文字を矩形状に囲んだ領域内の文字列である行矩形を出力する行矩形抽出手段と、ユーザが表示させたい文章ブロックを指示するユーザ指示手段と、該ユーザが指示した文章ブロック内の行の位置と大きさとから、表示領域に合わせて該文章ブロックを折り返した、折り返し画像を生成する折り返し画像生成手段とを備えたことを特徴とする。
【００１４】
請求項２記載の発明による文書表示システムは、行矩形抽出手段により抽出された行矩形を分割する際に、行矩形の区切り位置を抽出する区切り位置抽出手段を備え、折り返し画像生成手段は、該抽出された区切り位置で行矩形の分割を行うことを特徴とする。
【００１５】
請求項３記載の発明による文書表示システムは、文書画像全体を縮小した縮小画像を生成する縮小画像生成手段を備え、縮小画像と折り返し画像を同時に表示し、折り返し画像生成手段は、縮小画像生成手段により生成された縮小画像を避けた表示領域に対して折り返し画像を生成することを特徴とする。
【００１６】
請求項４記載の発明による文書表示システムは、ユーザの指示に応じてネットワークを介して文書画像、文章ブロックおよび行矩形の情報を送受信する、サーバ通信手段およびクライアント通信手段と、画像を一定の大きさの部分画像であるタイルに分割しつつ、各タイルを複数の解像度で保持する多重解像度タイル化画像記憶メモリとを備えたことを特徴とする。
【００１７】
請求項５記載の発明による文書表示システムは、特定の文書フォーマットを画像に変換する、フォーマット変換手段を対応する文書フォーマットの数だけ備えたことを特徴とする。
【００１８】
請求項６記載の発明による文書表示方法は、文書画像中に配置された文章ブロックの位置と大きさ、および文章ブロック内の各行の位置と大きさとを抽出し、ユーザが指示した文章ブロック内の行の位置と大きさとから、表示領域に合わせて該文章ブロックを折り返した、折り返し画像を生成することを特徴とする。
【００１９】
請求項７記載の発明による文書表示プログラムは、文書画像中に配置されたひとかたまりの文字列である文章ブロックの位置と大きさとを出力する文章ブロック抽出手段と、文章ブロック内の一列の文字を矩形状に囲んだ領域の文字列である行矩形を出力する行矩形抽出手段と、ユーザ指示手段によりユーザが選択した行の位置と大きさとから表示領域に合わせて文章ブロックを折り返した折り返し画像生成手段としてコンピュータを機能させることを特徴とする。
【００２０】
【発明の実施の形態】
次に、本発明の実施の形態について図面を参照して説明する。
図１は、本発明による文書表示システムの第１の実施の形態を示すブロック図である。図１に示すように、文書表示システムは、文書画像を蓄積する画像記憶メモリ１と、文書画像から文章ブロックを抽出する文章ブロック抽出手段２と、文章ブロック内の一列の文字を矩形で囲んだものである行矩形を抽出する行矩形抽出手段３と、文章ブロック内の行矩形から折り返し画像を生成する折り返し画像生成手段４と、文書画像全体を縮小した縮小画像を生成する縮小画像生成手段５と、マウスやキーボードなどによってユーザがシステムに指示を与えるユーザ指示手段６と、文書画像の縮小表示と折り返し表示のどちらかを選択する表示選択手段７と、画像の表示を行う画像表示装置８とから構成されている。
【００２１】
画像記憶メモリ１は、表示対象となる文書画像を文字を読むのに十分な解像度で保持する。本実施の形態では、画像記憶メモリ１には表示対象とする文書画像があらかじめ存在するとしているが、別に存在する文書画像データベースの中からユーザの指示によって文書画像が転送されるとしてもよい。
【００２２】
文章ブロック抽出手段２は、画像記憶メモリ１に格納された文書画像のレイアウトを画像理解により解析し、例えば１段落のように文字のまとまりとなる構成要素（文章ブロック）を抽出する。そして、抽出した文章ブロックのそれぞれの位置と大きさとを出力する。ただし、本実施の形態では、文章ブロックは、文章ブロック抽出手段２によって自動抽出されるが、ユーザが折り返し表示したい文章ブロックの抽出を行ってもよい。
【００２３】
行矩形抽出手段３は、各文章ブロックについて、その内部の文字列が縦書きか横書きかを判定したのち、行間の空白などを手がかりとして行を抽出する。そして、抽出した行のそれぞれの位置と大きさとを出力する。
【００２４】
折り返し画像生成手段４は、ユーザ指示手段６によって指示された文章ブロック内の各行を切り貼りした折り返し画像を生成する。文章ブロックが横書きであれば横幅を表示領域に合わせて上から下に伸びる縦長の画像を生成し、縦書きであれば縦幅を表示領域に合わせて右から左に伸びる横長の画像を生成する。
【００２５】
縮小画像生成手段５は、画像記憶メモリ１に格納された文書画像を縮小するだけでなく、その上に文章ブロック抽出手段２により抽出された文章ブロックの枠を重畳した画像を生成する。ただし、本実施の形態では、文章ブロックの部分を強調するために、文章ブロックの枠を重畳した画像を生成するが、文章ブロック内の色を変化させたりするなど他の表現を用いてもよい。
【００２６】
ユーザ指示手段６は、ユーザによるボタン押下などにより、縮小画像と折り返し画像のどちらを表示するかを表示選択手段７に伝える。また、縮小画像に重畳表示された文章ブロック枠をユーザが選択することにより折り返し表示したい文章ブロックを折り返し画像生成手段４に伝える。さらに、縮小表示の表示倍率の変更や上下左右のスクロール、折り返し画像の上下方向（縦書きの場合は左右方向）のスクロールを画像表示装置８に指示する。
【００２７】
表示選択手段７は、ユーザ指示手段６によるユーザの指示に従って、縮小画像または折り返し画像のどちらの画像を表示するかを選択する。画像表示装置８は、表示選択手段７により選択された画像の表示を行い、ユーザ指示手段６によるユーザの指示によって画像の表示倍率の変更やスクロールを行う。文章ブロック抽出手段２、行矩形抽出手段３、折り返し画像生成手段４、縮小画像生成手段５、表示選択手段７はパーソナルコンピュータ等のＣＰＵでの処理が可能であり、画像記憶メモリ１を含めると、この実施の形態は、パーソナルコンピュータ等での実施が可能である。
【００２８】
次に、図２、図３および図４のフローチャートを参照して、この実施の形態の動作について説明する。まず、文章ブロック抽出手段２が、画像記憶メモリ１に格納された文書画像のレイアウトを画像理解により解析し、文章ブロックを抽出する（ステップＡ１１）。このとき、文章ブロック抽出手段２は、例えば特開平４−４４１８５号公報記載の、文書画像を文字行のブロックや、図などのブロックに分割し、分割結果にもとづいて文書画像のレイアウトを解析する手法を用いることができる。さらに、行矩形抽出手段３が、各文章ブロックについて縦書きであるか横書きであるかを判定したのち、行間の空白などを手がかりとして行を抽出する（ステップＡ１２）。このとき行矩形抽出手段３は、例えば、文章ブロックに外接する矩形特徴や、画素の縦軸・横軸のヒストグラム特徴等を求めることにより、これらの特徴の分布にもとづいて、各文章ブロックが、縦書きであるか横書きであるかの判定を行うことができる。
【００２９】
次に、縮小画像生成手段５が、画像記憶メモリ１に格納された文書画像を縮小し、その上に文章ブロック枠を重畳した画像を生成し（ステップＡ２１）、画像表示装置８が生成された縮小画像を表示する（ステップＡ２２）。そして、システムはユーザ指示手段６からの指示を待つループに入る（ステップＡ２３、Ａ２５、Ａ２７）。スクロールバーの操作などによりスクロールを指示された場合（ステップＡ２３）、画像表示装置８は縮小画像をスクロールする（ステップＡ２４）。スクロール操作には表示領域の大きさ分だけ一度に移動するページ送りなどの方法を用いてもよい。また、拡大または縮小のボタン押下などにより拡大または縮小を指示された場合（ステップＡ２５）、縮小画像生成手段５により拡大または縮小された画像を再度生成する（ステップＡ２６）。重畳表示された文章ブロック枠を選択された場合、表示選択手段７により折り返し画像の表示に切り替える（ステップＡ２７）。
【００３０】
折り返し画像の表示に切り替えられた場合、折り返し画像の生成（ステップＡ３１）、表示（ステップＡ３２）、スクロール（ステップＡ３３とＡ３４）、拡大または縮小（ステップＡ３５とＡ３６）を縮小画像の場合と同様に行う。縮小画像との相違は、縮小画像のスクロールは上下左右の４方向に可能であるのに対して、折り返し画像はどのような拡大や縮小を行っても表示領域の大きさに合わせて生成されるので、折り返し画像のスクロールは上下または左右の２方向に対してのみ可能となる点である。そして、ボタン押下などにより縮小画像の表示に切り替える（ステップＡ３７）。
【００３１】
折り返し画像の生成（ステップＡ３１）について、図５のフローチャートを参照して、動作について説明する。ここでは横書きを対象として説明するが、対象が縦書きの場合でも同様に動作する。始めに、残り幅という変数を表示領域の幅に設定する（ステップＡ４１）。次に、最初の行矩形を一つ取り出し（ステップＡ４２）、行矩形の行幅が、残り幅を超えていないかどうかを比較する（ステップＡ４３）。行幅が残り幅よりも大きい場合、現在の行矩形を先頭から残り幅分と、それ以外の部分とに分割し、両方を行矩形とみなしてステップＡ４２の処理に戻る（ステップＡ４４）。行幅が残り幅と同じか小さい場合は、その行矩形を折り返し画像に貼り付ける（ステップＡ４５）。行矩形を貼り付ける際には、行間をあける、文字と背景のコントラストを高くする、などの文字を読みやすくするための調整を行ってもよい。そして、処理していない行矩形が残っていなければ終了する（ステップＡ４６）。そうでない場合は、残り幅を貼り付けた行矩形の幅分だけ減らす（ステップＡ４７）。残り幅が０より大きければステップＡ４２へ、０であれば残り幅を再設定するためにステップＡ４１へと戻る（ステップＡ４８）。
【００３２】
次に、図６に具体的な実施例を示し、この実施の形態の動作について説明する。図６（Ａ）に、対象とする文書画像の一例を示す。文章ブロック抽出手段２は、表示領域１０１に表示されている縮小画像１１１から、文章ブロック１１２と文章ブロック１１３の２つを抽出する（ステップＡ１１）。そして、行矩形抽出手段３はそれぞれの文章ブロック内の行を抽出する（ステップＡ１２）。次に、縮小画像生成手段５が縮小した文章画像の上に文章ブロックの枠を重畳した縮小画像１１１を生成し（ステップＡ２１）、画像表示装置８がそれを表示する（ステップＡ２２）。縮小画像の表示倍率の初期値は、表示領域に画像全体が収まるように縮小画像生成手段５が決定した。ユーザが文章ブロック１１２を選択すると折り返し画像の表示に切り替わる（ステップＡ２７）。そして、折り返し画像生成手段４が、表示領域の横幅に合わせて図６（Ｂ）に示す折り返し画像１１４を生成し（ステップＡ３１）、画像表示装置８がそれを表示する（ステップＡ３２）。折り返し画像１１４は、上下方向のスクロールだけで全体を表示することができる。
【００３３】
本実施の形態では、文字切り出しを利用せず、行抽出だけを用いて折り返し表示を行うために、文字切り出しの誤りによりユーザを混乱させることなく、どのような拡大や縮小に対しても表示領域に合わせた折り返し画像を生成できる。行抽出は文字切り出しに比べればはるかに誤りにくい処理である。このように生成された折り返し画像は、上下または左右のスクロールだけで読むことができるため、折り返さない場合に比べて操作量が少なく、快適に文章を読むことができる。
【００３４】
次に、本発明の第２の実施の形態について図面を参照して説明する。
図７は、本発明による文書表示システムの第２の実施の形態を示すブロック図である。図７を参照すると、本発明の第２の実施の形態は、行矩形を分割する際の区切り位置を抽出する区切り位置抽出手段１１と、文章ブロック内の行矩形と区切り位置から折り返し画像を生成する第２の折り返し画像生成手段１２と、縮小画像と折り返し画像を同時に表示する第２の画像表示装置１３とを有する点、および表示選択手段７を有しない点が第１の実施の形態と異なる。
【００３５】
区切り位置抽出手段１１は、第２の折り返し画像生成手段１２で行矩形を分割する際に、１文字の途中で分割を行わないように分割の区切り位置をあらかじめ抽出する。区切り位置抽出手段１１は、例えば図８に示すように、行矩形１２１のうち、縦方向に見て行矩形中に文字画素が全くない空白区間の中間を区切り位置１２２とする。その他、文字のピッチなどを利用する、より高度な文字切り出しの手法を利用してもよい。その場合、折り返し画像の生成では文字の途中で分割が起こらないことが重要であるので、可変ピッチなどで文字切り出しが困難な場合に、文字を過剰に分割してしまうよりは、確実性の高い区切り位置のみを抽出する方がよい。
【００３６】
第２の折り返し画像生成手段１２は、折り返し画像生成手段４の動作を示す図５において、ステップＡ４４にて行矩形を分割する際に、区切り位置抽出手段１１で抽出した区切り位置でのみ分割するという点が、折り返し画像生成手段４と異なる。さらに、図９の例に示すように、縮小画像と折り返し画像を同時に表示するために、ステップＡ４１にて残り幅に設定する表示領域の幅は一定ではない点が、折り返し画像生成手段４と異なる。それ以外の動作は折り返し画像生成手段４と同様である。
【００３７】
第２の画像表示装置１３は、縮小画像と折り返し画像を同時に表示するとともに、ユーザ指示手段６によるユーザの指示によって画像の表示倍率の変更やスクロールを行う。また、ユーザは縮小画像の移動を指示することができ、その場合は縮小画像を避けるように新たに折り返し画像を生成したのち両者を同時に表示する。
【００３８】
次に、図１０、図１１および図１２にこの実施の形態のフローチャート示し、図面を参照して動作について説明する。まず、文章ブロックの抽出と行矩形の抽出と区切り位置の抽出を行う（ステップＡ１１、Ａ１２、Ａ１０１）。次に、縮小画像の生成を行ってから、それを避けるように折り返し画像の生成を行い、図９の例に示すように両者を同時に表示する（ステップＡ２１、Ａ１０２、Ａ１０３）。そして、縮小画像に対する移動と拡大縮小、文章ブロック枠の選択、折り返し画像のスクロールと拡大縮小の指示を待つループに入る（ステップＡ１０４、Ａ２５、Ａ２７、Ａ３３、Ａ３５）。折り返し画像は縮小画像を避けて生成するので、折り返し画像のスクロールをした場合でも折り返し画像を再度生成する（ステップＡ３４）。
【００３９】
次に、本実施の形態の効果について説明する。第１の実施の形態では、折り返し画像の生成の際に、表示領域にあわせて折り返し位置を決定して折り返し表示をしていたのに対し、本実施の形態では、区切り位置の抽出とそれを利用した折り返し画像の生成を行うので、１文字単位で折り返し表示ができるようになる。区切り位置の抽出は、文字切り出しとは異なり、確実性の高いものだけを抽出すればよいので誤りが起こりにくい。また、本実施の形態では、縮小画像と折り返し画像を同時に表示するので、折り返し表示している文章ブロックの文章画像中での位置が把握しやすくなる。その際に、縮小画像を避けるように折り返し画像を生成するので、縮小画像の下の領域が見えないといった問題を回避しつつ表示領域全体を有効に利用できる。
【００４０】
次に、本発明の第３の実施の形態について図面を参照して説明する。図１３は、本発明による文書表示システムの第３の実施の形態を示すブロック図である。図１３を参照すると、本発明の第３の実施の形態は、ネットワークを介してサーバ部２０４とクライアント部２０５が通信するための制御を行うサーバ通信手段２０１とクライアント通信手段２０２と、文書を画像として保持し、画像をタイルとよぶ一定の大きさの部分画像に分割しつつ各タイルを複数の解像度で保持する多重解像度タイル化画像記憶メモリ２０３とを備える点が第１の実施例と異なる。また、文章ブロック抽出手段２と、行矩形抽出手段３とをサーバ部２０４が備える。
【００４１】
次に本実施の形態の動作について図１３を参照して説明する。サーバ部２０４において、多重解像度タイル化画像記憶メモリ２０３で画像として保持している文書から、文章ブロック抽出手段２は文章ブロックを抽出し、行矩形抽出手段３は行矩形を抽出する。抽出された文章ブロックの情報はサーバ通信手段２０１を介してクライアント部２０５へ送られる。また、多重解像度タイル化画像記憶メモリ２０３は、保持している文書画像全体を一定の大きさの部分画像に分割してタイル化し、サーバ通信手段２０１を介してクライアント部２０５へ送信する。ここで、文章ブロックを含むタイルの解像度は、文章ブロックを含まないタイルの解像度に比べて高くなるようにタイル化を行う。クライアント部２０５のクライアント通信手段２０２で受信した文章ブロックと文書画像の情報は、縮小画像生成手段５に送られる。縮小画像生成手段５は、文書画像を縮小し、縮小した文書画像の上に文章ブロックの枠を重畳した画像を生成し、表示選択手段７を介して画像表示装置８へ出力する。
【００４２】
ここで、ユーザによるボタン押下などにより、ユーザ指示手段６が、文章ブロックを選択し、折り返し画像表示の指示を行うと、クライアント通信手段２０２を介してサーバ部２０４へ、文章ブロックおよび行矩形の情報の要求を行う。サーバ通信手段２０１は、クライアント通信手段２０２の要求に応じて、情報を送信する。クライアント通信手段２０２は、サーバ通信手段２０１から送られた情報を、折り返し画像生成手段４へ出力する。折り返し画像生成手段は、送られた情報から折り返し画像を生成し表示選択手段７へ出力する。表示選択手段７は、折り返し画像の表示を選択し、生成した折り返し画像を画像表示装置８へ出力する。画像表示装置８は折り返し画像の表示を行う。
【００４３】
その後、ユーザがボタン押下などにより、文章ブロックの選択、または画像のスクロールや拡大縮小の指示を、ユーザ指示手段６を介して行うと、クライアント通信手段２０２は、文書ブロックおよび行矩形の情報の要求、または解像度とタイル番号による画像取得の要求をサーバ部２０４へ発する。サーバ通信手段２０１はクライアント通信部２０５の要求に応えて、必要な情報を送信する。このとき、表示倍率からは適切な解像度を、表示領域の位置と大きさからは必要なタイルをそれぞれ決定し、通信データ量をなるべく小さくするよう制御する。そして必要な情報を得たクライアント部２０５は、ユーザの指示通りの処理を行う。このときのクライアント部２０５の動作は、第１の実施の形態と同様である。
【００４４】
本発明における縮小画像は、レイアウトの把握を主目的としているので、文書全体にわたる広範囲な画像が必要である。一方、折り返し画像は文字を読むのが目的なため、縮小画像よりも高い解像度の画像が必要であるが、その範囲は文章ブロック内だけで済む。例えば、文章ブロックの領域が全画像領域の半分である文書画像を考える。ここで、この画像の文字を読むには例えば１５０ｄｐｉの解像度が必要で、文書全体を把握するための縮小画像は例えば７５ｄｐｉで十分であるとする。さらに、１５０ｄｐｉの画像全体を、何らかの手法で圧縮したデータ量が４００ｋＢであり、７５ｄｐｉの画像全体の圧縮後のデータ量は１００ｋＢであったとする。従来は、文書の通信のためには、文字が可読な解像度である１５０ｄｐｉの画像を全て送らねばならず、４００ｋＢの通信量が発生したが、本発明の第３の実施の形態によれば、縮小画像＋必要な文書ブロック部分＝１００ｋＢ＋４００ｋＢ／２＝３００ｋＢの通信量で済む。また、画像の圧縮方式として、例えばＪＰＥＧ２０００のような、解像度に対して階層的な圧縮方式を用いるならば、縮小画像である７５ｄｐｉの圧縮データを１５０ｄｐｉの圧縮データの一部として利用できるので、通信量をさらに削減することができる。以上説明したように、本実施の形態の効果は、文章ブロックだけを高解像で取得するように、解像度とタイルに応じて通信を制御して、ネットワークを介した文書画像を配信する際のデータ量を削減することができることである。
【００４５】
次に、本発明の第４の実施の形態について図面を参照して説明する。図１４は、本発明による文書表示システムの第４の実施の形態を示すブロック図である。図１４を参照すると、本発明の第４の実施の形態は、クライアント部２０５との通信だけでなく、標準的な文書フォーマットであるＨＴＭＬやＰＤＦの文書を、インターネットなどを介して取得することができる第２のサーバ通信手段３０１と、取得したＨＴＭＬ文書を画像に変換して多重解像度タイル化画像記憶メモリ２０３に送るＨＴＭＬ画像化手段３０２と、取得したＰＤＦ文書を画像に変換して多重解像度タイル化画像記憶メモリ２０３に送るＰＤＦ画像化手段３０３とを備えたことが、第３の実施の形態と異なる。
【００４６】
次に本実施の形態の動作について図１４を参照して説明する。第２のサーバ通信手段３０１は、クライアント部からのユーザ指示などによって指定された、ＨＴＭＬやＰＤＦなどの文書を、インターネットなどを介して取得し、それをそれぞれのフォーマットに応じた画像化手段へと送る。ＨＴＭＬ画像化手段３０２は、第２のサーバ通信手段３０１によって取得されたＨＴＭＬ文書を画像に変換する。その手法は、例えば、ＷｅｂブラウザなどのＨＴＭＬ文書の表示ソフトウエアの印刷機能を利用して擬似的なプリンタとして動作することにより取得する方法がある。このような方法を取れば、文書フォーマットそのものを直接扱うことなく容易に画像への変換が達成できる。ＰＤＦ画像化手段３０３は、画像フォーマットの違いを除けばＨＴＭＬ画像化手段３０２と同様である。多重解像度タイル化画像記憶メモリ２０３が画像を取得すれば、その他の動作は、第３の実施の形態と同様である。
【００４７】
本実施の形態の効果は、文書を画像に変換する手段を文書フォーマットの分だけ用意するだけで、クライアント部の変更はなしに様々な文書フォーマットに対応した折り返し表示ができることである。つまり、本実施の形態では、ＨＴＭＬやＰＤＦを文書の例として説明しているが、その他のフォーマットについてもそれを画像に変換する手段を用意するだけで同様に動作する。従って、本実施の形態が対象とする文書は、表示装置上で画像として表示可能なものであれば、あるいは、画面に表示できない場合であっても文書を画像に変換することが可能なものであれば、そのいずれの文書に対しても実施できる。例えば、画面上に表示された文書を画面キャプチャしたり、印刷出力を画像へ変換するなどといった手法によって、文書を画像へ変換できる。
【００４８】
【発明の効果】
以上のように、請求項１記載の発明によれば、文書画像中に配置されたひとかたまりの文字列である文章ブロックの位置と大きさとを出力する文章ブロック抽出手段と、該抽出された文章ブロック内の各行の位置と大きさとを抽出して、文章ブロック内の一列の文字を矩形状に囲んだ領域内の文字列である行矩形を出力する行矩形抽出手段と、ユーザが表示させたい文章ブロックを指示するユーザ指示手段と、該ユーザが指示した文章ブロック内の行の位置と大きさとから、表示領域に合わせて該文章ブロックを折り返した、折り返し画像を生成する折り返し画像生成手段とを備えたため、文字切り出しを利用せず、行抽出だけを用いて折り返し表示を行うために、文字切り出しの誤りによりユーザを混乱させることなく、どのような拡大や縮小に対しても表示領域に合わせた折り返し画像を生成できる効果がある。
【００４９】
請求項２記載の発明によれば、行矩形抽出手段により抽出された行矩形を分割する際に、行矩形の区切り位置を抽出する区切り位置抽出手段を備え、折り返し画像生成手段は、該抽出された区切り位置で行矩形の分割を行う構成にしたので、１文字の途中で折り返すことが少なくなり、読みやすい折り返し表示ができる効果がある。
【００５０】
請求項３記載の発明によれば、文書画像全体を縮小した縮小画像を生成する縮小画像生成手段を備え、縮小画像と折り返し画像とを同時に表示し、折り返し画像生成手段は、縮小画像生成手段により生成された縮小画像を避けた表示領域に対して折り返し画像を生成する構成にしたので、折り返し表示している文章ブロックの文章画像中での位置が把握しやすくなる効果がある。また、その際に、縮小画像を避けるように折り返し画像を生成するので、縮小画像の下の領域が見えないといった問題を回避しつつ、表示領域全体を有効に利用できるという効果がある。
【００５１】
請求項４記載の発明によれば、ユーザの指示に応じてネットワークを介して文書画像、文章ブロックおよび行矩形の情報を送受信する、サーバ通信手段およびクライアント通信手段と、画像を一定の大きさの部分画像であるタイルに分割しつつ、各タイルを複数の解像度で保持する多重解像度タイル化画像記憶メモリとを備えた構成にしたので、解像度とタイルに応じて通信を制御しつつ文章ブロックだけを高解像で取得することにより、ネットワークを介した文書画像の配信の際のデータ量を削減できる効果がある。
【００５２】
請求項５記載の発明によれば、特定の文書フォーマットを画像に変換する、フォーマット変換手段を対応する文書フォーマットの数だけ備えた構成にしたので、クライアント部の変更はせずに、様々な文書フォーマットに対応した折り返し表示ができる効果がある。
【００５３】
請求項６および請求項７記載の発明によれば、文書画像中に配置された文章ブロックの位置と大きさ、および、文章ブロック内の各行の位置と大きさとを抽出し、ユーザが指示した文章ブロック内の行の位置と大きさとから、表示領域に合わせて該文章ブロックを折り返した、折り返し画像を生成する構成にしたので、このようにして生成された折り返し画像は、上下または左右のスクロールだけで読むことができるため、折り返さない場合に比べて操作量が少なく、快適に文章を読むことができる効果がある。
【図面の簡単な説明】
【図１】本発明の第１の実施の形態の構成を示すブロック図である。
【図２】第１の実施の形態の動作を示す流れ図その１である。
【図３】第１の実施の形態の動作を示す流れ図その２である。
【図４】第１の実施の形態の動作を示す流れ図その３である。
【図５】折り返し画像生成手段４の動作を示す流れ図である。
【図６】縮小画像と折り返し画像の例である。
【図７】本発明の第２の実施の形態の構成を示すブロック図である。
【図８】区切り位置の例である。
【図９】縮小画像と折り返し画像を重畳表示した例である。
【図１０】第２の実施の形態の動作を示す流れ図その１である。
【図１１】第２の実施の形態の動作を示す流れ図その２である。
【図１２】第２の実施の形態の動作を示す流れ図その３である。
【図１３】本発明の第３の実施の形態の構成を示すブロック図である。
【図１４】本発明の第４の実施の形態の構成を示すブロック図である。
【図１５】文書画像と表示領域の関係の例を示す図である。
【図１６】文書の折り返し表示の例を示す図である。
【符号の説明】
１　画像記憶メモリ
２　文章ブロック抽出手段
３　行矩形抽出手段
４　折り返し画像生成手段
５　縮小画像生成手段
６　ユーザ指示手段
７　表示選択手段
８　画像表示装置
１１　区切り位置抽出手段
１２　第２の折り返し画像生成手段
１３　第２の画像表示装置
１０１　表示領域
１０２　文書画像
１０３　折り返したテキスト
１１１　縮小画像
１１２、１１３　文章ブロック
１１４　折り返し画像
１２１　行矩形
１２２　区切り位置
２０１　サーバ通信手段
２０２　クライアント通信手段
２０３　多重解像度タイル化画像記憶メモリ
２０４　サーバ部
２０５　クライアント部
３０１　第２のサーバ通信手段
３０２　ＨＴＭＬ画像化手段
３０３　ＰＤＦ画像化手段[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a document display system, a document display method, and a document display program, and more particularly, to a document display system, a document display method, and a document display program for rearranging and displaying a character string according to the size of a display area.
[0002]
[Prior art]
In a conventional system for displaying a document image (a document obtained by digitizing a document as a set of pixels), up, down, left, and right scrolls are used to display a document image that does not fit in the display area. FIG. 15 shows an example of the relationship between the document image and the display area. When the display area 101 is small with respect to the document image 102 to be displayed as shown in FIG. 15, the user has to grasp the contents while frequently scrolling in the horizontal and vertical directions, which is very inconvenient. For example, when browsing a document stored as a digital image on a personal computer or the like in an electronic library or the like, if the amount of the browsed document is large, the burden on the user becomes very large.
[0003]
On the other hand, an editor or viewer that handles a document using character codes instead of images displays lines in a folded manner in accordance with the display area 101 as shown in FIG. The contents can be grasped (for example, refer to Patent Document 1). Here, the wording display means that, when the display area is small with respect to the document block to be displayed, the text block is returned to a new line from the middle of the line and displayed. In the example of FIG. 15, the first line of the text block of the document image 102 is composed of “Aioe, Kikukukeko, Sashi”, whereas the display area 101 is small and one line is displayed on one screen. Can not. Therefore, as shown in FIG. 16, the display of the first line is set to “Aioeoka” according to the size of the display area 101, and “Kikukeko, Sashi” is folded back to the next line and displayed. . In other words, even when the size of the display area changes due to a difference in the display device or a difference in the use form (for example, browsing two documents at the same time), the character string in the document is folded back to suit the display area. Display.
[0004]
Separately, there is an apparatus that performs wrap-around display using the result of layout analysis of a document image (for example, see Patent Document 2). Through layout analysis, it is possible to obtain the type, position, and size of each element that constitutes a document image, such as text blocks, lines, and characters, and use that information to rearrange the image in character units. The folded display can be realized in the same manner as the editor or the viewer.
[0005]
[Patent Document 1]
JP-A-7-261736 (page 3-5, FIG. 6)
[Patent Document 2]
JP 2001-216292 A (Pages 4-5, FIG. 2)
[0006]
[Problems to be solved by the invention]
However, folded display in the enlarged display device according to the invention described in Patent Literature 1 can be performed because a document is handled by character codes, and cannot be applied to a document image. Even if a character code is used, a wrapping display is performed in a document format (PDF (Portable Document Format) or some HTML (Hypertext Markup Language)) in which the character arrangement is fixed at the time of document creation. I can't.
[0007]
A first problem of the layout image editing apparatus and the layout image editing method according to the invention described in Patent Document 2 is that it depends on a character cutout process for extracting an image for each character. Although character segmentation is a task as difficult as character recognition, it is difficult to perform this without errors, but no response to analysis errors has been made.
[0008]
A second problem is that it does not consider the efficiency of document distribution via a network. Since a document image has a huge amount of data, an efficient method is required to distribute the document image via a low-speed network such as the Internet.
[0009]
The third problem is that it does not support document formats other than document images. Therefore, it is impossible to respond to a request to display a document that is not an image in a folded manner. On the other hand, each document format has a display device (or display program) corresponding to it, but in order to realize folded display with them, each device (or program) must be modified.
[0010]
SUMMARY OF THE INVENTION It is an object of the present invention to provide a system and a program for displaying a document image by folding back without performing character segmentation.
[0011]
Another object of the present invention is to provide a system capable of efficiently distributing a document image via a network.
[0012]
Still another object of the present invention is to provide a system that can convert a document in a different format into a unified format and display the content on a single display device (or display program).
[0013]
[Means for Solving the Problems]
A document display system according to claim 1, wherein a text block extracting means for outputting a position and a size of a text block which is a group of character strings arranged in the document image, and each line in the extracted text block. Line rectangle extracting means for extracting a position and a size of a character string, and outputting a line rectangle which is a character string in an area surrounding a line of characters in a rectangular shape in the text block, and indicating a text block to be displayed by the user And a folded image generating means for generating a folded image by folding the sentence block in accordance with the display area based on the position and size of the line in the sentence block designated by the user. Features.
[0014]
The document display system according to the invention according to claim 2, further comprising: a dividing position extracting unit that extracts a dividing position of the line rectangle when dividing the line rectangle extracted by the line rectangle extracting unit. The method is characterized in that a line rectangle is divided at the extracted break positions.
[0015]
A document display system according to a third aspect of the present invention includes a reduced image generating unit that generates a reduced image obtained by reducing the entire document image, displays the reduced image and the folded image simultaneously, and the folded image generating unit includes the reduced image generating unit. A folded image is generated for a display area avoiding the reduced image generated by the method.
[0016]
According to a fourth aspect of the present invention, there is provided a document display system for transmitting and receiving a document image, a text block, and information on a line rectangle via a network according to a user's instruction. A multi-resolution tiled image storage memory that holds each tile at a plurality of resolutions while dividing the tile into tiles that are partial images.
[0017]
According to a fifth aspect of the present invention, there is provided a document display system comprising a plurality of format conversion units for converting a specific document format into an image.
[0018]
According to the document display method of the present invention, the position and size of a sentence block arranged in a document image and the position and size of each line in the sentence block are extracted, and the position and size of the sentence block designated by the user are extracted. It is characterized in that a folded image is generated by folding the text block in accordance with the display area from the position and size of the line.
[0019]
According to a seventh aspect of the present invention, there is provided a document display program, comprising: a text block extracting means for outputting a position and a size of a text block which is a group of text strings arranged in a document image; A line rectangle extracting means for outputting a line rectangle which is a character string of a region surrounded by the shape, and a folded image generating means for folding a text block in accordance with a display area from a position and a size of a line selected by a user by a user instruction means The function of the computer is characterized by the following.
[0020]
BEST MODE FOR CARRYING OUT THE INVENTION
Next, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing a first embodiment of a document display system according to the present invention. As shown in FIG. 1, the document display system includes an image storage memory 1 for storing a document image, a text block extraction unit 2 for extracting a text block from the document image, and a rectangle in which a row of characters in the text block is enclosed. A line rectangle extracting unit 3 for extracting a line rectangle, a folded image generating unit 4 for generating a folded image from a line rectangle in a text block, and a reduced image generating unit 5 for generating a reduced image obtained by reducing the entire document image. A user instructing unit 6 by which a user gives an instruction to the system using a mouse, a keyboard, or the like; a display selecting unit 7 for selecting one of a reduced display and a folded display of a document image; and an image display device 8 for displaying an image. It is composed of
[0021]
The image storage memory 1 holds a document image to be displayed at a resolution sufficient to read characters. In the present embodiment, it is assumed that the document image to be displayed exists in the image storage memory 1 in advance, but the document image may be transferred according to a user's instruction from a separately existing document image database.
[0022]
The sentence block extracting means 2 analyzes the layout of the document image stored in the image storage memory 1 by image understanding, and extracts constituent elements (sentence blocks) that form a unit of characters, for example, as one paragraph. Then, the position and size of each of the extracted text blocks are output. However, in the present embodiment, the sentence block is automatically extracted by the sentence block extracting unit 2, but the sentence block that the user wants to wrap and display may be extracted.
[0023]
The line rectangle extracting means 3 determines whether the character string inside each sentence block is vertical writing or horizontal writing, and then extracts a line by using a space between lines as a clue. Then, the position and size of each of the extracted rows are output.
[0024]
The wrapped image generation unit 4 generates a wrapped image in which each line in the text block specified by the user instruction unit 6 is cut and pasted. If the text block is horizontal writing, generate a vertically long image that extends from top to bottom according to the display area, and if it is vertical writing, generate a horizontal image that extends from right to left according to the vertical width according to the display area .
[0025]
The reduced image generation means 5 not only reduces the document image stored in the image storage memory 1 but also generates an image in which the text block frame extracted by the text block extraction means 2 is superimposed thereon. However, in the present embodiment, an image in which the frame of the text block is superimposed is generated to emphasize the portion of the text block, but other expressions such as changing the color in the text block may be used. .
[0026]
The user instructing means 6 informs the display selecting means 7 which of the reduced image and the folded image is to be displayed when the user presses a button or the like. When the user selects a text block frame superimposed and displayed on the reduced image, the text block desired to be displayed in a loop is transmitted to the loop image generation unit 4. Further, it instructs the image display device 8 to change the display magnification of the reduced display, scroll up, down, left and right, and scroll the folded image up and down (in the case of vertical writing, left and right).
[0027]
The display selection means 7 selects which of the reduced image and the folded image is to be displayed in accordance with a user instruction from the user instruction means 6. The image display device 8 displays the image selected by the display selection means 7, and changes or scrolls the display magnification of the image in response to a user instruction from the user instruction means 6. The sentence block extracting means 2, the line rectangle extracting means 3, the folded image generating means 4, the reduced image generating means 5, and the display selecting means 7 can be processed by a CPU such as a personal computer. This embodiment can be implemented by a personal computer or the like.
[0028]
Next, the operation of this embodiment will be described with reference to the flowcharts of FIGS. 2, 3, and 4. First, the text block extracting means 2 analyzes the layout of the document image stored in the image storage memory 1 by image understanding, and extracts a text block (step A11). At this time, the sentence block extracting unit 2 divides the document image into blocks of character lines or blocks such as figures described in, for example, JP-A-4-44185, and analyzes the layout of the document image based on the division result. Techniques can be used. Further, the line rectangle extracting means 3 determines whether each sentence block is written vertically or horizontally, and then extracts a line by using a space between lines as a clue (step A12). At this time, the row rectangle extracting unit 3 obtains, for example, a rectangular feature circumscribing the text block, a histogram feature on the vertical axis and the horizontal axis of the pixel, and the like. It can be determined whether the document is written vertically or horizontally.
[0029]
Next, the reduced image generation means 5 reduces the document image stored in the image storage memory 1, generates an image in which a text block frame is superimposed thereon (step A21), and the image display device 8 is generated. A reduced image is displayed (step A22). Then, the system enters a loop waiting for an instruction from the user instruction means 6 (steps A23, A25, A27). When scrolling is instructed by operating a scroll bar or the like (step A23), the image display device 8 scrolls the reduced image (step A24). For the scrolling operation, a method of moving the page by the size of the display area at a time may be used. If an instruction to enlarge or reduce the image is made by pressing an enlargement or reduction button (step A25), the image enlarged or reduced by the reduced image generation means 5 is generated again (step A26). When the superimposed sentence block frame is selected, the display selection means 7 switches to the display of the folded image (step A27).
[0030]
When the display is switched to the return image, the generation (step A31), the display (step A32), the scroll (steps A33 and A34), and the enlargement or reduction (steps A35 and A36) of the return image are performed similarly to the case of the reduced image. Do. The difference from the reduced image is that the reduced image can be scrolled in four directions, up, down, left, and right, whereas the folded image is generated according to the size of the display area regardless of any enlargement or reduction. Therefore, scrolling of a folded image is possible only in two directions, up and down or left and right. Then, the display is switched to the reduced image display by pressing a button or the like (step A37).
[0031]
The operation of generating a folded image (step A31) will be described with reference to the flowchart of FIG. Here, the description is given for horizontal writing, but the same operation is performed when the target is vertical writing. First, a variable called the remaining width is set to the width of the display area (step A41). Next, one first line rectangle is extracted (step A42), and it is compared whether the line width of the line rectangle does not exceed the remaining width (step A43). If the line width is larger than the remaining width, the current line rectangle is divided into the remaining width from the beginning and the remaining portion, and both are regarded as the line rectangles, and the process returns to step A42 (step A44). If the line width is equal to or smaller than the remaining width, the line rectangle is pasted on the folded image (step A45). When pasting the line rectangle, adjustments may be made to make the characters easier to read, such as leaving a space between lines or increasing the contrast between the characters and the background. If no unprocessed row rectangles remain, the process ends (step A46). Otherwise, the remaining width is reduced by the width of the pasted row rectangle (step A47). If the remaining width is larger than 0, the process returns to step A42, and if it is 0, the process returns to step A41 to reset the remaining width (step A48).
[0032]
Next, a specific example is shown in FIG. 6, and the operation of this embodiment will be described. FIG. 6A shows an example of a target document image. The text block extraction unit 2 extracts two text blocks 112 and 113 from the reduced image 111 displayed in the display area 101 (step A11). Then, the line rectangle extracting means 3 extracts a line in each sentence block (step A12). Next, the reduced image generating means 5 generates a reduced image 111 in which a text block frame is superimposed on the reduced text image (step A21), and the image display device 8 displays it (step A22). The initial value of the display magnification of the reduced image is determined by the reduced image generation unit 5 so that the entire image fits in the display area. When the user selects the text block 112, the display is switched to the display of the folded image (step A27). Then, the folded image generating means 4 generates the folded image 114 shown in FIG. 6B according to the width of the display area (Step A31), and the image display device 8 displays it (Step A32). The entire folded image 114 can be displayed only by scrolling up and down.
[0033]
In the present embodiment, since character display is performed using only line extraction without using character extraction, the display area can be displayed for any enlargement or reduction without confusing the user due to an error in character extraction. A folded image can be generated according to. Line extraction is a much less error-prone process than character segmentation. Since the folded image generated in this manner can be read only by scrolling up and down or left and right, the amount of operation is smaller than in the case where the folded image is not folded, and the sentence can be read comfortably.
[0034]
Next, a second embodiment of the present invention will be described with reference to the drawings.
FIG. 7 is a block diagram showing a second embodiment of the document display system according to the present invention. Referring to FIG. 7, in a second embodiment of the present invention, a delimiter position extracting unit 11 for extracting a delimiter position when dividing a line rectangle, and generating a folded image from the line rectangle and the delimiter position in the text block The second embodiment is different from the first embodiment in that the second embodiment has a second folded image generating unit 12 for performing the operation, a second image display device 13 for simultaneously displaying the reduced image and the folded image, and does not have the display selecting unit 7. .
[0035]
When dividing the line rectangle by the second folded image generation unit 12, the division position extraction unit 11 extracts division division positions in advance so that division is not performed in the middle of one character. For example, as shown in FIG. 8, the delimiter position extracting unit 11 sets, as the delimiter position 122, the middle of a blank section in the row rectangle 121 where there is no character pixel in the row rectangle when viewed in the vertical direction. In addition, a more advanced character segmentation method using a character pitch or the like may be used. In that case, it is important that the division is not caused in the middle of the character in generating the folded image. Therefore, when character extraction is difficult due to a variable pitch or the like, the character is more reliable than excessively dividing the character. It is better to extract only the break positions.
[0036]
In FIG. 5 showing the operation of the folded image generating means 4, the second folded image generating means 12 divides the row rectangle at step A44 only at the dividing positions extracted by the dividing position extracting means 11. This is different from the folded image generating means 4 in this point. Further, as shown in the example of FIG. 9, in order to simultaneously display the reduced image and the folded image, the width of the display area set to the remaining width in step A41 is not constant, which is different from the folded image generating means 4. . Other operations are the same as those of the folded image generation unit 4.
[0037]
The second image display device 13 simultaneously displays the reduced image and the folded image, and changes the display magnification of the image or scrolls the image according to an instruction from the user using the user instruction unit 6. In addition, the user can instruct the movement of the reduced image. In this case, a new folded image is generated so as to avoid the reduced image, and both are displayed at the same time.
[0038]
Next, FIGS. 10, 11 and 12 show flowcharts of this embodiment, and the operation will be described with reference to the drawings. First, a text block is extracted, a line rectangle is extracted, and a break position is extracted (steps A11, A12, A101). Next, after generating a reduced image, a folded image is generated so as to avoid it, and both are displayed simultaneously as shown in the example of FIG. 9 (steps A21, A102, A103). Then, the process enters a loop waiting for an instruction to move and scale the reduced image, select a text block frame, scroll the folded image, and scale (steps A104, A25, A27, A33, and A35). Since the folded image is generated avoiding the reduced image, the folded image is generated again even when the folded image is scrolled (step A34).
[0039]
Next, effects of the present embodiment will be described. In the first embodiment, when a folded image is generated, the folded position is determined in accordance with the display area and the folded display is performed. In the present embodiment, the dividing position is extracted and extracted. Since the used return image is generated, the return display can be performed in units of one character. The extraction of the delimiter position is different from the character segmentation, in that it is only necessary to extract the one having a high degree of certainty. Further, in the present embodiment, since the reduced image and the wrapped image are displayed at the same time, the position of the wrapped displayed text block in the text image can be easily grasped. At this time, since the folded image is generated so as to avoid the reduced image, the entire display area can be effectively used while avoiding the problem that the area below the reduced image cannot be seen.
[0040]
Next, a third embodiment of the present invention will be described with reference to the drawings. FIG. 13 is a block diagram showing a third embodiment of the document display system according to the present invention. Referring to FIG. 13, according to a third embodiment of the present invention, a server communication unit 201 and a client communication unit 202 for controlling communication between a server unit 204 and a client unit 205 via a network, And a multi-resolution tiled image storage memory 203 that holds each tile at a plurality of resolutions while dividing the image into partial images of a fixed size called tiles. The server unit 204 includes a sentence block extracting unit 2 and a line rectangle extracting unit 3.
[0041]
Next, the operation of the present embodiment will be described with reference to FIG. In the server unit 204, the sentence block extracting unit 2 extracts a sentence block from a document held as an image in the multi-resolution tiled image storage memory 203, and the line rectangle extracting unit 3 extracts a line rectangle. Information on the extracted sentence block is sent to the client unit 205 via the server communication unit 201. Further, the multi-resolution tiled image storage memory 203 divides the entire held document image into partial images of a fixed size to be tiled, and transmits the tiled image to the client unit 205 via the server communication unit 201. Here, the tile is formed such that the resolution of the tile including the text block is higher than the resolution of the tile not including the text block. The information of the text block and the document image received by the client communication unit 202 of the client unit 205 is sent to the reduced image generation unit 5. The reduced image generation means 5 reduces the document image, generates an image in which a text block frame is superimposed on the reduced document image, and outputs the generated image to the image display device 8 via the display selection means 7.
[0042]
Here, when the user instructing means 6 selects a text block and gives an instruction to display a folded image by pressing a button or the like by the user, the information of the text block and the line rectangle is sent to the server unit 204 via the client communication means 202. Make a request. The server communication unit 201 transmits information in response to a request from the client communication unit 202. The client communication unit 202 outputs the information sent from the server communication unit 201 to the return image generation unit 4. The return image generation means generates a return image from the transmitted information and outputs the generated return image to the display selection means 7. The display selection unit 7 selects display of a folded image, and outputs the generated folded image to the image display device 8. The image display device 8 displays a folded image.
[0043]
Thereafter, when the user performs a selection of a text block or an instruction of scrolling or enlarging / reducing an image by pressing a button or the like via the user instruction unit 6, the client communication unit 202 requests the information of the document block and the line rectangle. , Or a request for image acquisition based on the resolution and the tile number to the server unit 204. The server communication unit 201 transmits necessary information in response to a request from the client communication unit 205. At this time, an appropriate resolution is determined based on the display magnification, and a necessary tile is determined based on the position and size of the display area, and control is performed to minimize the communication data amount. Then, the client unit 205 having obtained necessary information performs the processing as instructed by the user. The operation of the client unit 205 at this time is the same as in the first embodiment.
[0044]
Since the main purpose of the reduced image in the present invention is to grasp the layout, a wide-range image covering the entire document is required. On the other hand, since the purpose of the folded image is to read characters, an image having a higher resolution than the reduced image is required, but the range only needs to be within the text block. For example, consider a document image in which the text block area is half of the total image area. Here, it is assumed that a resolution of, for example, 150 dpi is required to read the characters of the image, and a reduced image for grasping the entire document, for example, 75 dpi is sufficient. Further, it is assumed that the amount of data obtained by compressing the entire image of 150 dpi by some method is 400 kB, and the amount of data of the entire image of 75 dpi after compression is 100 kB. Conventionally, for communication of a document, all images of 150 dpi having a character-readable resolution had to be sent, and a communication volume of 400 kB was generated. However, according to the third embodiment of the present invention, A reduced image + required document block portion = 100 kB + 400 kB / 2 = 300 kB is enough. If a hierarchical compression method such as JPEG2000 is used as a compression method for an image, for example, 75 dpi compressed data as a reduced image can be used as a part of 150 dpi compressed data. The amount can be further reduced. As described above, the effect of the present embodiment is that when a document image is distributed via a network, communication is controlled in accordance with the resolution and the tile so that only a text block is acquired at a high resolution. That is, the amount of data can be reduced.
[0045]
Next, a fourth embodiment of the present invention will be described with reference to the drawings. FIG. 14 is a block diagram showing a fourth embodiment of the document display system according to the present invention. Referring to FIG. 14, in the fourth embodiment of the present invention, not only communication with the client unit 205 but also acquisition of a standard document format HTML or PDF document via the Internet or the like is possible. A second server communication unit 301 capable of converting the acquired HTML document into an image and sending it to the multi-resolution tiled image storage memory 203; and a multi-resolution tile by converting the acquired PDF document into an image. The third embodiment is different from the third embodiment in that a PDF image forming unit 303 for sending the converted image to the coded image storage memory 203 is provided.
[0046]
Next, the operation of the present embodiment will be described with reference to FIG. The second server communication unit 301 acquires a document such as HTML or PDF specified by a user instruction or the like from the client unit via the Internet or the like, and converts the document into an imaging unit corresponding to each format. send. The HTML imaging unit 302 converts the HTML document acquired by the second server communication unit 301 into an image. As the method, for example, there is a method of acquiring by operating as a pseudo printer using a print function of HTML document display software such as a Web browser. With such a method, conversion to an image can be easily achieved without directly handling the document format itself. The PDF imaging unit 303 is the same as the HTML imaging unit 302 except for the difference in the image format. If the multi-resolution tiled image storage memory 203 acquires an image, the other operations are the same as in the third embodiment.
[0047]
The effect of the present embodiment is that, only by preparing a means for converting a document into an image corresponding to the document format, a folded display corresponding to various document formats can be performed without changing the client unit. That is, in the present embodiment, HTML and PDF are described as examples of documents, but other formats operate in the same manner only by providing a means for converting them into images. Therefore, a document targeted by the present embodiment is a document that can be displayed as an image on a display device, or a document that can be converted into an image even when it cannot be displayed on a screen. If so, it can be performed on any of those documents. For example, the document can be converted into an image by a method such as capturing the screen of a document displayed on the screen or converting a print output into an image.
[0048]
【The invention's effect】
As described above, according to the first aspect of the present invention, a sentence block extracting unit that outputs the position and size of a sentence block that is a group of character strings arranged in a document image, and the extracted sentence block A line rectangle extracting means for extracting the position and size of each line in the text block and outputting a line rectangle which is a character string in a region surrounding a column of characters in the text block in a rectangular shape; A user instructing means for instructing a block, and a folded image generating means for producing a folded image by folding the sentence block in accordance with the display area based on the position and size of the line in the sentence block designated by the user Therefore, character display is not used, and wrapping display is performed using only line extraction. Also there is an effect that can generate aliasing image to match the display region with respect to small.
[0049]
According to the second aspect of the present invention, when the line rectangle extracted by the line rectangle extraction unit is divided, there is provided a separation position extraction unit for extracting a separation position of the line rectangle. Since the line rectangle is divided at the delimited position, it is possible to reduce the number of wraps in the middle of one character, and to provide an easy-to-read wrap display.
[0050]
According to the third aspect of the present invention, there is provided a reduced image generating means for generating a reduced image obtained by reducing the entire document image, wherein the reduced image and the folded image are displayed simultaneously, and the folded image generating means is provided by the reduced image generating means. Since the folded image is generated for the display area avoiding the generated reduced image, there is an effect that the position in the sentence image of the sentence block that is being folded and displayed can be easily grasped. In this case, since the folded image is generated so as to avoid the reduced image, there is an effect that the entire display area can be effectively used while avoiding a problem that an area below the reduced image cannot be seen.
[0051]
According to the fourth aspect of the present invention, a server communication unit and a client communication unit for transmitting / receiving a document image, a text block, and information on a line rectangle via a network according to a user's instruction, A multi-resolution tiled image storage memory that holds each tile at multiple resolutions while dividing it into tiles, which are partial images, is used. Acquiring at a high resolution has the effect of reducing the amount of data when distributing a document image via a network.
[0052]
According to the fifth aspect of the present invention, since the format conversion means for converting a specific document format into an image is provided as many as the number of corresponding document formats, various types of documents can be used without changing the client unit. There is an effect that the wrapping display corresponding to the format can be performed.
[0053]
According to the sixth and seventh aspects of the present invention, the position and size of the text block arranged in the document image and the position and size of each line in the text block are extracted, and the text specified by the user is extracted. Since the sentence block is wrapped according to the display area based on the position and size of the line in the block, a wrapped image is generated. Therefore, there is an effect that the amount of operation is small and the text can be read comfortably as compared with the case where no folding is performed.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a first exemplary embodiment of the present invention.
FIG. 2 is a first flowchart showing the operation of the first embodiment;
FIG. 3 is a second flowchart illustrating the operation of the first exemplary embodiment;
FIG. 4 is a flowchart 3 showing an operation of the first exemplary embodiment;
FIG. 5 is a flowchart showing the operation of the folded image generating means 4;
FIG. 6 is an example of a reduced image and a folded image.
FIG. 7 is a block diagram illustrating a configuration of a second exemplary embodiment of the present invention.
FIG. 8 is an example of a break position.
FIG. 9 is an example in which a reduced image and a folded image are displayed in a superimposed manner.
FIG. 10 is a first flowchart illustrating the operation of the second embodiment;
FIG. 11 is a second flowchart illustrating the operation of the second embodiment;
FIG. 12 is a flowchart 3 showing the operation of the second embodiment;
FIG. 13 is a block diagram illustrating a configuration of a third exemplary embodiment of the present invention.
FIG. 14 is a block diagram illustrating a configuration of a fourth exemplary embodiment of the present invention.
FIG. 15 is a diagram illustrating an example of a relationship between a document image and a display area.
FIG. 16 is a diagram illustrating an example of a folded display of a document.
[Explanation of symbols]
1 Image storage memory
2 Text block extraction means
3 Line rectangle extraction means
4 Return image generation means
5 Reduced image generation means
6 User instruction means
7 Display selection means
8 Image display device
11 Separation position extraction means
12 Second folded image generating means
13 Second image display device
101 Display area
102 Document image
103 Wrapped text
111 Reduced image
112, 113 sentence block
114 Folded image
121 line rectangle
122 break position
201 server communication means
202 Client communication means
203 Multi-resolution tiled image storage memory
204 server section
205 Client
301 second server communication means
302 HTML imaging means
303 PDF imaging means

Claims

Sentence block extracting means for outputting the position and size of a sentence block which is a lump of character strings arranged in the document image,
A line rectangle extracting means for extracting a position and a size of each line in the extracted sentence block and outputting a line rectangle which is a character string in an area surrounding a column of characters in the sentence block in a rectangular shape;
User instructing means for instructing a sentence block to be displayed by the user;
A document display system comprising: a wrapped image generation unit configured to generate a wrapped image by wrapping the text block in accordance with the display area based on the position and size of a line in the text block specified by the user. .

When dividing the line rectangle extracted by the line rectangle extraction unit, the division unit includes a division position extraction unit that extracts a division position of the line rectangle,
2. The document display system according to claim 1, wherein the folded image generating unit divides the line rectangle at the extracted break position.

A reduced image generation unit configured to generate a reduced image obtained by reducing the entire document image,
Display the reduced image and the folded image at the same time,
3. The document display system according to claim 1, wherein the folded image generating means generates a folded image in a display area avoiding the reduced image generated by the reduced image generating means.

Server communication means and client communication means for transmitting and receiving document images, text blocks, and information on line rectangles via a network according to a user's instruction;
4. A multi-resolution tiled image storage memory, which divides an image into tiles, which are partial images of a fixed size, and holds each tile at a plurality of resolutions. Document display system.

5. The document display system according to claim 4, wherein format conversion means for converting a specific document format into an image is provided by the number of corresponding document formats.

The position and size of a sentence block, which is a group of character strings arranged in the document image, and the position and size of each line in the sentence block are extracted, and the position and size of the line in the sentence block specified by the user are extracted. Generating a folded image in which the sentence block is folded in accordance with the display area.

Computer
A sentence block extracting unit that outputs the position and size of a sentence block that is a group of character strings arranged in a document image, and a line rectangle that is a character string of an area surrounding a line of characters in the sentence block in a rectangular shape And a document display program for functioning as a folded image generating means for folding a text block in accordance with the display area based on the position and size of the line selected by the user by the user instructing means.