JP2002175176A

JP2002175176A - Information presenting device and presenting method

Info

Publication number: JP2002175176A
Application number: JP2000373246A
Authority: JP
Inventors: Yoshitoku Kawai; 良徳河合; Yuji Ikeda; 裕治池田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2000-12-07
Filing date: 2000-12-07
Publication date: 2002-06-21

Abstract

PROBLEM TO BE SOLVED: To easily recognize the contents by even a use in a mobile environment limiting a display resource and a use by a visually handicapped person, when presenting an image file and HTML contents including the image file. SOLUTION: This device is provided with an image analysis processing part 102 for analyzing an image file including image data showing images and image information on the image data and obtaining the image information and a sound output part 106 for outputting the image information obtained by the image analysis processing part 102 in voice.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、画像ファイルの持
つ情報等を提示する情報提示方法および装置およびこの
情報提示装置を制御する制御プログラムを格納した記録
媒体に関する。とりわけインターネットにおけるＷｏｒ
ｌｄＷｉｄｅＷｅｂ（以下「ＷＷＷ」と記す）シス
テムにおいて、ＷＷＷサーバに蓄積された画像ファイル
を含むＨＴＭＬコンテンツの持つ情報等を提示するのに
好適な情報提示方法および装置およびこの情報提示装置
を制御する制御プログラムを格納した記録媒体に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information presenting method and apparatus for presenting information or the like of an image file and a recording medium storing a control program for controlling the information presenting apparatus. Wor in the Internet especially
In an ld Wide Web (hereinafter referred to as "WWW") system, an information presenting method and apparatus suitable for presenting information and the like of HTML content including image files stored in a WWW server and an information presenting apparatus are controlled. The present invention relates to a recording medium storing a control program.

【０００２】[0002]

【従来の技術】ＪＰＥＧファイルなどと言った画像ファ
イルは、画像ビューアなどを使って画像の表示を行うこ
とができる。また画像は拡大縮小して表示することも可
能である。画像ファイルの作成者、作成日、タイトル、
コメント、画像ファイルに関するリンク等の様々な画像
情報は、画像ファイルのヘッダとして画像ファイルに付
加することができる。ヘッダとして含まれる情報は画像
ビューアの機能を使ってテキストとして表示が可能であ
る。2. Description of the Related Art An image file such as a JPEG file can be displayed using an image viewer or the like. The image can also be displayed after being enlarged or reduced. Image file creator, creation date, title,
Various image information such as a comment and a link related to the image file can be added to the image file as a header of the image file. The information included as the header can be displayed as text using the function of the image viewer.

【０００３】ＷＷＷシステムにおいては、ネットスケー
プナビゲータ（ＮｅｔｓｃａｐｅＮａｖｉｇａｔｏｒ）
やインターネットエクスプローラ（ＩｎｔｅｒｎｅｔＥ
ｘｐｌｏｒｅｒ）といったブラウザソフトウェアを利用
し、ＷＷＷサーバにアクセスして、ＨＴＭＬで記述され
たＨＴＭＬコンテンツをＷＷＷサーバから取得し、ブラ
ウザの表示機能を使ってＨＴＭＬコンテンツに含まれる
情報の提示を行うことができる。ＨＴＭＬコンテンツは
文書やリンク等の記述に加え、画像ファイルの参照など
も多く利用し作成される。[0003] In a WWW system, a Netscape Navigator is used.
And Internet Explorer (InternetE
(xproler), it is possible to access the WWW server, obtain HTML content described in HTML from the WWW server, and present information included in the HTML content using the display function of the browser. . HTML content is created using many references to image files in addition to descriptions of documents and links.

【０００４】表示リソースの小さいモバイル環境では、
表示リソースを効率的に利用するため、音声による情報
の出力を行う音声ブラウザが有効である。電話等では音
声出力のみの音声ブラウザが、ＰＤＡ（Ｐｅｒｓｏｎａ
ｌＤａｔａＡｓｓｉｓｔａｎｃｅ）等の小型携帯
端末では画面出力の補助として音声出力を行う音声ブラ
ウザが使われている。In a mobile environment where display resources are small,
In order to efficiently use display resources, a voice browser that outputs information by voice is effective. For phones and the like, a voice browser that only outputs voice is compatible with PDA (Persona).
In a small portable terminal such as l Data Assistance, an audio browser that outputs audio as an aid to screen output is used.

【０００５】また、視覚障害者にとっては、従来のＷＷ
ＷブラウザはＧＵＩによる出力のため非常に使いにくい
が、音声ブラウザは音声を出力手段として用いることが
できるため、視覚障害者も使用することが可能である。For visually impaired persons, the conventional WW
The W browser is very difficult to use because it is output by a GUI, but the voice browser can use voice as an output means, so that a visually impaired person can use it.

【０００６】音声ブラウザの音声出力では、文書データ
やリンク情報をいかに有効に読み上げるかに主眼が置か
れ、ＨＴＭＬコンテンツに含まれる文書データをわかり
やすく読み上げるか、含まれるリンク情報をいかに音声
で伝え音声で選択できるかなどの点について開発が行わ
れている。[0006] In the audio output of the audio browser, the main focus is on how to effectively read out the document data and the link information. Development is being carried out on whether or not it can be selected.

【０００７】ＨＴＭＬコンテンツに含まれるものとして
画像ファイルがあり、画像ファイルには視覚的に多くの
情報が含まれているが、この画像に含まれる視覚的な情
報を音声で表現するのは非常に難しい。したがって、音
声ブラウザを用いてＨＴＭＬコンテンツの情報を提示す
る場合、画像ファイルは考慮されずに無視するかまたは
そのまま表示するか（特開平１１-１１０１８６）、リ
ソースに応じて縮小画像を作成し表示する（特開平１０
-３２６２４４）等の方法が取られている。[0007] HTML files include image files, and image files contain a great deal of information visually. However, it is very difficult to express the visual information contained in this image by voice. difficult. Therefore, when presenting information on HTML content using a voice browser, an image file is ignored without consideration or displayed as it is (JP-A-11-110186), or a reduced image is created and displayed according to resources. (Japanese Patent Laid-Open No. 10
-326244).

【０００８】また、ＨＴＭＬコンテンツで画像ファイル
の参照を現すＩＭＧタグのＡＬＴで記載された文章を読
み上げるというようなＨＴＭＬ文書に画像情報を記述す
る方法やＯＣＲを用いて画像を解析し画像に含まれる文
章を読み上げる（特開平１１-２８８３６４）等の方法
も考えられている。Also, a method of describing image information in an HTML document such as reading out a sentence described in ALT of an IMG tag representing a reference to an image file in HTML content, or analyzing an image using OCR and including the image in an image. A method of reading out a sentence (JP-A-11-288364) is also considered.

【０００９】[0009]

【発明が解決しようとする課題】モバイル機器等の画面
表示リソースの小さい機器において画像ファイルを確認
したい場合、従来は画像の縮小表示や画像情報のテキス
ト表示によって画像ファイルの確認を行っていたが、表
示リソースに限りがあるため利便性に欠ける。When it is desired to confirm an image file on a device such as a mobile device having a small screen display resource, the image file has been conventionally confirmed by a reduced display of the image or a text display of the image information. Lack of convenience due to limited display resources.

【００１０】ＨＴＭＬコンテンツから参照される画像に
ついて、画像を無視した場合、その画像に含まれていた
情報はまったく失われてしまうし、縮小画像を表示した
場合は画像に含まれている文字情報は読み取れない可能
性があるという欠点があった。もちろん、表示リソース
が小さい機器では画像をそのまま表示したのでは利便性
に欠ける。ＯＣＲにより画像を解析し画像に含まれる文
章を読み上げる場合、画像に文字情報が含まれない場合
有効ではない。[0010] Regarding the image referred to from the HTML content, if the image is ignored, the information contained in the image is lost at all, and if the reduced image is displayed, the character information contained in the image is lost. There was a drawback that it could not be read. Of course, in a device having a small display resource, displaying an image as it is lacks convenience. When the image is analyzed by the OCR and the text included in the image is read out, it is not effective when the image does not include character information.

【００１１】また、ＨＴＭＬ文書に画像情報を記述する
方法では、ＨＴＭＬコンテンツ製作者と画像データ製作
者が異なる場合、画像製作者が意図した画像情報が必ず
しも伝えられないと考えられる。また、ある画像ファイ
ルを複数のＨＴＭＬコンテンツから参照する場合、画像
ファイルの変更等に対して参照しているすべてのＨＴＭ
Ｌコンテンツ中の記述を変更するのは、容易ではなく変
更ミス等の可能性も高い。つまり従来の音声ブラウザで
は、画像に含まれる情報を音声によって十分に提示する
ことは困難である。In the method of describing image information in an HTML document, if the HTML content creator and the image data creator are different, it is considered that the image information intended by the image creator is not necessarily transmitted. Further, when a certain image file is referred to from a plurality of HTML contents, all of the HTML files referred to when the image file is changed or the like are changed.
It is not easy to change the description in the L content, and there is a high possibility of a change error or the like. That is, it is difficult for a conventional voice browser to sufficiently present information included in an image by voice.

【００１２】本発明は、上記課題を鑑みてなされたもの
であり、画像ファイルや画像ファイルを含むＨＴＭＬコ
ンテンツを提示する際に、表示リソースの制限されるモ
バイル環境での使用や、視覚障害者による使用でも、そ
の内容を容易に認識可能とすることを目的とする。SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and has been made in consideration of a case where an image file or HTML content including an image file is presented in a mobile environment where display resources are limited, or when an HTML file including an image file is presented. The purpose is to make the contents easily recognizable even in use.

【００１３】[0013]

【課題を解決するための手段】かかる課題を解決するた
め、例えば本発明の情報提示装置は以下の構成を備え
る。すなわち、画像を表す画像データと該画像データに
関する画像情報とを含む画像ファイルを解析し、該画像
情報を得る画像解析手段と、前記画像解析手段で得られ
た前記画像情報を音声で出力する音声出力手段とを備え
る。In order to solve such a problem, for example, an information presenting apparatus of the present invention has the following configuration. That is, an image file containing image data representing an image and image information related to the image data is analyzed, and image analysis means for obtaining the image information, and audio for outputting the image information obtained by the image analysis means in voice Output means.

【００１４】[0014]

【発明の実施の形態】［実施形態１］以下、図面を参照
して本発明の一実施形態を詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS [Embodiment 1] Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

【００１５】図１は本発明の実施形態に係る情報提示装
置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of an information presenting apparatus according to an embodiment of the present invention.

【００１６】同図において１００は提示する画像ファイ
ルであり、画像ファイルはヘッダ情報として記述された
画像情報と画像自身を表す画像データとで構成される。
ヘッダ情報には画像情報として、例えば画像ファイルの
作成者、作成日、タイトル、コメント、画像ファイルに
関するリンク等の情報が記述されている。Referring to FIG. 1, reference numeral 100 denotes an image file to be presented. The image file includes image information described as header information and image data representing the image itself.
The header information describes, as image information, information such as the creator of the image file, the date of creation, the title, a comment, and a link related to the image file.

【００１７】１０１は入力された画像ファイルの画像デ
ータ・画像情報を提示する情報提示装置で、情報提示装
置１０１は、画像ファイルに対し画像情報と画像データ
の解析を行う画像解析処理部１０２、情報の提示方法を
選択する提示方法選択部１０３、解析された画像データ
の表示と画像情報のテキスト表示を行う画面出力部１０
４、解析された画像情報の音声合成を行う音声合成処理
部１０５、音声合成処理部１０５で合成された音声の出
力を行う音声出力部１０６から構成されている。Reference numeral 101 denotes an information presenting apparatus for presenting image data and image information of an input image file. The information presenting apparatus 101 includes an image analysis processing unit 102 for analyzing image information and image data for an image file; Presentation method selection unit 103 for selecting a presentation method for the image, a screen output unit 10 for displaying the analyzed image data and displaying the text of the image information
4. Speech synthesis unit 105 for synthesizing the analyzed image information, and speech output unit 106 for outputting the speech synthesized by speech synthesis unit 105.

【００１８】情報提示方法としては、（１）画面出力部
１０４で画像データを提示、（２）画面出力部１０４で
画像情報をテキストデータとして提示、（３）音声出力
部１０６で画像情報を音声データとして提示の３つの方
法があり、それぞれ単独で提示することも、複数組み合
わせて提示することも可能であり、どのような提示方法
を選択するか提示方法選択部１０３で選択する。情報提
示装置１０１の仕様に応じてあらかじめ設定されている
か、または、ユーザの要求に応じてユーザ自身が設定す
ることが考えられる。As information presentation methods, (1) image data is presented by the screen output unit 104, (2) image information is presented as text data by the screen output unit 104, and (3) image information is voiced by the audio output unit 106. There are three data presentation methods, which can be presented alone or in combination, and the presentation method selection unit 103 selects which presentation method to select. It may be set in advance according to the specifications of the information presentation device 101, or may be set by the user himself in response to a request from the user.

【００１９】本実施形態に関わる情報提示装置１０１の
動作を、図２のフローチャートに基づき説明する。The operation of the information presentation apparatus 101 according to the present embodiment will be described with reference to the flowchart of FIG.

【００２０】まず、ステップＳ２０１で画像ファイルの
読み込みを行う。この際、読み込む画像ファイルの保持
場所については限定していない。同じ情報提示装置内に
保持してある画像ファイルからの読み込み、他装置内に
保持してある画像ファイルからの読み込み、ネットワー
クを経由しての読み込みを行うことが可能である。First, an image file is read in step S201. At this time, the holding place of the image file to be read is not limited. Reading from an image file held in the same information presenting device, reading from an image file held in another device, and reading via a network can be performed.

【００２１】次にステップＳ２０２において画像解析処
理部１０２で読み込んだ画像ファイルの解析を行い画像
情報と画像データを求める。ステップＳ２０３で画像デ
ータを画面出力部１０４に提示するかを判断する。画像
データを提示する場合は、ステップＳ２０４で画面出力
部１０４に画像データを表示する。一方画像データを提
示しない場合、ステップＳ２０５に進む。Next, in step S202, the image file read by the image analysis processing unit 102 is analyzed to obtain image information and image data. In step S203, it is determined whether to present the image data to the screen output unit 104. When presenting the image data, the image data is displayed on the screen output unit 104 in step S204. On the other hand, when the image data is not presented, the process proceeds to step S205.

【００２２】ステップＳ２０５では、解析し得られた画
像情報を画面出力部１０４にテキストとして提示するか
を判断する。テキスト提示する場合は、ステップＳ２０
６で画面出力部１０４に画像情報をテキスト表示する。
一方テキスト提示しない場合は、ステップＳ２０７へ進
む。In step S205, it is determined whether or not the analyzed image information is to be presented on the screen output unit 104 as text. When presenting a text, step S20
In step 6, the image information is displayed as text on the screen output unit 104.
On the other hand, if no text is to be presented, the process proceeds to step S207.

【００２３】ステップＳ２０７では解析し得られた画像
情報を音声出力部１０５に音声として出力するかを判断
する。音声提示する場合は、ステップＳ２０８で得られ
た画像情報を音声合成処理部１０５で音声に変換し、音
声出力部１０６で音声出力を行う。一方音声提示しない
場合はそのまま終了となる。In step S207, it is determined whether or not the image information obtained by the analysis is output to the audio output unit 105 as audio. When presenting a voice, the image information obtained in step S208 is converted into voice by the voice synthesis processing unit 105, and voice output is performed by the voice output unit 106. On the other hand, if no voice is presented, the process ends.

【００２４】本実施形態に関わる画像データ・画像情報
の提示例を、図３に基づき説明する。An example of presenting image data and image information according to the present embodiment will be described with reference to FIG.

【００２５】３０１は画像データの表示と画像情報の音
声出力及びテキスト表示を行った提示例であり、画像デ
ータの表示を行う画像データ表示結果３０２、画像情報
のテキスト表示を行う画像情報表示結果３０３、画像情
報の音声出力を行う音声出力結果３０４である。表示さ
れている画像情報は一例でありこれ以外の様々な情報を
提示する。Reference numeral 301 denotes a presentation example in which image data is displayed, image information is output as audio, and text is displayed. The image data display result 302 displays image data, and the image information display result 303 displays image information in text. , A voice output result 304 for performing voice output of image information. The displayed image information is an example, and presents various other information.

【００２６】３０５は、画像データの表示と画像情報の
テキスト表示を行った提示例であり、使用者が音声出力
を選択しなかった、または、情報提示装置の音声出力が
制限されて使えなかった場合の一例である。Reference numeral 305 denotes a presentation example in which image data is displayed and text information of image information is displayed. The user did not select audio output, or the audio output of the information presentation device was restricted and could not be used. This is an example of the case.

【００２７】３０６は、画像情報のテキスト表示と画像
情報の音声出力を行った提示例であり、使用者が画像デ
ータの表示を選択しなかった場合、または、情報提示装
置の画面出力部のリソースがモバイル端末などの様に制
限されていた場合の一例である。Reference numeral 306 denotes a presentation example in which the text display of image information and the audio output of image information are performed. If the user does not select the display of the image data, Is an example of a case where is restricted like a mobile terminal.

【００２８】その他、使用者の要求や情報提示装置のリ
ソースの応じて提示方法を決定することができる。In addition, a presentation method can be determined according to a user's request or a resource of the information presentation device.

【００２９】［実施形態２］情報提示装置１０１の画像
解析処理部１０２における解析方法を、上記実施形態で
は画像ファイルのヘッダ情報の解析に限定して説明した
が、文字認識処理による画像データに含まれる文字情報
の解析と組み合わせることで、さらなる情報が得られ
る。文字認識処理により得られた情報も画像情報に付加
して、上記実施形態と同様に図２のフローチャートに基
づき、画面出力部１０４へのテキスト表示、音声出力部
１０６への音声出力を行うことができる。[Second Embodiment] In the above embodiment, the analysis method in the image analysis processing unit 102 of the information presenting apparatus 101 is limited to the analysis of the header information of the image file. Further information can be obtained in combination with the analysis of character information. The information obtained by the character recognition processing is also added to the image information, and text display on the screen output unit 104 and audio output to the audio output unit 106 are performed based on the flowchart of FIG. it can.

【００３０】［実施形態３］図４は本発明の実施形態に
係る情報提示装置の構成を示すブロック図である。[Embodiment 3] FIG. 4 is a block diagram showing a configuration of an information presenting apparatus according to an embodiment of the present invention.

【００３１】同図において４００は画像ファイルを含む
ＨＴＭＬコンテンツであり、ＨＴＭＬコンテンツはＨＴ
ＭＬで記述されたＨＴＭＬ文書ファイルとＨＴＭＬコン
テンツから参照される画像ファイルとＨＴＭＬコンテン
ツから参照されるその他ファイルから成る。In the figure, reference numeral 400 denotes HTML content including an image file, and the HTML content is HT.
It is composed of an HTML document file described in ML, an image file referred to by the HTML content, and other files referred to by the HTML content.

【００３２】参照される画像ファイルはヘッダ情報とし
て記述された画像情報と画像自身を表す画像データとで
構成され、ヘッダ情報には画像情報として、例えば画像
ファイルの作成者、作成日、タイトル、コメント、画像
ファイルに関するリンク等の情報が記述されている。The referenced image file is composed of image information described as header information and image data representing the image itself. The header information includes, for example, the creator, creation date, title, and comment of the image file. And information such as a link regarding the image file.

【００３３】４０１は入力されたＨＴＭＬコンテンツを
提示する情報提示装置で、以下の構成を備える。An information presentation device 401 for presenting input HTML content has the following configuration.

【００３４】ＨＴＭＬコンテンツ解析処理部４０２は、
入力されたＨＴＭＬコンテンツ中のＨＴＭＬ文書ファイ
ルを解析し、そこに記述された文書情報・リンク情報な
どを解析するとともに、参照される画像ファイルを調べ
る。The HTML content analysis processing unit 402
It analyzes the HTML document file in the input HTML content, analyzes the document information and link information described therein, and checks the referenced image file.

【００３５】画像解析処理部４０３は、ＨＴＭＬコンテ
ンツから参照される画像ファイルに対し画像情報と画像
データの解析を行う。The image analysis processing unit 403 analyzes image information and image data for an image file referenced from the HTML content.

【００３６】ＨＴＭＬコンテンツ再構築処理部４０４
は、得られた画像情報をＨＴＭＬコンテンツ中に反映す
るためにＨＴＭＬ文書ファイルを更新する。HTML content reconstruction processing section 404
Updates the HTML document file to reflect the obtained image information in the HTML content.

【００３７】提示方法選択部４０５は、ＨＴＭＬコンテ
ンツを再構築する際の提示方法を選択する。The presentation method selection unit 405 selects a presentation method for reconstructing HTML content.

【００３８】また、画面出力部４０６は再構築されたＨ
ＴＭＬコンテンツの画面表示を、音声合成処理部４０７
は、再構築されたＨＴＭＬコンテンツの音声合成を、音
声出力部４０８は合成された音声の出力をそれぞれ行
う。The screen output unit 406 outputs the reconstructed H
The screen display of the TML content is performed by the speech synthesis processing unit 407.
Performs voice synthesis of the reconstructed HTML content, and the voice output unit 408 outputs the synthesized voice.

【００３９】ＨＴＭＬコンテンツにおける画像ファイル
の提示方法として、（１）画像データを画面出力部４０
６で提示、（２）解析して得られた画像情報をテキスト
として画面出力部４０６で提示、（３）解析して得られ
た画像情報を音声データとして音声出力部４０８で提示
する３つの方法があり、それぞれ単独で提示すること
も、複数組み合わせて提示することも可能である。情報
提示装置４０１の仕様に応じてあらかじめ設定されてい
るか、または、ユーザの要求に応じてユーザ自身が設定
することが考えられる。As a method of presenting an image file in HTML content, (1) image data is output to the screen output unit 40
6, (2) the image information obtained by analysis is presented as text on the screen output unit 406, and (3) the image information obtained by analysis is presented by the audio output unit 408 as audio data. It is possible to present each of them alone or to present them in combination. It may be set in advance according to the specifications of the information presentation device 401, or may be set by the user himself in response to a user request.

【００４０】本実施形態に関わる情報提示装置４０１の
動作を、図５のフローチャートに基づき説明する。The operation of the information presentation device 401 according to the present embodiment will be described with reference to the flowchart of FIG.

【００４１】まず、ステップＳ５０１でＨＴＭＬコンテ
ンツの読み込みを行う。通常はネットワークを介してＷ
ＷＷサーバにアクセスしＷＷＷサーバに置かれたＨＴＭ
Ｌコンテンツを取得する。First, in step S501, HTML content is read. Usually W through a network
HTM accessed WWW server and placed on WWW server
Acquire L content.

【００４２】次にステップＳ５０２においてＨＴＭＬコ
ンテンツ解析処理部４０２で読み込んだＨＴＭＬコンテ
ンツの解析を行う。すなわち、ＨＴＭＬコンテンツとし
て含まれる文書情報、リンク情報、ＨＴＭＬコンテンツ
から参照される画像ファイルを解析する。Next, in step S502, the HTML content read by the HTML content analysis processing unit 402 is analyzed. That is, it analyzes document information, link information, and image files referenced from the HTML content included in the HTML content.

【００４３】ステップＳ５０３では、画像解析処理部４
０３で参照される画像ファイルの解析を行い画像情報と
画像データを求める。In step S503, the image analysis processing unit 4
The image file referred to in step 03 is analyzed to obtain image information and image data.

【００４４】ステップＳ５０４では、ステップＳ５０３
で解析された画像情報が反映されるように、ＨＴＭＬコ
ンテンツ再構築部４０４で元のＨＴＭＬコンテンツの再
構築を行う。提示方法選択部４０５で選択された提示方
法に基づき、画像データの表示を行うかどうか、画像情
報のテキスト表示を行うかどうか、画像情報の音声出力
を行うかどうかが決定される。そして、ステップＳ５０
５で提示方法に応じて、再構築されたＨＴＭＬコンテン
ツに基づき、画面出力と音声出力が行われる。In step S504, step S503 is executed.
The original HTML content is reconstructed by the HTML content reconstructing unit 404 so that the image information analyzed in step 1 is reflected. Based on the presentation method selected by the presentation method selection unit 405, it is determined whether to display the image data, whether to display the text of the image information, and whether to output the audio of the image information. Then, step S50
In step 5, screen output and audio output are performed based on the reconstructed HTML content according to the presentation method.

【００４５】本実施形態に関わるＨＴＭＬコンテンツ情
報の提示例の一例を、図６に基づき説明する。An example of the presentation of the HTML content information according to the present embodiment will be described with reference to FIG.

【００４６】６０１は一提示例であり、画像情報のテキ
スト表示、画像情報の音声出力は行わない場合の提示例
である。６０２はＨＴＭＬ文書として記述された情報の
画面表示、６０３は画像データの画面表示、６０４はＨ
ＴＭＬ文書として記述された情報の音声出力で、普通用
いられている音声ブラウザではこういった出力が行われ
る。Reference numeral 601 denotes a presentation example in which text display of image information and audio output of image information are not performed. 602 is a screen display of information described as an HTML document, 603 is a screen display of image data, and 604 is H
The sound output of information described as a TML document is performed by a commonly used sound browser.

【００４７】６０５は、画像データの表示と画像情報の
音声出力を行った提示例であり、使用者が画像情報のテ
キスト表示を選択しなかった場合の一例である。Reference numeral 605 denotes a presentation example in which image data is displayed and image information is output as audio, and is an example in a case where the user does not select text display of image information.

【００４８】６０６は、画像情報のテキスト表示と画像
情報の音声出力を行った提示例であり、使用者が画像デ
ータの表示を選択しなかった場合、または、情報提示装
置の画面出力部のリソースがモバイル端末などのように
制限されていた場合の一例である。Reference numeral 606 denotes a presentation example in which text display of image information and audio output of image information are performed. If the user does not select display of image data, or resources of the screen output unit of the information presentation apparatus are displayed. Is an example in the case where is restricted like a mobile terminal.

【００４９】その他、使用者の要求や情報提示装置のリ
ソースの応じて提示方法を決定することができる。In addition, the presentation method can be determined according to the user's request or the resources of the information presentation device.

【００５０】［実施形態４］情報提示装置４０１の画像
解析処理部４０３における解析方法を、上記実施形態で
は画像ファイルのヘッダ情報の解析に限定して説明した
が、文字認識処理による画像データに含まれる文字情報
の解析と組み合わせることで、さらなる情報が得られ
る。文字認識処理により得られた情報も画像情報に付加
して、上記実施形態と同様に図５のフローチャートに基
づき、画面出力部４０６へのテキスト表示、音声出力部
４０８への音声出力を行うことができる。[Fourth Embodiment] The analysis method in the image analysis processing unit 403 of the information presentation device 401 has been described as being limited to the analysis of the header information of the image file in the above embodiment, but is included in the image data by the character recognition processing. Further information can be obtained by combining this with the analysis of character information. The information obtained by the character recognition processing is also added to the image information, and text display on the screen output unit 406 and audio output to the audio output unit 408 are performed based on the flowchart of FIG. it can.

【００５１】［実施形態５］上記実施形態において、入
力されたＨＴＭＬコンテンツを提示する情報提示装置４
０１において、端末側での処理を軽減するため、ＨＴＭ
Ｌコンテンツ解析処理部４０２、画像解析処理部４０
３、ＨＴＭＬコンテンツ再構築処理部４０４を、情報提
示装置から分離した別の装置で実現するゲートウェイ方
式の情報提示装置にも適用可能である。[Fifth Embodiment] In the above embodiment, the information presenting device 4 for presenting the input HTML content
01, to reduce processing on the terminal side,
L content analysis processing unit 402, image analysis processing unit 40
3. The HTML content reconstruction processing unit 404 can be applied to a gateway-type information presenting device realized by another device separated from the information presenting device.

【００５２】［実施形態６］上記実施形態では、現在Ｗ
ＷＷシステムで広く用いられているＨＴＭＬで記述され
たＨＴＭＬコンテンツについて述べたが、ＨＴＭＬと同
様にＸＭＬ等のマークアップ言語で書かれたコンテンツ
にも当然適用可能である。[Embodiment 6] In the above embodiment, the current W
The HTML content described in HTML widely used in the WW system has been described. However, it is naturally applicable to content written in a markup language such as XML as in HTML.

【００５３】［実施形態７］上記実施形態において、Ｈ
ＴＭＬコンテンツ再構築処理部４０４において画像情報
の提示方法に応じたＨＴＭＬコンテンツを再構築してい
たが、音声ブラウザ用のマークアップ言語で記述された
コンテンツに再構築することも可能である。[Embodiment 7] In the above embodiment, H
Although the HTML content according to the image information presentation method is reconstructed in the TML content reconstruction processing unit 404, the HTML content can be reconstructed into a content described in a markup language for a voice browser.

【００５４】［実施形態８］上記実施形態において、画
像ファイルのヘッダ情報として含まれる画像情報は表音
テキスト表記等の音声合成に適した形でヘッダ情報に記
述されていても構わない。[Eighth Embodiment] In the above embodiment, the image information included as the header information of the image file may be described in the header information in a form suitable for speech synthesis such as phonetic text notation.

【００５５】［実施形態９］また、本発明の目的は、前
述した実施形態の機能を実現するソフトウェアのプログ
ラムコードを記録した記憶媒体を、システムあるいは装
置に供給し、そのシステムあるいは装置のコンピュータ
（またはＣＰＵやＭＰＵ）が記憶媒体に格納されたプロ
グラムコードを読出し実行することによっても、達成さ
れることは言うまでもない。[Embodiment 9] Another object of the present invention is to provide a storage medium storing a program code of software for realizing the functions of the above-described embodiments to a system or an apparatus, and to provide a computer (a computer) of the system or apparatus. Alternatively, it is needless to say that the present invention can also be achieved by a CPU or an MPU) reading and executing the program code stored in the storage medium.

【００５６】この場合、記憶媒体から読出されたプログ
ラムコード自体が前述した実施形態の機能を実現するこ
とになり、そのプログラムコードを記憶した記憶媒体は
本発明を構成することになる。In this case, the program code itself read from the storage medium implements the functions of the above-described embodiment, and the storage medium storing the program code constitutes the present invention.

【００５７】プログラムコードを供給するための記憶媒
体としては、例えば、フロッピディスク，ハードディス
ク，光ディスク，光磁気ディスク，ＣＤ−ＲＯＭ，ＣＤ
−Ｒ，磁気テープ，不揮発性のメモリカード，ＲＯＭな
どを用いることができる。As a storage medium for supplying the program code, for example, a floppy disk, hard disk, optical disk, magneto-optical disk, CD-ROM, CD
-R, a magnetic tape, a nonvolatile memory card, a ROM, or the like can be used.

【００５８】また、コンピュータが読出したプログラム
コードを実行することにより、前述した実施形態の機能
が実現されるだけでなく、そのプログラムコードの指示
に基づき、コンピュータ上で稼働しているＯＳ（オペレ
ーティングシステム）などが実際の処理の一部または全
部を行い、その処理によって前述した実施形態の機能が
実現される場合も含まれることは言うまでもない。When the computer executes the readout program code, not only the functions of the above-described embodiment are realized, but also the OS (Operating System) running on the computer based on the instruction of the program code. ) May perform some or all of the actual processing, and the processing may realize the functions of the above-described embodiments.

【００５９】さらに、記憶媒体から読出されたプログラ
ムコードが、コンピュータに挿入された機能拡張ボード
やコンピュータに接続された機能拡張ユニットに備わる
メモリに書込まれた後、そのプログラムコードの指示に
基づき、その機能拡張ボードや機能拡張ユニットに備わ
るＣＰＵなどが実際の処理の一部または全部を行い、そ
の処理によって前述した実施形態の機能が実現される場
合も含まれることは言うまでもない。Further, after the program code read from the storage medium is written into a memory provided on a function expansion board inserted into the computer or a function expansion unit connected to the computer, based on the instructions of the program code, It goes without saying that the CPU included in the function expansion board or the function expansion unit performs part or all of the actual processing, and the processing realizes the functions of the above-described embodiments.

【００６０】[0060]

【発明の効果】以上詳述したように本発明によれば、画
像ファイルや画像ファイルを含むＨＴＭＬコンテンツを
提示する際に、表示リソースの制限されるモバイル環境
での使用や、視覚障害者による使用でも、その内容を容
易に認識できる。As described above in detail, according to the present invention, when presenting an image file or HTML content including the image file, use in a mobile environment where display resources are limited, or use by a visually impaired person However, the contents can be easily recognized.

[Brief description of the drawings]

【図１】本発明の第１の実施形態における情報提示装置
の構成を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of an information presentation device according to a first embodiment of the present invention.

【図２】本発明の第１の実施形態における情報提示装置
の処理の流れを示すフローチャートである。FIG. 2 is a flowchart illustrating a processing flow of the information presentation device according to the first embodiment of the present invention.

【図３】本発明の第１の実施形態における情報提示装置
の出力図である。FIG. 3 is an output diagram of the information presentation device according to the first embodiment of the present invention.

【図４】本発明の第３の実施形態における情報提示装置
の構成を示すブロック図である。FIG. 4 is a block diagram illustrating a configuration of an information presentation device according to a third embodiment of the present invention.

【図５】本発明の第３の実施形態における情報提示装置
の処理の流れを示すフローチャートである。FIG. 5 is a flowchart illustrating a flow of processing of an information presentation device according to a third embodiment of the present invention.

【図６】本発明の第３の実施形態における情報提示装置
の出力図である。FIG. 6 is an output diagram of an information presentation device according to a third embodiment of the present invention.

[Explanation of symbols]

１００画像ファイル１０１情報提示装置１０２画像解析処理部１０３提示方法選択部１０４画面出力部１０５音声合成処理部１０６音声出力部４００ＨＴＭＬコンテンツ４０１情報提示装置４０２ＨＴＭＬコンテンツ解析処理部４０３画像解析処理部４０４ＨＴＭＬコンテンツ再構築処理部４０５提示方法選択部４０６画面出力部４０７音声合成処理部４０８音声出力部４１０ネットワーク REFERENCE SIGNS LIST 100 image file 101 information presentation device 102 image analysis processing unit 103 presentation method selection unit 104 screen output unit 105 voice synthesis processing unit 106 voice output unit 400 HTML content 401 information presentation device 402 HTML content analysis processing unit 403 image analysis processing unit 404 HTML Content reconstruction processing unit 405 Presentation method selection unit 406 Screen output unit 407 Voice synthesis processing unit 408 Voice output unit 410 Network

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考） // Ｇ０９Ｂ 21/00 Ｇ０９Ｂ 21/00 ＤＦターム(参考） 5B050 AA08 BA15 CA07 CA08 FA02 FA10 FA13 FA19 5B075 ND06 NK04 PQ02 PQ04 5E501 AA01 AA04 AB03 AC15 AC22 BA03 BA12 CA08 EA34 FA13 FA14 FA32 FB34 ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI theme coat ゛ (reference) // G09B 21/00 G09B 21/00 DF term (reference) 5B050 AA08 BA15 CA07 CA08 FA02 FA10 FA13 FA19 5B075 ND06 NK04 PQ02 PQ04 5E501 AA01 AA04 AB03 AC15 AC22 BA03 BA12 CA08 EA34 FA13 FA14 FA32 FB34

Claims

[Claims]

An image analysis unit that analyzes an image file including image data representing an image and image information related to the image data to obtain the image information; and converts the image information obtained by the image analysis unit into a voice. An information presentation device, comprising: an audio output unit for outputting.

2. Extraction means for analyzing the image data and recognizing characters included in the image data by character recognition to extract image data extraction information; and image data extraction information obtained by the extraction means. 2. The image output apparatus according to claim 1, further comprising: an image information generating unit configured to generate image information in combination with the image information, wherein the image information generated by the image information generating unit is provided to the audio output unit. Information presentation device.

3. A screen output means for displaying the image data and the image information, and whether or not the image data is displayed by the screen output means.
2. A presentation method selecting means for arbitrarily selecting a presence / absence of display of said image information by said screen output means and presence / absence of sound output of said image information by said sound output means. Information presentation device as described.

4. An HT for analyzing HTML content including an image file and extracting a document file and an image file.
ML content analysis means, and the HTML information obtained by using the image analysis means or the image generation means based on the image file extracted by the HTML content analysis means,
HT that reconstructs HTML content by reflecting it in the document file extracted by the L content analysis processing means
The information presentation apparatus according to claim 1, further comprising: ML content reconstruction processing means; wherein the audio output means outputs the HTML content reconstructed by the HTML content reconstruction processing means in voice. .

5. A screen output unit for displaying the image data and the image information, and whether or not the image data is displayed by the screen output unit.
A presentation method selecting means for arbitrarily selecting a presence / absence of display of the document file by the screen output means and a sound output of the document file by the sound output means. 4. The information presentation device according to 4.

6. An image analyzing step of analyzing an image file including image data representing an image and image information related to the image data to obtain the image information, and converting the image information obtained in the image analyzing step into voice. And an audio output step of outputting.

7. An extraction step of analyzing the image data and recognizing characters included in the image data by character recognition to extract image data extraction information; and an image data extraction information obtained by the extraction step. 7. An image information generating step of generating image information in combination with the image information, wherein the image information generated by the image information generating means is provided to the audio output means. Presentation method.

8. A screen output step of displaying the image data and the image information; and displaying or not displaying the image data in the screen output step.
7. A presentation method selecting step in which a user arbitrarily selects the presence or absence of display of the image information in the screen output step and the presence or absence of audio output of the image information in the audio output step. Information presentation method described.

9. An HT for analyzing HTML content including an image file and extracting a document file and an image file
The ML content analysis step, and the image information obtained by using the image analysis step or the image generation step based on the image file extracted in the HTML content analysis step,
HT that reconstructs HTML content by reflecting it in the document file extracted in the L content analysis process
7. The information presentation method according to claim 6, further comprising an ML content reconstructing process, wherein the audio output unit outputs the HTML content reconstructed in the HTML content reconstructing process by voice. .

10. A screen output step of displaying the image data and the image information; presence or absence of display of the image data in the screen output step; presence or absence of display of the document file in the screen output step; 10. The information presentation method according to claim 9, further comprising: a presentation method selection step in which a user arbitrarily selects the presence or absence of the audio output in the audio output step.

11. A storage medium for storing a control program for causing a computer to implement the information presentation method according to claim 6.