JPH03282945A

JPH03282945A - Device for forming image information with voice

Info

Publication number: JPH03282945A
Application number: JP2085181A
Authority: JP
Inventors: Hideji Fujita; 藤田　秀治; Hideaki Suzaki; 洲崎　秀昭; Hiroshi Nishida; 博西田
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 1990-03-30
Filing date: 1990-03-30
Publication date: 1991-12-13

Abstract

PURPOSE:To efficiently form image information with voice by inputting voice data consisting of an explanation based upon a certain language as character codes at first and then converting the codes into the voice data by a voice synthesizing part. CONSTITUTION:When an operator inputs a required explanation sentence from a keyboard in a word processor 21 or a personal computer 22 as character codes, the inputted character codes are sent to the voice synthesizing part 40 or an image processing part 30, and when necessary, stored in an image storing part 31 or a voice storing part 41. On the other hand, display image data or moving image data for one screen are edited and formed from the inputted image data or moving image data based upon the character codes. While observing the display screen, the operator applies an editing command to an image processing part 30, reads out necessary image data from the storage part 31 to allocate it. Consequently, image information with voice can be efficiently formed.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は音声付き画像情報生成装置、特にＶＴＲ装置に
よる再生に適した音声付き画像情報を生成する装置に関
する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to an apparatus for generating image information with audio, and particularly to an apparatus for generating image information with audio suitable for reproduction by a VTR device.

[Conventional technology]

情報伝達媒体として、画像と音声とは最も基本的な要素
であり、これらを同時に提供するＶＴＲ装置は各家庭に
普及するまでに至っている。このような音声付き画像情
報を取り扱う装置では、般に、画像データと音声データ
とは別系統で処理される。ＶＴＲ装置用の音声付き画像
情報を生成する装置としては、ビデオカメラが代表的な
装置であるが、画像データについては被写体を撮影する
ことにより生成し、音声データについては音声を録音す
ることにより生成している。この他、画像データについ
ては、コンビニ−タグラフイックスジステムを用いて生
成したものを利用したり、スキャナ装置で取り込んだも
のを利用したりすることも可能であり、音声データにつ
いては、予め録音されたオーディオテープやコンパクト
ディスクなどから再生したものを利用することも可能で
ある。As an information transmission medium, images and sounds are the most basic elements, and VTR devices that provide these at the same time have become widespread in every household. In devices that handle such image information with audio, image data and audio data are generally processed in separate systems. A video camera is a typical device that generates image information with audio for VTR devices, but image data is generated by photographing the subject, and audio data is generated by recording audio. are doing. In addition, image data can be generated using a convenience store graphics system or captured using a scanner, and audio data can be recorded in advance. It is also possible to use audio tapes or compact discs played back.

[Problem to be solved by the invention]

しかしながら、従来の音声付き画像情報生成装置では、
音声データとして言語による説明が必要な場合、アナラ
ンサに説明文を朗読させ、これを録音するという方法を
採らざるを得ない。ところが、商業ベースで供給する音
声付き画像情報では、かなり熟練したアナランサによっ
て説明文を朗読させる必〜要があり、人件費がかさみ、
作業が煩雑になるという問題がある。However, in the conventional image information generation device with audio,
If a verbal explanation is required as audio data, it is necessary to have the anallancer read the explanation and record it. However, when providing image information with audio on a commercial basis, the explanatory text must be read out loud by a highly skilled anallancer, which increases labor costs.
There is a problem that the work becomes complicated.

そこで本発明は、言語による説明を音声データとして用
いる場合に、コストダウンを図り、作業性を向上させう
る音声付き画像情報生成装置を提供することを目的とす
る。SUMMARY OF THE INVENTION Therefore, an object of the present invention is to provide an audio-accompanied image information generation device that can reduce costs and improve workability when verbal explanations are used as audio data.

[Means to solve the problem]

（１）本願第１の発明は、音声付き画像情報生成装置に
おいて、画像データを入力する画像入力装置と、文字コードを入
力する文字入力装置と、入力した画像データに基づいて
、１画面分の表示画像データを準備する画像処理部と、
入力した文字コードに基づいて、この文字コードに対応
する音声データを準備する音声合成部と、画像処理部で
準備された１画面分の表示画像データと、音声合成部で
準備された音声データと、を対応づけて出力する出力装
置と、を設けるようにしたものである。(1) The first invention of the present application is an image information generation device with sound, which includes an image input device for inputting image data, a character input device for inputting character codes, and a system for generating information for one screen based on the input image data. an image processing unit that prepares display image data;
Based on the input character code, a voice synthesis section prepares voice data corresponding to this character code, one screen worth of display image data prepared by the image processing section, and voice data prepared by the voice synthesis section. , and an output device that outputs , in association with each other.

（２）本願第２の発明は、音声付き画像情報生成装置に
おいて、画像データを入力する画像入力装置と、文字コードを入
力する文字入力装置と、入力した画像データに基づいて
、動画用画像データを準備する画像処理部と、入力した文字コードに基づいて、この文字コードに対応
する音声データを準備する音声合成部と、画像処理部で
準備された動画用画像データと、音声合成部で準備され
た音声データと、を対応づけて出力する出力装置と、を設けるようにしたものである。(2) The second invention of the present application is an image information generation device with audio, which includes an image input device for inputting image data, a character input device for inputting character codes, and a method for generating moving image image data based on the input image data. An image processing unit that prepares audio data based on the input character code, an audio synthesis unit that prepares audio data corresponding to this character code, and an audio synthesis unit that prepares the video image data prepared by the image processing unit and the audio data that corresponds to the input character code. and an output device that outputs the audio data in association with each other.

（３）本願第３の発明は、上述の第１または第２の発明
による装置において、画像処理部に接続され、画像データをデータベースの形
式で保持しうる画像用記憶部と、音声合成部に接続され
、音声データをデータベースの形式で保持しうる音声用
記憶部と、を更に設け、画像処理部および音声合成部が
、データベースを利用してデータの準備を行いうるよう
にしたものである。(3) A third invention of the present application provides, in the apparatus according to the first or second invention described above, an image storage unit connected to the image processing unit and capable of holding image data in the form of a database, and a voice synthesis unit. The apparatus further includes an audio storage section that is connected to the audio storage section and can hold audio data in the form of a database, so that the image processing section and the audio synthesis section can prepare data using the database.

[For production]

本発明に係る音声付き画像情報生成装置によれば、言語
による説明からなる音声データを、最初は文字コードと
して入力することができる。すなわち、オペレータは言
語による説明文をワードプロセッサなどで文字コードと
して入力すればよい。According to the audio-accompanied image information generation device according to the present invention, audio data consisting of a verbal explanation can be initially input as a character code. That is, the operator only has to input the explanatory text in language as a character code using a word processor or the like.

入力した文字コードは音声合成部において音声データに
変換される。したがって、従来装置のように、熟練した
アナランサによる朗読を行う必要がなくなり、音声付き
画像情報を生成する上で、コストダウンを図り、作業性
を向上させることができる。The input character code is converted into voice data in the voice synthesis section. Therefore, there is no need for a skilled anallancer to perform the reading as in the conventional apparatus, and it is possible to reduce costs and improve workability in generating audio-accompanied image information.

〔Example〕

以下本発明を図示する実施例に基づいて説明する。第１
図は本発明の音声付き画像情報生成装置の一実施例の構
成を示すブロック図である。ここで、画像入力部１０に
は、画像データを入力するための装置として、ビデオカ
メラ１１、ＶＴＲ装置１２、コンピュータグラフィック
スシステム１３、スキャナ装置１４が接続されている。The present invention will be described below based on illustrated embodiments. 1st
FIG. 1 is a block diagram showing the configuration of an embodiment of the audio-accompanied image information generation device of the present invention. Here, a video camera 11, a VTR device 12, a computer graphics system 13, and a scanner device 14 are connected to the image input unit 10 as devices for inputting image data.

ビデオカメラ１１は、被写体を撮影することによって画
像データを入力することができ、ＶＴＲ装置１２は、予
め録画してあった画像データを再生することにより、こ
の画像データを入力することができる。コンピュータグ
ラフィックスシステム１３は、オペレータの指示により
画面上に作成された図形、グラフなどを画像データとし
て入力することができる。また、スキャナ装置１４は、
入力すべき画像が描かれた原稿を微小な画素データの集
合に分解することにより、これを画像データとして入力
することができる。このようにして入力された画像デー
タは、画像入力部１０から画像処理部３０へと送られ、
必要があれば画像用記憶部３１に記憶される。この実施
例の装置は、画像データとして静止画像と動画との両方
を取り扱うことができる。The video camera 11 can input image data by photographing a subject, and the VTR device 12 can input this image data by playing back prerecorded image data. The computer graphics system 13 can input figures, graphs, etc. created on the screen according to an operator's instructions as image data. Further, the scanner device 14
By decomposing a document on which an image to be input is drawn into a set of minute pixel data, this can be input as image data. The image data input in this way is sent from the image input section 10 to the image processing section 30,
If necessary, it is stored in the image storage section 31. The device of this embodiment can handle both still images and moving images as image data.

一方、文字入力部２０には、文字コードを入力するため
の装置として、ワードプロセッサ２１およびパーソナル
コンピュータ２２が接続されている。オペレータはこれ
らの装置のキーボードから、所望の説明文を文字コード
として入力することができる。入力された文字コードは
、音声合成部４０または画像処理部３０へと送られ、必
要があれば画像用記憶部３１または音声用記憶部４１に
記憶される。On the other hand, a word processor 21 and a personal computer 22 are connected to the character input section 20 as devices for inputting character codes. The operator can input a desired explanatory text as a character code from the keyboard of these devices. The input character code is sent to the speech synthesis section 40 or the image processing section 30, and is stored in the image storage section 31 or the audio storage section 41 if necessary.

画像処理部３０は、デイスプレィ装置を有しており、取
り込んだ画像データまたは文字コードに基づいて１画面
分の表示画像データまたは動画用画像データを編集生成
する機能を有する。オペレータは、デイスプレィ装置の
表示画面を見ながら、画像処理部３０に対して編集のた
めの指示を与え、取り込んだ画像データのうちから必要
なものを画像用記憶部３１から読出し、所望の表示領域
内にこの読出した画像データを割り付ける処理を行うこ
とができる。このとき、割り付けるべき画像データに対
して、トリミングや合成、拡大や縮小処理を施すことが
できる。また、画像の階調や色数を変更したりする処理
を行ってもよいし、割付位置の移動処理を行ってもよい
。画像用記憶部３１は、静止画像と動画との双方の画像
データを記憶する機能を有し、光デイスク装置などで構
成される。これらの画像データは、データベースの形式
で記憶部３１内に保存される。したがって、オペレータ
はこのデータベースの中から所望の画像データを読み出
し、種々の処理を施すことが可能である。たとえば、デ
ータベース中から動画と静止画像とを読み出し、動画に
静止画像をスーパーインポーズするような処理を行うこ
ともできる。The image processing section 30 has a display device, and has a function of editing and generating one screen's worth of display image data or video image data based on the captured image data or character code. While looking at the display screen of the display device, the operator gives instructions for editing to the image processing section 30, reads out the necessary image data from the image storage section 31, and displays the desired display area. Processing to allocate this read image data can be performed within the process. At this time, the image data to be allocated can be subjected to trimming, compositing, enlargement, or reduction processing. Further, processing such as changing the gradation or number of colors of the image may be performed, or processing may be performed to move the layout position. The image storage unit 31 has a function of storing image data of both still images and moving images, and is composed of an optical disk device or the like. These image data are stored in the storage unit 31 in the form of a database. Therefore, the operator can read desired image data from this database and perform various processing on it. For example, it is possible to perform processing such as reading a moving image and a still image from a database and superimposing the still image on the moving image.

また、本実施例の装置では、画像処理部３０は文字入力
部２０から取り込んだ文字コードに基づいて、文字画像
を生成する機能も有する。画像処理部３０内には、各文
字コードの種々の書体についての文字フォントが用意さ
れており、オペレータが、入力した一連の文字コードに
ついて、割付位置、書体、サイズなどを指定すると、指
定どおりの文字画像が生成される。なお、この文字画像
は、一般のテレビジョン用のデイスプレィでの再生に用
いるのであれば、かなり低解像度のものでかまわない。Furthermore, in the apparatus of this embodiment, the image processing section 30 also has a function of generating a character image based on the character code taken in from the character input section 20. In the image processing unit 30, character fonts for various fonts of each character code are prepared, and when the operator specifies the layout position, font, size, etc. for a series of input character codes, the character fonts are displayed as specified. A character image is generated. Note that this character image may have a fairly low resolution if it is used for reproduction on a general television display.

このようにして準備された１画面分の表示画像データま
たは動画用画像データは、出力部５０に転送される。あ
るいは、必要があれば、所定のファイル名を付加して記
憶部３１内のデータベースに登録しておいてもよい。こ
の場合は、ファイル名を指定することにより、このデー
タを後でいつでも検索し、読み出すことができる。The display image data or video image data for one screen prepared in this way is transferred to the output unit 50. Alternatively, if necessary, a predetermined file name may be added and registered in the database in the storage unit 31. In this case, you can search and read this data at any time later by specifying the file name.

本装置の特徴は、音声合成部４０によって音声データを
発生させる点にある。音声合成部４０は、入力した文字
コードに基づいて、この文字コードに対応する音声デー
タを生成する機能を有する。The feature of this device is that the voice synthesis section 40 generates voice data. The speech synthesis section 40 has a function of generating speech data corresponding to the input character code based on the input character code.

すなわち、与えられた一連の文字コードについて、構文
解析を行い、この一連の文字コードについての正しい表
音を認識し、この正しい表音に対応する音声を人工的に
合成するのである。このような文字コードから音声合成
を行う技術は、たとえば、日経バイト誌１９８８年７月
号２０１頁〜２１３頁などに開示されているため、ここ
では説明を省略する。本実施例の装置では、株式会社言
語工学研究所より販売されている文章読み上げソフト「
談話−１００」を用いてこの音声合成を行っている。こ
うして合成された音声データは、出力部５０に転送され
る。あるいは、必要があれば、所定のファイル名を付加
して音声用記憶部４１にデータベースの形式で登録して
おいてもよい。この場合は、ファイル名を指定すること
により、このデータをいつでも検索し、読み出すことが
できる。That is, it performs syntax analysis on a given set of character codes, recognizes the correct phonetic sounds for this set of character codes, and artificially synthesizes the speech that corresponds to the correct phonetic sounds. Techniques for synthesizing speech from such character codes are disclosed, for example, in Nikkei Byte magazine, July 1988 issue, pages 201 to 213, and therefore will not be described here. The device of this example uses the text-to-speech software "
This speech synthesis is performed using "Discourse-100". The audio data synthesized in this way is transferred to the output section 50. Alternatively, if necessary, a predetermined file name may be added and registered in the audio storage section 41 in the form of a database. In this case, this data can be searched and read at any time by specifying the file name.

さて、出力部５０では、このようにして得られた画像デ
ータおよび音声データが出力される。画像データはテレ
ビジョン用信号（ＮＴＳＣ方式）で出力されるため、出
力部５０をＶＴＲ装置に接続すれば、生成された画像デ
ータおよび音声データを録画および録音することができ
る。出力部５０に、出力すべき画像データおよび音声デ
ータを予め登録しておけば、番組表に合わせて自動出力
させるような動作も可能である。また、この装置を放送
局内で用い、出力部５０として送信装置を用いれば、生
成された画像データおよび音声データをそのまま搬送波
に重畳して電波として送信することが可能である。特に
、一般企業によって利用しうる情報伝達システムとして
、今後衛星通信が広く開放される予定である。この装置
は、音声付き画像情報を効率良く生成することができる
ため、各企業による衛星通信用送信局として大きな利用
価値が見出だせることになろう。Now, the output unit 50 outputs the image data and audio data obtained in this way. Since the image data is output as a television signal (NTSC system), by connecting the output section 50 to a VTR device, the generated image data and audio data can be recorded and recorded. If the image data and audio data to be output are registered in advance in the output section 50, it is possible to automatically output them in accordance with the program guide. Further, if this device is used in a broadcasting station and a transmitter is used as the output unit 50, the generated image data and audio data can be directly superimposed on a carrier wave and transmitted as radio waves. In particular, satellite communications are expected to become widely available in the future as an information transmission system that can be used by general businesses. Since this device can efficiently generate image information with audio, it will find great utility as a transmitting station for satellite communications by various companies.

続いて、この装置による音声付き画像情報の生成プロセ
スを、具体例を挙げて説明する。まず、オペレータが、
ビデオカメラ１１によって欧州の地図を撮影し、第２図
に示すような静止画像のデータを得たものとし、更にコ
ンピュータグラフィックスシステム１３によって第３図
に示すようなグラフからなる静止画像のデータを得たも
のとする。これらの画像データは、画像入力部１０から
画像処理部３０に与えられ、記憶部３１にひとまず保存
される。一方、オペレータは、ワードプロセッサ２１か
ら、第４図に示すような説明文１および説明文２に対応
する文字コード（たとえばＪＩｓ漢字コード）を入力し
たものとする。このとき、オペレータは、説明文１は画
像用のものであり、説明文２は表音用のものであること
を示す識別子を各説明文に付加しておく。この識別子に
より、画像用の文字コードは画像処理部３０に与えられ
記憶部３１にひとまず保存され、表音用の文字コードは
音声合成部４０に与えられ記憶部４１にひとまず保存さ
れる。Next, the process of generating audio-accompanied image information using this device will be explained using a specific example. First, the operator
It is assumed that a map of Europe is photographed by a video camera 11 to obtain still image data as shown in FIG. 2, and that still image data consisting of a graph as shown in FIG. I will assume that you have obtained it. These image data are provided from the image input section 10 to the image processing section 30 and are temporarily stored in the storage section 31. On the other hand, it is assumed that the operator inputs character codes (for example, JIs Kanji codes) corresponding to explanatory text 1 and explanatory text 2 as shown in FIG. 4 from the word processor 21. At this time, the operator adds an identifier to each explanatory text indicating that explanatory text 1 is for an image and explanatory text 2 is for a phonetic text. Using this identifier, the character code for the image is given to the image processing section 30 and temporarily stored in the storage section 31, and the character code for phonetic sound is given to the speech synthesis section 40 and temporarily stored in the storage section 41.

オペレータは、この他にも種々の静止画像データや動画
用画像データ、あるいは画像用の文字コードを入力して
記憶部３１に保存させておくことができるし、種々の表
音用の文字コードを入力して記憶部４１に保存させてお
くことができる。入力作業がひととおり完了したら、画
像処理部３０において、この実施例では１画面分の表示
画像データの作成処理が行われる。この例では、オペレ
ータは、第５図に示すような１画面分の表示画像データ
６０を作成したものとする。すなわち、第２図に示す画
像データをトリミングして拡大して割り付けて部分画像
６１を作成し、第３図に示す画像データを割り付は部分
画像６２を作成し、更に画像用の文字コードとして入力
した説明文１について、書体、サイズを指定して部分画
像６３を作成するのである。このようにして作成した画
像データ６０は、記憶部３１に、たとえば「イタリア家
具」なるファイル名を付加して登録される。In addition to this, the operator can input various still image data, video image data, or character codes for images and store them in the storage unit 31, and can also input various phonetic character codes. It can be input and stored in the storage unit 41. Once the input work has been completed, the image processing section 30 performs a process of creating display image data for one screen in this embodiment. In this example, it is assumed that the operator has created display image data 60 for one screen as shown in FIG. That is, the image data shown in FIG. 2 is trimmed, enlarged, and laid out to create a partial image 61, the image data shown in FIG. For the input explanatory text 1, a partial image 63 is created by specifying the font and size. The image data 60 created in this manner is registered in the storage unit 31 with a file name such as "Italian Furniture" added thereto.

続いて、音声合成部４０では、音声合成作業が行われる
。すなわち、この例では、オペレータは記憶部４１から
説明文２を読出し、これに基づく音声合成を行うような
指示を音声合成部４０に与える。これにより、音声合成
部４０は、「ご覧のグラフのように、・・・・・・−途
を辿っています。」なる説明文を正しく発音した音声デ
ータを合成する。Subsequently, the speech synthesis section 40 performs speech synthesis work. That is, in this example, the operator reads explanatory text 2 from storage section 41 and gives an instruction to speech synthesis section 40 to perform speech synthesis based on this. As a result, the speech synthesis unit 40 synthesizes speech data that correctly pronounces the explanatory sentence "As you can see in the graph, we are following the path...".

このようにして合成した音声データも、「イタリア家具
」なるファイル名を付加して記憶部４１に登録される。The audio data synthesized in this manner is also registered in the storage unit 41 with the file name "Italian Furniture" added thereto.

こうして、オペレータは複数組の画像データおよび音声
データを各記憶部３１．４１に登録しておくことができ
る。最後に必要なファイルを各記憶部３１．４１から検
索して読出し、これを出力部５０に送れば、生成された
音声付き画像情報が出力されることになる。これを衛星
通信を利用して出力すれば、各受信局では、第５図に示
すような静止画像とともに、「ご覧のグラフのように、
・・・・・・−途を辿っています。」なる音声が再生さ
れる。In this way, the operator can register multiple sets of image data and audio data in each storage section 31, 41. Finally, if the necessary files are searched and read from each storage section 31, 41 and sent to the output section 50, the generated image information with sound will be output. If this is output using satellite communication, each receiving station will receive a still image like the one shown in Figure 5.
・・・・・・－I am on my way. ” is played.

このように、本装置によれば、音声付き画像情報を、オ
ペレータの作業だけで作成することができ、説明文を朗
読する熟練したアナランサは不要になる。このため、制
作コストの低下を図ることができるとともに、作業性も
向上する。In this manner, according to the present device, image information with audio can be created solely by the operator's work, and a skilled anallancer who reads out explanatory text is not required. Therefore, it is possible to reduce production costs and improve work efficiency.

なお、上述の実施例では、画像データとして静止画像を
用いた例を示したが、本発明の装置は動画と静止画像と
の両方を扱えるものである。動画の場合には、動画用画
像データ内に記録されたタイムコードを利用して、画像
と音声との対応づけを行うようにするのが好ましい。ま
た、第１図の装置において、各ブロック間のデータ伝送
は、オンライン接続で行われているように矢印が描かれ
ているが、これはフロッピディスクやビデオテープなど
を媒介としたオフライン接続で行うようにしてもかまわ
ない。要するに、本発明の要点は、音声付き画像情報を
生成する際に、音声データを文字コードとして入力し、
この文字コードに基づいて音声データを合成するように
した点にあり、この要点から逸脱しない限り、種々の態
様での実施が可能である。In addition, although the above-mentioned Example showed the example using a still image as image data, the apparatus of this invention can handle both a moving image and a still image. In the case of a moving image, it is preferable to associate the image and the audio using a time code recorded in the image data for the moving image. In addition, in the device shown in Figure 1, the arrows are drawn to indicate that data transmission between each block is done through online connections, but this is done through offline connections using floppy disks, video tapes, etc. It doesn't matter if you do it like this. In short, the main point of the present invention is that when generating image information with audio, audio data is input as a character code,
The point is that audio data is synthesized based on this character code, and various implementations are possible as long as the main point is not departed from.

〔Effect of the invention〕

以上のとおり、本発明に係る音声付き画像情報生成装置
によれば、言語による説明からなる音声データを、最初
は文字コードとして入力し、これを音声合成部において
音声データに変換するようにしたため、従来装置のよう
に、熟練したアナランサによる朗読を行う必要がなくな
り、音声付き画像情報を生成する上で、コストダウンを
図り、作業性を向上させることができる。As described above, according to the audio-accompanied image information generation device according to the present invention, the audio data consisting of a verbal explanation is first input as a character code, and this is converted into audio data in the audio synthesis section. Unlike conventional devices, there is no need for reading by a skilled anallancer, and it is possible to reduce costs and improve workability when generating image information with audio.

[Brief explanation of drawings]

第１図は本発明の一実施例に係る音声付き画像情報生成
装置の基本構成を示すブロック図、第２図は第１図の装
置のビデオカメラで入力した画像を示す図、第３図は第
１図の装置のコンピュータグラフィックスシステムで入
力した画像を示す図、第４図は第１図の装置のワードプ
ロセッサで入力した文字列を示す図、第５図は第１図の
装置の画像処理部で作成された１画面分の表示画像を示
す図である。FIG. 1 is a block diagram showing the basic configuration of an audio-accompanied image information generation device according to an embodiment of the present invention, FIG. 2 is a diagram showing an image input by a video camera of the device shown in FIG. 1, and FIG. FIG. 4 is a diagram showing an image input into the computer graphics system of the device shown in FIG. 1, FIG. 4 is a diagram showing a character string input into the word processor of the device shown in FIG. 1, and FIG. 5 is an image processing of the device shown in FIG. FIG. 3 is a diagram showing a display image for one screen created by the department.

Claims

[Claims]

(1) An image input device for inputting image data, a character input device for inputting character codes, an image processing unit for preparing display image data for one screen based on the inputted image data, and an image processing unit for preparing display image data for one screen based on the inputted image data; a voice synthesis unit that prepares voice data corresponding to the character code based on the character code; one screen worth of display image data prepared by the image processing unit; and voice data prepared by the voice synthesis unit. , an output device that outputs , in association with each other.

(2) an image input device for inputting image data; a character input device for inputting a character code; an image processing unit for preparing video image data based on the inputted image data; Based on this character code, a voice synthesis section prepares voice data corresponding to this character code, and the video image data prepared by the image processing section and the voice data prepared by the voice synthesis section are output in association with each other. An image information generation device with audio, comprising: an output device for generating sound;

(3) The apparatus according to claim 1 or 2, further comprising: an image storage unit connected to the image processing unit and capable of storing image data in a database format; and an image storage unit connected to the audio synthesis unit and capable of storing audio data in a database format. further comprising: a storage unit for audio that can be held in the image processing unit and the audio synthesis unit;
An image information generating device with audio, characterized in that data can be prepared using the database.