JP2004266850A

JP2004266850A - Motion picture treatment apparatus and its method

Info

Publication number: JP2004266850A
Application number: JP2004117055A
Authority: JP
Inventors: Kunihiro Yamamoto; 邦浩山本
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1992-11-05
Filing date: 2004-04-12
Publication date: 2004-09-24

Abstract

<P>PROBLEM TO BE SOLVED: To solve the problem that if motion pictures with caption is compressed in a compression system based on quantization in a frequency region where white of high luminance is used in many cases in order to heighten perceptivity in caption indication of motion pictures, picture quality of a total picture deteriorates compared with a case where there is no caption since mosquito noises tend to occur around characters, and bits concentrate to character parts. <P>SOLUTION: The motion picture data are compressed in a motion picture compression circuit 113, composed with caption text data in a multiplexer 114, and sent to a user through a storage/transmission system 115. Received data by the user are separated to compressed picture data and caption data by a demultiplexer 116. The picture data are sent to a motion picture expansion circuit 117 through a signal line 120. The text data are sent to a caption superimposer 118 through a signal line 121, deployed to a bitmap text picture there, and superimposed on the motion picture expanded by a motion picture expansion circuit 117. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

本発明は動画像処理装置およびその方法に関し、とくに動画像に対してスーパインポーズなどの編集を施す動画像処理に関する。 The present invention relates to a moving image processing apparatus and method, and more particularly to a moving image processing for editing a moving image such as superimposition.

従来、ディジタル動画像データを編集、圧縮、伝送する場合は、動画像データを編集した後、加工済の動画像データを圧縮するのが一般的であった。ここで編集とは、例えば文字や図形などを動画像上にスーパインポーズしたり、次に説明するトランジションエフェクトなどを動画像に施すことである。 Conventionally, when editing, compressing, and transmitting digital moving image data, it is common to edit the moving image data and then compress the processed moving image data. Here, "editing" means, for example, superimposing a character or figure on a moving image, or applying a transition effect described below to the moving image.

トランジションエフェクトとは、例えば時間的に連続する二つの動画像シーケンスを滑らかに接続するために、時間的に前方にある動画像をフェードアウトしながら、時間的に後方にある動画像をフェードインする処理のことである。この場合、遷移期間に指定した範囲においては、二つの動画像シーケンスが重なり合った特殊な状態の画像が現れることになる。 The transition effect is a process of fading out a moving image that is temporally backward while fading out a moving image that is temporally forward, for example, in order to smoothly connect two temporally consecutive moving image sequences. That is. In this case, in the range specified in the transition period, an image in a special state in which two moving image sequences overlap appears.

こうして得られた編集済の動画像データは、種々提案されている動画像符号化方式により符号化されて、通信回線を通じて受信側へ伝送される。受信側では、受信した符号データを復号して表示する。 The edited moving image data thus obtained is encoded by various proposed moving image encoding methods and transmitted to the receiving side through a communication line. On the receiving side, the received encoded data is decoded and displayed.

図2は従来の動画像編集手順を示すフローチャートで、同図(a)は送信側端末の手順を、同図(b)は受信端末の手順をそれぞれ示す。 FIG. 2 is a flowchart showing a conventional moving image editing procedure. FIG. 2 (a) shows the procedure of the transmitting terminal, and FIG. 2 (b) shows the procedure of the receiving terminal.

同図(a)において、送信側端末は、ステップS1で、動画像データを入力してメモリに記憶する。続いて、送信側端末は、ステップS2でトランジションエフェクトなどの効果の指定を受付け、ステップS3で、ステップS2で受付けた指定に基づいて、メモリに記憶した画像データを編集する。 In FIG. 7A, in step S1, the transmitting terminal inputs moving image data and stores it in a memory. Subsequently, the transmitting terminal receives the designation of an effect such as a transition effect in step S2, and edits the image data stored in the memory based on the designation received in step S2 in step S3.

次に、同図(b)において、受信側端末は、ステップS6で符号データを受信する。続いて、受信側端末は、ステップS7で、受信した符号データを復号して、復号して得た動画像データをメモリに記憶する。続いて、受信側端末は、ステップS8でディスプレイに動画像を表示する。 Next, in FIG. 6B, the receiving terminal receives the code data in step S6. Subsequently, in step S7, the receiving terminal decodes the received code data and stores the decoded moving image data in the memory. Subsequently, the receiving terminal displays a moving image on the display in step S8.

特開平4-103271号公報JP-A-4-103271 特開昭60-79883号公報JP-A-60-79883 特開平2-123881号公報JP-A-2-123881 特開平1-286682号公報JP-A-1-286682 特開平1-232573号公報JP-A-1-232573 特開平4-256296号公報JP-A-4-256296

通常、テレビジョンや映画など動画像の字幕表示には、視認性を高めるために高輝度の白色を用いることが多いが、周波数領域での量子化をベースとしたMPEGなど圧縮方式で、このような字幕付き動画を圧縮すると、文字周囲にモスキートノイズが発生しやすく、また、文字部にビットが集中するために画像全体の画質が、字幕がないときに比べて劣化するという問題があった。 Usually, for displaying subtitles of moving images such as television and movies, high brightness white is often used to enhance visibility, but compression methods such as MPEG based on quantization in the frequency domain When a moving image with a subtitle is compressed, mosquito noise is likely to be generated around the character, and the bits are concentrated in the character portion, so that the image quality of the entire image is deteriorated as compared with the case without the subtitle.

また、使用者の操作性の点でも問題が多い。例えば、字幕の位置は、通常、画面下部に固定され、使用者が表示位置を任意に切替えることはできず、字幕が不要な場合にも、これを非表示に切替えることもできなかった。 There are also many problems in terms of user operability. For example, the caption position is usually fixed at the lower part of the screen, and the user cannot arbitrarily switch the display position. Even when the caption is unnecessary, it cannot be switched to non-display.

さらに、例えば映画などの動画像について、日本語と英語の字幕を切替えながら観るということができない問題があった。二種類の字幕を必要とする場合は、使用者は字幕部のみ異なる二つの動画像データを入手する必要があり、他方、制作者側も、販売圏（例えば日本語圏と英語圏）に応じて、異なった版を作制する必要かあり、無駄が多かった。 Furthermore, there is a problem that it is not possible to watch a moving image such as a movie while switching between Japanese and English subtitles. If two types of subtitles are required, the user needs to obtain two moving image data that differ only in the subtitle part, while the producer also depends on the sales area (for example, Japanese and English) I had to work on different versions, which was wasteful.

本発明は、上述の問題を解決するもので、動画像データへのテキストデータの重畳に起因する画質の劣化を防ぐとともに、受信側におけるテキストデータの重畳条件の設定を可能にすることを目的とする。 SUMMARY OF THE INVENTION An object of the present invention is to solve the above-described problem, and to prevent deterioration in image quality due to superimposition of text data on moving image data and to enable setting of superimposition conditions of text data on a receiving side. I do.

本発明は、前記の目的を達成する一手段として、以下の構成を備える。 The present invention has the following configuration as one means for achieving the above object.

本発明によれば、テキストデータが重畳されていない動画像データを周波数領域変換して圧縮符号化した動画像符号化データと、テキストデータとが多重化された情報データを外部装置から受信し、受信された前記情報データから前記動画像符号化データと前記テキストデータを分離し、分離された前記動画像符号化データを復号し、復号された動画像データに、分離された前記テキストデータを重畳する際の、テキストデータの重畳条件の設定を受け付け、前記重畳条件に基づき、前記動画像データに前記テキストデータを重畳し、前記テキストデータが重畳された動画像データを出力することを特徴とする。 According to the present invention, moving image data in which the text data is not superimposed is subjected to frequency domain conversion and compression encoded video data, and information data in which text data is multiplexed is received from an external device, Separating the encoded video data and the text data from the received information data, decoding the separated encoded video data, and superimposing the separated text data on the decoded video data; Receiving the setting of the superimposition condition of the text data, superimposing the text data on the moving image data based on the superimposition condition, and outputting the moving image data on which the text data is superimposed. .

本発明によれば、受信側で動画像データにテキストデータを重畳することで、送信側において動画像データへテキストデータを重畳することに起因する画質の劣化を防ぐとともに、受信側におけるテキストデータの重畳条件の設定を可能にすることができる。 According to the present invention, by superimposing text data on moving image data on the receiving side, it is possible to prevent deterioration in image quality caused by superimposing text data on moving image data on the transmitting side, and to reduce the text data on the receiving side. It is possible to set the superimposition condition.

以下、本発明にかかる実施例の動画像処理を図面を参照して詳細に説明する。 Hereinafter, moving image processing according to an embodiment of the present invention will be described in detail with reference to the drawings.

図1は、本発明にかかる第1実施例の動画像処理方法によって動画像を処理するシステムの構成例を示すブロック図である。同図の動画像伝送システムは、通信回線によって結合された複数の端末a1、端末b2、端末c3、…によって構成され、端末a1は次のような構成を備えている。 FIG. 1 is a block diagram illustrating a configuration example of a system that processes a moving image by the moving image processing method according to the first embodiment of the present invention. The moving image transmission system shown in FIG. 1 includes a plurality of terminals a1, b2, c3,... Connected by a communication line, and the terminal a1 has the following configuration.

同図の動画像伝送システムは、通信回線によって結合された複数の端末a1、端末b2、端末c3、…によって構成され、端末a1は次のような構成を備えている。 The moving image transmission system shown in FIG. 1 includes a plurality of terminals a1, b2, c3,... Connected by a communication line, and the terminal a1 has the following configuration.

11は入力ポートで、動画像データをビデオカメラなどの画像入力装置19より取込む。12はCPUであり、ROM 14に記憶された制御プログラムに従って、画像データの処理、および、バス18を介した端末1a全体の制御を実行する。また、CPU 12は、RAM 13の所定の領域へアクセスすることにより、RAM 13に記憶された動画像データの任意フレームの任意座標における画素値を読み出すことができる。 Reference numeral 11 denotes an input port which receives moving image data from an image input device 19 such as a video camera. Reference numeral 12 denotes a CPU that executes processing of image data and control of the entire terminal 1a via the bus 18 according to a control program stored in the ROM 14. Further, the CPU 12 can read a pixel value at an arbitrary coordinate of an arbitrary frame of the moving image data stored in the RAM 13 by accessing a predetermined area of the RAM 13.

15は符号化/復号回路、16はディスプレイであり、17はインタフェイスで、通信回線101との間のデータ通信を仲介する。なお、上記の構成部は、バス18によって相互に接続されている。また、他端末b2、端末c3、…も端末a1と略同様の構成を備えている。 Reference numeral 15 denotes an encoding / decoding circuit, 16 denotes a display, and 17 denotes an interface, which mediates data communication with the communication line 101. The above components are interconnected by a bus 18. The other terminals b2, c3,... Have substantially the same configuration as the terminal a1.

図3は本実施例の動画像編集手順の一例を示すフローチャートで、同図(a)は送信側端末の手順を、同図(b)は受信側端末の手順をそれぞれ示す。 FIG. 3 is a flowchart illustrating an example of a moving image editing procedure according to the present embodiment. FIG. 3 (a) illustrates the procedure of the transmitting terminal, and FIG.

同図(a)において、本実施例の送信側端末は、ステップS11で、入力ポート11を介して、動画像データをRAM 13へ読み込む。続いて、送信側端末は、ステップS12で、スーパインポーズやトランジションエフェクト（フェードアウト、ワイプ等）などの複数の効果の指定を受け付ける。 In FIG. 1A, the transmitting terminal of the present embodiment reads moving image data into the RAM 13 via the input port 11 in step S11. Subsequently, in step S12, the transmitting terminal accepts designation of a plurality of effects such as superimposition and transition effects (fade-out, wipe, etc.).

続いて、送信側端末は、ステップS13で、インタフェイス17を介して、ステップS12で受け付けた効果指定データ（コマンドデータ）を通信回線101へ送出する。続いて、送信端末は、ステップS14で動画像データを符号化/復号回路15で符号化し、ステップS15で、インタフェイス17を介して、符号データを通信回線101へ送出する。なお、動画像データは予め符号化しておき、効果指定データを付加して通信回線101に送出するようにしてもよい。 Subsequently, in step S13, the transmitting terminal transmits the effect designation data (command data) received in step S12 to the communication line 101 via the interface 17. Subsequently, the transmitting terminal encodes the moving image data by the encoding / decoding circuit 15 in step S14, and sends the encoded data to the communication line 101 via the interface 17 in step S15. Note that the moving image data may be encoded in advance, added with the effect designation data, and transmitted to the communication line 101.

すなわち、本実施例においては、効果指定データと、符号化した未編集の動画像データとを送出する。 That is, in the present embodiment, the effect designation data and the encoded unedited moving image data are transmitted.

次に、図3(b)において、受信側端末は、ステップS16で、インタフェイス17によって効果指定データを受信して、RAM 13に記憶する。続いて、受信端末は、ステップS17で、インタフェイス17によって動画像データの符号データを受信して、RAM 13に記憶する。続いて、受信側端末は、ステップS18で、符号化/復号回路15によって、RAM 13に記憶した符号データを復号して、復号して得た動画像データをRAM 13に記憶する。続いて、受信側端末は、ステップS19で、RAM 13に記憶した効果指定データに基づいて、RAM 13に記憶した動画像データを編集する。 Next, in FIG. 3 (b), the receiving terminal receives the effect designation data through the interface 17 and stores it in the RAM 13 in step S16. Subsequently, the receiving terminal receives the encoded data of the moving image data through the interface 17 and stores it in the RAM 13 in step S17. Subsequently, the receiving terminal uses the encoding / decoding circuit 15 to decode the code data stored in the RAM 13 and store the decoded moving image data in the RAM 13 in step S18. Subsequently, the receiving terminal edits the moving image data stored in the RAM 13 based on the effect specifying data stored in the RAM 13 in step S19.

編集は例えば以下のように行われる。今、受信した動画像のレートが毎秒5コマで、ディスプレイ16の表示可能なレートが毎秒20コマであるとする。このとき、フェードアウトを行うコマンドを受信した場合には、毎秒5コマの動画像のフレーム間を補間し、毎秒20コマの動画像データを作成した後に、フェードアウトの処理を行う（図6A参照）。ここで、補間処理としては、最も簡単なものとして、単純に同一画面を4コマ分繰返すことが考えられる。 Editing is performed, for example, as follows. Now, assume that the rate of the received moving image is 5 frames per second, and the displayable rate of the display 16 is 20 frames per second. At this time, when a command for performing fade-out is received, interpolating between frames of a moving image of 5 frames per second, and generating moving image data of 20 frames per second, a fade-out process is performed (see FIG. 6A). Here, the simplest interpolation processing may be to simply repeat the same screen for four frames.

なお、従来方式によれば、図6Bのような表示画像になる。また、フェードアウトの他にも例えば、ワイプなどのトランジションエフェクトも上述と同様の方法で行うことができる。 According to the conventional method, the display image is as shown in FIG. 6B. In addition to the fade-out, for example, a transition effect such as a wipe can be performed in the same manner as described above.

続いて、受信側端末は、ステップS20でディスプレイ16に動画像を表示する。 Subsequently, the receiving terminal displays a moving image on the display 16 in step S20.

すなわち、本実施例は、受信側端末において、効果指定データおよび動画像データを受信した後、該符号データを復号して得た動画像データを、該効果指定データに基づいて編集する。 That is, in the present embodiment, after receiving the effect specifying data and the moving image data, the receiving terminal edits the moving image data obtained by decoding the code data based on the effect specifying data.

上記従来例においては、例えば、動画像上へ文字などをスーパインポーズしようとすると、線画データ（すなわち文字）が重畳された画像データを符号化することになり、主として自然画像の圧縮を目的とした公知の動画像符号化方式（例えばMPEGの符号化方式）では、画質を良くしようとすると効率よく圧縮できずに符号データの増大を招く一方、圧縮率を高くしようとすると文字が判読困難になるなどの著しい画質劣化を生じていた。また、従来例においては、秒間10コマ程度の低フレームレートで動画像データを伝送する場合に、動画像のフェードアウトを行うと、フェードアウトする画像の変化も低フレームレートで表現するほかなく、ぎくしゃくしたぎこちない効果しか得られなかった。 In the above-mentioned conventional example, for example, when attempting to superimpose a character or the like on a moving image, image data on which line drawing data (that is, a character) is superimposed is coded. According to the known moving picture coding method (for example, the MPEG coding method), it is not possible to efficiently compress the image in order to improve the image quality, and the code data is increased. On the other hand, if the compression rate is increased, the character becomes difficult to read. Remarkable deterioration of image quality, such as In addition, in the conventional example, when moving image data is transmitted at a low frame rate of about 10 frames per second, if the moving image is faded out, the change of the image to be faded out has to be expressed at a low frame rate, and it is jerky. Only an awkward effect was obtained.

一方、本実施例によれば、文字のスーパインポーズは、受信側端末において、画像表示の直前に行われるので、圧縮効率の低下を招くことはなく、さらに画質劣化も生じない。また、本実施例によれば、動画像のフェードアウトを行う場合に、受信側端末において、上述の補間処理を行うことにより、フェードアウトする画像の変化部分のみ、他の部分よりもフレームレートを挙げることが容易にできるので、画像データの伝送効率を劣化させずに、滑らかなフェードアウトを得ることができる。 On the other hand, according to the present embodiment, the superimposition of characters is performed immediately before the image is displayed on the receiving terminal, so that the compression efficiency does not decrease and the image quality does not deteriorate. Further, according to the present embodiment, when performing the fade-out of the moving image, the receiving terminal performs the above-described interpolation processing, so that only the changed portion of the image to be faded out has a higher frame rate than the other portions. Therefore, a smooth fade-out can be obtained without deteriorating the transmission efficiency of the image data.

なお、上述の説明および図においては、符号化/復号回路15によって、動画像の符号化/復号を行う例を説明したが、本実施例はこれに限定されるものではなく、例えば、CPU 12によって、動画像の符号化/復号をソフトウェア処理によって実現してもよい。この場合、処理速度は低下するが、端末のコストを低減できる。 In the above description and the drawings, an example in which encoding / decoding of a moving image is performed by the encoding / decoding circuit 15 has been described, but the present embodiment is not limited to this. Thus, encoding / decoding of a moving image may be realized by software processing. In this case, the processing speed is reduced, but the cost of the terminal can be reduced.

また、上述の説明および図においては、入力ポート11によって入力した動画像データ、またはインタフェイス17によって受信した符号データを、そのままRAM 13に記憶する例を説明したが、本実施例はこれに限定されるものではなく、例えば、充分に高速な符号化/復号回路を備えて、動画像データを入力しながらリアルタイムで符号化し、あるいは符号データを受信しながらリアルタイムで復号して、RAM 13に記憶してもよい。この場合、処理速度を向上できるほか、前者の場合にはRAM 13のメモリ容量を低減することができる。 Further, in the above description and drawings, an example has been described in which the moving image data input through the input port 11 or the code data received through the interface 17 is stored in the RAM 13 as it is, but this embodiment is not limited to this. For example, a sufficiently high-speed encoding / decoding circuit is provided, and encoding is performed in real time while moving image data is input, or decoded in real time while receiving encoded data, and stored in the RAM 13. May be. In this case, the processing speed can be improved, and in the former case, the memory capacity of the RAM 13 can be reduced.

また、上述の説明および図においては、データの記憶にRAMを用いる例を説明したが、本実施例はこれに限定されるものではなく、例えば、高速な処理を要求されない場合には、ハードディスクなどの外部記憶装置を用いてもよい。この場合、データ量の大きな動画像データを、低コストで処理することができる。 Further, in the above description and drawings, an example is described in which a RAM is used for storing data, but the present embodiment is not limited to this. For example, when high-speed processing is not required, a hard disk or the like may be used. May be used. In this case, a large amount of moving image data can be processed at low cost.

以上説明したように、本実施例によれば、受信端末で動画像データを編集することによって、符号化/復号による圧縮効率の低下や画質の劣化を招くことなく、高品位な動画像伝送を実現できる。 As described above, according to the present embodiment, by editing moving image data at the receiving terminal, high-quality moving image transmission can be performed without causing a decrease in compression efficiency or image quality due to encoding / decoding. realizable.

なお、本実施例の端末は、複数の機器から構成されるシステム、例えばビデオカメラ、ホストコンピュータなどのシステムであってもよく、また、一つの機器からなる装置、例えば動画像を記憶したホストコンピュータであってもよい。また。本実施例の処理は、システムあるいは装置へ、媒体に記憶されたプログラムを供給することによって実現してもよい。 The terminal according to the present embodiment may be a system including a plurality of devices, for example, a system such as a video camera and a host computer, or a device including a single device, for example, a host computer that stores moving images. It may be. Also. The processing of the present embodiment may be realized by supplying a program stored in a medium to a system or an apparatus.

以上、本発明の第1実施例によれば、動画像データの編集情報を伝送し、該動画像データを符号化して効率よく伝送する動画像処理方法を提供できる。また、本発明の第1実施例によれば、受信した編集情報に基づいて、受信した符号データを復号して得た動画像データを編集する動画像処理方法を提供できる。 As described above, according to the first embodiment of the present invention, it is possible to provide a moving image processing method that transmits editing information of moving image data, encodes the moving image data, and transmits the moving image data efficiently. Further, according to the first embodiment of the present invention, it is possible to provide a moving image processing method for editing moving image data obtained by decoding received code data based on received editing information.

本実施例は、上述した第1実施例の動画像処理方法を、例えば映画の字幕や動画像付きカラオケの歌詞など、動画像に同期してテキストを画像上にスーパインポーズする技術に応用したものである。 In this embodiment, the moving image processing method of the first embodiment described above is applied to a technique of superimposing text on an image in synchronization with a moving image, such as a movie subtitle or a karaoke lyrics with a moving image. Things.

本実施例では、動画像データと字幕などのテキストデータを別々に保持し、再生時に字幕などのテキストをビットマップに展開して画像上にスーパインポーズすることにより、字幕などのテキストを挿入したことに起因する画像圧縮/伸長時の画質劣化を防ぐと同時に、上述のような操作性の問題を解決している。 In the present embodiment, text data such as subtitles is separately stored, and text such as subtitles is inserted into a bitmap during reproduction and superimposed on an image, thereby inserting text such as subtitles. This prevents image quality degradation at the time of image compression / decompression due to this, and also solves the above-described operability problem.

以下、図面を参照しながら、本発明にかかる第2実施例を詳細に説明する。 Hereinafter, a second embodiment according to the present invention will be described in detail with reference to the drawings.

図4は本実施例における動画像処理システムの構成図である。111は字幕テキストデータを記憶する字幕テキストデータメモリ、112は動画像データを記憶する動画像データメモリ、113は動画像圧縮回路、114はマルチプレクサ、115は蓄積・伝送系、116はデマルチプレクサ、117は動画像伸長回路、118は字幕スーパインポーザ、119は表示器、120および121は信号線である。 FIG. 4 is a configuration diagram of a moving image processing system in the present embodiment. 111 is a subtitle text data memory for storing subtitle text data, 112 is a moving image data memory for storing moving image data, 113 is a moving image compression circuit, 114 is a multiplexer, 115 is a storage / transmission system, 116 is a demultiplexer, 117 Is a moving picture decompression circuit, 118 is a subtitle superimposer, 119 is a display, and 120 and 121 are signal lines.

動画像データメモリ112に蓄積された動画像データは、動画像圧縮回路113で圧縮され、字幕テキストデータメモリ111に蓄積されたテキストデータと、マルチプレクサ114で合成され、蓄積・伝送系115に送られる。ここで、蓄積・伝送系とは、テープやCD-ROMなどのパッケージメディアや、ISDN、TV放送などの通信系であり、これらを通じてデータが使用者のもとに届けられる。 The moving image data stored in the moving image data memory 112 is compressed by the moving image compression circuit 113, combined with the text data stored in the subtitle text data memory 111 by the multiplexer 114, and sent to the storage / transmission system 115. . Here, the storage / transmission system is a package system such as a tape or a CD-ROM, or a communication system such as an ISDN or a TV broadcast, through which data is delivered to a user.

使用者が受け取ったデータは、デマルチプレクサ116により、圧縮画像データとテキストデータに分離される。画像データは信号線120を通して動画像伸長回路117に送られる。テキストデータは、信号線121を通して字幕スーパインポーザ118に送られ、ここでビットマップテキスト画像に展開され、動画像伸長回路117で伸長された動画像へ重畳される。こうして得られた字幕付き動画像が表示器119により表示される。 The data received by the user is separated by the demultiplexer 116 into compressed image data and text data. The image data is sent to the moving image expansion circuit 117 through the signal line 120. The text data is sent to the subtitle superimposer 118 via the signal line 121, where it is developed into a bitmap text image and superimposed on the moving image expanded by the moving image expansion circuit 117. The subtitled moving image thus obtained is displayed on the display 119.

なお、表示器の代わりに画像形成装置（レーザビームプリンタ等）を用いて、動画像をフレーム毎にハードコピーしてもよく、本実施例で注目すべきは、字幕スーパインポーザが使用者側にあるということである。そして、操作パネル122により、使用者が字幕の表示/非表示、表示色、表示位置、文字の大きさなどを設定し、その設定に基づき制御回路123は、字幕スーパインポーザ118によるスーパインポーズを制御する。 Note that a moving image may be hard-copied for each frame by using an image forming apparatus (laser beam printer or the like) instead of the display. It should be noted that in this embodiment, the subtitle superimposer is That is. Then, the user sets the display / non-display of the subtitles, the display color, the display position, the size of the characters, and the like by using the operation panel 122, and based on the settings, the control circuit 123 controls the superimposition by the subtitle superimposer 118. Control.

上述の従来の伝送方法では、字幕スーパインポーザは送信側（動画像の制作者側）にあるため、字幕は固定位置に常に表示され続けていた。これに対して本実施例によれば、ユーザが字幕スーパインポーザに指示を与えることにより、字幕の表示/非表示、表示色、表示位置、文字の大きさなどを自由に設定でき、操作性を大幅に改善できる。また、本実施例によれば、上述のような画質と圧縮効率の関係も良好にすることができる。 In the above-described conventional transmission method, the caption is always displayed at a fixed position because the caption superimposer is on the transmitting side (the creator of the moving image). On the other hand, according to the present embodiment, the user can freely set subtitle display / non-display, display color, display position, character size, and the like by giving an instruction to the subtitle superimposer. Can be greatly improved. Further, according to the present embodiment, the relationship between the image quality and the compression efficiency as described above can be improved.

また、図4に示した動画像圧縮回路113としては、国際標準の動画像圧縮方式であるMPEGのエンコーダを用いることができる。MPEGのビットストリームには「ユーザデータ領域」が規定されており、MPEGに準拠した上で一ピクチャごとに任意のデータを書込むことができるため、マルチプレクサ114では、この「ユーザデータ領域」にテキストデータを書込むという動作を行う。デマルチプレクサ116では、MPEGビットストリームの「ユーザデータ領域」からテキストデータを読出して字幕スーパインポーザ118に送る。動画像伸長回路117にはMPEGデコーダを用いる。 Further, as the moving image compression circuit 113 shown in FIG. 4, an MPEG encoder which is an international standard moving image compression method can be used. Since a "user data area" is defined in the MPEG bit stream, and arbitrary data can be written for each picture based on the MPEG, the multiplexer 114 writes a text in the "user data area". An operation of writing data is performed. The demultiplexer 116 reads out text data from the “user data area” of the MPEG bit stream and sends it to the subtitle superimposer 118. An MPEG decoder is used for the moving image decompression circuit 117.

本実施例によれば、標準的な動画像圧縮方式であるMPEGに準拠した符号データを生成できるという利点がある。 According to the present embodiment, there is an advantage that code data compliant with MPEG which is a standard moving image compression method can be generated.

本実施例で説明した再生系ではなく、通常のMPEGデコーダで再生した場合は、ユーザデータ領域のデータが無視され、字幕が表示できなくなるだけで、動画像は正しく再生できるので、互換性は保たれる。 If the data is played back by a normal MPEG decoder instead of the playback system described in this embodiment, the data in the user data area is ignored, and the subtitles cannot be displayed. Dripping.

以上説明したように、本発明にかかる第2実施例によれば、画像にテキストがスーパインポーズされることに起因する画質劣化を回避することができ、また使用者が字幕の表示方法を任意に選択できるようになる。 As described above, according to the second embodiment of the present invention, it is possible to avoid image quality deterioration due to superimposition of text on an image, and to allow a user to specify a subtitle display method. Can be selected.

以下、図面を参照しながら、本発明にかかる第3実施例を詳細に説明する。 Hereinafter, a third embodiment according to the present invention will be described in detail with reference to the drawings.

図5は本実施例における動画像処理システムの構成図である。131、132、133はそれぞれ字幕テキストデータを記憶する字幕テキストデータメモリ、134は動画像データメモリ、135は動画像圧縮回路、136はマルチプレクサ、137は蓄積、伝送系、138はデマルチプレクサ、139は動画像伸長回路、140は字幕スーパインポーザ、141は表示器、142は切替スイッチである。 FIG. 5 is a configuration diagram of the moving image processing system in the present embodiment. 131, 132, and 133 are subtitle text data memories for storing subtitle text data, 134 is a moving image data memory, 135 is a moving image compression circuit, 136 is a multiplexer, 137 is a storage and transmission system, 138 is a demultiplexer, and 139 is A moving image decompression circuit, 140 is a subtitle superimposer, 141 is a display, and 142 is a changeover switch.

字幕テキストデータメモリ131、132、133には、それぞれ異なる種類のテキストデータが入っている。例えば、第1字幕テキストデータメモリ131には日本語字幕、第2字幕テキストデータメモリ132には英語字幕、第3字幕テキストデータメモリ133には中国語字幕が入っているものとする。これら、三種類のテキストデータが、動画像データメモリ134から動画像圧縮回路135を経て圧縮された画像データと、マルチプレクサ136で重畳されて、蓄積、伝送系137に送られる。 The caption text data memories 131, 132, and 133 contain different types of text data. For example, the first subtitle text data memory 131 contains Japanese subtitles, the second subtitle text data memory 132 contains English subtitles, and the third subtitle text data memory 133 contains Chinese subtitles. These three types of text data are superimposed on the image data compressed from the moving image data memory 134 via the moving image compression circuit 135 by the multiplexer 136 and sent to the storage and transmission system 137.

受信側でデータは、デマルチプレクサ138を通り、圧縮画像データと三種類のテキストデータに分離される。圧縮画像データは動画像伸長回路139で伸長され、字幕スーパインポーザ160に送られる。スイッチ142は、テキストデータの何れか一つを選択する。本実施例では、使用者が三ヵ国語の字幕から所望のものを選択できることになる。ここで選択されたテキストデータは、スーパインポーザ140でビットマップに展開され、動画像と重畳され、表示器141で表示される。 On the receiving side, the data passes through a demultiplexer 138 and is separated into compressed image data and three types of text data. The compressed image data is expanded by the moving image expansion circuit 139 and sent to the subtitle superimposer 160. The switch 142 selects any one of the text data. In this embodiment, the user can select a desired one from trilingual captions. The text data selected here is developed into a bitmap by the superimposer 140, superimposed on a moving image, and displayed on the display 141.

以上説明したような構成をとることにより、使用者が複数種類の字幕の中から任意のものを選んで表示することが可能になる。 With the above-described configuration, the user can select and display any one of a plurality of types of subtitles.

また、ここでは字幕テキストの種類を三系統としたが、勿論これに限られるものではなく、蓄積、伝送系の容量や転送能力の許す限り何系統でも入れることができる。また、とくに多国語を用意する必要もなく、同一言語で異なる内容のテキストを入れてもよいことはいうまでもない。 Also, here, the type of subtitle text is three, but it is needless to say that the present invention is not limited to this. In addition, it is needless to say that there is no need to prepare a multilingual language, and that texts having different contents in the same language may be inserted.

テキストをビットマップ展開して画像上にスーパインポーズする代わりに、音声合成装置でテキストを読み上げ、動画像に付随する音声とミキシングし（もしくは音声と差替え）て再生してもよい。 Instead of developing a text into a bitmap and superimposing the text on an image, the text may be read out by a voice synthesizer, and mixed with (or replaced with) voice accompanying a moving image and reproduced.

なお、本発明は、複数の機器から構成されるシステムに適用しても、一つの機器からなる装置に適用してもよい。また。本実施例は、システムあるいは装置にプログラムを供給することによって達成される場合にも適用できることはいうまでもない。 The present invention may be applied to a system including a plurality of devices or to an apparatus including a single device. Also. It is needless to say that the present embodiment can be applied to a case where the present invention is achieved by supplying a program to a system or an apparatus.

本発明にかかる第1実施例の動画像処理によって動画像を処理するシステムの構成例を示すブロック図、Block diagram showing a configuration example of a system for processing a moving image by the moving image processing of the first embodiment according to the present invention, 従来の動画像の編集手順を示すフローチャート、Flowchart showing a conventional moving image editing procedure, 本発明にかかる第1実施例の動画像の編集手順を示すフローチャート、Flow chart showing a moving image editing procedure of the first embodiment according to the present invention, 本発明にかかる第2実施例の構成例を示すブロック図、Block diagram showing a configuration example of a second embodiment according to the present invention, 本発明にかかる第3実施例の構成例を示すブロック図、Block diagram showing a configuration example of a third embodiment according to the present invention, 第1実施例によるフェードアウトの一例を説明する図、FIG. 7 is a diagram illustrating an example of a fade-out according to the first embodiment. 従来例によるフェードアウトを説明する図である。FIG. 9 is a diagram illustrating fade-out according to a conventional example.

Claims

A receiving means for receiving, from an external device, moving image encoded data obtained by compressing and encoding the moving image data in which the text data is not superimposed by frequency domain and text data multiplexed,
Separating means for separating the moving image encoded data and the text data from the information data received by the receiving means,
Decoding means for decoding the moving picture encoded data separated by the separation means,
Setting means for setting a superimposition condition of text data when superimposing the text data separated by the separation means on the moving image data decoded by the decoding means;
Superimposing means for superimposing the text data on the moving image data based on the superimposing condition;
Output means for outputting moving image data on which the text data is superimposed.

Moving image data in which text data on which text data is not superimposed is subjected to frequency domain conversion and compression-encoded moving image encoded data, and information data in which text data is multiplexed is received from an external device,
Separating the moving image encoded data and the text data from the received information data,
Decoding the separated moving image encoded data,
When superimposing the separated text data on the decoded moving image data, accepting the setting of the superimposition condition of the text data,
Superimposing the text data on the moving image data based on the superimposition condition,
A moving image processing method, comprising outputting moving image data on which the text data is superimposed.