JPH10257437A

JPH10257437A - Mpeg encoder/decoder

Info

Publication number: JPH10257437A
Application number: JP9058135A
Authority: JP
Inventors: Yasunori Okada; 恭典岡田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1997-03-12
Filing date: 1997-03-12
Publication date: 1998-09-25

Abstract

PROBLEM TO BE SOLVED: To provide an MPEG encoder/decoder for preventing noises from being generated at the time of reproduction start even without performing any complicated processing on the side of the decoder when starting reproduction from the head of GOP in the middle of the stream such as jump reproduction. SOLUTION: An audio frame detection part 105 retrieves an audio stream provided by MPEG encoding and detects the respective heads of plural audio frames consisting of that audio stream. Based on the detected result, an audio packeting part 106 packets the audio stream so as not to store the plural audio frames dividely in mutually different packets.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ＭＰＥＧ符号化／
復号化装置に関し、より特定的には、ジャンプ再生を行
う際に発生する雑音を防止する機能を備えたＭＰＥＧ符
号化／復号化装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an MPEG encoding / decoding system.
More specifically, the present invention relates to an MPEG encoding / decoding device having a function of preventing noise generated when performing jump reproduction.

【０００２】[0002]

【従来の技術】近年、大量の映像・音声情報を蓄積し、
提供するビデオサーバーの研究が進められている。ビデ
オサーバでは、映像・音声情報は、ディジタル化して符
号化され、その情報量が大幅に圧縮された圧縮データ
（ストリーム）としてディスク等に格納されている。そ
の符号化方式としては一般に、動画像符号化の国際標準
方式であるＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅｓ
ＥｘｐｅｒｔｓＧｒｏｕｐ）が採用されている。こ
うしたビデオサーバなどに設けられ、ＭＰＥＧで映像・
音声情報の符号化／復号化を行う装置が、ＭＰＥＧ符号
化／復号化装置である。2. Description of the Related Art In recent years, a large amount of video and audio information has been accumulated,
Research on the video server provided is underway. In a video server, video / audio information is digitized and encoded, and the amount of information is stored on a disk or the like as compressed data (stream) in which the information amount is largely compressed. Generally, the encoding method is MPEG (Moving Pictures), which is an international standard method for moving picture encoding.
Experts Group) is employed. It is provided in such a video server, etc.
A device that encodes / decodes audio information is an MPEG encoding / decoding device.

【０００３】図７は、ＭＰＥＧで用いられるストリーム
の構造を示す図である。図７（１）はプログラムストリ
ーム、（２）はビデオエレメンタリストリーム、（３）
はオーディオエレメンタリストリームを示している。な
お、プログラムストリームは、ビデオエレメンタリスト
リームをペイロードに格納したＰＥＳパケットとオーデ
ィオエレメンタリストリームをペイロードに格納したＰ
ＥＳパケットとを多重化することにより得られる。FIG. 7 is a diagram showing the structure of a stream used in MPEG. 7A shows a program stream, FIG. 7B shows a video elementary stream, and FIG.
Indicates an audio elementary stream. The program stream is composed of a PES packet storing a video elementary stream in a payload and a PES packet storing an audio elementary stream in a payload.
It is obtained by multiplexing with an ES packet.

【０００４】以下には、図７（１）〜（３）のストリー
ムがそれぞれどのようにして構成されるかについて説明
する。ディジタル化され、ＭＰＥＧで符号化して得られ
たビデオデータの集合で複数のビデオフレームが構成さ
れ、各ビデオフレームには、それぞれの先頭にピクチャ
ヘッダが付与される。付与されるピクチャヘッダには、
そのビデオフレームの属性（Ｉ／Ｐ／Ｂフレーム）など
が記述されている。The following describes how the streams shown in FIGS. 7 (1) to 7 (3) are constructed. A plurality of video frames are composed of a set of video data obtained by digitization and encoding by MPEG, and a picture header is added to the head of each video frame. In the picture header to be given,
The attribute (I / P / B frame) of the video frame is described.

【０００５】ここで、Ｉフレームは、フレーム内符号化
されたフレームであり、そのフレーム単独で復号化でき
る。Ｐフレームは、過去のＩまたはＰフレームを参照し
て符号化されたフレームであり、Ｂフレームは、過去と
未来の２枚のフレームを参照して符号化されたフレーム
である。ＰおよびＢフレームでは、予測フレームとの差
分が符号化されているため、Ｉフレームに比べてデータ
量が削減されているが、そのフレーム単独では復号化で
きない。[0005] Here, the I frame is an intra-coded frame, and can be decoded by itself. The P frame is a frame encoded with reference to a past I or P frame, and the B frame is a frame encoded with reference to two past and future frames. In the P and B frames, since the difference from the predicted frame is encoded, the data amount is reduced as compared with the I frame, but the frame cannot be decoded alone.

【０００６】複数のビデオフレームで１つのＧＯＰ（Ｇ
ｒｏｕｐｏｆＰｉｃｔｕｒｅｓ）が構成され、ＧＯ
Ｐの先頭にはＧＯＰヘッダが付与される。ＧＯＰは、ラ
ンダムアクセスできるビデオデータの集合の最小単位で
あって、１５フレームを１ＧＯＰとするのが一般的であ
る。さらに、複数のＧＯＰで１つのシーケンスが構成さ
れ、シーケンスの先頭にはシーケンスヘッダが付与され
る。そして、複数のシーケンスでビデオエレメンタリス
トリームが構成される。A plurality of video frames constitute one GOP (GOP).
group of Pictures) is composed and GO
A GOP header is added to the head of P. A GOP is the minimum unit of a set of video data that can be randomly accessed, and generally 15 frames are defined as one GOP. Further, one sequence is composed of a plurality of GOPs, and a sequence header is added to the beginning of the sequence. Then, a video elementary stream is composed of a plurality of sequences.

【０００７】一方、ディジタル化され、ＭＰＥＧで符号
化して得られたオーディオデータの集合で複数のオーデ
ィオフレームが構成され、各オーディオフレームにはそ
れぞれの先頭にヘッダが付与される。オーディオフレー
ムは、それ単独で復号化できるオーディオデータの集合
の最小単位である。On the other hand, a plurality of audio frames are constituted by a set of audio data obtained by digitizing and encoding by MPEG, and a header is added to the head of each audio frame. An audio frame is the smallest unit of a set of audio data that can be decoded by itself.

【０００８】上記のようにして構成されたビデオエレメ
ンタリストリームおよびオーディオエレメンタリストリ
ームは、それぞれ複数のＰＥＳパケットのペイロード部
分に分割して格納することにより、パケット化される。
ＰＥＳパケットのペイロード部分の長さ（ペイロード
長）は予め一律に決められているため、ビデオエレメン
タリストリームおよびオーディオエレメンタリストリー
ムは、それぞれこのペイロード長に応じて分割されるこ
とになる。従って、一つのビデオフレーム／オーディオ
フレームが複数のＰＥＳパケットに分割して格納される
ことがある。なお、１つのＰＥＳパケットにビデオおよ
びオーディオの両方のデータが格納されることはない。[0008] The video elementary stream and the audio elementary stream configured as described above are packetized by dividing into a plurality of PES packets and storing them.
Since the length (payload length) of the payload portion of the PES packet is determined uniformly in advance, the video elementary stream and the audio elementary stream are each divided according to the payload length. Therefore, one video frame / audio frame may be divided into a plurality of PES packets and stored. Note that neither video nor audio data is stored in one PES packet.

【０００９】ビデオフレームまたはオーディオフレーム
の先頭部分を含むＰＥＳパケットには、それぞれのヘッ
ダ部分（ＰＥＳヘッダ）に、そのフレームを表示すべき
時刻を示すタイムスタンプ（Ｐｒｅｓｅｎｔａｔｉｏｎ
ＴｉｍｅＳｔａｍｐ；以下、ＰＴＳ）が含まれてい
る。さらに、複数のＰＥＳパケットでパックが構成さ
れ、その先頭にはパックヘッダが付与される。そして、
複数のパックでプログラムストリームが構成される。In a PES packet including a head portion of a video frame or an audio frame, a time stamp (Presentation) indicating a time at which the frame is to be displayed is provided in each header portion (PES header).
Time Stamp (hereinafter, PTS) is included. Further, a pack is composed of a plurality of PES packets, and a pack header is added to the head of the pack. And
A program stream is composed of a plurality of packs.

【００１０】ところで、ビデオサーバにおけるアプリケ
ーションの一つに、クライアントの要求に応じて映像や
音声データを随時供給するビデオ・オン・デマンド（Ｖ
ＯＤ）がある。ＶＯＤでは、高速再生やスロー再生、ジ
ャンプ再生などの特殊再生機能を実現することが求めら
れる。これらのうち、ジャンプ再生は、ユーザが任意に
指定した位置から再生を開始する機能であるが、ＭＰＥ
Ｇを用いたビデオサーバーでは、ＧＯＰがランダムアク
セスの単位となるため、指定された再生開始位置を含む
ＧＯＰの先頭からプログラムストリームの再生が開始さ
れなければならない。[0010] By the way, video-on-demand (V), which supplies video and audio data as needed to one of the applications in the video server in response to a request from the client.
OD). In VOD, it is required to realize special playback functions such as high-speed playback, slow playback, and jump playback. Among these, the jump playback is a function of starting playback from a position arbitrarily designated by the user.
In a video server using G, since the GOP is a unit of random access, the reproduction of the program stream must be started from the beginning of the GOP including the specified reproduction start position.

【００１１】しかし、すでに述べたように、プログラム
ストリームは複数のＰＥＳパケットで構成されており、
デコーダへの入力もパケット単位で行われる。そして、
各ＰＥＳパケットには、予め決められたペイロード長に
応じて分割されたビデオエレメンタリストリーム／オー
ディオエレメンタリストリームが格納されている。その
ため、再生開始時、最初に復号化されるビデオパケット
のペイロードの先頭位置に、直前のビデオパケットにそ
の一部が格納されているビデオフレームの残りの部分
（以下、ビデオフレームの断片）が含まれていて、その
断片が再生された結果、映像の乱れが生じることがあ
る。However, as described above, the program stream is composed of a plurality of PES packets.
Input to the decoder is also performed in packet units. And
Each PES packet stores a video elementary stream / audio elementary stream divided according to a predetermined payload length. Therefore, at the start of playback, the beginning of the payload of the video packet to be decoded first includes the remaining portion of the video frame (hereinafter, a fragment of the video frame) of which a part is stored in the immediately preceding video packet. And the fragments are played back, which may result in video distortion.

【００１２】同様に、再生開始時、最初に復号化される
オーディオパケットのペイロードの先頭位置に、直前の
オーディオパケットにその一部が格納されているオーデ
ィオフレームの残りの部分（オーディオフレームの断
片）が含まれていて、その断片が再生された結果、雑音
が発生することがある。Similarly, at the start of reproduction, at the head position of the payload of the audio packet to be decoded first, the rest of the audio frame (a fragment of the audio frame), a part of which is stored in the immediately preceding audio packet. May be included, and noise may be generated as a result of reproducing the fragment.

【００１３】この種の雑音の発生を防止するものとして
は、特開平６−８６２２４号公報に開示された情報記録
再生装置が知られている（なお、この装置は、静止画を
再生する際に発生する雑音を防止するものであるが、静
止画を再生する際に発生する雑音を防止することと、ジ
ャンプ再生時の雑音を防ぐこととは同等である）。この
装置では、再生装置側で、オーディオフレームの先頭部
分に付加されている音声同期信号を検出するようにして
いる。そして、音声同期信号が検出されるまでは、伝送
されてきたオーディオパケットをデコーダに入力せず、
音声同期信号が検出された次のパケットからデコーダに
入力する動作を開始するようにしている。これにより、
再生開始時、オーディオフレームの断片がデコーダに入
力されることがなくなるため、雑音の発生が防止され
る。As a device for preventing the generation of this kind of noise, an information recording / reproducing apparatus disclosed in Japanese Patent Laid-Open No. 6-86224 is known. Although the noise generated is prevented, the noise generated when a still image is reproduced is equivalent to the noise generated during jump reproduction.) In this device, the playback device detects an audio synchronization signal added to the beginning of an audio frame. Until the audio synchronization signal is detected, the transmitted audio packet is not input to the decoder,
The operation of inputting to the decoder from the packet following the detection of the audio synchronization signal is started. This allows
At the start of the reproduction, no fragment of the audio frame is input to the decoder, so that generation of noise is prevented.

【００１４】図８は、ジャンプ再生時の、再生開始位置
付近のプログラムストリームの一例を示す図である。図
８において、…、ＶＮ、Ｖ０、Ｖ１、…は、それぞれビ
デオパケットであり、Ｖ０には、ＧＯＰの先頭部分のビ
デオデータが格納されている。…、ＡＭ、Ａ０、Ａ１、
…は、それぞれオーディオパケットである。FIG. 8 is a diagram showing an example of a program stream near the reproduction start position during jump reproduction. In FIG. 8,..., VN, V0, V1,... Are video packets, respectively, and V0 stores the video data at the head of the GOP. …, AM, A0, A1,
Are audio packets.

【００１５】図８のストリームを、図中に示す再生開始
位置にジャンプして再生を開始する場合、Ｖ０以降のパ
ケットが順次的にデコーダに入力されることになる。再
生開始時、最初に入力されるビデオパケット（Ｖ０）に
ビデオフレームの断片が含まれていて映像の乱れが生じ
ること、また、最初に入力されるオーディオパケット
（Ａ０）にオーディオフレームの断片が含まれていて雑
音が発生することはすでに説明した。When the stream shown in FIG. 8 is jumped to the playback start position shown in the figure to start playback, packets after V0 are sequentially input to the decoder. At the start of playback, the first input video packet (V0) contains a video frame fragment, which causes video disturbance, and the first input audio packet (A0) contains an audio frame fragment. It has already been explained that noise is generated due to noise.

【００１６】加えて、最初に入力されるオーディオパケ
ット（Ａ０）が、必ずしも再生開始位置以降のビデオパ
ケットに対応しているとは限らず、例えばＶＭやＶＮに
対応している可能性がある。この場合、再生開始時、再
生開始位置以前の音声が再生されてしまう不都合が起こ
る。In addition, the audio packet (A0) input first does not always correspond to the video packet after the reproduction start position, and may correspond to, for example, VM or VN. In this case, at the time of starting the reproduction, there is a disadvantage that the sound before the reproduction start position is reproduced.

【００１７】上記公報に示された装置では、この不都合
を防ぐために、再生開始時、最初のオーディオパケット
が、最初のビデオパケットよりも少なくとも時間ΔＴ
（＝Ｔ２−Ｔ１；Ｔ１はビデオパケットの伝送時刻、Ｔ
２は表示時刻）だけ経過した後に送られてくることに着
目して、最初のビデオパケットが送られてきてから、時
間ΔＴだけ経過した時点で、オーディオパケットをデコ
ーダに入力する動作を開始するようにしている。これに
より、再生開始時、再生開始位置以前の音声が再生され
ることがなくなる。In the apparatus disclosed in the above publication, in order to prevent this inconvenience, at the time of starting reproduction, the first audio packet is at least ΔT longer than the first video packet.
(= T2-T1; T1 is the transmission time of the video packet, T
(2 is the display time), the operation of inputting the audio packet to the decoder is started at the time when the time ΔT elapses after the first video packet is transmitted after the first video packet is transmitted. I have to. As a result, at the start of the reproduction, the sound before the reproduction start position is not reproduced.

【００１８】[0018]

【発明が解決しようとする課題】しかし、上記公報に示
された装置では、再生装置（復号化装置）側で音声同期
信号を検出し、かつ検出結果に基づいてデコーダへのオ
ーディオパケットの入力動作を制御しているため、再生
装置側の構成が複雑である。従って、同様の機能を、ク
ライアント毎に復号化装置を設けるようなＭＰＥＧ符号
化・復号化装置において実現した場合、クライアントの
数が多くなるのに従って、ＭＰＥＧ符号化・復号化装置
全体が大規模かつ高価になる。However, in the apparatus disclosed in the above publication, a reproducing apparatus (decoding apparatus) detects an audio synchronizing signal, and an operation of inputting an audio packet to a decoder based on the detection result. , The configuration of the reproducing apparatus is complicated. Therefore, when the same function is realized in an MPEG encoding / decoding device in which a decoding device is provided for each client, as the number of clients increases, the entire MPEG encoding / decoding device becomes larger and larger. It will be expensive.

【００１９】また、上記公報に示された装置では、送ら
れてきたビデオパケットがいったんバッファリングされ
る一方、送られてきたオーディオパケットはバッファリ
ングされずにリアルタイムに再生されることが前提とな
っている。しかし、ＭＰＥＧ復号化装置では通常、オー
ディオパケットもいったんバッファリングして再生され
るため、最初のビデオパケットが送られてきてから時間
ΔＴだけ経過した時点でオーディオパケットをデコーダ
に入力する動作を開始しても、再生開始位置以前の音声
が再生されないとは限らない。In the apparatus disclosed in the above publication, it is assumed that the transmitted video packets are temporarily buffered, while the transmitted audio packets are not buffered and are reproduced in real time. ing. However, since the MPEG decoding apparatus normally buffers and reproduces the audio packet once, the operation of inputting the audio packet to the decoder is started when a time ΔT has elapsed since the first video packet was sent. However, the sound before the reproduction start position is not necessarily reproduced.

【００２０】それゆえに、本発明の目的は、ジャンプ再
生など、ストリーム途中のＧＯＰの先頭から再生を開始
する際、復号化装置側で複雑な処理を行わなくても、再
生開始時に発生する雑音を防ぐことができるようなＭＰ
ＥＧ符号化装置を提供することである。Therefore, an object of the present invention is to reduce noise generated at the start of reproduction, such as jump reproduction, when starting reproduction from the beginning of a GOP in the middle of a stream without performing complicated processing on the decoding device side. MP that can be prevented
An EG encoding device is provided.

【００２１】また、本発明の他の目的は、ジャンプ再生
など、ストリーム途中のＧＯＰの先頭から再生を開始す
る際、再生開始位置以前の音声が再生されることを防ぐ
ことができるようなＭＰＥＧ復号化装置を提供すること
である。Another object of the present invention is to provide an MPEG decoding system which can prevent reproduction of sound before a reproduction start position when reproduction is started from the beginning of a GOP in the middle of a stream, such as jump reproduction. It is to provide a chemical conversion device.

【００２２】[0022]

【課題を解決するための手段および発明の効果】第１の
発明のＭＰＥＧ符号化装置は、オーディオデータをＭＰ
ＥＧ方式で符号化するＭＰＥＧ符号化手段と、ＭＰＥＧ
符号化手段が符号化して得られたオーディオストリーム
を検索して、当該ストリームを構成している複数のオー
ディオフレームのそれぞれの先頭を検出するオーディオ
フレーム検出手段と、オーディオフレーム検出手段の検
出結果に関連して、オーディオストリームをパケット化
するパケット化手段とを備え、パケット化手段は、複数
のオーディオフレームが、いずれも互いに異なるパケッ
トに分割して格納されることのないように、パケット化
することを特徴としている。Means for Solving the Problems and Effects of the Invention The MPEG encoding apparatus according to the first invention converts audio data into MP data.
MPEG encoding means for encoding in the EG system, MPEG
An audio frame detecting unit that searches for an audio stream obtained by the encoding unit and detects the head of each of a plurality of audio frames forming the stream; And packetizing means for packetizing the audio stream, wherein the packetizing means performs packetization such that none of the plurality of audio frames is divided into different packets and stored. Features.

【００２３】上記のように、第１の発明では、ＭＰＥＧ
方式で符号化して得られたオーディオストリームを検索
して、そのストリームを構成している複数のオーディオ
フレームの、それぞれの先頭を検出する。そして、その
検出結果に基づいて、オーディオストリームを、複数の
オーディオフレームがいずれも互いに異なるパケットに
分割して格納されることのないように、パケット化す
る。As described above, in the first invention, MPEG
An audio stream obtained by encoding according to the method is searched, and the head of each of a plurality of audio frames constituting the stream is detected. Then, based on the detection result, the audio stream is packetized so that any of the plurality of audio frames is not divided into different packets and stored.

【００２４】ジャンプ再生など、ストリーム途中のＧＯ
Ｐの先頭から再生を開始する場合、最初に復号化される
オーディオパケットのペイロード部分の先頭に、直前の
オーディオパケットにその一部が格納されているオーデ
ィオフレームの残りの部分（以下、オーディオフレーム
の断片）が格納されていて、その断片が再生された結
果、雑音が発生することがある。そこで、オーディオス
トリームをパケット化する際、各オーディオフレームが
いずれも互いに異なるパケットに分割して格納されるこ
とのないようにパケット化する。これにより、最初に復
号化されるオーディオパケットにオーディオフレームの
断片が格納されていることがなくなり、その結果、復号
化装置側で複雑な処理を行わなくても、再生開始時に雑
音が発生することがなくなる。GO in the middle of a stream such as jump playback
When playback is started from the beginning of P, at the beginning of the payload portion of the audio packet to be decoded first, the remaining portion of the audio frame whose part is stored in the immediately preceding audio packet (hereinafter, the audio frame) Fragment) is stored, and as a result of reproducing the fragment, noise may be generated. Therefore, when packetizing an audio stream, each audio frame is packetized so as not to be divided into different packets and stored. As a result, a fragment of an audio frame is not stored in an audio packet to be decoded first. As a result, noise is generated at the start of reproduction without performing complicated processing on the decoding device side. Disappears.

【００２５】第２の発明のＭＰＥＧ復号化装置は、ビデ
オストリームおよびオーディオストリームを一時格納す
るためのバッファと、バッファに格納されたビデオスト
リームおよびオーディオストリームを復号化するＭＰＥ
Ｇ復号化手段と、ＭＰＥＧ復号化手段の動作を制御する
制御手段とを備え、制御手段は、再生開始時、オーディ
オストリームを構成している複数のオーディオフレーム
のうち現在復号化されつつあるオーディオフレームの表
示時刻と、ビデオストリームを構成している複数のビデ
オフレームのうち最初に表示されるべきビデオフレーム
のそれとを比較して、比較の結果、当該オーディオフレ
ームの表示時刻が当該ビデオフレームのそれより早い場
合には、ＭＰＥＧ復号化手段に指示して、当該オーディ
オフレームを復号化して得られたオーディオデータの出
力を行わせないことを特徴としている。An MPEG decoding apparatus according to a second aspect of the present invention provides a buffer for temporarily storing a video stream and an audio stream, and an MPE for decoding the video stream and the audio stream stored in the buffer.
G decoding means, and control means for controlling the operation of the MPEG decoding means, wherein at the start of playback, the audio frame currently being decoded among a plurality of audio frames constituting the audio stream, The display time of the audio frame is compared with that of the video frame to be displayed first among the plurality of video frames constituting the video stream, and as a result of the comparison, the display time of the audio frame is more than that of the video frame. If it is early, it is characterized in that it instructs the MPEG decoding means to not output audio data obtained by decoding the audio frame.

【００２６】上記のように、第２の発明では、再生開始
時、バッファから読み出されて復号化されつつあるオー
ディオフレームの表示時刻と、バッファに格納されてい
る最初に表示されるべきビデオフレームのそれとを比較
する。そして、比較の結果、復号化されつつあるオーデ
ィオフレームの表示時刻が、最初に表示されるべきビデ
オフレームのそれより早い場合には、そのオーディオフ
レームを復号化して得られたオーディオデータの出力を
行わない。As described above, according to the second invention, at the start of reproduction, the display time of the audio frame being read from the buffer and being decoded and the video frame stored in the buffer to be displayed first are displayed. Compare with that of If the comparison result shows that the display time of the audio frame being decoded is earlier than that of the video frame to be displayed first, the audio data obtained by decoding the audio frame is output. Absent.

【００２７】ジャンプ再生など、ストリーム途中のＧＯ
Ｐの先頭から再生を開始する場合、最初に復号化される
オーディオフレームが再生開始位置以前のビデオフレー
ムに対応していて、その結果、再生開始位置以前の音声
が再生されることがある。そこで、再生開始時、現在復
号化されつつあるオーディオフレームの表示時刻と、バ
ッファに格納されている最初に表示されるべきビデオフ
レームの表示時刻とを比較する。比較の結果、復号化さ
れつつあるオーディオフレームの表示時刻が、最初に表
示されるべきビデオフレームのそれより早い場合には、
そのオーディオフレームを復号して得られたオーディオ
データの出力を行わないようにする。これにより、再生
開始時、再生開始位置以前の音声が再生されることがな
くなる。GO in the middle of a stream such as jump playback
When the reproduction is started from the beginning of P, the audio frame to be decoded first corresponds to the video frame before the reproduction start position, and as a result, the sound before the reproduction start position may be reproduced. Therefore, at the start of playback, the display time of the audio frame currently being decoded is compared with the display time of the first video frame stored in the buffer to be displayed. If the result of the comparison shows that the display time of the audio frame being decoded is earlier than that of the video frame to be displayed first,
The audio data obtained by decoding the audio frame is not output. As a result, at the start of the reproduction, the sound before the reproduction start position is not reproduced.

【００２８】[0028]

BEST MODE FOR CARRYING OUT THE INVENTION

（第１の実施形態）以下、本発明の第１の実施形態につ
いて、図面を参照しながら説明する。図１は、本発明の
第１の実施形態に係るＭＰＥＧ符号化・復号化装置の構
成を示すブロック図である。図１のＭＰＥＧ符号化・復
号化装置は、符号化装置１０と復号化装置１１とからな
り、符号化装置１０は、ＭＰＥＧエンコーダ１０１、Ｇ
ＯＰ検出部１０２、ビデオフレーム検出部１０３、ビデ
オパケット化部１０４、オーディオフレーム検出部１０
５、オーディオパケット化部１０６、多重化部１０７を
備えている。復号化装置１１は、ＭＰＥＧデコーダ１１
１、バッファ部１１２を備えている。バッファ部１１２
は、チャネルバッファ１１２ａおよびヘッダバッファ１
１２ｂを含む。(First Embodiment) Hereinafter, a first embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing the configuration of the MPEG encoding / decoding device according to the first embodiment of the present invention. The MPEG encoding / decoding device in FIG. 1 includes an encoding device 10 and a decoding device 11, and the encoding device 10 includes an MPEG encoder 101 and a G encoder.
OP detector 102, video frame detector 103, video packetizer 104, audio frame detector 10
5, an audio packetizing unit 106 and a multiplexing unit 107 are provided. The decoding device 11 includes an MPEG decoder 11
1. A buffer unit 112 is provided. Buffer unit 112
Are the channel buffer 112a and the header buffer 1
12b.

【００２９】ＭＰＥＧエンコーダ１０１は、ビデオデー
タおよびオーディオデータをＭＰＥＧで符号化する。Ｇ
ＯＰ検出部１０２は、ＭＰＥＧエンコーダ１０１が符号
化して得られたビデオストリームを検索して、ＧＯＰの
先頭を検出する。ビデオフレーム検出部１０３は、ＭＰ
ＥＧエンコーダ１０１が符号化して得られたビデオスト
リームを検索して、ビデオフレームの先頭を検出する。
ビデオパケット化部１０４は、ＭＰＥＧエンコーダ１０
１が符号化して得られたビデオストリームをパケット化
する。The MPEG encoder 101 encodes video data and audio data using MPEG. G
The OP detection unit 102 searches for a video stream obtained by encoding by the MPEG encoder 101, and detects the beginning of a GOP. The video frame detection unit 103
The EG encoder 101 searches for a video stream obtained by encoding, and detects the beginning of a video frame.
The video packetizer 104 includes the MPEG encoder 10
1 packetizes the video stream obtained by encoding.

【００３０】オーディオフレーム検出部１０５は、ＭＰ
ＥＧエンコーダ１０１が符号化して得られたオーディオ
ストリームを検索して、オーディオフレームの先頭を検
出する。オーディオパケット化部１０６は、ＭＰＥＧエ
ンコーダ１０１が符号化して得られたオーディオストリ
ームをパケット化する。多重化部１０７は、ビデオパケ
ット化部１０４がパケット化して得られたＰＥＳパケッ
ト（ビデオパケット）と、オーディオパケット化部１０
６がパケット化して得られたＰＥＳパケット（オーディ
オパケット）とを多重化する。The audio frame detection unit 105 outputs the MP
The EG encoder 101 searches the audio stream obtained by encoding, and detects the head of the audio frame. The audio packetizing unit 106 packetizes the audio stream obtained by encoding by the MPEG encoder 101. The multiplexing unit 107 includes a PES packet (video packet) obtained by packetizing the video packetizing unit 104 and the audio packetizing unit 10.
6 multiplexes PES packets (audio packets) obtained by packetization.

【００３１】ＭＰＥＧデコーダ１１１は、多重化部１０
７が多重化して得られたプログラムストリームを、バッ
ファ部１１２に一時記憶させた後、復号化する。バッフ
ァ部１１２では、チャネルバッファ１１２ａが、ビデオ
ストリームおよびオーディオストリームをパケット単位
で記憶し、ヘッダバッファ１１２ｂは、パックヘッダお
よびＰＥＳヘッダを記憶する。The MPEG decoder 111 includes a multiplexing unit 10
7 temporarily stores the program stream obtained by multiplexing in the buffer unit 112 and then decodes the program stream. In the buffer unit 112, the channel buffer 112a stores a video stream and an audio stream in packet units, and the header buffer 112b stores a pack header and a PES header.

【００３２】以下には、図１のＭＰＥＧ符号化・復号化
装置の動作について説明する。図１の装置は、例えば符
号化装置１０がビデオサーバ側に設けられ、復号化装置
１１はクライアント側に設けられる。また、符号化装置
１０の出力（図７（１）に示すようなプログラムストリ
ーム）は、図示しないディスクなどにいったん記憶され
た後、クライアント側の要求に応じてそのディスクから
読み出され、復号化装置１１側に送られるものとする。The operation of the MPEG encoding / decoding apparatus shown in FIG. 1 will be described below. In the apparatus of FIG. 1, for example, an encoding device 10 is provided on a video server side, and a decoding device 11 is provided on a client side. Also, the output of the encoding device 10 (a program stream as shown in FIG. 7A) is temporarily stored on a disk or the like (not shown), read out from the disk in response to a request from the client side, and decoded. It is sent to the device 11 side.

【００３３】符号化装置１０では、最初、ビデオデータ
およびオーディオデータが、ＭＰＥＧエンコーダ１０１
に入力される。ＭＰＥＧエンコーダ１０１は、入力され
たビデオデータおよびオーディオデータをそれぞれＭＰ
ＥＧで符号化して、図７（２）および（３）に示すよう
なビデオエレメンタリストリームおよびオーディオエレ
メンタリストリームを出力する。ＧＯＰ検出部１０２
は、出力されたビデオエレメンタリストリームから、Ｇ
ＯＰの先頭位置を示すｇｒｏｕｐ＿ｓｔａｒｔ＿ｃｏｄ
ｅ（０ｘ０００００１Ｂ８）を検出する。そして、検出
したｇｒｏｕｐ＿ｓｔａｒｔ＿ｃｏｄｅのバイト位置
を、ビデオフレーム検出部１０３およびビデオパケット
化部１０４に通知する。In the encoding device 10, first, video data and audio data are transmitted to the MPEG encoder 101.
Is input to The MPEG encoder 101 converts the input video data and audio data into MP
The video elementary stream and the audio elementary stream as shown in FIGS. 7 (2) and (3) are output by encoding with EG. GOP detector 102
Is G from the output video elementary stream.
Group_start_cod indicating the start position of the OP
e (0x000001B8) is detected. Then, it notifies the video frame detecting unit 103 and the video packetizing unit 104 of the detected byte position of the group_start_code.

【００３４】なお、ＧＯＰ検出部１０２は、ｇｒｏｕｐ
＿ｓｔａｒｔ＿ｃｏｄｅを検出する代わりに、シーケン
スヘッダの先頭位置を示すｓｅｑｕｅｎｃｅ＿ｈｅａｄ
ｅｒ＿ｃｏｄｅ（０ｘ０００００１Ｂ３）およびｇｒｏ
ｕｐ＿ｓｔａｒｔ＿ｃｏｄｅのいずれかを検出するもの
であってもよい。この場合、シーケンスヘッダおよびＧ
ＯＰヘッダのうち、どちらか先に検出された方のバイト
位置が、ビデオフレーム検出部１０３およびビデオパケ
ット化部１０４に通知されることになる。Note that the GOP detection unit 102
_Start_code, instead of detecting sequence_head indicating the start position of the sequence header
er_code (0x000001B3) and gro
Any of the up_start_code may be detected. In this case, the sequence header and G
Whichever byte position of the OP header is detected first is notified to the video frame detecting unit 103 and the video packetizing unit 104.

【００３５】次に、ビデオフレーム検出部１０３は、Ｇ
ＯＰ検出部１０２により通知されたバイト位置、すなわ
ちＧＯＰの先頭から、ピクチャヘッダのｐｉｃｔｕｒｅ
＿ｓｔａｒｔ＿ｃｏｄｅ（０ｘ０００００１００）の検
索を開始する。検索を開始して最初に検出したｐｉｃｔ
ｕｒｅ＿ｓｔａｒｔ＿ｃｏｄｅのバイト位置をもとに、
Ｉフレームの先頭のバイト位置がわかる。また、２番目
に検出したｐｉｃｔｕｒｅ＿ｓｔａｒｔ＿ｃｏｄｅのバ
イト位置をもとに、そのＩフレームの直後のＢまたはＰ
フレーム（以下、ｎｏｎ−Ｉフレーム）の先頭のバイト
位置がわかる。そこで、ビデオフレーム検出部１０３
は、これらＩフレームの先頭およびその直後のｎｏｎ−
Ｉフレームの先頭のバイト位置を、ビデオパケット化部
１０４に通知する。Next, the video frame detecting unit 103
From the byte position notified by the OP detection unit 102, that is, from the top of the GOP, the picture header
A search for _start_code (0x0000100) is started. The first pict found after starting the search
Based on the byte position of ure_start_code,
The position of the first byte of the I frame is known. Also, based on the byte position of the second detected picture_start_code, B or P immediately after the I frame is used.
The first byte position of a frame (hereinafter, a non-I frame) is known. Therefore, the video frame detection unit 103
Is the non-
The first byte position of the I frame is notified to the video packetizer 104.

【００３６】ビデオパケット化部１０４は、ＧＯＰ検出
部１０２により通知されたＧＯＰの先頭のバイト位置
と、ビデオフレーム検出部１０３から通知されたＩフレ
ームおよびその直後のｎｏｎ−Ｉフレームの先頭のバイ
ト位置とを参照して、ビデオエレメンタリストリームを
パケット化する。なお、パケット化して得られるＰＥＳ
パケットのペイロード長は、ビデオパケット、オーディ
オパケットとも予め決められた長さであるとする。The video packetizer 104 calculates the first byte position of the GOP notified by the GOP detector 102 and the first byte position of the I frame notified by the video frame detector 103 and the immediately following non-I frame. And packetizes the video elementary stream. The PES obtained by packetization
It is assumed that the payload length of the packet is a predetermined length for both the video packet and the audio packet.

【００３７】ここで、ビデオパケット化部１０４がビデ
オエレメンタリストリームをパケット化する動作を、図
２を用いて説明する。すなわち、図２は、図１のビデオ
パケット化部１０４がビデオエレメンタリストリームを
パケット化する動作を説明するための図である。図２に
示すように、ビデオエレメンタリストリームのうち、シ
ーケンスヘッダの先頭のデータからＩフレームの末端の
データまでが、複数のＰＥＳパケットのペイロード部分
に分割して格納される。Here, the operation of the video packetizer 104 packetizing the video elementary stream will be described with reference to FIG. That is, FIG. 2 is a diagram for explaining an operation in which the video packetizing unit 104 in FIG. 1 packetizes a video elementary stream. As shown in FIG. 2, in the video elementary stream, the data from the head of the sequence header to the data at the end of the I frame is divided into a plurality of PES packets and stored.

【００３８】Ｉフレームの末端のデータが格納されたＰ
ＥＳパケットにおいて、格納されたデータ長がペイロー
ド長に満たない場合、データが格納されていない残りの
ペイロード部分（図中（ａ）で示される部分）に対し
て、パディング処理を行う。この処理を行うことより、
Ｉフレームの末端のデータが格納されたＰＥＳパケット
に、次のフレームの先頭のデータ（ｎｏｎ−Ｉフレーム
のデータ）が格納されることがなくなる。P at which the data at the end of the I frame is stored
In the case where the stored data length is less than the payload length in the ES packet, padding processing is performed on the remaining payload portion where data is not stored (portion indicated by (a) in the figure). By performing this process,
The head data (non-I frame data) of the next frame is not stored in the PES packet storing the data at the end of the I frame.

【００３９】なお、上記のパディング処理としては、残
りのペイロード部分の長さに等しいパディングパケット
を生成して挿入する。In the padding process, a padding packet equal to the length of the remaining payload portion is generated and inserted.

【００４０】次に、上記のＩフレームから次のＩフレー
ムまでの一連のｎｏｎ−Ｉフレーム（図中ではＢフレー
ム、Ｐフレーム、…、Ｐフレーム）のデータが、複数の
ＰＥＳパケットのペイロード部分に分割して格納され
る。その際、上記のＩフレームとは異なり、各ｎｏｎ−
Ｉフレームの末端のデータが格納されたＰＥＳパケット
において、格納されたデータ長がペイロード長に満たな
くても、パディング処理を行わない。Next, data of a series of non-I frames (B frame, P frame,..., P frame in the figure) from the above I frame to the next I frame is added to the payload portion of a plurality of PES packets. Stored separately. At this time, unlike the above I frame, each non-frame
In a PES packet in which data at the end of an I frame is stored, padding is not performed even if the stored data length is less than the payload length.

【００４１】ただし、最後尾のｎｏｎ−Ｉフレームの末
端のデータ、すなわちＧＯＰの末端のデータが格納され
たＰＥＳパケットにおいて、格納されたデータ長がペイ
ロード長に満たない場合には、データが格納されていな
い残りの部分（図中（ｂ）で示される部分）に対して、
パディング処理を行う。この処理を行うことより、ＧＯ
Ｐの末端のデータが格納されたＰＥＳパケットに、次の
ＧＯＰの先頭のデータ（Ｉフレームのデータ）が格納さ
れることがなくなる。However, in the data at the end of the last non-I frame, that is, in the PES packet storing the data at the end of the GOP, if the stored data length is less than the payload length, the data is stored. For the remaining parts not shown (parts shown in (b) in the figure),
Perform padding processing. By performing this processing, GO
The head data (I frame data) of the next GOP is not stored in the PES packet storing the data at the end of P.

【００４２】上記のように、Ｉフレームの末端のデータ
が格納されたＰＥＳパケットに、次のフレーム（ｎｏｎ
−Ｉフレーム）の先頭のデータが格納されないようにし
たことにより、高速再生時など、Ｉフレームのみを選択
的に再生したい場合に、ｎｏｎ−Ｉフレームの先頭のデ
ータ（ビデオフレームの断片）がデコードされることが
なくなり、その結果、高速再生中に映像の乱れが生じな
いようにできる。As described above, the PES packet storing the data at the end of the I frame contains the next frame (non
The first data of the non-I frame (fragment of the video frame) is decoded when only the I frame is to be selectively reproduced, such as during high-speed reproduction, by not storing the first data of the I-frame. As a result, it is possible to prevent video disturbance during high-speed reproduction.

【００４３】また、ＧＯＰの末端のデータが格納された
ＰＥＳパケットに、次のＧＯＰの先頭のデータ（Ｉフレ
ームのデータ）が格納されないようにしたことにより、
ジャンプ再生時など、ストリーム途中の任意のＧＯＰか
ら再生を開始した場合に、その直前のＧＯＰの末端のデ
ータ（ビデオフレームの断片）がデコードされることが
なくなり、その結果、再生開始時に映像が乱れないよう
にできる。Also, by preventing the head data (I frame data) of the next GOP from being stored in the PES packet storing the data at the end of the GOP,
When playback is started from an arbitrary GOP in the middle of a stream such as during jump playback, data at the end of the immediately preceding GOP (fragment of a video frame) is not decoded, and as a result, video is disturbed at the start of playback. You can not.

【００４４】一方、オーディオフレーム検出部１０５
は、オーディオエレメンタリストリームを検索して、ヘ
ッダに含まれる同期ビット（０ｘＦＦＦ）を検出する。
そして、検出した同期ビットのバイト位置を、オーディ
オパケット化部１０６に通知する。オーディオパケット
化部１０６は、通知されたバイト位置を参照して、オー
ディオエレメンタリストリームをパケット化する。On the other hand, the audio frame detecting section 105
Searches the audio elementary stream to detect a synchronization bit (0xFFF) included in the header.
Then, it notifies the audio packetizer 106 of the detected byte position of the synchronization bit. The audio packetizer 106 packetizes the audio elementary stream with reference to the notified byte position.

【００４５】ここで、オーディオパケット化部１０６が
オーディオエレメンタリストリームをパケット化する動
作を、図３を用いて説明する。すなわち、図３は、図１
のオーディオパケット化部１０６がオーディオエレメン
タリストリームをパケット化する動作を説明するための
図である。オーディオパケット化部１０６は、図３に示
すように、最初、１番目のＰＥＳパケットのペイロード
部分に、１枚目のオーディオフレームのデータを格納す
る。そして、データが格納されていない残りのペイロー
ド長と、２枚目のオーディオフレームのデータ長とを比
較する。The operation of the audio packetizing section 106 for packetizing an audio elementary stream will be described with reference to FIG. That is, FIG.
FIG. 6 is a diagram for explaining an operation of the audio packetizing unit 106 packetizing an audio elementary stream. As shown in FIG. 3, the audio packetizer 106 first stores the data of the first audio frame in the payload of the first PES packet. Then, the remaining payload length in which no data is stored is compared with the data length of the second audio frame.

【００４６】比較の結果、残りのペイロード長が２枚目
のオーディオフレームのデータ長よりも長いため、その
オーディオフレームを１番目のＰＥＳパケットの残りの
ペイロード部分に格納する。そして、データが格納され
ていない残りのペイロード長と、３枚目のオーディオフ
レームのデータ長とを比較する。As a result of the comparison, since the remaining payload length is longer than the data length of the second audio frame, the audio frame is stored in the remaining payload portion of the first PES packet. Then, the remaining payload length in which no data is stored is compared with the data length of the third audio frame.

【００４７】比較の結果、残りのペイロード長が３枚目
のオーディオフレームのデータ長よりも短いため、その
オーディオフレームを１番目のＰＥＳパケットの残りの
ペイロード部分に格納することをせず、２番目のＰＥＳ
パケットのペイロード部分に格納する。そして、１番目
のＰＥＳパケットの残りのペイロード部分（図中（ｃ）
で示される部分）に対しては、すでに説明したものと同
様のパディング処理を行う。As a result of the comparison, since the remaining payload length is shorter than the data length of the third audio frame, the audio frame is not stored in the remaining payload portion of the first PES packet and the second audio frame is not stored. PES
Store it in the payload part of the packet. Then, the remaining payload portion of the first PES packet ((c) in the figure)
), The same padding processing as described above is performed.

【００４８】以降、同様にして、オーディオフレームを
ＰＥＳパケットに格納する毎に、そのパケットの残りの
ペイロード長と次のオーディオフレームのデータ長とを
比較する。そして、比較の結果、残りのペイロード長が
次のオーディオフレームのデータ長よりも短くて、残り
のペイロード部分に次のフレームの全てのデータを格納
しきれないような場合には、そのパケットの残りのペイ
ロード部分に対してパディング処理を行う動作を繰り返
す。これにより、１つのオーディオフレームのデータが
複数のＰＥＳパケットに分割して格納されることがなく
なる。Thereafter, similarly, each time an audio frame is stored in a PES packet, the remaining payload length of the packet is compared with the data length of the next audio frame. If the result of the comparison indicates that the remaining payload length is shorter than the data length of the next audio frame and that all the data of the next frame cannot be stored in the remaining payload portion, the remaining The operation of performing the padding process on the payload portion of is repeated. As a result, the data of one audio frame is not divided and stored in a plurality of PES packets.

【００４９】次に、多重化部１０７が、ビデオパケット
化部１０４がパケット化して得られたＰＥＳパケット
（ビデオパケット）と、オーディオパケット化部１０６
がパケット化して得られたＰＥＳパケット（オーディオ
パケット）とを多重化し、多重化して得られたプログラ
ムストリームの先頭にパックヘッダを付加して出力す
る。出力されたプログラムストリームは、図示しないデ
ィスク等にいったん蓄積される。Next, the multiplexing unit 107 divides the PES packet (video packet) obtained by packetizing the video packetizing unit 104 into an audio packetizing unit 106.
Multiplexes a PES packet (audio packet) obtained by packetization, adds a pack header to the head of the multiplexed program stream, and outputs the result. The output program stream is temporarily stored on a disk or the like (not shown).

【００５０】次に、復号化装置１１の、ジャンプ再生時
の動作について説明する。上記のようにしてディスク等
に蓄積されたプログラムストリームは、クライアントの
指示に応じて、ストリーム途中のＧＯＰ先頭から以降が
読み出され、復号化装置１１へ送られる。復号化装置１
１では、ＭＰＥＧデコーダ１１１が、送られてきたプロ
グラムストリームからパックヘッダおよびＰＥＳヘッダ
を抽出してヘッダバッファ１１２ｂに格納させる。ま
た、ビデオエレメンタリストリームおよびオーディオエ
レメンタリストリームを抽出してチャネルバッファ１１
２ａに格納させる。Next, the operation of the decoding device 11 during jump reproduction will be described. The program stream stored in the disk or the like as described above is read from the beginning of the GOP in the middle of the stream according to the instruction of the client, and sent to the decoding device 11. Decryption device 1
In step 1, the MPEG decoder 111 extracts a pack header and a PES header from the sent program stream, and stores them in the header buffer 112b. In addition, a video elementary stream and an audio elementary stream are extracted and
2a.

【００５１】チャネルバッファ１１２ａに格納されたデ
ータ量が予め設定されたしきい値を超えると、ＭＰＥＧ
デコーダ１１１は、チャネルバッファ１１２ａからビデ
オストリームおよびオーディオストリームをパケット単
位で読み出して復号化する動作を開始する。復号化して
得られたビデオデータおよびオーディオデータは、図示
しないＤ／Ａ変換部でそれぞれアナログのビデオ信号、
アナログのオーディオ信号に変換される。そして、図示
しない画面上で映像の表示が開始され、図示しないスピ
ーカから音声の出力が開始される。When the amount of data stored in the channel buffer 112a exceeds a preset threshold, the MPEG
The decoder 111 starts an operation of reading and decoding the video stream and the audio stream from the channel buffer 112a in packet units. The video data and audio data obtained by decoding are respectively converted into analog video signals by a D / A conversion unit (not shown),
It is converted to an analog audio signal. Then, display of an image is started on a screen (not shown), and output of sound from a speaker (not shown) is started.

【００５２】その際、符号化装置１０側において、ＧＯ
Ｐの末端のデータが格納されたＰＥＳパケットに次のＧ
ＯＰの先頭のデータが格納されないようにビデオストリ
ームがパケット化されているため、最初に復号化される
ビデオパケットに再生開始位置以前のビデオフレームの
断片が含まれていることがなく、従って、再生開始時、
映像の乱れが発生しない。At this time, on the encoding device 10 side, GO
The next G is added to the PES packet storing the data at the end of P.
Since the video stream is packetized so that the data at the beginning of the OP is not stored, the video packet to be decoded first does not include a fragment of the video frame before the reproduction start position. At the start,
No image distortion occurs.

【００５３】また、符号化装置１０側において、１つの
オーディオフレームのデータが複数のＰＥＳパケットに
分割して格納されることがないよう、オーディオストリ
ームがパケット化されているため、最初に復号化される
オーディオパケットに再生開始位置以前のオーディオフ
レームの断片が含まれていることがなく、従って、再生
開始時、雑音が発生しない。The audio stream is packetized so that the data of one audio frame is not divided into a plurality of PES packets and stored on the encoding device 10 side. The audio packet does not include a fragment of an audio frame before the reproduction start position, and therefore, no noise is generated at the start of reproduction.

【００５４】以上のように、本実施形態によれば、ジャ
ンプ再生など、ストリーム途中のＧＯＰの先頭から再生
を開始する際、復号化装置１１側で複雑な処理を行わな
くても、再生開始時に映像の乱れや雑音が発生しないよ
うにできる。As described above, according to the present embodiment, when starting playback from the beginning of a GOP in the middle of a stream, such as jump playback, the decoding device 11 does not need to perform complicated processing, so that playback can be started at the beginning of playback. It is possible to prevent image distortion and noise from occurring.

【００５５】（第２の実施形態）以下、本発明の第２の
実施形態について、図面を参照しながら説明する。図４
は、本発明の第２の実施形態に係るＭＰＥＧ符号化・復
号化装置の構成を示すブロック図である。図４の符号化
・復号化装置は、符号化装置１０と、復号化装置１１ａ
とからなる。図４の符号化装置１０は、図１の符号化装
置１０と同様の構成を有している。図４の復号化装置１
１ａは、図１の復号化装置１１において、出力制御部１
１３をさらに備えている。出力制御部１１３は、ＭＰＥ
Ｇデコーダ１１１の出力動作を制御する。(Second Embodiment) Hereinafter, a second embodiment of the present invention will be described with reference to the drawings. FIG.
FIG. 3 is a block diagram showing a configuration of an MPEG encoding / decoding device according to a second embodiment of the present invention. 4 includes an encoding device 10 and a decoding device 11a.
Consists of The encoding device 10 of FIG. 4 has a configuration similar to that of the encoding device 10 of FIG. Decoding device 1 of FIG.
1a is an output control unit 1 in the decoding device 11 of FIG.
13 is further provided. The output control unit 113
The output operation of the G decoder 111 is controlled.

【００５６】以下には、図４のＭＰＥＧ符号化・復号化
装置の動作について説明する。ただし、符号化装置１０
の動作は、図１のものと同様であるので、説明を省略す
る。図５は、図４のＭＰＥＧデコーダ１１１が、ヘッダ
をヘッダバッファ１１２ｂに、ビデオエレメンタリスト
リームおよびオーディオエレメンタリストリームをチャ
ネルバッファ１１２ａに、それぞれ格納させる動作を説
明するための図である。ジャンプ再生時、復号化装置１
１ａでは、最初、ＭＰＥＧデコーダ１１１が、伝送され
てくるプログラムストリームからヘッダを抽出してヘッ
ダバッファ１１２ｂに格納させ、またビデオエレメンタ
リストリームおよびオーディオエレメンタリストリーム
を抽出してチャネルバッファ１１２ａに格納させる動作
を行う。以下には、その動作について、図５を用いて説
明する。The operation of the MPEG encoding / decoding apparatus shown in FIG. 4 will be described below. However, the encoding device 10
Is similar to that of FIG. 1, and the description is omitted. FIG. 5 is a diagram for explaining an operation in which the MPEG decoder 111 in FIG. 4 stores a header in the header buffer 112b and a video elementary stream and an audio elementary stream in the channel buffer 112a. At the time of jump reproduction, the decoding device 1
In 1a, first, the MPEG decoder 111 extracts a header from a transmitted program stream and stores it in the header buffer 112b, and extracts a video elementary stream and an audio elementary stream and stores them in the channel buffer 112a. I do. The operation will be described below with reference to FIG.

【００５７】クライアントがストリーム途中のあるＧＯ
Ｐから再生を開始するよう指示すると、図７（１）に示
すようなプログラムストリームが図示しないディスクか
ら読み出されて復号化装置１１ａへ入力される。復号化
装置１１ａでは、ＭＰＥＧデコーダ１１１が、入力され
たストリームからパックヘッダ（図示せず）を抽出して
チャネルバッファ１１２ａに格納させる。The client has a GO in the middle of the stream
When the reproduction is instructed from P, a program stream as shown in FIG. 7A is read from a disc (not shown) and input to the decoding device 11a. In the decoding device 11a, the MPEG decoder 111 extracts a pack header (not shown) from the input stream and stores it in the channel buffer 112a.

【００５８】次に、ＭＰＥＧデコーダ１１１は、ストリ
ームを構成しているＰＥＳパケット（パケット（１）、
パケット（２）、…）からＰＥＳヘッダ（ヘッダ
（１）、ヘッダ（２）、…）を抽出して、ヘッダバッフ
ァ１１２ｂに格納させる。また、ビデオデータ／オーデ
ィオデータ（データ（１）、データ（２）、…）を抽出
して、チャネルバッファ１１２ａに格納させる。Next, the MPEG decoder 111 converts the PES packets (packet (1),
The PES headers (header (1), header (2),...) Are extracted from the packets (2,...) And stored in the header buffer 112b. Also, video data / audio data (data (1), data (2),...) Are extracted and stored in the channel buffer 112a.

【００５９】その際、ＭＰＥＧデコーダ１１１は、各Ｐ
ＥＳヘッダの最終バイトに続けて、ライトポインタの値
を書き込む。書き込まれるライトポインタの値（ライト
ポインタ（１）、ライトポインタ（２）、…）は、対応
するビデオデータ／オーディオデータ（データ（１）、
データ（２）、…）の先頭バイトが書き込まれる（チャ
ネルバッファ１１２ａの）アドレスを示している。At this time, the MPEG decoder 111
The value of the write pointer is written following the last byte of the ES header. The values of the written write pointer (write pointer (1), write pointer (2),...) Correspond to the corresponding video data / audio data (data (1),
The address (of the channel buffer 112a) at which the first byte of the data (2),... Is written.

【００６０】次に、ＭＰＥＧデコーダ１１１は、ヘッダ
バッファ１１２ｂからＰＥＳヘッダを順次的に読み出し
て内容を解析する。解析の結果、ＰＥＳヘッダ内にＰＴ
Ｓが存在する場合（前述のように、ビデオフレーム／オ
ーディオフレームの先頭データを含むＰＥＳパケットに
は、ＰＥＳヘッダ内にＰＴＳが含まれている）、ＰＴＳ
の値（ＰＴＳ（１）、ＰＴＳ（２）、…）とライトポイ
ンタの値（ライトポインタ（１）、ライトポインタ
（２）、…）とを組にして、図６に示すようなＰＴＳテ
ーブルに蓄積していく。Next, the MPEG decoder 111 sequentially reads the PES header from the header buffer 112b and analyzes the contents. As a result of the analysis, PT
If the STS exists (as described above, the PTS packet including the head data of the video frame / audio frame includes the PTS in the PES header), the PTS
(PTS (1), PTS (2),...) And the values of the write pointers (Write Pointer (1), Write Pointer (2),...) Are combined into a PTS table as shown in FIG. Accumulate.

【００６１】図６は、図４のＭＰＥＧデコーダ１１１が
作成したＰＴＳテーブルの一例を示す図である。ＭＰＥ
Ｇデコーダ１１１は、図６に示すようなＰＴＳテーブル
を、ビデオフレーム用およびオーディオフレーム用にそ
れぞれ１つ作成する。これらのテーブルを参照すること
によって、各ビデオフレーム／オーディオフレームの格
納位置と表示時刻との関係がわかる。FIG. 6 is a diagram showing an example of the PTS table created by the MPEG decoder 111 of FIG. MPE
The G decoder 111 creates one PTS table as shown in FIG. 6 for each of a video frame and an audio frame. By referring to these tables, the relationship between the storage position of each video frame / audio frame and the display time can be understood.

【００６２】チャネルバッファ１１２ａに蓄積されたデ
ータ量が予め設定されたしきい値を超えると、ＭＰＥＧ
デコーダ１１１は、チャネルバッファ１１２ａからビデ
オストリームおよびオーディオストリームを読み出して
復号化する動作を開始する。ただし、再生開始時、復号
化して得られたビデオデータの出力は行うが、復号化し
て得られたオーディオデータは出力せず、代わりに無効
データを出力する（ミュート処理する）よう設定されて
いる。When the amount of data stored in the channel buffer 112a exceeds a preset threshold value, the MPEG
The decoder 111 starts the operation of reading and decoding the video stream and the audio stream from the channel buffer 112a. However, at the start of playback, the video data obtained by decoding is output, but the audio data obtained by decoding is not output, but invalid data is output (mute processing) instead. .

【００６３】一方、出力制御部１１３は、ＭＰＥＧデコ
ーダ１１１の復号化動作を監視している。そして、ＭＰ
ＥＧデコーダ１１１がチャネルバッファ１１２ａからオ
ーディオデータを読み出して復号化しつつある場合、出
力制御部１１３は、そのオーディオデータのライトポイ
ンタの値をＭＰＥＧデコーダ１１１から取得するととも
に、その値に対応するＰＴＳの値を、ＭＰＥＧデコーダ
１１１が管理しているオーディオフレーム用ＰＴＳテー
ブルから取得する。この取得したＰＴＳの値が、いまＭ
ＰＥＧデコーダ１１１が復号化しようとしているオーデ
ィオフレームの表示時刻である。On the other hand, the output control unit 113 monitors the decoding operation of the MPEG decoder 111. And MP
When the EG decoder 111 is reading and decoding audio data from the channel buffer 112a, the output control unit 113 obtains the value of the write pointer of the audio data from the MPEG decoder 111 and sets the value of the PTS corresponding to the value. From the audio frame PTS table managed by the MPEG decoder 111. The value of the obtained PTS is now M
This is the display time of the audio frame that the PEG decoder 111 is trying to decode.

【００６４】次いで、出力制御部１１３は、ＭＰＥＧデ
コーダ１１１が管理しているビデオオフレーム用ＰＴＳ
テーブルを参照して、取得したＰＴＳの値（現在復号化
されつつあるオーディオフレームの表示時刻）と、ビデ
オフレーム用ＰＴＳテーブルに格納されているＰＴＳの
うち最も値が小さいもの（最初に復号化されるビデオフ
レームの表示時刻）とを比較する。Next, the output control unit 113 controls the video frame PTS managed by the MPEG decoder 111.
Referring to the table, the acquired PTS value (the display time of the audio frame currently being decoded) and the PTS stored in the video frame PTS table with the smallest value (first decoded PTS) Video frame display time).

【００６５】比較の結果、取得したＰＴＳの値の方が、
ビデオフレーム用ＰＴＳテーブルに格納されているＰＴ
Ｓのうち最も値が小さいものより小さい場合、現在復号
化されつつあるオーディオフレームが再生開始位置以前
のものであることがわかる。一方、取得したＰＴＳの値
の方が大きい場合には、現在復号化されつつあるオーデ
ィオフレームが再生開始位置以降のものであることがわ
かる。As a result of the comparison, the obtained PTS value is
PT stored in the video frame PTS table
If the value of S is smaller than the smallest value, it is understood that the audio frame currently being decoded is before the reproduction start position. On the other hand, when the value of the acquired PTS is larger, it is understood that the audio frame currently being decoded is after the reproduction start position.

【００６６】そこで、出力制御部１１３は、取得したＰ
ＴＳの値の方が、ビデオフレーム用ＰＴＳテーブルに格
納されているＰＴＳのうち最も値が小さいものより大き
くなった瞬間、ＭＰＥＧデコーダ１１１に指示して、無
効データの出力を中止させるとともに、復号化して得ら
れたオーディオデータの出力を開始させる。Therefore, the output control unit 113 sets the acquired P
At the moment when the value of TS becomes larger than the smallest value of the PTS stored in the PTS table for video frame, it instructs the MPEG decoder 111 to stop outputting invalid data and decode the data. The output of the audio data obtained is started.

【００６７】復号化して得られたビデオデータおよびオ
ーディオデータは、図示しないＤ／Ａ変換部でそれぞれ
アナログのビデオ信号、アナログのオーディオ信号に変
換される。そして、図示しない画面上で映像の表示が開
始され、図示しないスピーカから音声の出力が開始され
る。その際、出力制御部１１３が、上記のように出力制
御を行ったことにより、再生開始時、再生開始位置以前
の音声が再生されることがない。The video data and audio data obtained by decoding are converted into an analog video signal and an analog audio signal by a D / A converter (not shown). Then, display of an image is started on a screen (not shown), and output of sound from a speaker (not shown) is started. At this time, since the output control unit 113 performs the output control as described above, the sound before the reproduction start position is not reproduced at the time of the reproduction start.

【００６８】以上のように、本実施形態によれば、ジャ
ンプ再生など、ストリーム途中のＧＯＰの先頭から再生
を開始する際、第１の実施形態同様、復号化装置１１ａ
側で複雑な処理を行わなくても、再生開始時に映像の乱
れや雑音が発生しないようにできるのに加え、さらに、
再生開始時、再生開始位置以前の音声が再生されないよ
うにできる。As described above, according to the present embodiment, when playback is started from the beginning of a GOP in the middle of a stream, such as jump playback, the decoding device 11a, as in the first embodiment.
In addition to being able to prevent video distortion and noise at the start of playback without performing complicated processing on the side,
At the start of playback, audio before the playback start position can be prevented from being played back.

[Brief description of the drawings]

【図１】本発明の第１の実施形態に係るＭＰＥＧ符号化
・復号化装置の構成を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of an MPEG encoding / decoding device according to a first embodiment of the present invention.

【図２】図１のビデオパケット化部１０４がビデオエレ
メンタリストリームをパケット化する動作を説明するた
めの図である。FIG. 2 is a diagram for explaining an operation in which the video packetizer 104 of FIG. 1 packetizes a video elementary stream.

【図３】図１のオーディオパケット化部１０６がオーデ
ィオエレメンタリストリームをパケット化する動作を説
明するための図である。FIG. 3 is a diagram for explaining an operation in which the audio packetizing unit 106 in FIG. 1 packetizes an audio elementary stream.

【図４】本発明の第２の実施形態に係るＭＰＥＧ符号化
・復号化装置の構成を示すブロック図である。FIG. 4 is a block diagram illustrating a configuration of an MPEG encoding / decoding device according to a second embodiment of the present invention.

【図５】図４のＭＰＥＧデコーダ１１１が、ヘッダをヘ
ッダバッファ１１２ｂに、ビデオエレメンタリストリー
ムおよびオーディオエレメンタリストリームをチャネル
バッファ１１２ａに、それぞれ格納させる動作を説明す
るための図である。FIG. 5 is a diagram for explaining an operation in which the MPEG decoder of FIG. 4 stores a header in a header buffer 112b and a video elementary stream and an audio elementary stream in a channel buffer 112a, respectively.

【図６】図４のＭＰＥＧデコーダ１１１が作成するＰＴ
Ｓテーブルの一例を示す図である。FIG. 6 shows a PT created by the MPEG decoder 111 of FIG.
It is a figure showing an example of an S table.

【図７】ＭＰＥＧで用いられるストリームの構造を示す
図である。FIG. 7 is a diagram showing the structure of a stream used in MPEG.

【図８】ジャンプ再生時の、再生開始位置付近のプログ
ラムストリームの一例を示す図である。FIG. 8 is a diagram illustrating an example of a program stream near a reproduction start position during jump reproduction.

[Explanation of symbols]

１０…符号化装置１１、１１ａ…復号化装置１０１…ＭＰＥＧエンコーダ１０２…ＧＯＰ検出部１０３…ビデオフレーム検出部１０４…ビデオパケット化部１０５…オーディオフレーム検出部１０６…オーディオパケット化部１０７…多重化部１１１…ＭＰＥＧデコーダ１１２…バッファ部１１２ａ…チャネルバッファ１１２ｂ…ヘッダバッファ１１３…出力制御部 DESCRIPTION OF SYMBOLS 10 ... Encoding device 11, 11a ... Decoding device 101 ... MPEG encoder 102 ... GOP detection part 103 ... Video frame detection part 104 ... Video packet formation part 105 ... Audio frame detection part 106 ... Audio packet formation part 107 ... Multiplexing part 111: MPEG decoder 112: Buffer unit 112a: Channel buffer 112b: Header buffer 113: Output control unit

Claims

[Claims]

An MPEG encoding means for encoding audio data in an MPEG system, an audio stream obtained by encoding by the MPEG encoding means is searched, and a plurality of audio frames constituting the stream are searched. Audio frame detection means for detecting the head of each of the plurality of audio streams, and packetization means for packetizing the audio stream in association with the detection result of the audio frame detection means, wherein the packetization means comprises: An MPEG encoding apparatus, characterized in that a frame is packetized so that none of the frames is divided into different packets and stored.

2. A buffer for temporarily storing a video stream and an audio stream; an MPEG decoding unit for decoding the video stream and the audio stream stored in the buffer; and an operation of the MPEG decoding unit. Control means, the control means, at the start of playback, of the plurality of audio frames constituting the audio stream, the display time of the audio frame that is currently being decoded,
If the display time of the audio frame is earlier than that of the video frame as a result of the comparison, the display time of the audio frame is compared with that of the video frame to be displayed first among the plurality of video frames constituting the video stream. An MPEG decoding apparatus, characterized in that the MPEG decoding means is not instructed to output audio data obtained by decoding the audio frame.