JPH02246432A

JPH02246432A - Video audio multiplex system

Info

Publication number: JPH02246432A
Application number: JP1066784A
Authority: JP
Inventors: Riyouichi Danki; 亮一段木
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1989-03-18
Filing date: 1989-03-18
Publication date: 1990-10-02

Abstract

PURPOSE:To make the reproduced video output coincident with a reproduced audio output at the receiver side by obtaining a mean value of a processing delay time of a video signal in advance in comparison with a processing time of an audio signal so as to apply fixed delay control. CONSTITUTION:A mean value of a processing delay time being the result of subtracting a coding and decoding processing delay time for an audio signal from a delay time requiring the coding and decoding processing of a video signal is measured in advance. When a coded audio frame by an audio coding section 2 in the multiplex of the video and audio signal in a multiplex section 7 is subject to multi-frame processing, the coded audio signal is retarded by a mean delay time of the video reproducing output with respect to the audio reproducing output and the coded video information is multiplexed and sent to the transmission line. Thus, the synchronization between the video signal and the audio signal ensured without provision of a special delay circuit and natural reproduction is attained.

Description

【発明の詳細な説明】〔概　　要〕音声情報と映像情報を符号化し多重化して伝送する方式
に関し、受信側において再生映像出力と再生音声出力を一致させ
ることができる映像・音声多重化方式を提供することを
目的とし、音声入力をディジタル音声に変換するＡ／Ｄ変換器と、
該ディジタル音声を符号化する音声符号化部と、映像入
力をディジタル映像に変換するＡ／Ｄ変換器と、該ディ
ジタル映像を符号化する映像符号化部と、該符号化され
た音声情報をマルチ・フレーム化して音声再生出力に対
する映像再生出力の平均遅延時間分だけ該符号化音声を
遅延させてから該符号化された映像情報を多重化して伝
送する音声遅延・多重化部とで構成する。[Detailed Description of the Invention] [Summary] Regarding a method of encoding and multiplexing audio information and video information and transmitting the same, we have developed a video/audio multiplexing method that can match the reproduced video output and the reproduced audio output on the receiving side. an A/D converter for converting audio input into digital audio;
An audio encoding section that encodes the digital audio, an A/D converter that converts the video input into digital video, a video encoding section that encodes the digital video, and a multiprocessor that encodes the encoded audio information. - It is composed of an audio delay/multiplexing unit that frames and delays the encoded audio by the average delay time of the video reproduction output with respect to the audio reproduction output, and then multiplexes and transmits the encoded video information.

[Industrial application field]

本発明は映像・音声多重化方式に関し、特に音声情報と
映像情報を符号化し多重化して伝送する方式に関するも
のである。The present invention relates to a video/audio multiplexing system, and particularly to a system for encoding, multiplexing, and transmitting audio information and video information.

近年、テレビ会議やテレビ電話等のように音声情報と映
像情報の双方を伝送する通信が盛んになって来ており、
このような多重化通信方式では、両者の符号化・復号化
を高能率に行うと共に音声情報と映像情報のバランスを
保つことも重要になっている。In recent years, communications that transmit both audio and video information, such as video conferences and video calls, have become popular.
In such a multiplex communication system, it is important to encode and decode both with high efficiency and to maintain a balance between audio information and video information.

[Conventional technology and issues]

このような映像・音声多重化方式に用いられる映像符号
化部では、フレーム間予測符号化、フレーム内予測符号
化、動き補償予測符号化、可変長符号化等の符号化手法
により、映像情報の冗長度圧縮を図っているが、音声信
号に比べて、映像信号の情報量の方が遥かに多いため音
声信号の符号化／復号化処理時間に対して遅延が生じて
しまう。The video encoding unit used in such video/audio multiplexing systems uses encoding techniques such as interframe predictive encoding, intraframe predictive encoding, motion compensated predictive encoding, and variable length encoding to encode video information. Although redundancy compression is attempted, since the amount of information in the video signal is much larger than that in the audio signal, a delay occurs in the encoding/decoding processing time of the audio signal.

そのため、受信側において映像と音声の同期（リップ・
シンク）を確保出来ず、映像と音声が一致しない不自然
な再生がなされていた。Therefore, synchronization of video and audio (rip
sync) could not be secured, resulting in unnatural playback where video and audio did not match.

従って、本発明は、受信側において再生映像出力と再生
音声出力を一致させることができる映像・音声多重化方
式を提供することを目的とする。Accordingly, an object of the present invention is to provide a video/audio multiplexing system that allows reproduction video output and reproduction audio output to match on the receiving side.

[Means to solve the problem]

本発明では、音声信号の処理時間に比べた時の映像信号
の処理遅延時間の平均値を予め求めておくことにより固
定的な遅延制御を行おうとするものである。The present invention attempts to perform fixed delay control by determining in advance the average value of the processing delay time of a video signal when compared to the processing time of an audio signal.

即ち、本発明に係る映像・音声多重化方式では、第１図
に原理的に示すように、音声入力をディジタル音声に変
換するＡ／Ｄ変換器１と、該ディジタル音声を符号化す
る音声符号化部２と、映像入力をディジタル映像に変換
するＡ／Ｄ変換器３と、該ディジタル映像を符号化する
映像符号化部４と、該符号化された音声情報をマルチ・
フレーム化して音声再生出力に対する映像再生出力の平
均遅延時間分だけ該符号化音声を遅延させてから該符号
化された映像情報を多重化して伝送する音声遅延・多重
化部７と、を備えている。That is, in the video/audio multiplexing system according to the present invention, as shown in principle in FIG. an A/D converter 3 that converts video input into digital video, a video encoder 4 that encodes the digital video, and a multi-digital converter 2 that converts the encoded audio information into digital video.
an audio delay/multiplexing unit 7 that frames and delays the encoded audio by an average delay time of the video reproduction output relative to the audio reproduction output, and then multiplexes and transmits the encoded video information. There is.

[For production]

映像信号の映像符号化部及び復号化部での符号化・復号
化処理に要する遅延時間Ｔｖから、音声信号の符号化部
及び復号化部での符号化・復号化処理遅延時間Ｔａを引
いた処理遅延時間Ｔｄ−Ｔｖ−Ｔａが、映像再生出力と
音声再生出力との非同期状態をもたらしている。従って
、この遅延時間Ｔｄの平均値を予め測定しておけば、両
者の平均的な同期状態を得ることができる。Delay time Ta required for encoding/decoding processing of the audio signal in the encoding/decoding section of the audio signal is subtracted from delay time Tv required for encoding/decoding processing of the video signal in the video encoding section and decoding section. The processing delay time Td-Tv-Ta brings about an asynchronous state between the video playback output and the audio playback output. Therefore, by measuring the average value of this delay time Td in advance, it is possible to obtain an average synchronization state between the two.

このため、本発明では、第１図に示すように、多重化部
７での映像・音声の多重化において音声符号化部２で符
号化された音声フレームをマルチ・フレーム化する。For this reason, in the present invention, as shown in FIG. 1, the audio frames encoded by the audio encoding unit 2 are multi-framed when the multiplexing unit 7 multiplexes the video and audio.

そして、この場合のマルチ・フレーム化は第２図に示す
ように、出力タイミングクロックＣＬＫにより音声符号
化部２から出力される音声フレーム■・・・［相］・・
・を例えばフレーム■〜■、■〜［相］、■〜■という
ように蓄積を行うことにより、音声情報を遅延させるこ
とができるので、この音声フレームの蓄積による遅延時
間Ｔ、の後に映像情報の多重化を行い伝送路に送出する
ことにより映像再生出力と音声再生出力とを同期させる
ことができる。In this case, the multi-frame format is as shown in FIG.
For example, by accumulating ・ as frames ■～■, ■～[phase], ■～■, the audio information can be delayed, so after the delay time T due to the accumulation of audio frames, the video information By multiplexing the signals and sending them to the transmission path, it is possible to synchronize the video playback output and the audio playback output.

〔Example〕

第３図は第１図に示した本発明の映像・音声多重化方式
の送信側及び受信側の実施例を示したもので、この実施
例では、音声符号化部２は、ディジタル音声に対する符
号化ビット数が各々異なっている複数の適応符号化部（
例えばＡＤＰＣＭ）２、〜２ｎと、各適応符号化部２Ｉ
〜２ｎの再生出力及び該ディジタル音声から雑音評価法
に基づいて符号化とットレートを決定し符号化ビットレ
ート情報を発生する評価部５と、その符号化ビットレー
ト情報により対応する適応符号化部の出力音声情報を選
択する選択部６とで構成されている。FIG. 3 shows an embodiment of the transmitting side and receiving side of the video/audio multiplexing system of the present invention shown in FIG. A plurality of adaptive encoding units each having a different number of encoding bits (
For example, ADPCM)2, ~2n, and each adaptive encoding unit 2I
-2n playback output and the digital audio to determine the encoding and bit rate based on the noise evaluation method and generate encoding bit rate information, and a corresponding adaptive encoding unit based on the encoding bit rate information. It is comprised of a selection section 6 that selects output audio information.

また、受信側は、伝送路８からの多重化された信号を映
像情報とそれ以外の情報とに分離する分離部１１と、そ
の映像情報以外の情報を更に上記の符号化ビットレート
情報と音声情報とに分離する分離部１２と、その符号化
ビットレート情報に従って音声情報を復号化する音声復
号化部１３と、復号化されたディジタル音声を音声信号
に変換するＤ／Ａ変換器１４と、映像情報を復号化する
映像復号化部１５と、復号化されたディジタル映像出力
を映像出力に変換するＤ／Ａ変換器１６とで構成されて
いる。The receiving side also includes a separating unit 11 that separates the multiplexed signal from the transmission path 8 into video information and other information, and a separating unit 11 that separates the multiplexed signal from the transmission path 8 into video information and other information, and further includes the above-mentioned encoded bit rate information and audio information. a separation unit 12 that separates the audio information into audio information; an audio decoding unit 13 that decodes the audio information according to the encoded bit rate information; and a D/A converter 14 that converts the decoded digital audio into an audio signal. It is comprised of a video decoding section 15 that decodes video information, and a D/A converter 16 that converts the decoded digital video output into video output.

適応符号化部２．〜２．から出力される音声情報のフレ
ーム（パケット）フォーマットは第４図に示す通りであ
り、フレームへラダａと、このフレームへラダａに続く
一連の音声データで構成され、最適な雑音評価を受けた
適応符号化部の符号化ビット数に×サンプリングレート
ｎ　（ｎ＝８ｋＨｚ　）のビット数で構成され、選択部
６を経て音声遅延・多重化部７へ送られる。Adaptive encoding unit 2. ~2. The frame (packet) format of the audio information output from is as shown in Figure 4, and consists of a frame ladder a and a series of audio data following this frame ladder a, and has received the optimal noise evaluation. The signal has a number of bits equal to the number of bits coded by the adaptive encoder multiplied by the sampling rate n (n=8kHz), and is sent to the audio delay/multiplexer 7 via the selector 6.

第５図は音声遅延・多重化部７の一実施例を示したもの
で、この実施例では、ＣＰＵ２１と、ＣＰＵバス２２と
、第６図に示すフレームフォーマットで伝送路にデータ
を送出するための手順プログラムを格納したＲＯＭ２３
と、映像情報、音声情報、及び符号化ビットレート情報
をそれぞれバス２２に取り込むためのインタフェース（
ｆ／Ｆ）２４〜２６と、これらインタフェース２４〜２
６からの映像情報、音声情報、及び符号化ビットレート
情報を一時的に書き込むためのアドレス空間を有するＲ
ＡＭ２７と、各情報を伝送路に送出するためのインタフ
ェース２８とで構成されている。FIG. 5 shows an embodiment of the audio delay/multiplexing section 7. In this embodiment, the CPU 21, the CPU bus 22, and the frame format shown in FIG. 6 are used to send data to the transmission path. ROM23 that stores the procedure program of
and an interface (
f/F) 24 to 26 and these interfaces 24 to 2
R having an address space for temporarily writing video information, audio information, and encoding bit rate information from 6.
It consists of an AM 27 and an interface 28 for sending out each piece of information to a transmission path.

第６図に示す伝送フレームフォーマットは、ＣＣＩＴＴ
勧告案Ｙ、２２１に基づいてビット割当を行ったもので
あり、各マルチフレームは１６伝送フレームで１単位を
構成しく１マルチフレーム−１６伝送フレーム）、横８
ビット×縦８０ビットー合計６４０ピットが１伝送フレ
ームを構成しており、また、１伝送フレームは音声情報
と映像情報と制御情報（Ａ　Ｃ（Ａｐｐｌｉｃａｔｉｏ
ｎ　Ｃｈａｎｎｅｌ）情報）としてのＦＡ　Ｓ　（Ｆｒ
ａｍｅ　Ａｌｉｇｎｍｅｎｔ　Ｓｉｇｎａｌ）情報及び
ＢＡＳ（Ｂｉｔ　Ａｌ１ｏｃａｔｉｏｎ　Ｓｉｇｎａｌ
）情報で構成されている。The transmission frame format shown in FIG. 6 is CCITT
The bit allocation is based on Recommendation Y.221, and each multiframe consists of 16 transmission frames (1 multiframe - 16 transmission frames), 8 horizontal
One transmission frame consists of bits x 80 bits vertically (640 pits in total), and one transmission frame contains audio information, video information, and control information (A C (Application)).
FA S (Fr n Channel) information) as
ame Alignment Signal) information and BAS (Bit Alignment Signal) information
) consists of information.

尚、１マルチフレームは上述の如（１６伝送フレームで
構成されているが、データは第６図（ａ）、（ロ）に示
すように■→［相］の方向に２つの伝送フレーム、例え
ばＦ　ｌ−１〜Ｆ　＋−１６とＦ、−１〜Ｆ！−１４、
がｌ対になって構成され且つ送出される。但し、ＦＡＳ
情報、ＢＡＳ情報は伝送フレームＦ、とＦ８とでは別々
である。Note that one multiframe is composed of 16 transmission frames as described above, but the data is divided into two transmission frames in the direction of ■ → [phase] as shown in FIGS. 6(a) and (b), e.g. F l-1~F +-16 and F, -1~F!-14,
are constructed and sent out in l pairs. However, FAS
The BAS information is different for transmission frames F and F8.

ＦＡＳ情報は、フレーム情報であり、（１）　Ｙ、２２
１フレ一ム同期、及び（２）マルチ・フレーム同期の同
期手順により同期を確立するために使用される。FAS information is frame information, (1) Y, 22
It is used to establish synchronization through one-frame synchronization and (2) multi-frame synchronization synchronization procedures.

即ち、上記（１）によりフレーム単位の区別ができ、（
２）により各フレームの識別が可能となる。尚、全フレ
ーム識別の必要性は、後述するＲ　Ａ　Ｓ　（Ｂｉｔ　
Ａ１１ｏｃａｔｉｏｎ　Ｓｉｇｎａｌ）情報の変化に対
する応答単位を認識する為である。That is, according to (1) above, it is possible to distinguish on a frame-by-frame basis, and (
2) makes it possible to identify each frame. Note that the necessity of identifying all frames is determined by R A S (Bit
A11ocation Signal) This is to recognize a response unit to a change in information.

ＢＡＳ情報は送信側において音声情報量と映像情報量の
符号化情報（例えば両者の比率）を設定したもので、第
３図の評価部５から符号化ビットレート情報を受けるこ
とによりこの符号化ビットレートをＲＡＳ情報に取り込
んで伝送し、フレーム同期確立後に受信側で各データの
分離を行うために使用される（第８図参照）、このＢＡ
Ｓ情報は、ｌサブ・マルチフレーム毎（１マルチフレー
ム−２サブ・マルチフレーム）に判定され、１つ前のＢ
ＡＳ情報から多数決の論理（８フレーム中の５フレ一ム
以上）により次の符号化ビットレートが検出される（第
９図参照）、尚、ＢＡＳ情報は符号化部が単数で然もビ
ットレートが一定の場合には予め多重化部７に記憶して
おいてもよい。The BAS information is a set of encoding information for the amount of audio information and the amount of video information (for example, the ratio of the two) on the transmitting side, and this encoded bit is determined by receiving the encoding bit rate information from the evaluation unit 5 in FIG. This BA is used to incorporate the rate into RAS information and transmit it, and to separate each data on the receiving side after establishing frame synchronization (see Figure 8).
The S information is determined every l sub-multiframe (1 multiframe - 2 sub-multiframes), and the previous B
The next encoding bit rate is detected from the AS information by majority logic (more than 5 frames out of 8 frames) (see Figure 9); If it is constant, it may be stored in the multiplexing unit 7 in advance.

次に、この実施例による音声のマルチ・フレーム化遅延
動作を説明する。Next, the audio multi-frame delay operation according to this embodiment will be explained.

上記のような伝送フレーム構成（第６図）を用いた場合
の多重化部７では、ＣＰＵ２１によりＢＡＳ情報として
記憶された符号化ビットレートに従ってビット配分が決
定され、上述したように各フレームはフレーム・ヘッダ
を除くと”８Ｘｎビツト構成となるため、第７図に示す
ようにフレーム割当が最適に行う事が出来る。In the multiplexing unit 7 when using the transmission frame configuration (FIG. 6) as described above, the bit allocation is determined according to the encoding bit rate stored as BAS information by the CPU 21, and as described above, each frame is - If the header is excluded, the frame has an 8Xn bit configuration, so frame allocation can be performed optimally as shown in FIG.

即ち、例えば縦方向の８ビット単位（例えば第７図（１
）〜（８）で示す）毎に右方向に１〜７ビツト（横軸）
まで音声データが埋められると、下段の８ビツトの組へ
と移行し、選択された符号化ビットレートに対応した形
で順次音声データが詰められて行く。That is, for example, in units of 8 bits in the vertical direction (for example, in Fig. 7 (1
) to (8)) 1 to 7 bits to the right (horizontal axis)
When the audio data is filled up to 8 bits, the data moves to the lower 8-bit set, and audio data is sequentially filled in in a form corresponding to the selected encoding bit rate.

ここで、音声フレームヘッダをＢＡＳ情報に含める為、
一定フレーム数分同一符号化レートとする必要があり、
この単位が上記の″１マルチフレーム（−１６フレーム
）″となる。従って、適応符号化を行う場合には１マル
チ伝送フレーム毎に、音声符号化レートが可変となる。Here, in order to include the audio frame header in the BAS information,
It is necessary to use the same encoding rate for a certain number of frames.
This unit becomes the above-mentioned "1 multiframe (-16 frames)". Therefore, when performing adaptive coding, the audio coding rate becomes variable for each multi-transmission frame.

そして、音声情報及び映像情報をＲＡＭ２７に格納し、
伝送フレームが完成した時点で伝送速度に一敗したレー
トでデータを伝送路８に送出する。Then, audio information and video information are stored in the RAM 27,
When the transmission frame is completed, data is sent to the transmission path 8 at a rate that is slightly lower than the transmission speed.

ここで、第７図に示す伝送フレームは、上述の如く横方
向８ビツト、縦方向８０ビツトであり、１伝送フレーム
は６４０ビツトとなるが、今、伝送速度を６４ｋｂ／ｓ
とすると、単位時間当たり、１００伝送フレームを送信
する事ができ、又、単位伝送フレーム当たりの時間は、
１０ｍ５ｅｃとなる。従って、１マルチ・フレームを１
６伝送フレームと定義した関係で、１マルチ・伝送フレ
ーム分の伝送時間は１６０　ｍ５ｅｃとなる。Here, the transmission frame shown in FIG. 7 has 8 bits in the horizontal direction and 80 bits in the vertical direction as described above, and one transmission frame has 640 bits, but the transmission speed is now set to 64 kb/s.
Then, 100 transmission frames can be transmitted per unit time, and the time per unit transmission frame is
It will be 10m5ec. Therefore, one multi-frame
Due to the relationship defined as 6 transmission frames, the transmission time for one multi-transmission frame is 160 m5ec.

これは、路上述した音声−映像間の遅延時間Ｔｄの測定
平均値に非常に近い。This is very close to the measured average value of the audio-video delay time Td mentioned above.

そこで、１０マルチ・音声フレームから成る１伝送フレ
ームを１６個用意して１６０マルチ・音声フレーム分だ
け音声情報をＲＡＭ２７上に蓄えた後にバッファ４３か
ら映像情報を書き込み、これにより伝送フレームが完了
したとして伝送路８に送出すれば、映像情報に対して音
声情報を所望に遅延させることができ、音声遅延のため
の独立した遅延部を設けることなく映像・音声の再生同
期（リップ・シンク）を確保することができる。Therefore, after preparing 16 transmission frames each consisting of 10 multi-audio frames and storing audio information for 160 multi-audio frames in the RAM 27, video information is written from the buffer 43, and this completes the transmission frame. By sending it to the transmission path 8, audio information can be delayed as desired with respect to video information, and video/audio playback synchronization (lip sync) can be ensured without providing an independent delay section for audio delay. can do.

また、先頭データ位置を一定のフレーム番号上から配置
すると、分離部での動作は簡略化されて処理時間も短縮
される。Further, by arranging the leading data position from a certain frame number, the operation in the separation section is simplified and the processing time is shortened.

一方、受信側における分離部１１は第５図に示した多重
化部７と丁度矢印を逆方向にした形で同様の構成を有し
ている。On the other hand, the demultiplexer 11 on the receiving side has the same configuration as the multiplexer 7 shown in FIG. 5, with the arrows pointing in opposite directions.

即ち、伝送路からの入力データに対して、上述したよう
に、このＦＡＳ情報を解析することにより（１）　Ｙ、
２２１フレ一ム同期、及び（２）マルチ・フレーム同期
を確立する。そして、ＲＡＳ情報に基づいて各データを
分離部１１で分離し、分離された各信号データを、各々
のインタフェースを介して映像情報以外の情報は分離部
１２へ、そして映像情報は映像復号化部１５へ送る。そ
して、分離部１２では更に音声情報（ＢＡＳ情報に基づ
いて各フレーム毎にフレーム・ヘッダが付加される）と
ＢＡＳ情報中の符号化ビットレート情報とに分離する。That is, by analyzing this FAS information as described above for the input data from the transmission path, (1) Y,
H.221 frame synchronization and (2) multi-frame synchronization are established. Then, each data is separated by a separation unit 11 based on the RAS information, and the separated signal data is sent to the separation unit 12 via each interface for information other than video information, and the video information is sent to a video decoding unit. Send to 15. Then, the separation unit 12 further separates the audio information into audio information (a frame header is added to each frame based on the BAS information) and encoding bit rate information in the BAS information.

そして、音声復号化部１３では分離部１２からの符号化
ビットレート情報を受けてそのビットレートに従うて音
声情報を復号化する。Then, the audio decoding section 13 receives the encoding bit rate information from the separating section 12 and decodes the audio information according to the bit rate.

この結果、音声復号化部１３の音声再生出力と映像復号
化部１５の映像再生出力とは略同じ時刻に発生されるこ
ととなって同期を確保することができる。As a result, the audio reproduction output of the audio decoding section 13 and the video reproduction output of the video decoding section 15 are generated at substantially the same time, so that synchronization can be ensured.

尚、上記の実施例では、音声符号化部２に複数の適応符
号化部を用いたが、予めビットレートが固定されている
単一の符号化部からの音声フレームを用いても全く同様
の動作が行われることは言うまでもない。In the above embodiment, a plurality of adaptive encoding units are used in the audio encoding unit 2, but the same result can be obtained even if audio frames from a single encoding unit whose bit rate is fixed in advance are used. Needless to say, the action is taken.

〔Effect of the invention〕

以上のように、本発明に係る映像・音声多重化方式によ
れば、符号化された音声フレームをマルチ・フレーム化
して音声再生出力と映像再生出力とを一敗させるための
遅延時間が経過した後に映像情報を多重化するように構
成したので、特別な遅延回路を設けることなく映像と音
声の同期（リップシンク）を確保する事ができ、自然な
再生ができると共に、伝送路の効率を最大限に活用する
ことができる。As described above, according to the video/audio multiplexing method according to the present invention, the delay time for converting encoded audio frames into multi-frames and causing the audio playback output and the video playback output to fail once has elapsed. Since the configuration was configured to multiplex the video information later, it is possible to ensure synchronization (lip sync) between video and audio without installing a special delay circuit, allowing natural playback and maximizing the efficiency of the transmission path. It can be used to a limited extent.

[Brief explanation of drawings]

第１図は、本発明に係る映像・音声多重化方式を原理的
に示したブロック構成図、第２図は、本発明による音声のマルチ・フレーム化遅延
動作を説明するためのタイムチャート図、第３図は、本
発明方式の一実施例を示したブロック図、第４図は、本発明に用いる音声フレームを示した図、第５図は、本発明に用いる多重化／分離部の構成例を示
したブロック図、第６図は、本発明に用いる伝送フレームの構成図、第７図は、本発明の実施例による音声のマルチ・パケッ
ト化を示した伝送フレーム構成図、第８図は、本発明に
用いるＢＡＳ情報の映像・音声割当コードの実施例を示
した図、第９図は、ＢＡＳ情報による映像・音声分離を説明する
図、である。第１図において、１．３・・・Ａ／Ｄ変換器、２・・・音声符号化部、４・・・映像符号化部、７・・・音声遅延・多重化部。図中、同一符号は同−又は相当部分を示す。FIG. 1 is a block configuration diagram showing the principle of the video/audio multiplexing system according to the present invention. FIG. 2 is a time chart diagram for explaining the audio multi-frame delay operation according to the present invention. FIG. 3 is a block diagram showing an embodiment of the method of the present invention. FIG. 4 is a diagram showing audio frames used in the present invention. FIG. 5 is a configuration of a multiplexing/separating section used in the present invention. A block diagram showing an example; FIG. 6 is a configuration diagram of a transmission frame used in the present invention; FIG. 7 is a configuration diagram of a transmission frame showing multi-packetization of audio according to an embodiment of the present invention; FIG. 9 is a diagram showing an example of a video/audio allocation code of BAS information used in the present invention. FIG. 9 is a diagram illustrating video/audio separation using BAS information. In FIG. 1, 1.3...A/D converter, 2...audio encoder, 4...video encoder, 7...audio delay/multiplexer. In the figures, the same reference numerals indicate the same or corresponding parts.

Claims

[Claims]

(1) An A/D converter (1) that converts audio input into digital audio, an audio encoder (2) that encodes the digital audio, and an A/D converter that converts video input into digital video. (3
), a video encoding unit (4) that encodes the digital video, and converts the encoded audio information into a multi-frame and encodes the encoded audio by an average delay time of the video playback output with respect to the audio playback output. A video/audio multiplexing system comprising: an audio delay/multiplexer (7) that multiplexes and transmits the encoded video information after delaying the encoded video information.

(2) The video/audio multiplexing system according to claim 1, wherein the period of multi-frame conversion of the audio information corresponds to the period of one multi-transmission frame as a unit of information processing.

(3) The audio encoding unit (2) includes a plurality of adaptive encoding units (2_1-2n) each having a different number of encoding bits for the digital audio, and each adaptive encoding unit (2_1-2n).
an evaluation unit (5) that determines a coding bit rate based on a noise evaluation method from the reproduced output of n) and the digital audio and generates coding bit rate information, and adaptive coding corresponding to the coding bit rate information. 2. The video/audio multiplexing system according to claim 1, further comprising a selection section (6) for selecting output audio information of the section.

(4) A separation unit (11) that separates the multiplexed signal from the transmission path into the video information and other information, and further separates the information other than the video information into the encoded bit rate information and the audio information. an audio decoding unit (13) that decodes the audio information according to the encoded bit rate information; and an audio decoding unit (13) that converts the decoded digital audio into an audio signal.
/A converter (14), a video decoding section (15) that decodes the video information, and a D converter that converts the decoded digital video output into video output.
3. The video/audio multiplexing system according to claim 2, further comprising: a /A converter (16).