JP2523995B2

JP2523995B2 - Video / audio multiplex transmission system

Info

Publication number: JP2523995B2
Application number: JP2504687A
Authority: JP
Inventors: 亮一段木; 武彦藤山; 敏彰臼井; 考志川畑
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1989-03-16
Filing date: 1990-03-16
Publication date: 1996-08-14
Anticipated expiration: 2011-08-14

Description

【発明の詳細な説明】技術分野本発明は映像・音声多重伝送システムに関し、特に音
声情報と映像情報を符号化し多重化して伝送するシステ
ムに関するものである。TECHNICAL FIELD The present invention relates to a video / audio multiplex transmission system, and more particularly to a system for encoding and multiplexing audio information and video information for transmission.

近年、テレビ会議やテレビ電話等のように音声情報と
映像情報の双方を伝送する通信が盛んになって来てお
り、このような多重化通信方式では、両者の符号化・復
号化を高能率に行うと共に音声情報と映像情報のバラン
スを保つことも重要になっている。In recent years, communication for transmitting both audio information and video information has become popular, such as a video conference and a videophone. In such a multiplex communication system, encoding / decoding of both can be performed with high efficiency. It is also important to maintain the balance between audio information and video information as well as the above.

背景技術符号化された音声情報を、映像符号化情報及びその他
の制御情報と共に多重化して伝送する従来から知られて
いる映像・音声符号化方式においては、64kb/s回線２本
を使用する２×Ｂ方式、128kb/s回線１本を使用する2
B、或いは64kb/s回線１本を使用するＢ方式のような特
に伝送速度が低ビットレートである場合、音声・映像信
号間の伝送比率は、一般に約1:1（例えば音声符号化速
度が56kb/sで映像符号化速度が64kb/sの場合）、約1:3
（例えば音声符号化速度が32kb/sで映像符号化速度が96
kb/sの場合）、或いは約1:7（例えば音声符号化速度が1
6kb/sで映像符号化速度が112kb/sの場合）に固定されて
いる。BACKGROUND ART In a conventionally known video / audio coding system in which coded audio information is multiplexed and transmitted together with video coding information and other control information, two 64 kb / s lines are used. × B method, one 128 kb / s line is used 2
When the transmission rate is low, such as B or B method using one 64 kb / s line, the transmission ratio between audio and video signals is generally about 1: 1 (for example, the audio encoding rate is 56kb / s and video coding speed is 64kb / s), about 1: 3
(For example, if the audio coding rate is 32 kb / s and the video coding rate is 96
kb / s) or about 1: 7 (for example, the voice coding rate is 1
The video coding speed is 6 kb / s and the video coding speed is 112 kb / s).

しかしながら、音声信号と映像信号との情報密度の比
率は本来数100倍の違いがあるにも拘わらず、上記の比
率で伝送されているため、伝送比率が1:1の場合には、
単位時間当たりの映像情報の伝送量が少なくなってしま
い動画像再生品質の劣化を招来し、他方、伝送比率が1:
7の場合には、今度は音声品質が悪くなってしまう。特
に後者の場合で、動きの少ない映像を符号化する際に
は、伝送容量に合わせるために不必要なビット（フィル
・ビット）を付加し伝送ビット・レートの整合を取って
いる。However, although the ratio of the information density between the audio signal and the video signal originally has a difference of several hundred times, since it is transmitted at the above ratio, when the transmission ratio is 1: 1,
The transmission amount of video information per unit time is reduced, which causes deterioration of moving image reproduction quality, while the transmission ratio is 1:
In the case of 7, the voice quality will be worse this time. Particularly in the latter case, when encoding a video with little motion, unnecessary bits (fill bits) are added to match the transmission capacity to match the transmission bit rates.

このように従来の方式に於いては、映像品質／音声品
質の何れか一方の犠牲を余儀無くされてしまうという問
題点があった。As described above, in the conventional method, there is a problem that either one of the image quality and the audio quality is sacrificed.

このような映像・音声多重化方式に用いられる映像符
号化部では、フレーム間予測符号化、フレーム内予測符
号化、動き補償予測符号化、可変長符号化等の符号化手
法により、映像情報の冗長度圧縮を図っているが、音声
信号に比べて、映像信号の情報量の方が遥かに多いため
音声信号の符号化／復号化処理時間に対して遅延が生じ
てしまう。In the video encoding unit used in such a video / audio multiplexing system, the video information is encoded by an encoding method such as interframe predictive encoding, intraframe predictive encoding, motion compensation predictive encoding, and variable length encoding. Although redundancy compression is attempted, the amount of information of the video signal is much larger than that of the audio signal, which causes a delay in the encoding / decoding processing time of the audio signal.

そのため、受信側において映像と音声の同期（リップ
・シンク）を確保出来ず、映像と音声が一致しない不自
然な再生がなされていた。Therefore, the receiving side cannot secure the synchronization (lip sync) between the video and the audio, and the video and the audio do not match and the reproduction is unnatural.

このためあらかじめ音声の処置に対する映像信号の処
理の遅れ時間の平均を求め、これを参考にして定めた固
定した遅延時間をシステムの音声処理時に生じさせると
いう方法も行なわれている。しかし実際の遅延時間は映
像情報の内容により変わるため不自然さを充分改善でき
ないという問題がある。For this reason, a method is also used in which the average of the delay time of the processing of the video signal with respect to the treatment of the voice is obtained in advance, and a fixed delay time determined with reference to this is generated during the voice processing of the system. However, since the actual delay time changes depending on the contents of the video information, there is a problem that unnaturalness cannot be improved sufficiently.

また通常は音声情報に比べて、映像情報の情報量の方
が遥かに多いため、実際には映像情報はすべて送られる
わけではない。すなわち送れない分は無視され、映像情
報は間引いて送られる。これにより受信側で再生された
映像は動きの粗いものになる。これは、映像の質よりも
リアルタイム性がより重要であるためである。Also, since the amount of image information is usually much larger than that of audio information, not all image information is actually sent. That is, the part that cannot be sent is ignored, and the video information is thinned out and sent. As a result, the image reproduced on the receiving side has a rough motion. This is because the real-time property is more important than the image quality.

一方音声信号に対しては音声中に無音期間が存在する
にも拘らず、16kps/s分あるいは56kps/s分の伝送容量が
確保されている。このために、音声信号における無音期
間を映像信号のために利用することが望まれる。On the other hand, for a voice signal, a transmission capacity of 16 kps / s or 56 kps / s is secured, even though there is a silent period in the voice. For this reason, it is desired to utilize the silent period in the audio signal for the video signal.

発明の開示本発明は送信する映像情報及び音声情報のすくなくと
も一つの情報に基づいて、映像情報処理及び音声情報処
理を含めたシステム全体の制御を行ない、システム全体
として最適な映像品質及び音声品質を維持することを目
的とする。DISCLOSURE OF THE INVENTION The present invention controls the entire system including video information processing and audio information processing based on at least one piece of video information and audio information to be transmitted, and provides optimum video quality and audio quality for the entire system. The purpose is to maintain.

まず本発明の一つの態様によればこの目的は次のよう
にして実現される。First, according to one aspect of the present invention, this object is realized as follows.

音声入力をデジタル音声に変換するA/D変換器と、デ
ジタル音声を符号化し送信量が選択可能な形をした符号
化音声として出力すると共に音声内容情報を出力する音
声符号化部と、映像入力をデジタル映像に変換するA/D
変換器と、デジタル映像を符号化し符号化映像として出
力する映像符号化部と、符号化音声と符号化映像のすく
なくとも一方の情報量に応じて符号化音声と符号化映像
の伝送比率を決め、割当信号として出力する符号化制御
部と、及び割当信号に基づく符号化音声と符号化映像と
更に割当信号を含む制御情報とを一定の伝送フレーム長
になるように多重化する多重化部とを備える送信部と、
送信部より伝送路に送出され伝送路を通って送られた多
重化された信号を受信し、符号化音声と符号化映像と割
当信号を含む制御信号に分離する分離部と、符号化音声
を復号化し復号デジタル音声にする音声復号化部と、符
号化映像を復号化し復号デジタル映像にする映像復号化
部と、割当信号に基づいて音声復号化部と映像復号化部
への制御を行なう復号化制御部と、復号デジタル音声を
音声信号に変換するD/A変換器と、及び復号デジタル映
像を映像信号に変換するD/A変換器を備える受信部で構
成される送信内容により音声と映像の送信比率を変えて
伝送する映像・音声多重伝送システムである。A / D converter that converts audio input into digital audio, audio encoder that encodes digital audio and outputs it as encoded audio with selectable transmission amount, and audio content information, and video input A / D to convert digital to digital video
A converter, a video encoding unit that encodes digital video and outputs it as encoded video, and determines the transmission ratio of encoded audio and encoded video according to the information amount of at least one of encoded audio and encoded video, An encoding control unit that outputs as an allocation signal, and a multiplexing unit that multiplexes encoded audio and coded video based on the allocation signal and control information including the allocation signal so as to have a constant transmission frame length. A transmission unit provided,
A separation unit that receives the multiplexed signal sent from the transmission unit to the transmission line and sent through the transmission line, and separates the encoded signal into a control signal that includes an encoded video and an allocation signal, and an encoded voice. An audio decoding unit for decoding and decoding digital audio, a video decoding unit for decoding coded video and decoding digital video, and a decoding for controlling the audio decoding unit and the video decoding unit based on the allocation signal Audio and video depending on the transmission contents, which are composed of a digitalization control unit, a D / A converter that converts decoded digital audio into an audio signal, and a D / A converter that converts decoded digital video into a video signal It is a video / audio multiplex transmission system that transmits by changing the transmission ratio of.

音声情報は映像情報に比べて情報量は小さいが間引き
等は許容されないため、映像と音声の伝送比率は固定さ
れ、送信内容にかかわらず一定の品質の音声情報が送ら
れていた。The audio information has a smaller amount of information than the video information, but thinning and the like are not allowed. Therefore, the transmission ratio of the video and the audio is fixed, and the audio information of a constant quality is transmitted regardless of the transmission content.

本発明では送信比率を可変にできる。もちろんこのた
めには音声情報が途切れることなく送信比率を変えるこ
とが必要である。そこで音声情報を異なる符号化ビット
レートの複数の符号化音声として出力する適応型の符号
化を行なう。そして送信内容に応じて一つのもっとも適
当な符号化ビットレートを選択することで音声品質の変
動は起きるが、音声は途切れること無しにその時の状態
にもっとも適した映像と音声の品質で伝送が行なわれ
る。受信側も符号化ビットレートに応じた再生を行な
う。In the present invention, the transmission ratio can be made variable. Of course, for this purpose, it is necessary to change the transmission ratio without interruption of the voice information. Therefore, adaptive coding is performed in which the audio information is output as a plurality of coded sounds having different coding bit rates. Then, by selecting one of the most suitable coding bit rates according to the transmission content, the audio quality will fluctuate, but the audio will be transmitted with the video and audio quality most suitable for the state at that time without interruption. Be done. The receiving side also performs reproduction according to the encoding bit rate.

また複数の符号化ビットレートを比較し音声内容から
もっとも最適な品質すなわちビットレートの選定を行な
う。In addition, a plurality of encoded bit rates are compared to select the most suitable quality, that is, the bit rate, from the audio contents.

適応型符号化器としては他にSB-ADPCMがあり、これは
情報密度の高い低域周波数部と情報密度の低い高域周波
数部に分けて符号化するものでビット数の選定は音質の
変化だけをもたらす。Another adaptive encoder is SB-ADPCM, which encodes separately in the low-frequency part with high information density and the high-frequency part with low information density. Only bring.

このSB-ADPCMと複数の符号化ビットレートの場合、ビ
ット数の選定を音声情報だけでなく映像情報や映像バッ
ファの状態を考慮して行なうとシステム全体としてより
バランスのとれたものになる。映像情報としてはフレー
ム間の情報変化率等が用いられる。In the case of this SB-ADPCM and multiple coding bit rates, if the selection of the number of bits is performed considering not only audio information but also video information and video buffer status, the overall system becomes more balanced. The rate of information change between frames is used as the video information.

映像符号化部は映像情報を圧縮するため各種の処理が
行なわれる。このためデジタル映像を符号化し量子化す
る映像符号化部と、符号化された映像情報を可変長符号
化する可変長符号化部と、多重化前に一時的に蓄積され
るバッファとこのバッファの蓄積状態を示すバッファ判
定部で構成される。映像符号化部はこのバッファ蓄積状
態を示す蓄積量信号を出力する。The video coding unit performs various processes to compress the video information. Therefore, a video encoding unit that encodes and quantizes digital video, a variable length encoding unit that performs variable length encoding of encoded video information, a buffer that is temporarily stored before multiplexing, and a buffer of this buffer. It is composed of a buffer determination unit that indicates a storage state. The video encoding unit outputs a storage amount signal indicating this buffer storage state.

バッファが一杯になれば映像信号は間引かれ、送出さ
れるまで時間もかかることになる。If the buffer is full, video signals will be decimated and it will take some time before they are sent out.

送信内容にかかわらず映像と音声の出力をずれが無い
ように一致させるためには、音声入力をデジタル音声に
変換するA/D変換器と、デジタル音声を符号化する音声
符号化部と、映像入力をデジタル映像に変換するA/D変
換器と、デジタル映像を符号化する映像符号化部と、映
像符号化部の入出力情報から映像再生出力と音声再生出
力を同期させるための映像符号化遅延時間情報を発生す
る遅延量演算部と、符号化された映像情報及び音声情報
と遅延時間情報とを多重化する多重化部とを備えた送信
部と、伝送路からの多重化された信号を音声情報と映像
情報と遅延時間情報とに分離する分離部と、遅延時間情
報に従って音声情報を遅延させる可変遅延制御部と、可
変遅延制御部からの音声情報を復号化する音声復号化部
と、復号化されたデジタル音声を音声信号に変換するD/
A変化器と、映像情報を復号化する映像復号化部と、復
号化されたデジタル映像出力を映像出力に変換するD/A
変換器とを備えた受信部とで構成される映像・音声多重伝送シス
テムで実現される。In order to match the output of video and audio so that there is no deviation regardless of the transmission contents, an A / D converter that converts audio input into digital audio, an audio encoding unit that encodes digital audio, and an audio An A / D converter that converts the input into digital video, a video coding unit that codes the digital video, and video coding that synchronizes the video playback output and audio playback output from the input / output information of the video coding unit. A delay amount calculation unit that generates delay time information, a transmission unit that includes a multiplexing unit that multiplexes encoded video information and audio information and delay time information, and a multiplexed signal from a transmission path And a variable delay control unit for delaying the audio information according to the delay time information, and an audio decoding unit for decoding the audio information from the variable delay control unit. , Decrypted digital D for converting voice into a voice signal /
A converter, a video decoding unit that decodes video information, and a D / A that converts the decoded digital video output to a video output
It is realized by a video / audio multiplex transmission system composed of a converter and a receiver equipped with.

遅延量を利用しこの間のあるレベル以下の音声の期間
すなわち無音期間を省き音声部分をまとめて送信するた
めには、映像信号に対応して、映像信号を符号化する映像符号
化部と、符号化された結果に対して可変長符号を与える
可変長符号化部とをそなえると共に、音声信号に対応し
て、音声符号化部をそなえ、上記可変長符号化部からの
出力と上記音声符号化部に対応した出力とが多重化され
て伝送されるよう構成され、受信側で伝送されたものに
対して、映像信号と音声信号とを抽出する処理を行う映
像・音声伝送システムにおいて、上記音声符号化部から
の出力にもとづいて、音声の有効期間中の信号を抽出し
てパケット化する時分割符号化部をもうけると共に、当
該時分割符号化部が音声伝送速度をシステム制御部に通
知するよう構成され、システム制御部は、上記音声伝送
速度を受け取って、上記可変長符号化部内のバッファ・
メモリ上のデータ量に対応して上記映像符号化部の符号
化量を制御するための閾値データを変更するよう構成さ
れ、上記音声伝送速度に適合したフレーム・フォーマッ
トにしたがった伝送を行うようにした映像・音声多重伝
送システムで実現される。In order to use the delay amount and transmit the audio part at a level below a certain level, that is, the silent part, to transmit the audio part as a whole, in order to transmit the audio part collectively, the video encoding part that encodes the video signal, and the code A variable length coding unit for giving a variable length code to the converted result, and a speech coding unit corresponding to a speech signal, and the output from the variable length coding unit and the speech coding. In the video / audio transmission system configured to transmit the output corresponding to each section in a multiplexed manner and performing the process of extracting the video signal and the audio signal from the one transmitted at the receiving side, Based on the output from the encoding unit, a time division encoding unit for extracting and packetizing the signal during the valid period of the voice is provided, and the time division encoding unit notifies the system control unit of the voice transmission rate. Structure The system control unit receives the voice transmission rate, and stores the buffer rate in the variable length encoding unit.
The threshold data for controlling the encoding amount of the video encoding unit is changed according to the amount of data on the memory, and the transmission is performed according to the frame format suitable for the audio transmission speed. It is realized by the video / audio multiplex transmission system.

[Brief description of drawings]

第１図は従来例の映像・音声多重伝送システムを示
し、第２図はCCITT勧告案Y.221に基づく伝送フォーマット
を示し、第３図は遅延部を設けた従来例を示し、第４図は第３の示すシステムでのバッファ部分のフィ
ードバック制御の説明図で、第５図は第３の示すシステムでの伝送量が異なる場合
の伝送フレームフォーマットを示し、第６図は本発明の基本構成図で、第７図は１つの実施例の送信側ブロック図で、第８図は音声フレームフォーマットを示し、第９図はビット数が異なる音声フレームフォーマット
を示し、第10図は多重化／分離部の構成例を示し、第11図はBASコードの割当例とコードの表であり、第12図はBAS情報から符号化ビットレートを分離する
説明図であり、第13図は伝送フレームへの情報の割当を示す説明図で
あり、第14図から第17図は音声情報のビット数を変えたとき
の伝送フレーム例を示し、第18図は他の伝送フレームフォーマットの例を示し、第19図は第18図のフォーマットへの書込方法を示し、第20図から第23図は音声情報のビット数を変えたとき
の伝送フレーム例を示し、第24図は第７図に示すシステムでビットレートの選択
部を多重化部へ移した例を示し、第25図は第24図に示すシステムの多重化／分離部の構
成例を示し、第26図は音声と映像の伝送比率の変化を示し、第27図は音声と映像の伝送比率が変化した場合のデー
タ送信の概念図を示し、第28図は他の実施例の基本構成図を示し、第29図は第28図が示すシステムの送信側のブロック図
を示し、第30図は映像のフレーム間変化率と音声最適ビットレ
ートよりビットレートを決定する手順を示すフローチャ
ート図を示し、第31図は他の実施例の基本構成図を示し、第32図は第31図が示すシステムのバッファ判定部の説
明図であり、第33図はFAS情報のビット配列例を示し、第34図はBASコードの配当例を示し、第35図は音声の割当ビットの実施例を示したフレーム
フォーマット図を示し、第36図は音声の割当ビットに基づいた復号化動作を説
明するための図であり、第37図はインチャンネル接続プロトコルを示した図で
あり、第38図はAC情報における能力問い合せのための付加ビ
ット例を示す図であり、第39図は他の実施例の基本構成図を示し、第40図は第39図が示すシステムに用いられるフレーム
情報の概念図を示し、第41図は第39図が示すシステムの遅延量演算部の一実
施例を示すブロック図であり、第42図は第39図の示すシステムの多重化分離部の一実
施例を示すブロック構成図を示し、第43図は第39図が示すシステムのフレームフォーマッ
ト内の制御情報を示す図で、第44図は他の実施例の基本構成図を示し、第45図は第44図の示すシステムの伝送フレームフォー
マットの説明図であり、第46図は時分割符号化部における処理の説明図であ
り、第47図はフィードバック制御の説明図である。FIG. 1 shows a conventional video / audio multiplex transmission system, FIG. 2 shows a transmission format based on CCITT Recommendation Y.221, FIG. 3 shows a conventional example with a delay unit, and FIG. Is an explanatory diagram of feedback control of a buffer portion in the system shown in FIG. 3, FIG. 5 shows a transmission frame format when the transmission amount in the system shown in FIG. 3 is different, and FIG. 6 is a basic configuration of the present invention. In the figure, FIG. 7 is a transmission side block diagram of one embodiment, FIG. 8 shows a voice frame format, FIG. 9 shows a voice frame format with a different number of bits, and FIG. 10 shows multiplexing / demultiplexing. FIG. 11 is an example of BAS code allocation and a table of codes, FIG. 12 is an explanatory diagram for separating the coding bit rate from BAS information, and FIG. It is explanatory drawing which shows allocation of information, 14 to 17 show examples of transmission frames when the number of bits of audio information is changed, FIG. 18 shows examples of other transmission frame formats, and FIG. 19 is writing to the format of FIG. FIG. 20 to FIG. 23 show an example of a transmission frame when the number of bits of voice information is changed, and FIG. 24 shows a method of moving the bit rate selector to the multiplexer in the system shown in FIG. 25 shows an example of the configuration of the multiplexing / demultiplexing unit of the system shown in FIG. 24, FIG. 26 shows the change in the transmission ratio of audio and video, and FIG. 27 shows that of audio and video. FIG. 28 shows a conceptual diagram of data transmission when the transmission ratio changes, FIG. 28 shows a basic configuration diagram of another embodiment, FIG. 29 shows a block diagram of the transmission side of the system shown in FIG. Figure 30 shows the procedure for determining the bit rate from the rate of change between frames of video and the optimal bit rate of audio. FIG. 31 is a row chart diagram, FIG. 31 is a basic configuration diagram of another embodiment, FIG. 32 is an explanatory diagram of a buffer determination unit of the system shown in FIG. 31, and FIG. 33 is a bit array of FAS information. An example is shown in FIG. 34, a BAS code payout example is shown, FIG. 35 is a frame format diagram showing an example of voice allocation bits, and FIG. 36 is a decoding operation based on voice allocation bits. 37 is a diagram showing an in-channel connection protocol, FIG. 38 is a diagram showing an example of additional bits for inquiring about capability in AC information, and FIG. 39 is another diagram. 40 shows a basic configuration diagram of an embodiment of FIG. 40, FIG. 40 shows a conceptual diagram of frame information used in the system shown in FIG. 39, and FIG. 41 shows an embodiment of the delay amount calculation section of the system shown in FIG. FIG. 42 is a block diagram showing the multiplex of the system shown in FIG. 39. FIG. 43 is a block configuration diagram showing an embodiment of the separation unit, FIG. 43 is a diagram showing control information in the frame format of the system shown in FIG. 39, and FIG. 44 is a basic configuration diagram of another embodiment. 45 is an explanatory diagram of a transmission frame format of the system shown in FIG. 44, FIG. 46 is an explanatory diagram of processing in the time division encoding unit, and FIG. 47 is an explanatory diagram of feedback control.

[Best mode for carrying out the invention]

本発明の実施例を説明する前に、本発明をより良く理
解するため従来の映像・音声多重伝送システムについて
第１図から第５図を用いて説明する。Before describing the embodiments of the present invention, a conventional video / audio multiplex transmission system will be described with reference to FIGS. 1 to 5 in order to better understand the present invention.

尚、全図面を通して、同一対象物には同一参照番号が
付されている。第１図は従来例の映像・音声多重伝送シ
ステムの構成を示すブロック図であり、両端局での送信
部と受信部のみを示している。図に示されるように音声
入力はA/D変換器１でデジタル音声に変換され、更に音
声符号化部２で符号化され多重化部６に入力される。映
像入力はA/D変換器３でデジタル映像に変換され、更に
映像符号化部４で量子化された後可変長符号化される。
この可変長符号化された映像情報は一時的にバッファに
蓄積され多重化部６に入力される。映像符号化部４はこ
のような機能を有するように構成されている。多重化部
６に入力された符号化音声と符号化映像は、第２図に示
すような伝送フレームフォーマットの形になるよう一定
の比率で多重化されさらにFAS,BAS,AC等の制御情報が多
重化され伝送路10に送出される。It should be noted that the same object is denoted by the same reference numeral throughout the drawings. FIG. 1 is a block diagram showing a configuration of a conventional video / audio multiplex transmission system, and shows only a transmitting unit and a receiving unit at both end stations. As shown in the figure, the voice input is converted into digital voice by the A / D converter 1, further encoded by the voice encoding unit 2 and input to the multiplexing unit 6. The video input is converted into a digital video by the A / D converter 3, further quantized by the video coding unit 4, and then variable length coded.
The variable length coded video information is temporarily stored in the buffer and input to the multiplexing unit 6. The video encoding unit 4 is configured to have such a function. The coded audio and the coded video input to the multiplexing unit 6 are multiplexed at a fixed ratio so as to have the form of the transmission frame format as shown in FIG. 2, and further control information such as FAS, BAS, AC is added. It is multiplexed and sent to the transmission line 10.

伝送路10よりの多重化信号は分離部11で符号化音声、
符号化映像及び制御情報に分離される。この分離された
符号化音声は音声復号化部12でデジタル音声に変換され
更にD/A変換器13で音声信号に変換され出力される。一
方分離された符号化映像は映像復号化部14でデジタル映
像に変換され更にD/A変換器15で映像信号に変換された
出力される。音声符号化部２と音声復号化部12は符号化
した音声をもとの形に復号できるように対応したもので
ある。映像符号化部４と映像復号化部14も同様である。The multiplexed signal from the transmission line 10 is encoded voice in the demultiplexing unit 11,
It is separated into encoded video and control information. The separated encoded voice is converted into digital voice by the voice decoding unit 12, further converted into a voice signal by the D / A converter 13, and output. On the other hand, the separated encoded video is converted into a digital video by the video decoding unit 14, further converted into a video signal by the D / A converter 15, and output. The voice encoding unit 2 and the voice decoding unit 12 correspond to each other so that the encoded voice can be decoded in the original form. The same applies to the video encoding unit 4 and the video decoding unit 14.

あらかじめ遅延を予想し固定した遅延を与えた場合は
以下の通りである。The following is the case where a delay is predicted in advance and a fixed delay is given.

第３図は映像・音声伝送システムにおける一方の端局
の従来構成を示している。図中の符号20は端局、３は映
像信号に対するA/D変換器、41は映像符号化部、42は可
変長符号化部、１は音声信号に対するA/D変換部、２は
音声符号化部、７は遅延制御部、6,11は多重・分離部、
68は伝送路インタフェース部、14は映像復号化部、15は
D/A変換部、12は音声復号化部、13はD/A変換部、19はシ
ステム制御部を表している。FIG. 3 shows a conventional configuration of one terminal station in the video / audio transmission system. In the figure, reference numeral 20 is a terminal station, 3 is an A / D converter for video signals, 41 is a video encoding unit, 42 is a variable length encoding unit, 1 is an A / D conversion unit for audio signals, and 2 is an audio code. A demultiplexing unit, 7 a delay control unit, 6 and 11 demultiplexing / demultiplexing units,
68 is a transmission line interface unit, 14 is a video decoding unit, and 15 is
A D / A conversion unit, 12 is a voice decoding unit, 13 is a D / A conversion unit, and 19 is a system control unit.

映像信号は、映像符号化部３における処理の後に、可
変長符号化処理が行われ、多重・分離部6,11に供給され
る。一方、音声信号は、音声符号化部２において、4KHz
帯域16Kbpsあるいは7KHz帯域56Kbpsにコード化され、映
像信号に対する符号化のための遅延に対応する遅延を与
える遅延制御部７をもうけ、当該遅延制御部７において
遅延された後に、多重・分離部6,11に供給される。The video signal is subjected to variable length coding processing after being processed in the video coding section 3 and supplied to the multiplexing / demultiplexing sections 6 and 11. On the other hand, the audio signal is 4KHz in the audio encoding unit 2.
A delay control unit 7 which is coded in a band of 16 Kbps or a 7 KHz band of 56 Kbps and gives a delay corresponding to a delay for encoding a video signal is provided, and after being delayed in the delay control unit 7, the multiplexing / demultiplexing unit 6, Supplied to 11.

そして、映像信号と音声信号とは多重化されて、伝送
路インタフェース部68から対向端局に伝送される。Then, the video signal and the audio signal are multiplexed and transmitted from the transmission path interface unit 68 to the opposite terminal station.

対向端局からの受信信号については、多重・分離部6,
11において、映像信号と音声信号とに分離される。映像
信号については、映像復号化部14、D/A変換部15をへ
て、映像出力として受け取られる。また音声信号につい
ては、音声復号化部12、D/A変換部13をへて、音声出力
として受け取られる。For the received signal from the opposite terminal station, the multiplexer / demultiplexer 6,
At 11, the video signal and the audio signal are separated. The video signal is received by the video decoding unit 14 and the D / A conversion unit 15 as a video output. Further, the audio signal is received as an audio output through the audio decoding unit 12 and the D / A conversion unit 13.

なお、上記遅延制御部７における遅延制御量Ｔは、Ｔ＝（t_V1+t_V2)-(t_A1+t_A2）で与えられる。The delay control amount T in the delay control unit 7 is T = (t _V1 + t _V2 )-(t _A1 + t _A2 ). Given in.

また、伝送路は、64Kbpsから384Kbpsまでの64Kbps毎
の回線速度をもっていて固定容量である。このために、
第４図に示す如きフィードバック制御が行われ、映像符
号化部４における符号化処理が制御される。The transmission line has a fixed capacity with a line speed of 64 Kbps from 64 Kbps to 384 Kbps. For this,
Feedback control as shown in FIG. 4 is performed to control the encoding process in the video encoding unit 4.

第４図はフィードバック制御の態様を示している。図
中の符号41,4,19は第３図に対応し、42は可変長符号化
ユニット、43はバッファ・メモリ、17は第１の閾値、18
は第２の閾値を表している。バッファ・メモリ43内には
可変長符号化された映像データが格納されると共に、バ
ッファ・メモリ43からは一定速度で読出されて多重・分
離部6,11に導かれる。FIG. 4 shows a mode of feedback control. Reference numerals 41, 4, 19 in the figure correspond to those in FIG. 3, 42 is a variable length coding unit, 43 is a buffer memory, 17 is a first threshold value, 18
Represents the second threshold value. The variable length coded video data is stored in the buffer memory 43, and is read from the buffer memory 43 at a constant speed and guided to the multiplexing / demultiplexing units 6 and 11.

バッファ・メモリ43内に蓄積されている映像データの
量にしたがって、例えば第４図（Ｂ）図示の如く、蓄積
量が第２の閾値18を越える段階になると、システム制御
部19は映像符号化部41に対して映像符号化停止を指示す
る。あるいは、映像符号化部の量子化をあらくする。ま
た蓄積量が第１の閾値17以下に低下する段階になると、
システム制御部19は映像符号化再開を指示する。また
は、量子化をこまかくする。According to the amount of video data stored in the buffer memory 43, when the storage amount exceeds the second threshold value 18 as shown in FIG. 4 (B), for example, the system control unit 19 performs video coding. Instruct the unit 41 to stop video coding. Alternatively, the quantization of the video coding unit is roughened. Also, when it reaches the stage where the accumulated amount drops below the first threshold value 17,
The system control unit 19 gives an instruction to restart video coding. Alternatively, the quantization is detailed.

第５図は従来の場合の伝送フレーム・フォーマットを
示している。FIG. 5 shows a transmission frame format in the conventional case.

第５図（Ａ）は、64Kbpsの場合のフレーム・フォーマ
ットを示しており、フレーム情報部分に8Kbps分を割当
て、音声に対して16Kbps分を割当て、映像に対して40Kb
ps分を割当ている。FIG. 5 (A) shows a frame format in the case of 64 Kbps. 8 Kbps is allocated to the frame information portion, 16 Kbps is allocated to audio, and 40 Kb is allocated to video.
Allocating ps minutes.

第５図（Ｂ）は、128Kbpsの場合のフレーム・フォー
マットを示し、音声に対して16Kbpsを割当ている場合
（Ｂ−１）と、音声に対して56Kbpsを割当ている場合
（Ｂ−２）とが存在する。第５図（Ｃ）は、192Kbpsの
場合のフレーム・フォーマットを示し、前半の128Kbps
分には上記（Ｂ−１）の場合と（Ｂ−２）の場合とのい
ずれか一方が用いられている。更に第５図（Ｄ）は、38
4Kbpsの場合のフレーム・フォーマットであり、384Kbps
になるまで映像信号の分として利用される。FIG. 5 (B) shows a frame format in the case of 128 Kbps, and there are a case where 16 Kbps is allocated to voice (B-1) and a case where 56 Kbps is allocated to voice (B-2). Exists. FIG. 5C shows the frame format in the case of 192 Kbps, which is 128 Kbps in the first half.
For the minute, one of the case (B-1) and the case (B-2) is used. Furthermore, FIG. 5 (D) shows 38
Frame format for 4Kbps, 384Kbps
It is used as a video signal until.

従来の場合には、第５図図示の如く、音声区分として
16Kbpsあるいは56Kbps固定的に定められている。映像信
号に対しては、第４図を参照して説明した如く、フィー
ドバック制御によって映像符号化部41における処理が制
御されるようになっているが、音声信号に対しては、音
声中に無音期間が存在するにも拘らず、上記の如く16Kb
ps分あるいは56Kbps分の伝送容量が確保されている。In the conventional case, as shown in FIG.
It is fixed at 16 Kbps or 56 Kbps. As described with reference to FIG. 4, for the video signal, the processing in the video encoding unit 41 is controlled by the feedback control. However, for the audio signal, no sound is produced in the sound. Despite the period, 16Kb as above
Transmission capacity of ps or 56Kbps is secured.

次に本発明の実施例を以下に記載する。 Next, examples of the present invention will be described below.

第６図は本発明の基本構成図で、送信部Ｂと受信部Ｃ
で構成される。FIG. 6 is a basic configuration diagram of the present invention, in which a transmitter B and a receiver C are provided.
Composed of.

第７図は本発明の映像・音声多重化方式の送信側の１
つの実施例を示したもので、この実施例では、音声符号
化部２は、デジタル音声に対する符号化ビット数が各々
異なっている８個の適応符号化部（例えばADPCM）2₁〜2
₈と、各適応符号化部2₁〜2₈の再生出力及び該デジタル
音声から雑音評価法に基づいて符号化ビットレートを決
定し符号化ビットレート情報を発生する評価部2Aと、そ
の符号化ビットレート情報により対応する適応符号化部
の出力音声情報を選択する選択部2Bとで構成されてい
る。ここで符号化制御部５は実質的に評価部2Aがその機
能を行なう。また、映像符号化部４は、A/D変換器３か
らのデジタル映像を符号化し量子化する符号化部（Ｑ）
41と、この符号化された映像情報を可変長符号化する可
変長符号化部（VWLC）42と、この可変長符号化された映
像情報を一時的に蓄積するバッファ43とで構成されてい
る。尚、９は送受信の伝送フレームのクロック供給部で
ある。FIG. 7 shows the transmitting side 1 of the video / audio multiplexing system of the present invention.
In this embodiment, the speech coding unit 2 includes eight adaptive coding units (for example, ADPCM) 2 ₁ to 2 each having a different number of coding bits for digital speech.
_8, and the evaluation unit 2A for determining an encoding bit rate for generating a coded bit rate information on the basis of the reproduction output and the digital sound of each adaptive coding unit 2 ₁ to 2 ₈ to the noise evaluation method, the encoding The selection unit 2B selects the output audio information of the corresponding adaptive encoding unit according to the bit rate information. Here, in the encoding control unit 5, the evaluation unit 2A substantially performs its function. Further, the video encoding unit 4 is an encoding unit (Q) that encodes and quantizes the digital video from the A / D converter 3.
41, a variable-length coding unit (VWLC) 42 for variable-length coding the coded video information, and a buffer 43 for temporarily storing the variable-length coded video information. . Reference numeral 9 is a clock supply unit for transmitting and receiving transmission frames.

適応符号化部2₁〜2₈から出力される音声情報のフレー
ム（パケット）フォーマットは第８図に示す通りであ
り、フレームヘッダａと、このフレームヘッダａに続く
一連の音声データで構成され、最適な雑音評価を受けた
適応符号化部の符号化ビット数ｋ×サンプリングレート
ｎ（ｎ＝8KHz）のビット数で構成される。Frame (packet) format of the audio information output from the adaptive encoder 2 ₁ to 2 ₈ is as shown in FIG. 8, a frame header a, consists of a set of audio data following the frame header a, It is composed of the number of coding bits k of the adaptive coding unit that has received the optimum noise evaluation × the number of bits of sampling rate n (n = 8 KHz).

これらの適応符号化部2₁〜2₈でそれぞれ０〜７ビット
符号化された音声フレームの例が第９図に示されてお
り、これらのいずれかが選択部2Bで選択されて多重化部
へ送られる。Examples of these adaptive encoder 2 ₁ to 2 ₈ speech frames 0-7 bits respectively encoding is shown in FIG. 9, the multiplexer one of these is selected by the selection unit 2B by Sent to.

第10図は音声遅延・多重化部6,7の一実施例を示した
もので、この実施例では、CPU21と、CPUバス22と、第９
図に示すフレームフォーマットで伝送路にデータを送出
するための手順プログラムを格納したROM23と、映像情
報、音声情報、及び符号化ビットレート情報をそれぞれ
バス22に取り込むためのインタフェース（I/F）24〜26
と、これらインタフェース24〜26からの映像情報、音声
情報、及びビットレート情報を一時的に書き込むための
アドレス空間を有するRAM27と、各情報を伝送路に送出
するためのインタフェース28とで構成されている。FIG. 10 shows an embodiment of the voice delay / multiplexing units 6 and 7. In this embodiment, the CPU 21, the CPU bus 22, and the ninth
A ROM 23 that stores a procedure program for sending data to a transmission line in the frame format shown in the figure, and an interface (I / F) 24 for loading video information, audio information, and coded bit rate information into the bus 22 respectively. ~ 26
And a RAM 27 having an address space for temporarily writing video information, audio information, and bit rate information from these interfaces 24-26, and an interface 28 for sending each information to a transmission line. There is.

第２図に示す伝送フレームフォーマットは、CCITT勧
告案Y.221に基づいてビット割当を行ったものであり、
各マルチフレームは16伝送フレームで１単位を構成し
（１マルチフレーム＝16伝送フレーム）、横８ビット×
縦80ビット＝合計640ビットが１伝送フレームを構成し
ており、また、１伝送フレームは音声情報と映像情報と
制御情報（AC（Application Channel）情報）としてのF
AS（Frame Alignment Signal）情報及びBAS（Bit Alloc
ation Signal）情報で構成されている。尚、１マルチフ
レームは上述の如く16伝送フレームで構成されている
が、データは第２図（ａ），（ｂ）に示すように→
の方向に２つの伝送フレーム、例えばF_1-1〜F_1-16とF
_2-1〜F_2-16が１対になって構成され且つ送出される。但
し、FAS情報、BAS情報は伝送フレームF₁とF₂とでは別々
である。The transmission frame format shown in FIG. 2 is bit allocation based on CCITT Recommendation Y.221.
Each multi-frame constitutes one unit with 16 transmission frames (1 multi-frame = 16 transmission frames), horizontal 8 bits x
Vertical 80 bits = 640 bits in total make up one transmission frame, and one transmission frame is F as audio information, video information and control information (AC (Application Channel) information).
AS (Frame Alignment Signal) information and BAS (Bit Alloc)
ation Signal) information. Incidentally, one multi-frame is composed of 16 transmission frames as described above, but the data is as shown in FIGS. 2 (a) and 2 (b).
Two transmission frames in the direction of, eg F _{1-1 to} F _1-16 and F
_{2-1 to} F _2-16 are constructed and transmitted as a pair. However, FAS information and BAS information are different in transmission frames F ₁ and F ₂ .

FAS情報は、フレーム情報であり、（１）Y.221フレー
ム同期、及び（２）マルチ・フレーム同期の同期手順に
より同期を確立するために使用される。即ち、上記
（１）によりフレーム単位の区別ができ、（２）により
各フレームの識別が可能となる。尚、全フレーム識別の
必要性は、後述するBAS（Bit Allocation Signal）情報
の変化に対する応答単位を認識する為である。FAS information is frame information and is used to establish synchronization by a synchronization procedure of (1) Y.221 frame synchronization and (2) multi-frame synchronization. That is, it is possible to distinguish each frame by the above (1), and it is possible to identify each frame by (2). The necessity of identifying all frames is to recognize a response unit for a change in BAS (Bit Allocation Signal) information described later.

BAS情報は送信側において音声情報量と映像情報量の
符号化情報（例えば各々の符号化ビットレート）を設定
したもので、評価部2Aから符号化ビットレート情報を受
けることによりこの符号化ビットレートを第11図に示す
ようにBAS情報に取り込んで伝送し、フレーム同期確立
後に受信側で各データの分離を行うために使用される。
このBAS情報は、１サブ・マルチフレーム毎（１マルチ
フレーム＝２サブ・マルチフレーム）に判定され、第12
図に示すように、１つ前のBAS情報から多数決の論理
（８フレーム中の５フレーム以上）に従って次の符号化
ビットレートが検出される。The BAS information is the coding information of the audio information amount and the video information amount (for example, each coding bit rate) set on the transmitting side. By receiving the coding bit rate information from the evaluation unit 2A, the coding bit rate is obtained. Is used to capture and transmit BAS information as shown in FIG. 11 and to separate each data on the receiving side after frame synchronization is established.
This BAS information is determined for each sub-multiframe (1 multiframe = 2 sub-multiframes), and the 12th
As shown in the figure, the next coding bit rate is detected from the immediately preceding BAS information according to the majority logic (5 frames or more out of 8 frames).

次に、この実施例による伝送フレーム化動作を説明す
る。Next, the transmission frame forming operation according to this embodiment will be described.

上記のような伝送フレーム情報（第２図）用いた場合
の多重化部６では、CPU21によりBAS情報として記憶され
た符号化ビットレートに従ってビット配分が決定され、
上述したように各フレームは、フレーム・ヘッダをBAS
情報に含めることによって除くと“８×n"ビット構成と
なるため、第13図に示すようにフレーム割当が最適に行
う事が出来る。In the multiplexing unit 6 using the above transmission frame information (FIG. 2), the CPU 21 determines the bit allocation according to the encoding bit rate stored as the BAS information,
As mentioned above, each frame has a BAS
Since it has an “8 × n” bit configuration when it is excluded by including it in the information, frame allocation can be optimally performed as shown in FIG.

即ち、例えば縦方向の８ビット単位（例えば第13図
（１）〜（８）で示す）毎に右方向に１〜７ビット（横
軸）まで音声データが埋められると、下段の８ビットの
組へと移行し、選択された符号化ビットレートに対応し
た形で順次音声データが埋められて行く。That is, for example, if voice data is filled up to 1 to 7 bits (horizontal axis) in the right direction for each 8-bit unit in the vertical direction (for example, shown in FIGS. The audio data is sequentially filled in a form corresponding to the selected coding bit rate.

このとき、音声データと映像データの埋め込みは縦８
ビット単位の１段毎に行われ、映像データは各段の音声
データの埋め込みが終了次第、バッファ43から残りの部
分に埋め込まれる。At this time, the embedding of audio data and video data is 8 vertically.
The video data is embedded in the rest of the buffer 43 as soon as the embedding of the audio data in each stage is completed.

このようにして１つの伝送フレームが形成されるが、
第14図〜第17図には、それぞれ０ビット符号化、１ビッ
ト符号化、３ビット符号化、及び７ビット符号化の場合
の１伝送フレームにおける音声データと映像データとの
関係が示されてている。In this way, one transmission frame is formed,
14 to 17 show the relationship between audio data and video data in one transmission frame in the case of 0-bit encoding, 1-bit encoding, 3-bit encoding and 7-bit encoding, respectively. ing.

尚、この実施例では音声フレームヘッダをBAS情報に
含める為、“１マルチフレーム（＝16フレーム）”毎に
音声符号化レートが可変となる。In this embodiment, since the voice frame header is included in the BAS information, the voice coding rate is variable for each "1 multiframe (= 16 frames)".

また、このようにBAS情報にフレームヘッダを含める
と、データを８ビット毎に格納することができるので、
後述する第23図のような半端なデータが生じない利点が
ある。In addition, by including the frame header in the BAS information in this way, data can be stored in every 8 bits.
There is an advantage that odd data as shown in FIG. 23 described later does not occur.

音声フレームヘッダをBAS情報に含めない場合の伝送
フレームのフォーマットが第18図に示されており、この
場合にはヘッダは図示のように縦１ビット・横３ビット
の３ビット情報とし（第９図参照）、第19図に示すよう
に音声・映像データを横方向に書き込む場合には、第20
図〜第23図に示すようにそれぞれ０ビット符号化、１ビ
ット符号化、３ビット符号化、及び７ビット符号化の場
合の１伝送フレームを形成する。The format of the transmission frame when the audio frame header is not included in the BAS information is shown in FIG. 18. In this case, the header is 3-bit information of vertical 1 bit and horizontal 3 bits as shown in the figure (No. 9). (See the figure), and when writing audio / video data in the horizontal direction as shown in FIG.
As shown in FIGS. 23 to 23, one transmission frame is formed for 0-bit encoding, 1-bit encoding, 3-bit encoding, and 7-bit encoding, respectively.

この音声情報及び映像情報から成る伝送フレームをRA
M27に格納し、伝送フレームが完成した時点（640ビット
分）で伝送速度に一致したレートでデータを伝送路10に
送出する。The transmission frame consisting of this audio information and video information is RA
The data is stored in M27, and when the transmission frame is completed (for 640 bits), the data is sent to the transmission line 10 at a rate that matches the transmission rate.

一方、受信側における分離部11は第10図に示した多重
化部７と丁度矢印を逆方向にした形で同様の構成を有し
ている。On the other hand, the demultiplexing unit 11 on the receiving side has the same configuration as the multiplexing unit 7 shown in FIG. 10 with the arrow just in the opposite direction.

即ち、伝送路からの入力データに対して、上述したよ
うに、このFAS情報を解析することにより（１）Y.221フ
レーム同期、及び（２）マルチ・フレーム同期を確立す
る。そして、BAS情報に基づいて各データを分離部11で
分離し、分離された各信号データを、各々のインタフェ
ースを介して符号化ビットレート情報は復号化制御部1
6、音声情報は音声復号化部12へ、そして映像情報は映
像復号化部14へ送る。That is, as described above, the FAS information is analyzed for the input data from the transmission path to establish (1) Y.221 frame synchronization and (2) multi-frame synchronization. Then, each data is separated by the separation unit 11 based on the BAS information, and the separated signal data are encoded bit rate information through the respective interfaces to the decoding control unit 1.
6. The audio information is sent to the audio decoding unit 12, and the video information is sent to the video decoding unit 14.

そして、音声復号化部12では復号化制御部からの復号
化ビットレート情報を受けてそのビットレートに従って
音声情報を復号化する。Then, the audio decoding unit 12 receives the decoding bit rate information from the decoding control unit and decodes the audio information according to the bit rate.

このように本発明で用いる伝送フレームは一定ビット
長であるので、伝送フレーム毎に符号化レートを可変に
するためには、伝送路クロックレートに同期した動作
を、第７図に示したクロック供給部９からのクロックを
受信側で抽出することにより行っている（CMI符号の場
合）。但し、RS422又はINCUでは別系統でクロックが供
給される。As described above, since the transmission frame used in the present invention has a constant bit length, in order to make the coding rate variable for each transmission frame, the operation synchronized with the transmission line clock rate is performed by the clock supply shown in FIG. This is done by extracting the clock from the unit 9 on the receiving side (in the case of CMI code). However, in RS422 or INCU, the clock is supplied in a separate system.

尚、上記の実施例では適応符号化部2₁〜2₈の各出力を
選択部2Bを介して多重化部６に入力しているが、第24図
の実施例では、適応符号化部2₁〜2₈の各出力をそのまま
多重化部６に入力すると共にセレクト情報も多重化部６
に送っている。In the above embodiments is input to the multiplexing unit 6 via the selector 2B each output of the adaptive coding unit 2 ₁ to 2 _8, but in the embodiment of Figure 24, adaptive encoder 2 multiplexer 6 also select information inputs the respective outputs of the ₂₁ to ₈ directly to the multiplexer 6
I am sending it to.

この場合の多重化部６のシステム構成が第25図に示さ
れており、適応符号化部2₁〜2₈の各出力音声データをイ
ンタフェース25₁〜25₈に介して受け、符号化ビットレー
ト情報によりそれらのいずれかの出力を選択して音声情
報としてRAM27に格納するようにしている。System configuration of the multiplexer 6 for this case is shown in FIG. 25, receives the respective output audio data of the adaptive coding unit 2 ₁ to 2 ₈ via the interface 25 _to 253 _8, the encoding bit rate One of these outputs is selected according to the information and stored in the RAM 27 as voice information.

本発明では、入力音声の情報量に応じて適応音声符号
化部２が適宜ビットレートを変えることにより冗長圧縮
を行って多重化部６に符号化された音声情報とそのビッ
トレート情報を送る。In the present invention, the adaptive voice encoding unit 2 performs redundant compression by appropriately changing the bit rate according to the information amount of the input voice, and sends the encoded voice information and its bit rate information to the multiplexing unit 6.

これにより多重化部６ではその音声情報とビットレー
ト情報と共に映像符号化部４からの映像情報も合わせて
多重化するが、この際、多重化部６からの伝送フレーム
は一定長であるので、音声情報が圧縮されてビット数が
少なくなった分だけで映像情報量を増加させることがで
きる。As a result, the multiplexing unit 6 multiplexes the audio information and the bit rate information together with the video information from the video coding unit 4. At this time, since the transmission frame from the multiplexing unit 6 has a constant length, The amount of video information can be increased only by the reduction of the bit number due to the compression of the audio information.

このようにして受信側に送られてきた伝送フレームは
分離部11で映像情報と音声情報とビットレート情報に分
離され、音声復号化部12ではそのビットレートに従って
音声情報を復号化する。The transmission frame thus sent to the receiving side is separated into video information, audio information, and bit rate information by the separating unit 11, and the audio decoding unit 12 decodes the audio information according to the bit rate.

このようにして音声情報量と映像情報量のデータシェ
アリングが変化するが、その様子が第26図（１）〜
（３）に示されている。In this way, the data sharing of the audio information amount and the video information amount changes, which is shown in Fig. 26 (1)-
It is shown in (3).

また、第27図には多重化された送出データが示されて
おり、この図に示すように、一定フレーム単位毎に映像
・音声データが送出されており、音声信号は、情報量の
多少に従って送出データ数が異なり、空き領域全体に映
像データが挿入され、映像品質が向上されることが理解
できる。Further, FIG. 27 shows multiplexed transmission data, and as shown in this figure, video / audio data is transmitted in constant frame units, and the audio signal is transmitted in accordance with the amount of information. It can be understood that the number of transmitted data is different and the video data is inserted into the entire empty area, and the video quality is improved.

第７図の実施例では音声符号化部２からの音声の最適
なビットレートが符号化制御部５に入力されてそのまま
割当信号としていた。しかしさらに映像情報の内容も考
慮して送信比率を決めようとするのが第28図に原理図を
示した次の実施例である。In the embodiment shown in FIG. 7, the optimum bit rate of the voice from the voice encoding unit 2 is input to the encoding control unit 5 and used as it is as the allocation signal. However, it is the next embodiment whose principle diagram is shown in FIG. 28 that tries to determine the transmission ratio in consideration of the contents of the video information.

この実施例の送信側のブロック図が第29図である。 FIG. 29 is a block diagram of the transmitting side of this embodiment.

音声符号化部２は、デジタル音声に対する符号化ビッ
ト数が各々異なっている８個の適応符号化部（例えばAC
PCM）2₁〜2₈と、各適応符号化部2₁〜2₈の再生出力及び
デジタル音声から雑音評価法に基づいて最適な音声符号
化ビットレートを決定し音声ビットレート情報ACIを発
生する評価部2Aとで構成されている。The speech coding unit 2 includes eight adaptive coding units (for example, AC
PCM) 2 _{1 to} 2 ₈ and the reproduction output of each adaptive coding unit 2 ₁ to 2 ₈ and the digital voice, determine the optimum voice coding bit rate based on the noise evaluation method and generate the voice bit rate information ACI. It is composed of an evaluation unit 2A.

また、映像符号化部４は、A/D変換器３からのデジタ
ル映像を符号化する符号化部（Ｑ）41と、符号化された
映像情報を可変長符号化する可変長符号化部（VWLC）42
と、この可変長符号化された映像情報を一時的に蓄積す
るバッファ43とで構成されており、フレーム間変化率判
定部８はA/D変換器３の出力を前後した２つのフレーム
分蓄積するフレームメモリ（FM）81と、これらの前後し
た２つの情報を取り込んでフレーム間の変化率を求める
と共に符号化制御部５からの閾値Thと比較してその比較
結果を符号化制御部５に与える変化率比較部82とで構成
されている。尚、この閾値Thはバッファ43の蓄積量を示
すVBIに応じて符号化制御部５で加減されるものであ
る。また、選択・多重化部６は各適応符号化部2₁〜2₈の
符号化出力を並列に入力して符号化制御部５からの映像
・音声割当ビットレート情報MIにより対応する適合符号
化部の出力音声情報を選択する選択部6aを含んでいる。The video encoding unit 4 also includes an encoding unit (Q) 41 that encodes the digital image from the A / D converter 3, and a variable length encoding unit (V) that encodes the encoded video information in a variable length ( VWLC) 42
And a buffer 43 for temporarily storing the variable length coded video information. The inter-frame change rate determination unit 8 stores two frames before and after the output of the A / D converter 3. The frame memory (FM) 81 and the preceding and following two pieces of information are fetched to obtain the rate of change between frames, and the comparison result is compared with the threshold value Th from the encoding control unit 5 to the encoding control unit 5. And a change rate comparison unit 82 for giving the change rate. The threshold value Th is adjusted by the encoding control unit 5 according to the VBI indicating the storage amount of the buffer 43. Moreover, the corresponding adaptation encoded by the video and audio allocation bit rate information MI from the coding control unit 5 to enter the selection and multiplexing unit 6 encoding the output of each adaptive coding unit 2 ₁ to 2 ₈ in parallel It includes a selection unit 6a for selecting output voice information of the unit.

これらの適応符号化部2₁〜2₈でそれぞれ０〜７ビット
符号化された音声フレームの例が第９図に示されてお
り、これらのいずれかが選択部6aで選択されて多重化さ
れる。Examples of these adaptive encoder 2 ₁ to 2 ₈ speech frames 0-7 bits respectively encoding is shown in Figure 9, one of these is being multiplexed selected by the selection unit 6a It

第30図は変化率比較部82の比較結果を受けて映像・音
声割当ビットレート情報MIを発生するための符号化制御
部５の制御アルゴリズムを示したものである。FIG. 30 shows a control algorithm of the encoding control unit 5 for generating the video / audio assigned bit rate information MI in response to the comparison result of the change rate comparison unit 82.

第30図において、まず比較部82での比較の結果、フレ
ーム間の映像情報の変化率が閾値Thより大きいか否かを
チェックし（ステップS1）、例えばテレビ会議等で出席
者が立ち上がって映像情報が大きく変化して閾値Thを越
えたときには、映像・音声割当ビットレート情報MIにか
かわらず映像情報を優先させるためステップS2に示すよ
うに音声情報に２ビット（16kb/s）（第９図（３）参
照）のみを割り当て、映像情報に14ビット（112kb/s）
を割り当てた映像・音声割当ビットレート情報を生成し
て出力する。尚、音声２ビット＋映像14ビット＝16ビッ
トのフレーム構成は多重化部６での多重化動作で用いら
れるものである。In FIG. 30, first, as a result of the comparison in the comparison unit 82, it is checked whether the rate of change in video information between frames is larger than a threshold Th (step S1). When the information greatly changes and exceeds the threshold Th, the video information is given 2 bits (16 kb / s) as shown in step S2 in order to prioritize the video information regardless of the video / audio assigned bit rate information MI (see FIG. 9). 14 bits (112 kb / s) for video information.
Generates and outputs the video / audio allocation bit rate information to which is allocated. The frame structure of 2 bits of audio + 14 bits of video = 16 bits is used in the multiplexing operation of the multiplexer 6.

一方、ステップS1での比較の結果、フレーム間の映像
情報の変化率が閾値Thより小さかったときには、バッフ
ァ43の蓄積データ量VBIが閾値Th′を越えているか否か
をチェックする（ステップS3）。この結果、蓄積データ
量VBIが閾値Th′を越える程大きい時には、上記と同様
にステップS2に進んで映像情報を優先的に割り当てる。On the other hand, as a result of the comparison in step S1, when the rate of change in the video information between frames is smaller than the threshold value Th, it is checked whether the accumulated data amount VBI in the buffer 43 exceeds the threshold value Th '(step S3). . As a result, when the accumulated data amount VBI is large enough to exceed the threshold value Th ', the process proceeds to step S2 similarly to the above to preferentially allocate the video information.

VBI＜Th′のときには、以下のステップS4〜S15におい
て評価部2Aからの音声ビットレート情報ACIの内容に応
じた映像と音声の割当ビットレートが決定される。When VBI <Th ', the allocated bit rates of video and audio according to the content of the audio bit rate information ACI from the evaluation unit 2A are determined in the following steps S4 to S15.

即ち、ステップS4で音声ビットレート情報ACIが２ビ
ット符号化を示しているときには、ステップS2へ進み、
３ビット符号化を示しているときには映像情報に13ビッ
ト（104kb/s）を割り当てる、というように順次、音声
ビットレートと映像ビットレートとを決定して映像・音
声割当ビットレート情報MIを多重化部６に出力する。That is, when the audio bit rate information ACI indicates 2-bit encoding in step S4, the process proceeds to step S2,
When 3-bit encoding is indicated, 13 bits (104 kb / s) are allocated to the video information, so that the audio bit rate and the video bit rate are sequentially determined and the video / audio allocation bit rate information MI is multiplexed. Output to the unit 6.

上記実施例では送信比率を音声の最適ビットレート、
映像情報のフレーム間変化率及びバッファの蓄積量から
決めていたが、決定は少なくとも一つの情報で行なわれ
ても良く、更に映像情報、音声情報に関する他の情報、
たとえば映像のフレーム内変化率や音声の高周波比率等
を用いても良い。In the above embodiment, the transmission ratio is set to the optimum bit rate of voice,
Although it was determined from the rate of change between frames of video information and the amount of storage in the buffer, the determination may be made with at least one piece of information, and other information regarding video information and audio information,
For example, the rate of change in video frame or the high frequency ratio of audio may be used.

次の実施例は音声符号化部２としてSB-ADPCM（サブバ
ンドADPCM）を用いる。SB-ADPCMは、２つの周波数帯域
に分けて符号化する方式であり、情報密度の高い低域周
波数部にはビット数の多い上位ビット（例えば８ビット
中の６ビット）を、情報密度の低い高域周波数帯域につ
いてはビット数の少ない下位（２ビット）を割り当てる
ものである。従って、その下位よりデータを切り捨てる
事が可能であり、各割当レートを適応的（6,7,8ビッ
ト）に採用することができる。第31図に原理的に示すよ
うに、送信側において、音声入力デジタル音声に変換す
るA/D変換器１と、デジタル音声を低周波ビット部分と
高周波ビット部分とに分けて符号化するSB-ADPCM符号化
部２と、映像入力をデジタル映像に変換するA/D変換器
３と、デジタル映像を符号化する映像符号化部41と、符
号化された映像情報を可変長符号化する可変長符号化部
42と、可変長符号化された映像情報を一時的に蓄積する
バッファ43と、バッファ43に蓄積された情報量に応じて
音声の割当レートを決定するバッファ判定部44と、割当
レートに応じて該SB-ADPCM符号化部２の高周波ビット部
分に映像情報の一部を割り当て、この割当レート情報と
映像情報と音声情報とを多重化する多重化部６と、を備
えている。In the next embodiment, SB-ADPCM (subband ADPCM) is used as the voice encoding unit 2. SB-ADPCM is a method of encoding by dividing it into two frequency bands, and the upper bits (for example, 6 bits out of 8 bits) having a large number of bits are low in the low frequency area where the information density is high. The lower frequency band (2 bits) having a smaller number of bits is allocated to the high frequency band. Therefore, the data can be truncated from the lower order, and each allocation rate can be adaptively adopted (6, 7, 8 bits). As shown in principle in FIG. 31, on the transmitting side, an A / D converter 1 for converting into a voice input digital voice, and an SB-for encoding the digital voice by dividing it into a low frequency bit part and a high frequency bit part ADPCM coding unit 2, A / D converter 3 for converting a video input into a digital video, video coding unit 41 for coding a digital video, and variable length for variable-length coding the coded video information. Coding unit
42, a buffer 43 that temporarily stores the variable length coded video information, a buffer determination unit 44 that determines an audio allocation rate according to the amount of information accumulated in the buffer 43, and an allocation rate according to the allocation rate. The SB-ADPCM coding unit 2 is provided with a multiplexing unit 6 that allocates a part of the video information to the high-frequency bit portion and multiplexes the allocation rate information, the video information, and the audio information.

また、受信側においては、伝送路10からの多重化され
た信号を音声情報と映像情報と割当情報とに分離する分
離部11と、音声情報をSB-ADPCM復号化する音声復号化部
12と、SB-ADPCM復号化されたデジタル音声を音声信号に
変換するD/A変換器13と、映像情報を一時的に蓄積する
バッファ143と、バッファ143の映像情報を可変長復号化
する可変長復号化部142と、可変長復号化された映像情
報を逆符号化してデジタル映像信号を発生する逆符号化
部141と、デジタル映像出力を映像出力に変換するD/A変
換器15と、割当レートに従い音声復号化部12を制御する
復号化制御部16を備えている。On the receiving side, a separation unit 11 that separates the multiplexed signal from the transmission path 10 into audio information, video information, and allocation information, and an audio decoding unit that SB-ADPCM-decodes the audio information.
12, a D / A converter 13 for converting the SB-ADPCM-decoded digital audio into an audio signal, a buffer 143 for temporarily storing the video information, and a variable variable-length decoding for the video information in the buffer 143. A long decoding unit 142, an inverse encoding unit 141 that inversely encodes the variable length decoded video information to generate a digital video signal, a D / A converter 15 that converts a digital video output into a video output, A decoding control unit 16 that controls the speech decoding unit 12 according to the allocation rate is provided.

第31図送信側においては、映像入力をA/D変換器３に
よりデジタル信号に変換してフィルタ処理を行った後符
号化部へ導き、前フレームの情報を用いて情報量の圧縮
を行い、そこで得られた値に対し量子化を行う。この量
子化された映像情報を可変長符号化部（VWLC）42の処理
により、発生頻度の高いデータには短い符号が、逆に発
生頻度の低いデータには長い符号が割り当てられ、この
符号化されたデータをバッファ（BUF）43に一時蓄積す
る。FIG. 31 On the transmission side, the video input is converted into a digital signal by the A / D converter 3, filtered, and then guided to the encoding unit, where the amount of information is compressed using the information of the previous frame, The value obtained there is quantized. The quantized video information is processed by the variable length coding unit (VWLC) 42 so that a short code is assigned to data with a high frequency of occurrence and a long code is assigned to data with a low frequency of occurrence. The buffered data is temporarily stored in the buffer (BUF) 43.

このバッファ43のデータは多重化部（MUX）６で多重
化され送出されるが、伝送路容量が少ない場合にはバッ
ファ43の蓄積データ量は多くなり、伝送路容量が多い場
合には蓄積データ量は少なくなる。The data in the buffer 43 is multiplexed by the multiplexing unit (MUX) 6 and sent out. When the transmission line capacity is small, the accumulated data amount in the buffer 43 is large, and when the transmission line capacity is large, the accumulated data is large. The quantity will be smaller.

従って、バッファ43の蓄積データ量により、伝送路へ
の送出データ量を把握する事ができ、この情報がバッフ
ァ判定部44へ送られ、ブッファ判定部44はそのデータ量
の情報に基づいて、音声の割当（符号化）レートを定め
ることが出来る。Therefore, the amount of data stored in the buffer 43 allows the amount of data to be sent to the transmission path to be ascertained, and this information is sent to the buffer determination unit 44, and the buffer determination unit 44 uses the information about the data amount It is possible to determine the allocation (coding) rate of the.

また、音声入力はA/D変換器１でデジタル音声に変換
されてSB-ADPCM符号化部２に送られる。Also, the voice input is converted into digital voice by the A / D converter 1 and sent to the SB-ADPCM encoding unit 2.

そこで、バッファ判定部44で決定された割当レートに
より、SB-ADPCM2の符号化出力の選択を行い、必要数の
音声ビット切り捨てを行い、切り捨てられた部分に映像
データを割り当て、多重化部（MUX）６で音声情報と映
像情報と割当レート情報とを多重化して伝送する。Therefore, the SB-ADPCM2 encoded output is selected according to the allocation rate determined by the buffer determination unit 44, the necessary number of audio bits are truncated, video data is allocated to the truncated portion, and the multiplexing unit (MUX 6), the audio information, the video information, and the allocation rate information are multiplexed and transmitted.

一方、受信側においては、分離部（DMUX）11において
映像情報と音声情報と割当レート情報とに分離する。On the other hand, on the receiving side, the separation unit (DMUX) 11 separates the video information, the audio information, and the allocation rate information.

分離された音声データをSB-ADPCM復号化部12で復号化
する際、分離された割当レート情報を用いて行うことが
可能となる。When the separated voice data is decoded by the SB-ADPCM decoding unit 12, it is possible to use the separated allocation rate information.

このようにして第26図及び第27図に示すのと同様に映
像情報量の増減に伴って音声情報量が可変となり、より
品質の優れた映像情報を伝送することができる。In this way, as shown in FIGS. 26 and 27, the audio information amount becomes variable as the amount of video information increases and decreases, and the video information of higher quality can be transmitted.

第32図は第31図に示したバッファ判定部44の一実施例
を示したもので、この実施例では可変長符号化部42から
RAMで構成されたバッファ43への書込アドレスＷと、多
重化部６からバッファ43への読出アドレスＲとを入力し
て以下の判定を行う。FIG. 32 shows an embodiment of the buffer judgment unit 44 shown in FIG. 31, and in this embodiment, the variable length coding unit 42
The write address W to the buffer 43 composed of RAM and the read address R to the buffer 43 from the multiplexing unit 6 are input and the following judgment is made.

Ｗ−Ｒ＜閾値Th₁のときは、バッファ43へのデータ
の書込に対してデータの読出が進んでいない状態である
から映像情報の発生量が大きいと判定し、SB-ADPCM符号
化部２からの８ビット音声符号化データのうちの高周波
成分に割り当てられた２ビットを切り捨てて低周波成分
に割り当てられた６ビットの音声データのみの符号化を
行うための割当レートに関する情報（0,1）を多重化部
６へ送る。When WR <threshold value Th ₁ , it is determined that the amount of video information generated is large because the data reading is not progressing with respect to the data writing to the buffer 43, and the SB-ADPCM coding unit is determined. Information about the allocation rate for encoding only 6-bit audio data allocated to the low-frequency component by discarding 2 bits allocated to the high-frequency component of the 8-bit audio encoded data from 0 (0, 1) is sent to the multiplexing unit 6.

閾値Th₂＞Ｗ−Ｒ＞閾値Th₁のときには、映像情報の
発生量は中程度であると判定し、SB-ADPCM符号化部２か
らの８ビット音声符号化データのうちの高周波成分に割
り当てられた２ビットの内の１ビットを切り捨てて低周
波成分に割り当てられた６ビットとの計７ビットの音声
データのみの符号化を行うための割当レートに関する情
報（0,1）を多重化部６へ送る。When the threshold value Th ₂ >W−R> threshold value Th ₁ , it is determined that the amount of generated video information is medium, and the high-frequency component of the 8-bit audio encoded data from the SB-ADPCM encoding unit 2 is allocated. Information about the allocation rate (0,1) for encoding only 7-bit audio data including 6 bits allocated to the low-frequency component by discarding 1 bit of the 2 bits thus generated is multiplexed. Send to 6.

閾値Th₂＜Ｗ−Ｒのときには、映像情報の発生量は
少ないと判定し、SB-ADPCM符号化部２からの８ビット音
声符号化デーは削ることなく符号化を行うための割当レ
ートに関する情報（1,1）を多重化部６へ送る。When the threshold value Th ₂ <W−R, it is determined that the amount of generated video information is small, and the 8-bit audio coding data from the SB-ADPCM coding unit 2 is assigned information for coding without cutting the data. (1,1) is sent to the multiplexing unit 6.

FAS情報のビット配置例が第33図に示されており、フ
レーム同期（１）は、FAW（Frame Alignment Word）
（第33図では、“0011011"）を認識することにより行
い、マルチ・フレーム同期（２）は、FAS情報の第１ビ
ットに配置されている情報Miにより識別し、第１・第３
・第５・第７・第９・第11フレームの情報Miに着目し、
“001011"のパターンにより同期確立を行う。Figure 33 shows an example of bit arrangement of FAS information. Frame synchronization (1) is FAW (Frame Alignment Word)
(In FIG. 33, "0011011") is recognized, and multi-frame synchronization (2) is identified by the information Mi arranged in the first bit of FAS information.
・ Focusing on the information Mi of the 5th, 7th, 9th and 11th frames,
Synchronization is established by the pattern of "001011".

第34図にはBAS情報の一例が示されており、このBAS情
報は送信側において上記の割当レートに基づいて音声情
報量と映像情報量との割合を設定するように生成された
もので（第12図参照）、フレーム同期確立後に受信側で
各データの分離を行うために使用する情報であり、デー
タ分離の処理変更単位は、１マルチ・フレーム毎又は１
サブ・マルチフレーム毎（１マルチフレーム＝２サブ・
マルチフレーム）に行い、又、BAS情報は、１サブ・マ
ルチフレーム毎に判定を行い、第12図に示すように、BA
S情報の変化が分離部11で多数決（８フレーム中、５フ
レーム以上が一致すること）により認識された時点で次
のマルチフレーム又はサブ・マルチフレーム内の音声／
映像データの分離位置を認識することが可能となる。こ
れは、音声−映像の割当が変化した時点でBAS情報も変
化してしまうので、１フレーム内でいずれのBAS情報に
依存すべきかを決定するためである。尚、音声／映像デ
ータの分離単位は80ビット単位である。FIG. 34 shows an example of BAS information, which is generated on the transmission side so as to set the ratio between the audio information amount and the video information amount based on the above-mentioned allocation rate ( (Refer to FIG. 12), which is information used for separating each data on the receiving side after frame synchronization is established, and the processing change unit of data separation is one multi-frame or one
For each sub-multiframe (1 multiframe = 2 sub-
Multi-frame), and the BAS information is determined for each sub-multi-frame, and as shown in FIG.
When the change in the S information is recognized by the majority decision by the demultiplexing unit 11 (5 frames or more out of 8 frames match), the speech / voice in the next multi-frame or sub-multi-frame is detected.
It is possible to recognize the separation position of the video data. This is because the BAS information also changes at the time when the audio-video allocation changes, so that it is determined which BAS information should be used within one frame. The unit of audio / video data separation is 80 bits.

次に動作の説明をすると、上記のようなフレームフォ
ーマットを用いた場合の多重化部６では、CPU21により
第35図に示すように、符号化制御部５から出力される割
当レート情報に基づいたBAS情報により、音声データを
８ビットそのまま使用する場合（同図（ａ））と、１ビ
ット切り捨てて７ビット使用する場合（同図（ｂ））
と、２ビット切り捨てて低周波成分の６ビットのみ使用
する場合（同図（ｃ））とに分かれて、ビット配分が決
定され、音声情報及び映像情報をRAM27に格納する。Next, the operation will be described. In the multiplexing unit 6 in the case of using the frame format as described above, based on the allocation rate information output from the encoding control unit 5 by the CPU 21, as shown in FIG. Depending on the BAS information, when 8 bits of audio data is used as it is ((a) in the same figure) and when 7 bits are used after truncating 1 bit ((b) in the same figure)
2 bits are cut off and only 6 bits of the low frequency component are used ((c) in the figure), the bit allocation is determined, and the audio information and the video information are stored in the RAM 27.

そして、第35図のフレームフォーマットが完成した時
点で伝送速度に一致したレートでデータを伝送路10に送
出する。Then, when the frame format of FIG. 35 is completed, the data is sent to the transmission line 10 at a rate that matches the transmission rate.

従って、第35図における音声情報の１フレーム当たり
の各ビット割当による伝送速度は次のようになる。Therefore, the transmission rate of the voice information in FIG. 35 by each bit allocation per frame is as follows.

（ａ）６×80（480ビット）×100フレーム（１秒間）＝
48kb/s （ｂ）７×80（560ビット）×100フレーム（１秒間）＝
56kb/s （ｃ）８×80（640ビット）×100フレーム（１秒間）＝
64kb/s 一方、受信側における分離部11は第10図に示した多重
化部６と丁度矢印を逆方向にした形で同様の構成を有し
ている。(A) 6 x 80 (480 bits) x 100 frames (1 second) =
48 kb / s (b) 7 x 80 (560 bits) x 100 frames (1 second) =
56kb / s (c) 8 x 80 (640 bits) x 100 frames (1 second) =
64 kb / s On the other hand, the separating unit 11 on the receiving side has the same configuration as the multiplexing unit 6 shown in FIG.

即ち、伝送路からの入力データに対して、上述したよ
うに、このFAS情報を解析することにより（１）Y.221フ
レーム同期、及び（２）マルチ・フレーム同期を確立す
る。そしてBAS情報に基づいて各データを分離部11で分
離し、分離された各信号データを、各々インタフェース
を介して音声情報はSB-ADPCM復号化部12へ、そして映像
情報はバッファ143及び可変長復号化部142へ送る。That is, as described above, the FAS information is analyzed for the input data from the transmission path to establish (1) Y.221 frame synchronization and (2) multi-frame synchronization. Then, each data is separated by the separation unit 11 based on the BAS information, the separated signal data, the audio information to the SB-ADPCM decoding unit 12 via each interface, and the video information to the buffer 143 and variable length. Send to the decoding unit 142.

ここで、分離された音声情報は伝送路クロック（8kH
z,64kHz）に同期したタイミングで第36図に示すように
音声割り当てビット数に応じてデータを出力する必要が
あり、この為、音声情報は、BAS情報に含まれた上記の
割当レート情報に基づきSB-ADPCM復号化部12で復号化さ
れD/A変換器13でアナログ信号として出力されると共
に、映像情報はバッファ143から可変長復号化部142で復
号化され、映像復号化部141で復号化されD/A変換器15で
アナログ映像信号に変換されて出力される。Here, the separated audio information is the transmission line clock (8kH
(z, 64kHz), it is necessary to output data according to the number of audio allocation bits as shown in Fig. 36 at the timing synchronized with this. Therefore, the audio information is the allocation rate information described above included in the BAS information. Based on the SB-ADPCM decoding unit 12 and output as an analog signal by the D / A converter 13, the video information is decoded from the buffer 143 by the variable length decoding unit 142, and by the video decoding unit 141. The data is decoded, converted into an analog video signal by the D / A converter 15, and output.

以上の説明では、適応音声符号化部としてADPCM-MQ方
式、SB-ADPCM方式を用いることができるが、この他、適
応音声符号化部としては、DPCM方式や、APC-AB方式等が
在り、送信側と受信側とで送受信方式の一致／不一致を
確認する必要がある。In the above description, the ADPCM-MQ system and the SB-ADPCM system can be used as the adaptive voice encoding unit, but in addition to this, the adaptive voice encoding unit includes the DPCM system, the APC-AB system, and the like. It is necessary to confirm the match / mismatch of the transmission / reception methods on the transmission side and the reception side.

そこで、装置間の相互の能力を把握し、共通の動作モ
ードを探索するため、第37図に示すように、ICP（イン
チャネル接続プロトコル）手順により、電源投入後、FAS情報によりフレーム同期を確立す
る。Therefore, in order to understand the mutual capabilities of the devices and search for a common operation mode, as shown in Fig. 37, after the power is turned on by the ICP (In-Channel Connection Protocol) procedure, the frame synchronization is established by the FAS information. To do.

第38図に示すように、64ビットのAC情報中の８ビッ
トにより相互の能力、即ち符号化方式を問い合わせる。As shown in FIG. 38, the mutual capability, that is, the encoding method is inquired by 8 bits in the 64-bit AC information.

問い合わせた結果に基づく共通の符号化方式をBAS
情報を用いて指定する。（但し、共通の方式が無い場合
等には、固定符号化レートによる通信を行うための手続
きを実行する。）このようにして自動相互接続を行い、従来装置との競
合を回避することができる。A common encoding method based on the inquiry result is BAS
Specify using information. (However, if there is no common system, a procedure for communication at a fixed coding rate is executed.) In this way, automatic interconnection can be performed and conflict with the conventional device can be avoided. .

次に音声と映像の符号化と復号化の処理速度の差に起
因する音声出力と映像出力のずれ補正する実施例につい
て説明する。Next, a description will be given of an embodiment for correcting the deviation between the audio output and the video output due to the difference in processing speed between the encoding and the decoding of the audio and the video.

第39図に原理的に示すように、送信部Ｄと受信部Ｅで
構成されており送信側において、音声入力をディジタル
音声に変換するA/D変換器１と、該ディジタル音声を符
号化する音声符号化部２と、映像入力をディジタル映像
に変換するA/D変換器３と、ディジタル映像を符号化す
る映像符号化部４と、映像符号化部４の入出力情報から
映像再生出力と音声再生出力を同期させるための映像符
号化遅延時間情報を発生する遅延量演算部31と、符号化
された映像情報及び音声情報と遅延時間情報とを多重化
する多重化部６と、を備えている。As shown in principle in FIG. 39, an A / D converter 1 which is composed of a transmitting unit D and a receiving unit E and converts a voice input into a digital voice at the transmitting side, and the digital voice is encoded. An audio encoding unit 2, an A / D converter 3 for converting an image input into a digital image, an image encoding unit 4 for encoding a digital image, and an image reproduction output based on input / output information of the image encoding unit 4. A delay amount calculation unit 31 that generates video coding delay time information for synchronizing the audio reproduction output, and a multiplexing unit 6 that multiplexes the coded video information and audio information with the delay time information. ing.

また、受信側においては、伝送路10からの多重化され
た信号を音声情報と映像情報と遅延時間情報とに分離す
る分離部11と、遅延時間情報に従って音声情報を遅延さ
せる可変遅延制御部32と、可変遅延制御部32からの音声
情報を復号化する音声復号化部12と、復号化されたディ
ジタル音声を音声信号に変換するD/A変換器13と、映像
情報を復号化する映像復号化部14と、復号化されたディ
ジタル映像出力を映像出力に変換するD/A変換器15と、
を備えている。Also, on the receiving side, a separation unit 11 that separates the multiplexed signal from the transmission path 10 into audio information, video information, and delay time information, and a variable delay control unit 32 that delays the audio information according to the delay time information. An audio decoding unit 12 for decoding the audio information from the variable delay control unit 32, a D / A converter 13 for converting the decoded digital audio into an audio signal, and a video decoding for decoding the video information. A conversion unit 14, a D / A converter 15 for converting the decoded digital video output into a video output,
It has.

第39図に示す様に、映像信号の符号化部４での符号化
処理に要する遅延時間（Tv:可変量）に対し、音声信号
の符号化部２及び復号化部12での固定処理遅延時間から
映像信号の復号化部14での固定処理遅延時間を引いた固
定処理遅延時間（Ta:一定量）は短い為、音声復号化部1
2と映像復号化部14とが同時に出力を発生するために可
変遅延制御部32に必要な遅延時間（Td:可変）は、Td＝T
v-Taで与えられる。As shown in FIG. 39, a fixed processing delay in the audio signal encoding unit 2 and the decoding unit 12 with respect to the delay time (Tv: variable amount) required for the encoding process in the image signal encoding unit 4 Since the fixed processing delay time (Ta: constant amount) obtained by subtracting the fixed processing delay time in the video signal decoding unit 14 from the time is short, the audio decoding unit 1
The delay time (Td: variable) required for the variable delay control unit 32 in order that 2 and the video decoding unit 14 generate outputs at the same time is Td = T
given by v-Ta.

従って、遅延情報Tdは、Taが一定である為、Tvを求め
ればよいことが分かる。Therefore, since Ta is constant in the delay information Td, it is understood that Tv may be obtained.

このため、本発明では、遅延量演算部31が映像符号化
部４の入出力情報に基づいて符号化処理の遅延時間Tvを
演算し、この遅延時間Tvと予め分かっている一定の遅延
時間Taとから上記の遅延時間Tdに関する情報を多量化部
６に与え、この遅延時間情報を映像情報及び音声情報と
共に多重化して伝送する（第40図参照）。Therefore, in the present invention, the delay amount calculation unit 31 calculates the delay time Tv of the encoding process based on the input / output information of the video encoding unit 4, and this delay time Tv and a predetermined delay time Ta known in advance. From the above, information about the above-mentioned delay time Td is given to the mulling unit 6, and this delay time information is multiplexed with video information and audio information and transmitted (see FIG. 40).

一方、受信側においては、分離部11でその遅延時間情
報を分離して可変遅延制御部32に与えると、可変遅延制
御部32は分離された音声情報を遅延時間Tdだけ遅延させ
て音声復号化部12に与える。On the other hand, on the receiving side, when the demultiplexing unit 11 separates the delay time information and gives it to the variable delay control unit 32, the variable delay control unit 32 delays the separated voice information by the delay time Td and performs voice decoding. Give to part 12.

このように入力映像に合わせて適応的な遅延制御を行
うので、音声復号化出力と映像復号化出力とを一致した
形で出力することが出来る。Since the adaptive delay control is performed according to the input video in this way, the audio decoding output and the video decoding output can be output in a matched form.

第41図は第39図に示した遅延量演算部31の一実施例を
示したもので、この実施例ではA/D変換器１からの映像
情報を蓄積するバッファ（BUF）2aと、バッファ2aから
の映像情報又は評価パターンをセレクト信号に基づいて
選択するセレクタ（SEL）2bと、メモリ35を用いて映像
情報量を検出する映像情報量検出部2cと、上記の映像情
報量を受けて評価パターンとセレクト信号を発生し且つ
映像符号化部４から評価パターンを受ける演算制御部2d
とを含んでいる。FIG. 41 shows an embodiment of the delay amount calculation unit 31 shown in FIG. 39. In this embodiment, a buffer (BUF) 2a for accumulating video information from the A / D converter 1 and a buffer are provided. Selector (SEL) 2b for selecting the video information or the evaluation pattern from 2a based on the select signal, the video information amount detection unit 2c for detecting the video information amount using the memory 35, and receiving the above video information amount. Operation control unit 2d that generates an evaluation pattern and a select signal and receives the evaluation pattern from the video encoding unit 4
And

動作においては、演算制御部2dが、例えば映像情報量
検出部2cの出力の変化に基づき、時間的に隣合う映像情
報量の差（T1-T0）が或る閾値（Th）以上となった場合
に、映像符号化部４の処理遅延時間T_VCを演算するシー
ケンスに移行する。In the operation, the arithmetic control unit 2d, for example, based on the change in the output of the video information amount detection unit 2c, the temporally adjacent video information amount difference (T1-T0) becomes a certain threshold value (Th) or more. In this case, the process shifts to a sequence for calculating the processing delay time T _VC of the video encoding unit 4.

この演算に際しては、演算制御部2dからセレクト信号
がセレクタ2bに与えられるので、セレクタ2bはバッファ
2aの映像データから評価パターンに切り替えると共に評
価パターンが出力されてセレクタ2bから映像符号化部４
に送られる。At the time of this calculation, since the select signal is given to the selector 2b from the calculation control section 2d, the selector 2b operates as a buffer.
The video data of 2a is switched to the evaluation pattern, and the evaluation pattern is output.
Sent to

そして、この評価パターンを送出した時刻から評価パ
ターンが符号化部４から出力される迄の時間を演算する
ことにより処理遅延時間T_VCが演算される。Then, the processing delay time T _VC is calculated by calculating the time from the time when this evaluation pattern is sent to the time when the evaluation pattern is output from the encoding unit 4.

尚、評価パターンは、映像データに存在しないパター
ンを使用する事とし、且つ符号化部４において評価パタ
ーンが入力された場合には、未処理の状態で出力させる
ため、実施例としては評価パターン“00000000000000"
を使用する。As the evaluation pattern, a pattern that does not exist in the video data is used, and when the evaluation pattern is input in the encoding unit 4, it is output in an unprocessed state. 00000000000000 "
To use.

また、評価パターン送出中は、映像データはバッファ
2aに蓄えられており、その読み出しは禁止される為、映
像データが捨てられることはない。Also, while sending the evaluation pattern, the video data is buffered.
It is stored in 2a and its reading is prohibited, so the video data is not discarded.

このようにして演算された遅延時間T_VCから、演算制
御部2dは更に同期を取るための遅延時間Tdを求める。こ
れは、 Td＝T_VC+T_Vd-T_aC-T_ad から求められる。ここで、T_Vdは映像復号化遅延時間、T
_aCは音声符号化遅延時間、T_adは音声復号化遅延時間で
あり、Ta＝T_Vd-T_aC-T_adは映像符号化遅延時間T_VCに比べ
ると一定と考えることができるので、このTaを予め求め
ておくことにより遅延時間Tdが得られ、多重化部６に送
ることができる。From the delay time T _VC calculated in this way, the calculation control unit 2d determines the delay time Td for further synchronization. This is obtained from Td = _TVC + _TVd - _TaC - _Tad . Where T _Vd is the video decoding delay time, T
_aC is the audio coding delay time, T _ad is the audio decoding delay time, and Ta = T _Vd -T _aC -T _ad can be considered to be constant compared to the video coding delay time T _VC , so Ta By delaying in advance, the delay time Td can be obtained and sent to the multiplexing unit 6.

第42図は多重化部６の一実施例を示したもので、この
実施例ではCPU21と、CPUバス22と、第２図に示すフレー
ムフォーマットで伝送路にデータを送出するための手順
プログラムを格納したROM23と、映像情報、音声情報及
び遅延量演算部31からの遅延時間情報をそれぞれバス22
に取り込むためのインタフェース（I/F）24〜26と、こ
れらインタフェース24〜26からの映像情報、音声情報、
及び遅延情報を一時的に書き込むためのアドレス空間を
有するRAM27と、各情報を伝送路に送出するためのイン
タフェース28とで構成されている。FIG. 42 shows an embodiment of the multiplexing unit 6, and in this embodiment, a CPU 21, a CPU bus 22, and a procedure program for sending data to the transmission line in the frame format shown in FIG. The stored ROM 23 and the video information, the audio information, and the delay time information from the delay amount calculation unit 31 are respectively transferred to the bus 22.
Interfaces (I / F) 24 to 26 for capturing in, and video information and audio information from these interfaces 24 to 26,
And a RAM 27 having an address space for temporarily writing the delay information, and an interface 28 for sending each information to the transmission path.

伝送フレームフォーマットは第２図に示されるもの
で、第６図に示すように、FAS情報は、フレーム情報で
あり、（１）Y.221フレーム同期、及び（２）マルチ・
フレーム同期の同期手順により同期を確立するために使
用される。即ち、上記（１）によりフレーム単位の区別
ができ、（２）により各フレームの識別が可能となる。
尚、全フレーム識別の必要性は、後述するBAS（Bit All
ocation Signal）情報の変化に対する応答単位を認識す
る為である。The transmission frame format is as shown in FIG. 2, and as shown in FIG. 6, FAS information is frame information, including (1) Y.221 frame synchronization, and (2) multi-frame.
Used to establish synchronization by the frame synchronization synchronization procedure. That is, it is possible to distinguish each frame by the above (1), and it is possible to identify each frame by (2).
Note that the necessity of identifying all frames is based on the BAS (Bit All
This is to recognize the response unit for changes in information.

BAS情報は送信側において音声情報量と映像情報量の
符号化情報（例えば両者の比率）を予め設定したもの
で、フレーム同期確立後に受信側で各データの分離を行
うために使用される。このBAS情報は、１サブ・マルチ
フレーム毎（１マルチフレーム＝２サブ・マルチフレー
ム）に判定される。The BAS information is information in which the encoded information of the audio information amount and the video information amount (for example, the ratio of both) is preset on the transmitting side, and is used for separating each data on the receiving side after the frame synchronization is established. This BAS information is determined for each sub-multiframe (1 multiframe = 2 sub-multiframes).

また、遅延量演算部31からの遅延時間情報は第43図に
示す制御情報（AC情報）部にDLY0〜７の８フレーム分
（CODEC機能部）を使用することにより、先に述べた遅
延時間Tdに関する情報を指示し、この８ビットにより、
256通りの音声遅延情報を受信側装置に送出する事が可
能となる。In addition, the delay time information from the delay amount calculation unit 31 uses the eight frames (CODEC function unit) of DLY0 to 7 for the control information (AC information) unit shown in FIG. Information about Td is indicated, and by these 8 bits,
It is possible to send 256 types of audio delay information to the receiving side device.

次に、動作の説明をすると、上記のようなフレームフ
ォーマットを用いた場合の多重化部６では、CPU21によ
り予め記憶したBAS情報に従ってビット配分が決定さ
れ、音声情報及び映像情報をRAM27に格納する。そし
て、このときに、遅延量演算部31で求めた遅延時間情報
を制御情報として格納する。そして、フレームフォーマ
ットが完成した時点で伝送速度に一致したレートでデー
タを伝送路10に送出する。Next, the operation will be described. In the multiplexing unit 6 in the case of using the frame format as described above, the bit allocation is determined by the CPU 21 according to the BAS information stored in advance, and the audio information and the video information are stored in the RAM 27. . Then, at this time, the delay time information obtained by the delay amount calculation unit 31 is stored as control information. Then, when the frame format is completed, the data is sent to the transmission line 10 at a rate that matches the transmission rate.

一方、受信側における分離部11は第42図に示した多重
化部６と丁度矢印を逆方向にした形で同様の構成を有し
ている。On the other hand, the demultiplexing unit 11 on the receiving side has the same configuration as the multiplexing unit 6 shown in FIG. 42 with the arrow just in the opposite direction.

即ち、伝送路からの入力データに対して、上述したよ
うに、このFAS情報を解析することにより（１）Y.221フ
レーム同期、及び（２）マルチ・フレーム同期を確立す
る。そして、BAS情報に基づいて各データを分離部11で
分離し、分離された各信号データを、各々のインタフェ
ースを介して音声情報は可変遅延制御部32へ、そして映
像情報は映像復号化部14へ送る。更に、制御情報中の遅
延時間情報も同様にして分離し可変遅延制御部32へ送
る。That is, as described above, the FAS information is analyzed for the input data from the transmission path to establish (1) Y.221 frame synchronization and (2) multi-frame synchronization. Then, each data is separated by the separation unit 11 based on the BAS information, and each separated signal data is sent to the variable delay control unit 32 through the respective interfaces and the video decoding unit 14 through the video information. Send to. Further, the delay time information in the control information is similarly separated and sent to the variable delay control section 32.

可変遅延制御部32では、分離部11で分離された遅延時
間情報を受けて音声情報を遅延時間Tdだけ遅延させて音
声復号化部12へ送る。The variable delay control unit 32 receives the delay time information separated by the separation unit 11, delays the voice information by the delay time Td, and sends it to the voice decoding unit 12.

尚、その他の遅延時間を決定する方法としては、下記
のように予め数種類の遅延テーブルを用意し、映像情報
発生量に応じて最も妥当な値を選択し映像符号化遅延時
間T_VCとすることもできる。As another method of determining the delay time, prepare several kinds of delay tables in advance as shown below, and select the most appropriate value according to the amount of video information generation and set it as the video coding delay time T _VC. You can also

〈テーブル例〉フレーム内符号化：T_VC＝500msec フレーム間符号化（１）発生情報量大：T_VC＝250msec （２）発生情報量大：T_VC＝200msec （３）発生情報量大：T_VC＝150msec また、映像符号化処理の可変遅延時間T_V以外は固定遅
延時間として送信側の遅延量演算部31に予め用意した
が、受信側にも遅延量演算部を設けて所望の遅延時間Td
を求めるようにしてもよい。<Table example> Intra-frame coding: T _VC = 500 msec Inter-frame coding (1) Large amount of generated information: T _VC = 250 msec (2) Large amount of generated information: T _VC = 200 msec (3) Large amount of generated information: T _VC = 150 msec Also, except for the variable delay time T _V of the video encoding process, the delay amount calculation unit 31 on the transmission side was prepared in advance as fixed delay time. Td
May be requested.

前述のように映像情報と音声情報の内容により伝送比
率を変える場合はこのズレは小さくなるが、それでもあ
る程度は残る。そこで映像と音声の伝送比率を変えたう
え、更に入力情報に対応した遅延時間を与えて再生すれ
ば映像と音声のより良いバランスが保てる。As described above, when the transmission ratio is changed depending on the contents of the video information and the audio information, this deviation becomes small, but it still remains to some extent. Therefore, by changing the transmission ratio of video and audio, and further giving a delay time corresponding to the input information for reproduction, a better balance between video and audio can be maintained.

次に映像の音声に対する遅延時間を利用し、音声の無
音部を圧縮し遅延時間に対応した時間まとめて伝送する
例について説明する。Next, an example will be described in which the delay time of the audio of the video is used to compress the silent part of the audio and collectively transmit the time corresponding to the delay time.

第44図はこの実施例の基本構成図を示す。図中の符号
Ａは端局、３は映像信号に対するA/D変換器、41は映像
符号化部、42は可変長符号化部、１は音声信号に対する
A/D変換器、２は音声符号化部、6,11は多重・分離部、
９は伝送路インタフェース部、142は映像信号に対する
可変長復号化部、141は映像復号化部、15はD/A変換器、
12は音声復号化部、13はD/A変換器、19はシステム制御
部、71は時分割符号化部、72は時分割復号化部を表して
いる。FIG. 44 shows the basic configuration of this embodiment. In the figure, symbol A is a terminal station, 3 is an A / D converter for video signals, 41 is a video coding unit, 42 is a variable length coding unit, and 1 is a voice signal.
A / D converter, 2 is a voice encoder, 6 and 11 are multiplexer / demultiplexer,
9 is a transmission line interface unit, 142 is a variable length decoding unit for video signals, 141 is a video decoding unit, 15 is a D / A converter,
12 is a voice decoding unit, 13 is a D / A converter, 19 is a system control unit, 71 is a time division encoding unit, and 72 is a time division decoding unit.

映像信号は、映像符号化部41における予測符号化処理
の後に、可変長符号化処理が行われ、多重・分離部6,11
に供給される。一方音声信号は、音声符号化部２におい
て、4kHz帯域16Kbpsあるいは7kHz帯域56Kbpsにコード化
され、時分割符号化部71において、音声信号における無
音期間を取り除いた有効期間中のみの信号が抽出され、
パケット化された上で所望する遅延制御量Ｔに略一致す
る期間分まとめられて多重・分離部6,11に供給される。The video signal is subjected to the variable length coding process after the predictive coding process in the video coding unit 41, and the multiplexing / demultiplexing units 6 and 11 are provided.
Is supplied to. On the other hand, the voice signal is coded in the 4 kHz band 16 Kbps or the 7 kHz band 56 Kbps in the voice coding unit 2, and the time division coding unit 71 extracts the signal only during the effective period from which the silent period in the voice signal is removed.
After being packetized, the packets are bundled and supplied to the multiplexing / demultiplexing units 6 and 11 for a period substantially matching the desired delay control amount T.

対向端局からの受信信号については、多重・分割部6,
11において、映像信号と音声信号とに分離される。映像
信号については、可変長復号化部142、映像復号化部14
1、D/A変換器15をへて、映像出力として受取られる。ま
た音声信号については、時分割復号化部72、音声復号化
部12、D/A変換器13をへて、音声出力として受取られ
る。For the received signal from the opposite terminal station, the multiplexing / dividing unit 6,
At 11, the video signal and the audio signal are separated. Regarding the video signal, the variable length decoding unit 142 and the video decoding unit 14
1. Received as a video output from the D / A converter 15. The audio signal is received as an audio output via the time division decoding unit 72, the audio decoding unit 12, and the D / A converter 13.

時分割符号化部71は、音声符号化部２からの出力にも
とづいて、音声信号における無音期間を取り除いた有効
期間の信号を抽出してパケットにまとめ、当該パケット
にまとめられたものを上述の遅延制御量Ｔに見合う時間
分まとめて、例えば8Kbps単位にて、最小０個ないし最
大７個分（56Kbps分）多重・分離部6,11に供給する。そ
して、時分割符号化部71は、0Kbpsないし56Kbpsのうち
のいずれの音声伝送速度に相当するものを出力したか
を、システム制御部19と多重・分離部6,11とに通知す
る。The time division encoding unit 71 extracts the signals of the valid period from which the silent period is removed from the voice signal based on the output from the voice encoding unit 2 and collects the signals into packets, and the packets that have been collected into the packets are described above. A time corresponding to the delay control amount T is collectively supplied to the multiplexing / demultiplexing units 6 and 11 in a unit of 8 Kbps for a minimum of 0 to a maximum of 7 (56 Kbps). Then, the time division encoding unit 71 notifies the system control unit 19 and the demultiplexing / demultiplexing units 6 and 11 which one of 0 Kbps to 56 Kbps that corresponds to the audio transmission rate is output.

この通知にもとづいて、システム制御部19は、可変長
符号化部42におけるバッファ・メモリ中のデータ量にも
とづいて映像符号化部41における符号化処理を制御する
際の閾値（第２の閾値）を変更する。即ち、音声信号の
伝送量が少なくなるにつれて映像信号の伝送量を増大す
る。Based on this notification, the system control unit 19 controls the threshold value (second threshold value) when controlling the encoding process in the video encoding unit 41 based on the data amount in the buffer memory in the variable length encoding unit 42. To change. That is, the transmission amount of the video signal increases as the transmission amount of the audio signal decreases.

一方、多重・分離部6,11は、上記音声信号の伝送速度
を受け取って、多重化したフレーム・フォーマット上で
音声信号の伝送速度を対向端局に対して通知する情報を
フレーム情報の中に記述するようにする。On the other hand, the demultiplexing / demultiplexing units 6 and 11 receive the transmission rate of the voice signal, and include in the frame information the information for notifying the opposite terminal station of the transmission rate of the voice signal in the multiplexed frame format. Try to describe it.

第45図（Ａ）は実施例の場合の伝送フレーム・フォー
マットを表し、第45図（Ｂ）は構成フレーム情報を説明
する図である。FIG. 45 (A) shows the transmission frame format in the case of the embodiment, and FIG. 45 (B) is a diagram for explaining the constituent frame information.

フレーム情報には8Kbps分が割当られており、当該フ
レーム情報はフレーム・ヘッダを含むと共に、音声伝送
速度を対向端局に通知する構成フレーム情報をもってい
る。当該構成フレーム情報は、第45図（Ｂ）図示の如
く、時分割符号化部41において0Kbpsないし56Kbpsの範
囲内にまとめて多重・分割部6,11に供給した音声伝送速
度を、情報「000」ないし「111」の形で記述される。8 Kbps is allocated to the frame information, and the frame information includes a frame header and also has constituent frame information for notifying the opposite terminal station of the voice transmission rate. As shown in FIG. 45 (B), the constituent frame information is the information "000" indicating the audio transmission rate collectively supplied to the multiplexing / dividing units 6 and 11 within the range of 0 Kbps to 56 Kbps in the time division coding unit 41. ”Through“ 111 ”.

本発明の場合には、上記音声伝送速度に対応して、第
45図（Ａ）において「可変」として示しているように、
音声信号を伝送する区分として、最小0Kbpsから56Kbps
までの区分が割当られて伝送される。音声信号のために
a Kbpsが与えられたとすると、図示残余の（56-a）Kbps
は映像信号を伝送するために利用される。In the case of the present invention, the first
As shown as "Variable" in Fig. 45 (A),
Minimum 0Kbps to 56Kbps for audio signal transmission
The categories up to are assigned and transmitted. For voice signals
Given a Kbps, the remaining (56-a) Kbps in the figure
Are used for transmitting video signals.

第46図は時分割符号化部における処理態様を示してい
る。FIG. 46 shows a processing mode in the time division encoding unit.

図示の如き音声入力が与えられたとするとき、第44
図図示のA/D変換器１によってディジタル化されてA/D出
力となる。なお図示「１」，「２」，「３」は有効期
間中の信号である。当該A/D出力は図示の如く無音期
間を削除されて、図示パケット化出力の如くまとめら
れる。If a voice input as shown in the figure is given, the 44th
It is digitized by the A / D converter 1 shown in the figure and becomes an A / D output. In the figure, "1", "2" and "3" are signals during the valid period. The A / D output has the silent period deleted as shown in the figure, and is summarized as the packetized output in the figure.

パケット化出力は、図示パケット情報に示す如
く、所定の期間t_S毎に図示（イ），（ロ），（ハ）…〉
の如く生成されるものである。このような情報（イ），
（ロ），（ハ）…は、上述の遅延制御量Ｔに相当する時
間の間まとめられて、図示の多重・分離部への入力デー
タの如く、8Kbps分を１つの単位として、０個ないし
７個分として、多重・分離部6,11に供給される。このま
とめられた（イ），（ロ）…〉の個数がいくつであるか
によって、上述の音声伝送速度が与えられる。即ち０個
の場合には第45図（Ｂ）図示の「000」が生成され、１
個の場合には「001」が生成され、…,7個の場合には「1
11」が生成される。The packetized output is shown at (a), (b), (c) ...> at predetermined time intervals t _S as shown in the illustrated packet information.
Is generated as follows. Such information (a),
(B), (c), etc. are collected for a time corresponding to the delay control amount T described above, and 0 to 0 are set in units of 8 Kbps as input data to the multiplexing / demultiplexing unit shown in the figure. It is supplied to the multiplexing / demultiplexing units 6 and 11 as seven pieces. The above-mentioned voice transmission rate is given depending on the number of the collected (a), (b) ...>. That is, when the number is 0, “000” shown in FIG.
"001" is generated when there are 7 pieces, ..., "1" when there are 7 pieces
11 ”is generated.

第47図は本発明の場合のフィードバック制御の態様を
示している。FIG. 47 shows a mode of feedback control in the case of the present invention.

時分割符号化部41において生成された音声伝送速度
（図示の符号化速度）が、本発明の場合閾値２′の如く
第２の閾値としてシステム制御部19に供給される。なお
第１の閾値は、従来の場合と同様に固定値で与えられて
いる。The voice transmission rate (encoding rate shown in the figure) generated in the time division encoding section 41 is supplied to the system control section 19 as a second threshold value like the threshold value 2'in the present invention. The first threshold is given as a fixed value as in the conventional case.

システム制御部19においては、音声伝送速度（符号化
速度）にもとづいて第47図（Ｂ）図示の如く、映像符号
化部41に対して符号化処理の停止を指示する条件を変更
する。あるいは量子化のテーブルを変更する。即ち、音
声伝送速度が大である程、バッファ・メモリ43に蓄積さ
れるデータがより少ない状態の場合で上記停止を指示す
るようにしている。または量子化のテーブルをあらいも
のに変更する。勿論、その逆に、音声伝送速度が小であ
る程、上記バッファ・メモリ43に蓄積されるデータがよ
り大となった状態で上記停止が指示される。または量子
化のテーブルをよりこまかいものとすることができる。In the system control unit 19, as shown in FIG. 47 (B), the condition for instructing the video encoding unit 41 to stop the encoding process is changed based on the audio transmission rate (encoding rate). Alternatively, change the quantization table. That is, the higher the voice transmission speed, the smaller the amount of data stored in the buffer memory 43, and the more the stop instruction is given. Alternatively, change the quantization table to a rough one. Of course, conversely, the lower the voice transmission speed, the more the data stored in the buffer memory 43 becomes larger, and the stop is instructed. Alternatively, the quantization table can be made more detailed.

図示「制御Ａ」は映像符号化部３に対する制御信号で
あり、図示「制御Ｂ」は多重・分離部6,11に対して音声
伝送速度を通知する制御信号である。“Control A” shown in the figure is a control signal for the video encoding unit 3, and “Control B” in the figure is a control signal for notifying the multiplexing / demultiplexing units 6 and 11 of the audio transmission rate.

[Potential for industrial use]

本発明は映像・音声多重伝送システムを用いるテレビ
会議システム等で利用されるもので、伝送容量が充分と
はいえない比較的下位のシステムでより効果的である
が、伝送容量が大きい場合にも応用できる。INDUSTRIAL APPLICABILITY The present invention is used in a video conference system or the like that uses a video / audio multiplex transmission system, and is more effective in a relatively low-order system where the transmission capacity cannot be said to be sufficient. It can be applied.

───────────────────────────────────────────────────── フロントページの続き (31)優先権主張番号特願平1 −66783 (32)優先日平１(1989)３月18日 (33)優先権主張国日本（ＪＰ） (31)優先権主張番号特願平1 −178454 (32)優先日平１(1989)７月11日 (33)優先権主張国日本（ＪＰ） (56)参考文献特開昭64−16040（ＪＰ，Ａ) 特開昭63−7042（ＪＰ，Ａ) 特開昭55−136741（ＪＰ，Ａ) 特開昭60−144037（ＪＰ，Ａ) ─────────────────────────────────────────────────── ─── Continuation of the front page (31) Priority claim number Japanese Patent Application No. 1-66783 (32) Priority date Hei 1 (1989) March 18 (33) Priority claiming country Japan (JP) (31) Priority Claim number Japanese Patent Application No. 1-178454 (32) Priority date 1 (1989) July 11 (33) Country of priority claim Japan (JP) (56) References Japanese Patent Laid-Open No. 64-16040 (JP, A) Kai 63-4042 (JP, A) JP 55-136741 (JP, A) JP 60-144037 (JP, A)

Claims

(57) [Claims]

1. An A / D for converting a voice input into a digital voice.
A converter (1), an audio encoding unit (2) that encodes the digital audio and outputs it as encoded audio with a selectable transmission amount, and outputs audio content information. A / D converter to convert to (3)
A video encoding unit (4) for encoding the digital video and outputting the encoded video information as encoded video information, the encoded audio and the encoded audio according to the information amount of at least one of the encoded audio and the encoded video information. Encoding control unit (5) that determines the transmission ratio of encoded video and outputs it as an allocation signal
And a multiplexing unit (6) that multiplexes the coded audio based on the allocation signal, the coded video, and control information including the allocation signal so as to have a constant transmission frame length. A video-audio multiplex transmission system having a transmitter (B), wherein the audio encoder (2) encodes the digital audio by dividing the digital audio into a low frequency bit part and a high frequency bit part.
A video / audio multiplex transmission system, characterized in that a CM encoding unit selects an allocation amount of a high frequency bit portion according to the allocation signal.

2. An A / D for converting a voice input into a digital voice.
A converter (1), an audio encoding unit (2) that encodes the digital audio and outputs it as encoded audio with a selectable transmission amount, and outputs audio content information. A / D converter to convert to (3)
A video encoding unit (4) for encoding the digital video and outputting the encoded video information as encoded video information, the encoded audio and the encoded audio according to the information amount of at least one of the encoded audio and the encoded video information. Encoding control unit (5) that determines the transmission ratio of encoded video and outputs it as an allocation signal
And a multiplexing unit (6) that multiplexes the coded audio based on the allocation signal, the coded video, and control information including the allocation signal so as to have a constant transmission frame length. A video / audio multiplex transmission system having a transmission section (B), wherein the transmission section obtains an inter-frame information change rate of the digital video and compares it with a threshold value, and a determination result is obtained by the encoding control section (5).
A video / audio multiplex transmission system, further comprising an inter-frame change rate determination unit (8) for outputting to.

3. An A / D for converting a voice input into a digital voice.
A converter (1), an audio encoding unit (2) that encodes the digital audio and outputs it as encoded audio with a selectable transmission amount, and outputs audio content information. A / D converter to convert to (3)
A video encoding unit (4) for encoding the digital video and outputting the encoded video information as encoded video information, the encoded audio and the encoded audio according to the information amount of at least one of the encoded audio and the encoded video information. Encoding control unit (5) that determines the transmission ratio of encoded video and outputs it as an allocation signal
And a multiplexing unit (6) that multiplexes the coded audio based on the allocation signal, the coded video, and control information including the allocation signal so as to have a constant transmission frame length. A video / audio multiplex transmission system having a transmitting section (B), wherein the video encoding section (4) encodes the digital video, and the encoded video information. A variable length coding unit (42) for variable length coding, a buffer (43) for temporarily storing the variable length coded video information, and a buffer (43) for storing the amount of information stored in the buffer (43). And a buffer determination section (44) for outputting a storage amount signal to the encoding control section (5).

4. An A / D for converting a voice input into a digital voice.
A converter (1), an audio encoding unit (2) that encodes the digital audio and outputs it as encoded audio with a selectable transmission amount, and outputs audio content information. A / D converter to convert to (3)
A video encoding unit (4) for encoding the digital video and outputting the encoded video information as encoded video information, the encoded audio and the encoded audio according to the information amount of at least one of the encoded audio and the encoded video information. Encoding control unit (5) that determines the transmission ratio of encoded video and outputs it as an allocation signal
And a multiplexing unit (6) that multiplexes the coded audio based on the allocation signal, the coded video, and control information including the allocation signal so as to have a constant transmission frame length. A video / audio multiplex transmission system having a transmission unit (B), wherein the transmission unit (B) delays from the input / output information of the video encoding unit (4) to synchronize the video reproduction output and the audio reproduction output. Further comprising a delay amount calculation unit (31) for generating time information,
A video / audio multiplex transmission system characterized in that the delay time information is also multiplexed by a multiplexing unit (6).

5. A separation unit (11) for receiving and separating a signal obtained by multiplexing coded audio from a transmission line (10), coded video, and a control signal including an allocation signal, and the coded audio. And a video decoding unit (14) for decoding the encoded video into a decoded digital video, a video decoding unit (14) for decoding the encoded video into a decoded digital video, and the audio decoding unit (14) based on the allocation signal. 12) and a decoding control section (16) for controlling the video decoding section (14), a D / A converter (13) for converting the decoded digital voice into an audio signal, and the decoded digital video A video / audio multiplex transmission system having a receiving section (C) comprising a D / A converter (15) for converting a video signal into a video signal, wherein the receiving section (C) is a multiplexed signal from a transmission line. Variable delay control section (32) for delaying the voice information according to the delay time information multiplexed in Further video and audio multiplex transmission system characterized by comprising.

6. An A / D for converting a voice input into a digital voice.
A converter (1), an audio encoding unit (2) that encodes the digital audio and outputs it as encoded audio with a selectable transmission amount, and outputs audio content information. A / D converter to convert to (3)
A video encoding unit (4) for encoding the digital video and outputting the encoded video information as encoded video information, the encoded audio and the encoded audio according to the information amount of at least one of the encoded audio and the encoded video information. Encoding control unit (5) that determines the transmission ratio of encoded video and outputs it as an allocation signal
And a multiplexing unit (6) that multiplexes the coded audio based on the allocation signal, the coded video, and control information including the allocation signal so as to have a constant transmission frame length. A transmission section (B), a separation section (11) for receiving and separating a signal obtained by multiplexing encoded voice and encoded video from a transmission line (10) and a control signal including an allocation signal; An audio decoding section (12) for decoding the encoded audio into a decoded digital audio, a video decoding section (14) for decoding the encoded video into a decoded digital video, and the audio decoding based on the allocation signal A decoding control section (16) for controlling the section (12) and the video decoding section (14), a D / A converter (13) for converting the decoded digital voice into an audio signal, and the decoding With a receiver (C) including a D / A converter (15) for converting a digital image into a video signal Video that is transmitted by changing the transmission ratio of audio and video according to the configured transmission contents.
A voice multiplex transmission system, wherein the voice encoding unit (2) encodes the digital voice by dividing the digital voice into a low frequency bit portion and a high frequency bit portion.
In the CM encoding unit, the allocation amount of the high frequency bit portion is selected by the allocation signal, and the speech decoding unit SB-codes the encoded speech according to the allocation signal.
-A video / multiplex transmission system characterized by ADPCM decoding.

7. An A / D for converting a voice input into a digital voice.
A converter (1), an audio encoding unit (2) that encodes the digital audio and outputs it as encoded audio with a selectable transmission amount, and outputs audio content information. A / D converter to convert to (3)
A video encoding unit (4) for encoding the digital video and outputting the encoded video information as encoded video information, the encoded audio and the encoded audio according to the information amount of at least one of the encoded audio and the encoded video information. Encoding control unit (5) that determines the transmission ratio of encoded video and outputs it as an allocation signal
And a multiplexing unit (6) that multiplexes the coded audio based on the allocation signal, the coded video, and control information including the allocation signal so as to have a constant transmission frame length. A transmission section (B), a separation section (11) for receiving and separating a signal obtained by multiplexing encoded voice and encoded video from a transmission line (10) and a control signal including an allocation signal; An audio decoding section (12) for decoding the encoded audio into a decoded digital audio, a video decoding section (14) for decoding the encoded video into a decoded digital video, and the audio decoding based on the allocation signal A decoding control section (16) for controlling the section (12) and the video decoding section (14), a D / A converter (13) for converting the decoded digital voice into an audio signal, and the decoding With a receiver (C) including a D / A converter (15) for converting a digital image into a video signal Video that is transmitted by changing the transmission ratio of audio and video according to the configured transmission contents.
An audio multiplex transmission system, wherein the transmission unit obtains an inter-frame information change rate of the digital video and compares it with a threshold value, and a determination result is obtained by the encoding control unit (5).
A video / audio multiplex transmission system, further comprising an inter-frame change rate determination unit (8) for outputting to.

8. An A / D for converting a voice input into a digital voice.
A converter (1), an audio encoding unit (2) that encodes the digital audio and outputs it as encoded audio with a selectable transmission amount, and outputs audio content information. A / D converter to convert to (3)
A video encoding unit (4) for encoding the digital video and outputting the encoded video information as encoded video information, the encoded audio and the encoded audio according to the information amount of at least one of the encoded audio and the encoded video information. Encoding control unit (5) that determines the transmission ratio of encoded video and outputs it as an allocation signal
And a multiplexing unit (6) that multiplexes the coded audio based on the allocation signal, the coded video, and control information including the allocation signal so as to have a constant transmission frame length. A transmission section (B), a separation section (11) for receiving and separating a signal obtained by multiplexing encoded voice and encoded video from a transmission line (10) and a control signal including an allocation signal; An audio decoding section (12) for decoding the encoded audio into a decoded digital audio, a video decoding section (14) for decoding the encoded video into a decoded digital video, and the audio decoding based on the allocation signal A decoding control section (16) for controlling the section (12) and the video decoding section (14), a D / A converter (13) for converting the decoded digital voice into an audio signal, and the decoding With a receiver (C) including a D / A converter (15) for converting a digital image into a video signal Video that is transmitted by changing the transmission ratio of audio and video according to the configured transmission contents.
An audio multiplex transmission system, wherein the video encoding unit (4) encodes the digital video, and a variable length code that performs variable length encoding on the encoded video information. An encoding unit (42), a buffer (43) for temporarily accumulating the variable-length-encoded video information, and an encoding control of an accumulation amount signal according to the amount of information accumulated in the buffer (43). A video / audio multiplex transmission system, comprising: a buffer determination section (44) for outputting to a section (5).

9. An A / D for converting a voice input into a digital voice.
A converter (1), an audio encoding unit (2) that encodes the digital audio and outputs it as encoded audio with a selectable transmission amount, and outputs audio content information. A / D converter to convert to (3)
A video encoding unit (4) for encoding the digital video and outputting the encoded video information as encoded video information, the encoded audio and the encoded audio according to the information amount of at least one of the encoded audio and the encoded video information. Encoding control unit (5) that determines the transmission ratio of encoded video and outputs it as an allocation signal
And a multiplexing unit (6) that multiplexes the coded audio based on the allocation signal, the coded video, and control information including the allocation signal so as to have a constant transmission frame length. A transmission section (B), a separation section (11) for receiving and separating a signal obtained by multiplexing encoded voice and encoded video from a transmission line (10) and a control signal including an allocation signal; An audio decoding section (12) for decoding the encoded audio into a decoded digital audio, a video decoding section (14) for decoding the encoded video into a decoded digital video, and the audio decoding based on the allocation signal A decoding control section (16) for controlling the section (12) and the video decoding section (14), a D / A converter (13) for converting the decoded digital voice into an audio signal, and the decoding With a receiver (C) including a D / A converter (15) for converting a digital image into a video signal Video that is transmitted by changing the transmission ratio of audio and video according to the configured transmission contents.
A voice multiplex transmission system, wherein the transmission unit (B) generates delay time information for synchronizing the video reproduction output and the audio reproduction output from the input / output information of the video encoding unit (4). (31) is further provided,
A video / audio multiplex transmission system characterized in that the delay time information is also multiplexed by a multiplexing unit (6).

10. A / A for converting voice input into digital voice
A D converter (1), an audio encoding unit (2) for encoding the digital audio, outputting the encoded audio with a selectable transmission amount, and outputting audio content information, and a digital input for the video input. A / D converter for converting to video (3)
A video encoding unit (4) for encoding the digital video and outputting the encoded video information as encoded video information, the encoded audio and the encoded audio according to the information amount of at least one of the encoded audio and the encoded video information. Encoding control unit (5) that determines the transmission ratio of encoded video and outputs it as an allocation signal
And a multiplexing unit (6) that multiplexes the coded audio based on the allocation signal, the coded video, and control information including the allocation signal so as to have a constant transmission frame length. A transmission section (B), a separation section (11) for receiving and separating a signal obtained by multiplexing encoded voice and encoded video from a transmission line (10) and a control signal including an allocation signal; An audio decoding section (12) for decoding the encoded audio into a decoded digital audio, a video decoding section (14) for decoding the encoded video into a decoded digital video, and the audio decoding based on the allocation signal A decoding control section (16) for controlling the section (12) and the video decoding section (14), a D / A converter (13) for converting the decoded digital voice into an audio signal, and the decoding With a receiver (C) including a D / A converter (15) for converting a digital image into a video signal Video that is transmitted by changing the transmission ratio of audio and video according to the configured transmission contents.
A voice multiplex transmission system, wherein the receiving section (C) further comprises a variable delay control section (32) for delaying the voice information according to delay time information multiplexed in a multiplexed signal from a transmission line. Video / audio multiplex transmission system characterized by.

11. An allocated bit in which the audio encoding unit (2) outputs an encoded voice having a different encoding bit rate, and one of the encoded voices is output as the assigned signal. Selected based on the rate, and the transmitting unit obtains the inter-frame information change rate of the digital video and compares it with a threshold value, and the determination result is the encoding control unit (5).
The video / audio multiplex transmission system according to any one of claims 1 to 4, further comprising an inter-frame change rate determination unit (8) for outputting to.

12. An allocated bit in which the speech coding unit (2) outputs a plurality of coded speeches of different coding bit rates, and one of the plurality of coded speeches is output as the allocation signal. A video coding unit (41) selected based on a rate, and the video coding unit (4) coding the digital video, and a variable length code coding a variable length of the coded video information. An encoding unit (42), a buffer (43) for temporarily accumulating the variable-length-encoded video information, and an encoding control of an accumulation amount signal according to the amount of information accumulated in the buffer (43). The video / audio multiplex transmission system according to any one of claims 1 to 4, which comprises a buffer determination unit (44) for outputting to a unit (5).

13. An allocated bit in which the audio encoding unit (2) outputs an encoded signal having a plurality of encoded bit rates and one of the encoded audio signals is output as the assigned signal. A delay amount calculation unit which is selected based on a rate and which generates delay time information for the transmission unit (B) to synchronize the video reproduction output and the audio reproduction output from the input / output information of the video encoding unit (4). (31) is further provided,
The video / audio multiplex transmission system according to any one of claims 1 to 4, wherein the delay time information is also multiplexed together by the multiplexing unit (6).

14. The video / audio multiplex transmission system according to claim 5, wherein said decoding control unit (16) performs control based on the allocated bit rate which is the separated allocation signal.

15. An assigned bit in which the speech coding unit (2) outputs a plurality of coded speeches having different coding bit rates, and one of the plurality of coded speeches is output as the assigned signal. The decoding control unit (16) controls based on the allocated bit rate that is the separated allocation signal, and the transmission unit selects the inter-frame information change rate of the digital video. The encoding control unit (5) obtains the comparison result with the threshold value and determines the determination result.
The video / audio multiplex transmission system according to any one of claims 6 to 10, further comprising an inter-frame change rate determination unit (8) for outputting to.

16. An assigned bit in which the speech coding unit (2) outputs a plurality of coded speeches having different coding bit rates, and one of the plurality of coded speeches is output as the assigned signal. The decoding control unit (16) performs control based on the allocated bit rate that is the separated allocation signal, and the video encoding unit (4) selects the digital video. A video coding unit (41) for coding, a variable length coding unit (42) for variable length coding the coded video information, and temporarily storing the variable length coded video information 11. A buffer (43) and a buffer determination unit (44) for outputting a storage amount signal to the encoding control unit (5) according to the amount of information stored in the buffer (43). The video / audio multiplex transmission system described in any one of 1.

17. An allocated bit in which the speech coding unit (2) outputs a plurality of coded speeches having different coding bit rates, and one of the plurality of coded speeches is output as the allocation signal. The decoding control unit (16) performs control based on the allocated bit rate which is the separated allocation signal, and the transmission unit (B) selects the video encoding unit (4). ) Further comprises a delay amount calculation unit (31) for generating delay time information for synchronizing the video reproduction output and the audio reproduction output from the input / output information of
The video / audio multiplex transmission system according to any one of claims 6 to 10, wherein the delay time information is also multiplexed by the multiplexing unit (6).

18. The audio encoding unit (2) outputs a signal having an optimum audio bit rate, and the transmitting unit obtains an inter-frame information change rate of the digital video and compares it with a threshold value. The encoding control unit (5)
The video / audio multiplex transmission system according to any one of claims 11 to 13, further comprising an inter-frame change rate determination unit (8) for outputting to.

19. A video encoding unit (41) for outputting a signal of an optimum audio bit rate by the audio encoding unit (2) and encoding the digital image by the video encoding unit (4). A variable length coding unit (42) for variable length coding the coded video information, a buffer (43) for temporarily storing the variable length coded video information, and the buffer (43 The video signal according to any one of claims 11 to 13, which comprises a buffer determination unit (44) for outputting a storage amount signal to the encoding control unit (5) according to the amount of information stored in Audio multiplex transmission system.

20. The audio encoding unit (2) outputs a signal having an optimum audio bit rate, and the transmission unit (B) outputs a video reproduction output from input / output information of the video encoding unit (4). A delay amount calculation unit (31) for generating delay time information for synchronizing the audio reproduction output,
The video / audio multiplex transmission system according to any one of claims 11 to 13, wherein the delay time information is also multiplexed together in the multiplexing unit (6).

21. The encoding control section (5) outputs the optimum audio bit rate as the allocation signal as it is, and the transmission section (B) outputs from the input / output information of the video encoding section (4). A delay amount calculation unit (31) for generating delay time information for synchronizing the video reproduction output and the audio reproduction output,
21. The video / audio multiplex transmission system according to any one of claims 18 to 20, wherein the delay time information is also multiplexed together by the multiplexing unit (6).

22. The audio encoding unit (2) outputs a signal having an optimum audio bit rate, and the transmitting unit obtains an inter-frame information change rate of the digital video and compares it with a threshold value. The encoding control unit (5)
The video / audio multiplex transmission system according to any one of claims 15 to 17, further comprising an inter-frame change rate determination unit (8) for outputting to.

23. A video encoding unit (41) for outputting the signal of the optimum audio bit rate by the audio encoding unit (2) and encoding the digital image by the video encoding unit (4). A variable length coding unit (42) for variable length coding the coded video information, a buffer (43) for temporarily storing the variable length coded video information, and the buffer (43 And a buffer determination unit (44) for outputting a storage amount signal to the encoding control unit (5) according to the amount of information stored in the video image according to any one of claims 15 to 17. Audio multiplex transmission system.

24. The audio encoding unit (2) outputs a signal of an optimal audio bit rate, and the transmitting unit (B) outputs a video reproduction output from the input / output information of the video encoding unit (4). A delay amount calculation unit (31) for generating delay time information for synchronizing the audio reproduction output,
The video / audio multiplex transmission system according to any one of claims 15 to 17, wherein the delay time information is also multiplexed together by the multiplexing unit (6).

25. The encoding control unit (5) outputs the optimum audio bit rate as the allocation signal as it is, and the transmission unit (B) outputs from the input / output information of the video encoding unit (4). A delay amount calculation unit (31) for generating delay time information for synchronizing the video reproduction output and the audio reproduction output,
25. The video / audio multiplex transmission system according to any one of claims 22 to 24, wherein the delay time information is also multiplexed together by the multiplexing unit (6).

26. An SB-ADPCM coding unit, wherein said speech coding unit (2) divides and encodes said digital speech into a low frequency bit portion and a high frequency bit portion, and allocates a high frequency bit portion by said allocation signal. Amount, and the transmitting unit obtains an inter-frame information change rate of the digital video and compares it with a threshold value, and the determination result is the encoding control unit (5).
The video / audio multiplex transmission system according to any one of claims 1 to 4, further comprising an inter-frame change rate determination unit (8) for outputting to.

27. An SB-ADPCM coding unit, wherein the speech coding unit (2) divides the digital speech into a low frequency bit portion and a high frequency bit portion and encodes the high frequency bit portion by the allocation signal. A video coding unit (41) for selecting the amount, and the video coding unit (4) for coding the digital video, and a variable length coding unit for variable-length coding the coded video information. (42), a buffer (43) for temporarily accumulating the variable length coded video information, and an encoding amount control section for supplying an accumulation amount signal according to the amount of information accumulated in the buffer (43). The video / audio multiplex transmission system according to any one of claims 1 to 4, wherein the video / audio multiplex transmission system comprises a buffer determination unit (44) for outputting to 5).

28. An SB-ADPCM encoding unit for encoding the digital voice by dividing the digital voice into a low-frequency bit portion and a high-frequency bit portion, wherein the high-frequency bit portion is assigned by the assignment signal. A delay amount calculation unit (31) for selecting the amount and generating delay time information for the transmission unit (B) to synchronize the video reproduction output and the audio reproduction output from the input / output information of the video encoding unit (4). ) Is further provided,
The video / audio multiplex transmission system according to any one of claims 1 to 4, wherein the delay time information is also multiplexed by the multiplexing unit (6).

29. The video according to claim 5, wherein the audio decoding unit SB-ADPCM-decodes the coded audio according to the allocation signal.
Audio multiplex transmission system.

30. A video coding unit (41) for coding the digital video by the video coding unit (4), and a variable length coding unit (41) for variable length coding the coded video information. 42), a buffer (43) for temporarily storing the variable-length coded video information, and a storage amount signal according to the amount of information stored in the buffer (43) by the encoding control unit (5). 19. The video / audio multiplex transmission system according to claim 18, which comprises a buffer determination unit (44) for outputting to the).

31. A video coding unit (41) for coding the digital video by the video coding unit (4), and a variable length coding unit (41) for variable length coding the coded video information. 42), a buffer (43) for temporarily storing the variable-length coded video information, and a storage amount signal according to the amount of information stored in the buffer (43) by the encoding control unit (5). 27. The video / audio multiplex transmission system according to claim 26, further comprising a buffer determination unit (44) for outputting to the).

32. The encoding signal generator according to claim 11 or 26, wherein the encoding control unit (5) generates an allocation signal based on a determination result output from the inter-frame change rate determination unit (8). Video / audio multiplex transmission system.

33. The video / audio multiplex transmission system according to claim 12 or 27, wherein said encoding control unit (5) generates an allocation signal based on a storage amount signal.

34. The video / audio multiplex transmission system according to claim 18, wherein the encoding control unit (5) generates an allocation signal according to the interframe change rate determination result and the optimum audio bit rate.

35. The video / audio multiplex transmission according to claim 19 or 30, wherein the encoding control unit (5) generates an allocation signal according to an accumulated amount signal and an optimum audio bit rate. system.

36. The video / audio multiplex transmission according to claim 30, wherein said encoding control unit (5) generates an allocation signal according to an interframe change rate determination result, a storage amount signal and an optimum audio bit rate. system.

37. The video / audio multiplex transmission system according to claim 31, wherein the encoding control unit (5) generates an allocation signal according to the interframe change rate determination result and the accumulated amount signal.

38. A / for converting a voice input into a digital voice
D converter (1), audio encoding unit (2) for encoding the digital audio, and A / D converter (3) for converting an image input into a digital image
A video coding unit (4) for coding the digital video, and video coding delay time information for synchronizing the video playback output and the audio playback output from the input / output information of the video coding unit (4). A video having a transmission amount (D) including a delay amount calculation unit (31) that occurs and a multiplexing unit (6) that multiplexes the coded video information and audio information and the delay time information. -Voice multiplex transmission system.

39. A separation unit (11) for separating a multiplexed signal from a transmission line (10) into audio information, video information and delay time information, and a variable for delaying the audio information according to the delay time information. A delay control section (32), a voice decoding section (12) for decoding voice information from the variable delay control section (32), and a D / C for converting the decoded digital voice into a voice signal.
An A converter (13), a video decoding unit (14) for decoding the video information, and a D / A converter (15) for converting the decoded digital video output into a video output. A video / audio multiplex transmission system having a receiving unit (E) that operates.

40. A / A for converting voice input into digital voice
D converter (1), audio encoding unit (2) for encoding the digital audio, and A / D converter (3) for converting an image input into a digital image
A video coding unit (4) for coding the digital video, and video coding delay time information for synchronizing the video playback output and the audio playback output from the input / output information of the video coding unit (4). A transmission unit (D) including a delay amount calculation unit (31) that occurs, and a multiplexing unit (6) that multiplexes the coded video information and audio information with the delay time information; A separation unit (11) for separating the multiplexed signal from the path (10) into audio information, video information and delay time information, and a variable delay control unit (32) for delaying the audio information according to the delay time information. A voice decoding unit (12) for decoding the voice information from the variable delay control unit (32), and a D / C for converting the decoded digital voice into a voice signal.
An A converter (13), a video decoding unit (14) for decoding the video information, and a D / A converter (15) for converting the decoded digital video output into a video output. A video / audio multiplex transmission system configured with a receiving unit (E).

41. A video coding unit (41) for coding a video signal corresponding to the video signal, and a variable length coding unit (42) for giving a variable length code to the coded result. At the same time, a voice coding unit (2) is provided corresponding to the voice signal, and the output from the variable length coding unit (42) and the output corresponding to the voice coding unit (2) are multiplexed. In the video / audio transmission system configured to be transmitted by the receiving side and performing a process of extracting a video signal and an audio signal with respect to the transmitted one, the output from the audio encoding unit (2). Based on this, a time division encoding unit (71) for extracting and packetizing a signal during the effective period of the voice is provided, and the time division encoding unit (71) controls the voice transmission rate by the system control unit (19). The system control unit (19) is configured to notify Receiving the voice transmission rate, the threshold data for selecting the quantizer is changed according to the amount of data on the buffer memory (43) in the variable length coding unit (42), and the video coding unit. A video / audio transmission system configured to control the amount of information generated by the encoding in (41) and adapted to perform transmission according to a frame format suitable for the above-mentioned audio transmission rate.