JPH10304360A

JPH10304360A - Method for encoding video/voice, device therefor and medium for recording coded program

Info

Publication number: JPH10304360A
Application number: JP28149897A
Authority: JP
Inventors: Takao Matsumoto; 孝夫松本; Aki Yoneda; 亜旗米田; Koichi Horiuchi; 浩一堀内; Hidenori Tatsumi; 英典辰巳; Eiji Kawahara; 栄治河原; Yoshitaka Arase; 吉隆荒瀬
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1996-10-15
Filing date: 1997-10-15
Publication date: 1998-11-13

Abstract

PROBLEM TO BE SOLVED: To properly set an encoding parameter corresponding to the basic capability of a computer by determining the encoding parameter based on at least one of resolution, frame rate, and encoding parameter influencing processing performance. SOLUTION: This device is provided with an encoding means 101 and an encoding parameter decision means 102 including a resolution reference table 110. The encoding parameter decision means 102 determines resolution by using the included resolution reference table 110 from a designated frame rate and an encoding pattern, and outputs the encoding parameter including a parameter indicating the determined resolution to the encoding means 101. The encoding means 101 operates an encoding processing according to the encoding parameter so that encoding with higher resolution can be attained while satisfying a requested condition. The encoding pattern can be determined corresponding to the designated resolution by using the similar reference table.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、映像・音声符号化
方法、符号化装置、及び符号化プログラム記録媒体に関
し、特に、汎用計算機資源を用いてソフトウェア制御に
より、映像、音声、又は映像音声を取り込みに伴って符
号化する符号化方法、符号化装置、及び符号化プログラ
ム記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a video / audio coding method, a coding apparatus, and a coded program recording medium, and more particularly to a video / audio or video / audio control by software control using general-purpose computer resources. The present invention relates to an encoding method, an encoding device, and an encoded program recording medium for encoding in accordance with capture.

【０００２】[0002]

【従来の技術】本来アナログのデータである映像や音声
をデジタル化し、デジタルの映像データや音声データを
得る技術については、デジタルデータの記録、伝送、編
集、複製および伝送等における扱いの容易さから普及と
発展が著しい分野となっている。デジタル化の利点の一
つとして、データを容易に圧縮可能であるという点が挙
げられ、特に記録や伝送のためには圧縮符号化は重要な
技術である。かかる圧縮符号化の技術については、国際
的規格も確立され、中でもＭＰＥＧ規格は、映像や音声
を扱い得る一般的なデジタル規格として普及している。2. Description of the Related Art Techniques for digitizing video and audio data, which are originally analog data, to obtain digital video data and audio data, are based on the ease of handling digital data recording, transmission, editing, duplication and transmission. It has become a remarkable area of diffusion and development. One of the advantages of digitization is that data can be easily compressed, and compression encoding is an important technique especially for recording and transmission. International standards have also been established for such compression encoding technology, and among them, the MPEG standard has become widespread as a general digital standard that can handle video and audio.

【０００３】また、近年、コンピュータならびにＶＬＳ
Ｉ等の半導体デバイスの高速化、低価格化に伴い、マル
チメディア対応パソコンと呼ばれるパソコンが低価格で
市場に出回り、従来デコードハードウェアの追加により
行っていた、圧縮符号化したデジタルデータである映像
および音声の再生が、家庭用・個人用パーソナルコンピ
ュータ上においても、ソフトウェアによって簡単に出来
るようになっている。また、これにともなって、インタ
ーネット等によっても映像や音声の配信などが行われる
ようになっており、ＭＰＥＧ等の規格に準拠した映像お
よび音声の符号化データの利用範囲が拡大している。In recent years, computers and VLS
With the speeding up and cost reduction of semiconductor devices such as I, PCs called multimedia-capable personal computers have come on the market at low prices, and video that is compression-encoded digital data, which was conventionally performed by adding decoding hardware. Software can be easily reproduced on home and personal computers by software. Along with this, distribution of video and audio is also performed via the Internet and the like, and the use range of video and audio coded data conforming to standards such as MPEG is expanding.

【０００４】一方、それらの映像や音声の符号化データ
を作る符号化（エンコード）処理については、一般的に
は、家庭用・個人用パーソナルコンピュータではソフト
ウェア処理が困難であり、専用ハードウェアを追加して
行うこととなる。また、一旦ファイルとして記録して後
にソフトウェア処理により符号化を行うことも可能では
あるが、映像や音声の入力時間の、何倍もの処理時間を
かけて変換を行うものとなることから、一般ユーザにと
って魅力的なソフトウェアとは言い難い。[0004] On the other hand, generally, it is difficult to perform software processing on a home / personal personal computer for encoding (encoding) processing for generating coded data of video and audio, and dedicated hardware is added. Will be done. It is also possible to record once as a file and then perform encoding by software processing, but since conversion takes many times the processing time of video and audio input, conversion takes place. It's hard to say it's attractive software.

【０００５】一般のパソコンユーザーが簡単に、動画を
含む映像や音声を取り込んで符号化データを作成出来る
ようになるためには、キャプチャーボードやサウンドボ
ードを用いて動画や音声を取り込み、この取り込みにと
もなっての、ソフトウェアによる実時間符号化処理が可
能であることが望ましいものであり、ハードウェア面で
の発展・普及に伴い、開発の望まれる分野である。映
像、音声、または映像音声符号化の分野における現状に
ついて、以下に、「Ａ映像符号化」「Ｂ音声符号化」
「Ｃ映像音声符号化」を行う、従来の技術による装置に
ついて説明する。[0005] In order for ordinary PC users to be able to easily capture video and audio including video and create encoded data, video and audio must be captured using a capture board or sound board. Accordingly, it is desirable that real-time encoding processing by software be possible. This is a field in which development is desired with the development and spread of hardware. The current status in the field of video, audio, or video / audio coding is described below as “A video coding” and “B audio coding”.
A device according to the related art for performing “C video / audio coding” will be described.

【０００６】Ａ．従来の技術による映像符号化装置静止画像もしくは動画像を含む映像を、リアルタイムで
ディジタル化してコンピュータに取り込み、該取り込み
に伴っての符号化処理を施すことは、動画像圧縮の国際
標準であるＭＰＥＧ方式を用いて映像をリアルタイムで
符号化するパソコン用拡張カードなどを用いることによ
って実現されている。A. 2. Description of the Related Art A video encoding device based on a conventional technique is to digitize a video including a still image or a moving image in real time and capture it in a computer, and to perform an encoding process along with the capture, which is an international standard for video compression. This is realized by using a personal computer expansion card or the like that encodes a video in real time using the system.

【０００７】図５８は、このような専用ハードウェアを
有するコンピュータにおいて実現された、映像符号化装
置の構成を示すブロック図である。図示するように、従
来の技術による映像符号化装置は、符号化手段５００１
と符号化パラメータ決定手段５００２とからなり、映像
を入力画像データとして入力し、映像符号化データを出
力する。符号化手段５００１には、ＤＣＴ処理手段５０
０３、量子化手段５００４、可変長符号化手段５００
５、ビットストリーム生成手段５００６、逆量子化手段
５００７、逆ＤＣＴ処理手段５００８、および予測画像
生成手段５００９が含まれている。FIG. 58 is a block diagram showing a configuration of a video encoding device realized in a computer having such dedicated hardware. As shown in the figure, a video encoding device according to the related art includes an encoding unit 5001.
And encoding parameter determination means 5002, and inputs a video as input image data and outputs video encoded data. The encoding means 5001 includes the DCT processing means 50
03, quantization means 5004, variable length coding means 500
5, a bit stream generation unit 5006, an inverse quantization unit 5007, an inverse DCT processing unit 5008, and a predicted image generation unit 5009.

【０００８】図において、符号化手段５００１は、映像
がデジタル化された、一連の静止画像からなる映像デー
タを入力画像データとして入力し、設定された符号化パ
ラメータに従って符号化処理し、符号化データを出力す
る。入力画像データを構成する個々の静止画像データを
フレーム画像と呼ぶ。また、符号化パラメータは、符号
化タイプと、解像度とを指示するものとして、後述する
符号化パラメータ決定手段５００２から与えられるもの
である。符号化パラメータ決定手段５００２は、フレー
ム内符号化処理、またはフレーム間符号化処理を示す符
号化タイプと、解像度とを決定して、これらを符号化パ
ラメータとして符号化手段５００１に出力する。In the figure, an encoding means 5001 inputs video data consisting of a series of still images in which video has been digitized as input image data, performs an encoding process in accordance with set encoding parameters, and encodes the encoded data. Is output. Each still image data constituting the input image data is called a frame image. The coding parameter is provided from a coding parameter determination unit 5002, which will be described later, as an instruction of a coding type and a resolution. The encoding parameter determining unit 5002 determines an encoding type indicating an intra-frame encoding process or an inter-frame encoding process and a resolution, and outputs these to the encoding unit 5001 as encoding parameters.

【０００９】符号化手段５００１の内部においては、入
力画像データに対してまずＤＣＴ手段５００３がＤＣＴ
（離散コサイン変換）処理を行ってＤＣＴ変換データを
出力し、次に、量子化手段５００４が、ＤＣＴ変換デー
タに対して量子化処理を行って量子化データを出力し、
次に可変長符号化手段５００５が量子化データに対して
可変長符号化処理を行うことによって、圧縮符号化され
た可変長符号化データが作成される。可変長符号化デー
タはビットストリーム生成手段５００６に入力され、ビ
ットストリーム生成手段５００６から、伝送や記録を行
うことのできるビットストリームとして、当該映像符号
化装置の符号化結果である符号化データが出力される。[0009] Inside the encoding means 5001, first, the DCT means 5003 performs the DCT on the input image data.
(Discrete Cosine Transform) processing to output DCT transformed data, and then the quantization means 5004 performs quantization processing on the DCT transformed data to output quantized data;
Next, the variable-length encoding unit 5005 performs variable-length encoding on the quantized data, thereby creating compression-coded variable-length encoded data. The variable-length coded data is input to a bit stream generation unit 5006, and the bit stream generation unit 5006 outputs coded data as a coding result of the video coding device as a bit stream that can be transmitted or recorded. Is done.

【００１０】逆量子化手段５００７は、量子化手段５０
０４から出力された量子化データに対して、量子化処理
の逆処理である逆量子化処理を行って逆量子化データを
出力し、次に逆ＤＣＴ手段５００８が逆量子化データに
対して、ＤＣＴ処理の逆処理である逆ＤＣＴ処理を行っ
て、逆ＤＣＴ変換データを出力し、逆ＤＣＴ変換データ
は予測画像生成手段５００３に入力され、予測画像デー
タとして出力されることとなる。符号化パラメータに従
って、予測画像を用いた符号化処理が行われる場合に
は、この予測画像データと入力画像データとの差分デー
タがＤＣＴ手段５００３に入力されることにより、符号
化手段５００１においてはフレーム間符号化が行われる
こととなる。The inverse quantization means 5007 includes a quantization means 50
The inverse DCT unit 5008 performs inverse quantization on the quantized data output from the inverse quantizing data and outputs the inversely quantized data. The inverse DCT process, which is the inverse process of the DCT process, is performed to output the inverse DCT transform data. The inverse DCT transform data is input to the predicted image generation unit 5003 and output as predicted image data. When the encoding process using the predicted image is performed according to the encoding parameter, the difference data between the predicted image data and the input image data is input to the DCT unit 5003, so that the encoding unit 5001 generates the frame. Inter-coding is performed.

【００１１】このように構成された、従来の技術による
映像符号化装置による、映像符号化処理の際の動作につ
いて、以下に説明する。まず、符号化処理に先立って、
符号化パラメータ決定手段５００２は、符号化タイプと
解像度とについて符号化パラメータを決定し、これを符
号化手段５００１に出力する。The operation of the video coding apparatus according to the prior art configured as described above in the video coding process will be described below. First, prior to the encoding process,
The encoding parameter determination unit 5002 determines an encoding parameter for the encoding type and the resolution, and outputs this to the encoding unit 5001.

【００１２】一般に、圧縮符号化にあたっては、１フレ
ーム（１画面相当）の静止画像について、その空間的相
関関係（フレーム内の相関関係）に基づいて、冗長性を
除いて圧縮を行うフレーム内符号化と、時間的に近接す
る、例えば連続するフレームの静止画像について、その
時間的相関関係（フレーム間の相関関係）に基づいて冗
長性を除いて圧縮を行うフレーム間符号化とがある。Generally, in compression coding, an intra-frame code for performing compression without redundancy on a still image of one frame (corresponding to one screen) based on its spatial correlation (correlation within a frame). And inter-frame coding for compressing a still image of a temporally close still image of, for example, a continuous frame without redundancy based on the temporal correlation (correlation between frames).

【００１３】従来の技術による映像符号化装置では、基
本的にフレーム内符号化を行うものであるが、フレーム
間符号化をも行うことで、高圧縮率の符号化データが得
られる。しかし、フレーム間符号化を行うためには、符
号化の逆処理である復号化処理や、動き検出・動き補償
処理により、予測画像を生成し、この予測画像と符号化
対象画像との差分を取得するものであるので、これらの
処理を必要とする分、装置にとっての処理負担の増大を
招くこととなる。フレーム間符号化を行う場合の予測画
像の生成については、直前に処理したデータに基づいて
予測を行う順方向予測、直後に処理されるデータに基づ
いて予測を行う逆方向予測、および順方向、または逆方
向予測を行う双方向予測のいずれかがなされる。なお、
フレーム内符号化を「Ｉ」、順方向予測符号化を
「Ｐ」、双方向予測符号化（逆方向を含む）を「Ｂ」と
表記する。The video encoding apparatus according to the prior art basically performs intra-frame encoding. However, by also performing inter-frame encoding, encoded data with a high compression rate can be obtained. However, in order to perform inter-frame encoding, a predicted image is generated by a decoding process, which is a reverse process of the encoding, or a motion detection / motion compensation process, and a difference between the predicted image and the encoding target image is calculated. Since these processes are obtained, the processing load on the apparatus is increased by the amount of the need for these processes. For the generation of a predicted image when performing inter-frame encoding, forward prediction that performs prediction based on data processed immediately before, backward prediction that performs prediction based on data processed immediately after, and forward direction, Alternatively, one of bidirectional prediction in which backward prediction is performed is performed. In addition,
Intra-frame coding is denoted by “I”, forward prediction coding is denoted by “P”, and bidirectional prediction coding (including backward direction) is denoted by “B”.

【００１４】また、画像の解像度については、「３２０
×２４０」、または「１６０×１２０」など１画面あた
りの縦横の画素（ピクセル）数で表わされることが一般
的であり、高解像度において、すなわち１画面に多くの
画素を有することとして処理を行う方が、再生画質の良
好なデータが得られるものであるが、それだけ処理対象
が増えることから、処理負担の増大を招くこととなる。[0014] Regarding the image resolution, "320
It is generally represented by the number of vertical and horizontal pixels (pixels) per screen such as “× 240” or “160 × 120”, and processing is performed at a high resolution, that is, as having many pixels in one screen. Although data with good reproduction image quality can be obtained, the processing load increases because the number of processing targets increases accordingly.

【００１５】又、ＭＰＥＧ規格に準拠するためには、一
定の転送レートにおいて、データの入出力や転送が行わ
れる必要があり、符号化処理においては、この転送レー
トを満足するように処理が行われ、符号化データが出力
される必要がある。映像を処理対象とする場合には、転
送レートは、１秒当たりのフレーム数で表わされるフレ
ームレートで表示することが一般的である。Further, in order to comply with the MPEG standard, it is necessary to perform input / output and transfer of data at a fixed transfer rate. In encoding processing, processing is performed so as to satisfy the transfer rate. Therefore, encoded data needs to be output. When a video is to be processed, the transfer rate is generally displayed at a frame rate represented by the number of frames per second.

【００１６】従って、当該映像符号化装置の処理能力を
考慮して、このフレームレートを満足しつつ、映像取り
込みに伴うリアルタイム処理をなし、かつなるべく高圧
縮率で、再生画質の良い（高解像度の）符号化データが
得られるように、符号化パラメータが設定されることが
望ましい。Therefore, in consideration of the processing capability of the video encoding device, real-time processing accompanying video capturing is performed while satisfying this frame rate, and the compression quality is as high as possible and the reproduction quality is high (high-resolution It is desirable that the encoding parameters are set so that encoded data can be obtained.

【００１７】従来の技術による映像符号化装置において
は、符号化パラメータはこれらの要因を考慮して予め設
定されたものとして、符号化パラメータ手段５００２は
この設定されたパラメータを保持し、符号化の際には符
号化手段５００１に出力するものとすることができる。
また、符号化パラメータのうち、符号化タイプに関して
は、シーンチェンジ等の入力映像の情報に基づいて、符
号化タイプを決定する方法が、「画像符号化装置（特開
平８−９８１８５号公報）」において、開示されてい
る。In the video encoding apparatus according to the prior art, the encoding parameters are set in advance in consideration of these factors, and the encoding parameter means 5002 holds the set parameters and performs encoding. In this case, it can be output to the encoding means 5001.
As for the encoding type among the encoding parameters, a method of determining the encoding type based on information of an input video such as a scene change is described in "Image Encoding Apparatus (Japanese Patent Laid-Open No. 8-98185)". , Are disclosed.

【００１８】符号化手段５００１に入力された符号化パ
ラメータのうち、解像度を示すパラメータはＤＣＴ処理
手段５００３に入力され、処理に用いられる。また、符
号化タイプを示すパラメータは、ＤＣＴ処理手段５００
３に対する入力を、入力画像データそのものとするか、
予測画像との差分にするかの切り替えの制御に用いられ
ることとなる。Among the encoding parameters input to the encoding means 5001, a parameter indicating the resolution is input to the DCT processing means 5003 and used for processing. Also, the parameter indicating the encoding type is the DCT processing means 500
3 is the input image data itself,
This is used for controlling whether to make a difference from the predicted image.

【００１９】ＤＣＴ処理手段５００３は、符号化パラメ
ータ決定手段５００２から入力された解像度に基づき、
入力されたフレーム画像、または差分データに対して、
ＤＣＴ処理をして、ＤＣＴ変換データを出力する。ＤＣ
Ｔ処理は、一般には、対象となるデータを８画素×８画
素のブロックに分割し、分割したブロックごとに２次元
離散コサイン変換することが行われる。次いで、量子化
手段１０４は、ＤＣＴ変換データに対して、ある定めら
れた値を用いて量子化処理を行い、量子化データを出力
する。量子化処理は、一般には、量子化ステップの値
（上記の定められた値）を用いた除算処理により行われ
る。そして、可変長符号化手段１０５は、量子化データ
を可変長符号化し、可変長符号化データを出力する。可
変長符号化は、符号化処理におけるビット割当におい
て、頻度の高いものに対してビット数の少ない符号を割
り当てることにより、全体のデータ量を小さくするもの
である。ビットストリーム生成手段１０６は、可変長符
号化手段１０５が出力した可変長符号化データより、当
該映像符号化装置の装置出力として、符号化結果である
ビットストリームを生成して出力する。The DCT processing means 5003, based on the resolution input from the coding parameter determining means 5002,
For the input frame image or difference data,
It performs DCT processing and outputs DCT conversion data. DC
In the T processing, generally, target data is divided into blocks of 8 pixels × 8 pixels, and two-dimensional discrete cosine transform is performed for each of the divided blocks. Next, the quantization unit 104 performs a quantization process on the DCT transformed data using a predetermined value, and outputs the quantized data. The quantization process is generally performed by a division process using the value of the quantization step (the above-defined value). Then, the variable length coding unit 105 performs variable length coding on the quantized data and outputs variable length coded data. The variable length coding is to reduce the entire data amount by allocating a code having a small number of bits to a high frequency code in a bit allocation in the coding process. The bit stream generating means 106 generates and outputs a bit stream as a coding result from the variable length coded data output from the variable length coding means 105 as a device output of the video coding device.

【００２０】フレーム間符号化が実行される場合には、
次の動作が行われる。逆量子化手段５００７は、量子化
手段５００４が出力する量子化データを逆量子化し、逆
量子化データを出力する。次いで、逆ＤＣＴ処理手段５
００８が、逆量子化データに対して、ＤＣＴ処理手段５
００３が分割した８画素×８画素のブロックごとに、２
次元離散コサイン変換の逆処理である２次元逆離散コサ
イン変換を実行し、逆ＤＣＴ変換データを出力する。予
測画像生成手段５００９は、逆ＤＣＴ変換データに基づ
いて予測画像を生成し、これを出力する。ＤＣＴ処理手
段５００３には、入力画像データと予測画像との差分デ
ータが入力されることとなる。When inter-frame encoding is performed,
The following operation is performed. The inverse quantization means 5007 inversely quantizes the quantized data output from the quantization means 5004, and outputs inversely quantized data. Next, the inverse DCT processing means 5
008: DCT processing means 5
For each 8 × 8 pixel block divided by 003, 2
A two-dimensional inverse discrete cosine transform, which is an inverse process of the dimensional discrete cosine transform, is executed, and inverse DCT transform data is output. The predicted image generation means 5009 generates a predicted image based on the inverse DCT transform data, and outputs the generated predicted image. The DCT processing means 5003 receives the difference data between the input image data and the predicted image.

【００２１】Ｂ．従来の技術による音声符号化装置音声符号化方法については、ＭＰＥＧＡｕｄｉｏ方式に
準拠した帯域分割符号化方式による符号化方法が、人間
の声や音楽、自然環境の音、様々な効果音など、広い帯
域にわたる音声一般を符号化するのに用いられる。高性
能なマルチメディアパソコン等では、標準装備されるこ
とが一般的なサウンドボードを用いて取り込んだ音声
を、取り込みにともなっての実時間符号化処理を行うこ
とも可能である。従来技術による音声符号化装置の第１
の例としては、上記の帯域分割符号化方式によって、入
力した音声を符号化するものについて説明する。B. 2. Related Art Speech coding apparatus according to the prior art Regarding a speech coding method, a coding method based on a band division coding method conforming to the MPEG Audio method is applied to a wide band such as a human voice, music, a sound in a natural environment, and various sound effects. Used to encode general speech. In a high-performance multimedia personal computer or the like, it is also possible to perform real-time encoding processing on audio captured using a sound board that is generally equipped as standard. First of conventional speech coding devices
As an example, a description will be given of an example in which input speech is encoded by the above-described band division encoding scheme.

【００２２】一方、ＭＰＥＧ１Ａｕｄｉｏに準拠した音
声符号化の方法として、心理聴覚分析を応用する方法が
ある。本来、ＭＰＥＧ１Ａｕｄｉｏに準拠したエンコー
ダでは、聴覚心理モデルを用いて人間の聴覚能力の限界
や、マスキング効果を考慮した上で、各帯域にビットを
割り当てる優先順位を決める。これは、人間の静的およ
び動的な聴覚特性にあわせた高能率符号化を行うための
ものであるが、ＭＰＥＧ１Ａｕｄｉｏ規格のデータフォ
ーマットには影響せず、たとえこれを行わなくとも、Ｍ
ＰＥＧ１Ａｕｄｉｏ符号化データは作成可能である。ま
た、後述するように、心理聴覚分析の処理は処理負荷が
大きく、従来技術の第１の例に示したように、この処理
を省くことにより、ＣＰＵに対する大幅な処理負荷軽減
を図ることが可能となる。ただし、心理聴覚分析を応用
しない分、再生音質は低下することとなる。従来技術に
よる音声符号化の第２の例としては、かかる心理聴覚分
析を応用した音声符号化について説明する。On the other hand, there is a method of applying psychoacoustic analysis as a speech coding method based on MPEG1 Audio. Originally, in an encoder conforming to MPEG1 Audio, the priority of allocating bits to each band is determined by using the psychoacoustic model in consideration of the limit of human hearing ability and the masking effect. This is for performing high-efficiency encoding according to the static and dynamic auditory characteristics of humans, but does not affect the data format of the MPEG1 Audio standard.
PEG1 Audio encoded data can be created. Further, as will be described later, the processing of the psychological hearing analysis has a large processing load, and as shown in the first example of the related art, it is possible to largely reduce the processing load on the CPU by omitting this processing. Becomes However, the reproduction sound quality is degraded because the psychoacoustic analysis is not applied. As a second example of speech encoding according to the related art, speech encoding that applies such psychological auditory analysis will be described.

【００２３】Ｂ−１．従来の技術による音声符号化装置
の第１例図５９は、従来技術の第１の例による音声符号化装置の
構成を示すブロック図である。図示するように、第１の
例による音声符号化装置は、音声入力部２５５１、入力
音声サンプリング部２５５３、帯域分割部２５５５、符
号化ビット割り当て部２５５６、量子化部２５５７、符
号化部２５５８、および符号化データ記録部２５５９か
ら構成されている。B-1. First Example of Conventional Speech Coding Apparatus FIG. 59 is a block diagram illustrating a configuration of a first example of a conventional speech coding apparatus. As shown in the figure, the audio encoding device according to the first example includes an audio input unit 2551, an input audio sampling unit 2553, a band division unit 2555, an encoded bit allocation unit 2556, a quantization unit 2557, an encoding unit 2558, and It comprises an encoded data recording section 2559.

【００２４】図において、音声入力部２５５１は、符号
化を行う音声を入力する。一般的には音声はマイクロホ
ンから、あるいはライン入力として入力される。入力音
声サンプリング部２５５３は、サウンドボードの入力機
能および制御プログラムによって実現され、音声入力部
２５５１が入力した音声に対してサンプリング処理を行
う。帯域分割部２５５５は、サンプリング処理されたデ
ータを帯域分割する。符号化ビット割り当て部２５５６
は、帯域分割部２５５５が分割した帯域のそれぞれに対
して、符号化ビットを割り当てる。量子化部２５５７
は、符号化ビット割り当て部２５５６の割り当てた符号
化ビット数に従って、量子化処理を行う。符号化部２５
５８は、量子化部２５５７の出力する量子化値を符号化
音声データとして出力する。２５５５〜２５５８はいず
れも、コンピュータのＣＰＵ、メインメモリ、およびプ
ログラムで実現される。符号化データ記録部２５５９
は、磁気記憶装置等の記憶装置、および該記憶装置の制
御プログラムで実現され、出力された符号化データを記
録する。In the figure, a voice input unit 2551 inputs voice to be coded. Generally, sound is input from a microphone or as a line input. The input voice sampling unit 2553 is realized by an input function and a control program of the sound board, and performs a sampling process on the voice input by the voice input unit 2551. Band dividing section 2555 divides the band of the sampled data. Encoded bit allocation section 2556
Assigns coded bits to each of the bands divided by the band division unit 2555. Quantizer 2557
Performs a quantization process according to the number of coded bits allocated by the coded bit allocation unit 2556. Encoding unit 25
58 outputs the quantization value output from the quantization unit 2557 as encoded audio data. Each of 2555 to 2558 is realized by a CPU, a main memory, and a program of a computer. Encoded data recording section 2559
Is realized by a storage device such as a magnetic storage device, and a control program for the storage device, and records the output encoded data.

【００２５】図６０は、従来の音声符号化方法のフロー
チャート図、図６１はサンプリング処理を、図６２〜図
６３は帯域分割を説明するための図である。以下に、図
５９〜図６３を参照し、図６０のフローに従って、従来
技術の第１の例による音声符号化装置の動作を説明す
る。FIG. 60 is a flowchart of a conventional speech encoding method, FIG. 61 is a diagram for explaining sampling processing, and FIGS. 62 to 63 are diagrams for explaining band division. The operation of the speech encoding device according to the first example of the related art will be described below with reference to FIGS. 59 to 63 and the flow of FIG.

【００２６】図６０のステップ１では、入力音声サンプ
リング部２５５３が、予め設定したサンプリング周波数
ｆｓで入力音声信号をサンプリングしてサンプリングデ
ータとする。図６１に示すように、入力音声は時間と音
圧の関係を示すグラフとして表現される。サンプリング
はこの入力音声をサンプリング周期といわれる時間ｔｓ
ごとに等分するように行われるが、このサンプリング周
期ｔｓと上記サンプリング周波数ｆｓとの間には、図示
したように逆数関係が成立している。In step 1 of FIG. 60, the input audio sampling section 2553 samples the input audio signal at a preset sampling frequency fs to obtain sampling data. As shown in FIG. 61, the input voice is expressed as a graph showing the relationship between time and sound pressure. Sampling uses this input voice as time ts called the sampling period.
The sampling frequency ts and the sampling frequency fs have a reciprocal relationship as shown in the figure.

【００２７】図６０のステップ２以降は、ＣＰＵの制御
によりソフトウェア的に行われる演算処理を中心とする
処理である。ステップ２において、サンプリングデータ
は、帯域分割部２５５５によりＭ個の周波数帯域に帯域
分割される。図６２は音声データを全帯域入力信号とし
て、これを１２の帯域に分割する場合を示す概念図であ
り、帯域０信号ＢＰＦ０から帯域１１信号ＢＰＦ１１ま
での１２個の帯域信号が作成されることを示している。
図６３はこのように１２個に帯域分割された帯域信号を
示す図である。この図において帯域分割された信号は、
図６１と異なり、音圧を時間ではなく周波数との関係に
おいて表したものとしてある。Step 2 and subsequent steps in FIG. 60 are processing centered on arithmetic processing performed by software under the control of the CPU. In step 2, the sampling data is divided into M frequency bands by the band dividing unit 2555. FIG. 62 is a conceptual diagram showing a case where audio data is used as a full-band input signal and divided into 12 bands, and it is assumed that 12 band signals from band 0 signal BPF0 to band 11 signal BPF11 are created. Is shown.
FIG. 63 is a diagram showing band signals divided into 12 bands as described above. In this figure, the band-divided signal is
Unlike FIG. 61, the sound pressure is represented not by time but by frequency.

【００２８】ＭＰＥＧオーディオの場合、レイヤ１〜３
が規定されており、１→２→３の順に再生音質が良好な
ものとなるが、必要なハードウェア性能は高くなり、ハ
ードウェア規模が増大する。ここで、レイヤ１に適応し
た音声符号化を行う場合には、１回の帯域分割処理で対
象とする入力音声サンプル数ｐはｐ＝３２となる。そし
て入力音声サンプルとしては対象とする３２サンプルを
中心に前後５１２サンプルを用いて、３２帯域へ分割し
帯域ごとの音声データを出力する。In the case of MPEG audio, layers 1 to 3
Is defined, and the reproduction sound quality becomes good in the order of 1 → 2 → 3, but the required hardware performance increases and the hardware scale increases. Here, when performing audio coding adapted to Layer 1, the number p of input audio samples to be processed in one band division process is p = 32. The input audio sample is divided into 32 bands using 512 samples before and after the target 32 samples, and audio data for each band is output.

【００２９】ステップ２で帯域分割によって得られたＭ
個の帯域信号データは、帯域分割部２５５５から量子化
部２５５７に渡される。一方、ステップ３で符号化ビッ
ト割り当て部２５５６は、Ｍ個の帯域信号の全てに対し
て、符号化ビットを割り当てる。そしてステップ４で
は、量子化部２５５７が、帯域分割部２５５５から渡さ
れた帯域信号データに対して、符号化ビット割り当て部
２５５６により割り当てられた符号化ビット数に応じ
て、帯域ごとに量子化を行って量子化値とする。次い
で、ステップ５では符号化部２５５８が、その量子化値
を符号化して出力し、符号化データ記録部２５５９によ
って、出力された符号化データが記録される。The M obtained by the band division in step 2
The band signal data are passed from band division section 2555 to quantization section 2557. On the other hand, in step 3, the coded bit allocation unit 2556 allocates coded bits to all of the M band signals. In step 4, the quantization unit 2557 performs quantization on the band signal data passed from the band division unit 2555 for each band according to the number of coded bits allocated by the coded bit allocation unit 2556. Go to the quantized value. Next, in step 5, the encoding unit 2558 encodes and outputs the quantized value, and the encoded data output unit 2559 records the output encoded data.

【００３０】音声入力が続く間は、ステップ１〜５が繰
り返し実行されることにより、音声が入力され続け、実
時間符号化処理が行われて、符号化データが出力され、
記録され続ける。音声入力が終わると、速やかに符号化
処理は終了する。記憶装置に格納された符号化データ
は、ＭＰＥＧ再生可能なデータとして保存される。ある
いは記録格納する代わりに、符号化データがネットワー
ク等によって伝送され利用されることも可能である。以
上が、従来技術の第１の例として示した、音声取り込み
に伴って実時間で符号化データを得る音声符号化装置で
ある。While the voice input continues, steps 1 to 5 are repeatedly performed, whereby the voice is continuously input, the real-time coding process is performed, and the coded data is output.
Keeps being recorded. When the voice input ends, the encoding process ends immediately. The encoded data stored in the storage device is stored as MPEG reproducible data. Alternatively, instead of recording and storing, encoded data can be transmitted and used via a network or the like. The above is the first example of the prior art, which is the audio encoding apparatus that obtains encoded data in real time with audio capture.

【００３１】Ｂ−２．従来の技術による音声符号化装置
の第２例図６４は、従来技術の第２の例の音声符号化装置の構成
を示すブロック図である。図示するように、第２の例に
よる音声符号化装置は、音声入力部２６５１、入力音声
サンプリング部２６５３、帯域分割部２６５５、量子化
部２６５７、符号化部２６５８、符号化データ記録部２
６５９、ＦＦＴ部２６６０、心理聴覚分析部２６６１、
および符号化ビット割り当て部２６６２から構成されて
いる。この装置は第１の例の装置にＦＦＴ部２６６０と
心理聴覚分析部２６６１を追加した構成である。B-2. Second example of conventional speech coding apparatus FIG. 64 is a block diagram showing a configuration of a second example of the speech coding apparatus of the prior art. As shown in the figure, the audio encoding device according to the second example includes an audio input unit 2651, an input audio sampling unit 2653, a band division unit 2655, a quantization unit 2657, an encoding unit 2658, an encoded data recording unit 2
659, FFT unit 2660, psychological hearing analysis unit 2661,
And a coded bit allocation unit 2662. This device has a configuration in which an FFT unit 2660 and a psychological auditory analysis unit 2661 are added to the device of the first example.

【００３２】図において、ＦＦＴ（高速フーリエ変換）
部２６６０は、信号に対してフーリエ変換処理を施し
て、心理聴覚分析を行えるようにする。心理聴覚分析部
２６６１は、ＦＦＴ部２６６０において処理された信号
に対し、最小可聴限界との比較や、マスキング効果の分
析を行う。符号化ビット割り当て部２６６２は、心理聴
覚分析部２６６１の分析結果に基づいて、人間の耳に聞
こえる信号に対する符号化ビットの割り当てを相対的に
増やすように、符号化ビットの割り当てを行う。音声入
力部２６５１、入力音声サンプリング部２６５３、帯域
分割部２６５５、量子化部２６５７、符号化部２６５
８、および符号化データ記録部２６５９は、第１の例の
２５５１〜２５５５、および２５５７〜２５５９と同様
であるので、説明を省略する。In the figure, FFT (Fast Fourier Transform)
The unit 2660 performs Fourier transform processing on the signal so that psychological auditory analysis can be performed. The psychological hearing analysis unit 2661 compares the signal processed by the FFT unit 2660 with the minimum audible limit and analyzes the masking effect. The coded bit allocation unit 2662 allocates the coded bits based on the analysis result of the psychological auditory analysis unit 2661 so that the allocation of the coded bits to the signal audible to the human ear is relatively increased. Audio input unit 2651, input audio sampling unit 2653, band division unit 2655, quantization unit 2657, encoding unit 265
8, and the coded data recording unit 2659 are the same as those of the first example 2551 to 2555 and 2557 to 2559, and thus description thereof is omitted.

【００３３】図６５は、ＭＰＥＧ１Ａｕｄｉｏ符号化の
フローチャート、図６６は最小可聴限界を示す図であ
る。以下に第２の例の音声符号化装置の動作について、
図６４〜図６６を参照して説明する。図６５のフローに
おけるステップ１からステップ２は第１の例と同様に行
われ、Ｍ個の帯域に分割された信号が得られる。一般的
な例として、Ｍ＝３２個の帯域信号を得るものとする。
第１例と同様に、帯域信号は帯域分割部５５から量子化
部５７に渡される。FIG. 65 is a flowchart of MPEG1 Audio encoding, and FIG. 66 is a diagram showing a minimum audible limit. Hereinafter, the operation of the speech encoding device of the second example will be described.
This will be described with reference to FIGS. Steps 1 and 2 in the flow of FIG. 65 are performed in the same manner as in the first example, and a signal divided into M bands is obtained. As a general example, it is assumed that M = 32 band signals are obtained.
As in the first example, the band signal is passed from the band division unit 55 to the quantization unit 57.

【００３４】一方ステップ３ではＦＦＴ部２６６０が、
サンプリングした入力音声データを、高速フーリエ変換
（ＦＦＴ）処理によってＬ個の帯域に分割した後、心理
聴覚分析部２６６１に渡し、心理聴覚分析部２６６１は
このＬ個の信号に対して分析を行う。例えばＭＰＥＧオ
ーディオのレイヤ１に従った場合では、５１２個のサン
プリングデータが用いられるが、ＦＦＴ部２６６０では
高速フーリエ変換処理によってＬ＝２５６個の帯域に分
割がなされる。なお、レイヤ２の場合には、１０２４サ
ンプルを用いて５１２帯域の出力を行うので、それだけ
処理負担は増大する。On the other hand, in step 3, the FFT unit 2660
After the sampled input voice data is divided into L bands by fast Fourier transform (FFT) processing, it is passed to a psycho-auditory analysis unit 2661, and the psycho-auditory analysis unit 2661 analyzes the L signals. For example, in the case of following the layer 1 of the MPEG audio, 512 sampling data are used, but the FFT unit 2660 divides the data into L = 256 bands by the fast Fourier transform process. In the case of Layer 2, since 512 bands are output using 1024 samples, the processing load increases accordingly.

【００３５】心理聴覚分析部２６６１は、各帯域信号に
ついて、図６６に示すような人間の耳に聞こえない限界
レベルである最小可聴限界との比較を行う。なお、図６
６は３２帯域への分割として示してあるが、この例のよ
うに２５６個などと分割数が増加した場合でも最小可聴
限界のグラフは同じものであって、図６６に示したのと
同じ範囲に対して横軸（帯域）についての細分化が行わ
れることとなる。The psycho-aural analysis unit 2661 compares each band signal with a minimum audible limit, which is a limit level that cannot be heard by human ears, as shown in FIG. FIG.
6 is shown as a division into 32 bands, even when the number of divisions is increased to 256 or the like as in this example, the graph of the minimum audible limit is the same, and the same range as shown in FIG. Is subdivided on the horizontal axis (band).

【００３６】心理聴覚分析部２６６１の分析により、最
小可聴限界未満とされた帯域に対しては、後段の処理に
おいてビット割り当てが行われず、その分、他の帯域に
より多くのビットが割り当てられることとなる。According to the analysis by the psychological auditory analysis unit 2661, no bit is allocated in the subsequent processing to the band determined to be less than the minimum audible limit, and accordingly, more bits are allocated to other bands. Become.

【００３７】又、人間の聴覚については、比較的小さな
音、すなわち音圧の小さな信号は、周波数的に、または
時間的に近接する大きな音、即ち音圧の大きい信号があ
るときには、聞き取られないというマスキング現象があ
ることが認められている。そこで心理聴覚分析部２６６
１は、各帯域の信号について近接する信号との関係を調
べ、マスキング現象によってマスクされる（聞き取れな
い）信号を検出する。ここで検出された信号について
も、後段の処理においてビットが割り当てられず、その
分、他の帯域により多くのビットが割り当てられること
となる。In human hearing, a relatively small sound, that is, a signal with a small sound pressure, cannot be heard when there is a large sound that is close in frequency or time, that is, a signal with a large sound pressure. It is recognized that there is a masking phenomenon. Therefore, the psychological hearing analysis unit 266
1 examines the relationship between signals in each band and adjacent signals, and detects signals that are masked (inaudible) by the masking phenomenon. Regarding the signal detected here, no bits are allocated in the subsequent processing, and accordingly, more bits are allocated to other bands.

【００３８】図６５のフローのステップ５では、符号化
ビット割り当て部２６５７が、心理聴覚分析部２６６１
の分析結果に応じて符号化ビットの割り当てを行う。こ
こでは、Ｌ帯域の分析結果について、Ｍ帯域に対する割
り当てが行われる。従って、人間の耳に聞こえない、ま
たは聞こえにくい信号についてはビットが割り当てられ
ず、その分、よく聞こえる信号に対して多くのビットが
割り当てられる。ステップ６以降については、第１の例
と同様であり、ステップ１〜７が繰り返されることで、
音声入力に伴った音声符号化が行われる。In step 5 of the flow in FIG. 65, the coded bit allocating unit 2657
Is performed according to the result of the analysis. Here, the analysis result of the L band is assigned to the M band. Therefore, bits are not assigned to a signal that cannot be heard or hardly heard by human ears, and more bits are assigned to a signal that is well heard. Step 6 and subsequent steps are the same as in the first example, and by repeating steps 1 to 7,
Speech coding accompanying speech input is performed.

【００３９】このようにして、人間の聴覚により聞き取
りやすい音声に、より多くの符号化ビットが割り当てら
れることにより、心理聴覚分析を取り入れたＭＰＥＧオ
ーディオの音声符号化では、再生音質の良好な符号化音
声データを得ることが可能となる。In this manner, more coded bits are allocated to speech that is more audible to human hearing, and thus, in MPEG audio speech encoding incorporating psychological auditory analysis, encoding with good reproduction sound quality is achieved. Voice data can be obtained.

【００４０】Ｃ．従来の技術による映像音声符号化装置図６７は従来の技術による映像音声符号化装置の概略構
成を示す図である。図示するように、従来の技術による
映像符号化装置は、ビデオカメラ２７０１、音声キャプ
チャ部２７０２、音声符号化部２７０３、映像キャプチ
ャ部２７０４、および映像符号化部２７０５から構成さ
れている。当該映像音声符号化装置からは、図示するよ
うに符号化音声情報と符号化映像情報とが装置出力とし
て出力され、これらは必要に応じて伝送されたり記録さ
れることとなる。C. Video / Audio Coding Apparatus According to Conventional Technique FIG. 67 is a diagram showing a schematic configuration of a video / audio coding apparatus according to conventional technique. As shown in the figure, the video encoding device according to the related art includes a video camera 2701, an audio capture unit 2702, an audio encoding unit 2703, a video capture unit 2704, and a video encoding unit 2705. As shown in the figure, the video / audio coding apparatus outputs coded audio information and coded video information as device outputs, which are transmitted or recorded as necessary.

【００４１】同図において、ビデオカメラ２７０１は、
映像音声情報を取り込み、アナログ音声情報とアナログ
映像情報とに分けて出力する。音声キャプチャ部２７０
２は、ビデオカメラ２７０１から出力されたアナログ音
声情報を入力し、離散的なデジタルデータからなるデジ
タル原音声情報として出力する。音声符号化部２７０３
は、原音声情報を圧縮符号化処理し、符号化音声情報を
出力する。映像キャプチャ部２７０４は、ビデオカメラ
２７０１から出力されたアナログ映像情報を入力し、離
散的なデジタルデータからなり、単位時間ごとの静止画
像の複数枚から構成されるデジタルの原映像情報を出力
する。映像符号化部２７０５は、映像キャプチャ部２７
０４から出力された原映像情報を入力し、圧縮符号化し
て符号化映像情報を出力する。In the figure, a video camera 2701 is
It takes in video and audio information, and separates and outputs analog audio information and analog video information. Voice capture unit 270
2 inputs analog audio information output from the video camera 2701 and outputs it as digital original audio information composed of discrete digital data. Audio encoding unit 2703
Performs compression encoding on the original audio information and outputs encoded audio information. The video capture unit 2704 receives the analog video information output from the video camera 2701 and outputs digital original video information composed of discrete digital data and composed of a plurality of still images per unit time. The video encoding unit 2705 includes the video capture unit 27
The original video information output from the input unit 04 is input, compression-encoded, and coded video information is output.

【００４２】このように構成される従来の技術による映
像音声符号化装置における、映像音声のとりこみにとも
なっての符号化の際の動作を以下に説明する。まず、ビ
デオカメラ２７０１が映像音声情報を取り込み、アナロ
グ音声情報とアナログ映像情報とに分けて出力する。The operation of the conventional video / audio coding apparatus having the above-described configuration at the time of video / audio coding will be described below. First, the video camera 2701 takes in the video / audio information, and outputs it separately into analog audio information and analog video information.

【００４３】アナログ音声情報は、音声キャプチャ部２
７０２に入力され、音声キャプチャ部２７０２は、アナ
ログ／デジタル変換処理によって、デジタル原音声情報
を作成して、これを音声符号化部２７０３に出力する。
一方、アナログ映像情報は、映像キャプチャ部２７０４
に入力され、映像キャプチャ部２７０４は、アナログ／
デジタル変換処理によって、複数の静止画像情報からな
るデジタル原映像情報を作成して、これを映像符号化部
２７０５に出力する。The analog audio information is sent to the audio capture unit 2
The audio input unit 702 receives the audio data, and the audio capture unit 2702 creates digital original audio information by analog / digital conversion processing, and outputs the digital original audio information to the audio encoding unit 2703.
On the other hand, the analog video information is stored in the video capture unit 2704.
, And the video capture unit 2704
By digital conversion processing, digital original video information composed of a plurality of pieces of still image information is created, and this is output to the video encoding unit 2705.

【００４４】音声符号化部２７０３は、原音声情報に対
して符号化処理を行って、符号化音声情報を出力する。
一方、映像符号化部２７０５は、原映像情報に対して符
号化処理を行って符号化映像情報を出力する。The audio encoding unit 2703 performs an encoding process on the original audio information and outputs encoded audio information.
On the other hand, the video encoding unit 2705 performs an encoding process on the original video information and outputs encoded video information.

【００４５】ビデオカメラ２７０１から、映像音声のと
りこみが続く間は、音声キャプチャ部２７０２、音声符
号化部２７０３、映像キャプチャ部２７０４、および映
像符号化部２７０５によるデジタル化と符号化が継続さ
れ、映像音声のとりこみが終了した後、デジタル化と符
号化も終了する。While video and audio are continuously captured from the video camera 2701, digitization and encoding by the audio capture unit 2702, audio encoding unit 2703, video capture unit 2704, and video encoding unit 2705 are continued. After the audio capture is completed, the digitization and encoding are also completed.

【００４６】[0046]

【発明が解決しようとする課題】従来の技術のＡ〜Ｃの
例おいて示したように、従来の技術による映像符号化装
置、音声符号化装置、および映像音声符号化装置は、映
像、音声、または映像音声のとりこみに伴って、符号化
処理を行い、符号化映像データ、符号化音声データ、ま
たは符号化映像データと符号化音声データとを出力し
て、記録や伝送しての利用に供するものである。As shown in the examples A to C of the prior art, the video coding apparatus, the audio coding apparatus, and the video / audio coding apparatus according to the conventional techniques are used for video, audio, , Or with the incorporation of video and audio, performs encoding processing and outputs coded video data, coded audio data, or coded video data and coded audio data for recording and transmission. To offer.

【００４７】Ａ．従来の技術による映像符号化の問題点しかしながら、従来技術のＡに示す、リアルタイム処理
の可能な映像符号化装置を、例えばパーソナルコンピュ
ータ（ＰＣ）等の汎用計算機システムにおいて、符号化
処理を行うソフトウェアを実行するものとして実現しよ
うとした場合には、当該ソフトウェアは様々な環境（周
辺機器、ネットワーク環境等）におかれた、多様な性能
のハードウェアにおいて実行され得るものであることか
ら、以下のような問題点につながることとなる。A. Problems of Video Coding by Conventional Technology However, a video coding device capable of real-time processing shown in A of the prior art is replaced with software for performing coding processing in a general-purpose computer system such as a personal computer (PC). If the software is intended to be executed, the software can be executed by hardware of various performances in various environments (peripheral devices, network environment, etc.). Problems.

【００４８】例えば上記のリアルタイム映像符号化装置
をＰＣ上で動作するアプリケーションソフトとして実現
するものとして、入力された映像を、とりこみに伴って
のリアルタイム処理をし、３２０×２４０の解像度にお
いてＭＰＥＧ１の規格に従った符号化をする場合に、符
号化タイプとしてフレーム内符号化である「Ｉ」と、フ
レーム間予測符号化である「Ｐ」および「Ｂ」について
「ＩＢＢＰＢＢ」の順番で繰り返すパターンを選択した
とする。この場合に、基本的なハードウェア性能が比較
的高い場合、例えば動作周波数１６６MHz の制御装置
（ＣＰＵ）を有する場合に、かかるソフトウェア処理を
行うものであれば、上記の設定に従い、「ＩＢＢＰＢ
Ｂ」のパターンを有する符号化タイプにおいて６つのフ
レーム画像を処理して、６／３０秒を要するものとす
る。この場合であれば、結果として３０フレーム/ 秒で
映像をリアルタイム符号化できることとなる。For example, as a realization of the above-mentioned real-time video encoding apparatus as application software operating on a PC, an input video is processed in real time in accordance with the import, and the resolution of 320 × 240 is set to the MPEG1 standard. In the case of encoding according to the following, select a pattern that repeats in the order of “I” for intra-frame encoding and “P” and “B” for inter-frame predictive encoding in the order of “IBBPBB” as the encoding type. Suppose you did. In this case, if the basic hardware performance is relatively high, for example, if a control device (CPU) having an operation frequency of 166 MHz is provided, and if the software processing is performed, “IBBPB” is set according to the above setting.
It is assumed that it takes 6/30 seconds to process six frame images in the encoding type having the pattern “B”. In this case, as a result, the video can be encoded in real time at 30 frames / sec.

【００４９】一方、基本的なハードウェア性能が低い場
合、例えば動作周波数１００MHz の制御装置（プロセッ
サ、ＣＰＵ）を有する場合に、かかるソフトウェア処理
を行うものであれば、上記のような符号化処理を６／３
０秒では行い得なくなり、得られた符号化データのフレ
ームレートが小さなものとなる。符号化結果において、
フレームレートが、３０（フレーム／秒）以下である
と、その符号化結果からの再生により得られる映像は動
きがぎこちないものになるので、このような場合には良
好な符号化をえないことになってしまう。On the other hand, when the basic hardware performance is low, for example, when a control device (processor, CPU) having an operation frequency of 100 MHz is provided, and the software processing is performed, the above-described encoding processing is performed. 6/3
It cannot be performed in 0 seconds, and the frame rate of the obtained encoded data becomes small. In the encoding result,
If the frame rate is 30 (frames / second) or less, the video obtained by the reproduction from the encoded result has awkward motion, and in such a case, good encoding cannot be performed. turn into.

【００５０】同様の事態は、例えばこのようなソフトウ
ェア処理を、マルチタスクオペレーティングシステム上
の一つのタスク（作業）として実行する際に、他のタス
ク（作業）として、ワードプロセッサ等の別のアプリケ
ーションソフトが実行された場合や、割り込みによる中
断があった場合などでは、比較的高性能なハードウェア
環境においても起こり得るものである。In a similar situation, for example, when such a software process is executed as one task (work) on a multitasking operating system, another application software such as a word processor is executed as another task (work). When executed, interrupted by an interrupt, or the like, it can occur even in a relatively high-performance hardware environment.

【００５１】また、同様の符号化を解像度「３２０×２
４０」でならば問題なく実行できても、解像度「６４０
×４００」で実行したならば、処理速度が十分なものと
ならず、フレームレート低下による問題が起こるという
こともある。Further, the same encoding is performed with a resolution of “320 × 2
If the resolution is "640", it can be executed without any problem.
If the processing is executed at "× 400", the processing speed may not be sufficient, and a problem may occur due to a decrease in the frame rate.

【００５２】以上は、ハードウェア性能が不足となる場
合の問題点であるが、逆に高性能なハードウェアを生か
し切れないこととなる事態も起こり得る。例えば、動作
周波数１６６MHz の制御装置（ＣＰＵ）を有するハード
ウェアにおいて、入力された映像を、とりこみに伴って
のリアルタイム処理をし、３２０×２４０の解像度にお
いてＭＰＥＧ１の規格に従った符号化をする場合に、符
号化タイプとしてフレーム内符号化である「Ｉ」のみを
用いることとして処理するならば、この条件での処理で
あれば、「Ｉ」タイプにおいて１フレーム画像を１/ ３
０秒で処理できることから、結果として３０フレーム/
秒で映像をリアルタイム符号化できるものとなる。The above is a problem in a case where the hardware performance is insufficient. On the contrary, a situation may occur in which the high performance hardware cannot be utilized. For example, in the case where hardware having a control device (CPU) having an operation frequency of 166 MHz performs real-time processing of an input video along with import and performs encoding according to the MPEG1 standard at a resolution of 320 × 240. In addition, if processing is performed using only “I” which is intra-frame encoding as the encoding type, if processing is performed under this condition, one-frame image is divided by １ in “I” type.
It can be processed in 0 seconds, resulting in 30 frames /
Video can be encoded in real time in seconds.

【００５３】これに対して、基本的なハードウェア性能
がより高い場合、例えば動作周波数２００MHz の制御装
置（ＣＰＵ）を有する場合に、かかるソフトウェア処理
を行うものであれば、本来上記「Ｉ」タイプの１フレー
ム画像処理を１／３０秒より短時間で行い得るものであ
ることから、ハードウェア性能を生かせない事態となっ
てしまう。高性能な制御装置を用いることは、それだけ
コストも高いものとなることから、このことは、映像符
号化装置としてはコストパフォーマンスが良くないとい
うことを意味する。On the other hand, when the basic hardware performance is higher, for example, when a control device (CPU) having an operation frequency of 200 MHz is provided, and the software processing is performed, the “I” type Since the one-frame image processing can be performed in less than 1/30 second, hardware performance cannot be utilized. Since the use of a high-performance control device also increases the cost, this means that the cost performance is not good as a video encoding device.

【００５４】この場合には、例えば、「Ｉ」タイプのみ
ならず、「Ｐ」や「Ｂ」の符号化タイプを用いて処理を
行えば、同等の画質において圧縮率の高い符号化データ
が得られるものであるから、上記のように「Ｉ」タイプ
のみにおいて圧縮率の低い符号化データを生成すること
は、結局装置資源を活用していなかったということにな
る。In this case, for example, if processing is performed using not only the “I” type but also the “P” or “B” coding type, coded data having a high compression rate with the same image quality can be obtained. Therefore, as described above, generating encoded data with a low compression ratio only in the “I” type as described above means that the device resources have not been utilized.

【００５５】同様のことは、マルチタスクオペレーティ
ングシステム上での実行などで、予想以上に計算機資源
（ＣＰＵ時間の割り当て）を利用できた場合や、「３２
０×２４０」よりも低解像度である「１６０×１２０」
の解像度で符号化を行う場合などにも起こり得るもので
ある。The same applies to the case where computer resources (allocation of CPU time) can be used more than expected, such as when executing on a multitasking operating system, or when "32
"160x120" which is lower resolution than "0x240"
This can also occur when encoding is performed at a resolution of.

【００５６】Ｂ．従来の技術による音声符号化の問題点Ｂ−１の第１の例のように行われる従来の音声符号化方
法では、サウンドボードを装備したマルチメディアパソ
コン等において、ソフトウェア処理により、音声取り込
みに伴ったリアルタイムでの音声符号化が可能である。
しかし、このことは即ち、音声の入力に伴った実時間符
号化に十分な性能を有する装置が用いられることが前提
であり、目的に適応して設計されたＬＳＩで構成した
り、十分な性能を持つ制御装置（プロセッサ）を選択し
たりして対応がされていた。または、十分な性能を有し
ていないプロセッサでの処理においては、途中でデータ
をファイルとして記録し、この記録されたデータを処理
するなど、実時間の何倍かの時間を許して符号化処理を
行うほかはなかった。B. Problems of conventional audio encoding According to the conventional audio encoding method performed as in the first example of B-1, in a multimedia personal computer or the like equipped with a sound board, the audio processing is performed in accordance with software processing by software processing. Real-time speech coding is possible.
However, this presupposes that a device having sufficient performance for real-time encoding accompanying speech input is used. Or by selecting a control device (processor) having Alternatively, in the case of processing with a processor that does not have sufficient performance, data is recorded as a file on the way, and the recorded data is processed. There was nothing else to do.

【００５７】すなわち、ＭＰＥＧＡｕｄｉｏなどで用い
られる帯域分割符号化処理を、ソフトウェア的にＣＰＵ
で実行させて、音声入力に伴った実時間処理を行おうと
する場合、当該ソフトウェアを実行するハードウェア環
境、代表的なものとしてはＣＰＵ性能によって可能か、
不可能かが決定されてしまい、例えば、ＣＰＵ性能に対
応した符号化レベルで実時間符号化することなどは出来
なかった。That is, the band division encoding process used in MPEG Audio or the like is executed by software using a CPU.
In the case of trying to perform real-time processing associated with voice input by executing in the hardware environment, is the hardware environment for executing the software, typically possible by CPU performance,
It was determined whether or not it was impossible, and for example, it was not possible to perform real-time encoding at an encoding level corresponding to the CPU performance.

【００５８】又、上記のように構成された音声符号化装
置は、当初の設定に従って、一定レートで音声を入力
し、実時間符号化処理を一定レートで行うように設計さ
れているものであるが、汎用のパーソナルコンピュータ
等であると、マルチタスク処理による他のタスクの影
響、その他の割り込み発生等で、ＣＰＵの処理能力が割
かれ、かかる状態では当初の設定に従って音声符号化処
理を行えなくなることがあり、これに対応することが困
難である。Further, the speech coding apparatus configured as described above is designed to input speech at a constant rate and to perform real-time encoding processing at a constant rate according to the initial settings. However, in the case of a general-purpose personal computer or the like, the processing capacity of the CPU is allocated due to the influence of other tasks due to the multitasking process, the occurrence of other interrupts, etc., and in such a state, the audio encoding process cannot be performed according to the initial setting. And it is difficult to respond to this.

【００５９】次に、Ｂ−２の第２の例に示すように心理
聴覚分析を行った帯域分割符号化処理では、上記のよう
に人間の聴覚特性に応じたビット割り当てがなされるこ
とにより、再生音質の良好な符号化データが得られる。Next, in the band division encoding process in which psychoacoustic analysis is performed as shown in the second example of B-2, bits are allocated according to human auditory characteristics as described above. Encoded data with good reproduction sound quality can be obtained.

【００６０】しかし、多数の帯域への分割と、それら分
割された信号についての変換処理、および比較処理の処
理負担は大きく、かかる心理聴覚分析を行うことによ
り、一般的に約２倍程度に処理負担が増加することとな
る。従って、標準的なパーソナルコンピュータレベルで
は、心理聴覚分析を取り入れると、音声取り込みに伴っ
た実時間処理を行うことは困難であり、専用のプロセッ
サやボード等の高性能なハードウェアの追加に頼るか、
実時間処理をあきらめ、ファイルとして記録した後に時
間をかけて符号化処理を行うかをせざるを得なかった。However, the processing load of the division into a large number of bands and the conversion processing and comparison processing of the divided signals is large, and by performing such a psychological auditory analysis, the processing is generally reduced to about twice. The burden will increase. Therefore, at the level of a standard personal computer, it is difficult to perform real-time processing associated with voice capture when incorporating psychoacoustic analysis, and it is difficult to rely on the addition of high-performance hardware such as a dedicated processor or board. ,
After giving up the real-time processing and recording it as a file, it was necessary to take the time to perform the encoding processing.

【００６１】Ｃ．従来の技術による映像音声符号化の問
題点上述のように、上記従来の映像音声符号化装置は、音
声、および映像の符号化に際しては、原音声情報（デジ
タル音声情報）、および原映像情報（デジタル映像情
報）が直接、それぞれ対応する符号化部に入力され、そ
れぞれに符号化処理されるものである。そのため、音声
符号化部、および映像符号化部については、入力される
原音声情報、および原映像情報を、例えばＭＰＥＧ規格
のような規格に従って、確実に処理できる能力が必要と
される。例えば、音声符号化部は、サンプリング周波数
が４８ＫＨｚで、１サンプル１ｂｙｔｅの音声情報を入
力するのであれば、１秒に４８Ｋｂｙｔｅの音声情報を
確実に符合化できる能力が必要となる。また、映像符号
化部は、横３２０ピクセル、縦２４０ピクセル、１ピク
セル２ｂｙｔｅ、３０ｆｐｓの映像情報が入力されるの
であれば、１秒に４．６Ｍｂｙｔｅの映像情報を確実に
符号化できる能力が必要となる。C. Problems of Video / Audio Coding by Conventional Techniques As described above, the conventional video / audio coding apparatus described above uses, when coding audio and video, original audio information (digital audio information) and original video information (digital video information). Digital video information) is directly input to the corresponding encoding units, and is subjected to encoding processing. Therefore, the audio encoding unit and the video encoding unit are required to have a capability of reliably processing input original audio information and original video information in accordance with, for example, the MPEG standard. For example, if the sampling frequency is 48 KHz and the audio information of 1 byte per sample is input, the audio encoding unit needs to have a capability of reliably encoding the audio information of 48 Kbytes per second. In addition, the video encoding unit needs to have a capability of reliably encoding 4.6 Mbytes of video information per second if video information of 320 pixels horizontally, 240 pixels vertically, 1 pixel 2 bytes, and 30 fps is input. Becomes

【００６２】このため、従来は、音声符号化部、および
映像符号化部については、それぞれ独立して動作し、符
号化処理を保証し得る専用ハードウェアを用いることに
よって映像音声の符号化を実現していた。これに対し
て、専用ハードウェアを用いず、汎用ＣＰＵを用いたマ
ルチタスクオペレーティングシステム上で動作するソフ
トウェアプログラムとして音声、および映像の符号化部
を実現することは、きわめて困難である。For this reason, conventionally, the audio encoding unit and the video encoding unit operate independently of each other, and realize video / audio encoding by using dedicated hardware capable of guaranteeing the encoding process. Was. On the other hand, it is extremely difficult to realize an audio and video encoding unit as a software program that operates on a multitasking operating system using a general-purpose CPU without using dedicated hardware.

【００６３】なぜなら、マルチタスクオペレーティング
システム上では、各符号化部もそれぞれタスクとして動
作するものであり、同様に他のソフトウェア（通信処理
等を行う常駐プログラムなど) もタスクとして機能して
いる場合には、当該他のタスクがある期間だけＣＰＵの
動作時間を奪ってしまうことが起こる。その期間には符
号化処理は停止するため、常に、符号化ソフトウェア
が、音声または映像を十分に処理し得るという保証をす
ることができない。従って、音声や映像のとぎれ等の再
生トラブルを生じない、良好な符号化結果が常に得られ
るとは限らないこととなる。This is because, on a multitasking operating system, each encoding unit operates as a task, and similarly, when other software (a resident program for performing communication processing, etc.) also functions as a task. In some cases, the other task deprives the CPU of the operation time for a certain period. Since the encoding process stops during that period, it cannot always be guaranteed that the encoding software can process audio or video sufficiently. Therefore, it is not always possible to always obtain a good encoding result without causing a reproduction trouble such as interruption of audio or video.

【００６４】また、映像音声を処理対象とする場合に
は、他のタスク以外にも問題は存在する。すなわち、マ
ルチタスクオペレーティングシステム上で、映像符号化
と音声符号化とは別のタスクとして処理されることが常
であるため、互いに上記の他のタスクとして影響を与え
あうことになるためである。例えば、映像は一様なもの
であることは期待できず、入力される映像情報を構成す
る静止画は時々刻々変化するものとなる。一連の映像の
ある部分については、きわめて符号化（圧縮) しずらい
ものであるため、その部分の処理については符号化処理
に時間のかかってしまうという可能性もある。この場
合、オペレーティングシステム上では、他のソフトウェ
アがまったく動作していなかったとしても、映像符号化
部に大量のＣＰＵ時間をとられ、音声符号化部の処理が
遅滞することによって、音声の途切がある符号化結果し
か得られないという事態に陥ることがあり得る。In the case where video and audio are to be processed, there are problems other than the other tasks. That is, on a multitasking operating system, video encoding and audio encoding are usually processed as separate tasks, and thus affect each other as the other tasks described above. For example, a video cannot be expected to be uniform, and a still image constituting input video information changes every moment. Since a certain part of a series of images is very difficult to encode (compress), it may be time-consuming to process the part. In this case, even if no other software is running on the operating system, a large amount of CPU time is taken by the video encoding unit and the processing of the audio encoding unit is delayed, so that the audio is interrupted. It may happen that only a certain encoding result is obtained.

【００６５】もう１つの問題は、汎用計算機上のソフト
ウェア実行により、映像音声符号化装置を構成しようと
した場合、ＡやＢの場合と同様に、当該ソフトウェア
は、様々なハードウェア能力のコンピュータシステム上
で実行される可能性があることに起因する。そのため、
前出の問題点、すなわち平均的にみれば音声も映像も符
号化できる能力がある場合において、ある期間には符号
化に割り当てられる計算機能力が少なくなることに起因
する問題とは別に、当該ソフトウェアが実行されるハー
ドウェアにおいて、そもそも、十分な計算機能力を有し
ない場合に、当該ソフトウェア設計時の当初の設定値の
ままでは、映像と音声の符号化処理が行えないという事
態が発生する可能性がある。このような場合、動作する
コンピュータシステムにあわせ、すみやかに映像符号化
が消費する計算機能力を低減しなければ、良好な符号化
結果が得られず、再生時に音声途切れが生じてしまうと
いう事態を招く。Another problem is that, when an attempt is made to configure a video / audio coding apparatus by executing software on a general-purpose computer, as in the case of A or B, the software is a computer system having various hardware capabilities. Due to what could be performed on it. for that reason,
Apart from the problem mentioned above, that is, the problem that the computational power assigned to the encoding is reduced in a certain period in the case where it is possible to encode both audio and video on average, If the hardware on which the software is executed does not have sufficient computational capabilities in the first place, video and audio encoding may not be performed with the initial settings at the time of software design. There is. In such a case, unless the computing function consumed by video encoding is reduced promptly in accordance with the operating computer system, a good encoding result cannot be obtained, and a situation in which audio is interrupted during reproduction is caused. .

【００６６】もちろん、音声符号化処理の影響により、
映像符号化の符号化結果に不具合が生じる可能性もある
が、一般に同時間に相当する映像と音声とを比較する
と、映像の方がデータ量が多いものであり、また映像デ
ータの欠落よりも音声データの欠落の影響が、再生時の
影響が大きいことから、音声符号化における問題点の方
が、比重が大きいものと言え、音声途切れの防止に対す
る要請がより大きいと一般に言い得るものである。Of course, due to the effect of the audio encoding process,
Although there is a possibility that the encoding result of the video encoding may have a problem, in general, when comparing the video and audio corresponding to the same time, the video has a larger amount of data and also has a larger data amount than the lack of the video data. Since the effect of the lack of audio data has a large effect at the time of reproduction, it can be generally said that the problem in audio coding has a higher specific gravity, and that there is a greater demand for prevention of audio interruption. .

【００６７】以上、Ａ〜Ｃより、パーソナルコンピュー
タ等の汎用計算機上で符号化ソフトウェアを実行するこ
とにより、映像、音声、または映像音声を、とりこみに
伴ってのリアルタイム符号化処理をしようとする場合に
は、以下の問題点があるものと言える。As described above, when the encoding software is executed on a general-purpose computer such as a personal computer from A to C, it is intended to perform real-time encoding processing of video, audio, or video / audio along with the import. Can be said to have the following problems.

【００６８】（１）当該ソフトウェアを実行するハード
ウェアの性能の影響が大きい。ハードウェア性能が低い
ならば、良好な符号化結果が得られなくなり、ハードウ
ェア性能が高いならば、装置資源を活用できなくなる可
能性がある。（２）当該ソフトウェアが、マルチタスクオペレーティ
ングシステム上で実行される場合、他のタスクの影響が
大きい。他のタスクによる装置資源の占有の大小が、
（１）のおけるハードウェア性能の高低と実質的に同様
の影響を与えることとなる。（３）さらに、映像音声を処理対象とする場合には、映
像符号化と音声符号化とが互いに他のタスクとして影響
を与えあうということが起こり得る。(1) The effect of the performance of the hardware executing the software is large. If the hardware performance is low, good coding results cannot be obtained, and if the hardware performance is high, there is a possibility that the device resources cannot be used. (2) When the software is executed on a multitasking operating system, the influence of other tasks is large. The degree of occupation of device resources by other tasks depends on
This has substantially the same effect as the level of hardware performance in (1). (3) Furthermore, when video and audio are to be processed, it may happen that video encoding and audio encoding affect each other as other tasks.

【００６９】本発明は、かかる事情に鑑みてなされたも
のであり、映像取り込みにともなってのリアルタイムで
の映像符号化処理を行う符号化方法において、当該符号
化方法を実施する計算機の基本的な能力に対応して、解
像度や符号化タイプを含む符号化パラメータを適切に設
定して、装置資源を活用して、良好な符号化結果を得る
ことの可能な映像符号化方法を提供することを目的とす
る。The present invention has been made in view of the above circumstances, and in a coding method for performing a video coding process in real time upon capturing a video, a basic computer for executing the coding method is used. It is an object of the present invention to provide a video coding method capable of appropriately setting a coding parameter including a resolution and a coding type according to a capability and utilizing a device resource to obtain a good coding result. Aim.

【００７０】また、本発明は、映像取り込みにともなっ
てのリアルタイムでの映像符号化処理を行う符号化方法
において、当該符号化方法を実施する計算機のその時点
での能力に対応して、解像度や符号化タイプを含む符号
化パラメータを適切に設定して、装置資源を活用して、
良好な符号化結果を得ることの可能な映像符号化方法を
提供することを目的とする。The present invention also relates to an encoding method for performing a video encoding process in real time upon capturing a video. By properly setting the encoding parameters including the encoding type and utilizing the device resources,
It is an object of the present invention to provide a video coding method capable of obtaining a good coding result.

【００７１】また、本発明は、音声取り込みにともなっ
てのリアルタイムでの音声符号化処理を行う符号化方法
において、当該符号化方法を実施する計算機の基本的な
能力に対応して、符号化処理の制御を行い、装置資源を
活用して、良好な符号化結果を得ることの可能な音声符
号化方法を提供することを目的とする。The present invention also relates to an encoding method for performing real-time audio encoding processing in response to audio capture, wherein the encoding processing is performed in accordance with the basic capability of a computer that executes the encoding method. And to provide a speech encoding method capable of obtaining good encoding results by utilizing device resources.

【００７２】また、本発明は、音声取り込みにともなっ
てのリアルタイムでの音声符号化処理を行う符号化方法
において、当該符号化方法を実施する計算機のその時点
での能力に対応して、符号化処理の制御を行い、装置資
源を活用して、良好な符号化結果を得ることの可能な音
声符号化方法を提供することを目的とする。Further, the present invention relates to a coding method for performing a real-time voice coding process along with a voice capture, wherein the coding is performed in accordance with the current capability of a computer which executes the coding method. An object of the present invention is to provide a speech encoding method capable of controlling a process and utilizing a device resource to obtain a good encoding result.

【００７３】また、本発明は、音声取り込みにともなっ
てのリアルタイムでの音声符号化処理を行う符号化方法
において、当該符号化方法を実施する計算機の基本的な
能力に対応して、心理聴覚分析の代替処理を実行し、装
置資源を活用して、良好な符号化結果を得ることの可能
な音声符号化方法を提供することを目的とする。Further, the present invention relates to an encoding method for performing a real-time audio encoding process in response to a speech capture, wherein a psychoacoustic analysis is performed in accordance with the basic ability of a computer which executes the encoding method. It is an object of the present invention to provide a speech encoding method capable of executing the alternative processing of (1) and utilizing a device resource to obtain a good encoding result.

【００７４】また、本発明は、映像音声取り込みにとも
なってのリアルタイムでの映像符号化、および音声符号
化処理を行う符号化方法において、当該符号化方法を実
施する計算機の基本的な能力に対応して、映像符号化処
理の制御を行い、装置資源を活用して、音途切れのない
良好な符号化結果を得ることの可能な映像音声符号化方
法を提供することを目的とする。The present invention also relates to an encoding method for performing real-time video encoding and audio encoding processing along with video and audio capture, which corresponds to the basic capability of a computer that executes the encoding method. It is another object of the present invention to provide a video / audio coding method capable of controlling a video coding process and utilizing a device resource to obtain a good coding result without sound interruption.

【００７５】また、本発明は、映像音声取り込みにとも
なってのリアルタイムでの映像符号化、および音声符号
化処理を行う符号化方法において、当該符号化方法を実
施する計算機のその時点での能力に対応して、映像符号
化処理の制御を行い、装置資源を活用して、音途切れの
ない良好な符号化結果を得ることの可能な音声符号化方
法を提供することを目的とする。The present invention also relates to an encoding method for performing video encoding and audio encoding processing in real time accompanying video and audio capture, and to a computer capable of executing the encoding method at that time. Correspondingly, an object of the present invention is to provide an audio encoding method capable of controlling a video encoding process and utilizing a device resource to obtain a satisfactory encoding result without sound interruption.

【００７６】また、本発明は、上記のような映像符号化
方法、音声符号化方法、および映像音声符号化方法を実
行する映像符号化装置、音声符号化装置、および映像音
声符号化装置を提供することを目的とする。The present invention also provides a video encoding method, an audio encoding method, and a video encoding apparatus, an audio encoding apparatus, and a video / audio encoding apparatus that execute the video / audio encoding method as described above. The purpose is to do.

【００７７】また、本発明は、パーソナルコンピュータ
等の汎用計算機において実行することで、上記のような
映像符号化方法、音声符号化方法、および映像音声符号
化方法を実現できる映像符号化プログラム、音声符号化
プログラム、および映像音声符号化プログラムを記録し
た記録媒体を提供することを目的とする。Further, the present invention can be implemented on a general-purpose computer such as a personal computer to realize the above-described video encoding method, audio encoding method, and video / audio encoding method. It is an object of the present invention to provide an encoding program and a recording medium recording an audiovisual encoding program.

【００７８】[0078]

【課題を解決するための手段】上記目的を達成するた
め、請求項１にかかる映像符号化方法は、映像を符号化
する映像符号化方法において、映像がデジタル化され
た、複数の静止画像情報からなる原映像情報に対して、
上記静止画像情報の１つまたは複数を、後述する符号化
パラメータに従って符号化する映像符号化ステップと、
原映像情報の有する解像度、符号化によって得られる符
号化データを再生する際に要求されるフレームレート、
上記映像符号化ステップを実行する符号化装置の処理能
力を示す処理性能、または上記映像符号化ステップにお
ける符号化処理の処理量に影響する１つ、もしくは複数
の符号化パラメータのうちいずれか１つ以上に基づい
て、１つ以上の上記符号化パラメータを決定する符号化
パラメータ決定ステップとを実行するものである。According to a first aspect of the present invention, there is provided a video encoding method for encoding a video, wherein the video encoding method comprises the steps of: For the original video information consisting of
A video encoding step of encoding one or more of the still image information according to encoding parameters described below;
The resolution of the original video information, the frame rate required when reproducing encoded data obtained by encoding,
One of a processing performance indicating a processing capability of an encoding device that executes the video encoding step, or one or a plurality of encoding parameters that affects a processing amount of an encoding process in the video encoding step. Based on the above, a coding parameter determining step of determining one or more coding parameters is executed.

【００７９】また、請求項２にかかる映像符号化方法
は、請求項１の方法において、当該映像符号化方法の、
上記映像符号化ステップを実行する符号化装置の処理能
力を判断して、判断結果を出力する処理能力判断ステッ
プをさらに実行するものである。Further, the video encoding method according to claim 2 is the video encoding method according to claim 1,
The processing capability of the encoding device that performs the video encoding step is determined, and a processing capability determining step of outputting a determination result is further performed.

【００８０】また、請求項３にかかる映像符号化方法
は、請求項１または２の方法において、上記符号化パラ
メータは、上記原映像情報に対して行う符号化処理にお
ける解像度、フレーム内符号化、もしくは予測符号化を
示す符号化タイプ、または上記予測符号化に用いる動き
ベクトルを検出する際の検出範囲のうち１つ以上を含む
ものである。According to a third aspect of the present invention, in the video encoding method according to the first or second aspect, the encoding parameter is a resolution in an encoding process performed on the original video information, an intra-frame encoding, Alternatively, it includes one or more of a coding type indicating predictive coding and a detection range for detecting a motion vector used for the predictive coding.

【００８１】また、請求項４にかかる映像符号化方法
は、請求項２の方法において、上記処理能力判断ステッ
プでは、当該映像符号化方法の有する制御装置の種類に
基づいて上記判断を行うものである。According to a fourth aspect of the present invention, in the video encoding method of the second aspect, in the processing capability determining step, the determination is made based on a type of a control device of the video encoding method. is there.

【００８２】また、請求項５にかかる映像符号化方法
は、請求項２の方法において、上記処理能力判断ステッ
プでは、上記符号化ステップにおける符号化処理の所要
時間に基づいて上記判断を行うものである。According to a fifth aspect of the present invention, in the video encoding method according to the second aspect, in the processing capability determining step, the determination is made based on a time required for the encoding process in the encoding step. is there.

【００８３】また、請求項６にかかる映像符号化方法
は、請求項２の方法において、上記処理能力判断ステッ
プでは、上記入力される原映像情報を一時蓄積し、該蓄
積にあたっては、上記原映像情報を構成する一連の静止
画像情報を順次保存していくとともに、上記符号化ステ
ップにおいて読み出されて、上記符号化処理が行われた
静止画像情報を順次廃棄する映像バッファリングステッ
プと、上記映像バッファリングステップにおける上記一
連の静止画像情報の保存を、上記与えられたフレームレ
ートに基づいて決定される一定のフレームレートにおい
て行うように制御するフレームレート制御ステップとを
実行し、上記映像バッファリングステップにおいて一時
蓄積された上記原映像情報の蓄積量に基づいて上記判断
を行うものである。According to a sixth aspect of the present invention, in the video encoding method according to the second aspect, in the processing capability determination step, the input original image information is temporarily stored, and when storing the original image information, A video buffering step of sequentially storing a series of still image information constituting the information, and sequentially discarding the still image information read out in the encoding step and subjected to the encoding processing; Executing a frame rate control step of controlling the storage of the series of still image information in the buffering step to be performed at a constant frame rate determined based on the given frame rate. The above determination is made based on the storage amount of the original video information temporarily stored in.

【００８４】また、請求項７にかかる音声符号化方法
は、音声に対して、帯域分割符号化方式により符号化を
行う音声符号化方法において、符号化処理に用いる数値
である、設定周波数ｆｓと、変換定数ｎとを記憶する記
憶ステップと、符号化の対象である音声を入力する音声
入力ステップと、上記記憶した設定周波数ｆｓに基づい
て決定されるサンプリング周波数を用いて、サンプリン
グ音声データを作成する入力音声サンプリングステップ
と、上記設定周波数ｆｓをサンプリング周波数として用
いた場合に得られるサンプリング音声データの個数をｍ
個とし、上記変換定数ｎに基づいて定められる数をｍ’
として、ｍ’個のサンプリング音声データを含む、ｍ個
の音声データからなる変換音声データを出力する音声デ
ータ変換ステップと、上記変換音声データを、帯域分割
してＭ個の帯域信号を得る帯域分割ステップと、上記記
憶した設定周波数ｆｓと変換定数ｎとから得られる周波
数ｆｓ／２ｎを制限周波数として、上記帯域信号のう
ち、制限周波数以下の帯域信号にのみ符号化ビットを割
り当てる符号化ビット割り当てステップと、上記割り当
てた符号化ビットに基づいて量子化を行う量子化ステッ
プと、上記量子化したデータを符号化データとして出力
する符号化ステップと、上記出力される符号化データを
記録する符号化データ記録ステップとを実行するもので
ある。According to a seventh aspect of the present invention, there is provided a speech encoding method for encoding a speech by a band division encoding method, wherein the set frequency fs and the numerical value used for the encoding process are used. , A conversion constant n, a voice input step of inputting a voice to be encoded, and a sampling voice data generated using a sampling frequency determined based on the stored set frequency fs. And the number of sampled audio data obtained when the set frequency fs is used as the sampling frequency is m
And the number determined based on the conversion constant n is m ′
An audio data conversion step of outputting converted audio data composed of m audio data, including m ′ sampled audio data, and a band division for dividing the converted audio data into bands to obtain M band signals A coding step for allocating coded bits only to a band signal having a frequency equal to or lower than the limit frequency among the band signals, using a frequency fs / 2n obtained from the stored set frequency fs and the conversion constant n as a limit frequency. And a quantization step of performing quantization based on the allocated coded bits, a coding step of outputting the quantized data as coded data, and coded data for recording the output coded data. And a recording step.

【００８５】また、請求項８にかかる音声符号化方法
は、請求項７の方法において、上記入力音声サンプリン
グステップでは、上記記憶した設定周波数ｆｓをサンプ
リング周波数として、上記入力された音声のサンプリン
グ処理により、ｍ個のサンプリング音声データを作成す
るものであり、上記音声データ変換ステップでは、上記
ｍ個のサンプリング音声データより、（ｎ−１）個おき
にサンプリング音声データを抽出し、２つの隣接する上
記抽出したサンプリング音声データの間に、（ｎ−１）
個の音声データを挿入して、ｍ個の変換音声データに変
換するものである。In the voice coding method according to the present invention, the input voice sampling step may be performed by sampling the input voice using the stored set frequency fs as a sampling frequency. , M sampling audio data, and in the audio data converting step, sampling audio data is extracted from the m sampling audio data every (n−1) sampling data, and two adjacent sampling audio data are extracted. Between the extracted sampled audio data, (n-1)
Is converted into m converted audio data by inserting the audio data.

【００８６】また、請求項９にかかる音声符号化方法
は、請求項８の方法において、上記音声データ変換ステ
ップでは、上記抽出したサンプリング音声データがそれ
ぞれｎ個ずつ連続する変換音声データを作成するもので
ある。According to a ninth aspect of the present invention, in the audio encoding method according to the eighth aspect, in the audio data conversion step, converted audio data in which each of the extracted sampled audio data is continuous by n pieces is created. It is.

【００８７】また、請求項１０にかかる音声符号化方法
は、請求項７の方法において、上記入力音声サンプリン
グステップでは、上記記憶した設定周波数ｆｓと変換定
数ｎとから得られる周波数ｆｓ／ｎをサンプリング周波
数として、上記入力された音声のサンプリング処理によ
り、ｍ／ｎ個のサンプリング音声データを作成するもの
であり、上記音声データ変換ステップでは、上記サンプ
リング音声データに基づき、２つの隣接するサンプリン
グ音声データの間に（ｎ−１）個の音声データを挿入し
て、ｍ個の変換音声データに変換するものである。According to a tenth aspect of the present invention, in the speech encoding method according to the seventh aspect, in the input speech sampling step, a frequency fs / n obtained from the stored set frequency fs and the conversion constant n is sampled. As the frequency, m / n pieces of sampled audio data are created by sampling the input audio, and in the audio data conversion step, two adjacent sampled audio data are generated based on the sampled audio data. In between, (n-1) audio data is inserted and converted into m converted audio data.

【００８８】また、請求項１１にかかる音声符号化方法
は、請求項１０の方法において、上記音声データ変換ス
テップでは、上記ｍ／ｎ個のサンプリング音声データ
が、それぞれｎ個ずつ連続する変換音声データを作成す
るものである。In the audio encoding method according to the eleventh aspect, in the method of the tenth aspect, in the audio data conversion step, the m / n sampled audio data is converted audio data that is continuous by n pieces each. Is to create.

【００８９】また、請求項１２にかかる音声符号化方法
は、請求項７ないし１１のいずれかの方法において、上
記サンプリング音声データを、入力バッファに一時的に
保持する音声バッファリングステップと、上記入力バッ
ファのデータ量を調べて、これを予め設定した値と比較
し、上記比較の結果に基づいて、上記レジスタに記憶さ
れた上記変換定数ｎの値を変更する入力バッファ監視ス
テップとを実行し、上記入力音声サンプリングステップ
では、上記サンプリング音声データを上記入力バッファ
に書き込むものであり、上記音声データ変換ステップで
は、上記入力バッファよりサンプリング音声データを読
み出して、これを上記変換するものである。A speech encoding method according to a twelfth aspect of the present invention is the audio encoding method according to any one of the seventh to eleventh aspects, wherein the sampling audio data is temporarily held in an input buffer; Checking the amount of data in the buffer, comparing it with a preset value, and performing an input buffer monitoring step of changing the value of the conversion constant n stored in the register based on the result of the comparison; In the input audio sampling step, the sampled audio data is written to the input buffer. In the audio data conversion step, the sampled audio data is read out from the input buffer and converted.

【００９０】また、請求項１３にかかる音声符号化方法
は、請求項７ないし１１のいずれかの方法において、上
記符号化ステップにおいて出力される単位時間当たりの
符号化データ量を調べて、これを予め設定した値と比較
し、上記比較の結果に基づいて、上記レジスタに記憶さ
れた上記変換定数ｎの値を変更する符号化データ監視ス
テップを実行するものである。A speech encoding method according to a thirteenth aspect is characterized in that, in the method of any one of the seventh to eleventh aspects, the amount of encoded data per unit time output in the encoding step is examined, and A coded data monitoring step of comparing a value set in advance and changing the value of the conversion constant n stored in the register based on the result of the comparison is executed.

【００９１】また、請求項１４にかかる音声符号化方法
は、音声に対して、帯域分割符号化方式を用いて符号化
を行う音声符号化方法において、上記符号化に用いる制
御定数を記憶する制御定数記憶ステップと、入力音声を
サンプリング処理して、サンプリングデータを出力する
サンプリングステップと、上記サンプリングステップで
得られたサンプリングデータに対して帯域分割を行い、
帯域信号データを出力する帯域分割ステップと、上記帯
域分割ステップで得られた帯域信号データに対して、符
号化ビットの割り当てを行う符号化ビット割り当てステ
ップと、上記符号化ビットの割り当てに従って、上記帯
域信号データの量子化を行い、量子化値を出力する量子
化ステップと、上記量子化ステップで得られた量子化値
に基づき、符号化データを出力する符号化ステップと、
上記記憶した制御定数に基づいて、上記帯域分割ステッ
プ、上記符号化ビット割り当てステップ、上記量子化ス
テップ、および上記符号化ステップにおけるデータ処理
を制御する符号化処理制御ステップとを実行するもので
ある。According to a fourteenth aspect of the present invention, in the voice coding method for coding a voice using a band division coding method, a control method for storing a control constant used for the coding is provided. A constant storage step, a sampling step of sampling the input voice and outputting sampling data, and performing band division on the sampling data obtained in the sampling step,
A band dividing step of outputting band signal data; a coding bit allocating step of allocating coded bits to the band signal data obtained in the band dividing step; and Quantizing the signal data, a quantization step of outputting a quantization value, and an encoding step of outputting encoded data based on the quantization value obtained in the quantization step,
On the basis of the stored control constants, the band division step, the coded bit allocation step, the quantization step, and a coding processing control step for controlling data processing in the coding step are executed.

【００９２】また、請求項１５にかかる音声符号化方法
は、請求項１４の方法において、上記制御定数記憶ステ
ップでは、上記制御定数として、単位期間判定定数ｋを
単位期間判定定数レジスタに記憶するものであり、上記
符号化処理制御ステップは、上記帯域分割ステップでの
１回の帯域分割処理で対象とするサンプリングデータ数
をｐとし、ｐ個のサンプリングデータに相当する時間を
単位期間として、上記出力されるサンプリングデータの
ｐ個ごとに、相当する単位期間が符号化対象期間である
か符号化対象外期間であるかの判定を、上記記憶した単
位期間判定定数に基づいて行い、上記単位期間が上記符
号化対象期間と判定されたときのみ、該単位期間のサン
プリングデータが上記帯域分割ステップに出力されるよ
う制御し、上記単位期間が上記符号化対象外期間と判定
されたときは、上記符号化ステップにおいて、予め記憶
した固定的符号化データを符号化データとして出力する
よう制御する判定制御ステップであるものである。According to a fifteenth aspect of the present invention, in the method of the fourteenth aspect, in the control constant storing step, a unit period determination constant k is stored in the unit period determination constant register as the control constant. In the encoding process control step, the number of sampling data to be targeted in one band division process in the band division step is p, and the time corresponding to the p pieces of sampling data is set as a unit period. For every p pieces of sampling data to be sampled, it is determined whether the corresponding unit period is a coding target period or a non-coding target period based on the stored unit period determination constant. Only when it is determined to be the encoding target period, control is performed such that the sampling data of the unit period is output to the band division step, and When the time period is determined to the encoding target out period, in the encoding step are those wherein the determination control step of controlling to output the fixed coded data previously stored as encoded data.

【００９３】また、請求項１６にかかる音声符号化方法
は、請求項１５の方法において、上記判定制御ステップ
では、ｉ番目の単位期間をｔｉとして、上記記憶した単
位期間判定定数ｋと任意の整数ｎとからｉ＝ｎ×ｋ＋１
が成立するとき、上記単位期間ｔｉが上記符号化対象期
間であると判定するものである。In the speech coding method according to a sixteenth aspect of the present invention, in the method according to the fifteenth aspect, in the determining control step, the stored unit period determination constant k is set to an arbitrary integer by setting the i-th unit period to ti. From n, i = n × k + 1
Is satisfied, it is determined that the unit period ti is the encoding target period.

【００９４】また、請求項１７にかかる音声符号化方法
は、請求項１４の方法において、上記制御定数記憶ステ
ップでは、上記制御定数として、演算処理判定定数ｑを
演算処理判定定数レジスタに記憶するものであり、上記
符号化処理制御ステップは、上記帯域分割ステップに内
包され、上記記憶した演算処理判定定数ｑに基づいて、
上記帯域分割ステップにおける演算処理を途中で打ち切
るように制御する演算処理中止ステップであるものであ
る。According to a seventeenth aspect of the present invention, in the speech encoding method according to the fourteenth aspect, in the control constant storing step, an arithmetic processing determination constant q is stored in the arithmetic processing determination constant register as the control constant. And the encoding process control step is included in the band division step and is based on the stored arithmetic processing determination constant q,
This is a calculation processing stop step for controlling the calculation processing in the band division step to be terminated halfway.

【００９５】また、請求項１８にかかる音声符号化方法
は、請求項１７の方法において、上記演算処理中止ステ
ップでは、上記帯域分割ステップにおける基本低域通過
フィルタの演算処理を、該フィルタの両端ステップ分に
ついては途中で打ち切るように制御するものである。The speech coding method according to claim 18 is the method according to claim 17, wherein, in the step of stopping the arithmetic processing, the arithmetic processing of the basic low-pass filter in the band division step is performed at both ends of the filter. The minute is controlled so as to be interrupted halfway.

【００９６】また、請求項１９にかかる音声符号化方法
は、請求項１４の方法において、上記制御定数記憶ステ
ップでは、上記制御定数として、帯域選択定数ｒを帯域
選択定数レジスタに記憶するものであり、上記符号化処
理制御ステップは、上記帯域分割ステップが出力する帯
域信号データのうち、上記記憶した帯域選択定数ｒに基
づいて選択したもののみに対して、上記符号化ビット割
り当てステップと上記量子化ステップとにおける処理を
実行するよう制御する帯域間引きステップであるもので
ある。According to a nineteenth aspect of the present invention, in the speech encoding method according to the fourteenth aspect, in the control constant storing step, a band selection constant r is stored in the band selection constant register as the control constant. The encoding process control step includes the step of allocating the encoded bit and the step of performing the quantization on only the band signal data output from the band division step selected based on the stored band selection constant r. This is a band thinning step for controlling to execute the processing in the step.

【００９７】また、請求項２０にかかる音声符号化方法
は、請求項１９の方法において、上記帯域間引きステッ
プでは、上記帯域分割ステップで得られたＭ個の帯域信
号データ出力から、上記記憶した帯域選択定数であるｒ
個おきに帯域信号データを選択するものである。A speech encoding method according to a twentieth aspect of the present invention is the audio encoding method according to the nineteenth aspect, wherein in said band thinning step, the stored band signals are output from the M band signal data outputs obtained in the band division step. R is the selection constant
The band signal data is selected every other.

【００９８】また、請求項２１にかかる音声符号化方法
は、請求項１４ないし２０のいずれかの方法において、
音声符号化におけるデータ処理の状況を取得し、該取得
した状況に応じて、上記記憶した上記制御定数の値を変
更する処理状況監視ステップを実行するものである。A speech encoding method according to claim 21 is the method according to any one of claims 14 to 20, wherein
A processing status monitoring step of acquiring a status of data processing in voice encoding and changing a value of the stored control constant according to the obtained status is executed.

【００９９】また、請求項２２にかかる音声符号化方法
は、請求項２１の方法において、上記処理状況監視ステ
ップでは、サンプリングデータを入力バッファに一時蓄
積する音声バッファリングステップと、上記入力バッフ
ァに保持されるデータの量を予め設定した値と比較し、
上記比較の結果に基づいて上記制御定数変更を行う入力
監視ステップとを実行するものである。According to a twenty-second aspect of the present invention, in the audio encoding method according to the twenty-first aspect, in the processing status monitoring step, an audio buffering step of temporarily storing sampling data in an input buffer; The amount of data to be compared with a preset value,
And an input monitoring step of changing the control constant based on the result of the comparison.

【０１００】また、請求項２３にかかる音声符号化方法
は、請求項２１の方法において、上記処理状況監視ステ
ップは、上記符号化ステップにおいて単位時間当たりに
出力される上記符号化データの量を、予め設定した値と
比較し、上記比較の結果に基づいて上記制御定数の値を
変更する符号化監視ステップであるものである。According to a twenty-third aspect of the present invention, in the speech encoding method according to the twenty-first aspect, the processing status monitoring step comprises: determining an amount of the encoded data output per unit time in the encoding step. This is an encoding monitoring step of comparing with a preset value and changing the value of the control constant based on the result of the comparison.

【０１０１】また、請求項２４にかかる音声符号化方法
は、音声がデジタル化された原音声情報に対して、帯域
分割符号化方式を用いて符号化を行う音声符号化方法に
おいて、入力音声をサンプリング処理して、サンプリン
グデータを出力するサンプリングステップと、上記サン
プリングステップで得られたサンプリングデータに対し
て帯域分割を行い、帯域信号データを出力する帯域分割
ステップと、上記帯域分割ステップで得られた帯域信号
データに対して、符号化ビットの割り当てを行う符号化
ビット割り当てステップと、上記符号化ビット割り当て
ステップにおける割り当てを心理聴覚分析代替制御方式
により制御するビット割り当て制御ステップと、上記符
号化ビットの割り当てに従って、上記帯域信号データの
量子化を行い、量子化値を出力する量子化ステップと、
上記量子化ステップで得られた量子化値に基づき、符号
化データを出力する符号化ステップとを実行するもので
ある。According to a twenty-fourth aspect of the present invention, in the audio encoding method for encoding original audio information obtained by digitizing audio using a band division encoding method, the input audio Sampling processing, a sampling step of outputting sampling data, performing band division on the sampling data obtained in the sampling step, and outputting a band signal data; and a band division step of outputting band signal data. For the band signal data, a coded bit allocation step of allocating coded bits, a bit allocation control step of controlling the allocation in the coded bit allocation step by a psychological auditory analysis alternative control method, According to the allocation, the band signal data is quantized, A quantization step of outputting the reduction value,
And an encoding step of outputting encoded data based on the quantization value obtained in the quantization step.

【０１０２】また、請求項２５にかかる音声符号化方法
は、請求項２４の方法において、上記ビット割り当て制
御ステップは、上記帯域分割ステップで得られた帯域信
号データに対して、心理聴覚分析代替制御方式により予
め定められたビット割り当て順に従って、符号化ビット
割り当てを行うよう制御する順次ビット割り当てステッ
プであるものである。A speech coding method according to a twenty-fifth aspect of the present invention is the method of the twenty-fourth aspect, wherein the bit allocation control step is performed based on the psychological auditory analysis alternative control for the band signal data obtained in the band division step. This is a sequential bit allocation step for controlling to perform coded bit allocation in accordance with a bit allocation order predetermined by the system.

【０１０３】また、請求項２６にかかる音声符号化方法
は、請求項２４の方法において、上記ビット割り当て制
御ステップは、上記帯域分割ステップで得られた帯域信
号データに対して、心理聴覚分析代替制御方式により予
め定められた各帯域への重み付けと、各帯域信号データ
の有する出力レベルとに基づいた符号化ビット割り当て
を行うよう制御する帯域出力適応ビット割り当てステッ
プであるものである。A speech coding method according to a twenty-sixth aspect is characterized in that, in the method of the twenty-fourth aspect, the bit allocation control step is a method of substituting a psychoacoustic analysis for band signal data obtained in the band division step. This is a band output adaptive bit allocation step for controlling to perform coded bit allocation based on weighting of each band determined in advance by a system and an output level of each band signal data.

【０１０４】また、請求項２７にかかる音声符号化方法
は、請求項２４の方法において、上記ビット割り当て制
御ステップは、上記帯域分割ステップで得られた帯域信
号データに対して、心理聴覚分析代替制御方式により予
め定められた各帯域への重み付けと、各帯域毎のビット
割り当て数に対する重み付けと、各帯域信号データの有
する出力レベルとに基づいた符号化ビット割り当てを行
うよう制御する改良型帯域出力適応ビット割り当てステ
ップであるものである。According to a twenty-seventh aspect of the present invention, in the speech encoding method according to the twenty-fourth aspect, the bit allocation control step includes a step of substituting a psychoacoustic analysis for the band signal data obtained in the band division step. Improved band output adaptation for controlling to perform coding bit allocation based on weighting to each band predetermined by the method, weighting to the number of bit allocations for each band, and output level of each band signal data This is a bit allocation step.

【０１０５】また、請求項２８にかかる音声符号化方法
は、請求項２４の方法において、上記ビット割り当て制
御ステップは、上記帯域分割ステップで得られた帯域信
号データに対して、帯域信号データごとに最小可聴限界
値との比較を行い、上記比較により最小可聴限界未満と
判定された帯域信号データにはビット割り当てを行わ
ず、他の帯域に対してのビット割り当てを増加するよう
制御する最小可聴限界比較ステップであるものである。[0105] In the speech coding method according to claim 28, in the method according to claim 24, the bit allocation control step is performed for each band signal data with respect to the band signal data obtained in the band division step. The minimum audible limit is compared with the minimum audible limit, and is controlled so that bit allocation is not performed on band signal data determined to be less than the minimum audible limit by the above comparison, and bit allocation for other bands is increased. This is a comparison step.

【０１０６】また、請求項２９にかかる映像音声符号化
方法は、映像と音声とを符号化するにあたり、上記２つ
の符号化処理に含まれる処理過程の一部または全部を、
共通の計算機資源を用いて実行する映像音声符号化方法
において、単位時間毎の静止画像を表す複数の静止画像
情報からなる原映像情報と、音声を表す原音声情報とか
ら構成される映像音声情報が入力されたとき、上記原音
声情報を一時的に蓄積する音声バッファリングステップ
と、上記音声バッファリングステップにおいて蓄積され
た原音声情報を読み出し、この読み出した上記原音声情
報を符号化処理し、符号化音声情報を出力する音声符号
化ステップと、映像符号化の負荷程度を表す符号化負荷
基準情報を用いて、当該映像音声符号化処理についての
処理能力を判断し、その判断の結果に基づいて、後述す
る映像符号化ステップにおける原映像情報に対する符号
化を制御する符号化負荷評価ステップと、上記符号化負
荷評価ステップにおける制御に従って、入力された上記
原映像情報を構成する静止画像情報を符号化処理し、符
号化映像情報を出力する映像符号化ステップとを実行す
るものである。Further, in the video / audio encoding method according to claim 29, when encoding video and audio, a part or all of the processing steps included in the two encoding processes is performed.
In a video / audio coding method executed using a common computer resource, video / audio information composed of original video information composed of a plurality of still image information representing still images per unit time and original audio information representing audio. Is input, an audio buffering step of temporarily accumulating the original audio information, and reading the original audio information accumulated in the audio buffering step, encoding the read original audio information, Using the audio encoding step of outputting the encoded audio information and the encoding load reference information indicating the degree of the encoding load of the video, the processing capability of the video / audio encoding process is determined, and based on the result of the determination, A coding load evaluation step of controlling coding of original video information in a video coding step described later; and According kick control, the still picture information constituting the original video information input to coding processing, and executes a video encoding step of outputting the coded video information.

【０１０７】また、請求項３０にかかる映像音声符号化
方法は、請求項２９の方法において、上記符号化負荷評
価ステップは、上記原映像情報を構成する静止画像情報
が入力されたとき、上記音声バッファリングステップに
おいて蓄積された原音声情報の総量と、上記符号化負荷
基準情報とに基づいて符号化負荷評価情報を求め、上記
符号化負荷評価情報を予め設定された負荷限度と比較し
て、上記符号化負荷評価情報が上記負荷限度に達してい
ない場合に静止画像情報を出力し、上記符号化負荷評価
情報が上記負荷限度に達した場合に、上記静止画像情報
を破棄するものである。[0107] In the video / audio coding method according to claim 30, in the method according to claim 29, the encoding load evaluation step is performed when the still image information constituting the original video information is input. The total amount of the original audio information stored in the buffering step and the coding load evaluation information are obtained based on the coding load reference information, and the coding load evaluation information is compared with a preset load limit, Still image information is output when the encoding load evaluation information has not reached the load limit, and the still image information is discarded when the encoding load evaluation information has reached the load limit.

【０１０８】また、請求項３１にかかる映像音声符号化
方法は、請求項２９の方法において、アナログ映像情報
を入力し、後述する映像解像度情報が出力されたとき、
上記アナログ映像情報を複数の離散的デジタル画素情報
からなり、上記映像解像度情報に従う解像度を持つ複数
の静止画像情報で構成される原映像情報に変換し、上記
映像符号化ステップにおいて処理されるよう出力する映
像キャプチャステップを実行するものであり、上記符号
化負荷評価ステップでは、上記音声バッファリングステ
ップにおいて蓄積された原音声情報の総量と、映像符号
化の負荷程度を表す符号化負荷基準情報とに基づいて符
号化負荷評価情報を求め、上記符号化負荷評価情報に基
づいて、映像符号化に用いる映像の解像度を表す映像解
像度情報を求め、上記映像解像度情報を出力するもので
あり、上記映像符号化ステップでは、上記映像解像度情
報が出力されたとき、上記映像解像度情報に従って上記
静止画像情報に対して符号化処理を行い、符号化映像情
報を出力するものである。A video / audio coding method according to claim 31 is the method according to claim 29, wherein analog video information is input and video resolution information described later is output.
The analog video information is converted into original video information composed of a plurality of discrete digital pixel information and a plurality of still image information having a resolution according to the video resolution information, and output so as to be processed in the video encoding step. In the encoding load evaluation step, the total amount of the original audio information accumulated in the audio buffering step and encoding load reference information indicating the degree of the image encoding load are calculated. Calculating coding load evaluation information based on the coding load evaluation information, obtaining video resolution information representing a resolution of a video used for video coding, based on the coding load evaluation information, and outputting the video resolution information; In the converting step, when the video resolution information is output, the still image information is matched with the video resolution information in accordance with the video resolution information. Performs coding processing Te, and outputs the coded video information.

【０１０９】また、請求項３２にかかる映像音声符号化
方法は、請求項２９の方法において、上記符号化負荷評
価ステップでは、符号化負荷評価情報を上記映像符号化
ステップにおいて処理されるよう出力するものであり、
上記映像符号化ステップでは、上記静止画像情報に対し
て、上記出力された符号化負荷評価情報を用いて計算さ
れる処理量だけ符号化処理を行い、符号化映像情報とし
て出力するものである。In the video / audio coding method according to claim 32, in the method according to claim 29, in the coding load evaluation step, the coding load evaluation information is output so as to be processed in the video coding step. Things,
In the video coding step, the still image information is subjected to a coding process by a processing amount calculated using the output coding load evaluation information, and is output as coded video information.

【０１１０】また、請求項３３にかかる映像音声符号化
方法は、請求項２９ないし３１のいずれかの方法におい
て、上記音声符号化ステップでは、上記音声バッファリ
ングステップにおいて蓄積された原音声情報を読み出
し、この読み出した上記原音声情報の総量を計算して処
理済み音声情報量として出力し、その後、上記原音声情
報を符号化処理して符号化音声情報として出力するもの
であり、上記符号化負荷評価ステップでは、経過時間
と、上記原音声情報の時間当たりの入力量に基づいて原
音声入力量を求め、この原音声入力量と上記処理済み音
声情報量との差である予測音声バッファ量を求め、上記
予測音声バッファ量を用いて、上記符号化負荷評価情報
を求めるものである。A video and audio encoding method according to claim 33 is the method according to any one of claims 29 to 31, wherein in the audio encoding step, the original audio information accumulated in the audio buffering step is read. Calculating the total amount of the read original audio information and outputting it as a processed audio information amount, and thereafter, encoding the original audio information and outputting it as encoded audio information. In the evaluation step, an original audio input amount is obtained based on the elapsed time and the input amount of the original audio information per time, and a predicted audio buffer amount which is a difference between the original audio input amount and the processed audio information amount is calculated. Then, the coding load evaluation information is obtained using the predicted audio buffer amount.

【０１１１】また、請求項３４にかかる映像音声符号化
方法は、請求項２９ないし３１のいずれかの方法におい
て、上記符号化負荷評価ステップでは、上記静止画像情
報が入力されたとき、経過時間と、上記原音声情報の時
間当たりの入力量とに基づいて原音声入力量を求め、か
つ、上記音声符号化ステップにおいて出力された符号化
音声情報の総量に基づいて処理済み音声情報量を求め、
さらに、上記求めた原音声入力量と上記求めた処理済み
音声情報量との差である予測音声バッファ量を求めた
後、上記予測音声バッファ量を用いて、上記符号化負荷
評価情報を求めるものである。In the video / audio coding method according to claim 34, in the method according to any one of claims 29 to 31, in the coding load evaluation step, when the still image information is input, an elapsed time and Determining the original audio input amount based on the input amount per time of the original audio information, and obtaining the processed audio information amount based on the total amount of the encoded audio information output in the audio encoding step,
Further, after calculating a predicted audio buffer amount which is a difference between the obtained original audio input amount and the processed audio information amount, the coding load evaluation information is obtained using the predicted audio buffer amount. It is.

【０１１２】また、請求項３５にかかる映像音声符号化
方法は、請求項２９ないし３１のいずれかの方法におい
て、上記符号化負荷評価ステップにおける、上記判断の
結果の変動を監視し、上記変動に対応して、上記符号化
負荷基準情報を設定するものである。The video / audio coding method according to claim 35 is the method according to any one of claims 29 to 31, wherein a change in the result of the determination in the coding load evaluation step is monitored, and Correspondingly, the coding load reference information is set.

【０１１３】また、請求項３６にかかる映像符号化装置
は、映像を符号化する映像符号化装置において、映像が
デジタル化された、複数の静止画像情報からなる原映像
情報に対して、上記静止画像情報の１つまたは複数を、
後述する符号化パラメータに従って符号化する映像符号
化手段と、１つ以上の解像度を一の符号化パラメータと
し、フレーム内符号化、順方向予測符号化、逆方向予測
符号化、及び双方向予測符号化の各タイプを含む符号化
タイプのうち１つ以上の符号化タイプを他の符号化パラ
メータとして、上記符号化手段の処理量を決定するもの
である符号化パラメータを、与えられたフレームレート
に基づいて決定する符号化パラメータ決定手段とを備え
たものである。A video coding apparatus according to claim 36 is a video coding apparatus for coding a video, wherein the video coding is performed on the original video information consisting of a plurality of pieces of still image information obtained by digitizing the video. One or more of the image information
Video encoding means for encoding according to encoding parameters to be described later, one or more resolutions as one encoding parameter, and intra-frame encoding, forward prediction encoding, backward prediction encoding, and bidirectional prediction encoding One or more encoding types among the encoding types including each type of encoding are used as other encoding parameters, and the encoding parameters for determining the processing amount of the encoding means are set to a given frame rate. Encoding parameter determining means for determining based on the coding parameter.

【０１１４】また、請求項３７にかかる音声符号化装置
は、音声に対して、帯域分割符号化方式により符号化を
行う音声符号化装置において、符号化処理に用いる数値
である、設定周波数ｆｓと、変換定数ｎとを記憶するレ
ジスタと、符号化の対象である音声を入力する音声入力
手段と、上記記憶した設定周波数ｆｓに基づいて決定さ
れるサンプリング周波数を用いて、サンプリング音声デ
ータを作成する入力音声サンプリング手段と、上記設定
周波数ｆｓをサンプリング周波数として用いた場合に得
られるサンプリング音声データの個数をｍ個とし、上記
変換定数ｎに基づいて定められる数をｍ’として、ｍ’
個のサンプリング音声データを含む、ｍ個の音声データ
からなる変換音声データを出力する音声データ変換手段
と、上記変換音声データを、帯域分割してＭ個の帯域信
号を得る帯域分割手段と、上記記憶した設定周波数ｆｓ
と変換定数ｎとから得られる周波数ｆｓ／２ｎを制限周
波数として、上記帯域信号のうち、制限周波数以下の帯
域信号にのみ符号化ビットを割り当てる符号化ビット割
り当て手段と、上記割り当てた符号化ビットに基づいて
量子化を行う量子化手段と、上記量子化したデータを符
号化データとして出力する符号化手段と、上記出力され
る符号化データを記録する符号化データ記録手段とを備
えたものである。A speech encoding apparatus according to claim 37 is a speech encoding apparatus for encoding a speech by a band division encoding method, wherein the set frequency fs and the numerical value used for the encoding process are used. , A conversion constant n, audio input means for inputting audio to be encoded, and sampling frequency determined based on the stored set frequency fs, thereby generating sampled audio data. The input audio sampling means and the number of sampled audio data obtained when the set frequency fs is used as the sampling frequency is m, and the number determined based on the conversion constant n is m ′, and m ′
Audio data conversion means for outputting converted audio data consisting of m audio data, including a number of sampled audio data; band dividing means for dividing the converted audio data into bands to obtain M band signals; Set frequency fs stored
A coding frequency allocating means for allocating coded bits only to a band signal having a frequency equal to or lower than the limited frequency, with the frequency fs / 2n obtained from the conversion constant n and the limited frequency, And a coded data recording means for recording the output coded data. .

【０１１５】また、請求項３８にかかる音声符号化装置
は、音声に対して、帯域分割符号化方式を用いて符号化
を行う音声符号化装置において、上記符号化に用いる制
御定数を記憶する制御定数記憶手段と、入力音声をサン
プリング処理して、サンプリングデータを出力するサン
プリング手段と、上記サンプリング手段で得られたサン
プリングデータに対して帯域分割を行い、帯域信号デー
タを出力する帯域分割手段と、上記帯域分割手段で得ら
れた帯域信号データに対して、符号化ビットの割り当て
を行う符号化ビット割り当て手段と、上記符号化ビット
の割り当てに従って、上記帯域信号データの量子化を行
い、量子化値を出力する量子化手段と、上記量子化手段
で得られた量子化値に基づき、符号化データを出力する
符号化手段と、上記記憶した制御定数に基づいて、上記
帯域分割手段、上記符号化ビット割り当て手段、上記量
子化手段、および上記符号化手段におけるデータ処理を
制御する符号化処理制御手段とを備えたものである。A speech coding apparatus according to a thirty-eighth aspect of the present invention is a speech coding apparatus for performing coding on a voice using a band division coding method, wherein the control for storing a control constant used for the coding is performed. Constant storage means, sampling processing of input audio, sampling means for outputting sampling data, band division means for performing band division on the sampling data obtained by the sampling means, and outputting band signal data, Coding bit allocating means for allocating coded bits to the band signal data obtained by the band dividing means, and quantizing the band signal data in accordance with the coded bit allocation; And encoding means for outputting encoded data based on the quantized value obtained by the quantization means. Based on the stored control constant is obtained by a coding process control means for controlling the data processing in the band dividing means, the coding bit allocation means, the quantization means, and said encoding means.

【０１１６】また、請求項３９にかかる音声符号化装置
は、音声に対して、帯域分割符号化方式を用いて符号化
を行う音声符号化装置において、入力音声をサンプリン
グ処理して、サンプリングデータを出力するサンプリン
グ手段と、上記サンプリング手段で得られたサンプリン
グデータに対して帯域分割を行い、帯域信号データを出
力する帯域分割手段と、上記帯域分割手段で得られた帯
域信号データに対して、符号化ビットの割り当てを行う
符号化ビット割り当て手段と、上記符号化ビット割り当
て手段における割り当てを心理聴覚分析代替制御方式に
より制御するビット割り当て制御手段と、上記符号化ビ
ットの割り当てに従って、上記帯域信号データの量子化
を行い、量子化値を出力する量子化手段と、上記量子化
手段で得られた量子化値に基づき、符号化データを出力
する符号化手段とを備えたものである。A speech coding apparatus according to a thirty-ninth aspect of the present invention is a speech coding apparatus for performing coding on a speech by using a band division coding method. Sampling means for outputting, band division means for performing band division on the sampling data obtained by the sampling means, and outputting band signal data; and code conversion for the band signal data obtained by the band division means. Coded bit allocating means for allocating coded bits, bit allocation control means for controlling allocation in the coded bit allocating means by a psychological auditory analysis alternative control method, and allocating the coded bits to the band signal data. Quantizing means for performing quantization and outputting a quantized value, and an amount obtained by the quantizing means Based on reduction value is obtained by a coding means for outputting coded data.

【０１１７】また、請求項４０にかかる映像音声符号化
装置は、映像と音声とを符号化するにあたり、上記２つ
の符号化処理に含まれる処理過程の一部または全部を、
共通の計算機資源を用いて実行する映像音声符号化装置
において、単位時間毎の静止画像を表す複数の静止画像
情報からなる原映像情報と、音声を表す原音声情報とか
ら構成される映像音声情報が入力されたとき、上記原音
声情報を一時的に蓄積する音声バッファリング手段と、
上記音声バッファリング手段において蓄積された原音声
情報を読み出し、この読み出した上記原音声情報を符号
化処理し、符号化音声情報を出力する音声符号化手段
と、映像符号化の負荷程度を表す符号化負荷基準情報を
用いて、当該映像音声符号化装置の処理能力を判断し、
その判断の結果に基づいて、後述する映像符号化手段に
対しての上記原映像情報の出力を制御する符号化負荷評
価手段と、上記符号化負荷評価手段の制御に従って、上
記原映像情報を構成する静止画像情報が入力されたと
き、上記静止画像情報を符号化処理し、符号化映像情報
を出力する映像符号化手段とを備えたものである。The video / audio encoding apparatus according to claim 40, when encoding video and audio, performs a part or all of the processing steps included in the two encoding processes.
In a video / audio coding apparatus executed using a common computer resource, video / audio information composed of original video information composed of a plurality of still image information representing still images per unit time and original audio information representing audio. Is input, audio buffering means for temporarily storing the original audio information,
Audio encoding means for reading the original audio information stored in the audio buffering means, encoding the read original audio information, and outputting encoded audio information, and a code indicating the degree of load of video encoding Using the coding load reference information, determine the processing capability of the video and audio encoding device,
Based on the result of the determination, an encoding load estimating means for controlling the output of the original video information to an image encoding means to be described later; And video encoding means for encoding the still image information and outputting encoded video information when the still image information to be input is input.

【０１１８】また、請求項４１にかかる映像符号化プロ
グラム記録媒体は、映像を符号化処理する映像符号化プ
ログラムを記録した記録媒体において、映像がデジタル
化された、複数の静止画像情報からなる原映像情報に対
して、上記静止画像情報の１つまたは複数を、後述する
符号化パラメータに従って符号化する映像符号化ステッ
プと、１つ以上の解像度を一の符号化パラメータとし、
フレーム内符号化、順方向予測符号化、逆方向予測符号
化、及び双方向予測符号化の各タイプを含む符号化タイ
プのうち１つ以上の符号化タイプを他の符号化パラメー
タとして、上記符号化ステップの処理量を決定するもの
である符号化パラメータを、与えられたフレームレート
に基づいて決定する符号化パラメータ決定ステップとを
実行する符号化プログラムを記録したものである。A video encoding program recording medium according to claim 41 is a recording medium on which a video encoding program for encoding a video is recorded, comprising a plurality of still image information in which the video is digitized. For the video information, a video encoding step of encoding one or more of the still image information according to encoding parameters described below, and one or more resolutions as one encoding parameter,
One or more encoding types among encoding types including intra-frame encoding, forward predictive encoding, backward predictive encoding, and bidirectional predictive encoding are used as the other encoding parameters. And a coding program for executing a coding parameter determination step of determining a coding parameter for determining a processing amount of the coding step based on a given frame rate.

【０１１９】また、請求項４２にかかる音声符号化プロ
グラム記録媒体は、音声に対して、帯域分割符号化方式
により符号化を行う音声符号化プログラムを記録した記
録媒体において、符号化処理に用いる数値である、設定
周波数ｆｓと、変換定数ｎとを記憶する記憶ステップ
と、符号化の対象である音声を入力する音声入力ステッ
プと、上記記憶した設定周波数ｆｓに基づいて決定され
るサンプリング周波数を用いて、サンプリング音声デー
タを作成する入力音声サンプリングステップと、上記設
定周波数ｆｓをサンプリング周波数として用いた場合に
得られるサンプリング音声データの個数をｍ個とし、ｍ
≧ｍ’である、上記変換定数ｎに基づいて定められる数
をｍ’として、ｍ’個のサンプリング音声データを含
む、ｍ個の音声データからなる変換音声データを出力す
る音声データ変換ステップと、上記変換音声データを、
帯域分割してＭ個の帯域信号を得る帯域分割ステップ
と、上記記憶した設定周波数ｆｓと変換定数ｎとから得
られる周波数ｆｓ／２ｎを制限周波数として、上記帯域
信号のうち、制限周波数以下の帯域信号にのみ符号化ビ
ットを割り当てる符号化ビット割り当てステップと、上
記割り当てた符号化ビットに基づいて量子化を行う量子
化ステップと、上記量子化したデータを符号化データと
して出力する符号化ステップと、上記出力される符号化
データを記録する符号化データ記録ステップとを実行す
る符号化プログラムを記録したものである。[0119] Further, the audio encoding program recording medium according to claim 42 is a recording medium in which an audio encoding program for encoding audio by a band division encoding method is recorded. A storage step of storing a set frequency fs and a conversion constant n, a voice input step of inputting a voice to be encoded, and a sampling frequency determined based on the stored set frequency fs. The input voice sampling step of creating the sampled voice data, and the number of sampled voice data obtained when the set frequency fs is used as the sampling frequency is m,
An audio data conversion step of outputting converted audio data composed of m audio data, including m ′ sampled audio data, where m ′ is a number determined based on the conversion constant n, where ≧ m ′; The converted audio data is
A band division step of dividing the band to obtain M band signals, and a frequency fs / 2n obtained from the stored set frequency fs and the conversion constant n as a limit frequency. A coded bit allocation step of allocating coded bits only to a signal, a quantization step of performing quantization based on the allocated coded bits, and a coding step of outputting the quantized data as coded data, And a coded data recording step of recording the output coded data.

【０１２０】また、請求項４３にかかる音声符号化プロ
グラム記録媒体は、音声に対して、帯域分割符号化方式
を用いて符号化を行う音声符号化プログラムを記録した
記録媒体において、上記符号化に用いる制御定数を記憶
する制御定数記憶ステップと、入力音声をサンプリング
処理して、サンプリングデータを出力するサンプリング
ステップと、上記サンプリングステップで得られたサン
プリングデータに対して帯域分割を行い、帯域信号デー
タを出力する帯域分割ステップと、上記帯域分割ステッ
プで得られた帯域信号データに対して、符号化ビットの
割り当てを行う符号化ビット割り当てステップと、上記
符号化ビットの割り当てに従って、上記帯域信号データ
の量子化を行い、量子化値を出力する量子化ステップ
と、上記量子化ステップで得られた量子化値に基づき、
符号化データを出力する符号化ステップと、上記記憶し
た制御定数に基づいて、上記帯域分割ステップ、上記符
号化ビット割り当てステップ、上記量子化ステップ、お
よび上記符号化ステップにおけるデータ処理を制御する
符号化処理制御ステップとを実行する符号化プログラム
を記録したものである。[0120] Further, the audio encoding program recording medium according to claim 43 is a recording medium in which an audio encoding program for encoding audio by using a band division encoding system is recorded. A control constant storing step of storing a control constant to be used; a sampling step of sampling an input voice to output sampling data; and performing band division on the sampling data obtained in the sampling step, Outputting a band division step, allocating coded bits to the band signal data obtained in the band division step, and quantifying the band signal data according to the coded bit allocation. A quantization step of performing quantization and outputting a quantization value. Based on the quantized values obtained by the flop,
An encoding step of outputting encoded data; and an encoding step of controlling data processing in the band division step, the encoded bit allocation step, the quantization step, and the encoding step based on the stored control constant. It records an encoding program for executing the processing control steps.

【０１２１】また、請求項４４にかかる音声符号化プロ
グラム記録媒体は、音声に対して、帯域分割符号化方式
を用いて符号化を行う音声符号化プログラムを記録した
記録媒体において、入力音声をサンプリング処理して、
サンプリングデータを出力するサンプリングステップ
と、上記サンプリングステップで得られたサンプリング
データに対して帯域分割を行い、帯域信号データを出力
する帯域分割ステップと、上記帯域分割ステップで得ら
れた帯域信号データに対して、符号化ビットの割り当て
を行う符号化ビット割り当てステップと、上記符号化ビ
ット割り当てステップにおける割り当てを心理聴覚分析
代替制御方式により制御するビット割り当て制御ステッ
プと、上記符号化ビットの割り当てに従って、上記帯域
信号データの量子化を行い、量子化値を出力する量子化
ステップと、上記量子化ステップで得られた量子化値に
基づき、符号化データを出力する符号化ステップとを実
行する符号化プログラムを記録したものである。Further, according to a sound encoding program recording medium of the present invention, an input sound is sampled on a recording medium on which an audio encoding program for encoding a sound by using a band division coding method is recorded. Process,
Sampling step of outputting sampling data, performing band division on the sampling data obtained in the sampling step, outputting a band signal data, and performing band division on the band signal data obtained in the band division step. A coded bit allocation step of allocating coded bits; a bit allocation control step of controlling the allocation in the coded bit allocation step by a psychological auditory analysis alternative control method; and An encoding program that performs quantization of signal data and outputs a quantization value, and an encoding step of outputting encoded data based on the quantization value obtained in the quantization step, It is recorded.

【０１２２】また、請求項４５にかかる映像音声符号化
プログラム記録媒体は、映像と音声とを符号化するにあ
たり、上記２つの符号化処理に含まれる処理過程の一部
または全部を、共通の計算機資源を用いて実行する映像
音声符号化プログラムを記録した記録媒体において、単
位時間毎の静止画像を表す複数の静止画像情報からなる
原映像情報と、音声を表す原音声情報とから構成される
映像音声情報が入力されたとき、上記原音声情報を一時
的に蓄積する音声バッファリングステップと、上記音声
バッファリングステップにおいて蓄積された原音声情報
を読み出し、この読み出した上記原音声情報を符号化処
理し、符号化音声情報を出力する音声符号化ステップ
と、映像符号化の負荷程度を表す符号化負荷基準情報を
用いて、当該映像音声符号化処理についての処理能力を
判断し、その判断の結果に基づいて、後述する映像符号
化ステップにおける原映像情報に対する符号化を制御す
る符号化負荷評価ステップと、上記符号化負荷評価ステ
ップにおける制御に従って、入力された上記原映像情報
を構成する静止画像情報を符号化処理し、符号化映像情
報を出力する映像符号化ステップとを実行する符号化プ
ログラムを記録したものである。Further, in the video / audio coding program recording medium according to claim 45, in coding video and audio, a part or all of the processing steps included in the two coding processes may be performed by a common computer. In a recording medium on which a video / audio encoding program to be executed using resources is recorded, a video composed of original video information composed of a plurality of still image information representing still images per unit time and original audio information representing audio. When audio information is input, an audio buffering step of temporarily storing the original audio information, and reading the original audio information stored in the audio buffering step, and encoding the read original audio information Then, using the audio encoding step of outputting encoded audio information and the encoding load reference information indicating the degree of the encoding load of the video, An encoding load evaluating step of controlling encoding of original video information in a video encoding step to be described later based on a result of the determination, and a control in the encoding load evaluating step; And a video encoding step of encoding the input still image information constituting the original video information and outputting the encoded video information.

【０１２３】[0123]

BEST MODE FOR CARRYING OUT THE INVENTION

実施の形態１．本発明の実施の形態１による映像符号化
方法は、複数の符号化パラメータのうちあるパラメータ
を定め、設定されたフレームレートと、上記定めたパラ
メータとに基づいて他のパラメータを決定するものであ
る。Embodiment 1 FIG. The video encoding method according to the first embodiment of the present invention determines a certain parameter among a plurality of encoding parameters, and determines another parameter based on a set frame rate and the determined parameter. .

【０１２４】図１は、本発明の実施の形態１による映像
符号化装置の構成を示すブロック図である。図示するよ
うに、本実施の形態１による映像符号化装置は、符号化
手段１０１と、符号化パラメータ決定手段１０２とから
構成されており、符号化手段１０１は、ＤＣＴ処理手段
１０３、量子化手段１０４、可変長符号化手段１０５、
ビットストリーム生成手段１０６、逆量子化手段１０
７、逆ＤＣＴ処理手段１０８、および予測画像生成手段
１０９を、また、符号化パラメータ決定手段１０２は解
像度参照テーブル１１０を内包している。FIG. 1 is a block diagram showing a configuration of a video encoding apparatus according to Embodiment 1 of the present invention. As shown in the figure, the video encoding apparatus according to the first embodiment includes an encoding unit 101 and an encoding parameter determination unit 102. The encoding unit 101 includes a DCT processing unit 103, a quantization unit 104, variable length encoding means 105,
Bit stream generation means 106, inverse quantization means 10
7, the inverse DCT processing unit 108 and the predicted image generation unit 109, and the encoding parameter determination unit 102 includes a resolution reference table 110.

【０１２５】符号化手段１０１は、映像がデジタル化さ
れた、一連の静止画像からなる映像データを入力画像デ
ータとして入力し、設定された符号化パラメータに従っ
て符号化処理し、符号化データを出力する。入力画像デ
ータを構成する個々の静止画像データをフレーム画像と
呼ぶ。また、符号化パラメータは、後述する符号化パラ
メータ決定手段１０２から与えられるものであり、符号
化タイプを示すパラメータと、解像度を示すパラメータ
とが含まれている。符号化タイプを示すパラメータは、
フレーム内符号化処理、または順方向予測符号化処理を
示すものであり、符号化手段１０１は、当該パラメータ
に従って、フレーム内符号化、または順方向予測符号化
を行う。解像度を示すパラメータは後述するＤＣＴ処理
手段に入力され、当該解像度において符号化処理が行わ
れることとなる。The encoding means 101 inputs video data consisting of a series of still images in which video has been digitized, as input image data, performs an encoding process according to the set encoding parameters, and outputs encoded data. . Each still image data constituting the input image data is called a frame image. The encoding parameter is provided from an encoding parameter determination unit 102 described later, and includes a parameter indicating an encoding type and a parameter indicating a resolution. The parameter indicating the encoding type is
This indicates intra-frame encoding processing or forward prediction encoding processing, and the encoding unit 101 performs intra-frame encoding or forward prediction encoding according to the parameter. The parameter indicating the resolution is input to a DCT processing unit described later, and the encoding process is performed at the resolution.

【０１２６】符号化手段１０１の内部においては、入力
画像データに対してまずＤＣＴ手段１０３がＤＣＴ（離
散コサイン変換）処理を行ってＤＣＴ変換データを出力
し、次に、量子化手段１０４が、ＤＣＴ変換データに対
して量子化処理を行って量子化データを出力し、次に可
変長符号化手段１０５が量子化データに対して可変長符
号化処理を行うことによって、圧縮符号化された可変長
符号化データが作成される。可変長符号化データはビッ
トストリーム生成手段１０６に入力され、ビットストリ
ーム生成手段１０６から、伝送や記録を行うことのでき
るビットストリームとして、装置出力である符号化デー
タが出力される。In the encoding means 101, first, the DCT means 103 performs DCT (discrete cosine transform) processing on the input image data to output DCT transformed data, and then the quantization means 104 The quantized data is output by performing a quantization process on the transformed data, and then the variable-length encoding unit 105 performs a variable-length encoding process on the quantized data, so that the compressed encoded variable-length Encoded data is created. The variable length coded data is input to the bit stream generating means 106, and the bit stream generating means 106 outputs coded data, which is an output from the device, as a bit stream that can be transmitted or recorded.

【０１２７】逆量子化手段１０７は、量子化手段１０４
から出力された量子化データに対して、量子化処理の逆
処理である逆量子化処理を行って逆量子化データを出力
し、次に逆ＤＣＴ手段１０８が逆量子化データに対し
て、ＤＣＴ処理の逆処理である逆ＤＣＴ処理を行って、
逆ＤＣＴ変換データを出力し、逆ＤＣＴ変換データは予
測画像生成手段１０３に入力され、予測画像データとし
て出力されることとなる。符号化パラメータに従って、
予測画像を用いたフレーム間符号化処理が行われる場合
には、この予測画像データと入力画像データとの差分デ
ータがＤＣＴ手段１０３に入力されることにより、符号
化手段１０１においては順方向予測符号化が行われるこ
ととなる。The inverse quantization means 107 is provided for the quantization means 104
The inverse DCT unit 108 performs inverse quantization on the quantized data output from the inverse quantization process, which is an inverse process of the quantization process, and outputs the inversely quantized data. Perform inverse DCT processing, which is the reverse of the processing,
The inverse DCT transform data is output, and the inverse DCT transform data is input to the predicted image generation means 103 and output as predicted image data. According to the encoding parameters,
When the inter-frame encoding process using the predicted image is performed, the difference data between the predicted image data and the input image data is input to the DCT unit 103, so that the encoding unit 101 Will be performed.

【０１２８】また、本実施の形態１による映像符号化装
置では、符号化パラメータ決定手段１０２は、指定され
たフレームレートと符号化パターンとから、内包する解
像度参照テーブル１１０を用いて解像度を決定し、当該
決定した解像度を示すパラメータを含む、上記符号化パ
ラメータを符号化手段１０１に出力する。In the video coding apparatus according to the first embodiment, the coding parameter determining means 102 determines the resolution from the designated frame rate and coding pattern using the included resolution reference table 110. , And outputs the above-mentioned encoding parameter including the parameter indicating the determined resolution to the encoding means 101.

【０１２９】なお、本実施の形態１による映像符号化装
置は、パーソナルコンピュータ（ＰＣ）において処理制
御装置（ＣＰＵ）の制御により映像符号化プログラムが
実行されることによって実現されるものとし、符号化処
理の実行においては、以下の５つの条件が成立するもの
とする。The video encoding apparatus according to the first embodiment is realized by executing a video encoding program under the control of a processing control device (CPU) in a personal computer (PC). In execution of the processing, it is assumed that the following five conditions are satisfied.

【０１３０】（１) 符号化処理時間は、フレーム内符号
化処理、順方向予測符号化処理ともに、処理するフレー
ム画像の解像度に比例した時間を要するものとする。（２) 順方向予測符号化を実行した場合の処理時間は、
フレーム内符号化を実行した場合の６倍の時間がかかる
ものとする。（３) フレーム内符号化を実行した場合、得られる符号
化データ量は入力画像データの１／１０の量となり、順
方向予測符号化を実行した場合、得られる符号化データ
のデータ量は入力画像データの１／６０の量となるもの
とする。（４) 本装置を実現するＰＣのＣＰＵが動作周波数１０
０MHz で動作する場合に、３２０×２４０の解像度のフ
レーム画像をフレーム内符号化を用いて符号化処理した
場合には、１／２４秒で処理できるものとする。（５) 本装置の処理能力は、本装置に搭載されるＣＰＵ
の動作周波数に比例するものとする。すなわち、本装置
における符号化処理の処理時間は、動作周波数の逆数に
比例するものとする。ここで、本装置に搭載されるＣＰＵの動作周波数は１０
０MHz であり、符号化開始時に指定されるフレームレー
トは２４フレーム／秒、符号化タイプの組み合わせとし
ての符号化パターンは、すべて「Ｉ」のみとするパター
ンであるパターン１「ＩＩ」と、２フレームごとに
「Ｉ」「Ｐ」を繰り返すパターン２「ＩＰ」との２種類
があるとする。ただし、フレーム内符号化を「Ｉ」、順
方向予測符号化を「Ｐ」で表すものとする。(1) It is assumed that both the intra-frame encoding process and the forward prediction encoding process require a time proportional to the resolution of the frame image to be processed. (2) The processing time when performing forward prediction encoding is
It is assumed that it takes six times as long as the case where the intra-frame encoding is performed. (3) When intra-frame encoding is performed, the obtained encoded data amount is 1/10 of the input image data. When forward prediction encoding is executed, the obtained encoded data amount is the input image data. It is assumed that the amount is 1/60 of the image data. (4) The CPU of the PC that realizes this device has an operating frequency of 10
When operating at 0 MHz, when a frame image having a resolution of 320 × 240 is encoded using intra-frame encoding, it can be processed in 1/24 second. (5) The processing capacity of this device is based on the CPU mounted on this device.
Is proportional to the operating frequency of That is, the processing time of the encoding process in the present apparatus is proportional to the reciprocal of the operating frequency. Here, the operating frequency of the CPU mounted on this device is 10
0 MHz, the frame rate specified at the start of encoding is 24 frames / sec, and the encoding patterns as a combination of encoding types are pattern 1 “II”, which is a pattern in which all are only “I”, and two frames. It is assumed that there are two types, a pattern 2 “IP” that repeats “I” and “P” every time. Here, the intra-frame encoding is represented by “I” and the forward prediction encoding is represented by “P”.

【０１３１】以上のような設定のもとに、上述のように
構成された本実施の形態１による映像符号化装置の動作
を以下に説明する。まず、符号化対象である映像はデジ
タル化され、一連のフレーム画像として当該符号化装置
の符号化手段１０１に入力される。図２は、符号化手段
１０１の動作を示すフローチャート図である。符号化手
段１０１の動作を、以下に、図２に従って説明する。な
お、符号化パラメータ決定手段１０２は、符号化開始時
の最初のフレーム画像に対しては、符号化手段１０１に
対して必ずフレーム内符号化を指示するものとする。The operation of the video encoding apparatus according to the first embodiment configured as described above under the above settings will be described below. First, the video to be encoded is digitized and input to the encoding unit 101 of the encoding device as a series of frame images. FIG. 2 is a flowchart illustrating the operation of the encoding unit 101. The operation of the encoding means 101 will be described below with reference to FIG. Note that the encoding parameter determination unit 102 always instructs the encoding unit 101 to perform intra-frame encoding for the first frame image at the start of encoding.

【０１３２】ステップＡ０１では、符号化パラメータ決
定手段１０２より入力された符号化パラメータについて
判断がなされ、フレーム内符号化が指示されていた場合
にはステップＡ０２以降の処理が実行され、順方向予測
符号化が指示されていた場合には、ステップＡ０７以降
の処理が実行される。In step A01, the encoding parameters inputted from the encoding parameter determining means 102 are judged. If intra-frame encoding is instructed, the processing in step A02 and subsequent steps is executed, and the forward prediction code is executed. If the conversion has been instructed, the processing after step A07 is executed.

【０１３３】ステップＡ０２以降が実行される場合は、
次のようになる。ステップＡ０２でＤＣＴ処理手段１０
３は、符号化パラメータ決定手段１０２が指示する解像
度に基づき、入力されたフレーム画像を８画素×８画素
のブロックに分割し、分割したブロックごとに２次元離
散コサイン変換して、ＤＣＴ変換データを出力する。次
いで、ステップＡ０３では、量子化手段１０４は、ＤＣ
Ｔ変換データに対して、ある定められた値を用いて量子
化処理を行い、量子化データを出力する。そして、ステ
ップＡ０４で、可変長符号化手段１０５は、量子化デー
タを可変長符号化し、可変長符号化データを出力する。
ステップＡ０５において、ビットストリーム生成手段１
０６は、可変長符号化手段１０５が出力した可変長符号
化データと、符号化パラメータ決定手段１０２より出力
された解像度、および符号化タイプとを用いて、当該映
像符号化装置の装置出力として、符号化結果であるビッ
トストリームを生成して出力する。When step A02 and subsequent steps are executed,
It looks like this: In step A02, DCT processing means 10
3 divides the input frame image into blocks of 8 pixels × 8 pixels based on the resolution instructed by the encoding parameter determination unit 102, performs two-dimensional discrete cosine transform for each of the divided blocks, and converts the DCT transformed data. Output. Next, in step A03, the quantization means 104
A quantization process is performed on the T-transformed data using a predetermined value, and the quantized data is output. Then, in step A04, the variable-length coding unit 105 performs variable-length coding on the quantized data and outputs variable-length coded data.
In step A05, the bit stream generation unit 1
06 is a device output of the video encoding device using the variable length encoded data output from the variable length encoding unit 105, the resolution output from the encoding parameter determination unit 102, and the encoding type. Generates and outputs a bit stream as an encoding result.

【０１３４】ステップＡ０６では、符号化が終了してい
るか否かが判断され、符号化が終了したと判断されたな
らば処理は終了する。一方、符号化終了でなければ上記
のステップＡ０１に戻り、ステップＡ０１の判断以降が
実行される。In step A06, it is determined whether or not the encoding has been completed. If it is determined that the encoding has been completed, the process ends. On the other hand, if the encoding is not completed, the process returns to step A01, and the steps after step A01 are executed.

【０１３５】これに対して、ステップＡ０１の判断によ
り、ステップＡ０７以降が実行される場合は次のように
なる。まず、ステップＡ０７で逆量子化手段１０７は、
量子化手段１０４が直前のフレーム画像に対してすでに
出力している量子化データを逆量子化し、逆量子化デー
タを出力する。次いでステップＡ０８では、逆ＤＣＴ処
理手段１０８が、逆量子化データに対して、ＤＣＴ処理
手段１０３が分割した８画素×８画素のブロックごと
に、２次元離散コサイン変換の逆処理である２次元逆離
散コサイン変換を実行し、逆ＤＣＴ変換データを出力す
る。ステップＡ０９において、予測画像生成手段１０９
は、逆ＤＣＴ変換データに基づいて予測画像を生成し出
力する。On the other hand, when step A07 and subsequent steps are executed according to the judgment in step A01, the following is performed. First, in step A07, the inverse quantization means 107
The quantization means 104 inversely quantizes the quantized data already output for the immediately preceding frame image, and outputs the inversely quantized data. Next, in step A08, the inverse DCT processing unit 108 performs a two-dimensional inverse cosine transform, which is an inverse process of the two-dimensional discrete cosine transform, on the inverse-quantized data for each block of 8 × 8 pixels divided by the DCT processing unit 103. Performs a discrete cosine transform and outputs inverse DCT transform data. In step A09, the predicted image generation unit 109
Generates and outputs a predicted image based on the inverse DCT transform data.

【０１３６】ステップＡ１０でＤＣＴ処理手段１０３
は、入力されたフレーム画像と予測画像生成手段１０９
が出力した予測画像とを、それぞれ指示された解像度に
基づき、８画素×８画素のブロックに分割し、分割した
ブロックごとに、入力されたフレーム画像のデータから
予測画像のデータを差し引くことにより差分データを得
る。そして、この差分データに対して、分割したブロッ
クごとに２次元離散コサイン変換して、ＤＣＴ変換デー
タを出力する。ＤＣＴ変換データが出力された後のステ
ップＡ１１〜Ａ１４はステップＡ０３からＡ０６と同様
に実行される。In step A10, DCT processing means 103
Is the input frame image and the predicted image generation unit 109
Is divided into blocks of 8 pixels × 8 pixels based on the respective indicated resolutions, and for each of the divided blocks, the data of the predicted image is subtracted from the data of the input frame image to obtain a difference. Get the data. Then, two-dimensional discrete cosine transform is performed on the difference data for each divided block, and DCT transform data is output. Steps A11 to A14 after the DCT transform data are output are executed in the same manner as steps A03 to A06.

【０１３７】このように、符号化手段１０１では、入力
されたフレーム画像ごとに、ステップＡ０１の判定によ
り、ステップＡ０２〜Ａ０６か、ステップＡ０７〜Ａ１
４かの処理が行われることとなる。ステップＡ０２〜Ａ
０６はフレーム内符号化であり、ステップＡ０７〜Ａ１
４は直前のフレーム画像に対しての符号化結果を用いた
予測画像に基づく順方向符号化処理が行われるものであ
り、この切り替えはステップＡ０１の判定において、入
力された符号化パラメータに従ってなされるものであ
る。As described above, the encoding means 101 determines, for each input frame image, the steps A02 to A06 or the steps A07 to A1 based on the determination in the step A01.
Four processes will be performed. Step A02-A
06 is an intra-frame encoding, and steps A07 to A1
Reference numeral 4 denotes forward coding processing based on a predicted image using the coding result for the immediately preceding frame image, and this switching is performed according to the input coding parameter in the determination of step A01. Things.

【０１３８】（表１）は符号化パラメータ決定手段１０
２が内包する解像度参照テーブル１１０を示す表であ
る。また、図３は符号化パラメータ決定手段１０２の動
作を示すフローチャート図である。以下に、符号化パラ
メータを決定して、符号化手段１０１に出力する符号化
パラメータ決定手段１０２の動作を、表１を参照し、図
３のフローに従って説明する。(Table 1) shows the encoding parameter determining means 10
2 is a table showing a resolution reference table 110 included therein. FIG. 3 is a flowchart showing the operation of the encoding parameter determining means 102. Hereinafter, the operation of the encoding parameter determining unit 102 that determines the encoding parameter and outputs the encoding parameter to the encoding unit 101 will be described with reference to Table 1 and in accordance with the flow of FIG.

【０１３９】[0139]

【表１】 [Table 1]

【０１４０】（表１）に示す解像度参照テーブル１１０
は、符号化に先立ちって予め作成しておかれるものであ
る。テーブル作成は、後述する条件を考慮した上で、例
えば経験的知識に基づいて、あるいは実験符号化やシミ
ュレーション等の結果を用いて、することができる。
（表１）の「入力」の欄は設定されたフレームレートと
指示されるパラメータとを、また「出力」の欄は入力に
対応して決定されるパラメータを示している。同表に示
すように、本実施の形態１ではフレームレートと符号化
パターンとに対応して、解像度が決定される。フレーム
レートについては固定的に「２４（フレーム／秒）」と
されているものであり、一方、符号化パターンについて
は、「ＩＩ」のパターン１と、「ＩＰ」のパターン２と
を指示することができる。これらのうち、パターン１
「ＩＩ」はすべてのフレーム画像に対してフレーム内符
号化( Ｉ) をすることを意味し、パターン２「ＩＰ」
は、２フレームごとにフレーム内符号化( Ｉ) と順方向
予測符号化( Ｐ) とを繰り返すことを意味する。The resolution reference table 110 shown in (Table 1)
Are created in advance of encoding. The table can be created, for example, based on empirical knowledge or using results of experimental coding, simulation, or the like, in consideration of the conditions described below.
The “input” column in (Table 1) shows the set frame rate and the designated parameter, and the “output” column shows the parameter determined in response to the input. As shown in the table, in the first embodiment, the resolution is determined according to the frame rate and the coding pattern. The frame rate is fixedly set to "24 (frames / second)", while the coding pattern is indicated by "II" pattern 1 and "IP" pattern 2. Can be. Of these, pattern 1
"II" means that intra-frame encoding (I) is performed on all frame images, and pattern 2 "IP"
Means that intra-frame coding (I) and forward prediction coding (P) are repeated every two frames.

【０１４１】参照テーブルの作成は、次の条件を考慮し
て行われる。第一に、順方向予測符号化処理は、逆量子
化手段１０７、逆ＤＣＴ処理手段１０８、および予測画
像生成手段１０９による処理が付加される分、フレーム
内符号化処理よりも処理量が多くなること、第二に、入
力画像に対して高解像度での符号化を行った場合は、低
解像度で符号化した場合と比べて処理量が多くなること
である。これらの条件を考慮し、指定されたフレームレ
ートを実現しつつ、できるだけ高解像度で符号化処理を
実行できるように、解像度参照テーブル１１０は作成さ
れるものである。The creation of the reference table is performed in consideration of the following conditions. First, the forward prediction encoding process requires a larger processing amount than the intra-frame encoding process because the processes performed by the inverse quantization unit 107, the inverse DCT processing unit 108, and the predicted image generation unit 109 are added. Second, when the input image is coded at a high resolution, the processing amount is larger than when the coding is performed at a low resolution. In consideration of these conditions, the resolution reference table 110 is created so that the encoding process can be executed at the highest possible resolution while realizing the specified frame rate.

【０１４２】まず、図３のフローのステップＢ０１にお
いて、符号化パラメータ決定手段１０２は、指定された
フレームレートである２４フレーム／秒と、符号化パタ
ーン( ＩＩもしくはＩＰ) とから、解像度参照テーブル
１１０を参照して、符号化を実行するフレーム画像の解
像度を決定する。First, in step B01 of the flow of FIG. 3, the coding parameter determining means 102 obtains the resolution reference table 110 from the specified frame rate of 24 frames / sec and the coding pattern (II or IP). , The resolution of the frame image to be encoded is determined.

【０１４３】次いでステップＢ０２では、符号化パラメ
ータ決定手段１０２は、符号化手段１０１に対して、ス
テップＢ０１で決定した解像度を指示するとともに、指
定された符号化パターンを実現できるように、処理対象
であるフレーム画像に用いるべき符号化タイプ( Ｉもし
くはＰ) を指示する。Next, in step B02, the coding parameter determining means 102 instructs the coding means 101 on the resolution determined in step B01 and performs processing on the processing object so that the specified coding pattern can be realized. Indicate the coding type (I or P) to be used for a certain frame image.

【０１４４】その後、ステップＢ０３では符号化が終了
したか否かが判定され、符号化が終了したと判定された
ならば処理は終了する。一方、終了でなければ、ステッ
プＢ０２に戻ることによって、符号化手段１０１に対す
る符号化パラメータ出力が繰り返される。Thereafter, in step B03, it is determined whether or not the coding has been completed. If it is determined that the coding has been completed, the process is completed. On the other hand, if not finished, the process returns to step B02, whereby the output of the encoding parameters to the encoding means 101 is repeated.

【０１４５】符号化手段１０１と、符号化パラメータ決
定手段１０２との以上のような動作によって、符号化が
実行されるが、（表２）は、本実施の形態１による映像
符号化装置において符号化を行なった結果を示す表であ
る。Coding is performed by the above-described operations of the coding means 101 and the coding parameter determination means 102. Table 2 shows the coding in the video coding apparatus according to the first embodiment. It is a table | surface which shows the result of having performed conversion.

【０１４６】[0146]

【表２】 [Table 2]

【０１４７】（表２）は、指示される２つの符号化条件
に対して、本実施の形態１の符号化装置において決定さ
れる解像度（決定されるパラメータ）と、それらのパラ
メータを用いた符号化処理の結果として得られたフレー
ムレート（符号化結果）とを示している。（表２）に示
す符号化結果の数値については、符号化パターン「Ｉ
Ｉ」において、解像度を３２０×２４０とした場合に２
４フレーム／秒で処理できることに基づいて、その他の
場合のフレームレートが算出されている。符号化パター
ンがＩＰで解像度が１６０×１２０の場合のフレームレ
ートは、Ｐの処理にＩの処理の６倍の時間を要すること
と、解像度が１／４の場合は１／４の時間で処理できる
こととから、２枚のフレーム画像を符号化するのに( １
／２４＋６／２４) ÷４＝０．０７３秒を要することが
算出でき、これから、２７．４２８フレーム／秒と算出
できる。Table 2 shows the resolutions (determined parameters) determined by the encoding apparatus according to the first embodiment and the codes using those parameters for the two specified encoding conditions. And the frame rate (encoding result) obtained as a result of the encoding process. For the numerical values of the encoding results shown in (Table 2), the encoding pattern "I
I ", if the resolution is 320 × 240,
The frame rate in other cases is calculated based on the fact that the frame rate can be processed at 4 frames / sec. When the coding pattern is IP and the resolution is 160 × 120, the frame rate requires six times as long as the processing of P and the processing of P when the resolution is 1/4. From what we can do, we need to encode two frame images (1
/ 24 + 6/24) ÷ 4 = 0.073 seconds can be calculated, from which it can be calculated to be 27.428 frames / second.

【０１４８】比較のため、（表３）に従来の技術による
映像符号化装置を用いて符号化を行なった場合の動作結
果を示す。For comparison, Table 3 shows the operation results when encoding is performed using a conventional video encoding device.

【０１４９】[0149]

【表３】 [Table 3]

【０１５０】（表３）においても（表２）の場合と同様
の算出がなされており、符号化パターンＩＩにおいて、
解像度が３２０×２４０の場合に２４フレーム／秒で処
理できることに基づいて、その他の場合のフレームレー
トが算出されている。In Table 3, the same calculation as in Table 2 is performed, and in coding pattern II,
The frame rate in other cases is calculated based on processing at 24 frames / sec when the resolution is 320 × 240.

【０１５１】従来の技術による映像符号化装置では、符
号化結果として得られるフレームレートを考慮せずに、
符号化タイプ（パターン）、あるいは解像度を決定して
いたものである。従って、符号化処理の結果として得ら
れるフレームレートが要望される値に近くなるように設
定することが困難であり、不必要な数値となってしまう
設定を選定せざるを得ない場合などがあった。これに比
べ、本実施の形態１の映像符号化装置においては、符号
化結果であるフレームレートを考慮して、指定された符
号化タイプ（パターン）に応じて解像度を決定すること
で、表２と表３との対比において示されるように、指定
されたフレームレートに近いフレームレートを実現しつ
つ、より高解像度での符号化が実行されていることがわ
かる。In a video coding apparatus according to the prior art, the frame rate obtained as a result of coding is not taken into consideration,
The encoding type (pattern) or the resolution has been determined. Therefore, it is difficult to set the frame rate obtained as a result of the encoding process close to a desired value, and there is a case where a setting that results in an unnecessary value must be selected. Was. On the other hand, in the video encoding apparatus according to the first embodiment, the resolution is determined according to the specified encoding type (pattern) in consideration of the frame rate as the encoding result. As shown in the comparison between Table 3 and Table 3, it can be seen that encoding at a higher resolution is executed while realizing a frame rate close to the designated frame rate.

【０１５２】このように、本実施の形態１による映像符
号化装置によれば、符号化手段１０１と、解像度参照テ
ーブル１１０を内包した符号化パラメータ決定手段１０
２とを備えたことで、符号化パラメータ決定手段１０２
は、指定されたフレームレートと符号化タイプとに対応
して解像度を決定して、符号化パラメータを符号化手段
１０１に出力し、符号化手段１０１はこの符号化パラメ
ータに応じて符号化の処理を行うので、要求される条件
を実現しつつ、より高解像度での符号化を行うことが可
能となる。As described above, according to the video encoding apparatus of the first embodiment, the encoding means 101 and the encoding parameter determination means 10 including the resolution reference table 110 are included.
2, the coding parameter determining means 102
Determines the resolution corresponding to the specified frame rate and encoding type, and outputs the encoding parameter to the encoding means 101. The encoding means 101 performs the encoding process in accordance with the encoding parameter. , It is possible to perform encoding at a higher resolution while realizing the required conditions.

【０１５３】なお、本実施の形態１による映像符号化装
置では、指定された符号化パターンに対応して解像度を
決定するものとしたが、同様の参照テーブルを用いるこ
とによって、指定された解像度に対応して符号化パター
ン（タイプ）を決定することも可能であり、要求される
フレームレートと符号化パターンとの下で、より圧縮率
の高い符号化結果の得られる処理をすることが可能とな
る。In the video coding apparatus according to the first embodiment, the resolution is determined in accordance with the specified coding pattern. However, by using the same reference table, the resolution can be set to the specified resolution. It is also possible to determine the coding pattern (type) correspondingly, and it is possible to perform a process of obtaining a coding result with a higher compression rate under the required frame rate and coding pattern. Become.

【０１５４】実施の形態２．本発明の実施の形態２によ
る映像符号化方法は、当該符号化装置の処理能力に対応
して、設定されたフレームレートに基づいて符号化パラ
メータを決定するものであり、制御装置（ＣＰＵ）の動
作周波数により、処理能力を判断するものである。Embodiment 2 The video encoding method according to the second embodiment of the present invention determines encoding parameters based on a set frame rate in accordance with the processing capability of the encoding device. The processing capability is determined based on the operating frequency.

【０１５５】図４は、本発明の実施の形態２による映像
符号化装置の構成を示すブロック図である。図示するよ
うに、本実施の形態２による映像符号化装置は、符号化
手段２０１と、符号化パラメータ決定手段２０２と、処
理能力判断手段２１１とから構成されており、符号化手
段２０１は、ＤＣＴ処理手段２０３、量子化手段２０
４、可変長符号化手段２０５、ビットストリーム生成手
段２０６、逆量子化手段２０７、逆ＤＣＴ処理手段２０
８、および予測画像生成手段２０９を、また、符号化パ
ラメータ決定手段２０２は符号化パターン参照テーブル
２１０を内包している。FIG. 4 is a block diagram showing a configuration of a video encoding device according to Embodiment 2 of the present invention. As shown in the figure, the video encoding device according to the second embodiment includes an encoding unit 201, an encoding parameter determination unit 202, and a processing capability determination unit 211. Processing means 203, quantization means 20
4. Variable length coding means 205, bit stream generation means 206, inverse quantization means 207, inverse DCT processing means 20
8 and the predicted image generation means 209, and the coding parameter determination means 202 includes a coding pattern reference table 210.

【０１５６】符号化手段２０１は、実施の形態１による
映像符号化装置の符号化手段１０１と同様であり、符号
化パラメータ決定手段１０２から入力される符号化パラ
メータに対応して、入力されたフレーム画像に対して、
指示された解像度で、かつ、フレーム内符号化（Ｉ) 、
または順方向予測符号化（Ｐ) といった指示された符号
化タイプで符号化を行う。The coding means 201 is the same as the coding means 101 of the video coding apparatus according to the first embodiment. For images,
At the indicated resolution and intra-frame encoding (I),
Alternatively, encoding is performed using a designated encoding type such as forward prediction encoding (P).

【０１５７】符号化パラメータ決定手段２０２は、処理
能力判断手段２１１の判断結果に応じて符号化パラメー
タを決定し、符号化手段２０１に出力する。処理能力判
断手段２１１は、当該符号化装置の符号化処理能力を判
断し、判断結果を符号化パラメータ決定手段２０２に出
力する。本実施の形態２では、処理能力判断手段２１１
は、当該符号化装置の処理能力を示す「ＣＰＵの動作周
波数」を判断結果として出力するものであり、符号化パ
ラメータ決定手段２０２は、指定されたフレームレー
ト、および解像度と、判断結果である動作周波数とから
符号化パターンを決定し、当該符号化パターンに応じて
符号化手段２０１に符号化タイプを指示するものであ
る。符号化パターンを決定するため、符号化パラメータ
決定手段２０２は、内包する符号化パターン参照テーブ
ル２１０を用いる。The encoding parameter determining means 202 determines an encoding parameter according to the result of the judgment by the processing capability judging means 211 and outputs it to the encoding means 201. The processing capability determining unit 211 determines the encoding processing capability of the encoding device, and outputs a result of the determination to the encoding parameter determining unit 202. In the second embodiment, the processing capacity determination unit 211
Outputs the “operating frequency of the CPU” indicating the processing capability of the encoding apparatus as a determination result. The encoding parameter determination unit 202 outputs the specified frame rate and resolution, and the operation as the determination result. A coding pattern is determined from the frequency and the coding pattern is instructed to the coding means 201 according to the coding pattern. In order to determine the coding pattern, the coding parameter determining means 202 uses the included coding pattern reference table 210.

【０１５８】なお、本実施の形態２による映像符号化装
置においても、実施の形態１と同様に、ＰＣにおける符
号化プログラムの実行によって実現されるものとし、実
施の形態１に示した条件（１）〜（５）が成立するもの
とする。また、本装置に搭載されるＣＰＵの動作周波数
は、１００MHz 、または１６６MHz のいずれかであると
し、符号化開始時に指定されるフレームレートは２４フ
レーム／秒であり、入力画像におけるフレーム画像の解
像度として、３２０×２４０あるいは１６０×１２０が
指定されるものとする。Note that the video encoding device according to the second embodiment is also realized by executing the encoding program on the PC as in the first embodiment, and the condition (1) described in the first embodiment is satisfied. ) To (5). It is assumed that the operating frequency of the CPU mounted on this apparatus is either 100 MHz or 166 MHz, the frame rate specified at the start of encoding is 24 frames / sec, and the resolution of the frame image in the input image is , 320 × 240 or 160 × 120.

【０１５９】以上のような設定のもとに、上述のように
構成された本実施の形態２による映像符号化装置の動作
を以下に説明する。入力画像データがフレーム画像ごと
に入力され、符号化手段２０１はこれを符号化処理す
る。符号化手段２０１の動作は、実施の形態１において
示した符号化手段１０１と同様である。The operation of the video encoding apparatus according to the second embodiment configured as described above under the above settings will be described below. Input image data is input for each frame image, and the encoding unit 201 performs an encoding process on the input image data. The operation of the encoding unit 201 is the same as that of the encoding unit 101 shown in the first embodiment.

【０１６０】一方、処理能力判断手段２１１は、本装置
の処理能力を判断するために、本装置を実現しているＰ
Ｃに搭載されているＣＰＵの動作周波数を検出し、これ
を判断結果として符号化パラメータ決定手段２０２に通
知する。符号化パラメータ決定手段２０２には、動作周
波数１００MHz 、または１６６MHz のいずれかを示す判
断結果が入力され、符号化パラメータ決定手段２０２
は、この判断結果を用いて符号化パラメータを決定す
る。On the other hand, the processing capability determining means 211 determines the processing capability of the present apparatus by using a P which implements the present apparatus.
The operating frequency of the CPU mounted on C is detected, and this is notified to the encoding parameter determining means 202 as a determination result. The judgment result indicating either the operating frequency of 100 MHz or 166 MHz is input to the coding parameter determining means 202, and the coding parameter determining means 202
Determines an encoding parameter using the result of this determination.

【０１６１】（表４）は符号化パラメータ決定手段２０
２が内包する符号化パターン参照テーブル２１０を示あ
う表である。また、図５は符号化パラメータ決定手段２
０２の動作を示すフローチャート図である。以下に、符
号化パラメータ決定手段２０２の動作を、表４を参照
し、図５のフローに従って説明する。Table 4 shows the encoding parameter determining means 20.
2 is a table showing an encoding pattern reference table 210 included in the encoding pattern reference table 210; FIG. 5 shows an encoding parameter determining means 2
It is a flowchart figure which shows operation | movement of 02. Hereinafter, the operation of the encoding parameter determination unit 202 will be described with reference to Table 4 and in accordance with the flow of FIG.

【０１６２】[0162]

【表４】 [Table 4]

【０１６３】実施の形態１における符号化パラメータ決
定手段１０１の内包する解像度参照テーブル１１０と同
様に、（表４）における符号化パターン参照テーブル２
１０は、後述する条件を考慮して符号化に先立って予め
作成されるものである。Similarly to the resolution reference table 110 included in the encoding parameter determination means 101 in the first embodiment, the encoding pattern reference table 2 in (Table 4) is used.
Numeral 10 is created in advance before encoding in consideration of conditions described later.

【０１６４】（表４）に示す「入力」欄と「出力」欄と
の関係については（表１）と同様であり、設定されたフ
レームレートと解像度、および入力された判断結果であ
る動作周波数の３つに対応して、符号化パターンを決定
するものである。ここで、符号化パターンについては、
「ＩＩＩＩＩＩ」はすべてのフレーム画像に対してフレ
ーム内符号化( Ｉ) をすることを意味し、「ＩＰＩＰＩ
Ｐ」は、２フレームごとにフレーム内符号化( Ｉ) と順
方向予測符号化( Ｐ) とを繰り返すことを、「ＩＩＩＩ
ＩＰ」は６フレームごとにフレーム内符号化( Ｉ) を５
回繰り返したのち順方向予測符号化( Ｐ) を１回実施す
るという処理を繰り返すことを、また、「ＩＰＰＰＰ
Ｐ」は６フレームごとにフレーム内符号化( Ｉ) を１回
実施したのち順方向予測符号化( Ｐ) を５回繰り返すと
いう処理を繰り返すことを意味する。The relationship between the "input" column and the "output" column shown in (Table 4) is the same as in (Table 1), and the set frame rate and resolution, and the operating frequency which is the input judgment result The coding pattern is determined corresponding to the three. Here, regarding the encoding pattern,
“IIIIII” means that intra-frame coding (I) is performed on all frame images, and “IPIPI”
“P” means that the intra-frame coding (I) and the forward prediction coding (P) are repeated every two frames.
“IP” means that intra-frame coding (I) is performed every 5 frames.
Iteratively repeating the process of performing forward prediction coding (P) once after repeating "IPPPP"
"P" means that the process of repeating intra-frame coding (I) once every six frames and then repeating forward prediction coding (P) five times is repeated.

【０１６５】実施の形態１における解像度参照テーブル
の設定の場合と同様に、参照テーブルの作成は、次の条
件を考慮して行われる。第一に、順方向予測符号化処理
は、フレーム内符号化処理と比較して、処理量が多いが
高圧縮率で符号化できること、第二に、符号化に際して
フレーム画像を高解像度で符号化した場合は、低解像度
で符号化した場合と比べて処理量が多くなること、第三
に、ＣＰＵの動作周波数が高い程その処理能力は高く、
符号化処理を短時間で実行できることがその条件であ
る。これらの条件を考慮し、指定されたフレームレート
を実現しつつ、できるだけ高い圧縮率で符号化処理を実
行できるように、符号化パターン参照テーブル１１０は
作成される。As in the case of setting the resolution reference table in the first embodiment, the creation of the reference table is performed in consideration of the following conditions. First, the forward predictive encoding process requires a large amount of processing compared to the intra-frame encoding process, but can be encoded at a high compression rate. Second, it encodes a frame image at high resolution when encoding. In that case, the processing amount is larger than that in the case of encoding at low resolution. Third, the higher the operating frequency of the CPU, the higher the processing capacity.
The condition is that the encoding process can be executed in a short time. In consideration of these conditions, the encoding pattern reference table 110 is created so that the encoding process can be executed at the highest possible compression ratio while achieving the specified frame rate.

【０１６６】処理能力判断手段２１１から判断結果が入
力されると、符号化パラメータ決定手段２０２は図５の
フローに従って動作する。まず、ステップＣ０１で符号
化パラメータ決定手段２０２は、指定されたフレームレ
ート( ２４フレーム／秒) と、解像度（３２０×２４
０、または１６０×１２０) と、処理能力判断手段２１
１により入力されるＣＰＵの動作周波数（１００MHz も
しくは１６６MHz ）とから、符号化パターン参照テーブ
ル２１０を参照して、符号化を実行する時の符号化パタ
ーンを決定する。When the judgment result is input from the processing ability judgment means 211, the coding parameter determination means 202 operates according to the flow shown in FIG. First, in step C01, the encoding parameter determination means 202 determines the designated frame rate (24 frames / second) and the resolution (320 × 24
0 or 160 × 120) and the processing capacity determination means 21
The coding pattern at the time of executing the coding is determined by referring to the coding pattern reference table 210 based on the CPU operating frequency (100 MHz or 166 MHz) input by 1.

【０１６７】続いてステップＣ０２が実行され、符号化
パラメータ決定手段２０２は、符号化手段２０１に対し
て符号化パラメータを出力する。符号化パラメータ決定
手段２０２は、ステップＣ０１で決定した符号化パター
ンを実現できるように符号化タイプ( ＩもしくはＰ) を
指示するとともに、指定された解像度を指示する。その
後、ステップＣ０３において符号化が終了したか否かが
判定され、符号化が終了したと判定されたならば処理は
終了する。一方、終了でなければ、ステップＣ０２に戻
ることによって、符号化手段２０１に対する符号化パラ
メータ出力が繰り返される。Subsequently, step C02 is executed, and the coding parameter determining means 202 outputs coding parameters to the coding means 201. The coding parameter determining means 202 specifies the coding type (I or P) so as to realize the coding pattern determined in step C01, and also specifies the designated resolution. Thereafter, it is determined in step C03 whether or not encoding has been completed. If it is determined that encoding has been completed, the process ends. On the other hand, if the processing has not been completed, the process returns to step C02, whereby the output of the coding parameters to the coding means 201 is repeated.

【０１６８】符号化手段２０１、符号化パラメータ決定
手段２０２、および処理能力判断手段２１１の以上のよ
うな動作によって、符号化が実行されるが、（表５）
は、本実施の形態２による映像符号化装置において符号
化を行なった結果を示す表である。Encoding is performed by the above-described operations of the encoding unit 201, the encoding parameter determining unit 202, and the processing capability determining unit 211 (Table 5).
9 is a table showing the result of encoding performed by the video encoding device according to the second embodiment.

【０１６９】[0169]

【表５】 [Table 5]

【０１７０】（表５）においては、指定される符号化条
件、および処理能力についての判断結果と、以上から決
定される決定パラメータである符号化パターンについて
は（表４）と同様であり、それぞれの場合について、符
号化処理を行った結果（符号化結果）として得られるフ
レームレート、および符号化データ量を示している。こ
こで、符号化データ量については、入力画像における１
枚のフレーム画像のデータ量を１としたときの、符号化
データにおける１枚のフレーム画像に相当する符号化デ
ータのデータ量を示している。すなわち、符号化データ
量が少ないほど圧縮率が高い。[0170] In (Table 5), the result of the judgment on the specified coding condition and the processing capability and the coding pattern which is the decision parameter determined from the above are the same as those in (Table 4). In the case of, the frame rate and the amount of encoded data obtained as a result of the encoding process (encoding result) are shown. Here, the encoded data amount is 1 in the input image.
When the data amount of one frame image is 1, the data amount of the encoded data corresponding to one frame image in the encoded data is shown. That is, the smaller the encoded data amount, the higher the compression ratio.

【０１７１】なお、（表５）においても、実施の形態１
の場合の（表２）と同様に、以下のようにして符号化結
果のフレームレートを算出している。すなわち、ＣＰＵ
の動作周波数が１００MHz 、符号化パターン「ＩＩＩＩ
ＩＩ」、解像度が３２０×２４０の場合に２４フレーム
／秒で処理できることに基づいて、その他の場合のフレ
ームレートが算出されるものである。例えば、ＣＰＵが
１６６MHz 、符号化パターン「ＩＩＩＩＩＰ」、解像度
が３２０×２４０の場合のフレームレートは、Ｐの処理
にＩの処理の６倍の時間を要すること、ＣＰＵの動作周
波数１６６MHzの場合は１００MHz の１００／１６６の
時間で符号化を処理できることから、６枚のフレーム画
像を符号化するのに（５／２４＋６／２４) ×（１００
／１６６) ＝０．２７６秒を要することが算出でき、こ
れから、２１．７３１フレーム／秒と算出できる。同様
に、符号化結果として示す１枚のフレーム画像における
符号化のデータ量についても、Ｉで符号化した場合に１
／１０となること、Ｐで符号化した場合に１／６０とな
ることにそれぞれ基づき算出されている。例えば、パタ
ーン「ＩＩＩＩＩＰ」で符号化した場合には、６枚のフ
レーム画像の符号化データ量は（５／１０＋１／６０)
＝０．５１７になることから、１枚のフレーム画像に対
する符号化データ量は、０．０８６となる。[0171] In Table 5 also, the first embodiment is used.
As in the case of (Table 2), the frame rate of the encoding result is calculated as follows. That is, CPU
Operating frequency is 100 MHz, and the coding pattern “IIII
II ", based on the fact that processing can be performed at 24 frames / sec when the resolution is 320 × 240, the frame rate in other cases is calculated. For example, when the CPU is 166 MHz, the coding pattern is “IIIIIP”, and the resolution is 320 × 240, the frame rate is that the processing of P takes six times as long as the processing of I. Can be processed in 100/166 of the time, so that (5/24 + 6/24) × (100)
/166)=0.276 seconds can be calculated, from which it can be calculated to be 21.731 frames / second. Similarly, the data amount of encoding in one frame image shown as an encoding result is 1 unit when encoded by I.
/ 10, and 1/60 when encoded with P. For example, when encoding is performed with the pattern “IIIIIP”, the encoded data amount of six frame images is (5/10 + 1/60)
= 0.517, the encoded data amount for one frame image is 0.086.

【０１７２】比較のため、（表６）に従来の技術による
映像符号化装置を用いて符号化を行なった場合の符号化
結果を示す。For comparison, Table 6 shows an encoding result when encoding is performed using a conventional video encoding device.

【０１７３】[0173]

【表６】 [Table 6]

【０１７４】なお、（表６）においても、ＣＰＵの動作
周波数が１００MHz 、符号化パターン「ＩＩＩＩＩＩ」
で、解像度が３２０×２４０の場合に２４フレーム／秒
で処理できることも基づいて、その他の場合の符号化結
果であるフレームレートを算出しているものである。ま
た、符号化データ量に関しても、１枚のフレーム画像に
対して、Ｉで符号化した場合に１／１０となること、Ｐ
で符号化した場合に１／６０となることに基づき算出さ
れている。In Table 6 also, the operating frequency of the CPU is 100 MHz, and the coding pattern “IIIIII”
Based on the fact that processing can be performed at 24 frames / sec when the resolution is 320 × 240, the frame rate as the encoding result in other cases is calculated. Also, the amount of encoded data is 1/10 when one frame image is encoded with I, and P
Is calculated based on 1/60 when encoded by.

【０１７５】従来の技術による映像符号化装置では、符
号化結果として得られるフレームレートや、当該符号化
装置を構成するハードウェア能力の変動を考慮せずに、
符号化タイプ（パターン）、あるいは解像度を決定して
いたものである。従って、かかる設定に従った符号化処
理の結果として得られるフレームレートは、ときには不
必要な数値となってしまうなど不具合があった。これに
比べ、本実施の形態２の映像符号化装置においては、当
該符号化装置の処理能力や、符号化結果であるフレーム
レートを考慮して、指定された解像度に応じて符号化タ
イプ（パターン）を決定することで、表５と表６との対
比において示されるように、指定されたフレームレート
に近いフレームレートを実現でき、かつ、より高圧縮率
での符号化が実行されていることがわかる。In the video coding apparatus according to the prior art, the frame rate obtained as a result of coding or the fluctuation of the hardware capability of the coding apparatus is not taken into consideration,
The encoding type (pattern) or the resolution has been determined. Therefore, there is a problem that the frame rate obtained as a result of the encoding process according to the setting sometimes becomes an unnecessary numerical value. On the other hand, in the video encoding device according to the second embodiment, in consideration of the processing capability of the encoding device and the frame rate as an encoding result, the encoding type (pattern ), A frame rate close to the specified frame rate can be realized as shown in the comparison of Table 5 and Table 6, and encoding at a higher compression rate is performed. I understand.

【０１７６】特に、当該符号化装置のハードウェア的能
力の変動に対応し得るという利点は、映像符号化プログ
ラムをコンピュータ等において実行することによって、
当該映像符号化装置を実現する場合には有用である。In particular, the advantage of being able to cope with fluctuations in the hardware capabilities of the encoding apparatus is that by executing a video encoding program on a computer or the like,
This is useful when implementing the video encoding device.

【０１７７】このように、本実施の形態２による映像符
号化装置によれば、符号化手段２０１と、符号化パター
ン参照テーブル２１０を内包した符号化パラメータ決定
手段２０２と、処理能力判断手段２１１とを備えたこと
で、符号化パラメータ決定手段２０２は、指定されたフ
レームレートと解像度、そして処理能力判断手段２１１
の出力する判断結果とに対応して符号化パターンを決定
して、符号化パラメータを符号化手段２０１に出力し、
符号化手段２０１はこの符号化パラメータに応じて符号
化の処理を行うので、要求される条件を実現しつつ、よ
り高圧縮率の得られる符号化を行うことが可能となる。As described above, according to the video coding apparatus of the second embodiment, the coding means 201, the coding parameter determination means 202 including the coding pattern reference table 210, and the processing capability determination means 211 Is provided, the encoding parameter determination unit 202 determines that the designated frame rate and resolution, and the processing capability determination unit 211
The encoding pattern is determined in accordance with the judgment result output by the encoding unit, and the encoding parameter is output to the encoding unit 201.
Since the encoding unit 201 performs the encoding process according to the encoding parameter, it is possible to perform the encoding with a higher compression ratio while realizing the required conditions.

【０１７８】なお、本実施の形態２による映像符号化装
置では、指定された解像度に対応して符号化パターンを
決定するものとしたが、同様の参照テーブルを用いるこ
とによって、指定された符号化パターン（タイプ）に対
応して解像度を決定することも可能であり、要求される
フレームレートと符号化パターンとの下で、より高解像
度での符号化処理をすることが可能となる。In the video encoding apparatus according to the second embodiment, the encoding pattern is determined according to the designated resolution. However, by using the same lookup table, the designated encoding is performed. It is also possible to determine the resolution in accordance with the pattern (type), and it is possible to perform encoding processing at a higher resolution under the required frame rate and encoding pattern.

【０１７９】また、本実施の形態２では、処理能力の判
断については、ＣＰＵの動作周波数に基づいて行うもの
としているが、ＣＰＵあるいはＤＳＰ等のプロセッサの
品番、バージョン、製造メーカなどの、装置能力を示す
諸要素に基づいて判断することとしてもよく、種々の応
用が可能である。In the second embodiment, the processing capability is determined based on the operating frequency of the CPU. However, the processing capability of the CPU or the processor such as a DSP, such as the product number, version, and the manufacturer, is determined. May be determined on the basis of various factors indicating the above, and various applications are possible.

【０１８０】実施の形態３．本発明の実施の形態３によ
る映像符号化方法は、当該符号化装置の処理能力に対応
して、設定されたフレームレートに基づいて符号化パラ
メータを決定するものであり、所要処理時間に基づいて
処理能力を判断するものである。Embodiment 3 The video encoding method according to the third embodiment of the present invention determines an encoding parameter based on a set frame rate in accordance with the processing capability of the encoding apparatus, and determines the encoding parameter based on a required processing time. This is to determine the processing capacity.

【０１８１】図６は、本発明の実施の形態３による映像
符号化装置の構成を示すブロック図である。図示するよ
うに、本実施の形態３による映像符号化装置は、符号化
手段３０１と、符号化パラメータ決定手段３０２と、処
理能力判断手段３１１とから構成されており、符号化手
段３０１は、ＤＣＴ処理手段３０３、量子化手段３０
４、可変長符号化手段３０５、ビットストリーム生成手
段３０６、逆量子化手段３０７、逆ＤＣＴ処理手段３０
８、および予測画像生成手段３０９を、また、符号化パ
ラメータ決定手段３０２は符号化パターン決定手段３１
０を内包している。符号化手段３０１は、実施の形態１
による映像符号化装置の符号化手段１０１と同様であ
り、符号化パラメータ決定手段３０２から入力される符
号化パラメータに対応して、入力されたフレーム画像に
対して、指示された解像度で、かつ、フレーム内符号化
( Ｉ) 、または順方向予測符号化( Ｐ) といった指示さ
れた符号化タイプで符号化を行う。FIG. 6 is a block diagram showing a configuration of a video encoding apparatus according to Embodiment 3 of the present invention. As shown in the figure, the video encoding apparatus according to the third embodiment includes an encoding unit 301, an encoding parameter determining unit 302, and a processing capacity determining unit 311. Processing means 303, quantization means 30
4. Variable length encoding means 305, bit stream generation means 306, inverse quantization means 307, inverse DCT processing means 30
8 and the predicted image generation means 309, and the coding parameter determination means 302
0 is included. Encoding means 301 is similar to that of the first embodiment.
Is the same as the encoding unit 101 of the video encoding device according to the above, and corresponding to the encoding parameter input from the encoding parameter determining unit 302, the input frame image is instructed at the indicated resolution, and Intra-frame coding
Encoding is performed using an indicated encoding type such as (I) or forward prediction encoding (P).

【０１８２】符号化パラメータ決定手段３０２は、処理
能力判断手段３１１の判断結果に応じて、内包する符号
化パターン決定手段３１０を用いて符号化パラメータを
決定し、符号化手段３０１に出力する。処理能力判断手
段３１１は、当該符号化装置の符号化処理能力を判断
し、判断結果を符号化パラメータ決定手段３０２に出力
する。本実施の形態３では、処理能力判断手段３１１
は、当該符号化装置における符号化処理の平均フレーム
レートを判断結果として出力するものであり、符号化パ
ラメータ決定手段３０２は、指定されたフレームレート
と、解像度と、判断結果である平均フレームレートとか
ら符号化パターンを決定し、当該符号化パターンに応じ
て符号化手段３０１に符号化タイプを指示するものであ
る。符号化パターンを決定するため、符号化パラメータ
決定手段３０２は、符号化パターン決定手段３１０を用
いる。The coding parameter determining means 302 determines coding parameters using the included coding pattern determining means 310 in accordance with the result of the determination by the processing capacity determining means 311, and outputs the determined coding parameters to the coding means 301. The processing capability determining unit 311 determines the encoding processing capability of the encoding device, and outputs a result of the determination to the encoding parameter determining unit 302. In the third embodiment, the processing capacity determination unit 311
Outputs the average frame rate of the encoding process in the encoding apparatus as a determination result. The encoding parameter determination unit 302 determines the designated frame rate, the resolution, and the average frame rate as the determination result. To determine the coding type to the coding means 301 according to the coding pattern. In order to determine the coding pattern, the coding parameter determining means 302 uses the coding pattern determining means 310.

【０１８３】なお、本実施の形態３による映像符号化装
置においても、実施の形態１と同様に、ＰＣにおける符
号化プログラムの実行によって実現されるものとし、実
施の形態１に示した条件（１）〜（５）が成立するもの
とする。また、本装置に搭載されるＣＰＵの動作周波数
は１００MHz であるとする。また、符号化開始時に指定
されるフレームレートは８フレーム／秒であり、入力画
像におけるフレーム画像の解像度として３２０×２４０
が指定されるとする。It is to be noted that, similarly to the first embodiment, the video encoding apparatus according to the third embodiment is realized by executing the encoding program on the PC, and the condition (1) described in the first embodiment is satisfied. ) To (5). Further, it is assumed that the operating frequency of the CPU mounted on this apparatus is 100 MHz. The frame rate specified at the start of encoding is 8 frames / second, and the resolution of the frame image in the input image is 320 × 240.
Is specified.

【０１８４】以上のような設定のもとに、上述のように
構成された本実施の形態３による映像符号化装置の動作
を、以下に説明する。入力画像データがフレーム画像ご
とに入力され、符号化手段３０１はこれを符号化処理す
る。符号化手段３０１の動作は、実施の形態１において
示した符号化手段１０１と同様である。The operation of the video encoding apparatus according to the third embodiment configured as described above under the above settings will be described below. Input image data is input for each frame image, and the encoding unit 301 performs an encoding process on the input image data. The operation of the encoding unit 301 is the same as that of the encoding unit 101 shown in the first embodiment.

【０１８５】一方、処理能力判断手段３１１は、本装置
の処理能力を判断するために、４つのフレーム画像ごと
に、それら４つのフレーム画像を含めた、それら以前の
すべてのフレーム画像に対して、符号化手段３０１が処
理するのに要した時間を測定し、当該測定した処理所要
時間と、フレーム画像の処理数とから、その時点までの
符号化処理の平均フレームレートを算出し、符号化パラ
メータ決定手段３０２に通知する。なお、処理能力判断
手段は平均フレームレートの初期値としては、要求され
たフレームレートである８フレーム／秒を通知するもの
とする。On the other hand, in order to determine the processing capability of the present apparatus, the processing capability determining means 311 calculates the processing capability of all the preceding frame images, including the four frame images, every four frame images. The encoding unit 301 measures the time required for processing, calculates the average frame rate of the encoding processing up to that time from the measured processing required time and the number of processed frame images, and calculates an encoding parameter. Notify the determination means 302. It is assumed that the processing capacity determination means notifies the requested frame rate of 8 frames / second as the initial value of the average frame rate.

【０１８６】符号化パラメータ決定手段３０２は、上記
通知された平均フレームレートを用いて符号化パラメー
タを決定する。符号化パラメータ決定手段３０２に含ま
れる符号化パターン決定手段３１０の動作を以下に説明
する。符号化パターン決定手段３１０は、限定されたい
くつかの状態をとるものであり、条件に応じて、それら
のとり得る状態間を遷移する有限状態マシンとして動作
する。図７は、有限状態マシンとしての符号化パターン
決定手段３１０における(a) 状態の遷移を示す状態遷移
図、および(b) 状態遷移条件を示す図である。符号化パ
ターン決定手段３１０はＳ０〜Ｓ３までの全部で４つの
状態をとり、それぞれの状態においては（表７）で示す
符号化パターンを出力する。The coding parameter determining means 302 determines a coding parameter using the notified average frame rate. The operation of the coding pattern determining means 310 included in the coding parameter determining means 302 will be described below. The coding pattern determination means 310 takes a limited number of states, and operates as a finite state machine that transitions between these possible states according to conditions. FIG. 7 is a state transition diagram showing a state transition (a) in the coding pattern determination means 310 as a finite state machine, and (b) a diagram showing a state transition condition. The coding pattern determination means 310 takes four states in total from S0 to S3, and outputs a coding pattern shown in (Table 7) in each state.

【０１８７】[0187]

【表７】 [Table 7]

【０１８８】（表７）に示す符号化パターンについて
は、「ＩＩＩＩ」は、処理対象である４つのフレーム画
像のすべてのフレーム画像に対してフレーム内符号化
（Ｉ) を実行することを、「ＩＩＩＰ」は該４つのフレ
ーム画像のうち、最初の３つに対してはフレーム内符号
化（Ｉ) を、最後の１つに対しては順方向予測符号化
（Ｐ) を実行することを、「ＩＰＩＰ」は該４つのフレ
ーム画像のうち、最初と３つめに対してはフレーム内符
号化（Ｉ) を、２つ目と４つ目に対しては順方向予測符
号化（Ｐ) を実行することを、「ＩＰＰＰ」は、該４つ
のフレーム画像のうち。最初の１つに対してはフレーム
内符号化（Ｉ) を、残りの３つに対しては、順方向予測
符号化（Ｐ) を実行することを、それぞれ意味する。Regarding the encoding patterns shown in Table 7, “IIII” indicates that intra-frame encoding (I) is to be performed on all of the four frame images to be processed. "IIIP" indicates that the first three of the four frame images are subjected to intra-frame encoding (I), and the last one is subjected to forward prediction encoding (P). “IPIP” executes intra-frame coding (I) for the first and third frames of the four frame images, and performs forward prediction coding (P) for the second and fourth frames. To do, “IPPP” is one of the four frame images. For the first one, this means performing intra-frame coding (I), and for the remaining three, performing forward prediction coding (P).

【０１８９】また、符号化パターン決定手段３１０の状
態の遷移については、４つのフレーム画像ごとに状態遷
移についての判定を実行するものであり、判定は、当該
判定の直前に処理能力判断手段３１１から通知された平
均フレームレートの値と、指定されたフレームレートの
値とを用いて、図９(a) に示す条件に従って行なわれ
る。ただし、本実施の形態３においては、上記の有限状
態マシンとしてとる状態の初期値はＳ１であるものとす
る。The state transition of the encoding pattern determining means 310 is to determine the state transition for each of the four frame images. The determination is made by the processing capability determining means 311 immediately before the determination. Using the notified value of the average frame rate and the designated value of the frame rate, the processing is performed according to the conditions shown in FIG. However, in the third embodiment, the initial value of the state taken as the finite state machine is S1.

【０１９０】以上のことから、本実施の形態３による映
像符号化装置では、符号化が開始するとまず、処理能力
判断手段３１１から初期値「８フレーム／秒」が出力さ
れ、符号化パターン決定手段３１０は初期状態Ｓ１であ
ることから、表７に示すように符号化パターンとして
「ＩＩＩＰ」を出力する。従って、符号化パラメータ決
定手段３０２は、該パターンを実現できるように符号化
パラメータを符号化手段３０１に出力し、フレーム画像
３つに対してフレーム内符号化、次の１つに対して順方
向予測符号化が行われるように制御がなされる。As described above, in the video encoding apparatus according to the third embodiment, when encoding is started, first, the initial value “8 frames / sec” is output from the processing capability judging unit 311 and the encoding pattern determining unit Since 310 is in the initial state S1, "IIIP" is output as an encoding pattern as shown in Table 7. Accordingly, the coding parameter determination means 302 outputs the coding parameters to the coding means 301 so as to realize the pattern, and performs intra-frame coding for three frame images and forward coding for the next one. Control is performed so that predictive coding is performed.

【０１９１】この後、処理能力判断手段３１１から得ら
れる平均フレームレートが、指定されたフレームレート
より低いものとなったときには、図７(a) に示すように
Ｓ１→Ｓ０の遷移がなされることにより、表７に示す符
号化パターンは「ＩＩＩＩ」に変更され、フレーム内符
号化ばかりが行われるようになる。Thereafter, when the average frame rate obtained from the processing capability judging means 311 becomes lower than the designated frame rate, a transition from S1 to S0 is made as shown in FIG. As a result, the encoding pattern shown in Table 7 is changed to "IIII", and only intra-frame encoding is performed.

【０１９２】一方、処理能力判断手段３１１から得られ
る平均フレームレートが指定されたフレームレートより
も高いときには、図７(a) に示すＳ１→Ｓ２の遷移がさ
れて、符号化パターンが「ＩＰＩＰ」に変更され、順方
向予測符号化の比率が増すこととなる。On the other hand, when the average frame rate obtained from the processing capability judging means 311 is higher than the designated frame rate, the transition from S1 to S2 shown in FIG. 7A is made, and the encoding pattern is changed to "IPIP". To increase the ratio of forward prediction coding.

【０１９３】このような制御をすることによって、処理
能力判断手段３１１の出力する平均フレームレートが指
定されたフレームレートよりも小さいときは、当該符号
化装置の処理負担が重いものと考えられるため、図７
(a) に示すＳ３→Ｓ０方向の遷移によって、符号化処理
における、処理負担の小さなフレーム内符号化の比率を
高くし、一方、処理能力判断手段３１１の出力する平均
フレームレートが指定されたフレームレートよりも大き
いときは、当該符号化装置の処理能力に余力があるもの
と考えられるため、図７(a) に示すＳ０→Ｓ３方向の遷
移によって、符号化処理における、処理負担の大きな順
方向予測符号化の比率を高くして、より高圧縮度の符号
化結果が得られるように図るものである。By performing such control, when the average frame rate output from the processing capability judging means 311 is smaller than the designated frame rate, the processing load of the encoding apparatus is considered to be heavy. FIG.
The transition in the S3 → S0 direction shown in (a) increases the ratio of intra-frame encoding with a small processing load in the encoding process, while increasing the frame rate specified by the average frame rate output from the processing capability determination unit 311. When the rate is larger than the rate, it is considered that the processing capability of the encoding apparatus has a margin. Therefore, the transition in the S0 → S3 direction shown in FIG. The ratio of the predictive encoding is increased so that an encoding result with a higher degree of compression can be obtained.

【０１９４】このように、処理能力として示される符号
化処理の状態に対応して、符号化パラメータが変化させ
ながら、符号化が実行されるが、（表８）は、上記のよ
うにして、２８枚の連続したフレーム画像に対して符号
化を実施した場合における、符号化の結果を示す表であ
る。As described above, the encoding is executed while the encoding parameters are changed in accordance with the state of the encoding process indicated as the processing capability. 25 is a table illustrating a result of encoding when encoding is performed on 28 continuous frame images.

【０１９５】[0195]

【表８】 [Table 8]

【０１９６】同表において、指定されるフレームレート
は、固定的に「８」と設定されている。符号化パター
ン、所要時間、および処理能力判断手段３１１の出力に
ついては、０番目から３番目、４番目から７番目…等の
４つずつのフレーム画像の処理ごとに、その値を示すも
のである。符号化パターンは４つのフレーム画像ごとに
符号化処理に用いられる符号化パターンを、所要時間は
符号化手段３０１が４つのフレーム画像の符号化処理に
要する時間（秒）を、また、処理能力判断手段３１１の
出力は、所要時間を用いて取得した平均フレームレート
を示している。そして、２８枚のフレーム画像を符号化
した結果として、符号化処理における平均フレームレー
トと符号化データ量とを示している。In the table, the designated frame rate is fixedly set to “8”. The values of the encoding pattern, the required time, and the output of the processing capability determination unit 311 are shown for each of the four frame images such as the 0th to 3rd, the 4th to 7th, etc. . The coding pattern is a coding pattern used for the coding process for each of the four frame images, the required time is a time (second) required for the coding unit 301 to perform the coding process on the four frame images, and a processing capability determination. The output of the means 311 indicates the average frame rate obtained using the required time. As a result of encoding the 28 frame images, the average frame rate and the encoded data amount in the encoding process are shown.

【０１９７】なお、（表８）においては、ＣＰＵの動作
周波数が１００MHz 、符号化パターン「ＩＩＩＩ」にお
いて、解像度が３２０×２４０の場合に、２４フレーム
／秒で処理できることから、４枚のフレーム画像を処理
するのに必要な時間と、符号化結果としてのフレームレ
ートとが算出されている。例えば、０番目から３番目ま
での４枚のフレーム画像を処理するのに必要な時間は、
１／２４×３＋６／２４×１＝０．３７５秒となる。ま
た、符号化結果としては、Ｉフレームが１５枚とＰフレ
ームが１３枚生成されるため、１枚のフレーム画像の符
号化に要した時間は、( １／２４×１５＋６／２４×１
３) ÷２８＝０．１３８秒となり、平均フレームレート
は７．２２５フレーム／秒と算出できる。また、符号化
結果における１枚のフレーム画像の符号化データ量は、
Ｉで符号化した場合に１／１０となること、Ｐで符号化
した場合に１／６０となることに基づいて算出されてい
る。その結果、符号化結果としては、Ｉフレームが１５
枚とＰフレームが１３枚生成されるため、２８枚のフレ
ーム画像の符号化データ量は( １５／１０＋１３／６
０) になることから、１枚のフレーム画像に対する符号
化データ量は、０．０６１となる。In Table 8, when the operating frequency of the CPU is 100 MHz and the encoding pattern “III” has a resolution of 320 × 240, processing can be performed at 24 frames / sec. Are calculated, and a frame rate as an encoding result is calculated. For example, the time required to process the four frame images from the 0th to the 3rd is
1/24 × 3 + 6/24 × 1 = 0.375 seconds. In addition, since 15 I-frames and 13 P-frames are generated as encoding results, the time required for encoding one frame image is (1/24 × 15 + 6/24 × 1).
3) ÷ 28 = 0.138 seconds, and the average frame rate can be calculated to be 7.225 frames / second. Also, the encoded data amount of one frame image in the encoding result is:
It is calculated based on the fact that it becomes 1/10 when encoded with I and 1/60 when encoded with P. As a result, as an encoding result, the I frame
Since 13 frames and 13 P frames are generated, the encoded data amount of 28 frame images is (15/10 + 13/6)
0), the encoded data amount for one frame image is 0.061.

【０１９８】（表９）は、本実施の形態３による映像符
号化装置における、符号化パターン決定手段３１０、お
よび符号化パラメータ決定手段３０２の機能を説明する
ための表である。同表においては、上記（表８）におけ
る処理対象の２８枚のフレーム画像のうち、０番目から
１１番目までの１２枚のフレーム画像について１枚ごと
についての、符号化パターン決定手段３１０、および符
号化パラメータ決定手段３０２の出力を含む、本実施の
形態３による映像符号化装置の状態を示すものである。(Table 9) is a table for explaining the functions of the coding pattern determining means 310 and the coding parameter determining means 302 in the video coding apparatus according to the third embodiment. In the same table, the encoding pattern determination unit 310 and the encoding code for each of 12 frame images from the 0th to the 11th among the 28 frame images to be processed in the above (Table 8) 14 shows the state of the video encoding device according to the third embodiment, including the output of the encoding parameter determination means 302.

【０１９９】[0199]

【表９】 [Table 9]

【０２００】同表において、所要時間と、処理能力判断
手段３１１の出力とは、（表８）と同様である。この処
理能力判断手段３１１の出力に対応して、符号化パター
ン決定手段３１０は状態遷移をし、状態に応じて符号化
パターンを出力する。そして、符号化パラメータ決定手
段３０２は、出力された符号化パターンに対応して、同
表に示すように、１枚のフレーム画像ごとに符号化タイ
プを出力する。In the table, the required time and the output of the processing capacity judging means 311 are the same as in (Table 8). In response to the output of the processing capability determination unit 311, the encoding pattern determination unit 310 makes a state transition and outputs an encoding pattern according to the state. Then, the encoding parameter determination unit 302 outputs the encoding type for each frame image, as shown in the table, corresponding to the output encoding pattern.

【０２０１】同表に示すように、符号化パターン決定手
段３１０の状態は、上記のようにまずＳ１をとるので、
このため表７に示すように符号化パターン「ＩＩＩＰ」
が出力される。符号化パラメータ決定手段３０２はこれ
に対応して、最初の４つのフレーム画像について「Ｉ」
「Ｉ」「Ｉ」「Ｐ」を出力する。As shown in the table, since the state of the coding pattern determining means 310 first takes S1 as described above,
Therefore, as shown in Table 7, the coding pattern “IIIP”
Is output. The encoding parameter determination means 302 responds by responding to the “I” for the first four frame images.
"I", "I" and "P" are output.

【０２０２】８番目から１１番目までの４つのフレーム
画像に対しては、符号化パターン決定手段３１０の状態
がＳ１→Ｓ２に遷移したことから、符号化パターン「Ｉ
ＰＩＰ」が出力されるので、符号化パラメータ決定手段
３０２の出力は、「Ｉ」「Ｐ」「Ｉ」「Ｐ」となる。For the four frame images from the eighth to the eleventh, since the state of the coding pattern determining means 310 has changed from S1 to S2, the coding pattern "I
Since “PIP” is output, the output of the encoding parameter determination unit 302 is “I”, “P”, “I”, and “P”.

【０２０３】以下、表８に示すように、４つのフレーム
画像を処理するごとに、処理能力の判断と、それに対応
した符号化パターンの選択がなされ、該パターンに対応
した符号化タイプにおいて符号化処理が行われる。As shown in Table 8, each time four frame images are processed, the processing capacity is determined, and a coding pattern corresponding to the processing capacity is selected, and a coding type corresponding to the pattern is selected. Processing is performed.

【０２０４】比較のため、（表１０）に従来の技術によ
る映像符号化装置を用いて、２８枚のフレーム画像を、
予め決められた符号化パターンで符号化した場合におけ
る符号化結果を示す。For comparison, Table 10 shows that 28 frame images were obtained by using a video encoding device according to the related art.
7 shows an encoding result when encoding is performed using a predetermined encoding pattern.

【０２０５】[0205]

【表１０】 [Table 10]

【０２０６】符号化パターンとしては、（表７）で示さ
れている４つのパターンを用いており、符号化結果とし
て、符号化処理のフレームレートと、１枚のフレーム画
像に対する符号化データ量とを示している。なお、（表
１０）においても、ＣＰＵの動作周波数が１００MHz 、
符号化パターンＩＩＩＩで、解像度が３２０×２４０の
場合に２４フレーム／秒で処理できることから、その他
の場合のフレームレートが算出されている。また、符号
化データ量に関しても、１枚のフレーム画像に対して、
Ｉで符号化した場合に１／１０となること、Ｐで符号化
した場合に１／６０となることに基づき算出されてい
る。As the encoding patterns, the four patterns shown in (Table 7) are used. As the encoding result, the frame rate of the encoding process, the encoded data amount for one frame image, Is shown. Note that also in Table 10, the operating frequency of the CPU is 100 MHz,
Since the encoding pattern IIII can be processed at 24 frames / sec when the resolution is 320 × 240, the frame rate in other cases is calculated. Also, regarding the amount of encoded data, for one frame image,
It is calculated based on the fact that it becomes 1/10 when encoded with I and 1/60 when encoded with P.

【０２０７】従来の技術による映像符号化装置では、符
号化結果として得られるフレームレートや、当該符号化
装置を構成するハードウェア能力の変動を考慮せずに、
符号化タイプ（パターン）、あるいは解像度を決定して
いたものである。従って、符号化処理の結果として得ら
れるフレームレートが、要望される値に近くなるよう設
定することは困難であり、不必要な数値となってしまう
設定を選定せざるを得ない場合などがあった。これに比
べ、本実施の形態３の映像符号化装置においては、当該
符号化装置の処理能力や符号化結果であるフレームレー
トの変動を考慮して、指定された解像度に応じて符号化
タイプ（パターン）を決定することで、表９と表１０と
の対比において示されるように、指定されたフレームレ
ートに近いフレームレートを実現でき、かつ、より高圧
縮率での符号化が実行されていることがわかる。[0207] In a video encoding device according to the prior art, a frame rate obtained as a result of encoding or a change in hardware capability constituting the encoding device is not taken into consideration,
The encoding type (pattern) or the resolution has been determined. Therefore, it is difficult to set the frame rate obtained as a result of the encoding process to be close to a desired value. In some cases, a setting that results in an unnecessary numerical value must be selected. Was. On the other hand, in the video encoding device according to the third embodiment, in consideration of the processing capability of the encoding device and the fluctuation of the frame rate as an encoding result, the encoding type ( By determining the pattern, a frame rate close to the designated frame rate can be realized as shown in a comparison between Table 9 and Table 10, and encoding at a higher compression rate is executed. You can see that.

【０２０８】このように、本実施の形態３による映像符
号化装置によれば、符号化手段３０１と、符号化パター
ン決定手段３１０を内包した符号化パラメータ決定手段
３０２と、処理能力判断手段３１１とを備えたことで、
符号化パラメータ決定手段３０２は、指定されたフレー
ムレートと解像度、そして処理能力判断手段３１１の出
力する判断結果とに対応して符号化パターンを決定し
て、符号化パラメータを符号化手段３０１に出力し、符
号化手段３０１はこの符号化パラメータに応じて符号化
の処理を行うので、要求される条件を実現しつつ、より
高圧縮率の得られる符号化を行うことが可能となる。As described above, according to the video coding apparatus of the third embodiment, the coding means 301, the coding parameter determining means 302 including the coding pattern determining means 310, and the processing capability determining means 311 By having
The encoding parameter determining unit 302 determines an encoding pattern in accordance with the specified frame rate and resolution, and the determination result output from the processing capability determining unit 311, and outputs the encoding parameter to the encoding unit 301. However, since the encoding unit 301 performs the encoding process according to the encoding parameter, it is possible to perform the encoding with a higher compression ratio while realizing the required conditions.

【０２０９】なお、本実施の形態３による映像符号化装
置では、指定された解像度に対応して符号化パターンを
決定するものとしたが、同様の処理をすることによっ
て、指定された符号化パターン（タイプ）に対応して解
像度を決定することも可能であり、要求されるフレーム
レートと符号化パターンとの下で、より高解像度での符
号化処理をすることが可能となる。[0209] In the video encoding apparatus according to the third embodiment, the encoding pattern is determined in accordance with the specified resolution. However, by performing the same processing, the specified encoding pattern is determined. It is also possible to determine the resolution corresponding to (type), and it is possible to perform encoding processing at a higher resolution under the required frame rate and encoding pattern.

【０２１０】また、実施の形態１、および２による映像
符号化装置では、基本的に符号化開始に際して符号化パ
ラメータを決定し、それ以後は該決定された符号化パラ
メータに従って符号化処理をするものであるが、本実施
の形態３による装置では、符号化の処理を行いながら平
均レートを算定し、装置の処理能力として取得される符
号化の状況に対応して動的に符号化パラメータを変更し
得るものである。従って、本実施の形態３による映像符
号化装置では、実施の形態１、および２に比較して若干
の処理負担は伴うものの、複数の演算処理等を並行して
実行する汎用計算機において、映像取り込みに伴った符
号化を実行する際などであって、当該計算機装置の状況
が変化するような場合でも、その状況の変化に対応し
て、適切な符号化条件を設定することが可能となるもの
である。The video coding apparatus according to the first and second embodiments basically determines a coding parameter at the start of coding, and thereafter performs a coding process according to the determined coding parameter. However, in the device according to the third embodiment, the average rate is calculated while performing the encoding process, and the encoding parameter is dynamically changed according to the encoding situation acquired as the processing capability of the device. Can be done. Therefore, in the video encoding device according to the third embodiment, although a small processing load is involved as compared with the first and second embodiments, a general-purpose computer that executes a plurality of arithmetic processes and the like in parallel performs video capturing. In such a case as when performing the encoding accompanying the above, even when the situation of the computer device changes, it is possible to set appropriate encoding conditions in response to the change of the situation. It is.

【０２１１】もっとも、あまり状況の変化がないなど、
当該符号化装置の処理能力が符号化処理の過程において
大きく変動しないと見込まれる場合等には、本実施の形
態３による映像符号化装置においても、実施の形態１、
および２と同様に、符号化開始に際してパラメータを設
定し、以後はその条件で符号化を実施するものとして、
制御にかかる処理負担を軽減することも可能である。However, there is not much change in the situation.
For example, when the processing capability of the encoding device is not expected to change significantly during the encoding process, the video encoding device according to the third embodiment can also use the first and second embodiments.
As in the case of (2) and (2), parameters are set at the start of encoding, and thereafter, encoding is performed under the conditions.
It is also possible to reduce the processing load on the control.

【０２１２】実施の形態４．本発明の実施の形態４によ
る映像符号化方法は、当該符号化装置の処理能力に対応
して、設定されたフレームレートに基づいて符号化パラ
メータを決定するものであり、一時蓄積するデータの量
により処理能力を判断するものである。Embodiment 4 The video encoding method according to the fourth embodiment of the present invention determines encoding parameters based on a set frame rate in accordance with the processing capability of the encoding device, and the amount of temporarily stored data. Is used to determine the processing capacity.

【０２１３】図８は、本発明の実施の形態４における映
像符号化方法を実行する映像符号化装置の構成を示すブ
ロック図である。図示するように、本実施の形態４によ
る映像符号化装置は、符号化手段４０１、符号化パラメ
ータ決定手段４０２、処理能力判断手段４１１、バッフ
ァ手段４１２、および入力フレームレート制御手段４１
３から構成されている。符号化手段４０１は、ＤＣＴ処
理手段４０３、量子化手段４０４、可変長符号化手段４
０５、ビットストリーム生成手段４０６、逆量子化手段
４０７、逆ＤＣＴ処理手段４０８、および予測画像生成
手段４０９を、また、符号化パラメータ決定手段４０２
は符号化パターン決定手段４１０を内包している。FIG. 8 is a block diagram showing a configuration of a video encoding device that executes the video encoding method according to Embodiment 4 of the present invention. As shown, the video encoding apparatus according to the fourth embodiment includes an encoding unit 401, an encoding parameter determination unit 402, a processing capability determination unit 411, a buffer unit 412, and an input frame rate control unit 41.
3 is comprised. The encoding unit 401 includes a DCT processing unit 403, a quantization unit 404, and a variable length encoding unit 4
05, a bit stream generation unit 406, an inverse quantization unit 407, an inverse DCT processing unit 408, and a predicted image generation unit 409, and an encoding parameter determination unit 402
Includes an encoding pattern determination unit 410.

【０２１４】符号化手段４０１は、実施の形態１による
映像符号化装置の符号化手段１０１と同様であり、符号
化パラメータ決定手段４０２から入力される符号化パラ
メータに対応して、入力されたフレーム画像に対して、
指示された解像度で、かつ、フレーム内符号化( Ｉ) 、
または順方向予測符号化( Ｐ) といった指示された符号
化タイプで符号化を行う。実施の形態１〜３において、
符号化手段１０１〜３０１はいずれも入力画像データを
入力するものであったが、本実施の形態４では、符号化
手段４０１は後述するバッファ手段４１２よりデータを
読み出して符号化処理を行うものである。The coding means 401 is the same as the coding means 101 of the video coding apparatus according to the first embodiment, and corresponds to the coding parameters input from the coding parameter determining means 402, For images,
At the indicated resolution and intra-frame encoding (I),
Alternatively, encoding is performed using a designated encoding type such as forward prediction encoding (P). In Embodiments 1 to 3,
All of the encoding units 101 to 301 input the input image data. However, in the fourth embodiment, the encoding unit 401 reads out data from a buffer unit 412 described later and performs an encoding process. is there.

【０２１５】符号化パラメータ決定手段４０２は、処理
能力判断手段４１１の判断結果に応じて、符号化パラメ
ータを決定し、符号化手段４０１に出力する。処理能力
判断手段４１１は、当該符号化装置の符号化処理能力を
判断し、判断結果を符号化パラメータ決定手段４０２に
出力する。本実施の形態４では、処理能力判断手段４１
１は、後述するバッファ手段４１２に一時蓄積されるバ
ッファ量を判断結果として出力するものである。また、
符号化パラメータ決定手段４０２は、指定されたフレー
ムレートと、解像度と、判断結果であるバッファ量とか
ら符号化パターンを決定し、当該符号化パターンに応じ
て符号化手段４０１に符号化タイプを指示するものであ
る。符号化パターンを決定するため、符号化パラメータ
決定手段４０２は、符号化パターン決定手段４１０を用
いる。[0215] The coding parameter determining means 402 determines coding parameters in accordance with the judgment result of the processing capability judging means 411, and outputs it to the coding means 401. The processing capability determination unit 411 determines the encoding processing capability of the encoding device, and outputs a result of the determination to the encoding parameter determination unit 402. In the fourth embodiment, the processing capacity determining unit 41
Reference numeral 1 denotes an output of a buffer amount temporarily stored in a buffer unit 412 described later as a determination result. Also,
The encoding parameter determining unit 402 determines an encoding pattern from the specified frame rate, resolution, and buffer amount as a result of the determination, and instructs the encoding unit 401 of the encoding type according to the encoding pattern. Is what you do. In order to determine the coding pattern, the coding parameter determining means 402 uses the coding pattern determining means 410.

【０２１６】入力フレームレート制御手段４１３は、当
該映像符号化装置の入力である入力画像データを、一連
のフレーム画像として、指定されたフレームレートに対
応して、後述するバッファ手段４１２に出力する。バッ
ファ手段４１２は、入力画像データを一時蓄積するもの
であって、入力画像データを一連のフレーム画像として
順次保存していくとともに上記符号化手段４０１により
読み込まれたフレーム画像を順次廃棄していく。そし
て、本実施の形態４では、処理能力判断手段４１１は、
符号化開始時にバッファ手段４１２に保存されていたフ
レーム画像の枚数と、現在保存されているフレーム画像
の枚数との差分を検出してこれを判断結果として、符号
化パラメータ決定手段４０２に出力するものである。The input frame rate control means 413 outputs the input image data, which is the input of the video encoding device, as a series of frame images to a buffer means 412, which will be described later, corresponding to the specified frame rate. The buffer unit 412 temporarily stores the input image data, sequentially stores the input image data as a series of frame images, and sequentially discards the frame images read by the encoding unit 401. Then, in the fourth embodiment, the processing capability determination unit 411
A method for detecting a difference between the number of frame images stored in the buffer unit 412 at the start of encoding and the number of currently stored frame images and outputting the difference to the encoding parameter determining unit 402 as a determination result It is.

【０２１７】なお、本実施の形態４による映像符号化装
置においても、実施の形態１と同様に、ＰＣにおける符
号化プログラムの実行によって実現されるものとし、実
施の形態１に示した条件（１）〜（５）が成立するもの
とする。また、本装置に搭載されるＣＰＵの動作周波数
は１００MHz であるとする。また、符号化開始時に指定
されるフレームレートは８フレーム／秒であり、入力画
像におけるフレーム画像の解像度として３２０×２４０
が指定されるものとする。Note that the video encoding device according to the fourth embodiment is also realized by executing the encoding program on the PC as in the first embodiment, and the condition (1) described in the first embodiment is satisfied. ) To (5). Further, it is assumed that the operating frequency of the CPU mounted on this apparatus is 100 MHz. The frame rate specified at the start of encoding is 8 frames / second, and the resolution of the frame image in the input image is 320 × 240.
Shall be specified.

【０２１８】以上のような設定のもとに、上述のように
構成された本実施の形態４による映像符号化装置の動作
を以下に説明する。本実施の形態４による映像符号化装
置の処理対象である映像が、入力画像データとして入力
されると、この入力画像データはまず、入力フレームレ
ート制御手段４１３に入力される。入力フレームレート
制御手段４１３は、指定されたフレームレートで、入力
画像データを一連のフレーム画像としてバッファ手段４
１２に順次入力していく。本実施の形態４においては、
入力フレームレート制御手段４１３は、上記の８フレー
ム／秒を該指定されたフレームレートとして処理を行
う。The operation of the video encoding apparatus according to the fourth embodiment configured as described above under the above settings will be described below. When a video to be processed by the video encoding device according to the fourth embodiment is input as input image data, the input image data is first input to input frame rate control means 413. The input frame rate control unit 413 converts the input image data into a series of frame images at a designated frame rate.
12 sequentially. In the fourth embodiment,
The input frame rate control means 413 performs processing with the above-mentioned 8 frame / sec as the specified frame rate.

【０２１９】バッファ手段４１２は、入力フレームレー
ト制御手段４１３より入力されたフレーム画像を順次保
存してゆき、符号化手段４０１により読み込まれたフレ
ーム画像を順次廃棄する。すなわち、ＦＩＦＯ（先入れ
先出し）方式によりデータを一時蓄積するものである。
なお、符号化開始時において、符号化手段４０１は入力
フレームレート制御手段４１３よりも一定の時間遅れて
動作を開始するとする。すなわち、符号化開始時におい
て、バッファ手段４１２は、ある一定の枚数のフレーム
画像を保存している。これは、バッファのアンダーフロ
ー、すなわち一時蓄積するデータの枯渇により処理が円
滑に進行しなくなることを防ぐものである。この段階で
処理能力判断手段４１１は、初期状態におけるバッファ
手段４１２に蓄積されたフレーム画像の枚数を検出し、
比較処理のために保持する。The buffer unit 412 sequentially stores the frame images input from the input frame rate control unit 413, and sequentially discards the frame images read by the encoding unit 401. That is, data is temporarily stored by a FIFO (first in first out) method.
At the start of encoding, it is assumed that the encoding unit 401 starts operating with a certain time delay from the input frame rate control unit 413. That is, at the start of encoding, the buffer unit 412 stores a certain number of frame images. This prevents the processing from proceeding smoothly due to buffer underflow, that is, depletion of temporarily stored data. At this stage, the processing capacity determination unit 411 detects the number of frame images stored in the buffer unit 412 in the initial state,
Retained for comparison processing.

【０２２０】符号化手段４０１は、バッファ手段４１２
より一時蓄積されたフレーム画像を読み出して、符号化
処理を行う。符号化処理に際しての符号化手段４０１の
動作は、実施の形態１における符号化手段１０１と同様
のものとなる。一方、処理能力判断手段４１１は、符号
化手段４０１が４枚のフレーム画像を処理するごとに、
バッファ手段４１２に保存されているフレーム画像の枚
数を検出し、先に検出して保持している、初期状態にお
けるバッファ手段４１２のフレーム画像の枚数との差分
を取得して、この差分値を符号化パラメータ決定手段４
０２に通知する。この時、保存されているフレーム画像
の枚数が、初期状態におけるフレーム画像の枚数よりも
多い場合は、差分を正の値として、少ない場合は差分を
負の値として通知するものとする。The encoding means 401 comprises a buffer means 412
The temporarily stored frame image is read out, and encoding processing is performed. The operation of the encoding unit 401 at the time of the encoding process is the same as that of the encoding unit 101 in the first embodiment. On the other hand, each time the encoding unit 401 processes four frame images,
The number of frame images stored in the buffer unit 412 is detected, the difference between the number of frame images in the buffer unit 412 in the initial state, which is detected and held first, is obtained, and this difference value is encoded. Parameterization means 4
02 is notified. At this time, when the number of stored frame images is larger than the number of frame images in the initial state, the difference is notified as a positive value, and when the number is smaller, the difference is notified as a negative value.

【０２２１】符号化パラメータ決定手段４０２では、上
記の差分が、符号化パターン決定手段４１０に入力され
る。符号化パターン決定手段４１０は、実施の形態３に
おける符号化パターン決定手段３１０と同様に、有限状
態マシンとして動作する。図９(a) は、有限状態マシン
として動作する符号化パターン決定手段４１０の状態遷
移図であり、同図(b) は、状態遷移条件を示す図であ
る。符号化パターン決定手段はＳ０〜Ｓ３までの、全部
で４つの状態をとり、それぞれの状態において、（表１
１）で示す符号化パターンを出力する。In the coding parameter determining means 402, the above difference is input to the coding pattern determining means 410. The coding pattern determining unit 410 operates as a finite state machine, like the coding pattern determining unit 310 in the third embodiment. FIG. 9A is a state transition diagram of the coding pattern determining means 410 operating as a finite state machine, and FIG. 9B is a diagram showing a state transition condition. The coding pattern determination means takes a total of four states from S0 to S3, and in each state, (Table 1
The coding pattern shown in 1) is output.

【０２２２】[0222]

【表１１】 [Table 11]

【０２２３】（表１１）に示す、「ＩＩＩＩ」、「ＩＩ
ＩＰ」、「ＩＰＩＰ」、および「ＩＰＰＰ」の各パター
ンは実施の形態３の（表７）で示したものと同じであ
る。また、符号化パターン決定手段４１０における状態
の遷移は、符号化手段４０１が４枚のフレーム画像を処
理するごとに判定がなされるものであって、判定は、処
理能力判断手段４１１から通知された当該判定の直前の
差分値に基づいて、図９(b) に示した条件に従って行な
われる。なお、本実施の形態４においては、有限状態マ
シンとしての状態の初期値はＳ１であるとする。The “IIII” and “II” shown in (Table 11)
The patterns of “IP”, “IPIP”, and “IPPP” are the same as those shown in (Table 7) of the third embodiment. The state transition in the encoding pattern determination unit 410 is determined each time the encoding unit 401 processes four frame images, and the determination is notified from the processing capability determination unit 411. The determination is performed based on the difference value immediately before the determination in accordance with the condition shown in FIG. In the fourth embodiment, it is assumed that the initial value of the state as the finite state machine is S1.

【０２２４】符号化パラメータ決定手段４０２は、符号
化手段４０１が４枚のフレーム画像を処理するごとに、
符号化パターン決定手段４１０が出力する符号化パター
ンを取得して、当該取得した符号化パターンを実現でき
るように、１枚のフレーム画像ごとに符号化手段４０１
に対して符号化タイプを指示する。また、指定された解
像度を符号化手段４０１に対してそのまま指示する。Each time the encoding means 401 processes four frame images, the encoding parameter determination means 402
The encoding unit 401 obtains the encoding pattern output from the encoding pattern determining unit 410, and encodes the encoding unit 401 for each frame image so that the acquired encoding pattern can be realized.
To indicate the encoding type. Further, the designated resolution is directly instructed to the encoding means 401.

【０２２５】以上のことから、本実施の形態４による映
像符号化装置では、符号化が開始するとまず、符号化パ
ターン決定手段４１０は初期状態Ｓ１であることから、
表１２に示すように符号化パターンとして「ＩＩＩＰ」
を出力する。従って、符号化パラメータ決定手段４０２
は、該パターンを実現できるように符号化パラメータを
符号化手段４０１に出力し、フレーム画像３つに対して
フレーム内符号化、次の１つに対して順方向予測符号化
が行われるように制御がなされる。As described above, in the video coding apparatus according to the fourth embodiment, when coding is started, first, since the coding pattern determining means 410 is in the initial state S1,
As shown in Table 12, "IIIP"
Is output. Therefore, the encoding parameter determining means 402
Outputs encoding parameters to the encoding means 401 so that the pattern can be realized, and performs intra-frame encoding on three frame images and forward prediction encoding on the next one. Control is exercised.

【０２２６】この後、処理能力判断手段４１１から得ら
れる差分値が、負の値となったときには、図９(a) に示
すようにＳ１→Ｓ０の遷移がなされることにより、表１
２に示す符号化パターンは「ＩＩＩＩ」に変更され、フ
レーム内符号化ばかりが行われるようになる。一方、処
理能力判断手段４１１から得られる差分値が正の値にな
ったときには、図９(a) に示すＳ１→Ｓ２の遷移がされ
て、符号化パターンが「ＩＰＩＰ」に変更され、順方向
予測符号化の比率が増すこととなる。Thereafter, when the difference obtained from the processing capacity judging means 411 becomes a negative value, a transition from S1 to S0 is made as shown in FIG.
The coding pattern shown in FIG. 2 is changed to “IIII”, and only intra-frame coding is performed. On the other hand, when the difference value obtained from the processing capacity determination means 411 becomes a positive value, the transition from S1 to S2 shown in FIG. 9A is made, the encoding pattern is changed to "IPIP", and the forward direction is changed. The ratio of predictive coding will increase.

【０２２７】このような制御をすることにより、処理能
力判断手段４１１の出力する差分値が正である、すなわ
ち、バッファ手段４１２に蓄積されたフレーム画像の枚
数が、初期の蓄積枚数より多いときは、当該符号化装置
の処理負担が重いものと考えられるため、図９(a) に示
すＳ３→Ｓ０方向の遷移によって、符号化処理におけ
る、処理負担の小さなフレーム内符号化の比率を高くす
るように図る。一方、処理能力判断手段４１１の出力す
る差分値がである、すなわち、バッファ手段４１２に蓄
積されたフレーム画像の枚数が、初期の蓄積枚数より多
いときは、当該符号化装置の処理能力に余力があるもの
と考えられるため、図９(a) に示すＳ０→Ｓ３方向の遷
移によって、符号化処理における、処理負担の大きな順
方向予測符号化の比率を高くして、より高圧縮度の符号
化結果が得られるように図るものである。By performing such control, when the difference value output from the processing capacity determining means 411 is positive, that is, when the number of frame images stored in the buffer means 412 is larger than the initial number of stored images, Since the processing load of the encoding apparatus is considered to be heavy, the ratio of intra-frame encoding with a small processing load in the encoding processing is increased by the transition from S3 to S0 shown in FIG. Aim at. On the other hand, when the difference value output from the processing capability determination unit 411 is, that is, when the number of frame images stored in the buffer unit 412 is larger than the initial storage number, the processing capability of the encoding device has a margin. 9A, the ratio of forward predictive coding with a large processing load in the encoding process is increased by the transition in the S0 → S3 direction shown in FIG. It is intended to obtain a result.

【０２２８】このように、蓄積枚数として示される符号
化処理の状態に対応して、符号化パラメータが変化させ
ながら、符号化が実行されるが、（表１２）は、上記の
ようにして、２８枚の連続したフレーム画像に対して符
号化を実施した場合における、符号化の結果を示す表で
ある。As described above, the encoding is executed while the encoding parameter is changed in accordance with the state of the encoding process indicated as the number of stored images. 25 is a table illustrating a result of encoding when encoding is performed on 28 continuous frame images.

【０２２９】[0229]

【表１２】 [Table 12]

【０２３０】同表において、指定されるフレームレート
は、固定的に「８」と設定されている。符号化パター
ン、所要時間、入力枚数、出力枚数、および処理能力判
断手段４１１の出力については、０番目から３番目、４
番目から７番目…等の４つずつのフレーム画像の処理ご
とに、値を示すものである。符号化パターンは、４つの
フレーム画像ごとに符号化処理に用いられる符号化パタ
ーンを、所要時間は、符号化手段４０１が４つのフレー
ム画像の符号化処理に要する時間（秒）を、入力枚数
は、バッファ手段４１２に対してフレーム画像が入力さ
れた枚数を、出力枚数は、バッファ手段４１２からフレ
ーム画像が出力された枚数を示している。そして、２８
枚のフレーム画像を符号化した結果として、符号化処理
における平均フレームレートと符号化データ量とを示し
ている。In the table, the designated frame rate is fixedly set to “8”. The encoding pattern, the required time, the input number, the output number, and the output of the processing capability determination unit 411 are 0th to 3rd,
The values are shown for each of the processing of four frame images such as the seventh to seventh. The encoding pattern is the encoding pattern used for the encoding process for each of the four frame images, the required time is the time (seconds) required for the encoding unit 401 to encode the four frame images, and the input number is The number of frame images input to the buffer unit 412 and the number of output frames indicate the number of frame images output from the buffer unit 412. And 28
As a result of encoding one frame image, an average frame rate and an encoded data amount in the encoding process are shown.

【０２３１】なお、（表１２）においては、ＣＰＵの動
作周波数が１００MHz 、符号化パターン「ＩＩＩＩ」に
おいて、解像度が３２０×２４０の場合に２４フレーム
／秒で処理できることに基づいて、４枚のフレーム画像
を処理するのに必要な時間および、符号化結果としての
フレームレートが算出されている。In Table 12, four frames are used based on the fact that the processing frequency of the CPU is 100 MHz and the coding pattern “III” can be processed at 24 frames / sec when the resolution is 320 × 240. The time required to process the image and the frame rate as an encoding result have been calculated.

【０２３２】例えば、０番目から３番目までの４枚のフ
レーム画像を処理するのに必要な時間は、１／２４×３
＋６／２４×１＝０．３７５秒となる。また、符号化結
果としては、Ｉフレームが１５枚とＰフレームが１３枚
生成されるため、１枚のフレーム画像の符号化に要した
時間は、（１／２４×１５＋６／２４×１３）÷２８＝
０．１３８秒となり、平均フレームレートは７．２２５
フレーム／秒と算出できる。また、符号化結果における
１枚のフレーム画像の符号化データ量は、Ｉで符号化し
た場合に１／１０となること、Ｐで符号化した場合に１
／６０となることにそれぞれ基づき算出されている。そ
の結果、符号化結果としては、Ｉフレームが１５枚とＰ
フレームが１３枚生成されるため、２８枚のフレーム画
像の符号化データ量は( １５／１０＋１３／６０) ＝
１．７１７になることから、１枚のフレーム画像に対す
る符号化データ量は、０．０６１となる。For example, the time required to process the four frame images from the 0th to the 3rd is 1/24 × 3
+ 6/24 × 1 = 0.375 seconds. As the encoding result, fifteen I frames and thirteen P frames are generated, so the time required for encoding one frame image is (1/24 × 15 + 6/24 × 13) ÷ 28 =
0.138 seconds, the average frame rate is 7.225
It can be calculated as frames / second. Further, the encoded data amount of one frame image in the encoding result is 1/10 when encoded by I, and 1 when encoded by P.
/ 60, respectively. As a result, as an encoding result, 15 I frames and P
Since 13 frames are generated, the encoded data amount of 28 frame images is (15/10 + 13/60) =
Since it is 1.717, the encoded data amount for one frame image is 0.061.

【０２３３】（表１３）は、本実施の形態４による映像
符号化装置における、符号化パターン決定手段４１０、
および符号化パラメータ決定手段４０２の機能を説明す
るための表である。同表においては、上記（表１２）に
おける処理対象の２８枚のフレーム画像のうち、０番目
から１１番目までの１２枚のフレーム画像について１枚
ごとについての、符号化パターン決定手段４１０、およ
び符号化パラメータ決定手段４０２の出力を含む、本実
施の形態３による映像符号化装置の状態を示すものであ
る。Table 13 shows the coding pattern determining means 410 in the video coding apparatus according to the fourth embodiment.
4 is a table for explaining the function of a coding parameter determination unit 402. In the table, among the 28 frame images to be processed in the above (Table 12), the encoding pattern determination unit 410 and the encoding code for each of the 12 frame images from the 0th to the 11th are shown. 13 shows the state of the video encoding device according to the third embodiment, including the output of the encoding parameter determination means 402.

【０２３４】[0234]

【表１３】 [Table 13]

【０２３５】同表において、所要時間、入力枚数、出力
枚数、および処理能力判断手段４１１の出力は、（表１
２）と同様である。この処理能力判断手段４１１の出力
に対応して、符号化パターン決定手段４１０は、上記の
ように状態遷移をし、符号化パターンを出力する。そし
て、符号化パラメータ決定手段３０２は、出力された符
号化パターンに対応して、表に示すように１枚のフレー
ム画像ごとに符号化タイプを出力する。In the table, the required time, the number of input sheets, the number of output sheets, and the output of the processing capability judging means 411 are as shown in (Table 1).
Same as 2). In response to the output of the processing capability determining unit 411, the coding pattern determining unit 410 makes a state transition as described above and outputs a coding pattern. Then, the encoding parameter determining means 302 outputs the encoding type for each frame image as shown in the table, corresponding to the outputted encoding pattern.

【０２３６】同表に示すように、符号化パターン決定手
段４１０の状態は、上記のようにまずＳ１をとるので、
このため表１１に示すように符号化パターン「ＩＩＩ
Ｐ」が出力される。符号化パラメータ決定手段４０２は
これに対応して、最初の４つのフレーム画像について
「Ｉ」「Ｉ」「Ｉ」「Ｐ」を出力する。As shown in the table, since the state of the coding pattern determining means 410 first takes S1 as described above,
Therefore, as shown in Table 11, the coding pattern “III
P "is output. In response to this, the encoding parameter determination unit 402 outputs “I”, “I”, “I”, and “P” for the first four frame images.

【０２３７】８番目から１１番目までの４つのフレーム
画像に対しては、符号化パターン決定手段４１０の状態
がＳ１→Ｓ２に遷移したことから、符号化パターン「Ｉ
ＰＩＰ」が出力されるので、符号化パラメータ決定手段
４０２の出力は、「Ｉ」「Ｐ」「Ｉ」「Ｐ」となる。With respect to the four frame images from the eighth to the eleventh, the state of the coding pattern determining means 410 has changed from S1 to S2.
Since “PIP” is output, the output of the encoding parameter determination unit 402 is “I”, “P”, “I”, and “P”.

【０２３８】以下、表１２に示すように、４つのフレー
ム画像を処理するごとに、処理能力の判断と、それに対
応した符号化パターンの選択がなされ、該パターンに対
応した符号化タイプにおいて符号化処理が行われる。As shown in Table 12, each time four frame images are processed, the processing capability is determined and the corresponding coding pattern is selected. Processing is performed.

【０２３９】従来の技術による映像符号化装置では、実
施の形態３において（表１０）を用いて示したように、
符号化結果として得られるフレームレートや、当該符号
化装置を構成するハードウェア能力の変動を考慮せず
に、符号化タイプ（パターン）、あるいは解像度を決定
していたものである。従って、符号化処理の結果として
得られるフレームレートが要望される値に近くなるよう
設定することが困難であり、ときには不必要な数値とな
ってしまう設定を選定せざるを得ない場合などがあっ
た。これに比べ、本実施の形態４の映像符号化装置にお
いては、実施の形態３と同様に、当該符号化装置の処理
能力や符号化結果であるフレームレートの変動を考慮し
て、指定された解像度に応じて符号化タイプ（パター
ン）を決定することで、表１３と表１０との対比におい
て示されるように、指定されたフレームレートに近いフ
レームレートを実現でき、かつ、より高圧縮率での符号
化が実行されていることがわかる。また、実施の形態３
においては、処理に要した時間の測定を要するものであ
ったが、本実施の形態４では、実施の形態３のように処
理時間を測定が不可能な、または困難な場合であって
も、一時蓄積するデータ量を指標として装置の処理能力
を判定することが可能である。In the video encoding device according to the prior art, as shown in (Table 10) in the third embodiment,
The encoding type (pattern) or the resolution is determined without considering the frame rate obtained as the encoding result or the fluctuation of the hardware capability of the encoding apparatus. Therefore, it is difficult to set the frame rate obtained as a result of the encoding process close to a desired value, and sometimes it is necessary to select a setting that results in an unnecessary value. Was. On the other hand, in the video encoding device according to the fourth embodiment, similarly to the third embodiment, the video encoding device is designated in consideration of the processing capability of the encoding device and the fluctuation of the frame rate as the encoding result. By determining the encoding type (pattern) according to the resolution, a frame rate close to the specified frame rate can be realized as shown in a comparison between Table 13 and Table 10, and at a higher compression rate. It can be seen that the encoding has been performed. Embodiment 3
, It is necessary to measure the time required for the processing. However, in the fourth embodiment, even if the processing time cannot be measured or difficult as in the third embodiment, The processing capacity of the apparatus can be determined using the temporarily accumulated data amount as an index.

【０２４０】このように、本実施の形態４による映像符
号化装置によれば、符号化手段４０１、符号化パターン
決定手段４１０を内包した符号化パラメータ決定手段４
０２、処理能力判断手段４１１、バッファ手段４１２、
および入力フレームレート制御手段４１３を備えたこと
で、入力フレームレート制御手段４１３が入力画像デー
タを定められたレートでバッファ手段４１２に入力し
て、入力バッファ手段４１２は、この入力画像データを
一時蓄積し、処理能力判断手段４１１は、バッファ手段
４１２に蓄積されたデータ量に基づいて、当該符号化装
置の処理能力を示す判断結果を出力し、符号化パラメー
タ決定手段４０２は、指定されたフレームレートと解像
度、そして処理能力判断手段４１１の出力する判断結果
とに対応して符号化パターンを決定して、符号化パラメ
ータを符号化手段４０１に出力し、符号化手段４０１は
この符号化パラメータに応じて符号化の処理を行うの
で、要求される条件を実現しつつ、より高圧縮率の得ら
れる符号化を行うことが可能となる。As described above, according to the video coding apparatus of the fourth embodiment, the coding parameter determining means 4 including the coding means 401 and the coding pattern determining means 410
02, processing capacity determination means 411, buffer means 412,
And the input frame rate control means 413, the input frame rate control means 413 inputs the input image data to the buffer means 412 at a predetermined rate, and the input buffer means 412 temporarily stores the input image data. Then, the processing capability determination unit 411 outputs a determination result indicating the processing capability of the encoding apparatus based on the amount of data stored in the buffer unit 412, and the encoding parameter determination unit 402 , A resolution, and a determination result output from the processing capability determination unit 411, an encoding pattern is determined, and encoding parameters are output to the encoding unit 401. Since the encoding process is performed, it is necessary to perform encoding that achieves a higher compression rate while achieving the required conditions. It can become.

【０２４１】なお、本実施の形態４による映像符号化装
置では、指定された解像度に対応して符号化パターンを
決定するものとしたが、同様の処理をすることによっ
て、指定された符号化パターン（タイプ）に対応して解
像度を決定することも可能であり、要求されるフレーム
レートと符号化パターンとの下で、より高解像度での符
号化処理をすることが可能となる。In the video coding apparatus according to the fourth embodiment, the coding pattern is determined in accordance with the specified resolution. However, by performing the same processing, the specified coding pattern is determined. It is also possible to determine the resolution corresponding to (type), and it is possible to perform encoding processing at a higher resolution under the required frame rate and encoding pattern.

【０２４２】また、本実施の形態４による装置では、実
施の形態３と同様に、符号化の処理を行いながら一時蓄
積されるデータ量を検出し、装置の処理能力として取得
される符号化の状況に対応して動的に符号化パラメータ
を変更し得るものである。従って、本実施の形態４によ
る映像符号化装置では、実施の形態１、および２に比較
して若干の処理負担は伴うものの、複数の演算処理等を
並行して実行する汎用計算機において、映像取り込みに
伴った符号化を実行する際などであって、当該計算機装
置の状況が変化するような場合でも、その状況の変化に
対応して、適切な符号化条件を設定することが可能とな
るものである。Also, in the device according to the fourth embodiment, as in the third embodiment, the amount of data temporarily stored is detected while performing the encoding process, and the amount of encoded data acquired as the processing capability of the device is detected. The coding parameters can be dynamically changed according to the situation. Therefore, in the video encoding device according to the fourth embodiment, although a small processing load is involved as compared with the first and second embodiments, a general-purpose computer that executes a plurality of arithmetic processes and the like in parallel performs video capturing. In such a case as when performing the encoding accompanying the above, even when the situation of the computer device changes, it is possible to set appropriate encoding conditions in response to the change of the situation. It is.

【０２４３】もっとも、実施の形態３と同様に、あまり
状況の変化がないなど当該符号化装置の処理能力が符号
化処理の過程において大きく変動しないと見込まれる場
合等には、本実施の形態４による映像符号化装置におい
ても、実施の形態１、および２と同様に、符号化開始に
際してパラメータを設定し、以後はその条件で符号化を
実施するものとして、制御にかかる処理負担を軽減する
ことも可能である。However, as in the case of the third embodiment, in the case where it is expected that the processing capacity of the coding apparatus does not fluctuate significantly in the course of the coding processing, such as when the situation does not change much, the fourth embodiment is used. As in the first and second embodiments, the video encoding apparatus according to the first embodiment sets parameters at the start of encoding, and thereafter performs encoding under the same conditions, thereby reducing the processing load on control. Is also possible.

【０２４４】なお、実施の形態１〜４では、符号化タイ
プとして、フレーム内符号化と、順方向予測符号化とを
行うものとしたが、符号化タイプはこれに限られるもの
ではない。たとえば、逆方向予測符号化、双方向予測符
号化等をもそれぞれ異なる符号化タイプとして採用する
ことができ、さらには、フレーム間予測符号化において
動きベクトルの探索範囲を変える場合などについても、
異なる符号化タイプとして用いることが可能である。In Embodiments 1-4, intra-frame encoding and forward prediction encoding are performed as encoding types, but the encoding types are not limited to these. For example, backward predictive coding, bidirectional predictive coding, etc., can also be adopted as different coding types, respectively.Moreover, even when the search range of a motion vector is changed in inter-frame predictive coding,
It can be used as a different encoding type.

【０２４５】また、実施の形態１〜４に示した映像符号
化方法については、該方法を実行し得る映像符号化プロ
グラムを記録した記録媒体を用いて、パーソナルコンピ
ュータやワークステーション等において、当該プログラ
ムを実行することによって実現できるものである。In the video encoding method described in the first to fourth embodiments, a personal computer, a workstation, or the like uses a recording medium storing a video encoding program capable of executing the method. Can be realized by executing

【０２４６】実施の形態５．本発明の実施の形態５によ
る音声符号化装置は、サンプリングした音声データに対
して、変換処理を行うことにより、当該音声符号化装置
における処理負担の軽減を図り得るものである。図１０
は、本発明の実施の形態５による音声符号化装置の構成
を示すブロック図、図１１は本実施の形態５の符号化装
置のハードウェア構成を示す図である。図１０に示すよ
うに、当該音声符号化装置は、音声入力部５０１、レジ
スタ５０２、入力音声サンプリング部５０３、音声デー
タ変換部５０４、帯域分割部５０５、符号化ビット割り
当て部５０６、量子化部５０７、符号化部５０８、およ
び符号化データ記録部５０９から構成されている。[0246] Embodiment 5 The speech encoding device according to the fifth embodiment of the present invention can reduce the processing load on the speech encoding device by performing conversion processing on the sampled speech data. FIG.
Is a block diagram illustrating a configuration of a speech coding apparatus according to a fifth embodiment of the present invention, and FIG. 11 is a diagram illustrating a hardware configuration of the coding apparatus according to the fifth embodiment. As shown in FIG. 10, the audio encoding device includes an audio input unit 501, a register 502, an input audio sampling unit 503, an audio data conversion unit 504, a band division unit 505, an encoded bit allocation unit 506, and a quantization unit 507. , An encoding unit 508, and an encoded data recording unit 509.

【０２４７】音声入力部５０１は、符号化を行う対象で
ある音声を入力するものである。音声は図１１に示すよ
うにマイクロホンから入力されても、あるいはライン入
力であっても良い。レジスタ５０２は、図１１のメイン
メモリまたは外部記憶装置で実現され、符号化処理に用
いられる定数を記憶する。入力音声サンプリング部５０
３は、図１１のサウンドボード（入力）および制御プロ
グラムによって実現され、音声入力部５０１が入力した
音声に対してサンプリング処理を行う。[0247] The audio input unit 501 is for inputting audio to be encoded. The sound may be input from a microphone as shown in FIG. 11, or may be a line input. The register 502 is realized by the main memory or the external storage device of FIG. 11, and stores a constant used for the encoding process. Input audio sampling unit 50
3 is realized by the sound board (input) and the control program in FIG. 11, and performs sampling processing on the sound input by the sound input unit 501.

【０２４８】音声データ変換部５０４は、入力音声サン
プリング部５０３がサンプリング処理したデータに対し
て、レジスタ５０２に記憶された定数の値を用いた変換
処理をする。帯域分割部５０５は、音声データ変換部５
０４が変換したデータを帯域分割する。符号化ビット割
り当て部５０６は、帯域分割部５０５が分割した帯域に
対して、符号化ビットを割り当てる。量子化部５０７
は、符号化ビット割り当て部５０６の割り当てた符号化
ビット数に従って、量子化処理を行う。符号化部５０８
は、量子化部５０７の出力する量子化値を符号化音声デ
ータとして出力する。５０４〜５０８はいずれも、図１
１のＣＰＵ、メインメモリ、およびプログラムで実現さ
れる。符号化データ記録部５０９は、図１１の外部記憶
装置、および制御プログラムで実現され、符号化部５０
８から出力された符号化データを当該音声符号化装置の
音声符号化処理結果として記録する。The audio data conversion unit 504 performs a conversion process on the data sampled by the input audio sampling unit 503 using the constant value stored in the register 502. The band dividing unit 505 includes the audio data converting unit 5
04 performs band division on the converted data. The coded bit allocation unit 506 allocates coded bits to the band divided by the band division unit 505. Quantizer 507
Performs a quantization process according to the number of coded bits allocated by the coded bit allocation unit 506. Encoding unit 508
Outputs the quantization value output from the quantization unit 507 as encoded audio data. 504 to 508 are all the same
This is realized by one CPU, main memory, and program. The encoded data recording unit 509 is realized by the external storage device and the control program of FIG.
The encoded data output from 8 is recorded as the result of the audio encoding process of the audio encoding device.

【０２４９】本実施の形態５において、設定周波数ｆｓ
としては、ＭＰＥＧＡｕｄｉｏで規定される、３２ｋＨ
ｚ、４４．１ｋＨｚ、４８ｋＨｚの３つのサンプリング
周波数のうち、４８ｋＨｚを採用したものとする。又、
変換定数ｎは、ＣＰＵ性能に従って予め定められた値
「２」としてレジスタ５０２に格納されたものであると
する。変換定数ｎの値の決定については、装置に用いら
れるＣＰＵを固定的に想定して、その符号化処理性能に
基づいて設定する方法、予めシミュレーションなどによ
りＣＰＵごとに求めた値の中から、ユーザーによるＣＰ
Ｕ選択によって選定する方法、符号化処理に先立ちＣＰ
Ｕの符号化処理性能を計る演算を行わせ、その結果に基
づいて設定する方法などを用いることができる。In the fifth embodiment, the set frequency fs
Is 32 kHz specified by MPEG Audio.
It is assumed that 48 kHz is adopted among three sampling frequencies of z, 44.1 kHz and 48 kHz. or,
It is assumed that the conversion constant n is stored in the register 502 as a value “2” predetermined according to the CPU performance. Regarding the determination of the value of the conversion constant n, a method of setting based on the encoding processing performance, assuming a fixed CPU to be used in the apparatus, or a value determined in advance for each CPU by simulation, etc. CP by
U selection method, CP prior to encoding process
A method of performing an operation for measuring the encoding processing performance of U and setting based on the result can be used.

【０２５０】図１２は本実施の形態５の符号化装置によ
る音声符号化の動作を示すフローチャート図、図１３は
本実施の形態５の符号化装置によるサンプリングおよび
それに続く音声データの変換を説明するための図であ
る。以下に本実施の形態５による音声符号化装置による
符号化の際の動作を、図１２のフローチャートに従っ
て、図１０、および１３を参照しながら説明する。FIG. 12 is a flowchart showing the operation of speech encoding by the encoding apparatus according to the fifth embodiment, and FIG. 13 illustrates sampling by the encoding apparatus according to the fifth embodiment and subsequent conversion of audio data. FIG. Hereinafter, the operation at the time of encoding by the speech encoding apparatus according to the fifth embodiment will be described according to the flowchart of FIG. 12 and with reference to FIGS.

【０２５１】図１２のフローのステップ１で、音声入力
部５０１より入力された音声信号は、入力音声サンプリ
ング部５０３において、設定周波数ｆｓをサンプリング
周波数としてサンプリングされる。従来例の場合と同様
に、このサンプリングはサンプリング周波数ｆｓと逆数
関係にある時間をｔｓとして、図１３(a) のように行わ
れ、ｍ個のサンプリング音声データが出力される。In step 1 of the flow in FIG. 12, the audio signal input from the audio input unit 501 is sampled in the input audio sampling unit 503 using the set frequency fs as the sampling frequency. As in the case of the conventional example, this sampling is performed as shown in FIG. 13A with time ts being a reciprocal relationship to the sampling frequency fs, and m pieces of sampled audio data are output.

【０２５２】ステップ２において、音声データ変換部５
０４は、レジスタ５０２に記憶された変換定数ｎを得
て、入力音声サンプリング部５０３の出力するｍ個のサ
ンプリング音声データから（ｎ−１）個とばしで、計ｍ
／ｎ個のサンプリング音声データを抽出する。この場合
（ｎ−１）は１であるので、図１３(b) に示すように、
１個おきに○印をつけたサンプリング音声データが抽出
される。そして、音声データ変換部５０４は抽出したデ
ータが、それぞれｎ個ずつ連続する合計ｍ個の変換音声
データを作成する。図１３(c) のような変換音声データ
が出力され、この変換音声データは、周波数ｆｓのｍ個
の音声データとなる。In step 2, the audio data converter 5
04 obtains the conversion constant n stored in the register 502 and skips (n-1) pieces of m sampled audio data output from the input audio sampling unit 503 to obtain a total of m
/ N sampling audio data is extracted. In this case, since (n-1) is 1, as shown in FIG.
Sampling voice data marked with a circle every other is extracted. Then, the audio data conversion unit 504 creates a total of m pieces of converted audio data in which the extracted data are continuous n pieces each. The converted voice data as shown in FIG. 13C is output, and the converted voice data is m voice data having a frequency fs.

【０２５３】ここで、ステップ２における音声データ変
換を、単に（ｎ−１）個とばしに抽出するだけで、後の
帯域分割符号化に供する場合と比較して考察する。この
場合、図１３(d) のデータＣに示すように、サンプリン
グ周波数ｆｓ／ｎのｍ／ｎ個の音声データが作成される
こととなり、出力される符号化音声データはサンプリン
グ周波数ｆｓ／ｎに相当する符号化データにしかなら
ず、サンプリング周波数ｆｓとしては再生することは出
来ない。ここでの想定により、ＭＰＥＧＡｕｄｉｏの規
定によって４８ｋＨｚでサンプリングした後に、ｎ＝２
により、１個飛ばしに抽出しただけの、ｍ／２個の音声
データに対して帯域分割符号化を行ったものとすると、
サンプリング周波数ｆｓ／２＝２４ｋＨｚに相当する符
号化音声データしか出力されず、上記ＭＰＥＧＡｕｄｉ
ｏ規定の３つのサンプリング周波数、３２ｋＨｚ、４
４．１ｋＨｚ、および４８ｋＨｚのいずれによっても再
生出来ないことが分かる。Here, the audio data conversion in step 2 will be considered in comparison with the case where the audio data conversion is performed by simply extracting (n-1) skips and then used for the subsequent band division coding. In this case, as shown in data C of FIG. 13D, m / n pieces of audio data having the sampling frequency fs / n are created, and the encoded audio data to be output is converted to the sampling frequency fs / n. Only the corresponding encoded data can not be reproduced as the sampling frequency fs. According to the assumption here, after sampling at 48 kHz according to the MPEG Audio regulations, n = 2
As a result, if it is assumed that band division encoding has been performed on m / 2 pieces of audio data, which are extracted one by one,
Only the encoded audio data corresponding to the sampling frequency fs / 2 = 24 kHz is output, and the MPEG Audio
o Three specified sampling frequencies, 32 kHz, 4
It can be seen that reproduction cannot be performed at any of 4.1 kHz and 48 kHz.

【０２５４】このため、本実施の形態５による符号化処
理においては、ステップ２の変換処理で、単純に（ｎ−
１）個飛ばしのｍ／ｎ個の音声データ（図１３(d) のデ
ータＣ）とするだけではなく、同じデータがｎ個ずつ連
続する変換音声データを作成するものである。図１３
(c) のデータＢに示すこの変換音声データは、実質のサ
ンプリング周波数はｆｓ／ｎ相当であるが、図１３(b)
のデータＡと同様にサンプリング周波数ｆｓとして扱え
るｍ個の音声データとなる。For this reason, in the encoding processing according to the fifth embodiment, the conversion processing in step 2 simply uses (n−
1) Not only m / n pieces of skipped audio data (data C in FIG. 13 (d)) but also converted voice data in which the same data continues n pieces at a time. FIG.
Although the actual sampling frequency of the converted audio data shown in the data B of FIG. 13C is equivalent to fs / n, FIG.
M audio data that can be handled as the sampling frequency fs in the same manner as the data A.

【０２５５】以上のような変換工程であるステップ２に
続いて、ステップ３では、この変換音声データに対し
て、帯域分割部５０５がＭ個の周波数帯域への分割を行
う。ＭＰＥＧＡｕｄｉｏの帯域分割では、３２帯域への
分割を行う。このステップは第１の従来例の場合と同様
に行われる。In step 3 subsequent to step 2 which is the above-described conversion process, band conversion section 505 divides the converted audio data into M frequency bands. In MPEG Audio band division, division into 32 bands is performed. This step is performed in the same manner as in the first conventional example.

【０２５６】ステップ４で、符号化ビット割り当て部５
０６は、設定周波数ｆｓと、変換定数ｎとをレジスタ５
０２より取得し、これらに基づいて、一般に知られてい
るサンプリング定理より再生可能な限界となる制限周波
数ｆｓ／２ｎを算出する。そして、Ｍ個に分割された帯
域の中で、再生可能である制限周波数より小さい周波数
の帯域には符号化ビットを割り当て、再生可能でない制
限周波数より大きい周波数の帯域には符号化ビットを割
り当てないものとして、符号化ビット割り当て数を決定
する。符号化ビット割り当て数は符号化ビット割り当て
部５０６から量子化部５０７に伝えられる。In step 4, coded bit allocation section 5
06 stores the set frequency fs and the conversion constant n in the register 5
02, and the limit frequency fs / 2n, which is the limit of reproducibility, is calculated based on the sampling theorem generally known. Then, among the M divided bands, coded bits are allocated to a band having a frequency lower than the reproducible limit frequency, and coded bits are not allocated to a band having a frequency higher than the non-reproducible limit frequency. In this case, the number of coded bits is determined. The number of coded bits allocated is transmitted from the coded bit allocation unit 506 to the quantization unit 507.

【０２５７】ステップ５において、量子化部５０７は、
符号化ビット割り当て数に従って、それぞれの帯域ごと
に音声データを量子化して量子化値を出力し、ステップ
６で符号化部５０８は、この量子化値により、符号化音
声データを出力する。出力された符号化音声データは、
符号化データ記録部５０９に記録される。図１２のフロ
ーに示すように、以上の過程は、符号化を行う対象の音
声の入力が続く間繰り返され、音声の入力終了後、速や
かに符号化は完了する。In step 5, the quantization section 507
According to the number of coded bits allocated, the audio data is quantized for each band to output a quantized value. In step 6, the encoding unit 508 outputs coded audio data based on the quantized value. The output coded audio data is
This is recorded in the encoded data recording unit 509. As shown in the flow of FIG. 12, the above process is repeated while the input of the audio to be encoded continues, and the encoding is completed immediately after the input of the audio is completed.

【０２５８】本実施の形態５による音声符号化装置の効
果について、ステップ３の帯域分割において、ステップ
２で得られた変換音声データを用いることによる処理量
の軽減を、ＭＰＥＧＡｕｄｉｏでの帯域分割方法におい
て考察する。Regarding the effect of the audio coding apparatus according to the fifth embodiment, the reduction of the processing amount by using the converted audio data obtained in step 2 in the band division in step 3 is described in the band division method in MPEG Audio. Consider.

【０２５９】ここで想定しているＭＰＥＧＡｕｄｉｏの
帯域分割では、３２帯域への分割のため以下の演算を行
う。In the MPEG Audio band division assumed here, the following calculation is performed for division into 32 bands.

【０２６０】[0260]

【数１】 (Equation 1)

【０２６１】ただし、Ｘi ：入力音声データＳi ：帯域分割後の音声データまた、係数Ｃｉは、ＭＰＥＧオーディオの規格による、
サンプル番号と係数とを対比させた係数表より得られ
る。Xi: input audio data Si: audio data after band division Further, the coefficient Ci is based on the MPEG audio standard.
It is obtained from a coefficient table in which sample numbers and coefficients are compared.

【０２６２】ここで、式（１）および（２）は、通常ｍ
個の音声データに対して演算されるが、ステップ２にお
いて変換した、抽出したサンプリング音声データがｎ個
ずつ連続する変換音声データに対して演算を行う場合、
このｎ個ずつのサンプリング音声データは同じものであ
るので、これらを連続的に扱える部分については、ｍ／
ｎ個の音声データに対して演算すればよく、式（１）に
ついては、演算量は１／ｎにまで軽減され、式（２）に
ついても１／ｎに軽減することもできる。Here, equations (1) and (2) are usually
The operation is performed on a plurality of pieces of audio data. When the operation is performed on the converted audio data in which the extracted sampled audio data converted in step 2 is continuous by n pieces,
Since the n pieces of sampled audio data are the same, the part that can continuously handle them is m /
The calculation may be performed on n pieces of audio data, and the calculation amount of Expression (1) is reduced to 1 / n, and the calculation amount of Expression (2) can be reduced to 1 / n.

【０２６３】ここでは、ｎ＝４の場合について説明を行
う。この場合、式（１）では、Ｘ0〜Ｘ3 の音声データ
は、４個のＸ0 が連続するものであるので、Ｘ0 ＝Ｘ1
＝Ｘ2 ＝Ｘ3 となり、また、Ｃi についてもＣ0 ＝Ｃ1
＝Ｃ2 ＝Ｃ3 となる一つの値を用いて演算を行うことが
できる。結局Ｚ0,Ｚ1,Ｚ2,Ｚ3 の４つの値は１回の演算
によって求められることとなり、式（１）では１／４の
演算量において全てのＺi が求められる。なお、式
（１）の演算において、Ｃ0 ＝Ｃ1 ＝Ｃ2 ＝…＝Ｃn と
なる一つの値で代表させるには、Ｃ0 〜Ｃn のいずれか
の値を用いる方法やＣ0 〜Ｃn の平均値を用いる方法な
どが用いられる。Here, the case where n = 4 will be described. In this case, in the formula (1), since the voice data of X0 to X3 is a sequence of four X0s, X0 = X1
= X2 = X3, and for Ci, C0 = C1
The operation can be performed using one value such that = C2 = C3. Eventually, the four values of Z0, Z1, Z2, and Z3 are obtained by one operation, and in equation (1), all Zi are obtained with a calculation amount of 1/4. In the calculation of the equation (1), in order to represent by one value of C0 = C1 = C2 =... = Cn, a method using any value of C0 to Cn or an average value of C0 to Cn is used. A method or the like is used.

【０２６４】次にこのＺi を用いた式（２）でも、0 か
ら64個飛びの値を８回加算しているだけとなり、ｎが２
のべき乗の場合は、Ｙi はｎ個ずつ同じ値となり、演算
は１／ｎに削減される。ただし、ｎが２のべき乗でない
場合、例えばｎ＝3 の場合では、Ｙ0 を求めるために加
算するＺ64と、Ｙ1 を求めるために加算するＺ65とは等
しくないため、Ｙ0=Ｙ1 とはならず、結局式（２）の演
算量は削減されないこととなる。Next, also in the equation (2) using Zi, the value obtained by jumping from 0 to 64 is simply added eight times, and n is 2
In the case of exponentiation, Yi has the same value every n pieces, and the calculation is reduced to 1 / n. However, when n is not a power of 2, for example, when n = 3, Z0 added to obtain Y0 is not equal to Z65 added to obtain Y1, so that Y0 = Y1 is not obtained. Eventually, the amount of calculation of equation (2) will not be reduced.

【０２６５】ただし、このｎが２のべき乗でない場合に
ついても、設定により演算量を削減することは可能であ
る。例えば、ｎ＝３の場合、Ｘ0 ＝Ｘ1 ＝Ｘ2 、Ｘ3 ＝
Ｘ4＝Ｘ5 、…、Ｘ27＝Ｘ28＝Ｘ29、Ｘ30＝Ｘ31、とな
るように、３２帯域への分割において、i ＝0 〜31の間
で同じ値がなるべく多く連続するように、ステップ２に
おける音声データ変換を行う。つまり、最後の２個を同
じとして、それ以外は同じ値の連続する３つ組が並ぶ形
式の音声データ列に変換しておき、この変換音声データ
を用いて、ステップ３の帯域分割の際の式（１）および
式（２）の演算を行えば、１／３に近い削減ができる。
式（３）については、音声データ変換後も演算量は変わ
らない。However, even when this n is not a power of 2, it is possible to reduce the amount of calculation by setting. For example, when n = 3, X0 = X1 = X2, X3 =
In the division into 32 bands so that X4 = X5,. Perform data conversion. In other words, the last two are assumed to be the same, and the other two are converted into an audio data sequence in which consecutive triples of the same value are arranged, and the converted audio data is used to perform the band division in step 3 By performing the operations of Equations (1) and (2), a reduction close to 1/3 can be achieved.
Regarding Expression (3), the amount of calculation does not change even after the audio data conversion.

【０２６６】また、ステップ４における符号化ビット割
り当ての際に、制限周波数ｆｓ／２ｎ以下の帯域のみ符
号化ビットを割り当てるのは、元々サンプリング周波数
ｆｓでサンプリングした音声データを、ステップ２の音
声データ変換に際して（ｎ−１）個飛ばしに抽出するこ
とが、サンプリング周波数ｆｓ／ｎでサンプリングした
ことと同等になり、公知のサンプリング定理より、ｆｓ
／２ｎ以上の周波数帯域は再生出来ないことが分かるの
で、再生可能なｆｓ／２ｎ以下の帯域のみを符号化の対
象とするためである。制限周波数以上の帯域には符号化
ビットを割り当てず、その帯域に対しての量子化が不要
なことから、ステップ５における量子化処理は、１／２
ｎに負担が軽減される。Also, in allocating the coded bits in step 4, the coded bits are allocated only to the band equal to or lower than the limited frequency fs / 2n because the voice data originally sampled at the sampling frequency fs is converted into the voice data converted in step 2. In this case, the (n-1) skipping extraction is equivalent to sampling at the sampling frequency fs / n. According to the known sampling theorem, fs
Since it can be seen that a frequency band equal to or higher than / 2n cannot be reproduced, only the band equal to or lower than fs / 2n that can be reproduced is to be encoded. Since no coded bit is assigned to a band equal to or higher than the limit frequency, and quantization for that band is unnecessary, the quantization process in step 5 is performed by １／
The burden is reduced to n.

【０２６７】このように、本実施の形態５の音声符号化
装置によれば、レジスタ５０２と、音声データ変換部５
０４とを備えたことで、入力音声サンプリング部５０３
が設定周波数によりサンプリング処理したｍ個のサンプ
リング音声データに対して、音声データ変換部５０４
が、レジスタ５０２に記憶した変換定数ｎに基づいて、
ｎ−１個とばしにサンプリング音声データを合計ｍ／ｎ
個抽出し、この抽出したサンプリング音声データがそれ
ぞれｎ個ずつ連続するｍ個の変換音声データとすること
で、これに続く帯域分割部５０５による処理に際し、演
算量を大きく削減することができ、しかも、単に抽出に
よって音声データを減らした場合とは異なり、元の設定
周波数において再生可能な符号化データを得ることがで
きる。又、帯域分割した帯域に対し、符号化ビット割り
当て部５０６が、サンプリング定理によって再生可能で
ない帯域には符号化ビットを割り当てず、その帯域に対
しては量子化部５０７による量子化が不要なことから、
量子化処理の負担が１／２ｎに軽減される。従って、Ｃ
ＰＵ性能不足などにより、従来の方法では、音声入力に
伴った実時間符号化処理が困難または不可能な場合に
も、定数の設定により負担を軽減することで音声符号化
処理を実時間で行うことが可能となる。As described above, according to the speech encoding apparatus of the fifth embodiment, register 502 and speech data conversion section 5
04, the input audio sampling unit 503
Converts the m sampled audio data sampled at the set frequency into an audio data conversion unit 504.
Is based on the conversion constant n stored in the register 502,
Sampling audio data is skipped by n-1 total m / n
By extracting m pieces of converted audio data, each of which is extracted and the number of the extracted sampled audio data is n, the amount of calculation can be greatly reduced in the subsequent processing by the band division unit 505, and Unlike the case where audio data is simply reduced by extraction, encoded data that can be reproduced at the original set frequency can be obtained. Also, the coded bit allocation unit 506 does not allocate coded bits to the band that cannot be reproduced by the sampling theorem for the band-divided band, and the quantization by the quantization unit 507 is unnecessary for the band. From
The burden of the quantization process is reduced to 1 / 2n. Therefore, C
In the conventional method, even when the real-time encoding process associated with audio input is difficult or impossible due to insufficient PU performance or the like, the audio encoding process is performed in real time by reducing the load by setting constants. It becomes possible.

【０２６８】なお、本実施の形態５では、変換音声デー
タの作成においてサンプリング音声データが並ぶ形式の
ものを作成したが、サンプリング音声データの間に両側
の音声データを平均した音声データなど、適当な音声デ
ータをｎ−１個挿入して、同様の効果を得ることも可能
である。In the fifth embodiment, the converted audio data is created in a format in which the sampled audio data is arranged. Similar effects can be obtained by inserting n-1 pieces of audio data.

【０２６９】実施の形態６．本発明の実施の形態６によ
る音声符号化装置は、実施の形態５と同様にサンプリン
グした音声データに対して、変換処理を行うことによ
り、当該音声符号化装置における処理負担の軽減を図り
得るものであるが、データ量削減を音声データ変換処理
でなくサンプリング処理において行うものであること
が、実施の形態５とは異なる。Embodiment 6 FIG. The speech coding apparatus according to the sixth embodiment of the present invention can reduce the processing load on the speech coding apparatus by performing conversion processing on the sampled speech data in the same manner as in the fifth embodiment. However, the difference from the fifth embodiment is that the data amount is reduced not in the audio data conversion processing but in the sampling processing.

【０２７０】図１４は、本発明の実施の形態６による音
声符号化装置の構成を示すブロック図である。同図に示
すように、当該音声符号化装置は、音声入力部６０１、
レジスタ６０２、入力音声サンプリング部６０３、音声
データ変換部６０４、帯域分割部６０５、符号化ビット
割り当て部６０６、量子化部６０７、符号化部６０８、
および符号化データ記録部６０９から構成されている。
また、本実施の形態６の符号化装置のハードウェア構成
も図１１に示される実施の形態５のものと同様である。FIG. 14 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 6 of the present invention. As shown in the figure, the speech encoding device includes a speech input unit 601,
Register 602, input audio sampling unit 603, audio data conversion unit 604, band division unit 605, encoded bit allocation unit 606, quantization unit 607, encoding unit 608,
And an encoded data recording unit 609.
Further, the hardware configuration of the encoding device according to the sixth embodiment is the same as that of the fifth embodiment shown in FIG.

【０２７１】入力音声サンプリング部６０３は、実施の
形態１の入力音声サンプリング部５０３とは異なり、設
定周波数をそのままサンプリング周波数とするのではな
く、レジスタ６０２から変換定数を得て、設定周波数と
この変換定数とを用いて定められるサンプリング周波数
を用いてサンプリング処理を行う。また、音声データ変
換部６０４は、実施の形態１の音声データ変換部１０４
とは異なり、サンプリング音声データの抽出を行わず、
音声データ挿入のみを行って、変換音声データを作成す
る。音声入力部６０１、帯域分割部６０５、符号化ビッ
ト割り当て部６０６、量子化部６０７、符号化部６０
８、および符号化データ記録部６０９は実施の形態５に
おける５０１、および５０５〜５０９と同様である。図
１５は本実施の形態６の符号化装置による音声符号化の
動作を示すフローチャート図である。また、本実施の形
態６においても、サンプリングと音声データ変換処理の
説明には図１３を用いる。以下に本実施の形態６による
音声符号化装置による符号化の際の動作を、図１５のフ
ローチャートに従って、図１４を参照しながら説明す
る。実施の形態５の場合と同様に、設定周波数ｆｓは、
ＭＰＥＧＡｕｄｉｏで規定される４８ｋＨｚ、変換定数
ｎは、「２」とする。The input voice sampling section 603 differs from the input voice sampling section 503 of the first embodiment in that the set frequency is not used as the sampling frequency as it is, but a conversion constant is obtained from the register 602, and the set frequency and this conversion A sampling process is performed using a sampling frequency determined using a constant. Also, the audio data conversion unit 604 is the audio data conversion unit 104 according to the first embodiment.
Unlike sampling audio data is not extracted,
The converted voice data is created only by inserting the voice data. Voice input section 601, band division section 605, coded bit allocation section 606, quantization section 607, coding section 60
8 and the encoded data recording unit 609 are the same as 501 and 505 to 509 in the fifth embodiment. FIG. 15 is a flowchart showing an operation of speech encoding performed by the encoding device according to the sixth embodiment. Also, in the sixth embodiment, FIG. 13 is used for the description of the sampling and audio data conversion processing. Hereinafter, the operation at the time of encoding by the speech encoding apparatus according to the sixth embodiment will be described with reference to the flowchart of FIG. 15 and FIG. As in the case of the fifth embodiment, the set frequency fs is
A conversion constant n of 48 kHz specified by MPEG Audio is “2”.

【０２７２】図１５のフローのステップ１において、入
力音声サンプリング部６０３は設定周波数ｆｓと、変換
定数ｎとをレジスタ６０２より得て、これらから、実行
サンプリング周波数ｆｓ／ｎを決定し、音声入力部６０
１より入力された音声信号は、入力音声サンプリング部
６０３において実行サンプリング周波数ｆｓ／ｎにより
サンプリングされる。このサンプリングの結果、図１３
(d) のデータＣのような、ｍ／ｎ個のサンプリング音声
データが出力される。In step 1 of the flow of FIG. 15, the input voice sampling unit 603 obtains the set frequency fs and the conversion constant n from the register 602, determines the effective sampling frequency fs / n from these, and 60
The audio signal input from 1 is sampled by the input audio sampling unit 603 at the effective sampling frequency fs / n. As a result of this sampling, FIG.
As shown in (d) data C, m / n pieces of sampled audio data are output.

【０２７３】ステップ２において、音声データ変換部６
０４は、レジスタ６０２より変換定数ｎを得て、入力音
声サンプリング部６０３の出力するｍ／ｎ個のサンプリ
ング音声データに基づいて、各サンプリング音声データ
がｎ個ずつ連続するｍ個の変換音声データを作成する。
図１３(c) のデータＢのような変換音声データが出力さ
れ、この変換音声データは、周波数ｆｓのｍ個の音声デ
ータとなる。In step 2, the audio data conversion unit 6
04 obtains a conversion constant n from the register 602, and based on the m / n sampled audio data output from the input audio sampling unit 603, converts m pieces of converted audio data in which each of the sampled audio data continues by n pieces. create.
Converted voice data such as data B in FIG. 13C is output, and the converted voice data is m voice data having a frequency fs.

【０２７４】実施の形態５において説明したように、図
１３(d) のデータＣの音声データはサンプリング周期ｆ
ｓで再生できないが、変換を行って図１３(c) のデータ
Ｂのような音声データとすることで、サンプリング周期
ｆｓで再生可能な符号化データを得ることができる。As described in the fifth embodiment, the audio data of the data C shown in FIG.
Although the data cannot be reproduced in s, encoded data that can be reproduced in the sampling period fs can be obtained by converting the data into audio data such as the data B in FIG.

【０２７５】ステップ２において得られた変換音声デー
タは、実施の形態５の場合のステップ２で得られた変換
音声データと同等のものとなるので、これ以降のステッ
プ３〜６は実施の形態５におけるステップ３〜６と同様
に実行される。そして、ステップ１〜６は、音声の入力
が続く間繰り返され、音声の入力終了後、速やかに符号
化は完了する。Since the converted voice data obtained in step 2 is equivalent to the converted voice data obtained in step 2 in the case of the fifth embodiment, subsequent steps 3 to 6 are performed in the fifth embodiment. Are performed in the same manner as in steps 3 to 6 in. Steps 1 to 6 are repeated as long as the input of the audio continues, and the encoding is completed immediately after the input of the audio is completed.

【０２７６】本実施の形態６の音声符号化装置において
も、帯域分割の段階と、量子化の段階とにおいて、実施
の形態５の装置で説明したのと同様の演算作業の削減が
可能となり、ＣＰＵ性能等に応じたレベルで、音声入力
に伴っての実時間符号化処理が可能となる。Also in the speech coding apparatus according to the sixth embodiment, the same operation as that described in the apparatus according to the fifth embodiment can be reduced in the band division stage and the quantization stage. Real-time encoding processing can be performed in accordance with CPU input and the like at a level corresponding to CPU performance and the like.

【０２７７】このように、本実施の形態６の音声符号化
装置によれば、レジスタ６０２、入力音声サンプリング
部６０３、および音声データ変換部６０４を備えたこと
で、入力音声サンプリング部６０３が設定周波数ｆｓと
レジスタ６０２に記憶した変換定数ｎとを用いて実行サ
ンプリング周波数ｆｓ／ｎを定めてサンプリング処理を
行い、得られたｍ／ｎ個のサンプリング音声データに対
して、音声データ変換部６０４が音声データの挿入を行
うことによって、ｍ個の音声データからなる変換音声デ
ータを得ることで、実施の形態５と同様に、装置の処理
負担の軽減を図ることが可能となる。加えて本実施の形
態６の符号化装置では、サンプリング周波数ｆｓ／ｎで
サンプリングを行うことにより、サンプリング入力にお
いて音声データを一時的に保存するバッファメモリなど
は実施の形態５による装置の場合の１／ｎの容量でよ
く、また、サンプリング周波数の上限がｆｓまで無いよ
うなサウンドボードを用いる場合でも、動作可能である
という利点を持ち、より少ないハードウェア資源におい
ても、装置資源を活用して、音声入力に伴っての実時間
符号化処理を行うことが可能となる。As described above, according to the speech coding apparatus of the sixth embodiment, since register 602, input speech sampling section 603, and speech data conversion section 604 are provided, input speech sampling section 603 has the set frequency. The sampling processing is performed by determining the effective sampling frequency fs / n using the fs and the conversion constant n stored in the register 602, and the audio data conversion unit 604 performs the audio processing on the obtained m / n sampled audio data. By performing the insertion of data to obtain converted audio data composed of m audio data, it is possible to reduce the processing load on the device as in the fifth embodiment. In addition, in the encoding device according to the sixth embodiment, by performing sampling at the sampling frequency fs / n, a buffer memory for temporarily storing audio data at the time of sampling input is 1 in the case of the device according to the fifth embodiment. / N, and has the advantage of being operable even when using a sound board whose sampling frequency does not have an upper limit of up to fs. It is possible to perform a real-time encoding process in response to a voice input.

【０２７８】なお、本実施の形態６においても実施の形
態５と同様、変換音声データの作成においてはサンプリ
ング音声データが並ぶ形式のものを作成したが、適当な
音声データをｎ−１個挿入して、同様の効果を得ること
も可能である。In the sixth embodiment, as in the fifth embodiment, the converted audio data is created in a format in which the sampled audio data is arranged. However, n-1 suitable audio data are inserted. Thus, a similar effect can be obtained.

【０２７９】実施の形態７．本発明の実施の形態７によ
る音声符号化装置は、入力されるデータの量に対応して
変換定数を変更することにより、状況に応じた符号化を
実行できるように図るものである。Embodiment 7 FIG. The speech coding apparatus according to the seventh embodiment of the present invention is intended to perform coding according to the situation by changing the conversion constant according to the amount of input data.

【０２８０】図１６は、本発明の実施の形態７による音
声符号化装置の構成を示すブロック図である。同図に示
すように、当該音声符号化装置は、音声入力部７０１、
レジスタ７０２、入力音声サンプリング部７０３、音声
データ変換部７０４、帯域分割部７０５、符号化ビット
割り当て部７０６、量子化部７０７、符号化部７０８、
符号化データ記録部７０９、入力バッファ７０１０、お
よび入力バッファ監視部７０１１から構成されている。
この構成は、実施の形態５による音声符号化装置に入力
バッファ７０１０と、入力バッファ監視部７０１１とを
追加した構成である。また、本実施の形態７の符号化装
置のハードウェア構成も図１１に示される実施の形態５
のものと同様である。FIG. 16 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 7 of the present invention. As shown in the drawing, the speech encoding device includes a speech input unit 701,
Register 702, input audio sampling unit 703, audio data conversion unit 704, band division unit 705, coded bit allocation unit 706, quantization unit 707, encoding unit 708,
It comprises an encoded data recording unit 709, an input buffer 7010, and an input buffer monitoring unit 7011.
This configuration is a configuration in which an input buffer 7010 and an input buffer monitoring unit 7011 are added to the speech coding apparatus according to the fifth embodiment. Further, the hardware configuration of the encoding apparatus according to the seventh embodiment is the same as that of the fifth embodiment shown in FIG.
It is similar to that of

【０２８１】入力バッファ７０１０は、主としてメイン
メモリ等のメモリで実現され、データを一時記憶する。
入力バッファ監視部７０１１は、ＣＰＵ、メインメモ
リ、およびプログラムで実現され、入力バッファ７０１
０に一時記憶のため保持されるデータ量を調べて、この
データ量を予め設定された値と比較し、その結果によっ
て、レジスタ７０２の変換定数ｎの値を変更する。レジ
スタ７０２は、記憶する変換定数の値が入力バッファ監
視部７０１１によって変更されることを除いて実施の形
態５のレジスタ５０２と同様である。入力音声サンプリ
ング部７０３は、サンプリング音声データを入力バッフ
ァ７０１０に出力する点を除いて、実施の形態５の入力
音声サンプリング部５０３と同様である。音声データ変
換部７０４は、入力バッファ７０１０からサンプリング
音声データを取りだして処理対象とする点を除いて、実
施の形態５の音声データ変換部５０４と同様である。ま
た、音声入力部７０１、帯域分割部７０５、符号化ビッ
ト割り当て部７０６、量子化部７０７、符号化部７０
８、および符号化データ記録部７０９は実施の形態５に
おける５０１、および５０５〜５０９と同様である。The input buffer 7010 is mainly realized by a memory such as a main memory, and temporarily stores data.
The input buffer monitoring unit 7011 is realized by a CPU, a main memory, and a program.
The data amount held for temporary storage at 0 is checked, this data amount is compared with a preset value, and the value of the conversion constant n of the register 702 is changed according to the result. The register 702 is the same as the register 502 of the fifth embodiment except that the value of the conversion constant to be stored is changed by the input buffer monitoring unit 7011. The input audio sampling unit 703 is the same as the input audio sampling unit 503 of the fifth embodiment except that the input audio sampling unit 703 outputs the sampled audio data to the input buffer 7010. The audio data conversion unit 704 is the same as the audio data conversion unit 504 of the fifth embodiment, except that the audio data conversion unit 704 extracts sampling audio data from the input buffer 7010 and sets it as a processing target. Also, a voice input unit 701, a band division unit 705, a coded bit allocation unit 706, a quantization unit 707, a coding unit 70
8 and the encoded data recording unit 709 are the same as 501 and 505 to 509 in the fifth embodiment.

【０２８２】図１７は本実施の形態７の符号化装置によ
る音声符号化の動作を示すフローチャート図である。以
下に本実施の形態７による音声符号化装置による符号化
の際の動作を、図１７に従って、図１６を参照しながら
説明する。実施の形態５の場合と同様に、設定周波数ｆ
ｓは、ＭＰＥＧＡｕｄｉｏで規定される４８ｋＨｚとす
る。又、変換定数ｎは、ＣＰＵ性能に従って予め定めら
れた値「１」が初期値としてレジスタ７０２に格納され
ているものとする。FIG. 17 is a flowchart showing the operation of speech coding by the coding apparatus according to the seventh embodiment. The operation of the speech encoding apparatus according to the seventh embodiment during encoding will be described with reference to FIG. 17 and FIG. As in the case of the fifth embodiment, the set frequency f
s is set to 48 kHz specified by MPEG Audio. It is assumed that the conversion constant n has a value “1” predetermined according to the CPU performance stored in the register 702 as an initial value.

【０２８３】図１７のフローのステップ１で、音声入力
部７０１より入力された音声信号は、入力音声サンプリ
ング部７０３において、実施の形態５と同様にサンプリ
ングされ、ステップ２において、サンプリング音声デー
タは入力バッファ７０１０に書き込まれて一時記憶され
る。ステップ３において、音声データ変換部７０４は入
力バッファ７０１０より、一時記憶されたサンプリング
音声データを読み出す。そして、後述するステップ４の
後に実行されるステップ５における音声データの変換以
降、ステップ９の符号化データ出力までは、実施の形態
５における図１２のフローのステップ２〜６と同様に実
行されるので、ステップ５〜９における動作については
説明を省略する。In step 1 of the flow of FIG. 17, the audio signal input from the audio input section 701 is sampled in the input audio sampling section 703 in the same manner as in the fifth embodiment. In step 2, the sampled audio data is input. The data is written to the buffer 7010 and temporarily stored. In step 3, the audio data conversion unit 704 reads the temporarily stored sampled audio data from the input buffer 7010. Then, after the conversion of the audio data in step 5 executed after step 4 to be described later, up to the output of the encoded data in step 9, the processing is executed in the same manner as steps 2 to 6 in the flow of FIG. Therefore, description of the operations in steps 5 to 9 will be omitted.

【０２８４】ステップ３が実行された後、ステップ４で
は、入力バッファ監視部７０１１が、入力バッファ７０
１０に保持されたデータ量を調べて、このデータ量を予
め設定した値と比較し、比較結果に基づいて、レジスタ
７０２に記憶される変換定数ｎの値を変更する。入力バ
ッファ７０１０を監視して、変換定数ｎの値を制御する
には様々な方法が採用できるが、ここでは以下の様に行
われるものとする。After Step 3 is executed, in Step 4, the input buffer monitoring unit 7011
The data amount stored in the register 10 is checked, the data amount is compared with a preset value, and the value of the conversion constant n stored in the register 702 is changed based on the comparison result. Various methods can be employed to monitor the input buffer 7010 and control the value of the conversion constant n. Here, it is assumed that the following is performed.

【０２８５】ＣＰＵの負担増大などにより、当初の設定
では、音声入力にともなっての符号化処理ができなくな
った場合、入力バッファ７０１０については、書き込み
は同じペースで行われるのに対して、符号化処理のため
の読み出しのペースが落ちるため、データ量は増大す
る。In the initial setting, if the encoding process cannot be performed with the voice input due to an increase in the load on the CPU or the like, the writing to the input buffer 7010 is performed at the same pace. Since the reading speed for processing is reduced, the data amount is increased.

【０２８６】入力バッファ監視部７０１１は、入力バッ
ファ７０１０のデータ量が、予め設定したバッファフル
レベルＢＦを越えた場合は、現状の設定での実時間符号
化処理が不可能であると判断し、レジスタ７０２に記憶
される変換定数ｎの値を、１だけ増加させてｎ＝２に変
更する。それ以後のフロー図のステップ５〜９において
は、ステップ５ではデータを１個飛ばしに間引いて、２
個同じデータが続く形式に変換し、これをステップ６で
帯域分割することにより、ステップ６の帯域分割での一
部の処理を１／２に軽減させる。又、ステップ７では、
各帯域に符号化ビットの割り当てを行う際、周波数ｆs
／４以下の帯域に対してのみ符号化ビット割り当てるこ
とにより、ステップ８での量子化処理を１／４に軽減さ
せる。このようにして、入力バッファ監視部７０１１は
変換定数ｎの値を変更することにより、ＣＰＵに対する
負担の軽減を図る。If the data amount of the input buffer 7010 exceeds a preset buffer full level BF, the input buffer monitoring unit 7011 determines that the real-time encoding processing with the current setting is impossible. The value of the conversion constant n stored in the register 702 is increased by 1 and changed to n = 2. In steps 5 to 9 of the flow chart thereafter, in step 5, data is skipped by skipping one piece.
By converting the data into a format in which the same data continues, and performing band division in step 6, a part of the processing in the band division in step 6 is reduced to half. In step 7,
When assigning coded bits to each band, the frequency fs
By assigning coded bits only to a band equal to or less than / 4, the quantization process in step 8 is reduced to 1/4. In this way, the input buffer monitoring unit 7011 reduces the load on the CPU by changing the value of the conversion constant n.

【０２８７】図１７のフローにおける繰り返しで、ステ
ップ４において、なおも入力バッファ７０１０のデータ
量がバッファフルレベルＢＦを越える場合、入力バッフ
ァ監視部７０１１は、レジスタ２の変換定数ｎを変更
し、さらに１増加してｎ＝３とする。これにより、ステ
ップ５では、データを２個飛ばしに間引いて、３個同じ
データが続く形式に変換することにより、ステップ６の
帯域分割での一部の処理を１／３に軽減させ、ステップ
７で各帯域に符号化ビットの割り当てを行う際、周波数
ｆs ／６以下の帯域に対してのみ符号化ビットを割り当
てることにより、ステップ８での量子化処理を１／６に
軽減させる。以後、ステップ４で入力バッファ７０１０
のデータ量がバッファフルレベルＢＦ以下になるまで、
入力バッファ監視部７０１１は、レジスタ２のｎの値を
増加させる。If the data amount of the input buffer 7010 still exceeds the buffer full level BF in step 4 in the repetition of the flow of FIG. 17, the input buffer monitoring unit 7011 changes the conversion constant n of the register 2 and further Increase by 1 to n = 3. Thus, in step 5, by skipping two data and converting the data into a format in which three identical data continue, a part of the processing in the band division in step 6 is reduced to ３, and When the coded bits are allocated to each band in (4), the coded bits are allocated only to the band having a frequency equal to or lower than fs / 6, thereby reducing the quantization process in step 8 to 1/6. Thereafter, in step 4, the input buffer 7010
Until the data amount of the buffer becomes below the buffer full level BF.
The input buffer monitoring unit 7011 increases the value of n in the register 2.

【０２８８】逆に、ステップ４において、入力バッファ
７０１０の保持するデータ量が予め設定したバッファエ
ンプティレベルＢＥを下回る場合は、入力バッファ監視
部７０１１は、符号化処理能力に余力があると判断す
る。なるべく変換定数ｎの値が少ない方が、音声データ
の間引きと、高周波成分のカットがなく、高品質の符号
化データが得られるので、入力バッファ監視部７０１１
は、変換定数ｎの値を１だけ減少させ、以後は上記と同
様に入力バッファ７０１０のデータ量がバッファエンプ
ティレベルＢＥ以上になるまで、図１７のフローの繰り
返しにおいて、ステップ４でレジスタ７０２の記憶する
変換定数ｎの値を１ずつ減少させる。Conversely, in step 4, if the amount of data held in the input buffer 7010 is lower than the preset buffer empty level BE, the input buffer monitoring unit 7011 determines that there is enough coding processing capacity. Since the smaller the value of the conversion constant n is, the higher the quality of the encoded data can be obtained without the thinning of the audio data and the cut of the high frequency component, the input buffer monitoring unit 7011
Reduces the value of the conversion constant n by 1 and thereafter repeats the flow of FIG. 17 until the data amount of the input buffer 7010 becomes equal to or more than the buffer empty level BE in the same manner as described above, and the storage of the register 702 is performed in step 4. The value of the conversion constant n is decreased by one.

【０２８９】なお、上記の方法では、変換定数ｎの値を
制御するために、バッファフルレベルＢＦとバッファエ
ンプティレベルＢＥの２つの値を用いたが、バッファフ
ルレベルＢＦのみを用いてもよく、この場合、入力バッ
ファのデータ量が予め設定したバッファフルレベルＢＦ
に達するまで変換定数ｎの値を増加し、音声入力と符号
化処理とがつりあうとき、すなわちデータ量がＢＦに達
したときに、変換定数ｎを増加するのを止める様に制御
を行う。In the above method, two values of the buffer full level BF and the buffer empty level BE are used to control the value of the conversion constant n. However, only the buffer full level BF may be used. In this case, the data amount of the input buffer is set to a preset buffer full level BF.
, The control is performed so that the increase of the conversion constant n is stopped when the voice input and the encoding process are balanced, that is, when the data amount reaches BF.

【０２９０】このように、本実施の形態７の音声符号化
装置によれば、実施の形態５による音声符号化装置に、
入力バッファ７０１０と、入力バッファ監視部７０１１
とを追加する構成としたことで、サンプリング音声デー
タをこの入力バッファ７０１０に一時記憶した後に読み
出して、それ以後の処理を行うものとし、また、入力バ
ッファ監視部７０１１が、入力バッファ７０１０の保持
するデータ量を調べることにより、これをその時点にお
けるＣＰＵの符号化処理能力の指標として、レジスタ７
０２に記憶する変換定数ｎの値を状況に応じて動的に制
御することによって、ＣＰＵがその時点で符号化処理可
能な、最も高品質な音声符号化を行うように図ることが
可能となる。As described above, according to the speech coding apparatus of the seventh embodiment, the speech coding apparatus of the fifth embodiment has:
Input buffer 7010 and input buffer monitoring unit 7011
In this configuration, the sampled audio data is temporarily stored in the input buffer 7010 and then read out, and the subsequent processing is performed. The input buffer monitoring unit 7011 stores the input audio data in the input buffer 7010. By examining the amount of data, this is used as an index of the encoding processing capability of the CPU at that time, and is used as an index in the register 7.
By dynamically controlling the value of the conversion constant n stored in 02 according to the situation, it becomes possible for the CPU to perform the highest quality speech encoding that can be encoded at that time. .

【０２９１】実施の形態８．本発明の実施の形態８によ
る音声符号化装置は、出力されるデータの量に対応して
変換定数を変更することにより、状況に応じた符号化を
実行できるように図るものである。図１８は、本発明の
実施の形態８による音声符号化装置の構成を示すブロッ
ク図である。同図に示すように、当該音声符号化装置
は、音声入力部８０１、レジスタ８０２、入力音声サン
プリング部８０３、音声データ変換部８０４、帯域分割
部８０５、符号化ビット割り当て部８０６、量子化部８
０７、符号化部８０８、符号化データ記録部８０９、お
よび符号化データ監視部８０１２から構成されている。
この構成は、実施の形態５による音声符号化装置に符号
化データ監視部８０１２を追加した構成である。また、
本実施の形態８の符号化装置のハードウェア構成も図１
１に示される実施の形態５のものと同様である。[Embodiment 8] The speech coding apparatus according to the eighth embodiment of the present invention is intended to perform coding according to the situation by changing the conversion constant according to the amount of output data. FIG. 18 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 8 of the present invention. As shown in the figure, the audio encoding apparatus includes an audio input unit 801, a register 802, an input audio sampling unit 803, an audio data conversion unit 804, a band division unit 805, a coded bit allocation unit 806, and a quantization unit 8
07, an encoding unit 808, an encoded data recording unit 809, and an encoded data monitoring unit 8012.
This configuration is a configuration in which an encoded data monitoring unit 8012 is added to the speech encoding device according to the fifth embodiment. Also,
The hardware configuration of the encoding apparatus according to the eighth embodiment is also shown in FIG.
This is the same as that of the fifth embodiment shown in FIG.

【０２９２】符号化データ監視部８０１２は、ＣＰＵ、
メインメモリ、およびプログラムで実現され、符号化部
８０８より出力される単位時間当たりの符号化データ量
を調べて、このデータ量を予め設定された値と比較し、
その結果によって、レジスタ８０２の変換定数ｎの値を
変更する。レジスタ８０２は、記憶する変換定数の値が
符号化データ監視部８０１２によって変更されることを
除いて実施の形態５のレジスタ５０２と同様である。音
声入力部８０１、入力音声サンプリング部８０３、音声
データ変換部８０４、帯域分割部８０５、符号化ビット
割り当て部８０６、量子化部８０７、符号化部８０８、
および符号化データ記録部８０９は実施の形態５におけ
る５０１、および５０３〜５０９と同様である。The coded data monitoring unit 8012 includes a CPU,
The amount of encoded data per unit time, which is realized by the main memory and the program and output from the encoding unit 808, is checked, and this data amount is compared with a preset value.
According to the result, the value of the conversion constant n of the register 802 is changed. The register 802 is the same as the register 502 of the fifth embodiment except that the value of the conversion constant to be stored is changed by the encoded data monitoring unit 8012. Audio input unit 801, input audio sampling unit 803, audio data conversion unit 804, band division unit 805, encoded bit allocation unit 806, quantization unit 807, encoding unit 808,
The coded data recording unit 809 is the same as 501 and 503 to 509 in the fifth embodiment.

【０２９３】図１９は本実施の形態８の符号化装置によ
る音声符号化の動作を示すフローチャート図である。以
下に本実施の形態８による音声符号化装置による符号化
の際の動作を、図１９に従って、図１８を参照しながら
説明する。実施の形態５の場合と同様に、サンプリング
周波数ｆｓは、ＭＰＥＧＡｕｄｉｏで規定される４８ｋ
Ｈｚとする。又、変換定数ｎは、ＣＰＵ性能に従って予
め定められた値「１」が初期値としてレジスタ８０２に
格納されているものとする。図１９のフローのステップ
１からステップ６までは、実施の形態５におけるステッ
プ１〜６と同様に実行される。そして、ステップ７で
は、符号化データ監視部８０１２が、符号化部８０８よ
り出力される単位時間当たりの符号化データ量を調べ
て、このデータ量を予め設定された値と比較し、その結
果によって、レジスタ８０２の変換定数ｎの値を変更す
る。符号化データの量を監視して変換定数ｎの値を制御
するには様々な方法が採用できるが、ここでは以下の方
法に従って行われるものとする。FIG. 19 is a flowchart showing the operation of speech encoding by the encoding apparatus according to the eighth embodiment. Hereinafter, the operation of the speech encoding apparatus according to the eighth embodiment during encoding will be described with reference to FIG. 19 and with reference to FIG. As in the case of the fifth embodiment, the sampling frequency fs is set to 48 k, which is defined by MPEG Audio.
Hz. It is assumed that a value “1” predetermined according to the CPU performance is stored in the register 802 as the conversion constant n as an initial value. Steps 1 to 6 of the flow in FIG. 19 are executed in the same manner as steps 1 to 6 in the fifth embodiment. In step 7, the coded data monitoring unit 8012 checks the coded data amount per unit time output from the coding unit 808, compares this data amount with a preset value, and, based on the result, , The value of the conversion constant n of the register 802 is changed. Various methods can be employed to monitor the amount of encoded data and control the value of the conversion constant n. Here, the method is performed according to the following method.

【０２９４】ＣＰＵの負担増大などにより、当初の設定
では、符号化処理が間に合わなくなった場合、符号化処
理のペースが落ちるため、出力される符号化データ量は
減少する。ステップ７において、符号化データ量が、予
め設定された符号化最低レベルＣＬに達しない場合は、
符号化データ監視部８０１２は、実施の形態７に示した
入力バッファ監視部７０１１と同様、レジスタ８０２の
変換定数ｎの値を増加させることにより、ＣＰＵの負担
を軽減させるよう図る。図１９の処理を繰り返し、ステ
ップ７において、単位時間当たりの符号化処理量が符号
化最高レベルＣＨを下回らない場合は、高品質の符号化
が行えるよう、レジスタ８０２の変換定数ｎの値を減少
させることも実施の形態７と同様である。実施の形態７
における入力バッファ監視部７０１１による制御と同様
に、本実施の形態８における符号化データ監視部８０１
２も、符号化データ量が適切と判定されるまでは、変換
定数ｎの値を変更し続ける。又、このように符号化最低
レベルＣＬと符号化最高レベルＣＨの２つの値を用い
ず、符号化最低レベルＣＬのみを用いても制御可能であ
る点についても、実施の形態７と同様である。In the initial setting due to an increase in the load on the CPU or the like, if the encoding process cannot be performed in time, the encoding process slows down and the amount of encoded data output decreases. In step 7, if the encoded data amount does not reach the preset minimum encoding level CL,
The coded data monitoring unit 8012 increases the value of the conversion constant n of the register 802 as in the input buffer monitoring unit 7011 described in Embodiment 7, thereby reducing the load on the CPU. The process of FIG. 19 is repeated, and in step 7, if the coding processing amount per unit time does not fall below the highest coding level CH, the value of the conversion constant n of the register 802 is reduced so that high-quality coding can be performed. This is the same as in the seventh embodiment. Embodiment 7
Of the coded data monitoring unit 801 according to the eighth embodiment, similarly to the control by the input buffer monitoring unit 7011 in FIG.
2 also keeps changing the value of the conversion constant n until it is determined that the encoded data amount is appropriate. Also, as in the seventh embodiment, control can be performed by using only the lowest coding level CL without using the two values of the lowest coding level CL and the highest coding level CH. .

【０２９５】このように、本実施の形態８の音声符号化
装置によれば、実施の形態５による音声符号化装置に、
符号化データ監視部８０１２を追加する構成としたこと
で、符号化データ監視部８０１２が、単位時間当たり出
力される符号化データ量を調べることにより、これをそ
の時点におけるＣＰＵの符号化処理能力の指標として、
レジスタに８０２に記憶する変換定数ｎの値を状況に応
じて動的に制御することによって、ＣＰＵがその時点で
符号化処理可能な、最も高品質な音声符号化を行うよう
に図ることが可能となる。Thus, according to the speech coding apparatus of the eighth embodiment, the speech coding apparatus of the fifth embodiment has
With the configuration in which the coded data monitoring unit 8012 is added, the coded data monitoring unit 8012 checks the amount of coded data output per unit time, and determines the amount of coded data output by the CPU at that time. As an indicator,
By dynamically controlling the value of the conversion constant n stored in the register 802 in accordance with the situation, it is possible to achieve the highest quality speech encoding that can be encoded by the CPU at that time. Becomes

【０２９６】なお、実施の形態７および８については、
実施の形態５に準じたものとして、入力音声サンプリン
グ部がサンプリング周波数ｆｓによるサンプリングでｍ
個のサンプリング音声データとし、次に音声データ変換
部が、（ｎ−１）個とばしの間引きを行うものとした
が、実施の形態６の装置の場合のように、入力音声サン
プリング部がサンプリング周波数をｆｓ／ｎとしてサン
プリングし、ｍ／ｎ個のサンプリング音声データを得
て、これを音声データ変換部が変換してｍ個の変換音声
データを得る方式としてもさしつかえなく、ソフトウェ
ア上の設定変更で容易に行える。またその場合、実施の
形態６で説明したように、バッファメモリの容量低減
や、サンプリング周波数の制限の厳しいサウンドボード
の使用も可能、といった効果は、同様に得られる。It should be noted that in Embodiments 7 and 8,
According to the fifth embodiment, the input audio sampling unit performs sampling at a sampling frequency fs by m
Although the audio data conversion unit performs thinning out of (n-1) skips, the input audio sampling unit determines the sampling frequency as in the case of the apparatus of the sixth embodiment. May be sampled as fs / n to obtain m / n sampled audio data, which may be converted by an audio data converter to obtain m converted audio data. Easy to do. In this case, as described in the sixth embodiment, the effects of reducing the capacity of the buffer memory and using a sound board whose sampling frequency is strictly limited can also be obtained.

【０２９７】なお、実施の形態５〜８による符号化にお
いては、実質的にオーディオデータの間引きや高周波成
分の除去を行うため、それに伴い音質が劣化することに
はなる。しかしその場合でも、性能の低いＣＰＵによっ
ても、ハードウェア的追加等を要せずソフトウェア的
に、ＭＰＥＧＡｕｄｉｏなどの帯域分割符号化データを
実時間で作成でき、これを、動画符号化の国際標準とし
て広く用いられるＭＰＥＧデータとして利用することが
可能となる。また、変換定数の値を調整することで、Ｃ
ＰＵの符号化処理性能にあわせて、間引き具合や除去す
る高周波成分の割合を制御できるため、高性能なＣＰＵ
のみならず性能が不十分なＣＰＵでもその符号化処理能
力なりの音質で符号化することが出来、幅広い性能レベ
ルのＣＰＵで符号化処理が実現できる。但し、ハードウ
ェア面に関しては、ＣＰＵが高性能であるほど、またサ
ウンドボードの機能や装置内でのデータ伝送速度が高い
ほど、高品質な符号化が可能である。In the coding according to the fifth to eighth embodiments, since the audio data is substantially thinned out and high-frequency components are removed, the sound quality is degraded accordingly. However, even in such a case, even with a low-performance CPU, band-division-encoded data such as MPEG Audio can be created in real time in software without the need for additional hardware or the like. It can be used as widely used MPEG data. Also, by adjusting the value of the conversion constant, C
A high-performance CPU that can control the degree of thinning and the ratio of high-frequency components to be removed in accordance with the encoding processing performance of the PU
In addition, even a CPU with inadequate performance can perform encoding with sound quality equivalent to its encoding processing capability, and encoding processing can be realized by a CPU with a wide range of performance levels. However, in terms of hardware, the higher the performance of the CPU and the higher the function of the sound board and the higher the data transmission speed in the device, the higher the quality of encoding is possible.

【０２９８】また、実施の形態５〜８の音声符号化は、
音声符号化制御プログラムとして記録媒体に記録し、パ
ーソナルコンピュータ、ワークステーションその他の装
置において実行することが可能である。また、実施の形
態５〜８では、符号化データを記憶装置に保存すること
としたが、ネットワーク等を介して他の機器に伝達し、
他の機器において記録または利用することも可能であ
る。また、実施の形態５〜８では、ＣＰＵ処理によるも
のとして説明したが、ＣＰＵの代わりにＤＳＰを用いた
ソフトウェア処理によっても、同様である。Also, the speech coding of the fifth to eighth embodiments is as follows.
It can be recorded on a recording medium as a voice encoding control program, and can be executed by a personal computer, a workstation, or another device. In the fifth to eighth embodiments, the encoded data is stored in the storage device. However, the encoded data is transmitted to another device via a network or the like.
It is also possible to record or use it in another device. In the fifth to eighth embodiments, the description has been made on the assumption that the processing is performed by the CPU. However, the same applies to the software processing using a DSP instead of the CPU.

【０２９９】実施の形態９．本発明の実施の形態９によ
る音声符号化装置は、単位期間に区切られたサンプリン
グデータに対して、設定された定数に応じて、単位期間
分ごとにデータ処理を行うか否かを制御することで、処
理負担の軽減を図り得るものである。図２０は、本発明
の実施の形態９による音声符号化装置の構成を示すブロ
ック図である。図２１は本実施の形態９による音声符号
化のフローチャート図、図２２は本実施の形態９による
音声符号化を説明するための概念図である。また、本実
施の形態６の符号化装置のハードウェア構成も実施の形
態５のものと同様であり、説明には図１１を用いる。Embodiment 9 FIG. The speech coding apparatus according to the ninth embodiment of the present invention controls whether or not to perform data processing for each unit period according to a set constant on sampling data divided into unit periods. Thus, the processing load can be reduced. FIG. 20 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 9 of the present invention. FIG. 21 is a flowchart of speech encoding according to the ninth embodiment, and FIG. 22 is a conceptual diagram for describing speech encoding according to the ninth embodiment. The hardware configuration of the encoding device according to the sixth embodiment is the same as that of the fifth embodiment, and FIG. 11 is used for the description.

【０３００】図２０に示すように、本実施の形態９によ
る音声符号化装置は、音声入力部９０１、レジスタ９０
２、入力音声サンプリング部９０３、判定制御部（単位
期間判定）９０４、帯域分割部９０５、符号化ビット割
り当て部９０６、量子化部９０７、符号化部９０８、符
号化データ記録部９０９、および固定的符号レジスタ９
１０から構成されている。As shown in FIG. 20, the speech coding apparatus according to the ninth embodiment includes a speech input unit 901, a register 90
2, input voice sampling unit 903, determination control unit (unit period determination) 904, band division unit 905, coded bit allocation unit 906, quantization unit 907, coding unit 908, coded data recording unit 909, and fixed Sign register 9
10.

【０３０１】音声入力部９０１は、符号化を行う音声を
入力するものである。音声は図１１に示すようにマイク
ロホンから入力されても、あるいはライン入力であって
も良い。単位期間判定定数レジスタ９０２は、図１１の
メインメモリまたは外部記憶装置で実現され、単位期間
判定定数を記憶する。入力音声サンプリング部９０３
は、図１１のサウンドボード（入力）および制御プログ
ラムによって実現され、音声入力部９０１が入力した音
声に対してサンプリング処理を行う。判定制御部９０４
は、入力音声サンプリング部９０３がサンプリング処理
したデータに対して、レジスタ９０２に記憶された定数
の値を用いて符号化対象期間であるか否かを判断する。
帯域分割部９０５は、判定制御部９０４により符号化対
象期間であるとされた場合のみ、サンプリングデータを
帯域分割する。符号化ビット割り当て部９０６は、帯域
分割部９０５が分割した帯域に対して、符号化ビットを
割り当てる。量子化部９０７は、符号化ビット割り当て
部９０６の割り当てた符号化ビット数に従って、量子化
処理を行う。符号化部９０８は、量子化部９０７の出力
する量子化値を符号化音声データとして出力する。本実
施の形態９では符号化部９０８は、判定制御部９０４に
おいて、符号化対象期間でないと判断された場合には、
後述する固定的符号レジスタ９１０に記憶された、帯域
出力ゼロに相当する符号化データｄＮを符号化音声デー
タとして出力する。９０４〜９０８はいずれも、図１１
のＣＰＵ、メインメモリ、およびプログラムで実現され
る。符号化データ記録部９０９は、図１１の外部記憶装
置および制御プログラムで実現され、出力された符号化
データを記録する。固定的符号レジスタ９１０は、図１
１のメインメモリまたは外部記憶装置で実現され、帯域
出力ゼロに相当する符号化データｄＮを記憶する。この
ように構成される本実施の形態９による音声符号化装置
による符号化の際の動作を、以下に図２１のフローチャ
ートに従って、図２０、および２２を参照しながら説明
する。[0301] The voice input unit 901 is for inputting voice to be coded. The sound may be input from a microphone as shown in FIG. 11, or may be a line input. The unit period determination constant register 902 is realized by the main memory or the external storage device of FIG. 11, and stores the unit period determination constant. Input audio sampling unit 903
Is implemented by the sound board (input) and the control program in FIG. 11, and performs sampling processing on the sound input by the sound input unit 901. Judgment control unit 904
Determines whether the data sampled by the input audio sampling unit 903 is a coding target period using the constant value stored in the register 902.
The band dividing unit 905 divides the band of the sampling data only when the judgment control unit 904 determines that the period is an encoding target period. The coded bit allocation unit 906 allocates coded bits to the band divided by the band division unit 905. The quantization unit 907 performs a quantization process according to the number of coded bits allocated by the coded bit allocation unit 906. The encoding unit 908 outputs the quantized value output from the quantization unit 907 as encoded audio data. In the ninth embodiment, when the determination control unit 904 determines that the current period is not the encoding target period, the encoding unit 908
The coded data dN corresponding to zero band output stored in a fixed code register 910 described later is output as coded audio data. 904 to 908 are all shown in FIG.
, A main memory, and a program. The encoded data recording unit 909 is realized by the external storage device and the control program in FIG. 11, and records the encoded data output. The fixed sign register 910 is shown in FIG.
1 and stores encoded data dN corresponding to zero band output. The operation at the time of encoding by the speech encoding apparatus according to Embodiment 9 configured as described above will be described below in accordance with the flowchart of FIG. 21 and with reference to FIGS. 20 and 22.

【０３０２】図２１のフローのステップ１で、音声入力
部９０１より入力された音声信号は、入力音声サンプリ
ング部９０３において、設定された周波数ｆｓをサンプ
リング周波数としてサンプリングされる。これにより、
周波数ｆｓのサンプリングデータが判定制御部９０４に
出力される。In step 1 of the flow shown in FIG. 21, the audio signal input from the audio input unit 901 is sampled in the input audio sampling unit 903 using the set frequency fs as the sampling frequency. This allows
The sampling data of the frequency fs is output to the determination control unit 904.

【０３０３】ステップ２において判定制御部９０４は、
上記サンプリングデータについて、符号化対象期間であ
るか否かの判定を行う。この判定では、まず、１回の帯
域分割で対象とする入力音声サンプル数ｐに相当する期
間を単位期間ｔｉとし、単位期間ごとに符号化対象期間
であるか否かの判定を実行する。また、判定に用いられ
る単位期間判定定数ｋは１以上の整数としてあらかじめ
システムによって設定され、レジスタ９０２に記憶され
る定数である。判定は、単位期間ｔｉについて、任意の
整数ｎにつき、ｉ＝ｎ×ｋ＋１が成立するとき符号化対
象期間であるとし、成立しないときは符号化対象期間で
ないとするものとして行われる。At step 2, the judgment control unit 904
It is determined whether or not the sampling data is a coding target period. In this determination, first, a period corresponding to the number p of input audio samples to be processed in one band division is set as a unit period ti, and it is determined whether or not a unit period is a coding target period. The unit period determination constant k used for the determination is a constant that is set in advance by the system as an integer of 1 or more and stored in the register 902. The determination is made with respect to the unit period ti assuming that the encoding target period is established when i = n × k + 1 is satisfied for an arbitrary integer n, and is determined not to be the encoding target period when i is not established.

【０３０４】ステップ２の判定で単位期間期間ｔｉが符
号化対象期間の場合は、ステップ３〜６が実行されるこ
とにより、従来例と同様の処理が行われる。すなわち、
まずステップ３において、単位期間ｔｉの音声データに
対して、帯域分割部９０５がＭ個の周波数帯域に分割を
行う。このステップは、図５９および図６０を用いて説
明した第１の従来例の場合と同様に行われる。ステップ
４では、符号化ビット割り当て部９０６で、各帯域に対
して符号化ビット数が割り当てられ、その割り当てが量
子化部９０７に伝えられる。ステップ５において、量子
化部９０７は、符号化ビット割り当て数に従って、帯域
分割部９０５が分割したそれぞれの帯域ごとの単位期間
ｔｉの音声データを量子化して量子化値を出力する。そ
して、ステップ６では、符号化部９０８が、量子化部９
０７の出力である量子化値により符号化音声データを構
成して出力し、符号化音声データは符号化データ記録部
９０９において記録される。If it is determined in step 2 that the unit period ti is the encoding target period, steps 3 to 6 are executed to perform the same processing as in the conventional example. That is,
First, in step 3, the band division unit 905 divides the audio data of the unit period ti into M frequency bands. This step is performed in the same manner as in the case of the first conventional example described with reference to FIGS. In step 4, the number of coded bits is allocated to each band in the coded bit allocation unit 906, and the allocation is transmitted to the quantization unit 907. In step 5, the quantization unit 907 quantizes the audio data in the unit period ti for each band divided by the band division unit 905 according to the number of coded bits, and outputs a quantized value. Then, in step 6, the encoding unit 908 sets the quantization unit 9
The encoded audio data is formed and output from the quantized value which is the output of the encoded data 07, and the encoded audio data is recorded in the encoded data recording unit 909.

【０３０５】一方、ステップ２の判定において、単位期
間ｔｉが符号化対象期間でない場合は、ステップ３〜６
の帯域分割、符号化ビット割り当て、および量子化は行
われず、ステップ２に続いてステップ７が実行される。
ステップ７では、符号化部９０８は、固定的符号レジス
タ９１０から固定的符号化データｄＮを取得して、これ
を符号化データとして出力を行う。ここで固定的符号化
データｄＮは、帯域分割における各帯域の出力をゼロと
して、固定的符号レジスタ９１０に予め設定されたデー
タである。出力された符号化音声データは、符号化記録
部９０９において記録される。図２１のフローに示すよ
うに、以上の過程は、符号化を行う対象の音声の入力が
続く間繰り返され、音声の入力終了後、速やかに符号化
は完了する。On the other hand, if it is determined in step 2 that the unit period ti is not the encoding target period, steps 3 to 6
Is not performed, and step 7 is executed following step 2.
In step 7, the coding unit 908 acquires the fixed coded data dN from the fixed code register 910, and outputs this as coded data. Here, the fixed coded data dN is data preset in the fixed code register 910 with the output of each band in the band division being set to zero. The output encoded audio data is recorded in the encoding recording unit 909. As shown in the flow of FIG. 21, the above process is repeated while the input of the audio to be encoded continues, and the encoding is completed immediately after the input of the audio is completed.

【０３０６】入力音声サンプル数ｐ＝３２で、変数定数
ｋ＝３に設定された場合を想定し、図２２の概念図を用
いてさらに説明する。図に示すように、音声データが入
力され、最初の単位期間ｔ１では、ｉ＝１＝０×３＋１
が成立する（ｎ＝０）ので符号化対象期間であり、この
単位期間分のサンプリングデータは３２個の帯域信号に
分割され、量子化、符号化されて、符号化データｄ１が
出力される。続く単位期間ｔ２およびｔ３では、ｉ＝ｎ
×ｋ＋１を満たす整数ｎはなく、符号化対象期間でない
と判定されるので、上記一連の処理はされることなく、
固定的な符号化データｄＮが出力される。固定的符号化
データｄＮについては、上記のように３２個の帯域信号
ゼロとして予め設定されたデータである。この後、単位
期間ｔ４では、ｉ＝４＝１×３＋１が成立する（ｎ＝
１）ので符号化対象期間であり、この単位期間分のサン
プリングデータはｔ１のデータと同様に、帯域分割、量
子化、符号化されて、符号化データｄ４が出力される。
以下同様の処理となる。The case where the number of input speech samples is p = 32 and the variable constant k is set to 3 is further described with reference to the conceptual diagram of FIG. As shown in the figure, audio data is input, and in the first unit period t1, i = 1 = 0 × 3 + 1.
Holds (n = 0), which is the encoding target period. The sampling data for this unit period is divided into 32 band signals, quantized and encoded, and encoded data d1 is output. In the subsequent unit periods t2 and t3, i = n
There is no integer n that satisfies × k + 1, and it is determined that the current period is not the encoding target period.
The fixed coded data dN is output. The fixed coded data dN is data preset as 32 band signals zero as described above. Thereafter, in the unit period t4, i = 4 = 1 × 3 + 1 holds (n = 4).
Therefore, the sampling data for the unit period is band-divided, quantized, and coded in the same manner as the data at t1, and coded data d4 is output.
Hereinafter, the same processing is performed.

【０３０７】本実施の形態９による音声符号化では、上
記のように、入力音声に基づく符号化データｄ１、およ
びｄ４の間に、出力ゼロのデータｄＮが（ｋ−１）個入
った符号化データが得られることとなる。第１の従来例
において説明したように、ＭＰＥＧ１Ａｕｄｉｏのレイ
ヤ１の音声符号化の場合、入力音声サンプルとしては対
象とする３２サンプルを中心に前後５１２サンプルを用
いて、３２帯域へ分割し帯域ごとの音声データを出力す
る。従ってこの３２サンプル分の期間に相当する帯域出
力を出力ゼロとして符号化したのち復号再生したとして
も、該部分で音が途切れるわけではなく、その前後の帯
域出力の符号化データとともに復号再生される。従っ
て、再生音声のエンベロープ（音声の時間的変化）は連
続したものとなるので、人間の聴覚では、あまり大きな
音質劣化を感じることはない。In the speech encoding according to the ninth embodiment, as described above, between the encoded data d1 and d4 based on the input speech, (k−1) pieces of data dN of zero output are included. Data will be obtained. As described in the first conventional example, in the case of MPEG1 Audio layer 1 audio coding, input audio samples are divided into 32 bands using 512 samples before and after the target 32 samples, and divided into 32 bands. Output audio data. Therefore, even if the band output corresponding to the period of 32 samples is encoded as output zero and then decoded and reproduced, the sound is not interrupted at this portion, but is decoded and reproduced together with the encoded data of the band outputs before and after that. . Therefore, since the envelope (temporal change of the sound) of the reproduced sound is continuous, human sound perception does not cause a great deterioration in sound quality.

【０３０８】このように、本実施の形態９による音声符
号化装置においては、単位期間判定定数ｋを記憶するレ
ジスタ９０２と、単位期間判定定数ｋに基づいて、単位
期間分のサンプリングデータにつき、そのデータが符号
化対象期間のものであるかどうかを判定する判定制御部
９０４と、符号化対象期間以外のデータを処理して得ら
れる符号化データの代替に用いる固定的符号化データを
記憶する固定的符号レジスタ９１０とを備えたことで、
サンプリングデータのうち、符号化対象期間に属する１
／ｋのサンプリングデータのみに対して帯域分割以降の
処理を行い、符号化対象期間に属さない残りのサンプリ
ングデータについては該処理を行わず、その分帯域分割
出力ゼロとして、固定的な符号化データｄＮを出力する
ので、ｋの値の設定により、帯域分割、符号化ビット割
り当て、量子化、および符号化データ生成の各処理につ
いて、それぞれの段階での処理量１／ｋに負担を軽減す
ることが可能となる。As described above, in the speech coding apparatus according to the ninth embodiment, the register 902 storing the unit period determination constant k and the sampling data for the unit period are determined based on the unit period determination constant k. A determination control unit 904 that determines whether the data belongs to the encoding target period, and a fixed storage unit that stores fixed encoded data used as a substitute for encoded data obtained by processing data other than the encoding target period. With the provision of the dynamic sign register 910,
1 of sampling data belonging to the encoding target period
/ K processing is performed only on the sampling data of / k, and the processing is not performed on the remaining sampling data that does not belong to the encoding target period. Since dN is output, by setting the value of k, it is possible to reduce the burden on each processing of band division, coded bit allocation, quantization, and coded data generation to a processing amount 1 / k at each stage. Becomes possible.

【０３０９】従って、ＣＰＵの性能不足などにより、従
来の方法では、音声入力に伴った実時間符号化処理が困
難または不可能な場合にも、変数定数ｋの設定により負
担を軽減することで音声符号化処理を実時間で行うこと
が可能となる。変数定数ｋの値の決定については、装置
に用いられるＣＰＵを固定的に想定して、その符号化処
理性能に基づいて設定する方法、予めシミュレーション
などによりＣＰＵごとに求めた値の中から、ユーザーに
よるＣＰＵ選択によって選定する方法、符号化処理に先
立ちＣＰＵの符号化処理性能を計る演算を行わせ、その
結果に基づいて設定する方法などを用いることができ
る。[0309] Therefore, in the conventional method, even when the real-time encoding processing accompanying the speech input is difficult or impossible due to the lack of performance of the CPU, etc., the load is reduced by setting the variable constant k. The encoding process can be performed in real time. For determining the value of the variable constant k, the CPU used in the apparatus is fixedly assumed, a method of setting based on the encoding processing performance, a value determined in advance for each CPU by simulation, etc. , A method of performing an operation for measuring the encoding processing performance of the CPU prior to the encoding processing, and setting based on the result.

【０３１０】なお、前述のように帯域出力ゼロの期間が
あっても再生の際に音がとぎれることはないが、この期
間が長いほど音の劣化が大きくなるため、単位期間判定
定数の設定については、帯域出力ゼロの期間を３２サン
プルとするように、つまりｋ＝２に留めるのが望まし
い。ただし、その音の劣化を容認しても符号化処理を実
行させる必要があれば、ｋを大きくすることにより、装
置の能力に応じた実時間符号化は可能である。As described above, the sound is not interrupted during reproduction even if there is a period during which the band output is zero, but the longer the period, the greater the deterioration of the sound. It is desirable that the period of zero band output be 32 samples, that is, k = 2. However, if it is necessary to execute the encoding process even if the deterioration of the sound is tolerated, real-time encoding according to the capability of the device is possible by increasing k.

【０３１１】実施の形態１０．本発明の実施の形態１０
による音声符号化装置は、帯域分割に際し一部の演算処
理を省略することで、処理負担の軽減を図り得るもので
ある。図２３は、本発明の実施の形態１０による音声符
号化装置の構成を示すブロック図である。図２３に示す
ように、本実施の形態１０による音声符号化装置は、音
声入力部１００１、入力音声サンプリング部１００３、
帯域分割部１００５、符号化ビット割り当て部１００
６、量子化部１００７、符号化部１００８、符号化デー
タ記録部１００９、およびレジスタ１０１１から構成さ
れている。また、本実施の形態１０の装置も実施の形態
９と同様、図１１に示されるハードウェア構成である。Embodiment 10 FIG. Embodiment 10 of the present invention
Can reduce the processing load by omitting some of the arithmetic processing in band division. FIG. 23 is a block diagram showing a configuration of the speech coding apparatus according to Embodiment 10 of the present invention. As shown in FIG. 23, a speech coding apparatus according to Embodiment 10 includes a speech input unit 1001, an input speech sampling unit 1003,
Band splitting section 1005, coded bit allocating section 100
6, a quantization unit 1007, an encoding unit 1008, an encoded data recording unit 1009, and a register 1011. Further, the apparatus according to the tenth embodiment has the hardware configuration shown in FIG. 11 similarly to the ninth embodiment.

【０３１２】同図において、レジスタ１０１１は、メイ
ンメモリまたは外部記憶装置で実現され、帯域分割処理
における演算実行制御に用いられる演算処理判定定数を
記憶する。又、本実施の形態１０の帯域分割部１００５
は、レジスタ１０１１より演算処理判定定数の値を得
て、帯域分割における演算処理を中止する演算処理中止
部を内包したものである。音声入力部１００１、入力音
声サンプリング部１００３、符号化ビット割り当て部１
００６、量子化部１００７、符号化部１００８、および
符号化データ記録部１００９は、実施の形態９の９０
１、９０３、および９０６〜９０９と同様であり、説明
を省略する。[0312] In the figure, a register 1011 is realized by a main memory or an external storage device, and stores an arithmetic processing determination constant used for arithmetic execution control in band division processing. Also, the band dividing unit 1005 of the tenth embodiment
Includes an arithmetic processing stop unit that obtains the value of the arithmetic processing determination constant from the register 1011 and stops the arithmetic processing in band division. Voice input unit 1001, input voice sampling unit 1003, coded bit allocation unit 1
006, the quantization unit 1007, the encoding unit 1008, and the encoded data recording unit 1009 are the same as those in the ninth embodiment.
1, 903, and 906 to 909, and a description thereof will be omitted.

【０３１３】図２４は本実施の形態１０の符号化装置に
よる音声符号化のフローチャート図、図２５は帯域分割
に際して基本低域通過フィルタ処理の演算に用いられる
係数を示す図である。以下に本実施の形態１０による音
声符号化装置の動作を、図２４に従って、図２３、およ
び図２５を参照しながら説明する。FIG. 24 is a flowchart of speech encoding performed by the encoding apparatus according to the tenth embodiment, and FIG. 25 is a diagram showing coefficients used in the calculation of the basic low-pass filter processing in band division. Hereinafter, the operation of the speech coding apparatus according to Embodiment 10 will be described with reference to FIG. 24 and with reference to FIGS. 23 and 25.

【０３１４】図２４のフローのステップ１は実施の形態
９と同様に実行され、音声入力部１００１より入力され
た音声信号が、入力音声サンプリング部１００３におい
てサンプリングされ、サンプリングデータが得られる。Step 1 of the flow in FIG. 24 is executed in the same manner as in the ninth embodiment. The audio signal input from the audio input unit 1001 is sampled in the input audio sampling unit 1003, and sampling data is obtained.

【０３１５】続いてステップ２では帯域分割部１００５
により、サンプリングデータに対する帯域分割処理が行
われる。実施の形態９で説明したように、ＭＰＥＧ１Ａ
ｕｄｉｏの帯域分割では、対象とする３２サンプルを中
心に前後５１２サンプルを用いて、３２帯域へ分割し帯
域ごとの音声データを得るものであり、このため基本低
域通過フィルタ処理を行う。この基本低域通過フィルタ
処理では、以下に示す式（１）（２）（３）の演算を行
う。Subsequently, in step 2, the band dividing section 1005
Thus, band division processing is performed on the sampling data. As described in the ninth embodiment, MPEG1A
In the audio band division, the audio data for each band is obtained by dividing into 32 bands using 512 samples before and after the target 32 samples, and basic low-pass filtering is performed. In this basic low-pass filter processing, the following equations (1), (2), and (3) are calculated.

【０３１６】[0316]

【数２】 (Equation 2)

【０３１７】この中で式（１）に注目してみると、５１
２個の（ｉ＝０〜５１１）入力音声データＸｉに対し
て、表より求められる数Ｃｉによる乗算処理がされるも
のである。係数Ｃｉは、ＭＰＥＧオーディオの規格によ
る、サンプル番号と係数とを対比させた係数表より得ら
れるが、これをグラフ化すると図２５のようになり、両
端ほど０に近づくことが分かる。そして、式（１）は乗
算処理であるため、係数Ｃｉが０に近づけば、積Ｚｉも
０に近づくこととなる。さらに式（２）では、式（１）
で求めたＺｉを加算するだけなので、係数Ｃｉが０に近
い項については寄与は小さく、かかる項についてはＺｉ
を求めて加算する必要性は低い。When attention is paid to equation (1), 51
Multiplication processing is performed on two (i = 0 to 511) input audio data Xi by a number Ci obtained from a table. The coefficient Ci is obtained from a coefficient table in which the sample number is compared with the coefficient according to the MPEG audio standard. When the coefficient Ci is graphed, it becomes as shown in FIG. Since the equation (1) is a multiplication process, if the coefficient Ci approaches 0, the product Zi also approaches 0. Further, in equation (2), equation (1)
Is added, the contribution is small for the term whose coefficient Ci is close to 0, and Zi is
It is not necessary to add the values for.

【０３１８】このことから、式（１）については、係数
Ｃｉが０に近い、つまりｉが０または５１１に近い分に
ついてＺｉを求める演算を行わずＺｉ＝０とし、又、式
（２）でもＺｉ＝０に相当する項は加算することなく、
帯域分割の演算を行えば、帯域分割の精度を幾分損じる
こととはなるものの、演算量の軽減が図れる。この場
合、図２５からも分かるように、Ｃｉは、３２個を単位
として変化しており、上記演算を打ち切る間隔３２個単
位で決定することが望ましい。従って、演算を実行する
演算対象区間は、ｉ＝３２ｑからｉ＝３２（８−ｑ）＋
２５５の区間として表すことができる。ここで、演算処
理判定定数ｑは、０≦ｑ≦７を満たす整数として、装置
性能等に応じて予め設定してレジスタ１０１１に記憶さ
せておくものである。上記のように演算対象区間を制限
することにより、ｑ×１／８＝ｑ／８だけ演算処理を省
くことが可能となるので、演算処理判定定数ｑを大きく
するほど、演算のための処理負担を軽減することができ
る。図２４のフローのステップ２で上記のように一部の
演算を省略して帯域分割が行われた後は、この帯域信号
が処理の対象となり、ステップ３以降については、第１
の従来例のステップ３以降と同様であるので、説明を省
略する。Accordingly, in the equation (1), Zi = 0 for the coefficient Ci close to 0, that is, for the part where i is close to 0 or 511, and Zi = 0. The term corresponding to Zi = 0 is not added,
If the calculation of the band division is performed, the accuracy of the band division is somewhat impaired, but the amount of calculation can be reduced. In this case, as can be seen from FIG. 25, Ci changes in units of 32, and it is desirable that the Ci be determined in units of 32 intervals in which the calculation is terminated. Therefore, the calculation target section for executing the calculation is from i = 32q to i = 32 (8−q) +
255 sections. Here, the arithmetic processing determination constant q is set in advance as an integer satisfying 0 ≦ q ≦ 7 and stored in the register 1011 according to the device performance and the like. By limiting the calculation target section as described above, it is possible to omit the calculation processing by q × １／ = q / 8. Therefore, as the calculation processing determination constant q increases, the processing load for the calculation increases. Can be reduced. After band division is performed by omitting some operations as described above in step 2 of the flow of FIG. 24, this band signal becomes a target of processing.
This is the same as step 3 and subsequent steps of the conventional example, and the description is omitted.

【０３１９】このように、実施の形態１０による音声符
号化装置においては、演算処理判定定数ｑを記憶するレ
ジスタ１０１１を備え、帯域分割部１００５は、演算処
理判定定数ｑに基づき、帯域分割処理における基本低域
通過フィルタの演算処理を一部のサンプルについて省略
するものとしたことで、演算処理判定定数ｑの値を制御
することにより、帯域分割部の演算処理を約ｑ／８だけ
軽減することが可能となる。従って、ＣＰＵの性能不足
などにより、従来の方法では、音声入力に伴った実時間
符号化処理が困難または不可能な場合にも、演算処理判
定定数ｑの設定により負担を軽減することで実時間処理
を行うことが可能となる。なお、演算処理判定定数ｑの
値の決定については、実施の形態９における単位期間判
定定数の決定と同様に行うことができる。As described above, the speech coding apparatus according to the tenth embodiment includes the register 1011 for storing the arithmetic processing determination constant q, and the band division unit 1005 performs the processing in the band division processing based on the arithmetic processing determination constant q. By omitting the arithmetic processing of the basic low-pass filter for some of the samples, the arithmetic processing of the band division unit is reduced by about q / 8 by controlling the value of the arithmetic processing determination constant q. Becomes possible. Therefore, according to the conventional method, even when the real-time encoding process accompanying the voice input is difficult or impossible due to the lack of performance of the CPU, the real-time encoding process can be reduced by setting the arithmetic processing determination constant q. Processing can be performed. Note that the value of the arithmetic processing determination constant q can be determined in the same manner as the determination of the unit period determination constant in the ninth embodiment.

【０３２０】実施の形態１１．本発明の実施の形態１１
による音声符号化装置は、帯域分割した信号の一部につ
いて後段の処理を省略することで、処理負担の軽減を図
るものである。図２６は、本発明の実施の形態１１によ
る音声符号化装置の構成を示すブロック図である。図２
６に示すように、本実施の形態１１による音声符号化装
置は、音声入力部１１０１、入力音声サンプリング部１
１０３、帯域分割部１１０５、符号化ビット割り当て部
１１０６、量子化部１１０７、符号化部１１０８、符号
化データ記録部１１０９、レジスタ１１１２、および帯
域間引き部１１１８から構成されている。また、本実施
の形態１１の装置も実施の形態９と同様、図１１に示さ
れるハードウェア構成である。Embodiment 11 FIG. Embodiment 11 of the present invention
Is intended to reduce the processing load by omitting subsequent processing for a part of the band-divided signal. FIG. 26 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 11 of the present invention. FIG.
As shown in FIG. 6, the speech coding apparatus according to the eleventh embodiment includes a speech input unit 1101, an input speech sampling unit 1
103, a band division unit 1105, a coded bit allocation unit 1106, a quantization unit 1107, a coding unit 1108, a coded data recording unit 1109, a register 1112, and a band thinning unit 1118. Further, the device of the eleventh embodiment has the hardware configuration shown in FIG. 11 similarly to the ninth embodiment.

【０３２１】同図において、帯域選択定数レジスタ１１
１２は、メインメモリまたは外部記憶装置で実現され、
帯域選択定数を記憶する。帯域間引き部１１１８は、Ｃ
ＰＵ、メインメモリ、およびプログラムで実現され、帯
域分割部１１０５が分割した帯域信号より、レジスタ１
１１２に記憶された帯域選択定数に基づいて、選択抽出
を行う。音声入力部１１０１、入力音声サンプリング部
１１０３、帯域分割部１１０５、符号化ビット割り当て
部１１０６、量子化部１１０７、符号化部１１０８、お
よび符号化データ記録部１１０９は、実施の形態９の９
０１、９０３、および９０５〜９０９と同様であり、説
明を省略する。In the figure, the band selection constant register 11
12 is realized by a main memory or an external storage device,
The band selection constant is stored. The band thinning unit 1118 calculates C
A PU is realized by a PU, a main memory, and a program.
Selection extraction is performed based on the band selection constant stored in 112. The voice input unit 1101, the input voice sampling unit 1103, the band division unit 1105, the coded bit allocation unit 1106, the quantization unit 1107, the coding unit 1108, and the coded data recording unit 1109 are the same as those in the ninth embodiment.
01, 903, and 905 to 909, and the description is omitted.

【０３２２】図２７は本実施の形態１１による音声符号
化のフローチャート図、図２８は本実施の形態１１によ
る音声符号化を説明するための概念図である。以下に本
実施の形態１１による音声符号化の際の動作を、図２７
のフローチャートに従って、図２６、および図２８を参
照しながら説明する。FIG. 27 is a flowchart of speech encoding according to the eleventh embodiment, and FIG. 28 is a conceptual diagram for describing speech encoding according to the eleventh embodiment. The operation of speech encoding according to the eleventh embodiment will be described below with reference to FIG.
Will be described with reference to FIGS. 26 and 28 according to the flowchart of FIG.

【０３２３】図２７のフローのステップ１〜２は実施の
形態９と同様に実行され、音声入力部１１０１より入力
された音声信号が、入力音声サンプリング部１１０３に
おいてサンプリングされ、得られたサンプリングデータ
に対して、帯域分割部１１０５がＭ個の周波数帯域への
分割を行い、Ｍ個の帯域信号データが得られる。Steps 1 and 2 in the flow of FIG. 27 are executed in the same manner as in the ninth embodiment, and the audio signal input from the audio input unit On the other hand, the band dividing unit 1105 divides the signal into M frequency bands, and obtains M band signal data.

【０３２４】次のステップ３において、帯域間引き部１
１１８は、レジスタ１１１２より、記憶された帯域選択
定数ｒを得て、帯域分割部１１０５の出力するＭ個の帯
域信号データに対してｒ個おきに帯域信号データを選択
取得し、計Ｍ／（ｒ＋１）個の帯域信号データを抽出す
る。ｒは０以上の整数として、装置性能等に応じて予め
設定され、レジスタ１１１２に記憶されたものである。
ここでｒ＝２とした場合には、図２８に示すように２個
おきに○印をつけた帯域信号データが抽出される。帯域
間引き部１８は抽出したＭ／（ｒ＋１）個の帯域信号デ
ータを量子化部１１０７に出力する。In the next step 3, the band thinning section 1
118 obtains the stored band selection constant r from the register 1112, selects and acquires every r band signal data from the M band signal data output from the band division unit 1105, and obtains a total of M / ( (r + 1) band signal data is extracted. r is an integer greater than or equal to 0, which is set in advance according to the device performance and the like and stored in the register 1112.
Here, when r = 2, band signal data marked with a circle every two data is extracted as shown in FIG. The band thinning unit 18 outputs the extracted M / (r + 1) band signal data to the quantization unit 1107.

【０３２５】ステップ４において、符号化ビット割り当
て部１１０６は、レジスタ１１１２より取得した帯域判
定定数ｒに基づいて、ステップ３で抽出された帯域のみ
に対して符号化ビット割り当て数を決定する。決定され
たＭ／（ｒ＋１）個の帯域に対しての符号化ビット割り
当て数は、符号化ビット割り当て部１１０６から量子化
部１１０７に伝えられる。ステップ５以降はＭ／（ｒ＋
１）個のデータに対して第１の従来例と同様に実行され
る。In step 4, the coded bit allocation unit 1106 determines the number of coded bit allocations for only the band extracted in step 3, based on the band determination constant r obtained from the register 1112. The determined number of coded bits allocated to the M / (r + 1) bands is transmitted from coded bit allocation section 1106 to quantization section 1107. M / (r +
1) The processing is performed on the pieces of data in the same manner as in the first conventional example.

【０３２６】このように、本実施の形態１１による音声
符号化装置においては、帯域選択定数ｒを記憶するレジ
スタ１１１２と、帯域間引き部１１１８とを備え、帯域
間引き部１１１８は、帯域選択定数ｒに基づき、帯域分
割処理で得られたＭ個の帯域信号データより、Ｍ／（ｒ
＋１）個の帯域信号データを抽出し、後段の処理はこの
抽出されたデータに対して行われるので、帯域選択定数
ｒの値を制御することにより、符号化ビット割り当て、
および量子化の処理を約１／ｒに軽減することが可能と
なる。但し、ｒが１以上の場合がこれに該当するもので
あり、ｒ＝０とした場合では処理負担は変わらない。As described above, the speech coding apparatus according to the eleventh embodiment includes the register 1112 for storing the band selection constant r and the band thinning unit 1118, and the band thinning unit 1118 stores the band selection constant r From the M band signal data obtained by the band division processing, M / (r
+1) band signal data is extracted, and the subsequent processing is performed on the extracted data.
And the quantization process can be reduced to about 1 / r. However, the case where r is 1 or more corresponds to this, and when r = 0, the processing load does not change.

【０３２７】従って、ＣＰＵの性能不足などにより、従
来の方法では、音声入力に伴った実時間符号化処理が困
難または不可能な場合にも、帯域選択定数ｒの設定によ
り負担を軽減することで実時間処理を行うことが可能と
なる。なお、帯域選択定数ｒの値の決定については、実
施の形態９における単位期間判定定数の決定と同様に行
うことができる。Therefore, in the conventional method, even when the real-time encoding processing accompanying the speech input is difficult or impossible due to insufficient performance of the CPU, the load can be reduced by setting the band selection constant r. Real-time processing can be performed. The value of the band selection constant r can be determined in the same manner as the determination of the unit period determination constant in the ninth embodiment.

【０３２８】実施の形態１２．本発明の実施の形態１２
による音声符号化装置は、入力データ量を監視し、制御
用の定数をそれに応じて変更し得るものである。図２９
は、本発明の実施の形態１２による音声符号化装置の構
成を示すブロック図である。図２９に示すように、本実
施の形態９による音声符号化装置は、音声入力部１２０
１、レジスタ１２０２、入力音声サンプリング部１２０
３、判定制御部（単位期間判定）１２０４、帯域分割部
１２０５、符号化ビット割り当て部１２０６、量子化部
１２０７、符号化部１２０８、符号化データ記録部１２
０９、固定的符号レジスタ１２１０、入力バッファ１２
１３、固定的符号レジスタ１２１０、および入力バッフ
ァ監視部１２１４から構成されている。すなわち、実施
の形態９に入力バッファ１２１３と入力バッファ監視部
１２１４とを追加した構成となっている。また、本実施
の形態１２の装置も実施の形態９と同様、図１１に示さ
れるハードウェア構成である。Embodiment 12 FIG. Embodiment 12 of the present invention
Is capable of monitoring the amount of input data and changing control constants accordingly. FIG.
FIG. 24 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 12 of the present invention. As shown in FIG. 29, the speech coding apparatus according to Embodiment 9 includes
1, register 1202, input audio sampling unit 120
3, determination control unit (unit period determination) 1204, band division unit 1205, coded bit allocation unit 1206, quantization unit 1207, coding unit 1208, coded data recording unit 12
09, fixed sign register 1210, input buffer 12
13, a fixed code register 1210, and an input buffer monitoring unit 1214. That is, an input buffer 1213 and an input buffer monitoring unit 1214 are added to the ninth embodiment. Also, the device of the twelfth embodiment has the hardware configuration shown in FIG. 11, as in the ninth embodiment.

【０３２９】同図において、入力バッファ１２１３は、
メインメモリ等のメモリで実現され、データを一時記憶
する。入力バッファ監視部１２１４は、ＣＰＵ、メイン
メモリ、およびプログラムで実現され、入力バッファ１
２１３に一時記憶のため保持されるデータ量を調べて、
このデータ量を予め設定された値と比較し、その結果に
よって、レジスタ１２０２の定数ｋの値を変更する。レ
ジスタ１２０２は、記憶する定数の値が入力バッファ監
視部１２１４によって変更されることを除いて実施の形
態９のレジスタ９０２と同様である。入力音声サンプリ
ング部１２０３は、サンプリング音声データを入力バッ
ファ１２１３に出力する点を除いて、実施の形態９の入
力音声サンプリング部９０３と同様である。判定制御部
１２０４は、入力バッファ１２１３からサンプリング音
声データを取りだして処理対象とする点を除いて、実施
の形態９の判定制御部９０４と同様である。また、音声
入力部１２０１、帯域分割部１２０５、符号化ビット割
り当て部１２０６、量子化部１２０７、符号化部１２０
８、符号化データ記録部１２０９、および固定的符号レ
ジスタ１２１０は実施の形態９における９０１、および
９０５〜９１０と同様である。In the figure, the input buffer 1213
It is realized by a memory such as a main memory and temporarily stores data. The input buffer monitoring unit 1214 is realized by a CPU, a main memory, and a program.
In 213, the amount of data held for temporary storage is checked.
This data amount is compared with a preset value, and the value of the constant k of the register 1202 is changed according to the result. The register 1202 is the same as the register 902 of the ninth embodiment except that the value of a constant to be stored is changed by the input buffer monitoring unit 1214. The input audio sampling unit 1203 is the same as the input audio sampling unit 903 of Embodiment 9 except that the input audio sampling unit 1203 outputs the sampled audio data to the input buffer 1213. The determination control unit 1204 is the same as the determination control unit 904 of the ninth embodiment except that the sampled audio data is extracted from the input buffer 1213 and is processed. Also, a voice input unit 1201, a band division unit 1205, a coded bit allocation unit 1206, a quantization unit 1207, a coding unit 120
8, the encoded data recording unit 1209 and the fixed code register 1210 are the same as 901 and 905 to 910 in the ninth embodiment.

【０３３０】図３０は本実施の形態１２による音声符号
化のフローチャート図である。以下に本実施の形態１２
による音声符号化の際の動作を、図３０のフローチャー
トに従って、図２９を参照しながら説明する。ここで、
単位期間判定定数ｋとしては、ＣＰＵの性能に基づいて
予め定められた値「２」が初期値としてレジスタ２に格
納されているものとする。FIG. 30 is a flowchart of speech encoding according to the twelfth embodiment. Embodiment 12 below
Will be described with reference to FIG. 29 in accordance with the flowchart of FIG. here,
As the unit period determination constant k, it is assumed that a value “2” predetermined based on the performance of the CPU is stored in the register 2 as an initial value.

【０３３１】図３０のフローのステップ１で、音声入力
部１２０１より入力された音声信号は、入力音声サンプ
リング部１２０３において、実施の形態９と同様にサン
プリングされ、ステップ２において、サンプリングデー
タは入力バッファ１２１３に書き込まれて一時記憶され
る。ステップ３において、判定制御部１２０４は入力バ
ッファ１２１３より、一時記憶されたサンプリングデー
タを読み出す。In step 1 of the flow of FIG. 30, the audio signal input from the audio input unit 1201 is sampled in the input audio sampling unit 1203 in the same manner as in the ninth embodiment. 1213 and temporarily stored. In step 3, the determination control unit 1204 reads the temporarily stored sampling data from the input buffer 1213.

【０３３２】続いて行われるステップ４については後述
する。ステップ５の判定は、判定制御部１２０４がステ
ップ３で読み出したデータに対して実施の形態９の場合
と同様に行われ、これ以降、ステップ９の符号化データ
出力までとステップ１０は、実施の形態９における図４
１のフローのステップ３〜６およびステップ７と同様に
実行されるので、ステップ５〜９およびステップ１０に
ついては説明を省略する。これらステップの実行によ
り、ここでは２となっている単位期間判定定数ｋの数値
の設定に応じて、単位期間ｔ１、ｔ３…は符号化対象と
して帯域分割、符号化ビット割り当て、量子化の処理を
行い、ｔ２、ｔ４…には上記処理を行わず、処理負担を
軽減した符号化が実行される。Step 4 to be performed subsequently will be described later. The determination in step 5 is performed on the data read by the determination control unit 1204 in step 3 in the same manner as in the ninth embodiment. Thereafter, until the encoded data output in step 9 and in step 10, FIG. 4 in Embodiment 9
Steps 5 to 9 and step 10 are omitted since they are executed in the same manner as steps 3 to 6 and step 7 in the flow of No. 1. By executing these steps, according to the setting of the numerical value of the unit period determination constant k, which is 2, the unit periods t1, t3,... ., And the above processing is not performed at t2, t4,.

【０３３３】本実施の形態１２においては、ステップ３
が実行された後のステップ４で、入力バッファ監視部１
２１４が、入力バッファ１２１３のデータ量を調べて、
このデータ量を予め設定した値と比較し、比較結果に基
づいて、レジスタ１２０２に記憶される定数ｋの値を変
更する。かかるデータ量に基づく定数ｋの値の制御には
様々な方法を採用し得るが、ここでは以下の様に行われ
るものとする。In the twelfth embodiment, step 3
Is executed, the input buffer monitoring unit 1
214 checks the amount of data in the input buffer 1213,
This data amount is compared with a preset value, and the value of the constant k stored in the register 1202 is changed based on the comparison result. Various methods can be used to control the value of the constant k based on the data amount, but here, the control is performed as follows.

【０３３４】ＣＰＵの負担増大などにより、当初の設定
では、音声入力に伴っての符号化処理が出来なくなった
場合、入力バッファ１２１３については、書き込みは同
じペースで行われるのに対して、符号化処理が遅滞する
のに伴ってそのための読み出しのペースが落ちるので、
蓄積されるデータ量は増大する。In the initial setting, when the encoding process cannot be performed with the voice input due to an increase in the load on the CPU or the like, the writing to the input buffer 1213 is performed at the same pace. As the processing is delayed, the reading speed for that slows down,
The amount of data stored increases.

【０３３５】従って、入力バッファ監視部１２１４は、
入力バッファ１２１３のデータ量が、予め設定した値で
あるバッファフルレベルＢＦを越えた場合は、現在の設
定での実時間符号化処理が不可能であると判断し、レジ
スタ１２０２に記憶される定数ｋの値を、１だけ増加さ
せてｋ＝３に変更する。それ以後のフロー図のステップ
５〜９においては、期間ｔ２とｔ３に関しては、帯域分
割、符号化ビット割り当て、量子化の処理を行わず、帯
域出力０として、予め設定された符号化データｄＮを出
力する。最初の設定であるｋ＝２の時は、２回に１回の
割合で、帯域分割、符号化ビット割り当て、量子化の処
理を省くことにより、これらの処理の負荷を約１／２に
軽減できていたが、ｋ＝３に増やすことにより、３回に
２回の割合で、帯域分割、符号化ビット割り当て、量子
化の処理を省くことにより、これらの処理の負荷を約１
／３と、さらに軽減させることができる。このようにし
て、入力バッファ監視部１２１４は定数ｋの値を変更す
ることにより、ＣＰＵに対する負荷の軽減を図り、音質
の低下は伴うものの実時間処理を継続することが可能と
なる。Accordingly, the input buffer monitoring unit 1214
If the data amount of the input buffer 1213 exceeds the buffer full level BF which is a preset value, it is determined that the real-time encoding process with the current setting is impossible, and the constant stored in the register 1202 The value of k is increased by 1 and changed to k = 3. In steps 5 to 9 of the flow chart thereafter, for the periods t2 and t3, the band division, the coded bit allocation, and the quantization are not performed, and the coded data dN set in advance is set as the band output 0. Output. When k = 2, which is the first setting, the processing load of these processes is reduced to about により by omitting the processes of band division, coded bit allocation, and quantization once every two times. However, by increasing k to 3, the processing load of these processings is reduced by about 1 by omitting the processing of band division, coded bit allocation, and quantization at a rate of 2 out of 3 times.
/ 3, which can be further reduced. In this manner, the input buffer monitoring unit 1214 changes the value of the constant k to reduce the load on the CPU, and can continue the real-time processing although the sound quality is reduced.

【０３３６】図３０のフローにおける繰り返しで、ステ
ップ４において、なおも入力バッファ１２１３のデータ
量がバッファフルレベルＢＦを越える場合、入力バッフ
ァ監視部１２１４は、レジスタ１２０２の定数ｋを変更
し、さらに１増加してｋ＝４とする。これにより、ステ
ップ５〜９においては、期間ｔ２からｔ４に関しては、
帯域分割、符号化ビット割り当て、量子化の処理を行わ
ず、帯域出力０として、予め設定された符号化データｄ
Ｎを出力することにより、帯域分割、符号化ビット割り
当て、量子化の処理の負荷を約１／４に軽減させる。以
後、ステップ４での入力バッファのデータ量がバッファ
フルレベルＢＦ以下になるまで、入力バッファ監視部１
２１４は、レジスタ１２０２のｋの値を増加させる。If the amount of data in the input buffer 1213 still exceeds the buffer full level BF in step 4 in the repetition of the flow of FIG. 30, the input buffer monitoring unit 1214 changes the constant k of the register 1202, and Increase to k = 4. Thereby, in steps 5 to 9, with respect to the periods t2 to t4,
The band division, the coded bit allocation, and the quantization are not performed.
By outputting N, the load of band division, coding bit allocation, and quantization processing is reduced to about ４. Thereafter, until the data amount of the input buffer in step 4 becomes equal to or less than the buffer full level BF, the input buffer monitoring unit 1
214 increments the value of k in register 1202.

【０３３７】逆にＣＰＵの負担が減少するなど、音声入
力に伴っての符号化処理を行ってもなお、ＣＰＵの処理
能力に余力が生じた場合、入力バッファ１２１３からの
読み出し量が多くなるので、蓄積されるデータ量は減少
し、ついには、少量のデータが短時間蓄積されるのみの
状態が続くに至ることとなる。On the other hand, if the CPU has enough processing power even when the encoding process is performed in accordance with the voice input, such as a decrease in the load on the CPU, the amount of data read from the input buffer 1213 increases. Then, the amount of data to be stored decreases, and eventually, a state where only a small amount of data is stored for a short time continues.

【０３３８】従って、ステップ４において、入力バッフ
ァ１２１３のデータ量が予め設定した値であるバッファ
エンプティレベルＢＥを下回る場合は、入力バッファ監
視部１２１４は、符号化処理能力に余力があると判断す
る。なるべく定数ｋの値が少ない方が、帯域出力０の時
間が短く、高品質の符号化データが得られるので、入力
バッファ監視部１２１４は、定数ｋの値を１だけ減少さ
せ、以後は上記と同様に入力バッファ１２１３のデータ
量がバッファエンプティレベルＢＥ以上になるまで、図
３０のフローの繰り返しにおいて、ステップ４でレジス
タ１２０２の定数ｋの値を１ずつ減少させる。Therefore, in step 4, when the data amount of the input buffer 1213 is lower than the buffer empty level BE which is a preset value, the input buffer monitoring unit 1214 determines that there is enough coding processing capacity. If the value of the constant k is as small as possible, the time of the band output 0 is short, and high-quality encoded data is obtained. Therefore, the input buffer monitoring unit 1214 reduces the value of the constant k by 1 and thereafter, Similarly, in the repetition of the flow of FIG. 30, the value of the constant k of the register 1202 is decreased by 1 until the data amount of the input buffer 1213 becomes equal to or greater than the buffer empty level BE.

【０３３９】なお、上記の方法では、定数ｋの値を制御
するために、バッファフルレベルＢＦとバッファエンプ
ティレベルＢＥの２つの値を用いたが、バッファフルレ
ベルＢＦのみを用いてもよく、この場合、例えば入力バ
ッファのデータ量が予め設定したバッファフルレベルＢ
Ｆに達するまで定数ｋの値を増加し、音声入力と符号化
処理がつりあうとき、すなわちデータ量がＢＦに達した
ときに、定数ｋを増加するのを止めるように制御を行う
等のように制御を実現できる。In the above method, two values of the buffer full level BF and the buffer empty level BE are used to control the value of the constant k. However, only the buffer full level BF may be used. In this case, for example, if the data amount of the input buffer is equal to a preset buffer full level B
When the value of the constant k is increased until F is reached, and when the voice input and the encoding process are balanced, that is, when the data amount reaches BF, control is performed so as to stop the increase of the constant k. Control can be realized.

【０３４０】このように、本実施の形態１２の音声符号
化装置においては、実施の形態９の音声符号化装置に入
力バッファ１２１３と、入力バッファ監視部１２１４と
をさらに備えたものとしたことで、サンプリングデータ
をこの入力バッファ１２１３に一時記憶した後に読み出
して、それ以後の処理を行うものとし、また、入力バッ
ファ監視部１２１４が、入力バッファ１２１３のデータ
量を調べることにより、レジスタに記憶する単位期間判
定定数ｋの値を動的に制御することによって、ＣＰＵの
基本的性能に応じた実時間符号化処理を行えることに加
え、ＣＰＵの処理性能の変化にも対応して、その時点で
符号化処理可能な、最も高品質な音声符号化を行うよう
に図ることが可能となる。As described above, in the speech coder of the twelfth embodiment, the speech coder of the ninth embodiment further includes the input buffer 1213 and the input buffer monitor 1214. The sampling data is temporarily stored in the input buffer 1213 and then read out, and the subsequent processing is performed. Further, the input buffer monitoring unit 1214 checks the amount of data in the input buffer 1213 to determine the unit to be stored in the register. By dynamically controlling the value of the period determination constant k, it is possible to perform a real-time encoding process in accordance with the basic performance of the CPU. It is possible to perform the highest quality speech encoding that can be processed.

【０３４１】したがって、音声符号化装置として汎用の
パーソナルコンピュータ等を用いて、マルチタスク下で
音声符号化処理を実行するような場合においても、他の
タスク実行によるＣＰＵの処理能力の変化に対応して、
実時間で符号化処理を実行することができる。Therefore, even when a general-purpose personal computer or the like is used as the audio encoding device to execute the audio encoding process under multitasking, it is possible to cope with a change in the processing capability of the CPU due to execution of another task. hand,
The encoding process can be performed in real time.

【０３４２】なお、ここでは、音声符号化装置の構成
を、実施の形態９による装置に入力バッファ１２１３
と、入力バッファ監視部１２１４とを追加する構成とし
たものであるが、実施の形態１０による装置に対して入
力バッファ１２１３と、入力バッファ監視部１２１４と
を追加する構成とすることも可能であり、演算処理判定
定数ｑの値を制御することにより、基本低域通過フィル
タ処理の演算量を増減して、処理負担軽減や、符号化デ
ータの音質向上を図ることが可能である。Here, the configuration of the speech encoding apparatus is the same as that of the apparatus according to the ninth embodiment except that
And an input buffer monitoring unit 1214 are added, but it is also possible to add an input buffer 1213 and an input buffer monitoring unit 1214 to the device according to the tenth embodiment. By controlling the value of the arithmetic processing determination constant q, it is possible to increase or decrease the arithmetic amount of the basic low-pass filter processing, thereby reducing the processing load and improving the sound quality of the encoded data.

【０３４３】同様に、実施の形態１１による装置に対し
て入力バッファ１２１３と、入力バッファ監視部１２１
４とを追加する構成とすることも可能であり、帯域選択
定数ｒの値を制御することにより、選択抽出される帯域
信号データの数を増減して、処理負担軽減や、符号化デ
ータの音質向上を図ることが可能である。Similarly, the input buffer 1213 and the input buffer monitor 121
4 can be added, and by controlling the value of the band selection constant r, the number of band signal data to be selectively extracted can be increased or decreased to reduce the processing load and the sound quality of the encoded data. It is possible to improve.

【０３４４】実施の形態１３．本発明の実施の形態１３
による音声符号化装置は、出力する符号化データ量を監
視し、制御用の定数をそれに応じて変更し得るものであ
る。図３１は、本発明の実施の形態１３による音声符号
化装置の構成を示すブロック図である。同図に示すよう
に、当該音声符号化装置は、音声入力部１３０１、レジ
スタ１３０２、入力音声サンプリング部１３０３、判定
制御部１３０４、帯域分割部１３０５、符号化ビット割
り当て部１３０６、量子化部１３０７、符号化部１３０
８、符号化データ記録部１３０９、固定的符号レジスタ
１３１０、および符号化データ監視部１３１５から構成
されている。この構成は、実施の形態９による音声符号
化装置に符号化データ監視部１３１５を追加した構成で
ある。また、本実施の形態１３の装置も実施の形態９と
同様、図１１に示されるハードウェア構成である。Embodiment 13 FIG. Embodiment 13 of the present invention
Is capable of monitoring the amount of coded data to be output and changing the control constants accordingly. FIG. 31 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 13 of the present invention. As shown in the figure, the audio encoding apparatus includes an audio input unit 1301, a register 1302, an input audio sampling unit 1303, a determination control unit 1304, a band division unit 1305, an encoded bit allocation unit 1306, a quantization unit 1307, Encoding unit 130
8, a coded data recording unit 1309, a fixed code register 1310, and a coded data monitoring unit 1315. This configuration is a configuration in which an encoded data monitoring unit 1315 is added to the speech encoding device according to Embodiment 9. Also, the device of the thirteenth embodiment has the hardware configuration shown in FIG. 11 as in the ninth embodiment.

【０３４５】符号化データ監視部１３１５は、ＣＰＵ、
メインメモリ、およびプログラムで実現され、符号化部
１３０８より出力される単位時間当たりの符号化データ
量を調べて、このデータ量を予め設定された値と比較
し、その結果によって、レジスタ１３０２の定数ｋの値
を変更する。レジスタ１３０２は、記憶する定数の値が
符号化データ監視部１３１５によって変更されることを
除いて実施の形態９のレジスタ９０２と同様である。音
声入力部１３０１、入力音声サンプリング部１３０３、
判定制御部１３０４、帯域分割部１３０５、符号化ビッ
ト割り当て部１３０６、量子化部１３０７、符号化部１
３０８、符号化データ記録部１３０９、固定的符号レジ
スタ１３１０は実施の形態５における９０１、および９
０３〜９１０と同様である。すなわち、実施の形態１２
の装置が入力データの量を監視したのに対して、本実施
の形態１３の装置では符号化データの量を監視して、レ
ジスタに記憶された定数の値を制御するものである。The coded data monitoring unit 1315 includes a CPU,
The amount of encoded data per unit time, which is realized by the main memory and the program and output from the encoding unit 1308, is checked, and the amount of data is compared with a preset value. Change the value of k. The register 1302 is the same as the register 902 of the ninth embodiment except that the value of the stored constant is changed by the encoded data monitoring unit 1315. A voice input unit 1301, an input voice sampling unit 1303,
Decision control section 1304, band division section 1305, coded bit allocation section 1306, quantization section 1307, coding section 1
Reference numeral 308, the encoded data recording unit 1309, and the fixed code register 1310 correspond to 901 and 9 in the fifth embodiment.
Same as 03 to 910. That is, Embodiment 12
The device of the thirteenth embodiment monitors the amount of encoded data and controls the value of a constant stored in a register, whereas the device of the third embodiment monitors the amount of input data.

【０３４６】図３２は本実施の形態１３による音声符号
化のフローチャート図である。以下に本実施の形態１３
による音声符号化の際の動作を、図３２のフローチャー
トに従って、図３１を参照しながら説明する。ここで、
単位期間判定定数ｋとしては、ＣＰＵの性能に基づいて
予め定められた値「２」が初期値としてレジスタ２に格
納されているものとする。FIG. 32 is a flowchart of speech encoding according to the thirteenth embodiment. Embodiment 13 below
Will be described with reference to FIG. 31 according to the flowchart of FIG. here,
As the unit period determination constant k, it is assumed that a value “2” predetermined based on the performance of the CPU is stored in the register 2 as an initial value.

【０３４７】図３２のフローのステップ１からステップ
７までは、実施の形態９と同様に実行されるので、説明
を省略する。これらステップの実行により、ここでは２
となっている単位期間判定定数ｋの数値の設定に応じ
て、単位期間ｔ１、ｔ３…は符号化対象として帯域分
割、符号化ビット割り当て、量子化の処理を行い、ｔ
２、ｔ４…には上記処理を行わず、処理負担を軽減した
符号化が実行される。Steps 1 to 7 in the flow of FIG. 32 are performed in the same manner as in the ninth embodiment, and therefore, description thereof will be omitted. By performing these steps, 2
According to the setting of the numerical value of the unit period determination constant k, the unit periods t1, t3,...
At 2, t4,..., The above processing is not performed, and encoding with a reduced processing load is executed.

【０３４８】本実施の形態１３では、ステップ１に戻っ
て繰り返しを行う前にステップ８が実行され、符号化デ
ータ監視部１３１５が、符号化部１３０８より出力され
る単位時間当たりの符号化データ量を調べて、このデー
タ量を予め設定された値と比較し、その結果によって、
レジスタ１３０２の定数ｋの値を変更する。符号化デー
タ量を監視して、定数ｋの値を制御するには様々な方法
を採用し得るが、ここでは以下の方法に従って行われる
ものとする。In the thirteenth embodiment, before returning to step 1 and performing repetition, step 8 is executed, and encoded data monitoring section 1315 outputs encoded data amount per unit time output from encoding section 1308. And compare this data amount with a preset value, and according to the result,
The value of the constant k of the register 1302 is changed. Various methods can be adopted to monitor the amount of encoded data and control the value of the constant k. Here, the method is performed according to the following method.

【０３４９】ＣＰＵの負担増大などにより、当初の設定
では、符号化処理が間に合わなくなった場合、符号化処
理のペースが落ちるため、出力される符号化データ量は
減少する。従って、ステップ８において、符号化データ
量が、予め設定した値である符号化最低レベルＣＬに達
しない場合は、符号化データ監視部１３１５は、実施の
形態１２に示した入力バッファ監視部１２１４と同様、
レジスタ２に記憶される単位期間判定定数ｋの値を増加
させることにより、ＣＰＵの負担を軽減させるように図
る。In the initial setting due to an increase in the load on the CPU or the like, if the encoding processing cannot be performed in time, the encoding processing pace is reduced, and the amount of encoded data to be output decreases. Therefore, if the encoded data amount does not reach the encoding minimum level CL which is a preset value in step 8, the encoded data monitoring unit 1315 performs the processing with the input buffer monitoring unit 1214 described in the twelfth embodiment. Similarly,
By increasing the value of the unit period determination constant k stored in the register 2, the load on the CPU is reduced.

【０３５０】同様に、ＣＰＵの負担が減少し、余力が生
じた場合には、ある限度までは符号化データ量が増加す
ることとなるので、ステップ８において、単位時間当た
りの符号化処理量が符号化最高レベルＣＨを下回らない
場合は、高品質の符号化が行えるよう、レジスタ１３０
２の単位期間判定定数ｋの値を減少させる。Similarly, if the load on the CPU is reduced and there is excess capacity, the amount of encoded data will increase to a certain limit. If it does not fall below the highest coding level CH, the register 130 is used so that high quality coding can be performed.
The value of the unit period determination constant k of 2 is decreased.

【０３５１】図３２のフローの繰り返しにおいては、実
施の形態１２と同様に上記のような単位期間判定手数の
増減が繰り返し行われ、定数ｋが適切な値となるように
制御がなされる。また、符号化最低レベルＣＬと符号化
最高レベルＣＨの２つの値を用いず、符号化最低レベル
ＣＬのみを用いても制御可能である点については、実施
の形態１２と同様である。In the repetition of the flow of FIG. 32, as in the twelfth embodiment, the above-described increase / decrease in unit period determination effort is repeated, and control is performed so that the constant k becomes an appropriate value. Further, as in the twelfth embodiment, control can be performed by using only the lowest coding level CL without using the two values of the lowest coding level CL and the highest coding level CH.

【０３５２】このように、本実施の形態１３の音声符号
化装置においては、実施の形態９による装置に、符号化
データ監視部１３１５をさらに備えたものとしたこと
で、符号化データ監視部１３１５が、単位時間当たり出
力される符号化データ量を調べることにより、これをそ
の時点におけるＣＰＵの符号化処理能力の指標として、
状況に応じてレジスタ１３０２に記憶する定数ｋの値を
動的に制御することによって、ＣＰＵがその時点で、符
号化処理可能な、最も高品質な音声符号化を行うように
図ることが可能となる。従って、実施の形態１２の装置
と同様に、マルチタスク等による、その時点でのＣＰＵ
の処理能力の変化に対応することが可能となる。As described above, in the speech coding apparatus according to the thirteenth embodiment, the apparatus according to the ninth embodiment is further provided with the coded data monitoring section 1315. By examining the amount of encoded data output per unit time, this is used as an index of the encoding processing capability of the CPU at that time,
By dynamically controlling the value of the constant k stored in the register 1302 according to the situation, it is possible for the CPU to perform the highest quality audio encoding that can be encoded at that time. Become. Therefore, similarly to the device of the twelfth embodiment, the CPU at that time by multitasking or the like is used.
Can be changed.

【０３５３】また、本実施の形態１３についても、実施
の形態１２と同様に、実施の形態１０による装置、また
は実施の形態１１による装置に、符号化データ監視部１
３１５を追加する構成とすることも可能であり、演算処
理判定定数、または帯域選択定数の値を制御することに
よって、同様の効果が得られる。Also, in the thirteenth embodiment, similarly to the twelfth embodiment, the apparatus according to the tenth embodiment or the apparatus according to the eleventh embodiment has the
It is also possible to adopt a configuration in which 315 is added, and a similar effect can be obtained by controlling the value of the arithmetic processing determination constant or the band selection constant.

【０３５４】実施の形態１４．本発明の実施の形態１４
による音声符号化装置は、符号化ビットの割り当て方に
よって、心理聴覚分析代替制御を実現し、処理負担を大
きく増すことなく、符号化データの再生音質の向上を図
るものである。図３３は、本発明の実施の形態１４によ
る音声符号化装置の構成を示すブロック図である。同図
に示すように、本実施の形態１４による音声符号化装置
は、音声入力部１４０１、入力音声サンプリング部１４
０３、帯域分割部１４０５、符号化ビット割り当て部１
４０６、量子化部１４０７、符号化部１４０８、符号化
データ記録部１４０９、およびビット割り当て制御部
（順次ビット割当）１４１６から構成されている。ま
た、本実施の形態１４の装置も実施の形態９と同様、図
１１に示されるハードウェア構成である。Embodiment 14 FIG. Embodiment 14 of the present invention
The present invention realizes psycho-aural analysis alternative control by assigning encoded bits, and improves the reproduction sound quality of encoded data without greatly increasing the processing load. FIG. 33 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 14 of the present invention. As shown in the figure, the speech coding apparatus according to the fourteenth embodiment includes a speech input unit 1401, an input speech sampling unit 14
03, band division section 1405, coded bit allocation section 1
406, a quantization unit 1407, an encoding unit 1408, an encoded data recording unit 1409, and a bit allocation control unit (sequential bit allocation) 1416. Also, the device of the fourteenth embodiment has the hardware configuration shown in FIG. 11, as in the ninth embodiment.

【０３５５】同図において、ビット割り当て制御部１４
１６は、ＣＰＵ、メインメモリ、およびプログラムで実
現され、帯域分割部１４０５の分割によって得られるＭ
個の帯域信号データに対して、符号化ビット割り当て部
１４０６が割り当てるビット数を、所定のアルゴリズム
に従って算定する。音声入力部１４０１、入力音声サン
プリング部１４０３、帯域分割部１４０５、符号化ビッ
ト割り当て部１４０６、量子化部１４０７、符号化部１
４０８、および符号化データ記録部１４０９は、実施の
形態９の９０１、９０３、および９０５〜９０９と同様
であり、説明を省略する。In the figure, bit allocation control unit 14
16 is realized by a CPU, a main memory, and a program, and is obtained by division of the band division unit 1405.
The number of bits allocated by the coded bit allocation unit 1406 to the band signal data is calculated according to a predetermined algorithm. Voice input unit 1401, input voice sampling unit 1403, band division unit 1405, coded bit allocation unit 1406, quantization unit 1407, coding unit 1
408 and the encoded data recording unit 1409 are the same as 901, 903, and 905 to 909 of the ninth embodiment, and a description thereof will be omitted.

【０３５６】図３４〜３６は本実施の形態１４による音
声符号化の動作を示すフローチャート図である。以下に
本実施の形態１３による音声符号化の際の動作を、図３
４〜３６のフローチャートに従って、図３３を参照しな
がら説明する。図３４のフローのステップ１〜２は第２
の従来例における図６５のステップ１〜２と同様に実行
され、Ｍ個の周波数帯域に分割された帯域信号データが
得られる。ＭＰＥＧオーディオの規格に従って、第２の
従来例と同様にＭ＝３２個であったとする。FIGS. 34 to 36 are flow charts showing the operation of speech encoding according to the fourteenth embodiment. The operation of speech encoding according to the thirteenth embodiment will be described below with reference to FIG.
This will be described with reference to FIG. 33 according to the flowcharts of 4 to 36. Steps 1 and 2 in the flow of FIG.
Is performed in the same manner as in Steps 1 and 2 of FIG. 65 in the conventional example, and band signal data divided into M frequency bands is obtained. According to the MPEG audio standard, it is assumed that M = 32 as in the second conventional example.

【０３５７】この後、図６５に示す第２の従来例では、
Ｌ＝２５６個に分割された帯域信号データに対して高速
フーリエ変換と心理聴覚分析とを行って、符号化ビット
割り当て数を決定するが、本実施の形態１４では、図３
４のフローのステップ３において、ビット割り当て制御
部１６が心理聴覚分析代替制御方式として、順次ビット
割り当て方式により算定した結果に基づいて符号化ビッ
ト割り当て部１４０６がＭ＝３２個の帯域信号データに
対する符号化ビットの割り当てを行う。Thereafter, in the second conventional example shown in FIG.
Fast Fourier transform and psychological auditory analysis are performed on the band signal data divided into L = 256 to determine the number of coded bits to be allocated. In the fourteenth embodiment, FIG.
In step 3 of the flow of FIG. 4, the bit allocation control unit 16 determines the code for the M = 32 band signal data by the coded bit allocation unit 1406 based on the result calculated by the bit allocation method as a psychological auditory analysis alternative control method. Assignment of coded bits.

【０３５８】まず割り当てられるべき総ビット数は、Ｍ
ＰＥＧオーディオのレイヤ１ならば２５６ｋｂｐｓ、レ
イヤ２ならば１９２ｋｂｐｓと定められたビットレート
から求められる。そして求められた総ビット数を以下に
説明する順次ビット割り当て方式により、割り当てを行
う。First, the total number of bits to be allocated is M
The bit rate is determined from a predetermined bit rate of 256 kbps for layer 1 of PEG audio and 192 kbps for layer 2 of layer 2. Then, the obtained total number of bits is allocated according to the sequential bit allocation method described below.

【０３５９】図３５は順次ビット割り当て方式の手順を
示すフローチャートである。ビット割り当て制御部１６
は、ステップ１０１とステップ１０３で帯域０〜１０に
対してビットを割り当て、その後ステップ１０５とステ
ップ１０７とで帯域１１〜２２に対してビットを割り当
てる。すなわち、帯域０〜１０と、帯域１１〜２２とに
各２回ずつビットを割り当てる。その後ステップ１０９
で帯域０〜１０に更に１回、ステップ１１１で帯域１１
〜２２に更に１回ビットを割り当てる。それから、ステ
ップ１１３で帯域２３〜３１に１回ビットを割り当て
る。以上の各ステップの実行後にある判定ステップ、ス
テップ１０２，１０４，１０６，１０８，１１０，１１
２，１１４のいずれかで総ビット数を割り当て終わった
と判断されなければ、ステップ１０１から上記の割り当
てが繰り返され、上記いずれかの判定ステップで総ビッ
ト数が割り当てられた際にフローが終了する。FIG. 35 is a flowchart showing the procedure of the sequential bit allocation method. Bit allocation control unit 16
Allocates bits to bands 0 to 10 in steps 101 and 103, and then allocates bits to bands 11 to 22 in steps 105 and 107. That is, bits are assigned to the bands 0 to 10 and the bands 11 to 22 twice each. Then step 109
Once in bands 0 to 10 and in step 111 in band 11
Bits are further assigned once to. Then, in step 113, bits are assigned to the bands 23 to 31 once. Determination steps after execution of each of the above steps, steps 102, 104, 106, 108, 110, 11
If it is not determined that the total number of bits has been allocated in any of the steps 2 and 114, the above allocation is repeated from step 101, and the flow ends when the total number of bits is allocated in any of the determination steps.

【０３６０】図３６は各帯域へのビット割り当ての手順
を示すフローチャートであり、図３５のフローのステッ
プ１０１，１０３，１０９では図３６(a) が、図３５の
ステップ１０５，１０７，１１１では図３６(b) が、そ
して、図３５のステップ１１３では図３６(c) が実行さ
れる。FIG. 36 is a flowchart showing the procedure for assigning bits to each band. FIG. 36 (a) is used in steps 101, 103 and 109 of the flow of FIG. 35, and FIG. 36 is used in steps 105, 107 and 111 in FIG. 36 (b), and in step 113 of FIG. 35, FIG. 36 (c) is executed.

【０３６１】図３６(a) のフローでは、まずステップ２
０１において変数ａを０とし、ステップ２０２で帯域０
に１ビットを割り当てる。ステップ２０３で総ビット数
を割り当て終わっていない場合には、ステップ２０４で
ａ＝０＋１＝１とされ、ステップ２０５の判定によりス
テップ２０２に戻り、帯域１に１ビットが割り当てられ
る。これを繰り返すことで、ステップ２０５でａ＝１１
と判定された場合、すなわち帯域０〜１０に各１ビット
ずつを割り当て終わったとき、またはステップ２０３に
おいて総ビット数を割り当て終わったと判定されたと
き、図３６(a) のフローは終了する。図３６(b) 、およ
び(c) のフローも同様である。図３６(a) 〜(c) のフロ
ーが終了したときは、図３５のフローの元のステップに
復帰して、総ビット数割り当てが終わっていた場合には
直後の判定ステップにより図３５のフローも終了する。
そして、図３４のフローのステップ３が終了し、同図ス
テップ４〜５は第２の従来例と同様に実行される。In the flow of FIG. 36A, first, in step 2
In step 01, the variable a is set to 0, and in step 202, the band 0
Is assigned one bit. If the total number of bits has not been allocated in step 203, a = 0 + 1 = 1 is set in step 204, and the process returns to step 202 according to the determination in step 205, where one bit is allocated to band 1. By repeating this, a = 11 in step 205
36, that is, when it is determined that one bit has been allocated to each of the bands 0 to 10, or when it is determined in step 203 that the total number of bits has been allocated, the flow of FIG. 36A ends. 36 (b) and (c) are the same. When the flow of FIGS. 36 (a) to (c) is completed, the process returns to the original step of the flow of FIG. 35, and when the total number of bits has been allocated, the flow of FIG. Also ends.
Then, step 3 of the flow in FIG. 34 is completed, and steps 4 and 5 in FIG. 34 are executed in the same manner as in the second conventional example.

【０３６２】本実施の形態１４の装置における順次ビッ
ト割り当て方式は、上記のように帯域０〜１０と、１１
〜２２とにビットを割り当てていく方式である。これは
第２の従来例において、図６６に示すように、これらの
帯域については最小可聴限界が低い、すなわち人間の聴
覚で聞き取りやすい帯域であり、帯域分割で３２の帯域
に分割された周波数帯域の中で、人間の耳に聞こえやす
い帯域に対してより大きな重み付けをし、各帯域に対す
る重み付けの大きい順にビットを割り当てるという考え
方である。そのため、大きな音圧を有する帯域である
か、音圧がほとんどない帯域であるかにかかわらずに、
従って、基本的に入力信号にかかわらず、上記のように
順次ビットを割り当てるものである。[0362] The sequential bit allocation method in the apparatus according to the fourteenth embodiment is based on the above-described band 0 to 10 and 11
In this method, bits are allocated to. In the second conventional example, as shown in FIG. 66, these bands have low minimum audible limits, that is, bands that are easy to hear by human hearing, and are divided into 32 bands by band division. The idea is to assign greater weight to bands that are easy for human ears to hear, and to assign bits in order of decreasing weight for each band. Therefore, regardless of whether the band has a large sound pressure or has almost no sound pressure,
Therefore, basically, bits are sequentially allocated as described above regardless of the input signal.

【０３６３】このように、本実施の形態１４の音声符号
化装置においては、ビット割り当て制御部１４１６を備
えたことで、定められたアルゴリズムにより順次帯域に
ビットを割り当てるので、処理負担を大きく増大するこ
となく、人間の聴覚特性を活かした符号ビット割り当て
が実行でき、再生音質の良好な符号化データが得られ
る。すなわち本実施の形態１４の音声符号化装置は、第
１の従来例による装置にビット割り当て制御部１４１６
を追加した構成により、図３５〜３６に示す単純なアル
ゴリズムに従った簡単な符号化ビット割り当て方法を３
２の帯域に対して実行することにより、再生音質の向上
を図り得るものであり、２５６の帯域に対してフーリエ
変換と心理聴覚分析とを行う第２の従来例と比較する
と、はるかに処理負担は小さく、図２１のハードウェア
構成で示される汎用パーソナルコンピュータやワークス
テーション等で行う音声符号化においても実時間処理と
音質向上との両立が可能なものである。なお、順次ビッ
ト割り当て方式のアルゴリズムについては、図３５〜３
６に示したものは一例であり、これに限定されるもので
はなく、帯域の順番や割り当てビット数を変更しても同
様の単純な順次割り当てを実行することは可能であり、
同様の効果が得られる。As described above, in the speech coding apparatus according to the fourteenth embodiment, the provision of bit allocation control section 1416 allows bits to be sequentially allocated to bands according to a predetermined algorithm, thereby greatly increasing the processing load. In this manner, code bit allocation can be performed utilizing human auditory characteristics, and encoded data with good reproduced sound quality can be obtained. That is, the speech coding apparatus according to the fourteenth embodiment differs from the apparatus according to the first conventional example in that the bit allocation control unit 1416
, A simple coded bit allocation method according to the simple algorithm shown in FIGS.
2 can improve the reproduction sound quality by performing the processing on the second band, and the processing load is far greater than that of the second conventional example in which the Fourier transform and the psychoacoustic analysis are performed on the 256 band. In speech encoding performed by a general-purpose personal computer, workstation, or the like shown in the hardware configuration of FIG. 21, it is possible to achieve both real-time processing and improved sound quality. The algorithm of the sequential bit allocation method is described in FIGS.
6 is an example, and the present invention is not limited to this. It is possible to execute the same simple sequential allocation even if the order of bands and the number of allocated bits are changed.
Similar effects can be obtained.

【０３６４】実施の形態１５．本発明の実施の形態１５
による音声符号化装置は、符号化ビットの割り当て方に
よって、心理聴覚分析代替制御を実現し、帯域ごとの出
力レベルを考慮することで、符号化データの再生音質の
一層の向上を図るものである。図３７は、本発明の実施
の形態１５による音声符号化装置の構成を示すブロック
図である。同図に示すように、本実施の形態１５による
音声符号化装置は、音声入力部１５０１、入力音声サン
プリング部１５０３、帯域分割部１５０５、符号化ビッ
ト割り当て部１５０６、量子化部１５０７、符号化部１
５０８、符号化データ記録部１５０９、およびビット割
り当て制御部（帯域出力適応ビット割当）１５１７から
構成されている。これは、実施の形態１４による装置と
同等の構成である。また、本実施の形態１５の装置も実
施の形態９と同様、図１１に示されるハードウェア構成
である。Embodiment 15 FIG. Embodiment 15 of the present invention
The audio encoding device according to the present invention realizes psychological auditory analysis alternative control by assigning encoded bits, and further improves the reproduction sound quality of encoded data by considering an output level for each band. . FIG. 37 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 15 of the present invention. As shown in the figure, a speech coding apparatus according to Embodiment 15 includes a speech input unit 1501, an input speech sampling unit 1503, a band division unit 1505, a coded bit allocation unit 1506, a quantization unit 1507, a coding unit 1
508, an encoded data recording unit 1509, and a bit allocation control unit (band output adaptive bit allocation) 1517. This is the same configuration as the device according to the fourteenth embodiment. Also, the device of the fifteenth embodiment has the hardware configuration shown in FIG. 11, as in the ninth embodiment.

【０３６５】同図において、ビット割り当て制御部１５
１７は、ＣＰＵ、メインメモリ、およびプログラムで実
現され、帯域分割部１５０５の分割によって得られるＭ
個の帯域信号データに対して、符号化ビット割り当て部
６が割り当てるビット数を、所定のアルゴリズムに従っ
て算定する。このようなビット割り当て制御部１５１７
の機能を除いては、本実施の形態１５は実施の形態１４
と同様の構成である。従って、音声入力部１５０１、入
力音声サンプリング部１５０３、帯域分割部１５０５、
符号化ビット割り当て部１５０６、量子化部１５０７、
符号化部１５０８、および符号化データ記録部１５０９
は、実施の形態９の９０１、９０３、および９０５〜９
０９と同様であり、説明を省略する。In the figure, bit allocation control unit 15
17 is realized by a CPU, a main memory, and a program, and is obtained by division of the band division unit 1505.
The number of bits allocated by the coded bit allocation unit 6 to the band signal data is calculated according to a predetermined algorithm. Such a bit allocation control unit 1517
Embodiment 15 is different from Embodiment 14 except for the function of
This is the same configuration as. Therefore, the audio input unit 1501, the input audio sampling unit 1503, the band division unit 1505,
Encoded bit allocation section 1506, quantization section 1507,
Encoding unit 1508 and encoded data recording unit 1509
Are 901, 903, and 905-9 of the ninth embodiment.
09 and the description is omitted.

【０３６６】図３８は本実施の形態１５による音声符号
化のフローチャート図である。以下に本実施の形態１５
による音声符号化の際の動作を、図３８のフローチャー
トに従って、図３７を参照しながら説明する。図３８の
フローのステップ１〜２は第２の従来例における図６５
のステップ１〜２と同様に実行され、Ｍ個の周波数帯域
に分割された帯域信号データが得られる。ＭＰＥＧオー
ディオの規格に従って、第２の従来例と同様にＭ＝３２
個であったとする。FIG. 38 is a flowchart of speech encoding according to the fifteenth embodiment. The following describes the fifteenth embodiment.
Will be described with reference to FIG. 37 in accordance with the flowchart of FIG. 38. Steps 1 and 2 in the flow of FIG. 38 correspond to FIG. 65 in the second conventional example.
Are performed in the same manner as in Steps 1 and 2 above to obtain band signal data divided into M frequency bands. According to the MPEG audio standard, M = 32 as in the second conventional example.
Suppose it was an individual.

【０３６７】この後、実施の形態１４では、図３５のフ
ローのステップ３において、ビット割り当て制御部１４
１６が順次ビット割り当て方式により算定するが、本実
施の形態１５ではビット割り当て制御部１５１７が帯域
出力適応ビット割り当て方式によりビット割り当てを決
定する。上述のように実施の形態１４では、帯域分割で
３２の帯域に分割された周波数帯域に対して、その帯域
の有する音圧を考慮に入れずビット割り当てを行うもの
であったが、本実施の形態１５では、人間の耳に聞こえ
やすい帯域であるかどうかと、各帯域の有する音圧との
２つの要因に基づいて各帯域に対するビット割り当て情
報を生成するものである。Thereafter, in the fourteenth embodiment, in step 3 of the flow of FIG.
16 are sequentially calculated by the bit allocation method. In the fifteenth embodiment, the bit allocation control unit 1517 determines the bit allocation by the band output adaptive bit allocation method. As described above, in the fourteenth embodiment, bits are allocated to the frequency band divided into 32 bands by the band division without considering the sound pressure of the band. In the fifteenth embodiment, bit allocation information for each band is generated based on two factors, that is, whether the band is easily audible to human ears and the sound pressure of each band.

【０３６８】まず、レイヤ１なら２５６ｋｂｐｓという
ビットレートから、割り当てるべき総ビット数が求めら
れることは実施の形態１４と同様である。ビット割り当
て制御部１７は、「人間の耳に聞こえやすい帯域への重
み付け(1) 」と、「各帯域出力レベル比(2) 」とに基づ
いて「帯域出力適応重み付け(3)」を求め、これに応じ
て上記求めた総ビット数を割り当てる。First, the total number of bits to be allocated is obtained from the bit rate of 256 kbps for layer 1, as in the fourteenth embodiment. The bit allocation control unit 17 obtains “band output adaptive weighting (3)” based on “weighting (1) to the band that is easily heard by human ears” and “each band output level ratio (2)”, According to this, the total bit number obtained above is allocated.

【０３６９】(1) 人間の耳に聞こえやすい帯域への重み
付けまず、「人間の耳に聞こえやすい帯域への重み付け」と
しては、図６５に示される最小可聴限界に基づいて次の
ように決定する。 (2) 各帯域出力レベル比次に、各帯域の有する音圧を比率として表した、各帯域
ごとの出力レベル比を求める。ここでは次のようになっ
ていたものとする。 (3) 帯域出力適応重み付け次に、考慮すべき二つの要因である上記項目(1) 、およ
び(2) から、項目(1)＊項目(2) を計算する。これによ
り次の結果が得られる。（sb：帯域）この結果に基づいて、ビット割り当て制御部１７は、各
帯域のビット割り当てが上記の重み付け(3) に近づくよ
うに、総ビット数を割り振ってビット割り当て数を決定
し、符号化ビット割り当て部６が符号化ビット割り当て
をする。この後図３８のフローのステップ４および５
は、第２の従来例と同様に実行される。(1) Weighting of Bands Easily Recognizable by Human Ear First, "weighting of bands easy to be heard by human ears" is determined as follows based on the minimum audible limit shown in FIG. . (2) Output level ratio of each band Next, an output level ratio for each band, which represents the sound pressure of each band as a ratio, is determined. Here, the following is assumed. (3) Band output adaptive weighting Next, item (1) * item (2) is calculated from the above items (1) and (2) which are two factors to be considered. This gives the following result: (Sb: bandwidth) Based on this result, the bit allocation control unit 17 determines the number of bit allocations by allocating the total number of bits so that the bit allocation of each band approaches the above-mentioned weighting (3). Assign coded bits. Thereafter, steps 4 and 5 in the flow of FIG.
Is executed in the same manner as in the second conventional example.

【０３７０】このように、本実施の形態１５の音声符号
化装置においては、ビット割り当て制御部１５１７を備
えたことで、人間の耳に聞こえやすい帯域であるかどう
かと、各帯域の有する音圧との２つの要因に基づいて各
帯域にビットを割り当てるので、処理負担を大きく増大
することなく、人間の聴覚特性を活かした符号ビット割
り当てが実行でき、再生音質の良好な符号化データが得
られる。As described above, in the speech coding apparatus according to the fifteenth embodiment, the provision of the bit allocation control section 1517 makes it possible to determine whether or not the band is audible to human ears and the sound pressure of each band. Bits are allocated to each band based on the two factors described above, so that code bit allocation can be performed utilizing human auditory characteristics without greatly increasing the processing load, and encoded data with good reproduced sound quality can be obtained. .

【０３７１】すなわち本実施の形態１５の音声符号化装
置は第１の従来例による装置にビット割り当て制御部１
５１７を追加した構成により、比較的単純な演算処理を
用いる簡単な符号化ビット割り当て方法を３２の帯域に
対して実行することにより、再生音質の向上を図り得る
ものであり、２５６の帯域に対してフーリエ変換と心理
聴覚分析とを行う第２の従来例と比較すると、はるかに
処理負担は小さく、図２のハードウェア構成で示される
汎用パーソナルコンピュータやワークステーション等で
行う音声符号化においても実時間処理と音質向上との両
立が可能なものである。また、本実施の形態１５による
音声符号化装置は、実施の形態１４による装置と比較し
て、入力音声の特質を要因として処理する分だけ処理負
担が大きくなるが、それだけ再生音質の良好な符号化デ
ータが得られる。That is, the speech coding apparatus according to the fifteenth embodiment is different from the first conventional apparatus in that the bit allocation control unit 1
With the configuration to which 517 is added, by executing a simple coded bit allocation method using relatively simple arithmetic processing on 32 bands, it is possible to improve the reproduction sound quality. The processing load is much smaller than that of the second conventional example in which Fourier transform and psychological auditory analysis are performed. It is possible to achieve both time processing and sound quality improvement. Further, the speech coding apparatus according to the fifteenth embodiment has a larger processing load than that of the apparatus according to the fourteenth embodiment due to the processing due to the characteristics of the input speech. Data is obtained.

【０３７２】なお、帯域出力適応ビット割り当て方式に
ついては、本実施の形態１５に示した算定方法は一例で
あり、これに限定されるものではなく、帯域に対する重
みづけや、二つの要因の重みを変更しても同様の単純な
演算処理による割り当てを実行することは可能であり、
同様の効果が得られる。The calculation method shown in the fifteenth embodiment is an example of the band output adaptive bit allocation method, and the present invention is not limited to this example. Even if it is changed, it is possible to execute the assignment by the same simple arithmetic processing,
Similar effects can be obtained.

【０３７３】実施の形態１６．本発明の実施の形態１６
による音声符号化装置は、符号化ビットの割り当て方に
よって、心理聴覚分析代替制御を実現し、各帯域ごとの
出力レベルと、各帯域ごとのビット割り当て数とを考慮
することで、符号化データの再生音質の一層の向上を図
るものである。Embodiment 16 FIG. Embodiment 16 of the present invention
The speech coding apparatus according to the present invention realizes psychological auditory analysis alternative control by assigning encoded bits, and considers the output level of each band and the number of bits assigned to each band, thereby obtaining encoded data. It is intended to further improve the reproduction sound quality.

【０３７４】図３９は、本発明の実施の形態１６による
音声符号化装置の構成を示すブロック図である。同図に
示すように、本実施の形態１６による音声符号化装置
は、音声入力部１６０１、入力音声サンプリング部１６
０３、帯域分割部１６０５、符号化ビット割り当て部１
６０６、量子化部１６０７、符号化部１６０８、符号化
データ記録部１６０９、およびビット割り当て制御部
（改良型帯域出力適応ビット割当）１６１６から構成さ
れている。これは、実施の形態１４による装置と同等の
構成である。また、本実施の形態１６の装置も実施の形
態９と同様、図１１に示されるハードウェア構成であ
る。FIG. 39 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 16 of the present invention. As shown in the figure, the speech coding apparatus according to the sixteenth embodiment includes a speech input unit 1601, an input speech sampling unit 16
03, band division section 1605, coded bit allocation section 1
606, a quantization unit 1607, an encoding unit 1608, an encoded data recording unit 1609, and a bit allocation control unit (improved band output adaptive bit allocation) 1616. This is the same configuration as the device according to the fourteenth embodiment. Also, the device of the sixteenth embodiment has the hardware configuration shown in FIG. 11, as in the ninth embodiment.

【０３７５】図３９において、ビット割り当て制御部１
６１７は、ＣＰＵ、メインメモリ、およびプログラムで
実現され、帯域分割部１６０５の分割によって得られる
Ｍ個の帯域信号データに対して、符号化ビット割り当て
部１６０６が割り当てるビット数を、所定のアルゴリズ
ムに従って算定する。このようなビット割り当て制御部
１６１７の機能を除いては、本実施の形態１６の装置
は、実施の形態１４、および１５の装置と同様の構成で
ある。従って、音声入力部１６０１、入力音声サンプリ
ング部１６０３、帯域分割部１６０５、符号化ビット割
り当て部１６０６、量子化部１６０７、符号化部１６０
８、および符号化データ記録部１６０９は、実施の形態
９の９０１、９０３、および９０５〜９０９と同様であ
り、説明を省略する。In FIG. 39, bit allocation control unit 1
617 is implemented by a CPU, a main memory, and a program, and calculates the number of bits allocated by the coding bit allocation unit 1606 to the M band signal data obtained by the division by the band division unit 1605 according to a predetermined algorithm. I do. Except for the function of the bit allocation control unit 1617, the device of the sixteenth embodiment has the same configuration as the devices of the fourteenth and fifteenth embodiments. Accordingly, the audio input unit 1601, the input audio sampling unit 1603, the band division unit 1605, the coded bit allocation unit 1606, the quantization unit 1607, and the encoding unit 160
8, and the coded data recording unit 1609 are the same as 901, 903, and 905 to 909 of the ninth embodiment, and a description thereof will be omitted.

【０３７６】図４０〜図４１は本実施の形態１６による
音声符号化のフローチャート図である。以下に本実施の
形態１６による音声符号化の際の動作を、図４０〜図４
１のフローチャートに従って、図３９を参照しながら説
明する。FIGS. 40 to 41 are flowcharts of speech encoding according to the sixteenth embodiment. The operation of speech encoding according to the sixteenth embodiment will be described below with reference to FIGS.
1 will be described with reference to FIG.

【０３７７】図４０のフローのステップ１〜２は第２の
従来例における図６５のステップ１〜２と同様に実行さ
れ、Ｍ個の周波数帯域に分割された帯域信号データが得
られる。ＭＰＥＧオーディオの規格に従って、第２の従
来例と同様にＭ＝３２個であったとする。Steps 1 and 2 of the flow of FIG. 40 are executed in the same manner as steps 1 and 2 of FIG. 65 in the second conventional example, and band signal data divided into M frequency bands is obtained. According to the MPEG audio standard, it is assumed that M = 32 as in the second conventional example.

【０３７８】この後、実施の形態１５では、図３８のフ
ローのステップ３において、ビット割り当て制御部１５
１７が帯域出力適応ビット割り当て方式により算定する
が、本実施の形態１６ではビット割り当て制御部１６１
７が改良型帯域出力適応ビット割り当て方式によりビッ
ト割り当てを決定する。帯域分割で３２の帯域に分割さ
れた周波数帯域に対して、実施の形態１４では、その帯
域に割り当てられるビット数を考慮に入れず、人間の耳
に聞こえやすい帯域を優先してビット割り当てを行うも
のであり、また、実施の形態１５では、人間の耳に聞こ
えやすい帯域であるかどうかと、各帯域の有する音圧と
の２つの要因に基づいて各帯域に対するビット割り当て
を行うものであった。そして、本実施の形態１６では、
さらに実施の形態１５における２つの要因に加えて、各
帯域へのビット割り当て数が十分であるかどうかという
要因を考慮し、３つの要因に基づいて各帯域に対するビ
ット割り当て情報を生成するものである。Then, in the fifteenth embodiment, in step 3 of the flow of FIG.
17 is calculated by the band output adaptive bit allocation method. In the sixteenth embodiment, the bit allocation control unit 161
7 determines bit allocation according to the improved band output adaptive bit allocation scheme. In the fourteenth embodiment, for the frequency band divided into 32 bands by the band division, the number of bits allocated to the band is not considered, and the bit allocation is performed with priority given to the band that is easy for human ears to hear. In the fifteenth embodiment, bits are allocated to each band based on two factors: whether the band is easily audible to human ears and the sound pressure of each band. . In the sixteenth embodiment,
Furthermore, in addition to the two factors in the fifteenth embodiment, the factor of whether the number of bits allocated to each band is sufficient is considered, and the bit allocation information for each band is generated based on the three factors. .

【０３７９】以下に本実施の形態１６で用いるビット割
当の方法について説明する。まず、レイヤ１なら２５６
ｋｂｐｓというビットレートから、割り当てるべき総ビ
ット数が求められることは実施の形態１４、および１５
と同様である。The following describes a bit allocation method used in the sixteenth embodiment. First, 256 for layer 1
Embodiments 14 and 15 determine that the total number of bits to be allocated is determined from the bit rate of kbps.
Is the same as

【０３８０】ビット割り当て制御部１６１７は、「各帯
域出力レベル(1) 」と、「人間の耳に聞こえやすい帯域
への重み付け(2) 」と、「各帯域ごとのビット割り当て
数に対応した重み付け(3) 」とに基づいて上記求めた総
ビット数を割り当てる。The bit allocation control unit 1617 includes “weight output for each band (1)”, “weighting to a band that is easy for human ears to hear (2)”, and “weighting corresponding to the number of bits allocated to each band. (3) Allocate the total number of bits obtained based on the above.

【０３８１】(1) 各帯域出力レベルまず、各帯域の有する音圧からスケールファクタを求め
る。なお、スケールファクタは、０から６２の間の値を
取り、値の小さいものほど音圧は大きい。ここでは、次
のようになっていたものとする。 (2) 人間の耳に聞こえやすい帯域への重み付け次に、「人間の耳に聞こえやすい帯域への重み付け」と
しては、図６６に示される最小可聴限界に基づいて次の
ように決定する。なお、上記のように各帯域出力レベル
をスケールファクタとして表しているため、実施の形態
１４で示した重み付け値とは、意味が逆転しているもの
となる。 (3) 各帯域ごとのビット割り当て数に対応した重み付け次に、「各帯域ごとのビット割り当て数に対応した重み
付け」としては、図６６に示す最小可聴限界に基づき、
さらに同一の帯域に必要以上のビット割り当てをしない
ように考慮して作成した下記の表に従って決定する。(1) Output level of each band First, a scale factor is obtained from the sound pressure of each band. The scale factor takes a value between 0 and 62, and the smaller the value, the higher the sound pressure. Here, it is assumed that the following is performed. (2) Weighting the Band Easily Hearable by Human Ears Next, “weighting the band easy to be heard by human ears” is determined as follows based on the minimum audible limit shown in FIG. Since each band output level is expressed as a scale factor as described above, the meaning is inverted from the weighting value described in the fourteenth embodiment. (3) Weighting corresponding to the number of allocated bits for each band Next, as “weighting corresponding to the number of allocated bits for each band”, based on the minimum audible limit shown in FIG.
Further, it is determined according to the following table created in consideration of not assigning more bits than necessary to the same band.

【０３８２】[0382]

【表１４】 [Table 14]

【０３８３】以上の(1) から(3) の要因があることを前
提に、次に、考慮すべき三つの要因である上記項目(1)
、(2) 、および(3) のうち、初めに項目(1) ＋項目(2)
を計算する。これにより次のビット割り当て情報係数
が得られる。（sb：帯域）ビット割り当て制御部１６１７は、この結果の中から最
小の値を持つ帯域を検出し、その帯域に符号化ビット割
り当て部６が符号化ビット割り当てを１ビット行う。な
お、最小の値を持つ帯域が複数個ある場合には低い帯域
を優先して行う。その後、ビット割り当て制御部１６１
７は、ビット割り当てが行われた帯域に対し、項目(3)
からビット数に対応した重み付け（＋項目(3) ）を行
い、次のような結果を得る。Assuming that the above factors (1) to (3) exist, the following three items to be considered, item (1)
, (2), and (3), first, item (1) + item (2)
Is calculated. As a result, the next bit allocation information coefficient is obtained. (Sb: bandwidth) The bit allocation control unit 1617 detects a band having the minimum value from the result, and the coded bit allocation unit 6 allocates one bit of the coded bit to the band. If there are a plurality of bands having the minimum value, the lower band is given priority. After that, the bit allocation control unit 161
7 is the item (3) for the band to which the bit is allocated.
, Weighting (+ item (3)) corresponding to the number of bits is performed, and the following result is obtained.

【０３８４】ビット割当部１６１７による、以上の動作は割り当て可
能な総ビット数がなくなるまで繰り返われ、符号化ビッ
ト割り当てがなされるものであるが、以下にこのように
ビット割当を行う、ビット割当部１６１７による、図４
０のフローのステップ３での動作について、図４１を用
いて説明する。[0384] The above operation by the bit allocating unit 1617 is repeated until the total number of allocable bits is exhausted, and coded bit allocation is performed. , FIG.
The operation in Step 3 of the flow 0 will be described with reference to FIG.

【０３８５】図４１は改良型帯域出力適応ビット割り当
て方式の手順を示すフローチャートである。ステップ１
０１でビット割当部１６１７は、上記のように各帯域の
スケールファクタを算出する。これは、上記(1) に相当
する処理である。次のステップ１０２では、最小可聴限
界に基づいた重み付け処理によって、各帯域へのビット
割り当て情報係数を算出する。上記(2) に相当する処理
である。FIG. 41 is a flowchart showing the procedure of the improved band output adaptive bit allocation method. Step 1
At 01, the bit allocation unit 1617 calculates the scale factor of each band as described above. This is a process corresponding to the above (1). In the next step 102, a bit allocation information coefficient for each band is calculated by a weighting process based on the minimum audibility limit. This is a process corresponding to the above (2).

【０３８６】それから、ステップ１０３で最小のビット
割り当て情報係数を持つ帯域を検出し、ステップ１０４
でその帯域に符号化ビットを１つ割り当てる。すなわ
ち、上記の項目(1) ＋(2) がなされたこととなる。Then, in step 103, a band having the smallest bit allocation information coefficient is detected, and in step 104
Assigns one coded bit to the band. That is, the above items (1) + (2) are performed.

【０３８７】次のステップ１０５において、上記の項目
(3) に相当する重み付けがなされ、ステップ１０４でビ
ット数を割り当てた帯域に対して、その帯域に現在割り
当てられているビット数に対応して、（表１４）から得
られる重み付け係数を加算する。ステップ１０３〜１０
５の動作はステップ１０６において総ビット数の割り当
てが終了したと判断されるまで繰り返され、終了したと
判断されたときは図４１のフローが終了する。そして図
４０のフローのステップ３が終了し、同図ステップ４お
よび５は、第２の従来例と同様に実行される。In the next step 105, the above items
Weighting corresponding to (3) is performed, and a weighting coefficient obtained from (Table 14) is added to the band to which the number of bits is allocated in step 104, in accordance with the number of bits currently allocated to the band. . Steps 103 to 10
The operation of No. 5 is repeated until it is determined in step 106 that the assignment of the total number of bits has been completed, and when it is determined that the allocation has been completed, the flow of FIG. 41 ends. Then, step 3 of the flow of FIG. 40 is completed, and steps 4 and 5 of FIG. 40 are executed in the same manner as in the second conventional example.

【０３８８】このように、本実施の形態１６の音声符号
化装置においては、ビット割り当て制御部１６１７を備
えたことで、人間の耳に聞こえやすい帯域であるかどう
かと、各帯域の有する音圧と、同帯域への必要以上のビ
ット割り当てを避けることとの３つの要因に基づいて各
帯域にビットを割り当てるので、処理負担を大きく増大
することなく、人間の聴覚特性を活かした符号ビット割
り当てが実行でき、再生音質の良好な符号化データが得
られる。As described above, in the speech coding apparatus according to the sixteenth embodiment, the provision of the bit allocation control section 1617 makes it possible to determine whether or not the band is audible to human ears and the sound pressure of each band. Since bits are allocated to each band based on three factors of avoiding unnecessary bit allocation to the same band, code bit allocation utilizing human auditory characteristics can be achieved without greatly increasing the processing load. It can be executed and coded data with good reproduction sound quality can be obtained.

【０３８９】すなわち本実施の形態１６の音声符号化装
置は第１の従来例による装置にビット割り当て制御部１
６１７を追加した構成により、比較的単純な演算処理を
用いる簡単な符号化ビット割り当て方法を３２の帯域に
対して実行することにより、再生音質の向上を図り得る
ものであり、２５６の帯域に対してフーリエ変換と心理
聴覚分析とを行う第２の従来例と比較すると、はるかに
処理負担は小さく、第２図のハードウェア構成で示され
る汎用パーソナルコンピュータやワークステーション等
で行う音声符号化においても実時間処理と音質向上との
両立が可能なものである。また、本実施の形態１６によ
る音声符号化装置は、実施の形態１５による装置と比較
して、各帯域のビット割り当て状況の監視を行う重み付
けを考慮して処理する分だけ処理負担が大きくなるが、
それだけ再生音質の良好な符号化データが得られる。That is, the speech coding apparatus according to the sixteenth embodiment differs from the apparatus according to the first conventional example in the bit allocation control unit 1
With the configuration to which 617 is added, by executing a simple coded bit allocation method using relatively simple arithmetic processing on 32 bands, the reproduction sound quality can be improved. In comparison with the second conventional example in which Fourier transform and psychological auditory analysis are performed, the processing load is much smaller. It is possible to achieve both real-time processing and sound quality improvement. Further, the processing load of the speech coding apparatus according to the sixteenth embodiment is larger than that of the apparatus according to the fifteenth embodiment by the amount of processing in consideration of the weight for monitoring the bit allocation status of each band. ,
As a result, encoded data with good reproduction sound quality can be obtained.

【０３９０】なお、改良型帯域出力適応ビット割り当て
方式については、本実施の形態１６に示した算定方法は
一例であり、これに限定されるものではなく、帯域に対
する重みづけや、各帯域のビット割り当て数に対する重
みづけを変更しても、さらには、スケールファクタ値を
用いず各帯域出力レベル比を用いても同様の単純な演算
処理による割り当てを実行することは可能であり、同様
の効果が得られる。The calculation method shown in the sixteenth embodiment is an example of the improved band output adaptive bit allocation method, and is not limited thereto. Even if the weight for the number of allocations is changed, and even if each band output level ratio is used without using the scale factor value, it is possible to execute the allocation by the same simple arithmetic processing, and the same effect is obtained. can get.

【０３９１】実施の形態１７．本発明の実施の形態１７
による音声符号化装置は、最小可聴限界を考慮した符号
化ビットの割り当て方によって、心理聴覚分析代替制御
を実現し、符号化データの再生音質の向上を図るもので
ある。図４２は、本発明の実施の形態１７による音声符
号化装置の構成を示すブロック図である。同図に示すよ
うに、本実施の形態１７による音声符号化装置は、音声
入力部１７０１、入力音声サンプリング部１７０３、帯
域分割部１７０５、符号化ビット割り当て部１７０６、
量子化部１７０７、符号化部１７０８、符号化データ記
録部１７０９、ビット割り当て制御部（動的ビット割
当）１７１７、および最小可聴限界比較部１７１８から
構成されている。これは、実施の形態１４による装置と
同等の構成である。また、本実施の形態１７の装置も実
施の形態９と同様、図１１に示されるハードウェア構成
である。Embodiment 17 FIG. Embodiment 17 of the present invention
Is intended to realize a psychological auditory analysis alternative control and to improve the reproduced sound quality of encoded data by assigning encoded bits in consideration of the minimum audibility limit. FIG. 42 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 17 of the present invention. As shown in the figure, the speech coding apparatus according to the seventeenth embodiment includes a speech input unit 1701, an input speech sampling unit 1703, a band division unit 1705, a coded bit allocation unit 1706,
It comprises a quantization section 1707, an encoding section 1708, an encoded data recording section 1709, a bit allocation control section (dynamic bit allocation) 1717, and a minimum audible limit comparison section 1718. This is the same configuration as the device according to the fourteenth embodiment. Also, the device of the seventeenth embodiment has the hardware configuration shown in FIG. 11, as in the ninth embodiment.

【０３９２】同図において、最小可聴限界比較部１７１
８は、ＣＰＵ、メインメモリ、およびプログラムで実現
され、帯域分割部１７０５の分割によって得られるＭ個
の帯域信号データに対して、最小可聴限界との比較を行
い、最小可聴限界未満の帯域を検出する。符号化ビット
割り当て部１７０６は、最小可聴限界比較部１７１８が
検出した帯域に対しては符号化ビットを割り当てない。
このようなビット割り当て制御部１５１７の機能を除い
ては、本実施の形態１５は実施の形態１４と同様の構成
である。従って、音声入力部１７０１、入力音声サンプ
リング部１７０３、帯域分割部１７０５、符号化ビット
割り当て部１７０６、量子化部１７０７、符号化部１７
０８、および符号化データ記録部１７０９は、実施の形
態９の９０１、９０３、および９０５〜９０９と同様で
あり、説明を省略する。In the figure, the minimum audible limit comparing section 171
Numeral 8 is realized by a CPU, a main memory, and a program, compares the M band signal data obtained by the division by the band division unit 1705 with the minimum audible limit, and detects a band smaller than the minimum audible limit. I do. The coded bit allocation unit 1706 does not allocate coded bits to the band detected by the minimum audible limit comparison unit 1718.
Except for the function of the bit allocation control unit 1517, the fifteenth embodiment has the same configuration as the fourteenth embodiment. Therefore, the audio input unit 1701, the input audio sampling unit 1703, the band division unit 1705, the coded bit allocation unit 1706, the quantization unit 1707, and the encoding unit 17
08 and the encoded data recording unit 1709 are the same as 901, 903, and 905 to 909 in the ninth embodiment, and a description thereof will be omitted.

【０３９３】図４３は本実施の形態１７による音声符号
化のフローチャート図である。以下に本実施の形態１７
による音声符号化の際の動作を、図４３のフローチャー
トに従って、図４２を参照しながら説明する。本実施の
形態１７による装置では、音声符号化に先立って最小可
聴限界比較部１７１８はメモリ等を内部記憶手段として
用いて、Ｍ帯域（ここでは３２帯域）に対する最小可聴
限界をテーブル（表）として記憶しておくものである。
このテーブルについては、図６６に示すグラフや、この
ようなグラフを数表化したものより読みとって記憶して
おくこととする。FIG. 43 is a flowchart of speech encoding according to the seventeenth embodiment. The seventeenth embodiment will be described below.
Will be described with reference to FIG. 42 in accordance with the flowchart of FIG. In the apparatus according to the seventeenth embodiment, prior to speech encoding, minimum audibility limit comparison section 1718 uses a memory or the like as internal storage means and sets a minimum audibility limit for M bands (here, 32 bands) as a table. It is something to remember.
This table is to be read and stored from the graph shown in FIG. 66 or a graph of such a graph.

【０３９４】図４３のフローのステップ１〜２は第２の
従来例における図６５のステップ１〜２と同様に実行さ
れ、Ｍ個の周波数帯域に分割された帯域信号データが得
られる。ＭＰＥＧオーディオの規格に従って、第２の従
来例と同様にＭ＝３２個であったとする。Steps 1 and 2 of the flow of FIG. 43 are executed in the same manner as steps 1 and 2 of FIG. 65 in the second conventional example, and band signal data divided into M frequency bands is obtained. According to the MPEG audio standard, it is assumed that M = 32 as in the second conventional example.

【０３９５】この後、図６５に示す第２の従来例では、
Ｌ＝２５６個に分割された帯域信号データに対して高速
フーリエ変換と、最小可聴限界との比較を含む心理聴覚
分析とを行って、符号化ビット割り当て数を決定する
が、本実施の形態１７では、図４３のフローのステップ
３において、最小可聴限界比較部１７１８が、Ｍ＝３２
個の帯域に対して最小可聴限界との比較をし、その比較
の結果に基づいて符号化ビット割り当て部１７０６がＭ
＝３２個の帯域信号データに対する符号化ビットの割り
当てを行う。Thereafter, in the second conventional example shown in FIG.
Fast Fourier transform and psychoacoustic analysis including comparison with the minimum audible limit are performed on the band signal data divided into L = 256 to determine the number of coded bits to be allocated. Then, in step 3 of the flow in FIG. 43, the minimum audible limit comparison unit 1718 determines that M = 32
Are compared with the minimum audible limit, and based on the result of the comparison, the coded bit allocation unit 1706 determines
Allocate coded bits to = 32 band signal data.

【０３９６】そして、ステップ３での比較では、帯域分
割部１７０５が分割によって得たＭ＝３２帯域の帯域信
号データについて、最小可聴限界比較部１７１８が、上
述のように予め記憶したテーブルの対応する帯域の最小
可聴限界との比較を行い、最小可聴限界に満たない帯域
を抽出して、その結果を符号化ビット割り当て部１７０
６に出力する。Then, in the comparison in step 3, the minimum audible limit comparison unit 1718 corresponds to the table stored in advance as described above for the band signal data of M = 32 bands obtained by the division by the band division unit 1705. The band is compared with the minimum audible limit, a band less than the minimum audible limit is extracted, and the result is coded by the coded bit allocating unit 170.
6 is output.

【０３９７】そして、ステップ４において符号化ビット
割り当て部１７０６は上記出力された比較の結果を用い
て、最小可聴限界未満の帯域にはビット割り当てを行わ
ず、その分を最小可聴限界以上の他の帯域に多くのビッ
トを割り当てるように、ビット割り当てを実行する。同
図ステップ４および５は、第２の従来例と同様に実行さ
れる。Then, in step 4, the coded bit allocation unit 1706 does not allocate bits to the band less than the minimum audible limit by using the output result of the comparison, and allocates the bit to another band equal to or more than the minimum audible limit. Perform bit allocation to allocate more bits to the band. Steps 4 and 5 are executed in the same manner as in the second conventional example.

【０３９８】このように、本実施の形態１７の音声符号
化装置においては、最小可聴限界比較部１７１８を備
え、帯域分割部１７０５が分割して得られたＭ個の帯域
に対して、予め記憶した最小可聴限界との比較を行うこ
とで、最小可聴限界未満の帯域を検出し、符号化ビット
割り当て部１７０６は、上記検出した帯域に符号化ビッ
トを割り当てないので、処理負担を大きく増大すること
なく、人間の聴覚特性を活かした符号ビット割り当てが
実行でき、再生音質の良好な符号化データが得られる。As described above, the speech coding apparatus according to the seventeenth embodiment includes minimum audibility limit comparing section 1718, and stores in advance the M bands obtained by dividing by band dividing section 1705. By performing comparison with the minimum audible limit, a band smaller than the minimum audible limit is detected, and the coded bit allocation unit 1706 does not allocate coded bits to the detected band. In addition, it is possible to execute code bit assignment utilizing human auditory characteristics, and obtain encoded data having good reproduced sound quality.

【０３９９】すなわち本実施の形態１７の音声符号化装
置は第１の従来例による装置に最小可聴限界比較部１７
１８を追加した構成により、先に分割して得られたＭ＝
３２個の帯域信号と最小可聴限界との比較を行うものな
ので、ＭＰＥＧ１Ａｕｄｉｏにおいて第２の従来例にお
ける最小可聴限界適用との比較では、２５６帯域へのＦ
ＦＴが不要となり、また帯域信号の演算や比較が３２／
２５６＝１／８に削減でき、大幅な処理負荷軽減を図る
ことが可能となる。従って、図２のハードウェア構成で
示される汎用パーソナルコンピュータやワークステーシ
ョン等で行う音声符号化においても実時間処理と音質向
上との両立が可能なものである。That is, the speech coder of the seventeenth embodiment differs from the apparatus of the first prior art in that the minimum audible limit comparing section 17
With the configuration in which 18 has been added, M =
Since the comparison is made between the 32 band signals and the minimum audible limit, in comparison with the minimum audible limit application in the second conventional example in MPEG1 Audio, the F
FT becomes unnecessary, and the calculation and comparison of the band signal becomes
256 = 1/8, and the processing load can be greatly reduced. Therefore, in the speech encoding performed by a general-purpose personal computer, a workstation, or the like shown in the hardware configuration of FIG. 2, it is possible to achieve both real-time processing and improved sound quality.

【０４００】なお、音声符号化に先立った最小可聴限界
テーブルの記憶については、図６６のグラフまたは数表
を読み込ませるものとしたが、他に、規格書（ISO/IEC1
1172-3）の表D.1 に従って、各帯域の最小可聴限界を求
め、これを記憶しておくこととしても良い。この表で
は、INDEX と最小可聴限界とが対照されたものとなって
いるので、Ｍ＝３２帯域であれば、３２帯域のそれぞれ
の中心周波数に近いINDEX の値を用いて、表より最小可
聴限界を求めることができる。[0400] The storage of the minimum audible limit table prior to speech encoding is performed by reading the graph or table shown in Fig. 66.
According to Table D.1 of 1172-3), the minimum audible limit of each band may be obtained and stored. In this table, INDEX is compared with the minimum audible limit. If M = 32 bands, the minimum audible limit is obtained from the table using the value of INDEX close to the center frequency of each of the 32 bands. Can be requested.

【０４０１】以上本発明の実施の形態として、実施の形
態９〜１７を示したが、実施の形態９〜１３による符号
化においては、実質的にオーディオデータの帯域信号デ
ータレベルでの間引きや、フィルタ特性の低下を行うた
め、それに伴い音質が劣化することにはなる。しかしそ
の場合でも、性能の低いＣＰＵによっても、ハードウェ
ア的追加等を要せずソフトウェア的に、ＭＰＥＧＡｕｄ
ｉｏなどの帯域分割符号化データを実時間で作成でき、
これを、動画符号化の国際標準として広く用いられるＭ
ＰＥＧデータとして利用することが可能となる。また、
変数定数の値を調整することで、ＣＰＵの符号化処理性
能にあわせて、間引き具合やフィルタ特性を制御できる
ため、高性能なＣＰＵのみならず性能が不十分なＣＰＵ
でもその符号化処理能力なりの音質で符号化することが
出来、幅広い性能レベルのＣＰＵで符号化処理が実現で
きる。Although the ninth to nineteenth embodiments have been described as the embodiments of the present invention, the encoding according to the ninth to thirteenth embodiments substantially eliminates the thinning of the audio data at the band signal data level, Since the filter characteristics are reduced, the sound quality is deteriorated accordingly. However, even in such a case, even with a low-performance CPU, MPEGAud can be implemented in software without additional hardware.
io etc. can be created in real time,
This is called M, which is widely used as an international standard for video coding.
It can be used as PEG data. Also,
By adjusting the value of the variable constant, it is possible to control the thinning degree and the filter characteristics in accordance with the encoding processing performance of the CPU.
However, encoding can be performed with sound quality equivalent to the encoding processing capability, and encoding processing can be realized with a CPU having a wide range of performance levels.

【０４０２】また、実施の形態１４〜１７による符号化
においては、第２の従来例における心理聴覚分析を行う
場合ほどの音質向上の効果は得られない。しかし、心理
聴覚分析を全く行わない第１の従来例による音声符号化
よりは音質を向上でき、汎用パーソナルコンピュータや
ワークステーション等の機器においても、ハードウェア
的追加等を要せずソフトウェア的に、音声取り込みにと
もなった実時間符号化を実行しつつ、再生音質の向上を
図ることが可能である。In the coding according to the fourteenth to seventeenth embodiments, the effect of improving the sound quality as compared with the case of performing the psychoacoustic analysis in the second conventional example cannot be obtained. However, the sound quality can be improved as compared with the speech encoding according to the first conventional example which does not perform any psychoacoustic analysis at all, and even in a device such as a general-purpose personal computer or a workstation, it does not require additional hardware or the like. It is possible to improve the reproduction sound quality while performing the real-time encoding accompanying the voice capture.

【０４０３】但し、実施の形態９〜１７のいずれにおい
ても、ハードウェア面に関しては、ＣＰＵが高性能であ
るほど、またサウンドボードの機能や装置内でのデータ
伝送速度が高いほど、高品質な符号化が可能である。ま
た、実施の形態９〜１７の音声符号化は、音声符号化制
御プログラムとして記録媒体に記録し、パーソナルコン
ピュータ、ワークステーションその他の装置において実
行することが可能である。However, in any of the ninth to seventeenth embodiments, the higher the performance of the CPU and the higher the function of the sound board and the higher the data transmission speed in the device, the higher the quality of the hardware. Encoding is possible. Also, the audio encoding according to the ninth to seventeenth embodiments can be recorded on a recording medium as an audio encoding control program, and can be executed by a personal computer, a workstation, or another device.

【０４０４】また、実施の形態９〜１７では、符号化デ
ータを記憶装置に保存することとしたが、ネットワーク
等を介して他の機器に伝達し、他の機器において記録ま
たは利用することも可能である。また、実施の形態９〜
１７では、ＣＰＵ制御によるソフトウェア処理で実現す
るものとして説明したが、ＣＰＵの代わりにＤＳＰが制
御するソフトウェア処理によっても、同様である。In the ninth to seventeenth embodiments, the coded data is stored in the storage device. However, the coded data can be transmitted to another device via a network or the like and recorded or used in another device. It is. Further, Embodiments 9 to
In FIG. 17, it is described that the processing is realized by software processing under CPU control. However, the same applies to software processing controlled by a DSP instead of the CPU.

【０４０５】実施の形態１８．本発明の実施の形態１８
による映像音声符号化装置は、汎用計算機等において映
像音声の符号化処理をソフトウェア処理によって行う場
合に、当該計算機等において負担増大があった場合に
も、音声の途切れを防ぐことを、音声データの蓄積量を
指標として、映像情報の符号化処理停止を行うものであ
る。Embodiment 18 FIG. Embodiment 18 of the present invention
The video and audio coding apparatus according to the present invention, when performing video and audio coding processing on a general-purpose computer by software processing, even if there is an increase in the load on the computer, etc. The encoding process of the video information is stopped using the accumulated amount as an index.

【０４０６】図４４は本発明の実施の形態１８による映
像音声符号化装置の概略構成を示す図である。図示する
ように、本実施の形態１８による映像符号化装置は、ビ
デオカメラ１８０１、音声キャプチャ部１８０２、音声
バッファリング部１８０３、音声符号化部１８０５、映
像キャプチャ部１８０６、映像符号化部１８０７、およ
び符号化負荷評価部１８０８から構成されている。当該
映像音声符号化装置からは、図示するように符号化音声
情報と符号化映像情報とが装置出力として出力され、こ
れらは必要に応じて伝送されたり記録されることとな
る。FIG. 44 is a diagram showing a schematic configuration of a video / audio coding apparatus according to Embodiment 18 of the present invention. As shown in the figure, the video encoding device according to the eighteenth embodiment includes a video camera 1801, an audio capture unit 1802, an audio buffering unit 1803, an audio encoding unit 1805, a video capture unit 1806, a video encoding unit 1807, It comprises an encoding load evaluator 1808. As shown in the figure, the video / audio coding apparatus outputs coded audio information and coded video information as device outputs, which are transmitted or recorded as necessary.

【０４０７】同図において、ビデオカメラ１８０１は、
映像音声情報を取り込み、アナログ音声情報とアナログ
映像情報とに分けて出力する。音声キャプチャ部１８０
２は、ビデオカメラ１８０１から出力されたアナログ音
声情報を入力し、離散的なデジタルデータからなるデジ
タル原音声情報として出力する。音声バッファリング部
１８０３は、音声キャプチャ部１８０２から出力された
デジタル原音声情報を一時的に蓄積する。音声バッファ
リング部１８０３に蓄積された原音声情報の総量は、原
音声バッファ量１８０４であり、本実施の形態１８によ
る映像音声符号化装置において制御に用いられる情報で
ある。音声符号化部１８０５は、音声バッファリング部
１８０３に蓄積された原音声情報を取り出して、圧縮符
号化処理し、符号化音声情報を出力する。音声符号化部
１８０３は、一時蓄積された原音声情報の取り出しにあ
たっては、該一時蓄積された原音声情報のうち、もっと
も先（過去）に蓄積された原音声情報を取り出して、こ
れを音声バッファリング部１８０３から削除する。従っ
て、音声バッファリング部１８０３は、ＦＩＦＯ(First
In First Out)構造をとることが望ましく、具体的には
リングバッファ等のアルゴリズムで実現される。[0407] In the same figure, a video camera 1801 is
It takes in video and audio information, and separates and outputs analog audio information and analog video information. Voice capture unit 180
2 inputs analog audio information output from the video camera 1801, and outputs it as digital original audio information composed of discrete digital data. The audio buffering unit 1803 temporarily stores the digital original audio information output from the audio capture unit 1802. The total amount of the original audio information stored in the audio buffering unit 1803 is the original audio buffer amount 1804, which is information used for control in the video and audio encoding device according to the eighteenth embodiment. The audio encoding unit 1805 extracts the original audio information stored in the audio buffering unit 1803, performs compression encoding processing, and outputs encoded audio information. When extracting the temporarily stored original audio information, the audio encoding unit 1803 extracts the earliest (past) original audio information from the temporarily stored original audio information, and stores it in an audio buffer. It is deleted from the ring section 1803. Therefore, the audio buffering unit 1803 performs the FIFO (First
It is desirable to adopt an (In First Out) structure, specifically, it is realized by an algorithm such as a ring buffer.

【０４０８】映像キャプチャ部１８０６は、ビデオカメ
ラ１８０１から出力されたアナログ映像情報を入力し、
離散的なデジタルデータからなり、単位時間ごとの静止
画像の複数枚から構成されるデジタルの原映像情報を出
力する。ここで、原映像情報は、あらかじめ設定された
解像度を有するものとして得られる。映像キャプチャ部
１８０６と、前出の音声キャプチャ部１８０２とは、マ
ルチメディア対応タイプのパーソナルコンピュータであ
れば、一般的に装備されるビデオキャプチャボードとし
て実現される。映像符号化部１８０７は、映像キャプチ
ャ部１８０６から出力された原映像情報を入力し、圧縮
符号化して符号化映像情報を出力する。符号化負荷評価
部１８０８は、当該映像音声符号化装置の符号化処理に
おける負荷を評価し、その評価に対応して、映像キャプ
チャ部１８０６から出力される原映像情報の映像符号化
部１８０７における処理を制御する。符号化負荷評価部
１８０８による制御は、符号化負荷基準情報１８１０を
用いて、符号化負荷評価情報１８０９を演算によって取
得し、当該取得した符号化負荷評価情報１８０９の値に
よって、原映像情報を映像符号化部１８０７に入力する
か、原映像情報を破棄するかを選択することで行われ
る。そして、原映像情報が破棄されたときには、映像符
号化部１８０７の符号化処理は中断されることとなり、
当該符号化装置の計算機資源（ＣＰＵ時間）が音声符号
化部１８０５に明け渡されることとなる。[0408] The video capture unit 1806 receives the analog video information output from the video camera 1801,
It outputs digital original video information consisting of discrete digital data and a plurality of still images per unit time. Here, the original video information is obtained as having the resolution set in advance. The video capture unit 1806 and the above-described audio capture unit 1802 are realized as a generally equipped video capture board if it is a multimedia-compatible personal computer. The video encoding unit 1807 receives the original video information output from the video capture unit 1806, performs compression encoding, and outputs encoded video information. The encoding load evaluator 1808 evaluates the load in the encoding process of the video / audio encoder, and processes the original video information output from the video capture unit 1806 in the video encoder 1807 in accordance with the evaluation. Control. The control by the coding load evaluator 1808 obtains the coding load evaluation information 1809 by calculation using the coding load reference information 1810, and converts the original video information into a video based on the value of the obtained coding load evaluation information 1809. This is performed by selecting whether to input to the encoding unit 1807 or discard the original video information. When the original video information is discarded, the encoding process of the video encoding unit 1807 is interrupted,
The computer resources (CPU time) of the encoding device are given to the speech encoding unit 1805.

【０４０９】符号化負荷評価部１８０８による、符号化
負荷評価情報１８０９の取得は、原音声バッファ量１８
０４の値を用いて評価情報を計算し、この評価情報に対
して符号化負荷基準情報１８１０を乗算することによっ
て行われる。本実施の形態１８では評価情報の計算にあ
たって、原音声バッファ量１８０４が音声バッファリン
グ部１８０３における蓄積可能量の半分以上になれば、
評価情報を０％とし、半分以下になれば、評価情報を１
００％とするものである。The encoding load evaluator 1808 acquires the encoding load evaluation information 1809 when the original audio buffer amount 18
This is performed by calculating evaluation information using the value of “04” and multiplying the evaluation information by the coding load reference information 1810. In the eighteenth embodiment, when calculating the evaluation information, if the original audio buffer amount 1804 becomes half or more of the storable amount in the audio buffering unit 1803,
The evaluation information is set to 0%.
00%.

【０４１０】符号化負荷基準情報１１０は、映像符号化
処理の処理量の基準を示す情報であり、例えば「音声バ
ッファが空である場合に映像の処理をどの程度行うかを
示す量」として設定しておくものであるが、本実施の形
態１８においては常に「１」の値であるものとして、評
価情報と符号化負荷基準情報１８１０との乗算にあたっ
て、上記評価情報がそのまま符号化負荷評価情報１８０
９となるようにしている。したがって、符号化負荷評価
情報１０９は、０％か１００％であり、１００％の場
合、符号化負荷評価部１０８は、その時点で入力された
原映像情報を１００％映像符号化部１８０７に入力し、
０％の場合、原映像情報をすべて破棄する。そして、映
像符号化部１８０７の処理を中断し、計算機資源（ＣＰ
Ｕ時間）を音声符号化部１８０５に明け渡すように制御
を行う。[0410] The coding load reference information 110 is information indicating the reference of the processing amount of the video coding processing, and is set as, for example, "amount indicating how much video processing is performed when the audio buffer is empty". It should be noted that, in the eighteenth embodiment, when the evaluation information is multiplied by the coding load reference information 1810, the evaluation information is used as it is as the value of “1”. 180
9 is set. Accordingly, the coding load evaluation information 109 is 0% or 100%. In the case of 100%, the coding load evaluation unit 108 inputs the original video information input at that time to the 100% video coding unit 1807. And
In the case of 0%, all the original video information is discarded. Then, the processing of the video encoding unit 1807 is interrupted, and the computer resources (CP
(U time) is passed to the speech encoding unit 1805.

【０４１１】このように構成された、本実施の形態１８
による映像音声符号化装置の動作の概略は、以下のよう
になる。すなわち、当該映像音声符号化装置において、
ビデオカメラ１８０１から出力されたアナログ音声情報
が入力されると、音声キャプチャ部１８０２は、離散的
なデジタルデータからなるデジタル原音声情報として出
力する。この音声キャプチャ部１８０２から出力された
デジタル原音声情報は、音声バッファリング部１８０３
に一時的に蓄積される。そして、音声符号化部１８０５
は、音声バッファリング部１８０３に蓄積された原音声
のうち、もっとも先（過去）に蓄積された原音声を取り
出し、取り出した原音声を音声バッファリング部１８０
３から削除し、原音声を圧縮符号化して符号化音声情報
として出力する。音声符号化部１８０５は、音声バッフ
ァリング部１８０３に蓄積された原音声の総量を示す値
である原音声バッファ量１８０４を更新し、原音声バッ
ファ量１８０４は、当該映像音声符号化装置による符号
化処理のための情報として保持される。[0411] Embodiment 18 thus configured
The outline of the operation of the video / audio coding apparatus according to the above is as follows. That is, in the video / audio coding apparatus,
When the analog audio information output from the video camera 1801 is input, the audio capture unit 1802 outputs the digital audio information as digital original audio information including discrete digital data. The digital original audio information output from the audio capture unit 1802 is
Is temporarily stored. Then, the speech encoding unit 1805
Extracts the earliest (past) original audio from among the original audios stored in the audio buffering unit 1803, and extracts the extracted original audio into the audio buffering unit 1803.
3, and compresses and encodes the original audio to output as encoded audio information. The audio encoding unit 1805 updates the original audio buffer amount 1804 which is a value indicating the total amount of the original audio stored in the audio buffering unit 1803, and the original audio buffer amount 1804 is encoded by the video / audio encoding device. It is stored as information for processing.

【０４１２】また、ビデオカメラ１８０１から出力され
たアナログ映像情報が入力されると、映像キャプチャ部
１８０６は、離散的なデジタルデータからなり、予め定
義された解像度を持つ単位時間ごとの複数の静止画像情
報で構成されるデジタル原映像情報として出力する。こ
の映像キャプチャ部１８０６から出力された原映像情報
が入力されると、映像符号化部１８０７は、圧縮符号化
して符号化情報として出力する。そして、符号化負荷評
価部１８０８は、符号化負荷評価情報１８０９を計算
し、この計算された符号化負荷評価情報１８０９の値に
従って、原映像情報を映像符号化部１８０７に入力する
か、原映像情報を破棄して映像符号化部１８０７の処理
を中断し、計算機資源（ＣＰＵ時間）を音声符号化部１
８０５に明け渡すか、否かの動作を決定する。映像符号
化部１８０７は、原映像情報を入力されたなら当該原映
像情報を圧縮符号化処理し、符号化映像情報を出力す
る。[0412] When analog video information output from the video camera 1801 is input, the video capture unit 1806 includes a plurality of still images per unit time, which are composed of discrete digital data and have a predefined resolution. It is output as digital original video information composed of information. When the original video information output from the video capture unit 1806 is input, the video encoding unit 1807 performs compression encoding and outputs the encoded information. Then, the encoding load evaluator 1808 calculates the encoding load evaluation information 1809, and inputs the original video information to the video encoding unit 1807 or outputs the original video information according to the calculated value of the encoding load evaluation information 1809. The information is discarded, the processing of the video encoding unit 1807 is interrupted, and the computer resources (CPU time) are
An operation to determine whether to surrender to 805 or not is determined. When the original video information is input, the video encoding unit 1807 performs a compression encoding process on the original video information, and outputs encoded video information.

【０４１３】図４５は、本実施の形態１８による映像音
声符号化装置において、ある映像音声をとりこんで符号
化する際の動作を図解的に表した図である。ここで、本
装置は、パーソナルコンピュータ等の汎用計算機におい
て実現されるものとし、該汎用計算機は複数の作業（タ
スク）を並行的に実行できるマルチタスクオペレーショ
ンシステムによって作動しているものであって、当該映
像音声符号化処理は、オペレーションシステム上で、映
像符号化、および音声符号化の各タスクとして扱われる
ものであるとする。映像符号化、および音声符号化を含
む各タスクは、オペレーションシステムによって、計算
機資源であるＣＰＵ時間の割り当てがなされ、該割り当
てられた期間に、ＣＰＵの制御によってそれぞれの処理
を実行することができる。ここでは、各タスクが一連の
処理を終了して割り当てられた計算機資源（ＣＰＵ時
間）の解放を行ったとき、オペレーションシステムは他
のタスクに割り当てを行うという制御がされるものとす
る。[0413] Fig. 45 is a diagram schematically illustrating an operation of capturing and encoding a certain video and audio in the video and audio encoding device according to the eighteenth embodiment. Here, the present apparatus is realized by a general-purpose computer such as a personal computer, and the general-purpose computer is operated by a multitask operation system capable of executing a plurality of tasks (tasks) in parallel. It is assumed that the video / audio coding process is handled as video coding and audio coding tasks on the operation system. For each task including video encoding and audio encoding, CPU time, which is a computer resource, is assigned by the operating system, and during the assigned period, each process can be executed under the control of the CPU. Here, it is assumed that when each task finishes a series of processing and releases the allocated computer resources (CPU time), the operation system is controlled to allocate to another task.

【０４１４】同図においては、上から下に時間が進行し
ていくものとし、図中の四角形は、マルチタスクオペレ
ーションシステム上の各プロセス（タスク）が計算機資
源（ＣＰＵ時間）を消費していることを示す。四角形間
を結ぶ破線矢印は、プロセスがスイッチしたことを示
す。破線矢印は、斜め線になっているが、この斜め線の
角度がプロセススイッチにかかる時間、すなわちマルチ
タスクオペレーションシステムのタスク切り替えのため
のオーバヘッドを示す。なお、これ以降の説明では、こ
のオーバヘッドについては、各タスクにおける処理との
比較において相対的にわずかなものとして、説明の上で
無視するものとする。[0414] In the figure, it is assumed that time progresses from the top to the bottom, and the squares in the figure indicate that each process (task) on the multitask operation system consumes computer resources (CPU time). It indicates that. Dashed arrows connecting the squares indicate that the process has switched. The dashed arrow is a diagonal line, and the angle of the diagonal line indicates the time required for the process switch, that is, the overhead for task switching of the multitask operation system. In the following description, this overhead is assumed to be relatively small in comparison with the processing in each task, and is ignored in the description.

【０４１５】同図中、「映像符号化プロセス」で示され
る欄は、映像情報の符号化のためのプロセスが消費する
時間を示し、上記構成によれば、符号化付加評価部１８
０８の処理と、映像符号化部１８０７の処理とを実行す
るプロセスの作業時間を示す。また、「音声符号化プロ
セス」で示される欄は、音声情報の符号化のためのプロ
セスが消費する時間を示し、上記構成によれば、音声符
号化部１８０５の処理を実行するプロセスの作業時間を
示す。さらに、「その他のプロセス」は、「映像符号化
プロセス」および「音声符号化プロセス」以外の、あら
ゆるプロセスの作業時間を示す。また、「原音声バッフ
ァ量」は、その時間での原音声バッファ量１８０４を、
最大バッファ量（音声バッファリング手段１８０３にお
ける蓄積可能量）に対する割合で示したものである。[0415] In the figure, the column indicated by "video encoding process" indicates the time consumed by the process for encoding the video information.
The operation time of the process of executing the process 08 and the process of the video encoding unit 1807 is shown. Further, a column indicated by “speech encoding process” indicates a time consumed by a process for encoding speech information. Is shown. Further, the “other processes” indicate the working time of any process other than the “video encoding process” and the “audio encoding process”. The “source audio buffer amount” is the original audio buffer amount 1804 at that time.
This is shown as a ratio to the maximum buffer amount (the amount that can be stored in the audio buffering unit 1803).

【０４１６】なお、当該映像音声符号化装置の構成にお
いて、ビデオカメラ１８０１は、当該装置を実現する汎
用計算機と接続して、比較的独立して機能する周辺機器
であり、上記のようなＣＰＵ時間の割り当てをうけ、Ｃ
ＰＵ制御により実行されるプロセスと概ね並行して動作
がなされるものである。また、音声キャプチャ部１８０
２、および映像キャプチャ部１８０６を実現するビデオ
キャプチャボードについても同様に、比較的独立して機
能し得るものであり、音声キャプチャ部１８０２、およ
び映像キャプチャ部１８０６についても、上記の各プロ
セスと概ね並行して動作がなされるものとなる。[0416] In the configuration of the video / audio coding apparatus, the video camera 1801 is a peripheral device that functions relatively independently by being connected to a general-purpose computer that realizes the apparatus. , C
The operation is performed substantially in parallel with the process executed by the PU control. Also, the voice capture unit 180
Similarly, the video capture board that implements the video capture unit 1806 can also function relatively independently, and the audio capture unit 1802 and the video capture unit 1806 are also generally parallel to the above processes. The operation is performed.

【０４１７】すなわち、映像、または音声の符号化プロ
セスや、他のプロセスが実行されているときにも、概ね
並行して映像音声の取り込みと、デジタル化による原映
像情報、および原音声情報の作成、そして、原音声情報
の音声バッファリング手段１８０３への蓄積は行われる
ものである。That is, even when a video or audio encoding process and other processes are being executed, video and audio are fetched substantially in parallel, and original video information and original audio information are created by digitization. Then, the original audio information is stored in the audio buffering means 1803.

【０４１８】以下に、この例における本実施の形態１８
による映像音声符号化装置の動作を、図４４、および図
４５を参照しながら説明する。まず、ビデオカメラ１８
０１が映像音声情報を取り込み、アナログ音声情報とア
ナログ映像情報とに分けて出力する。アナログ映像情報
は、映像キャプチャ部１８０６に入力され、映像キャプ
チャ部１８０６は、アナログ／デジタル変換処理によっ
て、上記のように複数の静止画像情報からなるデジタル
原映像情報とし、これを出力する。この過程（プロセ
ス）は主としてビデオカメラ１８０１の動作と、映像キ
ャプチャ部１８０６であるキャプチャボードで実行され
る処理が主体となるものなので、図４５に示すような、
ＣＰＵ時間を消費するオペレーティングシステム上の各
プロセスと概ね並行的に処理される。The following is a description of the eighteenth embodiment of this example.
Will be described with reference to FIGS. 44 and 45. First, the video camera 18
01 takes in the video / audio information and outputs it separately into analog audio information and analog video information. The analog video information is input to the video capture unit 1806, and the video capture unit 1806 outputs digital original video information including a plurality of pieces of still image information by analog / digital conversion processing as described above. Since this process is mainly performed by the operation of the video camera 1801 and the process executed by the capture board as the video capture unit 1806, as shown in FIG.
It is processed substantially in parallel with each process on the operating system that consumes CPU time.

【０４１９】符号化負荷評価部１８０８は、一旦出力さ
れた原映像情報を入力し、その時点の原音声バッファ量
１８０４を確認する。ここでは、未だ音声は音声バッフ
ァリング部１８０３に入力されておらず、原音声バッフ
ァ量１８０４は０％であるとする。従って、予め決めら
れた基準値である５０％を下回っており、上述の通り本
実施の形態では、符号化負荷基準情報１８１０について
は、値を「１」として、乗算では考慮しなくてもさしつ
かえないものとしているで、符号化負荷評価基準情報１
８０９は評価情報のままの、１００％となる。そこで、
符号化負荷評価部１８０８は、入力した原映像情報の１
００％を映像符号化部１８０７に出力する。映像符号化
部１８０７では、原映像情報に対して映像符号化処理を
行い、終了した時点でＣＰＵ時間を解放する。符号化負
荷評価部１８０８と、映像符号化部１８０７とによる、
以上の処理は、図４５の映像符号化プロセスであるＡ部
分に相当する。[0419] The encoding load evaluator 1808 inputs the once output original video information, and checks the original audio buffer amount 1804 at that time. Here, it is assumed that the audio has not yet been input to the audio buffering unit 1803, and the original audio buffer amount 1804 is 0%. Therefore, the value is lower than the predetermined reference value of 50%. As described above, in the present embodiment, the coding load reference information 1810 may be set to “1” and may not be considered in the multiplication. No coding load evaluation criterion information 1
809 is 100% as it is with the evaluation information. Therefore,
The encoding load evaluator 1808 calculates 1 of the input original video information.
00% is output to the video encoding unit 1807. The video encoding unit 1807 performs a video encoding process on the original video information, and releases the CPU time when the process is completed. An encoding load evaluation unit 1808 and a video encoding unit 1807
The above processing corresponds to the part A which is the video encoding process in FIG.

【０４２０】一方、音声キャプチャ部１８０２は、ビデ
オカメラ１８０１より出力されたアナログ音声情報を入
力し、アナログ／デジタル変換処理によって、デジタル
原音声情報として出力する。原音声情報は、音声バッフ
ァリング手段に入力されて一時蓄積され、音声バッファ
リング部１８０３は、蓄積量に応じて、当該映像音声符
号化装置の保持する原音声バッファ量１８０４を更新す
る。この過程も主としてビデオカメラ１８０１の動作
と、音声キャプチャ部１８０２であるキャプチャボード
で実行される処理が主体となるものなので、図４５に示
すようなオペレーティングシステム上の各プロセスと概
ね並行的に処理される。ここでは、映像符号化プロセス
Ａと並行してこの過程が実行され、原音声バッファ量が
３０％に達していたものとする。[0420] On the other hand, the audio capture unit 1802 receives the analog audio information output from the video camera 1801, and outputs it as digital original audio information by analog / digital conversion processing. The original audio information is input to the audio buffering unit and temporarily stored, and the audio buffering unit 1803 updates the original audio buffer amount 1804 held by the video / audio encoding device according to the amount of storage. Since this process is mainly performed by the operation of the video camera 1801 and the process executed by the capture board as the audio capture unit 1802, the process is performed substantially in parallel with each process on the operating system as shown in FIG. You. Here, it is assumed that this process is executed in parallel with the video encoding process A, and the original audio buffer amount has reached 30%.

【０４２１】音声符号化部１８０５は、音声バッファリ
ング部１８０３から一定量（原音声読み出し量）の原音
声情報を、先（過去) に蓄積した分から読み出し、当該
読み出した分の原音声情報を音声バッファリング部１８
０３から削除して、原音声バッファ量１８０４を更新す
る。さらに、音声符号化部１８０５は、原音声情報を符
号化する。本実施の形態１８では、上記の原音声読み出
し量について、最大バッファ量の３０％とするので、上
記の通り、音声バッファリング部１８０３に音声が３０
％分蓄積されていたことから、そのすべてを読み出して
符号化し、符号化が終了した時点でＣＰＵ時間を解放す
る。音声符号化部１８０５によるこの処理は、図４５の
音声符号化プロセスであるＢ部分に相当する。[0421] The audio encoding unit 1805 reads out a fixed amount (original audio readout amount) of the original audio information from the audio buffering unit 1803 from the previously (past) accumulated amount of original audio information, and converts the read-out original audio information into audio. Buffering unit 18
03, and the original audio buffer amount 1804 is updated. Further, the audio encoding unit 1805 encodes the original audio information. In the eighteenth embodiment, the original audio readout amount is set to 30% of the maximum buffer amount.
Since all the data has been stored, the entire data is read and encoded, and the CPU time is released when the encoding is completed. This processing by the audio encoding unit 1805 corresponds to part B which is the audio encoding process in FIG.

【０４２２】ここで、図４５に示すように、偶然「その
他のアプリケーション」が起動し、ＣＰＵ時間を要求し
たので、「その他のアプリケーション」がＣＰＵ時間を
消費する。「その他のアプリケーション」は、比較的処
理負担が大きなものであって、しばらくの間ＣＰＵ時間
を占有してから解放する。他の作業の処理にかかるこの
過程（プロセス）は、図４５のその他のプロセスである
Ｃ部分に相当する。Ｃ部分のプロセスとも並行して、ビ
デオカメラ１８０１と、音声キャプチャ部１８０２、お
よび映像キャプチャ部１８０６による処理は実行されて
いるものとする。従って、原音声情報の一時蓄積がなさ
れ、原音声バッファ量は図４５に示すように６０％に達
する。Here, as shown in FIG. 45, since the “other application” was accidentally activated and requested the CPU time, the “other application” consumes the CPU time. The “other application” has a relatively large processing load, and occupies the CPU time for a while and then releases it. This step (process) relating to the processing of another operation corresponds to a part C which is another process in FIG. It is assumed that the processes by the video camera 1801, the audio capture unit 1802, and the video capture unit 1806 are being executed in parallel with the process of the part C. Therefore, the original audio information is temporarily stored, and the original audio buffer amount reaches 60% as shown in FIG.

【０４２３】次に、再び符号化負荷評価部１８０８にＣ
ＰＵ時間の割当が回ってきたが、この時点での原音声バ
ッファ量１８０４である６０％は、基準値の５０％以上
となっていたので、符号化負荷評価部１８０８が取得す
る評価情報は０％となり、それに符号化負荷基準情報１
１０の「１」を乗算しても、符号化負荷評価情報１８０
９は０％となる。そこで、符号化負荷評価部１８０８
は、この時点の原映像情報を破棄し、映像符号化部１８
０７による符号化処理は行われることなく、ＣＰＵ時間
はすみやかに解放されることとなる。符号化負荷評価部
１８０８のこの処理は、図４５のＤ部分に相当する。Next, the encoding load evaluator 1808 again sends C
Although the allocation of PU time has come around, the 60% of the original audio buffer amount 1804 at this point is 50% or more of the reference value, so the evaluation information obtained by the coding load evaluator 1808 is 0. %, And the encoded load reference information 1
Even if multiplied by 10 “1”, the coding load evaluation information 180
9 is 0%. Therefore, the encoding load evaluator 1808
Discards the original video information at this time, and
07 is not performed, and the CPU time is released immediately. This processing of the encoding load evaluator 1808 corresponds to the part D in FIG.

【０４２４】そして、音声符号化部１８０５は、ＣＰＵ
時間の割当が回ってきたので、音声バッファリング部１
８０３から３０％の原音声情報を読み出し、その分の原
音声情報を音声バッファリング部１８０３から削除し、
原音声バッファ量１８０４を更新する。原音声バッファ
量は６０％から３０％になる。さらに、音声符号化部１
８０５は原音声情報を符号化し、符号化が終了した時点
でＣＰＵ時間を解放する。音声符号化部１８０５による
この処理は、図４５のＥ部分に相当する。[0424] The speech encoding unit 1805 has a CPU
Since the time has been allocated, the audio buffering unit 1
30% of the original audio information is read from 803, and the corresponding original audio information is deleted from the audio buffering unit 1803;
The original audio buffer amount 1804 is updated. The original audio buffer amount is reduced from 60% to 30%. Further, the audio encoding unit 1
Reference numeral 805 encodes the original audio information, and releases the CPU time when the encoding is completed. This processing by the audio encoding unit 1805 corresponds to the part E in FIG.

【０４２５】符号化負荷評価部１８０８にＣＰＵ時間の
割当がなされる。この時点で原音声バッファ量１８０４
は３０％であり、基準値の５０％を下回ったので、図４
５のプロセスＡ部分での場合と同様に、符号化負荷評価
情報１８０９は１００％となり、映像符号化部１８０７
において、原映像情報の符号化が行われる。先の映像符
号化処理の場合と比較して、原映像情報が複雑であるた
めに、プロセスＡ部分の場合よりも符号化処理に時間が
かかり、比較的ＣＰＵ時間を多く消費してから映像符号
化プロセスは、ＣＰＵ時間を解放する。符号化負荷評価
部１８０８と、映像符号化部１８０７とによる、以上の
処理は、図４５の映像符号化プロセスであるＦ部分に該
当する。このプロセスＦと並行して、原音声情報の蓄積
がなされ、原音声バッファ量は９０％に達した。[0425] CPU time is allocated to the encoding load evaluator 1808. At this point, the original audio buffer amount 1804
Is 30%, which is less than 50% of the reference value.
5, the encoding load evaluation information 1809 becomes 100%, and the video encoding unit 1807
In, encoding of original video information is performed. Since the original video information is more complicated than in the previous video coding process, the coding process takes longer than in the case of the process A part, and the video coding process takes a relatively large amount of CPU time. The conversion process frees up CPU time. The above processing performed by the encoding load evaluation unit 1808 and the video encoding unit 1807 corresponds to the F portion of the video encoding process in FIG. In parallel with this process F, the original audio information is accumulated, and the original audio buffer amount reaches 90%.

【０４２６】音声符号化部１８０５にＣＰＵ時間の割当
が回ってきたので、音声符号化部１８０５は、音声バッ
ファリング部１８０３から３０％の原音声情報を、読み
出し、その分の原音声情報を音声バッファリング部１８
０３から削除し、原音声バッファ量１８０４を更新す
る。原音声バッファ量１８０４は９０％から６０％にな
る。さらに、音声符号化部１８０５は、原音声情報を符
号化し、符号化を終了した時点でＣＰＵ時間を解放す
る。音声符号化部１８０５によるこの処理は、図４５の
Ｇ部分に相当する。[0426] Since the CPU time has been allocated to the audio encoding unit 1805, the audio encoding unit 1805 reads out 30% of the original audio information from the audio buffering unit 1803, and converts the original audio information to that amount. Buffering unit 18
03, and the original audio buffer amount 1804 is updated. The original audio buffer amount 1804 changes from 90% to 60%. Further, the audio encoding unit 1805 encodes the original audio information, and releases the CPU time when the encoding is completed. This processing by the audio encoding unit 1805 corresponds to the G section in FIG.

【０４２７】符号化負荷評価部１８０８にＣＰＵ時間の
割当がなされ、この時点で原音声バッファ量１８０４は
９０％に達しており、基準値の５０％以上であるので、
上記のプロセスＤの場合と同様に、符号化負荷評価情報
１８０９は０％となり、符号化負荷評価部１８０８は、
この時点の原映像情報を破棄し、映像符号化部１８０７
における映像符号化処理は行われず、すみやかにＣＰＵ
時間が解放される。符号化負荷評価部１８０８のこの処
理は、図４５のＨ部分に相当する。[0427] CPU time is allocated to the encoding load evaluator 1808. At this point, the original audio buffer amount 1804 has reached 90%, which is 50% or more of the reference value.
As in the case of the above process D, the coding load evaluation information 1809 is 0%, and the coding load evaluation unit 1808
The original video information at this point is discarded, and the video encoder 1807
Video encoding processing is not performed in the
Time is released. This processing of the encoding load evaluator 1808 corresponds to the portion H in FIG.

【０４２８】音声符号化部１８０５にＣＰＵ時間が割り
当てられたので、音声符号化部１８０５は、音声バッフ
ァリング部１８０３から３０％の原音声情報を読み出
し、その分の原音声情報を音声バッファリング部１８０
３から削除し、原音声バッファ量１８０４を更新する。
さらに、音声符号化部１８０５は、原音声情報を符号化
し、符号化の終了した時点でＣＰＵ時間を解放する。音
声符号化部１８０５によるこの処理は、図４５のＩ部分
に相当する。[0428] Since CPU time has been allocated to the audio encoding unit 1805, the audio encoding unit 1805 reads 30% of the original audio information from the audio buffering unit 1803, and substitutes the corresponding original audio information with the audio buffering unit. 180
3 and the original audio buffer amount 1804 is updated.
Further, the audio encoding unit 1805 encodes the original audio information, and releases the CPU time when the encoding is completed. This processing by the audio encoding unit 1805 corresponds to the I part in FIG.

【０４２９】図４４において、ビデオカメラ１８０１よ
り映像音声の取り込みが続いている間は、以上のように
映像符号化、および音声符号化のプロセスが実行される
ことによって、当該取り込みにともなっての映像音声符
号化が実行される。そして、映像音声の取り込みの終了
後、符号化も終了する。In FIG. 44, while the video and audio are being captured by the video camera 1801, the video encoding and audio encoding processes are executed as described above, so that the video associated with the capturing is obtained. Voice coding is performed. Then, after the capturing of the video and audio ends, the encoding ends.

【０４３０】図４６は、本実施の形態１８の映像音声符
号化装置における、このような符号化の動作をより長期
の時間にわたって説明するための図である。同図におい
て、Ａ区間では音声と映像の符号化処理がバランスよく
行われており、音声バッファ量は基準値以下の値を保っ
ているが、その後、図示するように「その他のプロセ
ス」がＣＰＵ時間を独占したため、これに並行して原音
声情報が過剰に蓄積されることとなる。そこで、続くＢ
区間では、蓄積された原音声情報を処理するべく、音声
符号化が優先的に処理される。原音声情報が少なくな
り、基準値以下になってからは、同図Ｃ区間にみられる
ように、また平常の処理が実行されることとなる。FIG. 46 is a diagram for illustrating such a coding operation in the video / audio coding apparatus according to the eighteenth embodiment over a longer period of time. In the figure, in the section A, the audio and video encoding processes are performed in a well-balanced manner, and the audio buffer amount keeps a value equal to or less than the reference value. Thereafter, as shown in FIG. Since the time is monopolized, the original audio information is excessively accumulated in parallel. So the following B
In the section, audio encoding is preferentially processed in order to process the stored original audio information. After the original audio information decreases and becomes equal to or less than the reference value, normal processing is executed again as seen in the section C in FIG.

【０４３１】このように、本実施の形態１８の映像音声
符号化装置によれば、音声バッファリング部１８０３
と、符号化負荷評価部１８０８とを備え、映像について
の符号化対象である原映像情報を入力された符号化負荷
評価部１８０８が、映像符号化部１８０における符号化
処理の前に、その時点での音声バッファリング部１８０
３に蓄積された未処理音声情報の量である原音声バッフ
ァ量１８０４を確認し、十分に小さければ、映像符号化
を行い、一定量以上未処理音声情報が蓄積されていれ
ば、その時点の映像情報を破棄して、映像符号化を行わ
ず、ＣＰＵ時間を音声符号化部へ譲るものとするので、
他のアプリケーションや、映像符号化部そのものが消費
してしまったための計算機資源の不足による影響を、問
題として認知されにくい映像のコマ落ちにとどめ、映像
符号化のために音声が途切れるという事態に陥るのを回
避することができる。As described above, according to the video and audio encoding apparatus of the eighteenth embodiment, audio buffering section 1803
And an encoding load estimating unit 1808, wherein the encoding load estimating unit 1808 to which the original video information to be encoded with respect to the video has been input is output by the video encoding unit 180 before the encoding process. Audio buffering section 180
The original audio buffer amount 1804, which is the amount of the unprocessed audio information stored in No. 3, is checked. If it is sufficiently small, video encoding is performed. Since the video information is discarded and video encoding is not performed and the CPU time is transferred to the audio encoding unit,
The effect of lack of computer resources due to consumption by other applications or the video encoding unit itself is limited to dropped frames of video that are not easily recognized as a problem, and audio is interrupted due to video encoding. Can be avoided.

【０４３２】また、本実施の形態１８の映像音声符号化
装置によれば、映像符号化部１８０７の機能について
は、符号化負荷評価部１８０８が映像情報を出力する
と、これを符号化すればよいものである。入力された映
像を符号化するという単機能さえ有すれば、本実施の形
態１８による符号化装置の映像符号化部１８０７として
に適応可能である。すなわち、イメージ圧縮サブルーチ
ン等の既存の映像符号化部を、内部の変更なしに、適用
できるものである。このことは、圧縮サブルーチンがモ
ジュール化されていて、後からアドオンできるような構
成を持つ、汎用コンピュータ上の映像情報操作環境にお
いて、当該サブルーチンをそのまま応用して、映像音声
の符号化に用いることができるので、ソフトウェア開発
が効率的に行い得るという格別の効果を持つ。According to the video and audio coding apparatus of the eighteenth embodiment, the function of video coding section 1807 can be coded when coding load evaluating section 1808 outputs video information. Things. As long as it has a single function of encoding the input video, it can be applied as the video encoding unit 1807 of the encoding device according to the eighteenth embodiment. That is, an existing video encoding unit such as an image compression subroutine can be applied without any internal change. This means that in a video information operation environment on a general-purpose computer that has a configuration in which the compression subroutine is modularized and can be added on later, it is possible to apply the subroutine as it is and use it for video / audio encoding. This has the special effect that software development can be performed efficiently.

【０４３３】なお、本実施の形態１８において、マルチ
タスクを行う各プロセスのスイッチは、プロセス自身が
計算機資源（ＣＰＵ時間）を解放することにより行うも
のとしたが、本発明はこのような形態に限定されるもの
ではない。例えば、マルチタスクオペレーティングシス
テムが各プロセスに一定のＣＰＵ時間を与え、各プロセ
スがそのＣＰＵ時間を使い切ったら無条件に他のプロセ
スへスイッチする形態をとってもいい。この場合、映像
符号化プロセスが、ＣＰＵ時間を使い切る前に、音声符
号化プロセスの進捗を監視し、必要があれば、自発的に
ＣＰＵ時間を解放するようにすれば、より効果的に計算
機資源（ＣＰＵ時間）の割り当てを行い、良好な符号化
結果が得られるように図ることができる。In the eighteenth embodiment, each process that performs multitasking is switched by the process itself releasing computer resources (CPU time). However, the present invention is not limited to this mode. It is not limited. For example, the multitasking operating system may give a certain amount of CPU time to each process, and switch to another process unconditionally when each process runs out of the CPU time. In this case, if the video encoding process monitors the progress of the audio encoding process before using up the CPU time, and if necessary, voluntarily releases the CPU time, the computer resources can be more effectively used. (CPU time) can be assigned so that a good encoding result can be obtained.

【０４３４】また、本実施の形態１８では、符号化すべ
き映像情報（静止画像情報) を廃棄してしまうことで、
映像符号化プロセスのＣＰＵ時間を解放している。すな
わち、静止画情報が、「０秒地点の静止画、１秒地点の
静止画、２秒地点の静止画」と入力されたとき、必要が
あれば、１秒地点の静止画を落とし、「０秒地点の静止
画、２秒地点の静止画」として符号化する。しかし、必
ずしも最終的に出力される符号化情報が、コマ数の少な
いものとなる必要はない。すなわち、静止画像情報の中
に、どの地点（時間）での静止画像情報であるかを示す
タイムスタンプを入れておき、映像符号化部が、そのタ
イムスタンプを確認することでコマ落としがあったかど
うかを認識し、コマ落としがあった場合は、そのコマに
相当する画像（前回の静止画と同一の画像や、前回と同
一の画像であることを示す符号等）を出力すれば、最終
的に出力される符号化映像情報は、額面上、コマ落とし
のない完全なものとなる。この手法をとれば、ＭＰＥＧ
(Motion Picture ExpertsGroup)規格など、映像情報の
コマ数を既定値（１秒に３０枚など) だけ保証しなけれ
ばならない映像情報を出力するときでも、容易に対処で
きる。In the eighteenth embodiment, video information (still image information) to be encoded is discarded,
This frees up CPU time for the video encoding process. That is, when the still image information is input as “still image at 0 second point, still image at 1 second point, still image at 2 second point”, if necessary, the still image at 1 second point is dropped, Still image at 0 second point, still image at 2 second point ". However, the encoded information that is finally output does not necessarily have to have a small number of frames. That is, a time stamp indicating which point (time) the still image information is in is inserted in the still image information, and the video encoding unit checks the time stamp to determine whether or not there is a frame drop. Is recognized, and if a frame is dropped, an image corresponding to the frame (the same image as the previous still image, a code indicating the same image as the previous image, etc.) is output, and finally, The coded video information to be output is complete in terms of face value and without dropping frames. With this method, MPEG
Even when outputting video information for which the number of frames of video information must be guaranteed to a predetermined value (such as 30 frames per second), such as the (Motion Picture Experts Group) standard, it can be easily handled.

【０４３５】実施の形態１９．本発明の実施の形態１９
による映像音声符号化装置は、実施の形態１８と同様
に、汎用計算機等におけるソフトウェア処理において負
担増大があった場合にも、音声の途切れを防ぐものであ
り、音声データの蓄積量を指標として、符号化に用いる
予測処理の制御を行うものである。Embodiment 19 FIG. Embodiment 19 of the Invention
The video / audio coding apparatus according to the present invention prevents the interruption of the voice even when the load increases in the software processing in the general-purpose computer or the like, as in the eighteenth embodiment. It controls the prediction process used for encoding.

【０４３６】図４７は本発明の実施の形態１９による映
像音声符号化装置の概略構成を示す図である。図示する
ように、本実施の形態１９による映像符号化装置は、ビ
デオカメラ１９０１、音声キャプチャ部１９０２、音声
バッファリング部１９０３、音声符号化部１９０５、映
像キャプチャ部１９０６、映像符号化部１９２３、およ
び符号化負荷評価部１９２１から構成され、映像符号化
部１９２３は、フレーム間予測処理部１９２４と、フレ
ーム符号化部１９２５とを内包している。また、装置出
力として符号化音声情報と、符号化映像情報とが出力さ
れることは、実施の形態１８と同様である。FIG. 47 is a diagram showing a schematic configuration of a video / audio coding apparatus according to Embodiment 19 of the present invention. As shown, the video encoding apparatus according to the nineteenth embodiment includes a video camera 1901, an audio capture unit 1902, an audio buffering unit 1903, an audio encoding unit 1905, a video capture unit 1906, a video encoding unit 1923, The video encoding unit 1923 includes an encoding load evaluation unit 1921, and includes an inter-frame prediction processing unit 1924 and a frame encoding unit 1925. Also, the encoded audio information and the encoded video information are output as the device output, as in the eighteenth embodiment.

【０４３７】同図において、符号化負荷評価部１９２１
は、原音声バッファ量１９０４と、符号化負荷基準情報
１９１０とに基づいて、符号化負荷評価情報１９２２を
計算により取得する。映像符号化部１９２３に内包され
るフレーム間予測処理部１９２４は、映像の時間的冗長
性を削減して圧縮符号化するために、静止画像情報間の
動きベクトルを求め、動き補償を伴う予測符号化のため
に当該動きベクトルを出力する。映像符号化部１９２３
に含まれるフレーム符号化部１９２５は、フレーム間予
測処理部１９２４が出力した動きベクトルを用いて符号
化を行い、符号化映像情報として出力する。In the figure, the coding load evaluator 1921
Obtains the encoding load evaluation information 1922 by calculation based on the original audio buffer amount 1904 and the encoding load reference information 1910. An inter-frame prediction processing unit 1924 included in the video encoding unit 1923 obtains a motion vector between still image information pieces in order to reduce temporal redundancy of video and perform compression encoding, and performs prediction coding with motion compensation. The motion vector is output for conversion. Video encoding unit 1923
Are encoded using the motion vector output by the inter-frame prediction processing unit 1924, and output as encoded video information.

【０４３８】ビデオカメラ１９０１、音声キャプチャ部
１９０２、音声バッファリング部１９０３、音声符号化
部１９０５、および映像キャプチャ部１９０６について
は、実施の形態１８の１８０１〜１８０３、１８０５、
および１８０６と同様であり、説明を省略する。The video camera 1901, audio capture unit 1902, audio buffering unit 1903, audio encoding unit 1905, and video capture unit 1906 are the same as those of the eighteenth embodiment.
And 1806, and the description is omitted.

【０４３９】符号化負荷評価部１９２１による符号化負
荷評価情報１９２２の計算にあたっては、原音声バッフ
ァ量１０４から計算した、音声バッファリング部１９３
が持つバッファの空きの割合を示す評価情報と、符号化
負荷基準情報１１０とを用いて両者を乗算して求めるも
のである。符号化負荷基準情報１９１０は、実施の形態
１８における場合と同様のものであり、本実施の形態１
９においても、固定的に「１」としておくものである。
従って、本実施の形態１９では、符号化負荷基準情報１
９１０は考慮に入れることを要せず、評価情報として得
られる、音声バッファリング部１９３が持つバッファの
空きの割合を、そのまま符号化負荷評価情報１９２２と
するものであり、バッファが空の場合は１００％、一杯
の場合は０％という値をとることとなる。When the coding load evaluation information 1922 is calculated by the coding load evaluation unit 1921, the audio buffering unit 193 calculated from the original audio buffer amount 104 is used.
Is obtained by multiplying the evaluation information indicating the ratio of the free space of the buffer and the encoding load reference information 110 by using both. The coding load reference information 1910 is the same as that in the eighteenth embodiment, and
9 is also fixedly set to "1".
Therefore, in the nineteenth embodiment, the encoding load reference information 1
No. 910 does not need to be taken into consideration, and the ratio of the free space of the buffer of the audio buffering unit 193 obtained as the evaluation information is used as it is as the encoding load evaluation information 1922. If the buffer is empty, The value is 100%, and 0% when full.

【０４４０】一般に、圧縮符号化にあたっては、１フレ
ーム（１画面相当）の静止画像について、その空間的相
関関係に基づいて圧縮を行うフレーム内符号化と、時間
的に近接する、例えば連続するフレームの静止画像につ
いて、その時間的相関関係に基づいて圧縮を行うフレー
ム間符号化とがあり、フレーム内符号化が基本となるも
のではあるが、この両者を組み合わせると高圧縮率の符
号化データが得られることとなる。フレーム間符号化を
行うためには、フレームごとの動きを動きベクトルとし
て検出し、この動きベクトルを用いた動き補償を伴って
予測画像を生成し、該予測画像と符号化対象である画像
との差分データを圧縮するという手法が用いられる。[0440] In general, in compression encoding, intra-frame encoding for compressing a still image of one frame (corresponding to one screen) based on the spatial correlation between the still image and one frame (e.g., consecutive frames) For still images, there is inter-frame coding that performs compression based on the temporal correlation, and intra-frame coding is fundamental. However, when these two are combined, encoded data with a high compression rate can be obtained. Will be obtained. In order to perform inter-frame coding, a motion for each frame is detected as a motion vector, a predicted image is generated with motion compensation using the motion vector, and the predicted image is encoded with the image to be encoded. A technique of compressing the difference data is used.

【０４４１】フレーム間予測処理部１９２４は、映像の
時間的冗長性を削減して圧縮符号化するための予測画像
生成処理に用いる、静止画像情報間の動きベクトルを求
めるものである。本実施の形態１９では、フレーム間予
測処理部１９２４は、指定された割合だけ、予測処理を
行うものである。すなわち、予測処理を行う最大の範囲
を初期値として、処理を行う際に符号化負荷評価情報１
９２２を入力し、初期値に対して、符号化負荷評価情報
１９２２で示される割合だけ、フレーム間予測処理を行
う。符号化負荷評価情報１９２２が１００％であった
ら、初期値である最大の量だけフレーム間予測処理を行
い、得られた最適な動きベクトルを出力する。一方、符
号化負荷評価情報１９２２が５０％であったら、初期値
の５０％の量だけフレーム間予測処理を行い、その時点
で求められた最適な動きベクトルを出力する。いずれの
場合にも、フレーム符号化部１９２５は、出力された動
きベクトルを用いて、符号化処理を行う。動きベクトル
を求める処理は、処理量を増やせば増やすだけ最適な動
きベクトルが求められることとなるので、このことによ
り予測画像と符号化対象画像との差が小さくなることか
ら、効率的な圧縮が行えることとなる。一方、処理量を
少なくすると最適なベクトルが得られず、圧縮率が下が
る。なお、処理量を増大することなく圧縮率を維持する
ことは、画質を犠牲にするならば可能である。[0441] The inter-frame prediction processing unit 1924 obtains a motion vector between still image information used for a predicted image generation process for compressing and encoding by reducing temporal redundancy of a video. In the nineteenth embodiment, the inter-frame prediction processing unit 1924 performs prediction processing by a specified ratio. That is, the encoding load evaluation information 1
922 is input, and an inter-frame prediction process is performed on the initial value by a ratio indicated by the coding load evaluation information 1922. If the coding load evaluation information 1922 is 100%, an inter-frame prediction process is performed by a maximum amount which is an initial value, and the obtained optimal motion vector is output. On the other hand, if the coding load evaluation information 1922 is 50%, the inter-frame prediction processing is performed by an amount of 50% of the initial value, and the optimum motion vector obtained at that time is output. In any case, the frame encoding unit 1925 performs an encoding process using the output motion vector. In the process of obtaining a motion vector, an optimum motion vector can be obtained as much as the amount of processing is increased, so that the difference between the predicted image and the encoding target image is reduced. You can do it. On the other hand, if the amount of processing is reduced, an optimal vector cannot be obtained, and the compression ratio decreases. Note that maintaining the compression ratio without increasing the processing amount is possible if image quality is sacrificed.

【０４４２】このように構成された、本実施の形態１９
による映像音声符号化装置の動作の概略は、以下のよう
になる。すなわち、当該映像音声符号化装置において、
符号化負荷評価部１９２１は、音声バッファリング部１
９０３が原音声情報を蓄積し、該蓄積量に応じて原音声
バッファ量１９０４を更新した時点における原音声バッ
ファ量１９０４と、符号化負荷基準情報１９１０とに従
い、符号化負荷評価情報１９２２を出力する。また、映
像符号化部１９２３は、映像キャプチャ部１９６から出
力された原映像情報を符号化して出力する。このとき、
フレーム間予測処理部１９２４は、映像の時間的冗長性
を削減して圧縮符号化するために、静止画像情報間の動
きベクトルを求め、動き情報を用いて符号化を行う。こ
れに伴い、フレーム符号化部１９２５は、フレーム間予
測処理部１９２４が出力した動きベクトルを用いて符号
化を行い、符号化映像情報として出力する。原音声情報
の読み出しと符号化は、実施の形態１８と同様に行われ
る。The twelfth embodiment thus constituted
The outline of the operation of the video / audio coding apparatus according to the above is as follows. That is, in the video / audio coding apparatus,
The encoding load evaluator 1921 includes the audio buffering unit 1
903 stores the original audio information, and outputs the encoding load evaluation information 1922 according to the original audio buffer amount 1904 at the time when the original audio buffer amount 1904 is updated according to the accumulated amount and the encoding load reference information 1910. . Further, the video encoding unit 1923 encodes and outputs the original video information output from the video capture unit 196. At this time,
The inter-frame prediction processing unit 1924 obtains a motion vector between still image information and performs encoding using the motion information in order to reduce temporal redundancy of video and perform compression encoding. Accordingly, the frame encoding unit 1925 performs encoding using the motion vector output from the inter-frame prediction processing unit 1924, and outputs the encoded video information. Reading and encoding of the original audio information are performed in the same manner as in the eighteenth embodiment.

【０４４３】以下に、ある映像音声に対しての、本実施
の形態１９による映像音声符号化装置による符号化処理
の一例における動作を説明する。ここで、実施の形態１
８と同様に、映像音声符号化処理は、汎用計算機におい
てオペレーティングシステムの制御に従う映像符号化
（符号化負荷評価部１９２１と映像符号化部１９２３の
処理）と、音声符号化（音声符号化部１９０５の処理）
との各タスクとしてなされるものであるとし、ＣＰＵ時
間の割り当てをされた各タスクが一連の処理を実行して
計算機資源（ＣＰＵ時間）を解放したとき、オペレーテ
ィングシステムは他のタスクにＣＰＵ時間の割り当てを
行う、という制御をするものとする。[0443] The operation of an example of encoding processing for a video and audio by the video and audio encoding apparatus according to the nineteenth embodiment will be described below. Here, Embodiment 1
8, the video / audio encoding process includes video encoding (processing of the encoding load evaluator 1921 and the video encoder 1923) and audio encoding (the audio encoder 1905) according to the control of the operating system in the general-purpose computer. Processing)
When each task to which CPU time is allocated executes a series of processing and releases computer resources (CPU time), the operating system gives the other tasks the CPU time. The assignment is controlled.

【０４４４】まず、実施の形態１８と同様に、ビデオカ
メラ１９０１が映像音声情報を取り込み、アナログ音声
情報とアナログ映像情報とに分けて出力する。そして、
音声キャプチャ部１９０２は、ビデオカメラ１９０１か
ら出力されたアナログ音声情報を入力し、デジタル原音
声情報として出力する。音声バッファリング部１９０３
は、原音声情報を蓄積し、蓄積量に応じて原音声バッフ
ァ量１９０４を更新する。一方、映像キャプチャ部１９
０６は、ビデオカメラ１９０１から出力されたアナログ
映像情報を入力し、デジタル原映像情報として出力す
る。First, as in the eighteenth embodiment, the video camera 1901 fetches video / audio information, and outputs it separately into analog audio information and analog video information. And
The audio capture unit 1902 receives the analog audio information output from the video camera 1901 and outputs it as digital original audio information. Audio buffering unit 1903
Stores the original audio information and updates the original audio buffer amount 1904 according to the accumulated amount. Meanwhile, the video capture unit 19
Reference numeral 06 inputs analog video information output from the video camera 1901 and outputs it as digital original video information.

【０４４５】符号化負荷評価部１９２１は、その時点の
原音声バッファ量１０４を確認する。ここでは、入力さ
れた原音声情報がバッファに３０％ほど蓄積されていた
ので、符号化負荷評価情報１９２２は、７０％とする。
フレーム間予測処理部１９２４は符号化負荷評価情報１
９２２を取得する。そして符号化負荷評価情報１９２２
が７０％であるので、初期値の７０％だけフレーム間予
測処理を行い、その中で最適な動きベクトルを得てこれ
をフレーム符号化部１９２５に出力する。フレーム符号
化部１９２５は、動きベクトルを用い、映像情報を符号
化し、符号化映像情報として出力し、映像符号化に割り
当てられたＣＰＵ時間を解放する。[0445] The coding load evaluator 1921 checks the original audio buffer amount 104 at that time. Here, since about 30% of the input original audio information has been stored in the buffer, the coding load evaluation information 1922 is set to 70%.
The inter-frame prediction processing unit 1924 performs encoding load evaluation information 1
922 is obtained. And encoding load evaluation information 1922
Is 70%, so that the inter-frame prediction process is performed only for 70% of the initial value, an optimal motion vector is obtained from the inter-frame prediction process, and this is output to the frame encoding unit 1925. The frame encoding unit 1925 encodes the video information using the motion vector, outputs the encoded video information as encoded video information, and releases the CPU time allocated to the video encoding.

【０４４６】音声符号化部１９０５は、音声バッファリ
ング部１９０３から一定量（原音声読み出し量）の原音
声情報を、先（過去) に蓄積した分から読み出し、当該
読み出した分の原音声情報を音声バッファリング部１９
０３から削除して、原音声バッファ量１９０４を更新す
る。さらに、音声符号化部１９０５は、原音声情報を符
号化する。本実施の形態１８では、上記の原音声読み出
し量について、最大バッファ量の３０％とするので、上
記の通り、音声バッファリング部１９０３に音声が３０
％分蓄積されていたことから、そのすべてを読み出して
符号化し、符号化が終了した時点でＣＰＵ時間を解放す
る。[0446] The audio encoding unit 1905 reads a fixed amount (original audio readout amount) of the original audio information from the audio buffering unit 1903 from the previously (past) accumulated original audio information, and converts the read original audio information into audio. Buffering unit 19
03 and the original audio buffer amount 1904 is updated. Further, the audio encoding unit 1905 encodes the original audio information. In the eighteenth embodiment, the original audio readout amount is set to 30% of the maximum buffer amount.
Since all the data has been stored, the entire data is read and encoded, and the CPU time is released when the encoding is completed.

【０４４７】ＣＰＵ時間を割り当てられた符号化負荷評
価部１９２１は、その時点の原音声バッファ量１０４を
確認する。上記の音声符号化処理の直後であり、原音声
バッファ量１０４に蓄積されている音声が０％であった
ので、符号化負荷評価部１９２１は、符号化負荷評価情
報１９２２を１００％とする。そして、フレーム間予測
処理部１９２４は、符号化負荷評価情報１９２２が１０
０％であるので、初期値である最大量だけフレーム間予
測処理を行い、最適な動きベクトルを取得し、フレーム
符号化部１９２５は、該動きベクトルを用いて映像情報
を符号化して、符号化映像情報を出力し、符号化が終了
するとＣＰＵ時間を解放する。The coding load evaluator 1921 to which the CPU time has been assigned confirms the original audio buffer amount 104 at that time. Immediately after the above audio encoding processing, and the audio stored in the original audio buffer amount 104 is 0%, the encoding load evaluator 1921 sets the encoding load evaluation information 1922 to 100%. Then, the inter-frame prediction processing unit 1924 sets the encoding load evaluation information 1922 to 10
Since it is 0%, an inter-frame prediction process is performed by the maximum amount that is an initial value to obtain an optimal motion vector, and the frame encoding unit 1925 encodes video information using the motion vector, and performs encoding. The video information is output, and when the encoding is completed, the CPU time is released.

【０４４８】図４７において、ビデオカメラ１９０１よ
り映像音声の取り込みが続いている間は、以上のように
映像符号化、および音声符号化の処理が実行されること
によって、当該取り込みにともなっての映像音声符号化
が実行される。そして、映像音声の取り込みの終了後、
符号化も終了する。In FIG. 47, while video and audio are being captured from the video camera 1901, the video encoding and audio encoding processes are executed as described above, so that the video associated with the capturing is processed. Voice coding is performed. And after the end of the video and audio capture,
The encoding also ends.

【０４４９】このように、本実施の形態１９の映像音声
符号化装置によれば、音声バッファリング部１９０３
と、符号化負荷評価部１９２１と、フレーム間予測処理
部１９２４、及びフレーム符号化部１９２５を内包する
映像符号化部１９２３とを備え、符号化負荷評価部１９
２１が、映像符号化部１９２３が符号化を行う前に、そ
の時点でのバッファに蓄積された未処理音声情報の量、
すなわち原音声バッファ量１９０４を確認し、その量に
応じて、フレーム間予測処理部１９２４における処理量
を指示することにより、映像符号化で消費するＣＰＵ時
間を制御するので、他のアプリケーションや、映像符号
化部そのものが消費してしまったための計算機資源の不
足による影響を、問題として認知されにくい一時的な圧
縮率の低下や画質の低下にとどめ、映像符号化のために
音声が途切れるという事態に陥るのを回避することがで
きる。As described above, according to the video and audio coding apparatus of the nineteenth embodiment, the audio buffering section 1903
And an encoding load estimating unit 1921, an inter-frame prediction processing unit 1924, and a video encoding unit 1923 including the frame encoding unit 1925.
21 is the amount of unprocessed audio information stored in the buffer at that time before the video encoding unit 1923 performs encoding;
That is, by confirming the original audio buffer amount 1904 and instructing the amount of processing in the inter-frame prediction processing unit 1924 according to the amount, the CPU time consumed in video encoding is controlled. The effect of lack of computer resources due to the consumption of the encoding unit itself was limited to a temporary decrease in compression rate and image quality, which is not easily recognized as a problem, and audio was interrupted due to video encoding. You can avoid falling.

【０４５０】また、本実施の形態１９の映像音声符号化
装置での、映像符号化部１９２３の機能については、符
号化負荷評価部１９２１が符号化負荷評価情報１９２２
を出力すると、これに対応して処理を行うものであるた
め、符号化負荷評価情報１９２２の入力と、該符号化負
荷評価情報１９２２に対応した処理を行い得るものであ
る必要があり、実施の形態１８のように、モジュール化
したサブルーチンのそのままの応用がなし得るものでは
ない。しかし、映像符号化の負荷を低減する際に、実施
の形態１８のように映像情報を破棄するのではなく、処
理量を低減するものであるので、映像のコマ落ちは発生
せず、実施の形態１８と比較して、なめらかな動きの符
号化映像情報が得られるという効果を持つ。In the video / audio coding apparatus according to the nineteenth embodiment, the function of video coding section 1923 is as follows.
Is output, processing is performed in response to this. Therefore, it is necessary that the input of the coding load evaluation information 1922 and the processing corresponding to the coding load evaluation information 1922 be performed. As in the eighteenth embodiment, the modularized subroutine cannot be directly applied. However, when reducing the load of video encoding, video information is not discarded as in the eighteenth embodiment, but the amount of processing is reduced. Compared with the eighteenth mode, this embodiment has an effect that coded video information with a smooth motion can be obtained.

【０４５１】なお、本実施の形態１９では、映像符号化
で消費するＣＰＵ時間を制御するのに、フレーム間予測
処理、すなわち最適な動きベクトルを求める計算の量を
調節することを利用したが、本発明はこのような方式に
限定されるものではない。例えば、色情報の符号化の処
理の一部を省略するなど、他の映像符号化の処理を簡略
化する方法をとることもできる。In the nineteenth embodiment, the CPU time consumed in video encoding is controlled by using the inter-frame prediction process, that is, by adjusting the amount of calculation for finding an optimal motion vector. The present invention is not limited to such a system. For example, a method of simplifying other video coding processes, such as omitting a part of the color information coding process, may be adopted.

【０４５２】また、本実施の形態１９では、原音声バッ
ファ量から求めたバッファの空きの割合をそのまま符号
化付加評価情報としているが、他の評価方法を用いるこ
ともできる。例えば、原音声バッファ量がある一定値を
超えるまでは、符号化付加評価情報を１００％とする
が、一定値を超えてからは、５０％、３０％と減らして
いく、という評価方法をとることもできる。Also, in the nineteenth embodiment, the ratio of the free space of the buffer obtained from the original audio buffer amount is used as the encoded additional evaluation information as it is, but other evaluation methods can be used. For example, an evaluation method is used in which the coded additional evaluation information is set to 100% until the original audio buffer amount exceeds a certain value, but is reduced to 50% and 30% after exceeding the certain value. You can also.

【０４５３】実施の形態２０．本発明の実施の形態２０
による映像音声符号化装置は、実施の形態１８と同様
に、汎用計算機等におけるソフトウェア処理において負
担増大があった場合にも、音声の途切れを防ぐものであ
り、音声データの蓄積量を指標として、符号化に用いる
映像解像度の変更を行うものである。Embodiment 20 FIG. Embodiment 20 of the present invention
The video / audio coding apparatus according to the present invention prevents the interruption of the voice even when the load increases in the software processing in the general-purpose computer or the like, as in the eighteenth embodiment. It changes the video resolution used for encoding.

【０４５４】図４８は本発明の実施の形態２０による映
像音声符号化装置の概略構成を示す図である。図示する
ように、本実施の形態２０による映像符号化装置は、ビ
デオカメラ２００１、音声キャプチャ部２００２、音声
バッファリング部２００３、音声符号化部２００５、映
像キャプチャ部２０３１、映像符号化部２０３５、およ
び符号化負荷評価部２０３２から構成され、映像符号化
部２０３５は、映像符号化部本体２０３６と、解像度補
正情報付加部２０３７とを内包している。また、装置出
力として符号化音声情報と、符号化映像情報とが出力さ
れることは、実施の形態１８と同様である。FIG. 48 is a diagram showing a schematic configuration of a video / audio coding apparatus according to Embodiment 20 of the present invention. As shown in the figure, the video encoding device according to Embodiment 20 includes a video camera 2001, an audio capture unit 2002, an audio buffering unit 2003, an audio encoding unit 2005, a video capture unit 2031, a video encoding unit 2035, The image encoding unit 2035 includes an encoding load evaluation unit 2032, and includes a video encoding unit main body 2036 and a resolution correction information adding unit 2037. Also, the encoded audio information and the encoded video information are output as the device output, as in the eighteenth embodiment.

【０４５５】同図において、映像キャプチャ部２０３１
は、実施の形態１８、および１９と同様に、アナログ映
像情報より、静止画像の複数枚から構成されるデジタル
の原映像情報を作成するものであるが、本実施の形態２
０では、後述する映像解像度情報２０３４を入力して、
該入力した映像解像度情報解像度に対応する解像度を持
つものとして、上記静止画像情報を作成するものであ
る。映像キャプチャ部２０３１は実施の形態１８と同様
に、ビデオキャプチャボードで実現されるが、ここで
は、当該ボードは解像度を指定し得るものであるとす
る。符号化負荷評価部２０３２は、符号化負荷評価情報
２０３３を計算し、該計算した符号化負荷評価情報２０
３３の値に従って、映像解像度情報２０３４を出力す
る。映像符号化部２０３５は、後述する映像符号化部本
体２０３６と、解像度補正情報付加部２０３７とを内包
し、映像キャプチャ部２０３１から出力された原映像情
報を符号化し、符号化映像情報を出力する。映像符号化
部本体２０３６は、映像符号化部２０３５に含まれ、実
際の映像の符号化処理を行う。解像度補正情報付加部２
０３７は、映像符号化部２０３５に含まれ、映像符号化
部本体２０３６が出力した内部符号化映像情報に対し
て、解像度の情報を付加し当該映像音声符号化装置の装
置出力とな符号化映像情報を作成する。[0455] In the figure, the video capture unit 2031
Creates digital original video information composed of a plurality of still images from analog video information in the same manner as in the eighteenth and nineteenth embodiments.
At 0, video resolution information 2034 to be described later is input,
The still image information is created as having a resolution corresponding to the input video resolution information resolution. The video capture unit 2031 is realized by a video capture board, as in the eighteenth embodiment. Here, it is assumed that the board can specify a resolution. The coding load evaluation unit 2032 calculates the coding load evaluation information 2033, and calculates the calculated coding load evaluation information 2033.
The video resolution information 2034 is output according to the value of 33. The video encoding unit 2035 includes a video encoding unit main body 2036 described later and a resolution correction information adding unit 2037, encodes original video information output from the video capture unit 2031, and outputs encoded video information. . The video encoding unit main body 2036 is included in the video encoding unit 2035, and performs an actual video encoding process. Resolution correction information adding unit 2
Reference numeral 037 denotes a coded video that is included in the video coding unit 2035, adds resolution information to the internal coded video information output by the video coding unit main body 2036, and becomes the device output of the video / audio coding device. Create information.

【０４５６】ビデオカメラ２００１、音声キャプチャ部
２００２、音声バッファリング部２００３、および音声
符号化部２００５については、実施の形態１８の１８０
１〜１８０３、および１８０５と同様であり、説明を省
略する。A video camera 2001, an audio capture unit 2002, an audio buffering unit 2003, and an audio encoding unit 2005 are the same as those in the eighteenth embodiment.
1 to 1803 and 1805, and the description is omitted.

【０４５７】上記の符号化負荷評価部２０３２による符
号化負荷評価情報２０３３の計算にあたっては、原音声
バッファ量２００４の値に基づいて基本となる評価情報
を計算し、それに符号化負荷基準情報２０１０の値を乗
算して求める。本実施の形態２０では、評価情報の計算
にあたり、原音声バッファ量２００４が音声バッファリ
ング部２００３の原音声蓄積可能量の半分を超えていれ
ば、評価情報を０％とし、半分以下になれば、１００％
とするものである。また、符号化負荷基準情報２０１０
は、実施の形態１８における場合と同様であり、本実施
の形態２０においても、固定的に「１」としておくもの
である。従って、評価情報がそのまま符号化負荷評価情
報２０３３となり、符号化負荷評価情報２０３３は、０
％か１００％という値をとる。符号化負荷評価部２０３
２は、符号化負荷評価情報２０３３を用いて映像解像度
情報２０３４を作成し、これを映像符号化部２０３５に
出力する。この際には、符号化負荷評価情報２０３３が
１００％である場合には、映像解像度情報２０３４を
「幅３２０ピクセル、高さ２４０ピクセル」を示すもの
として、また、０％であれば、「幅１６０ピクセル、高
さ１２０ピクセル」を示すものとして出力する。In the calculation of the coding load evaluation information 2033 by the coding load evaluation section 2032, basic evaluation information is calculated based on the value of the original audio buffer amount 2004, and Multiply by the value. In the twentieth embodiment, when calculating the evaluation information, the evaluation information is set to 0% if the original audio buffer amount 2004 is more than half of the original audio storage capacity of the audio buffering unit 2003, and if less than half. , 100%
It is assumed that. Also, the encoding load reference information 2010
Is the same as in the eighteenth embodiment, and is also fixed to “1” in the twentieth embodiment. Therefore, the evaluation information becomes the encoding load evaluation information 2033 as it is, and the encoding load evaluation information 2033 is set to 0.
% Or 100%. Coding load evaluator 203
2 generates video resolution information 2034 using the coding load evaluation information 2033 and outputs this to the video coding unit 2035. At this time, if the encoding load evaluation information 2033 is 100%, the video resolution information 2034 indicates “320 pixels in width and 240 pixels in height”. 160 pixels, height 120 pixels ".

【０４５８】本実施の形態２０においては、映像解像度
情報２０３４の初期値として「幅３２０ピクセル、高さ
２４０ピクセル」が設定されているものであるが、原音
声バッファ量２００４がバッファ量の最大の５０％を超
えるた場合に、上記のような符号化負荷評価部２０３２
による演算において、「幅１６０ピクセル、高さ１２０
ピクセル」に変化するものである。In the twentieth embodiment, “320 pixels wide and 240 pixels high” are set as the initial values of the video resolution information 2034, but the original audio buffer amount 2004 is If it exceeds 50%, the coding load evaluator 2032 as described above
In the calculation by, "width 160 pixels, height 120
To "pixels".

【０４５９】このように構成された、本実施の形態２０
による映像音声符号化装置の動作の概略は、以下のよう
になる。すなわち、当該映像音声符号化装置において、
映像キャプチャ部２０３１は、映像解像度情報２０３４
を入力して、その解像度を持つ静止画像情報で構成され
るデジタル原映像情報を、入力されたアナログ映像情報
を変換することで作成し、出力する。また、符号化負荷
評価部２０３２は、符号化負荷評価情報２０３３を計算
し、この計算した符号化負荷評価情報２０３３の値に従
って、映像解像度情報２０３４を出力する。そして、映
像符号化部２０３５は、映像キャプチャ部２０３１から
出力された原映像情報を符号化し出力する。このとき、
映像符号化部本体２０３６は、実際の映像の符号化処理
を行う。これに伴って、解像度補正情報付加部２０３７
は、映像符号化部本体３０５が出力した符号化映像情報
に、解像度の情報を付加する。音声の扱いについては、
実施の形態１８と同様である。The twentieth embodiment thus constituted
The outline of the operation of the video / audio coding apparatus according to the above is as follows. That is, in the video / audio coding apparatus,
The video capture unit 2031 includes video resolution information 2034
Is input, digital original video information composed of still image information having the resolution is created by converting the input analog video information, and output. Further, the coding load evaluation unit 2032 calculates the coding load evaluation information 2033, and outputs the video resolution information 2034 according to the calculated value of the coding load evaluation information 2033. Then, the video encoding unit 2035 encodes and outputs the original video information output from the video capture unit 2031. At this time,
The video encoding unit main body 2036 performs an actual video encoding process. Accordingly, the resolution correction information adding unit 2037
Adds resolution information to the encoded video information output by the video encoding unit main body 305. For audio handling,
This is the same as Embodiment 18.

【０４６０】以下に、ある映像音声に対しての、本実施
の形態２０による映像音声符号化装置による符号化処理
の一例における動作を説明する。ここで、実施の形態１
８と同様に、映像音声符号化処理は、汎用計算機におい
てオペレーティングシステムの制御に従う映像符号化
（符号化負荷評価部２０３２と映像符号化部２０３５の
処理）と、音声符号化（音声符号化部２００５の処理）
との各タスクとしてなされるものであるとし、ＣＰＵ時
間の割り当てをされた各タスクが一連の処理を実行して
計算機資源（ＣＰＵ時間）を解放したとき、オペレーテ
ィングシステムは他のタスクにＣＰＵ時間の割り当てを
行う、という制御をするものとする。[0460] The operation of an example of encoding processing for a video and audio by the video and audio encoding apparatus according to the twentieth embodiment will be described below. Here, Embodiment 1
8, the video / audio encoding process includes video encoding (processing of the encoding load evaluation unit 2032 and the video encoding unit 2035) and audio encoding (the audio encoding unit 2005) in accordance with the control of the operating system in the general-purpose computer. Processing)
When each task to which CPU time is allocated executes a series of processing and releases computer resources (CPU time), the operating system gives the other tasks the CPU time. The assignment is controlled.

【０４６１】まず、実施の形態１８と同様に、ビデオカ
メラ２００１が映像音声情報を取り込み、アナログ音声
情報とアナログ映像情報とに分けて出力する。そして、
音声キャプチャ部２００２は、ビデオカメラ２００１か
ら出力されたアナログ音声情報を入力し、デジタル原音
声情報として出力する。音声バッファリング部２００３
は、原音声情報を蓄積し、蓄積量に応じて原音声バッフ
ァ量２００４を更新する。First, as in the eighteenth embodiment, the video camera 2001 takes in video / audio information, and outputs it separately into analog audio information and analog video information. And
The audio capture unit 2002 receives the analog audio information output from the video camera 2001 and outputs it as digital original audio information. Audio buffering unit 2003
Accumulates the original audio information and updates the original audio buffer amount 2004 according to the accumulated amount.

【０４６２】符号化負荷評価部２０３２は、その時点の
原音声バッファ量２００４を確認する。上記入力された
原音声情報がバッファに３０％ほど蓄積されており、予
め決められた基準値である５０％を下回っているので、
評価情報は１００％となる。そして、上記のように値
「１」である符号化負荷基準情報２０１０は、乗算処理
しても結果に影響を与えないので、符号化負荷評価情報
は１００％となる。そこで、映像解像度情報２０３４
は、「幅３２０ピクセル、高さ２４０ピクセル」とな
り、符号化負荷評価部２０３２は、この映像解像度情報
２０３４を、映像キャプチャ部２０３１、および映像符
号化部２０３５に出力する。The coding load evaluator 2032 checks the original audio buffer amount 2004 at that time. About 30% of the input original audio information is stored in the buffer, which is lower than a predetermined reference value of 50%.
The evaluation information is 100%. Then, the coding load reference information 2010 having the value “1” as described above does not affect the result even if the multiplication processing is performed, and thus the coding load evaluation information becomes 100%. Therefore, the video resolution information 2034
Is “320 pixels wide and 240 pixels high”, and the encoding load evaluation unit 2032 outputs the video resolution information 2034 to the video capture unit 2031 and the video encoding unit 2035.

【０４６３】一方、映像キャプチャ部２０３１は、ビデ
オカメラ２００１から出力されたアナログ映像情報を入
力し、デジタル原映像情報として出力する。この際、映
像キャプチャ部２０３１は、符号化負荷評価部２０３５
から、映像解像度情報２０３４を入力される前であっ
て、映像解像度情報２０３４の初期値である「幅３２０
ピクセル、高さ２４０ピクセル」が用いられるものであ
り、映像キャプチャ部２０３１は、「幅３２０ピクセ
ル、高さ２４０ピクセル」の静止画像情報からなるデジ
タル原映像情報を作成して出力する。On the other hand, the video capture unit 2031 receives the analog video information output from the video camera 2001 and outputs it as digital original video information. At this time, the video capture unit 2031 sets the encoding load evaluation unit 2035
Before the video resolution information 2034 is input, and “width 320” which is the initial value of the video resolution information 2034
The pixel capture unit 2031 creates and outputs digital original video information including still image information “320 pixels wide and 240 pixels high”.

【０４６４】原映像情報は、映像符号化部２０３５に入
力され、まず映像符号化部本体２０３６によって符号化
処理をなされて、内部符号化映像情報が作成される。次
いで、解像度補正情報付加部２０３７は、映像符号化部
本体２０３６において作成された符号化映像情報に、
「幅３２０ピクセル、高さ２４０ピクセル」を示す情報
を付加し、当該映像音声符号化装置の装置出力となる符
号化映像情報を作成して出力する。[0464] The original video information is input to the video encoding unit 2035, and is first subjected to an encoding process by the video encoding unit main body 2036 to generate the internal encoded video information. Next, the resolution correction information adding unit 2037 adds the encoded video information created by the video encoding unit main unit 2036 to the encoded video information.
Information indicating "width 320 pixels and height 240 pixels" is added, and coded video information to be output from the video / audio coding device is created and output.

【０４６５】ここで、実施の形態１８での説明の場合と
同様に、汎用計算機のオペレーティングシステム上で映
像符号化、および音声符号化以外のタスクも実行されて
いたものとして、他のタスクにＣＰＵ時間の割り当てが
なされ、「その他のプロセス」に制御が移って、ＣＰＵ
時間が消費される。実施の形態１８に示したと同様に、
映像音声のビデオカメラ２００１による取り込みと、音
声キャプチャ部２００２、および映像キャプチャ部２０
３１による処理とは、当該「その他のプロセス」とも概
ね並行して行われるものであり、音声バッファリング部
２００３に音声が９０％分まで蓄積される。Here, as in the description of the eighteenth embodiment, it is assumed that tasks other than the video encoding and the audio encoding have been executed on the operating system of the general-purpose computer, and other tasks are executed by the CPU. Time is allocated, control is transferred to "other processes", and CPU
Time is consumed. As described in the eighteenth embodiment,
Capture of video and audio by video camera 2001, audio capture unit 2002, and video capture unit 20
The processing by 31 is performed substantially in parallel with the “other process”, and the sound is stored in the sound buffering unit 2003 up to 90%.

【０４６６】その後、音声符号化処理が実行されると
き、音声符号化部２００５は、音声バッファリング部２
００３から一定量の原音声情報を、先（過去) に蓄積し
た分から先に読み出し、その分の原音声情報を音声バッ
ファリング部２００３から削除し、原音声バッファ量２
００４を更新する。さらに、音声符号化部２００５は、
原音声情報を符号化する。本実施の形態２０では、読み
出して削除する一定量として３０％であるものとし、音
声符号化部２００５は、上記のように９０％まで蓄積さ
れた原音声情報のうちの３０％を読み出して符号化し、
符号化を終了した時点でＣＰＵ時間を解放する。Thereafter, when the audio encoding process is performed, the audio encoding unit 2005
From 003, a certain amount of original audio information is read out first from the amount stored earlier (in the past), and the corresponding original audio information is deleted from the audio buffering unit 2003, and the original audio buffer amount 2
004 is updated. Further, the audio encoding unit 2005
Encode the original audio information. In the twentieth embodiment, it is assumed that the fixed amount to be read and deleted is 30%, and the audio encoding unit 2005 reads and encodes 30% of the original audio information stored up to 90% as described above. And
When the encoding is completed, the CPU time is released.

【０４６７】映像符号化に再び処理が移ると、符号化負
荷評価部２０３２は、その時点の原音声バッファ量２０
０４を確認する。上記３０％分の読み出しの直後で、ま
だ６０％の原音声情報が蓄積されている。したがって、
基準値の５０％より大きい値となっているため、評価情
報は０％となり、符号化負荷基準情報２０１０の「１」
の乗算後にも値は変わらず、符号化負荷基準情報２０３
３は０％となる。そこで、映像解像度情報２０３４は、
「幅１６０ピクセル、高さ１２０ピクセル」となる。映
像解像度情報２０３４は、先と同様に映像キャプチャ部
２０３１と、映像符号化部２０３５とに出力される。When the processing shifts to video encoding again, the encoding load evaluator 2032 determines that the original audio buffer amount 20
Check 04. Immediately after the reading of 30%, 60% of the original audio information is still stored. Therefore,
Since the value is greater than 50% of the reference value, the evaluation information is 0%, and “1” of the encoding load reference information 2010
Does not change even after the multiplication by
3 is 0%. Therefore, the video resolution information 2034 is
"160 pixels in width and 120 pixels in height". The video resolution information 2034 is output to the video capture unit 2031 and the video encoding unit 2035 as before.

【０４６８】映像キャプチャ部２０３１は、アナログ映
像情報を入力し、デジタル原映像情報として出力する。
この際、映像解像度情報２０３４は、「幅１６０ピクセ
ル、高さ１２０ピクセル」であるので、映像キャプチャ
部２０３１は、「幅１６０ピクセル、高さ１２０ピクセ
ル」の静止画像情報からなるデジタル原映像情報を出力
する。[0468] The video capture unit 2031 receives analog video information and outputs it as digital original video information.
At this time, since the video resolution information 2034 is “width 160 pixels and height 120 pixels”, the video capture unit 2031 converts the digital original video information including still image information “width 160 pixels and height 120 pixels”. Output.

【０４６９】原映像情報は、映像符号化部２０３５に入
力され、まず映像符号化部本体２０３６において符号化
処理がされ、内部符号化映像情報として出力される。先
の処理において、解像度が「幅３２０ピクセル、高さ２
４０ピクセル」ことと比較すると、今回の処理では、映
像の解像度が「幅１６０ピクセル、高さ１２０ピクセ
ル」であるので、ピクセル数で表わされる情報量は４分
の１となっている。従って、この符号化処理は、先の処
理の４分の１の時間で終了する。解像度補正情報付加部
２０３７は、映像符号化部本体２０３６から出力された
内部符号化映像情報に対して、「幅１６０ピクセル、高
さ１２０ピクセル」を示す情報を付加し、当該映像音声
符号化装置の装置出力である符号化映像情報として出力
する。[0469] The original video information is input to the video encoding unit 2035, first subjected to encoding processing in the video encoding unit main unit 2036, and output as internal encoded video information. In the previous processing, the resolution is "320 pixels wide and 2 high.
Compared with "40 pixels", in the present processing, the resolution of the image is "160 pixels in width and 120 pixels in height", so that the information amount represented by the number of pixels is 1/4. Therefore, this encoding process is completed in one-fourth the time of the previous process. The resolution correction information adding unit 2037 adds information indicating “width 160 pixels, height 120 pixels” to the internal coded video information output from the video coding unit main unit 2036, and Is output as encoded video information which is the output of the device.

【０４７０】図４８において、ビデオカメラ２００１よ
り映像音声の取り込みが続いている間は、以上のように
映像符号化、および音声符号化の処理が実行されること
によって、当該取り込みにともなっての映像音声符号化
が実行される。そして、映像音声の取り込みの終了後、
符号化も終了する。In FIG. 48, while video and audio are being captured by the video camera 2001, the video encoding and audio encoding processes are executed as described above, so that the video associated with the capture is processed. Voice coding is performed. And after the end of the video and audio capture,
The encoding also ends.

【０４７１】このように、本実施の形態２０の映像音声
符号化装置によれば、音声バッファリング部２００３
と、映像キャプチャ部２０３１と、符号化負荷評価部２
０３２と、映像符号化部本体２０３６、および解像度補
正情報付加部２０３７を内包する映像符号化部２０３５
とを備えたことで、符号化負荷評価部２０３２が、音声
バッファリング部２００３に蓄積された未処理音声情報
の量を確認し、その量に応じて、映像解像度情報２０３
４を出力することによって入力される映像情報の解像度
を制御し、これによって映像情報の情報量を制御する。
よって、映像の符号化処理に消費されるＣＰＵ時間を制
御することができ、他のアプリケーションや、映像符号
化部そのものが消費してしまった計算機資源の不足によ
る影響を、問題として認知されにくい一時的な解像度の
低下にとどめ、映像符号化のために音声が途切れるとい
う事態に陥ることを回避することが可能となる。As described above, according to the video / audio coding apparatus of the twentieth embodiment, audio buffering section 2003
, Video capture unit 2031, encoding load evaluator 2
032, a video coding unit 2035 including a video coding unit main body 2036 and a resolution correction information adding unit 2037.
Is provided, the encoding load evaluator 2032 checks the amount of the unprocessed audio information stored in the audio buffering unit 2003, and according to the amount, the video resolution information 203
4 to control the resolution of the input video information, thereby controlling the amount of video information.
Therefore, the CPU time consumed for the video encoding process can be controlled, and the effect of the shortage of computer resources consumed by other applications and the video encoding unit itself is not easily recognized as a problem. It is possible to avoid a situation in which the audio is interrupted due to the video encoding, with only a reduction in the actual resolution.

【０４７２】なお、本実施の形態２０では、映像解像度
情報を変化させるのに、原音声バッファ量がある一定値
を超えることを条件としているが、他の評価方法をとる
こともできる。例えば、原音声バッファ量にある係数を
掛け、常にバッファ量に応じた解像度を設定することも
でき、同様に音声の途切れ防止の効果が得られる。In the twentieth embodiment, the video resolution information is changed on condition that the original audio buffer amount exceeds a certain value. However, other evaluation methods can be used. For example, the original audio buffer amount can be multiplied by a certain coefficient, and the resolution can always be set according to the buffer amount, and the effect of preventing interruption of the audio can be obtained similarly.

【０４７３】実施の形態２１．本発明の実施の形態２１
による映像音声符号化装置は、実施の形態１８と同様
に、汎用計算機等におけるソフトウェア処理において負
担増大があった場合にも、音声の途切れを防ぐものであ
り、音声データの処理量を指標として、映像情報の符号
化処理停止を行うものである。Embodiment 21 FIG. Embodiment 21 of the present invention
The video and audio encoding device according to the present invention prevents interruption of audio even when a load increases in software processing in a general-purpose computer or the like, as in the eighteenth embodiment, and uses the processing amount of audio data as an index. The encoding process of the video information is stopped.

【０４７４】図４９は本発明の実施の形態２１による映
像音声符号化装置の概略構成を示す図である。図示する
ように、本実施の形態２１による映像符号化装置は、ビ
デオカメラ２１０１、音声キャプチャ部２１０２、音声
バッファリング部２１０３、音声符号化部２１４２、映
像キャプチャ部２１０６、映像符号化部２２０７、符号
化負荷評価部２１４４、およびシステムタイマ２１４１
から構成されている。また、装置出力として符号化音声
情報と、符号化映像情報とが出力されることは、実施の
形態１８と同様である。FIG. 49 is a diagram showing a schematic configuration of a video / audio coding apparatus according to Embodiment 21 of the present invention. As shown in the figure, the video encoding device according to the twenty-first embodiment includes a video camera 2101, an audio capture unit 2102, an audio buffering unit 2103, an audio encoding unit 2142, a video capture unit 2106, a video encoding unit 2207, a codec. Load evaluator 2144 and system timer 2141
It is composed of Also, the encoded audio information and the encoded video information are output as the device output, as in the eighteenth embodiment.

【０４７５】同図において、音声符号化部２１４２は、
実施の形態１と同様に、音声バッファリング部２１０３
に蓄積された原音声のうち、もっとも先（過去) に蓄積
された原音声を取り出し、取り出した原音声を音声バッ
ファリング部２１０３から削除し、原音声を圧縮符号化
して符号化音声情報として出力する。これに加えて、本
実施の形態２１における音声符号化部２１４２は、これ
までに取り出した原音声の総和である処理済み音声情報
量２１４３を保持し、更新するものである。符号化負荷
評価部２１４４は、後述する方式において、映像符号化
の制御に用いる符号化負荷評価情報２１４５を計算によ
り取得し、該取得した符号化負荷評価情報に対応して、
原映像情報の符号化を実行するか否かを指示する。シス
テムタイマ２１４１は、符号化の経過時間を測定する。[0475] In the figure, the audio encoding unit 2142
As in the first embodiment, the audio buffering unit 2103
Out of the original voices stored in the original voice, the earliest (previous) stored original voice is extracted, the extracted original voice is deleted from the audio buffering unit 2103, and the original voice is compression-coded and output as coded voice information. I do. In addition, the audio encoding unit 2142 according to the twenty-first embodiment holds and updates the processed audio information amount 2143 which is the total sum of the original audio extracted so far. The coding load evaluation unit 2144 obtains coding load evaluation information 2145 used for control of video coding by calculation in a method described later, and, in accordance with the obtained coding load evaluation information,
Indicate whether to execute the encoding of the original video information. The system timer 2141 measures the elapsed time of the encoding.

【０４７６】ビデオカメラ２１０１、音声キャプチャ部
２１０２、音声バッファリング部２１０３、映像キャプ
チャ部２１０６、および映像符号化部２１０７について
は、実施の形態１８の１８０１〜１８０３、１８０６、
および１８０７と同様であり、説明を省略する。The video camera 2101, audio capture unit 2102, audio buffering unit 2103, video capture unit 2106, and video encoding unit 2107 are the same as those of the eighteenth embodiment.
And 1807, and the description is omitted.

【０４７７】符号化負荷評価部１８２１による符号化負
荷評価情報１９２２の計算にあたっては、まず、システ
ムタイマ２１４１から求められる符号化の経過時間と、
予め明らかである原音声情報の時間当たりの入力量とを
用いて原音声入力量を計算する。そして、計算により取
得した原音声入力量と、音声符号化部２１４２が保持す
る処理済み音声情報量２１４３との差として、予測音声
バッファ量を求める。次に、この求めた予測音声バッフ
ァ量を評価情報として用いて、実施の形態１８と同様に
符号化負荷基準情報２１１０との乗算処理によって符号
化負荷評価情報２１４５を求める。本実施の形態２１に
おいても、符号化負荷基準情報２１１０は固定的に
「１」の値をとるものとし、予測音声バッファ量がすな
わち、符号化負荷基準情報２１１０となるものである。
そして、符号化負荷評価部２１４４は、符号化負荷評価
情報２１４５の値を用いて、これが一定量を超えていな
ければ原映像情報を映像符号化部２１０７に出力して符
号化を実行させ、一方一定量を超えている場合には、原
映像情報を破棄することにより、符号化を実行させな
い。従って、本実施の形態２１では、符号化負荷評価部
は、予測バッファ量と一定量との比較を行うことにな
り、上記一定量としては、音声バッファリング部２１０
３の最大バッファ量の５０％とするものである。また、
原音声情報の時間当たりの入力量は、１０秒で音声バッ
ファリング部２１０３のバッファが最大になるだけの量
とする。In the calculation of the coding load evaluation information 1922 by the coding load evaluation unit 1821, first, the elapsed time of coding obtained from the system timer 2141 and
The input amount of the original voice is calculated using the input amount per time of the original voice information which is apparent in advance. Then, a predicted audio buffer amount is obtained as a difference between the original audio input amount obtained by the calculation and the processed audio information amount 2143 held by the audio encoding unit 2142. Next, using the obtained predicted audio buffer amount as the evaluation information, the coding load evaluation information 2145 is obtained by multiplication with the coding load reference information 2110 as in the eighteenth embodiment. Also in the twenty-first embodiment, the coding load reference information 2110 has a fixed value of “1”, and the predicted audio buffer amount becomes the coding load reference information 2110.
Then, the coding load evaluation unit 2144 uses the value of the coding load evaluation information 2145 to output the original video information to the video coding unit 2107 and execute the coding if the value does not exceed a certain amount. If it exceeds a certain amount, the encoding is not executed by discarding the original video information. Therefore, in the twenty-first embodiment, the encoding load evaluator compares the predicted buffer amount with the fixed amount, and the audio buffering unit 210
3 is set to 50% of the maximum buffer amount. Also,
The input amount of the original audio information per time is set to an amount such that the buffer of the audio buffering unit 2103 becomes maximum in 10 seconds.

【０４７８】このように構成された、本実施の形態２１
による映像音声符号化装置の動作の概略は、以下のよう
になる。すなわち、当該映像音声符号化装置において、
音声符号化部２１４２は、音声バッファリング部２１０
３に蓄積された原音声のうち、もっとも先（過去) に蓄
積された原音声を取り出し、取り出した原音声を音声バ
ッファリング部２１０３から削除し、これまでに取り出
した原音声の総和である処理済み音声情報量２１４３を
更新し、原音声を圧縮符号化して符号化音声情報として
出力する。そして、符号化負荷評価部２１４４は、シス
テムタイマ２１４１から求められる符号化の経過時間
と、予め明らかである原音声情報の時間当たりの入力量
とで原音声入力量を計算し、この計算した原音声入力量
と処理済み音声情報量２１４３との差である予測音声バ
ッファ量を求め、この求めた予測音声バッファ量を用い
て符号化負荷評価情報２１４５を求める。そして、この
符号化負荷評価情報の値に応じて、映像符号化が制御さ
れる。音声の扱いについては、実施の形態１８と同様で
ある。The twenty-first embodiment thus constituted
The outline of the operation of the video / audio coding apparatus according to the above is as follows. That is, in the video / audio coding apparatus,
The audio encoding unit 2142 includes the audio buffering unit 210
3, the earliest (past) original sound is extracted from the original sound, the extracted original audio is deleted from the audio buffering unit 2103, and the processing is a sum of the original audios extracted so far. The updated audio information amount 2143 is updated, and the original audio is compression-encoded and output as encoded audio information. Then, the encoding load evaluator 2144 calculates the original audio input amount based on the elapsed time of the encoding obtained from the system timer 2141 and the input amount per hour of the original audio information that is apparent in advance, and the calculated original audio amount is calculated. A predicted audio buffer amount, which is a difference between the audio input amount and the processed audio information amount 2143, is obtained, and coding load evaluation information 2145 is obtained using the obtained predicted audio buffer amount. Then, video encoding is controlled according to the value of the encoding load evaluation information. The handling of voice is the same as in the eighteenth embodiment.

【０４７９】以下に、ある映像音声に対しての、本実施
の形態２１による映像音声符号化装置による符号化処理
の一例における動作を説明する。ここで、実施の形態１
８と同様に、映像音声符号化処理は、汎用計算機におい
てオペレーティングシステムの制御に従う映像符号化
（符号化負荷評価部２１４４と映像符号化部２１０７の
処理）と、音声符号化（音声符号化部２１４２の処理）
との各タスクとしてなされるものであるとし、ＣＰＵ時
間の割り当てをされた各タスクが一連の処理を実行して
計算機資源（ＣＰＵ時間）を解放したとき、オペレーテ
ィングシステムは他のタスクにＣＰＵ時間の割り当てを
行う、という制御をするものとする。[0479] The operation of an example of encoding processing for a video and audio by the video and audio encoding apparatus according to the twenty-first embodiment will be described below. Here, Embodiment 1
8, the video / audio coding process includes video coding (processing of the coding load evaluator 2144 and the video coding unit 2107) and voice coding (the voice coding unit 2142) according to the control of the operating system in the general-purpose computer. Processing)
When each task to which CPU time is allocated executes a series of processing and releases computer resources (CPU time), the operating system gives the other tasks the CPU time. The assignment is controlled.

【０４８０】まず、実施の形態１８と同様に、ビデオカ
メラ２１０１が映像音声情報を取り込み、アナログ音声
情報とアナログ映像情報とに分けて出力する。そして、
音声キャプチャ部２１０２は、ビデオカメラ２１０１か
ら出力されたアナログ音声情報を入力し、デジタル原音
声情報として出力する。音声バッファリング部２１０３
は、原音声情報を蓄積し、蓄積量に応じて原音声バッフ
ァ量２１０４を更新する。一方、映像キャプチャ部２１
０６は、ビデオカメラ２１０１から出力されたアナログ
映像情報を入力し、デジタル原映像情報として出力す
る。First, as in the eighteenth embodiment, the video camera 2101 fetches video / audio information, and separates and outputs analog audio information and analog video information. And
The audio capture unit 2102 receives the analog audio information output from the video camera 2101 and outputs it as digital original audio information. Audio buffering unit 2103
Accumulates the original audio information and updates the original audio buffer amount 2104 according to the accumulated amount. Meanwhile, the video capture unit 21
Reference numeral 06 inputs analog video information output from the video camera 2101 and outputs it as digital original video information.

【０４８１】符号化負荷評価部２１４４は、一旦、映像
キャプチャ部２１４４から出力された原映像情報を入力
し、その時点において、予測音声バッファ量を確認す
る。このとき、システムタイマ２１４１を参照して得ら
れる経過時刻は１秒であり、処理済み音声情報量２１４
３はまだ「０」であるので、予測音声バッファ量は１０
％であって予め決められた基準値である５０％を下回っ
ており、符号化負荷基準情報２１１０は乗算処理にあた
って考慮しないでよい値「１」を有するので、符号化負
荷評価情報２１４５は１００％となる。そこで、符号化
負荷評価部２１４４は、原映像情報を映像符号化部２１
０７に入力し、映像符号化部２１０７はこの原映像情報
に対して映像符号化処理を行い、符号化処理が終了した
時点でＣＰＵ時間を解放する。[0481] The encoding load evaluator 2144 temporarily receives the original video information output from the video capture unit 2144, and confirms the predicted audio buffer amount at that time. At this time, the elapsed time obtained by referring to the system timer 2141 is 1 second, and the processed audio information amount 214
Since 3 is still “0”, the predicted audio buffer amount is 10
%, Which is less than a predetermined reference value of 50%, and the coding load reference information 2110 has a value “1” that does not need to be considered in the multiplication process. Becomes Therefore, the encoding load evaluating unit 2144 converts the original video information into the video encoding unit 21.
07, the video coding unit 2107 performs video coding processing on the original video information, and releases the CPU time when the coding processing is completed.

【０４８２】ここで、実施の形態１８での説明の場合と
同様に、汎用計算機のオペレーティングシステム上で映
像符号化、および音声符号化以外のタスクも実行されて
いたものとして、他のタスクにＣＰＵ時間の割り当てが
なされ、「その他のプロセス」に制御が移って、ＣＰＵ
時間が消費される。実施の形態１８に示したと同様に、
映像音声のビデオカメラ２１０１による取り込みと、音
声キャプチャ部２１０２、および映像キャプチャ部２１
０６による処理とは、当該「その他のプロセス」とも概
ね並行して行われるものであり、音声バッファリング部
２１０３に音声が９０％分まで蓄積される。Here, as in the description of the eighteenth embodiment, it is assumed that tasks other than the video encoding and the audio encoding have been executed on the operating system of the general-purpose computer, and the other tasks are executed by the CPU. Time is allocated, control is transferred to "other processes", and CPU
Time is consumed. As described in the eighteenth embodiment,
Capture of video and audio by video camera 2101, audio capture unit 2102, and video capture unit 21
The processing by 06 is performed substantially in parallel with the “other process”, and the sound is stored in the sound buffering unit 2103 up to 90%.

【０４８３】その後、音声符号化処理が実行されると
き、音声符号化部２１４２は、音声バッファリング部２
１０３から一定量の原音声情報を、先（過去) に蓄積し
た分から先に読み出し、その分の原音声情報を音声バッ
ファリング部２１０３から削除し、原音声バッファ量２
１０４を更新する。さらに、音声符号化部２１４２は、
原音声情報を符号化する。本実施の形態２１では、読み
出して削除する一定量として３０％であるものとし、音
声符号化部２１０５は、上記のように９０％まで蓄積さ
れた原音声情報のうちの３０％を読み出して符号化し、
符号化を終了した時点で、３０％分の量を自らの保持す
る処理済み音声情報量２１４３に加えて更新をし、ＣＰ
Ｕ時間を解放する。Thereafter, when the audio encoding process is performed, the audio encoding unit 2142
A certain amount of original audio information is read out from the audio buffer 103 from the audio buffering unit 2103, and the original audio information is deleted from the audio buffering unit 2103.
Update 104. Further, the audio encoding unit 2142
Encode the original audio information. In the twenty-first embodiment, the fixed amount to be read and deleted is 30%, and the audio encoding unit 2105 reads and encodes 30% of the original audio information stored up to 90% as described above. And
At the end of the encoding, the amount of 30% of the processed audio information amount 2143 held by itself is updated by adding
Release U time.

【０４８４】映像符号化に再び処理が移ると、符号化負
荷評価部２１４４は、まずシステムタイマ２１４１を参
照して、その時点の経過時間を確認する。「その他のプ
ロセス」に移っていたため、経過時間は９秒である。次
に音声符号化部２１４２の保持する処理済み音声情報量
２１４３を参照すると、３０％であった。このため、予
測音声バッファ量は６０％となり、基準値である５０％
を超えているので、評価情報は０％となり、符号化負荷
基準情報２１１０の「１」を乗算して得られる符号化負
荷評価情報も０％となる。そこで、符号化負荷評価部２
１４４は、その時点の原映像情報を破棄し、すみやかに
ＣＰＵ時間を解放して、音声符号化が実行されるように
図る。When the processing shifts to video encoding again, the encoding load evaluator 2144 first refers to the system timer 2141 to check the elapsed time at that time. The elapsed time is 9 seconds because the process has been moved to the “other process”. Next, referring to the processed audio information amount 2143 held by the audio encoding unit 2142, it was 30%. For this reason, the predicted audio buffer amount is 60%, which is the reference value of 50%.
, The evaluation information is 0%, and the encoding load evaluation information obtained by multiplying the encoding load reference information 2110 by “1” is also 0%. Therefore, the coding load evaluator 2
In step 144, the original video information at that time is discarded, and the CPU time is immediately released so that the audio encoding is executed.

【０４８５】図４９において、ビデオカメラ２１０１よ
り映像音声の取り込みが続いている間は、以上のように
映像符号化、および音声符号化の処理が実行されること
によって、当該取り込みにともなっての映像音声符号化
が実行される。そして、映像音声の取り込みの終了後、
符号化も終了する。[0485] In Fig. 49, while video and audio are being captured by the video camera 2101, the video encoding and audio encoding processes are executed as described above. Voice coding is performed. And after the end of the video and audio capture,
The encoding also ends.

【０４８６】このように、本実施の形態２１による映像
音声符号化装置によれば、システムタイマ２１４１と、
処理済み音声情報量２１４３を保持する音声符号化部２
１４２と、符号化音声負荷評価部２１４４とを備えたこ
とで、符号化負荷評価部２１４４は、システムタイマ２
１４１を参照して得られる経過時間と、音声符号化部２
１４２を参照して得られる処理済み音声情報量２１４３
とから予測バッファ量を計算し、この予測バッファ量
を、原音声バッファ量２１０４の代替に用いて、映像符
号化を制御するので、原音声バッファ量を用いて制御を
行った実施の形態１８と同様に、他のアプリケーション
や、映像符号化部そのものが消費してしまった計算機資
源の不足による、音声の途切れを防止することが可能と
なる。As described above, according to the video and audio coding apparatus of the twenty-first embodiment, the system timer 2141
Speech encoding unit 2 holding processed speech information amount 2143
142 and the encoded voice load evaluator 2144, the encoding load evaluator 2144
141, and the elapsed time obtained by referring to
Processed audio information amount 2143 obtained with reference to 142
Since the predicted buffer amount is calculated from the above, and the predicted buffer amount is used as a substitute for the original audio buffer amount 2104 to control the video encoding, control is performed using the original audio buffer amount according to the eighteenth embodiment. Similarly, it is possible to prevent interruption of audio due to lack of computer resources consumed by other applications or the video encoding unit itself.

【０４８７】また、本実施の形態２１では、実施の形態
１８と異なり、原音声バッファ量２１０４がわからなく
ても、音声符号化部２１４２が処理した情報量を参照す
ることで、現在蓄積されているであろう原音声バッファ
量２１０４を予測することができるようになっているた
め、バッファ部がブラックボックスとなっている既存ア
プリケーションを使用する際にも、容易に対処できるも
のである。In the twenty-first embodiment, unlike the eighteenth embodiment, even if the original audio buffer amount 2104 is not known, the currently stored audio data can be stored by referring to the information amount processed by the audio encoding unit 2142. Since it is possible to predict the original audio buffer amount 2104 that will be present, it is possible to easily cope with the use of an existing application in which the buffer unit is a black box.

【０４８８】なお、本実施の形態２１では、状況に応じ
て映像符号化処理の停止を行う実施の形態１８に準じた
構成において、処理済み音声情報量に基づく予測バッフ
ァ量を指標とした映像符号化の制御を行うものとした
が、フレーム間予測符号化の処理量を制御する実施の形
態１９、および解像度を変更する実施の形態２０に対し
て、予測バッファ量を指標とする本実施の形態２１の手
法を応用することも可能であり、原音声バッファ量２１
０４を知ることができない場合にも制御を行い得るとい
う同様の効果が得られる。In the twenty-first embodiment, in the configuration according to the eighteenth embodiment in which the video encoding process is stopped according to the situation, the video encoding using the predicted buffer amount based on the processed audio information amount as an index In this embodiment, the prediction buffer amount is used as an index, compared with Embodiment 19 in which the processing amount of inter-frame prediction coding is controlled and Embodiment 20 in which the resolution is changed. It is also possible to apply the method 21.
The same effect that control can be performed even when the user cannot know the information 04 is obtained.

【０４８９】実施の形態２２．本発明の実施の形態２０
による映像音声符号化装置は、実施の形態１８と同様
に、汎用計算機等におけるソフトウェア処理において負
担増大があった場合にも、音声の途切れを防ぐものであ
り、音声の符号化量を指標として、映像情報の符号化処
理停止を行うものである。Embodiment 22 FIG. Embodiment 20 of the present invention
The video and audio encoding device according to the present invention prevents interruption of audio even when a load increases in software processing in a general-purpose computer or the like, as in the eighteenth embodiment, and uses the audio encoding amount as an index. The encoding process of the video information is stopped.

【０４９０】図５０は本発明の実施の形態２２による映
像音声符号化装置の概略構成を示す図である。図示する
ように、本実施の形態２２による映像符号化装置は、ビ
デオカメラ２２０１、音声キャプチャ部２２０２、音声
バッファリング部２２０３、音声符号化部２２０５、映
像キャプチャ部２２０６、映像符号化部２２０７、符号
化負荷評価部２２５３、およびシステムタイマ２２５１
から構成されている。また、装置出力として符号化音声
情報と、符号化映像情報とが出力されることは、実施の
形態１８と同様である。FIG. 50 is a diagram showing a schematic configuration of a video / audio coding apparatus according to Embodiment 22 of the present invention. As shown in the figure, the video encoding device according to the twenty-second embodiment includes a video camera 2201, an audio capture unit 2202, an audio buffering unit 2203, an audio encoding unit 2205, a video capture unit 2206, a video encoding unit 2207, a code Load evaluator 2253 and system timer 2251
It is composed of Also, the encoded audio information and the encoded video information are output as the device output, as in the eighteenth embodiment.

【０４９１】同図において、符号化負荷評価部２２５３
は、実施の形態２１と同様に予測音声バッファ量を求
め、これに基づいて符号化負荷評価情報を取得するもの
であるが、当該予測音声バッファ量を求める方法が実施
の形態２１による符号化装置とは異なるものである。本
実施の形態２２では、符号化負荷評価部２２５３は、音
声符号化部２２０５から出力される符号化音声量２２５
２を検出するものであり、この符号化音声量２２５２か
ら得られる処理済み音声情報量２２５４を、実施の形態
２１における処理済み音声量２１４３の代わりに用いる
ものである。符号化音声情報は、上記の通りに、当該映
像音声符号化装置の装置出力となり、伝送・記録等され
るものであって、容易にその量を検出することができ
る。本実施の形態２２の符号化負荷評価部２２５３にお
いても、システムタイマ２２５１から経過時間を得る
点、該経過時間と時間当たりの原音声入力量とから原音
声入力量を取得する点、また、固定的に「１」の値とす
る符号化負荷基準情報２２１０を用いる点は実施の形態
２１の場合と同様である。In the figure, the coding load evaluator 2253
Is a method for obtaining a predicted audio buffer amount in the same manner as in the twenty-first embodiment and obtaining coding load evaluation information based on the calculated amount. A method for obtaining the predicted audio buffer amount is the encoding apparatus according to the twenty-first embodiment. Is different from In the twenty-second embodiment, encoding load estimating section 2253 outputs encoded audio amount 225 output from audio encoding section 2205.
2 is used, and the processed audio information amount 2254 obtained from the encoded audio amount 2252 is used instead of the processed audio amount 2143 in the twenty-first embodiment. As described above, the encoded audio information is output from the video / audio encoding device and is transmitted / recorded, and the amount thereof can be easily detected. Also in the coding load evaluator 2253 of the twenty-second embodiment, the point that the elapsed time is obtained from the system timer 2251, the point that the original sound input amount is obtained from the elapsed time and the amount of the original sound input per time, are fixed. The point that the encoding load reference information 2210 having a value of “1” is used is the same as in the twenty-first embodiment.

【０４９２】本実施の形態２２による映像音声符号化装
置は、上記のように符号化負荷評価部２２５３の機能が
異なる点、および音声符号化部２２０５が処理済み音声
量を保持しない点を除いては、実施の形態２１による映
像音声符号化装置と同様の構成となっているものであ
る。従って、ビデオカメラ２２０１、音声キャプチャ部
２２０２、音声バッファリング部２２０３、映像キャプ
チャ部２２０６、および映像符号化部２２０７について
は、実施の形態１８の１８０１〜１８０３、１８０６、
および１８０７と同様であり、システムタイマ２２５１
については実施の形態２１と同様であるので説明を省略
する。[0492] The video and audio coding apparatus according to the twenty-second embodiment differs from the video and audio coding apparatus except that the function of the coding load evaluator 2253 is different as described above, and that the sound coding unit 2205 does not hold the processed sound amount. Has a configuration similar to that of the video / audio coding apparatus according to Embodiment 21. Accordingly, regarding the video camera 2201, the audio capture unit 2202, the audio buffering unit 2203, the video capture unit 2206, and the video encoding unit 2207, 1801 to 1803 and 1806 of the eighteenth embodiment
And 1807, and the system timer 2251
Is the same as in the twenty-first embodiment, and a description thereof will be omitted.

【０４９３】本実施の形態２２においても、実施の形態
２１と同様に、予測音声バッファ量と比較する一定量を
最大バッファ量の５０％、また、原音声の時間当たりの
入力量を１０秒で音声バッファリング部２２０３のバッ
ファが最大になるだけの量とする。さらに、音声符号化
部２２０５の圧縮率を１０分の１とする。In the twenty-second embodiment, as in the twenty-first embodiment, the fixed amount to be compared with the predicted audio buffer amount is 50% of the maximum buffer amount, and the input amount per hour of the original audio is 10 seconds. The amount is set so that the buffer of the audio buffering unit 2203 is maximized. Further, the compression rate of the audio encoding unit 2205 is set to 1/10.

【０４９４】このように構成された、本実施の形態２１
による映像音声符号化装置の動作の概略は、以下のよう
になる。すなわち、当該映像音声符号化装置の符号化負
荷評価部２２５３は、システムタイマ２２５１から求め
られる符号化の経過時間と、予め明らかである原音声情
報の時間当たりの入力量とを用いて原音声入力量を計算
し、さらに、音声符号化部２２０５が出力した符号化音
声情報の総量である符号化音声量２２５２から処理済み
音声情報量２２５４を求め、さらに、原音声入力量と処
理済み音声情報量２２５４との差である予測音声バッフ
ァ量を求め、この求めた予測音声バッファ量を用いて符
号化負荷評価情報２２０９を求める。そして、符号化負
荷評価部２２５３は、予測音声バッファ量が一定量より
少なければ、原映像情報を映像符号化部２２０７に入力
し、一定量以上であれば、原映像情報を破棄して映像符
号化部２２０７の処理を中断し、計算機資源（ＣＰＵ時
間）を音声符号化部２２０５に明け渡す。The twenty-first embodiment thus constructed
The outline of the operation of the video / audio coding apparatus according to the above is as follows. That is, the encoding load evaluator 2253 of the video / audio encoding apparatus uses the elapsed time of the encoding obtained from the system timer 2251 and the input amount of the original audio information per time which is apparent in advance, to obtain the original audio input. Then, the amount of processed audio information 2254 is calculated from the amount of encoded audio information 2252, which is the total amount of encoded audio information output by the audio encoding unit 2205. A predicted audio buffer amount which is a difference from 2254 is obtained, and coding load evaluation information 2209 is obtained using the obtained predicted audio buffer amount. Then, the encoding load evaluator 2253 inputs the original video information to the video encoding unit 2207 if the predicted audio buffer amount is smaller than the predetermined amount, and discards the original video information and The processing of the encoding unit 2207 is interrupted, and the computer resources (CPU time) are given to the audio encoding unit 2205.

【０４９５】以下に、ある映像音声に対しての、本実施
の形態２２による映像音声符号化装置による符号化処理
の一例における動作を説明する。ここで、実施の形態１
８と同様に、映像音声符号化処理は、汎用計算機におい
てオペレーティングシステムの制御に従う映像符号化
（符号化負荷評価部２２５３と映像符号化部２２０７の
処理）と、音声符号化（音声符号化部２２０５の処理）
との各タスクとしてなされるものであるとし、ＣＰＵ時
間の割り当てをされた各タスクが一連の処理を実行して
計算機資源（ＣＰＵ時間）を解放したとき、オペレーテ
ィングシステムは他のタスクにＣＰＵ時間の割り当てを
行う、という制御をするものとする。[0495] The operation of an example of encoding processing for a video and audio by the video and audio encoding apparatus according to the twenty-second embodiment will be described below. Here, Embodiment 1
8, the video / audio coding processing includes video coding (processing of the coding load evaluation unit 2253 and the video coding unit 2207) and voice coding (the voice coding unit 2205) according to the control of the operating system in the general-purpose computer. Processing)
When each task to which CPU time is allocated executes a series of processing and releases computer resources (CPU time), the operating system gives the other tasks the CPU time. The assignment is controlled.

【０４９６】まず、実施の形態１８と同様に、ビデオカ
メラ２２０１が映像音声情報を取り込み、アナログ音声
情報とアナログ映像情報とに分けて出力する。そして、
音声キャプチャ部２２０２は、ビデオカメラ２２０１か
ら出力されたアナログ音声情報を入力し、デジタル原音
声情報として出力する。音声バッファリング部２２０３
は、原音声情報を蓄積し、蓄積量に応じて原音声バッフ
ァ量２２０４を更新する。一方、映像キャプチャ部２２
０６は、ビデオカメラ２２０１から出力されたアナログ
映像情報を入力し、デジタル原映像情報として出力す
る。First, as in the eighteenth embodiment, the video camera 2201 takes in video / audio information, and outputs it separately into analog audio information and analog video information. And
The audio capture unit 2202 receives the analog audio information output from the video camera 2201 and outputs it as digital original audio information. Audio buffering unit 2203
Stores the original audio information, and updates the original audio buffer amount 2204 according to the accumulated amount. Meanwhile, the video capture unit 22
Reference numeral 06 inputs analog video information output from the video camera 2201 and outputs it as digital original video information.

【０４９７】符号化負荷評価部２２４４は、一旦、映像
キャプチャ部２１４４から出力された原映像情報を入力
し、その時点において、予測音声バッファ量を確認す
る。このとき、システムタイマ２２５１を参照して得ら
れる経過時刻は１秒であり、符号化音声量２２５２はま
だ「０」であるので、予測音声バッファ量は１０％であ
って予め決められた基準値である５０％を下回ってお
り、符号化負荷基準情報２１１０は乗算処理にあたって
考慮しないでよい値「１」を有するので、符号化負荷評
価情報２２４５は１００％となる。そこで、符号化負荷
評価部２２５３は、原映像情報を映像符号化部２２０７
に入力し、映像符号化部２２０７はこの原映像情報に対
して映像符号化処理を行い、符号化処理が終了した時点
でＣＰＵ時間を解放する。[0497] The encoding load evaluator 2244 once receives the original video information output from the video capture unit 2144, and confirms the predicted audio buffer amount at that time. At this time, the elapsed time obtained by referring to the system timer 2251 is 1 second, and the encoded audio volume 2252 is still “0”, so the predicted audio buffer volume is 10% and the predetermined reference value Is less than 50%, and the coding load reference information 2110 has a value “1” that does not need to be considered in the multiplication process, so the coding load evaluation information 2245 becomes 100%. Therefore, the encoding load evaluation unit 2253 converts the original video information into the video encoding unit 2207.
, The video encoding unit 2207 performs video encoding processing on the original video information, and releases the CPU time when the encoding processing ends.

【０４９８】ここで、実施の形態１８での説明の場合と
同様に、汎用計算機のオペレーティングシステム上で映
像符号化、および音声符号化以外のタスクも実行されて
いたものとして、他のタスクにＣＰＵ時間の割り当てが
なされ、「その他のプロセス」に制御が移って、ＣＰＵ
時間が消費される。実施の形態１８に示したと同様に、
映像音声のビデオカメラ２２０１による取り込みと、音
声キャプチャ部２２０２、および映像キャプチャ部２２
０６による処理とは、当該「その他のプロセス」とも概
ね並行して行われるものであり、音声バッファリング部
２２０３に音声が９０％分まで蓄積される。Here, as in the description of the eighteenth embodiment, it is assumed that tasks other than video encoding and audio encoding have been executed on the operating system of the general-purpose computer, and the other tasks are executed by the CPU. Time is allocated, control is transferred to "other processes", and CPU
Time is consumed. As described in the eighteenth embodiment,
Capture of video and audio by video camera 2201, audio capture unit 2202, and video capture unit 22
The processing by 06 is performed substantially in parallel with the “other process”, and the sound is stored in the sound buffering unit 2203 up to 90%.

【０４９９】その後、音声符号化処理が実行されると
き、音声符号化部２２０５は、音声バッファリング部２
２０３から一定量の原音声情報を、先（過去) に蓄積し
た分から先に読み出し、その分の原音声情報を音声バッ
ファリング部２２０３から削除し、原音声バッファ量２
２０４を更新する。さらに、音声符号化部２２０５は、
原音声情報を符号化する。本実施の形態２２では、読み
出して削除する一定量として３０％であるものとし、音
声符号化部２２０５は、上記のように９０％まで蓄積さ
れた原音声情報のうちの３０％を読み出して符号化し、
符号化を終了した時点で、ＣＰＵ時間を解放する。[0499] Thereafter, when the audio encoding process is performed, the audio encoding unit 2205 outputs
A certain amount of original audio information is read out from the audio buffer 203, and the corresponding original audio information is deleted from the audio buffering unit 2203.
Update 204. Further, the audio encoding unit 2205
Encode the original audio information. In the twenty-second embodiment, the fixed amount to be read and deleted is 30%, and the audio encoding unit 2205 reads and encodes 30% of the original audio information stored up to 90% as described above. And
At the end of the encoding, the CPU time is released.

【０５００】映像符号化に再び処理が移ると、符号化負
荷評価部２２４４は、まずシステムタイマ２２４１を参
照して、その時点の経過時間を確認する。「その他のプ
ロセス」に移っていたため、経過時間は９秒である。次
に音声符号化部２２０５から出力された符号化音声量
が、音声バッファリング部１０３のバッファ量の３％で
あり、圧縮率が１０分の１であることから、処理済み音
声情報量２２５４は、３０％と求められる。このため、
予測音声バッファ量は６０％となり、基準値である５０
％を超えているので、評価情報は０％となり、符号化負
荷基準情報２１１０の「１」を乗算して得られる符号化
負荷評価情報も０％となる。そこで、符号化負荷評価部
２１４４は、その時点の原映像情報を破棄し、すみやか
にＣＰＵ時間を解放して、音声符号化が実行されるよう
に図る。When the processing shifts to video encoding again, the encoding load evaluator 2244 first refers to the system timer 2241 to check the elapsed time at that time. The elapsed time is 9 seconds because the process has been moved to the “other process”. Next, since the encoded audio amount output from the audio encoding unit 2205 is 3% of the buffer amount of the audio buffering unit 103 and the compression ratio is 1/10, the processed audio information amount 2254 is , 30%. For this reason,
The predicted audio buffer amount is 60%, which is the reference value of 50.
%, The evaluation information is 0%, and the coding load evaluation information obtained by multiplying the coding load reference information 2110 by “1” is also 0%. Thus, the encoding load evaluator 2144 discards the original video information at that time, immediately releases the CPU time, and performs audio encoding.

【０５０１】図５０において、ビデオカメラ２２０１よ
り映像音声の取り込みが続いている間は、以上のように
映像符号化、および音声符号化の処理が実行されること
によって、当該取り込みにともなっての映像音声符号化
が実行される。そして、映像音声の取り込みの終了後、
符号化も終了する。In FIG. 50, while the video and audio are being captured by the video camera 2201, the video encoding and audio encoding processes are executed as described above, whereby the video associated with the capture is processed. Voice coding is performed. And after the end of the video and audio capture,
The encoding also ends.

【０５０２】このように、本実施の形態２２による映像
音声符号化装置によれば、システムタイマ２２５１と、
符号化音声量２２５２から処理済み音声情報量２２５４
を取得する符号化音声負荷評価部２２５３とを備えたこ
とで、符号化負荷評価部２２５３は、システムタイマ２
２５１を参照して得られる経過時間と、符号化音声量２
２５２を参照して得られる処理済み音声情報量２２５４
とから予測バッファ量を計算し、この予測バッファ量
を、原音声バッファ量２２０４の代替に用いて、映像符
号化を制御するので、原音声バッファ量を用いて制御を
行った実施の形態１８と同様に、他のアプリケーション
や、映像符号化部そのものが消費してしまった計算機資
源の不足による、音声の途切れを防止することが可能と
なる。As described above, according to the video and audio coding apparatus of the twenty-second embodiment, the system timer 2251
From the encoded audio volume 2252 to the processed audio information volume 2254
And a coded voice load evaluator 2253 that obtains the
251 and the coded voice amount 2
Processed audio information amount 2254 obtained by referring to 252
Since the predicted buffer amount is calculated from the above and the predicted buffer amount is used as a substitute for the original audio buffer amount 2204 to control the video encoding, control is performed using the original audio buffer amount according to the eighteenth embodiment. Similarly, it is possible to prevent interruption of audio due to lack of computer resources consumed by other applications or the video encoding unit itself.

【０５０３】また、本実施の形態２２では、実施の形態
１８、および実施の形態２１と異なり、原音声バッファ
量２２０４と、音声符号化部２２５３における処理量と
がわからなくても、音声符号化部２２５３が処理して出
力した情報量を参照することで、現在蓄積されているで
あろう原音声バッファ量２２０４を予測することができ
るようになっているため、バッファ部に加えて、音声符
号化部もブラックボックスとなっている既存アプリケー
ションを使用する際にも、容易に対処することができ
る。[0503] Further, unlike Embodiment 18 and Embodiment 21, Embodiment 22 differs from Embodiment 18 and Embodiment 21 in that even if the original audio buffer amount 2204 and the processing amount in the audio encoding unit 2253 are not known, audio encoding is not performed. By referring to the amount of information processed and output by the unit 2253, the original audio buffer amount 2204 that will be currently stored can be predicted. It is also possible to easily cope with using an existing application whose black box is also a black box.

【０５０４】なお、本実施の形態２２では、状況に応じ
て映像符号化処理の停止を行う実施の形態１８におい
て、符号化音声量に基づく予測バッファ量を指標とした
制御を行うものとしたが、フレーム間予測符号化の処理
量を制御する実施の形態１９、および解像度を変更する
実施の形態２０に対して、予測バッファ量を指標とする
本手法を応用することも可能であり、同様の効果が得ら
れる。[0504] In the twenty-second embodiment, in the eighteenth embodiment in which the video encoding process is stopped according to the situation, the control is performed using the predicted buffer amount based on the encoded audio amount as an index. It is also possible to apply the present method using the prediction buffer amount as an index to the nineteenth embodiment for controlling the processing amount of inter-frame prediction encoding and the twenty-second embodiment for changing the resolution. The effect is obtained.

【０５０５】さて、ここまで説明してきた実施の形態１
８ないし実施の形態２２は、すべて、平均的には符号化
が可能なものであることを前提として、瞬間的、または
短期的な負荷増大による影響から、一時的に計算機資源
が少なくなったときに、音声の途切れを防ぐためのもの
である。これらの実施の形態による映像音声符号化装置
を、基本的に計算機能力の乏しいコンピュータシステム
上で動作するソフトウェアにより実現する場合に応用す
ることも可能ではあるが、かかる場合には、種々の条件
と対象においての符号化のすべてにおいて好適であると
は言えない。Now, the first embodiment described so far.
Eighth to twenty-second embodiments are based on the premise that encoding is possible on average, and when computer resources are temporarily reduced due to the influence of an instantaneous or short-term load increase. In addition, this is for preventing interruption of audio. It is possible to apply the video / audio encoding apparatus according to these embodiments to a case where the apparatus is basically implemented by software that operates on a computer system having a low computational function. In such a case, various conditions and It may not be suitable for all codings in the object.

【０５０６】図５１は実施の形態１８ないし実施の形態
２２までの映像音声符号化装置を、基本的に計算機能力
の乏しいコンピュータシステム上で実現したときの、音
声バッファ量の推移を示す図である。同図に示すよう
に、全体的に映像符号化の負荷が大きすぎるため、映像
符号化をしている間に、未処理の音声情報が増大してし
まう。その結果、映像符号化が終了した段階で、音声符
号化を優先する制御がなされるようになり、急激に音声
符号化の処理が優先して実行され、その間映像の符号化
処理は停止する。未処理の音声情報が少なくなった時点
で、再び映像の符号化に戻ると、また未処理の音声情報
が増大する。このようなことが繰り返されると、符号化
された映像情報は、高質のものから、急激に低質のもの
となり、また高質のものに戻る、といったことを繰り返
し、結果として再生して利用する場合に鑑賞しがたいも
のとなってしまう。FIG. 51 is a diagram showing the transition of the audio buffer amount when the video / audio coding apparatus according to the eighteenth to twenty-second embodiments is realized on a computer system having basically low computational capability. . As shown in the figure, since the load of video encoding is too large as a whole, unprocessed audio information increases during video encoding. As a result, at the stage where the video encoding is completed, control for giving priority to the audio encoding is performed, and the audio encoding process is suddenly executed with priority, and the video encoding process is stopped during that time. When the processing returns to video encoding again when the amount of unprocessed audio information decreases, the amount of unprocessed audio information increases. When such a process is repeated, the encoded video information rapidly changes from high-quality information to low-quality information, and then returns to high-quality information. In that case, it becomes difficult to appreciate.

【０５０７】実施の形態１８〜２２による映像符号化を
行う場合の、かかる課題を解決するため、以下に説明す
る本発明の実施の形態２３、および実施の形態２４によ
る映像符号化装置は、基本的に能力の乏しい計算機上で
実現され、やはり音声の途切れを防止しながらも、上述
のような映像の画質の大きな変動を抑制し得るものであ
る。In order to solve such a problem when performing video coding according to Embodiments 18 to 22, a video coding apparatus according to Embodiments 23 and 24 of the present invention, which will be described below, has a basic configuration. It is realized on a computer having a poor ability in terms of performance, and can also suppress the above-described large fluctuation in the image quality of the video while also preventing interruption of the sound.

【０５０８】実施の形態２３．本発明の実施の形態２３
による映像音声符号化装置は、高性能でない汎用計算機
等において映像音声の符号化処理をソフトウェア処理に
よって行う場合に対応することを、符号化負荷基準情報
の設定によってするものである。Embodiment 23 FIG. Embodiment 23 of the present invention
The video / audio coding apparatus according to the present invention is adapted to cope with the case where the video / audio coding processing is performed by software processing in a general-purpose computer or the like having low performance by setting the coding load reference information.

【０５０９】図５２は本発明の実施の形態２３による映
像音声符号化装置の概略構成を示す図である。図示する
ように、本実施の形態２３による映像符号化装置は、ビ
デオカメラ２３０１、音声キャプチャ部２３０２、音声
バッファリング部２３０３、音声符号化部２３０５、映
像キャプチャ部２３０６、映像符号化部２３０７、符号
化負荷評価部２３０８、システムタイマ２３６１、およ
び符号化負荷基準設定部２３６２から構成されている。
装置出力として符号化音声情報と、符号化映像情報とが
出力されることは、実施の形態１８と同様である。FIG. 52 is a diagram showing a schematic configuration of a video / audio coding apparatus according to Embodiment 23 of the present invention. As shown in the figure, the video encoding apparatus according to Embodiment 23 includes a video camera 2301, an audio capture unit 2302, an audio buffering unit 2303, an audio encoding unit 2305, a video capture unit 2306, a video encoding unit 2307, a codec. It comprises a coded load evaluation unit 2308, a system timer 2361, and a coded load reference setting unit 2362.
As in the eighteenth embodiment, encoded audio information and encoded video information are output as device outputs.

【０５１０】同図において、システムタイマ２３６１は
経過時間を計測する。符号化負荷基準設定部２３６２
は、単位時間ごとの原音声バッファ量２３０４の変動を
調査し、変動の度合いによって符号化負荷基準情報２３
６３を設定する。符号化負荷評価部２３０８は、固定さ
れた値を有する符号化負荷基準情報ではなく、符号化負
荷基準設定部２３８２によって設定された符号化負荷基
準情報を用いて、符号化負荷評価情報２３０９を計算に
より取得する。本実施の形態２３による映像音声符号化
装置は、実施の形態１８による映像符号化装置に、シス
テムタイマ２３６１と、符号化負荷基準設定部２３６２
とを追加した構成であり、ビデオカメラ２３０１、音声
キャプチャ部２３０２、音声バッファリング部２３０
３、音声符号化部２３０５、映像キャプチャ部２３０
６、および映像符号化部２３０７については実施の形態
１８の１８０１〜１８０７と同様であり、説明を省略す
る。[0510] In the figure, a system timer 2361 measures the elapsed time. Encoding load reference setting unit 2362
Investigates the variation of the original audio buffer amount 2304 per unit time, and determines the encoding load reference information 23 according to the degree of the variation.
Set 63. The coding load evaluation unit 2308 calculates the coding load evaluation information 2309 using the coding load reference information set by the coding load reference setting unit 2382 instead of the coding load reference information having a fixed value. Get by. The video / audio coding apparatus according to Embodiment 23 is different from the video coding apparatus according to Embodiment 18 in that a system timer 2361 and a coding load reference setting unit 2362 are provided.
And a video camera 2301, an audio capture unit 2302, and an audio buffering unit 230.
3, audio encoding unit 2305, video capture unit 230
6 and the video encoding unit 2307 are the same as 1801 to 1807 of the eighteenth embodiment, and a description thereof will be omitted.

【０５１１】符号化負荷基準設定部２３６２は、原音声
バッファ量２３０４が、一定値を超えた場合、または一
定値より少なくなった場合に「１カウント」とするカウ
ント動作を実行し、単位時間当たりに３カウントを超え
たところで、符号化負荷基準情報２３６３を設定しなお
すものとする。[0511] The encoding load criterion setting unit 2362 executes a counting operation of setting "1 count" when the original audio buffer amount 2304 exceeds a certain value or becomes smaller than the certain value, and executes a counting operation per unit time. When the count exceeds 3 counts, the coding load reference information 2363 is reset.

【０５１２】符号化負荷基準情報２３６３は、音声バッ
ファが空である場合に映像の処理をどの程度行うかを示
す情報であり、例えば１００％を表わす「１」の値であ
れば、音声バッファが空である場合に映像の処理を１０
０％行い、５０％を表わす「０．５」の値であれば、音
声バッファが空である場合に映像の処理を５０％行う、
ということを示すものである。[0512] The coding load reference information 2363 is information indicating how much video processing is performed when the audio buffer is empty. For example, if the value of "1" representing 100% is used, the audio buffer is If it is empty, process the video by 10
0% is performed, and if the value is “0.5” representing 50%, the video processing is performed 50% when the audio buffer is empty.
It indicates that.

【０５１３】本実施の形態２３において、符号化負荷基
準情報２３６３は、その初期値が「１」であるものとし
ているが、原音声バッファ量２３０４の変動により、符
号化負荷基準設定部２３６２によるカウント動作におい
て３カウントを超えると、再設定がなされることにより
その値が「０．５」になる。[0513] In the twenty-third embodiment, the coding load reference information 2363 has an initial value of "1". If the count exceeds 3 counts in the operation, the value is set to “0.5” by resetting.

【０５１４】符号化負荷評価部２３０８は、この値を用
いて符号化負荷情報２３０９を求めるので、符号化負荷
情報２３０９の値は、符号化負荷基準情報２３６３が
「１」のときは、実施の形態１８と同じように、０％か
１００％の値をとるが、符号化負荷基準情報２３６３が
「０．５」のときは、０％か５０％の値となる。５０％
の値となったとき、符号化評価部２３０８は、その時点
で入力された原映像情報を１００％映像符号化部２３０
７に入力するのではなく、５０％だけ映像符号化部２３
０７に入力する。従って、この場合、いわゆるフルフレ
ーム（３０ｆｐｓ）の処理をするのではなく、１５ｆｐ
ｓの処理を行うことになる。[0514] The coding load evaluator 2308 obtains the coding load information 2309 by using this value. Therefore, when the coding load reference information 2363 is "1", the value of the coding load information 2309 is set to "1". As in the case of the eighteenth mode, the value is 0% or 100%, but when the coding load reference information 2363 is “0.5”, the value is 0% or 50%. 50%
, The coding evaluation unit 2308 compares the original video information input at that time with the 100% video coding unit 230
7 instead of inputting it to the video encoding unit 23 by 50%.
07. Therefore, in this case, instead of processing a so-called full frame (30 fps), 15 fp
The processing of s will be performed.

【０５１５】このように構成された、本実施の形態２３
による映像音声符号化装置の動作の概略は、以下のよう
になる。すなわち、当該映像音声符号化装置において、
符号化負荷基準設定部２３６２は、システムタイマ２３
６１の計時出力に基づき、単位時間ごとの原音声バッフ
ァ量２３０４の変動を調査し、変動の度合いによって符
号化負荷基準情報２３６３を設定する。符号化負荷評価
部２３０８は、設定された符号化負荷基準情報２３６３
を用いて、符号化負荷評価情報２３０９を計算し、この
計算された符号化負荷評価情報２３０９の値に従って、
原映像情報を映像符号化部２３０７に入力するか、原映
像情報を破棄して映像符号化部２３０７の処理を中断
し、計算機資源（ＣＰＵ時間）を音声符号化部２３０５
に明け渡すか、否かの動作を決定する。従って、映像符
号化部２３０７において原映像情報の符号化処理が行わ
れる場合にも、すべての原映像情報が処理されるとは限
らず、状況に応じた割合での処理が行われることとな
る。The twenty-third embodiment thus constituted
The outline of the operation of the video / audio coding apparatus according to the above is as follows. That is, in the video / audio coding apparatus,
The encoding load criterion setting unit 2362 includes a system timer 23
Based on the timed output of 61, a change in the original audio buffer amount 2304 per unit time is investigated, and the coding load reference information 2363 is set according to the degree of the change. The encoding load evaluator 2308 determines the set encoding load reference information 2363.
Is used to calculate the coding load evaluation information 2309, and according to the calculated value of the coding load evaluation information 2309,
The original video information is input to the video encoding unit 2307, or the original video information is discarded, the processing of the video encoding unit 2307 is interrupted, and the computer resources (CPU time) are reduced to the audio encoding unit 2305.
To decide whether to surrender or not. Therefore, even when the encoding processing of the original video information is performed in the video encoding unit 2307, not all the original video information is processed, and the processing is performed at a rate according to the situation. .

【０５１６】以下に、ある映像音声に対しての、本実施
の形態２３による映像音声符号化装置による符号化処理
の一例における動作を説明する。ここで、実施の形態１
８と同様に、映像音声符号化処理は、汎用計算機におい
てオペレーティングシステムの制御に従う映像符号化
（符号化負荷評価部２３０８と映像符号化部２３０７の
処理）と、音声符号化（音声符号化部２３０５の処理）
との各タスクとしてなされるものであるとし、ＣＰＵ時
間の割り当てをされた各タスクが一連の処理を実行して
計算機資源（ＣＰＵ時間）を解放したとき、オペレーテ
ィングシステムは他のタスクにＣＰＵ時間の割り当てを
行う、という制御をするものとする。また、当該映像音
声符号化装置を実現する汎用計算機の基本的能力は、実
施の形態１８による符号化装置を実現する場合よりも低
いものであるとする。[0516] The operation of an example of encoding processing for a video and audio by the video and audio encoding apparatus according to the twenty-third embodiment will be described below. Here, Embodiment 1
8, the video / audio coding processing includes video coding (processing of the coding load evaluation unit 2308 and the video coding unit 2307) and voice coding (the voice coding unit 2305) according to the control of the operating system in the general-purpose computer. Processing)
When each task to which CPU time is allocated executes a series of processing and releases computer resources (CPU time), the operating system gives the other tasks the CPU time. The assignment is controlled. Also, it is assumed that the basic capability of the general-purpose computer for realizing the video / audio coding apparatus is lower than that for realizing the coding apparatus according to the eighteenth embodiment.

【０５１７】まず、実施の形態１８と同様に、ビデオカ
メラ２３０１が映像音声情報を取り込み、アナログ音声
情報とアナログ映像情報とに分けて出力する。そして、
音声キャプチャ部２３０２は、ビデオカメラ２３０１か
ら出力されたアナログ音声情報を入力し、デジタル原音
声情報として出力する。音声バッファリング部２３０３
は、原音声情報を蓄積し、蓄積量に応じて原音声バッフ
ァ量２３０４を更新する。一方、映像キャプチャ部２３
０６は、ビデオカメラ２３０１から出力されたアナログ
映像情報を入力し、デジタル原映像情報として出力す
る。First, as in the eighteenth embodiment, the video camera 2301 takes in video and audio information, and outputs it separately into analog audio information and analog video information. And
The audio capture unit 2302 receives the analog audio information output from the video camera 2301 and outputs it as digital original audio information. Audio buffering unit 2303
Stores the original audio information, and updates the original audio buffer amount 2304 according to the accumulated amount. Meanwhile, the video capture unit 23
Reference numeral 06 inputs analog video information output from the video camera 2301 and outputs it as digital original video information.

【０５１８】符号化負荷評価部２３０８は、一旦映像キ
ャプチャ部２３０６から出力された原映像情報を入力
し、その時点の原音声バッファ量２３０４と符号化負荷
基準情報２３６３とを確認する。この時点で、原音声バ
ッファ量２３０４は１０％であり、予め決められた基準
値である５０％を下回っているので、評価値は１００％
となり、しかも符号化負荷基準情報２３６３は初期値の
「１」であるので、符号化負荷基準情報２３６３は１０
０％となる。そこで、符号化負荷評価部２３０８は、原
映像情報のすべてのフレームを映像符号化部２３０７に
入力し、映像符号化部２３０７は映像符号化処理を行
い、映像符号化が終了した時点でＣＰＵ時間を解放す
る。[0518] The encoding load evaluator 2308 receives the original video information once output from the video capture unit 2306, and checks the original audio buffer amount 2304 and the encoding load reference information 2363 at that time. At this point, the original audio buffer amount 2304 is 10%, which is lower than the predetermined reference value of 50%.
And the coding load reference information 2363 is “1”, which is the initial value.
0%. Therefore, the encoding load evaluator 2308 inputs all the frames of the original video information to the video encoder 2307, and the video encoder 2307 performs the video encoding process. To release.

【０５１９】実施の形態１８の場合と同様に、ＣＰＵ制
御に従う各タスクのプロセスと概ね並行して、ビデオカ
メラ２３０１と、音声キャプチャ部２３０２、および映
像キャプチャ部２３０６を実現するキャプチャボードと
の動作は行われ得るものであり、映像符号化処理にあた
り、ＣＰＵ時間の消費が大きかったことから、音声バッ
ファリング部２３０３に９０％まで原音声情報が蓄積さ
れる。As in the case of the eighteenth embodiment, the operations of the video camera 2301 and the capture board for realizing the audio capture unit 2302 and the video capture unit 2306 are performed substantially in parallel with the process of each task under CPU control. The original audio information can be stored in the audio buffering unit 2303 up to 90% because the CPU time is large in the video encoding process.

【０５２０】音声符号化部２３０５は、音声バッファリ
ング部２３０３から一定量（原音声読み出し量）の原音
声情報を、先（過去) に蓄積した分から読み出し、当該
読み出した分の原音声情報を音声バッファリング部２３
０３から削除して、原音声バッファ量２３０４を更新す
る。さらに、音声符号化部２３０５は、原音声情報を符
号化する。本実施の形態２３では、上記の原音声読み出
し量について、最大バッファ量の３０％とするので、上
記の通り、音声バッファリング部２３０３に原音声情報
が９０％分蓄積されていたことから、そのうち３０％を
読み出して、原音声バッファ量２３０４を６０％に更新
し、原音声情報の符号化処理を行い、符号化が終了した
時点でＣＰＵ時間を解放する。また、原音声バッファ量
２３０４を監視する符号化負荷基準設定部２３６２は、
原音声バッファ量２３０４が６０％であるので、既定値
である５０％を超えたことを認識する。[0520] The audio encoding unit 2305 reads a fixed amount (original audio readout amount) of original audio information from the audio buffering unit 2303 from the previously (past) accumulated amount of original audio information, and converts the read original audio information into audio. Buffering unit 23
03, and the original audio buffer amount 2304 is updated. Further, the audio encoding unit 2305 encodes the original audio information. In the twenty-third embodiment, the original audio readout amount is set to 30% of the maximum buffer amount. As described above, 90% of the original audio information is stored in the audio buffering unit 2303. 30% is read, the original audio buffer amount 2304 is updated to 60%, encoding processing of the original audio information is performed, and the CPU time is released when the encoding is completed. Also, the encoding load reference setting unit 2362 that monitors the original audio buffer amount 2304
Since the original audio buffer amount 2304 is 60%, it is recognized that the original audio buffer amount 2304 exceeds the default value of 50%.

【０５２１】映像符号化に再び処理が移ると、符号化負
荷評価部２３０８は、その時点の原音声バッファ量２３
０４と符号化負荷基準情報２３６３とを確認する。この
時点で、原音声バッファ量２３０４は６０％であり、予
め決められた基準値の５０％以上となっていたので、評
価値は０％となり、それに符号化負荷基準情報２３６３
の「１」を乗算しても符号化負荷評価情報２３０９は０
％となる。そこで、符号化負荷評価部２３０８は、この
時点の原映像情報のすべてのフレームを廃棄し、すみや
かにＣＰＵ時間を解放する。従って、この映像符号化の
過程においては、原音声バッファ量２３０４は変わらな
い。When the processing shifts to video encoding again, the encoding load evaluator 2308 determines that the original audio buffer amount 23
04 and the coded load reference information 2363 are confirmed. At this point, the original audio buffer amount 2304 is 60%, which is 50% or more of the predetermined reference value, so that the evaluation value becomes 0%, and the coding load reference information 2363 is added.
Is multiplied by “1”, the coding load evaluation information 2309 is 0.
%. Therefore, the encoding load evaluator 2308 discards all frames of the original video information at this time and immediately releases the CPU time. Therefore, in this video encoding process, the original audio buffer amount 2304 does not change.

【０５２２】音声符号化部２３０５は、音声バッファリ
ング部２３０３から一定量（原音声読み出し量）の原音
声情報を、先（過去) に蓄積した分から読み出し、当該
読み出した分の原音声情報を音声バッファリング部２３
０３から削除して、原音声バッファ量２３０４を更新す
る。さらに、音声符号化部２３０５は、原音声情報を符
号化する。原音声読み出し量は３０％であり、音声バッ
ファリング部２３０３に原音声情報が６０％分蓄積され
ていたことから、そのうち３０％を読み出して、原音声
バッファ量２３０４を３０％に更新し、原音声情報の符
号化処理を行い、符号化が終了した時点でＣＰＵ時間を
解放する。[0522] The audio encoding unit 2305 reads a predetermined amount (original audio readout amount) of original audio information from the audio buffering unit 2303 from the earlier (past) amount of original audio information, and outputs the read original audio information. Buffering unit 23
03, and the original audio buffer amount 2304 is updated. Further, the audio encoding unit 2305 encodes the original audio information. The original audio reading amount is 30%, and the original audio information is stored in the audio buffering unit 2303 for 60%. Therefore, 30% of the original audio information is read out, and the original audio buffer amount 2304 is updated to 30%. The audio information is encoded, and the CPU time is released when the encoding is completed.

【０５２３】原音声バッファ量２３０４を監視する符号
化負荷基準設定部２３６２は、この時点で、原音声バッ
ファ量２３０４が３０％であるので、既定値である５０
％を下回り、前回既定値を超えたことと合わせて、原音
声バッファ量２３０４の変動が１カウントであることを
認識する。すなわち１カウント分のカウント動作が行わ
れる。At this point, the encoding load criterion setting unit 2362 monitoring the original audio buffer amount 2304 has the default value of 50 since the original audio buffer amount 2304 is 30%.
%, And together with the fact that it has exceeded the predetermined value the previous time, it is recognized that the fluctuation of the original audio buffer amount 2304 is 1 count. That is, a count operation for one count is performed.

【０５２４】このような過程が繰り返されることによ
り、原音声バッファ量２３０４の変動による、符号化音
声基準設定部２３６２のカウント動作が３カウントに達
すると、符号化負荷基準設定部２３６２は、符号化負荷
基準情報２３６３を、初期値「１」から１５ｆｐｓを指
示する「０．５」に設定し直す。[0524] By repeating such a process, when the count operation of the coded voice reference setting unit 2362 reaches three counts due to the fluctuation of the original voice buffer amount 2304, the coding load reference setting unit 2362 sets the The load reference information 2363 is reset from the initial value “1” to “0.5” indicating 15 fps.

【０５２５】映像符号化に再び処理が移ると、符号化負
荷評価部２３０８は、その時点の原音声バッファ量２３
０４と符号化負荷基準情報２３６３とを確認する。ここ
で、原音声バッファ量２３０４は３０％であるので、評
価値は１００％となる。しかし、符号化負荷基準情報２
３６３は「０．５」であるので、乗算して符号化負荷基
準情報２３６３は５０％となる。そこで、原映像情報の
フレームのうち、半分を間引いて廃棄し、残りのフレー
ムを映像符号化部２３０７に入力し、映像符号化処理を
行い、終了した時点でＣＰＵ時間を解放する。When the processing shifts to video encoding again, the encoding load evaluator 2308 determines that the original audio buffer amount 23
04 and the coded load reference information 2363 are confirmed. Here, since the original audio buffer amount 2304 is 30%, the evaluation value is 100%. However, the coding load reference information 2
Since 363 is “0.5”, the multiplication results in 50% of the coding load reference information 2363. Therefore, half of the frames of the original video information are discarded by discarding, and the remaining frames are input to the video encoding unit 2307, where the video encoding process is performed, and the CPU time is released when the processing is completed.

【０５２６】音声符号化部２３０５は、音声バッファリ
ング部２３０３から一定量（本実施の形態２３では３０
％）の原音声情報を、先（過去) に蓄積した分から先に
読み出し、その分の原音声情報を音声バッファリング部
２３０３から削除し、原音声バッファ量２３０４を更新
する。さらに、音声符号化部２３０５は、原音声情報を
符号化する。この過程では、先の映像符号化の過程に要
する時間が、符号化負荷基準情報の再設定の前に比較し
て約半分で済んでいるので、音声バッファリング部２３
０３に音声が６０％分蓄積されている。そのうちの３０
％を読み出し、３０％とし、符号化が終了した時点でＣ
ＰＵ時間を解放する。[0526] The audio encoding unit 2305 outputs a fixed amount (30 in the twenty-third embodiment) from the audio buffering unit 2303.
%) Of the original audio information is read out first from the previously (previously) accumulated original audio information, and the corresponding original audio information is deleted from the audio buffering unit 2303, and the original audio buffer amount 2304 is updated. Further, the audio encoding unit 2305 encodes the original audio information. In this process, the time required for the previous video encoding process is about half that required before the reset of the encoding load reference information.
03 stores 60% of voice. 30 of them
% Is read out and set to 30%, and when encoding is completed, C
Release PU time.

【０５２７】映像符号化に再び処理が移ると、符号化負
荷評価部２３０８は、その時点の原音声バッファ量２３
０４と符号化負荷基準情報２３６３とを確認する。この
時点で、原音声バッファ量２３０４は３０％であり、既
定値を下回っているが、符号化負荷基準情報２３６３は
１５ｆｐｓであるので、原映像情報のフレームのうち、
半分を間引いて廃棄し、残りのフレームを映像符号化部
２３０７に入力し、映像符号化処理を行い、終了した時
点でＣＰＵ時間を解放する。When the processing shifts to video encoding again, the encoding load evaluator 2308 determines that the original audio buffer amount 23
04 and the coded load reference information 2363 are confirmed. At this point, the original audio buffer amount 2304 is 30%, which is lower than the default value. However, since the encoding load reference information 2363 is 15 fps, of the frames of the original video information,
Half is discarded and discarded, and the remaining frames are input to the video encoding unit 2307, where the video encoding process is performed, and the CPU time is released when the process is completed.

【０５２８】図５２において、ビデオカメラ２３０１よ
り映像音声の取り込みが続いている間は、以上のように
映像符号化、および音声符号化のプロセスが実行される
ことによって、当該取り込みにともなっての映像音声符
号化が実行される。そして、映像音声の取り込みの終了
後、符号化も終了する。In FIG. 52, while the video and audio are being captured from the video camera 2301, the video encoding and audio encoding processes are executed as described above, whereby the video associated with the capture is obtained. Voice coding is performed. Then, after the capturing of the video and audio ends, the encoding ends.

【０５２９】図５３は、このような本実施の形態２３に
よる映像音声符号化装置の符号化に際しての動作を長期
の時間にわたって示す図である。同図に示すように、Ａ
区間では、音声バッファ量について図５４に見られたよ
うな大きな変動があり、実施の形態１８〜２２を性能の
低い計算機で実行する場合に示したように、映像の符号
化と、音声を優先する符号化とが繰り返されこととなっ
ている。しかし、上記のように、この間に符号化負荷基
準情報２３６３の再設定がなされるものであり、Ｂ区間
では映像の符号化の負荷の基準が半分となったため、符
号化の進行の度合いについてはバランスのよいものとな
っている。FIG. 53 is a diagram showing the operation of the video / audio coding apparatus according to the twenty-third embodiment over a long period of time in coding. As shown in FIG.
In the section, there is a large variation in the audio buffer amount as shown in FIG. 54, and as shown in the case where the eighteenth to twenty-second embodiments are executed on a computer with low performance, video encoding and audio priority are given. Is repeated. However, as described above, the encoding load reference information 2363 is reset during this time. In the B section, the reference of the video encoding load has been reduced by half. It is well balanced.

【０５３０】このように、本実施の形態２３の映像音声
符号化装置によれば、実施の形態１８による映像音声符
号化装置にシステムタイマ２３６１と、符号化負荷基準
設定部２３６２とを追加した構成としたことで、符号化
負荷基準設定部２３６２は、原音声バッファ量２３０４
の変動に対応して、符号化負荷基準情報２３６３の再設
定を行い、符号化負荷評価部２３０６は符号化負荷基準
情報２３６３を用いて符号化負荷評価情報２３０９を取
得することにより、状況に応じて、原映像情報の符号化
処理の割合を変更する。従って、負荷の変動により音声
がとぎれることを防止できるのに加えて、基本的に計算
機能力の乏しいコンピュータシステム上で動作するソフ
トウェアにより映像音声を符号化する際において、その
システムにおいて最適な映像負荷を自動的に設定し、そ
れにより、高品質の映像と低品質の映像とが繰り返され
てしまうといった事態に陥るのを回避することが可能と
なる。As described above, according to the video and audio coding apparatus of the twenty-third embodiment, the system timer 2361 and the coding load reference setting section 2362 are added to the video and audio coding apparatus of the eighteenth embodiment. The encoding load criterion setting unit 2362 determines that the original audio buffer amount 2304
In response to the fluctuation of the coding load reference information 2363, the coding load evaluation unit 2306 obtains the coding load evaluation information 2309 using the coding load reference information 2363, so that the Then, the rate of the encoding process of the original video information is changed. Therefore, in addition to being able to prevent audio from being interrupted due to load fluctuations, when encoding video and audio using software that operates on a computer system with basically low computational power, the optimal video load in that system is reduced. The setting is automatically performed, so that it is possible to avoid a situation in which high-quality video and low-quality video are repeated.

【０５３１】なお、本実施の形態２３では、状況に応じ
て映像符号化処理の停止を行う実施の形態１８におい
て、原音声情報蓄積量の変動に基づく、符号化負荷基準
情報の変更による制御を行うものとしたが、フレーム間
予測符号化の処理量を制御する実施の形態１９、および
解像度を変更する実施の形態２０に対して、符号化負荷
基準情報を変更する本手法を応用することも可能であ
り、本実施の形態２３における場合と同様の効果が得ら
れる。[0531] In the twenty-third embodiment, in the eighteenth embodiment in which the video encoding processing is stopped according to the situation, the control by changing the encoding load reference information based on the fluctuation of the original audio information storage amount is performed. However, it is also possible to apply the present method of changing the coding load reference information to the embodiment 19 for controlling the processing amount of the inter-frame prediction coding and the embodiment 20 for changing the resolution. It is possible, and the same effect as in the twenty-third embodiment can be obtained.

【０５３２】例えば、実施の形態１９に応用する際に
は、符号化負荷基準情報を、動きベクトルを計算する
際、原音声バッファ量が「０」のときに、どのくらいの
符号化を行うべきかを設定するものとし、原音声バッフ
ァ量の変動が大きい場合には、原音声バッファ量が
「０」の場合でも、設定値の５０％の量だけ動きベクト
ルを計算するとする等とする応用が可能である。従っ
て、かかる応用によって、実施の形態１９においても、
高品質の映像と低品質の映像とが繰り返されてしまうと
いった事態に陥るのを回避することができる。For example, when applied to the nineteenth embodiment, the coding load reference information is used to calculate the motion vector, and how much coding should be performed when the original audio buffer amount is “0”. If the fluctuation of the original audio buffer amount is large, even if the original audio buffer amount is “0”, the motion vector can be calculated by 50% of the set value. It is. Therefore, by such an application, also in the nineteenth embodiment,
It is possible to avoid a situation where high-quality video and low-quality video are repeated.

【０５３３】また、本実施の形態２３は、初期値の符号
化負荷基準情報で実際の符号化を行い、原音声バッファ
量の変動の具合により、好適な符号化負荷基準情報を設
定するというものであるが、設定して求めた符号化負荷
基準情報は、ハードディスクなどの記憶装置に保存して
おけば、次回符号化する際には、最初から、好適な符号
化負荷基準情報で符号化が行える。すなわち、初回のみ
図５３に示すＡ区間のような好適な符号化の行えない期
間が生じるが、次回からはＢ区間に示すような、バラン
スの良い符号化を実行し得るものとなる。In the twenty-third embodiment, the actual coding is performed using the coding load reference information of the initial value, and suitable coding load reference information is set in accordance with the fluctuation of the original audio buffer amount. However, if the encoding load reference information obtained by setting is stored in a storage device such as a hard disk, the next time encoding is performed, the encoding can be performed with suitable encoding load reference information from the beginning. I can do it. That is, although a period occurs during the first time in which a suitable encoding cannot be performed as in the A section shown in FIG. 53, the next time, a well-balanced encoding as shown in the B section can be performed.

【０５３４】実施の形態２４．本発明の実施の形態２４
による映像音声符号化装置は、高性能でない汎用計算機
等において映像音声の符号化処理をソフトウェア処理に
よって行う場合に対応することを、符号化負荷基準情報
の設定によってするものであり、当該設定の結果を利用
者に提示し得るものである。Embodiment 24 FIG. Embodiment 24 of the present invention
The video and audio encoding device according to the present invention is to cope with the case where the video and audio encoding processing is performed by software processing in a general-purpose computer or the like that is not high-performance, by setting the encoding load reference information, and as a result of the setting. Can be presented to the user.

【０５３５】図５４は本発明の実施の形態２４による映
像音声符号化装置の概略構成を示す図である。図示する
ように、本実施の形態２４による映像符号化装置は、ビ
デオカメラ２４０１、音声キャプチャ部２４０２、音声
バッファリング部２４０３、音声符号化部２４０５、映
像キャプチャ部２４０６、映像符号化部２４０７、符号
化負荷評価部２４０８、システムタイマ２４６１、符号
化負荷基準設定部２４６２、符号化負荷提示部２４１
１、および負荷設定用標準映像音声出力部２４１２から
構成されている。装置出力として符号化音声情報と、符
号化映像情報とが出力されることは、実施の形態１８と
同様である。また、符号化負荷提示部２４１１は、モニ
タに対して出力を行うものである。FIG. 54 is a diagram showing a schematic configuration of a video / audio coding apparatus according to Embodiment 24 of the present invention. As shown in the figure, the video encoding apparatus according to Embodiment 24 includes a video camera 2401, an audio capture unit 2402, an audio buffering unit 2403, an audio encoding unit 2405, a video capture unit 2406, a video encoding unit 2407, a codec. Load evaluation section 2408, system timer 2461, coding load reference setting section 2462, coding load presentation section 241
1 and a standard video / audio output unit 2412 for load setting. As in the eighteenth embodiment, encoded audio information and encoded video information are output as device outputs. Further, the encoding load presentation unit 2411 performs output to a monitor.

【０５３６】同図において、符号化負荷提示部２４１１
は、符号化負荷基準情報２４６３の設定結果を、映像音
声符号化装置の利用者に対して提示する。負荷設定用標
準映像音声出力部２４１２は、コンピュータシステムの
計算機能力に合わせて映像負荷を設定するため、標準的
な映像情報、および音声情報を出力する。本実施の形態
２４による映像音声符号化装置は、実施の形態２３によ
る映像符号化装置に、符号化負荷提示部２４１１、およ
び負荷設定用標準映像音声出力部２４１２を追加した構
成であり、ビデオカメラ２４０１、音声キャプチャ部２
４０２、音声バッファリング部２４０３、音声符号化部
２４０５、映像キャプチャ部２４０６、および映像符号
化部２４０７については実施の形態１８の１８０１〜１
８０７と同様であり、また、システムタイマ２４６１
と、符号化負荷基準設定部２４６２とは、実施の形態２
３と同様であるので、説明を省略する。In the figure, an encoding load presentation unit 2411
Presents the setting result of the coding load reference information 2463 to the user of the video / audio coding apparatus. The load setting standard video / audio output unit 2412 outputs standard video information and audio information in order to set a video load in accordance with the computing capability of the computer system. The video / audio encoding apparatus according to Embodiment 24 has a configuration in which an encoding load presentation unit 2411 and a standard video / audio output unit for load setting 2412 are added to the video encoding apparatus according to Embodiment 23, and a video camera 2401, audio capture unit 2
402, an audio buffering unit 2403, an audio encoding unit 2405, a video capturing unit 2406, and an image encoding unit 2407 are the same as those of the eighteenth embodiment.
807, and a system timer 2461
And the coding load criterion setting unit 2462 correspond to the second embodiment.
3, the description is omitted.

【０５３７】このように構成された、本実施の形態２４
による映像音声符号化装置の動作の概略は、以下のよう
になる。すなわち、当該映像音声符号化装置において、
符号化負荷基準設定部２４６２は、システムタイマ２４
６１の計時出力に基づき、単位時間ごとの原音声バッフ
ァ量２４０４の変動を調査し、変動の度合いによって符
号化負荷基準情報２４６３を設定する。符号化負荷評価
部２４０８は、設定された符号化負荷基準情報２４６３
を用いて、符号化負荷評価情報２４０９を計算し、この
計算された符号化負荷評価情報２４０９の値に従って、
原映像情報を映像符号化部２４０７に入力するか、原映
像情報を破棄して映像符号化部２４０７の処理を中断
し、計算機資源（ＣＰＵ時間）を音声符号化部２４０５
に明け渡すか、否かの動作を決定する。従って、映像符
号化部２４０７において原映像情報の符号化処理が行わ
れる場合にも、すべての原映像情報が処理されるとは限
らず、状況に応じた割合での処理が行われることとな
る。The twenty-fourth embodiment thus configured
The outline of the operation of the video / audio coding apparatus according to the above is as follows. That is, in the video / audio coding apparatus,
The encoding load criterion setting unit 2462 includes the system timer 24
Based on the clock output of 61, a change in the original audio buffer amount 2404 per unit time is investigated, and the coding load reference information 2463 is set according to the degree of the change. The encoding load evaluator 2408 determines the set encoding load reference information 2463.
Is used to calculate the encoding load evaluation information 2409, and according to the calculated value of the encoding load evaluation information 2409,
The original video information is input to the video encoding unit 2407, or the original video information is discarded, the processing of the video encoding unit 2407 is interrupted, and the computer resources (CPU time) are reduced by the audio encoding unit 2405.
To decide whether to surrender or not. Therefore, even when the encoding processing of the original video information is performed in the video encoding unit 2407, not all the original video information is processed, and the processing is performed at a rate according to the situation. .

【０５３８】加えて、負荷設定用標準映像出力部２４１
２は、標準映像音声情報２４１３に基づき、コンピュー
タシステムの計算機能力に合わせて映像負荷を設定する
ため、標準的な映像情報、および音声情報を出力する。
また、符号化負荷提示部２４１１は、符号化負荷基準情
報２４６３の設定結果を、映像音声符号化装置の利用者
に対して提示する。In addition, the load setting standard video output unit 241
2 outputs standard video information and audio information based on the standard video / audio information 2413 in order to set the video load in accordance with the computing function of the computer system.
Also, the encoding load presentation unit 2411 presents the setting result of the encoding load reference information 2463 to the user of the video / audio encoding device.

【０５３９】以下に、ある映像音声に対しての、本実施
の形態２４による映像音声符号化装置による符号化処理
の一例における動作を説明する。ここで、実施の形態１
８と同様に、映像音声符号化処理は、汎用計算機におい
てオペレーティングシステムの制御に従う映像符号化
（符号化負荷評価部２４０８と映像符号化部２４０７の
処理）と、音声符号化（音声符号化部２４０５の処理）
との各タスクとしてなされるものであるとし、ＣＰＵ時
間の割り当てをされた各タスクが一連の処理を実行して
計算機資源（ＣＰＵ時間）を解放したとき、オペレーテ
ィングシステムは他のタスクにＣＰＵ時間の割り当てを
行う、という制御をするものとする。また、当該映像音
声符号化装置を実現する汎用計算機の基本的能力は、実
施の形態１８による符号化装置を実現する場合よりも低
いものであるとする。[0539] The operation of an example of encoding processing for a video and audio by the video and audio encoding apparatus according to the twenty-fourth embodiment will be described below. Here, Embodiment 1
8, the video / audio encoding process includes video encoding (processing of the encoding load evaluator 2408 and the video encoder 2407) and audio encoding (the audio encoding unit 2405) according to the control of the operating system in the general-purpose computer. Processing)
When each task to which CPU time is allocated executes a series of processing and releases computer resources (CPU time), the operating system gives the other tasks the CPU time. The assignment is controlled. Also, it is assumed that the basic capability of the general-purpose computer for realizing the video / audio coding apparatus is lower than that for realizing the coding apparatus according to the eighteenth embodiment.

【０５４０】まず、負荷設定用標準映像音声出力部２４
１２が、実際の映像情報の符号化に先立ち、標準映像音
声情報２４１３を出力する。実施の形態２３においてと
りこまれた映像音声情報と同様に、本実施の形態２４に
よる符号化装置は、標準映像音声情報２４１３を符号化
し、符号化負荷基準設定部２４６２が、符号化負荷基準
情報２４６３を設定する。符号化負荷提示部２４１１
が、符号化負荷基準情報２４６３の内容を、モニタを通
じて利用者に対して提示し、了解を得る。負荷設定用標
準映像出力部２４１２が、出力する映像情報、および音
声情報を、ビデオカメラ２４０１からの映像情報、およ
び音声情報に切り替え、実施の形態２３に示したよう
な、通常の符号化を行う。First, the load setting standard video / audio output unit 24
12 outputs standard video / audio information 2413 prior to actual coding of video information. Similarly to the video / audio information taken in the twenty-third embodiment, the coding apparatus according to the twenty-fourth embodiment codes the standard video / audio information 2413, and the coding load reference setting unit 2462 outputs the coding load reference information 2463. Set. Encoding load presentation unit 2411
Presents the contents of the coded load reference information 2463 to the user through the monitor, and obtains the consent. The load setting standard video output unit 2412 switches the video information and audio information to be output to the video information and audio information from the video camera 2401, and performs normal encoding as described in Embodiment 23. .

【０５４１】なお、本実施の形態２４では、符号化負荷
基準情報を設定するのに、予め用意した標準的な映像音
声情報を用いるものとしたが、本発明はこのような方法
に限定されるものではない。例えば、利用者が、符号化
したい映像に合わせて、任意の映像音声情報を評価に用
いることが可能である。また、本実施の形態２４で用い
る標準映像音声情報については、当該映像音声符号化装
置における音声符号化が、音声情報の内容（有音・無音
の別など）にかかわらずに、一定の符号化処理を行うも
のであれば、標準映像音声情報を構成する音声として無
音のデータを用いることも可能である。In Embodiment 24, standard video / audio information prepared in advance is used to set the coding load reference information. However, the present invention is limited to such a method. Not something. For example, it is possible for a user to use arbitrary video and audio information for evaluation in accordance with a video to be encoded. Also, for the standard video / audio information used in the twenty-fourth embodiment, the audio coding in the video / audio coding apparatus is performed with a constant coding regardless of the content of the audio information (such as voiced / silent). As long as the processing is performed, silent data can be used as the audio constituting the standard video / audio information.

【０５４２】このように、本実施の形態２４の映像音声
符号化装置によれば、実施の形態２３の映像音声符号化
装置に、負荷設定用標準映像音声出力部２４１２と、符
号化負荷表示部２４１１とを追加した構成としたこと
で、実際の符号化処理に先立ち、まず標準的な映像音声
情報を符号化し、それにより、符号化負荷基準情報の設
定を行い、それを利用者に提示するようになっているた
め、利用者は、映像の品質の低下を納得した上で符号化
処理を行える。すなわち、本実施の形態２４で示したよ
うなソフトウェアによる符号化装置は、様々なコンピュ
ータシステムの上で動作可能である。したがって、計算
機能力が元々高いものから、低いものまで、様々な環境
の上で動作する。As described above, according to the video / audio coding apparatus of the twenty-fourth embodiment, the video / audio coding apparatus of the twenty-third embodiment has a load setting standard video / audio output unit 2412 and an encoding load display unit. By adding 2411 to the configuration, prior to actual encoding processing, first, standard video / audio information is encoded, thereby setting encoding load reference information, and presenting it to the user. With this configuration, the user can perform the encoding process after being convinced that the quality of the video has deteriorated. That is, the encoding device using software as described in the twenty-fourth embodiment can operate on various computer systems. Therefore, it operates in various environments, from those with originally high computational capabilities to those with low computational capabilities.

【０５４３】このように、計算機能力の大小にかかわら
ず、映像の品質を犠牲にすることで音声の符号化を途切
れなく行うというのが本発明の目的の１つであるが、本
実施の形態２４では、映像の品質をどれくらい犠牲にし
たかを、利用者に示すことができる。これにより、利用
者は、映像の品質の低下を、コンピュータシステムの計
算機能力の不足によるものと認識でき、その対策とし
て、動作周波数の向上であるとか、メインメモリの増設
といったような、施策を講じることができる。したがっ
て、本実施の形態２４によれば、実施の形態２３で示し
た効果に加え、利用者に当該コンピュータシステムにつ
いての状況を知らせることができるという格別の効果を
得ることができる。As described above, one of the objects of the present invention is to perform audio coding without interruption without sacrificing the quality of video, regardless of the magnitude of the computing function. At 24, the user can be shown how much the video quality has been sacrificed. As a result, the user can recognize that the deterioration of the image quality is caused by the lack of the computational capability of the computer system, and take measures such as improving the operating frequency or increasing the main memory. be able to. Therefore, according to the twenty-fourth embodiment, in addition to the effects shown in the twenty-third embodiment, a special effect that a user can be notified of the situation of the computer system can be obtained.

【０５４４】なお、本発明は上記各実施の形態に限定さ
れるものではなく、例えば、上記各実施の形態において
は、音声符号化部２４０５を、音声バッファリング部２
４０３に蓄積された原音声情報を読み出し、この読み出
した原音声情報を音声バッファリング部２４０３より削
除した後、原音声情報を符号化処理し、符号化音声情報
として出力するように構成したが、音声符号化部２４０
５を、音声バッファリング部２４０３に蓄積された原音
声情報を読み出し、この読み出した原音声情報をを符号
化処理し、符号化音声情報として出力した後、原音声情
報を音声バッファリング部２４０３より削除するように
構成してもよい。また、音声符号化部２４０５が音声バ
ッファリング部２４０３に蓄積された原音声情報を読み
出したことを検知した後、この読み出した原音声情報を
音声バッファリング部２４０３より削除する削除部を別
途設けてもよい。その他、本発明の請求の範囲内での種
々の設計変更および修正を加え得ることが可能である。[0544] The present invention is not limited to the above embodiments. For example, in each of the above embodiments, the audio encoding unit 2405 is replaced by the audio buffering unit 2
Although the original audio information stored in 403 is read out, the read original audio information is deleted from the audio buffering unit 2403, the original audio information is encoded, and the encoded audio information is output as encoded audio information. Voice encoding section 240
5 is read out of the original audio information stored in the audio buffering unit 2403, the read original audio information is encoded, and output as encoded audio information. You may be comprised so that it may be deleted. Further, after detecting that the audio encoding unit 2405 has read the original audio information stored in the audio buffering unit 2403, a deletion unit for deleting the read original audio information from the audio buffering unit 2403 is separately provided. Is also good. In addition, it is possible to make various design changes and modifications within the scope of the claims of the present invention.

【０５４５】また、実施の形態１８〜２４に示した映像
音声符号化方法については、該方法を実行し得る映像音
声符号化プログラムを記録した記録媒体を用いて、パー
ソナルコンピュータやワークステーション等において、
当該プログラムを実行することによって実現できるもの
である。[0545] Also, with regard to the video and audio coding method described in the eighteenth to twenty-fourth embodiments, a personal computer, a workstation, or the like uses a recording medium on which a video and audio coding program capable of executing the method is recorded.
This can be realized by executing the program.

【０５４６】実施の形態２５．本発明の実施の形態２５
による映像符号化方法は、実施の形態１と同様に、設定
に応じて符号化パラメータを定めるものであり、入力画
像データの有する解像度を与えられ、該解像度と、設定
されたパラメータとに基づいて他のパラメータを決定す
るものである。Embodiment 25 FIG. Embodiment 25 of the present invention
Is a method for determining an encoding parameter according to a setting in the same manner as in the first embodiment. It is for determining other parameters.

【０５４７】実施の形態１〜４はいずれもフレームレー
トを指定する場合であったが、本実施の形態２５では、
フレームレートを指定せず、なるべく高いフレームレー
トで符号化処理を実行し、再生画質の高い符号化データ
を得ようとするものである。また、本実施の形態２５で
は、入力画像データの有する解像度が与えられるもので
あり、当該与えられた解像度に対応して、符号化処理を
行うものである。In the first to fourth embodiments, the frame rate is specified, but in the twenty-fifth embodiment,
An encoding process is performed at a frame rate as high as possible without specifying a frame rate, thereby obtaining encoded data with high reproduction image quality. In the twenty-fifth embodiment, the resolution of the input image data is given, and the encoding process is performed in accordance with the given resolution.

【０５４８】図５５は、本発明の実施の形態２５による
映像符号化装置の構成を示すブロック図である。図示す
るように、本実施の形態２５による映像符号化装置は、
符号化手段３００１と、符号化パラメータ決定手段３０
０２とから構成されており、符号化手段３００１は、Ｄ
ＣＴ処理手段３００３、量子化手段３００４、可変長符
号化手段３００５、ビットストリーム生成手段３００
６、逆量子化手段３００７、逆ＤＣＴ処理手段３００
８、および予測画像生成手段３００９を、また、符号化
パラメータ決定手段３００２は動きベクトル（ＭＶ）検
出範囲参照テーブル３０１０を内包している。FIG. 55 is a block diagram showing a configuration of a video encoding apparatus according to Embodiment 25 of the present invention. As shown in the figure, the video encoding apparatus according to Embodiment 25
Encoding means 3001 and encoding parameter determination means 30
02, and the encoding means 3001
CT processing means 3003, quantization means 3004, variable length coding means 3005, bit stream generation means 300
6. Inverse quantization means 3007, inverse DCT processing means 300
8 and the predicted image generation means 3009, and the coding parameter determination means 3002 include a motion vector (MV) detection range reference table 3010.

【０５４９】符号化手段３００１は、映像がデジタル化
された、一連の静止画像からなる映像データを入力画像
データとして入力し、設定された符号化パラメータに従
って符号化処理し、符号化データを出力する。入力画像
データを構成する個々の静止画像データをフレーム画像
と呼ぶ。また、符号化パラメータは、後述する符号化パ
ラメータ決定手段３００２から与えられるものであり、
符号化タイプを示すパラメータと、動きベクトルの検出
範囲を示すパラメータとが含まれている。符号化タイプ
を示すパラメータは、フレーム内符号化処理、または順
方向予測符号化処理を示すものであり、符号化手段３０
０１は、当該パラメータに従って、フレーム内符号化、
または順方向予測符号化を行う。順方向予測符号化に用
いる動きベクトルは、動きベクトルの検出範囲を示すパ
ラメータによって指示される範囲内で、検出される。The encoding means 3001 inputs video data consisting of a series of still images obtained by digitizing the video as input image data, performs an encoding process in accordance with the set encoding parameters, and outputs encoded data. . Each still image data constituting the input image data is called a frame image. Further, the encoding parameter is provided from an encoding parameter determining unit 3002 described later.
A parameter indicating an encoding type and a parameter indicating a detection range of a motion vector are included. The parameter indicating the encoding type indicates intra-frame encoding processing or forward prediction encoding processing.
01 is intra-frame encoding,
Alternatively, forward prediction coding is performed. A motion vector used for forward prediction coding is detected within a range indicated by a parameter indicating a detection range of the motion vector.

【０５５０】符号化手段３００１の内部のＤＣＴ手段３
００３、量子化手段３００４、可変長符号化手段３００
５、ビットストリーム生成手段３００６、逆量子化手段
３００７、および逆ＤＣＴ手段３００８については、実
施の形態１における１０３〜１０８と同様であり説明を
省略する。The DCT unit 3 inside the encoding unit 3001
003, quantization means 3004, variable length coding means 300
5. The bit stream generation means 3006, the inverse quantization means 3007, and the inverse DCT means 3008 are the same as 103 to 108 in the first embodiment and will not be described.

【０５５１】予測画像生成手段３００９は、逆ＤＣＴ処
理手段３００８が出力する逆ＤＣＴ変換データを入力し
て、この逆ＤＣＴ変換データと入力画像データとの間で
動きベクトルの検出処理を行った後、予測画像を生成し
て予測画像データとして出力する。動きベクトルの検出
は、上記のように動きベクトルの検出範囲を示すパラメ
ータによって指示される範囲内で行われる。予測画像を
用いたフレーム間符号化処理が行われる場合には、この
予測画像データと入力画像データとの差分データがＤＣ
Ｔ手段３００３に入力されることにより、符号化手段３
００１においては順方向予測符号化が行われることとな
る。The predictive image generating means 3009 receives the inverse DCT transform data output from the inverse DCT processing means 3008 and performs a motion vector detection process between the inverse DCT transform data and the input image data. A predicted image is generated and output as predicted image data. The detection of the motion vector is performed within the range indicated by the parameter indicating the detection range of the motion vector as described above. When the inter-frame encoding process using the predicted image is performed, the difference data between the predicted image data and the input image data is DC
When input to the T means 3003, the encoding means 3
In 001, forward prediction encoding is performed.

【０５５２】また、本実施の形態２５による映像符号化
装置では、符号化パラメータ決定手段３００２は、入力
画像データが有する解像度と、指定された符号化パター
ンとから、内包するＭＶ検出範囲参照テーブル３０１０
を用いてＭＶ検出範囲を決定し、当該決定したＭＶ検出
範囲を示すパラメータを含む、上記符号化パラメータを
符号化手段３００１に出力する。In the video coding apparatus according to the twenty-fifth embodiment, the coding parameter determining means 3002 determines the MV detection range reference table 3010 included in the MV detection range reference table from the resolution of the input image data and the specified coding pattern.
Is used to determine the MV detection range, and outputs the above-mentioned encoding parameter including a parameter indicating the determined MV detection range to the encoding means 3001.

【０５５３】なお、本実施の形態２５による映像符号化
装置は、実施の形態１と同様に、パーソナルコンピュー
タ（ＰＣ）において処理制御装置（ＣＰＵ）の制御によ
り映像符号化プログラムが実行されることによって実現
されるものとし、符号化処理の実行においては、実施の
形態１において示した５つの条件に加えて以下の２つの
条件が成立するものとする。In the video coding apparatus according to the twenty-fifth embodiment, as in the first embodiment, a personal computer (PC) executes a video coding program under the control of a processing control unit (CPU). In the execution of the encoding process, the following two conditions are satisfied in addition to the five conditions shown in the first embodiment.

【０５５４】( ６) 順方向符号化処理が行われる場合、
動きベクトルの検出範囲が「小さい」であるならば、処
理時間はフレーム内符号化処理を行う場合の６倍とな
る。(6) When forward encoding processing is performed,
If the detection range of the motion vector is “small”, the processing time is six times that in the case of performing the intra-frame encoding process.

【０５５５】( ７) 順方向符号化処理が行われる場合、
動きベクトルの検出範囲が「大きい」であるならば、処
理時間は検出範囲が「小さい」である場合の４倍とな
る。(7) When the forward encoding process is performed,
If the detection range of the motion vector is “large”, the processing time is four times that of the case where the detection range is “small”.

【０５５６】ここで、本装置に搭載されるＣＰＵの動作
周波数は１００MHz であり、符号化開始時に指定される
フレームレートは２４フレーム／秒、符号化タイプの組
み合わせとしての符号化パターンは、２フレームごとに
「Ｉ」「Ｐ」を繰り返すパターン２「ＩＰ」を用いるも
のとする。ただし、フレーム内符号化を「Ｉ」、順方向
予測符号化を「Ｐ」で表すものとする。Here, the operating frequency of the CPU mounted on this apparatus is 100 MHz, the frame rate specified at the start of coding is 24 frames / sec, and the coding pattern as a combination of coding types is 2 frames. It is assumed that pattern 2 “IP” that repeats “I” and “P” every time is used. Here, the intra-frame encoding is represented by “I” and the forward prediction encoding is represented by “P”.

【０５５７】以上のような設定のもとに、上述のように
構成された本実施の形態２５による映像符号化装置の動
作を以下に説明する。まず、符号化対象である映像はデ
ジタル化され、一連のフレーム画像として当該符号化装
置の符号化手段３００１に入力される。図５６は、符号
化手段３００１の動作を示すフローチャート図である。
符号化手段３００１の動作を、以下に、図５６に従って
説明する。なお、符号化パラメータ決定手段３００２
は、符号化開始時の最初のフレーム画像に対しては、符
号化手段３００１に対して必ずフレーム内符号化を指示
するものとする。[0556] The operation of the video encoding apparatus according to the twenty-fifth embodiment configured as described above based on the above settings will be described below. First, a video to be encoded is digitized and input to the encoding unit 3001 of the encoding device as a series of frame images. FIG. 56 is a flowchart showing the operation of the encoding means 3001.
The operation of the encoding means 3001 will be described below with reference to FIG. Note that the coding parameter determining means 3002
Is to instruct the encoding means 3001 to perform intra-frame encoding for the first frame image at the start of encoding.

【０５５８】ステップＤ０１では、符号化パラメータ決
定手段３００２より入力された符号化パラメータについ
て判断がなされ、フレーム内符号化が指示されていた場
合にはステップＤ０２以降の処理が実行され、順方向予
測符号化が指示されていた場合には、ステップＤ０７以
降の処理が実行される。At step D01, the encoding parameters inputted from the encoding parameter determining means 3002 are judged. If the conversion has been instructed, the processing after step D07 is executed.

【０５５９】ステップＤ０２以降が実行される場合は、
ステップＤ０２〜ステップＤ０５が実施の形態１におけ
るステップＡ０２〜ステップＡ０５と同様に実行され
る。ステップＤ０６では、符号化が終了しているか否か
が判断され、符号化が終了したと判断されたならば処理
は終了する。一方、符号化終了でなければ上記のステッ
プＤ０１に戻り、ステップＤ０１の判断以降が実行され
る。In the case where step D02 and subsequent steps are executed,
Steps D02 to D05 are executed in the same manner as steps A02 to A05 in the first embodiment. In step D06, it is determined whether or not the encoding has been completed. If it is determined that the encoding has been completed, the process ends. On the other hand, if the encoding is not completed, the process returns to step D01, and the steps after step D01 are executed.

【０５６０】これに対して、ステップＤ０１の判断によ
り、ステップＤ０７以降が実行される場合は次のように
なる。まず、ステップＤ０７で逆量子化手段３００７
は、量子化手段３００４が直前のフレーム画像に対して
すでに出力している量子化データを逆量子化し、逆量子
化データを出力する。次いでステップＤ０８では、逆Ｄ
ＣＴ処理手段３００８が、逆量子化データに対して、Ｄ
ＣＴ処理手段３００３が分割した８画素×８画素のブロ
ックごとに、２次元離散コサイン変換の逆処理である２
次元逆離散コサイン変換を実行し、逆ＤＣＴ変換データ
を出力する。ステップＤ０９において、予測画像生成手
段３００９は、逆ＤＣＴ変換データに基づいて予測画像
（未補償）を生成し、該生成した予測画像と入力画像デ
ータとに対して、符号化パラメータによって指示された
範囲内で動きベクトル検出を行い、この動きベクトルを
用いて、動き補償のされた予測画像を生成し出力する。On the other hand, when step D07 and subsequent steps are executed according to the judgment in step D01, the following is performed. First, in step D07, the inverse quantization means 3007
The inverse quantization unit 3004 inversely quantizes the quantized data already output for the immediately preceding frame image by the quantization unit 3004, and outputs inversely quantized data. Next, in step D08, the inverse D
CT processing means 3008 calculates D
The inverse processing of the two-dimensional discrete cosine transform is performed for each block of 8 × 8 pixels divided by the CT processing unit 3003.
Performs a dimensional inverse discrete cosine transform and outputs inverse DCT transform data. In step D09, the prediction image generation unit 3009 generates a prediction image (uncompensated) based on the inverse DCT transform data, and generates a prediction image (uncompensated) based on the generated image and the input image data in a range specified by the encoding parameter. Then, a motion vector is detected, and the motion vector is used to generate and output a motion-compensated predicted image.

【０５６１】ステップＤ１０でＤＣＴ処理手段３００３
は、入力されたフレーム画像と予測画像生成手段３００
９が出力した予測画像とを、それぞれ指示された解像度
に基づき、８画素×８画素のブロックに分割し、分割し
たブロックごとに、入力されたフレーム画像のデータか
ら予測画像のデータを差し引くことにより差分データを
得る。そして、この差分データに対して、分割したブロ
ックごとに２次元離散コサイン変換して、ＤＣＴ変換デ
ータを出力する。ＤＣＴ変換データが出力された後のス
テップＤ１１〜Ｄ１４は、上記のステップＤ０３からＤ
０６と同様に実行される。In step D10, DCT processing means 3003
Represents the input frame image and the predicted image generation means 300
9 is divided into blocks of 8 pixels × 8 pixels based on the specified resolutions, and for each of the divided blocks, the data of the predicted image is subtracted from the data of the input frame image. Obtain difference data. Then, two-dimensional discrete cosine transform is performed on the difference data for each divided block, and DCT transform data is output. Steps D11 to D14 after the DCT conversion data is output are the same as steps D03 to D14 described above.
It is executed in the same way as 06.

【０５６２】このように、符号化手段３００１では、入
力されたフレーム画像ごとに、ステップＤ０１の判定に
より、ステップＤ０２〜Ｄ０６か、ステップＤ０７〜Ｄ
１４かの処理が行われることとなる。ステップＤ０２〜
Ｄ０６はフレーム内符号化であり、ステップＤ０７〜Ｄ
１４は直前のフレーム画像に対しての符号化結果を用い
た予測画像に基づく順方向符号化処理が行われるもので
あり、この切り替えはステップＤ０１の判定において、
入力された符号化パラメータに従ってなされるものであ
る。As described above, the encoding means 3001 determines whether or not steps D02 to D06 or steps D07 to D07 are determined for each input frame image in step D01.
Fourteen processes will be performed. Step D02-
D06 is intra-frame encoding, and steps D07 to D07 are performed.
14 performs forward coding processing based on a predicted image using the coding result for the immediately preceding frame image, and this switching is performed in the determination of step D01.
This is performed according to the input encoding parameters.

【０５６３】（表１５）は符号化パラメータ決定手段３
００２が内包するＭＶ検出範囲参照テーブル３０１０を
示す表である。また、図５７は符号化パラメータ決定手
段３００２の動作を示すフローチャート図である。以下
に、符号化パラメータを決定して、符号化手段３００１
に出力する符号化パラメータ決定手段３００２の動作
を、表１を参照し、図５７のフローに従って説明する。(Table 15) shows the encoding parameter determining means 3
002 is a table showing an MV detection range reference table 3010 included therein. FIG. 57 is a flowchart showing the operation of the encoding parameter determining means 3002. In the following, an encoding parameter is determined and the encoding means 3001 is determined.
The operation of the encoding parameter determining means 3002 for outputting to the.

【０５６４】[0564]

【表１５】 [Table 15]

【０５６５】（表１５）に示すＭＶ検出範囲参照テーブ
ル３０１０は、符号化に先立ちって予め作成しておかれ
るものである。テーブル作成は、後述する条件を考慮し
た上で、例えば経験的知識に基づいて、あるいは実験符
号化やシミュレーション等の結果を用いて、することが
できる。（表１）の「入力」の欄は入力画像データの有
する解像度と、指示されるパラメータとを、また「出
力」の欄は入力に対応して決定されるパラメータを示し
ている。同表に示すように、本実施の形態２５では入力
画像データの解像度と符号化パターンとに対応して、Ｍ
Ｖ検出範囲が決定される。符号化パターンについては固
定的に「ＩＰ」とされているものであり、パターン「Ｉ
Ｐ」は、２フレームごとにフレーム内符号化( Ｉ) と順
方向予測符号化( Ｐ) とを繰り返すことを意味する。入
力画像の解像度としては、「１６０×１２０」と「８０
×６４」とのいずれかが指示されるものとする。The MV detection range reference table 3010 shown in (Table 15) is created in advance prior to encoding. The table can be created, for example, based on empirical knowledge or using results of experimental coding, simulation, or the like, in consideration of the conditions described below. The “input” column in (Table 1) indicates the resolution of the input image data and the designated parameter, and the “output” column indicates the parameter determined in response to the input. As shown in the table, in the twenty-fifth embodiment, in accordance with the resolution of the input image data and the
The V detection range is determined. The encoding pattern is fixed to “IP”, and the pattern “I”
"P" means that intra-frame encoding (I) and forward prediction encoding (P) are repeated every two frames. As the resolution of the input image, “160 × 120” and “80”
× 64 ”.

【０５６６】参照テーブルの作成は、次の条件を考慮し
て行われる。第一に、動きベクトル検出範囲が大きくな
ると、処理量が多くなること、第二に入力画像の有する
解像度が高い場合は、低解像度である場合と比べて処理
量が多くなることである。[0566] Creation of the reference table is performed in consideration of the following conditions. First, the processing amount increases when the motion vector detection range increases, and second, the processing amount increases when the resolution of the input image is high as compared with the case where the resolution is low.

【０５６７】これらの条件を考慮し、指定された入力画
像の解像度に対して、できるだけ大きな範囲で検出を行
い、高圧縮率の符号化データを得られるように、ＭＶ検
出範囲参照テーブル３０１０は作成されるものである。In consideration of these conditions, the MV detection range reference table 3010 is created so that detection is performed in a range as large as possible with respect to the resolution of the specified input image, and coded data with a high compression ratio can be obtained. Is what is done.

【０５６８】まず、図５７のフローのステップＥ０１に
おいて、符号化パラメータ決定手段３００２は、指定さ
れた入力画像の解像度と、符号化パターン( ＩＰ) とか
ら、ＭＶ検出範囲参照テーブル３０１０を参照して、予
測符号化における動きベクトルの検出範囲を決定する。First, in step E01 of the flow in FIG. 57, the coding parameter determining means 3002 refers to the MV detection range reference table 3010 from the designated input image resolution and coding pattern (IP). , A motion vector detection range in predictive coding is determined.

【０５６９】次いでステップＥ０２では、符号化パラメ
ータ決定手段３００２は、符号化手段３００１に対し
て、ステップＥ０１で決定したＭＶ検出範囲を指示する
とともに、指定された符号化パターンを実現できるよう
に、処理対象であるフレーム画像に用いるべき符号化タ
イプ( ＩもしくはＰ) を指示する。Next, in step E02, the coding parameter determination means 3002 instructs the coding means 3001 on the MV detection range determined in step E01 and performs processing so as to realize the specified coding pattern. Indicate the encoding type (I or P) to be used for the target frame image.

【０５７０】その後、ステップＥ０３では符号化が終了
したか否かが判定され、符号化が終了したと判定された
ならば処理は終了する。一方、終了でなければ、ステッ
プＥ０２に戻ることによって、符号化手段３００１に対
する符号化パラメータ出力が繰り返される。[0570] Thereafter, in step E03, it is determined whether or not the encoding has been completed. If it is determined that the encoding has been completed, the process ends. On the other hand, if the processing has not been completed, the process returns to step E02, whereby the output of the coding parameters to the coding means 3001 is repeated.

【０５７１】符号化手段３００１と、符号化パラメータ
決定手段３００２との以上のような動作によって、符号
化が実行されるが、（表１６）は、本実施の形態２５に
よる映像符号化装置において符号化を行なった結果を示
す表である。[0571] Encoding is performed by the above-described operations of the encoding means 3001 and the encoding parameter determining means 3002. It is a table | surface which shows the result of having performed conversion.

【０５７２】[0572]

【表１６】 [Table 16]

【０５７３】（表１６）は、指示される条件に対して、
本実施の形態２５の符号化装置において決定されるＭＶ
検出範囲（決定されるパラメータ）と、それらのパラメ
ータを用いた符号化処理の結果として得られたフレーム
レート（符号化結果）とを示している。（表１６）に示
す符号化結果の数値については、符号化パターン「Ｉ
Ｐ」において、解像度を１６０×１２０とした場合に２
７．４フレーム／秒で処理できることに基づいて、その
他の場合のフレームレートが算出されている。符号化パ
ターンがＩＰで解像度が８０×６４、動きベクトルの検
出範囲が「大きい」の場合のフレームレートは、検出範
囲が大きい場合の処理に小さい場合の処理の４倍の時間
を要することと、解像度が約１／４の場合は約１／４の
時間で処理できることとから、約２７．４フレーム／秒
と算出できる。(Table 16) shows that, for the specified conditions,
MV determined in encoding apparatus of Embodiment 25
It shows a detection range (determined parameters) and a frame rate (encoding result) obtained as a result of an encoding process using those parameters. For the numerical values of the encoding results shown in (Table 16), the encoding pattern “I
P ”, when the resolution is set to 160 × 120,
The frame rate in other cases is calculated based on the fact that processing can be performed at 7.4 frames / sec. The frame rate when the encoding pattern is IP, the resolution is 80 × 64, and the detection range of the motion vector is “large” requires that the processing when the detection range is large takes four times as long as the processing when the detection range is small. When the resolution is about 1/4, processing can be performed in about 1/4 time, so that it can be calculated as about 27.4 frames / sec.

【０５７４】比較のため、（表１７）に従来の技術によ
る映像符号化装置を用いて符号化を行なった場合の動作
結果を示す。For comparison, Table 17 shows the operation results when encoding is performed using a video encoding device according to the conventional technique.

【０５７５】[0575]

【表１７】 [Table 17]

【０５７６】（表１７）においても（表１６）の場合と
同様の算出がなされており、符号化パターン「ＩＰ」に
おいて、解像度が１６０×１２０、検出範囲が「小さ
い」の場合に２７．４フレーム／秒で処理できることに
基づいて、その他の場合のフレームレートが算出されて
いる。In Table 17, the same calculation as in Table 16 is performed. In the case of the encoding pattern “IP”, when the resolution is 160 × 120 and the detection range is “small”, 27.4. Other frame rates have been calculated based on the ability to process at frames per second.

【０５７７】従来の技術による映像符号化装置では、符
号化結果として得られるフレームレートを考慮せずに、
動きベクトル検出範囲を決定していたものである。従っ
て、符号化処理の結果として得られるフレームレートが
十分高くなるように設定することが困難なことが多かっ
た。これに比べ、本実施の形態２５の映像符号化装置に
おいては、符号化結果であるフレームレートを考慮し
て、指定された符号化タイプ（パターン）と、入力され
る画像が有する解像度とに応じて動きベクトルの検出範
囲を決定することで、表１６と表１７との対比において
示されるように、できるだけ高いフレームレートを実現
しつつ、より高圧縮率の符号化データを得られるように
動きベクトルの検出範囲を設定しての符号化が実行され
ていることがわかる。[0577] In the video coding apparatus according to the conventional technique, without considering the frame rate obtained as a coding result,
This is to determine the motion vector detection range. Therefore, it is often difficult to set the frame rate obtained as a result of the encoding process to be sufficiently high. On the other hand, in the video coding apparatus according to the twenty-fifth embodiment, in consideration of the frame rate as a coding result, the video coding apparatus according to the specified coding type (pattern) and the resolution of the input image have As shown in a comparison between Tables 16 and 17, by determining the detection range of the motion vector, it is possible to realize a frame rate as high as possible and obtain encoded data with a higher compression rate. It can be seen that the encoding is executed with the detection range set.

【０５７８】このように、本実施の形態２５による映像
符号化装置によれば、符号化手段３００１と、動きベク
トル検出範囲参照テーブル３０１０を内包した符号化パ
ラメータ決定手段３００２とを備えたことで、符号化パ
ラメータ決定手段３００２は、指定された符号化タイプ
と入力される画像が有する解像度に対応して、動きベク
トルの検出範囲を決定して、符号化パラメータを符号化
手段３００１に出力し、符号化手段３００１はこの符号
化パラメータに応じて符号化の処理を行うので、要求さ
れる条件を実現しつつ、より高圧縮率の符号化データの
得られる符号化を行うことが可能となる。[0578] As described above, according to the video coding apparatus of the twenty-fifth embodiment, the coding apparatus 3001 and the coding parameter determining means 3002 including the motion vector detection range reference table 3010 are provided. The encoding parameter determination unit 3002 determines a detection range of a motion vector according to the designated encoding type and the resolution of the input image, and outputs an encoding parameter to the encoding unit 3001. Since the encoding unit 3001 performs the encoding process according to the encoding parameter, it is possible to perform the encoding that can obtain the encoded data with a higher compression ratio while realizing the required condition.

【０５７９】なお、本実施の形態２５による映像符号化
装置では、指定された符号化パターンと入力画像の有す
る解像度に対応して動きベクトルの検出範囲を決定する
ものとしたが、同様の参照テーブルを用いることによっ
て、入力画像の有する解像度に対応してフィルタリング
の有無を決定することも可能であり、設定された条件下
で、より高画質の符号化結果の得られる処理をすること
が可能となる。In the video coding apparatus according to the twenty-fifth embodiment, the detection range of the motion vector is determined according to the specified coding pattern and the resolution of the input image. By using, it is also possible to determine the presence or absence of filtering corresponding to the resolution of the input image, it is possible to perform processing to obtain a higher quality encoding result under the set conditions Become.

【０５８０】なお、実施の形態２５に示した映像符号化
方法については、実施の形態１と同様に、該方法を実行
し得る映像音声符号化プログラムを記録した記録媒体を
用いて、パーソナルコンピュータやワークステーション
等において、当該プログラムを実行することによって実
現できるものである。[0580] As in the first embodiment, the video coding method shown in the twenty-fifth embodiment uses a recording medium storing a video / audio coding program capable of executing the method and uses a personal computer or a computer. This can be realized by executing the program on a workstation or the like.

【０５８１】また、実施の形態１〜２５のいずれについ
ても、当該符号化プログラムを記録する媒体としては、
フロッピーディスク、ＣＤ−ＲＯＭ、光磁気ディスク、
相変化型光ディスク等、当該符号化プログラムを記録す
ることができ、パーソナルコンピュータ等の汎用計算機
で読み出して実行できるものであれば使用可能である。
又、ネットワークにより接続する、他のコンピュータが
管理する記憶装置に記録した当該プログラムを、ネット
ワークを介して読み出し、当該読み出したコンピュータ
において実行する運用形態をとることも可能である。Also, in any of Embodiments 1 to 25, the medium for recording the encoding program includes:
Floppy disk, CD-ROM, magneto-optical disk,
Any computer that can record the encoding program, such as a phase change optical disk, and can be read and executed by a general-purpose computer such as a personal computer can be used.
Further, it is also possible to adopt an operation mode in which the program recorded in a storage device managed by another computer connected to the network is read out via the network, and the read out computer executes the program.

【０５８２】[0582]

【発明の効果】請求項１の映像符号化方法によれば、映
像がデジタル化された、複数の静止画像情報からなる原
映像情報に対して、上記静止画像情報の１つまたは複数
を、後述する符号化パラメータに従って符号化する映像
符号化ステップと、原映像情報の有する解像度、符号化
によって得られる符号化データを再生する際に要求され
るフレームレート、上記映像符号化ステップを実行する
符号化装置の処理能力を示す処理性能、または上記映像
符号化ステップにおける符号化処理の処理量に影響する
１つ、もしくは複数の符号化パラメータのうちいずれか
１つ以上に基づいて、１つ以上の上記符号化パラメータ
を決定する符号化パラメータ決定ステップとを実行する
ので、与えられたフレームレートにおいて、高解像度
の、又は高圧縮率の符号化結果の得られる符号化パラメ
ータを設定することで、装置資源を活用して、良好な符
号化結果を得ることが可能となる。According to the video encoding method of the present invention, one or more of the above-mentioned still image information is converted to the original video information composed of a plurality of still image information in which the video is digitized. A video encoding step of encoding according to the encoding parameters to be encoded, a resolution of the original video information, a frame rate required when reproducing encoded data obtained by encoding, and an encoding for executing the video encoding step. One or more of the above based on processing performance indicating the processing capability of the device, or one or more of one or a plurality of coding parameters affecting the processing amount of the coding process in the video coding step. And a coding parameter determination step of determining a coding parameter, so that a high resolution or a high compression ratio is obtained at a given frame rate. By setting the encoding parameters obtained in Goka result, it is possible to utilize the device resources, obtain good coding results.

【０５８３】また、請求項２の映像符号化方法によれ
ば、請求項１の方法において、当該映像符号化方法の、
上記映像符号化ステップを実行する符号化装置の処理能
力を判断して、判断結果を出力する処理能力判断ステッ
プをさらに実行するので、処理能力に応じた良好な符号
化結果を得ることが可能となる。According to the video encoding method of claim 2, in the video encoding method of claim 1,
Since the processing capability of the encoding device that executes the video encoding step is determined and the processing capability determination step of outputting the determination result is further performed, it is possible to obtain a good encoding result according to the processing capability. Become.

【０５８４】また、請求項３の映像符号化方法によれ
ば、請求項１または２の方法において、上記符号化パラ
メータは、上記原映像情報に対して行う符号化処理にお
ける解像度、フレーム内符号化、もしくは予測符号化を
示す符号化タイプ、または上記予測符号化に用いる動き
ベクトルを検出する際の検出範囲のうち１つ以上を含む
ものとすることで、設定に応じてこれらのパラメータを
決定して、装置資源を活用して、良好な符号化結果を得
ることが可能となる。[0584] According to the video encoding method of claim 3, in the method of claim 1 or 2, the encoding parameter is a resolution in an encoding process performed on the original video information, an intra-frame encoding. Or, by including one or more of the encoding type indicating predictive encoding, or a detection range when detecting the motion vector used for the predictive encoding, to determine these parameters according to the setting, A good encoding result can be obtained by utilizing the device resources.

【０５８５】また、請求項４の映像符号化方法によれ
ば、請求項３の方法において、上記処理能力判断ステッ
プでは、当該映像符号化方法の有する制御装置の種類に
基づいて上記判断を行うので、当該符号化装置のハード
ウェア能力を判断し、その能力に応じて符号化パラメー
タを設定することで、装置資源を活用して良好な符号化
結果を得ることが可能となる。According to the video encoding method of the fourth aspect, in the method of the third aspect, in the processing capability determining step, the determination is performed based on the type of the control device of the video encoding method. By judging the hardware capability of the encoding device and setting the encoding parameter according to the capability, it is possible to obtain a good encoding result by utilizing the device resources.

【０５８６】また、請求項５の映像符号化方法によれ
ば、請求項３の方法において、上記処理能力判断ステッ
プでは、上記符号化ステップにおける符号化処理の所要
時間に基づいて上記判断を行うので、上記所要時間が示
す、当該符号化装置の符号化処理能力を判断し、その能
力に応じて符号化パラメータを設定することで、装置資
源を活用して良好な符号化結果を得ることが可能とな
る。According to the video encoding method of claim 5, in the method of claim 3, the above-mentioned determination is made based on the time required for the encoding process in the above-mentioned encoding step. By judging the encoding processing capability of the encoding device, indicating the required time, and setting an encoding parameter according to the capability, it is possible to obtain a good encoding result by utilizing the device resources. Becomes

【０５８７】また、請求項６の映像符号化方法によれ
ば、請求項３の方法において、上記処理能力判断ステッ
プでは、上記入力される原映像情報を一時蓄積し、該蓄
積にあたっては、上記原映像情報を構成する一連の静止
画像情報を順次保存していくとともに、上記符号化ステ
ップにおいて読み出されて、上記符号化処理が行われた
静止画像情報を順次廃棄する映像バッファリングステッ
プと、上記映像バッファリングステップにおける上記一
連の静止画像情報の保存を、上記与えられたフレームレ
ートに基づいて決定される一定のフレームレートにおい
て行うように制御するフレームレート制御ステップとを
実行し、上記映像バッファリングステップにおいて一時
蓄積された上記原映像情報の蓄積量に基づいて上記判断
を行うので、上記蓄積量が示す、当該符号化装置の符号
化処理能力を判断し、その能力に応じて符号化パラメー
タを設定することで、装置資源を活用して良好な符号化
結果を得ることが可能となる。According to the video encoding method of claim 6, in the method of claim 3, in the processing capacity determination step, the input original video information is temporarily stored, and the original video information is stored in the storage. A video buffering step of sequentially saving a series of still image information constituting the video information, and sequentially discarding the still image information read out in the encoding step and subjected to the encoding process; Performing a frame rate control step of controlling the storage of the series of still image information in the video buffering step to be performed at a constant frame rate determined based on the given frame rate. Since the above determination is made based on the storage amount of the original video information temporarily stored in the step, Indicated amount, determines the encoding capability of the encoder, by setting encoding parameters in accordance with their capabilities, it is possible to obtain good coding result by utilizing the device resource.

【０５８８】また、請求項７の音声符号化方法によれ
ば、音声に対して、帯域分割符号化方式により符号化を
行う音声符号化方法において、符号化処理に用いる数値
である、設定周波数ｆｓと、変換定数ｎとを記憶する記
憶ステップと、符号化の対象である音声を入力する音声
入力ステップと、上記記憶した設定周波数ｆｓに基づい
て決定されるサンプリング周波数を用いて、サンプリン
グ音声データを作成する入力音声サンプリングステップ
と、上記設定周波数ｆｓをサンプリング周波数として用
いた場合に得られるサンプリング音声データの個数をｍ
個とし、上記変換定数ｎに基づいて定められる数をｍ’
として、ｍ’個のサンプリング音声データを含む、ｍ個
の音声データからなる変換音声データを出力する音声デ
ータ変換ステップと、上記変換音声データを、帯域分割
してＭ個の帯域信号を得る帯域分割ステップと、上記記
憶した設定周波数ｆｓと変換定数ｎとから得られる周波
数ｆｓ／２ｎを制限周波数として、上記帯域信号のう
ち、制限周波数以下の帯域信号にのみ符号化ビットを割
り当てる符号化ビット割り当てステップと、上記割り当
てた符号化ビットに基づいて量子化を行う量子化ステッ
プと、上記量子化したデータを符号化データとして出力
する符号化ステップと、上記出力される符号化データを
記録する符号化データ記録ステップとを実行するので、
変換定数ｎの設定によって、処理負担を軽減し、音声取
り込みにともなったリアルタイム符号化処理を行って、
当該符号化装置の性能に応じた音質の符号化結果を得る
ことが可能となる。According to the audio encoding method of the present invention, in the audio encoding method for encoding audio by the band division encoding method, the set frequency fs, which is a numerical value used for the encoding process, is used. And a storage step of storing a conversion constant n; a voice input step of inputting a voice to be encoded; and a sampling frequency determined based on the stored set frequency fs. The input voice sampling step to be created and the number of sampled voice data obtained when the set frequency fs is used as the sampling frequency is m
And the number determined based on the conversion constant n is m ′
An audio data conversion step of outputting converted audio data composed of m audio data, including m ′ sampled audio data, and a band division of dividing the converted audio data to obtain M band signals A coding step for allocating coding bits only to a band signal having a frequency equal to or lower than the limit frequency among the band signals, using a frequency fs / 2n obtained from the stored set frequency fs and the conversion constant n as a limit frequency. And a quantization step of performing quantization based on the allocated coded bits, a coding step of outputting the quantized data as coded data, and coded data for recording the output coded data. Perform the recording step and
By setting the conversion constant n, the processing load is reduced, and the real-time encoding process accompanying the voice capture is performed.
It is possible to obtain an encoding result of sound quality according to the performance of the encoding device.

【０５８９】また、請求項８の音声符号化方法によれ
ば、請求項７の方法において、上記入力音声サンプリン
グステップでは、上記記憶した設定周波数ｆｓをサンプ
リング周波数として、上記入力された音声のサンプリン
グ処理により、ｍ個のサンプリング音声データを作成す
るものであり、上記音声データ変換ステップでは、上記
ｍ個のサンプリング音声データより、（ｎ−１）個おき
にサンプリング音声データを抽出し、２つの隣接する上
記抽出したサンプリング音声データの間に、（ｎ−１）
個の音声データを挿入して、ｍ個の変換音声データに変
換するので、処理負担を軽減し得る変換音声データを得
ることによって、音声取り込みにともなったリアルタイ
ム符号化処理を行って、当該符号化装置の性能に応じた
音質の符号化結果を得ることが可能となる。[0589] According to the speech encoding method of claim 8, in the method of claim 7, in the input speech sampling step, the input speech is sampled by using the stored set frequency fs as a sampling frequency. In the audio data conversion step, sampling audio data is extracted at intervals of (n-1) from the m sampling audio data, and two adjacent audio data are extracted. Between the extracted sampled audio data, (n-1)
Is converted into m pieces of converted voice data by inserting the voice data, thereby obtaining the converted voice data that can reduce the processing load, and performing the real-time coding process accompanying the voice capture and performing the coding. It is possible to obtain an encoding result of sound quality according to the performance of the device.

【０５９０】また、請求項９の音声符号化方法によれ
ば、請求項８の方法において、上記音声データ変換ステ
ップでは、上記抽出したサンプリング音声データがそれ
ぞれｎ個ずつ連続する変換音声データを作成すること
で、上記の処理負担を軽減し得る変換音声データを、容
易に得ることが可能となる。[0590] According to a ninth aspect of the present invention, in the method of the ninth aspect, in the audio data converting step, converted audio data in which the extracted sampled audio data is continuous by n pieces each is created. This makes it possible to easily obtain converted audio data that can reduce the processing load.

【０５９１】また、請求項１０の音声符号化方法によれ
ば、請求項７の方法において、上記入力音声サンプリン
グステップでは、上記記憶した設定周波数ｆｓと変換定
数ｎとから得られる周波数ｆｓ／ｎをサンプリング周波
数として、上記入力された音声のサンプリング処理によ
り、ｍ／ｎ個のサンプリング音声データを作成するもの
であり、上記音声データ変換ステップでは、上記サンプ
リング音声データに基づき、２つの隣接するサンプリン
グ音声データの間に（ｎ−１）個の音声データを挿入し
て、ｍ個の変換音声データに変換するので、変換定数の
設定によって音声入力時のサンプリング周波数を制御す
ることにより、処理負荷を軽減することで、音声取り込
みにともなったリアルタイム符号化処理を行って、当該
符号化装置の性能に応じた音質の符号化結果を得ること
に加えて、サンプリングデータ量が低減することによ
る、データ一時蓄積のためのバッファ使用量の低減をも
図ることが可能となる。[0591] According to the speech encoding method of claim 10, in the method of claim 7, in the input speech sampling step, the frequency fs / n obtained from the stored set frequency fs and the conversion constant n is calculated. As the sampling frequency, m / n pieces of sampled audio data are created by sampling the input audio. In the audio data conversion step, two adjacent sampled audio data are generated based on the sampled audio data. Since (n-1) pieces of audio data are inserted between them and converted into m pieces of converted audio data, the processing load is reduced by controlling the sampling frequency at the time of audio input by setting the conversion constant. By performing the real-time encoding process accompanying the voice capture, the performance of the encoding device In addition to obtaining quality encoding result of corresponding, due to the sampling data amount is reduced, it becomes possible to reduce the buffer usage for the time data temporary storage.

【０５９２】また、請求項１１の音声符号化方法によれ
ば、請求項１０の方法において、上記音声データ変換ス
テップでは、上記ｍ／ｎ個のサンプリング音声データ
が、それぞれｎ個ずつ連続する変換音声データを作成す
ることで、上記の処理負担を軽減し得る変換音声データ
を、容易に得ることが可能となる。[0592] According to a speech encoding method of claim 11, in the method of claim 10, in the speech data conversion step, the m / n sampled speech data is converted speech data which is continuous by n pieces each. By creating data, it is possible to easily obtain converted audio data that can reduce the above processing load.

【０５９３】また、請求項１２の音声符号化方法によれ
ば、請求項７ないし１１のいずれかの方法において、上
記サンプリング音声データを、入力バッファに一時的に
保持する音声バッファリングステップと、上記入力バッ
ファのデータ量を調べて、これを予め設定した値と比較
し、上記比較の結果に基づいて、上記レジスタに記憶さ
れた上記変換定数ｎの値を変更する入力バッファ監視ス
テップとを実行し、上記入力音声サンプリングステップ
では、上記サンプリング音声データを上記入力バッファ
に書き込むものであり、上記音声データ変換ステップで
は、上記入力バッファよりサンプリング音声データを読
み出して、これを上記変換するので、一時蓄積されるデ
ータ量を指標として、その時点での当該符号化装置の処
理能力を判断し、その結果に応じて変換定数ｎの数値を
変更することによって、当該装置の処理能力の変動に対
応して、音声取り込みにともなったリアルタイム符号化
処理を行うことが可能となる。According to a twelfth aspect of the present invention, in the method of any of the seventh to eleventh aspects, the audio buffering step of temporarily holding the sampled audio data in an input buffer; Checking the amount of data in the input buffer, comparing it with a preset value, and executing an input buffer monitoring step of changing the value of the conversion constant n stored in the register based on the result of the comparison. In the input audio sampling step, the sampled audio data is written to the input buffer. In the audio data conversion step, the sampled audio data is read out from the input buffer, and the converted audio data is temporarily stored. Using the data amount as an index, determine the processing capability of the encoding device at that time, By changing the value of the conversion constant n according to the result of, in response to variations in the processing capability of the apparatus, it is possible to perform real-time coding processing with the audio capture.

【０５９４】また、請求項１３の音声符号化方法によれ
ば、請求項７ないし１１のいずれかの方法において、上
記符号化ステップにおいて出力される単位時間当たりの
符号化データ量を調べて、これを予め設定した値と比較
し、上記比較の結果に基づいて、上記レジスタに記憶さ
れた上記変換定数ｎの値を変更する符号化データ監視ス
テップを実行するので、符号化データ量を指標として、
その時点での当該符号化装置の処理能力を判断し、その
結果に応じて変換定数ｎの数値を変更することによっ
て、当該装置の処理能力の変動に対応して、音声取り込
みにともなったリアルタイム符号化処理を行うことが可
能となる。According to the speech encoding method of claim 13, in the method of any of claims 7 to 11, the amount of encoded data per unit time output in the encoding step is checked. Is compared with a preset value, and based on the result of the comparison, an encoded data monitoring step of changing the value of the conversion constant n stored in the register is performed.
By judging the processing capability of the encoding device at that time and changing the numerical value of the conversion constant n according to the result, the real-time encoding accompanying the voice capture is performed in response to the variation in the processing capability of the encoding device. Can be performed.

【０５９５】また、請求項１４の音声符号化方法によれ
ば、音声に対して、帯域分割符号化方式を用いて符号化
を行う音声符号化方法において、上記符号化に用いる制
御定数を記憶する制御定数記憶ステップと、入力音声を
サンプリング処理して、サンプリングデータを出力する
サンプリングステップと、上記サンプリングステップで
得られたサンプリングデータに対して帯域分割を行い、
帯域信号データを出力する帯域分割ステップと、上記帯
域分割ステップで得られた帯域信号データに対して、符
号化ビットの割り当てを行う符号化ビット割り当てステ
ップと、上記符号化ビットの割り当てに従って、上記帯
域信号データの量子化を行い、量子化値を出力する量子
化ステップと、上記量子化ステップで得られた量子化値
に基づき、符号化データを出力する符号化ステップと、
上記記憶した制御定数に基づいて、上記帯域分割ステッ
プ、上記符号化ビット割り当てステップ、上記量子化ス
テップ、および上記符号化ステップにおけるデータ処理
を制御する符号化処理制御ステップとを実行するので、
制御定数の設定によって、処理負担を軽減し、音声取り
込みにともなったリアルタイム符号化処理を行って、当
該符号化装置の性能に応じた音質の符号化結果を得るこ
とが可能となる。[0595] According to the speech encoding method of the present invention, in the speech encoding method for encoding speech using band division encoding, the control constant used for the encoding is stored. A control constant storage step, a sampling step of sampling the input voice and outputting sampling data, and performing band division on the sampling data obtained in the sampling step,
A band dividing step of outputting band signal data; a coding bit allocating step of allocating coded bits to the band signal data obtained in the band dividing step; and Quantizing the signal data, a quantization step of outputting a quantization value, and an encoding step of outputting encoded data based on the quantization value obtained in the quantization step,
Based on the stored control constant, the band division step, the coded bit allocation step, the quantization step, and the encoding process control step of controlling the data processing in the encoding step,
By setting the control constants, it is possible to reduce the processing load, perform the real-time encoding process accompanying the voice capture, and obtain the encoding result of the sound quality according to the performance of the encoding device.

【０５９６】また、請求項１５の音声符号化方法によれ
ば、請求項１４の方法において、上記制御定数記憶ステ
ップでは、上記制御定数として、単位期間判定定数ｋを
単位期間判定定数レジスタに記憶するものであり、上記
符号化処理制御ステップは、上記帯域分割ステップでの
１回の帯域分割処理で対象とするサンプリングデータ数
をｐとし、ｐ個のサンプリングデータに相当する時間を
単位期間として、上記出力されるサンプリングデータの
ｐ個ごとに、相当する単位期間が符号化対象期間である
か符号化対象外期間であるかの判定を、上記記憶した単
位期間判定定数に基づいて行い、上記単位期間が上記符
号化対象期間と判定されたときのみ、該単位期間のサン
プリングデータが上記帯域分割ステップに出力されるよ
う制御し、上記単位期間が上記符号化対象外期間と判定
されたときは、上記符号化ステップにおいて、予め記憶
した固定的符号化データを符号化データとして出力する
よう制御する判定制御ステップであるので、対象期間ご
とに区切られたサンプリングデータに対して、符号化処
理対象期間のもののみに対して符号化処理を実行し、他
は符号化処理を実行せず、固定的符号化データを用いる
ことで、処理負担を軽減し、音声取り込みにともなった
リアルタイム符号化処理を行って、当該符号化装置の性
能に応じた音質の符号化結果を得ることが可能となる。According to a fifteenth aspect of the present invention, in the method of the fourteenth aspect, in the control constant storing step, a unit period determination constant k is stored in the unit period determination constant register as the control constant. The encoding process control step is characterized in that the number of sampling data to be targeted in one band dividing process in the band dividing step is p, and the time corresponding to p pieces of sampling data is a unit period. For each of the p pieces of output sampling data, a determination as to whether the corresponding unit period is a coding target period or a non-coding target period is performed based on the stored unit period determination constant. Is controlled so that the sampling data of the unit period is output to the band division step only when it is determined that the When the period is determined to be the non-encoding target period, in the above-described encoding step, a determination control step is performed in which the previously stored fixed coded data is controlled to be output as coded data. For the delimited sampling data, the encoding process is performed only for the period of the encoding process, and the other encoding process is not performed. It is possible to reduce the amount and perform real-time encoding processing accompanying the audio capture to obtain an encoding result of sound quality according to the performance of the encoding device.

【０５９７】また、請求項１６の音声符号化方法によれ
ば、請求項１５の方法において、上記判定制御ステップ
では、ｉ番目の単位期間をｔｉとして、上記記憶した単
位期間判定定数ｋと任意の整数ｎとからｉ＝ｎ×ｋ＋１
が成立するとき、上記単位期間ｔｉが上記符号化対象期
間であると判定するので、単位期間鑑定定数ｋの設定に
応じて上記符号化対象期間を定め、処理負担の軽減を図
ることが可能となる。According to the speech encoding method of claim 16, in the method of claim 15, in the determination control step, the i-th unit period is defined as ti, and the stored unit period determination constant k and an arbitrary From the integer n, i = n × k + 1
Holds, it is determined that the unit period ti is the encoding target period, so that the encoding target period is determined according to the setting of the unit period appraisal constant k, and the processing load can be reduced. Become.

【０５９８】また、請求項１７の音声符号化方法によれ
ば、請求項１４の方法において、上記制御定数記憶ステ
ップでは、上記制御定数として、演算処理判定定数ｑを
演算処理判定定数レジスタに記憶するものであり、上記
符号化処理制御ステップは、上記帯域分割ステップに内
包され、上記記憶した演算処理判定定数ｑに基づいて、
上記帯域分割ステップにおける演算処理を途中で打ち切
るように制御する演算処理中止ステップであるので、演
算処理判定定数ｑの設定に応じて、帯域分割ステップに
おける演算処理の一部を省略することによって、処理負
担の軽減を図ることが可能となる。[0598] According to the speech encoding method of claim 17, in the method of claim 14, in the control constant storing step, an arithmetic processing determination constant q is stored in the arithmetic processing determination constant register as the control constant. The encoding process control step is included in the band division step, and is based on the stored arithmetic processing determination constant q,
Since this is an arithmetic processing stop step for controlling the arithmetic processing in the band division step to be interrupted halfway, the processing is performed by omitting a part of the arithmetic processing in the band division step according to the setting of the arithmetic processing determination constant q. The burden can be reduced.

【０５９９】また、請求項１８の音声符号化方法によれ
ば、請求項１７の方法において、上記演算処理中止ステ
ップでは、上記帯域分割ステップにおける基本低域通過
フィルタの演算処理を、該フィルタの両端ステップ分に
ついては途中で打ち切るように制御することで、上記の
演算処理の省略をし、処理負担の軽減を図ることが可能
となる。According to the speech coding method of the eighteenth aspect, in the method of the seventeenth aspect, in the step of stopping the arithmetic processing, the arithmetic processing of the basic low-pass filter in the band dividing step is performed by using both ends of the filter. By controlling the steps to be terminated halfway, the above-described arithmetic processing can be omitted, and the processing load can be reduced.

【０６００】また、請求項１９の音声符号化方法によれ
ば、請求項１４の方法において、上記制御定数記憶ステ
ップでは、上記制御定数として、帯域選択定数ｒを帯域
選択定数レジスタに記憶するものであり、上記符号化処
理制御ステップは、上記帯域分割ステップが出力する帯
域信号データのうち、上記記憶した帯域選択定数ｒに基
づいて選択したもののみに対して、上記符号化ビット割
り当てステップと上記量子化ステップとにおける処理を
実行するよう制御する帯域間引きステップであるので、
帯域選択定数ｒの設定に応じて、帯域信号データの一部
に対して、後段の処理を省略することによって、処理負
担の軽減を図ることが可能となる。According to the speech coding method of claim 19, in the method of claim 14, in the control constant storing step, a band selection constant r is stored in the band selection constant register as the control constant. The encoding process control step includes performing the encoding bit allocation step and the quantum Is a band thinning step for controlling to execute the processing in the
According to the setting of the band selection constant r, it is possible to reduce the processing load by omitting the subsequent processing for a part of the band signal data.

【０６０１】また、請求項２０の音声符号化方法によれ
ば、請求項１９の方法において、上記帯域間引きステッ
プでは、上記帯域分割ステップで得られたＭ個の帯域信
号データ出力から、上記記憶した帯域選択定数であるｒ
個おきに帯域信号データを選択することによって、上記
の帯域信号データの選択を実行し、処理負担の軽減を図
ることが可能となる。[0601] According to the speech encoding method of claim 20, in the method of claim 19, in the band decimation step, the stored band signal data is output from the M band signal data outputs obtained in the band division step. The band selection constant r
By selecting the band signal data every other, the above-mentioned band signal data is selected and the processing load can be reduced.

【０６０２】また、請求項２１の音声符号化方法によれ
ば、請求項１４ないし２０のいずれかの方法において、
音声符号化におけるデータ処理の状況を取得し、該取得
した状況に応じて、上記記憶した上記制御定数の値を変
更する処理状況監視ステップを実行するので、データ処
理の状況を指標として、その時点での当該符号化装置の
処理能力を判断し、その結果に応じて制御定数の数値を
変更することによって、当該装置の処理能力の変動に対
応して、音声取り込みにともなったリアルタイム符号化
処理を行うことが可能となる。。[0602] According to the speech coding method of claim 21, in the method of any of claims 14 to 20,
The status of data processing in voice encoding is obtained, and a processing status monitoring step of changing the value of the stored control constant is executed in accordance with the obtained status. By determining the processing capability of the encoding device in the above, and changing the numerical value of the control constant according to the result, the real-time encoding process accompanying the voice capture can be performed in response to the fluctuation in the processing capability of the encoding device. It is possible to do. .

【０６０３】また、請求項２２の音声符号化方法によれ
ば、請求項２１の方法において、上記処理状況監視ステ
ップでは、サンプリングデータを入力バッファに一時蓄
積する音声バッファリングステップと、上記入力バッフ
ァに保持されるデータの量を予め設定した値と比較し、
上記比較の結果に基づいて上記制御定数変更を行う入力
監視ステップとを実行することで、一時蓄積量を指標と
して、その時点での当該符号化装置の処理能力を判断
し、その結果に応じて変換定数ｎの数値を変更すること
によって、当該装置の処理能力の変動に対応して、音声
取り込みにともなったリアルタイム符号化処理を行うこ
とが可能となる。[0603] According to the speech encoding method of claim 22, in the method of claim 21, the processing status monitoring step includes: an audio buffering step of temporarily accumulating sampling data in an input buffer; Compare the amount of data held with a preset value,
By executing the input monitoring step of changing the control constant based on the result of the comparison, the processing capacity of the encoding device at that time is determined using the temporary storage amount as an index, and according to the result, By changing the numerical value of the conversion constant n, it becomes possible to perform a real-time encoding process accompanying voice capture in response to a change in the processing capability of the device.

【０６０４】また、請求項２３の音声符号化方法によれ
ば、請求項２１の方法において、上記処理状況監視ステ
ップは、上記符号化ステップにおいて単位時間当たりに
出力される上記符号化データの量を、予め設定した値と
比較し、上記比較の結果に基づいて上記制御定数の値を
変更する符号化監視ステップであるものとしたことで、
符号化データ量を指標として、その時点での当該符号化
装置の処理能力を判断し、その結果に応じて変換定数ｎ
の数値を変更することによって、当該装置の処理能力の
変動に対応して、音声取り込みにともなったリアルタイ
ム符号化処理を行うことが可能となる。[0604] According to the speech encoding method of claim 23, in the method of claim 21, the processing status monitoring step includes the step of controlling the amount of the encoded data output per unit time in the encoding step. By comparing with a value set in advance, it is assumed that the encoding monitoring step to change the value of the control constant based on the result of the comparison,
Using the amount of encoded data as an index, the processing capability of the encoding device at that time is determined, and a conversion constant n
By changing the numerical value of, it is possible to perform real-time encoding processing accompanying voice capture in response to fluctuations in the processing capacity of the device.

【０６０５】また、請求項２４の音声符号化方法によれ
ば、音声がデジタル化された原音声情報に対して、帯域
分割符号化方式を用いて符号化を行う音声符号化方法に
おいて、入力音声をサンプリング処理して、サンプリン
グデータを出力するサンプリングステップと、上記サン
プリングステップで得られたサンプリングデータに対し
て帯域分割を行い、帯域信号データを出力する帯域分割
ステップと、上記帯域分割ステップで得られた帯域信号
データに対して、符号化ビットの割り当てを行う符号化
ビット割り当てステップと、上記符号化ビット割り当て
ステップにおける割り当てを心理聴覚分析代替制御方式
により制御するビット割り当て制御ステップと、上記符
号化ビットの割り当てに従って、上記帯域信号データの
量子化を行い、量子化値を出力する量子化ステップと、
上記量子化ステップで得られた量子化値に基づき、符号
化データを出力する符号化ステップとを実行すること
で、心理聴覚分析を簡略化したビット割り当て制御を実
行し、人間の聴覚の特性に応じた高品質の符号化結果の
得られる符号化処理を、処理負担を大きく増大すること
なく実行することが可能となる。[0605] According to the speech encoding method of claim 24, the speech encoding method for encoding the original speech information obtained by digitizing the speech by using the band division encoding method. A sampling step of outputting sampling data, performing a band division on the sampling data obtained in the sampling step, and outputting a band signal data; and a sampling step of outputting band signal data. A coded bit allocation step of allocating coded bits to the band signal data, and a bit allocation control step of controlling the allocation in the coded bit allocation step by a psychological auditory analysis alternative control method; and Quantization of the above band signal data according to the allocation of A quantization step of outputting the reduction value,
Based on the quantized value obtained in the above-described quantization step, by executing an encoding step of outputting encoded data, a bit allocation control that simplifies the psychological auditory analysis is executed, and the characteristics of human hearing are performed. It becomes possible to execute an encoding process that can obtain a corresponding high-quality encoding result without greatly increasing the processing load.

【０６０６】また、請求項２５の音声符号化方法によれ
ば、請求項２４の方法において、上記ビット割り当て制
御ステップは、上記帯域分割ステップで得られた帯域信
号データに対して、心理聴覚分析代替制御方式により予
め定められたビット割り当て順に従って、符号化ビット
割り当てを行うよう制御する順次ビット割り当てステッ
プであるものとしたことで、帯域ごとに単純なアルゴリ
ズムでビットを割り当てて、簡略化した心理聴覚分析を
実行し、若干の処理負担の増大はあるものの、再生音質
の良好な符号化データを得ることが可能となる。[0606] According to the speech encoding method of claim 25, in the method of claim 24, the bit allocation control step is a method of substituting a psychological auditory analysis for the band signal data obtained in the band division step. By adopting a sequential bit allocation step that controls to perform coded bit allocation according to a bit allocation order predetermined by a control method, bits are allocated by a simple algorithm for each band, thereby simplifying psychological hearing. By performing the analysis, it is possible to obtain encoded data with good reproduction sound quality, although the processing load is slightly increased.

【０６０７】また、請求項２６の音声符号化方法によれ
ば、請求項２４の方法において、上記ビット割り当て制
御ステップは、上記帯域分割ステップで得られた帯域信
号データに対して、心理聴覚分析代替制御方式により予
め定められた各帯域への重み付けと、各帯域信号データ
の有する出力レベルとに基づいた符号化ビット割り当て
を行うよう制御する帯域出力適応ビット割り当てステッ
プであるものとしたことで、帯域ごとの割り当てと、デ
ータ自体の性質とに応じてビットを割り当てて、簡略化
した心理聴覚分析を実行し、若干の処理負担の増大はあ
るものの、再生音質の良好な符号化データを得ることが
可能となる。[0607] According to the speech coding method of claim 26, in the method of claim 24, the bit allocation control step is a step of substituting the psychological auditory analysis for the band signal data obtained in the band division step. The weighting of each band determined in advance by the control method and the band output adaptive bit allocation step of controlling to perform coding bit allocation based on the output level of each band signal data, The simplified psycho-auditory analysis is performed by allocating bits according to each assignment and the nature of the data itself, and although there is a slight increase in processing load, it is possible to obtain encoded data with good reproduced sound quality. It becomes possible.

【０６０８】また、請求項２７の音声符号化方法によれ
ば、請求項２４の方法において、上記ビット割り当て制
御ステップは、上記帯域分割ステップで得られた帯域信
号データに対して、心理聴覚分析代替制御方式により予
め定められた各帯域への重み付けと、各帯域毎のビット
割り当て数に対する重み付けと、各帯域信号データの有
する出力レベルとに基づいた符号化ビット割り当てを行
うよう制御する改良型帯域出力適応ビット割り当てステ
ップであるものとしたことで、帯域ごとの割り当てと、
データ自体の性質とに応じて、すでに割り当てたビット
配分を考慮しつつビットを割り当てて、簡略化した心理
聴覚分析を実行し、若干の処理負担の増大はあるもの
の、再生音質の良好な符号化データを得ることが可能と
なる。[0608] According to the speech encoding method of claim 27, in the method of claim 24, the bit allocation control step includes a step of substituting a psychological auditory analysis for the band signal data obtained in the band division step. An improved band output for controlling weighting of each band predetermined by a control method, weighting of the number of bits allocated to each band, and coding bit allocation based on the output level of each band signal data. By adopting an adaptive bit allocation step, allocation for each band,
According to the nature of the data itself, bits are allocated in consideration of the bit allocation already allocated, and a simplified psycho-auditory analysis is performed. Although there is a slight increase in processing load, encoding with good reproduction sound quality is performed. Data can be obtained.

【０６０９】また、請求項２８の音声符号化方法によれ
ば、請求項２４の方法において、上記ビット割り当て制
御ステップは、上記帯域分割ステップで得られた帯域信
号データに対して、帯域信号データごとに最小可聴限界
値との比較を行い、上記比較により最小可聴限界未満と
判定された帯域信号データにはビット割り当てを行わ
ず、他の帯域に対してのビット割り当てを増加するよう
制御する最小可聴限界比較ステップであるものとしたこ
とで、最小可聴限界を考慮してビットを割り当てて、簡
略化した心理聴覚分析を実行し、若干の処理負担の増大
はあるものの、再生音質の良好な符号化データを得るこ
とが可能となる。[0609] According to the speech encoding method of claim 28, in the method of claim 24, the bit allocation control step is performed for each band signal data with respect to the band signal data obtained in the band division step. Is compared with the minimum audible limit value, and bit allocation is not performed for band signal data determined to be less than the minimum audible limit by the above comparison, and control is performed so as to increase bit allocation to other bands. By adopting the limit comparison step, bits are assigned in consideration of the minimum audible limit, and a simplified psycho-auditory analysis is performed. Data can be obtained.

【０６１０】また、請求項２９の映像音声符号化方法に
よれば、映像と音声とを符号化するにあたり、上記２つ
の符号化処理に含まれる処理過程の一部または全部を、
共通の計算機資源を用いて実行する映像音声符号化方法
において、単位時間毎の静止画像を表す複数の静止画像
情報からなる原映像情報と、音声を表す原音声情報とか
ら構成される映像音声情報が入力されたとき、上記原音
声情報を一時的に蓄積する音声バッファリングステップ
と、上記音声バッファリングステップにおいて蓄積され
た原音声情報を読み出し、この読み出した上記原音声情
報を符号化処理し、符号化音声情報を出力する音声符号
化ステップと、映像符号化の負荷程度を表す符号化負荷
基準情報を用いて、当該映像音声符号化処理についての
処理能力を判断し、その判断の結果に基づいて、後述す
る映像符号化ステップにおける原映像情報に対する符号
化を制御する符号化負荷評価ステップと、上記符号化負
荷評価ステップにおける制御に従って、入力された上記
原映像情報を構成する静止画像情報を符号化処理し、符
号化映像情報を出力する映像符号化ステップとを実行す
るので、当該映像音声符号化を行う符号化装置の処理能
力に対応して、映像符号化を制御し、映像音声の取り込
みにともなったリアルタイムの符号化処理を実行し、再
生時の音途切れのない、良好な符号化結果を得ることが
可能となる。According to the video and audio encoding method of claim 29, when encoding video and audio, some or all of the processing steps included in the two encoding processes are performed by:
In a video / audio coding method executed using a common computer resource, video / audio information composed of original video information composed of a plurality of still image information representing still images per unit time and original audio information representing audio. Is input, an audio buffering step of temporarily accumulating the original audio information, and reading the original audio information accumulated in the audio buffering step, encoding the read original audio information, Using the audio encoding step of outputting the encoded audio information and the encoding load reference information indicating the degree of the encoding load of the video, the processing capability of the video / audio encoding process is determined, and based on the result of the determination, A coding load evaluation step of controlling coding of original video information in a video coding step described later; and A video encoding step of encoding the input still image information constituting the original video information and outputting the encoded video information in accordance with the control performed by the encoding apparatus. It can control video encoding in accordance with the processing capacity of the camera, execute real-time encoding processing accompanying the capture of video and audio, and obtain good encoding results without sound interruption during playback. Become.

【０６１１】また、請求項３０の映像音声符号化方法に
よれば、請求項２９の方法において、上記符号化負荷評
価ステップは、上記原映像情報を構成する静止画像情報
が入力されたとき、上記音声バッファリングステップに
おいて蓄積された原音声情報の総量と、上記符号化負荷
基準情報とに基づいて符号化負荷評価情報を求め、上記
符号化負荷評価情報を予め設定された負荷限度と比較し
て、上記符号化負荷評価情報が上記負荷限度に達してい
ない場合に静止画像情報を出力し、上記符号化負荷評価
情報が上記負荷限度に達した場合に、上記静止画像情報
を破棄するので、一時蓄積された音声データの量を指標
として、映像データに対する符号化を実行するか、しな
いかを決定することで、映像音声の取り込みにともなっ
たリアルタイムの符号化処理を実行し、再生時の音途切
れのない、良好な符号化結果を得ることが可能となる。[0611] According to the video / audio coding method of claim 30, in the method of claim 29, the coding load evaluation step is performed when the still image information constituting the original video information is input. The total amount of the original audio information stored in the audio buffering step and the encoding load evaluation information are obtained based on the encoding load reference information, and the encoding load evaluation information is compared with a preset load limit. If the encoded load evaluation information does not reach the load limit, still image information is output.If the encoded load evaluation information reaches the load limit, the still image information is discarded. Using the amount of stored audio data as an index, whether to execute or not perform encoding on video data is determined, so that real-time Running-coding processing can be uninterrupted sound reproduction, obtain good coding results.

【０６１２】また、請求項３１の映像音声符号化方法に
よれば、請求項２９の方法において、アナログ映像情報
を入力し、後述する映像解像度情報が出力されたとき、
上記アナログ映像情報を複数の離散的デジタル画素情報
からなり、上記映像解像度情報に従う解像度を持つ複数
の静止画像情報で構成される原映像情報に変換し、上記
映像符号化ステップにおいて処理されるよう出力する映
像キャプチャステップを実行するものであり、上記符号
化負荷評価ステップでは、上記音声バッファリングステ
ップにおいて蓄積された原音声情報の総量と、映像符号
化の負荷程度を表す符号化負荷基準情報とに基づいて符
号化負荷評価情報を求め、上記符号化負荷評価情報に基
づいて、映像符号化に用いる映像の解像度を表す映像解
像度情報を求め、上記映像解像度情報を出力するもので
あり、上記映像符号化ステップでは、上記映像解像度情
報が出力されたとき、上記映像解像度情報に従って上記
静止画像情報に対して符号化処理を行い、符号化映像情
報を出力するので、一時蓄積された音声データの量を指
標として、映像データに対する符号化における解像度を
決定することで、映像音声の取り込みにともなったリア
ルタイムの符号化処理を実行し、再生時の音途切れのな
い、良好な符号化結果を得ることが、映像のとぎれを伴
わずに可能となる。[0612] According to the video / audio coding method of claim 31, in the method of claim 29, when analog video information is input and video resolution information described later is output,
The analog video information is converted into original video information composed of a plurality of discrete digital pixel information and a plurality of still image information having a resolution according to the video resolution information, and output so as to be processed in the video encoding step. In the encoding load evaluation step, the total amount of the original audio information accumulated in the audio buffering step and encoding load reference information indicating the degree of the image encoding load are calculated. Calculating coding load evaluation information based on the coding load evaluation information, obtaining video resolution information representing a resolution of a video used for video coding, based on the coding load evaluation information, and outputting the video resolution information; In the converting step, when the video resolution information is output, the still image information is matched with the video resolution information in accordance with the video resolution information. And performs encoding processing to output encoded video information.By using the amount of temporarily stored audio data as an index to determine the resolution in encoding of video data, real-time By executing the encoding process, it is possible to obtain a good encoding result without interruption of sound during reproduction without interruption of video.

【０６１３】また、請求項３２の映像音声符号化方法に
よれば、請求項２９の方法において、上記符号化負荷評
価ステップでは、符号化負荷評価情報を上記映像符号化
ステップにおいて処理されるよう出力するものであり、
上記映像符号化ステップでは、上記静止画像情報に対し
て、上記出力された符号化負荷評価情報を用いて計算さ
れる処理量だけ符号化処理を行い、符号化映像情報とし
て出力するので、一時蓄積された音声データの量を指標
として、映像データに対する符号化の実行割合を決定す
ることで、映像音声の取り込みにともなったリアルタイ
ムの符号化処理を実行し、再生時の音途切れのない、良
好な符号化結果を得ることが、映像のとぎれを伴わずに
可能となる。[0613] According to the video / audio coding method of claim 32, in the method of claim 29, in the coding load evaluation step, the coding load evaluation information is output so as to be processed in the video coding step. To do
In the video encoding step, the still image information is subjected to an encoding process by a processing amount calculated using the output encoding load evaluation information, and is output as encoded video information. By determining the execution ratio of the encoding for the video data using the amount of the reproduced audio data as an index, the real-time encoding process accompanying the capturing of the video and audio is executed, and the sound is not interrupted during reproduction. It is possible to obtain an encoding result without interrupting the video.

【０６１４】また、請求項３３の映像音声符号化方法に
よれば、請求項２９ないし３２のいずれかの方法におい
て、上記音声符号化ステップでは、上記音声バッファリ
ングステップにおいて蓄積された原音声情報を読み出
し、この読み出した上記原音声情報の総量を計算して処
理済み音声情報量として出力し、その後、上記原音声情
報を符号化処理して符号化音声情報として出力するもの
であり、上記符号化負荷評価ステップでは、経過時間
と、上記原音声情報の時間当たりの入力量に基づいて原
音声入力量を求め、この原音声入力量と上記処理済み音
声情報量との差である予測音声バッファ量を求め、上記
予測音声バッファ量を用いて、上記符号化負荷評価情報
を求めるので、一時蓄積された音声データの量の代替と
して予測バッファ量を指標として、映像データに対する
処理を制御することで、映像音声の取り込みにともなっ
たリアルタイムの符号化処理を実行し、再生時の音途切
れのない、良好な符号化結果を得ることが、一時蓄積さ
れた音声データ量の取得が不可能、または困難な場合に
も可能となる。[0614] According to the video / audio coding method of claim 33, in the method of any of claims 29 to 32, in the audio coding step, the original audio information accumulated in the audio buffering step is used. Reading, calculating the total amount of the read original audio information and outputting it as a processed audio information amount, and thereafter, encoding the original audio information and outputting it as encoded audio information. In the load evaluation step, an original audio input amount is obtained based on the elapsed time and the input amount of the original audio information per time, and a predicted audio buffer amount which is a difference between the original audio input amount and the processed audio information amount. And the coding load evaluation information is obtained using the predicted audio buffer amount. As a benchmark, by controlling the processing of video data, it is possible to execute real-time encoding processing accompanying the capture of video and audio, and obtain good encoding results without sound interruption during playback. This is possible even when it is impossible or difficult to obtain the amount of voice data.

【０６１５】また、請求項３４の映像音声符号化方法に
よれば、請求項２９ないし３２のいずれかの方法におい
て、上記符号化負荷評価ステップでは、上記静止画像情
報が入力されたとき、経過時間と、上記原音声情報の時
間当たりの入力量とに基づいて原音声入力量を求め、か
つ、上記音声符号化ステップにおいて出力された符号化
音声情報の総量に基づいて処理済み音声情報量を求め、
さらに、上記求めた原音声入力量と上記求めた処理済み
音声情報量との差である予測音声バッファ量を求めた
後、上記予測音声バッファ量を用いて、上記符号化負荷
評価情報を求めるので、一時蓄積された音声データの量
の代替として予測バッファ量を指標として、映像データ
に対する処理を制御することで、映像音声の取り込みに
ともなったリアルタイムの符号化処理を実行し、再生時
の音途切れのない、良好な符号化結果を得ることが、一
時蓄積された音声データ量の取得と、符号化ステップか
らの処理量の取得とが不可能、または困難な場合にも可
能となる。[0615] According to the video / audio coding method of claim 34, in the method of any one of claims 29 to 32, in the coding load evaluation step, when the still image information is input, an elapsed time is determined. And an input amount per hour of the original audio information, and an amount of processed audio information is obtained based on a total amount of encoded audio information output in the audio encoding step. ,
Further, after calculating a predicted audio buffer amount which is a difference between the obtained original audio input amount and the obtained processed audio information amount, the coding load evaluation information is obtained using the predicted audio buffer amount. By controlling the processing of video data using the predicted buffer amount as an index as an alternative to the amount of temporarily stored audio data, the real-time encoding process accompanying the capture of video and audio is executed, and the sound is interrupted during playback. It is possible to obtain a good coding result without any problem even when it is impossible or difficult to obtain the temporarily stored audio data amount and the processing amount from the coding step.

【０６１６】また、請求項３５の映像音声符号化方法に
よれば、請求項２９ないし３４のいずれかの方法におい
て、上記符号化負荷評価ステップにおける、上記判断の
結果の変動を監視し、上記変動に対応して、上記符号化
負荷基準情報を設定するので、映像に対する符号化処理
を制御することで、映像音声の取り込みにともなったリ
アルタイムの符号化処理を実行し、再生時の音途切れの
ない、良好な符号化結果を得ることが可能となるのに加
え、制御に用いる情報の変動を監視して、当該変動に対
応して、制御の基準を設定することで、再生映像画質の
激しい変動を抑制することが可能となる。[0616] According to the video / audio coding method of claim 35, in the method of any one of claims 29 to 34, a change in the result of the judgment in the coding load evaluation step is monitored, and the change is monitored. In response to the above, the above-mentioned encoding load reference information is set, so that by controlling the encoding process for the video, the real-time encoding process accompanying the capture of the video and audio is executed, and the sound is not interrupted during reproduction. In addition to being able to obtain good coding results, by monitoring fluctuations in information used for control and setting control criteria in response to the fluctuations, severe fluctuations in reproduced video image quality can be achieved. Can be suppressed.

【０６１７】また、請求項３６にかかる映像符号化装置
は、映像を符号化する映像符号化装置において、映像が
デジタル化された、複数の静止画像情報からなる原映像
情報に対して、上記静止画像情報の１つまたは複数を、
後述する符号化パラメータに従って符号化する映像符号
化手段と、１つ以上の解像度を一の符号化パラメータと
し、フレーム内符号化、順方向予測符号化、逆方向予測
符号化、及び双方向予測符号化の各タイプを含む符号化
タイプのうち１つ以上の符号化タイプを他の符号化パラ
メータとして、上記符号化手段の処理量を決定するもの
である符号化パラメータを、与えられたフレームレート
に基づいて決定する符号化パラメータ決定手段とを備え
たので、与えられたフレームレートにおいて、高解像度
の、又は高圧縮率の符号化結果の得られる符号化パラメ
ータを設定することで、装置資源を活用して、良好な符
号化結果を得ることが可能となる。[0617] The video encoding apparatus according to claim 36 is a video encoding apparatus for encoding a video, wherein the video encoding is performed on the original video information composed of a plurality of still image information in which the video is digitized. One or more of the image information
Video encoding means for encoding according to encoding parameters to be described later, and intra-frame encoding, forward prediction encoding, backward prediction encoding, and bidirectional prediction encoding using one or more resolutions as one encoding parameter. One or more encoding types among the encoding types including each type of encoding are set as other encoding parameters, and the encoding parameter for determining the processing amount of the encoding means is set to a given frame rate. And a coding parameter determining means for determining the coding parameters based on the coding result. As a result, a good encoding result can be obtained.

【０６１８】また、請求項３７の音声符号化装置によれ
ば、音声に対して、帯域分割符号化方式により符号化を
行う音声符号化装置において、符号化処理に用いる数値
である、設定周波数ｆｓと、変換定数ｎとを記憶するレ
ジスタと、符号化の対象である音声を入力する音声入力
手段と、上記記憶した設定周波数ｆｓに基づいて決定さ
れるサンプリング周波数を用いて、サンプリング音声デ
ータを作成する入力音声サンプリング手段と、上記設定
周波数ｆｓをサンプリング周波数として用いた場合に得
られるサンプリング音声データの個数をｍ個とし、上記
変換定数ｎに基づいて定められる数をｍ’として、ｍ’
個のサンプリング音声データを含む、ｍ個の音声データ
からなる変換音声データを出力する音声データ変換手段
と、上記変換音声データを、帯域分割してＭ個の帯域信
号を得る帯域分割手段と、上記記憶した設定周波数ｆｓ
と変換定数ｎとから得られる周波数ｆｓ／２ｎを制限周
波数として、上記帯域信号のうち、制限周波数以下の帯
域信号にのみ符号化ビットを割り当てる符号化ビット割
り当て手段と、上記割り当てた符号化ビットに基づいて
量子化を行う量子化手段と、上記量子化したデータを符
号化データとして出力する符号化手段と、上記出力され
る符号化データを記録する符号化データ記録手段とを備
えたので、変換定数ｎの設定によって、処理負担を軽減
し、音声取り込みにともなったリアルタイム符号化処理
を行って、当該符号化装置の性能に応じた音質の符号化
結果を得ることが可能となる。[0618] According to the speech encoding apparatus of the present invention, in the speech encoding apparatus for encoding speech by the band division encoding method, the set frequency fs, which is a numerical value used for encoding processing, is used. And a register for storing a conversion constant n, audio input means for inputting audio to be encoded, and sampling audio data generated using the sampling frequency determined based on the stored set frequency fs. Input audio sampling means, and the number of sampled audio data obtained when the set frequency fs is used as the sampling frequency is m, and the number determined based on the conversion constant n is m ′, and m ′
Audio data conversion means for outputting converted audio data consisting of m audio data, including a number of sampled audio data; band dividing means for dividing the converted audio data into bands to obtain M band signals; Set frequency fs stored
Encoding frequency allocating means for allocating coded bits only to a band signal equal to or lower than the restricted frequency out of the band signals, using a frequency fs / 2n obtained from the conversion constant n and A quantization means for performing quantization based on the data, an encoding means for outputting the quantized data as encoded data, and an encoded data recording means for recording the output encoded data. By setting the constant n, it is possible to reduce the processing load, perform the real-time encoding processing accompanying the voice capture, and obtain the encoding result of the sound quality according to the performance of the encoding apparatus.

【０６１９】また、請求項３８の音声符号化装置によれ
ば、音声に対して、帯域分割符号化方式を用いて符号化
を行う音声符号化装置において、上記符号化に用いる制
御定数を記憶する制御定数記憶手段と、入力音声をサン
プリング処理して、サンプリングデータを出力するサン
プリング手段と、上記サンプリング手段で得られたサン
プリングデータに対して帯域分割を行い、帯域信号デー
タを出力する帯域分割手段と、上記帯域分割手段で得ら
れた帯域信号データに対して、符号化ビットの割り当て
を行う符号化ビット割り当て手段と、上記符号化ビット
の割り当てに従って、上記帯域信号データの量子化を行
い、量子化値を出力する量子化手段と、上記量子化手段
で得られた量子化値に基づき、符号化データを出力する
符号化手段と、上記記憶した制御定数に基づいて、上記
帯域分割手段、上記符号化ビット割り当て手段、上記量
子化手段、および上記符号化手段におけるデータ処理を
制御する符号化処理制御手段とを備えたので、制御定数
の設定によって、処理負担を軽減し、音声取り込みにと
もなったリアルタイム符号化処理を行って、当該符号化
装置の性能に応じた音質の符号化結果を得ることが可能
となる。[0619] According to the speech coding apparatus of claim 38, in a speech coding apparatus for performing coding on a voice using a band division coding method, a control constant used for the coding is stored. A control constant storage unit, a sampling unit that samples input voice and outputs sampling data, a band division unit that performs band division on the sampling data obtained by the sampling unit, and outputs band signal data. Coding band allocating means for allocating coded bits to the band signal data obtained by the band dividing means, and quantizing the band signal data according to the coded bit allocation. A quantizer for outputting a value, an encoder for outputting encoded data based on the quantized value obtained by the quantizer, Based on the stored control constants, the band division means, the coded bit allocation means, the quantization means, and the encoding processing control means for controlling the data processing in the encoding means, the control constant The setting makes it possible to reduce the processing load and to perform a real-time encoding process accompanying the voice capture, thereby obtaining an encoding result of sound quality according to the performance of the encoding device.

【０６２０】また、請求項３９の音声符号化装置によれ
ば、音声に対して、帯域分割符号化方式を用いて符号化
を行う音声符号化装置において、入力音声をサンプリン
グ処理して、サンプリングデータを出力するサンプリン
グ手段と、上記サンプリング手段で得られたサンプリン
グデータに対して帯域分割を行い、帯域信号データを出
力する帯域分割手段と、上記帯域分割手段で得られた帯
域信号データに対して、符号化ビットの割り当てを行う
符号化ビット割り当て手段と、上記符号化ビット割り当
て手段における割り当てを心理聴覚分析代替制御方式に
より制御するビット割り当て制御手段と、上記符号化ビ
ットの割り当てに従って、上記帯域信号データの量子化
を行い、量子化値を出力する量子化手段と、上記量子化
手段で得られた量子化値に基づき、符号化データを出力
する符号化手段とを備えたので、心理聴覚分析を簡略化
したビット割り当て制御を実行し、人間の聴覚の特性に
応じた高品質の符号化結果の得られる符号化処理を、処
理負担を大きく増大することなく実行することが可能と
なる。[0620] According to the audio coding apparatus of claim 39, in the audio coding apparatus for performing encoding on the audio by using the band division encoding method, the input audio is sampled and the sampling data is obtained. , A band dividing unit that performs band division on the sampling data obtained by the sampling unit, and outputs band signal data, and a band signal data obtained by the band dividing unit. Coded bit allocating means for allocating coded bits, bit allocation control means for controlling the allocation in the coded bit allocating means by a psychological auditory analysis alternative control method, and the band signal data according to the coded bit allocation. And a quantizer for outputting a quantized value, and a quantity obtained by the quantizer. Coding means for outputting coded data based on the coded value, thereby performing bit allocation control that simplifies psycho-auditory analysis and obtaining a high-quality coding result according to the characteristics of human hearing. Encoding processing can be performed without greatly increasing the processing load.

【０６２１】また、請求項４０の映像音声符号化装置に
よれば、映像と音声とを符号化するにあたり、上記２つ
の符号化処理に含まれる処理過程の一部または全部を、
共通の計算機資源を用いて実行する映像音声符号化装置
において、単位時間毎の静止画像を表す複数の静止画像
情報からなる原映像情報と、音声を表す原音声情報とか
ら構成される映像音声情報が入力されたとき、上記原音
声情報を一時的に蓄積する音声バッファリング手段と、
上記音声バッファリング手段において蓄積された原音声
情報を読み出し、この読み出した上記原音声情報を符号
化処理し、符号化音声情報を出力する音声符号化手段
と、映像符号化の負荷程度を表す符号化負荷基準情報を
用いて、当該映像音声符号化装置の処理能力を判断し、
その判断の結果に基づいて、後述する映像符号化手段に
対しての上記原映像情報の出力を制御する符号化負荷評
価手段と、上記符号化負荷評価手段の制御に従って、上
記原映像情報を構成する静止画像情報が入力されたと
き、上記静止画像情報を符号化処理し、符号化映像情報
を出力する映像符号化手段とを備えたので、当該映像音
声符号化を行う符号化装置の処理能力に対応して、映像
符号化を制御し、映像音声の取り込みにともなったリア
ルタイムの符号化処理を実行し、再生時の音途切れのな
い、良好な符号化結果を得ることが可能となる。[0621] According to the video and audio encoding apparatus of claim 40, when encoding video and audio, some or all of the processing steps included in the two encoding processes are performed.
In a video / audio coding apparatus executed using a common computer resource, video / audio information composed of original video information composed of a plurality of still image information representing still images per unit time and original audio information representing audio. Is input, audio buffering means for temporarily storing the original audio information,
Audio encoding means for reading the original audio information stored in the audio buffering means, encoding the read original audio information, and outputting encoded audio information, and a code indicating the degree of load of video encoding Using the coding load reference information, determine the processing capability of the video and audio encoding device,
Based on the result of the determination, an encoding load estimating unit that controls the output of the original image information to an image encoding unit described below, and the original image information is configured according to the control of the encoding load estimating unit. And video encoding means for encoding the still image information and outputting encoded video information when the still image information to be input is input, so that the processing capability of the encoding device for performing the video / audio encoding is provided. In response to this, video encoding is controlled, real-time encoding processing is performed along with the capture of video and audio, and it is possible to obtain good encoding results without sound interruption during reproduction.

【０６２２】また、請求項４１の映像符号化プログラム
記録媒体によれば、映像を符号化処理する映像符号化プ
ログラムを記録した記録媒体において、映像がデジタル
化された、複数の静止画像情報からなる原映像情報に対
して、上記静止画像情報の１つまたは複数を、後述する
符号化パラメータに従って符号化する映像符号化ステッ
プと、１つ以上の解像度を一の符号化パラメータとし、
フレーム内符号化、順方向予測符号化、逆方向予測符号
化、及び双方向予測符号化の各タイプを含む符号化タイ
プのうち１つ以上の符号化タイプを他の符号化パラメー
タとして、上記符号化ステップの処理量を決定するもの
である符号化パラメータを、与えられたフレームレート
に基づいて決定する符号化パラメータ決定ステップとを
実行する符号化プログラムを記録したので、パーソナル
コンピュータ等の汎用計算機において、当該符号化プロ
グラムを実行させることにより、与えられたフレームレ
ートにおいて、高解像度の、又は高圧縮率の符号化結果
の得られる符号化パラメータを設定することで、装置資
源を活用して、良好な符号化結果を得ることが可能とな
る。[0622] According to the video encoding program recording medium of claim 41, in the recording medium in which the video encoding program for encoding the video is recorded, the video encoding program comprises a plurality of still image information in which the video is digitized. For the original video information, a video encoding step of encoding one or more of the still image information according to encoding parameters described below, and one or more resolutions as one encoding parameter;
One or more of the encoding types including intra-frame encoding, forward predictive encoding, backward predictive encoding, and bidirectional predictive encoding are used as the other encoding parameters. Encoding parameters for determining the amount of processing of the encoding step, the encoding parameter determining step of determining based on a given frame rate, and recorded an encoding program to execute, in a general-purpose computer such as a personal computer By executing the encoding program, at a given frame rate, high-resolution or by setting an encoding parameter obtained as an encoding result of a high compression rate, utilizing the device resources, It is possible to obtain a proper encoding result.

【０６２３】また、請求項４２の音声符号化プログラム
記録媒体によれば、音声に対して、帯域分割符号化方式
により符号化を行う音声符号化プログラムを記録した記
録媒体において、符号化処理に用いる数値である、設定
周波数ｆｓと、変換定数ｎとを記憶する記憶ステップ
と、符号化の対象である音声を入力する音声入力ステッ
プと、上記記憶した設定周波数ｆｓに基づいて決定され
るサンプリング周波数を用いて、サンプリング音声デー
タを作成する入力音声サンプリングステップと、上記設
定周波数ｆｓをサンプリング周波数として用いた場合に
得られるサンプリング音声データの個数をｍ個とし、ｍ
≧ｍ’である、上記変換定数ｎに基づいて定められる数
をｍ’として、ｍ’個のサンプリング音声データを含
む、ｍ個の音声データからなる変換音声データを出力す
る音声データ変換ステップと、上記変換音声データを、
帯域分割してＭ個の帯域信号を得る帯域分割ステップ
と、上記記憶した設定周波数ｆｓと変換定数ｎとから得
られる周波数ｆｓ／２ｎを制限周波数として、上記帯域
信号のうち、制限周波数以下の帯域信号にのみ符号化ビ
ットを割り当てる符号化ビット割り当てステップと、上
記割り当てた符号化ビットに基づいて量子化を行う量子
化ステップと、上記量子化したデータを符号化データと
して出力する符号化ステップと、上記出力される符号化
データを記録する符号化データ記録ステップとを実行す
る符号化プログラムを記録したので、パーソナルコンピ
ュータ等の汎用計算機において、当該符号化プログラム
を実行させることにより、変換定数ｎの設定によって、
処理負担を軽減し、音声取り込みにともなったリアルタ
イム符号化処理を行って、当該符号化装置の性能に応じ
た音質の符号化結果を得ることが可能となる。[0623] According to the audio encoding program recording medium of claim 42, the audio encoding program is used for encoding processing on an audio encoding program for encoding audio by band division encoding. A storage step of storing a set frequency fs, which is a numerical value, and a conversion constant n, a voice input step of inputting a voice to be encoded, and a sampling frequency determined based on the stored set frequency fs The input voice sampling step of creating the sampled voice data using the set frequency fs as the sampling frequency, and the number of the sampled voice data obtained when the set frequency fs is used as the sampling frequency is m.
An audio data conversion step of outputting converted audio data composed of m audio data, including m ′ sampled audio data, where m ′ is a number determined based on the conversion constant n, where ≧ m ′; The converted audio data is
A band division step of dividing the band to obtain M band signals, and a frequency fs / 2n obtained from the stored set frequency fs and the conversion constant n as a limit frequency. A coded bit allocation step of allocating coded bits only to a signal, a quantization step of performing quantization based on the allocated coded bits, and a coding step of outputting the quantized data as coded data, Since the encoded program for executing the encoded data recording step for recording the encoded data to be output is recorded, the general-purpose computer such as a personal computer executes the encoded program to set the conversion constant n. By
It is possible to reduce the processing load and perform a real-time encoding process accompanying the voice capture, and obtain an encoding result of sound quality according to the performance of the encoding device.

【０６２４】また、請求項４３の音声符号化プログラム
記録媒体によれば、音声に対して、帯域分割符号化方式
を用いて符号化を行う音声符号化プログラムを記録した
記録媒体において、上記符号化に用いる制御定数を記憶
する制御定数記憶ステップと、入力音声をサンプリング
処理して、サンプリングデータを出力するサンプリング
ステップと、上記サンプリングステップで得られたサン
プリングデータに対して帯域分割を行い、帯域信号デー
タを出力する帯域分割ステップと、上記帯域分割ステッ
プで得られた帯域信号データに対して、符号化ビットの
割り当てを行う符号化ビット割り当てステップと、上記
符号化ビットの割り当てに従って、上記帯域信号データ
の量子化を行い、量子化値を出力する量子化ステップ
と、上記量子化ステップで得られた量子化値に基づき、
符号化データを出力する符号化ステップと、上記記憶し
た制御定数に基づいて、上記帯域分割ステップ、上記符
号化ビット割り当てステップ、上記量子化ステップ、お
よび上記符号化ステップにおけるデータ処理を制御する
符号化処理制御ステップとを実行する符号化プログラム
を記録したので、パーソナルコンピュータ等の汎用計算
機において、当該符号化プログラムを実行させることに
より、制御定数の設定によって、処理負担を軽減し、音
声取り込みにともなったリアルタイム符号化処理を行っ
て、当該符号化装置の性能に応じた音質の符号化結果を
得ることが可能となる。[0624] According to the audio encoding program recording medium of claim 43, the audio encoding program for encoding audio using a band division encoding system is recorded on the recording medium, A control constant storing step of storing a control constant used for sampling, a sampling step of sampling an input voice and outputting sampling data, and performing band division on the sampling data obtained in the above sampling step to obtain band signal data. , A coding bit allocation step of allocating coded bits to the band signal data obtained in the band division step, and allocating the coded bits. A quantization step of performing quantization and outputting a quantization value; Based on the quantized values obtained by the flop,
An encoding step of outputting encoded data; and an encoding step of controlling data processing in the band division step, the encoded bit allocation step, the quantization step, and the encoding step based on the stored control constant. Since the encoding program for executing the processing control step was recorded, the encoding program was executed on a general-purpose computer such as a personal computer, so that the processing load was reduced by setting the control constants and the voice was captured. By performing the real-time encoding process, it is possible to obtain an encoding result of sound quality according to the performance of the encoding device.

【０６２５】また、請求項４４の音声符号化プログラム
記録媒体によれば、音声に対して、帯域分割符号化方式
を用いて符号化を行う音声符号化プログラムを記録した
記録媒体において、入力音声をサンプリング処理して、
サンプリングデータを出力するサンプリングステップ
と、上記サンプリングステップで得られたサンプリング
データに対して帯域分割を行い、帯域信号データを出力
する帯域分割ステップと、上記帯域分割ステップで得ら
れた帯域信号データに対して、符号化ビットの割り当て
を行う符号化ビット割り当てステップと、上記符号化ビ
ット割り当てステップにおける割り当てを心理聴覚分析
代替制御方式により制御するビット割り当て制御ステッ
プと、上記符号化ビットの割り当てに従って、上記帯域
信号データの量子化を行い、量子化値を出力する量子化
ステップと、上記量子化ステップで得られた量子化値に
基づき、符号化データを出力する符号化ステップとを実
行する符号化プログラムを記録したので、パーソナルコ
ンピュータ等の汎用計算機において、当該符号化プログ
ラムを実行させることにより、心理聴覚分析を簡略化し
たビット割り当て制御を実行し、人間の聴覚の特性に応
じた高品質の符号化結果の得られる符号化処理を、処理
負担を大きく増大することなく実行することが可能とな
る。[0625] According to the audio coding program recording medium of claim 44, the input audio is recorded on a recording medium in which an audio encoding program for encoding the audio using the band division encoding method is recorded. Sampling process,
Sampling step of outputting sampling data, performing band division on the sampling data obtained in the above sampling step, outputting a band signal data, and performing band division on the band signal data obtained in the above band division step. A coded bit allocation step of allocating coded bits, a bit allocation control step of controlling the allocation in the coded bit allocation step by a psychological auditory analysis alternative control method, and the band allocation according to the coded bit allocation. An encoding program that performs signal data quantization and executes a quantization step of outputting a quantization value and an encoding step of outputting encoded data based on the quantization value obtained in the above quantization step. Because it was recorded, general-purpose personal computers In the computer, by executing the encoding program, performs a bit allocation control that simplifies the psychological auditory analysis, the encoding process of obtaining a high-quality encoding result according to the characteristics of human hearing, Execution can be performed without greatly increasing the processing load.

【０６２６】また、請求項４５の映像音声符号化プログ
ラム記録媒体によれば、映像と音声とを符号化するにあ
たり、上記２つの符号化処理に含まれる処理過程の一部
または全部を、共通の計算機資源を用いて実行する映像
音声符号化プログラムを記録した記録媒体において、単
位時間毎の静止画像を表す複数の静止画像情報からなる
原映像情報と、音声を表す原音声情報とから構成される
映像音声情報が入力されたとき、上記原音声情報を一時
的に蓄積する音声バッファリングステップと、上記音声
バッファリングステップにおいて蓄積された原音声情報
を読み出し、この読み出した上記原音声情報を符号化処
理し、符号化音声情報を出力する音声符号化ステップ
と、映像符号化の負荷程度を表す符号化負荷基準情報を
用いて、当該映像音声符号化処理についての処理能力を
判断し、その判断の結果に基づいて、後述する映像符号
化ステップにおける原映像情報に対する符号化を制御す
る符号化負荷評価ステップと、上記符号化負荷評価ステ
ップにおける制御に従って、入力された上記原映像情報
を構成する静止画像情報を符号化処理し、符号化映像情
報を出力する映像符号化ステップとを実行する符号化プ
ログラムを記録したので、パーソナルコンピュータ等の
汎用計算機において、当該符号化プログラムを実行させ
ることにより、当該映像音声符号化を行う符号化装置の
処理能力に対応して、映像符号化を制御し、映像音声の
取り込みにともなったリアルタイムの符号化処理を実行
し、再生時の音途切れのない、良好な符号化結果を得る
ことが可能となる。[0626] According to the video / audio coding program recording medium of claim 45, in coding video and audio, a part or all of the processing steps included in the two coding processes is performed by a common process. In a recording medium on which a video / audio coding program to be executed using computer resources is recorded, the recording medium is composed of original video information composed of a plurality of pieces of still image information representing still images per unit time and original audio information representing audio. When video and audio information is input, an audio buffering step of temporarily storing the original audio information, and reading the original audio information stored in the audio buffering step, and encoding the read original audio information Processing and outputting encoded audio information, and using the encoded load reference information indicating the degree of load of the video encoding, An encoding load evaluation step of controlling encoding of original video information in a video encoding step to be described later based on a result of the determination, and a control in the encoding load evaluation step; And a video encoding step of encoding the still image information constituting the input original video information and outputting the encoded video information, thereby recording the general-purpose computer such as a personal computer. In the above, by executing the encoding program, in accordance with the processing capability of the encoding device that performs the video and audio encoding, the video encoding is controlled, and the real-time encoding process accompanying the capture of the video and audio is performed. By doing so, it is possible to obtain a good encoding result without interruption of sound during reproduction.

[Brief description of the drawings]

【図１】本発明の実施の形態１による映像符号化装置の
構成を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of a video encoding device according to a first embodiment of the present invention.

【図２】同実施の形態の映像符号化装置の符号化手段に
おける処理手順を示すフローチャート図である。FIG. 2 is a flowchart showing a processing procedure in an encoding unit of the video encoding device according to the embodiment.

【図３】同実施の形態の映像符号化装置の符号化パラメ
ータ決定手段における処理手順を示すフローチャート図
である。FIG. 3 is a flowchart showing a processing procedure in an encoding parameter determination unit of the video encoding device according to the embodiment.

【図４】本発明の実施の形態２による映像符号化装置の
構成を示すブロック図である。FIG. 4 is a block diagram illustrating a configuration of a video encoding device according to a second embodiment of the present invention.

【図５】同実施の形態の映像符号化装置の符号化パラメ
ータ決定手段における処理手順を示すフローチャート図
である。FIG. 5 is a flowchart showing a processing procedure in a coding parameter determination unit of the video coding apparatus according to the embodiment.

【図６】本発明の実施の形態３による映像符号化装置の
構成を示すブロック図である。FIG. 6 is a block diagram illustrating a configuration of a video encoding device according to a third embodiment of the present invention.

【図７】同実施の形態の映像符号化装置の符号化パター
ン決定手段における状態遷移を示す状態遷移図である。FIG. 7 is a state transition diagram showing state transitions in an encoding pattern determination unit of the video encoding device according to the embodiment.

【図８】本発明の実施の形態４による映像符号化装置の
構成を示すブロック図である。FIG. 8 is a block diagram illustrating a configuration of a video encoding device according to a fourth embodiment of the present invention.

【図９】同実施の形態の映像符号化装置の符号化パター
ン決定手段における状態遷移を示す状態遷移図である。FIG. 9 is a state transition diagram showing state transitions in an encoding pattern determination unit of the video encoding device according to the embodiment.

【図１０】本発明の実施の形態５による音声符号化装置
の構成を示すブロック図である。FIG. 10 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 5 of the present invention.

【図１１】同実施の形態の装置のハードウェア構成を示
す図である。FIG. 11 is a diagram showing a hardware configuration of the apparatus according to the embodiment.

【図１２】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 12 is a flowchart showing a procedure of speech encoding performed by the apparatus according to the embodiment;

【図１３】同実施の形態の装置によるサンプリング処理
と音声データ変換処理を説明するための図である。FIG. 13 is a diagram for describing sampling processing and audio data conversion processing by the device of the embodiment.

【図１４】本発明の実施の形態６による音声符号化装置
の構成を示すブロック図である。FIG. 14 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 6 of the present invention.

【図１５】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 15 is a flowchart showing a procedure of speech encoding performed by the apparatus of the embodiment.

【図１６】本発明の実施の形態７による音声符号化装置
の構成を示すブロック図である。FIG. 16 is a block diagram illustrating a configuration of a speech encoding device according to a seventh embodiment of the present invention.

【図１７】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 17 is a flowchart showing a procedure of speech encoding performed by the apparatus of the embodiment.

【図１８】本発明の実施の形態８による音声符号化装置
の構成を示すブロック図である。FIG. 18 is a block diagram illustrating a configuration of a speech coding apparatus according to Embodiment 8 of the present invention.

【図１９】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 19 is a flowchart showing a procedure of speech encoding performed by the apparatus of the embodiment.

【図２０】本発明の実施の形態９による音声符号化装置
の構成を示すブロック図である。FIG. 20 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 9 of the present invention.

【図２１】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 21 is a flowchart showing a procedure of speech encoding performed by the apparatus of the embodiment.

【図２２】同実施の形態による音声符号化を説明するた
めの概念図である。FIG. 22 is a conceptual diagram for describing audio encoding according to the embodiment.

【図２３】本発明の実施の形態１０による音声符号化装
置の構成を示すブロック図である。FIG. 23 is a block diagram illustrating a configuration of a speech coding apparatus according to Embodiment 10 of the present invention.

【図２４】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 24 is a flowchart showing a procedure of speech encoding performed by the apparatus of the embodiment.

【図２５】同実施の形態による帯域分割での基本低域通
過フィルタの演算式における係数Ｃｉを示す図である。FIG. 25 is a diagram showing a coefficient Ci in an arithmetic expression of a basic low-pass filter in band division according to the embodiment.

【図２６】本発明の実施の形態１１による音声符号化装
置の構成を示すブロック図である。FIG. 26 is a block diagram illustrating a configuration of a speech coding apparatus according to Embodiment 11 of the present invention.

【図２７】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 27 is a flowchart showing a procedure of speech encoding performed by the apparatus of the embodiment.

【図２８】同実施の形態による音声符号化における帯域
の間引きを説明するための概念図である。FIG. 28 is a conceptual diagram for explaining band thinning in speech encoding according to the embodiment.

【図２９】本発明の実施の形態１２による音声符号化装
置の構成を示すブロック図である。FIG. 29 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 12 of the present invention.

【図３０】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 30 is a flowchart showing a procedure of speech encoding performed by the apparatus of the embodiment.

【図３１】本発明の実施の形態１３による音声符号化装
置の構成を示すブロック図である。FIG. 31 is a block diagram illustrating a configuration of a speech coding apparatus according to Embodiment 13 of the present invention.

【図３２】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 32 is a flowchart showing a procedure of speech encoding performed by the apparatus of the embodiment.

【図３３】本発明の実施の形態１４による音声符号化装
置の構成を示すブロック図である。FIG. 33 is a block diagram illustrating a configuration of a speech coding apparatus according to Embodiment 14 of the present invention.

【図３４】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 34 is a flowchart showing a processing procedure of speech encoding performed by the apparatus of the embodiment.

【図３５】同実施の形態の装置による順次ビット割り当
ての処理手順を示すフローチャート図である。FIG. 35 is a flowchart showing a processing procedure of sequential bit allocation by the device of the embodiment.

【図３６】同実施の形態による順次ビット割り当てにお
ける、各帯域へのビット割り当ての処理手順を示すフロ
ーチャート図である。FIG. 36 is a flowchart showing a processing procedure of bit allocation to each band in the sequential bit allocation according to the embodiment.

【図３７】本発明の実施の形態１５による音声符号化装
置の構成を示すブロック図である。FIG. 37 is a block diagram illustrating a configuration of a speech encoding device according to Embodiment 15 of the present invention.

【図３８】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 38 is a flowchart showing a procedure of speech encoding performed by the apparatus of the embodiment.

【図３９】本発明の実施の形態１６による音声符号化装
置の構成を示すブロック図である。FIG. 39 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 16 of the present invention.

【図４０】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 40 is a flowchart showing a procedure of speech encoding performed by the apparatus of the embodiment.

【図４１】同実施の形態の装置による改良型帯域出力適
応ビット割り当ての処理手順を示すフローチャート図で
ある。FIG. 41 is a flowchart showing a processing procedure of improved band output adaptive bit allocation by the device of the embodiment.

【図４２】本発明の実施の形態１７による音声符号化装
置の構成を示すブロック図である。FIG. 42 is a block diagram illustrating a configuration of a speech coding apparatus according to Embodiment 17 of the present invention.

【図４３】同実施の形態の装置による音声符号化の処理
手順を示すフローチャート図である。FIG. 43 is a flowchart showing a procedure of speech encoding performed by the apparatus of the embodiment.

【図４４】本発明の実施の形態１８による映像音声符号
化装置の概略構成を示す図である。FIG. 44 is a diagram illustrating a schematic configuration of a video and audio encoding device according to Embodiment 18 of the present invention.

【図４５】同実施の形態の映像音声符号化装置の動作を
図解的に表した図である。FIG. 45 is a diagram schematically illustrating an operation of the video and audio encoding device of the embodiment.

【図４６】同実施の形態の映像音声符号化装置のより長
期の時間にわたる動作を説明するための図である。FIG. 46 is a diagram for explaining an operation over a longer period of time of the video and audio encoding device of the embodiment.

【図４７】本発明の実施の形態１９による映像音声符号
化装置の概略構成を示す図である。FIG. 47 is a diagram showing a schematic configuration of a video / audio encoding device according to Embodiment 19 of the present invention.

【図４８】本発明の実施の形態２０による映像音声符号
化装置の概略構成を示す図である。FIG. 48 is a diagram showing a schematic configuration of a video / audio encoding device according to Embodiment 20 of the present invention.

【図４９】本発明の実施の形態２１による映像音声符号
化装置の概略構成を示す図である。FIG. 49 is a diagram illustrating a schematic configuration of a video / audio encoding apparatus according to Embodiment 21 of the present invention.

【図５０】本発明の実施の形態２２による映像音声符号
化装置の概略構成を示す図である。FIG. 50 is a diagram showing a schematic configuration of a video / audio encoding apparatus according to Embodiment 22 of the present invention.

【図５１】実施の形態１８ないし実施の形態２２による
映像音声符号化における現象を説明するための図であ
る。FIG. 51 is a diagram for describing a phenomenon in video / audio coding according to Embodiments 18 to 22;

【図５２】本発明の実施の形態２３による映像音声符号
化装置の概略構成を示す図である。FIG. 52 is a diagram illustrating a schematic configuration of a video / audio encoding device according to Embodiment 23 of the present invention;

【図５３】実施の形態２３による映像音声符号化を説明
するための図である。[Fig. 53] Fig. 53 is a diagram for describing video / audio coding according to Embodiment 23.

【図５４】本発明の実施の形態２４による映像音声符号
化装置の概略構成を示す図である。FIG. 54 is a diagram illustrating a schematic configuration of a video / audio encoding device according to Embodiment 24 of the present invention;

【図５５】本発明の実施の形態２５による映像符号化装
置の構成を示すブロック図である。FIG. 55 is a block diagram illustrating a configuration of a video encoding device according to Embodiment 25 of the present invention.

【図５６】同実施の形態の映像符号化装置の符号化手段
における処理手順を示すフローチャート図である。FIG. 56 is a flowchart showing a processing procedure in an encoding unit of the video encoding device of the embodiment.

【図５７】同実施の形態の映像符号化装置の符号化パラ
メータ決定手段における処理手順を示すフローチャート
図である。FIG. 57 is a flowchart showing a processing procedure in an encoding parameter determination unit of the video encoding device of the embodiment.

【図５８】従来の技術による専用のハードウェアで構成
されたリアルタイム処理を行う映像符号化装置の構成を
示すブロック図である。And FIG. 58 is a block diagram illustrating a configuration of a video encoding device that performs real-time processing and is configured by dedicated hardware according to the related art.

【図５９】従来技術の第１例による音声符号化装置の構
成を示すブロック図である。FIG. 59 is a block diagram illustrating a configuration of a speech encoding device according to a first example of the related art.

【図６０】同例における音声符号化の処理手順を示すフ
ローチャート図である。FIG. 60 is a flowchart showing a processing procedure of speech encoding in the same example.

【図６１】音声符号化におけるサンプリング処理を説明
するための図である。FIG. 61 is a diagram for describing a sampling process in audio encoding.

【図６２】音声符号化における帯域分割を説明するため
の概念図である。FIG. 62 is a conceptual diagram for explaining band division in audio coding.

【図６３】帯域分割された帯域信号を示す図である。FIG. 63 is a diagram showing band signals obtained by band division.

【図６４】従来技術の第２例によるＭＰＥＧ１Ａｕｄｉ
ｏ音声符号化装置の構成を示すブロック図である。FIG. 64: MPEG1Audi according to a second example of the prior art
FIG. 2 is a block diagram illustrating a configuration of an audio encoding device.

【図６５】同例における音声符号化処理を示すフローチ
ャート図である。FIG. 65 is a flowchart showing a speech encoding process in the same example.

【図６６】従来技術による心理聴覚分析に応用される、
人間の聴覚における最小可聴限界を示す図である。FIG. 66 is applied to a psychological auditory analysis according to the prior art.
It is a figure showing the minimum audible limit in human hearing.

【図６７】従来の技術による映像音声符号化装置の概略
構成を示す図である。FIG. 67 is a diagram showing a schematic configuration of a video / audio coding apparatus according to a conventional technique.

[Explanation of symbols]

１０１，２０１，３０１，４０１，５００１符号化
手段１０２，２０２，３０２，４０２，５００２符号化
パラメータ決定手段１０３、２０３，３０３，４０３，５００３ＤＣＴ
処理手段１０４，２０４，３０４，４０４，５００４量子化
手段１０５，２０５，３０５，４０５，５００５可変長
符号化手段１０６，２０６，３０６，４０６，５００６ビット
ストリーム生成手段１０７，２０７，３０７，４０７，５００７逆量子
化１０８，２０８，３０８，４０８，５００８逆ＤＣ
Ｔ処理手段１０９，２０９，３０９，４０９，５００９予測画
像生成手段１１０解像度参照テーブル２１０符号化パターン参照テーブル３１０，４１０符号化パターン決定手段３１１，４１１処理能力判断手段４１２バッファ手段４１３入力フレームレート制御手段５０１，６０１，７０１，８０１，９０１，１００１，
１１０１，１２０１，１３０１，１４０１，１５０１，
１６０１，１７０１，２５５１，２６５１音声入力部５０２，６０２，７０２，８０２，９０２、１０１１，
１１１２，１２０２，１３０２レジスタ８１０，１２１０，１３１０固定的符号化レジスタ５０３，６０３，７０３，８０３，９０３，１００３，
１１０３，１２０３，１３０３，１４０３，１５０３，
１６０３，１７０３，２５５３，２６５３入力音声サン
プリング部５０４，６０４，７０４，８０４音声データ変換部１１１８帯域間引き部５０５，６０５，７０５，８０５，９０５，１００５，
１１０５，１２０５，１３０５，１４０５，１５０５，
１６０５，１７０５，２５５５，２６５５帯域分割部５０６，６０６，７０６，８０６，９０６，１００６，
１１０６，１２０６，１３０６，１４０６，１５０６，
１６０６，１７０６，２５５６，２６５６符号化ビット
割り当て部５０７，６０７，７０７，８０７，９０７，１００７，
１１０７，１２０７，１３０７，１４０７，１５０７，
１６０７，１７０７，２５５７，２６５７量子化部５０８，６０８，７０８，８０８，９０８，１００８，
１１０８，１２０８，１３０８，１４０８，１５０８，
１６０８，１７０８，２５５８，２６５８符号化部５０９，６０９，７０９，８０９，９０９，１００９，
１１０９，１２０９，１３０９，１４０９，１５０９，
１６０９，１７０９，２５５９，２６５９符号化データ
記録部７０１０，１２１３入力バッファ７０１１，１２１４入力バッファ監視部８０１２，１３１５符号化データ監視部１７１８最小可聴限界比較部２６６０ＦＦＴ（高速フーリエ変換）部２６６１心理聴覚分析部１８０１，１９０１，２００１，２１０１，２２０１，
２３０１，２４０１，２７０１ビデオカメラ１８０２，１９０２，２００２，２１０２，２２０２，
２３０２，２４０２，２７０２音声キャプチャ部１８０３，１９０３，２００３，２１０３，２２０３，
２３０３，２４０３音声バッファリング部１８０５，１９０５，２００５，２１４２，２２０５，
２３０５，２４０５，２７０３音声符号化部１８０６，１９０６，２００６，２１０６，２２０６，
２３０６，２４０６，２７０４映像キャプチャ部１８０７，１９２３，２０３５，２１０７，２２０７，
２３０７，２４０７，２７０５映像符号化部１８０８，１９２１，２０３２，２１４４，２２５３，
２３０８，２４０８符号化負荷評価部１９２４フレーム間予測処理部１９２５フレーム符号化部２０３７解像度補正情報付加部２１４１，２２５１，２３６１，２４６１システム
タイマ２４１１符号化負荷提示部２４１２負荷設定用標準映像音声出力部101, 201, 301, 401, 5001 Encoding means 102, 202, 302, 402, 5002 Encoding parameter determination means 103, 203, 303, 403, 5003 DCT
Processing means 104, 204, 304, 404, 5004 Quantization means 105, 205, 305, 405, 5005 Variable length coding means 106, 206, 306, 406, 5006 Bit stream generation means 107, 207, 307, 407, 5007 Inverse quantization 108, 208, 308, 408, 5008 Inverse DC
T processing means 109, 209, 309, 409, 5009 Predicted image generation means 110 Resolution reference table 210 Encoding pattern reference table 310, 410 Encoding pattern determination means 311, 411 Processing capacity determination means 412 Buffer means 413 Input frame rate control means 501,601,701,801,901,1001,
1101, 1201, 1301, 1401, 1501,
1601, 1701, 2551, 2651 Voice input unit 502, 602, 702, 802, 902, 1011
1112, 1202, 1302 registers 810, 1210, 1310 fixed coding registers 503, 603, 703, 803, 903, 1003
1103, 1203, 1303, 1403, 1503
1603, 1703, 2553, 2653 Input audio sampling unit 504, 604, 704, 804 Audio data conversion unit 1118 Band thinning unit 505, 605, 705, 805, 905, 1005
1105, 1205, 1305, 1405, 1505
1605, 1705, 2555, 2655 band division units 506, 606, 706, 806, 906, 1006
1106, 1206, 1306, 1406, 1506,
1606, 1706, 2556, 2656 coded bit allocation units 507, 607, 707, 807, 907, 1007,
1107, 1207, 1307, 1407, 1507,
1607, 1707, 2557, 2657 quantization units 508, 608, 708, 808, 908, 1008,
1108, 1208, 1308, 1408, 1508,
1608, 1708, 2558, 2658 encoders 509, 609, 709, 809, 909, 1009,
1109, 1209, 1309, 1409, 1509,
1609, 1709, 2559, 2659 Encoded data recording unit 7010, 1213 Input buffer 7011, 1214 Input buffer monitoring unit 8012, 1315 Encoded data monitoring unit 1718 Minimum audible limit comparison unit 2660 FFT (fast Fourier transform) unit 2661 Psychological hearing analysis Parts 1801, 1901, 2001, 2101, 201,
2301, 2401, 270 video camera 1802, 1902, 2002, 2102, 2202
2302, 2402, 2702 sound capture unit 1803, 1903, 2003, 2103, 2203
2303, 2403 audio buffering unit 1805, 1905, 2005, 2142, 2205
2305, 2405, 2703 Speech encoder 1806, 1906, 2006, 2106, 2206
2306, 2406, 2704 video capture unit 1807, 1923, 2035, 2107, 2207,
2307, 2407, 2705 video encoders 1808, 1921, 2032, 2144, 2253,
2308, 2408 Encoding load evaluation unit 1924 Inter-frame prediction processing unit 1925 Frame encoding unit 2037 Resolution correction information adding unit 2141, 251, 236, 246 System timer 2411 Encoding load presenting unit 2412 Load setting standard video / audio output unit

───────────────────────────────────────────────────── フロントページの続き (31)優先権主張番号特願平9−42051 (32)優先日平９(1997)２月26日 (33)優先権主張国日本（ＪＰ） (72)発明者辰巳英典広島県広島市東区光町１丁目12番20号株式会社松下電器情報システム広島研究所内 (72)発明者河原栄治広島県広島市東区光町１丁目12番20号株式会社松下電器情報システム広島研究所内 (72)発明者荒瀬吉隆広島県広島市東区光町１丁目12番20号株式会社松下電器情報システム広島研究所内 ──────────────────────────────────────────────────続き Continued on the front page (31) Priority claim number Japanese Patent Application No. 9-42051 (32) Priority date Hei 9 (1997) February 26 (33) Priority claim country Japan (JP) (72) Inventor Hidenori Tatsumi 1-12-20 Hikaricho, Higashi-ku, Hiroshima City, Hiroshima Prefecture Inside Matsushita Electric Industrial Information Systems Hiroshima Research Laboratory (72) Inventor Eiji Kawahara 1-12-20 Hikaricho, Higashi-ku, Hiroshima City, Hiroshima Hiroshima Prefecture Matsushita Electric Industrial Co., Ltd. Inside the Hiroshima Research Institute of Information Systems (72) Inventor Yoshitaka Arase 1-12-20 Hikaricho, Higashi-ku, Hiroshima City, Hiroshima Prefecture Matsushita Electric Information Systems Hiroshima Laboratory

Claims

[Claims]

1. A video encoding method for encoding a video, wherein one or more of the still image information described above is described below with respect to original video information including a plurality of still image information in which the video is digitized. A video encoding step of encoding according to the encoding parameters, a resolution of the original video information, a frame rate required when reproducing encoded data obtained by encoding,
One of a processing performance indicating a processing capability of an encoding device that executes the video encoding step, or one or a plurality of encoding parameters that affects a processing amount of an encoding process in the video encoding step. A coding parameter determining step of determining one or more of the coding parameters based on the above.

2. The video encoding method according to claim 1, wherein the processing capability of the encoding device that executes the video encoding step is output and the determination result is output. A video encoding method, further comprising a determining step.

3. The video encoding method according to claim 1, wherein the encoding parameter indicates a resolution, an intra-frame encoding, or a predictive encoding in an encoding process performed on the original video information. A video encoding method comprising at least one of an encoding type and a detection range for detecting a motion vector used for the predictive encoding.

4. The video encoding method according to claim 2, wherein in the processing capability determination step, the determination is made based on a type of a control device of the video encoding method. Video encoding method.

5. The video encoding method according to claim 2, wherein in the processing capability determination step, the determination is made based on a time required for an encoding process in the encoding step. Video encoding method.

6. The video encoding method according to claim 2, wherein, in the processing capability determining step, the input original video information is temporarily stored, and in the storage, a series of original video information constituting the original video information is stored. A video buffering step of sequentially storing the still image information and sequentially discarding the still image information read out in the encoding step and subjected to the encoding process; and Performing a frame rate control step of controlling the storage of the still image information to be performed at a constant frame rate determined based on the given frame rate, and temporarily storing the image information in the video buffering step. A video code for performing the above-mentioned determination based on the storage amount of the original video information; Method of.

7. A speech encoding method for encoding speech by a band division encoding method, comprising: a storage step of storing a set frequency fs and a conversion constant n, which are numerical values used for an encoding process. An audio input step of inputting audio to be encoded; an input audio sampling step of creating sampling audio data using a sampling frequency determined based on the stored set frequency fs; M is the number of sampled sound data obtained when m is used as the sampling frequency, m is the number determined based on the conversion constant n, and m is the number of sampled sound data including m ′ sampled sound data. An audio data conversion step of outputting converted audio data consisting of: A band division step of obtaining a band signal of the following formula, and using a frequency fs / 2n obtained from the stored set frequency fs and the conversion constant n as a limit frequency, among the band signals, coding bits only for a band signal equal to or lower than the limit frequency. A coding bit allocation step of allocating the following; a quantization step of performing quantization based on the allocated coding bits; an encoding step of outputting the quantized data as encoded data; And a coded data recording step of recording data.

8. The speech encoding method according to claim 7, wherein, in the input speech sampling step, m pieces of sampled speech are processed by sampling the input speech using the stored set frequency fs as a sampling frequency. In the audio data conversion step, sampling audio data is extracted every (n-1) samplings from the m sampling audio data, and two adjacent sampling audio data of the extracted sampling audio data are extracted. A speech encoding method comprising inserting (n-1) pieces of speech data between them and converting them into m pieces of transformed speech data.

9. The audio encoding method according to claim 8, wherein in the audio data converting step, converted audio data in which each of the extracted sampled audio data is continuous by n pieces is created. The audio coding method to use.

10. The speech encoding method according to claim 7, wherein, in the input speech sampling step, a frequency fs obtained from the stored set frequency fs and a conversion constant n.
M / n sampled audio data is created by sampling the input audio using / n as a sampling frequency. In the audio data conversion step, two adjacent audio data are generated based on the sampled audio data. A speech encoding method characterized in that (n-1) speech data is inserted between sampled speech data and converted into m converted speech data.

11. The audio encoding method according to claim 10, wherein, in the audio data conversion step, the m / n sampled audio data creates converted audio data that is continuous by n each. A speech coding method characterized by the above-mentioned.

12. The audio encoding method according to claim 7, wherein an audio buffering step of temporarily holding the sampled audio data in an input buffer, and checking a data amount of the input buffer. Performing an input buffer monitoring step of changing the value of the conversion constant n stored in the register based on the result of the comparison. Writing the sampled audio data into the input buffer; and in the audio data converting step, reading the sampled audio data from the input buffer and converting the sampled audio data. .

13. The speech encoding method according to claim 7, wherein an amount of encoded data per unit time outputted in said encoding step is checked and compared with a preset value. And a coded data monitoring step of changing a value of the conversion constant n stored in the register based on a result of the comparison.

14. A speech encoding method for encoding speech using a band division encoding method, comprising: a control constant storage step for storing a control constant used for the encoding; and a sampling process for input speech. A sampling step of outputting sampling data, a band division step of performing band division on the sampling data obtained in the sampling step, and outputting band signal data, and a band signal data obtained in the band division step. A coded bit allocation step of allocating coded bits, a quantization step of performing quantization of the band signal data according to the coded bit allocation, and outputting a quantized value; An encoding step of outputting encoded data based on the quantized value obtained in the step, Based on the stored control constants, performing the band division step, the coded bit allocation step, the quantization step, and an encoding control step of controlling data processing in the encoding step. The audio coding method to use.

15. The voice encoding method according to claim 14, wherein in the control constant storing step, a unit period determination constant k is stored in the unit period determination constant register as the control constant. In the processing control step, the number of sampling data to be processed in one band dividing process in the band dividing step is p, and the time corresponding to the p pieces of sampling data is defined as a unit period. For each unit, whether the corresponding unit period is the encoding target period or the non-coding target period is determined based on the stored unit period determination constant, and the unit period is the encoding target period and Only when it is determined, control is performed so that the sampling data of the unit period is output to the band division step, and the unit period is set to the encoding pair. A speech encoding method, characterized in that when it is determined that the period is outside the elephant period, the encoding step is a determination control step of controlling to output fixed encoded data stored in advance as encoded data.

16. The speech encoding method according to claim 15, wherein in the determination control step, the stored unit period determination constant k and an arbitrary integer n are set as ith unit periods.
When i = n × k + 1 holds, the unit period t
A speech encoding method, wherein i is determined to be the encoding target period.

17. The speech encoding method according to claim 14, wherein in the control constant storing step,
Storing the arithmetic processing determination constant q in an arithmetic processing determination constant register, wherein the encoding processing control step is included in the band division step, and based on the stored arithmetic processing determination constant q, A speech coding method, characterized in that the speech coding method is a computing process stop step of controlling the computing process to be terminated halfway.

18. The speech encoding method according to claim 17, wherein in the step of stopping the arithmetic processing, the arithmetic processing of the basic low-pass filter in the band division step is aborted halfway for steps at both ends of the filter. A speech encoding method characterized by performing control as described above.

19. The speech encoding method according to claim 14, wherein in the control constant storing step,
Storing a band selection constant r in a band selection constant register, wherein the encoding process control step selects, based on the stored band selection constant r, the band signal data output from the band division step A speech coding method, characterized in that it is a band thinning step for controlling only the coded bit allocation step and the quantization step for only the coded bit allocation step and the quantization step.

20. The speech encoding method according to claim 19, wherein in the band thinning step, r stored band selection constants are obtained from the M band signal data outputs obtained in the band division step. A speech coding method characterized by selecting band signal data every other time.

21. The speech encoding method according to claim 14, wherein a state of data processing in speech encoding is acquired, and the value of the stored control constant is stored in accordance with the acquired situation. And a processing status monitoring step for changing the audio encoding.

22. The audio encoding method according to claim 21, wherein, in the processing status monitoring step, an audio buffering step of temporarily storing sampling data in an input buffer; and an amount of data held in the input buffer. An input monitoring step of comparing with a preset value and changing the control constant based on a result of the comparison.

23. The audio encoding method according to claim 21, wherein the processing status monitoring step includes comparing an amount of the encoded data output per unit time in the encoding step with a preset value. And
A speech encoding method, which is an encoding monitoring step of changing a value of the control constant based on a result of the comparison.

24. A speech encoding method for encoding original speech information obtained by digitizing speech by using a band division encoding method, wherein a sampling process is performed on input speech to output sampling data. A band dividing step of performing band division on the sampling data obtained in the sampling step and outputting band signal data; and encoding bits of the band signal data obtained in the band division step. A coded bit allocation step of allocating, a bit allocation control step of controlling the allocation in the coded bit allocation step by a psychological auditory analysis alternative control method, and quantizing the band signal data according to the coded bit allocation. Performing a quantization step to output a quantization value; A coding step of outputting coded data based on the quantization value obtained in the quantization step.

25. The speech encoding method according to claim 24, wherein the bit allocation control step is performed in advance on the band signal data obtained in the band division step using a psychological auditory analysis alternative control method. A speech coding method, which is a sequential bit allocation step of controlling to perform coding bit allocation according to a bit allocation order.

26. The speech encoding method according to claim 24, wherein the bit allocation control step is performed in advance on the band signal data obtained in the band division step by a psychological auditory analysis alternative control method. A speech encoding method, which is a band output adaptive bit assignment step of controlling to perform encoding bit assignment based on weighting to each band and an output level of each band signal data.

27. The speech encoding method according to claim 24, wherein the bit allocation control step is performed in advance on the band signal data obtained in the band division step using a psychological auditory analysis alternative control method. An improved band output adaptive bit allocation step for controlling to perform coding bit allocation based on weighting for each band, weighting for the number of bit allocations for each band, and output level of each band signal data. A speech encoding method characterized by the following.

28. The speech encoding method according to claim 24, wherein the bit allocation control step includes: determining a minimum audible limit value for each band signal data with respect to the band signal data obtained in the band division step. It is a minimum audible limit comparison step of performing a comparison, performing no bit allocation for band signal data determined to be less than the minimum audible limit by the above comparison, and controlling to increase bit allocation for other bands. Characteristic speech coding method.

29. In encoding video and audio,
In a video / audio coding method in which a part or all of the processing steps included in the above two coding processes are performed using a common computer resource, an original image including a plurality of pieces of still image information representing a still image per unit time is provided. When video and audio information composed of video information and original audio information representing audio is input, an audio buffering step for temporarily storing the original audio information, and an original buffer stored in the audio buffering step. Reading the audio information, encoding the read original audio information, and outputting the encoded audio information; and using the encoding load reference information indicating the load level of the video encoding, Judgment the processing capability of the audio encoding process, and based on the result of the judgment, control the encoding of the original video information in the video encoding step described later. A coding load evaluating step, and a video coding step of coding the still image information constituting the input original video information according to the control in the coding load evaluating step, and outputting the coded video information. A video / audio coding method characterized by being executed.

30. The video / audio coding method according to claim 29, wherein the coding load evaluation step is performed such that when still image information constituting the original video information is input, the still image information is accumulated in the audio buffering step. The coding load evaluation information is obtained based on the total amount of the original audio information and the coding load reference information, and the coding load evaluation information is compared with a preset load limit. Output the still image information when the load limit has not been reached, and discard the still image information when the coded load evaluation information has reached the load limit. Encoding method.

31. The video / audio coding method according to claim 29, wherein analog video information is input, and when video resolution information described later is output, the analog video information is composed of a plurality of discrete digital pixel information. Performing a video capture step of converting the original video information into a plurality of still image information having a resolution according to the video resolution information and outputting the processed original video information to be processed in the video encoding step. In the load evaluation step, the coding load evaluation information is obtained based on the total amount of the original audio information accumulated in the audio buffering step and the coding load reference information indicating the degree of the load of the video coding.
On the basis of the coding load evaluation information, video resolution information representing the resolution of a video used for video coding is obtained, and the video resolution information is output. In the video coding step, the video resolution information is output And a coding process for coding the still image information according to the video resolution information and outputting the coded video information.

32. The video / audio coding method according to claim 29, wherein in the coding load evaluation step, the coding load evaluation information is output so as to be processed in the video coding step. In the encoding step, the still image information is encoded by a processing amount calculated using the outputted encoding load evaluation information, and the encoded image information is output as encoded video information. Video / audio coding method.

33. The video / audio encoding method according to claim 29, wherein in the audio encoding step, the original audio information accumulated in the audio buffering step is read, and the read original audio information is read. The total amount of audio information is calculated and output as a processed audio information amount, and thereafter, the original audio information is coded and output as coded audio information. And an estimated voice buffer amount, which is a difference between the original voice input volume and the processed voice information volume, based on the input volume per hour of the original voice information. A video / audio coding method characterized in that said coding load evaluation information is obtained using a quantity.

34. The video / audio encoding method according to claim 29, wherein, in the encoding load evaluation step, when the still image information is input, an elapsed time and a value of the original audio information The original voice input amount based on the input amount per time; and
A processed audio information amount is obtained based on a total amount of the encoded audio information output in the audio encoding step, and a prediction is a difference between the obtained original audio input amount and the obtained processed audio information amount. A video / audio coding method characterized in that after obtaining an audio buffer amount, the coding load evaluation information is obtained using the predicted audio buffer amount.

35. The video / audio coding method according to claim 29, wherein a change in the result of the determination in the coding load evaluation step is monitored, and the code is provided in response to the change. A video / audio coding method characterized by setting coded load reference information.

36. A video encoding apparatus for encoding a video, wherein one or more of the still image information described below is described with respect to original video information including a plurality of still image information in which the video is digitized. Video encoding means for encoding according to the encoding parameter, and one or more resolutions as one encoding parameter, wherein intra-frame encoding, forward prediction encoding, backward prediction encoding, and bidirectional prediction encoding are performed. Using one or more encoding types among the encoding types including the respective types as other encoding parameters, the encoding parameters for determining the processing amount of the encoding means are determined based on a given frame rate. A video coding apparatus comprising: a coding parameter determining unit for determining.

37. A speech encoding apparatus for encoding speech according to a band division encoding scheme, comprising: a register for storing a set frequency fs and a conversion constant n, which are numerical values used for an encoding process; Voice input means for inputting voice to be encoded; input voice sampling means for generating sampled voice data using a sampling frequency determined based on the stored set frequency fs; When the number of sampled audio data obtained when used as the sampling frequency is m, and the number determined based on the conversion constant n is m ′, m audio data including m ′ sampled audio data are included. Voice data converting means for outputting converted voice data, and converting the converted voice data into M band signals. A band dividing means for assigning coded bits only to a band signal having a frequency equal to or lower than the limit frequency among the band signals, using a frequency fs / 2n obtained from the stored set frequency fs and the conversion constant n as a limit frequency. Bit allocation means, quantization means for performing quantization based on the allocated coded bits, coding means for outputting the quantized data as coded data, and recording the coded data to be output An audio encoding device comprising: encoded data recording means.

38. A speech coding apparatus for coding speech using a band division coding method, comprising: control constant storage means for storing control constants used for the coding; and sampling processing of input speech. Sampling means for outputting sampling data, band dividing means for performing band division on the sampling data obtained by the sampling means, and outputting band signal data, and band signal data obtained by the band dividing means. Against
Coded bit allocating means for allocating coded bits; quantizing means for quantizing the band signal data in accordance with the coded bit allocation and outputting a quantized value; Encoding means for outputting encoded data based on the quantized value obtained, and the band dividing means based on the stored control constants,
A speech encoding apparatus comprising: the encoded bit allocation unit; the quantization unit; and an encoding control unit that controls data processing in the encoding unit.

39. A speech encoding apparatus that encodes speech using a band division encoding method, comprising: a sampling unit that samples input speech and outputs sampling data; Band dividing means for performing band division on the obtained sampling data and outputting band signal data; and for band signal data obtained by the band dividing means,
Coded bit allocating means for allocating coded bits; bit allocation control means for controlling the allocation in the coded bit allocating means by a psychological auditory analysis alternative control method; and Voice, characterized by comprising: a quantizing means for performing quantization of the data and outputting a quantized value; and an encoding means for outputting encoded data based on the quantized value obtained by the quantizing means. Encoding device.

40. In encoding video and audio,
A video / audio coding apparatus that performs part or all of the processing steps included in the above two coding processes using a common computer resource. When video / audio information composed of video information and original audio information representing audio is input, audio buffering means for temporarily storing the original audio information, and audio data stored in the audio buffering means. The audio information is read, the read original audio information is encoded, and audio encoding means for outputting encoded audio information, and the encoding load reference information indicating the load level of the video encoding are used for the video. A coding load estimating unit that controls the output of the original video information to a video coding unit, which will be described later, based on the result of the determination; In accordance with the control of the encoding load evaluator, when still image information constituting the original video information is input, the still image information is encoded, and video encoding means for outputting encoded video information is provided. A video / audio encoding device characterized by the following.

41. A recording medium on which a video encoding program for encoding a video is recorded, wherein one of the still image information and one of the still image information are digitized from the original video information comprising a plurality of still image information. Or a video encoding step of encoding a plurality according to encoding parameters to be described later; and one or more resolutions as one encoding parameter, intra-frame encoding, forward prediction encoding, backward prediction encoding, and An encoding parameter for determining a processing amount of the encoding step is given, with one or more encoding types among encoding types including each type of bidirectional predictive encoding as other encoding parameters. A video encoding program for executing an encoding parameter determining step of determining based on a frame rate determined based on the frame rate. Recording medium.

42. A recording medium on which a speech encoding program for encoding speech by band division encoding is recorded, wherein a set frequency fs and a conversion constant n, which are numerical values used for the encoding process, are recorded. A storage step of storing; a voice input step of inputting a voice to be encoded; and an input voice sampling step of creating sampled voice data using a sampling frequency determined based on the stored set frequency fs. The number of sampled audio data obtained when the set frequency fs is used as the sampling frequency is m, and the number determined based on the conversion constant n, where m ≧ m ′, is m ′, and m ′ Audio data conversion step of outputting converted audio data consisting of m audio data, including sampled audio data A band dividing step of dividing the converted audio data to obtain M band signals, and a frequency fs / 2n obtained from the stored set frequency fs and the conversion constant n as a limiting frequency. A coded bit allocation step of allocating coded bits only to a band signal equal to or lower than the limited frequency; a quantization step of performing quantization based on the allocated coded bits; and the quantized data as coded data. An audio encoding program recording medium, wherein an encoding program for executing an encoding step for outputting and an encoded data recording step for recording the encoded data to be outputted is recorded.

43. A control medium storing a control constant used for encoding, on a recording medium recording an audio encoding program for encoding audio using band division encoding, A sampling step of sampling audio and outputting sampling data; a band division step of performing band division on the sampling data obtained in the sampling step and outputting band signal data; A coded bit allocation step of allocating coded bits to the band signal data obtained, and a quantization step of quantizing the band signal data according to the coded bit allocation and outputting a quantized value And outputs coded data based on the quantization value obtained in the quantization step. Encoding step, based on the stored control constants, the band division step, the encoded bit allocation step, the quantization step, and an encoding processing control step for controlling data processing in the encoding step. An audio encoding program recording medium on which an encoding program to be executed is recorded.

44. A sampling step of sampling an input audio and outputting sampling data on a recording medium having recorded thereon an audio encoding program for encoding the audio using a band division encoding method; Performing band division on the sampled data obtained in the sampling step, and outputting band signal data; and allocating coded bits to the band signal data obtained in the band division step. A coded bit allocation step, a bit allocation control step of controlling the allocation in the coded bit allocation step by a psychological auditory analysis alternative control method, and quantizing the band signal data according to the coded bit allocation. A quantization step for outputting a quantized value; An audio encoding program recording medium characterized by recording an encoding program for executing an encoding step of outputting encoded data based on a quantized value obtained in a decoding step.

45. In encoding video and audio,
A part of or all of the processing steps included in the above two encoding processes may be performed using a common computer resource on a recording medium recording a video / audio encoding program. An audio buffering step of temporarily storing the original audio information when video / audio information composed of original video information composed of image information and original audio information representing audio is input; Reading the stored original audio information, encoding the read original audio information, and outputting encoded audio information; and encoding load reference information indicating a load level of video encoding. To determine the processing capability of the video / audio coding process, and based on the result of the determination, an original video in a video coding step described later. An encoding load evaluation step of controlling encoding of information; and encoding processing of the input still image information constituting the original video information according to the control in the encoding load evaluation step, and outputting encoded video information. A video / audio coding program recording medium, which records a coding program for executing a video coding step.