WO2006046334A1

WO2006046334A1 - Video encoder, video decoder, video encoding method, and video decoding method

Info

Publication number: WO2006046334A1
Application number: PCT/JP2005/013406
Authority: WO
Inventors: Daijiroh Ichimura; Yoshimasa Honda
Original assignee: Matsushita Electric Industrial Co., Ltd.
Priority date: 2004-10-26
Filing date: 2005-07-21
Publication date: 2006-05-04
Also published as: JP2006128759A

Abstract

A video encoder (10) comprises a DCT section for creating a transform coefficient representing a frequency component by frequency-transforming a video, a variable length encoding section (32) for transforming the transform coefficient into a binary number, generating bit planes composed of bits at the same place of transform coefficients from the most significant bit to the least significant bit and encoding the bit planes sequentially from the high-order bit plane, and a state predicting section (34) for predicting the state of a bit of the transform coefficient to be encoded next on the basis of information the previously encoded transform coefficient. The variable length encoding section (32) performs encoding according to the predicted bit state. Thus, high-efficiency encoding can be performed according to the bit state of the transform coefficient to be encoded.

Description

Specification

Video coding apparatus, video decoding apparatus, video coding method and video decoding method

Technical field

[0001] The present invention relates to a video coding apparatus and method for coding video and generating a video stream, and

The present invention relates to a video decoding apparatus and method for decoding a video stream to generate a decoded video.

Background art

[0002] Images are no longer separated from our lives, and various displays such as personal computers, portable terminals, televisions, high-vision televisions, etc. are transmitted through transmission units such as the Internet, mobile phone networks, broadcast waves, and storage media. It has become an important presence that allows the terminal to enjoy visual information.

[0003] Video transmitted through the transmission unit is compressed into a video stream having a smaller amount of data using video coding technology, and information is efficiently transmitted. Recently, video stream transmission has become increasingly popular, in which video code data that has been received is sequentially played back, rather than being completely downloaded after all video information has been downloaded. In the video coding technology such as the MPEG-4 AVC system described in ISO / IEC 14496-10, once coding, the code amount used for decoding is uniquely determined and changes the quality of the video to be reproduced. It is not possible. When one video stream is provided to two different communication bands, the video is encoded twice and transmitted according to each band, or the image quality and resolution of the video according to the narrow communication band, Reduce the frame rate and set the video code.

[0004] Several scalable video coding schemes have been devised and standardized, which have a data structure consisting of several layers and can change the amount of stream transmission as needed even after coding. In the video hierarchical coding system, the image quality, resolution, frame rate, etc. can be selected after the video is coded.

[0005] As video becomes higher definition due to advanced camera technology, the amount of information possessed by the video increases, and the need to adjust the amount of transmission for each user increases, resulting in a code-efficient schedule. A rubble video coding system is required and described.

[0006] MPEG-4 FGS (Fine Granularity Scalable coding) is one of the scalable video coding methods defined in ISO / IEC 1449 6-2 Amendment 2, and in particular, it is necessary to finely select the picture quality of the stream. Is standardized as a possible coding method.

[0007] A video stream encoded by MPEG-4 FGS is composed of a base layer stream and an enhancement layer stream. The base layer stream is a low-band low-quality video stream that can be decoded alone, and the enhancement layer stream is a video stream for improving the quality of the base layer stream. In MPEG-4 FGS, the amount of code to be transmitted is 1 frame (1 screen, 1 image) by hierarchically encoded layer structure and encoding process called bit-plane encoding used for enhancement layer stream. It can be controlled on a case-by-case basis and can be very flexible in responding to the transmission rate and the required image quality.

[0008] The following briefly describes the concept of bit plane coding used to generate an MPEG-4 FGS enhancement layer stream.

[0009] FIG. 14 is a diagram showing a video encoding device 130 of MPEG-4 FGS. The original video is input from the video signal input unit 132, and the base layer coding unit 134 generates a base layer stream, and the enhancement layer coding unit 140 generates an enhancement layer stream, and the base layer output unit 136. And output from the enhancement layer output unit 138.

[0010] In enhancement layer coding section 140, a DCT transform is performed on each differential pixel, which is the difference between a base layer decoded video obtained by decoding the original video and the base layer stream, every 8 X 8 pixels (Discrete Cosine

Transformation (discrete cosine transform), scan, hierarchical coding. In MPEG-4 FGS, the coding efficiency is improved by collectively coding the value of “0” that appears frequently in the DCT coefficients of the differential video.

[0011] In the enhancement layer coding unit 140, first, the DCT unit 144 DCT-transforms the difference image to generate DCT coefficients. Next, the scan unit 146 scans DCT coefficients of 8 × 8 pixels and rearranges the scanned DCT coefficients. The DCT coefficients have a statistically large absolute value biased to horizontal and vertical low frequencies, and it becomes statistically less that the horizontal and vertical high frequency coefficients have large absolute values. The scanning unit 146 is used to By performing a scan that reorders the DCT coefficients towards the flat and vertical high frequency coefficients, there will be more "0" s in the second half of the reordered scanned DCT coefficients. FIG. 16A shows a scan of DCT coefficients of 8 × 8 pixels. Signs representing plus and minus are encoded separately from absolute values. In this way, scan in the order of lower horizontal frequency and lower vertical frequency. As a result, the coding efficiency can be increased because the probability that "0" appears in the high frequency component is high.

Next, hierarchical coding section 140 performs bit plane coding. The MPEG-4 FGS performs a zero run length code and a nonman code for each bit plane from the upper bits as bit plane coding. A bit plane is a bit string in which only the same bit positions of a plurality of binary numbers are arranged, and is also called a bit plane. FIG. 17 is a diagram in which DCT coefficients are arranged with the horizontal axis in scan order and the vertical axis in bit units. One column represents one DCT coefficient, and one row represents one bit plane. The most significant “1” bit of the DCT coefficient is called MSB, and the blank cell in FIG. 17 is the one in which the description of “0” bit higher than MSB is omitted. The bit plane containing the MSB with the largest value among the DCT coefficients is called the MSB plane.

Zero run length coding is to encode how many “0” appears before coefficients other than “0” appear, and by assigning one signal to a plurality of “0” s. It is a coding method that compresses the amount of information. Since only "0" or "1" appears in the bit plane, it becomes "0 length". In addition, by encoding "whether it is the last 1 or not" in the bit plane, the "1" appearing after the "0" efficiently encodes the "0" that is biased in the second half by the scan. . In the zero run length code MPEG of MPEG-4 FGS illustrated in FIG. 17, referring to the area C, four “0” s follow and “1” s follow, but there is a “1” after that. Assign a combination of 0 is 4 and "not 1". FIG. 16B is a diagram showing a bit plane of bit positions of DCT coefficients of 8 × 8 pixels. According to the scan order shown in FIG. 16A, bit B12 in FIG. 16B is "1" at the end of the bit plane, otherwise it is not "1" at the end of the bit plane.

[0014] Huffman coding is a type of variable-length coding, and in MPEG-4 FGS, "length following 0", ie, zero run length and "whether it is the last 1 or not", ie, bit plane end signal The appearance probability is calculated in advance for each combination of numbers, and the information is compressed by assigning high probability of occurrence, short! Symbol for combination, low probability of occurrence, and long symbol for combination. Do. Table of codes to be assigned to combinations No., H. Mant. And! /,, MPEG-4 FGS, for MSB plane, for bit plane 1 bit lower than MSB, for bit plane 2 bit lower, Prepare 4 tape holes for lower bit planes. In these Huffman tables, assuming that the MSB plane of the DCT coefficient of 8 × 8 pixels contains less “1”, and assuming that the probability of occurrence of 1 gradually increases as it becomes the lower bit plane, “zero run length Compress information efficiently by predicting the combination of “1” and “last [1]”.

[0015] FIG. 15 shows an MPEG-4 FGS video decoding apparatus 150 that decodes the base layer stream and the enhancement layer stream generated by the video coding apparatus 130 to generate a decoded image. The base layer decoding image generated by the base layer decoding block 156 by decoding the base layer stream and the differentially decoded image generated by the enhancement layer decoding section 160 by decoding the enhancement layer stream are added. To generate a decoded image. The quality of the decoded image is proportional to the amount of decoded enhancement layer stream.

[0016] As described above, in MPEG-4 FGS that performs coding and decoding, the upper bit plane power of the DCT coefficient that has a strong influence on the quality of the decoded image is also coded and video stream is preferentially coded. The image quality can be flexibly adjusted by storing in. For example, in a system for transmitting a video stream from a video transmitting terminal to a video receiving terminal, if the video receiving terminal can not receive the video stream completely, the video stream of the MPEG-4 FGS will be displayed in image quality. It is possible to decode the picture since only the lower bit planes of low influence are missing. In the case of video non-scaleable coding schemes such as MPEG-4 AVC, if the video stream can not be received completely, the subjective image quality is greatly degraded, such as the lower half of one screen can not be decoded. I have a problem.

In Japanese Patent Application Laid-Open No. 2003-274406, the hierarchical coding unit 148 is improved in the MPEG-4 FGS video coding apparatus shown in FIG.

FIG. 18 shows the layer coding unit 148 described in the above publication. In hierarchical coding unit 148, bit separation unit 170 separates the input DCT coefficients into MSB and other non-MSBs. The MSB encoding unit 172 encodes only the MSB of the DCT coefficient, the non-MSB encoding unit 174 encodes the non-MSB, and the combining unit 176 combines and outputs the MSB and non-MSB codes. In the above-mentioned publication, the coding efficiency is improved by giving priority to the video stream by giving priority to the MSB which most affects the image quality of the decoded image among one DCT coefficient. “The code efficiency is good” means that when two video streams of the same data amount, which code the same original video, are decoded, the image quality of one decoded video is superior to the other. Then, the former has better coding efficiency.

Disclosure of the invention

Problem that invention tries to solve

As described above, although MPEG-4 FGS prepares a Huffman table for each bit plane, it does not determine whether the Huffman table has “a length of 0” and “last“ 1 ”. It does not necessarily reflect the appearance probability of each combination of “”, and it can not be said that the coding efficiency is good. The differential image to be encoded by the enhancement layer encoding unit 140 is a difference between the original image and the basic layer decoded image and includes many edges, so a large absolute value DCT coefficient appears in a high frequency region. There is a case. For example, if the MSB of one DCT coefficient is 11 bits and the MSBs of the other DCT coefficients are all 5 bits or less, the DCT coefficients coding from 11 MSB planes to 6 bits are Even though it is one, it is assumed that a large number of “1” appears as a Huffman table, and the expected probability of occurrence and the actual deviation become large and the coding efficiency becomes worse.

In the coding method in which only the MSB of the DCT coefficient used in the above publication is prioritized, improvement in coding efficiency can not always be expected. For example, when the MSB is considered to be a DCT coefficient of 8 bits and 2 bits, the 7-bit non-MSB of the former has a larger influence on the image quality than the MSB of the latter 2 bits.

[0021] In view of the above background, the present invention decodes a video coding apparatus and coding data that can perform efficient coding according to the state of bits of transform coefficients to be encoded. An object of the present invention is to provide a video decoding device.

Means to solve the problem

A video encoding apparatus according to the present invention is a conversion coefficient that frequency-converts an image and represents frequency components. Transform coefficients generated by the transform coefficient generator and the transform coefficient generator into binary numbers, and the bit plane consisting of the same order bits of the plurality of transform coefficients is from the most significant bit to the least significant bit A state in which the state of bits of transform coefficients to be encoded later is predicted based on information on a bitplane encoding unit that generates up to and encoding in order of upper bitplane power and transform coefficients that have been encoded earlier. The bit plane coding unit performs coding according to the state of the bit predicted by the state prediction unit.

In the above-described video encoding device, the state prediction unit may predict the state of the bits of the lower bit plane based on (1) information of the upper bit plane, or (2) the prediction target The bit state may be predicted according to the number of transform coefficients in which a bit state “1” appears in the bit plane higher than the bit, or (3) the transform coefficient of the bit to be predicted is The state of the bit may be predicted based on whether the state of the bit "1" appears in the upper bit or not, or (4) the state of the bit in the bit plane higher than the bit to be predicted Among the transform coefficients in which “1” appears, the state of the bit is predicted based on the distance to the closest transform coefficient where the code order is located later than the transform coefficient of the bit to be predicted. Or (5) a bit higher than the bit to be predicted. In the case of the bit plane, the state of the bit may be predicted based on the transform coefficient at which the state of the bit “1” appears at the last position of the code sequence.

[0024] In the video encoding device, the state prediction unit is configured such that the state of the bit to be predicted is "1", and all the code coefficients belonging to the same bit plane belonging to the same bit plane of the transform coefficient located behind. It predicts the probability that the bit state is "0", and the bit-plane coder may perform zero run length code based on the probability.

[0025] In the above video coding device, the state prediction unit is configured to convert bits of the coding coefficients in the order of the codes after the predetermined order on the bit plane higher than the bit to be predicted. It is determined whether or not the state includes the state “1”, and in accordance with the determination that the state “1” of the bit is not included, the state of the bit to be predicted is “1” and the order of the codes is The probability of all bits belonging to the same bit plane of the conversion coefficient located behind being "0" is predicted, and the bit plane code block performs a zero run length code block based on the probability. May be

[0026] In the above video encoding device, the state prediction unit is a zero run length including bits to be predicted. The bit plane coding unit may perform Huffman coding using a Huffman table selected based on the zero run length predicted by the state prediction unit.

[0027] In the above video coding device, the state prediction unit determines the number of zeros up to the point where the probability that the bit state "0" continuously appears falls below a preset threshold as the zero run length. May be

In the above-described video encoding apparatus, the state prediction unit predicts the probability that the state of the bit to be predicted is “1” or “0”, and the bit-plane coding part uses the state prediction unit. Arithmetic coding may be performed using the occurrence probability of the symbol determined based on the predicted probability.

[0029] In the above video encoding device, the state prediction unit is configured such that the state of the bit to be predicted is “1”, and the code order is all belonging to the same bit plane of the conversion coefficient located behind. The probability of the bit state being "0" is predicted, and the bit plane coding unit performs arithmetic coding using the occurrence probability of the symbol determined based on the probability predicted by the state prediction unit. You may go.

The video coding apparatus is a video coding apparatus that performs hierarchical coding on a base layer capable of decoding video independently and an enhancement layer for improving the video quality of the base layer. The prediction unit predicts the bit state of the transform coefficient of the enhancement layer based on the information of the base layer, and the bit plane coding unit converts the enhancement layer according to the state of the bit predicted by the state prediction unit. You may sign the coefficients!

[0031] In the above video coding apparatus, the state prediction unit is based on edge information included in the base layer or the base layer code amount, and the bit of the transform coefficient of the enhancement layer is You may predict the condition.

In the video coding apparatus, the state prediction unit predicts the state of the bit of the conversion coefficient of the frame constituting the video based on the information of the reference frame used for the motion prediction / compensation code, The plane code unit may code the transform coefficient of the frame according to the state of the bit predicted by the state prediction unit.

[0033] In the video coding apparatus, the state prediction unit determines the shape of bits of the transform coefficient of the frame based on edge information included in the reference frame or the code quantity of the reference frame. The state may be predicted.

The video decoding apparatus according to the present invention comprises a bit plane decoding unit that decodes bit plane coded video encoded data in order of upper bit plane power, and a conversion coefficient that has been previously decoded. And a state prediction unit that predicts the state of bits of transform coefficients to be decoded later based on the information related to H. The bit plane decoding unit decodes in accordance with the state of bits predicted by the state prediction unit. Do a bribe.

[0035] In the above video decoding apparatus, the state prediction unit may predict the state of the bits of the lower bit plane based on the information of the upper bit plane (1). The bit state may be predicted according to the number of transform coefficients in which a bit state “1” appears in the bit plane higher than the bit, or (3) the transform coefficient of the bit to be predicted is The state of the bit may be predicted based on whether the state of the bit "1" appears in the upper bit or not, or (4) the state of the bit in the bit plane higher than the bit to be predicted Among the transform coefficients in which “1” appears, the state of the bit is predicted based on the distance to the closest transform coefficient to which the decoding order is located later than the transform coefficient of the bit to be predicted. (5) bits higher than the bit to be predicted In the top plane, the state of the bit is predicted based on the transform coefficient at which the state of the bit "1" appears at the position where the decoding order is the most backward.

In the video decoding apparatus, the state prediction unit is configured such that the state of the bit to be predicted is “1”, and the decoding order is all belonging to the same bit plane of the conversion coefficient located behind. By predicting the probability that the bit state is "0", the bit plane decoding unit may perform zero run length decoding based on the probability.

[0037] In the above video decoding apparatus, the state prediction unit is configured to convert the decoding coefficients in the decoding order after the predetermined order in the bit plane higher than the bit to be predicted. It is determined whether or not the state includes the state “1”, and the state of the bit to be predicted is “1” and the decoding order is determined according to the determination that the state “1” of the bit is not included. The probability of all bits belonging to the same bitplane of the conversion coefficient located behind being "0" is predicted, and the bitplane decoding unit performs zero run length decoding based on the probability. May be

[0038] In the above video decoding apparatus, the state prediction unit is a zero run length including bits to be predicted. The bit plane decoding unit may perform Huffman decoding using the Huffman table selected based on the zero run length predicted by the state prediction unit.

The state prediction unit may obtain, as a zero run length, the number of zeros until the probability that the bit state “0” continuously appears falls below a preset threshold.

In the above video decoding apparatus, the state prediction unit predicts the probability that the state of the bit to be predicted is “1” or “0”, and the bit plane decoding unit is operated by the state prediction unit. Arithmetic decoding may be performed using the probability of occurrence of the symbol determined based on the predicted probability.

[0041] In the video decoding apparatus, the state prediction unit determines that the state of the bit to be predicted is “1”, and that the decoding order is all bits belonging to the same bit plane of the conversion coefficient positioned behind. The bit-plane decoding unit predicts the probability that the state of “0” is “0”, and the bit-plane decoding unit performs arithmetic decoding using the occurrence probability of the symbol determined based on the zero run length predicted by the state prediction unit. You may

In the above video decoding apparatus, the state prediction unit predicts, based on the decoded information of the base layer, the state of the bit of the transform coefficient of the enhancement layer for improving the video quality of the base layer, The decoding unit may decode the transform coefficients of the enhancement layer according to the state of the bit predicted by the state prediction unit !.

In the above video decoding device, the state prediction unit predicts the bit state of the transform coefficient of the enhancement layer based on edge information included in the base layer or the base layer code amount. It is also good.

In the video decoding apparatus, the state prediction unit predicts the state of the bit of the conversion coefficient of the frame constituting the video based on the information of the reference frame used for the motion prediction / compensation code. Then, the bit-plane decoding unit decodes the transform coefficients of the frame according to the state of the bit predicted by the state prediction unit.

In the video decoding apparatus, the state prediction unit predicts the state of the bits of the frame based on edge information included in the reference frame or the amount of coding of the reference frame.

According to the video encoding method of the present invention, a transform coefficient representing a frequency component by frequency-converting the video Converting the conversion coefficients generated in the conversion coefficient generation step of generating the conversion coefficients and the conversion coefficient generation step into binary numbers, the bit plane consisting of the bits of the same order of the plurality of conversion coefficients is the most significant bit to the least significant bit Based on the information on the bit plane coding step of generating up to bits and coding the upper bit plane powers sequentially and the information on the previously coded transform coefficients, the bit states of the transform coefficients to be encoded later are And a state prediction step of predicting, and the bit plane coding step performs coding in accordance with the state of the bit predicted in the state prediction step.

According to the video decoding method of the present invention, the bit plane decoding step of decoding the bit plane coded picture data and the coded picture data of the picture in the order of higher bit plane power, and the conversion previously decoded are performed. And a state prediction step of predicting the state of bits of transform coefficients to be decoded later based on the information on the coefficient, and the bit plane decoding step is performed according to the state of the bits predicted by the state prediction unit. Perform decryption.

[0048] The program for video coding according to the present invention includes a conversion coefficient generation step of converting the frequency of the video into a conversion coefficient representing a frequency component and converting the video to a computer for coding the video. The conversion coefficient generated in the coefficient generation step is converted to a binary number, and bit planes consisting of the same order bits of multiple conversion coefficients are generated from the most significant bit to the least significant bit. Perform a state prediction step of predicting the state of the bit of the transform coefficient to be encoded later based on the information on the previously encoded bit of the transform coefficient, and The plane coding step performs coding according to the state of the predicted bit in the state prediction step.

The program for video decoding according to the present invention is a bit for decoding the upper bit plane power in order to the computer in order to decode the bit plane coded video data of the bit plane. Performing a plane decoding step and a state prediction step of predicting a state of a bit of a transform coefficient to be decoded later based on information on the previously decoded transform coefficient; Decoding is performed according to the state of the bit predicted by the prediction unit.

[0050] As described below, there are other aspects of the present invention. Therefore, the present invention The disclosure is intended to provide some aspects of the present invention and is not intended to limit the scope of the claimed invention.

Brief description of the drawings

[FIG. 1] FIG. 1 is a diagram showing the configuration of a layer code portion according to a first embodiment of the present invention.

[FIG. 2] FIG. 2 is a diagram showing the configuration of a video encoding apparatus according to the first embodiment of the present invention.

[FIG. 3] FIG. 3 is a flowchart showing the operation of the video coding apparatus according to the first embodiment of the present invention.

[FIG. 4] FIG. 4 is a flow chart showing the operation of hierarchical code processing of the video coding apparatus of the first embodiment of the present invention.

[FIG. 5] FIG. 5 is an explanatory view of the state prediction of the present invention.

[FIG. 6] FIG. 6 is a diagram showing the configuration of a video decoding apparatus according to a second embodiment of the present invention.

[FIG. 7] FIG. 7 is a diagram showing the configuration of a hierarchical decoding module according to a second embodiment of the present invention.

[FIG. 8] FIG. 8 is a flowchart showing the operation of the video decoding apparatus according to the second embodiment of the present invention.

[FIG. 9] FIG. 9 is a flowchart showing an operation of hierarchical decoding processing of the video decoding apparatus according to the second embodiment of the present invention.

[FIG. 10] FIG. 10 is a diagram showing a configuration of a video encoding apparatus according to a third embodiment of the present invention. [FIG. 11] FIG. 11 is a diagram showing a video encoding apparatus according to the third embodiment of the present invention. Flow chart showing operation

[FIG. 12] FIG. 12 shows a configuration of a video decoding apparatus according to a fourth embodiment of the present invention. [FIG. 13] FIG. 13 shows a video decoding apparatus according to the fourth embodiment of the present invention. Flow chart showing operation

[FIG. 14] FIG. 14 is a diagram showing the configuration of a video coding device of MPEG-4 FGS.

[FIG. 15] FIG. 15 is a diagram showing the configuration of the MPEG-4 FGS video decoding device.

[FIG. 16A] FIG. 16A is an explanatory diagram of scanning of DCT coefficients.

[FIG. 16B] FIG. 16B is an explanatory diagram of bit planes of DCT coefficients.

[FIG. 17] FIG. 17 is an explanatory diagram of a bit plane code line.

[FIG. 18] FIG. 18 is a diagram showing the configuration of a conventional hierarchical coding unit. BEST MODE FOR CARRYING OUT THE INVENTION

Hereinafter, the present invention will be described in detail, but the following detailed description and the attached drawings do not limit the present invention. The scope of the invention is defined by the appended claims.

The video encoding apparatus according to the present embodiment is configured to convert the frequency of the video to generate a conversion coefficient representing a frequency component, and to convert the conversion coefficient generated by the conversion coefficient generation unit into a binary number. To generate bit planes consisting of the same-order bits of a plurality of transform coefficients from the most significant bit to the least significant bit, and the higher order bit plane powers in order as well. And a state prediction unit that predicts the state of bits of the conversion coefficient to be encoded later based on the information on the previously encoded conversion coefficient, and the bit plane coding unit is predicted by the state prediction unit. It performs coding according to the state of the bit.

As described above, the probability of occurrence of “1” and “0” in the state of the bit to be encoded later, ie, the bits to be encoded later, is predicted based on the information on the previously encoded transform coefficient. In this way, efficient code can be obtained. For example, by assigning a code having a short code bit length to a combination having a high occurrence probability, the code efficiency is improved.

In the video coding apparatus, the state prediction unit may predict the state of the bits of the lower bit plane based on the information of the upper bit plane.

By utilizing the correlation between the state of the upper bit plane and the state of the lower bit plane in this manner, it is possible to appropriately predict the state of the bits of the lower bit plane. In addition, the state of bits can be predicted without using information other than the region to which the coefficient bits in the code belong.

In the video coding apparatus, the state prediction unit predicts the state of the bit according to the number of transform coefficients in which the state of the bit “1” appears in the bit plane higher than the bit to be predicted. It is also good.

With this configuration, the prediction accuracy of the bit state can be improved, and the coding efficiency can be improved.

[0059] In the video coding apparatus, the state prediction unit determines whether a bit state “1” appears in the high-order bit and V in the conversion coefficient of the bit to be predicted. The state may be predicted. According to this configuration, the prediction accuracy of the bit state can be improved, and the coding efficiency can be improved.

[0061] In the video coding apparatus, the state prediction unit converts the bit to be predicted among the conversion coefficients in which the state of bit “1” appears in the bit plane higher than the bit to be predicted. The state of the bit may be predicted based on the distance to the nearest transform coefficient where the code order is located behind the coefficient and the closest to the transform coefficient.

According to this configuration, since the zero run length can be predicted, it is possible to perform efficient coding.

In the above video coding apparatus, the state prediction unit is based on the conversion coefficient in which the state of the bit “1” appears at the last position of the code order in the bit plane higher than the bit to be predicted. First, let's predict the state of the bit.

[0065] In the above-mentioned video encoding device, the state prediction unit is configured such that the state of the bit to be predicted is “1”, and all the code coefficients belonging to the same bit plane belonging to the same bit plane of the conversion coefficient located behind It predicts the probability that the bit state is "0", and the bit-plane coder may perform zero run length code based on the probability.

[0066] According to this configuration, it is possible to predict the last "1" appearing in the bit plane and perform efficient coding.

[0067] In the above video coding apparatus, the state prediction unit is configured to convert bits of the coding coefficients in the order of the codes after the predetermined order in the bit plane higher than the bit to be predicted. It is determined whether or not the state includes the state “1”, and in accordance with the determination that the state “1” of the bit is not included, the state of the bit to be predicted is “1” and the order of the codes is The probability of all bits belonging to the same bit plane of the conversion coefficient located behind being "0" is predicted, and the bit plane code block performs a zero run length code block based on the probability. May be

With this configuration, it is possible to predict the number of "0" s after the last "1" that can be omitted by the bit plane end signal representing the last "1" in the code sequence, and the last "1" If the number of subsequent “0s” is small, the bit plane end signal is not coded. The code efficiency can be improved.

In the video coding apparatus, the state prediction unit predicts a zero run length including bits to be predicted, and the bit plane coding unit predicts a zero run predicted by the state prediction unit. Perform Huffman coding using a Huffman table selected based on length.

According to this configuration, it is possible to assign a short code to the zero run length with high occurrence probability based on the predicted zero run length, and the coding efficiency is improved.

[0071] In the video coding apparatus, the state prediction unit determines the number of zeros up to the point where the probability that the bit state "0" appears continuously falls below a preset threshold as the zero run length. May be

According to this configuration, the zero run length can be appropriately predicted.

According to this configuration, it is possible to assign a short code to the transform coefficients in the code, and coding efficiency is improved.

[0075] In the above video encoding device, the state prediction unit determines that the state of the bit to be predicted is "1", and that all the codes belonging to the same bit plane of the transform coefficient whose code sequence is located behind. The probability of the bit state being "0" is predicted, and the bit plane coding unit performs arithmetic coding using the occurrence probability of the symbol determined based on the probability predicted by the state prediction unit. You may go.

The video coding apparatus is a video coding apparatus that performs hierarchical coding on a base layer capable of decoding video independently and an enhancement layer for improving the video quality of the base layer. The prediction unit predicts the bit state of the transform coefficient of the enhancement layer based on the information of the base layer, and the bit plane coding unit converts the enhancement layer according to the state of the bit predicted by the state prediction unit. You may sign the coefficients! According to this configuration, it is possible to appropriately predict the bit state of the transform coefficient of the enhancement layer based on the information of the enhancement layer and the base layer strongly correlated.

[0079] In the above video coding apparatus, the state prediction unit is configured to use the bit of the transform coefficient of the enhancement layer based on the edge information included in the base layer or the code quantity of the base layer. You may predict the condition.

According to this configuration, it is possible to appropriately predict the state of bits of the transform coefficient of the enhancement layer.

In the video coding apparatus, the state prediction unit predicts the state of the bit of the conversion coefficient of the frame constituting the video, based on the information of the reference frame used for the motion prediction / compensation code, The plane code unit may code the transform coefficient of the frame according to the state of the bit predicted by the state prediction unit.

According to this configuration, based on the information of the reference frame strongly correlated with the frame in the code.

, The state of the bits of the transform coefficients of the frame in the code can be properly predicted. It is possible to predict the state of the residual signal of motion prediction compensation using the characteristics of the reference frame.

The code efficiency can be improved.

In the above video coding device, the state prediction unit predicts the bit state of the transform coefficient of the frame based on edge information included in the reference frame or the code amount of the reference frame. May be

According to this configuration, it is possible to properly predict the bit state of the transform coefficient of the frame in the code.

[0085] The video decoding apparatus according to the present embodiment decodes a bit-plane coded video encoded data in the order of higher bit-plane coding, and a bit-plane decoding module that decodes the data first. And a state prediction unit for predicting the state of bits of the conversion coefficient to be decoded later based on the information on the converted conversion coefficient, the bit plane decoding unit including bits predicted by the state prediction unit. Decode according to the state of.

Thus, based on the information on the transform coefficient decoded earlier, the state of the bit to be decoded later, ie, the appearance probabilities of “1” and “0” in the bit to be decoded later By prediction, it is possible to decode encoded data efficiently encoded according to the appearance probability. [0087] In the video decoding apparatus, the state prediction unit may predict the state of the bits of the lower bit plane based on the information of the upper bit plane.

By utilizing the correlation between the state of the upper bit plane and the state of the lower bit plane in this manner, the state of the bits of the lower bit plane can be appropriately predicted, and coding can be performed efficiently. The encoded data can be decoded. In addition, the state of bits can be predicted without using information other than the region to which the bit of coefficient being decoded belongs.

In the video decoding device, the state prediction unit predicts the state of the bit according to the number of transform coefficients in which the state of the bit “1” appears in the bit plane higher than the bit to be predicted. It is also good.

According to this configuration, it is possible to decode coded data encoded efficiently by improving the prediction accuracy of the bit state.

[0091] In the above video decoding apparatus, the state prediction unit determines the conversion coefficient of the bit to be predicted.

The state of the bit may be predicted based on whether or not the state of the bit "1" appears in V, the upper bits.

[0092] With this configuration, it is possible to improve the prediction accuracy of the bit state and decode efficiently encoded code data.

[0093] In the above video decoding apparatus, the state prediction unit converts the bit to be predicted among the conversion coefficients in which the bit state “1” appears in the bit plane higher than the bit to be predicted. The state of the bit may be predicted based on the distance to the nearest transform coefficient where the decoding order is located behind the coefficient and the nearest to it.

According to this configuration, it is possible to predict zero run length and decode coded data efficiently coded.

[0095] In the above video decoding apparatus, the state prediction unit is based on the conversion coefficient at which the state of the bit "1" appeared at the position of the rearmost in the decoding order in the bit plane higher than the bit to be predicted. First, let's predict the state of the bit.

With this configuration, it is possible to improve the prediction accuracy of the bit state and decode efficiently encoded code data. [0097] In the above video decoding apparatus, the state prediction unit determines that the state of the bit to be predicted is “1”, and that the decoding order is all belonging to the same bit plane of the conversion coefficient located behind. By predicting the probability that the bit state is "0", the bit plane decoding unit may perform zero run length decoding based on the probability.

According to this configuration, it is possible to efficiently decode coded data encoded by predicting the last “1” appearing in the bit plane.

In the above video decoding apparatus, the state prediction unit is configured to convert the decoding coefficients in the decoding order after the predetermined order in the bit plane higher than the bit to be predicted. It is determined whether or not the state includes the state “1”, and the state of the bit to be predicted is “1” and the decoding order is determined according to the determination that the state “1” of the bit is not included. The probability of all bits belonging to the same bitplane of the conversion coefficient located behind being "0" is predicted, and the bitplane decoding unit performs zero run length decoding based on the probability. May be

With this configuration, when the number of “0” s after “1” at the end is small, coding of the bit plane end signal is not performed, thereby efficiently decoding the coded data encoded. Can be

In the video decoding apparatus, the state prediction unit predicts a zero run length including bits to be predicted, and the bit plane decoding unit is based on the zero run length predicted by the state prediction unit. Do the Huffman decoding using the selected Huffman table.

According to this configuration, it is possible to decode encoded data efficiently encoded using the Nofman table selected according to the predicted zero run length.

The state prediction unit may obtain, as the zero run length, the number of zeros until the probability that the bit state “0” appears continuously falls below a preset threshold.

In the above video decoding apparatus, the state prediction unit predicts the probability that the state of the bit to be predicted is “1” or “0”, and the bit plane decoding unit uses the state prediction unit to Arithmetic decoding may be performed using the probability of occurrence of the symbol determined based on the predicted probability.

According to this configuration, the code can be efficiently encoded using the occurrence probability selected according to the state of the bit. The encoded data can be decoded.

In the video decoding apparatus, the state prediction unit determines that the state of the bit to be predicted is “1”, and that the decoding order is all bits belonging to the same bit plane of the conversion coefficient positioned behind. The bit-plane decoding unit predicts the probability that the state of “0” is “0”, and the bit-plane decoding unit performs arithmetic decoding using the occurrence probability of the symbol determined based on the zero run length predicted by the state prediction unit. You may

According to this configuration, it is possible to decode encoded data efficiently encoded using the occurrence probability selected according to the state of the bit.

In the above video decoding apparatus, the state prediction unit predicts, based on the decoded information of the base layer, the state of the bit of the transform coefficient of the enhancement layer for improving the video quality of the base layer, The plane decoding unit may decode the transform coefficients of the enhancement layer according to the state of the bit predicted by the state prediction unit !.

According to this configuration, it is possible to appropriately predict the bit state of the transform coefficient of the enhancement layer based on the information of the enhancement layer and the base layer strongly correlated.

In the above video decoding apparatus, the state prediction unit predicts the bit state of the transform coefficient of the enhancement layer based on edge information included in the base layer or the code quantity of the base layer. It is also good.

According to this configuration, it is possible to appropriately predict the bit state of the transform coefficient of the enhancement layer in decoding.

In the above video decoding device, the state prediction unit predicts the state of the bit of the conversion coefficient of the frame making up the video based on the information of the reference frame used for the motion prediction / compensation code, and the bit is The plane decoding unit may decode the transform coefficients of the frame according to the state of the bit predicted by the state prediction unit.

According to this configuration, it is possible to appropriately predict the bit state of the transform coefficient of the frame in decoding based on the information of the reference frame having a strong correlation with the frame in decoding.

In the above video decoding apparatus, the state prediction unit predicts the state of the bits of the frame based on edge information included in the reference frame or the amount of coding of the reference frame. According to this configuration, it is possible to appropriately predict the bit state of the transform coefficient of the frame in the decoding process.

In the video encoding method according to the present embodiment, a transform coefficient generating step of frequency converting a video to generate a transform coefficient representing a frequency component, and a transform coefficient generated in the transform coefficient generating step Is converted to a binary number, and a bit plane coding step of generating bit planes consisting of the same order bits of a plurality of transform coefficients from the most significant bit to the least significant bit and coding in the order of upper bit plane power The state prediction step of predicting the state of the bits of the transform coefficient to be encoded later based on the information on the previously encoded transform coefficient, the bit plane encoding step comprising the steps of: Coding is performed according to the state of the predicted bit.

According to this configuration, it is possible to perform efficient coding as in the video coding apparatus described above. It is also possible to apply various configurations of the above video coding apparatus to the video coding method of the present embodiment.

In the video decoding method according to the present embodiment, a bit-plane decoding step of decoding bit-code coded video coded data in order of upper bit-plane power, and decoding first. And a state prediction step of predicting the state of bits of the conversion coefficient to be decoded later based on the information on the converted conversion coefficient, the bit plane decoding step including the state of the bit predicted by the state prediction unit. Perform decryption according to.

With this configuration, it is possible to decode encoded data that has been efficiently encoded, as in the video decoding device described above. It is also possible to apply various configurations of the video decoding apparatus described above to the video decoding method of the present embodiment.

The program for video coding according to the present embodiment is a conversion coefficient generation step of frequency converting the video into a computer and generating a conversion coefficient representing a frequency component in order to code the video. Then, the conversion coefficient generated in the conversion coefficient generation step is converted into a binary number, and a bit plane consisting of the same order bits of a plurality of conversion coefficients is generated from the most significant bit to the least significant bit, Bit-Plane Force A bit-plane encoding step of encoding in order, and a state prediction step of predicting the state of bits of transform coefficients to be encoded later based on information on transform coefficients encoded earlier. Let the bit run The plane coding step performs coding according to the state of the predicted bit in the state prediction step.

According to this configuration, it is possible to perform efficient coding as in the video coding apparatus described above. It is also possible to apply the various configurations of the video encoding device described above to the program of the present embodiment.

The program for video decoding according to the present embodiment is to decode the upper bit plane power in order to the computer in order to decode the bit plane coded video data of the bit plane code. The bit plane decoding step is performed, and the state prediction step is performed to predict the state of bits of the transform coefficient to be decoded later based on the information on the transform coefficient decoded earlier. The decoding step performs decoding in accordance with the state of the bit predicted by the state prediction unit.

With this configuration, it is possible to decode encoded data that has been efficiently encoded, as in the video decoding device described above. It is also possible to apply various configurations of the video decoding apparatus of the present invention to the program of the present invention.

According to the video encoding device and method according to the present embodiment, the state of the bit to be encoded later based on the information on the previously encoded transformation coefficient, ie, the bit to be encoded later. Good efficiency by predicting the appearance probability of '1' and '0' in It has the excellent effect of being able to perform the ヽ sign.

Hereinafter, embodiments of the video coding device and the video decoding device according to the present invention will be described in detail with reference to the drawings.

First Embodiment

In the first embodiment, in a scalable video coding system, a video coding system will be described which predicts the state of the enhancement layer using the information of the enhancement layer encoded earlier and performs coding. The video encoding apparatus according to the first embodiment uses the encoded / decoded information, which is the information stored in the video stream as the encoded information, to the encoded information, which is the information in the encoded information. A state prediction parameter representing a state of coding information is predicted, and code information is coded based on the state prediction parameter.

FIG. 1 is a block diagram showing the configuration of hierarchical code input unit 26 in video encoding apparatus 10, FIG. FIG. 1 is a block diagram showing the configuration of a video coding apparatus 10 according to a first embodiment of the present invention. Before describing the encoding of the video encoding device 10 according to the present embodiment with reference to FIG. 1, the overall configuration of the video encoding device 10 will be described with reference to FIG.

In FIG. 2, the video encoding device 10 includes a video signal input unit 12, a base layer coding unit 14, a base layer output unit 16, an enhancement layer coding unit 18, and an enhancement layer output unit 28. The enhancement layer coding unit 18 has a difference unit 20, a DCT unit 22, a scan unit 24, and a hierarchical coding unit 26.

The video signal input unit 12 receives an image as an original image frame by frame from the outside of the video code processing apparatus 10 and outputs the video to the base layer code communication unit 14 and the enhancement layer coding circuit unit 18. The presence or absence of a video input from the outside of the video encoding device 10 is determined, and if there is no video input, the processing ends.

The base layer coding unit 14 codes the original image input from the video signal input unit 12 to generate a base layer stream, and outputs the generated base layer stream to the base layer output unit 16. Do. The base layer stream is decoded to generate a base layer decoded image, and the generated base layer decoded image is output to the enhancement layer coding unit 18.

The base layer output unit 16 outputs the base layer stream input from the base layer code input unit 14 to the outside of the video code input device 10.

The difference unit 20 generates a difference image by calculating the difference between the original image input from the video signal input unit 12 and the base layer decoded image input from the base layer code input unit 14. The difference image is output to the DCT unit 22.

The DCT unit 22 divides the difference image input from the difference unit 20 into blocks, which are areas of 8 × 8 pixels, performs DCT transform for each block, and generates DCT coefficients. The DCT coefficients are output to the scan unit 24. The DCT unit 22 corresponds to a transform coefficient generation unit of the present invention.

The scan unit 24 scans the DCT coefficients input from the DCT unit 22 in a predetermined order to generate scanned DCT coefficients, and outputs the generated scanned DCT coefficients to the hierarchical coding unit 26. Do.

With reference to FIG. 1, the layer coding unit 26 in the first embodiment will be described. As shown in FIG. 1, the hierarchical coding unit 26 has a zero run length coding unit 30, a variable length coding unit 32, and a state prediction unit 34.

The zero run length coding unit 30 zero-scans the scanned DCT coefficients input from the scan unit 24 into a combination of a zero run length and a bit plane end signal for each bit plane, and performs variable-length code coding. Output to the conversion unit 32.

The variable length coding unit 32 uses the state prediction parameter input from the state prediction unit 34 to combine the zero run length and bit plane end signal input from the zero run length coding unit 30. Variable-length code. The variable-length coding unit 32 outputs the enhancement layer stream generated by the encoding to the enhancement layer output unit 28.

The state prediction unit 34 generates a state prediction parameter by predicting the state of code information which is a bit in the scanned DCT coefficient power input from the scan unit 24 to the hierarchical code unit 26. Then, the generated state prediction parameters are output to the variable-length coding unit 32. The state prediction unit 34 corresponds to the state prediction unit of the present invention, and the variable length coding unit 32 corresponds to the bit plane coding unit.

The operation of the video coding apparatus 10 configured as described above will be described. FIG. 3 is a flow chart showing an example of the operation of the video encoding apparatus 10 of the first embodiment shown in FIGS. 1 and 2. The flowchart shown in FIG. 3 causes a control program stored in a storage device (for example, ROM, flash memory, etc.) not shown to be executed as software by executing the program by the CPU not shown also executing the control program. It is also possible.

[0141] First, the image encoding device 10 performs video signal input processing (S10) _o Specifically, a video signal input unit 12, one frame image from the outside of the video encoder spoon 10 original images As an input, it is outputted to the base layer code block 14 and the enhancement layer code block 18.

Next, the video encoding device 10 performs base layer encoding processing (S12). Specifically, the base layer coding unit 14 codes the original image input from the video signal input unit 12 to generate a base layer stream, and the generated base layer stream is output to the base layer output unit 1. Output to 6. Base layer code block 14 units A base layer stream is decoded and a base layer decoded image is generated, and the generated base layer decoded image is an enhancement layer coding unit. Output to 18

As an encoding method of the base layer, an existing method such as MPEG-4 AVC is used. A base layer decoded image is a decoded image generated by performing intermediate processing on a base layer stream if the base layer stream can not be generated by decoding the base layer stream, and an identical one can be generated. But good!

Next, the video encoding device 10 performs differential processing (S14). Specifically, the difference unit 20 of the enhancement layer coding unit 18 calculates the difference between the original image input from the video signal input unit 12 and the base layer decoded image input from the base layer coding unit 14. As a result, a difference image is generated, and the generated difference image is output to the DCT unit 22.

Next, the video encoding device 10 performs a DCT process (S16). Specifically, the DCT unit 22 divides the difference image input from the difference unit 20 into blocks of an area of 8 × 8 pixels, performs DCT transform for each block, and generates DCT coefficients. The DCT coefficient is output to the scan unit 24. The division method of the area at the time of DCT conversion is not limited to 8 × 8 pixels. The method of performing frequency conversion of video is not limited to DCT conversion, and other orthogonal conversion such as Wavelet conversion may be performed. In addition, although the coding efficiency is inferior, it is not necessary to perform any orthogonal transformation.

Next, the video encoding device 10 performs a scan process (S18). Specifically, the scanning unit 24 scans the DCT coefficients input from the DCT unit 22 in a predetermined order to generate scanned DCT coefficients, and generates the scanned DCT coefficients in the hierarchical coding unit 26. Output. The scan performed by the scan unit 24 is not limited to the order shown in FIG. 16, but may be performed in another order.

Instead of scanning DCT coefficients as they are, scanning may be performed after quantization processing to reduce the values by dividing the DCT coefficients by a predetermined number. In that case, it is necessary to perform inverse quantization processing for decoding.

Next, the video encoding device 10 performs hierarchical encoding processing (S20). Specifically, the hierarchical encoding unit 26 encodes the scanned DCT coefficients input from the scanning unit 24 to generate an enhancement layer stream, and the generated enhancement layer stream is output to the enhancement layer output unit 28. Output to The hierarchical encoding process (S20) will be described in detail. FIG. 4 is a flowchart showing an example of the hierarchical encoding process (S20). First, the hierarchical encoding unit 26 performs zero run length encoding processing (S30). Specifically, the scanned DCT coefficients input from the zero run length coding unit 30 to the hierarchical code unit 26 from the zero power scan unit 24 are divided into several zero run lengths and bit plane end signals for each bit plane. The combination is zero-run length encoded and output to the variable-length encoder 32. The bit plane end signal being ON indicates that "1" following the zero run signal is the last "1" of the bit plane. The OF F indicates that it is not the last “1”.

[0150] A zero run length code may be obtained by putting together a plurality of bit planes into a scanned DCT coefficient.

In this case, in addition to the zero run length and the bit plane end signal, a level signal indicating “a force that“ 0 ”follows and then what number has appeared” is required. For example, when 3 bit planes are grouped together and subjected to zero run length coding, there is a possibility that values “1” to “7” other than “0” may appear after the zero run. A level signal indicating In the present embodiment, only the zero run length and the bit plane end signal are described for simplicity of description.

Next, the hierarchical coding unit 26 performs state prediction processing (S32). Specifically, the state prediction unit

34 also generates a state prediction parameter from the scanned DCT coefficient power input from the scan unit 24 to the hierarchical code pair unit 26, and outputs the generated state prediction parameter to the variable-length coding unit 32. The state prediction parameter generated by the state prediction unit 34 is a zero run length formed by a number of bits included in the encoded information or a probability that each bit included in the code information is “1”, and a bit plane end signal is ON. The probability is

FIG. 5 is a diagram showing an example of the state prediction of the present invention. FIG. 5 is a diagram in which the DCT coefficients of a certain block are arranged in the order of scanning on the horizontal axis and in units of bits on the vertical axis. For convenience, the distance between two points on the scan axis is called the scan distance, and the distance from the left edge of the scan axis is called the scan coordinates. Region C in FIG. 5 indicates a bit group to be subjected to state prediction. The leftmost bit of bit B4 in region C is the bit currently in the code.

In the present embodiment, prediction in the case where the variable length coding unit 32 performs Huffman coding will be described. In the case of the Huffman code 匕, a zero run length is predicted. In the present embodiment, the sign information of the DCT coefficient in which the MSB appears in the upper bits, for example, the bit B in FIG. 7 and B8 etc. The appearance probability of “0” and “1” appears at the top with MSB appearing! /,! /, DCT coefficient bit B 4, B 5 etc. has higher value “1” The probability is predicted to be 50%, for example. Conversely, the sign information of the DCT coefficient in which the MSB does not appear, for example, bits B4 and B5 in FIG. 5 predict the probability that the value is "1" lower than 50%.

As an example, consider the case of predicting the states of a plurality of bits starting from bit B4. Here, assuming that the probability that the bit without MSB appearing is “1” is 50%, and the probability that the bit with MSB appearing is “1” is 25%, probability that “0” follows bit B5 0% The number of consecutive “0s” is considered as the prediction of zero run length when it falls below. The leftmost bit B4 of the region C is MSB at the upper end and is therefore 、, so the probability of being “0” is 75%, and the probability of the bit B5 to its right is also “0” is 75%. The probability of a run being 2 or more is 75% x 75% = 56. 25%. The next bit B6 also appears MSB at the upper level! /,! /, So the probability of being '0' is 75%, and the probability of 3 consecutive '0's is 56. 25% x 75% = 42 It is 1875%. The next bit B7 has a 50% probability of being "0" because the MSB appears in the upper part. Therefore, the probability that "0" will continue to bit B7 is 21. 09375% <40%, so the predicted zero run length is determined to be 3.

[0155] The lower the bit plane that the DCT coefficient often has an absolute value larger than 0, the higher the probability that the MSB appears in the higher bit plane. Therefore, since the probability that “1” will appear increases as the lower bit plane is reached, the state prediction unit 34 may predict the zero run length to be shorter as the bit position becomes lower. The zero run length may be set shorter as the number of MSBs included in the bit plane higher than the bit plane to which the zero run length belongs increases.

The high frequency coefficient of the DCT coefficient has many small absolute values, and the MSB does not appear. The probability that the sign information of the DCT coefficient is “1” is low. Therefore, the larger the scan coordinate of the zero run start bit, the smaller the probability that the bit “1” will appear, and therefore the state prediction unit 34 may extend the zero run length to be predicted.

The prediction of the bit plane end signal will be described. In the case of Huffman coding, the state prediction unit 34 predicts the probability of the bit plane end signal being ON to be higher as the bit behind the scan axis. The state prediction unit 34 reduces the probability of ON if the scan coordinate is smaller than the MSB located at the end of the scan axis in the upper bit plane with respect to the code information. Measure and predict the probability of ON high if it is large. For example, in FIG. 5, if the scan coordinate is smaller than bit B1, the ON probability is predicted higher if it is lower.

If the DCT coefficient of a somewhat large scan coordinate contains 1 in the upper bit plane for the code information, the bit plane end signal may not be predicted for the bit plane to which the code information belongs. For example, when the bit plane end signal of the upper bit plane is at the 62nd DCT coefficient in the 8 × 8 DCT conversion, the bit plane with a smaller number of “0” can be omitted by the bit plane end signal in lower bit planes. It is possible to improve the coding efficiency by not coding the end signal.

When predicting the state of code information, the state prediction unit 34 according to the present embodiment performs prediction using only the upper bit plane in the block to which the coding information belongs. As a result, in the case where the user who decodes the video stream decodes only the specific block, it is sufficient to decode the enhancement layer of the specific block only, so the whole is decoded. The processing load of decoding can be further reduced. In addition, it is possible to save the bit rate by transmitting only a part of the video stream over the network.

Next, the hierarchical coding unit 26 performs variable-length coding processing (S34). Specifically, the variable length code input unit 32 combines the zero run length and bit plane end signal combination input from the zero run length code input unit 30 with the state prediction parameter input from the state prediction unit 34. Use variable-length coding. The layer coding unit 26 outputs the enhancement layer stream generated by the encoding to the enhancement layer output unit 28.

Here, the case where the variable-length coding unit 32 performs the Huffman coding will be described. In this case, the video encoding device 10 has a plurality of Huffman tables in advance. The coding table is improved by assigning short codes to frequently appearing combinations and long codes to high combinations.

Therefore, when coding a combination in which the zero run length is predicted to be long, the variable-length coding unit 32 uses a Huffman tape in which a short code is assigned to a combination having a length and a zero run length. Select and Huffman encode. Conversely, when coding a combination predicted to have a short zero run length, the Huffman table is selected by selecting a short and a Huffman table to which a short code is assigned to a combination having a zero run length. [0163] When coding a combination predicted that the bit plane end signal is likely to be OFF, the variable-length coding unit 32 sets a short code to the combination in which the bit plane end signal power is OFF. Select the assigned Nomfman table and perform Huffman coding. Conversely, when it is predicted that the bit plane end signal power is likely to be on, but the combination is encoded as high, the bit plane end signal of the combination where the bit plane end signal is ON is assigned a short code. Select and Huffman code. The hierarchical encoding process (S20) has been described above.

Returning to FIG. 3, the video encoding device 10 performs stream output processing (S22). Specifically, the base layer output unit 16 outputs the base layer stream input from the base layer code input unit 14 to the outside of the video code input unit 10. The enhancement layer output unit 28 outputs the enhancement layer stream input from the enhancement layer coding unit 18 to the outside of the video coding unit 10.

Next, the video encoding device 10 performs an end determination process (S24). Specifically, the video signal input unit 12 determines the presence or absence of the video input from the outside of the video encoding device 10. As a result of the determination, if there is no video input, the processing is ended, and if there is a video input, the processing returns to the video input processing (S10). The video coding apparatus according to the first embodiment of the present invention has been described above.

[0166] The video encoding device 10 according to the first embodiment uses the state of the MSB of the higher bit plane of the enhancement layer for the encoding information of the enhancement layer to set the zero run length or “1 By predicting the probability that the bit plane end signal is ON, and replacing the Huffman table used for the Huffman code based on the prediction, the code of the variable-length code It is possible to improve the chewing efficiency.

According to the first embodiment, since the video encoding device 10 predicts the state of code information from the upper bit plane belonging to the same block, independence between blocks is maintained. It is possible to code only the block of interest of the user performing decoding. This can reduce the coding processing load. It is also possible to save bitrate by transmitting only the video stream that corresponds to the block of interest.

[0168] In the first embodiment described above, the variable length coding unit 32 The variable-length code unit 32 may perform arithmetic coding. State prediction in the case of arithmetic coding will be described.

In the case of arithmetic coding, it predicts the probability that the code information is “1”. The state prediction unit 34 predicts that the probability that the low-order bit of the DCT coefficient in which the MSB appears at the high order is "1" is 50%. The other coding information has a higher probability of being "1" as the lower bit-planes. If the number of MSBs included in the bit plane higher than the bit plane to which the encoding information belongs is large, the probability of being “1” may be increased. The larger the scan coordinates of the zero run start bit, the smaller the probability that a "1" bit will appear may be.

When encoding is performed by arithmetic coding, the state prediction unit 34 predicts the probability that the bit plane end signal is ON for each bit. As in the case of the Huffman code, if the scan coordinate is smaller than the MSB located at the end of the scan axis in the upper bit plane, the probability of ON is predicted low, and if it is large, the probability of ON is predicted high. For example, in FIG. 5, if the scan coordinate is smaller than bit B1, the ON probability is predicted high if it is too low.

The variable-length coding unit 32 performs variable-length coding using the state prediction parameters generated by the state prediction unit 34. As the state prediction parameter used here, the probability that the code information is “1” and the probability that the bit plane end signal is ON are used as the appearance probability of the symbol necessary for the arithmetic coding. It is arithmetically encoded with two kinds of symbols whether the encoding information is “0” or “1”, and the bit plane end signal of the encoding information “1” is ON. OFF is OFF If you use two different symbols, you need to use another arithmetic code. Whether the encoding information is “0” or whether the bit plane end signal is “1” of OFF or the bit plane end signal is “1” of ON can be arithmetically coded with three kinds of symbols. ,.

Second Embodiment

In the second embodiment, a state prediction parameter representing the state of the decoded information is predicted from the decoded information decoded earlier with respect to the decoded information which is information being decoded, and the state prediction is performed. A video decoding apparatus that performs decoding based on parameters will be described. In the second embodiment, a video decoding apparatus for decoding a video stream generated by the video coding apparatus of the first embodiment will be described. FIG. 6 is a block diagram showing the configuration of a video decoding / decoding device 40 according to the second embodiment, and FIG. 7 is a diagram showing the configuration of the hierarchical decoding unit 50. In FIG. 6, the video decoding device 40 has a base layer input unit 42, a base layer decoding unit 44, an enhancement layer input unit 46, an enhancement layer decoding unit 48, and a video signal output unit 58. The enhancement layer decoding unit 48 includes a hierarchical decoding unit 50, an inverse scan unit 52, an inverse DCT unit 54, and an addition unit 56.

[0174] Base layer input section 42 also inputs the base layer stream as an external force of video decoding / decoding apparatus 40, and outputs the same to base layer decoding section 44. The base layer input unit 42 determines the presence or absence of the input of the base layer stream from the outside, and ends the processing if there is no input of the base layer stream.

[0175] Base layer decoding section 44 decodes the base layer stream input from base layer input section 42 to generate a base layer decoded image, and the generated base layer decoded image is subjected to enhancement layer decoding. It is output to the head unit 48 and the video signal output unit 58.

[0176] The enhancement layer input unit 46 also receives an enhancement layer stream as an external force of the video decoding / decoding device 40 and outputs the enhancement layer stream to the enhancement layer decoding unit 48.

The video signal output unit 58 includes the base layer decoded image input from the base layer decoding unit 44 and the enhancement layer decoded image input from the enhancement layer decoding unit 48 in the video decoding apparatus 40. Output to the outside.

[0178] In hierarchical decoding unit 50, reverse scanning unit 52 performs reverse scanning to rearrange scanned DCT coefficients input from hierarchical decoding unit 50 into defined river page numbers to generate DCT coefficients. And outputs the generated DCT coefficients to the inverse DCT unit 54.

[0179] The inverse DCT unit 54 generates a differentially decoded image by performing inverse DCT on the block basis and applying DCT coefficients input from the inverse scan unit 52 to the block. The inverse DCT unit 54 outputs the generated differentially decoded image to the addition unit 56.

[0180] The addition unit 56 adds the inverse of the base layer decoded image input from the base layer decoding unit 44 and the inverse D

The differential decoded image input from the CT unit 54 is added to generate an enhancement layer decoded image. The addition unit 56 outputs the generated enhancement layer decoded 匕 image to the video signal output unit 58.

FIG. 7 is a block diagram showing a configuration of hierarchical decoding unit 50 in the second embodiment. Ru. The hierarchical decoding unit 50 includes a variable-length decoding unit 60, a zero run length decoding unit 62, and a state prediction unit 64.

The variable-length decoding unit 60 performs variable-length decoding on the enhancement layer stream input from the enhancement layer input unit 46 using the state prediction parameters input from the state prediction unit 64 to obtain a zero run length. Generate a combination of and the bit plane end signal. The variable length decoding unit 60 outputs the combination of the generated zero run length and bit plane end signal to the zero run length decoding unit 62.

The zero run length decoding unit 62 decodes the combination of the zero run length and the bit plane end signal input from the variable length decoding unit 60 to generate a scanned DCT coefficient. The zero run length decoding unit 60 outputs the generated scanned DCT coefficients to the reverse scan unit 52.

The state prediction unit 64 uses the scanned DCT coefficients input from the zero run length decoding unit 62, and the zero run length or the decoded information during the variable length decoding unit 60 is decoded. The state prediction parameter is generated by predicting the state of the bit plane end signal and the probability that “1” is “1. The state prediction unit 60 outputs the generated state prediction parameter to the variable-length decoding unit 60.

The operation of the video decoding apparatus 40 configured as described above will be described.

FIG. 8 is a flowchart showing an example of the operation of the video decoding device 40 according to the second embodiment. Note that the flowchart shown in FIG. 8 is executed as software by executing a control program stored in a storage device (for example, ROM, flash memory, etc.) (not shown) by the CPU (not shown). It is also possible to

First, the video decoding apparatus 40 performs stream input processing (S40). Specifically, the base layer input unit 42 also inputs the base layer stream as an external force of the video decoding apparatus 40 and outputs the base layer stream to the base layer decoding unit 44. The enhancement layer input unit 46 receives the enhancement layer stream from the outside of the video decoding device 40 and outputs the stream to the enhancement layer decoding unit 48.

Next, the video decoding apparatus 40 performs base layer decoding processing (S42). Specifically, the base layer decoding unit 44 decodes the base layer stream input from the base layer input unit 42 to generate a base layer decoded image. The base layer decoding unit 44 outputs the generated base layer decoded image to the enhancement layer decoding unit 48 and the video signal output unit 58. Do.

Next, the video decoding apparatus 40 performs hierarchical coding processing (S44). Specifically, the enhancement layer decoding unit 48 decodes the enhancement layer stream input from the enhancement layer input unit 46 to generate an enhancement layer decoded image. The enhancement layer decoding unit 48 outputs the generated enhancement layer decoded image to the video signal output unit 58.

FIG. 9 is a flowchart illustrating an example of the hierarchical decoding process (S44). First, the hierarchy decoding unit 50 performs state prediction processing (S60). Specifically, using the scanned DCT coefficients input from the zero run length decoding unit 62, the state prediction unit 64 uses the scanned DCT coefficients to perform zero run length during decoding or decoding performed by the variable length decoding unit 60. The state prediction parameter is generated by predicting the probability that the information is “1” and the state of the bit plane end signal, and the generated state prediction parameter is output to the variable length decoding unit 60. The prediction performed by the state prediction unit 64 is the same as that performed in the first embodiment, and the state prediction parameter generated by the state prediction unit 64 is a zero run length formed by consecutive bits including decoding information. Or the probability that each bit included in the code information is “1” and the probability that the bit plane end signal strength is SON. The state prediction unit 64 needs to always input the latest scanned DCT coefficient from the zero run length decoding unit 62 in order to predict the state of the decoded information.

[0190] Next, the hierarchical decoding unit 50 performs variable-length decoding processing (S62). Specifically, the variable-length decoding unit 60 performs variable-length decoding on the enhancement layer stream input from the enhancement layer input unit 46 using the state prediction parameters input from the state prediction unit 64. Generate a combination of zero run length and bit plane end signal. The variable length decoding unit 60 outputs the combination of the generated zero run length and bit plane end signal to the zero run length decoding unit 62. The method of selecting a Huffman table used for variable length decoding based on the state prediction parameter is the same as that of the first embodiment.

[0191] Next, layer coding section 50 performs zero run length decoding processing (S64) ₀ Specifically, zero run length decoding section 62 receives an input from variable length decoding section 60. The combination of the zero run length and the bit plane end signal is decoded to generate scanned DCT coefficients. The zero run length decoding unit 62 outputs the generated scanned DCT coefficients to the reverse scan unit 52. The hierarchical decoding process (S44) has been described above. Returning to FIG. 8, the video decoding device 40 performs reverse scan processing (S46). Specifically, the reverse scan unit 52 performs reverse scan to rearrange the scanned DCT coefficients input from the hierarchical decoding unit 52 in a predetermined order to generate DCT coefficients, and generates the generated DCT coefficients. Output to the inverse DCT unit 54.

Next, the video decoding / decoding device 40 performs inverse DCT processing (S48). Specifically, the DCT coefficient input from the inverse DCT unit 54 1S inverse scan unit 52 is subjected to inverse DCT for each block, and the inverse DCT converted blocks are synthesized to generate a differentially decoded image. The inverse DCT unit 54 outputs the generated differentially decoded image to the addition unit 56.

Next, the video decoding apparatus 40 performs addition processing (S50). Specifically, the addition unit 56 adds the base layer decoded image input to the enhancement layer decoding unit 48 by the base layer decoding unit 44 and the differentially decoded image input from the inverse DCT unit 54. To generate an enhancement layer decoded image. The addition unit 56 outputs the generated enhancement layer decoded image to the video signal output unit 58.

Next, the video decoding / decoding device 40 performs video signal output processing (S 52). Specifically, the video signal output unit 58 decodes the base layer decoded image input from the base layer decoding unit 44 and the enhancement layer decoded image input from the enhancement layer decoding unit 48. Output to the outside of the device 40. The video signal output unit 58 may output only one of the base layer decoded image and the enhancement layer decoded image or the outer layer image to the outside.

Next, the video decoding apparatus 40 performs an end determination process (S 54) ₀ Specifically, the basic layer input unit 42 externally determines the presence / absence of the input of the base layer stream. As a result, if there is no input of the base layer stream, the processing ends, and if not, it returns to the stream input processing (S40). The video decoding apparatus 40 according to the second embodiment of the present invention has been described above.

According to the second embodiment, the video decoding apparatus 40 uses the state of the MSB of the bit plane higher than the bit in the decoding table to set the zero run length of the coded bit and the value “0”. The code amount for decoding the decoded information is calculated by predicting the probability of being “1” or the probability that the bit plane end signal is ON, replacing the Huffman table used for the Huffman decoding based on the prediction. It is possible to reduce the number and to improve the image quality of the image. According to the second embodiment, since the video decoding apparatus 40 predicts the state of the decoded information from the upper bit plane belonging to the same block, independence between blocks is maintained. It is possible to decode only the block of interest of the user performing decoding. This makes it possible to lighten the processing load of decryption. It is also possible to save the bit rate by transmitting only the video stream that corresponds to the block of interest.

In the second embodiment described above, an example in which Huffman decoding is used as the decoding method has been described. However, if the code data is generated by arithmetic code, variable length decoding is performed. A section 60 performs arithmetic decoding. In this case, arithmetic decoding is performed by changing the appearance probability of symbols used for arithmetic decoding based on the predicted bit state.

Third Embodiment

In the third embodiment, in a scalable video coding system, a video coding system will be described which predicts the state of the enhancement layer using the information of the base layer encoded earlier and performs coding. Based on the information previously encoded and stored in the video stream of the base layer V, a state prediction parameter representing the state of the enhancement layer is generated, and based on the generated state prediction parameter, the code information is Sign it.

FIG. 10 is a block diagram showing the configuration of a video coding apparatus 70 according to the third embodiment of the present invention. In FIG. 10, the video coding device 70 has a video signal input unit 72, a basic layer coding unit 74, a base layer output unit 76, an enhancement layer coding unit 78, and an enhancement layer output unit 90. The enhancement layer coding unit 78 includes a difference unit 80, a DCT unit 82, a scan unit 84, a layer coding unit 86, and a state prediction unit 88. The configuration of the video encoding device 70 according to the third embodiment is basically the same as the video encoding device 10 according to the first embodiment, but in the third embodiment, the hierarchical encoding is performed. The functions of the unit 86 and the state prediction unit 88 are different from those of the first embodiment.

The video signal input unit 72 receives an image as an original image frame by frame from the outside of the video code processing apparatus 70 and outputs the video to the base layer code processing unit 74 and the enhancement layer coding unit 78. It is determined whether there is a video input from the outside of the video encoding device 70, and if there is no video input, the processing is terminated.

[0203] Base layer coding section 74 encodes the original image input from video signal input section 72. The base layer stream is generated, and the generated base layer stream is output to the base layer output unit 76. Further, base layer coding section 74 decodes the base layer stream to generate a base layer decoded image, and outputs the generated base layer decoded image to enhancement layer coding section 78.

The base layer output unit 76 outputs the base layer stream input from the base layer coding unit 74 to the outside of the video coding unit 70.

[0205] The enhancement layer output unit 90 outputs the enhancement layer stream input from the enhancement layer coding unit 78 to the outside of the video coding unit 70.

The difference unit 80 generates a difference image by calculating the difference between the original image input from the video signal input unit 72 and the base layer decoded image input from the base layer coding unit 74. The difference image is output to the DCT unit 82.

The DCT unit 82 divides the differential image input from the differential unit 80 into blocks which are areas of 8 × 8 pixels, performs DCT transform for each block, and generates DCT coefficients. D

The CT coefficient is output to the scan unit 84.

[0208] The scan unit 84 scans the DCT coefficients input from the DCT unit 82 in a predetermined order to generate scanned DCT coefficients, and generates the scanned DCT coefficients in the hierarchical coding unit.

Output to 86.

State prediction unit 90 generates a state prediction parameter based on the base layer decoded image input from base layer coding unit 74, and generates the state prediction parameter as a hierarchical code. Output to section 86.

Next, the operation of the video coding device 70 configured as described above will be described.

FIG. 11 is a flow chart showing an example of the operation of the video coding apparatus 70 of the third embodiment. The flowchart shown in FIG. 3 is executed as software by executing a control program stored in a storage device (for example, a ROM, a flash memory, etc.) (not shown) and executed by the CPU (not shown). It is also possible to

First, the video encoding device 70 performs video signal input processing (S70). Specifically, the video signal input unit 72 converts the video into an original image frame by frame from the outside of the video coding device 70. Input to the base layer code section 74 and the enhancement layer code section 78.

Next, the video encoding device 70 performs base layer encoding processing (S72). Specifically, the base layer coding unit 74 encodes the original image input from the video signal input unit 72 to generate a base layer stream, and the generated base layer stream is output to the base layer output unit 76. Output. Base layer coding section 74 decodes the base layer stream to generate a base layer decoded image, and outputs the generated base layer decoded image to enhancement layer coding section 78. Similar to the first embodiment, the base layer coding method uses the existing method such as MPEG-4 AVC.

Next, the video encoding device 70 performs differential processing (S 74). Specifically, the difference unit 80 obtains a difference between the original image input from the video signal input unit 72 and the base layer decoded image input from the base layer coding unit 74 to generate a difference image. The difference unit 80 outputs the generated difference image to the DCT unit 82.

[0214] Next, the video encoding device 70 performs state prediction processing (S76). Specifically, the state prediction unit 88 generates a state prediction parameter based on the base layer decoded image input from the base layer coding unit 74, and the generated state prediction parameter is transmitted to the layer coding unit 86. Output to

In the prediction of the state prediction unit 88, the base layer decoded image is divided into blocks according to the processing block of the DCT unit 82, the characteristics are checked for each block, and the state of the DCT coefficient of the corresponding block of the enhancement layer Predict. The characteristics of a block are the amount of code when a block is coded as a base layer stream, and the amount of edges included in the base layer decoded image of that block. When the number of edges included in the block is large, the code amount is also large. The amount of edge is calculated by using a known Roberts filter, Sobel filter, DCT transform, and using the absolute sum of DCT coefficients. The state of the block's enhancement layer is the amount of edges that the corresponding block of the difference image contains. When there are many edges in a block with difference video, a coefficient with a large absolute value appears at high frequency when performing orthogonal transformation such as DCT transformation. The state prediction unit 88 predicts that the DCT coefficient of the enhancement layer contains a large value at a high frequency as the certain block of the base layer decoded image contains more edges. The bias to the high frequency coefficient of the DCT coefficient of the enhancement layer is used as the state prediction parameter. Next, the video code processing apparatus 70 performs a DCT process (S 78). Specifically, the DCT unit 82 divides the difference image input from the difference unit 80 into blocks which are regions of 8 × 8 pixels, performs DCT transform for each block, and generates DCT coefficients. The DCT coefficient is output to the scan unit 84.

[0217] Next, the video code processing apparatus 70 performs a scan process (S80). Specifically, the scanning unit 84 scans the DCT coefficients input from the four-force DCT unit 82 in a predetermined order to generate scanned DCT coefficients, and the generated scanned DCT coefficients are stored in the layer coding unit 86. Output to

Next, the video encoding device 70 performs hierarchical coding processing (S 82). Specifically, the hierarchical encoding unit 86 encodes the scanned DCT coefficients input from the scanning unit 84 using the state prediction parameters input from the state prediction unit 88, and generates the encoded coefficients. The enhancement layer stream thus output is output to the enhancement layer output unit 90.

[0219] In the coding by the hierarchical coding unit 86, the scanned DCT coefficients for each block are subjected to zero run length coding for each bit plane to generate a combination of a zero run length and a bit plane end signal, Variable length coding is performed using state prediction parameters.

A case will be considered in which hierarchical code unit 86 performs Huffman coding as variable-length coding. When the state prediction unit 88 predicts that the deviation to the high frequency of the DCT coefficient is large, the bit “1” in which the bit plane end signal is ON appears behind the scan, compared to when the deviation is predicted to be small Probability is high. Therefore, when the hierarchical coding unit 86 codes the combination belonging to the block predicted to have a large deviation, the larger the scan coordinate of the bit that starts the zero run, the larger the bit plane end signal turns ON. Select the Noh-Fuman table where short codes have been assigned to the combinations. Conversely, when coding the combination belonging to the block predicted to have a small bias power S, the hierarchical code input unit 86 has a small scan coordinate of the bit that starts the zero run, and the bit plane end signal is Select a short table with codes assigned to short for combinations that are ON.

[0221] When it is predicted that the high-frequency bias of the DCT coefficient is large, there is a high possibility that the MSB appears in bit planes of lower bit positions or more than when it is predicted that the coefficient is small. Therefore, when coding a combination belonging to a block predicted to have a large bias, it is zero Select the Nofman table where the run length is short, the combination is short, and the code is assigned, and perform Nofman encoding. Conversely, when coding a combination belonging to a block predicted to have a small bias, a Huffman table with a length of zero run length, a short combination, and a code assigned is selected to perform Huffman coding.

[0222] When it is predicted that the high-frequency bias of the DCT coefficient is large, the hierarchical coding unit 86 performs arithmetic coding with a high probability of occurrence that the code information with large scan coordinates is "1". I do.

Next, the video encoding device 70 performs stream output processing (S 84). Specifically, the base layer output unit 76 outputs the base layer stream input from the base layer coding unit 74 to the outside of the video coding unit 70. The enhancement layer output unit 90 outputs the enhancement layer stream input from the enhancement layer coding unit 78 to the outside of the video coding device 70.

Next, the video encoding device 70 performs an end determination process (S86). Specifically, the video signal input unit 72 determines the presence or absence of a video input from the outside of the video encoding device 70. As a result of the determination, if there is no video input, the process ends, and if there is a video input, the process returns to the video signal input process (S70). The video encoding device 70 according to the third embodiment has been described above.

According to the third embodiment, the video encoding device 70 has a probability of being a zero run length or “1” from the state of the edge of the base layer with respect to the code layer information of the enhancement layer. It is possible to predict the probability that the bit plane end signal is ON, and replace the Huffman table used for the Huffman code based on the prediction to improve the code efficiency of the variable length code. is there.

In the third embodiment described above, an example in which Huffman coding is performed as variable-length coding has been described, but arithmetic coding may be performed as variable-length coding. In this case, the state prediction unit 88 performs state prediction as follows. When it is predicted that the high-frequency bias of the DCT coefficient is large, the hierarchical coding unit 86 increases the probability that the bit plane end signal is ON as the scanning coordinate of the bit increases, thereby increasing the arithmetic code. Perform. Conversely, when the state prediction unit 88 predicts that the deviation of the DCT coefficients toward high frequencies is small, the probability that the bit plane end signal power is ON is made high even when the scan coordinates of the bit are small. Perform encoding.

In the third embodiment described above, the hierarchical coding scheme is based on the base layer. Although the example of predicting the state of the enhancement layer has been described, the state prediction may be performed based on other image information. The motion prediction / compensation code MPEG used for MPEG code 分割 etc. divides the original image into several areas, searches the reference area from the previous and subsequent decoded images for each area to be encoded, and encodes the reference area and the encoding area. The variable with the area to be It is also possible to predict the state of the difference image with the subsequent frame based on the reference frame in this motion prediction / compensation code system. When the present invention is applied to a motion prediction compensation code, prediction is performed from a reference frame by using a reference frame instead of the base layer in the third embodiment and using a difference image instead of the enhancement layer. The differential image can be efficiently encoded based on the state.

Fourth Embodiment

In the fourth embodiment, a state prediction parameter representing the state of the enhancement layer is predicted based on decoded information which is information of the base layer decoded earlier, and decoding is performed based on the state prediction parameter. A video decoding apparatus that performs a communication will be described. In the fourth embodiment, a video decoding apparatus for decoding a video stream generated by the video coding apparatus 70 according to the third embodiment will be described.

FIG. 12 is a block diagram showing a configuration of a video decoding / reproducing apparatus 100 according to a fourth embodiment of the present invention. In FIG. 12, the video decoding apparatus 100 includes a base layer input unit 102, a base layer decoding unit 104, an enhancement layer input unit 106, an enhancement layer decoding unit 108, and a video signal output unit 120. The enhancement layer decoding unit 108 includes a hierarchical decoding unit 110, a reverse scan unit 112, an inverse DCT unit 114, an addition unit 116, and a state prediction unit 118. The basic configuration of the video decoding apparatus 100 according to the fourth embodiment is the same as that of the video decoding apparatus according to the second embodiment, but the fourth embodiment is a hierarchical decoding. The functions of the unit 110 and the state prediction unit 118 are different.

[0230] Base layer input section 102 also receives the base layer stream as an external force of video decoding / decoding apparatus 100, and outputs the base layer stream to base layer decoding section 104. The base layer input unit 102 determines the presence or absence of input of the base layer stream from the outside, and ends the processing if there is no input of the base layer stream.

Base layer decoding section 104 receives a base layer sequence input from base layer input section 102. The stream is decoded to generate a base layer decoded image, and the generated base layer decoded image is output to the enhancement layer decoding unit 108 and the video signal output unit 120.

Enhancement layer input section 106 also receives an enhancement layer stream as an external force of video decoding / decoding apparatus 100 and outputs the enhancement layer stream to enhancement layer decoding section 108.

[0233] Video signal output section 120 outputs the base layer decoded image input from base layer decoding section 104 and the enhancement layer decoded image input from enhancement layer decoding section 108 to video decoding apparatus 100. Output to the outside of

Hierarchical decoding section 110 decodes the enhancement layer stream input from enhancement layer input section 106 using the state prediction parameters input from state prediction section 118 to generate scanned DCT coefficients. The generated scanned DCT coefficients are output to the inverse scan unit 112.

[0235] The reverse scan unit 112 performs reverse scan to rearrange the scanned DCT coefficients input from the hierarchical decoding unit 112 in a specified order to generate DCT coefficients, and reverses the generated DCT coefficients. Output to the DCT unit 114.

[0236] The inverse DCT unit 114 inverts the DCT coefficients input from the inverse scan unit 112 for each region.

T is applied to synthesize a region to generate a differentially decoded image, and the generated differentially decoded image is output to the addition unit 116.

[0237] The addition unit 116 adds the base layer decoded image input from the base layer decoding unit 104 and the differentially decoded image input from the inverse DCT unit 114 to generate an enhancement layer decoded image, The generated enhancement layer decoded image is output to the video signal output unit 120.

State prediction section 118 generates state prediction parameters based on the base layer decoded image input from base layer decoding section 104 to enhancement layer decoding section 108, and is generated. The state prediction parameters are output to hierarchical decoding section 116.

[0239] Next, the operation of the video decoding apparatus 100 configured as described above will be described.

FIG. 13 is a flowchart showing an example of the operation of the video decoding / decoding device 100 of the fourth embodiment shown in FIG. The flowchart shown in FIG. 13 is executed by executing a control program stored in a storage device (not shown) (eg, ROM, flash memory, etc.) by executing the program by executing a program (not shown). To It is a good thing.

First, the video decoding / reading apparatus 100 performs stream input processing (S90). Specifically, the base layer input unit 102 inputs the base layer stream also from the external power of the video decoding apparatus 100 and outputs the base layer stream to the base layer decoding unit 104. The enhancement layer input unit 106 also receives the enhancement layer stream from the external power of the video decoding apparatus 100 and outputs the enhancement layer stream to the enhancement layer decoding unit 108.

Next, the video decoding apparatus 100 performs base layer decoding processing (S92). Specifically, base layer decoding section 104 decodes the base layer stream input from base layer input section 102 to generate a base layer decoded image, and the generated base layer decoded image is generated. It is output to the enhancement layer decoding unit 108 and the video signal output unit 120.

Next, the video decoding apparatus 100 performs state prediction processing (S94). Specifically, the state prediction unit 118 also generates state prediction parameters for the base layer decoding / image power input from the base layer decoding unit 104, and outputs the generated state prediction parameters to the hierarchy decoding unit 116. To force. In the prediction of the state prediction unit 118, the base layer decoded image is divided according to the region to be synthesized by the inverse DCT unit 112, the characteristics are checked for each divided region, and the DCT coefficients of the corresponding extension layer region are Predict the state. The prediction performed by the state prediction unit 118 is the same as the prediction performed in the third embodiment.

Next, the video decoding apparatus 100 performs hierarchical decoding processing (S96). Specifically, the hierarchical decoding unit 110 decodes the enhancement layer stream input from the enhancement layer input unit 106 using the state prediction parameters input from the state prediction unit 118, and the scanned DCT coefficients are decoded. , And outputs the generated scanned DCT coefficients to the inverse scan unit 112. In the decoding of the hierarchical decoding unit 86, variable length decoding is performed using state prediction parameters, and a combination of zero run length and bit plane end signal for each bit plane of each area is generated, Generate scanned DCT coefficients by zero run length decoding. The hierarchy decoding unit 110 selects a Huffman table based on the state prediction parameter, and performs decoding using the selected Huffman table.

Next, the video decoding apparatus 100 performs reverse scan processing (S 98). Specifically, reverse scan unit 112 determines the order in which the scanned DCT coefficients input from hierarchical decoding unit 112 are determined. Inverse scan is performed to rearrange numbers to generate DCT coefficients, and the generated DCT coefficients are output to the inverse DCT unit 114.

Next, the video decoding / writing apparatus 100 performs inverse DCT processing (S 100). Specifically, the inverse DCT unit 114 performs inverse DCT on the DCT coefficients input from the inverse scan unit 112 for each area, synthesizes the areas, and generates a differentially decoded image, and the generated differential decoding The image is output to the addition unit 116.

[0246] Next, the video decoding apparatus 100 performs an addition process (S112). Specifically, the addition unit 116 adds the base layer decoded image input to the enhancement layer decoding unit 108 by the base layer decoding unit 104 and the differentially decoded image input from the inverse DCT unit 114. To generate an enhancement layer decoded image. The addition unit 116 outputs the generated enhancement layer decoded image to the video signal output unit 120.

Next, the video decoding / reproducing apparatus 100 performs video signal output processing (S114). Specifically, the video signal output unit 120 outputs the base layer decoded image input from the base layer decoding unit 104 and the enhancement layer decoded image input from the enhancement layer decoding unit 108 as video. Output to the outside of the decryption device 100. The video signal output unit 120 may output only one of the base layer decoded image and the enhancement layer decoded 匕 image to the outside.

Next, the video decoding apparatus 100 performs an end determination process (S116) ₀ Specifically, the basic layer input unit 102 determines whether there is an input of the external layer or the input of the basic layer stream. If there is no input of the basic layer stream, the processing ends, and if there is an input of the basic layer stream, the processing returns to the stream input processing (S90). The video decoding apparatus 100 according to the fourth embodiment of the present invention has been described above.

According to the fourth embodiment, the video decoding apparatus 100 predicts the zero run length and the probability of being “1” from the state of the edge of the base layer with respect to the decoding information of the enhancement layer. Also, by predicting the probability that the bit plane end signal is ON, the Huffman table used for the Huffman decoding is replaced based on the prediction, and the code amount for decoding the decoding information is reduced. It is possible to improve the quality of the image.

[0250] In the fourth embodiment described above, an example in which Huffman decoding is performed has been described, but the hierarchical decoding unit 110 may perform arithmetic decoding. In this case, the state of the predicted enhancement layer Decoding is performed by changing the appearance probability of symbols used for arithmetic decoding according to the state.

In the fourth embodiment described above, an example has been described in which the state of the enhancement layer is predicted from the base layer and decoding of the enhancement layer is performed based on the predicted state. However, motion prediction / compensation decoding When performing chewing, the state may be predicted based on the reference image. When the present invention is applied to a motion prediction compensation code, a reference frame is used instead of the base layer in the fourth embodiment, and a difference image is used instead of the enhancement layer to obtain a reference frame. The difference image can be decoded based on the predicted state.

As described above, it is understood that various modifications can be made to the embodiment of the present invention which has described the presently preferred embodiment of the present invention, and it is within the true spirit and scope of the present invention. It is intended that the appended claims cover all such variations. Industrial applicability

As described above, according to the present invention, the states of bits to be encoded later based on information on previously encoded transform coefficients, that is, “1” and “0” in bits to be encoded later. By predicting the appearance probability of “,” it has an excellent effect of being able to perform efficient coding, and is useful as a video coding apparatus that codes video. The present invention is particularly useful for a video coding scheme of a system that transmits and receives video while dynamically changing the amount of video stream via a network with a variable communication speed.

Claims

The scope of the claims

[1] A conversion coefficient generation unit that generates a conversion coefficient representing a frequency component by frequency-converting an image, and the conversion coefficient generated by the conversion coefficient generation unit are converted into binary numbers, and the same order of a plurality of conversion coefficients A bit plane code section that generates bit planes consisting of bits from the most significant bit to the least significant bit, and the higher order bit plane powers in order.

A state prediction unit that predicts a state of bits of a transform coefficient to be encoded later, based on information on the transform coefficient encoded earlier;

Equipped with

The video coding apparatus according to claim 1, wherein the bit-plane coding unit performs coding according to the state of the bit predicted by the state prediction unit.

[2] The video coding device according to claim 1, wherein the state prediction unit predicts the state of the bits of the lower bit plane based on the information of the upper bit plane.

[3] The video code according to claim 2, wherein the state prediction unit predicts the state of the bit according to the number of transform coefficients in which the state “1” of the bit appears in the bit plane higher than the bit to be predicted. Device.

[4] The state prediction unit according to claim 2, wherein the state prediction unit predicts the state of the bit based on whether or not the state of the bit “1” appears in the upper bits of the transform coefficient of the bit to be predicted. Video coding device.

[5] The state prediction unit is configured to generate a code having a code state that is higher than the conversion coefficient of the prediction target bit among the conversion coefficients in which the bit state “1” appears in the bit plane higher than the prediction target bit. The video encoding apparatus according to claim 2, wherein the state of the bit is predicted based on the distance to the closest transform coefficient, which is located behind the order.

[6] The state prediction unit determines the state of the bit based on the conversion coefficient in which the state of the bit “1” appears at the last position of the code order in the bit plane higher than the bit to be predicted. The video coding apparatus according to claim 2, wherein the video coding apparatus predicts.

[7] The state prediction unit is a state power “1” of the bit to be predicted, and the state of all bits belonging to the same bit plane of the conversion coefficient whose code order is located backward is “0”. " Predict the probability of

The video coding apparatus according to claim 2, wherein the bit plane coding unit performs zero run length coding based on the probability.

[8] The state prediction unit determines whether or not the conversion coefficient of the code order after the predetermined order includes the bit state “1” in the bit plane higher than the bit to be predicted. And according to the determination that the bit state "1" is not included, the state of the bit to be predicted is "1" and the code order belongs to the same bit plane of the conversion coefficient located behind. Predict the probability that the state of all bits is "0",

[9] The state prediction unit predicts a zero run length including a bit to be predicted,

3. The image coding method according to claim 2, wherein the bit plane coding unit performs Huffman coding using the selected table based on the zero run length predicted by the state prediction unit. apparatus.

10. The image according to claim 9, wherein the state prediction unit obtains, as a zero run length, the number of zeros up to a point where the probability that the bit state “0” appears continuously falls below a preset threshold. Encoding device.

[11] The state prediction unit predicts the probability that the state of the bit to be predicted is “1” or “0”, and the bitplane coding unit is configured to calculate the probability predicted by the state prediction unit. 3. The video encoding apparatus according to claim 2, wherein arithmetic coding is performed using the occurrence probability of the symbol determined based on V,.

[12] The state prediction unit is a state power “1” of the bit to be predicted, and the state of all bits belonging to the same bit plane of the conversion coefficient whose code order is located backward is “0”. Predict the probability of

3. The video code according to claim 2, wherein the bit plane coding unit performs arithmetic coding using the occurrence probability of the symbol determined based on the probability predicted by the state prediction unit. Device.

[13] Enhancement to improve the video quality of base layer and base layer that can decode video independently It is a video coding device that performs hierarchical coding with layers,

The state prediction unit predicts the state of the bit of the conversion coefficient of the enhancement layer based on the information of the base layer;

The video coding apparatus according to claim 1, wherein the bit-plane coding unit encodes the transform coefficient of the enhancement layer according to the state of the bit predicted by the state prediction unit.

[14] The condition prediction unit is a request to predict the state of the bit of the transform coefficient of the enhancement layer based on edge information included in the base layer or a code amount of the base layer. 13. The video encoding device according to 13.

[15] The state prediction unit predicts the state of bits of transform coefficients of a frame making up an image based on information of a reference frame used for motion prediction / compensation code 、,

The video coding apparatus according to claim 1, wherein the bit plane coding unit codes the conversion coefficient of the frame according to the state of the bit predicted by the state prediction unit.

[16] The state prediction unit predicts a state of bits of transform coefficients of the frame based on edge information included in the reference frame or a code amount of the reference frame. Video coding device according to claim 1.

[17] A bit-plane decoding unit that decodes the bit-plane coded image data and the upper-bit-plane data sequentially in the same manner as the upper-bit-plane power, and

A state prediction unit that predicts the state of bits of the transform coefficient to be decoded later based on the information on the previously decoded transform coefficient;

Equipped with

The video decoding apparatus, wherein the bit plane decoding unit performs decoding according to the state of the bit predicted by the state prediction unit.

[18] The video decoding apparatus according to claim 17, wherein the state prediction unit predicts the state of the bits of the lower bit plane based on the information of the upper bit plane.

[19] The image according to claim 18, wherein the state prediction unit predicts the state of the bit according to the number of transform coefficients in which the state of the bit “1” appears in the bit plane higher than the bit to be predicted. Decryption device.

[20] The state prediction unit according to claim 18, wherein the state prediction unit predicts the state of the bit based on whether or not the state of the bit “1” appears in the upper bits of the transform coefficient of the bit to be predicted. Video decoder.

[21] The state prediction unit decodes more than the conversion coefficient of the prediction target bit among the conversion coefficients in which the bit state “1” appears in the bit plane higher than the prediction target bit. The video decoding apparatus according to claim 18, wherein the state of the bit is predicted based on the distance to the closest transform coefficient, which is located behind the order.

[22] The state prediction unit determines the bit state based on the conversion coefficient in which the bit state “1” appears at the position at the rearmost position of the decoding order in the bit plane higher than the bit to be predicted. The video decoding apparatus according to claim 18, which predicts.

[23] The state prediction unit is a state power “1” of the bit to be predicted, and the state of all bits belonging to the same bit plane of the conversion coefficient whose decoding order is located backward is “0”. Predict the probability of

The video decoding apparatus according to claim 18, wherein the bit-plane decoding unit performs zero run length decoding based on the probability.

[24] The state prediction unit determines whether or not the conversion coefficient of the decoding order after the predetermined order includes the bit state “1” in the bit plane higher than the bit to be predicted. And according to the determination that the bit state "1" is not included, the state of the bit to be predicted is "1", and the decoding order belongs to the same bit plane of the transform coefficient located behind. Predict the probability that the state of all bits is "0",

[25] The state prediction unit predicts a zero run length including a bit to be predicted,

The video decoding apparatus according to claim 18, wherein the bit-plane decoding unit performs Huffman decoding using a Nofman table selected based on the zero run length predicted by the state prediction unit.

26. The state prediction unit according to claim 25, wherein the number of zeros until the probability that the state of the bit “0” appears continuously falls below a preset threshold is obtained as a zero run length. Video decoding device.

[27] The state prediction unit predicts the probability that the state of the bit to be predicted is “1” or “0”, and the bit plane decoding unit calculates the probability predicted by the state prediction unit. The video decoding apparatus according to claim 18, wherein arithmetic decoding is performed using the occurrence probability of the symbol determined based on V,.

[28] The state prediction unit is the state power “1” of the bit to be predicted, and the state of all bits belonging to the same bit plane of the conversion coefficient located behind in the decoding order is “0”. Predict the probability of being

The video decoding according to claim 18, wherein the bit plane decoding unit performs arithmetic decoding using the occurrence probability of the symbol determined based on the zero run length predicted by the state prediction unit. Device.

[29] The state prediction unit predicts, based on the decoded information of the base layer, the states of bits of transform coefficients of the enhancement layer for improving the video quality of the base layer,

The video decoding apparatus according to claim 17, wherein the bit plane decoding unit decodes the transform coefficient of the enhancement layer according to the state of the bit predicted by the state prediction unit.

[30] The state prediction unit is a request to predict the state of bits of the transform coefficient of the enhancement layer based on edge information included in the base layer or a code amount of the base layer. 29. The video decoding apparatus as described in 29.

[31] The state prediction unit predicts the state of the bit of the transform coefficient of the frame making up the video based on the information of the reference frame used for the motion prediction / compensation code 、.

The video decoding apparatus according to claim 17, wherein the bit plane decoding unit decodes the transform coefficient of the frame according to the state of the bit predicted by the state prediction unit.

32. The video decoding according to claim 31, wherein the state prediction unit predicts the state of the bits of the frame based on edge information included in the reference frame or a code amount of the reference frame. apparatus.

[33] A transform coefficient generation step of frequency converting an image to generate transform coefficients representing frequency components;

In the conversion coefficient generation step, the generated conversion coefficients are converted into binary numbers, and bit planes consisting of the same-order bits of a plurality of conversion coefficients are generated from the most significant bit to the least significant bit, and the upper bits With the bit plane coding step where the plane force is also coded in order,

State prediction step of predicting the state of bits of the transform coefficient to be encoded later based on the information on the previously encoded transform coefficient;

Equipped with

A video coding method in which the bit plane coding step performs a coding according to a state of a bit predicted in the state prediction step.

[34] A bit-plane decoding step of decoding the bit-plane coded image data and the code data of the video in sequence with the upper bit-plane power, and

State prediction step of predicting the state of bits of the transform coefficient to be decoded later based on information on the previously decoded transform coefficient;

Equipped with

The video decoding method, wherein the bit plane decoding step performs decoding in accordance with the state of the bit predicted by the state prediction unit.

[35] To the computer to code the picture,

A transform coefficient generation step of frequency-converting an image to generate transform coefficients representing frequency components;

To run The bit plane coding step is a program for performing a code in accordance with a state of a bit predicted in the state prediction step.

In order to decode bit-plane-coded video code data, the computer can

A bit plane decoding step of sequentially decoding the upper bit plane power, and a state prediction step of predicting a bit state of a transform coefficient to be decoded later based on information on the previously decoded transform coefficient;

To run

The said bit plane decoding step is a program which decodes according to the state of the bit estimated by the said state estimation part.