KR20140129629A

KR20140129629A - Method and apparatus for processing moving image

Info

Publication number: KR20140129629A
Application number: KR1020130048163A
Authority: KR
Inventors: 박동진; 김종철
Original assignee: 주식회사 칩스앤미디어; 인텔렉추얼디스커버리 주식회사
Priority date: 2013-04-30
Filing date: 2013-04-30
Publication date: 2014-11-07

Abstract

An apparatus for processing a video is disclosed. The video processing apparatus comprises: a central video processing unit for parsing parameter information or slice header information from video data inputted from a host; and a plurality of video processing units for processing the video according to the parsed information by being controlled by the central video processing unit. The plurality of video processing units process regions assigned respectively and then perform the boundary processing between the video processing units. The boundary processing is performed by the video processing unit adjacent to the corresponding boundary portion.

Description

TECHNICAL FIELD [0001] The present invention relates to a video processing method and apparatus,

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a moving image processing method and apparatus, and more particularly, to a configuration in which a moving image is processed in a scalable manner using a plurality of processing units.

As the need for UHD has arisen, it has become difficult to accommodate the size of the storage medium and the bandwidth of the transmission medium with the current moving image compression technology. Therefore, a new compression standard technology for compressing UHD moving image has been required. Standardization was completed in January.

However, the HEVC can also be used for a video stream that is served over the internet and networks such as 3G and LTE. In this case, not only UHD but also FHD or HD class can be compressed with HEVC.

UHD TV also expects 4K 30fps in the short term, but 4K 60fps / 120fps, 8K 30fps / 60fps / ... The number of pixels to be processed per second is expected to increase.

In order to cost-effectively cope with various resolutions, frame rates, etc. according to such applications, it is necessary to have a video decoding apparatus that can be easily extended according to the performance and functions required in an application.

SUMMARY OF THE INVENTION The present invention has been made in view of the above-mentioned needs, and it is an object of the present invention to provide a moving picture processing method and apparatus having a V-CPU for allocating Multi V-Cores for border processing.

According to an aspect of the present invention, there is provided an apparatus for processing moving images, the apparatus comprising: an image central processing unit for parsing parameter information or slice header information from moving image data input from the host; A plurality of image processors for processing a moving image according to the parsed information, and a boundary processing is performed between the image processors after the plurality of image processors respectively process the allocated area, The boundary processing is performed by an image processing unit adjacent to the boundary portion.

According to another aspect of the present invention, there is provided a method of processing moving images in a moving image processing apparatus having an image processing unit and a plurality of image processing units, Parsing parameter information or slice header information from the moving picture data; processing the moving picture according to the parsed information under the control of the image central processing unit, the plurality of image processing units And performing boundary processing between the image processing units after processing the allocated area, and the boundary processing may be performed by an image processing unit adjacent to the boundary.

The moving picture processing method may be embodied as a computer-readable recording medium having recorded thereon a program for execution on a computer.

According to various embodiments of the present invention, it is possible to provide a video processing apparatus and method capable of effectively processing the number of pixels to be processed per second (4K 60 fps / 120 fps, 8K 30 fps / 60 fps / have.

1 is a block diagram illustrating a configuration of a moving picture encoding apparatus according to an embodiment of the present invention.
FIG. 2 is a diagram for explaining an example of a method of dividing and processing an image into blocks.
3 is a block diagram showing an embodiment of an arrangement for performing inter prediction in an encoding apparatus.
4 is a block diagram illustrating a configuration of a moving picture decoding apparatus according to an embodiment of the present invention.
5 is a block diagram showing an embodiment of a configuration for performing inter prediction in a decoding apparatus.
6 and 7 are views showing an example of the configuration of a sequence parameter set (SPS).
8 and 9 are diagrams showing an example of the configuration of a picture parameter set (PPS).
10 to 12 are views showing an example of the configuration of a slice header (SH).
13 is a layer structure of a moving picture decoding apparatus according to an embodiment of the present invention.
FIG. 14 is a timing diagram illustrating a moving picture decoding operation of a VPU according to an embodiment of the present invention.
15 is a diagram illustrating a detailed operation of a V-CPU according to an embodiment of the present invention.
16 to 17 are diagrams for explaining a boundary process performed in a post-process in a Multi V-Core according to an embodiment of the present invention.
18 to 21 illustrate a method of allocating a Multi V-Core to be bounded on a boundary portion according to an embodiment of the present invention.

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. It should be understood, however, that the present invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In the drawings, the same reference numbers are used throughout the specification to refer to the same or like parts.

Throughout this specification, when a part is referred to as being "connected" to another part, it is not limited to a case where it is "directly connected" but also includes the case where it is "electrically connected" do.

Throughout this specification, when a member is " on " another member, it includes not only when the member is in contact with the other member, but also when there is another member between the two members.

Throughout this specification, when an element is referred to as "including " an element, it is understood that the element may include other elements as well, without departing from the other elements unless specifically stated otherwise. The terms "about "," substantially ", etc. used to the extent that they are used throughout the specification are intended to be taken to mean the approximation of the manufacturing and material tolerances inherent in the stated sense, Accurate or absolute numbers are used to help prevent unauthorized exploitation by unauthorized intruders of the referenced disclosure. The word " step (or step) "or" step "used to the extent that it is used throughout the specification does not mean" step for.

Throughout this specification, the term " combination thereof " included in the expression of the machine form means one or more combinations or combinations selected from the group consisting of the constituents described in the expression of the machine form, And the like.

As an example of a method of encoding an actual image and its depth information map, the Moving Picture Experts Group (MPEG) and the Video Coding Experts Group (VCEG) having the highest coding efficiency among the video coding standards developed so far jointly standardize Encoding is performed using HEVC (High Efficiency Video Coding), but the present invention is not limited thereto.

Generally, the encoding apparatus includes an encoding process and a decoding process, and the decoding apparatus has a decoding process. The decoding process of the decoding apparatus is the same as the decoding process of the encoding apparatus. Therefore, the encoding apparatus will be mainly described below.

1 is a block diagram illustrating a configuration of a moving image encoding apparatus according to an embodiment of the present invention.

1, a moving picture encoding apparatus 100 according to the present invention includes a picture dividing unit 110, a transform unit 120, a quantization unit 130, a scanning unit 131, an entropy coding unit 140, An inter prediction unit 160, an inverse quantization unit 135, an inverse transformation unit 125, a post-processing unit 170, a picture storage unit 180, a subtraction unit 190, and an addition unit 195, .

The picture division unit 110 analyzes the input video signal to determine a prediction mode by dividing a picture into a coding unit of a predetermined size for each largest coding unit (LCU: Largest Coding Unit), and determines a prediction unit size .

The picture division unit 110 sends the prediction unit to be encoded to the intra prediction unit 150 or the inter prediction unit 160 according to a prediction mode (or a prediction method). Further, the picture division unit 110 sends the prediction unit to be encoded to the subtraction unit 190.

The picture may be composed of a plurality of slices, and the slice may be composed of a plurality of maximum coding units (LCU).

The LCU can be divided into a plurality of coding units (CUs), and the encoder can add information indicating whether or not to be divided to a bit stream. The decoder can recognize the position of the LCU by using the address (LcuAddr).

The coding unit CU in the case where division is not allowed is regarded as a prediction unit (PU), and the decoder can recognize the position of the PU using the PU index.

The prediction unit PU may be divided into a plurality of partitions. Also, the prediction unit PU may be composed of a plurality of conversion units (TUs).

In this case, the picture division unit 110 may send the image data to the subtraction unit 190 in units of blocks of a predetermined size (for example, in units of PU or TU) according to the determined coding mode.

Referring to FIG. 2, a CTU (Coding Tree Unit) is used as a moving picture encoding unit, and the CTU is defined as various square shapes. The CTU includes a coding unit CU (coding unit).

The coding unit (CU) is a quad tree and has a depth of 0 when the maximum coding unit LCU (Largest Coding Unit) having a size of 64 × 64 is set to 0, , That is, the encoding unit (CU) of 8 × 8 size, is recursively found.

A prediction unit for performing prediction is defined as a PU (Prediction Unit). Each coding unit (CU) is predicted by a unit divided into a plurality of blocks, and is divided into a square and a rectangle to perform prediction.

The transforming unit 120 transforms the residual block, which is a residual signal of the prediction block generated by the intra prediction unit 150 or the inter prediction unit 160, with the original block of the input prediction unit. The residual block is composed of a coding unit or a prediction unit. A residual block composed of a coding unit or a prediction unit is divided into optimum conversion units and converted. Different transformation matrices may be determined depending on the prediction mode (intra or inter). Also, since the residual signal of the intra prediction has directionality according to the intra prediction mode, the transformation matrix can be adaptively determined according to the intra prediction mode.

The transformation unit can be transformed by two (horizontal, vertical) one-dimensional transformation matrices. For example, in the case of inter prediction, a predetermined conversion matrix is determined.

On the other hand, in case of the intra prediction, when the intra prediction mode is horizontal, the probability that the residual block has the direction in the vertical direction becomes high. Therefore, the DCT-based integer matrix is applied in the vertical direction, Or a KLT-based integer matrix. When the intra prediction mode is vertical, a DST-based or KLT-based integer matrix is applied in the vertical direction and a DCT-based integer matrix is applied in the horizontal direction.

In case of DC mode, DCT-based integer matrix is applied in both directions. Further, in the case of intra prediction, the transformation matrix may be adaptively determined depending on the size of the conversion unit.

The quantization unit 130 determines a quantization step size for quantizing the coefficients of the residual block transformed by the transform matrix. The quantization step size is determined for each coding unit of a predetermined size or larger (hereinafter referred to as a quantization unit).

The predetermined size may be 8x8 or 16x16. And quantizes the coefficients of the transform block using a quantization matrix determined according to the determined quantization step size and the prediction mode.

The quantization unit 130 uses the quantization step size of the quantization unit adjacent to the current quantization unit as the quantization step size predictor of the current quantization unit.

The quantization unit 130 searches the left quantization unit, the upper quantization unit, and the upper left quantization unit of the current quantization unit in order, and can generate a quantization step size predictor of the current quantization unit using one or two effective quantization step sizes have.

For example, the effective first quantization step size searched in the above order can be determined as a quantization step size predictor. In addition, the average value of the two effective quantization step sizes searched in the above order may be determined as a quantization step size predictor, or when only one is effective, it may be determined as a quantization step size predictor.

When the quantization step size predictor is determined, the difference value between the quantization step size of the current encoding unit and the quantization step size predictor is transmitted to the entropy encoding unit 140.

On the other hand, there is a possibility that the left coding unit, the upper coding unit, and the upper left coding unit of the current coding unit do not exist. On the other hand, there may be coding units that were previously present on the coding order in the maximum coding unit.

Therefore, the quantization step sizes of the quantization units adjacent to the current coding unit and the quantization unit immediately before the coding order in the maximum coding unit can be candidates.

In this case, 1) the left quantization unit of the current coding unit, 2) the upper quantization unit of the current coding unit, 3) the upper left side quantization unit of the current coding unit, 4) . The order may be changed, and the upper left side quantization unit may be omitted.

The quantized transform block is provided to the inverse quantization unit 135 and the scanning unit 131.

The scanning unit 131 scans the coefficients of the quantized transform block and converts them into one-dimensional quantization coefficients. Since the coefficient distribution of the transform block after quantization may be dependent on the intra prediction mode, the scanning scheme is determined according to the intra prediction mode.

The coefficient scanning method may be determined depending on the size of the conversion unit. The scan pattern may vary according to the directional intra prediction mode. The scan order of the quantization coefficients is scanned in the reverse direction.

When the quantized coefficients are divided into a plurality of subsets, the same scan pattern is applied to the quantization coefficients in each subset. The scan pattern between subset applies zigzag scan or diagonal scan. The scan pattern is preferably scanned to the remaining subsets in the forward direction from the main subset containing the DC, but vice versa.

In addition, a scan pattern between subsets can be set in the same manner as a scan pattern of quantized coefficients in a subset. In this case, the scan pattern between the sub-sets is determined according to the intra-prediction mode. On the other hand, the encoder transmits to the decoder information indicating the position of the last non-zero quantization coefficient in the transform unit.

Information that can indicate the position of the last non-zero quantization coefficient in each subset can also be transmitted to the decoder.

The inverse quantization unit 135 dequantizes the quantized quantized coefficients. The inverse transform unit 125 restores the inversely quantized transform coefficients into residual blocks in the spatial domain. The adder combines the residual block reconstructed by the inverse transform unit with the intra prediction unit 150 or the received prediction block from the inter prediction unit 160 to generate a reconstruction block.

The post-processing unit 170 performs a deblocking filtering process for eliminating the blocking effect generated in the reconstructed picture, an adaptive offset application process for compensating a difference value from the original image on a pixel-by-pixel basis, and a coding unit And performs an adaptive loop filtering process to compensate the value.

The deblocking filtering process is preferably applied to the boundary of a prediction unit and a conversion unit having a size larger than a predetermined size. The size may be 8x8. The deblocking filtering process may include determining a boundary to be filtered, determining a bounary filtering strength to be applied to the boundary, determining whether to apply a deblocking filter, And selecting a filter to be applied to the boundary if it is determined to apply the boundary.

Whether or not the deblocking filter is applied is determined based on i) whether the boundary filtering strength is greater than 0 and ii) whether a pixel value at a boundary between two blocks adjacent to the boundary to be filtered (P block, Q block) Is smaller than a first reference value determined by the quantization parameter.

The filter is preferably at least two or more. If the absolute value of the difference between two pixels located at the block boundary is greater than or equal to the second reference value, a filter that performs relatively weak filtering is selected.

And the second reference value is determined by the quantization parameter and the boundary filtering strength.

The adaptive offset application process is to reduce a distortion between a pixel in the image to which the deblocking filter is applied and the original pixel. It may be determined whether to perform the adaptive offset applying process in units of pictures or slices.

The picture or slice may be divided into a plurality of offset regions, and an offset type may be determined for each offset region. The offset type may include a predetermined number (e.g., four) of edge offset types and two band offset types.

If the offset type is an edge offset type, the edge type to which each pixel belongs is determined and the corresponding offset is applied. The edge type is determined based on the distribution of two pixel values adjacent to the current pixel.

The adaptive loop filtering process can perform filtering based on a value obtained by comparing a reconstructed image and an original image through a deblocking filtering process or an adaptive offset applying process. The adaptive loop filtering can be applied to the entire pixels included in the 4x4 block or the 8x8 block.

Whether or not the adaptive loop filter is applied can be determined for each coding unit. The size and the coefficient of the loop filter to be applied may vary depending on each coding unit. Information indicating whether or not the adaptive loop filter is applied to each coding unit may be included in each slice header.

In the case of the color difference signal, it is possible to determine whether or not the adaptive loop filter is applied in units of pictures. The shape of the loop filter may have a rectangular shape unlike the luminance.

Adaptive loop filtering can be applied on a slice-by-slice basis. Therefore, information indicating whether or not adaptive loop filtering is applied to the current slice is included in the slice header or the picture header.

If the current slice indicates that adaptive loop filtering is applied, the slice header or picture header additionally includes information indicating the horizontal and / or vertical direction filter length of the luminance component used in the adaptive loop filtering process.

The slice header or picture header may include information indicating the number of filter sets. At this time, if the number of filter sets is two or more, the filter coefficients can be encoded using the prediction method. Accordingly, the slice header or the picture header may include information indicating whether or not the filter coefficients are encoded in the prediction method, and may include predicted filter coefficients when the prediction method is used.

On the other hand, not only luminance but also chrominance components can be adaptively filtered. Accordingly, the slice header or the picture header may include information indicating whether or not each of the color difference components is filtered. In this case, in order to reduce the number of bits, information indicating whether or not to filter Cr and Cb can be joint-coded (i.e., multiplexed coding).

At this time, in the case of chrominance components, since Cr and Cb are not all filtered in order to reduce the complexity, it is most likely to be the most frequent. Therefore, if Cr and Cb are not all filtered, the smallest index is allocated and entropy encoding is performed .

When both Cr and Cb are filtered, the largest index is allocated and entropy encoding is performed.

The picture storage unit 180 receives the post-processed image data from the post-processing unit 170, and restores and restores the pictures on a picture-by-picture basis. The picture may be a frame-based image or a field-based image. The picture storage unit 180 has a buffer (not shown) capable of storing a plurality of pictures.

The inter-prediction unit 160 performs motion estimation using at least one reference picture stored in the picture storage unit 180, and determines a reference picture index and a motion vector indicating a reference picture.

Based on the determined reference picture index and motion vector, a prediction block corresponding to a prediction unit to be coded is extracted from a reference picture used for motion estimation among a plurality of reference pictures stored in the picture storage unit 180 and output .

The intraprediction unit 150 performs intraprediction encoding using the reconstructed pixel values in a picture including the current prediction unit.

The intra prediction unit 150 receives the current prediction unit to be predictively encoded and selects one of a predetermined number of intra prediction modes according to the size of the current block to perform intra prediction.

The intraprediction unit 150 adaptively filters the reference pixels to generate intra prediction blocks. If reference pixels are not available, reference pixels may be generated using available reference pixels.

The entropy coding unit 140 entropy-codes the quantized coefficients quantized by the quantization unit 130, the intra prediction information received from the intra prediction unit 150, the motion information received from the inter prediction unit 160, and the like.

FIG. 3 is a block diagram of an embodiment of a configuration for performing inter-prediction in the encoding apparatus. The illustrated inter-prediction encoding apparatus includes a motion information determination unit 161, a motion information encoding mode determination unit 162, The information encoding unit 163, the prediction block generating unit 164, the residual block generating unit 165, the residual block encoding unit 166, and the multiplexer 167.

Referring to FIG. 3, the motion information determination unit 161 determines motion information of a current block. The motion information includes a reference picture index and a motion vector. The reference picture index indicates any one of the previously coded and reconstructed pictures.

And indicates one of the reference pictures belonging to the list 0 (L0) when the current block is unidirectionally inter-predictive-coded. On the other hand, when the current block is bi-directionally predictive-coded, a reference picture index indicating one of the reference pictures of the list 0 (L0) and a reference picture index indicating one of the reference pictures of the list 1 (L1) .

In addition, when the current block is bi-directionally predictive-coded, it may include an index indicating one or two pictures among the reference pictures of the composite list LC generated by combining the list 0 and the list 1.

The motion vector indicates the position of the prediction block in the picture indicated by each reference picture index. The motion vector may be a pixel unit (integer unit) or a sub-pixel unit.

For example, it may have a resolution of 1/2, 1/4, 1/8 or 1/16 pixels. When the motion vector is not an integer unit, the prediction block is generated from the pixels of the integer unit.

The motion information encoding mode determination unit 162 determines whether the motion information of the current block is to be coded in the skip mode, the merge mode, or the AMVP mode.

The skip mode is applied when there is a skip candidate having the same motion information as the current block motion information, and the residual signal is zero. The skip mode is also applied when the current block is the same size as the coding unit. The current block can be viewed as a prediction unit.

The merge mode is applied when there is a merge candidate having the same motion information as the current block motion information. The merge mode is applied when there is a residual signal when the current block is different in size from the coding unit or the size is the same. The merge candidate and the skip candidate can be the same.

AMVP mode is applied when skip mode and merge mode are not applied. The AMVP candidate having the motion vector most similar to the motion vector of the current block is selected as the AMVP predictor.

The motion information encoding unit 163 encodes the motion information according to a method determined by the motion information encoding mode deciding unit 162. [ When the motion information encoding mode is a skip mode or a merge mode, a merge motion vector encoding process is performed. When the motion information encoding mode is AMVP, the AMVP encoding process is performed.

The prediction block generation unit 164 generates a prediction block using the motion information of the current block. If the motion vector is an integer unit, the block corresponding to the position indicated by the motion vector in the picture indicated by the reference picture index is copied to generate a prediction block of the current block.

However, when the motion vector is not an integer unit, the pixels of the prediction block are generated from the pixels in the integer unit in the picture indicated by the reference picture index.

In this case, in the case of a luminance pixel, a prediction pixel can be generated using an 8-tap interpolation filter. In the case of a chrominance pixel, a 4-tap interpolation filter can be used to generate a predictive pixel.

The residual block generating unit 165 generates a residual block using the current block and the prediction block of the current block. If the current block size is 2Nx2N, a residual block is generated using a 2Nx2N prediction block corresponding to the current block and the current block.

However, if the current block size used for prediction is 2NxN or Nx2N, a prediction block for each of the 2NxN blocks constituting 2Nx2N is obtained, and the 2Nx2N final prediction block using the 2NxN prediction blocks is calculated Can be generated.

The 2Nx2N residual block may be generated using the 2Nx2N prediction block. It is possible to overlap-smoothing the pixels of the boundary portion to solve the discontinuity of the boundary portion of 2NxN-sized two prediction blocks.

The residual block coding unit 166 divides the generated residual block into one or more conversion units. Then, each conversion unit is transcoded, quantized, and entropy encoded. At this time, the size of the conversion unit may be determined according to the size of the residual block in a quadtree manner.

The residual block coding unit 166 transforms the residual block generated by the inter prediction method using an integer-based transform matrix. The transform matrix is an integer-based DCT matrix.

The residual block coding unit 166 uses a quantization matrix to quantize the coefficients of the residual block transformed by the transform matrix. The quantization matrix is determined by a quantization parameter.

The quantization parameter is determined for each coding unit equal to or larger than a predetermined size. The predetermined size may be 8x8 or 16x16. Therefore, when the current coding unit is smaller than the predetermined size, only the quantization parameters of the first coding unit are encoded in the coding order among the plurality of coding units within the predetermined size, and the quantization parameters of the remaining coding units are the same as the parameters. You do not have to.

The coefficients of the transform block are quantized using a quantization matrix determined according to the determined quantization parameter and the prediction mode.

The quantization parameter determined for each coding unit equal to or larger than the predetermined size is predictively encoded using a quantization parameter of a coding unit adjacent to the current coding unit. A quantization parameter predictor of the current coding unit can be generated by searching the left coding unit of the current coding unit, the upper coding unit order, and using one or two valid quantization parameters available.

For example, a valid first quantization parameter retrieved in the above order may be determined as a quantization parameter predictor. In addition, the first coding unit may be searched in order of the coding unit immediately before in the coding order, and the first validation parameter may be determined as a quantization parameter predictor.

The coefficients of the quantized transform block are scanned and converted into one-dimensional quantization coefficients. The scanning scheme can be set differently according to the entropy encoding mode. For example, in the case of CABAC encoding, the inter prediction encoded quantized coefficients can be scanned in a predetermined manner (zigzag or raster scan in the diagonal direction). On the other hand, when encoded by CAVLC, it can be scanned in a different manner from the above method.

For example, the scanning method may be determined according to the intra-prediction mode in the case of interlacing, or the intra-prediction mode in the case of intra. The coefficient scanning method may be determined depending on the size of the conversion unit.

The scan pattern may vary according to the directional intra prediction mode. The scan order of the quantization coefficients is scanned in the reverse direction.

The multiplexer 167 multiplexes the motion information encoded by the motion information encoder 163 and the residual signals encoded by the residual block encoder. The motion information may vary depending on the encoding mode.

That is, in the case of skipping or merge, only the index indicating the predictor is included. However, in the case of AMVP, the reference picture index, the difference motion vector, and the AMVP index of the current block are included.

Hereinafter, an operation of the intra predictor 150 will be described in detail.

First, the prediction mode information and the size of the prediction block are received by the picture division unit 110, and the prediction mode information indicates an intra mode. The size of the prediction block may be a square of 64x64, 32x32, 16x16, 8x8, 4x4, or the like, but is not limited thereto. That is, the size of the prediction block may be non-square instead of square.

Next, the reference pixel is read from the picture storage unit 180 to determine the intra-prediction mode of the prediction block.

It is determined whether or not the reference pixel is generated by examining whether or not the unavailable reference pixel exists. The reference pixels are used to determine the intra prediction mode of the current block.

If the current block is located at the upper boundary of the current picture, pixels adjacent to the upper side of the current block are not defined. In addition, when the current block is located at the left boundary of the current picture, pixels adjacent to the left side of the current block are not defined.

It is determined that these pixels are not usable pixels. In addition, it is determined that the pixels are not usable even if the current block is located at the slice boundary and pixels adjacent to the upper or left side of the slice are not encoded and reconstructed.

As described above, when there are no pixels adjacent to the left or upper side of the current block, or there are no pixels that have been previously coded and reconstructed, the intra prediction mode of the current block may be determined using only available pixels.

However, it is also possible to use the available reference pixels of the current block to generate reference pixels of unusable positions. For example, if the pixels of the upper block are not available, the upper pixels may be created using some or all of the left pixels, or vice versa.

That is, available reference pixels at positions closest to the predetermined direction from the reference pixels at unavailable positions can be copied and generated as reference pixels. When there is no usable reference pixel in a predetermined direction, the usable reference pixel at the closest position in the opposite direction can be copied and generated as a reference pixel.

On the other hand, even if the upper or left pixels of the current block exist, the reference pixel may be determined as an unavailable reference pixel according to the encoding mode of the block to which the pixels belong.

For example, if the block to which the reference pixel adjacent to the upper side of the current block belongs is inter-coded and the reconstructed block, the pixels can be determined as unavailable pixels.

In this case, it is possible to generate usable reference pixels by using pixels belonging to the restored block by intra-coded blocks adjacent to the current block. In this case, information indicating that the encoder determines available reference pixels according to the encoding mode must be transmitted to the decoder.

Next, an intra prediction mode of the current block is determined using the reference pixels. The number of intra prediction modes that can be allowed in the current block may vary depending on the size of the block. For example, if the current block size is 8x8, 16x16, or 32x32, there may be 34 intra prediction modes. If the current block size is 4x4, 17 intra prediction modes may exist.

The 34 or 17 intra prediction modes may include at least one non-directional mode and a plurality of directional modes.

The one or more non-directional modes may be a DC mode and / or a planar mode. When the DC mode and the planar mode are included in the non-directional mode, there may be 35 intra-prediction modes regardless of the size of the current block.

At this time, it may include two non-directional modes (DC mode and planar mode) and 33 directional modes.

The planner mode generates a prediction block of the current block using at least one pixel value (or a predicted value of the pixel value, hereinafter referred to as a first reference value) located at the bottom-right of the current block and the reference pixels .

As described above, the configuration of the moving picture decoding apparatus according to an embodiment of the present invention can be derived from the configuration of the moving picture coding apparatus described with reference to FIG. 1 to FIG. 3. For example, The image can be decoded by performing an inverse process of the encoding process.

4 is a block diagram illustrating a configuration of a moving picture decoding apparatus according to an embodiment of the present invention.

4, the moving picture decoding apparatus according to the present invention includes an entropy decoding unit 210, an inverse quantization / inverse transform unit 220, an adder 270, a deblocking filter 250, a picture storage unit 260, An intra prediction unit 230, a motion compensation prediction unit 240, and an intra / inter changeover switch 280.

The entropy decoding unit 210 decodes the encoded bit stream transmitted from the moving picture encoding apparatus into an intra prediction mode index, motion information, a quantized coefficient sequence, and the like. The entropy decoding unit 210 supplies the decoded motion information to the motion compensation prediction unit 240. [

The entropy decoding unit 210 supplies the intra prediction mode index to the intraprediction unit 230 and the inverse quantization / inverse transformation unit 220. In addition, the entropy decoding unit 210 supplies the inverse quantization coefficient sequence to the inverse quantization / inverse transformation unit 220.

The inverse quantization / inverse transform unit 220 transforms the quantized coefficient sequence into an inverse quantization coefficient of the two-dimensional array. One of a plurality of scanning patterns is selected for the conversion. One of a plurality of scanning patterns is selected based on at least one of a prediction mode of the current block (i.e., one of intra prediction and inter prediction) and the intra prediction mode.

The intraprediction mode is received from an intraprediction unit or an entropy decoding unit.

The inverse quantization / inverse transform unit 220 restores the quantization coefficients using the selected quantization matrix among the plurality of quantization matrices to the inverse quantization coefficients of the two-dimensional array. A different quantization matrix is applied according to the size of the current block to be restored and a quantization matrix is selected based on at least one of a prediction mode and an intra prediction mode of the current block with respect to the same size block.

Then, the reconstructed quantized coefficient is inversely transformed to reconstruct the residual block.

The adder 270 reconstructs the image block by adding the residual block reconstructed by the inverse quantization / inverse transforming unit 220 to the intra prediction unit 230 or the prediction block generated by the motion compensation prediction unit 240.

The deblocking filter 250 performs deblocking filter processing on the reconstructed image generated by the adder 270. Accordingly, the deblocking artifact due to the video loss due to the quantization process can be reduced.

The picture storage unit 260 is a frame memory for holding a local decoded picture in which the deblocking filter process is performed by the deblocking filter 250.

The intraprediction unit 230 restores the intra prediction mode of the current block based on the intra prediction mode index received from the entropy decoding unit 210. A prediction block is generated according to the restored intra prediction mode.

The motion compensation prediction unit 240 generates a prediction block for the current block from the picture stored in the picture storage unit 260 based on the motion vector information. When motion compensation with a decimal precision is applied, a prediction block is generated by applying a selected interpolation filter.

The intra / inter selector switch 280 provides the adder 270 with a prediction block generated in either the intra prediction unit 230 or the motion compensation prediction unit 240 based on the coding mode.

FIG. 5 is a block diagram of an embodiment for performing inter prediction in a decoding apparatus. The inter prediction decoding apparatus includes a demultiplexer 241, a motion information encoding mode determination unit 242, a merge mode motion information decoding unit 242, An AMVP mode motion information decoding unit 244, a prediction block generating unit 245, a residual block decoding unit 246, and a restoration block generating unit 247.

Referring to FIG. 5, the demultiplexer 241 demultiplexes the current encoded motion information and the encoded residual signals from the received bitstream. The demultiplexer 241 transmits the demultiplexed motion information to the motion information encoding mode determination unit 242 and transmits the demultiplexed residual signal to the residual block decoding unit 246.

The motion information encoding mode determination unit 242 determines a motion information encoding mode of the current block. When the skip_flag of the received bitstream has a value of 1, the motion information encoding mode determination unit 242 determines that the motion information encoding mode of the current block is encoded in the skip encoding mode.

The motion information encoding mode determination unit 242 determines that the skip_flag of the received bitstream has a value of 0 and the motion information encoding mode of the current block having only the merge index of the motion information received from the demultiplexer 241 is the merge mode As shown in FIG.

When the skip_flag of the received bitstream has a value of 0 and the motion information received from the demultiplexer 241 has a reference picture index, a differential motion vector, and an AMVP index, the motion information encoding mode determination unit 242 determines It is determined that the motion information encoding mode of the current block is coded in the AMVP mode.

The merge mode motion information decoding unit 243 is activated when the motion information encoding mode determination unit 242 determines the motion information encoding mode of the current block as a skip or merge mode.

The AMVP mode motion information decoding unit 244 is activated when the motion information encoding mode determination unit 242 determines that the motion information encoding mode of the current block is the AMVP mode.

The prediction block generator 245 generates a prediction block of the current block using the motion information reconstructed by the merge mode motion information decoding unit 243 or the AMVP mode motion information decoding unit 244. [

If the motion vector is an integer unit, the block corresponding to the position indicated by the motion vector in the picture indicated by the reference picture index is copied to generate a prediction block of the current block.

However, when the motion vector is not an integer unit, the pixels of the prediction block are generated from the integer unit pixels in the picture indicated by the reference picture index. In this case, in the case of a luminance pixel, a prediction pixel can be generated using an 8-tap interpolation filter. In the case of a chrominance pixel, a 4-tap interpolation filter can be used to generate a predictive pixel.

The residual block decoding unit 246 entropy decodes the residual signal. Then, the entropy-decoded coefficients are inversely scanned to generate a two-dimensional quantized coefficient block. The inverse scanning method can be changed according to the entropy decoding method.

That is, the inverse scanning method of the inter-prediction residual signal in case of decoding based on CABAC and decoding based on CAVLC can be changed. For example, in case of decoding based on CABAC, a raster inverse scanning method in a diagonal direction, and a case in which decoding is based on CAVLC, a zigzag reverse scanning method can be applied.

In addition, the inverse scanning method may be determined depending on the size of the prediction block.

The residual block decoding unit 246 dequantizes the generated coefficient block using an inverse quantization matrix. And restores the quantization parameter to derive the quantization matrix. The quantization step size is restored for each coding unit of a predetermined size or more.

The predetermined size may be 8x8 or 16x16. Accordingly, when the current coding unit is smaller than the predetermined size, only the quantization parameters of the first coding unit are restored in the coding order among the plurality of coding units within the predetermined size, and the quantization parameters of the remaining coding units are the same as the parameters, You do not have to.

The quantization parameter of the coding unit adjacent to the current coding unit is used to recover the quantization parameter determined for each coding unit equal to or larger than the predetermined size. The first coding unit of the current coding unit, the upper coding unit order, and determine a valid first quantization parameter as a quantization parameter predictor of the current coding unit.

In addition, the first coding unit may be searched in order of the coding unit immediately before in the coding order, and the first validation parameter may be determined as a quantization parameter predictor. And restores the quantization parameter of the current prediction unit using the determined quantization parameter predictor and the difference quantization parameter.

The residual block decoding unit 260 inversely transforms the dequantized coefficient block to recover the residual block.

The reconstruction block generation unit 270 adds the prediction blocks generated by the prediction block generation unit 250 and the residual blocks generated by the residual block decoding unit 260 to generate reconstruction blocks.

Hereinafter, a process of restoring a current block through intraprediction will be described with reference to FIG.

First, the intra prediction mode of the current block is decoded from the received bitstream. For this, the entropy decoding unit 210 recovers the first intra prediction mode index of the current block by referring to one of the plurality of intra prediction mode tables.

The plurality of intra prediction mode tables are tables shared by the encoder and the decoder, and may be any one selected according to the distribution of intra prediction modes of a plurality of blocks adjacent to the current block.

For example, if the intra prediction mode of the left block of the current block and the intra prediction mode of the upper block of the current block are the same, the first intra prediction mode table of the current block is restored by applying the first intra prediction mode table, The first intra prediction mode index of the current block can be restored by applying the second intra prediction mode table.

As another example, when the intra prediction modes of the upper block and the left block of the current block are all the directional intra prediction modes, the direction of the intra prediction mode of the upper block and the intra prediction mode of the left block If the direction is within a predetermined angle, the first intra-prediction mode table of the current block is restored by applying the first intra-prediction mode table. If the direction is outside the predetermined angle, the second intra- The mode index can also be restored.

The entropy decoding unit 210 transmits the first intra-prediction mode index of the restored current block to the intra-prediction unit 230.

The intraprediction unit 230 receiving the index of the first intraprediction mode determines the maximum possible mode of the current block as the intra prediction mode of the current block when the index has the minimum value (i.e., 0).

However, if the index has a value other than 0, the index indicating the maximum possible mode of the current block is compared with the first intra-prediction mode index. If the first intra-prediction mode index is not smaller than the index indicated by the maximum possible mode of the current block, the intra-prediction mode corresponding to the second intra-prediction mode index obtained by adding 1 to the first intra- The intra prediction mode of the current block is determined as the intra prediction mode corresponding to the first intra prediction mode index.

The intra prediction mode acceptable for the current block may be composed of at least one non-directional mode and a plurality of directional modes.

The one or more non-directional modes may be a DC mode and / or a planar mode. In addition, either the DC mode or the planar mode may be adaptively included in the allowable intra prediction mode set.

To this end, information specifying the non-directional mode included in the allowable intra prediction mode set may be included in the picture header or slice header.

Next, in order to generate an intra prediction block, the intra predictor 230 rotors the reference pixels stored in the picture storage unit 260, and determines whether there is a reference pixel that is not available.

The determination may be made according to the presence or absence of the reference pixels used to generate the intra prediction block by applying the decoded intra prediction mode of the current block.

Next, when it is necessary to generate a reference pixel, the intra predictor 230 generates reference pixels of a position that is not available using the reconstructed available reference pixels.

The definition of a reference pixel that is not available and the method of generating a reference pixel are the same as those in the intra prediction unit 150 shown in FIG. However, it is also possible to selectively reconstruct a reference pixel used for generating an intra prediction block according to the decoded intra prediction mode of the current block.

Next, the intraprediction unit 230 determines whether to apply a filter to the reference pixels to generate a prediction block. That is, the intra-prediction unit 230 determines whether to apply filtering on the reference pixels to generate an intra-prediction block of the current block based on the decoded intra-prediction mode and the size of the current prediction block.

Since the problem of blocking artifacts increases as the size of the block increases, the larger the size of the block, the larger the number of prediction modes for filtering reference pixels. However, when the block is larger than a predetermined size, it can be regarded as a flat area, so that reference pixels may not be filtered to reduce the complexity.

If it is determined that the filter needs to be applied to the reference pixel, the reference pixels are filtered using a filter.

At least two or more filters may be adaptively applied according to the difference in level difference between the reference pixels. The filter coefficient of the filter is preferably symmetrical.

In addition, the above two or more filters may be adaptively applied according to the size of the current block. That is, when a filter is applied, a filter having a narrow bandwidth may be applied to a block having a small size, and a filter having a wide bandwidth may be applied to a block having a large size.

In the case of the DC mode, since a prediction block is generated with an average value of reference pixels, there is no need to apply a filter. That is, when the filter is applied, only unnecessary calculation amount is increased.

In addition, it is not necessary to apply the filter to the reference pixel in the vertical mode in which the image has vertical correlation. It is not necessary to apply the filter to the reference pixel even in the horizontal mode in which the image is related to the horizontal direction.

Since the filtering is applied to the intra-prediction mode of the current block, the reference pixel can be adaptively filtered based on the intra-prediction mode of the current block and the size of the prediction block.

Next, according to the reconstructed intra prediction mode, a prediction block is generated using the reference pixel or the filtered reference pixels. Since the generation of the prediction block is the same as the operation in the encoder, it is omitted. Even in the planar mode, the operation is the same as that in the encoder, so it is omitted.

Next, it is determined whether to filter the generated prediction block. The determination as to whether to perform the filtering may use information included in the slice header or the encoding unit header. It may also be determined according to the intra prediction mode of the current block.

If it is determined that the generated prediction block is to be filtered, the generated prediction block is filtered. Specifically, a new pixel is generated by filtering pixels at a specific position of a prediction block generated using available reference pixels adjacent to the current block.

This may be applied together at the time of generating the prediction block. For example, in the DC mode, a prediction pixel in contact with reference pixels among prediction pixels is filtered using a reference pixel in contact with the prediction pixel.

Therefore, the predictive pixel is filtered using one or two reference pixels according to the position of the predictive pixel. The filtering of the prediction pixel in the DC mode can be applied to the prediction block of all sizes. In the vertical mode, the prediction pixels adjacent to the left reference pixel among the prediction pixels of the prediction block may be changed using reference pixels other than the upper pixel used to generate the prediction block.

Likewise, in the horizontal mode, the prediction pixels adjacent to the upper reference pixel among the generated prediction pixels may be changed using reference pixels other than the left pixel used to generate the prediction block.

The current block is reconstructed using the predicted block of the current block restored in this manner and the residual block of the decoded current block.

The moving picture bitstream according to an embodiment of the present invention may include PS (parameter sets) and slice data as a unit used to store coded data in one picture.

A PS (parameter set) is divided into a picture parameter set (hereinafter, simply referred to as PPS) and a sequence parameter set (hereinafter simply referred to as SPS) which are data corresponding to the heads of each picture. The PPS and the SPS may include initialization information required to initialize each encoding.

The SPS is common reference information for decoding all pictures coded in a random access unit (RAU), and includes a profile, a maximum number of pictures usable for reference, a picture size, and the like, as shown in Figs. 6 and 7 .

The PPS includes, for each picture coded by the random access unit (RAU), the kind of the variable length coding method as the reference information for decoding the picture, the initial value of the quantization step, and a plurality of reference pictures, 9 as shown in FIG.

On the other hand, the slice header SH includes information on the corresponding slice when coding in units of slices, and can be configured as shown in FIGS. 10 to 12.

Hereinafter, a configuration for scalably processing the above-described moving image encoding and decoding processing using a plurality of processing units will be described in detail.

An apparatus for processing moving images according to an exemplary embodiment of the present invention includes an image center processing unit for parsing parameter information or slice header information from moving image data input from the host, And a boundary processing between the image processing units is performed after the plurality of image processing units process the respective allocated areas, Is performed by an image processing unit adjacent to the image processing unit.

Then, the image central processing unit can assign the boundary portion to be bounded to a plurality of image processing units.

In addition, the boundary processing may include at least one of Deblock and SAO processing.

Each of the plurality of image processing units includes a first processing unit for communicating with the image central processing unit to perform entropy coding on the moving image data, and a second processing unit for processing the entropy- Unit. &Lt; / RTI >

A method of processing a moving image in a moving image processing apparatus having an image processing unit and a plurality of image processing units according to an embodiment of the present invention is characterized in that the image processing unit is configured to extract parameter information or slice header information Processing the moving image according to the parsed information under the control of the image central processing unit, and processing the moving image by the plurality of image processing units after processing the allocated area, And performing boundary processing between the processing units, wherein the boundary processing can be performed by an image processing unit adjacent to the boundary portion.

The image central processing unit may further include a step of assigning the boundary portion to be bounded to a plurality of image processing units.

The boundary processing may include at least one of Deblock and SAO processing.

In addition, the plurality of image processing units may include a first processing unit and a second processing unit, respectively, and the first processing unit may communicate with the image central processing unit to perform entropy coding on the moving image data, And the second processing unit may process the entropy-coded moving picture data in units of encoding.

Here, the video processing unit may refer to a VPU 300 to be described later, the video central processing unit may be a V-CPU 310 to be described later, and the video processing unit may be a V-CORE 320 to be described later. The first image processing unit may be referred to as a BPU 321, and the second image processing unit may be referred to as a VCE 322 to be described later.

Here, the moving picture processing apparatus may include both a moving picture coding apparatus and a moving picture decoding apparatus. The moving picture decoding apparatus and the moving picture encoding apparatus may be implemented as apparatuses for performing inverse processes as described above with reference to FIGS. 1 to 4. Hereinafter, a moving picture decoding apparatus will be described as an example of a moving picture decoding apparatus do. However, the present invention is not limited to this, and the moving picture processing apparatus may be embodied as a moving picture coding apparatus which performs an inverse process of a moving picture decoding apparatus to be described later.

13 is a diagram illustrating a layer structure of a moving picture decoding apparatus according to an embodiment of the present invention. Referring to FIG. 13, the moving picture decoding apparatus may include a video processing unit (VPU) 300 that performs a moving picture decoding function. The VPU 300 includes a V-CPU 310, a BPU 321, a VCE 322). Here, the BPU 321 and the VCE 322 may combine to form the V-core 320. [

Here, the VPU 300 according to an embodiment of the present invention may preferably include one V-CPU 310 and a plurality of V-cores 320 (hereinafter referred to as Multi V-Core) . However, the present invention is not limited to this, and the number of VPUs 300 may vary depending on the implementation of the VPU 300.

The V-CPU 310 controls the overall operation of the VPU 300. In particular, the V-CPU 310 can parse a Video Parameter Set (VPS), an SPS, a PPS, and an SH in a received moving picture bitstream. Then, the V-CPU 310 can control the overall operation of the VPU 300 based on the parsed information.

For example, the V-CPU 310 can determine the number of V-cores 320 to be used for data parallel processing based on the parsed information. As a result of the determination, when it is determined that a plurality of V-cores 320 are necessary for the data parallel processing, the V-CPU 310 determines the area to be processed by each V-core 320 of the Multi V- Can be determined.

Also, the V-CPU 310 can determine the entry points of the bit stream for the area to be allocated to each V-core 320. [

Also, the V-CPU 310 can allocate the boundary area in one picture generated by decoding using the Multi V-core 320 to the Multi V-core 320. [

The V-CPU 310 communicates with an application programming interface (API) on a picture-by-picture basis and can communicate with the V-Core 320 on a slice / tile basis.

The V-Core 320 performs a demodulation process and a boundary process under the control of the V-CPU 310. [ For example, the V-Core 320 can decode the allocated area under the control of the V-CPU 310. [ In addition, the V-Core 320 can perform boundary processing on the boundary area allocated under the control of the V-CPU 310. [

Here, the V-Core 320 may include a BPU 321 and a VCE 322.

The BPU 321 entropy decodes the data of the allocated area (slice or tile). That is, the BPU 321 can perform the functions of the entropy decoding unit 210 described above, and the BPU 321 can include a Coding Tree Unit (CTU), a Coding Unit (CU), a Prediction Unit (PU), and a TU (Transform Unit) level parameter can be derived. Then, the VCE 322 can be controlled.

Where the BPU 321 may communicate with the V-CPU 310 on a slice or tile basis and with the VCE 322 on a CTU-by-CTU basis.

The VCE 322 may perform TQ (Transform / Quantization), Intra-prediction, Inter-prediction, Loop Filtering (LF), and Memory compression by receiving the derived parameters of the BPU 321. That is, the VCE 322 may perform the functions of the inverse quantization / inverse transformation unit 220, the deblocking filter 250, the intra prediction unit 230, and the motion compensation prediction unit 240.

Here, the VCE 322 can process the allocated area by CTU-based pipelining.

FIG. 14 is a timing diagram illustrating a moving picture decoding operation of a VPU according to an embodiment of the present invention. Referring to FIG. 14, as described above, the V-Cpu 310 is allocated to each multi V-Core 320 for each picture (frame) area, and the multi V- (core processing) and boundary processing (boundary processing).

Hereinafter, the detailed operation of the V-CPU 310 will be described in detail.

Specifically, the V-CPU 310 can perform an interface operation with the host processor.

Also, the V-CPU 310 can parse a Video Parameter Set (VPS), a Sequence Parameter Set (SPS), a Picture Parameter Set (PPS), and a Slice Header (SH) in the received moving picture bitstream.

In addition, the V-CPU 310 can transmit information necessary for slice / tile decoding in the V-Core 320 using the parsed information. The necessary information may include 'Picture parameter data structure' and 'Slice control data structure'.

The 'Picture parameter data structure' may include the following information.

For example, the information contained in the sequence / picture header (eg, picture size, scaling list, CTU, min / max CU size, min / max TU size, etc.) can do.

This Picture parameter data structure can be set once during decoding of one picture.

Slice control data structure may contain the following information.

For example, the information included in the Slice header (eg, slice type, slice / tile area information, reference picture list, weighted prediction parameter, etc.) may be included.

This slice control data structure can be set when the slice changes. The inter-processor communication registers of the V-Core 320 or the slice parameter buffer at external memory can store N slice control data structures. If the state is not full, the data structure corresponding to the slice currently being decoded can be stored Can be stored. In this case, N is a time point at which the V-CORE 320 notifies the V-CPU 310 of the completion of the processing, after the pipe of the VCE 322 is completely flushed (N = 1) (N > 1) between the current segment and the next segment.

Here, the information transferred from the V-CPU 310 to the V-Core 320 may be transferred through the inter-processor communication registers of the V-Core 320. [ Inter-processor communication registers can be implemented as a fixed-size register array (file), or as an external memory. If it is implemented as an external memory, the V-CPU 310 can be stored in an external memory, and the BPU 321 can be operated in a structure read from an external memory.

Meanwhile, even if the number of slice control data structures that can be stored in the V-Core 320 is 1 (or any number), the V-Core 320 between the segment and the segment is prevented from being idle for a long time The V-CPU 310 must be able to perform SH decoding and parameter generation as shown in FIG.

Meanwhile, when a plurality of tiles are included in one slice and are processed in parallel by the multi V-Cores 320, the V-CPU 310 transmits the same slice control data structure to the multi V-Core 320 .

In addition, the V-CPU 310 can control the synchronization of the Multi V-Cores 320 for data parallel processing of the Multi V-Cores 320. [

Also, the V-CPU 310 can process the exception when the V-Core 320 generates an exception. For example, when an error is detected in the parameter set decoding in the V-CPU 310, an error is detected in the slice data decoding in the BPU 321 of the V-Core 320, and a decoding time specified in frame decoding is exceeded : When the peripheral and V-core 320 of the V-CPU 310 are stalled due to an unknown error in the VPU 300 and a failure of the system bus, a countermeasure can be taken to solve this problem.

In addition, the V-CPU 310 can report completion to the API upon completion of frame decoding of the VPU 300. [

In addition, the V-CPU 310 can determine the number of V-cores 320 to be used for data parallel processing based on the parsed information. As a result of the determination, when it is determined that a plurality of V-cores 320 are necessary for the data parallel processing, the V-CPU 310 determines the area to be processed by each V-core 320 of the Multi V- Can be determined.

Hereinafter, the detailed operation of the BPU 321 will be described in detail.

The BPU 321 may entropy decode the data of the allocated area (slice or tile). The SHU (Slice Header) is decoded by the V-CPU 310, and the BPU 321 does not decode the SH because all the necessary information is received by the picture parameter data structure and the slice control data structure.

In addition, the BPU 321 can derive a CTU (Coding Tree Unit) / CU (Coding Unit) / PU (Prediction Unit) / TU (Transform Unit) level parameter.

The BPU 321 may also send the derived parameters to the VCE 322.

CUU / CU / PU / TU parameters and coefficients required for decode processing excluding the information (picture size, segment offset / size, ...) common to each block and source / destination address in DMAC and reference pixel data The BPU 321 and the VCE 322 can communicate through the FIFO. However, the segment level parameters may be set in the internal register of the VCE 322 instead of the FIFO.

In addition, the BPU 321 may perform a function of a VCE controller for controlling the VCE 322. The VCE controller outputs the picture_init, segment_init signal, and software reset that the BPU 321 can control by register setting, and each sub-block of the VCE 322 can use these signals for control.

When the BPU 321 sets the above picture / segment-level parameters to the VCE controller and then sets the segment run by the register, the CU 321 does not communicate with the BPU 321 until the decoding of the set segment is completed. The decoding process can be controlled by referring to the fullness of the parameter FIFO and the status information of each subblock.

Also, the BPU 321 can process the exception when an exception occurs.

In addition, it can report to the V-CPU 310 when the slice / tile segment processing is completed.

The VCE 322 may perform TQ (Transform / Quantization), Intra-prediction, Inter-prediction, Loop Filtering (LF), and Memory compression by receiving the derived parameters of the BPU 321.

Here, the VCE 322 can process the allocated area by CTU-based pipelining.

According to various embodiments of the present invention described above, it is possible to separate the header parsing and the data processing process, pipeline the separated data processing process, and perform V- CPU can be provided.

Hereinafter, the boundary process performed in the Multi V-Core 320 as a post-process will be described in detail with reference to FIGS.

Referring to FIG. 16, the V-CORE 320 may store information for boundary processing in a line buffer and a column buffer while decoding the area allocated by the V-CPU. When receiving the boundary processing command from the V-CPU 310, the V-CORE 320 calculates the Deblock and SAO parameters using the information stored in the line buffers and the column buffers, Can be performed. In this case, the boundary processing result can be stored in the frame buffer.

Here, in the column buffer and line buffer, information such as the pixel values corresponding to the boundary portion and the motion vector, qp, coeff, and SAO parameter for obtaining Deblock and SAO parameters can be stored.

Here, as shown in FIG. 16, the boundary part may be a boundary part with an adjacent area generated by decoding the area allocated to each of the Multi V-COREs 320. FIG. Referring to FIG. 16, when each Multi V-Core 320 decodes an area allocated on a picture-by-picture basis, it can be seen that a boundary portion is generated between areas to be decoded by each Multi V-Core 320.

The area that the single core processes may include multiple tiles, in which case the boundaries between the tiles may be processed while sequentially decoding the tiles. For example, if tile 1 and tile 2 are adjacent, the boundary portion between tile 1 and tile 2 can be processed while processing tile 2 after processing tile 1.

However, when the tile 1 and the tile 2 are processed in the core 1 and the core 2 (in the case of the multicore), respectively, the tile 1 and the tile 2 are processed at the same time.

Thus, according to an embodiment of the present invention, information for boundary portion processing can be stored for the post-processing (boundary processing) of the boundary portion.

Hereinafter, a method of allocating the Multi V-Core 320 to be border-processed to the boundary portion will be described in detail with reference to FIGS. 18 to 21. FIG.

Referring to FIGS. 16 through 17, the boundary processing method according to an embodiment of the present invention can perform filtering using parameters required for the Deblock and SAO calculated by the Multi V-CORE 320. That is, according to an embodiment of the present invention, the processing of the boundary portion can be performed using the filtering engine of the Multi V-CORE 320 without adding any additional hardware for the boundary portion processing.

However, in this case, a method of allocating boundary portions to be bounded to each of the Multi V-COREs 320 should be preceded. This boundary portion allocation can be performed in the V-CPU 310. [

For example, when the number of V-cores is two or four, as shown in FIG. 18, the V-CPU 310 can assign a boundary portion to each of the Multi V-COREs 320 to perform boundary processing.

Alternatively, when the number of V-cores is not two or four, the V-CPU 310 can allocate a boundary portion to each of the Multi V-COREs 320 as shown in FIG.

That is, the allocation for the boundary portion can be allocated to the V-Core 320 adjacent to the boundary portion. However, the present invention is not limited thereto.

In this case, each of the Multi V-COREs 320 can calculate Deblock and SAO parameters using the information stored in the line buffers and column buffers of the allocated boundary, as shown in FIG. Each of the Multi V-COREs 320 can perform boundary processing of the allocated boundary using the calculated parameters.

On the other hand, when the boundary portions allocated to the V-CORE 320 overlap each other, FIG. 21 and the V-CORE 320 can perform boundary processing on the vertical boundary portion and boundary processing on the horizontal boundary.

Further, the processing for the next frame can not be performed before the boundary processing is completed. This is because the next frame can refer to the previous frame.

As described above, when the boundary portion to be boundary processed by the V-CPU 310 is allocated to each of the Multi V-COREs 320, the usage rate of the Multi V-CORE 320 can be increased and the processing time can be reduced . It can also be cost effective, eliminating the extra hardware cost for boundary processing.

The method according to the present invention may be implemented as a program for execution on a computer and stored in a computer-readable recording medium. Examples of the computer-readable recording medium include a ROM, a RAM, a CD- , A floppy disk, an optical data storage device, and the like, and may also be implemented in the form of a carrier wave (for example, transmission over the Internet).

The computer readable recording medium may be distributed over a networked computer system so that computer readable code can be stored and executed in a distributed manner. And, functional programs, codes and code segments for implementing the above method can be easily inferred by programmers of the technical field to which the present invention belongs.

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, It should be understood that various modifications may be made by those skilled in the art without departing from the spirit and scope of the present invention.

Claims

An apparatus for processing moving images,
An image central processing unit for parsing parameter information or slice header information from moving picture data input from a host; And
And a plurality of image processing units under the control of the image central processing unit and processing moving images in accordance with the parsed information,
The boundary processing between the image processing units is performed after the plurality of image processing units process the allocated areas,
Wherein the boundary processing is performed by an image processing unit adjacent to the boundary portion.

The method according to claim 1,
Wherein the image central processing unit comprises:
And the boundary portion to be bounded is allocated to the plurality of video processing units.

The method according to claim 1,
Wherein the boundary processing includes at least one of Deblock and SAO processing.

The method according to claim 1,
Wherein the plurality of image processing units store the pixel values of the adjacent region and the boundary portion in a memory while processing the allocated region,
And the pixel values of the stored boundary portion are used for boundary processing between the image processing units.

5. The method of claim 4,
Wherein at least one of motion vector information, quatization parameter, coefficient presence information, and SAO parameter is stored in the memory together with pixel values of the boundary portion.

6. The method of claim 5,
Wherein at least one of a Deblock parameter and a SAO parameter is calculated based on information stored in the memory.

The method according to claim 1,
Wherein each of the plurality of image processing units comprises:
A first processing unit communicating with the image central processing unit to perform entropy coding on the moving image data; And
And a second processing unit for processing the entropy-coded moving picture data in units of coding.

A method of processing moving images in a moving image processing apparatus having an image processing unit and a plurality of image processing units,
Parsing parameter information or slice header information from moving picture data input from a host;
Processing the moving picture according to the parsed information under the control of the image central processing unit;
And performing boundary processing between the image processing units after the respective allocated areas of the motion pictures are processed by the plurality of image processing units,
Wherein the boundary processing is performed by an image processing unit adjacent to the boundary portion.

9. The method of claim 8,
And allocating the boundary portion to be bounded to the plurality of image processing units.

9. The method of claim 8,
Wherein the boundary processing includes at least one of Deblock and SAO processing.

10. The method of claim 9,
Each of the plurality of image processing units includes a first processing unit and a second processing unit,
The first processing unit communicating with the image central processing unit to perform entropy coding on the moving picture data; And
And the second processing unit processes the entropy-coded moving picture data in a coding unit.