JP2009272969A - Image encoding apparatus and image encoding method, and image decoding apparatus and image decoding method - Google Patents

Image encoding apparatus and image encoding method, and image decoding apparatus and image decoding method Download PDF

Info

Publication number
JP2009272969A
JP2009272969A JP2008122851A JP2008122851A JP2009272969A JP 2009272969 A JP2009272969 A JP 2009272969A JP 2008122851 A JP2008122851 A JP 2008122851A JP 2008122851 A JP2008122851 A JP 2008122851A JP 2009272969 A JP2009272969 A JP 2009272969A
Authority
JP
Japan
Prior art keywords
image
encoding
encoding mode
decoding
feature amount
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2008122851A
Other languages
Japanese (ja)
Inventor
Hiroo Ito
Masashi Takahashi
Muneaki Yamaguchi
浩朗 伊藤
宗明 山口
昌史 高橋
Original Assignee
Hitachi Ltd
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd, 株式会社日立製作所 filed Critical Hitachi Ltd
Priority to JP2008122851A priority Critical patent/JP2009272969A/en
Publication of JP2009272969A publication Critical patent/JP2009272969A/en
Application status is Pending legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field

Abstract

【Task】
Improve compression efficiency.
[Solution]
An encoding method of an encoding target image divided into a plurality of regions,
An image feature amount calculating step for calculating an image feature amount indicating a feature of an image in an area adjacent to the encoding target area of the encoding target image;
An encoding mode group selection step of selecting an encoding mode group of the encoding target region using the image feature amount calculated in the image feature amount calculation step;
An encoding mode selection step of selecting one encoding mode among a plurality of encoding modes belonging to the encoding mode group selected in the encoding mode group selection step;
A step of performing a predetermined conversion process on the prediction difference value calculated by the prediction process using the encoding mode selected in the encoding mode selection step and including the result in an encoded stream.
[Selection] Figure 1

Description

The present invention relates to a moving picture coding technique for coding a moving picture and a moving picture decoding technique for decoding a moving picture.

  An encoding method such as MPEG (Moving Picture Experts Group) method is known as a method for recording and transmitting a large amount of moving image information as digital data.

  MPEG (Moving Picture Experts Group) and H.264 / AVC standards are known.

  When the encoding mode is increased, the influence of the increase in the code amount of overhead information is increased, and the encoding efficiency is lowered. As a technique for improving this, Patent Document 1 is known.

JP2007-235991

  However, in the technique disclosed in Patent Literature 1, it is necessary to include information indicating a table (encoding mode table) indicating a group of encoding modes and information indicating an encoding mode in the encoded stream, which indicates the encoding mode. There has been a problem that an increase in the amount of code of information cannot be sufficiently reduced.

  An object of the present invention is to improve compression efficiency by reducing the code amount of information indicating a coding mode.

  In order to achieve the above object, an embodiment of the present invention may be configured as described in the claims, for example.

  According to the present invention, compression efficiency can be improved.

  Embodiments of the present invention will be described below with reference to the drawings.

  First, an example of the operation of encoding processing according to the conventional H.264 / AVC standard will be described with reference to FIG. In the H.264 / AVC standard, 16 × 16 pixel macroblock unit encoding is performed on the encoding target image according to the raster scan order (301), and the encoding target macroblock is displayed between screens. Prediction (302) and in-screen prediction (303) are executed.

  Here, in the intra prediction, the encoding block adjacent to the left, upper left, upper, and upper right of the encoding target block (a block in which encoding / decoding processing is performed before the encoding target block, hereinafter simply referred to as “encoding block”). The prediction is executed by referring to the decoded image of “pre-encoded block” or “pre-decoded block” and mainly copying the reference pixel value in a specific direction. In the H.264 / AVC standard, it is possible to perform prediction by dividing a macroblock into smaller blocks, and a plurality of candidates are also prepared for prediction directions. Therefore, here, prediction based on each block size (304) and prediction direction (305) is executed.

  On the other hand, in the case of inter-screen prediction, prediction is performed by referring to an already encoded image and searching for a region similar to the encoding target block in the reference image by motion search. Also in this case, since the block size to be predicted can be selected, the motion search (307) is executed for all candidate block sizes (306) to perform prediction.

  Finally, an optimal one is selected (308) from the plurality of prediction encoding modes (combination of encoding method and block size).

  FIG. 10 is an example of a list listing the types of encoding modes. The left side shows a list of coding modes of the conventional H.264 / AVC standard, and the right side shows a list of coding modes according to an embodiment of the present invention.

  Here, as described above, in the H.264 / AVC standard, prediction is performed for all prediction methods and block sizes, and an optimum one among a large number of coding modes expressed as a combination thereof is selected. select. An example (1001) of an encoding mode in the H.264 / AVC standard is shown. In the H.264 / AVC standard, a symbol representing the selected encoding mode is encoded together with the prediction difference and included in the encoded stream. That is, a number that does not overlap each other is assigned to each of the plurality of available encoding modes shown in the example (1001), and this number is variable-length encoded. However, in the case of the H.264 / AVC standard, since there are a large number of candidate encoding modes, there is a problem that a large amount of bits is required to represent this number.

  On the other hand, in the example (1002) of the encoding process of the embodiment of the present invention in FIG. 10, a set of encoding modes having similar properties is grouped as an “encoding mode group”, and a plurality of code A final coding mode is selected by a procedure for selecting a coding mode group from the coding mode group and selecting an optimum one from a plurality of coding modes belonging to the selected coding mode group. .

  In the decoding process according to an embodiment of the present invention, the encoding mode group selected in the encoding process and the selected encoding mode are specified, and the decoding process corresponding to the specified encoding mode is performed. .

  In the present embodiment, the unit of the encoding process and decoding process is described based on the macroblock unit as in the conventional H.264 / AVC standard, but may be an area that divides the encoding target image into a plurality of parts. For example, it does not have to be a macroblock unit.

  Here, in the selection of the encoding mode group of the encoding process according to the embodiment of the present invention, the encoding mode group is selected based on the information of the already-encoded block located around the encoding target area. Similarly, in specifying the encoding mode group of the decoding process according to the embodiment of the present invention, the encoding mode group is specified based on the information of the already-encoded block located around the encoding target area. Thereby, for example, the encoding mode group can be specified in the decoding process without including the flag indicating the encoding mode group selected in the encoding process in the encoded stream.

  Also in the encoding process of an embodiment of the present invention, a flag indicating one encoding mode belonging to the selected encoding mode group is stored in the encoded stream. Here, since the encoding mode group is a group of a plurality of available encoding modes, the number of encoding modes included in each encoding mode group is based on the total number of available encoding modes. Is also small.

  Therefore, as in the prior art, compared to a case where a number that does not overlap each other is assigned to each of a plurality of available encoding modes and a flag indicating this is stored in the encoded stream, the embodiment of the present invention In the encoding process, the information amount of the flag stored in the encoded stream is reduced.

  FIG. 6 shows an example of a method for selecting a coding mode group. In FIG. 6, as shown in the adjacent block example (601), the encoded blocks adjacent to the left side, the upper side, the upper right side, and the upper left side of the target block are the adjacent block A, the adjacent block B, the adjacent block C, and the adjacent block, respectively. Let it be block D.

  Furthermore, if the image feature values obtained by performing predetermined image processing on the decoded images of those blocks are ICA, ICB, ICC, and ICD, the encoding mode group of the target block uses these values as arguments. This is expressed as in equation (602) using the function g.

  By using this function g, the encoding mode group used in the target region can be selected from the information of the already encoded block. Further, by using the function g not only in the encoding process but also on the decoding process side, it is not necessary to encode and transmit a number representing the encoding mode group. Thereby, the bit amount (603) necessary for indicating the encoding mode group can be set to zero.

  Here, as the image feature amount, for example, edge information (edge intensity or edge angle) may be used, or dispersion information of pixel values (luminance value or intensity for each color) may be used. Alternatively, a combination of these information may be used.

  The predetermined image processing method may use a technique according to the image feature amount to be used. For example, if edge information is used, it is effective to use edge detection processing using a Sobel filter shown in FIG. 7, for example.

  Here, when the Sobel filter is used, an edge in each direction is detected using two types of filters, ie, a vertical filter (701) and a vertical filter (702). A pre-witt filter may be used. In this case, in addition to the vertical filter (703) and the horizontal filter (704), diagonal filters (705) and (706) are prepared. As a simpler use example of the filter, it is conceivable to use a MIN-MAX filter that first prepares a rectangular filter of a specific size and calculates the difference between the maximum value and the minimum value of the density value.

  FIG. 9 shows an example of a method for calculating edge strength and edge angle using Sobel filters (701) and (702) and a method for calculating a variance of pixel values. Here, with respect to the decoded images of the already-encoded blocks A, B, C, and D (901) adjacent to the target block on the left side, upper side, upper right side, and upper left side, respectively, The vertical filter (701) and the horizontal filter (702) are applied to 1 to pixels m × n) (902). At this time, assuming that the values obtained by applying the horizontal filter and the vertical filter to the pixel i (i = 1,..., M × n) are fx (i) and fy (x), respectively, For example, the edge strength can be calculated as in Equation (903) and the edge angle can be calculated as in Equation (904). Further, the dispersion value of the pixel value can be calculated as in Expression (905). As the image feature values ICA, ICB, ICC, and ICD, these values may be used as they are, or normalization and a plurality of feature values may be synthesized. The above example shows a case where the block size is m × n pixels. In the case of the H.264 / AVC standard, m = 16 and n = 16.

  Any function g for outputting the prediction mode of the target block may be used, but it is effective to realize the function g by using, for example, a machine learning function of a neural network.

  An example when the function g is realized using a neural network will be described with reference to FIG. A neural network is a network in which a plurality of threshold logic units are arranged hierarchically from an input layer to an output layer. In a feed-forward network, coupling between units exists only between adjacent layers and is unidirectional from the input layer to the output layer. A combined weight is given between the combined units, and an input to the upper layer unit is a product sum of a value output from the lower layer unit group and the combined weight. When learning is performed, these weights are adjusted so that a desired result is obtained in the output layer. Here, when the normalized edge strength or edge angle of the adjacent blocks A to D or the normalized dispersion value of the pixel value is input as the image feature amount (701), the encoding mode group number n The neural network (702) is learned in advance so that the likelihood of each is calculated and output (703). Likelihood in the present application is the other coding modes in which the coding amount when the target block having the input image feature amount is coded by a coding mode belonging to one coding mode group can be used. This is an index indicating the probability of the possibility of being the smallest as compared with the amount of code when encoding is performed in the encoding mode belonging to the group. At this time, if a function that returns the number of the encoding mode group that outputs the highest likelihood is set as the function g (704), encoding and decoding can be performed by the method shown in FIG.

The learning method is not particularly limited. For example, if a back propagation method (BP method) is used, a great effect can be seen. The BP method is described in detail in Reference Document 1, for example.
[Reference Document 1] JP2003-44827
In addition to the above, as the function g, for example, from a simple polynomial having variables such as edge strength and angle, variance, kernel method, SVM (Support Vector Machine), k-nearest neighbor method, linear discriminant analysis, It can be widely considered to use machine learning techniques such as Bayes betting, hidden Markov models, and decision tree learning.

  Further, a plurality of discriminators may be combined by means such as using boosting. It may be determined in advance which model is used to realize the function g. In addition, regarding the input / output of the function g, an input / output correspondence table may be provided on both the encoding side and the decoding side in advance. Further, the information on the function g may be stored in the encoded stream.

  In the above embodiment, the variance value of the pixel value in the surrounding block, the edge strength and angle are used as variables, but the pixel value average and standard deviation of the surrounding block, the encoding method, the encoding mode, etc. Any block information may be used, and parameters relating to encoding conditions such as QP (Quantization Parameter) and screen resolution may be added. Further, in this embodiment, information on an already-encoded block on the same screen as the target image is used, but an image feature amount in an image different from the target image such as a previous frame may be used. Good.

  Next, an embodiment of the moving picture coding apparatus according to the present invention will be described with reference to FIG.

  The moving image encoding apparatus includes an input image memory (102) that holds an input original image (101), a block dividing unit (103) that divides the input image into small regions, and a motion that detects motion in units of blocks. Search unit (104), intra-screen prediction unit (105) that performs intra-screen prediction in units of blocks, and inter-screen prediction that performs inter-screen prediction in units of blocks based on the amount of motion detected by the motion search unit (104) A prediction unit (106), an image feature amount calculation unit (117) that performs predetermined image processing on a decoded image of an already-encoded block located around the encoding target block, and calculates an image feature amount; and an image feature amount Using the image feature amount calculated by the calculation unit (117), a mode group selection unit (108) that selects a coding mode group of the encoding target block, and a coding mode selected by the mode group selection unit (108) A coding mode (prediction method and Mode selection unit (107) for selecting a block size), a subtraction unit (109) for generating a prediction difference, a frequency conversion unit (110) and a quantization unit (111) for encoding the prediction difference ), A variable length encoding unit (112) for encoding according to the occurrence probability of the symbol, an inverse quantization processing unit (113) for decoding the prediction difference encoded once, and an inverse frequency transform Unit (114), an addition unit (115) for generating a decoded image using the decoded prediction difference, and a reference image memory (116) for holding the decoded image and utilizing it for later prediction ). Details of the operation of each unit will be described below.

  The input image memory (102) holds one image as an encoding target image from the original image (101), and divides it into fine blocks by the block dividing unit (103), and the motion search unit (104 ), An intra-screen prediction unit (105), and an inter-screen prediction unit (106). The motion search unit (104) calculates the amount of motion of the corresponding block using the decoded image stored in the reference image memory (116), and passes the motion vector to the inter-screen prediction unit (106). The intra-screen prediction unit (105) and the inter-screen prediction unit (106) execute the intra-screen prediction process and the inter-screen prediction process in units of several blocks.

  Here, the image feature amount calculation unit (117) receives the decoded image of the already-encoded block located around the target block from the reference image memory (116), and performs the predetermined image processing described in FIG. An image feature amount is acquired, and an encoding mode group is selected by a function g shown in (602). The mode selection unit (108) selects an optimal one from the encoding modes included in the selected encoding mode group.

  Subsequently, the subtraction unit (109) generates a prediction difference according to the selected encoding mode, and passes it to the frequency conversion unit (110). The frequency transform unit (110) and the quantization processing unit (111) each perform frequency transform such as DCT (Discrete Cosine Transformation) in units of blocks of a specified size with respect to the transmitted prediction difference. Quantization processing is performed, and the result is transmitted to the variable length coding processing unit (112) and the inverse quantization processing unit (113).

  Further, in the variable length coding processing unit (112), the prediction difference information represented by the frequency transform coefficient is converted into, for example, a coding mode group number, a coding mode number, a prediction direction in intra prediction encoding, and an inter prediction code. In addition to information necessary for predictive decoding, such as motion vectors in encoding, variable-length encoding is performed based on the occurrence probability of symbols to generate an encoded stream. In addition, the inverse quantization processing unit (113) and the inverse frequency transform unit (114) perform inverse frequency transform such as inverse quantization and IDCT (Inverse DCT) on the frequency transform coefficients after quantization. Then, the prediction difference is decoded and sent to the adding unit (115). Subsequently, the adder (115) adds the prediction difference and the prediction value, generates a decoded image, and stores it in the reference image memory (116).

  Next, a one-frame moving image encoding method in the embodiment of the moving image encoding apparatus shown in FIG. 1 will be described with reference to FIG. First, as shown in loop 1 (201), the following processing is performed on all blocks existing in the frame to be encoded. That is, the image feature amount of the encoded block located around the target block is calculated using a Sobel filter or the like (202), and the encoding mode group is determined by the function g shown in the equation (602) of FIG. select. Subsequently, prediction is executed for all the encoding modes included in the selected encoding mode group (204), and a prediction difference is calculated (206). Then, an encoding mode with the highest prediction accuracy is selected from among them (207). Subsequently, frequency conversion (208) and quantization processing (209) are performed on the prediction difference generated in the selected encoding mode. Further, variable length encoding is performed, and the variable length encoded data is included in the encoded stream and output (210). At this time, information indicating the selected encoding mode is also included in the encoded stream and output. On the other hand, the quantized frequency transform coefficient is subjected to inverse quantization processing (211) and inverse frequency transformation processing (212) to decode the prediction difference, generate a decoded image, and store it in the reference image memory (213). When the above processing is completed for all the blocks, the encoding for one frame of the image is completed (214).

  According to the moving picture coding apparatus and the moving picture coding method according to the present embodiment described above, the selected one coding mode is the coding mode group information and the coding modes belonging to the coding mode group. The coding mode group information is selected based on the characteristics of the image of the block adjacent to the coding target block, and the coding mode group information is included in the coded stream. Instead, only the information indicating one coding mode among the coding modes belonging to the coding mode group is included in the coded stream.

  As a result, the encoding mode is compared with a case where information indicating one encoding mode selected from a plurality of available encoding modes is included in the encoded stream as flag information including numbers that do not overlap each other. It is possible to reduce the code amount of the information to be shown, and to improve the compression efficiency.

  Next, an embodiment of the moving picture decoding apparatus according to the present invention will be described with reference to FIG. The video decoding device includes, for example, a variable length decoding unit (402) that performs the reverse procedure of variable length encoding on the encoded stream (401) generated by the video encoding device shown in FIG. Inverse quantization processing unit (403) and inverse frequency transform unit (404) for decoding the prediction difference, and image characteristics obtained by performing predetermined image processing on the decoded image of the already-encoded block located around the target block An image feature amount calculation unit (410) for calculating the amount, and a mode group specifying unit (411) for specifying the encoding mode group of the encoding target block using the image feature amount calculated by the image feature amount calculation unit (410) ), A mode specifying unit (405) for specifying the coding mode used in the corresponding block among the coding mode groups belonging to the coding mode group specified by the mode group specifying unit (411), Intra-screen prediction unit (406) that performs prediction, and inter-screen prediction unit (407) that performs inter-screen prediction Includes an adder which obtains a decoded image (408), a reference image memory (409) for temporarily storing the decoded image. Details of the operation of each unit will be described below.

  The variable length decoding unit (402) performs variable length decoding on the encoded stream (401), and acquires information necessary for prediction processing, such as a frequency transform coefficient component of a prediction difference, a block size, and a motion vector. For the former prediction difference information, the inverse quantization processing unit (403), for the information necessary for the latter prediction processing, in-screen prediction unit (406) or inter-screen prediction unit depending on the prediction means Sent to (407). Subsequently, the inverse quantization processing unit (403) and the inverse frequency transform unit (404) perform decoding by performing inverse quantization and inverse frequency transform on the prediction difference information, respectively.

  Here, the image feature amount calculation unit (410) receives the decoded image of the already-encoded block located around the target block from the reference image memory (409). The predetermined image processing described with reference to FIG. 6 is performed on the received decoded image of the already-encoded block to calculate the image feature amount. The mode group specifying unit (411) specifies the encoding mode group of the encoding target block based on the image feature amount calculated by the image feature amount calculating unit (410) and the function g shown in the equation (602) of FIG. . The mode specifying unit (405) determines the encoding mode used for the corresponding block from the information of the encoding mode group specified by the mode group specifying unit (411) and the encoding mode number included in the encoded stream. Identify. Further, information on the identified encoding mode is transmitted to the intra prediction unit (406) or the inter prediction unit (407).

  Subsequently, the intra prediction unit (406) or the inter prediction unit (407) refers to the information sent from the variable length decoding unit (402) and the decoded image stored in the reference image memory (409). The prediction process corresponding to the coding mode specified by the mode specifying unit (405) is performed. In the addition unit (408), the reference image acquired by the prediction process is added to the prediction difference decoded by the inverse quantization processing unit (403) and the inverse frequency conversion unit (404) to obtain a decoded image. The decoded image is generated and stored in the reference image memory (409).

  FIG. 5 shows a one-frame moving picture decoding method in the embodiment of the moving picture decoding apparatus shown in FIG. First, as shown in loop 1 (501), the following processing is performed for all blocks in one frame. That is, the variable length decoding process is performed on the input stream (502), and the inverse quantization process (503) and the inverse frequency transform process (504) are performed to decode the prediction difference in the decoding target region. Subsequently, an image feature amount of an encoded block located around the target block is calculated using a Sobel filter or the like (505), and an encoding mode group is expressed by a function g shown in Expression (602) of FIG. Is identified (506). Subsequently, the encoding mode is specified from the information of the encoding mode group that has been specified and the encoding mode number included in the encoded stream (507). Furthermore, a prediction process corresponding to the specified encoding mode is performed on the decoded prediction difference, and a decoded image is generated by combining the reference image acquired by the prediction process and the decoded prediction difference. The generated decoded image is stored in the reference image memory (508). When the above processing is completed for all the blocks in the frame, decoding for one frame of the image is completed (509).

  According to the image decoding apparatus and the image decoding method according to the present embodiment described above, the selected one encoding mode is encoded mode group information and one of the encoding modes belonging to the encoding mode group. Even when an encoded stream generated by dividing the coding mode into information indicating the coding mode and not including the coding mode group information is input, the characteristics of the image of the block adjacent to the decoding target block Based on this, it is possible to specify the information of the encoding mode group, and it is possible to perform decoding processing on the encoded stream with higher compression efficiency in which the code amount of the information indicating the encoding mode is reduced.

  In the embodiments described above, DCT is cited as an example of frequency transformation, but DST (Discrete Sine Transformation), WT (Wavelet Transformation), DFT (Discrete Fourier Transformation), KLT Any orthogonal transformation used for removing correlation between pixels, such as (Karhunen-Loeve Transformation), may be used. In particular, the prediction difference itself may be encoded without performing frequency conversion. Furthermore, variable length coding is not particularly required.

  In the embodiment, the encoding mode number is variable-length encoded. However, the encoding mode number is selected using the image feature amount of the peripheral block as shown in FIG. 6 in the same manner as the encoding mode group number. May be. In this case, it is not necessary to encode the encoding mode group number and the encoding mode number, and further improvement of the compression rate can be expected.

  Although the embodiment describes the case of encoding a moving image, the present invention is also effective for encoding a still image. For example, if the motion search unit (104) and the inter-screen prediction unit (106) are excluded from the block diagram of FIG. 1, it corresponds to a block diagram of an encoding device specialized for still images.

  In the embodiment, the case where encoding is performed in units of blocks has been described. However, the present invention may be used when encoding is performed in units of objects separated from the background of an image.

Block diagram of an image encoding apparatus used in this embodiment Flow chart of image encoding apparatus used in this embodiment Conceptual explanatory diagram of encoding processing according to the H.264 / AVC standard Block diagram of an image decoding apparatus used in this embodiment Flow chart of image decoding apparatus used in this embodiment Explanatory drawing of an example of the selection method of the encoding mode group used by a present Example. Illustration of examples of filters used for edge detection Explanatory drawing of an example of likelihood calculation of coding mode group An explanatory diagram of an example of a method for calculating an image feature amount Illustration of an example of a list listing the types of encoding modes used in the H.264 / AVC standard and this embodiment

Explanation of symbols

DESCRIPTION OF SYMBOLS 101 ... Original image, 102 ... Original image memory, 103 ... Block division part, 104 ... Motion search part, 105 ... In-screen prediction part, 106 ... Inter-screen prediction part, 107 ... Mode selection part, 108 ... Mode group selection part, DESCRIPTION OF SYMBOLS 109 ... Subtraction part, 110 ... Frequency conversion part, 111 ... Quantization processing part, 112 ... Variable length coding part, 113 ... Dequantization processing part, 114 ... Inverse frequency conversion part, 115 ... Addition part, 116 ... Reference image Memory 117, image feature amount calculation unit 401 401 encoded stream 402 variable length decoding unit 403 inverse quantization processing unit 404 inverse frequency transform unit 405 mode identification unit 406 intra prediction 407: Inter-screen prediction unit, 408 ... Addition unit, 409 ... Reference image memory, 410 ... Image feature amount calculation unit, 411 ... Coding mode group specifying unit

Claims (12)

  1. An image encoding device that encodes an encoding target image divided into a plurality of regions,
    An image feature amount calculating unit that calculates an image feature amount indicating the feature of an image in an area adjacent to the encoding target area of the encoding target image;
    An encoding mode group selection unit that selects an encoding mode group of the encoding target region using the image feature amount calculated by the image feature amount calculation unit;
    An encoding mode selection unit that selects one encoding mode among a plurality of encoding modes belonging to the encoding mode group selected by the encoding mode group selection unit;
    An output unit that performs a predetermined conversion process on the prediction difference value calculated by the prediction process using the encoding mode selected by the encoding mode selection unit, and outputs the prediction difference value in an encoded stream. An image encoding device.
  2.   The image feature amount calculation unit performs a predetermined filtering process on an image in an area that has already been encoded among areas adjacent to the encoding target area of the encoding target image, and includes the image in the image in the area 2. The image feature amount is calculated by calculating one of a variance value of pixel values to be detected, an edge strength included in an image of the region, an edge angle included in the image of the region, or a combination thereof. Image coding apparatus.
  3.   The output unit includes only the information about the coding mode out of the coding mode group selected by the coding mode group selection unit and the coding mode selected by the coding mode selection unit, and outputs it. The image coding apparatus according to claim 1, wherein:
  4. An encoding method of an encoding target image divided into a plurality of regions,
    An image feature amount calculating step for calculating an image feature amount indicating a feature of an image in an area adjacent to the encoding target area of the encoding target image;
    An encoding mode group selection step of selecting an encoding mode group of the encoding target region using the image feature amount calculated in the image feature amount calculation step;
    An encoding mode selection step of selecting one encoding mode among a plurality of encoding modes belonging to the encoding mode group selected in the encoding mode group selection step;
    An image code comprising: an output step of performing a predetermined conversion process on the prediction difference value calculated by the prediction process using the encoding mode selected in the encoding mode selection step and including it in an encoded stream Method.
  5.   In the image feature amount calculating step, a predetermined filter process is performed on an image in an area where the encoding process is performed before the encoding target area among the areas adjacent to the encoding target area of the encoding target image. And calculating one of a variance value of pixel values included in the image of the region, an edge strength included in the image of the region, an edge angle included in the image of the region, or a combination thereof as an image feature amount. The image encoding method according to claim 4, wherein the image encoding method is characterized in that:
  6. In the output step, out of the encoding mode group selected in the encoding mode group selection step and the encoding mode selected in the encoding mode selection step, only the information about the encoding mode is included in the encoded stream and output. The image encoding method according to claim 4, wherein:
  7. An image decoding device for decoding an encoded stream in which an image divided into a plurality of regions is encoded,
    A prediction difference decoding unit that performs predetermined processing on the encoded stream and decodes a prediction difference value of a decoding target region;
    An image feature quantity calculating unit for calculating an image feature quantity indicating the feature of the image of the area adjacent to the decoding target area of the decoding target image;
    An encoding mode group specifying unit that specifies an encoding mode group of the decoding target area using the image feature amount calculated by the image feature amount calculating unit;
    An encoding mode specifying unit for specifying one encoding mode among a plurality of encoding modes belonging to the encoding mode group specified by the encoding mode group selecting unit;
    Decoding that generates a decoded image by synthesizing a reference image acquired by a prediction process corresponding to an encoding mode specified by the encoding mode specifying unit and a prediction difference decoded by the prediction difference decoding unit An image decoding apparatus comprising: a converted image generation unit.
  8.   The image feature amount calculation unit performs a predetermined filtering process on an image in an area that has been subjected to a decoding process prior to the decoding target area among areas adjacent to the encoding target area of the decoding target image. And calculating one of a variance value of pixel values included in the image of the region, an edge strength included in the image of the region, an edge angle included in the image of the region, or a combination thereof as an image feature amount. 8. The image decoding apparatus according to claim 7, wherein
  9.   The encoding mode specifying unit specifies an encoding mode using information on the encoding mode group specified by the encoding mode group specifying unit and an encoding mode number included in the encoded stream. The image decoding apparatus according to claim 7, wherein:
  10. An image decoding method for decoding an encoded stream in which an image divided into a plurality of regions is encoded,
    A prediction difference decoding step for performing a predetermined process on the encoded stream and decoding a prediction difference value of a decoding target region;
    An image feature amount calculating step for calculating an image feature amount indicating a feature of an image in a region adjacent to the decoding target region of the decoding target image;
    An encoding mode group specifying step for specifying an encoding mode group of the decoding target area using the image feature amount calculated in the image feature amount calculating step;
    An encoding mode specifying step for specifying one encoding mode among a plurality of encoding modes belonging to the encoding mode group specified in the encoding mode group selecting step;
    Decoding that generates a decoded image by synthesizing the reference image acquired by the prediction process corresponding to the encoding mode specified in the encoding mode specifying step and the prediction difference decoded in the prediction difference decoding step An image decoding method comprising: a converted image generation step.
  11.   In the image feature amount calculating step, a predetermined filter process is performed on an image in an area that has been subjected to a decoding process prior to the decoding target area among areas adjacent to the encoding target area of the decoding target image. And calculating one of a variance value of pixel values included in the image of the region, an edge strength included in the image of the region, an edge angle included in the image of the region, or a combination thereof as an image feature amount. The image decoding method according to claim 10, wherein:
  12.   In the encoding mode specifying step, the encoding mode is specified using the information of the encoding mode group specified in the encoding mode group specifying step and the encoding mode number included in the encoded stream. The image decoding method according to claim 10, wherein:
JP2008122851A 2008-05-09 2008-05-09 Image encoding apparatus and image encoding method, and image decoding apparatus and image decoding method Pending JP2009272969A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2008122851A JP2009272969A (en) 2008-05-09 2008-05-09 Image encoding apparatus and image encoding method, and image decoding apparatus and image decoding method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2008122851A JP2009272969A (en) 2008-05-09 2008-05-09 Image encoding apparatus and image encoding method, and image decoding apparatus and image decoding method
PCT/JP2009/001838 WO2009136475A1 (en) 2008-05-09 2009-04-22 Image coding device and image coding method, image decoding device and image decoding method

Publications (1)

Publication Number Publication Date
JP2009272969A true JP2009272969A (en) 2009-11-19

Family

ID=41264532

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2008122851A Pending JP2009272969A (en) 2008-05-09 2008-05-09 Image encoding apparatus and image encoding method, and image decoding apparatus and image decoding method

Country Status (2)

Country Link
JP (1) JP2009272969A (en)
WO (1) WO2009136475A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012175543A (en) * 2011-02-23 2012-09-10 Fujitsu Ltd Motion vector detection apparatus, motion vector detection method and moving image encoding device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4995789B2 (en) * 2008-08-27 2012-08-08 日本電信電話株式会社 Intra-screen predictive encoding method, intra-screen predictive decoding method, these devices, their programs, and recording media recording the programs
JP5292343B2 (en) * 2010-03-24 2013-09-18 日本電信電話株式会社 Image quality objective evaluation apparatus, method and program

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3208101B2 (en) * 1996-11-07 2001-09-10 松下電器産業株式会社 Picture coding method and the picture coding apparatus and a recording medium recording an image encoding program
JP2003324731A (en) * 2002-04-26 2003-11-14 Sony Corp Encoder, decoder, image processing apparatus, method and program for them
MXPA04012133A (en) * 2002-06-11 2005-04-19 Nokia Corp Spatial prediction based intra coding.
BRPI0411765A (en) * 2003-06-25 2006-08-08 Thomson Licensing fast modal interframe decision coding
JP4763422B2 (en) * 2004-12-03 2011-08-31 パナソニック株式会社 Intra prediction device
JP4889231B2 (en) * 2005-03-31 2012-03-07 三洋電機株式会社 Image encoding method and apparatus, and image decoding method
JP2007104117A (en) * 2005-09-30 2007-04-19 Seiko Epson Corp Image processing apparatus and program for allowing computer to execute image processing method
JP2007208543A (en) * 2006-01-31 2007-08-16 Victor Co Of Japan Ltd Moving image encoder

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012175543A (en) * 2011-02-23 2012-09-10 Fujitsu Ltd Motion vector detection apparatus, motion vector detection method and moving image encoding device

Also Published As

Publication number Publication date
WO2009136475A1 (en) 2009-11-12

Similar Documents

Publication Publication Date Title
RU2604997C2 (en) Method and apparatus for encoding and decoding motion vector
JP5401009B2 (en) Video intra prediction encoding and decoding method and apparatus
US8311118B2 (en) Method and apparatus for encoding/decoding motion vector
EP1478190A1 (en) Image encoding device, image decoding device, image encoding method, image decoding method, image encoding program, and image decoding program
US20050146451A1 (en) Signal encoding method, signal decoding method, signal encoding apparatus, signal decoding apparatus, signal encoding program, and signal decoding program
JPWO2009037828A1 (en) Image coding apparatus and image decoding apparatus
US8098731B2 (en) Intraprediction method and apparatus using video symmetry and video encoding and decoding method and apparatus
RU2510945C1 (en) Method and apparatus for image encoding and decoding using large transformation unit
KR101228020B1 (en) Video coding method and apparatus using side matching, and video decoding method and appartus thereof
EP1379000B1 (en) Signal encoding method and apparatus and decoding method and apparatus
US9058659B2 (en) Methods and apparatuses for encoding/decoding high resolution images
US8649431B2 (en) Method and apparatus for encoding and decoding image by using filtered prediction block
US20090110070A1 (en) Image encoding device and encoding method, and image decoding device and decoding method
EP1936998A2 (en) Decoding method and coding method
KR101365567B1 (en) Method and apparatus for prediction video encoding, and method and apparatus for prediction video decoding
WO2007125856A1 (en) Image predictive coding device, image predictive coding method, image predictive coding program, image predictive decoding device, image predictive decoding method and image predictive decoding program
CN107005712B (en) Method and apparatus for performing graph-based prediction using optimization function
US9967589B2 (en) Method and apparatus for updating predictions when coding motion information
US10491920B2 (en) Method and apparatus for encoding/decoding the motion vectors of a plurality of reference pictures, and apparatus and method for image encoding/decoding using same
WO2012008125A1 (en) Video encoding device, video decoding device, video encoding method, and video decoding method
CN102239693B (en) Moving picture decoding method and moving picture encoding method
KR101351709B1 (en) Image decoding device, and image decoding method
US20130089265A1 (en) Method for encoding/decoding high-resolution image and device for performing same
US20100284469A1 (en) Coding Device, Coding Method, Composite Device, and Composite Method
KR20090103776A (en) A method and an apparatus for encoding or decoding of a video signal