WO2010007719A1 - Image encoding apparatus, image encoding method, image decoding apparatus, and image decoding method - Google Patents

Image encoding apparatus, image encoding method, image decoding apparatus, and image decoding method Download PDF

Info

Publication number
WO2010007719A1
WO2010007719A1 PCT/JP2009/002453 JP2009002453W WO2010007719A1 WO 2010007719 A1 WO2010007719 A1 WO 2010007719A1 JP 2009002453 W JP2009002453 W JP 2009002453W WO 2010007719 A1 WO2010007719 A1 WO 2010007719A1
Authority
WO
WIPO (PCT)
Prior art keywords
image data
image
decimal pixel
frame
decoding
Prior art date
Application number
PCT/JP2009/002453
Other languages
French (fr)
Japanese (ja)
Inventor
齋藤昇平
影山昌弘
横山徹
中村克行
高橋昌史
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to JP2010520739A priority Critical patent/JPWO2010007719A1/en
Publication of WO2010007719A1 publication Critical patent/WO2010007719A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/573Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy

Definitions

  • the present invention relates to a moving picture compression and decompression technique using a compensation technique, and more particularly to an apparatus and method for image encoding or decoding using a fractional pixel precision image for the compensation technique.
  • Non-Patent Document 1 filters from adjacent pixels are used to perform motion detection / compensation with sub-pixel image accuracy.
  • a reference image is generated by interpolation.
  • the reference image with decimal pixel image accuracy generated by filter interpolation is obtained by performing filter interpolation to obtain a pixel value with decimal pixel image accuracy around the pixel of the minimum cost function searched in the integer accuracy motion search. .
  • the quality of the reproduced image by decoding can be improved.
  • the present invention has been made in view of the above problems, and an object thereof is to generate a reference image with higher resolution in an apparatus and method for image encoding and image decoding. Another object is to improve motion prediction accuracy. Yet another object is to contribute to high image quality.
  • the motion prediction image data is generated based on local decoded image data obtained by encoding the difference between the original image data and the motion prediction image data and decoding the encoded data.
  • the local decode image data is stored for a plurality of frames
  • decimal pixel precision image data is generated using the local decode image data of the frame to be encoded and the local decode image data of the previous frame.
  • motion detection is performed using the decimal pixel precision image data as reference image data to generate motion prediction image data.
  • motion prediction image data is added to a motion prediction error obtained by decoding encoded image data separated from encoded data encoded by motion prediction, and image data is reproduced.
  • the motion prediction image data is generated based on the reproduced image data that has been reproduced
  • the reproduction image data of the decoding target frame and the previous frame are decoded from a frame memory that holds the reproduction image data for a plurality of frames.
  • the reproduced image data is used to generate decimal pixel accuracy image data, and motion detection is performed using the generated decimal pixel accuracy image data as reference image data to generate motion prediction image data.
  • the decimal pixel precision image data as the reference image data is generated by using both the image data of the processing target frame and the image data of another frame separated from the frame. Therefore, it is possible to generate decimal pixel accuracy image data with higher accuracy than in the case of generating only by interpolation.
  • a higher-resolution reference image can be generated. Thereby, it can contribute to improvement of motion prediction accuracy. Furthermore, it can contribute to high image quality.
  • the image encoding apparatus capable of encoding includes a frame memory (109) for holding the local decoded image data for a plurality of frames, a local decoded image data of a frame to be encoded, which is stored in the frame memory, and a frame before the frame memory (109).
  • a sub-pixel image data processing unit (110, 111) that generates sub-pixel accuracy image data using local decoded image data of the frame.
  • the image encoding device performs motion detection using the decimal pixel accuracy image data generated by the decimal pixel image data processing unit as reference image data to generate motion prediction image data.
  • the decimal pixel precision image data as the reference image data can be generated by using both the local decoded image data of the encoding target frame and the local decoded image data of another frame separated from the frame. Since this is possible, it is possible to generate more accurate decimal pixel accuracy image data than in the case of generating only by interpolation.
  • the decimal pixel image data processing unit includes the local decoded image data of the encoding target frame stored in the frame memory and the local decoded image of the previous frame in a predetermined range before the encoding target frame. It is determined whether or not the amount of motion between the data is decimal pixel accuracy, and when the determination result of decimal pixel accuracy is obtained, local decoding image data of the previous frame and local decoding of the encoding target frame. The decimal pixel accuracy image data is generated using the image data, and when the determination result of the decimal pixel accuracy is not obtained, the decimal pixel accuracy image data is generated by the interpolation operation on the local decoded image data of the encoding target frame.
  • the decimal pixel image data processing unit uses information on a screen as a target for determining whether or not the decimal pixel accuracy is obtained without performing motion prediction in the time direction.
  • the frame is limited to an I picture (Intra-Picture) which is a screen obtained by encoding using the above, or a P picture (Predictive-Picture) which is a screen obtained by forward predictive coding between screens.
  • a B picture Bi-directional Predictive-Picture that is obtained by predictive coding from the past and the future is excluded. Thereby, it is possible to generate decimal pixel accuracy image data with higher accuracy or higher image quality.
  • the decimal pixel image data processing unit sequentially determines the frame form and the decimal pixel from the previous frame closest to the encoding target frame with respect to the previous frame within the predetermined range. It is determined whether or not the accuracy. This is because, as the local decoded image data of the frame closer to the encoding target frame is used, the higher pixel quality image data with higher image quality (higher accuracy) can be generated.
  • the decimal pixel image generation unit performs, for example, phase shift processing on each of a plurality of image signals of image data to generate a plurality of new image signals. Then, the pixel image precision image data is generated by multiplying the plurality of image signals and the new plurality of image signals by a coefficient and combining them.
  • the image coding method includes the following processes (a) to (l).
  • (b) Difference processing for calculating a difference between the read motion prediction image data and input image data as prediction error data (c) The difference Frequency conversion processing for frequency conversion of the prediction error data calculated in the processing,
  • quantization processing for quantizing the frequency converted data in the frequency conversion processing (e) data quantized by the quantization processing
  • Variable-length encoding processing for variable-length encoding and generating an encoded stream (f) inverse quantization processing for inverse-quantizing the data quantized by the quantization processing, and
  • Inverse frequency transform processing for reproducing the prediction error data by performing inverse frequency transform on the quantized data, and (h) adding the prediction error data reproduced by the inverse frequency transform processing and the motion prediction image data
  • Local decoded image data (I) a process of storing the local decoded image data calculated in the addition process in a frame memory; (j) a local decoded image data of the encoding target frame held in the frame memory and Decimal pixel image data processing for generating decimal pixel accuracy image data using local decoded image data of the previous frame, and (k) the decimal pixel accuracy image data generated by the decimal pixel image data processing is referred to as reference image data
  • Motion detection / motion compensation processing for generating a predicted image by performing motion detection
  • write processing for writing motion predicted image data generated by the motion detection / motion compensation processing to the predicted image memory.
  • Image decoding apparatus The encoded data encoded by the motion prediction is input and separated into the encoded image data and the additional information, and the motion prediction error obtained by decoding the separated encoded image data is detected.
  • An image decoding device capable of regenerating image data by adding motion prediction image data and generating the motion prediction image data based on the reproduced image data that has been reproduced includes a plurality of frames of the reproduction image data.
  • the image decoding apparatus performs motion detection using the decimal pixel accuracy image data generated by the decimal pixel image data processing unit as reference image data to generate motion prediction image data.
  • the decimal pixel precision image data as the reference image data can be generated using both the reproduced image data of the decoding target frame and the reproduced image data of another frame separated from the frame. Therefore, it is possible to generate more accurate decimal pixel accuracy image data than in the case of generating only by interpolation.
  • This can contribute to improvement of motion prediction accuracy from the viewpoint of decoding, and can further contribute to high image quality of an image obtained through coding and decoding.
  • the decimal pixel image data processing unit is configured to perform a process between the reproduced image data of the decoding target frame stored in the frame memory and the reproduced image data of the previous frame in a predetermined range before the decoding target frame. It is determined whether or not the amount of motion is decimal pixel accuracy, and when a determination result of decimal pixel accuracy is obtained, the decimal number is obtained using the reproduced image data of the previous frame and the reproduced image data of the decoding target frame. Pixel accuracy image data is generated, and when a determination result of decimal pixel accuracy is not obtained, the decimal pixel accuracy image data is generated by an interpolation operation on the reproduction image data of the decoding target frame. As a result, useless data processing is reduced, which can contribute to reduction in processing amount and processing time.
  • the decimal pixel image data processing unit limits a target of determination as to whether or not the decimal pixel accuracy is the frame form of an I picture or a P picture. It is possible to generate decimal pixel accuracy image data with higher accuracy or higher image quality.
  • the decimal pixel image data processing unit sequentially determines the frame form and the decimal pixel accuracy from the previous frame closest to the decoding target frame with respect to the previous frame within the predetermined range. It is determined whether or not there is. This is because, as the reproduced image data of the frame closer to the decoding target frame is used, the higher pixel quality image data with higher image quality (high accuracy) can be generated.
  • the decimal pixel image generation unit performs, for example, phase shift processing on each of the plurality of image signals of the image data to generate a plurality of new image signals, The plurality of image signals and the new plurality of image signals are multiplied and combined to generate decimal pixel precision image data.
  • the image decoding method includes the following processes (a) to (l).
  • E an addition process for regenerating image data by adding motion prediction image data in the prediction image memory to a motion prediction error reproduced by the inverse frequency conversion process, and
  • (f) the addition process is the following processes (a) to (l).
  • a target of determination as to whether or not the decimal pixel accuracy is accurate is limited to a frame form of an I picture or a P picture.
  • the decimal pixel image data processing is performed by sequentially determining the frame form and the decimal pixel accuracy from the previous frame closest to the decoding target frame with respect to the previous frame in the predetermined range. It is determined whether or not.
  • the decimal pixel image data processing performs phase shift processing on each of the plurality of image signals of the image data to generate a plurality of new image signals, A plurality of image signals and a new plurality of image signals are multiplied and combined to generate decimal pixel precision image data.
  • FIG. 1 shows an example of an image encoding device according to the present invention.
  • An original image memory 101 stores input image data.
  • Reference numeral 102 denotes a subtracter that takes a difference between input image data output from the original image memory and predicted image data output from the frame memory 113.
  • Reference numeral 103 denotes a frequency conversion unit that converts the difference image between the original image data calculated by the subtractor 102 and the predicted image and others into the spatial frequency domain.
  • a quantization unit 104 quantizes the data frequency-converted by the frequency conversion unit 103.
  • Reference numeral 105 denotes a variable length coding unit that performs variable length coding on the data quantized by the quantization unit 104.
  • Reference numeral 106 denotes an inverse quantization unit that inversely quantizes the data quantized by the quantization unit 104.
  • Reference numeral 107 denotes an inverse frequency conversion unit that performs inverse frequency conversion on the data inversely quantized by the inverse quantization unit 106.
  • Reference numeral 108 denotes an adder that adds the predicted image data stored in the frame memory 113 to the data subjected to inverse frequency conversion by the inverse frequency conversion unit 107.
  • Reference numeral 109 denotes a frame memory for storing data (local decoded image data) obtained by addition by the adder 108.
  • 110 is a decimal pixel accuracy image generation method determination unit
  • 111 is a decimal pixel accuracy image generation unit.
  • the decimal pixel accuracy image generation method determination unit 110 and the decimal pixel accuracy image generation unit 111 store the local decoded image data of the encoding target frame and the local decoded image data of the previous frame held by the frame memory 109.
  • a decimal pixel image data processing unit that makes it possible to generate decimal pixel precision image data by using it is configured.
  • the decimal pixel accuracy image generation method determination unit 110 determines the generation direction, and the decimal pixel accuracy image generation unit 111 generates the decimal pixel accuracy image data from the local decoded image data according to the determined generation method. Details thereof will be described later.
  • a motion detection / compensation unit that detects an image close to the original image by motion detection using the decimal pixel accuracy image generated by the decimal pixel accuracy image generation unit 111 as a reference image, and generates predicted image data.
  • a frame memory 113 stores an image (predicted image data) generated by the motion detection / compensation unit 112.
  • Reference numeral 114 denotes an intra-screen prediction unit that generates a predicted image using data in a frame from local decoded image data stored in the frame memory 109.
  • the decimal pixel accuracy image generation method determined by the decimal pixel accuracy image generation method determination unit 110 is roughly divided into A and B in FIG.
  • decimal pixel precision image data is generated using local decoded image data of two different frames, ie, an encoding target frame and a previous frame.
  • the resolution of the image blocks (for example, macro blocks) of two frames is increased by using a super-resolution process described later.
  • the second method shown in B generates decimal pixel precision image data using local decoded image data of one frame of the encoding target frame.
  • an image block (for example, a macro block) of one frame is increased in resolution by using an interpolation operation.
  • the decimal pixel precision image data generated by the first method may be generated by using both the local decoded image data of the encoding target frame and the local decoded image data of another frame separated from the frame. Since it is possible, the 1st method can produce
  • Whether the first method or the second method is used is determined by, for example, local decoding image data of an encoding target frame stored in the frame memory and local decoding image data of a previous frame in a predetermined range before that. It is determined whether or not the amount of motion during the period is decimal pixel accuracy.
  • the first method is used when the determination result of decimal pixel accuracy is obtained, and the second method is used when the determination result of decimal pixel accuracy is not obtained. Select a method.
  • the previous frame candidate is selected as follows.
  • a selection candidate frame is obtained by I-picture (Intra-Picture), which is a screen obtained by encoding using information in a screen without performing temporal motion prediction, or by forward prediction encoding between screens. It is limited to the frame of P picture (Predictive-Picture) that is the obtained screen.
  • a B picture (Bi-directional Predictive-Picture) which is obtained by predictive coding from the past and the future is excluded. Thereby, it is possible to generate decimal pixel accuracy image data with higher accuracy or higher image quality.
  • a picture closest to the encoding target frame is set as a candidate frame.
  • the decimal pixel accuracy image generation method determination unit 110 determines the first method or the second method by the frame selection unit 120, the motion detection unit 121, and the determination unit 122.
  • the frame selection unit 120 reads local decoded image data of a predetermined frame from the frame memory 109 according to a predetermined procedure.
  • the motion detection unit 121 performs motion detection on local decoded image data of two frames read by the frame selection unit.
  • the determination unit 122 selects the first method or the second method as described above according to the motion detection result or the like.
  • FIG. 3 illustrates a flowchart of determination processing in the decimal pixel accuracy image generation method determination unit 110.
  • the selection of the previous frame is started from the frame immediately before the encoding target frame (130). It is determined whether the selected previous frame is a frame separated from the encoding target frame by a predetermined number of frames (131). If the selected previous frame is separated, the second method is selected, and the fractional pixel accuracy according to the second method is selected.
  • An instruction to generate image data and necessary data are provided to the decimal pixel precision image generation unit 111 (132).
  • step 131 If it is not a predetermined number of frames away, it is determined whether the previous frame is a B picture. If it is a B picture, the processing of step 131 is also returned. If it is not a B picture, between the previous frame and the encoding target frame Motion detection is performed (134). As a result of the motion detection, when the motion amount is an integer pixel accuracy, the processing in step 131 is also returned. If the amount of motion is not integer pixel accuracy, instructions for generating decimal pixel accuracy image data by the first method and necessary data (local decoded image data of the encoding target frame, local decoded image data of the selected previous frame, The motion detection result image data and motion vector information) are provided to the decimal pixel precision image generation unit 111 (136).
  • FIG. 4 shows the overall processing flow of the encoding process.
  • the encoding apparatus stores input image data in the original image memory 101 (201).
  • input image data include digital signals such as RGB signals, Y, Cb, and Cr signals.
  • the input image may be stored in the original image memory 101 for one frame, or may be divided into a plurality of pixel blocks and stored in units of the pixel blocks.
  • the difference between the original image data read from the original image memory 101 and the predicted image data is calculated (202). If there is no difference between the original image data and the predicted image data, the encoding process is terminated (203).
  • the processing on the decoding side can be simplified.
  • the difference image calculated by the subtracter 102 is converted into the frequency domain by using the frequency conversion unit 103 such as discrete cosine transform (DCT).
  • the frequency transformation may use other transformations such as Hadamard transformation and Fourier transformation in addition to DCT.
  • information for identifying the type of frequency conversion may be added to the stream.
  • the block size of the frequency conversion may be different in the vertical and horizontal sizes, for example, 16 ⁇ 8 pixels, even if the vertical and horizontal sizes are the same, such as 8 ⁇ 8 pixel units.
  • the block size information for frequency conversion may be added to the stream.
  • the data frequency-converted by the frequency converter 103 is quantized by the quantizer 106 (205).
  • a method based on a conventional moving image coding standard may be used, or a new quantization step may be determined.
  • quantization step information may be added to the stream.
  • the data quantized by the quantization unit 106 is encoded by the variable length encoding unit 105.
  • methods such as CABAC (Context-Adaptive Binary Arithmetic Coding) and CAVLC (Context-Adaptive Variable Length Coding) adopted in conventional coding standards may be used. You may create a new one. In that case, the code table information is added to the encoded stream.
  • inverse quantization is performed by the inverse quantization unit 106 (206).
  • a method based on the conventional video coding standard may be used.
  • the data calculated by the inverse quantization unit 106 is subjected to inverse frequency conversion by the inverse frequency conversion unit 107 (207).
  • the inverse frequency transform unit 107 performs inverse transform from the frequency domain to the spatial domain using the frequency transform block size and the type of frequency transform performed by the frequency transform unit 103.
  • the inverse frequency converted data and the data stored in the frame memory 113 are added and stored in the frame memory 109.
  • the frame selection unit 120 selects the data stored in the frame memory 109, and the motion detection unit 110 performs pixel-by-pixel motion on the selected frame data and encoding target frame data.
  • a detection process is performed to determine a decimal pixel precision image generation method (208).
  • the motion detection process may use the block matching method that has been used in the conventional encoding process, or may be performed on a pixel-by-pixel basis in order to improve the accuracy of the decimal pixel image.
  • the motion vector information is added to the encoded stream, the amount of data becomes enormous. Therefore, the amount of data can be reduced by performing the same motion detection process on the decoding side as that on the encoding side.
  • the decimal pixel accuracy image generation unit 111 generates image data with decimal pixel accuracy using the motion vector detected by the decimal pixel accuracy image generation method determination unit 110 and a plurality of image data stored in the frame memory 109 (209). ).
  • motion detection / compensation processing is performed using the decimal pixel image data and the original image data generated by the decimal pixel image generation unit 111 to generate a predicted image (210).
  • the motion detection / compensation processing (210) may be performed by calculating a motion vector with decimal pixel accuracy using a block matching method used in the conventional encoding processing.
  • motion vector information with decimal pixel accuracy is added to the encoded stream, the amount of data becomes enormous, and therefore the amount of data can be reduced by performing the same motion detection processing on the decoding side as on the encoding side.
  • the above processing is repeated until the processing of all the blocks in the frame of the input video is completed (211).
  • FIG. 5 shows an outline of the generation processing of the decimal pixel accuracy image.
  • the decimal pixel precision image generation unit 111 uses a plurality of image data 301 stored in the frame memory 109 and a motion vector between the plurality of image data 301 detected by the motion detection unit 110 to generate a plurality of image data 301. Are aligned, and a pixel value is multiplied by a predetermined coefficient to synthesize each pixel of a plurality of images after alignment, thereby generating a decimal pixel precision image (also referred to as a high resolution image) 302.
  • FIG. 16 is a flowchart showing the flow of processing of the decimal pixel accuracy image generation unit 111.
  • the decimal pixel accuracy image generation unit 111 generates a high resolution image by three processes, for example, (1) position estimation, (2) wideband interpolation, and (3) weighted sum.
  • (1) position estimation is to estimate the difference in sampling phase (sampling position) of each image data using each image data of a plurality of input image frames (1401, 1402).
  • (2) Wideband interpolation increases the image data density by interpolating and increasing the number of pixels (sampling points) using a wide-band low-pass filter that transmits all high-frequency components of the original signal, including aliasing components. (1403).
  • the weighted sum is a weighted sum corresponding to the sampling phase of each densified data, canceling out aliasing components generated during pixel sampling and simultaneously removing the high-frequency components of the original signal. Is restored (1404).
  • FIG. 7 shows an overview of this high-resolution image generation technology.
  • frame # 1 (501), frame # 2 (502), and frame # 3 (503) on different time axes are input and synthesized to obtain an output frame (506).
  • the signal waveform is displaced depending on the amount of movement (504) of the subject.
  • the position deviation amount is obtained by (1) position estimation, and as shown in FIG.
  • the frame # 2 (502) is motion-compensated (507) so that the position deviation is eliminated, and the pixels (508) of each frame are also compensated.
  • the phase difference ⁇ (511) between the sampling phases (509) and (510) is obtained.
  • Based on this phase difference ⁇ (511), by performing the above (2) wideband interpolation and (3) weighted sum, as shown in FIG. E, just the middle of the original pixel (508) (phase difference ⁇
  • Sub-pixel image generation is realized by generating a new pixel (512) at the position).
  • the weighted sum will be described later.
  • the movement of the subject may be accompanied by movements such as rotation and enlargement / reduction as well as parallel movement, but if the time interval between frames is very small or the movement of the subject is slow, These movements can also be considered by approximating local translation.
  • the first configuration example of the decimal pixel precision image generation unit 111 includes Reference Document 1 (Japanese Patent Laid-Open No. 8-336046), Reference Document 2 (Japanese Patent Laid-Open No. 9-69755), Reference Document 3 (Shin Aoki “Multiple Digital Images”. "Super-resolution processing by data”, “Ricoh Technical Report pp.19-25,” No.24, “NOVEMBER,” 1998)).
  • Reference Document 1 Japanese Patent Laid-Open No. 8-336046
  • Reference Document 2 Japanese Patent Laid-Open No. 9-69755
  • Reference Document 3 Shin Aoki “Multiple Digital Images”. "Super-resolution processing by data”, “Ricoh Technical Report pp.19-25," No.24, “NOVEMBER,” 1998).
  • the sub-pixel image generation unit 111 when performing the weighted sum of (3) above, as shown in FIG. 8, if signals of at least three frame images are used, 2 in the one-dimensional direction is used. Double high-resolution image generation is possible.
  • FIG. 8 is a diagram showing the frequency spectrum of each component in a one-dimensional frequency region.
  • the distance from the frequency axis represents the signal intensity
  • the rotation angle around the frequency axis represents the phase.
  • the aliasing components (604), (605), (606) of each frame are performed. Can be removed by canceling each other, and only the original components can be extracted.
  • the vector sum of the folded components (604), (605), and (606) of each frame is set to 0, that is, both the Re axis (real axis) component and the Im axis (imaginary axis) component are set to 0.
  • at least three folding components are required. Therefore, by using the signals of at least three frame images, it is possible to realize the generation of a doubled fractional pixel image, that is, to remove one aliasing component.
  • FIG. 1 a second configuration example of the decimal pixel precision image generation unit 111 is shown in FIG.
  • the decimal pixel accuracy image generation unit 111 it is possible to generate a fractional pixel image that is twice as large as that in the one-dimensional direction by using signals of at least two frame images. Details will be described below.
  • a plurality of frames that is, a frame to be encoded and a frame that has been encoded in the past, are input from the frame memory 109 to the input unit 400.
  • the position estimation unit 401 estimates the position of the corresponding pixel on the frame # 2 based on the sampling phase (sampling position) of the pixel to be processed on the frame # 1 input to the input unit 400, and the sampling position.
  • the phase difference ⁇ 402 is obtained.
  • the up-compensators 403 and 404 of the motion compensation / up-rate unit 415 use the information of the phase difference ⁇ 402 to perform motion compensation on the frame # 2 so as to align the position with the frame # 1, and the frame # 1 and the frame #
  • the number of pixels of 2 is doubled to increase the density.
  • the phase shift unit 416 shifts the phase of the densified data by a certain amount.
  • ⁇ / 2 phase shifters 406 and 408 can be used as means for shifting the data phase by a certain amount. Further, in order to compensate for the delay caused by the ⁇ / 2 phase shifters 406 and 408, the signals of the frame # 1 and the frame # 2 that have been densified by the delay units 405 and 407 are delayed.
  • the coefficients C0, C2, C1, C3 generated by the coefficient determiner 409 based on the phase difference ⁇ 402 with respect to the output signals of the delay units 405, 407 and the Hilbert transformers 406, 408, respectively. Are multiplied by multipliers 410, 411, 412, and 413, and these signals are added by an adder (414) to obtain an output. This output is output from the output unit 418.
  • the position estimation unit 401 can be realized using the above-described conventional technique as it is. Details of the up-raters 403 and 404, the ⁇ / 2 phase shifters 406 and 408, and the aliasing component removing unit 417 will be described later.
  • FIG. 9 shows an operation in the second configuration example of the decimal pixel precision image generation unit 111.
  • This figure shows the outputs of the delay units 405 and 407 and the ⁇ / 2 phase shifters 406 and 408 shown in FIG. 6 in a one-dimensional frequency domain.
  • the signals of frame # 1 and frame # 2 after the up-rate output from the delay units 405 and 407 are respectively the original components 701 and 702, and the aliasing component 705 that is aliased from the original sampling frequency (fs).
  • the signal is obtained by adding 706.
  • the folded component 706 is rotated in phase by the above-described phase difference ⁇ 402.
  • the signals of frame # 1 and frame # 2 after the up-rate output from the ⁇ / 2 phase shifters 406 and 408 are the original components 703 and 704 after the ⁇ / 2 phase shift and the ⁇ / 2 phase shifted signal, respectively.
  • the aliasing components 707 and 708 are added.
  • the original component and the folded component are extracted and shown.
  • the vector sum of the four components shown in Fig. B is taken, the Re-axis component is set to 1, the Im-axis component is set to 0, and the vector sum of the four components shown in Fig. C is calculated.
  • the horizontal axis represents frequency
  • the vertical axis represents gain (the value of the ratio of the output signal amplitude to the input signal amplitude), indicating the “frequency-gain” characteristics of the up-raters 403 and 404.
  • a frequency (2fs) twice as high as the sampling frequency (fs) of the original signal is set as a new sampling frequency, and a new pixel is located at a position just in the middle of the original pixel interval.
  • the frequency-gain characteristic is a characteristic that repeats every frequency that is an integral multiple of 2 fs due to the symmetry of the digital signal.
  • FIG. 11 shows filter tap coefficients obtained by inverse Fourier transform of the frequency characteristics shown in FIG.
  • phase difference ⁇ (402) as a phase difference in integer pixel units (2 ⁇ ) + a phase difference in decimal pixel image units
  • the phase difference compensation in integer pixel units is realized by a simple pixel shift
  • the filters of the up-raters 403 and 404 may be used.
  • FIG. 12 shows the frequency-gain characteristics of the ⁇ / 2 phase shifters 406 and 408 used in the second configuration example of the decimal pixel image generation unit 111.
  • the ⁇ / 2 phase shifters 406 and 408 generally known Hilbert transformers can be used.
  • the horizontal axis represents frequency
  • the vertical axis represents gain (the value of the ratio of the output signal amplitude to the input signal amplitude), indicating the “frequency-gain” characteristic of the Hilbert transformer.
  • the frequency (2fs) that is twice the sampling frequency (fs) of the original signal is set as a new sampling frequency, and all frequency components except 0 between -fs and + fs are gained.
  • a pass band of 1.0 In FIG.
  • the horizontal axis represents frequency
  • the vertical axis represents phase difference (difference in output signal phase with respect to input signal phase), indicating the “frequency-phase difference” characteristics of the Hilbert transformer.
  • the phase of the frequency component between 0 and fs is delayed by ⁇ / 2
  • the phase of the frequency component between 0 and ⁇ fs is advanced by ⁇ / 2.
  • the characteristic repeats every frequency that is an integral multiple of 2fs.
  • FIG. 13 shows filter tap coefficients obtained by inverse Fourier transform of the frequency characteristics shown in FIG.
  • a differentiator may be used as the ⁇ / 2 phase shifters 406 and 408 used for generating the decimal pixel precision image data.
  • the general expression cos ( ⁇ t + ⁇ ) representing a sine wave is differentiated by t and multiplied by 1 / ⁇
  • the function of ⁇ / 2 phase shift can be realized.
  • a ⁇ / 2 phase shift function is realized by applying a filter with a frequency / amplitude characteristic of 1 / ⁇ . May be.
  • the coefficient determiner (409) used in the second configuration example of the decimal pixel accuracy image generation unit 111 will be described with reference to FIG.
  • FIG. 9A when the vector sum of the four components shown in FIG. 9B is taken, the Re-axis component is set to 1, the Im-axis component is set to 0, and the four components shown in FIG. If the coefficient to multiply each component is determined so that both the Re-axis and Im-axis components are set to 0 when taking the vector sum of, using only two frame images, It is possible to realize an image signal processing apparatus that generates a doubled decimal pixel image. As shown in FIG.
  • the coefficient for the output of the delay unit (405) (the sum of the original component and the folded component of the frame # 1 after the up-rate) is C0
  • the output of the ⁇ / 2 phase shifter 406 (after the up-rate) C1 is a coefficient with respect to the sum of the ⁇ / 2 phase shift results of the original component and the aliasing component of frame # 1
  • a coefficient with respect to the output of delay device 407 (the sum of the original component and aliasing component of frame # 2 after the update) Is C2
  • the coefficient for the output of the Hilbert transformer 406 (the sum of the ⁇ / 2 phase shift results of the original component and the aliasing component of the frame # 2 after the up-rate) is C3.
  • the coefficient determiner 409 may output the coefficients C0, C1, C2, and C3 obtained in this way.
  • FIG. 14D shows values of the coefficients C0, C1, C2, and C3 when the phase difference ⁇ 402 is changed from 0 to 2 ⁇ every ⁇ / 8. This corresponds to a case where the position of the signal of the original frame # 2 is estimated with an accuracy of 1/16 pixel and motion compensation is performed on the frame # 1.
  • a general window function such as a Hanning window function or a Hamming window function
  • the configuration of the decimal pixel accuracy image generation unit 111 in FIG. 1 is the configuration described in FIGS. 6 to 14, thereby generating high-precision decimal pixel accuracy image data from a plurality of frames. It becomes possible.
  • the fractional pixel accuracy image generation unit 111 it is possible to generate one piece of high precision fractional pixel accuracy image data from two frames, which is smaller than in the first configuration example. Can be encoded with the amount of memory.
  • the decimal pixel accuracy image generation processing when the intra prediction frame is used as a reference frame. Since the past frame cannot be referred to in the intra prediction frame, the block closest to the encoding target block is searched for the motion vector in the screen as shown in FIG. At this time, the search block size may be one pixel instead of block.
  • the decimal pixel generation process after calculating the motion vector is the same as the method performed between the frames.
  • a decimal pixel image cannot be generated when the position indicated by the motion vector has integer pixel accuracy. Therefore, based on the result of the decimal pixel accuracy image generation method determination unit 110, it is determined whether or not the motion amount has integer pixel accuracy.
  • the motion detection position is an integer pixel accuracy position
  • a decimal pixel accuracy image is generated by filter interpolation used in the conventional coding standard as the second method. At this time, it is possible to reproduce the decimal pixel precision image on the decoding side by adding to the stream information indicating whether to use the conventional second method or whether to use the first method.
  • the processing unit for switching by either method may be a pixel block unit or a frame unit. In that case, information on whether to encode in units of pixel blocks or in units of frames may be added to the stream.
  • the difference between the original image and the predicted image is determined, and if there is no difference, the frequency conversion process, the quantization process, the inverse quantization process, the inverse frequency conversion process, the motion detection process, and the motion compensation process are omitted. It is possible to reduce the processing amount on the production side.
  • a decimal pixel image can be generated from a larger amount of image data, so that a higher precision decimal pixel image can be generated and motion prediction can be performed. The accuracy is improved and encoding with a small amount of data becomes possible.
  • FIG. 17 illustrates a block diagram of the image decoding apparatus according to the present invention.
  • reference numeral 1501 denotes a variable length decoding unit that decodes encoded data sent from the encoding side.
  • a syntax analysis unit 1502 analyzes the syntax of the data decoded by the variable length decoding unit 1501, and separates the encoded data into encoded image data and additional information.
  • Reference numeral 1503 denotes an inverse quantization unit that inversely quantizes data sent from the syntax analysis unit 1502.
  • Reference numeral 1504 denotes an inverse frequency conversion unit that generates motion prediction error data by performing inverse frequency conversion on the data inversely quantized by the inverse quantization unit 1530.
  • Reference numeral 1505 denotes an adder that generates image data by adding the motion prediction error subjected to inverse frequency conversion by the inverse frequency conversion unit 1504 and the motion prediction image data stored in the frame memory 1510.
  • Reference numeral 1506 denotes a frame memory for storing reproduced image data obtained by addition by the adder 1505.
  • Reference numeral 1507 denotes a decimal pixel accuracy image generation method determination unit, and 1508 denotes a decimal pixel accuracy image generation unit.
  • the operation of the decimal pixel accuracy image generation method determination unit 1507 is the same as the operation of the decimal pixel accuracy image generation method determination unit 110 of the image encoding device shown in FIG. That is, the decimal pixel accuracy image generation method determination unit 1507 determines the decimal pixel accuracy image generation method using the frame selection unit 1520, the motion detection unit 1521, and the determination unit 1522.
  • the frame selection unit 1520 reads the reproduction image data of a predetermined frame from the frame memory 1506 according to a predetermined procedure.
  • the motion detection unit 1521 performs motion detection on the reproduced image data of the two frames read by the frame selection unit 1520.
  • the determination unit 1522 selects a decimal pixel accuracy image generation method according to the motion detection result and the like.
  • the details of the operation of the decimal pixel accuracy image generation method determination unit 1507 are the same as the operation of the decimal pixel accuracy image generation method determination unit 110 of the image encoding device shown in FIG. That is, for example, according to the flowchart of the determination process shown in FIG. 3, the decimal pixel precision image generation method of either the first method or the second method shown in FIG. 2 is determined.
  • the details of the operation are the same as the description of the decimal pixel precision image generation method determination unit 110 of the image encoding device shown in FIG.
  • decimal pixel accuracy image generation unit 1508 The operation of the decimal pixel accuracy image generation unit 1508 is the same as that of the decimal pixel accuracy image generation unit 111 of the image encoding device shown in FIG.
  • the decimal pixel accuracy image generation method determination unit 1507 and the decimal pixel accuracy image generation unit 1508 are the decimal pixel accuracy image generation method determination unit 110 and the decimal pixel accuracy image generation unit 111 of the image encoding device illustrated in FIG.
  • the image decoding apparatus shown in FIG. 17 can generate a highly accurate decoded image assumed by the image encoding apparatus.
  • 1509 is a motion compensation unit that generates a decoded image from the motion vector sent from the syntax analysis unit 1502 and the image data generated by the decimal pixel precision image generation unit 1508.
  • Reference numeral 1510 denotes a frame memory that stores the decoded image data generated by the motion compensation unit 1509.
  • Reference numeral 1511 denotes a video display device that reads out and outputs decoded data stored in the frame memory 1510.
  • FIG. 17 can decode the stream encoded by the image encoding apparatus in FIG. 17
  • a detailed image decoding method in the image decoding apparatus will be described below.
  • FIG. 18 is a flowchart showing the entire image decoding process.
  • the encoded stream shown in FIG. 19 is encoded by, for example, an image encoding device.
  • a data area 1701 stores, for example, a determination flag indicating whether or not there is a difference. Further, for example, a determination flag (1707) on whether or not to perform motion detection, a motion information vector information (1708) generated by the decimal pixel precision image generation unit 1508, and an integer pixel position determination are performed on the data area 1702.
  • a flag (1709) to be executed is stored.
  • the data area 1703 stores quantization parameters, quantization steps, coefficients multiplied by these, or matrix number information used in the encoding process.
  • the data area 1704 stores the frequency conversion type and block size.
  • information on the type of the resolution enhancement method generated by the decimal pixel image generation unit 111, and in the data area 1706, coefficients after frequency conversion and quantization of the difference image between the original image and the predicted image are stored. Stored.
  • each flag and each data information are dequantized unit 1503, inverse frequency transform unit 1504, motion compensation unit 1509, motion detection, respectively. Is sent to the respective processing units of the unit 1507, the frame memory 1510, and the decimal pixel accuracy image generation unit 1508.
  • the inverse quantization unit 1503 performs an inverse quantization process using the data sent from the syntax analysis unit 1502 (1603).
  • the encoded stream is an encoded stream encoded by the image encoding device in FIG. 1
  • the inverse quantization process in the inverse quantization unit 1503 is the inverse process to the process in the quantization unit 104 in FIG. (1604).
  • This is the same processing as the processing of the inverse quantization unit 106 in FIG. 1 and may be the inverse quantization technology used in the conventional decoding technology, or it is multiplied by the quantization step stored in the data area 1703. May be.
  • the inverse frequency transform unit 1504 performs an inverse frequency transform process on the data inversely quantized by the inverse quantizer 1503 (1604).
  • the inverse frequency transform unit 1604 performs the inverse frequency transform process using the frequency transform type and frequency transform block size information sent from the syntax analysis unit 1502.
  • the inverse frequency conversion process may use a technique in a conventional image decoding technique.
  • motion detection is performed using data that has been subjected to reverse frequency conversion by the reverse frequency conversion unit 1504 by the adder 1505 and data that has been stored in the frame memory 1509 (1605).
  • the determined motion search method may be acquired from the syntax analysis unit 1502.
  • a decimal pixel image is generated using the image data stored in the frame memory 1506 and the motion vector acquired from the syntax analysis unit 1502 (1606).
  • the decimal pixel accuracy image generation unit 1508 generates a decimal pixel accuracy image using the motion vector and a plurality of image data, as in the case of the decimal pixel image generation unit 111 of FIG.
  • the content of the decimal pixel accuracy image generation processing is the same as the content described for the decimal pixel accuracy image generation unit 111 in FIG.
  • the decimal pixel accuracy image generation unit 1508 performs high-resolution processing using a motion vector, a plurality of images, and their aliasing distortions, as described for the decimal pixel accuracy image generation unit 111 in FIG. 1. As a result, it is possible to increase the resolution of the decimal pixel precision image data referred to by the motion compensation unit 1509.
  • the process ends without performing the inverse quantization, inverse frequency conversion, motion detection, decimal pixel image generation, and motion compensation processing.
  • the decimal pixel precision image generation method determination processing 1605 in FIG. 18 when the detection result is determined to be an integer pixel position, the pixel filter used in the conventional moving image encoding method on the encoding side in the syntax analysis unit When it is determined that the decimal pixel image is generated by the interpolation, the decimal pixel image is generated by the pixel filter interpolation used in the conventional moving image encoding method.
  • the intra-screen prediction unit 1511 generates a prediction image and stores the data in the frame memory 1506.
  • the decoded image is output to a video display device 1511 such as a TV, a PC monitor, or a projector, for example.
  • the image decoding device and the image decoding method described above it is possible to restore the resolution of the decimal pixel precision image data that is referred to in the motion search process in the image recovery device and the image decoding method. Therefore, a higher-definition decoded image can be generated.
  • frequency conversion processing, quantization processing, inverse quantization processing, and inverse frequency conversion processing are performed based on difference information between the original image and the predicted image stored in the encoded stream.
  • the motion detection process and the motion compensation process can be omitted, and the data processing amount on the decoding side can be reduced.
  • the encoding processing by the image encoding device (image encoding device) and the decoding processing by the image decoding device (image decoding device) are performed using a computer device.
  • a computer device Provide a program that controls the above processing on a recording medium (hard disk, optical disk, magneto-optical disk, etc.) or via a transmission line or appropriate network, such as a PC (Personal Computer) or EWS (Engineering Work Station)
  • the image encoding device, image encoding method, image decoding device, and image decoding method of the present invention can be used for image encoding processing, image decoding processing, and the like based on various standards. Further, the super-resolution processing used in the first method and the third method, and the filter interpolation calculation processing used in the second method and the fourth method can be appropriately changed.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A reference image having a higher resolution is generated to contribute to improvement of motion prediction accuracy from the viewpoint of encoding. An image encoding apparatus is capable of encoding the difference between original image data and motion prediction image data, and generating the motion prediction image data on the basis of local decoded image data that can be acquired by decoding the encoded data. The image encoding apparatus includes a frame memory (109) that maintains the local decoded image data across a plurality of frames, and decimal pixel image data processing units (110) and (111) for generating decimal pixel precision image data by using local decoded image data of a frame to be encoded that is maintained by the frame memory and local decoded image data of the preceding frame. The image encoding apparatus executes motion detection by using the decimal pixel precision image data generated by the decimal pixel image data processing units as reference image data, and generates motion prediction image data.

Description

画像符号化装置、画像符号化方法、画像復号装置、及び画像復号方法Image coding apparatus, image coding method, image decoding apparatus, and image decoding method
 本発明は、補償技術を用いた動画像の圧縮及び伸長技術に関し、特に補償技術に小数画素精度画像を用いる画像符号化又は復号のための装置と方法に関するものである。 The present invention relates to a moving picture compression and decompression technique using a compensation technique, and more particularly to an apparatus and method for image encoding or decoding using a fractional pixel precision image for the compensation technique.
 MPEG-2,MPEG-,H.264等の画像符号化、復号処理では、非特許文献1に記載されるように、小数画素画像精度の動き検出・動き補償を行うために隣接画素からのフィルタ補間により参照画像を生成する。フィルタ補間によって生成される小数画素画像精度の参照画像は、整数精度の動き探索で探索された最小のコスト関数の画素の周囲に対して小数画素画像精度の画素値をフィルタ補間によって得るものである。参照画像として整数画素精度の画像データを用いる場合に比べて復号による再生画像の画質を向上させることができる。 In image encoding / decoding processes such as MPEG-2, MPEG-, H.264, etc., as described in Non-Patent Document 1, filters from adjacent pixels are used to perform motion detection / compensation with sub-pixel image accuracy. A reference image is generated by interpolation. The reference image with decimal pixel image accuracy generated by filter interpolation is obtained by performing filter interpolation to obtain a pixel value with decimal pixel image accuracy around the pixel of the minimum cost function searched in the integer accuracy motion search. . Compared with the case where image data with integer pixel accuracy is used as the reference image, the quality of the reproduced image by decoding can be improved.
 MPEG-2、MPEG-4、H.264等の既存の画像符号化規格では、符号化、復号処理において、小数画素画像精度の動き補償を行うための参照画像を得るためのフィルタ補間処理において、MPEG-2の場合には2タップのフィルタが適用されているが単純な画素補間のため高周波成分がカットされてしまい、動き予測精度が低下する。また、MPEG-4 ASPやH.264においても各々8タップ、6タップのフィルタ処理が適用されているが、高周波成分の調整が十分とはいえず、予測精度の向上による符号化効率の改善が課題であった。何れにおいても、小数画素画像精度の動き補償を行うための従来の参照画像は、処理対象とされるマクロブロックが所属するフレーム内の画像データだけで生成される。 In existing image coding standards such as MPEG-2, MPEG-4, H.264, etc., in filter interpolation processing for obtaining a reference image for performing motion compensation with decimal pixel image accuracy in encoding and decoding processing, In the case of MPEG-2, a 2-tap filter is applied, but high-frequency components are cut due to simple pixel interpolation, resulting in a decrease in motion prediction accuracy. In addition, although 8-tap and 6-tap filter processing is applied to MPEG-4 ASP and H.264, respectively, the adjustment of high-frequency components is not sufficient, and the encoding efficiency is improved by improving the prediction accuracy. It was a challenge. In any case, a conventional reference image for performing motion compensation with decimal pixel image accuracy is generated only from image data in a frame to which a macroblock to be processed belongs.
 本発明は、上記課題に鑑みてなされたものであり、その目的は、画像符号化、画像復号のための装置及び方法において、より高解像度の参照画像を生成することにある。別の目的は、動き予測精度を改善することである。更に別の目的は、高画質に資することである。 The present invention has been made in view of the above problems, and an object thereof is to generate a reference image with higher resolution in an apparatus and method for image encoding and image decoding. Another object is to improve motion prediction accuracy. Yet another object is to contribute to high image quality.
 本発明の前記並びにその他の目的と新規な特徴は本明細書の記述及び添付図面から明らかになるであろう。 The above and other objects and novel features of the present invention will be apparent from the description of this specification and the accompanying drawings.
 本願において開示される発明のうち代表的なものの概要を簡単に説明すれば下記の通りである。 The outline of typical inventions disclosed in the present application will be briefly described as follows.
 すなわち、画像データの符号化技術において、原画像データと動き予測画像データとの差分を符号化すると共に符号化データを復号して得られるローカルデコード画像データに基づいて前記動き予測画像データを生成するとき、ローカルデコード画像データを複数フレーム分に亘り保持するフレームメモリから、符号化対象フレームのローカルデコード画像データとそれよりも前のフレームのローカルデコード画像データとを用いて小数画素精度画像データを生成し、前記小数画素精度画像データを参照画像データとして動き検出を行って動き予測画像データを生成する。 That is, in the image data encoding technique, the motion prediction image data is generated based on local decoded image data obtained by encoding the difference between the original image data and the motion prediction image data and decoding the encoded data. When the local decode image data is stored for a plurality of frames, decimal pixel precision image data is generated using the local decode image data of the frame to be encoded and the local decode image data of the previous frame. Then, motion detection is performed using the decimal pixel precision image data as reference image data to generate motion prediction image data.
 また、画像データの復号技術において、動き予測によって符号化された符号化データから分離された符号化画像データを復号して得られる動き予測誤差に動き予測画像データを加算して画像データを再生し、再生された再生画像データに基づいて前記動き予測画像データを生成するとき、前記再生画像データを複数フレーム分に亘り保持するフレームメモリから、復号対象フレームの再生画像データとそれよりも前のフレームの再生画像データとを用いて小数画素精度画像データを生成し、生成された小数画素精度画像データを参照画像データとして動き検出を行って動き予測画像データを生成する。 Also, in the image data decoding technique, motion prediction image data is added to a motion prediction error obtained by decoding encoded image data separated from encoded data encoded by motion prediction, and image data is reproduced. When the motion prediction image data is generated based on the reproduced image data that has been reproduced, the reproduction image data of the decoding target frame and the previous frame are decoded from a frame memory that holds the reproduction image data for a plurality of frames. The reproduced image data is used to generate decimal pixel accuracy image data, and motion detection is performed using the generated decimal pixel accuracy image data as reference image data to generate motion prediction image data.
 符号化、復号の何れにおいても、参照画像データとしての小数画素精度画像データは、処理対象フレームの画像データと当該フレームから離れた別のフレームの画像データとの双方のデータを用いて生成することが可能であるから、補間だけで生成する場合に比べてより高精度な小数画素精度画像データを生成することができる。 In both encoding and decoding, the decimal pixel precision image data as the reference image data is generated by using both the image data of the processing target frame and the image data of another frame separated from the frame. Therefore, it is possible to generate decimal pixel accuracy image data with higher accuracy than in the case of generating only by interpolation.
 本願において開示される発明のうち代表的なものによって得られる効果を簡単に説明すれば下記のとおりである。 The following is a brief description of the effects obtained by the representative inventions disclosed in the present application.
 すなわち、より高解像度の参照画像を生成することができる。それによって動き予測精度の改善に資することができる。更に、高画質に資することができる。 That is, a higher-resolution reference image can be generated. Thereby, it can contribute to improvement of motion prediction accuracy. Furthermore, it can contribute to high image quality.
本発明に係る画像符号化装置の構成の一例のブロック図である。It is a block diagram of an example of a structure of the image coding apparatus which concerns on this invention. 小数画素精度画像生成方法決定部が決定する小数画素精度画像の生成方法を大別して示す説明図である。It is explanatory drawing which shows roughly the generation method of the decimal pixel accuracy image which the decimal pixel accuracy image generation method determination part determines. 小数画素精度画像生成方法決定部における決定処理を例示するフローチャートである。It is a flowchart which illustrates the determination process in a decimal pixel precision image generation method determination part. 本発明に係る画像符号化装置の処理の一例を示すフロー図である。It is a flowchart which shows an example of a process of the image coding apparatus which concerns on this invention. 画像符号化装置における高解像度化技術の概要を示した図である。It is the figure which showed the outline | summary of the high resolution technique in an image coding apparatus. 画像符号化装置における高解像度化処理の一例のブロック図である。It is a block diagram of an example of the high resolution process in an image coding apparatus. 画像符号化装置における高解像度化技術の概要を示す図である。It is a figure which shows the outline | summary of the high resolution technique in an image coding apparatus. 画像符号化装置における小数画素画像生成部の第1の構成例の入力信号の位相関係の説明図である。It is explanatory drawing of the phase relationship of the input signal of the 1st structural example of the decimal pixel image generation part in an image coding apparatus. 画像符号化装置における小数画素画像生成部の第2の構成例の入力信号の位相関係の説明図である。It is explanatory drawing of the phase relationship of the input signal of the 2nd structural example of the decimal pixel image generation part in an image coding apparatus. 画像符号化装置における小数画素画像生成部の第2の構成例に用いるアップレート器の周波数-利得特性である。It is the frequency-gain characteristic of the up-rate device used for the 2nd structural example of the decimal pixel image generation part in an image coding apparatus. 画像符号化装置における小数画素画像生成部の第2の構成例に用いるアップレート器の周波数特性を逆フーリエ変換して得られるフィルタのタップ係数の例である。It is an example of the tap coefficient of the filter obtained by carrying out the inverse Fourier transform of the frequency characteristic of the up-rate device used for the 2nd structural example of the decimal pixel image generation part in an image coding apparatus. 画像符号化装置における小数画素画像生成部の第2の構成例に用いるπ/2位相シフト器周波数-利得特性である。It is the (pi) / 2 phase shifter frequency-gain characteristic used for the 2nd structural example of the decimal pixel image generation part in an image coding apparatus. 画像符号化装置における小数画素画像生成部の第2の構成例に用いるπ/2位相シフト器の周波数特性を逆フーリエ変換して得られるフィルタのタップ係数を示している。The filter tap coefficients obtained by inverse Fourier transform of the frequency characteristics of the π / 2 phase shifter used in the second configuration example of the decimal pixel image generation unit in the image encoding device are shown. 画像符号化装置における小数画素画像生成部の第2の構成例に用いる係数決定器の係数の一例である。It is an example of the coefficient of the coefficient determinator used for the 2nd structural example of the decimal pixel image generation part in an image coding apparatus. 画像符号化装置における画面内動き探索の説明図である。It is explanatory drawing of the motion search in a screen in an image coding apparatus. 画像符号化装置における高解像度化処理の一例のフロー図である。It is a flowchart of an example of the high resolution process in an image coding apparatus. 本発明に係る画像復号装置の構成の一例のブロック図である。It is a block diagram of an example of a structure of the image decoding apparatus which concerns on this invention. 本発明に係る画像復号装置の処理の一例を示すフロー図である。It is a flowchart which shows an example of the process of the image decoding apparatus which concerns on this invention. 本発明に係るビットストリームの一例を示した図である。It is the figure which showed an example of the bit stream which concerns on this invention.
 101…原画像メモリ
 102…減算器
 103…周波数変換部
 104…量子化部
 105…可変長符号化部
 106,1503…逆量子化部
 107、1504…逆周波数変換部
 108、1505…加算器
 109、113、1506、1510…フレームメモリ
 110、1507…小数画素精度画像生成方法決定部
 111、1508…小数画素精度画像生成部
 112、1509…動き検出・動き補償部
 114、1512…画面内予測部
 401…位置推定部
 403、404…アップレート器
 406、407…位相シフト器
 410、411、412、413…乗算器
 409…係数決定部
 1501…可変長復号部
 1502…構文解析部
 1511…映像表示装置
DESCRIPTION OF SYMBOLS 101 ... Original image memory 102 ... Subtractor 103 ... Frequency conversion part 104 ... Quantization part 105 ... Variable length coding part 106, 1503 ... Inverse quantization part 107, 1504 ... Inverse frequency conversion part 108, 1505 ... Adder 109, 113, 1506, 1510 ... Frame memory 110, 1507 ... Decimal pixel accuracy image generation method determination unit 111, 1508 ... Decimal pixel accuracy image generation unit 112, 1509 ... Motion detection / motion compensation unit 114, 1512 ... In-screen prediction unit 401 ... Position estimation unit 403, 404 ... Up-rater 406, 407 ... Phase shifter 410, 411, 412, 413 ... Multiplier 409 ... Coefficient determination unit 1501 ... Variable length decoding unit 1502 ... Syntax analysis unit 1511 ... Video display device
 1.実施の形態の概要
 先ず、本願において開示される発明の代表的な実施の形態について概要を説明する。代表的な実施の形態についての概要説明で括弧を付して参照する図面中の参照符号はそれが付された構成要素の概念に含まれるものを例示するに過ぎない。
1. First, an outline of a typical embodiment of the invention disclosed in the present application will be described. Reference numerals in the drawings referred to in parentheses in the outline description of the representative embodiments merely exemplify what are included in the concept of the components to which the reference numerals are attached.
 〔1〕《画像符号化装置》原画像データと動き予測画像データとの差分を符号化すると共に符号化データを復号して得られるローカルデコード画像データに基づいて前記動き予測画像データを生成することが可能な画像符号化装置は、前記ローカルデコード画像データを複数フレーム分に亘り保持するフレームメモリ(109)と、前記フレームメモリが保持する、符号化対象フレームのローカルデコード画像データとそれよりも前のフレームのローカルデコード画像データとを用いて小数画素精度画像データを生成する小数画素画像データ処理部(110、111)と、を有する。画像符号化装置は前記小数画素画像データ処理部で生成された小数画素精度画像データを参照画像データとして動き検出を行って動き予測画像データを生成する。 [1] << Image Encoding Device >> Encoding a difference between original image data and motion predicted image data and generating the motion predicted image data based on local decoded image data obtained by decoding the encoded data The image encoding apparatus capable of encoding includes a frame memory (109) for holding the local decoded image data for a plurality of frames, a local decoded image data of a frame to be encoded, which is stored in the frame memory, and a frame before the frame memory (109). A sub-pixel image data processing unit (110, 111) that generates sub-pixel accuracy image data using local decoded image data of the frame. The image encoding device performs motion detection using the decimal pixel accuracy image data generated by the decimal pixel image data processing unit as reference image data to generate motion prediction image data.
 上記より、参照画像データとしての小数画素精度画像データは、符号化対象フレームのローカルデコード画像データと当該フレームから離れた別のフレームのローカルデコード画像データとの双方のデータを用いて生成することが可能であるから、補間だけで生成する場合に比べてより高精度な小数画素精度画像データを生成することができる。 From the above, the decimal pixel precision image data as the reference image data can be generated by using both the local decoded image data of the encoding target frame and the local decoded image data of another frame separated from the frame. Since this is possible, it is possible to generate more accurate decimal pixel accuracy image data than in the case of generating only by interpolation.
 これにより、符号化の観点より動き予測精度の改善に資することができ、更に、符号化復号を経て得られる画像の高画質に資することができる。 Thereby, it is possible to contribute to improvement of motion prediction accuracy from the viewpoint of encoding, and it is possible to contribute to high image quality of an image obtained through encoding and decoding.
 〔2〕項2の画像符号化装置において、前記小数画素画像データ処理部は、フレームメモリに格納された符号化対象フレームのローカルデコード画像データとそれより前の所定範囲の前フレームのローカルデコード画像データとの間の動き量が小数画素精度であるか否かを判定し、小数画素精度の判定結果が得られたときは、その前フレームのローカルデコード画像データと前記符号化対象フレームのローカルデコード画像データとを用いて小数画素精度画像データを生成し、小数画素精度の判定結果が得られないときは符号化対象フレームのローカルデコード画像データに対する補間演算によって小数画素精度画像データを生成する。 [2] In the image encoding device according to item 2, the decimal pixel image data processing unit includes the local decoded image data of the encoding target frame stored in the frame memory and the local decoded image of the previous frame in a predetermined range before the encoding target frame. It is determined whether or not the amount of motion between the data is decimal pixel accuracy, and when the determination result of decimal pixel accuracy is obtained, local decoding image data of the previous frame and local decoding of the encoding target frame The decimal pixel accuracy image data is generated using the image data, and when the determination result of the decimal pixel accuracy is not obtained, the decimal pixel accuracy image data is generated by the interpolation operation on the local decoded image data of the encoding target frame.
 上記より、動き量が整数画素精度である場合(例えば静止画像の場合)には小数画素精度画像データを生成するのに前者の処理を行なっても結果は後者と実質的に同じになるから、その場合に後者の補間による処理を選択することにより、無駄なデータ処理が減って処理量と処理時間が縮小される。また、前者の処理対象フレームが離れるに従って小数画素精度画像データの高精度若しくは高画質を期待できなくなる。この点を考慮して、符号化対象フレームから所定フレーム数だけ離れたフレームの範囲で小数画素精度画像データがなければそれ以降のフレームに対する画素精度の判定を行わずに、補間による処理を選択するようにして、無駄な処理を極力低減する。 From the above, when the amount of motion is integer pixel accuracy (for example, in the case of a still image), even if the former processing is performed to generate decimal pixel accuracy image data, the result is substantially the same as the latter. In this case, by selecting the latter interpolation processing, useless data processing is reduced, and the processing amount and processing time are reduced. In addition, as the former processing target frame is separated, high accuracy or high image quality of the decimal pixel accuracy image data cannot be expected. In consideration of this point, if there is no fractional pixel accuracy image data in a frame range that is a predetermined number of frames away from the encoding target frame, the pixel accuracy is not determined for the subsequent frames, and the processing by interpolation is selected. In this way, wasteful processing is reduced as much as possible.
 〔3〕項2の画像符号化装置において、前記小数画素画像データ処理部は、前記小数画素精度であるか否かの判定の対象を、時間方向の動き予測を行うことなく画面内の情報を用いて符号化されて得られる画面であるIピクチャ(Intra-Picture)又は画面間の順方向予測符号化によって得られる画面であるPピクチャ(Predictive-Picture)のフレームに限定する。過去と未来の双方向からの予測符号化によって得られるが面であるBピクチャ(Bi-directional Predictive-Picture)を除外する。これにより、より高精度若しくは高画質な小数画素精度画像データを生成することができる。 [3] In the image encoding device according to item 2, the decimal pixel image data processing unit uses information on a screen as a target for determining whether or not the decimal pixel accuracy is obtained without performing motion prediction in the time direction. The frame is limited to an I picture (Intra-Picture) which is a screen obtained by encoding using the above, or a P picture (Predictive-Picture) which is a screen obtained by forward predictive coding between screens. A B picture (Bi-directional Predictive-Picture) that is obtained by predictive coding from the past and the future is excluded. Thereby, it is possible to generate decimal pixel accuracy image data with higher accuracy or higher image quality.
 〔4〕項3の像符号化装置において、前記小数画素画像データ処理部は、前記所定範囲の前フレームに対して最も符号化対象フレームに近い前フレームから順次前記フレーム形態の判定と前記小数画素精度であるか否かの判定を行う。符号化対象フレームにより近いフレームのローカルデコード画像データを用いるほど、より高画質(高精度)の小数画素精度画像データを生成することができるからである。 [4] In the image coding device according to item 3, the decimal pixel image data processing unit sequentially determines the frame form and the decimal pixel from the previous frame closest to the encoding target frame with respect to the previous frame within the predetermined range. It is determined whether or not the accuracy. This is because, as the local decoded image data of the frame closer to the encoding target frame is used, the higher pixel quality image data with higher image quality (higher accuracy) can be generated.
 〔5〕項1の画像符号化装置において、前記小数画素画像生成部は、例えば画像データの複数の画像信号のそれぞれの信号に対して位相シフト処理を行って新たな複数の画像信号を生成し、前記複数の画像信号と新たな複数の画像信号とに係数を乗じて合成することにより、小数画素精度画像データを生成する。 [5] In the image encoding device according to item 1, the decimal pixel image generation unit performs, for example, phase shift processing on each of a plurality of image signals of image data to generate a plurality of new image signals. Then, the pixel image precision image data is generated by multiplying the plurality of image signals and the new plurality of image signals by a coefficient and combining them.
 〔6〕《画像符号化方法》画像符号化方法は以下の(a)乃至(l)の処理を含む。(a)予測画像メモリから動き予測画像データを読出す読出し処理、(b)読み出された動き予測画像データと入力画像データとの差分を予測誤差データとして算出する差分処理、(c)前記差分処理で算出した予測誤差データを周波数変換する周波数変換処理、(d)前記周波数変換処理で周波数変換されたデータを量子化する量子化処理、(e)前記量子化処理で量子化されたデータを可変長符号化し、符号化ストリームを生成する可変長符号化処理、(f)前記量子化処理で量子化されたデータを逆量子化する逆量子化処理と、(g)前記逆量化処理で逆量子化されたデータを逆周波数変換して予測誤差データを再生するする逆周波数変換処理と、(h)前記逆周波数変換処理で再生された予測誤差データと前記動き予測画像データとを加算してローカルデコード画像データを出力する加算処理、(i)前記加算処理で算出されたローカルデコード画像データをフレームメモリに格納する処理、(j)前記フレームメモリが保持する、符号化対象フレームのローカルデコード画像データとそれよりも前のフレームのローカルデコード画像データとを用いて小数画素精度画像データを生成する小数画素画像データ処理、(k)前記小数画素画像データ処理で生成された小数画素精度画像データを参照画像データとして動き検出を行って予測画像を生成する動き検出・動き補償処理、及び(l)前記動き検出・動き補償処理で生成された動き予測画像データを前記予測画像メモリに書き込む書き込み処理。 [6] << Image coding method >> The image coding method includes the following processes (a) to (l). (a) Read processing for reading motion prediction image data from the prediction image memory, (b) Difference processing for calculating a difference between the read motion prediction image data and input image data as prediction error data, (c) The difference Frequency conversion processing for frequency conversion of the prediction error data calculated in the processing, (d) quantization processing for quantizing the frequency converted data in the frequency conversion processing, (e) data quantized by the quantization processing Variable-length encoding processing for variable-length encoding and generating an encoded stream, (f) inverse quantization processing for inverse-quantizing the data quantized by the quantization processing, and (g) inverse processing by the inverse-quantization processing. Inverse frequency transform processing for reproducing the prediction error data by performing inverse frequency transform on the quantized data, and (h) adding the prediction error data reproduced by the inverse frequency transform processing and the motion prediction image data Local decoded image data (I) a process of storing the local decoded image data calculated in the addition process in a frame memory; (j) a local decoded image data of the encoding target frame held in the frame memory and Decimal pixel image data processing for generating decimal pixel accuracy image data using local decoded image data of the previous frame, and (k) the decimal pixel accuracy image data generated by the decimal pixel image data processing is referred to as reference image data Motion detection / motion compensation processing for generating a predicted image by performing motion detection, and (l) write processing for writing motion predicted image data generated by the motion detection / motion compensation processing to the predicted image memory.
 〔7〕《画像復号装置》動き予測によって符号化された符号化データを入力して符号化画像データと付加情報に分離し、分離された符号化画像データを復号して得られる動き予測誤差に動き予測画像データを加算して画像データを再生し、再生された再生画像データに基づいて前記動き予測画像データを生成することが可能な画像復号装置は、前記再生画像データを複数フレーム分に亘り保持するフレームメモリ(1506)と、前記フレームメモリが保持する、復号対象フレームの再生画像データとそれよりも前のフレームの再生画像データとを用いて小数画素精度画像データを生成する小数画素画像データ処理部(1507,1508)と、を有する。画像復号装置は、前記小数画素画像データ処理部で生成された小数画素精度画像データを参照画像データとして動き検出を行って動き予測画像データを生成する。 [7] << Image decoding apparatus >> The encoded data encoded by the motion prediction is input and separated into the encoded image data and the additional information, and the motion prediction error obtained by decoding the separated encoded image data is detected. An image decoding device capable of regenerating image data by adding motion prediction image data and generating the motion prediction image data based on the reproduced image data that has been reproduced includes a plurality of frames of the reproduction image data. Decimal pixel image data for generating fractional pixel precision image data using a frame memory (1506) to be held, and reproduction image data of a decoding target frame and reproduction image data of a frame before that held by the frame memory And processing units (1507, 1508). The image decoding apparatus performs motion detection using the decimal pixel accuracy image data generated by the decimal pixel image data processing unit as reference image data to generate motion prediction image data.
 上記より、参照画像データとしての小数画素精度画像データは、復号対象フレームの再生画像データと当該フレームから離れた別のフレームの再生画像データとの双方のデータを用いて生成することが可能であるから、補間だけで生成する場合に比べてより高精度な小数画素精度画像データを生成することができる。 As described above, the decimal pixel precision image data as the reference image data can be generated using both the reproduced image data of the decoding target frame and the reproduced image data of another frame separated from the frame. Therefore, it is possible to generate more accurate decimal pixel accuracy image data than in the case of generating only by interpolation.
 これにより、復号の観点より動き予測精度の改善に資することができ、更に、符号化復号を経て得られる画像の高画質に資することができる。 This can contribute to improvement of motion prediction accuracy from the viewpoint of decoding, and can further contribute to high image quality of an image obtained through coding and decoding.
 〔8〕項7の画像復号装置において、前記小数画素画像データ処理部は、フレームメモリに格納された復号対象フレームの再生画像データとそれより前の所定範囲の前フレームの再生画像データとの間の動き量が小数画素精度であるか否かを判定し、小数画素精度の判定結果が得られたときは、その前フレームの再生画像データと前記復号対象フレームの再生画像データとを用いて小数画素精度画像データを生成し、小数画素精度の判定結果が得られないときは復号対象フレームの再生画像データに対する補間演算によって小数画素精度画像データを生成する。これにより、無駄なデータ処理が減って処理量と処理時間の縮小に寄与することができる。 [8] In the image decoding device according to item 7, the decimal pixel image data processing unit is configured to perform a process between the reproduced image data of the decoding target frame stored in the frame memory and the reproduced image data of the previous frame in a predetermined range before the decoding target frame. It is determined whether or not the amount of motion is decimal pixel accuracy, and when a determination result of decimal pixel accuracy is obtained, the decimal number is obtained using the reproduced image data of the previous frame and the reproduced image data of the decoding target frame. Pixel accuracy image data is generated, and when a determination result of decimal pixel accuracy is not obtained, the decimal pixel accuracy image data is generated by an interpolation operation on the reproduction image data of the decoding target frame. As a result, useless data processing is reduced, which can contribute to reduction in processing amount and processing time.
 〔9〕項8の画像復号装置において、前記小数画素画像データ処理部は、前記小数画素精度であるか否かの判定の対象をIピクチャ又はPピクチャのフレーム形態に限定する。より高精度若しくは高画質な小数画素精度画像データを生成することができる。 [9] In the image decoding device according to item 8, the decimal pixel image data processing unit limits a target of determination as to whether or not the decimal pixel accuracy is the frame form of an I picture or a P picture. It is possible to generate decimal pixel accuracy image data with higher accuracy or higher image quality.
 〔10〕項9の画像復号装置において、前記小数画素画像データ処理部は、前記所定範囲の前フレームに対して最も復号対象フレームに近い前フレームから順次前記フレーム形態の判定と前記小数画素精度であるか否かの判定を行う。復号対象フレームにより近いフレームの再生画像データを用いるほど、より高画質(高精度)の小数画素精度画像データを生成することができるからである。 [10] In the image decoding device according to [9], the decimal pixel image data processing unit sequentially determines the frame form and the decimal pixel accuracy from the previous frame closest to the decoding target frame with respect to the previous frame within the predetermined range. It is determined whether or not there is. This is because, as the reproduced image data of the frame closer to the decoding target frame is used, the higher pixel quality image data with higher image quality (high accuracy) can be generated.
 〔11〕項7の画像復号装置において、前記小数画素画像生成部は、例えば画像データの複数の画像信号のそれぞれの信号に対して位相シフト処理を行って新たな複数の画像信号を生成し、前記複数の画像信号と新たな複数の画像信号とに係数を乗じて合成することにより、小数画素精度画像データを生成する。 [11] In the image decoding device according to item 7, the decimal pixel image generation unit performs, for example, phase shift processing on each of the plurality of image signals of the image data to generate a plurality of new image signals, The plurality of image signals and the new plurality of image signals are multiplied and combined to generate decimal pixel precision image data.
 〔12〕《復号方法》画像復号方法は以下の(a)乃至(l)の処理を含む。(a)動き予測によって符号化された符号化データから成る符号化ストリームを入力して可変長復号する可変長復号処理、(b)可変長復号された符号化データを符号化画像データと付加情報に分離する構文解析処理、(c)分離された符号化画像データを逆量子化する逆量子化処理、(d)前記逆量化処理で逆量子化されたデータを逆周波数変換して動き予測誤差を再生するする逆周波数変換処理、(e)逆周波数変換処理で再生された動き予測誤差に予測画像メモリの動き予測画像データを加算して画像データを再生する加算処理、(f)前記加算処理で再生された再生画像データをフレームメモリに書き込む再生画像データ書き込み処理と、(g)前記フレームメモリが保持する、復号対象フレームの再生画像データとそれよりも前のフレームの再生画像データとを用いて小数画素精度画像データを生成する小数画素画像データ処理、(h)前記小数画素画像データ処理で生成された小数画素精度画像データと前記付加情報を用いて動き予測画像データを生成する動き補償処理、及び(i)前記動き補償処理で生成された動き予測画像データを前記予測画像メモリに書き込む書き込み処理。 [12] << Decoding Method >> The image decoding method includes the following processes (a) to (l). (a) Variable-length decoding process for variable-length decoding by inputting an encoded stream made up of encoded data encoded by motion prediction; (b) encoded image data and additional information for variable-length decoded encoded data (C) Inverse quantization processing for inverse quantization of the separated encoded image data, (d) Motion prediction error by inverse frequency conversion of the data inversely quantized by the inverse quantization processing (E) an addition process for regenerating image data by adding motion prediction image data in the prediction image memory to a motion prediction error reproduced by the inverse frequency conversion process, and (f) the addition process. A reproduction image data writing process for writing the reproduction image data reproduced in step (b) to a frame memory, and (g) reproduction image data of a decoding target frame and reproduction image data of a frame before that held by the frame memory, Decimal pixel image data processing for generating decimal pixel accuracy image data using, (h) motion compensation for generating motion prediction image data using the decimal pixel accuracy image data generated by the decimal pixel image data processing and the additional information And (i) a writing process for writing motion prediction image data generated in the motion compensation process into the prediction image memory.
 〔13〕項12の画像復号方法において、前記小数画素画像データ処理は、前記小数画素精度であるか否かの判定の対象をIピクチャ又はPピクチャのフレーム形態に限定する。 [13] In the image decoding method according to item 12, in the decimal pixel image data processing, a target of determination as to whether or not the decimal pixel accuracy is accurate is limited to a frame form of an I picture or a P picture.
 〔14〕項13の画像復号方法において、前記小数画素画像データ処理は、前記所定範囲の前フレームに対して最も復号対象フレームに近い前フレームから順次前記フレーム形態の判定と前記小数画素精度であるか否かの判定を行う。 [14] In the image decoding method according to item 13, the decimal pixel image data processing is performed by sequentially determining the frame form and the decimal pixel accuracy from the previous frame closest to the decoding target frame with respect to the previous frame in the predetermined range. It is determined whether or not.
 〔15〕項12の画像復号方法において、前記小数画素画像データ処理は、画像データの複数の画像信号のそれぞれの信号に対して位相シフト処理を行って新たな複数の画像信号を生成し、前記複数の画像信号と新たな複数の画像信号とに係数を乗じて合成することにより、小数画素精度画像データを生成する。 [15] In the image decoding method according to item 12, the decimal pixel image data processing performs phase shift processing on each of the plurality of image signals of the image data to generate a plurality of new image signals, A plurality of image signals and a new plurality of image signals are multiplied and combined to generate decimal pixel precision image data.
 2.実施の形態の詳細
 実施の形態について更に詳述する。以下、本発明を実施するための形態を図面に基づいて詳細に説明する。なお、発明を実施するための形態を説明するための全図において、同一の機能を有する要素には同一の符号を付して、その繰り返しの説明を省略する。
2. Details of Embodiments Embodiments will be further described in detail. DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments for carrying out the present invention will be described in detail with reference to the drawings. Note that components having the same function are denoted by the same reference symbols throughout the drawings for describing the embodiments for carrying out the invention, and the repetitive description thereof will be omitted.
 《画像符号化装置及び画像符号化方法》
 図1には本発明に係る画像符号化装置の一例が示される。101は入力画像データを記憶する原画像メモリである。102は原画像メモリから出力された入力画像データとフレームメモリ113から出力された予測画像データとの差分を取る減算器である。103は、減算器102で演算した原画像データと予測画像でー他との差分画像を空間周波数領域に変換する周波数変換部である。104は、周波数変換部103で周波数変換したデータを量子化する量子化部である。105は量子化部104で量子化したデータを可変長符号化する可変長符号化部である。106は量子化部104で量子化したデータを逆量子化する逆量子化部である。107は、逆量子化部106にて逆量子化したデータを逆周波数変換する逆周波数変換部である。108は、逆周波数変換部107にて逆周波数変換したデータにフレームメモリ113に格納されている予測画像データを加算する加算器である。109は、加算器108にて加算して得られるデータ(ローカルデコード画像データ)を格納するフレームメモリである。110は小数画素精度画像生成方法決定部、111は小数画素精度画像生成部である。
<< Image Encoding Device and Image Encoding Method >>
FIG. 1 shows an example of an image encoding device according to the present invention. An original image memory 101 stores input image data. Reference numeral 102 denotes a subtracter that takes a difference between input image data output from the original image memory and predicted image data output from the frame memory 113. Reference numeral 103 denotes a frequency conversion unit that converts the difference image between the original image data calculated by the subtractor 102 and the predicted image and others into the spatial frequency domain. A quantization unit 104 quantizes the data frequency-converted by the frequency conversion unit 103. Reference numeral 105 denotes a variable length coding unit that performs variable length coding on the data quantized by the quantization unit 104. Reference numeral 106 denotes an inverse quantization unit that inversely quantizes the data quantized by the quantization unit 104. Reference numeral 107 denotes an inverse frequency conversion unit that performs inverse frequency conversion on the data inversely quantized by the inverse quantization unit 106. Reference numeral 108 denotes an adder that adds the predicted image data stored in the frame memory 113 to the data subjected to inverse frequency conversion by the inverse frequency conversion unit 107. Reference numeral 109 denotes a frame memory for storing data (local decoded image data) obtained by addition by the adder 108. 110 is a decimal pixel accuracy image generation method determination unit, and 111 is a decimal pixel accuracy image generation unit.
 小数画素精度画像生成方法決定部110及び小数画素精度画像生成部111は、前記フレームメモリ109が保持する、符号化対象フレームのローカルデコード画像データとそれよりも前のフレームのローカルデコード画像データとを用いて小数画素精度画像データを生成することを可能にする小数画素画像データ処理部を構成するものである。小数画素精度画像生成方法決定部110はその生成方向を決定し、小数画素精度画像生成部111は決定された生成方法に従ってローカルデコード画像データから小数画素精度画像データを生成する。その詳細については後述する。 The decimal pixel accuracy image generation method determination unit 110 and the decimal pixel accuracy image generation unit 111 store the local decoded image data of the encoding target frame and the local decoded image data of the previous frame held by the frame memory 109. A decimal pixel image data processing unit that makes it possible to generate decimal pixel precision image data by using it is configured. The decimal pixel accuracy image generation method determination unit 110 determines the generation direction, and the decimal pixel accuracy image generation unit 111 generates the decimal pixel accuracy image data from the local decoded image data according to the determined generation method. Details thereof will be described later.
 112は小数画素精度画像生成部111にて生成した小数画素精度画像を参照画像として動き検出により原画像に近い画像を検出し、予測画像データを生成する動き検出・補償部である。113は動き検出・補償部112にて生成した画像(予測画像データ)を格納するフレームメモリである。114は、フレームメモリ109に格納したローカルデコード画像データからフレーム内のデータを用いて予測画像を生成する画面内予測部である。 112 is a motion detection / compensation unit that detects an image close to the original image by motion detection using the decimal pixel accuracy image generated by the decimal pixel accuracy image generation unit 111 as a reference image, and generates predicted image data. A frame memory 113 stores an image (predicted image data) generated by the motion detection / compensation unit 112. Reference numeral 114 denotes an intra-screen prediction unit that generates a predicted image using data in a frame from local decoded image data stored in the frame memory 109.
 前記小数画素精度画像生成方法決定部110が決定する小数画素精度画像の生成方法は図2のAとBに大別される。Aに示される第1の方法は、符号化対象フレームとその前のフレームとの異なる2枚のフレームのローカルデコード画像データを用いて小数画素精度画像データを生成する。その生成には、2枚のフレームの画像ブロック(例えばマクロブロック)に対して後述する超解像処理を用いて高解像度化を行う。Bに示される第2の方法は、符号化対象フレームの1枚のフレームのローカルデコード画像データを用いて小数画素精度画像データを生成する。その生成には、1枚のフレームの画像ブロック(例えばマクロブロック)に対して補間演算を用いて高解像度化する。第1の方法で生成される小数画素精度画像データは、符号化対象フレームのローカルデコード画像データと当該フレームから離れた別のフレームのローカルデコード画像データとの双方のデータを用いて生成することが可能であるから、第1の方法は、補間だけで生成する第2の方法に比べてより高精度な小数画素精度画像データを生成することができる。第2の方法を採用すれば第1の方法に比べて演算処理データが少ないから小数画素精度画像データを生成するための演算処理負担が軽減される。 The decimal pixel accuracy image generation method determined by the decimal pixel accuracy image generation method determination unit 110 is roughly divided into A and B in FIG. In the first method shown in A, decimal pixel precision image data is generated using local decoded image data of two different frames, ie, an encoding target frame and a previous frame. For the generation, the resolution of the image blocks (for example, macro blocks) of two frames is increased by using a super-resolution process described later. The second method shown in B generates decimal pixel precision image data using local decoded image data of one frame of the encoding target frame. For the generation, an image block (for example, a macro block) of one frame is increased in resolution by using an interpolation operation. The decimal pixel precision image data generated by the first method may be generated by using both the local decoded image data of the encoding target frame and the local decoded image data of another frame separated from the frame. Since it is possible, the 1st method can produce | generate more accurate decimal pixel precision image data compared with the 2nd method produced | generated only by interpolation. If the second method is adopted, the calculation processing data for generating the decimal pixel accuracy image data is reduced because the calculation processing data is less than that of the first method.
 第1の方法又は第2の方法のいずれかを用いるかは、例えば、フレームメモリに格納された符号化対象フレームのローカルデコード画像データとそれより前の所定範囲の前フレームのローカルデコード画像データとの間の動き量が小数画素精度であるか否かを判定し、小数画素精度の判定結果が得られたときは第1の方法、小数画素精度の判定結果が得られないときは第2の方法を選択する。 Whether the first method or the second method is used is determined by, for example, local decoding image data of an encoding target frame stored in the frame memory and local decoding image data of a previous frame in a predetermined range before that. It is determined whether or not the amount of motion during the period is decimal pixel accuracy. The first method is used when the determination result of decimal pixel accuracy is obtained, and the second method is used when the determination result of decimal pixel accuracy is not obtained. Select a method.
 前記前フレームの候補は以下のように選択する。
(1)選択候補フレームは、時間方向の動き予測を行うことなく画面内の情報を用いて符号化されて得られる画面であるIピクチャ(Intra-Picture)又は画面間の順方向予測符号化によって得られる画面であるPピクチャ(Predictive-Picture)のフレームに限定する。過去と未来の双方向からの予測符号化によって得られるが面であるBピクチャ(Bi-directional Predictive-Picture)を除外する。これにより、より高精度若しくは高画質な小数画素精度画像データを生成することができる。
(2)上記第1の条件を満足するフレームの内、最も符号化対象フレームに近いピクチャを候補フレームとする。これにより、より高画質(高精度)な小数画素精度画像生成可能になる。符号化対象フレームにより近いフレームのローカルデコード画像データを用いるほど、より高画質(高精度)の小数画素精度画像データを生成することができるからである。
(3)上記第1及び第2条件を満足するフレームと符号化対象フレームとの動き検出結果が縦方向及び横方向の双方で整数画素精度の動き量である場合、次に符号化対象フレームに近い別のフレームであって、IまたはPピクチャであるフレームを候補とする。これにより、整数画素精度の動き量では超解像処理ができないという問題を解決できる。
(4)所定の範囲の過去フレームを遡っても、上記(3)を満たすフレームが無い場合は候補の選択を打ち切る。この場合に、通常の補間拡大処理による第2の方法を選択する。また、第1の方法による前フレームが符号化対象フレームから離れるに従って小数画素精度画像データの高精度若しくは高画質を期待できなくなる。この点を考慮して、符号化対象フレームから所定フレーム数だけ離れたフレームの範囲で小数画素精度画像データがなければそれ以降のフレームに対する画素精度の判定を行わずに、補間による処理を選択するから、無駄な処理を極力低減することができる。また、このようにフレーム数の範囲を制限することで、フレームメモリに格納しておくフレーム数を低減することができ、フレームメモリを構成するハードウェア資源が小さくて済む。これにより、低コスト化を図ることが可能である。
The previous frame candidate is selected as follows.
(1) A selection candidate frame is obtained by I-picture (Intra-Picture), which is a screen obtained by encoding using information in a screen without performing temporal motion prediction, or by forward prediction encoding between screens. It is limited to the frame of P picture (Predictive-Picture) that is the obtained screen. A B picture (Bi-directional Predictive-Picture) which is obtained by predictive coding from the past and the future is excluded. Thereby, it is possible to generate decimal pixel accuracy image data with higher accuracy or higher image quality.
(2) Of the frames satisfying the first condition, a picture closest to the encoding target frame is set as a candidate frame. Thereby, it is possible to generate a decimal pixel accuracy image with higher image quality (high accuracy). This is because, as the local decoded image data of the frame closer to the encoding target frame is used, the higher pixel quality image data with higher image quality (higher accuracy) can be generated.
(3) When the motion detection result between the frame that satisfies the first and second conditions and the encoding target frame is a motion amount with integer pixel accuracy in both the vertical direction and the horizontal direction, A frame that is another near frame and is an I or P picture is set as a candidate. As a result, it is possible to solve the problem that the super-resolution processing cannot be performed with the motion amount with integer pixel accuracy.
(4) Even if the past frames in a predetermined range are traced back, if there is no frame satisfying the above (3), the selection of candidates is terminated. In this case, the second method by the normal interpolation enlargement process is selected. Further, as the previous frame according to the first method moves away from the encoding target frame, it becomes impossible to expect high accuracy or high image quality of the decimal pixel accuracy image data. In consideration of this point, if there is no fractional pixel accuracy image data in a frame range that is a predetermined number of frames away from the encoding target frame, the pixel accuracy is not determined for the subsequent frames, and the processing by interpolation is selected. Therefore, useless processing can be reduced as much as possible. Further, by limiting the range of the number of frames in this way, the number of frames stored in the frame memory can be reduced, and hardware resources constituting the frame memory can be reduced. Thereby, cost reduction can be achieved.
 図1において、前記小数画素精度画像生成方法決定部110は、フレーム選択部120、動き検出部121、及び判定部122によって上記第1の方法又は第2の方法を決定する。フレーム選択部120は所定の手順に従ってフレームメモリ109から所定のフレームのローカルデコード画像データをリードする。動き検出部121はフレーム選択部でリードされた2枚のフレームのローカルデコード画像データに関する動き検出を行う。判定部122はその動き検出結果等に従って、上記のように第1の方法又は第2の方法を選択する。 In FIG. 1, the decimal pixel accuracy image generation method determination unit 110 determines the first method or the second method by the frame selection unit 120, the motion detection unit 121, and the determination unit 122. The frame selection unit 120 reads local decoded image data of a predetermined frame from the frame memory 109 according to a predetermined procedure. The motion detection unit 121 performs motion detection on local decoded image data of two frames read by the frame selection unit. The determination unit 122 selects the first method or the second method as described above according to the motion detection result or the like.
 図3には前記小数画素精度画像生成方法決定部110における決定処理のフローチャートが例示される。前フレームの選択は符号化対象フレームの直前のフレームから開始される(130)。選択された前フレームが符号化対象フレームから所定フレーム数分だけ離れたフレームであるか否かが判別され(131)、離れていれば、第2方法が選択され、第2方法による小数画素精度画像データの生成の指示と必要なデータが小数画素精度画像生成部111に与えられる(132)。 FIG. 3 illustrates a flowchart of determination processing in the decimal pixel accuracy image generation method determination unit 110. The selection of the previous frame is started from the frame immediately before the encoding target frame (130). It is determined whether the selected previous frame is a frame separated from the encoding target frame by a predetermined number of frames (131). If the selected previous frame is separated, the second method is selected, and the fractional pixel accuracy according to the second method is selected. An instruction to generate image data and necessary data are provided to the decimal pixel precision image generation unit 111 (132).
 所定フレーム数離れていなければ、当該前フレームはBピクチャであるかが判別され、Bピクチャであればステップ131の処理も戻り、Bピクチャでなければ、当該前フレームと符号化対象フレームとの間の動き検出が行われる(134)。動き検出の結果、動き量が整数画素精度であるときはステップ131の処理も戻る。動き量が整数画素精度でなければ、第1方法による小数画素精度画像データの生成の指示と、必要なデータ(符号化対象フレームのローカルデコード画像データ、選択された前フレームのローカルデコード画像データ、動き検出結果の画像データ、および動きベクトルの情報)が小数画素精度画像生成部111に与えられる(136)。 If it is not a predetermined number of frames away, it is determined whether the previous frame is a B picture. If it is a B picture, the processing of step 131 is also returned. If it is not a B picture, between the previous frame and the encoding target frame Motion detection is performed (134). As a result of the motion detection, when the motion amount is an integer pixel accuracy, the processing in step 131 is also returned. If the amount of motion is not integer pixel accuracy, instructions for generating decimal pixel accuracy image data by the first method and necessary data (local decoded image data of the encoding target frame, local decoded image data of the selected previous frame, The motion detection result image data and motion vector information) are provided to the decimal pixel precision image generation unit 111 (136).
 図4には符号化処理の全体的な処理フローが示される。図4を参照しながら、符号化処理の流れを全体的に説明する。まず本、符号化装置は入力画像データを原画像メモリ101に格納する(201)。入力画像データとしては、例えばRGB信号、Y,Cb,Cr信号などのディジタル信号がある。符号化装置において、原画像メモリ101には、入力画像を1フレーム分格納しても良いし、複数の画素ブロックに分割してその画素ブロック単位で格納しても良い。次に原画像メモリ101から読み出した原画像データと予測画像データとの差分を取る(202)。原画像データと予測画像データとの差分がない場合には、符号化処理を終了する(203)。このとき、差分情報なしの情報をストリームに付加すれば復号側での処理も簡略化できる。原画像データと予測画像データとの差分がある場合には、減算器102にて演算した差分画像を周波数変換部103にて離散コサイン変換(DCT)等の周波数変換を用いて周波数領域に変換する。周波数変換はDCT以外にもアダマール変換やフーリエ変換などその他の変換を用いても良い。複数の周波数変換を使用する場合には、ストリームに周波数変換の種類を識別するための情報を付加すればよい。また、周波数変換のブロックサイズは、例えば8×8画素単位のような縦、横のサイズが同じでも16×8画素のように、縦と横のサイズが異なってもよく、その際には、ストリームに周波数変換のブロックサイズ情報を付加すればよい。 FIG. 4 shows the overall processing flow of the encoding process. The overall flow of the encoding process will be described with reference to FIG. First, the encoding apparatus stores input image data in the original image memory 101 (201). Examples of input image data include digital signals such as RGB signals, Y, Cb, and Cr signals. In the encoding apparatus, the input image may be stored in the original image memory 101 for one frame, or may be divided into a plurality of pixel blocks and stored in units of the pixel blocks. Next, the difference between the original image data read from the original image memory 101 and the predicted image data is calculated (202). If there is no difference between the original image data and the predicted image data, the encoding process is terminated (203). At this time, if information without difference information is added to the stream, the processing on the decoding side can be simplified. When there is a difference between the original image data and the predicted image data, the difference image calculated by the subtracter 102 is converted into the frequency domain by using the frequency conversion unit 103 such as discrete cosine transform (DCT). . The frequency transformation may use other transformations such as Hadamard transformation and Fourier transformation in addition to DCT. When a plurality of frequency conversions are used, information for identifying the type of frequency conversion may be added to the stream. In addition, the block size of the frequency conversion may be different in the vertical and horizontal sizes, for example, 16 × 8 pixels, even if the vertical and horizontal sizes are the same, such as 8 × 8 pixel units. The block size information for frequency conversion may be added to the stream.
 周波数変換部103にて周波数変換したデータは、量子化部106にて量子化される(205)。量子化処理は、従来の動画像符号化規格に基づく手法を用いても、新たに量子化ステップを決定しても良い。新たに量子化ステップを決定する際には、ストリームに量子化ステップ情報を付加すれば良い。量子化部106にて量子化したデータは、可変長符号化部105にて符号化される。可変長符号化の方法として、従来の符号化規格で採用されているCABAC(Context-Adaptive Binary Arithmetic Coding)やCAVLC(Context-Adaptive Variable Length Coding)等の手法を用いても良いし、コードテーブルを新たに作成しても良い。その際には、コードテーブル情報を符号化ストリームに付加する。 The data frequency-converted by the frequency converter 103 is quantized by the quantizer 106 (205). For the quantization process, a method based on a conventional moving image coding standard may be used, or a new quantization step may be determined. When a new quantization step is determined, quantization step information may be added to the stream. The data quantized by the quantization unit 106 is encoded by the variable length encoding unit 105. As a variable-length coding method, methods such as CABAC (Context-Adaptive Binary Arithmetic Coding) and CAVLC (Context-Adaptive Variable Length Coding) adopted in conventional coding standards may be used. You may create a new one. In that case, the code table information is added to the encoded stream.
 次に、逆量子化部106にて、逆量子化を実施する(206)。逆量子化の手法も従来の動画像符号化規格に基づく手法を用いればよい。逆量子化部106にて演算したデータは、逆周波数変換部107にて逆周波数変換される(207)。逆周波数変換部107では、周波数変換部103にて実施した周波数変換ブロックサイズや周波数変換の種類を用いて、周波数領域から空間領域への逆変換を行う。逆周波数変換したデータとフレームメモリ113に格納したデータを加算し、フレームメモリ109に格納する。次に、前述の如く、フレームメモリ109に格納したデータをフレーム選択部120が選択し、選択されたフレームのデータと符号化対象フレームのデータとに対して動き検出部110が画素単位での動き検出処理を行なって、小数画素精度画像生成方法を決定する(208)。動き検出処理は従来の符号化処理で用いられてきたブロックマッチング法を用いても良いし、小数画素画像の精度を向上させるため、1画素単位に行っても良い。その際、動きベクトル情報を符号化ストリームに付加するとデータ量が膨大となるため、復号側で符号化側と同じ動き検出処理を実施することでデータ量を削減することができる。その際には、符号化側で動き検出方法を決めたテーブルを用意し、そのテーブル番号をストリームに付加すればよい。一方、動きベクトル情報を送る場合には、復号側で動き検出を実施する必要がなく、また動き検出方法を付加せずに済む。したがって、ユーザ側で動き検出方法を付加してデータ量を削減するか否かを決定し、その決定情報をストリームに付加することでより、ハードウェア等の処理性能に応じた適切な符号化処理を行うことができる。小数画素精度画像生成部111では、小数画素精度画像生成方法決定部110で検出した動きベクトルとフレームメモリ109に格納されている複数の画像データを用いて小数画素精度の画像データを生成する(209)。第1方法による小数画素精度の画像データの生成方法についてはその詳細を後述する。補間演算による第2方法はMPEG-4、H.264/AVC等において公知の小数画素精度による動き探索処理で用いられる演算手法と同様であるからその詳細な説明は省略する。 Next, inverse quantization is performed by the inverse quantization unit 106 (206). As the inverse quantization method, a method based on the conventional video coding standard may be used. The data calculated by the inverse quantization unit 106 is subjected to inverse frequency conversion by the inverse frequency conversion unit 107 (207). The inverse frequency transform unit 107 performs inverse transform from the frequency domain to the spatial domain using the frequency transform block size and the type of frequency transform performed by the frequency transform unit 103. The inverse frequency converted data and the data stored in the frame memory 113 are added and stored in the frame memory 109. Next, as described above, the frame selection unit 120 selects the data stored in the frame memory 109, and the motion detection unit 110 performs pixel-by-pixel motion on the selected frame data and encoding target frame data. A detection process is performed to determine a decimal pixel precision image generation method (208). The motion detection process may use the block matching method that has been used in the conventional encoding process, or may be performed on a pixel-by-pixel basis in order to improve the accuracy of the decimal pixel image. At this time, if the motion vector information is added to the encoded stream, the amount of data becomes enormous. Therefore, the amount of data can be reduced by performing the same motion detection process on the decoding side as that on the encoding side. In that case, a table in which the motion detection method is determined on the encoding side is prepared, and the table number may be added to the stream. On the other hand, when motion vector information is sent, it is not necessary to perform motion detection on the decoding side, and it is not necessary to add a motion detection method. Therefore, by determining whether or not to reduce the amount of data by adding a motion detection method on the user side, by adding the determination information to the stream, an appropriate encoding process according to the processing performance of hardware or the like It can be performed. The decimal pixel accuracy image generation unit 111 generates image data with decimal pixel accuracy using the motion vector detected by the decimal pixel accuracy image generation method determination unit 110 and a plurality of image data stored in the frame memory 109 (209). ). Details of the method of generating image data with decimal pixel accuracy according to the first method will be described later. Since the second method based on the interpolation calculation is the same as the calculation method used in the motion search process with the fractional pixel accuracy known in MPEG-4, H.264 / AVC, etc., detailed description thereof will be omitted.
 次に、小数画素画像生成部111にて生成した小数画素画像データと原画像データとを用いて動き検出・動き補償処理を実施して予測画像を生成する(210)。動き検出・動き補償処理(210)は従来の符号化処理で用いるブロックマッチング法を用いて小数画素精度の動きベクトルを算出すればよい。その際、小数画素精度の動きベクトル情報を符号化ストリームに付加するとデータ量が膨大となるため、復号側で符号化側と同じ動き検出処理を実施することでデータ量を削減することができる。以上の処理を入力映像のフレーム内すべてのブロックの処理が終了するまで繰り返し行う(211)。 Next, motion detection / compensation processing is performed using the decimal pixel image data and the original image data generated by the decimal pixel image generation unit 111 to generate a predicted image (210). The motion detection / compensation processing (210) may be performed by calculating a motion vector with decimal pixel accuracy using a block matching method used in the conventional encoding processing. At this time, if motion vector information with decimal pixel accuracy is added to the encoded stream, the amount of data becomes enormous, and therefore the amount of data can be reduced by performing the same motion detection processing on the decoding side as on the encoding side. The above processing is repeated until the processing of all the blocks in the frame of the input video is completed (211).
 ここで、第1方法による小数画素精度の画像データの生成方法として例えば特開2007-324789号公報に記載の方法を適用した場合についてその概要を説明する。 Here, an outline of a case where, for example, a method described in Japanese Patent Application Laid-Open No. 2007-324789 is applied as a method for generating image data with decimal pixel accuracy by the first method will be described.
 図5には小数画素精度画像の生成処理の概要が示される。小数画素精度画像生成部111は、フレームメモリ109に格納されている複数の画像データ301と、動き検出部110が検出した当該複数の画像データ301間の動きベクトルを用いて、複数の画像データ301の各画素の位置あわせをおこない、位置あわせ後の複数の画像の各画素に画素値に所定の係数を乗じて合成することにより、小数画素精度画像(高解像度画像とも記す)302を生成する。 FIG. 5 shows an outline of the generation processing of the decimal pixel accuracy image. The decimal pixel precision image generation unit 111 uses a plurality of image data 301 stored in the frame memory 109 and a motion vector between the plurality of image data 301 detected by the motion detection unit 110 to generate a plurality of image data 301. Are aligned, and a pixel value is multiplied by a predetermined coefficient to synthesize each pixel of a plurality of images after alignment, thereby generating a decimal pixel precision image (also referred to as a high resolution image) 302.
 図16は小数画素精度画像生成部111の処理の流れを示すフロー図である。小数画素精度画像生成部111では、例えば、(1)位置推定、(2)広帯域補間、(3)加重和、の3つの処理により高解像度画像生成を行う。ここで、(1)位置推定は、入力した複数の画像フレームの各画像データを用いて、各画像データのサンプリング位相(標本化位置)の差を推定するものである(1401、1402)。(2)広帯域補間は、各画像データを折返し成分も含め、原信号の高周波成分をすべて透過する帯域の広いローパスフィルタを用いて画素数(サンプリング点)を補間して増やし、画像データを高密度化するものである(1403)。(3)加重和は、各高密度化データのサンプリング位相に応じた重み係数により加重和をとることによって、画素サンプリングの際に生じた折返し成分を打ち消して除去するとともに、同時に原信号の高周波成分を復元するものである(1404)。 FIG. 16 is a flowchart showing the flow of processing of the decimal pixel accuracy image generation unit 111. The decimal pixel accuracy image generation unit 111 generates a high resolution image by three processes, for example, (1) position estimation, (2) wideband interpolation, and (3) weighted sum. Here, (1) position estimation is to estimate the difference in sampling phase (sampling position) of each image data using each image data of a plurality of input image frames (1401, 1402). (2) Wideband interpolation increases the image data density by interpolating and increasing the number of pixels (sampling points) using a wide-band low-pass filter that transmits all high-frequency components of the original signal, including aliasing components. (1403). (3) The weighted sum is a weighted sum corresponding to the sampling phase of each densified data, canceling out aliasing components generated during pixel sampling and simultaneously removing the high-frequency components of the original signal. Is restored (1404).
 図7に、この高解像度画像生成技術の概要を示す。同図Aに示すように、異なる時間軸上のフレーム#1(501)、フレーム#2(502)、フレーム#3(503)が入力され、これらを合成して出力フレーム(506)を得ることを想定する。簡単のため、まず被写体が水平方向に移動(504)した場合を考え、水平線(505)の上の1次元の信号処理によって小数画素画像生成することを考える。このとき、同図Bと同図Dに示すように、フレーム#2(502)とフレーム#1(501)では、被写体の移動(504)の量に応じて信号波形の位置ずれが生じる。上記(1)位置推定によってこの位置ずれ量を求め、同図Cに示すように、位置ずれが無くなるようにフレーム#2(502)を動き補償(507)するとともに、各フレームの画素(508)のサンプリング位相(509)(510)の間の位相差θ(511)を求める。この位相差θ(511)に基づき、上記(2)広帯域補間および(3)加重和を行うことにより、同図Eに示すように、元の画素(508)のちょうど中間(位相差θ=π)の位置に新規画素(512)を生成することにより、小数画素画像生成を実現する。 (3)加重和については後述する。なお、実際には被写体の動きが平行移動だけでなく、回転や拡大・縮小などの動きを伴うことも考えられるが、フレーム間の時間間隔が微小な場合や被写体の動きが遅い場合には、これらの動きも局所的な平行移動に近似して考えることができる。 Fig. 7 shows an overview of this high-resolution image generation technology. As shown in FIG. A, frame # 1 (501), frame # 2 (502), and frame # 3 (503) on different time axes are input and synthesized to obtain an output frame (506). Is assumed. For simplicity, first consider the case where the subject has moved (504) in the horizontal direction, and consider generating a fractional pixel image by one-dimensional signal processing on the horizontal line (505). At this time, as shown in FIG. B and FIG. D, in the frame # 2 (502) and the frame # 1 (501), the signal waveform is displaced depending on the amount of movement (504) of the subject. The position deviation amount is obtained by (1) position estimation, and as shown in FIG. 3C, the frame # 2 (502) is motion-compensated (507) so that the position deviation is eliminated, and the pixels (508) of each frame are also compensated. The phase difference θ (511) between the sampling phases (509) and (510) is obtained. Based on this phase difference θ (511), by performing the above (2) wideband interpolation and (3) weighted sum, as shown in FIG. E, just the middle of the original pixel (508) (phase difference θ = π Sub-pixel image generation is realized by generating a new pixel (512) at the position). (3) The weighted sum will be described later. Actually, the movement of the subject may be accompanied by movements such as rotation and enlargement / reduction as well as parallel movement, but if the time interval between frames is very small or the movement of the subject is slow, These movements can also be considered by approximating local translation.
 小数画素精度画像生成部111の第1の構成例は、参考文献1(特開平8-336046号)、参考文献2(特開平9-69755号)、参考文献3(青木伸 “複数のデジタル画像データによる超解像処理”, Ricoh Technical Report pp.19-25, No.24, NOVEMBER, 1998)に記載の高解像度処理を行う構成とすることである。小数画素画像生成部111の第一の構成例は、上記(3)の加重和を行う際に、図8に示すように、少なくとも3枚のフレーム画像の信号を用いれば、1次元方向の2倍の高解像度画像生成が可能である。 The first configuration example of the decimal pixel precision image generation unit 111 includes Reference Document 1 (Japanese Patent Laid-Open No. 8-336046), Reference Document 2 (Japanese Patent Laid-Open No. 9-69755), Reference Document 3 (Shin Aoki “Multiple Digital Images”. "Super-resolution processing by data", "Ricoh Technical Report pp.19-25," No.24, "NOVEMBER," 1998)). In the first configuration example of the sub-pixel image generation unit 111, when performing the weighted sum of (3) above, as shown in FIG. 8, if signals of at least three frame images are used, 2 in the one-dimensional direction is used. Double high-resolution image generation is possible.
 ここで、図8を用いて、小数画素精度画像生成部111の第1の構成例における小数画素精度画像生成処理ついて説明する。図8は、1次元の周波数領域で、各成分の周波数スペクトルを示した図である。同図において、周波数軸からの距離が信号強度を表し、周波数軸を中心とした回転角が位相を表す。上記(3)の加重和について、以下に詳しく説明する。 Here, the decimal pixel accuracy image generation processing in the first configuration example of the decimal pixel accuracy image generation unit 111 will be described with reference to FIG. FIG. 8 is a diagram showing the frequency spectrum of each component in a one-dimensional frequency region. In the figure, the distance from the frequency axis represents the signal intensity, and the rotation angle around the frequency axis represents the phase. The weighted sum of (3) above will be described in detail below.
 上記(2)の広帯域補間にて、ナイキスト周波数の2倍の帯域(周波数0~サンプリング周波数fsまでの帯域)を透過する広帯域ローパスフィルタによって画素補間すると、原信号と同じ成分(以下、原成分)と、サンプリング位相に応じた折返し成分の和が得られる。このとき、3枚のフレーム画像の信号に対して上記(2)広帯域補間の処理を行うと、図8Aに示すように、各フレームの原成分(601)、(602)、(603)の位相はすべて一致し、折返し成分(604)、(605)、(606)の位相は各フレームのサンプリング位相の差に応じて回転することがよく知られている。それぞれの位相関係をわかりやすくするために、各フレームの原成分の位相関係を同図Bに示し、各フレームの折返し成分の位相関係を同図Cに示す。 When the pixel interpolation is performed with the broadband low-pass filter that transmits twice the Nyquist frequency band (frequency band 0 to sampling frequency fs) in the broadband interpolation in (2) above, the same component as the original signal (hereinafter referred to as the original component) And the sum of the aliasing components according to the sampling phase is obtained. At this time, when the (2) wideband interpolation processing is performed on the signals of the three frame images, as shown in FIG. 8A, the phases of the original components (601), (602), and (603) of each frame are obtained. Are well-matched, and it is well known that the phases of the aliasing components (604), (605), and (606) rotate in accordance with the sampling phase difference of each frame. In order to facilitate understanding of the respective phase relationships, the phase relationship of the original components of each frame is shown in FIG. B, and the phase relationship of the folded components of each frame is shown in FIG.
 ここで、3枚のフレーム画像の信号に対して、乗算する係数を適切に選択して上記(3)加重和を行うことにより、各フレームの折返し成分(604)、(605)、(606)を互いに打ち消して除去することができ、原成分だけを抽出できる。このとき、各フレームの折返し成分(604)、(605)、(606)のベクトル和を0にする、すなわち、Re軸(実軸)の成分とIm軸(虚軸)の成分を両方ともに0とするためには、少なくとも3つの折返し成分が必要となる。従って、少なくとも3枚のフレーム画像の信号を用いることにより、2倍の小数画素画像生成を実現すること、すなわち1個の折返し成分を除去することができる。 Here, with respect to the signals of the three frame images, by appropriately selecting a coefficient to be multiplied and performing the above (3) weighted sum, the aliasing components (604), (605), (606) of each frame are performed. Can be removed by canceling each other, and only the original components can be extracted. At this time, the vector sum of the folded components (604), (605), and (606) of each frame is set to 0, that is, both the Re axis (real axis) component and the Im axis (imaginary axis) component are set to 0. In order to achieve this, at least three folding components are required. Therefore, by using the signals of at least three frame images, it is possible to realize the generation of a doubled fractional pixel image, that is, to remove one aliasing component.
 以上述べたように、第一の構成例では、3枚のフレームの画像信号を用いて高精度の小数画素を生成することができる。 As described above, in the first configuration example, it is possible to generate a high-precision decimal pixel using the image signals of three frames.
 次に、小数画素精度画像生成部111の第2の構成例を図6に示す。小数画素精度画像生成部111の第2の構成例では、少なくとも2枚のフレーム画像の信号を用いれば、1次元方向の2倍の小数画素画像生成が可能である。以下に詳細を説明する。 Next, a second configuration example of the decimal pixel precision image generation unit 111 is shown in FIG. In the second configuration example of the decimal pixel accuracy image generation unit 111, it is possible to generate a fractional pixel image that is twice as large as that in the one-dimensional direction by using signals of at least two frame images. Details will be described below.
 まず、フレームメモリ109から入力部400に符号化対象フレームと過去に符号化済みのフレームの複数のフレームが入力される。 First, a plurality of frames, that is, a frame to be encoded and a frame that has been encoded in the past, are input from the frame memory 109 to the input unit 400.
 まず位置推定部401により、入力部400に入力したフレーム#1上の処理対象の画素のサンプリング位相(標本化位置)を基準として、フレーム#2上の対応する画素の位置を推定し、サンプリング位相差θ402を求める。次に、動き補償・アップレート部415のアップレート器403,404により、位相差θ402の情報を用いてフレーム#2を動き補償してフレーム#1と位置を合わせるとともに、フレーム#1とフレーム#2の画素数をそれぞれ2倍に増して高密度化する。位相シフト部416では、この高密度化したデータの位相を一定量だけシフトする。ここで、データの位相を一定量だけシフトする手段として、π/2位相シフト器406、408を用いることができる。また、π/2位相シフト器406,408で生じる遅延を補償するために、遅延器405,407により高密度化したフレーム#1とフレーム#2の信号を遅延させる。折返し成分除去部417では、遅延器405、407とヒルベルト変換器406、408の各出力信号に対して、係数決定器409にて位相差θ402をもとに生成した係数C0、C2、C1、C3を乗算器410、411、412、413にてそれぞれ乗算し、加算器(414)にてこれらの信号を加算して出力を得る。この出力は、出力部418から出力される。 First, the position estimation unit 401 estimates the position of the corresponding pixel on the frame # 2 based on the sampling phase (sampling position) of the pixel to be processed on the frame # 1 input to the input unit 400, and the sampling position. The phase difference θ402 is obtained. Next, the up-compensators 403 and 404 of the motion compensation / up-rate unit 415 use the information of the phase difference θ 402 to perform motion compensation on the frame # 2 so as to align the position with the frame # 1, and the frame # 1 and the frame # The number of pixels of 2 is doubled to increase the density. The phase shift unit 416 shifts the phase of the densified data by a certain amount. Here, π / 2 phase shifters 406 and 408 can be used as means for shifting the data phase by a certain amount. Further, in order to compensate for the delay caused by the π / 2 phase shifters 406 and 408, the signals of the frame # 1 and the frame # 2 that have been densified by the delay units 405 and 407 are delayed. In the aliasing component removal unit 417, the coefficients C0, C2, C1, C3 generated by the coefficient determiner 409 based on the phase difference θ402 with respect to the output signals of the delay units 405, 407 and the Hilbert transformers 406, 408, respectively. Are multiplied by multipliers 410, 411, 412, and 413, and these signals are added by an adder (414) to obtain an output. This output is output from the output unit 418.
 なお、位置推定部401は、上記従来技術をそのまま用いて実現することができる。アップレート器403、404、π/2位相シフト器406,408、折返し成分除去部417の各詳細については後述する。 Note that the position estimation unit 401 can be realized using the above-described conventional technique as it is. Details of the up-raters 403 and 404, the π / 2 phase shifters 406 and 408, and the aliasing component removing unit 417 will be described later.
 図9に、小数画素精度画像生成部111の第2の構成例における動作を示す。同図は、図6に示した遅延器405、407とπ/2位相シフト器406、408の各出力を1次元の周波数領域で示したものである。同図Aにおいて、遅延器405、407から出力したアップレート後のフレーム#1とフレーム#2の信号はそれぞれ、原成分701、702と、元のサンプリング周波数(fs)から折り返した折返し成分705、706を加えた信号となる。このとき、折返し成分706は上述の位相差θ402だけ位相が回転している。一方、π/2位相シフト器406、408から出力したアップレート後のフレーム#1とフレーム#2の信号はそれぞれ、π/2位相シフト後の原成分703、704と、π/2位相シフト後の折返し成分707、708を加えた信号となる。同図Bおよび同図C、同図Aに示した各成分の位相関係をわかりやすくするために、原成分と折返し成分をそれぞれ抜き出して示したものである。ここで、同図Bに示す4つの成分のベクトル和を取ったときに、Re軸の成分を1とし、Im軸の成分を0とするとともに、同図Cに示す4つの成分のベクトル和を取ったときに、Re軸とIm軸の両方の成分を0とするように、各成分に乗算する係数を決定して加重和をとれば、折返し成分を打ち消してキャンセルし、原成分だけを抽出することができる。すなわち、2枚のフレーム画像だけを用いて、1次元方向の2倍の高解像度画像生成を行うことができる。この係数決定方法の詳細については後述する。 FIG. 9 shows an operation in the second configuration example of the decimal pixel precision image generation unit 111. This figure shows the outputs of the delay units 405 and 407 and the π / 2 phase shifters 406 and 408 shown in FIG. 6 in a one-dimensional frequency domain. In FIG. A, the signals of frame # 1 and frame # 2 after the up-rate output from the delay units 405 and 407 are respectively the original components 701 and 702, and the aliasing component 705 that is aliased from the original sampling frequency (fs). The signal is obtained by adding 706. At this time, the folded component 706 is rotated in phase by the above-described phase difference θ402. On the other hand, the signals of frame # 1 and frame # 2 after the up-rate output from the π / 2 phase shifters 406 and 408 are the original components 703 and 704 after the π / 2 phase shift and the π / 2 phase shifted signal, respectively. To which the aliasing components 707 and 708 are added. In order to facilitate understanding of the phase relationship between the components shown in FIG. B, FIG. C, and FIG. A, the original component and the folded component are extracted and shown. Here, when the vector sum of the four components shown in Fig. B is taken, the Re-axis component is set to 1, the Im-axis component is set to 0, and the vector sum of the four components shown in Fig. C is calculated. When taking the values, determine the coefficients to be multiplied by each component so that both the Re-axis and Im-axis components are set to 0. If the weighted sum is taken, the aliasing components are canceled and canceled, and only the original components are extracted. can do. That is, using only two frame images, it is possible to generate a high-resolution image that is twice the one-dimensional direction. Details of this coefficient determination method will be described later.
 図10、図11、を用いて小数画素精度画像生成部111の第2の構成例に用いるアップレート器403、404の動作を説明する。図10は、横軸は周波数を、縦軸は利得(入力信号振幅に対する出力信号振幅の比の値)を表し、アップレート器403、404の「周波数-利得」特性を示している。ここで、アップレート器403、404では、もとの信号のサンプリング周波数(fs)に対して2倍の周波数(2fs)を新しいサンプリング周波数とし、もとの画素間隔のちょうど中間の位置に新しい画素のサンリング点(=ゼロ点)を挿入することによって画素数を2倍にして高密度化するとともに、-fs~+fsの間の周波数をすべて利得2.0の通過帯域とするフィルタをかける。このとき、同図に示すように、周波数-利得特性はデジタル信号の対称性により、2fsの整数倍の周波数ごとに繰り返す特性となる。 The operation of the up-raters 403 and 404 used in the second configuration example of the decimal pixel accuracy image generation unit 111 will be described with reference to FIGS. In FIG. 10, the horizontal axis represents frequency, and the vertical axis represents gain (the value of the ratio of the output signal amplitude to the input signal amplitude), indicating the “frequency-gain” characteristics of the up-raters 403 and 404. Here, in the up-raters 403 and 404, a frequency (2fs) twice as high as the sampling frequency (fs) of the original signal is set as a new sampling frequency, and a new pixel is located at a position just in the middle of the original pixel interval. The number of pixels is doubled to increase the density by inserting a sanding point (= zero point), and a filter with a frequency between −fs and + fs all having a gain of 2.0 is applied. At this time, as shown in the figure, the frequency-gain characteristic is a characteristic that repeats every frequency that is an integral multiple of 2 fs due to the symmetry of the digital signal.
 図11は、図10に示した周波数特性を逆フーリエ変換して得られるフィルタのタップ係数を示している。このとき、各タップ係数Ck(ただし、kは整数)は一般的に知られているsinc関数となり、サンプリングの位相差θ402を補償するために(-θ)だけシフトし、Ck=2sin(πk+θ)/(πk+θ)とすればよい。なお、アップレート器403では、位相差θ402を0とおき、Ck=2sin(πk)/(πk)とすればよい。また、位相差θ(402)を、整数画素単位(2π)の位相差+小数画素画像単位の位相差で表すことにより、整数画素単位の位相差の補償については単純な画素シフトにより実現し、小数画素画像単位の位相差の補償については上記アップレート器403、404のフィルタを用いてもよい。 FIG. 11 shows filter tap coefficients obtained by inverse Fourier transform of the frequency characteristics shown in FIG. At this time, each tap coefficient Ck (where k is an integer) is a generally known sinc function, shifted by (−θ) to compensate for the sampling phase difference θ402, and Ck = 2sin (πk + θ) / (πk + θ) may be used. In the up-rate device 403, the phase difference θ402 is set to 0 and Ck = 2sin (πk) / (πk). Further, by expressing the phase difference θ (402) as a phase difference in integer pixel units (2π) + a phase difference in decimal pixel image units, the phase difference compensation in integer pixel units is realized by a simple pixel shift, For the compensation of the phase difference in units of decimal pixel images, the filters of the up-raters 403 and 404 may be used.
 図12は、小数画素画像生成部111の第2の構成例に用いるπ/2位相シフト器406、408の周波数-利得特性を示している。π/2位相シフト器406、408として、一般に知られているヒルベルト変換器を用いることができる。同図Aにおいて、横軸は周波数を、縦軸は利得(入力信号振幅に対する出力信号振幅の比の値)を表し、ヒルベルト変換器の「周波数-利得」特性を示している。ここで、ヒルベルト変換器では、もとの信号のサンプリング周波数(fs)に対して2倍の周波数(2fs)を新しいサンプリング周波数として、-fs~+fsの間の0を除く周波数成分をすべて利得1.0の通過帯域とする。また、同図Bにおいて、横軸は周波数を、縦軸は位相差(入力信号位相に対する出力信号位相の差)を表し、ヒルベルト変換器の「周波数-位相差」特性を示している。ここで、0~fsの間の周波数成分についてはπ/2だけ位相を遅らせ、0~-fsの間の周波数成分についてはπ/2だけ位相を進ませる。このとき、同図に示すように、デジタル信号の対称性により、2fsの整数倍の周波数ごとに繰り返す特性となる。 FIG. 12 shows the frequency-gain characteristics of the π / 2 phase shifters 406 and 408 used in the second configuration example of the decimal pixel image generation unit 111. As the π / 2 phase shifters 406 and 408, generally known Hilbert transformers can be used. In FIG. A, the horizontal axis represents frequency, and the vertical axis represents gain (the value of the ratio of the output signal amplitude to the input signal amplitude), indicating the “frequency-gain” characteristic of the Hilbert transformer. Here, in the Hilbert transformer, the frequency (2fs) that is twice the sampling frequency (fs) of the original signal is set as a new sampling frequency, and all frequency components except 0 between -fs and + fs are gained. A pass band of 1.0. In FIG. B, the horizontal axis represents frequency, and the vertical axis represents phase difference (difference in output signal phase with respect to input signal phase), indicating the “frequency-phase difference” characteristics of the Hilbert transformer. Here, the phase of the frequency component between 0 and fs is delayed by π / 2, and the phase of the frequency component between 0 and −fs is advanced by π / 2. At this time, as shown in the figure, due to the symmetry of the digital signal, the characteristic repeats every frequency that is an integral multiple of 2fs.
 図13は、図12に示した周波数特性を逆フーリエ変換して得られるフィルタのタップ係数を示している。このとき、各タップ係数Ckは、k=2m(ただしmは整数)のときはCk=0とし、k=2m+1のときはCk=-2/(πk)とすればよい。 FIG. 13 shows filter tap coefficients obtained by inverse Fourier transform of the frequency characteristics shown in FIG. At this time, each tap coefficient Ck may be Ck = 0 when k = 2m (where m is an integer), and Ck = −2 / (πk) when k = 2m + 1.
 なお、小数画素精度画像データの生成に用いるπ/2位相シフト器406、408は、微分器を用いることも可能である。この場合、正弦波を表す一般式cos(ωt+α)をtで微分して1/ωを乗じると、d(cos(ωt+α))/dt*(1/ω)=-sin(ωt+α)=cos(ωt+α+π/2)となり、π/2位相シフトの機能を実現できる。すなわち、対象とする画素の値と隣接画素の値との差分を取ったのちに、1/ωの「周波数-振幅」特性を持ったフィルタを掛けることによってπ/2位相シフトの機能を実現してもよい。 Note that a differentiator may be used as the π / 2 phase shifters 406 and 408 used for generating the decimal pixel precision image data. In this case, if the general expression cos (ωt + α) representing a sine wave is differentiated by t and multiplied by 1 / ω, d (cos (ωt + α)) / dt * (1 / ω) =-sin (ωt + α) = cos (ωt + α + π / 2), and the function of π / 2 phase shift can be realized. In other words, after taking the difference between the value of the target pixel and the value of the adjacent pixel, a π / 2 phase shift function is realized by applying a filter with a frequency / amplitude characteristic of 1 / ω. May be.
 図14を用いて小数画素精度画像生成部111の第2の構成例に用いる係数決定器(409)の動作と具体例を説明する。同図Aに示すように、図9Bに示した4つの成分のベクトル和を取ったときに、Re軸の成分を1とし、Im軸の成分を0とするとともに、図9C示した4つの成分のベクトル和を取ったときに、Re軸とIm軸の両方の成分を0とするように、各成分に乗算する係数を決定すれば、2枚のフレーム画像だけを用いて、1次元方向の2倍の小数画素画像生成行う画像信号処理装置を実現できる。図6に示すように、遅延器(405)の出力(アップレート後のフレーム#1の原成分と折返し成分の和)に対する係数をC0、π/2位相シフト器406の出力(アップレート後のフレーム#1の原成分と折返し成分のそれぞれのπ/2位相シフト結果の和)に対する係数をC1、遅延器407の出力(アップレート後のフレーム#2の原成分と折返し成分の和)に対する係数をC2、ヒルベルト変換器406の出力(アップレート後のフレーム#2の原成分と折返し成分のそれぞれのπ/2位相シフト結果の和)に対する係数をC3、として図14Aの条件を満たすと仮定すると、図9Bおよび図9Cに示した各成分の位相関係から、図14Bに示す連立方程式を得ることができ、これを解くと図14Cに示す結果を導くことができる。係数決定器409は、このようにして得た係数C0、C1、C2、C3を出力すればよい。一例として、位相差θ402をπ/8ごとに0~2πまで変化させたときの係数C0、C1、C2、C3の値を、図14Dに示す。これは、もとのフレーム#2の信号を、1/16画素の精度で位置推定し、フレーム#1に対して動き補償した場合に相当する。 The operation and specific example of the coefficient determiner (409) used in the second configuration example of the decimal pixel accuracy image generation unit 111 will be described with reference to FIG. As shown in FIG. 9A, when the vector sum of the four components shown in FIG. 9B is taken, the Re-axis component is set to 1, the Im-axis component is set to 0, and the four components shown in FIG. If the coefficient to multiply each component is determined so that both the Re-axis and Im-axis components are set to 0 when taking the vector sum of, using only two frame images, It is possible to realize an image signal processing apparatus that generates a doubled decimal pixel image. As shown in FIG. 6, the coefficient for the output of the delay unit (405) (the sum of the original component and the folded component of the frame # 1 after the up-rate) is C0, and the output of the π / 2 phase shifter 406 (after the up-rate) C1 is a coefficient with respect to the sum of the π / 2 phase shift results of the original component and the aliasing component of frame # 1, and a coefficient with respect to the output of delay device 407 (the sum of the original component and aliasing component of frame # 2 after the update) Is C2, and the coefficient for the output of the Hilbert transformer 406 (the sum of the π / 2 phase shift results of the original component and the aliasing component of the frame # 2 after the up-rate) is C3. From the phase relationships of the components shown in FIGS. 9B and 9C, the simultaneous equations shown in FIG. 14B can be obtained, and solving these results in the results shown in FIG. 14C. The coefficient determiner 409 may output the coefficients C0, C1, C2, and C3 obtained in this way. As an example, FIG. 14D shows values of the coefficients C0, C1, C2, and C3 when the phase difference θ402 is changed from 0 to 2π every π / 8. This corresponds to a case where the position of the signal of the original frame # 2 is estimated with an accuracy of 1/16 pixel and motion compensation is performed on the frame # 1.
 なお、アップレート器403、404およびπ/2位相シフト器406、407は、理想的な特性を得るためには無限大のタップ数を必要とするが、タップ数を有限個で打ち切って簡略化しても実用上問題ない。このとき、一般的な窓関数(例えばハニング窓関数やハミング窓関数など)を用いてもよい。簡略化したヒルベルト変換器の各タップの係数を、C0を中心として左右点対象の値、すなわちC(-k)=-Ck(kは整数)とすれば、位相を一定量だけシフトすることができる。 Note that the up-raters 403 and 404 and the π / 2 phase shifters 406 and 407 require an infinite number of taps in order to obtain ideal characteristics. But there is no practical problem. At this time, a general window function (such as a Hanning window function or a Hamming window function) may be used. If the coefficient of each tap of the simplified Hilbert transformer is the value of the left and right points centered on C0, that is, C (-k) = -Ck (k is an integer), the phase can be shifted by a certain amount. it can.
 以上説明したように、図1の小数画素精度画像生成部111の構成を図6乃至図14において説明した構成とすることにより、複数のフレームから高精度の小数画素精度画像データを生成することが可能となる。 As described above, the configuration of the decimal pixel accuracy image generation unit 111 in FIG. 1 is the configuration described in FIGS. 6 to 14, thereby generating high-precision decimal pixel accuracy image data from a plurality of frames. It becomes possible.
 特に、小数画素精度画像生成部111の第2の構成によれば、2枚のフレームから1枚の高精度の小数画素精度画像データを生成することが可能となり、第1の構成例よりも少ないでメモリ量で符号化できる。 In particular, according to the second configuration of the fractional pixel accuracy image generation unit 111, it is possible to generate one piece of high precision fractional pixel accuracy image data from two frames, which is smaller than in the first configuration example. Can be encoded with the amount of memory.
 次に画面内予測フレームを参照フレームとする際の小数画素精度画像生成処理について説明する。画面内予測フレームでは、過去のフレームを参照することができないため、図15に示すように符号化対象ブロックと最も近いブロックを探索し画面内の動きベクトルを求める。このとき、探索ブロックサイズはブロック単位ではなく1画素単位でも良い。動きベクトル算出後の小数画素生成処理においては、上記フレーム間にて実施する方法と同様である。 Next, a description will be given of the decimal pixel accuracy image generation processing when the intra prediction frame is used as a reference frame. Since the past frame cannot be referred to in the intra prediction frame, the block closest to the encoding target block is searched for the motion vector in the screen as shown in FIG. At this time, the search block size may be one pixel instead of block. The decimal pixel generation process after calculating the motion vector is the same as the method performed between the frames.
 なお、上記の小数画素精度画像生成処理においては、動きベクトルが示す位置が整数画素精度の場合には小数画素画像を生成できない。そこで、小数画素精度画像生成方法決定部110での結果に基づいて、動き量が整数画素精度か否かを判定する。動き検出位置が整数画素精度位置の場合には、前記第2の方法として、従来の符号化規格で用いるフィルタ補間によって小数画素精度画像を生成する。このとき、従来からの第2の方法を用いるか、前記第1の方法を用いるかを否かの情報をストリームに付加することで、復号側で小数画素精度画像を再生することが可能となる。どちらの方法で切り替えるかの処理単位は、画素ブロック単位でもフレーム単位でも良い。その際には、ストリームに画素ブロック単位で符号化するのか、またはフレーム単位で符号化するかの情報を付加すればよい。 In the above-described decimal pixel accuracy image generation processing, a decimal pixel image cannot be generated when the position indicated by the motion vector has integer pixel accuracy. Therefore, based on the result of the decimal pixel accuracy image generation method determination unit 110, it is determined whether or not the motion amount has integer pixel accuracy. When the motion detection position is an integer pixel accuracy position, a decimal pixel accuracy image is generated by filter interpolation used in the conventional coding standard as the second method. At this time, it is possible to reproduce the decimal pixel precision image on the decoding side by adding to the stream information indicating whether to use the conventional second method or whether to use the first method. . The processing unit for switching by either method may be a pixel block unit or a frame unit. In that case, information on whether to encode in units of pixel blocks or in units of frames may be added to the stream.
 以上説明した画像符号化装置およびその画像符号化方法によれば、より高精度の小数画素精度の参照画像を生成することができるので、動き予測精度が向上し、より少ないデータ量で効率よく映像信号を圧縮することが可能となる。 According to the image coding apparatus and the image coding method described above, since a reference image with higher precision and decimal pixel accuracy can be generated, motion prediction accuracy is improved and video can be efficiently generated with a smaller amount of data. It becomes possible to compress the signal.
 また、原画像と予測画像の差分を判定し、差分がない場合に、周波数変換処理、量子化処理、逆量子化処理、逆周波数変換処理、動き検出処理、動き補償処理を省略するから、符号化側の処理量を削減することが可能となる。 Also, the difference between the original image and the predicted image is determined, and if there is no difference, the frequency conversion process, the quantization process, the inverse quantization process, the inverse frequency conversion process, the motion detection process, and the motion compensation process are omitted. It is possible to reduce the processing amount on the production side.
 また、複数のフレームを参照可能である動画像符号化規格においては、より多くの画像データから小数画素画像を生成することができるので、より高精度の小数画素画像を生成することでき、動き予測精度が改善して少ないデータ量での符号化が可能となる。 In addition, in the moving picture coding standard that can refer to a plurality of frames, a decimal pixel image can be generated from a larger amount of image data, so that a higher precision decimal pixel image can be generated and motion prediction can be performed. The accuracy is improved and encoding with a small amount of data becomes possible.
 《画像復号装置及び画像復号方法》
 図17に本発明に係る画像復号装置のブロックダイヤグラムが例示される。画像復号装置において、1501は、符号化側から送られた符号化データを復号する可変長復号部である。1502は、可変長復号部1501にて復号したデータの構文を解析する構文解析部であり、符号化データを符号化画像データと付加情報に分離する。1503は、構文解析部1502より送られるデータを逆量子化する逆量子化部である。1504は逆量子化部1530にて逆量子化したデータを逆周波数変換して動き予測誤差データを生成する逆周波数変換部である。1505は、逆周波数変換部1504にて逆周波数変換した動き予測誤差とフレームメモリ1510に格納した動き予測画像データを加算して画像データを生成する加算器である。1506は、加算器1505で加算されて得られる再生画像データを格納するフレームメモリである。1507は小数画素精度画像生成方法決定部、1508は小数画素精度画像生成部である。
<< Image decoding apparatus and image decoding method >>
FIG. 17 illustrates a block diagram of the image decoding apparatus according to the present invention. In the image decoding apparatus, reference numeral 1501 denotes a variable length decoding unit that decodes encoded data sent from the encoding side. A syntax analysis unit 1502 analyzes the syntax of the data decoded by the variable length decoding unit 1501, and separates the encoded data into encoded image data and additional information. Reference numeral 1503 denotes an inverse quantization unit that inversely quantizes data sent from the syntax analysis unit 1502. Reference numeral 1504 denotes an inverse frequency conversion unit that generates motion prediction error data by performing inverse frequency conversion on the data inversely quantized by the inverse quantization unit 1530. Reference numeral 1505 denotes an adder that generates image data by adding the motion prediction error subjected to inverse frequency conversion by the inverse frequency conversion unit 1504 and the motion prediction image data stored in the frame memory 1510. Reference numeral 1506 denotes a frame memory for storing reproduced image data obtained by addition by the adder 1505. Reference numeral 1507 denotes a decimal pixel accuracy image generation method determination unit, and 1508 denotes a decimal pixel accuracy image generation unit.
 ここで、小数画素精度画像生成方法決定部1507の動作は、図1に示す画像符号化装置の小数画素精度画像生成方法決定部110の動作と同様である。すなわち、前記小数画素精度画像生成方法決定部1507は、フレーム選択部1520、動き検出部1521、及び判定部1522によって小数画素精度画像生成方法を決定する。フレーム選択部1520は所定の手順に従ってフレームメモリ1506から所定のフレームの再生画像データをリードする。動き検出部1521はフレーム選択部1520でリードされた2枚のフレームの再生画像データに関する動き検出を行う。判定部1522はその動き検出結果等に従って、小数画素精度画像生成方法を選択する。 Here, the operation of the decimal pixel accuracy image generation method determination unit 1507 is the same as the operation of the decimal pixel accuracy image generation method determination unit 110 of the image encoding device shown in FIG. That is, the decimal pixel accuracy image generation method determination unit 1507 determines the decimal pixel accuracy image generation method using the frame selection unit 1520, the motion detection unit 1521, and the determination unit 1522. The frame selection unit 1520 reads the reproduction image data of a predetermined frame from the frame memory 1506 according to a predetermined procedure. The motion detection unit 1521 performs motion detection on the reproduced image data of the two frames read by the frame selection unit 1520. The determination unit 1522 selects a decimal pixel accuracy image generation method according to the motion detection result and the like.
 ここで、小数画素精度画像生成方法決定部1507の動作の詳細も図1に示す画像符号化装置の小数画素精度画像生成方法決定部110の動作と同様である。すなわち、例えば、図3に示す決定処理のフローチャートに従い、図2に示す第1の方法又は第2の方法の何れかの小数画素精度画像生成方法を決定する。当該動作の詳細については、図1に示す画像符号化装置の小数画素精度画像生成方法決定部110の説明と同様であるため、説明を省略する。 Here, the details of the operation of the decimal pixel accuracy image generation method determination unit 1507 are the same as the operation of the decimal pixel accuracy image generation method determination unit 110 of the image encoding device shown in FIG. That is, for example, according to the flowchart of the determination process shown in FIG. 3, the decimal pixel precision image generation method of either the first method or the second method shown in FIG. 2 is determined. The details of the operation are the same as the description of the decimal pixel precision image generation method determination unit 110 of the image encoding device shown in FIG.
 また小数画素精度画像生成部1508の動作も図1に示す画像符号化装置の小数画素精度画像生成部111の動作と同様であるため、説明を省略する。 The operation of the decimal pixel accuracy image generation unit 1508 is the same as that of the decimal pixel accuracy image generation unit 111 of the image encoding device shown in FIG.
 以上のように、小数画素精度画像生成方法決定部1507、小数画素精度画像生成部1508が、図1に示す画像符号化装置の小数画素精度画像生成方法決定部110、小数画素精度画像生成部111と同様の処理により小数画素精度画像を生成することにより、図17に示す画像復号装置においては、画像符号化装置で想定した高精度の復号画像を生成することが可能となる。 As described above, the decimal pixel accuracy image generation method determination unit 1507 and the decimal pixel accuracy image generation unit 1508 are the decimal pixel accuracy image generation method determination unit 110 and the decimal pixel accuracy image generation unit 111 of the image encoding device illustrated in FIG. By generating a decimal pixel precision image by the same processing as in FIG. 17, the image decoding apparatus shown in FIG. 17 can generate a highly accurate decoded image assumed by the image encoding apparatus.
 次に、1509は、構文解析部1502から送られる動きベクトルと小数画素精度画像生成部1508にて生成された画像データから復号画像を生成する動き補償部である。1510は、動き補償部1509にて生成した復号画像データを格納するフレームメモリである。1511は、フレームメモリ1510に格納されている復号データを読み出し、出力する映像表示装置である。 Next, 1509 is a motion compensation unit that generates a decoded image from the motion vector sent from the syntax analysis unit 1502 and the image data generated by the decimal pixel precision image generation unit 1508. Reference numeral 1510 denotes a frame memory that stores the decoded image data generated by the motion compensation unit 1509. Reference numeral 1511 denotes a video display device that reads out and outputs decoded data stored in the frame memory 1510.
 図17の画像復号装置は、図1の画像符号化装置によって符号化されたストリームを復号することができる。以下に当該画像復号装置における詳細な画像復号方法を説明する。図18は画像復号処理を全体的に示すフローチャートである。 The image decoding apparatus in FIG. 17 can decode the stream encoded by the image encoding apparatus in FIG. A detailed image decoding method in the image decoding apparatus will be described below. FIG. 18 is a flowchart showing the entire image decoding process.
 図18において、まず、符号化側で符号化したデータを可変長復号部1501にて復号する(1601)。次に、可変長復号部1501にて復号したデータを構文解析部1502にて構文分けを行う(1602)。ここで、画像復号装置に記録される符号化ストリームの構造について図19を用いて説明する。図19に示す符号化ストリームは、例えば、画像符号化装置によって符号化される。図19においてデータ領域1701には、例えば、差分があるか否かの判定フラグが格納される。また、例えば、データ領域1702には、動き検出を行うか否かの判定フラグ(1707)、小数画素精度画像生成部1508にて生成される動き情報ベクトル情報(1708)や整数画素位置の判定を行うフラグ(1709)が格納される。データ領域1703には、量子化パラメータ、量子化ステップもしくはこれらに乗ずる係数または符号化処理で用いたマトリクスの番号の情報が格納される。データ領域1704には、周波数変換の種類、ブロックサイズが格納される。データ領域1705には、小数画素画像生成部111にて生成される高解像度化手法の種類の情報、データ領域1706には原画像と予測画像の差分画像を周波数変換および量子化した後の係数が格納される。 18, first, the data encoded on the encoding side is decoded by the variable length decoding unit 1501 (1601). Next, the syntax analysis unit 1502 classifies the data decoded by the variable length decoding unit 1501 (1602). Here, the structure of the encoded stream recorded in the image decoding apparatus will be described with reference to FIG. The encoded stream shown in FIG. 19 is encoded by, for example, an image encoding device. In FIG. 19, a data area 1701 stores, for example, a determination flag indicating whether or not there is a difference. Further, for example, a determination flag (1707) on whether or not to perform motion detection, a motion information vector information (1708) generated by the decimal pixel precision image generation unit 1508, and an integer pixel position determination are performed on the data area 1702. A flag (1709) to be executed is stored. The data area 1703 stores quantization parameters, quantization steps, coefficients multiplied by these, or matrix number information used in the encoding process. The data area 1704 stores the frequency conversion type and block size. In the data area 1705, information on the type of the resolution enhancement method generated by the decimal pixel image generation unit 111, and in the data area 1706, coefficients after frequency conversion and quantization of the difference image between the original image and the predicted image are stored. Stored.
 次に、図19に示す符号化ストリームの各データ領域のデータの種類を判別し、各フラグや各データ情報が、それぞれ逆量子化部1503、逆周波数変換部1504、動き補償部1509、動き検出部1507、フレームメモリ1510、小数画素精度画像生成部1508の各処理部へ送られる。 Next, the type of data in each data area of the encoded stream shown in FIG. 19 is discriminated, and each flag and each data information are dequantized unit 1503, inverse frequency transform unit 1504, motion compensation unit 1509, motion detection, respectively. Is sent to the respective processing units of the unit 1507, the frame memory 1510, and the decimal pixel accuracy image generation unit 1508.
 次に、逆量子化部1503では、構文解析部1502から送られたデータを用いて逆量子化処理を行う(1603)。ここで、符号化ストリームが図1の画像符号化装置において符号化した符号化ストリームである場合は、逆量子化部1503における逆量子化処理は、図1の量子化部104の処理と逆処理を行えばよい(1604)。これは、図1の逆量子化部106の処理と同様の処理であり、従来の復号技術に用いる逆量子化技術であってもよいし、データ領域1703に格納された量子化ステップを乗算しても良い。 Next, the inverse quantization unit 1503 performs an inverse quantization process using the data sent from the syntax analysis unit 1502 (1603). Here, when the encoded stream is an encoded stream encoded by the image encoding device in FIG. 1, the inverse quantization process in the inverse quantization unit 1503 is the inverse process to the process in the quantization unit 104 in FIG. (1604). This is the same processing as the processing of the inverse quantization unit 106 in FIG. 1 and may be the inverse quantization technology used in the conventional decoding technology, or it is multiplied by the quantization step stored in the data area 1703. May be.
 次に、逆量子化部1503において逆量子化したデータに対して、逆周波数変換部1504にて逆周波数変換処理を実施する(1604)。このとき、逆周波数変換部1604は、構文解析部1502から送られる周波数変換の種類や周波数変換ブロックサイズ情報を用いて逆周波数変換処理を行う。当該逆周波数変換処理は、従来の画像復号技術における技術を用いても良い。次に、加算器1505にて逆周波数変換部1504にて逆周波数変換したデータとフレームメモリ1509に格納したデータを用いて動き検出を行う(1605)動き検出の方法は、予め符号化部にて決定した動き探索方法を構文解析部1502から取得すればよい。 Next, the inverse frequency transform unit 1504 performs an inverse frequency transform process on the data inversely quantized by the inverse quantizer 1503 (1604). At this time, the inverse frequency transform unit 1604 performs the inverse frequency transform process using the frequency transform type and frequency transform block size information sent from the syntax analysis unit 1502. The inverse frequency conversion process may use a technique in a conventional image decoding technique. Next, motion detection is performed using data that has been subjected to reverse frequency conversion by the reverse frequency conversion unit 1504 by the adder 1505 and data that has been stored in the frame memory 1509 (1605). The determined motion search method may be acquired from the syntax analysis unit 1502.
 次に、フレームメモリ1506に格納されている画像データと構文解析部1502から取得する動きベクトルを用いて小数画素画像を生成する(1606)。このとき小数画素精度画像生成部1508は、図1の小数画素画像生成部111と同様に動きベクトルと複数の画像データを用いて小数画素精度画像を生成する。当該小数画素精度画像の生成処理の内容は、図1の小数画素精度画像生成部111について説明した内容と同様であるため説明を省略する。 Next, a decimal pixel image is generated using the image data stored in the frame memory 1506 and the motion vector acquired from the syntax analysis unit 1502 (1606). At this time, the decimal pixel accuracy image generation unit 1508 generates a decimal pixel accuracy image using the motion vector and a plurality of image data, as in the case of the decimal pixel image generation unit 111 of FIG. The content of the decimal pixel accuracy image generation processing is the same as the content described for the decimal pixel accuracy image generation unit 111 in FIG.
 図17の小数画素精度画像生成部1508は、図1の小数画素精度画像生成部111について説明したと同様に、動きベクトルと複数の画像とこれらの折返しひずみを用いた高解像度処理を行う。これにより、動き補償部1509にて参照する小数画素精度画像データの高精細化が可能となる。 17, the decimal pixel accuracy image generation unit 1508 performs high-resolution processing using a motion vector, a plurality of images, and their aliasing distortions, as described for the decimal pixel accuracy image generation unit 111 in FIG. 1. As a result, it is possible to increase the resolution of the decimal pixel precision image data referred to by the motion compensation unit 1509.
 構文解析部1502から送られる差分がない場合には、逆量子化、逆周波数変換、動き検出、小数画素画像生成、動き補償の処理を実施せずに処理を終了する。また、図18の小数画素精度画像生成方法決定処理1605において、検出結果が整数画素位置と判定した場合や構文解析部にて符号化側で従来の動画像符号化方式で用いられている画素フィルタ補間により小数画素画像が生成されていると判定した場合には、従来の動画像符号化方式で用いられている画素フィルタ補間により小数画素画像を生成する。 If there is no difference sent from the syntax analysis unit 1502, the process ends without performing the inverse quantization, inverse frequency conversion, motion detection, decimal pixel image generation, and motion compensation processing. In addition, in the decimal pixel precision image generation method determination processing 1605 in FIG. 18, when the detection result is determined to be an integer pixel position, the pixel filter used in the conventional moving image encoding method on the encoding side in the syntax analysis unit When it is determined that the decimal pixel image is generated by the interpolation, the decimal pixel image is generated by the pixel filter interpolation used in the conventional moving image encoding method.
 次に、小数画素精度画像生成部1508にて生成した画像データと構文解析部1502から送られた動きベクトル情報に基づいて動き補償を行う(1607)。画面内予測画像の場合には、画面内予測部1511にて予測画像を生成し、フレームメモリ1506にデータを格納する。上記復号画像は、例えば、TV、PCモニタ、プロジェクタなどの映像表示装置1511などに出力される。 Next, motion compensation is performed based on the image data generated by the decimal pixel precision image generation unit 1508 and the motion vector information sent from the syntax analysis unit 1502 (1607). In the case of an intra-screen prediction image, the intra-screen prediction unit 1511 generates a prediction image and stores the data in the frame memory 1506. The decoded image is output to a video display device 1511 such as a TV, a PC monitor, or a projector, for example.
 以上説明した画像復装置及び画像復号方法によれば、画像復装置およびその画像復号方法において、動き探索処理の際に参照する小数画素精度画像データを高解像度化して復元することが可能となる。そのため、より高精細な復号画像を生成することができる。 According to the image decoding device and the image decoding method described above, it is possible to restore the resolution of the decimal pixel precision image data that is referred to in the motion search process in the image recovery device and the image decoding method. Therefore, a higher-definition decoded image can be generated.
 また、以上説明した画像復装置及び画像復号方法によれば、符号化ストリームに格納される原画像と予測画像の差分情報により、周波数変換処理、量子化処理、逆量子化処理、逆周波数変換処理、動き検出処理、動き補償処理を省略することが可能となり復号側のデータ処理量を削減することが可能となる。 Further, according to the image reconstruction device and the image decoding method described above, frequency conversion processing, quantization processing, inverse quantization processing, and inverse frequency conversion processing are performed based on difference information between the original image and the predicted image stored in the encoded stream. Thus, the motion detection process and the motion compensation process can be omitted, and the data processing amount on the decoding side can be reduced.
 上記画像符号化装置(画像符号化装置)による上記符号化処理、及び上記画像復号装置(画像復号装置)による上記復号処理は、コンピュータ装置を利用して行う。上記処理を制御するプログラムを記録媒体(ハードディスク、光ディスク、光磁気ディスク等)にて、あるいは伝送線路若しくは適宜のネットワークを介して提供して、PC(Personal Computer)又はEWS(Engineering Work Station)等のコンピュータ装置で実行可能にすることにより、上記画像符号化処理、画像復号処理を容易に行うことができる。 The encoding processing by the image encoding device (image encoding device) and the decoding processing by the image decoding device (image decoding device) are performed using a computer device. Provide a program that controls the above processing on a recording medium (hard disk, optical disk, magneto-optical disk, etc.) or via a transmission line or appropriate network, such as a PC (Personal Computer) or EWS (Engineering Work Station) By making it executable by a computer device, the image encoding process and the image decoding process can be easily performed.
 以上本発明者によってなされた発明を実施形態に基づいて具体的に説明したが、本発明はそれに限定されるものではなく、その要旨を逸脱しない範囲において種々変更可能であることは言うまでもない。 Although the invention made by the present inventor has been specifically described based on the embodiments, the present invention is not limited thereto, and it goes without saying that various modifications can be made without departing from the scope of the invention.
 本発明の画像符号化装置、画像符号化方法、画像復号装置、及び画像復号方法は、様々な規格をベースとする画像符号化処理、画像復号処理等にも用いることができる。また、第1方法及び第3方法に用いる超解像度処理、第2方法及び第4方法に用いるフィルタ補間演算処理は、適宜変更可能である。 The image encoding device, image encoding method, image decoding device, and image decoding method of the present invention can be used for image encoding processing, image decoding processing, and the like based on various standards. Further, the super-resolution processing used in the first method and the third method, and the filter interpolation calculation processing used in the second method and the fourth method can be appropriately changed.

Claims (15)

  1.  原画像データと動き予測画像データとの差分を符号化すると共に符号化データを復号して得られるローカルデコード画像データに基づいて前記動き予測画像データを生成することが可能な画像符号化装置であって、
     前記ローカルデコード画像データを複数フレーム分に亘り保持するフレームメモリと、
     前記フレームメモリが保持する、符号化対象フレームのローカルデコード画像データとそれよりも前のフレームのローカルデコード画像データとを用いて小数画素精度画像データを生成する小数画素画像データ処理部と、を有し、
     前記小数画素画像データ処理部で生成された小数画素精度画像データを参照画像データとして動き検出を行って動き予測画像データを生成する、画像符号化装置。
    An image encoding device capable of encoding the difference between original image data and motion predicted image data and generating the motion predicted image data based on local decoded image data obtained by decoding the encoded data. And
    A frame memory for holding the local decoded image data for a plurality of frames;
    A fractional pixel image data processing unit that generates fractional pixel precision image data using local decoded image data of a frame to be encoded and local decoded image data of a previous frame held by the frame memory. And
    An image encoding device that generates motion prediction image data by performing motion detection using the decimal pixel accuracy image data generated by the decimal pixel image data processing unit as reference image data.
  2.  前記小数画素画像データ処理部は、フレームメモリに格納された符号化対象フレームのローカルデコード画像データとそれより前の所定範囲の前フレームのローカルデコード画像データとの間の動き量が小数画素精度であるか否かを判定し、小数画素精度の判定結果が得られたときは、その前フレームのローカルデコード画像データと前記符号化対象フレームのローカルデコード画像データとを用いて小数画素精度画像データを生成し、小数画素精度の判定結果が得られないときは符号化対象フレームのローカルデコード画像データに対する補間演算によって小数画素精度画像データを生成する、請求項1記載の画像符号化装置。 The decimal pixel image data processing unit is configured such that the amount of motion between the local decoded image data of the encoding target frame stored in the frame memory and the local decoded image data of the previous frame in a predetermined range before that is a decimal pixel accuracy. When the determination result of the decimal pixel accuracy is obtained, the decimal pixel accuracy image data is obtained using the local decode image data of the previous frame and the local decode image data of the encoding target frame. The image encoding device according to claim 1, wherein when the decimal pixel accuracy determination result is not obtained, the decimal pixel accuracy image data is generated by an interpolation operation on the local decoded image data of the encoding target frame.
  3.  前記小数画素画像データ処理部は、前記小数画素精度であるか否かの判定の対象をIピクチャ又はPピクチャのフレーム形態に限定する、請求項2記載の画像符号化装置。 The image encoding device according to claim 2, wherein the decimal pixel image data processing unit limits an object of determination as to whether or not the decimal pixel accuracy is a frame form of an I picture or a P picture.
  4.  前記小数画素画像データ処理部は、前記所定範囲の前フレームに対して最も符号化対象フレームに近い前フレームから順次前記フレーム形態の判定と前記小数画素精度であるか否かの判定を行う、請求項3記載の画像符号化装置。 The decimal pixel image data processing unit sequentially determines the frame form and the decimal pixel accuracy from the previous frame closest to the encoding target frame with respect to the previous frame in the predetermined range. Item 4. The image encoding device according to Item 3.
  5.  前記小数画素画像生成部は、画像データの複数の画像信号のそれぞれの信号に対して位相シフト処理を行って新たな複数の画像信号を生成し、前記複数の画像信号と新たな複数の画像信号とに係数を乗じて合成することにより、小数画素精度画像データを生成する、請求項1記載の画像符号化装置。 The decimal pixel image generation unit performs a phase shift process on each of a plurality of image signals of image data to generate a plurality of new image signals, and the plurality of image signals and a plurality of new image signals The image encoding apparatus according to claim 1, wherein decimal pixel-accurate image data is generated by multiplying and synthesizing by multiplying by a coefficient.
  6.  予測画像メモリから動き予測画像データを読出す読出し処理と、
     読み出された動き予測画像データと入力画像データとの差分を予測誤差データとして算出する差分処理と、
     前記差分処理で算出した予測誤差データを周波数変換する周波数変換処理と、
     前記周波数変換処理で周波数変換されたデータを量子化する量子化処理と、
     前記量子化処理で量子化されたデータを可変長符号化し、符号化ストリームを生成する可変長符号化処理と、
     前記量子化処理で量子化されたデータを逆量子化する逆量子化処理と、
     前記逆量化処理で逆量子化されたデータを逆周波数変換して予測誤差データを再生する逆周波数変換処理と、
     前記逆周波数変換処理で再生された予測誤差データと前記動き予測画像データとを加算してローカルデコード画像データを出力する加算処理と、
     前記加算処理で算出されえたローカルデコード画像データをフレームメモリに格納する処理と、
     前記フレームメモリが保持する、符号化対象フレームのローカルデコード画像データとそれよりも前のフレームのローカルデコード画像データとを用いて小数画素精度画像データを生成する小数画素画像データ処理と、
     前記小数画素画像データ処理で生成された小数画素精度画像データを参照画像データとして動き検出を行って予測画像を生成する動き検出・動き補償処理と、
     前記動き検出・動き補償処理で生成された動き予測画像データを前記予測画像メモリに書き込む書き込み処理と、を含む画像符号化方法。
    A read process for reading motion prediction image data from the prediction image memory;
    Difference processing for calculating the difference between the read motion prediction image data and input image data as prediction error data;
    A frequency conversion process for frequency conversion of the prediction error data calculated in the difference process;
    A quantization process for quantizing the frequency-converted data in the frequency conversion process;
    Variable-length encoding processing for variable-length encoding the data quantized by the quantization processing and generating an encoded stream; and
    An inverse quantization process for inversely quantizing the data quantized by the quantization process;
    Inverse frequency transform processing for reproducing the prediction error data by performing inverse frequency transform on the data inversely quantized in the inverse quantization process;
    An addition process of adding the prediction error data reproduced by the inverse frequency conversion process and the motion prediction image data to output local decoded image data;
    A process of storing locally decoded image data calculated in the addition process in a frame memory;
    Decimal pixel image data processing for generating decimal pixel precision image data using local decoded image data of a frame to be encoded and local decoded image data of a frame before that held by the frame memory;
    Motion detection / motion compensation processing for generating a prediction image by performing motion detection using the decimal pixel accuracy image data generated by the decimal pixel image data processing as reference image data;
    And a writing process for writing motion prediction image data generated by the motion detection / compensation process into the prediction image memory.
  7.  動き予測によって符号化された符号化データを入力して符号化画像データと付加情報に分離し、分離された符号化画像データを復号して得られる動き予測誤差に動き予測画像データを加算して画像データを再生し、再生された再生画像データに基づいて前記動き予測画像データを生成することが可能な画像復号装置であって、
     前記再生画像データを複数フレーム分に亘り保持するフレームメモリと、
     前記フレームメモリが保持する、復号対象フレームの再生画像データとそれよりも前のフレームの再生画像データとを用いて小数画素精度画像データを生成する小数画素画像データ処理部と、を有し、
     前記小数画素画像データ処理部で生成された小数画素精度画像データを参照画像データとして動き検出を行って動き予測画像データを生成する、画像復号装置。
    Encoding data encoded by motion prediction is input and separated into encoded image data and additional information, and motion prediction image data is added to a motion prediction error obtained by decoding the separated encoded image data. An image decoding apparatus capable of reproducing image data and generating the motion prediction image data based on the reproduced image data reproduced,
    A frame memory for holding the reproduced image data for a plurality of frames;
    A fractional pixel image data processing unit that generates fractional pixel precision image data using the reproduced image data of the decoding target frame and the reproduced image data of the previous frame held by the frame memory;
    An image decoding device that generates motion prediction image data by performing motion detection using the decimal pixel accuracy image data generated by the decimal pixel image data processing unit as reference image data.
  8.  前記小数画素画像データ処理部は、フレームメモリに格納された復号対象フレームの再生画像データとそれより前の所定範囲の前フレームの再生画像データとの間の動き量が小数画素精度であるか否かを判定し、小数画素精度の判定結果が得られたときは、その前フレームの再生画像データと前記復号対象フレームの再生画像データとを用いて小数画素精度画像データを生成し、小数画素精度の判定結果が得られないときは復号対象フレームの再生画像データに対する補間演算によって小数画素精度画像データを生成する、請求項7記載の画像復号装置。 The decimal pixel image data processing unit determines whether or not the amount of motion between the reproduced image data of the decoding target frame stored in the frame memory and the reproduced image data of the previous frame in the predetermined range before that is decimal pixel accuracy. When the determination result of decimal pixel accuracy is obtained, the decimal pixel accuracy image data is generated using the reproduction image data of the previous frame and the reproduction image data of the decoding target frame, and the decimal pixel accuracy is obtained. The image decoding apparatus according to claim 7, wherein when the determination result is not obtained, the decimal pixel precision image data is generated by an interpolation operation on the reproduction image data of the decoding target frame.
  9.  前記小数画素画像データ処理部は、前記小数画素精度であるか否かの判定の対象をIピクチャ又はPピクチャのフレーム形態に限定する、請求項8記載の画像復号装置。 The image decoding device according to claim 8, wherein the decimal pixel image data processing unit limits a target of determination as to whether or not the decimal pixel accuracy is a frame form of an I picture or a P picture.
  10.  前記小数画素画像データ処理部は、前記所定範囲の前フレームに対して最も復号対象フレームに近い前フレームから順次前記フレーム形態の判定と前記小数画素精度であるか否かの判定を行う、請求項9記載の画像復号装置。 The decimal pixel image data processing unit sequentially determines the frame form and the decimal pixel accuracy from the previous frame closest to the decoding target frame with respect to the previous frame in the predetermined range. 9. The image decoding device according to 9.
  11.  前記小数画素画像生成部は、画像データの複数の画像信号のそれぞれの信号に対して位相シフト処理を行って新たな複数の画像信号を生成し、前記複数の画像信号と新たな複数の画像信号とに係数を乗じて合成することにより、小数画素精度画像データを生成する、請求項7記載の画像復号装置。 The decimal pixel image generation unit performs a phase shift process on each of a plurality of image signals of image data to generate a plurality of new image signals, and the plurality of image signals and a plurality of new image signals The image decoding device according to claim 7, wherein decimal pixel-accurate image data is generated by multiplying and synthesizing by multiplying by a coefficient.
  12.  動き予測によって符号化された符号化データから成る符号化ストリームを入力して可変長復号する可変長復号処理と、
     可変長復号された符号化データを符号化画像データと付加情報に分離する構文解析処理と、
     分離された符号化画像データを逆量子化する逆量子化処理と、
     前記逆量化処理で逆量子化されたデータを逆周波数変換して動き予測誤差を再生する逆周波数変換処理と、
     逆周波数変換処理で再生された動き予測誤差に予測画像メモリの動き予測画像データを加算して画像データを再生する加算処理と、
     前記加算処理で再生された再生画像データをフレームメモリに書き込む再生画像データ書き込み処理と、
     前記フレームメモリが保持する、復号対象フレームの再生画像データとそれよりも前のフレームの再生画像データとを用いて小数画素精度画像データを生成する小数画素画像データ処理と、
     前記小数画素画像データ処理で生成された小数画素精度画像データと前記付加情報を用いて動き予測画像データを生成する動き補償処理と、
     前記動き補償処理で生成された動き予測画像データを前記予測画像メモリに書き込む書き込み処理と、を含む画像復号方法。
    A variable-length decoding process in which an encoded stream including encoded data encoded by motion prediction is input and variable-length decoding is performed;
    A parsing process for separating the variable length decoded encoded data into encoded image data and additional information;
    An inverse quantization process for inversely quantizing the separated encoded image data;
    An inverse frequency transform process for reproducing a motion prediction error by performing an inverse frequency transform on the data quantized by the inverse quantization process;
    An addition process for adding the motion prediction image data of the prediction image memory to the motion prediction error reproduced by the inverse frequency conversion process to reproduce the image data;
    Reproduction image data writing processing for writing reproduction image data reproduced by the addition processing to a frame memory;
    Decimal pixel image data processing for generating decimal pixel accuracy image data using the reproduction image data of the decoding target frame and the reproduction image data of the frame preceding it held by the frame memory;
    Motion compensation processing for generating motion prediction image data using the decimal pixel accuracy image data generated by the decimal pixel image data processing and the additional information;
    And a writing process for writing motion prediction image data generated by the motion compensation process into the prediction image memory.
  13.  前記小数画素画像データ処理は、前記小数画素精度であるか否かの判定の対象をIピクチャ又はPピクチャのフレーム形態に限定する、請求項12記載の画像復号方法。 13. The image decoding method according to claim 12, wherein the decimal pixel image data processing limits an object of determination as to whether or not the decimal pixel accuracy is an I picture or P picture frame form.
  14.  前記小数画素画像データ処理は、前記所定範囲の前フレームに対して最も復号対象フレームに近い前フレームから順次前記フレーム形態の判定と前記小数画素精度であるか否かの判定を行う、請求項13記載の画像復号方法。 14. The decimal pixel image data processing sequentially determines the frame form and the decimal pixel accuracy from the previous frame closest to the decoding target frame with respect to the previous frame in the predetermined range. The image decoding method as described.
  15.  前記小数画素画像データ処理は、画像データの複数の画像信号のそれぞれの信号に対して位相シフト処理を行って新たな複数の画像信号を生成し、前記複数の画像信号と新たな複数の画像信号とに係数を乗じて合成することにより、小数画素精度画像データを生成する、請求項12記載の画像復号方法。 The decimal pixel image data processing generates a plurality of new image signals by performing phase shift processing on each of the plurality of image signals of the image data, and the plurality of image signals and the new plurality of image signals The image decoding method according to claim 12, wherein decimal pixel-accurate image data is generated by multiplying and synthesizing by multiplying by a coefficient.
PCT/JP2009/002453 2008-07-16 2009-06-02 Image encoding apparatus, image encoding method, image decoding apparatus, and image decoding method WO2010007719A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2010520739A JPWO2010007719A1 (en) 2008-07-16 2009-06-02 Image coding apparatus, image coding method, image decoding apparatus, and image decoding method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2008-184777 2008-07-16
JP2008184777 2008-07-16

Publications (1)

Publication Number Publication Date
WO2010007719A1 true WO2010007719A1 (en) 2010-01-21

Family

ID=41550131

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2009/002453 WO2010007719A1 (en) 2008-07-16 2009-06-02 Image encoding apparatus, image encoding method, image decoding apparatus, and image decoding method

Country Status (2)

Country Link
JP (1) JPWO2010007719A1 (en)
WO (1) WO2010007719A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011223293A (en) * 2010-04-09 2011-11-04 Hitachi Ltd Image encoding method, image encoding apparatus, image decoding method, and image decoding apparatus
JP2013534801A (en) * 2010-07-20 2013-09-05 シーメンス アクチエンゲゼルシヤフト Video coding using high resolution reference frames
JP2014523696A (en) * 2011-06-30 2014-09-11 エルジー エレクトロニクス インコーポレイティド Interpolation method and prediction method using the same

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006174415A (en) * 2004-11-19 2006-06-29 Ntt Docomo Inc Image decoding apparatus, image decoding program, image decoding method, image encoding apparatus, image encoding program, and image encoding method
JP2007324789A (en) * 2006-05-31 2007-12-13 Hitachi Ltd Image signal processing unit, method for obtaining high resolution image signal, and program for executing the same
JP2008017241A (en) * 2006-07-07 2008-01-24 Hitachi Ltd High-resolution image processor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19951341B4 (en) * 1999-10-25 2012-02-16 Robert Bosch Gmbh Method for the motion-compensating prediction of moving pictures and device therefor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006174415A (en) * 2004-11-19 2006-06-29 Ntt Docomo Inc Image decoding apparatus, image decoding program, image decoding method, image encoding apparatus, image encoding program, and image encoding method
JP2007324789A (en) * 2006-05-31 2007-12-13 Hitachi Ltd Image signal processing unit, method for obtaining high resolution image signal, and program for executing the same
JP2008017241A (en) * 2006-07-07 2008-01-24 Hitachi Ltd High-resolution image processor

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011223293A (en) * 2010-04-09 2011-11-04 Hitachi Ltd Image encoding method, image encoding apparatus, image decoding method, and image decoding apparatus
JP2013534801A (en) * 2010-07-20 2013-09-05 シーメンス アクチエンゲゼルシヤフト Video coding using high resolution reference frames
US9906787B2 (en) 2010-07-20 2018-02-27 Siemens Aktiengesellschaft Method and apparatus for encoding and decoding video signal
JP2014523696A (en) * 2011-06-30 2014-09-11 エルジー エレクトロニクス インコーポレイティド Interpolation method and prediction method using the same
US9460488B2 (en) 2011-06-30 2016-10-04 Lg Electronics Inc. Interpolation method and prediction method using same

Also Published As

Publication number Publication date
JPWO2010007719A1 (en) 2012-01-05

Similar Documents

Publication Publication Date Title
RU2456761C1 (en) Operations of repeated digitisation and variation of image size for video coding and decoding with alternating defining power
DK2996336T3 (en) Device for interpolating images using a smoothing interpolation filter
KR101648058B1 (en) Method and Apparatus for Image Encoding/Decoding Using High Resolution Filter
JP2006246474A (en) Prediction image generation method and apparatus using single coding mode for all color components, and image and video encoding and decoding methods and apparatuses using the same
KR101090586B1 (en) Encoding/decoding device, encoding/decoding method and recording medium
KR101362755B1 (en) Sensor image encoding and decoding apparatuses and method thereof
JP2008541653A (en) Multi-layer based video encoding method, decoding method, video encoder and video decoder using smoothing prediction
CN103503453A (en) Encoding device, encoding method, decoding device, and decoding method
KR20090085956A (en) Method and apparatus for encoding/decoding image efficiently
WO2010007719A1 (en) Image encoding apparatus, image encoding method, image decoding apparatus, and image decoding method
KR101690253B1 (en) Image processing method and Apparatus
JP5011138B2 (en) Image coding apparatus, image coding method, image decoding apparatus, and image decoding method
JP5113479B2 (en) Image signal processing apparatus and image signal processing method
JP2008125002A (en) Image encoder, image decoder and image processing program
KR20090012957A (en) Method for resizing image using interpolation filter selected in block transform domain based on characteristic of image
JP2004357313A (en) Image information processing apparatus and image information processing method
JP4565393B2 (en) Video signal hierarchical encoding apparatus, video signal hierarchical encoding method, and video signal hierarchical encoding program
JP3591025B2 (en) Image information processing device
JP2011223293A (en) Image encoding method, image encoding apparatus, image decoding method, and image decoding apparatus
JP2009278473A (en) Image processing device, imaging apparatus mounting the same, and image reproducing device
JP4526529B2 (en) Video signal converter using hierarchical images
JP2009105689A (en) Video display device, video display method and method for forming high resolution image
Pantoja et al. Coefficient conversion for transform domain VC-1 to H. 264 transcoding
JP2010109640A (en) Video display device, video display method, and image high resolution method
JP2016165054A (en) Encoding device, decoding device, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09797648

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2010520739

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09797648

Country of ref document: EP

Kind code of ref document: A1