WO2007013194A1 - Méthode de compression d’informations image et système ftv (télévision à point de vue libre) - Google Patents

Méthode de compression d’informations image et système ftv (télévision à point de vue libre) Download PDF

Info

Publication number
WO2007013194A1
WO2007013194A1 PCT/JP2006/300257 JP2006300257W WO2007013194A1 WO 2007013194 A1 WO2007013194 A1 WO 2007013194A1 JP 2006300257 W JP2006300257 W JP 2006300257W WO 2007013194 A1 WO2007013194 A1 WO 2007013194A1
Authority
WO
WIPO (PCT)
Prior art keywords
image information
frame
frames
encoding
image
Prior art date
Application number
PCT/JP2006/300257
Other languages
English (en)
Japanese (ja)
Inventor
Masayuki Tanimoto
Toshiaki Fujii
Kenji Yamamoto
Original Assignee
National University Corporation Nagoya University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University Corporation Nagoya University filed Critical National University Corporation Nagoya University
Priority to JP2007526814A priority Critical patent/JP4825983B2/ja
Publication of WO2007013194A1 publication Critical patent/WO2007013194A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present invention relates to an image information compression method capable of improving the code compression efficiency of image information acquired by a plurality of cameras, and a free viewpoint television system to which this method is applied. is there.
  • the inventor of the present application is a free viewpoint TV (FTV) that allows viewers to freely change their viewpoints and view 3D scenes as if they were on the spot.
  • FTV free viewpoint TV
  • the viewpoint can be freely moved in the horizontal plane based on the photographed images acquired by 15 cameras.
  • F TV experimental equipment is completed (see Non-Patent Document 1, for example).
  • Non-Patent Document 1 Masayuki Tanimoto, “Free Viewpoint Television”, Nihon Kogyo Publishing, Imaging Lab, February 2005, pp. 23-28
  • Non-Patent Document 2 Shinya Oka, Nonon Champurim, Toshiaki Fujii, Masayuki Tanimoto, “Light-Space Information Compression for Free Viewpoint Television”, IEICE Technical Report, CS2003—141, pp. 7-12, 2003 1 2 Moon
  • Non-Patent Document 3 Masayuki Tanimoto, "5. Free-viewpoint TV FTV, using multi-viewpoint image processing", Journal of the Institute of Image Information and Media Sciences, Vol. 58, No. 7, pp. 898-901, 2004
  • Patent Document 4 Shinya Oka, Nonon Champurim, Toshiaki Fujii, Masayuki Tanimoto, “Compression of Dynamic Ray Space for Free Viewpoint Television”, 3D Image Conference 2004, pp. 139-142, 2004
  • Non-Patent Document 2 states, “Because the light space is very similar in both the time axis and the space axis, motion (parallax) prediction is applied to both axes. It is thought that it is possible to obtain a high compression ratio by doing so. " Also, Non-Patent Document 3 on page 899, the left column says “interpolate the ray space”, and the left column on page 900 says “If the interpolation is performed only on the necessary part, not the entire light space.
  • FIG. 1 is a diagram conceptually showing the basic configuration of an FTV system.
  • the F TV system shown in Fig. 1 uses a camera (step ST1), image interpolation processing (step ST2 or ST2a), image information compression processing (step ST3), and an image viewed from the input viewpoint. Is displayed (steps ST4 and ST5).
  • image information of a subject 101 that exists in a three-dimensional real space is stored in multiple cameras (Fig. 1 shows five cameras 102 to 102).
  • Step ST1 As shown, more cameras are actually used. ) (Step ST1) and images acquired by multiple cameras ( Figure 1 shows five images with reference numerals 103 to 103)
  • X represents a horizontal viewing direction
  • y represents a vertical viewing direction
  • the arrangement of the plurality of cameras 102 is a linear arrangement in which the directions parallel to each other are arranged on a straight line, as shown in Fig. 2 (b).
  • Circumferential arrangement (or arc arrangement) arranged with the inside of the circumference facing the inside of the circumference, as shown in Fig. 2 (c), planar arrangement arranged in parallel with each other on the plane, Fig. 2 (d)
  • Fig. 2 (a) As shown in Fig.
  • the spherical arrangement (or hemispherical arrangement) arranged on the spherical surface with the inner surface of the spherical surface arranged, and the cylindrical arrangement arranged on the cylinder with the inner surface of the cylinder oriented as shown in Fig. 2 (e).
  • the arrangement of multiple cameras 102 should be either the linear arrangement shown in Fig. 2 (a) or the circumferential arrangement shown in Fig. 2 (b) when only a horizontal free viewpoint is realized.
  • the planar arrangement shown in Fig. 2 (c) the cylindrical arrangement shown in Fig. 2 (d), or the spherical arrangement shown in Fig. 2 (e) To do.
  • one ray in a three-dimensional real space is represented by one point in a multidimensional space with a parameter representing it as a coordinate.
  • This virtual multidimensional space is called a light space.
  • the whole ray space expresses all rays in 3D space without excess or deficiency.
  • Ray space is created by collecting images taken with a lot of viewpoint power. Since the value of the point in the ray space is the same as the pixel value of the image, conversion to the image force ray space is just a coordinate transformation.
  • the light beam 107 passing through the reference plane 106 in real space passes. It can be uniquely expressed by four parameters: position (x, y) and passing direction ( ⁇ , ⁇ ). In FIG.
  • X is a horizontal coordinate axis in three-dimensional real space
  • Y is a vertical coordinate axis
  • Z is a depth coordinate axis.
  • is the horizontal angle with respect to the normal of the reference surface 106, that is, the horizontal projection angle with respect to the reference surface 106
  • is the vertical angle with respect to the normal of the reference surface 106, that is, This is an emission angle in a direction perpendicular to the reference plane 106.
  • the ray information in this three-dimensional real space can be expressed as luminance f (x, y, ⁇ , ⁇ ).
  • the vertical parallax (angle ⁇ ) is ignored. As shown in Fig.
  • images taken by a number of cameras placed horizontally toward the reference plane 106 are X, y as shown in Fig. 3 (b).
  • step ST2 interpolation
  • step ST2a interpolation side
  • Step ST3 in Fig. 1 Compression of image information (step ST3 in Fig. 1) is not an indispensable process when the components of the FTV system are in the same location, but the camera and the user are in different locations. This is an indispensable process when distributing image information using the Internet.
  • a conventional image information compression method for example, there is a method compliant with the H.264ZAVC standard (for example, see Patent Document 1).
  • Patent Document 1 Japanese Patent Laid-Open No. 2003-348595 (FIGS. 1 and 2) Disclosure of the invention
  • the present invention has been made to solve the above-described problems of the prior art, and the object thereof is to reduce the sign key compression in the sign key of image information obtained by a plurality of cameras.
  • the object is to provide an image information compression method capable of improving efficiency and an FTV system to which this method is applied.
  • the image information compression method of the present invention includes:
  • Three or more cameras' medium power is also selected Frame information arranged in the time axis direction of moving images acquired by two or more cameras using intra-frame coding and temporal correlation between frames
  • a time of a moving image acquired by a camera other than the selected force based on the step of performing the code processing using the inter-predictive code and the image information acquired by the selected camera Generating a first viewpoint interpolated image corresponding to the axially aligned frames;
  • the step of encoding and processing image information of frames arranged in a time axis direction of a moving image acquired by a camera other than the selected camera is image information acquired by a camera other than the selected camera.
  • Encoding processing with reference to image information of a frame at a time different from the encoding target frame, and encoding processing with reference to the first viewpoint interpolated image corresponding to the encoding target frame includes a step of selectively outputting an encoding processing result when the encoding compression efficiency is highest. It is characterized by this.
  • Another image information compression method of the present invention includes:
  • Image information of frames of moving images acquired by the plurality of cameras at the same time arranged in the order of the arrangement of the cameras is obtained by the same algorithm as the inter-frame prediction code ⁇ using the temporal correlation.
  • a step of performing a code process using an inter-frame prediction code using a correlation between frames at the same time is obtained by the same algorithm as the inter-frame prediction code ⁇ using the temporal correlation.
  • the FTV system of the present invention includes:
  • An image information encoding apparatus for executing the image information compression method
  • a plurality of cameras for supplying video signals to the image information encoding device
  • An image information decoding device for decoding the encoded information output from the image information encoding device
  • a user interface for inputting the viewpoint position of the viewer
  • an image information extracting unit that extracts an image of viewpoint power input by the user interface from images of the same time taken by the plurality of cameras.
  • a frame of a moving image acquired by a plurality of cameras is encoded by inter-frame prediction encoding using correlation between frames at the same time.
  • inter-frame prediction encoding using correlation between frames at the same time.
  • image information of frames arranged in the time axis direction of moving images acquired by two or more selected cameras is encoded.
  • the first viewpoint interpolation image corresponding to the frame of the moving image acquired by the camera other than the selected camera is generated, and the camera other than the selected camera generates the first viewpoint interpolation image.
  • FIG. 1 is a diagram conceptually showing the basic configuration of an FTV system.
  • FIG. 2 (a) to (e) are diagrams showing examples of the arrangement of multiple cameras, (a) is a linear arrangement, (b) is a circumferential arrangement, (c) is a planar arrangement, (d) Is a cylindrical arrangement, and (e) is a spherical arrangement.
  • FIG. 3 (a) is a diagram showing an object in real space, a linearly arranged camera, a reference plane, and light rays, and (b) is a diagram showing the light space.
  • FIG. 4 (a) is a diagram showing a light space, (b) is a diagram showing an image cut out from the light space, and (c) is a diagram showing another image cut out from the light space. is there.
  • FIG. 5 is a block diagram schematically showing a configuration of an image information encoding device capable of implementing the image information compression method of the present invention.
  • FIG. 6 is a diagram conceptually showing that frames of moving images taken by a plurality of cameras are arranged in the time axis direction, and frames at the same time are arranged in the order of camera arrangement.
  • FIG. 7 is a flowchart showing an operation of the image information encoding device shown in FIG.
  • FIG. 8 is a flowchart showing an example of the operation of the interpolated image generation / compensation step shown in FIG.
  • FIG. 9 is a flowchart showing an example of the operation of the selection step shown in FIG.
  • FIG. 10 is a block diagram schematically showing a configuration of an image information decoding apparatus capable of decoding image information encoded by the image information compression method of the present invention.
  • FIG. 11 is a flowchart showing an operation of the image information decoding apparatus shown in FIG.
  • FIG. 12 is a flowchart showing an example of the operation of the interpolated image generation / compensation step shown in FIG.
  • FIG. 13 is an explanatory diagram (part 1) of the image information compression method according to the first embodiment of the present invention.
  • FIG. 14 is an explanatory diagram (part 2) of the image information compression method according to the first embodiment of the present invention.
  • FIG. 15 is an explanatory diagram (part 1) of the image information compression method according to the second embodiment of the present invention.
  • FIG. 16 is an explanatory diagram (part 2) of the image information compression method according to the second embodiment of the present invention.
  • FIG. 17 is an explanatory diagram (part 3) of the image information compression method according to the second embodiment of the present invention.
  • FIG. 18 is an explanatory diagram (part 4) of the image information compression method according to the second embodiment of the present invention.
  • FIG. 19 is an explanatory diagram of an image information compression method according to the third embodiment of the present invention.
  • FIG. 20 is an explanatory diagram of an image information compression method according to the fourth embodiment of the present invention.
  • FIG. 21 is an explanatory diagram (part 1) of the image information compression method according to the fifth embodiment of the present invention.
  • FIG. 22 is an explanatory diagram (part 2) of the image information compression method according to the fifth embodiment of the present invention.
  • FIG. 23 is an explanatory diagram (part 3) of the image information compression method according to the fifth embodiment of the present invention.
  • FIG. 24 is an explanatory diagram (part 4) of the image information compression method according to the fifth embodiment of the present invention.
  • FIG. 25 is an explanatory diagram (part 5) of the image information compression method according to the fifth embodiment of the present invention.
  • FIG. 26 is an explanatory diagram (No. 6) of the image information compression method according to the fifth embodiment of the invention.
  • FIG. 27 is a diagram showing an example of a horizontal section of a light space in an image information compression method according to a sixth embodiment of the present invention.
  • FIG. 28 is an explanatory diagram of a motion vector prediction method in the image information compression method according to the sixth embodiment of the present invention.
  • FIG. 29 is an explanatory diagram of a motion vector prediction method in H.264ZAVC as a comparative example of the sixth embodiment of the present invention.
  • FIG. 30 (a) and (b) are explanatory diagrams showing the relationship between a point in real space and a straight line in a horizontal section of the light space. It is a figure which shows notionally the basic composition of a FTV system.
  • FIG. 31 is a diagram conceptually showing the basic structure of an FTV system in a seventh embodiment of the present invention.
  • FIG. 5 shows an image information coding apparatus capable of implementing the image information compression method of the present invention.
  • the image information encoding device 200 includes N input terminals 201 to 201 (N is an integer of 2 or more), N AZD conversion units 202 to 202, Screen sorting buffer
  • the image information coding apparatus 200 includes an inverse quantization unit 211, an inverse orthogonal transform unit 212, a multi-camera frame 213, a motion prediction / compensation unit 215, an interpolated image generation / compensation unit 216, a motion A selection unit 217 that selectively outputs one of the output signals of the prediction / compensation unit 215 and the interpolated image generation / compensation unit 216;
  • the motion prediction / compensation unit 215, the interpolated image generation / compensation unit 216, and the selection unit 217 constitute an encoding processing unit 214 that performs the image information compression method of the present invention.
  • An image information encoding apparatus 200 shown in FIG. 5 includes a point that can receive image information from a plurality of cameras, and an encoding processing unit 214 that can implement the image information compression method of the present invention. This is different from the conventional image information encoding apparatus disclosed in Patent Document 1 described above.
  • Each of the input terminals 201 to 201 of the image information encoding device 200 has an arrangement position and
  • analog video signals acquired by N cameras with known shooting directions are input.
  • the N cameras usually have the same performance, such as resolution, and are regularly arranged as shown in FIGS. 2 (a;) to (e), for example.
  • the number of cameras is usually tens, hundreds, or more.
  • Ma The camera arrangement is not limited to that shown in FIGS. 2 (a) to 2 (e).
  • the analog video signals input to the input terminal SO ⁇ 201 are respectively decoded by the AZD converters 202-202.
  • FIG. 6 shows frames of moving images (also referred to as “images”) taken by a plurality of cameras # 1 to # 5, and FRs are arranged in the time axis t direction, and camera # 1. It is a figure which shows notionally that the frames of the same time acquired by ⁇ # 5 are arranged in the spatial axis S direction in the arrangement order of the cameras.
  • the frame FR of the moving images taken by the cameras # 1 to # 5 is an image group having a predetermined number of frame forces arranged in time series in the time axis t direction. Configure GOP (Group of Pictures).
  • FIG. 6 shows frames of moving images (also referred to as “images”) taken by a plurality of cameras # 1 to # 5, and FRs are arranged in the time axis t direction, and camera # 1.
  • the frame FR of the moving images taken by the cameras # 1 to # 5 is an image group having a predetermined number of frame forces arranged in time series in the time axis t direction. Configure GOP (Group
  • the frames taken at the same time of the moving images taken by the respective force cameras # 1 to # 5, that is, the frames at the same time are the spatial axes S which are the arrangement order of the cameras.
  • An image group G of a predetermined number of frames at the same time arranged in the direction (horizontal direction in Fig. 6) is formed.
  • the screen rearrangement buffer 203 of the image information encoding device 200 performs frame rearrangement according to the GOP structure of the supplied image information.
  • the face rearrangement buffer 203 supplies the image information of the entire frame to the orthogonal transform unit 205 for the image on which intra-frame coding (intra coding) is performed.
  • the orthogonal transform unit 205 performs orthogonal transform such as discrete cosine transform on the image information and supplies transform coefficients to the quantization unit 206.
  • the quantization unit 206 performs a quantization process on the transform coefficient supplied from the orthogonal transform unit 205.
  • the variable code key unit 207 determines the quantized transform coefficient and quantization scale iso-power code key mode supplied from the quantization unit 206, and sets a variable length code for this coding mode. Or variable coding such as arithmetic coding is performed to form information to be inserted into the header portion of each image code key. Then, the variable code key unit 207 supplies the encoded encoding mode to the accumulation buffer 208 for accumulation. The encoded code mode is output from the output terminal 209 as image compression information. The variable code key unit 207 applies a variable code key such as a variable-length code key or an arithmetic code key to the quantized transform coefficient to generate a code key. The converted conversion coefficient is supplied to the accumulation buffer 208 and accumulated. The encoded conversion coefficient is output from the output terminal 209 as image compression information.
  • the behavior of the quantization unit 206 is controlled by the rate control unit 210 based on the data amount of transform coefficients accumulated in the accumulation buffer 208. Further, the quantization unit 206 supplies the quantized transform coefficient to the inverse quantization unit 211, and the inverse quantization unit 211 performs inverse quantization on the quantized transform coefficient.
  • the inverse orthogonal transform unit 212 performs inverse orthogonal transform processing on the inversely quantized transform coefficients to generate decoded image information, and supplies the information to the multi-camera frame memory 213 for accumulation.
  • the screen rearrangement buffer 203 supplies image information to the encoding processing unit 214 for an image on which inter-frame predictive encoding (inter-encoding) is performed.
  • the encoding processing unit 214 performs encoding processing on image information using the image information compression methods of the first to sixth embodiments of the present invention described later.
  • the encoding processing unit 214 supplies the generated reference image information to the adder 204, and the adder 204 converts the reference image information into a difference signal from the corresponding image information.
  • the encoding processing unit 214 supplies the motion vector information to the variable code unit 207 at the same time.
  • variable encoding unit 207 is based on the quantized transform coefficient and quantization scale from the quantization unit 206, the motion vector information supplied from the code key processing unit 214, and the like. Then, variable encoding such as variable length encoding or arithmetic encoding is performed on the determined encoding mode, and information to be inserted into the header portion of each image code key is generated. Then, the variable code key unit 207 supplies the encoded code key mode to the accumulation buffer 208 for accumulation. The encoded code mode is output as image compression information.
  • variable code key unit 207 performs variable coding processing such as variable length code key or arithmetic coding on the motion vector information, and is inserted into the header part of each image code key. Information is generated.
  • image information input to the orthogonal transform unit 205 is a difference signal obtained from the adder 204.
  • the other processes are the same as those in the case of image compression using intra codes.
  • FIG. 7 is a flowchart showing the encoding process of the image information encoding device 200 shown in FIG. It is As shown in FIG. 7, the image information encoding device 200 performs AZD conversion of the input analog video signal by the AZD conversion units 20 to 202 (step ST11
  • the screen is rearranged by the screen rearrangement buffer 203 (step ST12), and then the motion prediction 'compensation unit 215 for motion prediction' compensation (step ST21) and the interpolated image generation 'compensation unit 216 Generation / compensation (step ST22), encoding by referring to the interpolated image by the selection unit 217, or determination of whether to select a shift of the coding prediction by motion prediction / compensation (step ST23) is performed.
  • the conventional compression coding process for image information for example, processing conforming to the H.26 4ZAVC standard
  • interpolation by the interpolated image generation / compensation unit 216 is performed in the case of the first embodiment described later.
  • Image generation ⁇ No compensation is required.
  • step ST23 the image information generated by orthogonal transform section 205 is orthogonally transformed (step ST23), and quantization and quantization rate control are performed by quantization section 206 and rate control section 210 (steps ST25, 26).
  • Variable code key unit 207 performs variable code key (step ST2 7)
  • inverse quantization unit 211 performs inverse quantization (step ST28)
  • inverse orthogonal transform unit 212 performs inverse orthogonal transform (step ST29). )I do.
  • Steps ST21 to ST29 are performed on all blocks having a predetermined number of pixels in the frame, and steps ST11 and ST12 and steps ST21 to ST29 for all blocks are performed on all frames.
  • FIG. 8 is a flowchart showing an example of the operation of the interpolated image generation / compensation step ST22 shown in FIG. Interpolated image generation 'For compensation, depth estimation is performed at each pixel in the block to generate an interpolated pixel (for example, pixel value 0 to 255), and an evaluation value E is calculated based on the pixel value of the generated interpolated pixel.
  • the evaluation value E is, for example,
  • the evaluation value E is not limited to the above definition, and other definitions can be adopted.
  • an interpolation pixel is generated using the depth that is the minimum value E (step ST224).
  • steps ST221 to ST224 is performed on all the pixels in the block, and an evaluation value, which is an index indicating how much the estimated block generated by the interpolation pixel is similar to the actual block, is calculated (step ST225 ).
  • an evaluation value which is an index indicating how much the estimated block generated by the interpolation pixel is similar to the actual block.
  • the set of pixels S is I (i, j), a ⁇ i ⁇ b, c; j ⁇ d, and the set of pixels T of the image to be signed T
  • est int e is I (i, j), a ⁇ i ⁇ b, c; j ⁇ d, the evaluation ⁇ is
  • a, b, c, and d are values indicating the block range. Note that the interpolation method described above is merely an example, and any manufacturer or user of the apparatus that can use any of the interpolation methods in the present invention is free from the known frame interpolation methods. May be configured to be selectable.
  • FIG. 9 is a flowchart showing an example of the operation of the selection step of either the interpolated image or the motion prediction compensation shown in FIG.
  • evaluation ⁇ mt is calculated, but when evaluation ⁇ int force S motion prediction compensation is adopted ⁇ ⁇ ⁇ ⁇ ⁇ larger than mot
  • motion prediction compensation is employed, and when the evaluation ⁇ mt is equal to or less than the evaluation ⁇ mot when motion prediction compensation is adopted, an interpolation image is selected (steps ST231 to ST233).
  • motion prediction compensation is performed when performing conventional image information compression encoding processing (for example, processing conforming to the H.264ZAVC standard) or when performing the image information compression method of the first embodiment described later.
  • the image information encoded by is selected.
  • FIG. 10 is a block diagram schematically showing a configuration of an image information decoding device 300 corresponding to the image information encoding device 200.
  • an image information decoding apparatus 300 includes an input terminal 301, a storage buffer 302, a variable decoding unit 203, an inverse quantization unit 304, an inverse orthogonal transform unit 305, an adder 306 and Screen rearrangement buffer 307, N DZ A converters 308 to 308, and N output terminals
  • the image information decoding apparatus 300 is a multi-camera frame.
  • the motion prediction / compensation unit 312, the interpolated image generation / compensation unit 313, and the selection unit 314 constitute a decoding processing unit 311 that performs image information decoding.
  • An image information decoding apparatus 300 shown in FIG. 10 includes a decoding processing unit 311 that can decode image information encoded by the image information compression method of the present invention, and corresponds to image information of a plurality of cameras. This is different from the image information decoding apparatus disclosed in Patent Document 1 in that a plurality of analog video signals can be output.
  • N DZA converters when N digital output signals are output from 309 to 309 output terminals, N DZA converters
  • the image compression information input from the input terminal 301 is temporarily stored in the storage buffer 302 and then transferred to the variable decoding unit 303.
  • the variable decoding unit 303 performs processing such as variable length decoding or arithmetic decoding on the image compression information based on the determined format of the image compression information, and acquires code key mode information stored in the header unit. This is supplied to the inverse quantization unit 304 or the like. Similarly, the variable decoding unit 303 acquires the quantized transform coefficient and supplies it to the inverse quantization unit 304. Further, if the frame decoding inter-coding is performed, the variable decoding unit 303 also decodes the motion vector information stored in the header portion of the image compression information, and the information is decoded. Supply to 311.
  • the inverse quantization unit 304 inverse-quantizes the quantized transform coefficient supplied from the variable decoding unit 303, and supplies the transform coefficient to the inverse orthogonal transform unit 305.
  • the inverse orthogonal transform unit 305 performs inverse orthogonal transform such as inverse discrete cosine transform on the transform coefficient based on the determined format of the image compression information.
  • the image information subjected to the inverse orthogonal transform processing is stored in the screen rearrangement buffer 307, and the DZA in the DZA conversion units 308 to 308 After conversion processing, output terminals 309 to 3
  • the decoding processing unit 311 performs motion vector information subjected to variable decoding processing and image information stored in the multi-camera frame memory 310. Based on the above, a reference image is generated and supplied to the adder 306. The adder 306 combines the reference image and the output from the inverse orthogonal transform unit 305. The other processing is the same as that of the intra-coded frame.
  • FIG. 11 is a flowchart showing the code key processing of the image information decoding apparatus 300 shown in FIG.
  • the image information decoding apparatus 300 performs motion prediction compensation on image information after variable decoding (step ST31), inverse quantization (step ST32), and inverse orthogonal transform (step ST33) of an input signal. If so, decoding is performed using motion prediction compensation (steps ST34 and ST35), and if compensated using an interpolated image, decoding is performed using the interpolated image (steps ST36 and ST37).
  • the processing of steps ST31 to ST37 is performed for all blocks, and further, the processing of performing the processing of steps ST31 to ST37 for all blocks is performed for all frames. Thereafter, screen rearrangement (step ST41) and DZA conversion (step ST42) are performed based on the obtained decoded data.
  • FIG. 12 is a flowchart showing an example of the operation of the interpolated image generation / compensation step ST37 shown in FIG.
  • the processing in steps ST371 to ST374 in FIG. 12 is the same as the processing in steps ST221 to ST224 in FIG.
  • the depth is estimated at each pixel in the block to generate an interpolated pixel (for example, pixel value 0 to 255), and an evaluation value E based on the pixel value of the generated interpolated pixel is calculated. Then, the minimum value E of the evaluation value E in the block depth range is obtained (steps ST371 to ST373). Then the minimum value E
  • An interpolation pixel is generated using a depth of mm min (step ST374).
  • the processing in steps ST221 to ST224 is performed on all the pixels in the block.
  • the image information encoding apparatus 200 capable of performing the image information compression method of the present invention and the image information capable of decoding the image information encoded by the image information compression method of the present invention have been described above.
  • the decoding apparatus 300 has been described as an example, the image information encoding apparatus 200 and the image information decoding apparatus 300 that can implement the image information compression method of the present invention are not limited to those having the above-described configuration.
  • the image information compression method of the present invention can also be applied to an apparatus having the configuration described above.
  • an embodiment of the image information compression method of the present invention and the image information of the present invention The FTV system to which the compression method is applied is explained.
  • the image information compression method according to the first embodiment of the present invention will be described below.
  • the image information compression method according to the first embodiment applies inter-view prediction encoding described later.
  • inter-view prediction encoding described later.
  • FIGS. 13 and 14 are explanatory diagrams (parts 1 and 2) of the image information compression method according to the first embodiment of the present invention.
  • 13 and 14 t represents a time axis, and S represents a spatial axis in the camera arrangement order or the camera arrangement direction.
  • # 1 to # 7 indicate camera numbers assigned in the order of camera arrangement.
  • the number of cameras may be other than the number shown as long as the number is two or more. Further, the camera may be arranged in any one of FIGS. 2 (a) to 2 (e) or other arrangements.
  • I is an intra-frame encoded frame (I picture)
  • P is an inter-frame prediction code frame (P picture)
  • B is an inter-frame bi-directional prediction code frame.
  • a predetermined number of frames arranged in the direction of the time axis t constitute a GOP that is an image group composed of a predetermined number of frame covers.
  • a GOP is configured by a predetermined number of pictures of I, B, B, P, B, B, P,.
  • image information of frames arranged in the time axis t direction of moving images acquired by a plurality of cameras is obtained.
  • Coding is performed by intra-frame code (intra coding) and inter-frame prediction code (inter coding) using temporal correlation between frames.
  • the inter-frame prediction code using the temporal correlation is, for example, an encoding method based on the H.264ZAVC standard.
  • the inter-frame prediction code using the temporal correlation is not limited to the above method, and other code methods may be adopted.
  • a moving image frame that is, an encoded image as shown in FIG. 13 is obtained.
  • the first frame in time in the GOP that is composed of a predetermined number of frames aligned in the time axis t direction.
  • the first frame is an I picture
  • the first frame is an I picture.
  • the encoding processing of frames other than the first frame in the same GOP is performed by inter-frame prediction code using temporal correlation, and the encoded image is a P picture or a B picture. .
  • image information of frames of moving images acquired by a plurality of cameras, which are arranged at the same time in the spatial axis S direction in the order of camera arrangement, is obtained between frames using temporal correlation.
  • Encoding is performed by inter-frame prediction encoding using the correlation between frames at the same time using the same algorithm as the prediction code ⁇ .
  • the inter-frame prediction code ⁇ using the correlation between the frames at the same time is executed in units of image groups (G shown in FIG. 6) composed of a predetermined number of frames arranged at the same time in the spatial axis S direction. The in this way
  • the inter-frame prediction code using the correlation between frames at the same time is the inter-frame prediction code using the correlation between frames acquired at each viewpoint (for example, adjacent camera positions). This is referred to as “inter-view prediction encoding”.
  • the frame subjected to code processing by the inter-frame prediction code using the correlation between the simultaneous frames is the first frame of the frame in the GOP, that is, the I picture.
  • the inter-view prediction code ⁇ processing as shown in FIG. 14, the first frame in the GOP moves in the direction of the spatial axis S in the camera arrangement direction, I, B, B, P, B, B, P , ... signed to picture.
  • the inter-view prediction code ⁇ described above is executed for the first frame of each GOP acquired by a plurality of cameras.
  • the image information compression method according to the first embodiment is used in the H. 264ZAVC standard or the like between images taken at the same time by a plurality of cameras whose positional relationships are known. Focusing on the fact that there is a spatial correlation similar to the temporal correlation, we propose to apply inter-view predictive coding to the first GOP frame (I picture), which has a large amount of information. is there.
  • the image information compression method of the first embodiment the same as the inter-frame prediction encoding for the frame aligned in the time axis t direction with respect to the first frame in the GOP aligned in the spatial axis S direction.
  • the inter-frame predictive coding based on the algorithm that is, the inter-view prediction code is applied, the code compression efficiency can be improved.
  • the inter-view prediction encoding process is based on the same algorithm as the inter-frame prediction encoding for the frames arranged in the time axis t direction. It is also possible to divert the compensation unit 215. For this reason, it is necessary to add a significant configuration (circuit or software) in order to implement the image information compression method of the first embodiment.
  • the image information compression method of one embodiment is advantageous in terms of cost.
  • the image information compression method according to the second embodiment of the present invention will be described below.
  • the image information compression method of the second embodiment uses viewpoint interpolation, which will be described later, and includes a multiframe memory 213 and a motion prediction / compensation unit 215 of the code key processing unit 214 shown in FIG. This is executed by the image generation / compensation unit 216 and the selection unit 217.
  • FIGS. 15 to 18 are explanatory diagrams (parts 1 to 4) of the image information compression method according to the second embodiment of the present invention.
  • 15 to 18 t represents a time axis
  • S represents a spatial axis in the camera arrangement order or the camera arrangement direction.
  • the figure also shows only the frames acquired by cameras # 1 to # 5.
  • the number of cameras is the number of frames that can be interpolated, i.e. 3 (capturing the frame to be encoded). If there is more than one camera and two cameras that capture the reference frame to generate an interpolated image corresponding to the frame to be encoded, a total of three cameras) Also good.
  • I, P, and B are an I picture, a P picture, and a B picture, respectively.
  • the frames arranged in the space axis S direction are frames at the same time.
  • cameras other than the selected camera are selected based on the image information acquired by the selected odd-numbered cameras # 1, # 3, # 5, and so on.
  • viewpoint interpolation This is referred to as “viewpoint interpolation image”.
  • the interpolation method used for viewpoint interpolation may be any interpolation method, and may be based on various factors such as the performance required by the apparatus that implements the image information compression method of the present invention or the request of the apparatus user. Therefore, a known frame interpolation method may be selected. In addition, if it is clear that the movement of the shooting target has a specific law, an interpolation method suitable for the movement of the shooting target may be selected. Also, before or after generating the viewpoint interpolation image shown in FIG. 16, the inter-view prediction encoding described in the first embodiment is performed on the first frame in the GOP, and the first frame You can compress the amount of information.
  • the image information of frames arranged in the time axis t direction of the moving image acquired by even-numbered cameras # 2, # 4, • ⁇ -other than the selected camera. Is encoded using the intra-frame code and the inter-frame prediction code using the temporal correlation between frames.
  • the selection unit 217 of the image information encoding apparatus 200 is an image acquired by an even-numbered camera # 2, # 4, ... other than the selected camera, and is a frame to be encoded.
  • the coding efficiency is the highest when the encoding process is performed with reference to images of frames at different times and when the encoding process is performed with reference to the viewpoint interpolation image corresponding to the frame to be encoded.
  • the result of the encoding process when the value becomes high is selectively output.
  • FR (# 2, n) is based on the frame FR (# 2, n) adjacent frame FR (# 1, n) and FR (# 3, n).
  • the frame FR (# 2, n) to be encoded has frames FR (# 2, 11-1) and 1 ⁇ (# 2, n) as frames at different times.
  • the force referring to +1) (drawn with a thick solid line)
  • the frame to be referenced is not limited to the frames FR (# 2, n-1) and FR (# 2, n + 1).
  • Reference frame FR (# 2, n) force When referring to one of the frames FR (# 2, n— 1) or FR (# 2, n + 1), or the frame shown There may also be references to frames at different times.
  • the image information compression method of the second embodiment is the adjacent camera # 1,
  • image information acquired by cameras # 2, # 4 image information acquired by cameras # 2, # 4,.
  • Subject When the sign key processing is performed with reference to the image information of the frame at a time different from that of the frame FR (# 2, n), and the viewpoint interpolation image FR ( # 2,
  • the encoding processing result when the code compression efficiency is the highest is selectively output, so the encoding compression efficiency of the output image information is reduced. Can be improved.
  • the selected camera is an odd-numbered camera (# 1, # 3, # 5, # 7,...), And a camera other than the selected camera is an even-numbered camera ( # 2, # 4, # 6, etc.)
  • the power explained when the camera is a selected camera The selected camera is an even-numbered camera, and the cameras other than the selected camera are odd-numbered cameras Also good.
  • FIG. 18 shows a case where a viewpoint interpolation image is generated by interpolation as indicated by a white arrow, but a viewpoint interpolation image may be generated by extrapolation interpolation.
  • the selected camera is not limited to an even number or an odd number.
  • a camera in which one of three cameras whose camera numbers are indicated by # 3n-2 is selected (specifically, # 1, # 4, # 7, %) and the remaining cameras and cameras other than the selected camera (specifically, # 2, # 3, # 5, # 6,...;) and Other methods, such as, may be adopted.
  • some groups of selected cameras may be even (# 2, # 4, # 6, ...) or odd (# 1, # 3, # 5, ...) and the rest In the group part, one out of three cameras whose camera numbers are indicated by # 3n-2 can be selected cameras, and the remaining cameras can be other than the selected cameras.
  • some groups of selected cameras have one camera selected as the selected camera with the camera number indicated by # 3n-2 and the remaining cameras. Let the camera be a camera other than the selected camera, and in the remaining group part, even number (# 2, # 4, # 6, ...) or odd number (# 1, # 3, # 5, ...) It is good. That is, it is possible to adopt a method in which an even or odd numbered camera is selected as a selected camera and a method in which one predetermined number of cameras is selected as a selected camera.
  • the image information compression method according to the third embodiment of the present invention uses viewpoint interpolation.
  • the multiframe memory 213 shown in FIG. 5, the motion prediction / compensation unit 215 of the code key processing unit 214, and interpolation image generation are performed.
  • 'Complement This is executed by the compensation unit 216 and the selection unit 217.
  • the image information compression method of the third embodiment is an improved version of the image information compression method of the second embodiment, and the point of referring to a plurality of viewpoint-interpolated images is that of the image information compression of the second embodiment. It is different from the method.
  • FIG. 19 is an explanatory diagram of an image information compression method according to the third embodiment of this invention.
  • FR (# 1, n) is by camera # 1.
  • the acquired frame at t n and FR (# 1, n)
  • FIG. 19 Is a viewpoint interpolation image corresponding to the frame FR (# 2, n), which is generated using the second interpolation method.
  • Figure 19 shows two types of viewpoint-interpolated images FR
  • the first interpolation method and the second interpolation method can be determined based on various factors such as the performance required for a device that is not limited to a specific method and the performance required by the device user. Any known frame interpolation method can be selected freely. In addition, if it is clear that there is a specific law in the movement of the shooting target, you can select an interpolation method suitable for the movement of the shooting target!
  • the frame FR (# 2, n) to be encoded has frames FR (# 2, 11-1) and 1 ⁇ (# 2, n) as frames at different times.
  • the force indicating the case of referring to +1) (drawn with a thick solid line)
  • the frame to be referenced is not limited to the frames FR (# 2, n— 1) and FR (# 2, n + 1).
  • the target frame FR (# 2, n) is changed to frame FR (#
  • the selection unit 217 shown in FIG. 5 refers to a frame at a different time and performs code key processing using an inter-frame prediction code key that uses temporal correlation between frames (for example, H Frame FR (# 2, n) by referring to the viewpoint interpolation image FR (# 2, n) corresponding to the frame FR (# 2, n) to be encoded.
  • frames for example, H Frame FR (# 2, n) by referring to the viewpoint interpolation image FR (# 2, n) corresponding to the frame FR (# 2, n) to be encoded.
  • the viewpoint interpolation image FR (corresponding to the frame FR (# 2, n) to be encoded)
  • Interpolated image FR (# 2, n) based on time frame is different with the same camera # 2
  • Viewpoint interpolation images FR (# 2, n) based on the same time frames taken in # 1 and # 3 have the same power
  • the image information compression method of the second embodiment is based on the viewpoint interpolation image FR (# 2, n) or FR (# based on the same time frame taken by the adjacent cameras # 1 and # 3.
  • n may be more similar to the target frame FR (# 2, n) than the frame of different time taken by the same camera # 2, and the viewpoint interpolation image FR (# 2,
  • the image information acquired by the cameras # 2, # 4 the image information acquired by the cameras # 2, # 4,.
  • Target file When encoding processing is performed with reference to image information of a frame at a time different from that of frame FR (# 2, n), and viewpoint interpolation image FR (# 2) corresponding to frame FR (# 2, n) to be encoded , n
  • the encoding processing result when the compression efficiency becomes high is selectively output, the encoding efficiency of the output image information can be improved.
  • the selected camera is an odd-numbered camera and the other cameras are even-numbered cameras has been described.
  • the selected camera is an even-numbered camera. It is a camera, and other cameras may be odd-numbered cameras.
  • FIG. 19 shows a case where a viewpoint interpolation image is generated by interpolation as indicated by a white arrow, but a viewpoint interpolation image may be generated by extrapolation! .
  • the selected camera is not limited to an even or odd number.
  • one out of three cameras whose camera numbers are indicated by # 3n-2 are selected cameras, and the remaining cameras are selected.
  • Other methods may be employed, such as using a camera other than the selected camera.
  • some groups of selected cameras may be even (# 2, # 4, # 6, ...) or odd (# 1, # 3, # 5, ...) and the rest In this group part, one camera out of the three cameras whose camera number is indicated by # 3n-2 can be selected cameras, and the remaining cameras can be cameras other than the selected camera.
  • some of the selected cameras have one camera selected as the camera number # 3n-2, and the remaining cameras. Is the camera other than the selected camera, and the remaining group parts are even (# 2, # 4, # 6, ...) or odd (# 1, # 3, # 5, ...) It is good.
  • the image information compression method according to the fourth embodiment of the present invention uses viewpoint interpolation.
  • the multiframe memory 213 shown in FIG. 5, the motion prediction / compensation unit 215 of the code key processing unit 214, and interpolation image generation are performed. 'Executed by the compensation unit 216 and the selection unit 217.
  • Image information compression method of the fourth embodiment The method is an improved version of the image information compression method of the second embodiment, and the image information compression method of the second embodiment is different from the viewpoint interpolation image in that it also refers to the adjacent image at the same time. Is different.
  • FIG. 20 is an explanatory diagram of an image information compression method according to the fourth embodiment of the present invention.
  • FR (# 1, n) is by camera # 1.
  • the acquired frame at t n
  • FR (# 2, n) is
  • the frame FR (# 2, n—i; ⁇ FR (# 2, n + 1) is used as a frame at a different time to be encoded.
  • Reference frame is not limited to frames FR (# 2, n-1) and FR (# 2, n + 1).
  • Frame FR (# 2, n) refer to one of frame FR (# 2, 11-1) or? 1 ⁇ (# 2, n + 1), or other than the frame shown In some cases, frames of different times are referred to.
  • the selection unit 217 shown in Fig. 5 refers to frames at different times and performs code key processing using an inter-frame prediction code key that uses temporal correlation between frames (for example, H.264ZAVC), and frame FR (# 2, n) with reference to viewpoint interpolation image FR (# 2, n) corresponding to frame FR (# 2, n) to be encoded Sign of
  • Frame FR (# 2, n) with reference to frame FR (# 1, n) or FR (# 3, n) adjacent to the frame FR (# 2, n) to be encoded.
  • the reason for this processing is that when considering the problem of which frame the encoding target frame is similar to, the frames with the same time taken by the same camera # 2 are most similar When the viewpoint interpolation images based on the same time frames taken by adjacent cameras # 1 and # 3 are the most similar, and when the same time frames taken by adjacent cameras # 1 and # 3 are the most similar In either case, there are also different forces depending on the instantaneous movement of the subject.
  • the image information compression method of the fourth embodiment pays attention to this point, frames at different times taken with the same power camera, viewpoint interpolation images based on the same time frames taken with adjacent cameras, and images taken with adjacent cameras.
  • the encoding target frame is encoded using the most similar image of the same time frames.
  • the image information acquired by the cameras # 2, # 4 the image information acquired by the cameras # 2, # 4,.
  • encoding processing is performed with reference to image information of a frame at a time different from the target frame FR (# 2, n), and the viewpoint interpolation image FR corresponding to the encoding target frame FR (# 2, n). (# 2, n
  • the selected camera is an odd-numbered camera and the other cameras are even-numbered cameras has been described.
  • the selected camera is an even-numbered camera. It is a camera, and other cameras may be odd-numbered cameras.
  • FIG. 20 shows a case where a viewpoint interpolation image is generated by interpolation as indicated by a white arrow, but a viewpoint interpolation image may be generated by extrapolation! .
  • the selected camera is not limited to an even number or an odd number.
  • one out of three cameras whose camera numbers are indicated by # 3n-2 are selected cameras, and the remaining cameras are selected.
  • Other methods may be employed, such as using a camera other than the selected camera.
  • some groups of selected cameras may be even (# 2, # 4, # 6, ...) or odd (# 1, # 3, # 5, ...) and the rest In the group part, the camera number is indicated by # 3n— 2.
  • One out of three cameras can be the selected camera, and the remaining cameras can be cameras other than the selected camera.
  • some of the selected cameras have one camera selected as the camera number # 3n-2, and the remaining cameras. Let the camera other than the selected camera be an even number (# 2, # 4, # 6, ...) or odd number (# 1, # 3, # 5, ...) in the remaining group parts It is good.
  • a plurality of types of viewpoint interpolation images may be generated by combining the fourth embodiment with the third embodiment.
  • the image information compression method of the fifth embodiment is an improvement over the image information compression method of the first embodiment.
  • the image information compression method of the fifth embodiment is that the interpolated image is also referred to when the inter-view prediction code for the first temporal frame in the GOP is referred to. This is different from the image information compression method.
  • the image information compression method of the fifth embodiment includes a multi-frame memory 213 shown in FIG. 5, a motion prediction 'compensation unit 215, an interpolated image generation' compensation unit 216, and a selection unit 217. Executed.
  • FIGS. 21 to 26 are explanatory diagrams of an image information compression method according to the fifth embodiment of the present invention.
  • t indicates a time axis direction
  • S is a spatial axis corresponding to the camera arrangement order or the camera arrangement direction.
  • the figure shows cameras # 1 to # 9, but the number of cameras is not limited to nine.
  • I indicates an I picture
  • P indicates a P picture
  • B indicates a B picture.
  • P is a P picture that also refers to the interpolated image
  • B is a B picture that also refers to the interpolated image.
  • image information of frames arranged in the time axis t direction of moving images acquired by a plurality of cameras is obtained.
  • Code code processing for example, processing by H. 264ZAVC
  • image information of a moving image frame is obtained.
  • Time axis t direction The encoding process of the first frame in time within the GOP, which is composed of a predetermined number of frames arranged in the direction, is performed by the intraframe code ⁇ , and the first frame is the I picture.
  • the encoding process for frames other than the first frame in the same GOP is performed by inter-frame prediction codes using temporal correlation.
  • the frame FR (# 1, 1) that is an I picture is selected as the first reference frame, and is a P picture.
  • Select frame FR (# 3, 1) as the second reference frame.
  • a viewpoint interpolation image FR is generated by interpolation (extrapolation) based on the frame FR (# 1, 1) and the frame FR (# 3, 1).
  • Encoding processing (inter-view prediction encoding according to the first embodiment) with reference to image information of a frame different from the encoding target frame in the same time frames arranged in the order of
  • the encoding processing result when the encoding compression efficiency is the highest is the image information encoded in the encoding target frame (e.g., FR (# 5, 1)), e.g., Pi Let it be a picture.
  • the viewpoint interpolation image FR is sequentially generated from the image of the frame FR (# 3, 1) and the external interpolation based on the generated Pi picture, and the same processing is repeated.
  • the interpolated image is generated by frame FR (# n +
  • the encoding processing result when the encoding compression efficiency becomes the highest is the code of the target frame (for example, FR (# 4, 1)). It is assumed that the converted image information is, for example, a Bi picture.
  • encoding processing is performed with reference to image information of a frame different from the encoding target frame in frames at the same time arranged in the order of camera arrangement, and corresponds to the encoding target frame.
  • the encoding processing result when the encoding compression efficiency is highest is selectively output.
  • the reason for this processing is that the first frame in the GOP! / And the problem of which image the encoding target frame looks like are taken by the adjacent camera.
  • the inter-view prediction code ⁇ of the first embodiment By performing the inter-view prediction code ⁇ of the first embodiment based on the simultaneous frames, the case where the encoded images are most similar to each other and the reference frames taken by adjacent cameras are used.
  • the created interpolated image may be the most similar! /, And the difference between the V and the deviation depends on the instantaneous movement of the subject. Focusing on this point, the image information compression method of the fifth embodiment is encoded by performing the inter-view prediction code ⁇ of the first embodiment based on the same-time frame captured by the adjacent camera. If the image is the most similar, and if the interpolated image created based on the reference frame taken by the adjacent camera is the most similar, The target frame is encoded.
  • the image encoded by performing the inter-view prediction code in the first embodiment is most similar. Encoding the frame to be encoded using the most similar image between the case where the interpolated image created based on the reference frame taken by the adjacent camera is the most similar As a result, it is possible to improve the code compression efficiency of the output image information. Note that in the fifth embodiment, points other than those described above are the same as in the case of the first embodiment.
  • FIG. 27 is a diagram showing an example of a horizontal section of a light space referred to in the image information compression method of the sixth embodiment of the present invention.
  • FIG. 28 is an explanatory diagram of a motion vector prediction method in the image information compression method according to the sixth embodiment of the present invention.
  • FIG. 29 is an explanatory diagram of a motion vector prediction method in H.264ZAVC as a comparative example of the sixth embodiment of the present invention.
  • the image information compression method of the sixth embodiment is an improvement over the image information compression method of the first embodiment.
  • the image information compression method of the sixth embodiment is based on the premise that a plurality of cameras are arranged in a straight line in parallel with each other.
  • the image information compression method according to the sixth embodiment is a step in which image information of frames at the same time arranged in the order of camera arrangement is subjected to code processing using an inter-frame prediction code using correlation between frames at the same time.
  • the motion vector used in the motion compensated prediction encoding (step of inter-view prediction code in the first embodiment) is a horizontal cross-sectional image (EPI: Epipolar) when the ray space is cut horizontally. It is characterized by being obtained based on a straight line appearing in the Plane Image).
  • the image information compression method according to the sixth embodiment is executed by the multiframe memory 213 shown in FIG. 5 and the motion prediction / compensation unit 215 of the code key processing unit 214.
  • Blocks BL, BL and BL forces also predict motion vectors. This method is
  • a plurality of cameras are linearly arranged in a line in parallel with each other, and a moving image acquired by the plurality of cameras is used.
  • the horizontal sectional structure in the light space is a collection of linear structures.
  • Figure 30 (b) shows a point X in real space on the horizontal section of the ray space.
  • the image information compression method of the sixth embodiment is applied to the first embodiment.
  • the image information compression method of the sixth embodiment is the second It can also be applied to the fifth embodiment.
  • FIG. 30 is a diagram conceptually showing the basic structure of the FTV system according to the seventh embodiment of the present invention.
  • the same or corresponding elements as those shown in FIG. 30 are identical or corresponding elements as those shown in FIG.
  • the transmission-side device 250 and the reception-side device 350 are separated from each other, and from the transmission-side device 250 to the reception-side device 350, for example, the Internet It is a system that transmits FTV signals using, for example.
  • the transmission-side apparatus 250 includes a plurality of cameras (in FIG. Although five of 2 to 102 are shown, more cameras are actually used. ) And the power of multiple units
  • An image information encoding device 200 having the configuration and functions described in the first to sixth embodiments, which compresses video information acquired by a camera, is provided.
  • the image information compressed and encoded by the image information encoding device 200 is sent to the receiving device 350 by a communication device (not shown).
  • receiving-side apparatus 350 includes, as shown, a receiving apparatus, image information decoding apparatus 300 described in Embodiment 1 above, and an output signal from image information decoding apparatus 300. Then, a light ray space 103 is formed on the basis of the information, and a cross section is extracted from the light ray space 103 according to the viewpoint position input from the user interface 104 and displayed.
  • FIGS. 3 (a), (b) and FIGS. 4 (a) to (c) for example, by using the ray space method, by cutting an arbitrary surface from the ray space 103, It is possible to generate an image viewed from an arbitrary viewpoint in the horizontal direction in real space. For example, when the cross section 103a is cut out from the ray space 103 shown in FIG. 4 (a), an image as shown in FIG. 4 (b) is generated, and the cross section 103b is drawn from the ray space 103 shown in FIG. When cut out, the image shown in Fig. 4 (c) is generated.
  • the FTV in the FTV system can be used.
  • the sign key compression efficiency of the signal can be improved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

L’invention concerne une méthode de compression d’informations image pouvant améliorer l’efficacité de la compression au codage d’informations image capturées par de multiples caméras. L'invention concerne également un système FTV (télévision à point de vue libre) exploitant la méthode. La méthode de compression d’informations image comprend : une étape de codage des trames FR (#1, n-1) à FR (#1, n+1), FR (#3, n-1) à FR (#3, n+1) d’une image dynamique capturée par les caméras de numéro impair #1 et #3 ; une étape de génération d’une image d’interpolation de point de vue FRint (#2, n) qui correspond à une trame d’une image dynamique capturée par la caméra de numéro pair #2 ; et une étape de fourniture sélective d’un résultat de codage de la plus haute efficacité de compression au codage de l’image capturée par la caméra #2 en cas de réalisation du codage par référencement des trames FR (#2, n-1) et FR (#2, n+1) de différents instants et en cas de réalisation du codage par référencement de l’image d’interpolation de point de vue FRint (#2, n).
PCT/JP2006/300257 2005-07-26 2006-01-12 Méthode de compression d’informations image et système ftv (télévision à point de vue libre) WO2007013194A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2007526814A JP4825983B2 (ja) 2005-07-26 2006-01-12 画像情報圧縮方法及び自由視点テレビシステム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2005215928 2005-07-26
JP2005-215928 2005-07-26

Publications (1)

Publication Number Publication Date
WO2007013194A1 true WO2007013194A1 (fr) 2007-02-01

Family

ID=37683102

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2006/300257 WO2007013194A1 (fr) 2005-07-26 2006-01-12 Méthode de compression d’informations image et système ftv (télévision à point de vue libre)

Country Status (2)

Country Link
JP (1) JP4825983B2 (fr)
WO (1) WO2007013194A1 (fr)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009518877A (ja) * 2006-04-04 2009-05-07 ミツビシ・エレクトリック・リサーチ・ラボラトリーズ・インコーポレイテッド 3次元ライトフィールドを取得し、表示するための方法およびシステム
JP2011015227A (ja) * 2009-07-02 2011-01-20 Nippon Hoso Kyokai <Nhk> 立体画像生成装置およびプログラム
WO2011046085A1 (fr) * 2009-10-16 2011-04-21 ソニー株式会社 Dispositif de traitement d'images et procédé de traitement d'images
WO2013038679A1 (fr) * 2011-09-13 2013-03-21 パナソニック株式会社 Dispositif de codage, dispositif de décodage, dispositif de lecture, procédé de codage et procédé de décodage
WO2014168121A1 (fr) * 2013-04-11 2014-10-16 日本電信電話株式会社 Procédé de codage d'image, procédé de décodage d'image, dispositif de codage d'image, dispositif de décodage d'image, programme de codage d'image et programme de décodage d'image
JP2015530788A (ja) * 2012-07-30 2015-10-15 バーソロミュー ジー ユキック 3次元画像メディアを生成するシステム及び方法
US9392248B2 (en) 2013-06-11 2016-07-12 Google Inc. Dynamic POV composite 3D video system
EP3145191A1 (fr) * 2015-09-17 2017-03-22 Thomson Licensing Procédé de codage d'un contenu de champ lumineux
WO2017046272A1 (fr) * 2015-09-17 2017-03-23 Thomson Licensing Procédé de codage d'un contenu de champ lumineux
JP2018530963A (ja) * 2015-09-14 2018-10-18 トムソン ライセンシングThomson Licensing ライトフィールドベースの画像を符号化及び復号する方法及び機器並びに対応するコンピュータプログラム製品

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0698312A (ja) * 1992-09-16 1994-04-08 Fujitsu Ltd 画像高能率符号化方式
JPH07143494A (ja) * 1993-11-19 1995-06-02 Sanyo Electric Co Ltd 動画像符号化方法
JPH07154799A (ja) * 1993-11-26 1995-06-16 Sanyo Electric Co Ltd 動画像符号化方法
JPH09245195A (ja) * 1996-03-08 1997-09-19 Canon Inc 画像処理方法およびその装置
JPH09261653A (ja) * 1996-03-18 1997-10-03 Sharp Corp 多視点画像符号化装置
JPH10224795A (ja) * 1997-01-31 1998-08-21 Nippon Telegr & Teleph Corp <Ntt> 動画像符号化方法、復号方法、符号化器および復号器
JP2002016945A (ja) * 2000-06-29 2002-01-18 Toppan Printing Co Ltd 画像軽量化手法を用いた三次元画像表現システム

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3477023B2 (ja) * 1996-04-05 2003-12-10 松下電器産業株式会社 多視点画像伝送方法および多視点画像表示方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0698312A (ja) * 1992-09-16 1994-04-08 Fujitsu Ltd 画像高能率符号化方式
JPH07143494A (ja) * 1993-11-19 1995-06-02 Sanyo Electric Co Ltd 動画像符号化方法
JPH07154799A (ja) * 1993-11-26 1995-06-16 Sanyo Electric Co Ltd 動画像符号化方法
JPH09245195A (ja) * 1996-03-08 1997-09-19 Canon Inc 画像処理方法およびその装置
JPH09261653A (ja) * 1996-03-18 1997-10-03 Sharp Corp 多視点画像符号化装置
JPH10224795A (ja) * 1997-01-31 1998-08-21 Nippon Telegr & Teleph Corp <Ntt> 動画像符号化方法、復号方法、符号化器および復号器
JP2002016945A (ja) * 2000-06-29 2002-01-18 Toppan Printing Co Ltd 画像軽量化手法を用いた三次元画像表現システム

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KIMATA H. ET AL: "System Design of Free Viewpoint Video Communication", THE FOURTH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY 2004 CIT '04), 14 September 2004 (2004-09-14), pages 52 - 59, XP003015599 *
OKA S. ET AL.: "Jiyu Shiten Terebi no Tameno Kosen Kukan Joho Asshuku", INFORMATION PROCESSING SOCIETY OF JAPAN KENKYU HOKOKU, vol. 2003, no. 125, 19 December 2003 (2003-12-19), pages 97 - 102 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009518877A (ja) * 2006-04-04 2009-05-07 ミツビシ・エレクトリック・リサーチ・ラボラトリーズ・インコーポレイテッド 3次元ライトフィールドを取得し、表示するための方法およびシステム
JP2011015227A (ja) * 2009-07-02 2011-01-20 Nippon Hoso Kyokai <Nhk> 立体画像生成装置およびプログラム
WO2011046085A1 (fr) * 2009-10-16 2011-04-21 ソニー株式会社 Dispositif de traitement d'images et procédé de traitement d'images
JP2011087194A (ja) * 2009-10-16 2011-04-28 Sony Corp 画像処理装置および画像処理方法
CN102577402A (zh) * 2009-10-16 2012-07-11 索尼公司 图像处理设备和图像处理方法
JPWO2013038679A1 (ja) * 2011-09-13 2015-03-23 パナソニックIpマネジメント株式会社 符号化装置、復号装置、再生装置、符号化方法、及び復号方法
EP2757783A4 (fr) * 2011-09-13 2015-03-18 Panasonic Corp Dispositif de codage, dispositif de décodage, dispositif de lecture, procédé de codage et procédé de décodage
WO2013038679A1 (fr) * 2011-09-13 2013-03-21 パナソニック株式会社 Dispositif de codage, dispositif de décodage, dispositif de lecture, procédé de codage et procédé de décodage
JP2016220236A (ja) * 2011-09-13 2016-12-22 パナソニックIpマネジメント株式会社 符号化装置、復号装置、再生装置、符号化方法、及び復号方法
US9661320B2 (en) 2011-09-13 2017-05-23 Panasonic Intellectual Property Management Co., Ltd. Encoding device, decoding device, playback device, encoding method, and decoding method
JP2015530788A (ja) * 2012-07-30 2015-10-15 バーソロミュー ジー ユキック 3次元画像メディアを生成するシステム及び方法
WO2014168121A1 (fr) * 2013-04-11 2014-10-16 日本電信電話株式会社 Procédé de codage d'image, procédé de décodage d'image, dispositif de codage d'image, dispositif de décodage d'image, programme de codage d'image et programme de décodage d'image
JP5926451B2 (ja) * 2013-04-11 2016-05-25 日本電信電話株式会社 画像符号化方法、画像復号方法、画像符号化装置、画像復号装置、画像符号化プログラム、および画像復号プログラム
US9392248B2 (en) 2013-06-11 2016-07-12 Google Inc. Dynamic POV composite 3D video system
JP2018530963A (ja) * 2015-09-14 2018-10-18 トムソン ライセンシングThomson Licensing ライトフィールドベースの画像を符号化及び復号する方法及び機器並びに対応するコンピュータプログラム製品
EP3145191A1 (fr) * 2015-09-17 2017-03-22 Thomson Licensing Procédé de codage d'un contenu de champ lumineux
WO2017046272A1 (fr) * 2015-09-17 2017-03-23 Thomson Licensing Procédé de codage d'un contenu de champ lumineux
US10880576B2 (en) 2015-09-17 2020-12-29 Interdigital Vc Holdings, Inc. Method for encoding a light field content

Also Published As

Publication number Publication date
JPWO2007013194A1 (ja) 2009-02-05
JP4825983B2 (ja) 2011-11-30

Similar Documents

Publication Publication Date Title
KR100667830B1 (ko) 다시점 동영상을 부호화하는 방법 및 장치
Merkle et al. Efficient compression of multi-view video exploiting inter-view dependencies based on H. 264/MPEG4-AVC
WO2007013194A1 (fr) Méthode de compression d’informations image et système ftv (télévision à point de vue libre)
CN100512431C (zh) 用于编码和解码立体视频的方法和装置
JP4611386B2 (ja) 多視点映像のスケーラブル符号化、復号化方法及び装置
US9154786B2 (en) Apparatus of predictive coding/decoding using view-temporal reference picture buffers and method using the same
KR100481732B1 (ko) 다 시점 동영상 부호화 장치
KR100636785B1 (ko) 다시점 입체 영상 시스템 및 이에 적용되는 압축 및 복원방법
KR100728009B1 (ko) 다시점 동영상을 부호화하는 방법 및 장치
EP2538675A1 (fr) Appareil pour codage universel pour video multivisionnement
JP2007180981A (ja) 画像符号化装置、画像符号化方法、及び画像符号化プログラム
US20090190662A1 (en) Method and apparatus for encoding and decoding multiview video
KR100738867B1 (ko) 다시점 동영상 부호화/복호화 시스템의 부호화 방법 및시점간 보정 변이 추정 방법
KR20110057162A (ko) 정제된 깊이 맵
US20120114036A1 (en) Method and Apparatus for Multiview Video Coding
CN111800653B (zh) 视频解码方法、系统、设备及计算机可读存储介质
JPWO2009001791A1 (ja) 映像符号化方法及び復号方法、それらの装置、それらのプログラム並びにプログラムを記録した記録媒体
US20110268193A1 (en) Encoding and decoding method for single-view video or multi-view video and apparatus thereof
JP4825984B2 (ja) 画像情報圧縮方法、画像情報圧縮装置、及び自由視点テレビシステム
JP6571646B2 (ja) マルチビュービデオのデコード方法及び装置
CN101990103A (zh) 用于多视点视频编码的方法和装置
JP2007180982A (ja) 画像復号装置、画像復号方法、及び画像復号プログラム
Merkle et al. Efficient compression of multi-view depth data based on MVC
Conceicao et al. LF-CAE: Context-adaptive encoding for lenslet light fields using HEVC
Zhang et al. Light field image coding with disparity correlation based prediction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2007526814

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06711580

Country of ref document: EP

Kind code of ref document: A1