WO2007013194A1

WO2007013194A1 - Image information compression method and free viewpoint television system

Info

Publication number: WO2007013194A1
Application number: PCT/JP2006/300257
Authority: WO
Inventors: Masayuki Tanimoto; Toshiaki Fujii; Kenji Yamamoto
Original assignee: National University Corporation Nagoya University
Priority date: 2005-07-26
Filing date: 2006-01-12
Publication date: 2007-02-01
Also published as: JP4825983B2; JPWO2007013194A1

Abstract

There is provided an image information compression method capable of improving compression efficiency in encoding image information captured by a plurality of cameras. An FTV system employing the method is also disclosed. The image information compression method includes: a step for encoding frames FR (&num;1, n-1) to FR (&num;1, n+1), FR (&num;3, n-1) to FR (&num;3, n+1) of a dynamic image captured by the cameras of the odd numbers &num;1 and &num;3; a step for generating a viewpoint interpolation image FRint (&num;2, n) corresponding to a frame of a dynamic image captured by the camera of the even number &num;2; and a step for selectively outputting an encoding result of the highest encoding compression efficiency when encoding the image captured by the camera &num;2 among the case for performing encoding by referencing the frames FR (&num;2, n-1) and FR (&num;2, n+1) of different times and the case for performing encoding by referencing the viewpoint interpolation image FRint (&num;2, n).

Description

Specification

Image information compression method and free viewpoint television system

Technical field

[0001] The present invention relates to an image information compression method capable of improving the code compression efficiency of image information acquired by a plurality of cameras, and a free viewpoint television system to which this method is applied. is there.

Background art

[0002] The inventor of the present application is a free viewpoint TV (FTV) that allows viewers to freely change their viewpoints and view 3D scenes as if they were on the spot. (For example, see Non-Patent Documents 1 to 4). Furthermore, the viewpoint can be freely moved in the horizontal plane based on the photographed images acquired by 15 cameras. F TV experimental equipment is completed (see Non-Patent Document 1, for example).

[0003] Non-Patent Document 1: Masayuki Tanimoto, “Free Viewpoint Television”, Nihon Kogyo Publishing, Imaging Lab, February 2005, pp. 23-28

Non-Patent Document 2: Shinya Oka, Nonon Champurim, Toshiaki Fujii, Masayuki Tanimoto, “Light-Space Information Compression for Free Viewpoint Television”, IEICE Technical Report, CS2003—141, pp. 7-12, 2003 1 2 Moon

Non-Patent Document 3: Masayuki Tanimoto, "5. Free-viewpoint TV FTV, using multi-viewpoint image processing", Journal of the Institute of Image Information and Media Sciences, Vol. 58, No. 7, pp. 898-901, 2004 Patent Document 4: Shinya Oka, Nonon Champurim, Toshiaki Fujii, Masayuki Tanimoto, “Compression of Dynamic Ray Space for Free Viewpoint Television”, 3D Image Conference 2004, pp. 139-142, 2004

[0004] It should be noted that the left column on page 9 of Non-Patent Document 2 states, “Because the light space is very similar in both the time axis and the space axis, motion (parallax) prediction is applied to both axes. It is thought that it is possible to obtain a high compression ratio by doing so. " Also, Non-Patent Document 3 on page 899, the left column says “interpolate the ray space”, and the left column on page 900 says “If the interpolation is performed only on the necessary part, not the entire light space. "It's good." Non-special The left column on page 140 of Permitted Document 4 states that “dynamic ray space can be expected to have a large correlation between time and space.” From the right column on page 140 to the left column on page 141, An example of a reference image is shown.

FIG. 1 is a diagram conceptually showing the basic configuration of an FTV system. The F TV system shown in Fig. 1 uses a camera (step ST1), image interpolation processing (step ST2 or ST2a), image information compression processing (step ST3), and an image viewed from the input viewpoint. Is displayed (steps ST4 and ST5). In an FTV system, image information of a subject 101 that exists in a three-dimensional real space is stored in multiple cameras (Fig. 1 shows five cameras 102 to 102).

1 5 As shown, more cameras are actually used. ) (Step ST1) and images acquired by multiple cameras (Figure 1 shows five images with reference numerals 103 to 103)

1 5

Force More images are actually used. ) Are arranged in the ray space 103 to form an FTV signal. In FIG. 1, X represents a horizontal viewing direction, y represents a vertical viewing direction, and u (= tan 0) represents a viewing zone direction. As shown in Fig. 2 (a), the arrangement of the plurality of cameras 102 is a linear arrangement in which the directions parallel to each other are arranged on a straight line, as shown in Fig. 2 (b). Circumferential arrangement (or arc arrangement) arranged with the inside of the circumference facing the inside of the circumference, as shown in Fig. 2 (c), planar arrangement arranged in parallel with each other on the plane, Fig. 2 (d) As shown in Fig. 2, the spherical arrangement (or hemispherical arrangement) arranged on the spherical surface with the inner surface of the spherical surface arranged, and the cylindrical arrangement arranged on the cylinder with the inner surface of the cylinder oriented as shown in Fig. 2 (e). is there. The arrangement of multiple cameras 102 should be either the linear arrangement shown in Fig. 2 (a) or the circumferential arrangement shown in Fig. 2 (b) when only a horizontal free viewpoint is realized. When the free viewpoint in both the vertical direction and the vertical direction is realized, the planar arrangement shown in Fig. 2 (c), the cylindrical arrangement shown in Fig. 2 (d), or the spherical arrangement shown in Fig. 2 (e) To do.

[0006] In the ray space method, one ray in a three-dimensional real space is represented by one point in a multidimensional space with a parameter representing it as a coordinate. This virtual multidimensional space is called a light space. The whole ray space expresses all rays in 3D space without excess or deficiency. Ray space is created by collecting images taken with a lot of viewpoint power. Since the value of the point in the ray space is the same as the pixel value of the image, conversion to the image force ray space is just a coordinate transformation. As shown in Fig. 3 (a), the light beam 107 passing through the reference plane 106 in real space passes. It can be uniquely expressed by four parameters: position (x, y) and passing direction (θ, φ). In FIG. 3 (a), X is a horizontal coordinate axis in three-dimensional real space, Y is a vertical coordinate axis, and Z is a depth coordinate axis. Θ is the horizontal angle with respect to the normal of the reference surface 106, that is, the horizontal projection angle with respect to the reference surface 106, and φ is the vertical angle with respect to the normal of the reference surface 106, that is, This is an emission angle in a direction perpendicular to the reference plane 106. As a result, the ray information in this three-dimensional real space can be expressed as luminance f (x, y, θ, φ). Here, in order to simplify the explanation, the vertical parallax (angle Φ) is ignored. As shown in Fig. 3 (a), images taken by a number of cameras placed horizontally toward the reference plane 106 are X, y as shown in Fig. 3 (b). , u (= tan 0) in the three-dimensional space, it is located in the cross section 103-103 drawn with a dotted line. Arbitrary surface from ray space 103 shown in Fig. 3 (b)

1 5

It is possible to generate an image viewed from an arbitrary viewpoint in the horizontal direction in real space. For example, when the section 103a is cut out from the light beam space 103 shown in FIG. 4 (a), an image as shown in FIG. 4 (b) is displayed on the display 105, and the light beam space 103 shown in FIG. When the cross section 103b is cut out from the image, an image as shown in FIG. 4 (c) is displayed on the display 105.

[0007] In addition, there is no data between the images (sections 103 to 103) arranged in the light space 103.

1 5

Therefore, this is created by interpolation (step ST2 or ST2a in Fig. 1). It should be noted that the interpolation need only be performed for the necessary part of the entire ray space. In addition, the location where interpolation is performed is the image information transmission side (step ST2) for applications such as VOD (Video On Demend), and the image information reception side (step ST2a) for applications such as broadcasting.

[0008] Compression of image information (step ST3 in Fig. 1) is not an indispensable process when the components of the FTV system are in the same location, but the camera and the user are in different locations. This is an indispensable process when distributing image information using the Internet. As a conventional image information compression method, for example, there is a method compliant with the H.264ZAVC standard (for example, see Patent Document 1).

Patent Document 1: Japanese Patent Laid-Open No. 2003-348595 (FIGS. 1 and 2) Disclosure of the invention

Problems to be solved by the invention

[0009] However, the amount of image information distributed in the FTV system is larger by the number of cameras than the image information in the conventional TV system. For this reason, compression efficiency is insufficient only by using the conventional image information compression method, and more efficient compression can be performed in order to put an FTV system that involves transmission of image information into practical use. A compression method is essential.

[0010] Therefore, the present invention has been made to solve the above-described problems of the prior art, and the object thereof is to reduce the sign key compression in the sign key of image information obtained by a plurality of cameras. The object is to provide an image information compression method capable of improving efficiency and an FTV system to which this method is applied.

Means for solving the problem

[0011] The image information compression method of the present invention includes:

Three or more cameras' medium power is also selected Frame information arranged in the time axis direction of moving images acquired by two or more cameras using intra-frame coding and temporal correlation between frames A time of a moving image acquired by a camera other than the selected force based on the step of performing the code processing using the inter-predictive code and the image information acquired by the selected camera Generating a first viewpoint interpolated image corresponding to the axially aligned frames;

Encoding the image information of frames arranged in the time axis direction of a moving image acquired by a camera other than the selected camera, and

The step of encoding and processing image information of frames arranged in a time axis direction of a moving image acquired by a camera other than the selected camera is image information acquired by a camera other than the selected camera. Encoding processing with reference to image information of a frame at a time different from the encoding target frame, and encoding processing with reference to the first viewpoint interpolated image corresponding to the encoding target frame In some cases, the method includes a step of selectively outputting an encoding processing result when the encoding compression efficiency is highest. It is characterized by this.

[0012] Another image information compression method of the present invention includes:

A step of encoding image information of frames arranged in the time axis direction of moving images acquired by a plurality of cameras by inter-frame predictive encoding using intra-frame code and temporal correlation between frames. When,

Image information of frames of moving images acquired by the plurality of cameras at the same time arranged in the order of the arrangement of the cameras is obtained by the same algorithm as the inter-frame prediction code 利用 using the temporal correlation. And a step of performing a code process using an inter-frame prediction code using a correlation between frames at the same time.

It is characterized by this.

[0013] Further, the FTV system of the present invention includes:

An image information encoding apparatus for executing the image information compression method;

A plurality of cameras for supplying video signals to the image information encoding device;

An image information decoding device for decoding the encoded information output from the image information encoding device;

A user interface for inputting the viewpoint position of the viewer;

And an image information extracting unit that extracts an image of viewpoint power input by the user interface from images of the same time taken by the plurality of cameras.

The invention's effect

[0014] According to the image information compression method and the FTV system of the present invention, a frame of a moving image acquired by a plurality of cameras is encoded by inter-frame prediction encoding using correlation between frames at the same time. As a result, it is possible to obtain the effect that the coding compression efficiency can be improved.

[0015] According to another image information compression method and FTV system of the present invention, image information of frames arranged in the time axis direction of moving images acquired by two or more selected cameras is encoded. The first viewpoint interpolation image corresponding to the frame of the moving image acquired by the camera other than the selected camera is generated, and the camera other than the selected camera generates the first viewpoint interpolation image. When the encoding process is performed with reference to the image information of the acquired image information and the frame at a different time from the encoding target frame, and when the encoding process is performed with reference to the first viewpoint interpolated image. Thus, since the encoding process result when the encoding compression efficiency becomes the highest is selectively output, it is possible to obtain the effect that the encoding efficiency of the output image information can be improved.

Brief Description of Drawings

FIG. 1 is a diagram conceptually showing the basic configuration of an FTV system.

[Fig. 2] (a) to (e) are diagrams showing examples of the arrangement of multiple cameras, (a) is a linear arrangement, (b) is a circumferential arrangement, (c) is a planar arrangement, (d) Is a cylindrical arrangement, and (e) is a spherical arrangement.

[Fig. 3] (a) is a diagram showing an object in real space, a linearly arranged camera, a reference plane, and light rays, and (b) is a diagram showing the light space.

[Fig. 4] (a) is a diagram showing a light space, (b) is a diagram showing an image cut out from the light space, and (c) is a diagram showing another image cut out from the light space. is there.

FIG. 5 is a block diagram schematically showing a configuration of an image information encoding device capable of implementing the image information compression method of the present invention.

FIG. 6 is a diagram conceptually showing that frames of moving images taken by a plurality of cameras are arranged in the time axis direction, and frames at the same time are arranged in the order of camera arrangement.

FIG. 7 is a flowchart showing an operation of the image information encoding device shown in FIG.

8 is a flowchart showing an example of the operation of the interpolated image generation / compensation step shown in FIG.

FIG. 9 is a flowchart showing an example of the operation of the selection step shown in FIG.

FIG. 10 is a block diagram schematically showing a configuration of an image information decoding apparatus capable of decoding image information encoded by the image information compression method of the present invention.

FIG. 11 is a flowchart showing an operation of the image information decoding apparatus shown in FIG.

FIG. 12 is a flowchart showing an example of the operation of the interpolated image generation / compensation step shown in FIG.

FIG. 13 is an explanatory diagram (part 1) of the image information compression method according to the first embodiment of the present invention.

FIG. 14 is an explanatory diagram (part 2) of the image information compression method according to the first embodiment of the present invention. FIG. 15 is an explanatory diagram (part 1) of the image information compression method according to the second embodiment of the present invention.

FIG. 16 is an explanatory diagram (part 2) of the image information compression method according to the second embodiment of the present invention.

FIG. 17 is an explanatory diagram (part 3) of the image information compression method according to the second embodiment of the present invention.

FIG. 18 is an explanatory diagram (part 4) of the image information compression method according to the second embodiment of the present invention.

FIG. 19 is an explanatory diagram of an image information compression method according to the third embodiment of the present invention.

FIG. 20 is an explanatory diagram of an image information compression method according to the fourth embodiment of the present invention.

FIG. 21 is an explanatory diagram (part 1) of the image information compression method according to the fifth embodiment of the present invention.

FIG. 22 is an explanatory diagram (part 2) of the image information compression method according to the fifth embodiment of the present invention.

FIG. 23 is an explanatory diagram (part 3) of the image information compression method according to the fifth embodiment of the present invention.

FIG. 24 is an explanatory diagram (part 4) of the image information compression method according to the fifth embodiment of the present invention.

FIG. 25 is an explanatory diagram (part 5) of the image information compression method according to the fifth embodiment of the present invention.

FIG. 26 is an explanatory diagram (No. 6) of the image information compression method according to the fifth embodiment of the invention.

FIG. 27 is a diagram showing an example of a horizontal section of a light space in an image information compression method according to a sixth embodiment of the present invention.

FIG. 28 is an explanatory diagram of a motion vector prediction method in the image information compression method according to the sixth embodiment of the present invention.

FIG. 29 is an explanatory diagram of a motion vector prediction method in H.264ZAVC as a comparative example of the sixth embodiment of the present invention.

[FIG. 30] (a) and (b) are explanatory diagrams showing the relationship between a point in real space and a straight line in a horizontal section of the light space. It is a figure which shows notionally the basic composition of a FTV system.

FIG. 31 is a diagram conceptually showing the basic structure of an FTV system in a seventh embodiment of the present invention.

Explanation of symbols

101 Subject (object)

102, 102-102 camera

1 5

103 Ray space

103-103 live action image

1 5

103a, 103b Vertical section of ray space 104 User interface

105 display

106 Reference plane

107 rays

200 Image information encoder

201 to 201 input terminals

1 N

202 to 202 AZD converter

1 N

203 Screen sorting nota

204 Adder

205 Orthogonal transformation unit

206 Quantizer

207 Variable encoding unit

208 Accumulation buffer

209 Output terminal

210 Rate control section

211 Inverse quantization part

212 Inverse orthogonal transform

213 Multi-camera frame memory

214 Encoding processor

215 Motion prediction and compensation unit

216 Interpolated image generation and compensation unit

217 Selector

250 FTV system transmitter equipment

300 Image information decoder

301 Input terminal

302 accumulation buffer

303 Variable decoding unit

304 Inverse quantization 305 Inverse orthogonal transform unit

306 Adder

307 Screen sorting buffer

308-308 DZA converter

1 N

309 to 309 output terminals

1 N

310 Multi-camera frame memory

311 Decryption processor

312 Motion prediction and compensation unit

313 Interpolated Image GenerationCompensator

314 Selector

350 FTV system receiver equipment

351 Image information extraction unit

# 1, # 2, # 3, ..., #n, # n + l,… Camera number

FR frame (image)

FR (# 1, n-1) Frame at t = n—l obtained by camera # 1 FR (# 1, n) Frame at t = n obtained by camera # 1 FR (# 1, n +1) Frame FR (# 2, n-1) obtained by camera # 1 at t = n + l FR (# 2, n-1) Frame obtained by camera # 2 at t = n—l FR (# 2, n) camera The frame at t = n obtained by # 2 FR (# 2, n + 1) The frame at t = n + l obtained by camera # 2 FR (# 3, n-1) Obtained by camera # 3 Frame at t = n—l FR (# 3, n) obtained by camera # 3 frame at t = n FR (# 3, n + 1) t = n + obtained by camera # 3 Frame FR at the time of l (# 2, n) Viewpoint interpolation image corresponding to frame FR (# 2, n)

mt

Viewpoint interpolation image intl corresponding to FR (# 2, n) frame FR (# 2, n)

Viewpoint interpolation image int2 corresponding to FR (# 2, n) frame FR (# 2, n)

t Time axis GOP group 'Ob' Picture (image group in the t-direction of time axis consisting of a predetermined number of frames)

G Spatial axis with multiple frame forces at the same time Image group in the S direction

s

I intra-coded frame (I picture)

P Interframe prediction code 匕 frame (P picture)

B Inter-frame bi-directional prediction code frame (B picture)

BEST MODE FOR CARRYING OUT THE INVENTION

FIG. 5 shows an image information coding apparatus capable of implementing the image information compression method of the present invention.

It is a block diagram which shows the structure of 200 roughly.

As shown in FIG. 5, the image information encoding device 200 includes N input terminals 201 to 201 (N is an integer of 2 or more), N AZD conversion units 202 to 202, Screen sorting buffer

1 N 1 N

203, an adder 204, an orthogonal transform unit 205, a quantization unit 206, a variable encoding unit 207, an accumulation buffer 208, an output terminal 209, and a rate control unit 210. The image information coding apparatus 200 includes an inverse quantization unit 211, an inverse orthogonal transform unit 212, a multi-camera frame 213, a motion prediction / compensation unit 215, an interpolated image generation / compensation unit 216, a motion A selection unit 217 that selectively outputs one of the output signals of the prediction / compensation unit 215 and the interpolated image generation / compensation unit 216; The motion prediction / compensation unit 215, the interpolated image generation / compensation unit 216, and the selection unit 217 constitute an encoding processing unit 214 that performs the image information compression method of the present invention. An image information encoding apparatus 200 shown in FIG. 5 includes a point that can receive image information from a plurality of cameras, and an encoding processing unit 214 that can implement the image information compression method of the present invention. This is different from the conventional image information encoding apparatus disclosed in Patent Document 1 described above.

[0020] Each of the input terminals 201 to 201 of the image information encoding device 200 has an arrangement position and

1 N

In addition, analog video signals acquired by N cameras with known shooting directions are input. The N cameras usually have the same performance, such as resolution, and are regularly arranged as shown in FIGS. 2 (a;) to (e), for example. However, in an actual FTV system, the number of cameras is usually tens, hundreds, or more. Ma The camera arrangement is not limited to that shown in FIGS. 2 (a) to 2 (e). The analog video signals input to the input terminal SO ^ 201 are respectively decoded by the AZD converters 202-202.

It is converted into an N 1 N digital video signal and held in the screen rearrangement buffer 203. As a modification, when a digital video signal is input to the input terminals 201 to 201, AZD conversion is performed.

1 N

Replacement units 202 to 202 are not necessary.

1 N

[0021] FIG. 6 shows frames of moving images (also referred to as “images”) taken by a plurality of cameras # 1 to # 5, and FRs are arranged in the time axis t direction, and camera # 1. It is a figure which shows notionally that the frames of the same time acquired by ~ # 5 are arranged in the spatial axis S direction in the arrangement order of the cameras. As shown in FIG. 6, the frame FR of the moving images taken by the cameras # 1 to # 5 is an image group having a predetermined number of frame forces arranged in time series in the time axis t direction. Configure GOP (Group of Pictures). In addition, as shown in FIG. 6, the frames taken at the same time of the moving images taken by the respective force cameras # 1 to # 5, that is, the frames at the same time are the spatial axes S which are the arrangement order of the cameras. An image group G of a predetermined number of frames at the same time arranged in the direction (horizontal direction in Fig. 6) is formed.

S

The screen rearrangement buffer 203 of the image information encoding device 200 performs frame rearrangement according to the GOP structure of the supplied image information. Picture

N

The face rearrangement buffer 203 supplies the image information of the entire frame to the orthogonal transform unit 205 for the image on which intra-frame coding (intra coding) is performed. The orthogonal transform unit 205 performs orthogonal transform such as discrete cosine transform on the image information and supplies transform coefficients to the quantization unit 206. The quantization unit 206 performs a quantization process on the transform coefficient supplied from the orthogonal transform unit 205.

[0023] The variable code key unit 207 determines the quantized transform coefficient and quantization scale iso-power code key mode supplied from the quantization unit 206, and sets a variable length code for this coding mode. Or variable coding such as arithmetic coding is performed to form information to be inserted into the header portion of each image code key. Then, the variable code key unit 207 supplies the encoded encoding mode to the accumulation buffer 208 for accumulation. The encoded code mode is output from the output terminal 209 as image compression information. The variable code key unit 207 applies a variable code key such as a variable-length code key or an arithmetic code key to the quantized transform coefficient to generate a code key. The converted conversion coefficient is supplied to the accumulation buffer 208 and accumulated. The encoded conversion coefficient is output from the output terminal 209 as image compression information.

The behavior of the quantization unit 206 is controlled by the rate control unit 210 based on the data amount of transform coefficients accumulated in the accumulation buffer 208. Further, the quantization unit 206 supplies the quantized transform coefficient to the inverse quantization unit 211, and the inverse quantization unit 211 performs inverse quantization on the quantized transform coefficient. The inverse orthogonal transform unit 212 performs inverse orthogonal transform processing on the inversely quantized transform coefficients to generate decoded image information, and supplies the information to the multi-camera frame memory 213 for accumulation.

In addition, the screen rearrangement buffer 203 supplies image information to the encoding processing unit 214 for an image on which inter-frame predictive encoding (inter-encoding) is performed. The encoding processing unit 214 performs encoding processing on image information using the image information compression methods of the first to sixth embodiments of the present invention described later. The encoding processing unit 214 supplies the generated reference image information to the adder 204, and the adder 204 converts the reference image information into a difference signal from the corresponding image information. Also, the encoding processing unit 214 supplies the motion vector information to the variable code unit 207 at the same time.

The variable encoding unit 207 is based on the quantized transform coefficient and quantization scale from the quantization unit 206, the motion vector information supplied from the code key processing unit 214, and the like. Then, variable encoding such as variable length encoding or arithmetic encoding is performed on the determined encoding mode, and information to be inserted into the header portion of each image code key is generated. Then, the variable code key unit 207 supplies the encoded code key mode to the accumulation buffer 208 for accumulation. The encoded code mode is output as image compression information.

[0027] Further, the variable code key unit 207 performs variable coding processing such as variable length code key or arithmetic coding on the motion vector information, and is inserted into the header part of each image code key. Information is generated. In contrast to intra coding, in the case of inter coding, image information input to the orthogonal transform unit 205 is a difference signal obtained from the adder 204. The other processes are the same as those in the case of image compression using intra codes.

FIG. 7 is a flowchart showing the encoding process of the image information encoding device 200 shown in FIG. It is As shown in FIG. 7, the image information encoding device 200 performs AZD conversion of the input analog video signal by the AZD conversion units 20 to 202 (step ST11

N

), The screen is rearranged by the screen rearrangement buffer 203 (step ST12), and then the motion prediction 'compensation unit 215 for motion prediction' compensation (step ST21) and the interpolated image generation 'compensation unit 216 Generation / compensation (step ST22), encoding by referring to the interpolated image by the selection unit 217, or determination of whether to select a shift of the coding prediction by motion prediction / compensation (step ST23) is performed. However, in the case of performing the conventional compression coding process for image information (for example, processing conforming to the H.26 4ZAVC standard), in the case of the first embodiment described later, interpolation by the interpolated image generation / compensation unit 216 is performed. Image generation · No compensation is required.

[0029] Thereafter, the image information generated by orthogonal transform section 205 is orthogonally transformed (step ST23), and quantization and quantization rate control are performed by quantization section 206 and rate control section 210 (steps ST25, 26). ), Variable code key unit 207 performs variable code key (step ST2 7), inverse quantization unit 211 performs inverse quantization (step ST28), and inverse orthogonal transform unit 212 performs inverse orthogonal transform (step ST29). )I do. Steps ST21 to ST29 are performed on all blocks having a predetermined number of pixels in the frame, and steps ST11 and ST12 and steps ST21 to ST29 for all blocks are performed on all frames.

FIG. 8 is a flowchart showing an example of the operation of the interpolated image generation / compensation step ST22 shown in FIG. Interpolated image generation 'For compensation, depth estimation is performed at each pixel in the block to generate an interpolated pixel (for example, pixel value 0 to 255), and an evaluation value E is calculated based on the pixel value of the generated interpolated pixel. The minimum value E of the evaluation value E in the block depth range

min is obtained (steps ST221 to ST223). Here, the pixel value of the generated interpolation pixel is expressed as I (i

int

, j), and depth as D (i, j), where (i, j) indicates the position on the image and

mt

When the pixel value is defined as I (i, j), the evaluation value E is, for example,

en

abs (l (i, j) -I (i, j))

int en

It can be. Where abs (indicates the absolute value in parentheses. However, the evaluation value E is defined as

abs (l (i, j) -I (i, j))

int en

abs (D. (i, j) — D. (i- 1, j)) It is good. In the present invention, the evaluation value E is not limited to the above definition, and other definitions can be adopted.

[0031] Next, an interpolation pixel is generated using the depth that is the minimum value E (step ST224). The

min

The processing in steps ST221 to ST224 is performed on all the pixels in the block, and an evaluation value, which is an index indicating how much the estimated block generated by the interpolation pixel is similar to the actual block, is calculated (step ST225 ). Where the evaluation 銜 is the estimation within the block

mt 1

The set of pixels S is I (i, j), a <i <b, c; j <d, and the set of pixels T of the image to be signed T

If est int e is I (i, j), a <i <b, c; j <d, the evaluation 銜 is

n en mt

∑ {abs (l (i, j) -I (i, j))}, a <i <b, c <j <d

int en

Can be defined. Or evaluation 銜 mt, for example

∑ {abs (l (i, j) -I (i, j)) * abs (I (i, j) I (i, j))}, a <i <b, c <j <d

int en int en

Can be defined as Here, a, b, c, and d are values indicating the block range. Note that the interpolation method described above is merely an example, and any manufacturer or user of the apparatus that can use any of the interpolation methods in the present invention is free from the known frame interpolation methods. May be configured to be selectable.

FIG. 9 is a flowchart showing an example of the operation of the selection step of either the interpolated image or the motion prediction compensation shown in FIG. As shown in Fig. 9, in the selection step of either interpolated image or motion prediction compensation, evaluation 銜 mt is calculated, but when evaluation 銜 int force S motion prediction compensation is adopted より大きい larger than mot In this case, motion prediction compensation is employed, and when the evaluation 銜 mt is equal to or less than the evaluation 銜 mot when motion prediction compensation is adopted, an interpolation image is selected (steps ST231 to ST233). However, when performing conventional image information compression encoding processing (for example, processing conforming to the H.264ZAVC standard) or when performing the image information compression method of the first embodiment described later, motion prediction compensation is performed. The image information encoded by is selected.

FIG. 10 is a block diagram schematically showing a configuration of an image information decoding device 300 corresponding to the image information encoding device 200.

As shown in FIG. 10, an image information decoding apparatus 300 includes an input terminal 301, a storage buffer 302, a variable decoding unit 203, an inverse quantization unit 304, an inverse orthogonal transform unit 305, an adder 306 and Screen rearrangement buffer 307, N DZ A converters 308 to 308, and N output terminals

1 N

Children 309-309. In addition, the image information decoding apparatus 300 is a multi-camera frame.

1 N

Motion memory 310, motion prediction / compensation unit 312, interpolation image generation / compensation unit 313, selection unit that selectively outputs one of the outputs of motion prediction / compensation unit 312 and interpolation image generation / compensation unit 313 And 314. The motion prediction / compensation unit 312, the interpolated image generation / compensation unit 313, and the selection unit 314 constitute a decoding processing unit 311 that performs image information decoding. An image information decoding apparatus 300 shown in FIG. 10 includes a decoding processing unit 311 that can decode image information encoded by the image information compression method of the present invention, and corresponds to image information of a plurality of cameras. This is different from the image information decoding apparatus disclosed in Patent Document 1 in that a plurality of analog video signals can be output. As a modification, when N digital output signals are output from 309 to 309 output terminals, N DZA converters

1 N

308 to 308 are not necessary.

1 N

In the image information decoding apparatus 300 shown in FIG. 10, the image compression information input from the input terminal 301 is temporarily stored in the storage buffer 302 and then transferred to the variable decoding unit 303. The variable decoding unit 303 performs processing such as variable length decoding or arithmetic decoding on the image compression information based on the determined format of the image compression information, and acquires code key mode information stored in the header unit. This is supplied to the inverse quantization unit 304 or the like. Similarly, the variable decoding unit 303 acquires the quantized transform coefficient and supplies it to the inverse quantization unit 304. Further, if the frame decoding inter-coding is performed, the variable decoding unit 303 also decodes the motion vector information stored in the header portion of the image compression information, and the information is decoded. Supply to 311.

[0036] The inverse quantization unit 304 inverse-quantizes the quantized transform coefficient supplied from the variable decoding unit 303, and supplies the transform coefficient to the inverse orthogonal transform unit 305. The inverse orthogonal transform unit 305 performs inverse orthogonal transform such as inverse discrete cosine transform on the transform coefficient based on the determined format of the image compression information. Here, in the case where the target frame force is S intra code, the image information subjected to the inverse orthogonal transform processing is stored in the screen rearrangement buffer 307, and the DZA in the DZA conversion units 308 to 308 After conversion processing, output terminals 309 to 3

1 N 1

09 power is output. [0037] If the target frame force inter-coding is performed, the decoding processing unit 311 performs motion vector information subjected to variable decoding processing and image information stored in the multi-camera frame memory 310. Based on the above, a reference image is generated and supplied to the adder 306. The adder 306 combines the reference image and the output from the inverse orthogonal transform unit 305. The other processing is the same as that of the intra-coded frame.

FIG. 11 is a flowchart showing the code key processing of the image information decoding apparatus 300 shown in FIG. As shown in FIG. 11, the image information decoding apparatus 300 performs motion prediction compensation on image information after variable decoding (step ST31), inverse quantization (step ST32), and inverse orthogonal transform (step ST33) of an input signal. If so, decoding is performed using motion prediction compensation (steps ST34 and ST35), and if compensated using an interpolated image, decoding is performed using the interpolated image (steps ST36 and ST37). The processing of steps ST31 to ST37 is performed for all blocks, and further, the processing of performing the processing of steps ST31 to ST37 for all blocks is performed for all frames. Thereafter, screen rearrangement (step ST41) and DZA conversion (step ST42) are performed based on the obtained decoded data.

FIG. 12 is a flowchart showing an example of the operation of the interpolated image generation / compensation step ST37 shown in FIG. The processing in steps ST371 to ST374 in FIG. 12 is the same as the processing in steps ST221 to ST224 in FIG. When generating the interpolated image, the depth is estimated at each pixel in the block to generate an interpolated pixel (for example, pixel value 0 to 255), and an evaluation value E based on the pixel value of the generated interpolated pixel is calculated. Then, the minimum value E of the evaluation value E in the block depth range is obtained (steps ST371 to ST373). Then the minimum value E

An interpolation pixel is generated using a depth of mm min (step ST374). The processing in steps ST221 to ST224 is performed on all the pixels in the block.

[0040] The image information encoding apparatus 200 capable of performing the image information compression method of the present invention and the image information capable of decoding the image information encoded by the image information compression method of the present invention have been described above. Although the decoding apparatus 300 has been described as an example, the image information encoding apparatus 200 and the image information decoding apparatus 300 that can implement the image information compression method of the present invention are not limited to those having the above-described configuration. The image information compression method of the present invention can also be applied to an apparatus having the configuration described above. Next, an embodiment of the image information compression method of the present invention and the image information of the present invention The FTV system to which the compression method is applied is explained.

The image information compression method according to the first embodiment of the present invention will be described below. The image information compression method according to the first embodiment applies inter-view prediction encoding described later. For example, the motion of the multi-camera frame memory 213 and the code key processing unit 214 shown in FIG. Prediction 'Executed by the compensation unit 215.

FIGS. 13 and 14 are explanatory diagrams (parts 1 and 2) of the image information compression method according to the first embodiment of the present invention. 13 and 14, t represents a time axis, and S represents a spatial axis in the camera arrangement order or the camera arrangement direction. In FIGS. 13 and 14, # 1 to # 7 indicate camera numbers assigned in the order of camera arrangement. However, in the first embodiment, the number of cameras may be other than the number shown as long as the number is two or more. Further, the camera may be arranged in any one of FIGS. 2 (a) to 2 (e) or other arrangements. 13 and 14, I is an intra-frame encoded frame (I picture), P is an inter-frame prediction code frame (P picture), and B is an inter-frame bi-directional prediction code frame. (B picture). In FIG. 13 and FIG. 14, the frames arranged in the space axis S direction are frames at the same time. In FIG. 13 and FIG. 14, a predetermined number of frames arranged in the direction of the time axis t constitute a GOP that is an image group composed of a predetermined number of frame covers. For example, for the camera # 1, a GOP is configured by a predetermined number of pictures of I, B, B, P, B, B, P,.

In the image information compression method of the first embodiment, first, as shown in FIG. 13, image information of frames arranged in the time axis t direction of moving images acquired by a plurality of cameras is obtained. Coding is performed by intra-frame code (intra coding) and inter-frame prediction code (inter coding) using temporal correlation between frames. The inter-frame prediction code using the temporal correlation is, for example, an encoding method based on the H.264ZAVC standard. However, the inter-frame prediction code using the temporal correlation is not limited to the above method, and other code methods may be adopted. As a result of the encoding process, for example, a moving image frame, that is, an encoded image as shown in FIG. 13 is obtained. The first frame in time in the GOP that is composed of a predetermined number of frames aligned in the time axis t direction. The first frame is an I picture, and the first frame is an I picture. In addition, the encoding processing of frames other than the first frame in the same GOP is performed by inter-frame prediction code using temporal correlation, and the encoded image is a P picture or a B picture. .

[0044] Next, image information of frames of moving images acquired by a plurality of cameras, which are arranged at the same time in the spatial axis S direction in the order of camera arrangement, is obtained between frames using temporal correlation. Encoding is performed by inter-frame prediction encoding using the correlation between frames at the same time using the same algorithm as the prediction code 匕. The inter-frame prediction code 匕 using the correlation between the frames at the same time is executed in units of image groups (G shown in FIG. 6) composed of a predetermined number of frames arranged at the same time in the spatial axis S direction. The in this way

S

The inter-frame prediction code using the correlation between frames at the same time is the inter-frame prediction code using the correlation between frames acquired at each viewpoint (for example, adjacent camera positions). This is referred to as “inter-view prediction encoding”. In the first embodiment, the frame subjected to code processing by the inter-frame prediction code using the correlation between the simultaneous frames is the first frame of the frame in the GOP, that is, the I picture. By this inter-view prediction code 匕 processing, as shown in FIG. 14, the first frame in the GOP moves in the direction of the spatial axis S in the camera arrangement direction, I, B, B, P, B, B, P , ... signed to picture.

[0045] The inter-view prediction code 匕 described above is executed for the first frame of each GOP acquired by a plurality of cameras. As described above, the image information compression method according to the first embodiment is used in the H. 264ZAVC standard or the like between images taken at the same time by a plurality of cameras whose positional relationships are known. Focusing on the fact that there is a spatial correlation similar to the temporal correlation, we propose to apply inter-view predictive coding to the first GOP frame (I picture), which has a large amount of information. is there. Thus, using the image information compression method of the first embodiment, the same as the inter-frame prediction encoding for the frame aligned in the time axis t direction with respect to the first frame in the GOP aligned in the spatial axis S direction. Since the inter-frame predictive coding based on the algorithm, that is, the inter-view prediction code is applied, the code compression efficiency can be improved. [0046] In addition, the inter-view prediction encoding process is based on the same algorithm as the inter-frame prediction encoding for the frames arranged in the time axis t direction. It is also possible to divert the compensation unit 215. For this reason, it is necessary to add a significant configuration (circuit or software) in order to implement the image information compression method of the first embodiment. The image information compression method of one embodiment is advantageous in terms of cost.

The image information compression method according to the second embodiment of the present invention will be described below. The image information compression method of the second embodiment uses viewpoint interpolation, which will be described later, and includes a multiframe memory 213 and a motion prediction / compensation unit 215 of the code key processing unit 214 shown in FIG. This is executed by the image generation / compensation unit 216 and the selection unit 217.

FIGS. 15 to 18 are explanatory diagrams (parts 1 to 4) of the image information compression method according to the second embodiment of the present invention. 15 to 18, t represents a time axis, and S represents a spatial axis in the camera arrangement order or the camera arrangement direction. The figure also shows only the frames acquired by cameras # 1 to # 5. The number of cameras is the number of frames that can be interpolated, i.e. 3 (capturing the frame to be encoded). If there is more than one camera and two cameras that capture the reference frame to generate an interpolated image corresponding to the frame to be encoded, a total of three cameras) Also good. In the figure, I, P, and B are an I picture, a P picture, and a B picture, respectively. In FIGS. 15 to 17, the frames arranged in the space axis S direction are frames at the same time.

In the image information compression method of the second embodiment, first, as shown in FIG. 15, odd-numbered cameras # 1, # 3, # 5,... Are selected, and the selected camera # 1 is selected. , # 3, # 5,… Using the intra-frame code and the inter-frame prediction code using the temporal correlation between the frames, the image information of the frames arranged in the time axis t direction Encoding process.

[0050] Next, as shown in FIG. 16, cameras other than the selected camera are selected based on the image information acquired by the selected odd-numbered cameras # 1, # 3, # 5, and so on. Interpolated images corresponding to frames arranged in the time axis t direction of the moving image acquired by the even-numbered cameras # 2, # 4,. That is, a frame based on an image taken by an adjacent camera Perform interpolation. In this way, the process of generating an interpolated image based on a frame at the same time taken by an adjacent camera (that is, from an adjacent viewpoint) is called “viewpoint interpolation”. This is referred to as “viewpoint interpolation image”. Note that the interpolation method used for viewpoint interpolation may be any interpolation method, and may be based on various factors such as the performance required by the apparatus that implements the image information compression method of the present invention or the request of the apparatus user. Therefore, a known frame interpolation method may be selected. In addition, if it is clear that the movement of the shooting target has a specific law, an interpolation method suitable for the movement of the shooting target may be selected. Also, before or after generating the viewpoint interpolation image shown in FIG. 16, the inter-view prediction encoding described in the first embodiment is performed on the first frame in the GOP, and the first frame You can compress the amount of information.

[0051] Next, as shown in FIG. 17, the image information of frames arranged in the time axis t direction of the moving image acquired by even-numbered cameras # 2, # 4, • ·-other than the selected camera. Is encoded using the intra-frame code and the inter-frame prediction code using the temporal correlation between frames.

[0052] At this time, the selection unit 217 of the image information encoding apparatus 200 is an image acquired by an even-numbered camera # 2, # 4, ... other than the selected camera, and is a frame to be encoded. The coding efficiency is the highest when the encoding process is performed with reference to images of frames at different times and when the encoding process is performed with reference to the viewpoint interpolation image corresponding to the frame to be encoded. The result of the encoding process when the value becomes high is selectively output. An explanatory diagram of this process is shown in FIG. In Figure 18, FR (# 1, n—l) is the frame at t = n—1 obtained by camera # 1, and FR (# 1, n) is t obtained by camera # 1. = n frame, FR (# 1, n + 1) is the frame at t = n + 1 obtained by camera # 1. FR (# 2, n—1) is the frame at t = n—1 obtained by camera # 2, and FR (# 2, n) is t = n obtained by camera # 2. FR (# 2, n + 1) is the frame at t = n + 1 obtained by camera # 2. In addition, FR (# 3, n- 1) is the frame at t = n—1 o'clock acquired by camera # 3 and FR (# 3, n) is t = n acquired by camera # 3 FR (# 3, n + 1) is the frame at t = n + 1 obtained by camera # 3. FR (# 2, n) is based on the frame FR (# 2, n) adjacent frame FR (# 1, n) and FR (# 3, n).

It is a viewpoint interpolation image corresponding to the frame FR (# 2, n) generated based on this.

In FIG. 18, the frame FR (# 2, n) to be encoded has frames FR (# 2, 11-1) and 1 ^ (# 2, n) as frames at different times. The force referring to +1) (drawn with a thick solid line) The frame to be referenced is not limited to the frames FR (# 2, n-1) and FR (# 2, n + 1). Reference frame FR (# 2, n) force When referring to one of the frames FR (# 2, n— 1) or FR (# 2, n + 1), or the frame shown There may also be references to frames at different times. Then, the selection unit 217 shown in FIG. 5 refers to frames at different times and performs code key processing using inter-frame prediction codes that use temporal correlation between frames (for example, H .264ZAVC processing) and the viewpoint interpolation image FR (# 2, n) corresponding to the frame FR (# 2, n) to be encoded

When encoding the frame FR (# 2, n) with reference to (mt n) (for example, when the viewpoint interpolation image is the encoded image information of the frame FR (# 2, n)) Among them, an encoding process result when the code compression efficiency is highest is selected and output.

[0054] The reason for performing such processing is that when considering the problem of which image the frame FR (# 2, n) to be encoded is similar to, the different time taken by the same camera # 2 Frame force Viewpoint interpolation image FR (# 2 based on the same frame taken by adjacent cameras # 1 and # 3

int

, n), and the interpolated image FR (# 2, n) based on the same time frame taken by adjacent cameras # 1 and # 3 from frames of different time taken by the same camera # 2. Also

mt

Depending on the momentary movement of the object to be photographed, this is different. In this way, the image information compression method of the second embodiment is the adjacent camera # 1,

Viewpoint interpolation image based on same-time frame taken in # 3 FR (# 2, n) force Same camera

int

Pay attention to the fact that the frame may be more similar to the target frame FR (# 2, n) than the frame of different time taken in # 2, see also the viewpoint interpolated image FR (# 2, n). Subject of

int

By selecting the method with the highest code compression efficiency among the multiple compression methods, the coding compression efficiency is improved.

As described above, according to the image information compression method of the second embodiment, image information acquired by cameras # 2, # 4,. Subject When the sign key processing is performed with reference to the image information of the frame at a time different from that of the frame FR (# 2, n), and the viewpoint interpolation image FR ( # 2,

In the case of encoding processing with reference to mt n), the encoding processing result when the code compression efficiency is the highest is selectively output, so the encoding compression efficiency of the output image information is reduced. Can be improved.

In the above description, the selected camera is an odd-numbered camera (# 1, # 3, # 5, # 7,...), And a camera other than the selected camera is an even-numbered camera ( # 2, # 4, # 6, etc.) The power explained when the camera is a selected camera The selected camera is an even-numbered camera, and the cameras other than the selected camera are odd-numbered cameras Also good. Further, FIG. 18 shows a case where a viewpoint interpolation image is generated by interpolation as indicated by a white arrow, but a viewpoint interpolation image may be generated by extrapolation interpolation.

[0057] Further, the selected camera is not limited to an even number or an odd number. For example, a camera in which one of three cameras whose camera numbers are indicated by # 3n-2 is selected (specifically, # 1, # 4, # 7, ...) and the remaining cameras and cameras other than the selected camera (specifically, # 2, # 3, # 5, # 6,…;) and Other methods, such as, may be adopted. For example, some groups of selected cameras may be even (# 2, # 4, # 6, ...) or odd (# 1, # 3, # 5, ...) and the rest In the group part, one out of three cameras whose camera numbers are indicated by # 3n-2 can be selected cameras, and the remaining cameras can be other than the selected cameras. As still another modification, some groups of selected cameras have one camera selected as the selected camera with the camera number indicated by # 3n-2 and the remaining cameras. Let the camera be a camera other than the selected camera, and in the remaining group part, even number (# 2, # 4, # 6, ...) or odd number (# 1, # 3, # 5, ...) It is good. That is, it is possible to adopt a method in which an even or odd numbered camera is selected as a selected camera and a method in which one predetermined number of cameras is selected as a selected camera.

The image information compression method according to the third embodiment of the present invention will be described below. The image information compression method according to the third embodiment uses viewpoint interpolation. The multiframe memory 213 shown in FIG. 5, the motion prediction / compensation unit 215 of the code key processing unit 214, and interpolation image generation are performed. 'Complement This is executed by the compensation unit 216 and the selection unit 217. The image information compression method of the third embodiment is an improved version of the image information compression method of the second embodiment, and the point of referring to a plurality of viewpoint-interpolated images is that of the image information compression of the second embodiment. It is different from the method.

FIG. 19 is an explanatory diagram of an image information compression method according to the third embodiment of this invention. In Figure 19, FR (# 1, n-1) is the frame at t = n—1 obtained by camera # 1, and FR (# 1, n) is by camera # 1. The acquired frame at t = n and FR (#

1, n + 1) is the frame at t = n + l obtained by camera # 1. FR (# 2, n- 1) is the frame at t = n—1 obtained by camera # 2 and FR (# 2, n) is t = n obtained by camera # 2. FR (# 2, n + 1) is the frame at t = n + l obtained by camera # 2. Furthermore, FR (# 3, n—1) is the frame at t = n—l obtained by camera # 3, and FR (# 3, n) is t = obtained by camera # 3. The frame at time n, FR (# 3, n + 1) is the frame at time t = n + l obtained by camera # 3. In Fig. 19, FR (# 2, n

intl

) Is the viewpoint interpolation image (interpolated image 1 in the figure) corresponding to the frame FR (# 2, n) generated using the first interpolation method, and FR (# 2, n ) Is different from the first interpolation method.

int2

Is a viewpoint interpolation image corresponding to the frame FR (# 2, n), which is generated using the second interpolation method. Figure 19 shows two types of viewpoint-interpolated images FR

intl

(# 2, n) and FR (# 2, n) are shown. By using three or more interpolation methods,

int2

Therefore, three or more types of interpolated images may be generated. Note that the first interpolation method and the second interpolation method can be determined based on various factors such as the performance required for a device that is not limited to a specific method and the performance required by the device user. Any known frame interpolation method can be selected freely. In addition, if it is clear that there is a specific law in the movement of the shooting target, you can select an interpolation method suitable for the movement of the shooting target!

In FIG. 19, the frame FR (# 2, n) to be encoded has frames FR (# 2, 11-1) and 1 ^ (# 2, n) as frames at different times. The force indicating the case of referring to +1) (drawn with a thick solid line) The frame to be referenced is not limited to the frames FR (# 2, n— 1) and FR (# 2, n + 1). The target frame FR (# 2, n) is changed to frame FR (#

2, n— 1) or FR (# 2, n + 1), or the frame shown In some cases, a frame at a different time other than the time frame is referred to. Then, the selection unit 217 shown in FIG. 5 refers to a frame at a different time and performs code key processing using an inter-frame prediction code key that uses temporal correlation between frames (for example, H Frame FR (# 2, n) by referring to the viewpoint interpolation image FR (# 2, n) corresponding to the frame FR (# 2, n) to be encoded. (For example, intl

, View-interpolated image FR (# 2, n) and encoded image information of frame FR (# 2, n)

intl

The viewpoint interpolation image FR (corresponding to the frame FR (# 2, n) to be encoded)

int2

When encoding frame FR (# 2, n) with reference to # 2, n) (e.g., encoding view complement image FR (# 2, n) to frame FR (# 2, n)) (If the image information is to be

intl

Select and output the encoding process result when the encoding compression efficiency is the highest

[0061] The reason for this processing is that when considering the problem of which image the frame FR (# 2, n) to be encoded is similar to, the different time taken by the same camera # 2 Frame force Viewpoint interpolation image FR (# based on the same frame taken by adjacent cameras # 1 and # 3

intl

2, n) and FR (# 2, n) and when taken with adjacent cameras # 1 and # 3

int2

Interpolated image FR (# 2, n) based on time frame is different with the same camera # 2

intl

Similar to the time frame and view-interpolated image FR (# 2, n)!

int2

Viewpoint interpolation images FR (# 2, n) based on the same time frames taken in # 1 and # 3 have the same power

int2

Similar to frame of different time taken with Mera # 2 and view-interpolated image FR (# 2, n)

intl

Depending on the momentary movement of the object to be photographed, this is different. As described above, the image information compression method of the second embodiment is based on the viewpoint interpolation image FR (# 2, n) or FR (# based on the same time frame taken by the adjacent cameras # 1 and # 3.

intl int2

2, n) may be more similar to the target frame FR (# 2, n) than the frame of different time taken by the same camera # 2, and the viewpoint interpolation image FR (# 2,

intl n) and FR (# 2, n) are also referenced.

int2

· ¾: It is something.

[0062] As described above, according to the image information compression method of the third embodiment, the image information acquired by the cameras # 2, # 4,. Target file When encoding processing is performed with reference to image information of a frame at a time different from that of frame FR (# 2, n), and viewpoint interpolation image FR (# 2) corresponding to frame FR (# 2, n) to be encoded , n

intl

) And the encoding process with reference to the viewpoint interpolation image FR (# 2, n) corresponding to the frame FR (# 2, n) to be encoded. Most sign

int2

Since the encoding processing result when the compression efficiency becomes high is selectively output, the encoding efficiency of the output image information can be improved.

[0063] In the above description, the case where the selected camera is an odd-numbered camera and the other cameras are even-numbered cameras has been described. However, the selected camera is an even-numbered camera. It is a camera, and other cameras may be odd-numbered cameras. In addition, FIG. 19 shows a case where a viewpoint interpolation image is generated by interpolation as indicated by a white arrow, but a viewpoint interpolation image may be generated by extrapolation! .

[0064] In addition, the selected camera is not limited to an even or odd number. For example, one out of three cameras whose camera numbers are indicated by # 3n-2 are selected cameras, and the remaining cameras are selected. Other methods may be employed, such as using a camera other than the selected camera. For example, some groups of selected cameras may be even (# 2, # 4, # 6, ...) or odd (# 1, # 3, # 5, ...) and the rest In this group part, one camera out of the three cameras whose camera number is indicated by # 3n-2 can be selected cameras, and the remaining cameras can be cameras other than the selected camera. As yet another variation, some of the selected cameras have one camera selected as the camera number # 3n-2, and the remaining cameras. Is the camera other than the selected camera, and the remaining group parts are even (# 2, # 4, # 6, ...) or odd (# 1, # 3, # 5, ...) It is good.

[0065] In the third embodiment, points other than those described above are the same as in the case of the second embodiment.

The image information compression method according to the fourth embodiment of the present invention will be described below. The image information compression method according to the fourth embodiment uses viewpoint interpolation. The multiframe memory 213 shown in FIG. 5, the motion prediction / compensation unit 215 of the code key processing unit 214, and interpolation image generation are performed. 'Executed by the compensation unit 216 and the selection unit 217. Image information compression method of the fourth embodiment The method is an improved version of the image information compression method of the second embodiment, and the image information compression method of the second embodiment is different from the viewpoint interpolation image in that it also refers to the adjacent image at the same time. Is different.

FIG. 20 is an explanatory diagram of an image information compression method according to the fourth embodiment of the present invention. In Figure 20, FR (# 1, n-1) is the frame at t = n—1 obtained by camera # 1, and FR (# 1, n) is by camera # 1. The acquired frame at t = n, and FR (# 1, n + 1) is the frame at t = n + l acquired by camera # 1. FR (# 2, n- 1) is the frame at t = n—1 obtained by camera # 2 and FR (# 2, n) is t = n obtained by camera # 2. FR (# 2, n + 1) is the frame at t = n + l obtained by camera # 2. Furthermore, FR (# 3, n—1) is the frame at t = n—l obtained by camera # 3, and FR (# 3, n) is t = obtained by camera # 3. The frame at time n, FR (# 3, n + 1) is the frame at time t = n + l obtained by camera # 3. In Figure 20, FR (# 2, n) is

int

This is a viewpoint-interpolated image corresponding to the frame FR (# 2, n) to be encoded.

Note that in FIG. 20, the frame FR (# 2, n—i; ^ FR (# 2, n + 1) is used as a frame at a different time to be encoded. (See the bold solid line in Fig. 20) Reference frame is not limited to frames FR (# 2, n-1) and FR (# 2, n + 1). Frame FR (# 2, n) refer to one of frame FR (# 2, 11-1) or? 1 ^ (# 2, n + 1), or other than the frame shown In some cases, frames of different times are referred to.

[0069] Then, the selection unit 217 shown in Fig. 5 refers to frames at different times and performs code key processing using an inter-frame prediction code key that uses temporal correlation between frames (for example, H.264ZAVC), and frame FR (# 2, n) with reference to viewpoint interpolation image FR (# 2, n) corresponding to frame FR (# 2, n) to be encoded Sign of

mt

Frame FR (# 2, n) with reference to frame FR (# 1, n) or FR (# 3, n) adjacent to the frame FR (# 2, n) to be encoded. (For example, when applying the same algorithm as the processing by H.264ZAVC in the spatial axis S direction), select the encoding processing result for the highest encoding compression efficiency. Out To help.

[0070] The reason for this processing is that when considering the problem of which frame the encoding target frame is similar to, the frames with the same time taken by the same camera # 2 are most similar When the viewpoint interpolation images based on the same time frames taken by adjacent cameras # 1 and # 3 are the most similar, and when the same time frames taken by adjacent cameras # 1 and # 3 are the most similar In either case, there are also different forces depending on the instantaneous movement of the subject. The image information compression method of the fourth embodiment pays attention to this point, frames at different times taken with the same power camera, viewpoint interpolation images based on the same time frames taken with adjacent cameras, and images taken with adjacent cameras. The encoding target frame is encoded using the most similar image of the same time frames.

[0071] As described above, according to the image information compression method of the fourth embodiment, the image information acquired by the cameras # 2, # 4,. When encoding processing is performed with reference to image information of a frame at a time different from the target frame FR (# 2, n), and the viewpoint interpolation image FR corresponding to the encoding target frame FR (# 2, n). (# 2, n

mt

) And encoding processing with reference to frames FR (# 1, n) and FR (# 3, n) adjacent to the encoding target frame FR (# 2, n). In this case, since the code key processing result when the code key compression efficiency becomes the highest is selectively output, the code key compression efficiency of the output image information can be improved.

[0072] In the above description, the case where the selected camera is an odd-numbered camera and the other cameras are even-numbered cameras has been described. However, the selected camera is an even-numbered camera. It is a camera, and other cameras may be odd-numbered cameras. In addition, FIG. 20 shows a case where a viewpoint interpolation image is generated by interpolation as indicated by a white arrow, but a viewpoint interpolation image may be generated by extrapolation! .

[0073] Further, the selected camera is not limited to an even number or an odd number. For example, one out of three cameras whose camera numbers are indicated by # 3n-2 are selected cameras, and the remaining cameras are selected. Other methods may be employed, such as using a camera other than the selected camera. For example, some groups of selected cameras may be even (# 2, # 4, # 6, ...) or odd (# 1, # 3, # 5, ...) and the rest In the group part, the camera number is indicated by # 3n— 2. One out of three cameras can be the selected camera, and the remaining cameras can be cameras other than the selected camera. As yet another variation, some of the selected cameras have one camera selected as the camera number # 3n-2, and the remaining cameras. Let the camera other than the selected camera be an even number (# 2, # 4, # 6, ...) or odd number (# 1, # 3, # 5, ...) in the remaining group parts It is good.

[0074] Furthermore, a plurality of types of viewpoint interpolation images may be generated by combining the fourth embodiment with the third embodiment.

[0075] In the fourth embodiment, points other than the above are the same as those in the second embodiment.

The image information compression method according to the fifth embodiment of the present invention will be described below. The image information compression method of the fifth embodiment is an improvement over the image information compression method of the first embodiment. The image information compression method of the fifth embodiment is that the interpolated image is also referred to when the inter-view prediction code for the first temporal frame in the GOP is referred to. This is different from the image information compression method. The image information compression method of the fifth embodiment includes a multi-frame memory 213 shown in FIG. 5, a motion prediction 'compensation unit 215, an interpolated image generation' compensation unit 216, and a selection unit 217. Executed.

FIGS. 21 to 26 are explanatory diagrams of an image information compression method according to the fifth embodiment of the present invention. In FIG. 21 to FIG. 26, t indicates a time axis direction, and S is a spatial axis corresponding to the camera arrangement order or the camera arrangement direction. The figure shows cameras # 1 to # 9, but the number of cameras is not limited to nine. In the figure, I indicates an I picture, P indicates a P picture, and B indicates a B picture. P is a P picture that also refers to the interpolated image), and B is a B picture that also refers to the interpolated image.

In the image information compression method of the fifth embodiment, first, as shown in FIG. 21, image information of frames arranged in the time axis t direction of moving images acquired by a plurality of cameras is obtained. Code code processing (for example, processing by H. 264ZAVC) is performed using an intra-frame code code and an inter-frame prediction code key that uses temporal correlation between frames. As a result, for example, as shown in FIG. 21, image information of a moving image frame is obtained. Time axis t direction The encoding process of the first frame in time within the GOP, which is composed of a predetermined number of frames arranged in the direction, is performed by the intraframe code 匕, and the first frame is the I picture. In addition, the encoding process for frames other than the first frame in the same GOP is performed by inter-frame prediction codes using temporal correlation.

[0079] Next, as shown in FIG. 22, for the first frame in the GOP, in the direction of the spatial axis S, the inter-view prediction code described in the image information compression method of the first embodiment is used.匕 processing, that is, image information of frames at the same time arranged in the order of camera arrangement is encoded by interframe predictive coding using the correlation between frames at the same time. The processing in FIGS. 21 and 22 is the same as that in the first embodiment.

Next, as shown in FIG. 23, from the first frame in the GOP, the frame FR (# 1, 1) that is an I picture is selected as the first reference frame, and is a P picture. Select frame FR (# 3, 1) as the second reference frame. A viewpoint interpolation image FR is generated by interpolation (extrapolation) based on the frame FR (# 1, 1) and the frame FR (# 3, 1). Next, the camera

int

Encoding processing (inter-view prediction encoding according to the first embodiment) with reference to image information of a frame different from the encoding target frame in the same time frames arranged in the order of

, Refer to the viewpoint interpolation image FR corresponding to the encoding target frame, and perform encoding processing

int

In this case, the encoding processing result when the encoding compression efficiency is the highest is the image information encoded in the encoding target frame (e.g., FR (# 5, 1)), e.g., Pi Let it be a picture. Next, the viewpoint interpolation image FR is sequentially generated from the image of the frame FR (# 3, 1) and the external interpolation based on the generated Pi picture, and the same processing is repeated. Where perspective

int

As shown in Figure 24, the interpolated image is generated by frame FR (# n +

intl

4, 1) and FR (# n + 4, 1) may be created. In addition, as shown in Figure 24

int2

In the first frame in the GOP, after the I picture, P picture, and Pi picture are generated, the interpolated frames FR (# n + l, 1) and FR (# n + l, 1), or

intl mt2

Create interpolation frames FR (# n + 3, 1) and FR (# n + 3, 1). Next, the camera

mtl mt2

When encoding processing (inter-view prediction encoding according to the first embodiment) with reference to image information of a frame different from the encoding target frame in the frames at the same time arranged in the arrangement order, and encoding target Viewpoint interpolation image FR (# n + l, 1) or FR ( # n + l, 1) or interpolation frame FR (# n + 3, 1) or FR (# n + 3, 1) FR

intl int2

Among the encoding processing with reference to mt, the encoding processing result when the encoding compression efficiency becomes the highest is the code of the target frame (for example, FR (# 4, 1)). It is assumed that the converted image information is, for example, a Bi picture.

[0081] Next, encoding processing is performed with reference to image information of a frame different from the encoding target frame in frames at the same time arranged in the order of camera arrangement, and corresponds to the encoding target frame. In the case of encoding processing with reference to the viewpoint interpolation image, the encoding processing result when the encoding compression efficiency is highest is selectively output. As a result, as shown in FIG. 25, the first frame at t = l is encoded by the method with the highest encoding efficiency.

Next, as shown in FIG. 6, the same processing is repeated for the next GOP.

[0083] The reason for this processing is that the first frame in the GOP! / And the problem of which image the encoding target frame looks like are taken by the adjacent camera. By performing the inter-view prediction code の of the first embodiment based on the simultaneous frames, the case where the encoded images are most similar to each other and the reference frames taken by adjacent cameras are used. The created interpolated image may be the most similar! /, And the difference between the V and the deviation depends on the instantaneous movement of the subject. Focusing on this point, the image information compression method of the fifth embodiment is encoded by performing the inter-view prediction code の of the first embodiment based on the same-time frame captured by the adjacent camera. If the image is the most similar, and if the interpolated image created based on the reference frame taken by the adjacent camera is the most similar, The target frame is encoded.

[0084] As described above, according to the image information compression method of the fifth embodiment, the image encoded by performing the inter-view prediction code in the first embodiment is most similar. Encoding the frame to be encoded using the most similar image between the case where the interpolated image created based on the reference frame taken by the adjacent camera is the most similar As a result, it is possible to improve the code compression efficiency of the output image information. Note that in the fifth embodiment, points other than those described above are the same as in the case of the first embodiment.

The image information compression method according to the sixth embodiment of the present invention will be described below. FIG. 27 is a diagram showing an example of a horizontal section of a light space referred to in the image information compression method of the sixth embodiment of the present invention. FIG. 28 is an explanatory diagram of a motion vector prediction method in the image information compression method according to the sixth embodiment of the present invention. FIG. 29 is an explanatory diagram of a motion vector prediction method in H.264ZAVC as a comparative example of the sixth embodiment of the present invention.

[0087] The image information compression method of the sixth embodiment is an improvement over the image information compression method of the first embodiment. The image information compression method of the sixth embodiment is based on the premise that a plurality of cameras are arranged in a straight line in parallel with each other. The image information compression method according to the sixth embodiment is a step in which image information of frames at the same time arranged in the order of camera arrangement is subjected to code processing using an inter-frame prediction code using correlation between frames at the same time. The motion vector used in the motion compensated prediction encoding (step of inter-view prediction code in the first embodiment) is a horizontal cross-sectional image (EPI: Epipolar) when the ray space is cut horizontally. It is characterized by being obtained based on a straight line appearing in the Plane Image). The image information compression method according to the sixth embodiment is executed by the multiframe memory 213 shown in FIG. 5 and the motion prediction / compensation unit 215 of the code key processing unit 214.

[0088] In the encoding of an image by H.264ZAVC, as shown in FIG. 29, an encoded adjacent region adjacent to the corresponding block BL in the frame FR of the moving image acquired by the camera is used.

en

Blocks BL, BL and BL forces also predict motion vectors. This method is

neil nei2 nei3

Many bits when the lock BL and the reference blocks BL, BL, BL are very different

en neii nei2 nei3

Has the disadvantage of generating

Therefore, according to the image information compression method of the sixth embodiment, a plurality of cameras are linearly arranged in a line in parallel with each other, and a moving image acquired by the plurality of cameras is used. When the light space is configured by arranging frames at the same time in parallel with each other in the arrangement order of a plurality of cameras, the horizontal sectional structure in the light space is a collection of linear structures. Take advantage of the nature represented by Mari. This property gives priority to the point represented by a straight line with a large slope of the straight line in the region where the motion of the frame occurs continuously and where the motion overlaps (the region where the straight lines intersect in FIG. 27). A straight line with a large slope corresponds to the point in the front in 3D space.

[0090] Here, in the case where the light space is configured by arranging them in parallel with each other in the arrangement order of the plurality of cameras, the property that the horizontal sectional structure in the light space is expressed as a collection of linear structures, Please refer to Fig. 3 (a) and (b) and Fig. 30 (a) and (b). Considering the cross section where y is constant, ignoring the vertical parallax (Φ), as shown in Fig. 30 (a), let (X, Z) be the coordinates of one point P in real space, Let z, 0 be the position and angle at which the ray passes through the reference plane 106. At this time, the relationship X = x + Z'tan Θ holds. In other words, a group of rays passing through a point in real space has a feature that they are arranged in a straight line on the horizontal section (y = -constant section) of the ray space. Figure 30 (b) shows a point X in real space on the horizontal section of the ray space.

Thus, in the image information compression method of the sixth embodiment, as shown in FIG. 29, since it is not used with the motion vector of the adjacent block, an appropriate motion vector can be predicted. According to the sixth embodiment, since an appropriate motion vector can be predicted, the image compression efficiency can be improved.

In the above description, the case where the image information compression method of the sixth embodiment is applied to the first embodiment has been described. However, the image information compression method of the sixth embodiment is the second It can also be applied to the fifth embodiment.

[0093] <Explanation of FTV System of Seventh Embodiment>

FIG. 30 is a diagram conceptually showing the basic structure of the FTV system according to the seventh embodiment of the present invention. In FIG. 30, the same or corresponding elements as those shown in FIG.

[0094] In the FTV system of the seventh embodiment, the transmission-side device 250 and the reception-side device 350 are separated from each other, and from the transmission-side device 250 to the reception-side device 350, for example, the Internet It is a system that transmits FTV signals using, for example.

[0095] As shown in FIG. 30, the transmission-side apparatus 250 includes a plurality of cameras (in FIG. Although five of 2 to 102 are shown, more cameras are actually used. ) And the power of multiple units

1 5

An image information encoding device 200 having the configuration and functions described in the first to sixth embodiments, which compresses video information acquired by a camera, is provided. The image information compressed and encoded by the image information encoding device 200 is sent to the receiving device 350 by a communication device (not shown).

In addition, receiving-side apparatus 350 includes, as shown, a receiving apparatus, image information decoding apparatus 300 described in Embodiment 1 above, and an output signal from image information decoding apparatus 300. Then, a light ray space 103 is formed on the basis of the information, and a cross section is extracted from the light ray space 103 according to the viewpoint position input from the user interface 104 and displayed.

[0097] As shown in FIGS. 3 (a), (b) and FIGS. 4 (a) to (c), for example, by using the ray space method, by cutting an arbitrary surface from the ray space 103, It is possible to generate an image viewed from an arbitrary viewpoint in the horizontal direction in real space. For example, when the cross section 103a is cut out from the ray space 103 shown in FIG. 4 (a), an image as shown in FIG. 4 (b) is generated, and the cross section 103b is drawn from the ray space 103 shown in FIG. When cut out, the image shown in Fig. 4 (c) is generated.

[0098] As described above, in the FTV system of the seventh embodiment, since the image information compression method described in the first to sixth embodiments can be used, the FTV in the FTV system can be used. The sign key compression efficiency of the signal can be improved.

Claims

The scope of the claims

[1] Three or more cameras' medium power was also selected. Image information of frames arranged in the time axis direction of moving images obtained by two or more cameras was encoded with intra-frame coding and temporal correlation between frames. A video obtained by a camera other than the selected camera based on the step of performing the code computation using the inter-frame prediction code used, and the image information obtained by the selected camera; Generating a first viewpoint interpolated image corresponding to frames aligned in the time axis direction of the image;

Encoding the image information of frames arranged in the time axis direction of a moving image acquired by a camera other than the selected camera; and

Have

The step of encoding and processing image information of frames arranged in a time axis direction of a moving image acquired by a camera other than the selected camera is image information acquired by a camera other than the selected camera. Encoding processing with reference to image information of a frame at a time different from the encoding target frame, and encoding processing with reference to the first viewpoint interpolated image corresponding to the encoding target frame In some cases, the method includes a step of selectively outputting an encoding processing result when the encoding compression efficiency is highest.

A method of compressing image information.

[2] In the step of generating the first viewpoint interpolation image, a plurality of types of the first viewpoint interpolation images are generated by using different interpolation methods for one frame, and other than the selected camera. The step of encoding the image information of frames arranged in the time axis direction of the moving image acquired by the camera is the image information acquired by a camera other than the selected camera, When encoding processing is performed with reference to image information of frames at different times, and when encoding processing is performed with reference to one of a plurality of types of first viewpoint interpolated images corresponding to the encoding target frame Including a step of selectively outputting an encoding process result when the encoding compression efficiency is highest

The image information compression method according to claim 1, wherein:

[3] The step of encoding and processing image information of frames arranged in a time axis direction of a moving image acquired by a camera other than the selected camera is an image acquired by a camera other than the selected camera. Information and encoding processing with reference to image information of a frame at a time different from that of the encoding target frame, and encoding with reference to the first viewpoint interpolation image corresponding to the encoding target frame. In the case of performing the encoding process with reference to the image information of the image information acquired by the selected camera at the same time as the frame to be encoded, Including a step of selectively outputting an encoding process result when the encoding compression efficiency is highest.

The image information compression method according to claim 1, wherein:

[4] Image information of frames of a moving image acquired by the camera and arranged at the same time in the order of arrangement of the cameras is the same by the same algorithm as the inter-frame prediction using the temporal correlation. The method further includes a step of performing encoding processing by inter-frame predictive encoding using correlation between frames of time.

The image information compression method according to claim 1, wherein:

[5] In the step of processing the image information of the frames arranged in the time axis direction by the intra-frame encoding and the inter-frame prediction code using the temporal correlation between the frames.

The first frame in the image group is processed by the code power of the first frame in the image group constituted by a predetermined number of frames arranged in the time axis direction. The processing power of the frame other than the frame is performed by the inter-frame prediction code using temporal correlation.

5. The image information compression method according to claim 4, wherein:

[6] In the step, the image information of the frames at the same time arranged in the order of arrangement of the cameras is subjected to code processing by an inter-frame prediction code using the correlation between the frames at the same time.

5. The frame force encoded by an inter-frame prediction code using a correlation between the frames at the same time is a plurality of the first frames arranged in the arrangement order of the cameras. Image information compression method.

[7] The step of encoding the image information of the frames at the same time arranged in the arrangement order of the cameras with an inter-frame prediction code using the correlation between the frames at the same time,

Selecting two or more reference frames from the frames at the same time arranged in the order of arrangement of the cameras;

Generating a second viewpoint-interpolated image corresponding to any of the reference frames or frames of the same time arranged in the order of arrangement of the cameras based on the reference frames or the frames generated based on the reference frames When,

When encoding processing is performed with reference to image information of a frame different from the encoding target frame in the frames at the same time arranged in the camera arrangement order, and the second corresponding to the encoding target frame. A step of selectively outputting an encoding process result when the encoding compression efficiency is the highest in the case where the encoding process is performed with reference to the viewpoint interpolation image.

The image information compression method according to claim 6.

[8] In the step of generating the second viewpoint interpolation image, a plurality of types of the second viewpoint interpolation images are generated using different interpolation methods for one frame, and the camera array The step of encoding the image information of frames other than the reference frame in the frames at the same time arranged in sequence is performed with the encoding target frame in the simultaneous frames arranged in the camera arrangement order. When encoding is performed with reference to image information of different frames, and when encoding is performed with reference to any of the plurality of types of second viewpoint interpolated images corresponding to the frame to be encoded. The image information compression method according to claim 7, further comprising a step of selectively outputting a result of encoding processing when the encoding compression efficiency is highest.

[9] The plurality of cameras are arranged in a straight line in parallel with each other,

Frames of the same time of moving images acquired by the plurality of cameras are arranged in parallel with each other in the arrangement order of the plurality of cameras to form a light space, and frames of the same time arranged in the arrangement order of the cameras The image information of the image is subjected to code processing by inter-frame prediction code using the correlation between the frames at the same time. Force motion compensation using a motion vector of a block constituted by a part of the frame is performed by a prediction code 匕,

The motion vector is obtained based on a straight line appearing in a horizontal sectional image when the light space is cut in the horizontal direction.

5. The image information compression method according to claim 4, wherein:

[10] Image information of frames arranged in the time axis direction of moving images obtained by multiple cameras is encoded by interframe prediction encoding using intraframe code and temporal correlation between frames. Processing steps;

Image information of frames of moving images acquired by the plurality of cameras at the same time arranged in the order of the arrangement of the cameras is obtained by the same algorithm as the inter-frame prediction code 利用 using the temporal correlation. A step of performing a code process using an inter-frame prediction code using a correlation between frames at the same time;

An image information compression method characterized by comprising:

[11] In the step of processing the image information of the frames arranged in the time axis direction by the intra-frame encoding and the inter-frame prediction code using the temporal correlation between the frames.

In the image group composed of a predetermined number of frames arranged in the time axis direction, the first frame in time is processed by the code processing power of the first frame, and other than the first frame in the image group. The code processing power of the frame is performed by inter-frame prediction code using temporal correlation.

The image information compression method according to claim 10, wherein:

[12] In the step, the image information of the frames at the same time arranged in the order of arrangement of the cameras is code-processed by an inter-frame prediction code using a correlation between the frames at the same time.

11. The plurality of first frames arranged in order of arrangement of the cameras according to claim 10, wherein the frame force is processed by an inter-frame prediction code using the correlation between the frames at the same time. Image information compression method.

[13] Image information of frames at the same time arranged in the order of arrangement of the cameras is converted into the frame information at the same time. The above-mentioned step of performing the code processing by the inter-frame prediction code using the correlation between the frames,

Generating a viewpoint complement image corresponding to any of the reference frames or frames of the same time arranged in the order of arrangement of the cameras based on the reference frames or the frames generated based on the reference frames;

When encoding processing is performed with reference to image information of a frame different from the target frame in the same time frames arranged in the camera arrangement order, and the viewpoint interpolation corresponding to the target frame is encoded 13. The image according to claim 12, further comprising a step of selectively outputting a result of the encoding process when the encoding compression efficiency is highest among the cases where the encoding process is performed with reference to the image. Information compression method.

[14] In the step of generating the viewpoint interpolation image, a plurality of types of the viewpoint interpolation images are generated using different interpolation methods for one frame,

The step of encoding the image information of frames other than the reference frame in the frames at the same time arranged in the camera arrangement order is performed in the same frame in the frames arranged in the camera arrangement order. When encoding processing is performed with reference to image information of a frame different from the target frame, and when encoding processing is performed with reference to whether the plurality of types of viewpoint-interpolated images corresponding to the encoding target frame are shifted. Including a step of selectively outputting a code processing result when the coding compression efficiency is highest

The image information compression method according to claim 13.

[15] The plurality of cameras are arranged in a straight line in parallel with each other,

Frames of the same time of moving images acquired by the plurality of cameras are arranged in parallel with each other in the arrangement order of the plurality of cameras to form a light space, and frames of the same time arranged in the arrangement order of the cameras The step of processing the image information by the interframe prediction code using the correlation between the frames at the same time The motion compensation prediction using the motion vector of the block constituted by a part of the frame Executed by the sign The motion vector is obtained based on a straight line appearing in a horizontal sectional image when the light space is cut in the horizontal direction.

The image information compression method according to claim 10, wherein:

[16] An image information encoding device that executes the image information compression method according to claim 1,

A user interface for inputting the viewpoint position of the viewer;

A free viewpoint television system, comprising: an image information extracting unit that extracts an image of viewpoint power input by the user interface from images of the same time taken by the plurality of cameras.

[17] The image information extraction unit sets the images based on the image information decoded by the image information decoding device, which are images of the same time taken by the camera, in the arrangement order of the cameras. 17. The image information viewed from the viewpoint position is extracted by cutting light spaces configured in parallel by a plane based on the viewpoint position input by the user interface. Free viewpoint television system.

[18] A linear arrangement in which the cameras are arranged in a direction parallel to each other on a straight line, a circumferential arrangement in which the inside of the circumference is arranged on the circumference, and a planar arrangement in which the directions parallel to each other are arranged on a plane The freedom according to claim 16, characterized in that it is installed in any one of a spherical arrangement on the spherical surface facing the inside of the spherical surface and a cylindrical arrangement on the cylinder facing the inner side of the cylinder. Perspective TV system.

[19] The camera force is installed in a linear arrangement with the directions parallel to each other on a straight line,

The free viewpoint television system according to claim 17, wherein the plane that cuts the light space is a vertical plane in the light space.

[20] The camera is installed in a circumferential arrangement on the circumference facing the inside of the circumference, and the surface for cutting the light space is a sine wave curve on a horizontal plane in the light space. Is the face

The free viewpoint television system according to claim 17, wherein: