WO2011039814A1

WO2011039814A1 - Multi-view stream data control system/method

Info

Publication number: WO2011039814A1
Application number: PCT/JP2009/005057
Authority: WO
Inventors: チョンジーン; オルテガアントニオ; チョンガイマン
Original assignee: ヒューレット－パッカードデベロップメントカンパニーエル．ピー．
Priority date: 2009-09-30
Filing date: 2009-09-30
Publication date: 2011-04-07

Abstract

In interactive multi-view video streaming, a corresponding video data stream is transmitted to a user, in response to a request from the user. In a coding stream, if a large number of P frames are used, a data storage amount necessary in a data storage device becomes large, and if a large number of I frames are used, a data transfer amount necessary in a data communication line becomes large. Upon creating a coding graph, D (distributed source coding) frames are combined in addition to the P frames and the I frames, so that the bandwidth of the communication line can be reduced and further, a trade-off with the storage capacity of the stream server can be achieved.

Description

Multi-view stream data control system and method

The present invention relates to a technique for efficiently storing and transmitting to a user video streams shot by a plurality of cameras in multiview video coding.

In multiview video coding (hereinafter referred to as MVC), video streams shot by multiple cameras are compressed, encoded, and transmitted to the user. The user receives, decodes and plays back these video streams. In an interactive multi-view video stream, a user can interactively request a desired view from a multi-view video stream server (hereinafter referred to as a stream server). Correspondingly, the stream server sends the corresponding video stream from the coding graph to the client.

G. Cheung, T. Sakamoto, A. Ortega, “Managing Multiview Streaming Video Data Composed Of Frames”, USPTO application no. 12/246274.

As a recent trend of MVC, a technique for efficiently compressing video streams shot by a plurality of cameras by performing inter-view and temporal redundancy has been studied. For example, the MVC standardization process has focused on developing a new compression algorithm that encodes all frames in a multi-view video stream in order to optimize rate and distortion.

A problem of an interactive multi-view video stream (hereinafter referred to as IMVS in the present specification) targeted by the present specification will be described. In IMVS, a user can interactively request a desired view from a stream server. Correspondingly, the stream server sends the corresponding video stream from the coding graph to the client. The following are problems of the prior art. When a large number of P frames are used in the construction of the coding graph, the number of P frames increases exponentially with the passage of time, and the amount of data storage required to construct the coding graph increases, resulting in data Storage costs increase. If many I frames are used, the number of frames in the coding graph and the number of paths can be reduced, but a large amount of data transmission is required on the data communication line, and the bandwidth on the communication line is reduced. It is widely required.

In the present specification, the above problems are solved by the following means. The contents described in the claims as the control system will be described below.
Claim 1 solves the problem by using DSC 1 shown in the second embodiment.
[Claim 1] A multi-view stream data control system using a computer,
(A) An encoding unit that reads a predetermined original picture at a predetermined time from a multi-view source storage unit, performs encoding, and creates an I frame, a P frame, and a D frame,
The D frame corresponds to the first view at the first time created using the original picture corresponding to the first view at the first time as the input information and the frame at the first time. A first D frame created by performing distributed source coding with at least one first picture as side information,
(B) a coding graph creation unit that creates a coding graph by connecting the I frame, the P frame, and the D frame by a series of time-sequential paths; and
(C) a storage unit for storing the created coding graph in a coding graph storage unit;
Including data control system.

[Claim 2]
Claim 2 solves the problem by using DSC 2 shown in the third embodiment.
From the first time, the D frame is created by using a frame at a time immediately before the first time, using the original picture corresponding to the first view at the first time as input information. The control system according to claim 1, further comprising a second D frame created by performing distributed source coding with at least one picture at the previous time as side information.

[Claim 3]
Claim 3 solves the problem by using DSC3 (DSC1 + DSC2) shown in the fourth embodiment.
The D frame has, as input information, an original picture corresponding to a first view at a first time, and at least one of the first D frame and the second D frame as side information. 3. The control system of claim 2, comprising a third D frame created by performing distributed source coding.

[Claim 4]
In a fourth aspect of the present invention, the system of the first aspect further minimizes the amount of multi-view streams to be transmitted under a predetermined memory capacity limit in the coding graph storage unit.

[Claim 5]
In a fifth aspect of the present invention, the system of the first aspect further includes a multi-view stream transmission unit that transmits a necessary stream to the client.

According to the technology disclosed in this specification, when a user is watching a multi-view video stream, the view change request is interactively transmitted to the stream server, and the stream server transmits the corresponding view stream to the client. In some cases, the storage efficiency of stream data in the stream server can be improved, and the transmission cost to the client can be reduced.

It is a figure which shows the concept of the data acquisition of a multi view stream. It is a figure which shows the concept of an interactive multi view stream. It is a hardware block diagram inside a computer which implements this invention. It is a figure which shows the software block diagram of a stream server. 3 is a block diagram showing a software of a client 130. FIG. It is an example which shows the example of the coding graph of Example 1 of this invention. It is a figure which shows the concept of distributed source coding. It is a figure which shows the concept in the case of applying a distributed source coding to Example 2. FIG. It is an example which shows the example of the coding graph of Example 2 of this invention. It is a figure which shows the concept in the case of applying a distributed source coding to Example 3. FIG. It is an example which shows the example of the coding graph of Example 3 of this invention. It is an example which shows the example of the coding graph of Example 4 of this invention. It is a figure which shows the relationship between the transmission rate with respect to distributed source coding, and required memory capacity. It is a figure which shows the processing flow in a stream server.

[Interactive multi-view video stream]
FIG. 1 shows a scene in which one person is photographed simultaneously by n cameras. Views (j) (j = 1 to n) photographed by these n cameras are sent to the user, and the user can designate and view a desired view.
FIG. 2 shows an overview of a system for interactively delivering an interactive multi-view video stream according to a request from a user. Views taken by multiple cameras are stored in a multi-view source. These multiviews are encoded by the encoding unit 116, and a coding graph is created by the coding graph creation unit 118. The user interactively transmits a view request from the client 130 to the stream server 110. The stream server 110 transmits a view stream desired by the user to the client s (s = 1 to q).

[Example using I frame and P frame]
[Computer internal hardware configuration diagram]
FIG. 3 shows a hardware configuration diagram inside the computer of the stream server 110 that implements the present invention and each client 130s (s = 1 to q). These apparatuses using a computer have a processor 302, and a program executed in the present invention is executed. Data and commands are sent from the processor 302 through the communication bus 304. The computer 300 also has a main memory 306 typified by RAM, where program code is executed. The secondary memory 308 includes, for example, one or more hard disks 310 or a removable storage drive 312 such as a flexible disk drive, a magnetic tape drive, and a CD disk drive, and these drive the removable storage 314. . The secondary memory 308 stores program codes for implementing the present invention.

User I / O devices include a keyboard 316, a mouse 318, a display 320, and the like. Display adapter 322 interfaces communication bus 304 and display 320, receives data from processor 302 and converts it to display commands for display 320. Further, the processor 302 communicates with, for example, the Internet and a LAN through the network adapter 324.

[Software block diagram]
FIG. 4 is a diagram illustrating a software block diagram of the stream server 110. The multi-view source storage unit 112 stores a plurality of view streams. The plurality of view streams are sent to the encoding unit 116 and encoded by the encoding unit to create an I frame, a P frame, and a D frame (described later). The coding graph creation unit 118 combines the I frame, the P frame, and the D frame, and further connects them by a predetermined path to form a coding graph as shown in FIGS. 6, 9, 11, and 12. . In the configuration of the coding graph, the coding graph is constructed while optimizing the storage efficiency and the transmission bandwidth. In this case, if necessary, data is transmitted to the encoding unit 116 and further encoded, and the coding graph is reconstructed using them.

The created coding graph is stored in the coding graph storage unit 120. The receiving unit 124 receives the information of the view desired by the user transmitted from the client 130 (FIG. 5), and sends it to the view data transmission control unit 122. The view stream transmission control unit 122 receives a view stream desired by the user from the coding graph storage unit 120 and transmits the view stream to the client via the transmission unit 126.

[Processing flow diagram on stream server]
FIG. 14 is a diagram showing a series of processing flows in the stream server in the present invention.
410: Step of reading original picture In this step, the encoding unit 116 reads each picture of the multi-view stream from the multi-view source storage unit 112.
420: Step of encoding original picture This is a step of encoding the original picture read by the encoding unit 116 to create an I frame, a P frame, and a D frame.
430: Coding graph creation step In this step, the coding graph creation unit 118 creates a coding graph using the created I frame, P frame, and D frame.
Step 440: Step of minimizing the transmission stream under the memory limit In order for the coding graph creation unit 118 to minimize the bandwidth of the multi-view stream to be transmitted under the limitation of the memory capacity in the coding graph storage unit 120. This is a step for optimizing the configuration of the coding graph. If the configuration of the coding graph is not optimized, the process returns to Step 420: “Step of encoding original picture”.

450: Step of storing coding graph In this step, the coding graph creation unit 118 stores the created coding graph in the coding graph storage unit 120.
460: Step for receiving a request from the client View stream transmission control unit 1
Reference numeral 22 denotes a step of receiving a view request from the client 130 through the receiving unit 124.
470: A step of transmitting a necessary stream in accordance with the request In this step, the view data transmission control unit 122 transmits the necessary data from the coding graph storage unit 120 to the client 130 according to the request.

FIG. 5 shows a software block diagram of the client 130. The client 130 receives designation of a desired view stream from the user via the user input unit 132. The information is transmitted to the stream server 110 via the transmission unit 134. A video stream corresponding to the view designated by the user is transmitted from the stream server 110. The receiving unit 136 receives the video stream and transmits it to the stream data storage unit 137. The stream data storage unit 137 stores the received video stream and sends it to the decoding unit 138. The decoding unit 138 performs decoding, and the multi-view display unit 140 displays the decoded data. Further, the decoded data is stored in the stream data storage unit 137 as necessary.

[Example using I frame and P frame]
FIG. 6 is a diagram illustrating an example of a coding graph using an I frame and a P frame. Here, a square represents an I frame, an ellipse represents a P frame, and a dashed rectangle represents a picture. The rectangle indicated by the broken line need not be stored in the coding graph storage unit 120. Since information necessary for displaying the corresponding original picture ^Fo is included in the I frame, the picture F can be reproduced by the I frame alone. Here, ^{o on} the right shoulder of F ^o represents an original picture. The picture F is a picture restored after being reduced once. Therefore different from the F ^o.
Since the difference information from the picture at the previous time is included in the P frame, in order to reproduce the picture F, the data of the image F at the previous time is required.

Here, at the time index t = 2, there are a plurality of P frames necessary for reproducing the view (1) by the path from the frame at t = 1. That is, P (2,1,1) and P (2,1,2). Similarly, P frames necessary for reproducing the view (2) are P (2, 2, 1) and P (2, 2, 2). The third parameter k is used to distinguish these plural P frames. P (i, j, k) means that it is the (k) th P frame among the P frames necessary for reproducing the view (j) in the time index (i) (k = 1). ~ M).

[Play video stream]
After the user sees the currently viewed picture F at time index (i) and view (j), at the next time index (i + 1), view (h) (h = 1-n) 0 When requesting, the stream server selects one of the following (1) to (3) and transmits the information to the client.
(1) I frame I (i + 1, h) (h = 1 to n, where j-1 <= h <= j + 1)
In this case, the client can reproduce F (i + 1, h) by decoding I (i + 1, h).
(2) P (i + 1, h, k ′) (k ′ = 1 to m) which is difference information from the currently viewed picture F (i, j, k)
In this case, F (i + 1, h, k) means a picture that can be reproduced using P (i + 1, h, k).
(3) From a series of P-frames or I-frames before time index (i) and picture F (i, j, r) for reproducing picture F (i, j, r) (r # k) P (i + 1, h, k) which is difference information
In order to reproduce the picture F (i, j, r) (r # k), a series of P frames or I frames before the time index (i) is required, and the picture F (i, j, r) F (i + 1, h, k) can be reproduced by P (i + 1, h, k) which is difference information from).

Here, the problem when using P-frames is that the number of coding passes in the coding graph and the number of P-frames required to reproduce the same picture thereby rapidly increase over time. It is. For example, in FIG. 6, the number of P frames is 2 at the time index t = 1, but increases to 4 at the time index t = 2. As described above, a large amount of P frames is required with the passage of time, which means that a large memory capacity (storage cost) is required for the stream server.
In general, when the number of P frames increases and the number of decoding passes is large, replacing the P frames with I frames is used to avoid this. In this case, however, this reduces the amount of memory required for the stream server, but increases the amount of transmission to be sent to the client, which places a burden on the bandwidth on the transmission line.

[Example using distributed source coding (DSC1)]
In the second embodiment, a solution using distributed source coding is disclosed to solve the above-described problem.

[Distributed source coding]
First, distributed source coding will be described. FIG. 7 is a diagram illustrating the concept of distributed source coding. In the encoder, distributed source coding is performed using X as input information and Y1, Y2,... Yk as side information of X, and as a result, a frame Dy is obtained. The encoder transmits Dy to the decoder, and the decoder can decode X using at least one side information Y1, Y2,... Yk among Y1, Y2,.

FIG. 8 is a diagram illustrating the concept of distributed source coding (DSC1) used in the second embodiment. The encoder uses the original picture F ^o (i + 1, j), which is the view (j) at the time index (i + 1), as input information as input information, and the view (j) at the time index (i + 1) as side information. The picture F (i + 1, j, k) (k = 1 to m) is encoded to create Dp. The decoder decodes Dp using at least one of the pictures F (i + 1, j, k) (k = 1 to m) as the view (j) in the time index (i + 1) as side information, and displays FN ( i + 1, j, k) (k = 1 to m).
In this embodiment, since the user is viewing one of F (i, j, k) (k = 1 to m), P (i + 1, j, k ′ corresponding to the difference from the user is viewed. ) (K ′ = 1 to m) is transmitted to the client, F (i + 1, j, k) (k = 1 to m) can be reproduced by the client. Using this as side information, Dp can be decoded to obtain FN (i + 1, j, k) (k = 1 to m) to be displayed.

[Build coding graph]
FIG. 9 is a diagram illustrating a coding graph using the first example (DSC1) using distributed source coding. D (3,1) uses the original picture F ^o (3,1) corresponding to the view (1) at the time index t = 3 as input information, and as side information corresponds to the view (1) F ( 3,1,1), F (3,1,2), F (3,1,3), and F (3,1,4). Note that at time index t = 3, there are four pictures F corresponding to view (1) based on that path.

F (3,1,1) is a series of paths from the I frame, namely I (0,0), P (1,1,1), P (2,1,1) and P (3,1 , 1). The same applies to the other three pictures F. In FIG. 9, D (3,2), F (3,2,1), F (3,2,2), F (3,2,3), and F corresponding to the view (2) (3, 2, 4) is omitted.
The original picture F ^o (3,1) is used as input information, and four pictures F (3,1, k) (k = 1 to 4) reproducing the view (1) at the time index t = 3 are set as side. Since it is created as information, the size of the DSC frame D (3, 1) can be very small. The four pictures F (3, 1, k) (k = 1 to 4) need not be stored in the stream server after the DSC frame D (3, 1) is created. Pictures F (3, 1, k) (k = 1 to 4) that do not need to be stored in the stream server are indicated by broken lines.

In each embodiment of the present specification, a predetermined evaluation function I is defined, and I frame, P frame, D frame, and coding graph configuration that minimize the predetermined evaluation function I are obtained. As an evaluation function, in this embodiment,
I = T (s) + λB (s)
However, it is not necessary to be limited to this evaluation function. Here, T (s) is a transmission cost, B (s) is a storage cost, and λ is a predetermined constant. s represents the configuration of the coding graph. Although details of the calculation method are omitted, the value of the evaluation function I is obtained while sequentially changing the structure, and a structure in which the value of I is minimized in the entire coding graph is obtained.

[Picture playback on client]
If the user is looking at picture F (2,1,1) at time t = 2 and then the user requests view (1) at time index t = 3, D (3, 1) frame and P (3, 1, 1) are transmitted.
Since the user sees the picture F (2,1,1), if P (3,1,1) is received, F (3,1,1) can be reproduced.
Therefore, the picture FN (3, 1, 1) to be displayed on the multi-view display unit is decoded by decoding the D (3, 1) frame as input information and F (3, 1, 1) as side information. ) Can be obtained.
[Effect of Example 2]
In the DSC 1 disclosed in the second embodiment, the information transmitted from the stream server to the client is D (3, 1) and P (3, 1, 1).
Since D (3,1) created by DSC1 is for a view in which the input information and the side information are the same, the frame size is very small. Further, since P (3, 1, 1) is a difference from the picture F at the previous time, this frame size is also small. Therefore, it can be said that it is more advantageous than transmitting I frame I (3, 1).
Further, in DSC1, a DSC frame is constructed using a plurality of P frames. That is, at the time index t = 3, the decoding path from the P frame for reproducing the picture F (3, 1, k) k = 1 to 4 is only the path to the D (3, 1) frame. . As a result, the number of passes of the coding graph after t = 4 can be reduced.

[Example DSC2 using distributed source coding]
[Distributed source coding]
FIG. 10 is a diagram illustrating the concept of a coding graph using distributed source coding (DSC2) used in the third embodiment. The stream server inputs the original picture F ^o (i + 1, j) of the view (j) at the time index (i + 1) as input information, and F of the view (j) at the time index (i) as side information. Frame, F (i, 1,1), F (i, 1,2),-, F (i, 1, k1), F (i, j, 1), F (i, j, 2), -, F (i, j, kj), ..., F (i, n, 1), F (i, n, 2),-, encoding using F (i, n, kn) To create Dp. (j = 1 to n)
Here, F (i, 1,1), F (i, 1,2),-, F (i, 1, k1) represents the view (1) at the time index (i). However, there are one or more pictures showing view (1) at time index (i). In order to distinguish them, the third parameter (k1) is used for the distinction. F (i, j, 1), F (i, j, 2),-, F (i, j, kj),... Representing view (j), and F (i) representing view (n) The same applies to i, n, 1), F (i, n, 2),-, F (i, n, kn).
The client decodes Dp using at least one of the pictures F 1 of the view (j) in the time index (i), for example, F (i, j, w) as side information, and displays the picture F (to be displayed on the multi-view display unit. i + 1, j, w) can be obtained. Here, a possible range of w is 1 to (k1 + · kj + · + kn).

[Build coding graph]
FIG. 11 is a diagram illustrating a coding graph using the second example (DSC2) using distributed source coding. In DSC2, the D (3,1) frame has the original picture F ^o (3,1) at time index t = 3 as input information and the picture F (2,1,1), F at time index t = 2. (2,2,1), F (2,1,2), and F (2,2,2) are created as side information. In the third embodiment, the pictures of the view (1) and the view (2) are used as side information. Similarly, the D (3,2) frame uses the original picture F ^o (3,2) at the time index t = 3 as input information and the picture F (2,1,1), at the time index t = 2. F (2, 2, 1), F (2, 1, 2), and F (2, 2, 2) are created as side information. In FIG. 11, the D (3, 2) frame is omitted.

In this case, when constructing D (3,1) at a certain time, here, at time index t = 3, it is not necessary to create a plurality of P frames at time index t = 3. This means that the required amount of memory in the coding graph storage unit 120 of the stream server can be saved from DSC1. On the other hand, since the F frame at the time index t = 2 as the side information includes the picture of the view (2), the relevance to the input information F ^o (3, 1) is generally weakened. As a result, the size of the D frame of DSC2 tends to be larger than that of DSC1. In the coding graph using the second example (DSC2) using distributed source coding, when D (3,1) is created, in FIG. 11, as side information, F (2,1,1), F (2,2,1), F (2,1,2), and F (2,2,2) are used. Alternatively, D (2,1) (not shown) can be obtained using F (1,1,1) and F (1,2,1), and can be used for the coding graph. As described above, there are various variations in the coding graph configuration method using the I frame, the P frame, and the D frame.

[Picture playback on client]
When the request with the time index t = 3 is the view (1), the corresponding D (3, 1) frame is transmitted to the user.
Here, since the user is viewing the picture at the time index t = 2, the picture F (2,1,1), F (2,2,2), F (2,1) at the time index t = 2. , 3) and F (2, 2, 4), the client has one of them. The client can reproduce the input information F (3,1) by decoding the D (3,1) frame as input information and the currently viewed picture F as side information.
[Effect of Example 3]
In the DSC 2 disclosed in the third embodiment, the information transmitted from the stream server to the client is D (3, 1). One of pictures F (2,1,1), F (2,2,2), F (2,1,3), and F (2,2,4) at time index t = 2 Since the client has one, the user can reproduce the desired F (3, j, k). In the prior art, an I frame is used instead of a D frame, but in general, the frame size of D (3, 1) is smaller than the I frame I (3, j). Therefore, in DSC2, the memory capacity to be stored in the stream server and the data amount to be transmitted to the client are reduced.

[Example using distributed source coding DSC1 + DSC2]
[Build coding graph]
FIG. 12 is a diagram showing a coding graph using the third example (DSC3) using distributed source coding. This is an embodiment in which DSC1 and DSC2 are combined at time index t = 3.
(1) F (3,1,1) is created using F (2,1,1) and P (3,1,1).
(2) F (2,2,1) can be reproduced using P (2,2,1) and F (1,1,1). Similarly, F (2,1,2) can be reproduced from P (2,1,2) and F (1,2,2), and F (2,2,2) It can be reproduced from (2,2,2) and F (1,2,2).
(3) Encode the original picture F ^o (3,1) as input information and F (2,2,1), F (2,1,2) and F (2,2,2) as side information To create D (3,1,1). This applies DSC2.
(4) Using any one of F (2,2,1), F (2,1,2) and F (2,2,2) and D (3,1,1), Decoding is performed to obtain F (3, 1, 2).
(5) The original picture F ^o (3,1) is used as input information, and F (3,1,1) and F (3,1,2) are used as side information for encoding to perform D (3,1,2). Get. This applies DSC1.
Here, D (i, j, m) is a DSC frame for creating view (j) at time index t = i, and represents the mth D frame at time index t = i.

[Picture playback on client]
(1) When the user is watching F (2,1,1).
The stream server transmits D (3, 1, 2) and P (3, 1, 1) to the client.
The client creates F (3,1,1) using P (3,1,1) based on F (2,1,1) already possessed. Furthermore, D (3,1,2) is decoded as input information and F (3,1,1) is decoded as side information to obtain a picture FN (3,1,1) to be displayed on the multi-view display unit. I can do it. See FIG. 8 for picture FN.
(2) When the user is looking at one of F (2, 2, 1), F (2, 1, 2), and F (2, 2, 2).
The stream server transmits D (3, 1, 2) and D (3, 1, 1) to the client.
The client performs decoding using D (3, 1, 1) as input information and the picture F viewed at time index t = 2 as side information to obtain F (3, 1, 2). Next, decoding is performed using D (3,1,2) as input information and F (3,1,2) as side information, and a picture FN (3,1,2) to be displayed on the multi-view display unit is displayed. Can be obtained. See FIG. 8 for picture FN.

[Effect of Example 4]
In the fourth embodiment, an example of DSC3 (a combination of DSC1 and DSC2) is shown. Although details will be described later, in the amount of data stored in the stream server,
DSC1> (DSC1 + DSC2)> DSC2
In the bandwidth when sending from the stream server to the client,
DSC1 <(DSC1 + DSC2) <DSC2
It becomes.
Comparing the path (a) and the path (b) in FIG.
In the case of path (a), the stream server transmits D (3, 1, 2) and P (3, 1, 1) to the client.
In the case of path (b), the stream server transmits D (3, 1, 2) and D (3, 1, 1) to the client.

Therefore, when it is known that the frequency of using the path (a) is high empirically or probabilistically, when the coding graph is constructed with the configuration of (DSC1 + DSC2), it is stored in the stream server. The amount of data can be reduced, and the bandwidth for transmission from the substantial stream server to the client can also be reduced.
In general, the size of the transmitted frame is larger in the D frame than in the P frame. If it is known in advance that the probability of passing through the path (b) is higher than that of the path (a), the configuration of the coding graph shown in the fourth embodiment reduces the amount of data to be transmitted.

Advantages of the present invention (1) When DCS1 and DSC3 (DCS1 + DSC2) are compared The storage capacity of the data required in the stream server is P3 in DSC3 when comparing FIG. 9 (DSC1) and FIG. The number of frames is small, while DSC 1 needs to store a plurality of P frames. Therefore, it can be said that DSC3 is generally more advantageous.
The amount of data transmitted from the stream server to the client is P frame and D frame in DSC1. On the other hand, in DSC3, there are a P frame and a D frame, or two D frames. Thus, it can be said that DSC1 is generally more advantageous.
(2) When DCS2 and DSC3 (DCS1 + DSC2) are compared The storage capacity of data required in the stream server is compared with FIG. 11 (DSC2) and FIG. 12 (DSC3). Two D frames are included, while DSC 2 includes one D frame. Therefore, it can be said that DSC2 is generally more advantageous.
The amount of data transmitted from the stream server to the client is one D frame in DSC2. On the other hand, in DSC3, the P frame and the second D frame (path (a) in FIG. 12), or the first and second D frames (path (b) in FIG. 12). However, since the second D frame of DSC 3 uses the same picture as input information and side information, the size can be very small. When the P frame and the second D frame of DSC3 are smaller than the D frame of DSC2, and the path of the P frame and the second D frame of DSC3, that is, the path (a) shown in FIG. When the probability of selection from (b) is sufficiently high, DSC3 is more advantageous than DSC2.

FIG. 12 shows the amount of transmission data and memory storage expected when three distributed source coding techniques DCS1, DSC2, and DSC3 (DSC1 + DSC2) disclosed in this specification are used when an I frame is used. A trade-off with quantity is shown.
It can be seen that the DSC 1 disclosed in the specification of the present application can transmit with the minimum transmission data amount in the transmission line. DSC1 is effective when transmitting via a transmission line with limited bandwidth.
Further, DSC2 shows that it can cope with a smaller transmission data amount and a smaller memory storage amount than a coding graph of only I frames. By combining DSC3, that is, DSC1 + DSC2, it is possible to trade off the bandwidth of the communication line and the memory storage amount in the stream server.
In this example, in terms of memory storage, the coding graph including DSC is 28% less than the coding graph for only I frames, and 20% more than the coding graph for I frames and P frames. Has been demonstrated.

110: Stream server 112: Multi-view source storage unit 116: Encoding unit 118: Coding graph creation unit 120: Coding graph storage unit 122: View stream transmission control unit 124: Reception unit 126: Transmission unit 130: Client 132: User Input unit 134: transmission unit 136: reception unit 138: decoding unit 140: multi-view display unit 300: main memory 306
302: Processor 304: Communication bus 308: Secondary memory 310: Hard disk 312: Removable storage drive 316: Keyboard 318: Mouse 320: Display 322: Display adapter

Claims

A multi-view stream data control system using a computer,
(A) An encoding unit that reads a predetermined original picture at a predetermined time from a multi-view source storage unit, performs encoding, and creates an I frame, a P frame, and a D frame,
The D frame corresponds to the first view at the first time created using the original picture corresponding to the first view at the first time as the input information and the frame at the first time. A first D frame created by performing distributed source coding with at least one first picture as side information,
(B) a coding graph creation unit that creates a coding graph by connecting the I frame, the P frame, and the D frame by a series of time-sequential paths; and
(C) A data control system including a storage unit that stores the generated coding graph in a coding graph storage unit.
From the first time, the D frame is created by using a frame at a time immediately before the first time, using the original picture corresponding to the first view at the first time as input information. The control system according to claim 1, further comprising a second D frame created by performing distributed source coding with at least one picture at the previous time as side information.
The D frame has, as input information, an original picture corresponding to a first view at a first time, and at least one of the first D frame and the second D frame as side information. 3. The control system of claim 2, comprising a third D frame created by performing distributed source coding.
2. The control system according to claim 1, wherein the coding graph creation unit creates a coding graph while minimizing the amount of multi-view streams to be transmitted under a predetermined memory capacity limit in the coding graph storage unit. .
The control system according to claim 1, further comprising a multi-view stream transmission unit that reads data stored in the coding graph in accordance with a request from a client and transmits a necessary stream to the client.
Multi-view stream data creation method in a multi-view stream data control system using a computer,
(A) The encoding unit reads a predetermined original picture at a predetermined time from the multi-view source storage unit, performs encoding, and creates an I frame, a P frame, and a D frame,
Furthermore, the original picture corresponding to the first view at the first time is used as input information, and is created using the frame at the first time, and corresponds to the first view at the first time, at least Performing a distributed source coding with one first picture as side information to create a first D frame,
(B) a step of creating a coding graph in which a coding graph creation unit creates a coding graph by connecting the I frame, the P frame, and the D frame by a series of time-sequential paths; and
(C) a coding graph creation unit storing the created coding graph in a coding graph storage unit;
Including methods.
Further, the coding graph creation unit uses the original picture corresponding to the first view at the first time as input information and is created using the frame at the time one time before the first time. The method according to claim 6, further comprising a step of performing distributed source coding to generate a second D frame using at least one picture at a time prior to one time as side information.
Further, the coding graph creation unit uses the original picture corresponding to the first view at the first time as input information, and selects at least one of the first D frame and the second D frame. 8. The method of claim 7, comprising the step of creating a third D-frame created by performing distributed source coding as side information.
A computer-readable medium storing program code corresponding to the following steps for creating multi-view stream data to be executed on a computer,
(A) The encoding unit reads a predetermined original picture at a predetermined time from the multi-view source storage unit, performs encoding, and creates an I frame, a P frame, and a D frame,
Furthermore, the original picture corresponding to the first view at the first time is used as input information, and is created using the frame at the first time, and corresponds to the first view at the first time, at least Performing a distributed source coding with one first picture as side information to create a first D frame,
(B) a step of creating a coding graph in which a coding graph creation unit creates a coding graph by connecting the I frame, the P frame, and the D frame by a series of time-sequential paths; and
(C) A step of the coding graph creation unit storing the created coding graph in the coding graph storage unit.
Further, the coding graph creation unit uses the original picture corresponding to the first view at the first time as input information and is created using the frame at the time one time before the first time. The medium according to claim 9, further comprising a step of creating a second D frame by performing distributed source coding using at least one picture at a time before one time as side information.
Further, the coding graph creation unit uses the original picture corresponding to the first view at the first time as input information, and selects at least one of the first D frame and the second D frame. The medium according to claim 10, comprising a step of creating a third D frame created by performing distributed source coding as side information.