CN108702478B - File generation device, file generation method, reproduction device, and reproduction method - Google Patents
File generation device, file generation method, reproduction device, and reproduction method Download PDFInfo
- Publication number
- CN108702478B CN108702478B CN201780011647.6A CN201780011647A CN108702478B CN 108702478 B CN108702478 B CN 108702478B CN 201780011647 A CN201780011647 A CN 201780011647A CN 108702478 B CN108702478 B CN 108702478B
- Authority
- CN
- China
- Prior art keywords
- file
- depth
- quality
- image
- track
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/275—Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B30/00—Optical systems or apparatus for producing three-dimensional [3D] effects, e.g. stereoscopic images
- G02B30/40—Optical systems or apparatus for producing three-dimensional [3D] effects, e.g. stereoscopic images giving the observer of a single two-dimensional [2D] image a perception of depth
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/0007—Image acquisition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/172—Processing image signals image signals comprising non-image signal components, e.g. headers or format information
- H04N13/178—Metadata, e.g. disparity information
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/236—Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
- H04N21/2362—Generation or processing of Service Information [SI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/91—Television signal processing therefor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/80—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
- H04N9/82—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only
- H04N9/8205—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only involving the multiplexing of an additional signal and the colour video signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N2013/0074—Stereoscopic image analysis
- H04N2013/0081—Depth or disparity estimation from stereoscopic image signals
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Library & Information Science (AREA)
- Optics & Photonics (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Television Signal Processing For Recording (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
- Stereoscopic And Panoramic Photography (AREA)
- Processing Or Creating Images (AREA)
Abstract
The present disclosure relates to a file generating apparatus, a file generating method, a reproducing apparatus, and a reproducing method that make it possible to generate a file that efficiently stores quality information of a depth-related image including at least a depth image. According to the present invention, the clip file generating unit generates a quality file in which quality information indicating the quality of a depth-related image including at least a depth image is divided by type. The present disclosure can be applied to, for example, a file generation apparatus of an information processing system that distributes a segment file having moving image content and an MPD file using a technique based on MPEG-DASH, and the like.
Description
Technical Field
The present disclosure relates to a file generating apparatus and a file generating method, and a reproducing apparatus and a reproducing method, and particularly to a file generating apparatus and a file generating method, and a reproducing apparatus and a reproducing method, which can generate: the file is used to efficiently store quality information of a depth-related image including at least a depth image.
As a technique for realizing stereoscopic vision, a technique using a texture image and a depth image is available. The depth image is the following image: in the image, a value indicating a position of each pixel in the depth direction of the image pickup object is a pixel value.
In such a technique just described, there is a case where an occlusion image is used as additional information in order to realize natural stereoscopic vision. The occlusion image is a texture image in an occlusion region, which is a region of an image pickup object that does not exist in the texture image, that is, a region of an image pickup object that is not visible from the viewpoint of the texture image (for example, an image pickup object that is occluded by a closer image pickup object). By using not only the texture image and the depth image but also the occlusion image, it is possible to generate a 3D image that realizes stereoscopic vision in a case where peeking or the like from viewpoints different from each other is performed.
The texture image and the depth image may be transmitted, for example, by an existing MPEG-DASH (moving picture experts group-HTTP dynamic adaptive streaming) method (for example, refer to non-patent document 1).
In this case, the DASH client selects and acquires a depth image having the maximum acceptable bit rate from among the depth images of a plurality of bit rates stored in the DASH server, taking into account the buffer amount and the transmission path of the DASH client itself.
However, in a case where the influence of the bit rate of the depth image on the picture quality of the 3D image is small, in a case where the variation amount of the pixel value of the depth image is small, or in a similar case, the picture quality of the 3D image does not vary greatly due to the bit rate of the depth image. Therefore, in this case, if the DASH client selects and acquires a depth image having the maximum acceptable bit rate, the transmission path and the buffer become useless.
On the other hand, ISO/IEC 23001-10 proposes to store quality information representing the quality of one or more types of texture images into an MP4 file of the ISO base media file format.
[ reference List ]
[ non-patent document ]
[ non-patent document 1] ISO/IEC 23009-1Dynamic adaptive streaming over HTTP (DASH) part 1: media presentation descriptions and segment formats, month 4 2012
Disclosure of Invention
[ problem ] to
As described above, also with respect to a depth-related image including at least a depth image, it is required to store quality information into a file similarly to a texture image so that the DASH client uses the quality information to acquire a depth-related image having an appropriate bit rate. However, efficient storage of quality information of the depth-related image into a file is not considered.
The present disclosure has been made in view of such a situation as described above, and makes it possible to generate a file into which quality information of a depth-related image including at least a depth image is efficiently stored.
[ solution of problem ]
A file generating apparatus of a first aspect of the present disclosure is a file generating apparatus including a file generating unit configured to generate a file in which quality information representing quality of a depth-related image including at least a depth image is set in a form for each type of division.
The file generation method of the first aspect of the present disclosure corresponds to the file generation apparatus of the first aspect of the present disclosure.
In the first aspect of the present disclosure, a file is generated in which quality information representing the quality of a depth-related image including at least a depth image is set in the form of a partition for each type.
A reproduction apparatus of a second aspect of the present disclosure is a reproduction apparatus including an acquisition unit configured to acquire quality information of a given type from a file in which quality information representing quality of a depth-related image including at least a depth image is set in the form of division for each type.
The reproduction method of the second aspect of the present disclosure corresponds to the reproduction apparatus of the second aspect of the present disclosure.
In the second aspect of the present disclosure, the given type of quality information is acquired from a file in which quality information representing the quality of a depth-related image including at least a depth image is set in the form of division for each type.
It should be noted that the file generating apparatus of the first aspect and the reproducing apparatus of the second aspect of the present disclosure may be realized by causing a computer to execute a program.
Further, in order to realize the file generating apparatus of the first aspect and the reproducing apparatus of the second aspect of the present disclosure, the program executed by the computer may be provided by transmitting the program via a transmission medium or by recording the program on a recording medium.
[ advantageous effects of the invention ]
With the first aspect of the present disclosure, a file may be generated. Further, with the first aspect of the present disclosure, the following file may be generated: quality information of a depth-related image comprising at least a depth image is efficiently stored in the file.
With the second aspect of the present disclosure, quality information can be acquired from a file in which quality information of a depth-related image including at least a depth image is efficiently stored.
It should be noted that the effects described herein are not necessarily restrictive, and the effects may be any of the effects described in the present disclosure.
Drawings
Fig. 1 is a diagram showing an overview of an information processing system according to a first embodiment to which the present disclosure is applied.
Fig. 2 is a diagram showing an occlusion image.
Fig. 3 is a diagram showing a hierarchical structure of an MPD file.
Fig. 4 is a block diagram depicting an example of a configuration of the file generating apparatus of fig. 1.
Fig. 5 is a diagram depicting an example of a clip file in the first embodiment.
Fig. 6 is a diagram depicting an example of description of qualitymetrics sampleentry (quality metric sample entry) in the first embodiment.
Fig. 7 is a diagram showing a description of qualitymetrics sampleentry of fig. 6.
Fig. 8 is a diagram depicting an example of metric _ code.
Fig. 9 is a diagram depicting an example of a presentation element of an MPD file in the first embodiment.
Fig. 10 is a diagram depicting an example of description of an MPD file in the first embodiment.
Fig. 11 is a diagram showing a leva box (leva box) in the first embodiment.
Fig. 12 is a flowchart showing a file generation process.
Fig. 13 is a block diagram depicting an example of a configuration of a streaming reproduction unit.
Fig. 14 is a flowchart showing a first example of reproduction processing in the first embodiment.
Fig. 15 is a flowchart showing a second example of reproduction processing in the first embodiment.
Fig. 16 is a diagram depicting an example of a clip file in the second embodiment of the information processing system to which the present disclosure is applied.
Fig. 17 is a diagram depicting an example of a clip file in the third embodiment of the information processing system to which the present disclosure is applied.
Fig. 18 is a diagram depicting an example of description of an MPD file in the third embodiment.
Fig. 19 is a flowchart showing a first example of reproduction processing in the third embodiment.
Fig. 20 is a flowchart showing a second example of reproduction processing in the third embodiment.
Fig. 21 is a diagram depicting an example of a clip file in the fourth embodiment of the information processing system to which the present disclosure is applied.
Fig. 22 is a diagram depicting an example of a clip file in the fifth embodiment of the information processing system to which the present disclosure is applied.
Fig. 23 is a diagram depicting an example of a clip file in the sixth embodiment of the information processing system to which the present disclosure is applied.
Fig. 24 is a diagram depicting an example of description of an MPD file in the sixth embodiment.
Fig. 25 is a diagram depicting an example of a clip file in the seventh embodiment of the information processing system to which the present disclosure is applied.
Fig. 26 is a diagram depicting an example of a configuration of a sample of the track of fig. 25.
Fig. 27 is a diagram depicting an example of the configuration of the moov box of the depth file.
Fig. 28 is a diagram depicting an example of a description of a qualitymetrics configuration box.
Fig. 29 is a diagram depicting an example of a description of qualitymetrics sampleentry among the descriptions in qualitymetrics sampleentry in the seventh embodiment.
Fig. 30 is a diagram depicting an example of a description of a SubsampleInformationBox.
Fig. 31 is a diagram depicting an example of description of a SubsampleReferenceBox.
Fig. 32 is a diagram depicting an example of description of an MPD file in the seventh embodiment.
Fig. 33 is a diagram depicting an example of a clip file in the eighth embodiment of the information processing system to which the present disclosure is applied.
Fig. 34 is a diagram depicting a first example of the description of the leva box.
Fig. 35 is a diagram showing a first example of levels and sub-samples associated with each other through the leva box of fig. 34.
Fig. 36 is a diagram depicting a second example of the description of the leva box.
Fig. 37 is a diagram showing a second example of levels and sub-samples associated with each other by the leva box of fig. 36.
Fig. 38 is a diagram depicting an example of the description of the sub-sample group entry.
Fig. 39 is a diagram showing a third example of levels and sub-samples associated with each other by leva boxes.
Fig. 40 is a block diagram depicting an example of a hardware configuration of a computer.
Detailed Description
Hereinafter, modes for carrying out the present disclosure (hereinafter referred to as embodiments) are described. Note that the description is given in the following order.
1. The first embodiment: information processing system (FIGS. 1 to 15)
2. Second embodiment: information processing system (FIG. 16)
3. The third embodiment: information processing system (FIGS. 17 to 20)
4. The fourth embodiment: information processing system (FIG. 21)
5. Fifth embodiment: information processing system (FIG. 22)
6. Sixth embodiment: information processing system (FIGS. 23 and 24)
7. Seventh embodiment: information processing system (FIGS. 25 to 32)
8. Eighth embodiment: information processing system (FIGS. 33 and 39)
9. Ninth embodiment: computer (Picture 40)
< first embodiment >
(overview of information processing System)
Fig. 1 is a diagram showing an overview of an information processing system according to a first embodiment to which the present disclosure is applied.
The information processing system 10 of fig. 1 is configured by: a Web server 12 as a DASH server and a video reproduction terminal 14 as a DASH client connected to the file generation apparatus 11 are connected via the internet 13.
In the information processing system 10, the Web server 12 distributes the file of the video content generated by the file generation apparatus 11 to the video reproduction terminal 14 by a method compliant with MPEG-DASH.
Specifically, the file generating means 11 encodes texture images, image data and sound data of depth images and occlusion images of video contents, metadata including quality information of the depth images and occlusion images, and the like at one or more bit rates.
It is assumed in this specification that two bit rates of 8Mbps and 4Mbps are available for texture images; two bit rates of 2Mbps and 1Mbps are available for the depth image; and a bit rate of 1Mbps is available for occlusion images. Further, in the following description, in the case where the depth image and the occlusion image do not need to be particularly distinguished from each other, these images are referred to as depth occlusion images.
The file generating apparatus 11 files encoded streams of image data and sound data of respective bit rates generated as a result of encoding in units of time of several seconds to about ten seconds called clips in the ISO base media file format. The file generation device 11 uploads a clip file, which is an MP4 file of the image data and the sound data generated by the above-described processing, to the Web server 12.
Further, the file generating means 11 divides the encoded stream including the metadata of the quality information of the depth occlusion image in units of segments for each type of depth occlusion image, and files the division of the encoded stream in the ISO base media file format. The file generating means 11 uploads the fragment file of the metadata generated as a result of the processing just described to the Web server 12.
Further, the file generation means 11 generates an MPD (media presentation description) file (management file) for managing a segment file group of video content. The file generation device 11 uploads the MPD file to the Web server 12.
The Web server 12 stores therein the segment file and the MPD file uploaded from the file generation apparatus 11. The Web server 12 transmits the stored clip file or MPD file to the video reproduction terminal 14 in response to a request from the video reproduction terminal 14.
The video reproduction terminal 14 (reproduction apparatus) executes control software (hereinafter referred to as control software) 21 for streaming data, video reproduction software 22, client software (hereinafter referred to as access software) 23 for HTTP (hypertext transfer protocol) access, and the like.
The control software 21 is software for controlling data to be streamed from the Web server 12. Specifically, the control software 21 causes the video reproduction terminal 14 to acquire the MPD file from the Web server 12.
Further, the control software 21 issues a transmission request of the encoded stream of the clip file of the playback target to the access software 23 based on the playback target information indicating the playback target time and the bit rate or the like specified by the video playback software 22.
The video reproduction software 22 is software for reproducing the encoded stream acquired from the Web server 12. Specifically, the video reproduction software 22 specifies reproduction target information in which the encoded stream of the metadata is a reproduction target to the control software 21. Then, when receiving a notification of the start of reception of the encoded stream of metadata from the access software 23, the video reproduction software 22 decodes the encoded stream of metadata received by the video reproduction terminal 14.
The video reproduction software 22 specifies reproduction target information in which an encoded stream of image data or sound data of a predetermined bit rate is a reproduction target to the control software 21 based on quality information included in metadata obtained as a result of the decoding, a network bandwidth of the internet 13, and the like. Then, when receiving a notification of the start of reception of the encoded stream of the image data or the sound data from the access software 23, the video reproduction software 22 decodes the encoded stream of the image data or the sound data received by the video reproduction terminal 14.
The video reproduction software 22 outputs the image data of the texture image obtained as a result of the decoding as it is. Further, the video reproduction software 22 generates and outputs image data of a 3D image using the texture image and the depth image. Further, the video reproduction software 22 generates and outputs image data of a 3D image using the texture image, the depth image, and the occlusion image. Further, the video reproduction software 22 outputs sound data obtained as a result of the decoding.
The access software 23 is software for controlling communication with the Web server 12 through the internet 13 using HTTP. Specifically, in response to the instruction of the control software 21, the access software 23 causes the video reproduction terminal 14 to transmit a transmission request for an encoded stream of a clip file of a reproduction target. Further, the access software 23 causes the video reproduction terminal 14 to start receiving the encoded stream transmitted from the Web server 12 in response to the transmission request, and provides the video reproduction software 22 with a notification of the start of reception.
It should be noted that, since the present disclosure is an invention relating to image data and metadata of video content, hereinafter, description of storage and reproduction of a clip file of sound data is omitted.
(description of occlusion image)
Fig. 2 is a diagram showing an occlusion image.
If the images of the cylinder 41 and the cube 42 in the upper part of fig. 2 are taken from the front direction indicated by the arrow mark 51, a texture image 61 in the lower left side of fig. 2 is obtained.
In a case where the texture image 62 photographed from the direction viewed from the left side indicated by the arrow mark 52 is to be generated using the texture image 61 and the depth image of the texture image 61, the pixel values of the texture image 61 corresponding to the pixels of the texture image 62 are acquired based on the depth image of the texture image 61. Then, the texture image 62 is generated by determining the pixel values of the pixels of the texture image 62 as the pixel values of the texture image 61 corresponding to these pixels.
However, as shown on the right side of the lower part of fig. 2, there is no corresponding texture image 61 in the texture image 62. Specifically, an occlusion region 43 is generated, which occlusion region 43 is a region of an image pickup object that is not picked up when image pickup is performed from the direction indicated by the arrow mark 51, but is picked up when image pickup is performed from the direction indicated by the arrow mark 52 (in the example of fig. 2, a side face of the cube 42). The texture image of the occlusion region 43 is an occlusion image.
Accordingly, it is possible to generate the texture image 62 from a viewpoint different from that of the texture image 61 by using the texture image 61, the depth image of the texture image 61, and the texture image of the occlusion region 43. Further, the depth image of the texture image 62 may be generated from the texture image 62 and the depth image of the texture image 61. Accordingly, a 3D image from a viewpoint different from that of the texture image 61 can be generated from the texture image 62 and the depth image of the texture image 62.
(description of MPD File)
Fig. 3 is a diagram illustrating a hierarchical structure of an MPD file.
In the MPD file, information of an encoding method and a bit rate of video content, a size of an image, a language of speech, and the like are described in a hierarchical relationship in an XML format.
Specifically, as shown in fig. 3, elements such as a Period (Period), an adaptation set (adaptation set), a Representation (reproduction), and segment information (SegmentInfo) are hierarchically included in the MPD file.
In the MPD file, video content managed by the MPD file itself is divided by a predetermined time range (for example, a unit such as a program, a CM (commercial message), or the like). A period element is described for each division of the divided video content. The period element has information of a reproduction start time of a program of the video content (data of a set of image data or sound data or the like synchronized with each other), a URL (uniform resource locator) of the Web server 12 to which a clip file of the video content is to be stored, and the like.
The adaptation set element is included in the period element, and presentation elements of a clip file group corresponding to the same encoded stream of the video content corresponding to the period element are grouped. The presentation elements are grouped, for example, by the type of data of the corresponding encoded stream. The adaptive set element has utility as a media type, language, subtitle, dubbing, etc. common to the group.
The presentation element is included in an adaptation set element for grouping the presentation elements, and is described for each clip file group of the same encoded stream of video content corresponding to a period element in an upper layer. The presentation element has a bit rate, an image size, and the like common to the clip file group.
The segment information element is included in the presentation element and has information on respective segment files of a segment file group corresponding to presentation (reproduction).
(example of configuration of File Generation apparatus)
Fig. 4 is a block diagram depicting an example of a configuration of the file generating apparatus of fig. 1.
The file generating apparatus 11 of fig. 4 is configured by an acquiring unit 81, an encoding unit 82, a segment file generating unit 83, an MPD file generating unit 84, and an uploading unit 85.
The acquisition unit 81 of the file generation apparatus 11 acquires image data of a texture image, a depth image, and an occlusion image of video content, and supplies the image data to the encoding unit 82. Further, the acquisition unit 81 acquires metadata of quality information of an encoded stream including a depth image of 2Mbps and 1Mbps and an occlusion image of 1Mbps, and supplies the metadata to the encoding unit 82.
The encoding unit 82 encodes the image data of the texture image supplied from the acquisition unit 81 at 8Mbps and 4Mbps, and encodes the image data of the depth image at 2Mbps and 1 Mbps. Further, the encoding unit 82 encodes the image data of the occlusion image at 1 Mbps. Further, the encoding unit 82 encodes metadata of the depth image of 2Mbps and 1Mbps and the occlusion image of 1Mbps at predetermined bitrates, respectively. The encoding unit 82 supplies the encoded stream generated as a result of the encoding to the clip file generating unit 83.
The clip file generating unit 83 files the encoded streams of the texture image, the depth image, and the occlusion image supplied from the encoding unit 82 in units of clips for each bit rate to generate a clip file of image data.
Further, the clip file generating unit 83 (file generating unit) divides the encoded stream of the metadata supplied from the encoding unit 82 into two for each type of depth occlusion image. Then, the clip file generating unit 83 sets the division of the encoded stream of the metadata into different clip files in units of clips to generate a clip file of the metadata.
Specifically, the clip file generation unit 83 divides the encoded stream of metadata supplied from the encoding unit 82 into an encoded stream of metadata in units of clips of depth images of 2Mbps and 1Mbps and another encoded stream of metadata in units of clips of occlusion images of 1 Mbps. Then, the clip file generation unit 83 converts the encoded stream of metadata in units of clips of 2Mbps and 1Mbps depth images and the encoded stream of metadata in units of clips of 1Mbps occlusion images into files, respectively, to generate a clip file of metadata. The clip file generating unit 83 supplies the generated clip file to the uploading unit 85.
An MPD file generating unit 84 (file generating unit) generates an MPD file and supplies the MPD file to the uploading unit 85.
The upload unit 85 uploads the segment file supplied from the segment file generation unit 83 and the MPD file supplied from the MPD file generation unit 84 to the Web server 12.
(example of clip files)
Fig. 5 is a diagram depicting an example of the clip file generated by the clip file generating unit 83 of fig. 4.
As shown in fig. 5, the clip file generating unit 83 generates a clip file of a texture image of 8Mbps as a texture file (texture 1 file), and generates a clip file of a texture image of 4Mbps as another texture file (texture 2 file). Further, the clip file generating unit 83 generates a clip file of a depth image of 2Mbps as a depth file (depth 1 file), and generates a clip file of a depth image of 1Mbps as a depth file (depth 2 file). Further, the clip file generating unit 83 generates a clip file of an occlusion image of 1Mbps as an occlusion file (occlusion 1 file).
Further, the clip file generating unit 83 generates a clip file including metadata of quality information of depth images of 2Mbps and 1Mbps as a quality file (quality 1 file). In the quality file (quality 1 file), metadata including quality information of a depth image of 2Mbps and metadata including quality information of a depth image of 1Mbps are set in tracks (quality track (depth 1) and quality track (depth 2)) different from each other.
Further, the clip file generating unit 83 generates a clip file including metadata of quality information of an occlusion image of 1Mbps as a quality file (quality 2 file).
As described above, the clip file generating unit 83 separately files the encoded streams of metadata including quality information for different types of depth occlusion images. Therefore, in the case where the video reproduction terminal 14 generates a 3D image using a texture image and a depth image, quality information of a desired depth image can be easily acquired from a quality file (quality 1 file) of the depth image.
In contrast, in the case of collectively filing the encoded streams of the quality information of all the depth occlusion images, the quality information of the desired depth image is acquired from a file that also includes the quality information of the unnecessary occlusion images, and the acquisition efficiency is low.
Further, in the case where encoded streams of quality information of all depth occlusion images are separately filed for each encoded stream, when a plurality of required depth occlusion images are acquired, it is necessary to acquire quality information from a plurality of files, and acquisition efficiency is low.
It is to be noted that although the acquisition unit 81 does not acquire metadata including quality information of texture images of 8Mbps and 4Mbps, the metadata may be acquired. In this case, the clip file generating unit 83 also generates a clip file in which encoded streams of metadata including quality information of texture images of 8Mbps and 4Mbps are collectively stored in units of clips. Further, metadata of a texture image of 8Mbps and metadata of a texture image of 4Mbps are set in tracks different from each other.
(example of a description of QualityMetricsSampleEntry)
Fig. 6 is a diagram depicting an example of a description of qualitymetrics sampleentry set in a quality file.
As shown in fig. 6, in qualitymetrics sampleentry, a qualitymetrics configuration box is set. In the qualitymetrics configuration box, field _ size _ bytes and metric _ count are described, and the number of metric _ count equal to the metric _ code is described.
As shown in fig. 7, field _ size _ bytes indicates a data size of Quality (Quality) for each type of encoded stream of Quality information included in a sample of a Quality file. In case the actual size of the encoded stream of a certain type of quality information is smaller than field _ size _ bytes, padding is added to the encoded stream of quality information.
Further, the metric _ count indicates the number of types of quality corresponding to the encoded streams of the quality information included in the samples of the quality file. The metric _ code is information indicating a type of each quality corresponding to an encoded stream of quality information included in a sample of a quality file, and is described in the order of the encoded streams of quality information set in the sample.
(example of metric _ code)
Fig. 8 is a diagram depicting an example of metric _ code.
As shown in FIG. 8, as the metric _ code, not only psnr, ssim, msim, j144, j247, mops and fsig defined in ISO/IEC 23001-10 but also ocer and ocpr may be set.
For example, PSNR indicates that the type of quality indicated by the quality information is PSNR (peak signal-to-noise ratio) of the entire picture.
Further, when quality information of an occlusion image is included in the sample, ocer and ocpr are set. ocer indicates the type of quality indicated by the quality information indicates an occlusion region corresponding to the occlusion image, i.e., the ratio of the effective range of the occlusion region, with respect to the entire picture of the texture image. ocpr indicates that the type of quality indicated by the quality information is PSNR of only the occlusion image, that is, PSNR of only the effective range of the occlusion area.
As described above, in the case where the quality information of the occlusion image is included in the sample, ocer or ocpr may be set to the metric _ code. Thus, quality information representing the ratio of occlusion regions and PSNR can be stored in the samples. Accordingly, the video reproduction terminal 14 can select and reproduce the best occlusion file based on the quality information.
Specifically, since the occlusion image is only an image of an occlusion region within the screen, there is a possibility that: the impact on the existing quality of the entire picture, such as PSNR, may be small. Therefore, the quality of the mask image cannot be sufficiently expressed by the conventional quality information. Thus, by enabling the following quality information to be stored in the sample, the video reproduction terminal 14 is enabled to select a more appropriate occlusion file based on the quality information: the quality information indicates the PSNR or the proportion of the occlusion area having a large influence on the quality of the occlusion image as the quality.
(example of a presentation element)
Fig. 9 is a diagram depicting an example of a presentation element of an MPD file generated by the MPD file generating unit 84 of fig. 4.
As shown in fig. 5, the clip file generating unit 83 generates seven types of clip files including a texture file (texture 1 file), another texture file (texture 2 file), a depth file (depth 1 file), another depth file (depth 2 file), an occlusion file (occlusion 1 file), a quality file (quality 1 file (depth)) and another quality file (quality 2 file (occlusion)). Accordingly, as shown in fig. 9, seven presentation elements are included in the MPD file.
(example of description of MPD File)
Fig. 10 is a diagram depicting an example of a description of an MPD file generated by the MPD file generating unit 84 of fig. 4.
It should be noted that in the present specification, it is assumed that the picture quality of the 3D image is not improved even if the depth file (depth 1 file) or the occlusion file (occlusion file) is used when the texture file (texture 2 file) is reproduced. Therefore, reproduction of a 3D image using a texture file (texture 2 file) and a depth file (depth 1 file) or an occlusion file (occlusion file) is not performed. Accordingly, the reproduction modes 1 to 7 described below are modes to be reproduced.
Reproduction mode 6 reproduction of 3D images using texture files (texture 1 file), depth files (depth 2 file) and occlusion files (occlusion file)
In the MPD file of fig. 10, a texture file (texture 1 file) group and another texture file (texture 2 file) group are grouped by one adaptation set (AdaptationSet).
In the adaptive set element for a texture file, a representation element corresponding to a group of texture files (texture 1 file) and another representation element corresponding to a group of texture files (texture 2 file) are described.
The presentation element includes a presentation id (playback id), a bandwidth (bandwidth), a root domain name (BaseURL), an association id (association id), and the like. The representation ID is an ID (identification) unique to the representation element, and is information for specifying the encoded stream corresponding to the representation element. The bandwidth is information representing the bit rate of the texture file group, and the root domain name is information representing the base (base) of the file name. Further, the association ID is a representation ID of some other representation element related to decoding or display (reproduction). The correlation ID is specified by ISO/IEC 23009-1, revision 2.
Therefore, in the presentation element corresponding to the texture file (texture 1 file) group, vt1 is described as a presentation id, 8192000 representing 8Mbps is described as a bandwidth, and "texture1. mp4" is described as a root domain name.
In the presentation element corresponding to the texture file (texture 2 file) group, vt2 is described as a presentation id, 4096000 representing 4Mbps is described as a bandwidth, and "texture2. mp 4" is described as a root domain name.
It should be noted that since the presentation elements corresponding to the texture file (texture 1 file) group and the texture file (texture 2 file) group are not related to other presentation elements, the association ID is not described in the presentation elements corresponding to the texture file (texture 1 file) group and the texture file (texture 2 file) group.
Further, the depth file (depth 1 file) group and the depth file (depth 2 file) group are grouped by one adaptive set element. In the adaptive set element for a depth file, a presentation element corresponding to a group of depth files (depth 1 files) and another presentation element corresponding to a group of depth files (depth 2 files) are described.
In the presentation element corresponding to the depth file (depth 1 file), vd1 is described as a presentation id, 2048000 representing 2Mbps is described as a bandwidth, and "depth1. mp4" is described as a root domain name.
Further, in the reproduction modes 1 to 7, the texture file group related to the depth file (depth 1 file) is a texture file (texture 1 file) group. Therefore, in the presentation element corresponding to the depth file (depth 1 file) group, vt1, which is the presentation ID of the texture file (texture 1 file) group, is described as the association ID.
In the representation element corresponding to the depth file (depth 2 file), vd2 is described as a representation id, 1024000 representing 1Mbps is described as a bandwidth, and "depth2. mp4" is described as a root domain name.
Further, in the reproduction modes 1 to 7, the texture file group of the texture image related to the depth file (depth 2 file) at the time of display is either a texture file (texture 1 file) group or a texture file (texture 2 file) group. Therefore, in the presentation element corresponding to the depth file (depth 1 file) group, vt1 as the presentation ID of the texture file (texture 1 file) group and vt2 as the presentation ID of the texture file (texture 2 file) group are described as association IDs.
In addition, the set of occlusion files (occlusion 1 files) is grouped with one adaptation set element. In an adaptive set element for an occlusion file, a presentation element corresponding to an occlusion file (occlusion file) group is described.
In the presentation element corresponding to the occlusion file (occlusion file) group, vo1 is described as a presentation ID, 1024000 representing 1Mbps is described as a bandwidth, and "occlusion. mp 4" is described as a root domain name.
Further, in the reproduction modes 1 to 7, the depth file group of the depth image related to the occlusion file (occlusion file) at the time of display is a depth file (depth 1 file) group or a depth file (depth 2 file) group. Therefore, in the presentation element corresponding to the mask file (occlusion file) group, vd1 as the presentation ID of the depth file (depth 1 file) group and vd2 as the presentation ID of the depth file (depth 2 file) group are described as the associated IDs.
Further, the set of quality files (quality 1 files) and the set of quality files (quality 2 files) are grouped by one adaptive set element.
In the adaptation set element for the quality file, it is possible to describe schemeIdUri that represents a combination of images to be reproduced among a texture image, a depth image, and an occlusion image, that is, a combination of images that become candidates for use at the time of reproduction, using SupplementalProperty, schemeIdUri "urn: mpeg: dash: quality: playback: 2015". The representation id for constituting the combined image is described as a value (value) of "urn: mpeg: dash: quality: playback: combination: 2015".
Specifically, as information indicating a texture image to be used for reproduction of the above-described reproduction mode 1, < supplementpropertschemehduri ═ urn: mpeg: dash: quality: playback: combination:2015value ═ vt1' >, where vt1 is the representation id of the texture file (texture 1 file) is described.
Similarly, as information indicating a texture image and a depth image to be used for reproduction in the reproduction mode 3, < SupplementalProperty schemehduri ═ urn: mpeg: dash: quality: playback: combination:2015value ═ vt1vd1' >, where values are vt1 as a representation id of the texture file (texture 1 file) and vd1 as a representation id of the depth file (depth 1 file) are described.
As information indicating a texture image and a depth image to be used for reproduction in the reproduction mode 4, < supplementpropertschemehduri: "urn: mpeg: dash: quality: playback: combination:2015value ═ vt1vd2' >, where values are vt1 as a representation id of a texture file (texture 1 file) and vd2 as a representation id of a depth file (depth 2 file) are described.
As information indicating a texture image, a depth image, and an occlusion image used for reproduction in the reproduction mode 5, < supplemenalpropertschemehduri: "urn: mpeg: dash: quality: playback: combination:2015value ═ vt1vd1vo1' >, where values are vt1 as a representation id of a texture file (texture 1 file), vd1 as a representation id of a depth file (depth 1 file), and vo1 as a representation id of an occlusion file (occlusion file) are described.
As information indicating a texture image, a depth image, and an occlusion image used for reproduction in the reproduction mode 6, < supplemenalpropertschemehduri: "urn: mpeg: dash: quality: playback: combination:2015value ═ vt1vd2vo1' >, where values are vt1 as a representation id of a texture file (texture 1 file), vd2 as a representation id of a depth file (depth 2 file), and vo1 as a representation id of an occlusion file (occlusion file) are described.
As information indicating a texture image used for reproduction in the reproduction mode 2, a value of < supplementpropertschemehduri: "urn: mpeg: dash: quality: playback: combination:2015value ═ vt2' >, where the value is vt2 as a representation id of a texture file (texture 2 file), is described.
As information indicating a texture image and a depth image used for reproduction in the reproduction mode 7, < supplementpropertschemehduri: "urn: mpeg: dash: quality: playback: combination:2015 value: 'vt2vd2' >, where values are vt2 as a representation id of a texture file (texture 2 file) and vd2 as a representation id of a depth file (depth 2 file), are described.
As described above, the following representation ids of images are described in the MPD file: these images constitute a combination of images used in the mode to be reproduced. Accordingly, the video reproduction terminal 14 may perform reproduction with reference to the MPD file so that reproduction is not performed using any mode other than the mode to be reproduced.
Further, in the adaptive set element for a quality file, presentation elements corresponding to a quality file (quality 1 file) group and a quality file (quality 2 file) group, respectively, are described.
Among the presentation elements corresponding to the quality file (quality 1 file (depth)) group, vq1 is described as a presentation id, and "quality 1.mp 4" is described as a root domain name.
Further, the quality information stored in the quality file (quality 1 file) is quality information of the depth image stored in the depth file (depth 1 file) and the depth file (depth 2 file). Therefore, in the presentation element corresponding to the quality file (quality 1 file) group, vd1 as the presentation ID of the depth file (depth 1 file) and vd2 as the presentation ID of the depth file (depth 2 file) are described as the associate IDs.
Accordingly, the video reproduction terminal 14 can recognize that the quality information stored in the quality file (quality 1 file) group is the quality information of the depth images corresponding to the depth file (depth 1 file) group and the depth file (depth 2 file) group.
However, the video reproduction terminal 14 cannot recognize whether the quality information stored in one of the two tracks of the quality file (quality 1 file) is the quality information of the depth image stored in the depth file (depth 1 file) group or the depth file (depth 2 file) group.
Therefore, in the MPD file of fig. 10, a sub-representation (sub-representation) element obtained by dividing a representation element for each level that can be associated with a track is expanded so that the sub-representation element can have a similar association ID to the representation element. Therefore, the correspondence between the respective tracks of the quality file (quality 1 file) and the presentation ID (depth-related image specifying information) for specifying the depth image can be described.
Specifically, in the example of fig. 10, a track for storing quality information of a depth image to be stored into a depth file (depth 1 file) is associated with level 1 by a leva box (level assigningbox) provided in the quality file (quality 1 file). Further, a track for storing quality information of a depth image to be stored into a depth file (depth 2 file) is associated with level 2.
Therefore, it is described that < replication level ═ 1 "association ID ═ vd 1" >, which associates level 1 with vd1 as an association ID, where vd1 is the representation ID of the depth file (depth 1 file). Further, it is described that < replication level ═ 2 "association ID ═ vd 2" >, which associates level 2 and vd2 as association IDs with each other, where vd2 is the representation ID of the depth file (depth 2 file).
Further, in the presentation element corresponding to the quality file (quality 2 file) group, vq2 is described as a presentation id, and "quality 2.mp 4" is described as a root domain name.
The quality information included in the metadata stored in the quality file (quality 2 file) is quality information of an occlusion image stored in an occlusion file (occlusion file). Therefore, in the presentation element corresponding to the quality file (quality 1 file) group, vo1 as the presentation ID of the occlusion file (occlusion file) group is described as the associate ID.
It is to be noted that although in the example of fig. 10, the bandwidth is not described in the adaptation set element for the quality file, the bandwidth may be described.
(description of the leva frame)
Fig. 11 is a diagram showing a leva box provided in a quality file (quality 1 file) in the case where the MPD file is the MPD file of fig. 10.
As shown in fig. 11, a leva box (level designation box) is set in a quality file (quality 1 file) having a plurality of tracks. In this leva box, information for specifying a track corresponding to each level is described in order from level 1 to describe the correspondence between the level and the track.
Accordingly, the video reproduction terminal 14 may specify a track corresponding to the association ID that the sub-presentation element of the MPD file has through the leva box. Specifically, the video reproduction terminal 14 can designate a track corresponding to the association ID of vd1 as the first track from the top (quality track (depth 1)) corresponding to level 1. Further, the video reproduction terminal 14 can designate a track corresponding to the association ID of vd2 as the second track from the top (quality track (depth 2)) corresponding to level 2.
(description of processing of File reproducing apparatus)
Fig. 12 is a flowchart showing a file generation process of the file generation apparatus 11 of fig. 1.
At step S11 of fig. 12, the acquisition unit 81 of the file generation apparatus 11 acquires the texture image, the image data of the depth image and the occlusion image of the video content, and the metadata of the quality information of the encoded stream including the depth image of 2Mbps and 1Mbps and the occlusion image of 1 Mbps. Then, the acquisition unit 81 supplies the acquired image data and metadata to the encoding unit 82.
At step S12, the encoding unit 82 encodes the image data of the texture image supplied from the acquisition unit 81 at 8Mbps and 4Mbps, and encodes the image data of the depth image at 2Mbps and 1 Mbps. Further, the encoding unit 82 encodes the image data of the occlusion image at 1 Mbps. Further, the encoding unit 82 encodes metadata of the depth image of 2Mbps and 1Mbps and the occlusion image of 1Mbps at predetermined bitrates, respectively. The encoding unit 82 supplies the encoded stream generated as a result of the encoding to the clip file generating unit 83.
At step 13, the clip file generating unit 83 files the encoded streams of the texture image, the depth image, and the occlusion image supplied from the encoding unit 82 in units of clips for each bit rate. The clip file generating unit 83 supplies the texture file, the depth file, and the occlusion file generated as a result of the documentation to the uploading unit 85.
At step S14, the clip file generating unit 83 divides the encoded stream of metadata supplied from the encoding unit 82 into two for each type of depth occlusion image.
At step S15, the clip file generating unit 83 sets the division of the encoded stream of metadata into quality files different from each other in units of clips to generate quality files and supplies the quality files to the uploading unit 85.
At step S16, the MPD file generating unit 84 generates an MPD file and supplies the MPD file to the uploading unit 85. At step S17, the upload unit 85 uploads the texture file, the depth file, the occlusion file, the quality file, and the MPD file to the Web server 12.
As described above, the file generating means 11 divides the quality information of the depth occlusion image for each type of depth occlusion image, and sets the division of the divided information into quality files different from each other. Therefore, the number of quality files can be reduced as compared with an alternative case where the quality information is set to different quality files from each other for each depth occlusion image. Therefore, it can be said that the quality information of the depth occlusion image can be efficiently stored. Further, the amount of processing associated with acquiring quality information by video reproduction terminal 14 can be reduced.
Further, in the case where the texture image and the depth image are used for reproduction, the video reproduction terminal 14 may acquire quality information from a quality file (quality 1 file) in which only quality information of the depth image is stored. Therefore, the acquisition efficiency of the quality information can be improved compared to an alternative case of acquiring the quality information from a quality file storing the quality information of all the depth occlusion images.
The file generation device 11 also generates an MPD file in which the association ID is described in a child representation element. Accordingly, the MPD file may manage the correspondence between tracks of a quality file in which quality information of a plurality of depth occlusion images is divisionally set in different tracks from each other and depth occlusion images. As a result, the video reproduction terminal 14 can extract the quality information of the depth occlusion image from the quality file in which the quality information of the plurality of depth occlusion images are divisionally set in tracks different from each other.
Further, since the file generation means 11 generates an MPD file in which a combination of images to be used for reproduction of a mode to be used for reproduction is described, only reproduction of the mode to be used for reproduction can be performed by the video reproduction terminal 14. As a result, for example, a producer of video content can provide a user with an image of quality desired by the producer. Further, since only the video reproduction terminal 14 needs to select a reproduction mode from among modes to be used for reproduction, the processing load is reduced as compared with an alternative case where a reproduction mode is selected from among all reproduction modes that can be used for reproduction.
(example of functional configuration of video reproduction terminal)
Fig. 13 is a block diagram depicting an example of a configuration of a streaming reproduction unit implemented by the video reproduction terminal 14 of fig. 1 executing the control software 21, the video reproduction software 22, and the access software 23.
The streaming reproduction unit is configured by an MPD acquisition unit 101, an MPD processing unit 102, a quality information acquisition unit 103, a decoding unit 104, an image acquisition unit 105, a decoding unit 106, an output control unit 107, and a measurement unit 108.
The MPD acquisition unit 101 of the streaming reproduction unit 100 requests the Web server 12 to acquire an MPD file. The MPD acquisition unit 101 supplies the acquired MPD file to the MPD processing unit 102.
The MPD processing unit 102 analyzes the MPD file supplied from the MPD acquisition unit 101. Specifically, the MPD processing unit 102 acquires a bandwidth that each representation element of the MPD file has as a bit rate of an image corresponding to the representation element.
Further, the MPD processing unit 102 acquires a combination of images to be used for reproduction of that mode to be used for reproduction, according to the value of "urn: mpeg: dash: quality: playback: combination: 2015" of the MPD file and the representation id that each representation element has. Further, the MPD processing unit 102 acquires acquisition information such as a file name of a segment file group corresponding to a presentation element, a level corresponding to quality information of each deep occlusion image, and the like, from a root domain name which each presentation element of the MPD file has, an association Id of a child presentation element, and the like.
The MPD processing unit 102 selects a candidate of a reproduction mode from modes to be used for reproduction. The MPD processing unit 102 creates a list of acquisition candidates of the depth occlusion image based on the network bandwidth of the internet 13 and the bit rate of the image supplied from the measurement unit 108. The MPD processing unit 102 supplies acquisition information of quality information of the depth occlusion images registered in the list to the quality information acquisition unit 103.
For example, in the case where a depth image is registered in the list of acquisition candidates of a depth occlusion image, the MPD processing unit 102 supplies acquisition information of quality information of the depth image registered in the list in the quality file (quality 1 file) to the quality information acquisition unit 103. In the case where an occlusion image is registered in the list of acquisition candidates of a depth occlusion image, the MPD processing unit 102 supplies acquisition information of quality information of an occlusion image registered in the list of quality files (quality 2 files) to the quality information acquisition unit 103.
Further, the MPD processing unit 102 selects a reproduction mode from candidates of reproduction modes based on the quality information supplied from the decoding unit 104. The MPD processing unit 102 supplies acquisition information of a texture image of a texture file to be used for reproduction of the selected reproduction mode to the image acquisition unit 105. Further, in the case where the file of the depth occlusion image is used for reproduction of the selected reproduction mode, the MPD processing unit 102 supplies acquisition information of the depth occlusion image to the image acquisition unit 105.
The quality information acquisition unit 103 requests and acquires an encoded stream of metadata including quality information from the Web server 12 based on the acquisition information supplied from the MPD processing unit 102. The quality information acquisition unit 103 supplies the acquired encoded stream to the decoding unit 104.
The decoding unit 104 decodes the encoded stream supplied from the quality information acquisition unit 103, and generates metadata including quality information. The decoding unit 104 supplies the quality information to the MPD processing unit 102.
The image acquisition unit 105, the decoding unit 106, and the output control unit 107 function as a reproduction unit, and reproduce only a texture image or a texture image and a depth occlusion image based on the acquisition information supplied from the MPD processing unit 102.
Specifically, the image acquisition unit 105 requests and acquires an encoded stream of a file of the depth barrier image or a texture file from the Web server 12 based on the acquisition information supplied from the MPD processing unit 102. The image acquisition unit 105 supplies the acquired encoded stream to the decoding unit 104.
The decoding unit 106 decodes the encoded stream supplied from the image acquisition unit 105 to generate image data. The decoding unit 106 supplies the generated image data to the output control unit 107.
In the case where the image data supplied from the decoding unit 106 is texture-image-only image data, the output control unit 107 makes it necessary for a display unit, not shown, such as a display of the video reproduction terminal 14 to display a texture image thereon, based on the image data of the texture image.
On the other hand, in the case where the image data supplied from the decoding unit 106 is image data of a texture image and a depth occlusion image, the output control unit 107 generates image data of a 3D image based on the image data of the texture image and the depth occlusion image. Then, the output control unit 107 causes a not-shown display unit such as a display to display the 3D image based on the image data of the generated 3D image.
The measurement unit 108 measures the network bandwidth of the internet 13 and provides the measured network bandwidth to the MPD processing unit 102.
(description of first example of processing of streaming reproduction Unit)
Fig. 14 is a flowchart showing a first example of the reproduction process of the streaming reproduction unit 100 of fig. 13. It should be noted that in the reproduction process of fig. 14, the streaming reproduction unit 100 performs reproduction of a 3D image using a texture image and a depth image.
At step S31 of fig. 14, the MPD acquisition unit 101 requests and acquires an MPD file from the Web server 12. The MPD acquisition unit 101 supplies the acquired MPD file to the MPD processing unit 102.
At step S32, the MPD processing unit 102 analyzes the MPD file supplied from the MPD acquisition unit 101. Accordingly, MPD processing unit 102 acquires the following items: bit rates of the respective texture images and depth images, a combination of images to be used for reproduction in a mode to be reproduced, and acquisition information of the texture images, the depth images, and the quality information.
At step S33, the MPD processing unit 102 selects, from the modes to be reproduced, reproduction modes 3, 4, and 7 in which reproduction is performed using only texture images and depth images as candidates for the reproduction mode, based on a combination of images to be used for reproduction in the modes to be reproduced. The processing at subsequent steps S34 to S43 is performed in units of clips.
At step S34, the measurement unit 108 measures the network bandwidth of the internet 13 and supplies it to the MPD processing unit 102.
At step S35, the MPD processing unit 102 determines a texture image to be acquired from texture images to be used for reproduction together with candidates for a reproduction mode, based on the network bandwidth and the bit rate of the texture image.
For example, the MPD processing unit 102 assumes that 80% of the network bandwidth is the maximum acceptable bit rate of texture images, and determines, as a texture image to be acquired, a texture image having a bit rate lower than the maximum acceptable bit rate from among texture images used for reproduction together with candidates of a reproduction mode.
At step S36, the MPD processing unit 102 creates a list of acquisition candidates for a depth image based on candidates of a reproduction mode for reproducing a texture image to be acquired and the bit rate of the depth image.
For example, MPD processing unit 102 determines that 20% of the network bandwidth is the maximum acceptable bit rate for the depth image. Then, when the texture image to be acquired is a texture image of a texture file (texture 1 file), and the bit rates of the depth images of the depth file (depth 1 file) and the depth file (depth 2 file) are lower than the maximum acceptable bit rate, the MPD processing unit 102 creates the following list based on the reproduction modes 3 and 4: in this list, depth images of a depth file (depth 1 file) and a depth file (depth 2 file) are registered.
On the other hand, in a case where the texture image to be acquired is a texture image of a texture file (texture 2 file) and the bit rate of the depth image of the depth file (depth 2 file) is lower than the maximum acceptable bit rate, the MPD processing unit 102 creates a list of depth images in which the depth file (depth 2 file) is registered based on the reproduction mode 7.
Then, the MPD processing unit 102 supplies acquisition information of quality information of depth images registered in the list to the quality information acquisition unit 103. Note that, in the case where all the bit rates of the depth image to be reproduced together with the texture image to be acquired are equal to or higher than the maximum acceptable bit rate, nothing is registered in the list of acquisition candidates of the depth image, and only the encoded stream of the texture image is acquired, decoded, and displayed, and the process proceeds to step S43.
At step 37, the quality information acquisition unit 103 requests the Web server 12 for an encoded stream of metadata including quality information of the depth image based on the acquisition information supplied from the MPD processing unit 102 to acquire the encoded stream. The quality information acquisition unit 103 supplies the acquired encoded stream to the decoding unit 104.
At step S38, the decoding unit 104 decodes the encoded stream of the quality information of the depth image supplied from the quality information acquisition unit 103 to generate metadata including the quality information of the depth image. The decoding unit 104 supplies the quality information of the depth image to the MPD processing unit 102.
At step 39, the MPD processing unit 102 determines a depth image to be acquired from depth images registered in the list of depth images based on the quality information supplied from the decoding unit 104.
For example, the MPD processing unit 102 determines the following depth images as the depth image to be acquired: the best quality depth image represented by the quality information; a depth image whose quality indicated by the quality information is closest to that of the depth image of the immediately preceding segment (or sub-segment); or depth images for which the quality indicated by the quality information is an acceptable quality and for which the bit rate is lowest in addition.
In the case where a depth image whose quality indicated by the quality information is closest to that of the depth image of the immediately preceding segment (or sub-segment) is determined as the depth image to be acquired, a sense of incongruity of the appearance of the 3D image to be reproduced can be reduced. The MPD processing unit 102 supplies acquisition information of a depth image to be acquired to the image acquisition unit 105.
At step S40, the image acquisition unit 105 requests the Web server 12 for the encoded streams of the texture image and the depth image based on the acquisition information supplied from the MPD processing unit 102 to acquire the encoded streams. The image acquisition unit 105 supplies the acquired encoded stream to the decoding unit 104.
At step S41, the encoded stream supplied from the image acquisition unit 105 is decoded to generate image data of a texture image and a depth image. The decoding unit 106 supplies the generated image data of the texture image and the depth image to the output control unit 107.
At step 42, the output control unit 107 generates image data of a 3D image based on the image data of the texture image and the depth image supplied from the decoding unit 106, and controls a display unit, not shown, to display the 3D image.
At step S43, the streaming reproduction unit 100 determines whether the image of the last segment of the video content is displayed. In a case where it is determined at step S43 that the image of the last clip of the video content has not been displayed, the process returns to step S34.
On the other hand, in the case where it is determined at step S43 that the image of the last clip of the video content is displayed, the processing ends.
It should be noted that although not depicted, the first example of the reproduction processing of performing reproduction of a 3D image by the streaming reproduction unit 100 using a texture image, a depth image, and an occlusion image is similar to the reproduction processing of fig. 14 except for the following points.
Specifically, the candidates of the reproduction mode selected at step S33 of fig. 14 are reproduction modes 5 and 6. Further, between step S39 and step S40, the processing regarding the quality information of the occlusion image is performed similarly to the processing at steps S36 to S39 for the quality information of the depth image.
However, in this case, the maximum acceptable bit rates for the texture image, the depth image, and the occlusion image are 70%, 15%, and 15% of the network bandwidth, respectively. Further, the depth images in the processing at steps S40 to S42 are both depth images and occlusion images.
(description of second example of processing of streaming reproduction Unit)
Fig. 15 is a flowchart showing a second example of the reproduction process of the streaming reproduction unit 100 of fig. 13. Note that in the reproduction process of fig. 15, the streaming reproduction unit 100 performs reproduction of a 3D image using a texture image and a depth image.
The reproduction process of fig. 15 is different from the reproduction process of fig. 14 in that the ratio of the maximum acceptable bit rate to the network bandwidth of the texture image and the depth image is not determined.
Steps S61 to S64 of fig. 15 are similar to the processes at steps S31 to S34 of fig. 14, respectively, and thus the description thereof is omitted. The processing at steps S64 to S73 is performed in units of clips.
At step S65, the MPD processing unit 102 creates a list of acquisition candidates for a combination of texture images and depth images based on the candidates for the reproduction mode, the network bandwidth of the internet 13 supplied from the measurement unit 108, and the bit rates of the texture images and depth images.
Specifically, a list is created in which, from among combinations of texture images and depth images used for reproduction in the reproduction modes 3, 4, and 7, those combinations in which the sum of the bit rates of the texture images and depth images does not exceed the network bandwidth are registered.
It should be noted that the lower limit of the bit rate of the texture image and the depth image may be determined in advance so that any combination in which the bit rate of at least one is lower than the lower limit is excluded from the combinations registered in the list.
Further, in the case where the sum total of the bit rates of the texture image and the depth image used for reproduction in the reproduction modes 3, 4, and 7 all exceed the network bandwidth, nothing is registered in the list of the acquisition candidates of the combination of the texture image and the depth image. Then, only the encoded stream of texture images having the maximum bit rate not exceeding the network bandwidth is acquired, decoded, and displayed, and the process proceeds to step S73.
At step S66, the MPD processing unit 102 creates a list of depth images registered in the list created at step S65. Then, the MPD processing unit 102 supplies acquisition information of quality information of depth images registered in the list to the quality information acquisition unit 103.
The processing at steps S67 and S68 is similar to the processing at steps S37 and S38 of fig. 14, respectively, and thus the description thereof is omitted.
At step 69, the MPD processing unit 102 determines a combination of a texture image and a depth image to be acquired from among combinations registered in the list of combinations of texture images and depth images based on the quality information.
For example, the MPD processing unit 102 determines a depth image to be acquired similarly at step S39 of fig. 14. Then, the MPD processing unit 102 determines a texture image of the highest bit rate as a texture image to be acquired from among the following texture images: combinations of these texture images and depth images to be acquired are registered in a combination list.
The processing at steps S70 to S73 is similar to the processing at steps S40 to S43 of fig. 14, respectively, and thus the description thereof is omitted.
It should be noted that, although not shown, a second example of a reproduction process in which the streaming reproduction unit 100 performs reproduction of a 3D image using a texture image, a depth image, and an occlusion image is similar to the reproduction process of fig. 15 except for the following points.
Specifically, the candidates of the reproduction mode selected at step S63 of fig. 15 are reproduction modes 5 and 6. Further, the depth images in the processing at steps S65 to S72 include both the depth image and the occlusion image.
Since the video reproduction terminal 14 acquires the quality information of the depth image and the occlusion image as described above, the video reproduction terminal 14 can acquire an appropriate depth occlusion image based on the quality information.
< second embodiment >
(example of clip files)
The configuration of the second embodiment of the information processing system to which the present disclosure is applied is substantially the same as that of the information processing system 10 of fig. 1 except for the following aspects: the encoded stream of metadata including quality information is divided not for each type of depth-related image but for each texture image, and set into different quality files. Therefore, in the following description, descriptions other than the description of the quality file are appropriately omitted.
Fig. 16 depicts an example of a clip file generated by the clip file generating unit 83 of the second embodiment of the information processing system to which the present disclosure is applied.
The clip file of fig. 16 is the same as the clip file of fig. 5, except for the quality file.
As shown in fig. 16, the clip file generating unit 83 divides each of the encoded stream of the quality information of the depth image of 2Mbps and 1Mbps and the encoded stream of the quality information of the occlusion image of 1Mbps into two for each texture image to be reproduced using the encoded stream. Then, the clip file generating unit 83 sets the division of the encoded stream of metadata into different quality files in units of clips to generate quality files.
Specifically, the depth occlusion images used for reproduction in the mode to be reproduced together with the texture file (texture 1 file) are depth images of 2Mbps and 1Mbps and occlusion images of 1 Mbps. Accordingly, the clip file generating unit 83 generates the following quality file (quality 1 file): wherein an encoded stream of metadata including quality information of depth images of 2Mbps and 1Mbps and an encoded stream of metadata including quality information of occlusion images of 1Mbps are set in units of clips.
In the quality file (quality 1 file), the respective encoded streams are set in different tracks (quality track (depth 1), quality track (depth 2), and quality track (occlusion 1)).
Further, the depth occlusion image used for reproduction together with the texture file (texture 2 file) in reproduction of the mode to be reproduced is a depth image of 1 Mbps. Therefore, the clip file generating unit 83 generates the following quality file (quality 2 file) in units of clips: in which an encoded stream of metadata including quality information of a depth image of 1Mbps is displayed.
In this way, the clip file generating unit 83 separately files the encoded streams of metadata including quality information into a file for each texture image. Therefore, by acquiring quality information from the quality file of the texture image to be acquired, the video reproduction terminal 14 can easily acquire quality information of the depth occlusion image to be used together with the texture image for reproduction in the mode to be reproduced.
Specifically, in the quality file, quality information of a depth occlusion image to be used for reproduction in a mode to be reproduced together with a texture image corresponding to the quality file is stored. Therefore, the quality information of the depth occlusion image to be used for reproduction together with the texture image to be acquired can be easily acquired, as compared with the alternative case where the encoded streams of the quality information of all the depth occlusion images are collectively filed.
Furthermore, although not depicted, the MPD file in the second embodiment is similar to the MPD file of fig. 10 except for the following points. Specifically, in the MPD file in the second embodiment, the presentation element of the quality file (quality 1 file) has an association ID including not only vd1 and vd2 but also vo 1. Further, the presentation element of the quality file (quality 1 file) also includes a sub-presentation element having level 3 and vo1 as an association ID. Further, the presentation element of the quality file (quality 2 file) has an associated ID not of vo1 but of vd 2.
Further, although not depicted, the file reproduction process in the second embodiment is the same as the file reproduction process of fig. 12 except for the following: at step S14, the encoded stream of metadata including quality information is divided not for each type of depth-related image but for each texture image.
Further, although not described, at step S36, the reproduction processing in the second embodiment is the same as that of fig. 14 except for the following: the MPD processing unit 102 supplies acquisition information of quality information stored in a quality file of a texture image to be acquired, from among acquisition information of quality information of depth images registered in the list, to the quality information acquisition unit 103.
As described above, the file generating apparatus 11 in the second embodiment divides the quality information of the depth occlusion image for each texture image to be reproduced together with the depth occlusion image, and sets the division of the quality information into different quality files. Thus, the number of quality files can be reduced compared to the alternative case where the quality information is set in different quality files for different depth occlusion images. Therefore, it can be said that the quality information of the depth occlusion image can be efficiently stored. Further, the amount of processing associated with acquiring quality information by video reproduction terminal 14 can be reduced.
Further, the video reproduction terminal 14 may acquire quality information from within the following quality files: in the quality file, only quality information of a depth occlusion image to be reproduced together with a texture image of a reproduction target is stored. Therefore, the acquisition efficiency of the quality information can be improved compared to an alternative case of acquiring the quality information from a quality file storing the quality information of all the depth occlusion images.
< third embodiment >
(example of clip files)
The configuration of the third embodiment of the information processing system to which the present disclosure is applied is substantially the same as that of the information processing system 10 of fig. 1 except for the following aspects: the quality information of the depth occlusion image is replaced by the quality information of the 3D image or texture image reproduced in the mode to be reproduced; and dividing the encoded stream of quality information for each type of reproduction mode corresponding to the quality information; and arranging the division of the encoded streams in different quality files. Therefore, hereinafter, descriptions other than the description of the quality file are appropriately omitted.
Fig. 17 is a diagram depicting an example of a clip file generated by the clip file generating unit 83 in the third embodiment of the information processing system to which the present disclosure is applied.
The clip file of fig. 17 is the same as the clip file of fig. 5 except for the quality file.
As shown in fig. 17, the clip file generating unit 83 divides the encoded stream of metadata including the quality information of the modes 1 to 7 supplied from the encoding unit 82 into three for each type of reproduction mode. Then, the clip file generating unit 83 sets the division of the encoded stream into different quality files in units of clips to generate quality files.
Specifically, the clip file generating unit 83 generates the following quality files (quality 1 files): in which encoded streams in units of slices are provided with quality information of reproduction modes 1 and 2 in which reproduction is performed using only texture images. In the quality file (quality 1 file), the respective encoded streams are set in different tracks (quality track (texture 1) and quality track (texture 2)).
Further, the clip file generating unit 83 generates the following quality file (quality 2 file): in which encoded streams of quality information of reproduction modes 3, 4, and 7 in which reproduction is performed using only texture images and depth images are set in units of slices. In the quality file (quality 2 file), the respective encoded streams are set on different tracks (quality track (texture 1+ depth 1), quality track (texture 1+ depth 2), and quality track (texture 2+ depth 2)).
Further, the clip file generating unit 83 generates the following quality file (quality 3 file): in which an encoded stream including metadata of quality information of reproduction modes 5 and 6 in which reproduction is performed using a texture image, a depth image, and an occlusion image is set in units of slices. In the quality file (quality 3 file), the respective encoded streams are set in different tracks (quality track (texture 1+ depth 1+ occlusion 1) and quality track (texture 1+ depth 2+ occlusion 1)).
As described above, the clip file generating unit 83 sets an encoded stream of metadata including quality information of texture images or 3D images to be reproduced in a mode to be reproduced. Accordingly, the video reproduction terminal 14 can perform reproduction in which the quality of a texture image or a 3D image to be reproduced is high eventually based on the quality information.
Further, the clip file generating unit 83 files the encoded stream of the quality information of the mode to be reproduced separately for each type of reproduction mode. Therefore, the video reproduction terminal 14 can easily acquire the quality information of the reproduction mode to be a candidate from the quality files of the various types of reproduction modes to be candidates.
In contrast, in the case where the encoded streams of the quality information of all reproduction modes are collectively filed, it is necessary to acquire the quality information of the reproduction mode to be a candidate from the following files: these files also include quality information of unnecessary reproduction modes that are not the types of reproduction modes that become candidates.
(example of description of MPD File)
Fig. 18 is a diagram depicting an example of the description of an MPD file in the third embodiment.
The description of the MPD file of fig. 18 is the same as that of fig. 10, except for the adaptation set element for the quality file.
In the MPD file of fig. 18, a quality file (quality 1 file) group, another quality file (quality 2 file) group, and yet another quality file (quality 3 file) group are grouped by one adaptation set element.
In the adaptive set element for a quality file, presentation elements corresponding to a quality file (quality 1 file) group, a quality file (quality 2 file) group, and a quality file (quality 3 file) group, respectively, are described.
In the presentation element corresponding to the quality file (quality 1 file) group, vq1 is described as a presentation id, and "quality 1.mp 4" is described as a root domain name.
Meanwhile, the quality information stored in the quality file (quality 1 file) is the quality information of the texture file (texture 1 file) and another texture file (texture 2 file). Therefore, in the presentation element corresponding to the quality file (quality 1 file), vt1 and vt2 as the presentation IDs of the texture file (texture 1 file) group and the texture file (texture 2 file) group are described as the associated IDs.
In the example of fig. 18, the tracks corresponding to level 1 and level 2 store the quality information of the texture file (texture 1 file) and the texture file (texture 2 file), respectively.
Therefore, in the presentation element corresponding to the quality file (quality 1 file) group, < replication level ═ 1 "association id ═ vt 1" >, which associates level 1 with vt1 as an association id, where vt1 is the presentation id of the texture file (texture 1 file), is described. Further, < replication level ═ 2 "association id ═ vt 2" >, which associates level 2 with vt2 as an association id, where vt2 is the representation id of the texture file (texture 2 file) is described. In other words, the correspondence between the respective tracks of the quality file (quality 1 file) and the Representation (reproduction) for specifying a texture image (texture image specifying information) is described.
Meanwhile, among the presentation elements corresponding to the quality file (quality 2 file) group, vq2 is described as a presentation id, and "quality 2.mp 4" is described as a root domain name.
Further, the quality information stored in the quality file (quality 2 file) is quality information of the following 3D image: a 3D image to be reproduced using a texture file (texture 1 file) and a depth file (depth 1 file); another 3D image to be reproduced using a texture file (texture 1 file) and a depth file (depth 2 file); and a further 3D image to be reproduced using the texture file (texture 2 file) and the depth file (depth 2 file).
Therefore, in the presentation elements corresponding to the quality file (quality 2 file) group, vt1, vt2, vd1, and vd2 as the presentation IDs of the texture file (texture 1 file), texture file (texture 2 file), depth file (depth 1 file), and depth file (depth 2 file) are described as the association IDs.
Further, in the example of fig. 18, a track corresponding to level 1 stores quality information of a 3D image to be reproduced using a texture file (texture 1 file) and a depth file (depth 1 file). The track corresponding to level 2 stores quality information of a 3D image to be reproduced using a texture file (texture 1 file) and a depth file (depth 2 file). The track corresponding to level 3 stores quality information of a 3D image to be reproduced using a texture file (texture 2 file) and a depth file (depth 2 file).
Therefore, in the presentation element corresponding to the quality file (quality 2 file) group, < replication level ═ 1 "association ID ═ vt1d 1" >, which associates level 1 and vt1 and vd1 as association IDs with each other, where vt1 and vd1 are the representation IDs of the texture file (texture 1 file) and the depth file (depth 1 file), is described. Further, < subparrection level ═ 2 "associationID ═ vt1vd 2" >, which associates level 2 and vt1 and vd2 as association IDs with each other, where vt1 and vd2 are the presentation IDs of the texture file (texture 1 file) and the depth file (depth 2 file), is described.
Further, < replication level ═ 3 "association ID ═ vt2vd 2" >, which associates level 3 and vt2 and vd2 as association IDs with each other, where vt2 and vd2 are the representation IDs of the texture file (texture 2 file) and the depth file (depth 2 file), is described.
Further, in the representation element corresponding to the quality file (quality 3 file), vq3 is described as a representation id, and "quality 3.mp 4" is described as a root domain name.
Further, the quality information stored in the quality file (quality 3 file) is quality information of the following 3D image: a 3D image to be reproduced using a texture file (texture 1 file), a depth file (depth 1 file), and an occlusion file (occlusion file); and another 3D image to be reproduced using the texture file (texture 1 file), the depth file (depth 2 file), and the occlusion file (occlusion file).
Therefore, in the presentation elements corresponding to the quality file (quality 3 file) group, vt1, vd1, vd2, and vo1 as the presentation IDs of the texture file (texture 1 file), the depth file (depth 2 file), and the occlusion file (occlusion file) are described as the association IDs.
Further, in the example of fig. 18, the track corresponding to level 1 stores quality information of a 3D image to be reproduced using a texture file (texture 1 file), a depth file (depth 1 file), and an occlusion file (occlusion file). The track corresponding to level 2 stores quality information of a 3D image to be reproduced using a texture file (texture 1 file), a depth file (depth 2 file), and an occlusion file (occlusion file).
Therefore, in the presentation element corresponding to the quality file (quality 3 file) group, < replication level ═ 1 "associationID ═ vt1vd1vo 1" >, which associates level 1 with vt1, vd1, and vo1 with each other, is described as an association ID, where vt1, vd1, and vo1 are the presentation IDs of texture files (texture 1 files), depth files (depth 1 files), and occlusion files (occlusion files). Further, as the association ID, < replication level ═ 2 "association ID ═ vt1vd2vo 1" >, which associates level 2 with vt1, vd2, and vo1 each other, is described, where vt1, vd2, and vo1 are the representation IDs of the texture file (texture 1 file), the depth file (depth 2 file), and the occlusion file (occlusion file).
Note that, in the example of fig. 18, although the bandwidth is not described in the adaptation set element of the quality file, the bandwidth may be described.
Further, although not depicted, the file generation process in the third embodiment is the same as the file reproduction process of fig. 12 except for the following: the quality information acquired at step S11 is quality information of a 3D image or texture image reproduced in the reproduction mode, and at step S14, the encoded stream including metadata of the quality information is divided not for each type of depth-related image but for each type of reproduction mode.
(description of first example of processing of streaming reproduction Unit)
Fig. 19 is a flowchart showing a first example of reproduction processing of the streaming reproduction unit 100 in the third embodiment. Note that in the reproduction processing of fig. 19, the streaming reproduction unit 100 performs reproduction of a 3D image using a texture image and a depth image.
The processing at steps S91 to S95 of fig. 19 is similar to the processing at steps S31 to S35 of fig. 14, respectively, and thus the description thereof is omitted. The processing at steps S94 to S103 is performed in units of clips.
At step S96, the MPD processing unit 102 creates a list of reproduction modes for reproducing texture images to be acquired.
For example, assume that the maximum acceptable bit rate of the depth image in the MPD processing unit 102 is 20% of the network bandwidth. Then, in a case where the texture image to be acquired is a texture image of a texture file (texture 1 file), and the bit rates of the depth images of the depth file (depth 1 file) and the depth file (depth 2 file) are lower than the maximum acceptable bit rate, the MPD processing unit 102 creates a list in which reproduction modes 3 and 4 are registered.
On the other hand, in a case where the texture image to be acquired is a texture image of a texture file (texture 2 file), and the bit rate of the depth image of a depth file (depth 2 file) is lower than the maximum acceptable bit rate, the MPD processing unit 102 creates a list in which the reproduction mode 7 is registered. Then, the MPD processing unit 102 supplies acquisition information of quality information of reproduction modes registered in the list to the quality information acquisition unit 103.
It is to be noted that, in the case where the bit rates of all depth images to be reproduced together with a texture image to be acquired are equal to or higher than the maximum acceptable bit rate, nothing is registered in the list of acquisition candidates of the depth images, and only the encoded stream of the texture image is acquired, decoded, and displayed, and then the process proceeds to step S103.
At step 97, the quality information acquisition unit 103 requests the Web server 12 for an encoded stream of metadata including quality information of a predetermined reproduction mode based on the acquisition information supplied from the MPD processing unit 102 to acquire the encoded stream. The quality information acquisition unit 103 supplies the acquired encoded stream to the decoding unit 104.
At step S98, the decoding unit 104 decodes the encoded stream of the quality information of the predetermined reproduction mode supplied from the quality information acquisition unit 103, and generates metadata including the quality information of the predetermined reproduction mode. The decoding unit 104 supplies quality information of a predetermined reproduction mode to the MPD processing unit 102.
At step S99, the MPD processing unit 102 determines a depth image to be acquired from depth images registered in the list and to be used for reproduction of the reproduction mode, based on the quality information supplied from the decoding unit 104.
For example, the MPD processing unit 102 determines a depth image to be used for reproduction in the following reproduction modes as a depth image to be acquired: the reproduction mode with the best quality indicated by the quality information; a reproduction mode whose quality indicated by the quality information is closest to that of the reproduction mode of the immediately preceding segment (or sub-segment); or a reproduction mode in which the quality indicated by the quality information is an acceptable quality and the bit rate is also lowest.
In the case where a depth image to be used for reproduction is determined as a depth image to be acquired in the following reproduction mode, a sense of incongruity of the appearance of a reproduced 3D image can be reduced: the quality of the reproduction mode indicated by the quality information is closest to the quality of the reproduction mode of the immediately preceding segment (or sub-segment). The MPD processing unit 102 supplies acquisition information of a depth image to be acquired to the image acquisition unit 105.
The processing at steps S100 to S103 is similar to that at steps S40 to S43 of fig. 14, respectively, and thus description thereof is omitted.
It should be noted that although not depicted, the reproduction processing when the streaming reproduction unit 100 performs reproduction of a 3D image using a texture image, a depth image, and an occlusion image is similar to the reproduction processing of fig. 19 except for the following points.
Specifically, the candidates of the reproduction mode selected at step S93 of fig. 19 are reproduction modes 5 and 6. Further, the depth image in the processing at steps S99 to S102 is replaced with the depth image and the occlusion image. It is noted that in this case the maximum acceptable bit rates for the texture image, the depth image and the occlusion image are, for example, 70%, 15% and 15% of the network bandwidth, respectively.
(description of second example of processing of streaming reproduction Unit)
Fig. 20 is a flowchart showing a second example of reproduction processing of the streaming reproduction unit 100 in the third embodiment. It should be noted that in the reproduction process of fig. 20, the streaming reproduction unit 100 performs reproduction of a 3D image using a texture image and a depth image.
The rendering process of fig. 20 differs from that of fig. 19 mainly in that the ratio of the maximum acceptable bit rate to the network bandwidth of the texture image and the depth image is not determined.
The processing at steps S121 to S124 of fig. 20 is similar to the processing at steps S91 to S94 of fig. 19, respectively, and thus the description thereof is omitted. The processing at steps S124 to S132 is performed in units of clips.
At step S125, the MPD processing unit 102 creates a list of reproduction modes based on the candidates of the reproduction modes, the network bandwidth of the internet 13 provided from the measurement unit 108, and the bit rates of the texture image and the depth image.
Specifically, MPD processing unit 102 creates the following list: in this list, from among the reproduction modes 3, 4, and 7, a reproduction mode in which the sum of the bit rates of the texture image and the depth image to be used for reproduction does not exceed the network bandwidth is registered.
It should be noted that the lower limits of the bit rates of the texture image and the depth image may be determined in advance so that any reproduction mode in which at least one of the bit rates of the texture image and the depth image to be used for reproduction is lower than the lower limit thereof is excluded from the reproduction modes registered in the list.
Further, in the case where the sum of the bit rates of the texture image and the depth image for reproduction in the reproduction modes 3, 4, and 7 exceeds the network bandwidth, nothing is registered to the list of reproduction modes. Then, only the encoded stream of texture images whose maximum bit rate does not exceed the network bandwidth is acquired, decoded, and displayed, and the process proceeds to step S132.
The processing at steps S126 and S127 is similar to the processing at steps S97 and S98 of fig. 19, respectively, and thus the description thereof is omitted.
At step S128, the MPD processing unit 102 determines a combination of a texture image and a depth image to be acquired from combinations of texture images and depth images to be used for reproduction in the reproduction mode, which are registered in the list, based on the quality information, similarly as in the processing at S69 of fig. 15.
The processing at steps S129 to S132 is similar to the processing at steps S100 to S103 of fig. 19, respectively, and thus the description thereof is omitted.
It should be noted that, although not depicted, the second example of the reproduction processing when the streaming reproduction unit 100 performs reproduction of a 3D image using a texture image, a depth image, and an occlusion image is similar to the reproduction processing of fig. 20 except for the following points.
Specifically, the candidates of the reproduction mode selected at step S123 of fig. 20 are reproduction modes 5 and 6. Further, the depth image in the processing at steps S128 to S131 includes both the depth image and the occlusion image.
As described above, the file generating apparatus 11 in the third embodiment divides the quality information of the mode to be reproduced into quality information for each type of reproduction mode, and sets the divided quality information into different quality files. Therefore, the number of quality files can be reduced as compared with an alternative case where the quality information is set in different quality files between different modes to be reproduced. Therefore, it can be said that the quality information can be efficiently stored. Further, the amount of processing associated with the acquisition of quality information by video reproduction terminal 14 can be reduced.
Further, the video reproduction terminal 14 may acquire quality information from the following quality files: wherein only the quality information of the reproduction modes of the type to be candidates is stored. Therefore, the acquisition efficiency of the quality information of the candidate reproduction modes can be improved as compared with the alternative case of acquiring the quality information from the quality file in which the quality information of all reproduction modes is stored.
< fourth embodiment >
(example of clip files)
The configuration of the fourth embodiment of the information processing system to which the present disclosure is applied is substantially the same as that of the third embodiment except for the following aspects: the encoded stream of metadata including quality information is divided not for each type of reproduction mode but for each texture image used for reproduction, and each of the divisions is set into a different quality file. Therefore, in the following description, descriptions other than the description of the quality file are appropriately omitted.
Fig. 21 depicts an example of a clip file generated by the clip file generating unit 83 of the fourth embodiment of the information processing system to which the present disclosure is applied.
The clip file of fig. 21 is identical to the clip file of fig. 5, except for the quality file.
As shown in fig. 21, the clip file generating unit 83 divides the encoded stream of metadata including the quality information of the reproduction modes 1 to 7 supplied from the encoding unit 82 into two for each texture image to be reproduced in the reproduction mode. Then, the clip file generating unit 83 sets the division of the encoded stream of metadata into different quality files in units of clips to generate quality files.
Specifically, the clip file generating unit 83 generates the following quality files (quality 1 files): in which encoded streams of quality information of reproduction modes 1 to 5 to be reproduced using a texture file (texture 1 file) are set in units of slices. In the quality file (quality 1 file), each encoded stream is set in different tracks (quality track (texture 1), quality track (texture 1+ depth 2), quality track (texture 1+ depth 1+ occlusion), and quality track (texture 1+ depth 2+ occlusion)).
Further, the clip file generating unit 83 generates the following quality file (quality 2 file): in which encoded streams of quality information of reproduction modes 6 and 7 to be reproduced using a texture file (texture 2 file) are set in units of slices. In the quality file (quality 2 file), each encoded stream is set in a different track (quality track (texture 2) and quality track (texture 2+ depth 2)).
As described above, the clip file generating unit 83 separately files the encoded stream of the quality information of the reproduction mode for each texture image. Accordingly, by acquiring the quality information of the quality file of the texture image to be acquired, the video reproduction terminal 14 can easily acquire the quality information of the mode to be reproduced for performing reproduction using the texture image.
Further, although not depicted, the MPD file in the fourth embodiment is similar to the MPD file of fig. 18 except for the following point. Specifically, in the MPD file in the fourth embodiment, the number of presentation elements included in an adaptation set element for a quality file is two.
The presentation elements of the first quality file (quality 1 file) group have associated IDs vt1, vt2, vd1, vd2, and vo 1. Further, the presentation elements of the quality file (quality 1 file) include five sub-presentation elements having level 1 and vt1 as association IDs, level 2 and vt1 and vd1 as association IDs, level 3 and vt1 and vd2 as association IDs, level 4 and vt1, vd1 and vo1 as association IDs, and level 5 and vt1, vd2 and vo1 as association IDs.
The presentation elements of the second quality file (quality 2 file) have associated IDs vt2 and vd 2. Further, the representation element of the quality file (quality 1 file) includes two sub-representation elements having a level 1 and vt2 as association IDs and a level 2 and vd2 and vt2 as association IDs, respectively.
Further, although not depicted, the file generation process in the fourth embodiment is the same as that in the third embodiment except for the following: the encoded stream of metadata including quality information is divided not for each type of reproduction mode but for each texture image.
Further, although not depicted, the reproduction processing in the fourth embodiment is the same as that of fig. 19 or 20.
As described above, the file generating apparatus 11 in the fourth embodiment divides the quality information of the mode to be reproduced for each texture image, and sets the division of the quality information into separate and different quality files. Therefore, the number of quality files can be reduced as compared with an alternative case where the quality information is set in different quality files between different modes to be reproduced. Therefore, it can be said that the quality information of the mode to be reproduced can be efficiently stored. Further, the amount of processing associated with the acquisition of quality information by video reproduction terminal 14 can be reduced.
Further, the video reproduction terminal 14 may acquire quality information from the following quality files: the quality file stores therein only quality information of a mode to be reproduced in which reproduction is performed using a texture image of a reproduction target. Therefore, it is possible to improve the efficiency of acquiring quality information of a mode to be reproduced in which reproduction is performed using a texture image of a reproduction target, compared to an alternative case in which quality information is acquired from a quality file in which quality information of all reproduction modes is stored.
< fifth embodiment >
(example of clip files)
The configuration of the fifth embodiment to which the information processing system of the present disclosure is applied is substantially the same as that of the third embodiment except for the following aspects: the encoded stream of metadata including quality information is divided not only for each type of reproduction mode but also for each texture image, and set into different quality files. Specifically, the fifth embodiment is a combination of the third embodiment and the fourth embodiment. Therefore, in the following description, descriptions other than the description of the quality file are appropriately omitted.
Fig. 22 depicts an example of a clip file generated by the clip file generating unit 83 of the fifth embodiment of the information processing system to which the present disclosure is applied.
The clip file of fig. 22 is identical to the clip file of fig. 5, except for the quality file.
As shown in fig. 22, the clip file generating unit 83 divides the encoded stream of metadata including the quality information of the reproduction modes 1 to 7 supplied from the encoding unit 82 into five for each type of reproduction mode and for each texture image to be used for reproduction in each reproduction mode. Then, the clip file generating unit 83 sets the division of the encoded stream into different quality files in units of clips to generate quality files.
Specifically, the clip file generating unit 83 generates the following quality files (quality 1 files): in which an encoded stream of quality information of reproduction mode 1 to be reproduced using a texture file (texture 1 file) is set in units of slices.
Further, the clip file generating unit 83 generates the following quality file (quality 2 file): in which encoded streams of quality information of reproduction modes 3 and 4 to be reproduced using only a texture file (texture 1 file) and a depth file are set in units of slices. In the quality file (quality 2 file), each encoded stream is set in different tracks (quality track (texture 1+ depth 1) and quality track (texture 1+ depth 2)).
Further, the clip file generating unit 83 generates the following quality file (quality 3 file): in which encoded streams of quality information of reproduction modes 5 and 6 to be reproduced using only a texture file (texture 1 file), a depth file, and an occlusion file are set in units of clips. In the quality file (quality 3 file), the respective encoded streams are set on different tracks (quality track (texture 1+ depth 1+ occlusion) and quality track (texture 1+ depth 2+ occlusion)).
Further, the clip file generating unit 83 generates the following quality file (quality 4 file): here, an encoded stream of quality information of the reproduction mode 2 to be reproduced using only a texture file (texture 2 file) is set in units of slices.
Further, the clip file generating unit 83 generates the following quality file (quality 5 file): in which an encoded stream of quality information of the reproduction mode 7 to be reproduced using only a texture file (texture 2 file) and a depth file is set in units of slices.
As described above, the clip file generating unit 83 separately files the encoded streams of the quality information of the reproduction modes for each type of reproduction mode and for each texture image. Therefore, from the quality file of a mode to be reproduced in which reproduction is performed using a texture image that is a reproduction target of a type of a reproduction mode that becomes a candidate, it is easy to acquire quality information of a plurality of modes that become candidates and are to be reproduced in which reproduction is performed using the texture image of the quality file.
Also, although not depicted, the MPD file in the fifth embodiment is similar to the MPD file of fig. 18 except for the following point. Specifically, in the MPD file in the fifth embodiment, the number of presentation elements included in an adaptation set element for a quality file is five.
The presentation element of the first quality file (quality 1 file) has an associate ID of vt1, and the presentation element of the fourth quality file (quality 4 file) has an associate ID of vt 2. The presentation elements of the fifth quality file (quality 5 file) have associated IDs vt2 and vd 2.
The presentation elements of the second quality file (quality 2 file) have associated IDs vt1, vd1, and vd 2. Meanwhile, the presentation elements of the quality file (quality 1 file) include sub-presentation elements having a level 1 and vt1 and vd1 as association IDs and a level 2 and vt1 and vd2 as association IDs, respectively.
The presentation elements of the third quality file (quality 3 file) have associated IDs vt1, vd1, vd2, and vo 1. Meanwhile, the presentation elements of the quality file (quality 1 file) include child presentation elements having level 1 and vt1, vd1, and vo1 as association IDs, and level 2 and vt1, vd2, and vo1 as association IDs, respectively.
In addition, although not depicted, the file generation process in the fifth embodiment is the same as that in the third embodiment except that the encoded stream including the metadata of the quality information is divided not only for each type of reproduction mode but also for each texture image. Further, although not depicted, the reproduction processing in the fifth embodiment is the same as that of fig. 19 or 20.
As described above, the file generating apparatus 11 in the fifth embodiment divides the quality information of the reproduction mode for each reproduction type and the mode to be reproduced for each texture image, and sets the division of the quality information into different quality files. Therefore, the number of quality files can be reduced as compared with an alternative case where the quality information is set in different quality files between different modes to be reproduced. Therefore, it can be said that the quality information of the mode to be reproduced can be efficiently stored. Further, the amount of processing associated with acquiring quality information by video reproduction terminal 14 can be reduced.
Further, the video reproduction terminal 14 may acquire quality information from a quality file storing only quality information of a mode to be reproduced, which is a type of a reproduction mode becoming a candidate, and perform reproduction using a texture image of a reproduction target in the mode. Therefore, it is possible to improve the efficiency of acquiring quality information of a mode to be reproduced in which reproduction is performed using a texture image of a reproduction target, compared to an alternative case in which quality information is acquired from a quality file in which quality information of all reproduction modes is stored.
< sixth embodiment >
(example of clip files)
The configuration of the sixth embodiment of the information processing system to which the present disclosure is applied is substantially the same as that of the information processing system 10 of fig. 1 except that encoded streams of metadata including quality information are collectively set into a single quality file. Therefore, in the following description, descriptions other than the description of the quality file are appropriately omitted.
Fig. 23 is a diagram depicting an example of a clip file generated by the clip file generating unit 83 in the sixth embodiment of the information processing system to which the present disclosure is applied.
The clip file of fig. 23 is the same as the clip file of fig. 5 except for the quality file.
As shown in fig. 23, the clip file generating unit 83 generates one quality file (quality 1 file) by: an encoded stream of metadata including quality information of depth images of 2Mbps and 1Mbps and an encoded stream of metadata including quality information of occlusion images of 1Mbps are set into one quality file (quality 1 file) in units of clips.
In the quality file (quality 1 file), different tracks (quality track (depth 1), quality track (depth 2), and quality track (occlusion 1)) are assigned to different encoded streams.
(example of description of MPD File)
Fig. 24 is a diagram depicting an example of the description of an MPD file in the sixth embodiment.
The configuration of the MPD file of fig. 24 is the same as that of fig. 10, except for an adaptation set (adaptation set) element for a quality file.
In the MPD file of fig. 24, quality files (quality 1 files) are grouped by one adaptation set element.
In the adaptation set element for the quality file, a presentation element corresponding to the quality file (quality 1 file) is described. In the representation element corresponding to the quality file (quality 1 file), vq1 is described as a representation id, and "quality 1.mp 4" is described as a root domain name.
Further, the quality information stored in the quality file (quality 1 file) is quality information of the depth file (depth 1 file), the depth file (depth 2 file), and the occlusion file (occlusion 1 file). Therefore, in the presentation element corresponding to the quality file (quality 1 file), vd1, vd2, and vo1 as the presentation IDs of the depth file (depth 1 file) group, the depth file (depth 2 file), and the occlusion file (occlusion 1 file) group are described as the associate IDs.
Further, in the example of fig. 24, tracks corresponding to level 1 to level 3 store quality information of a depth file (depth 1 file), a depth file (depth 2 file), and an occlusion file (occlusion 1 file), respectively.
Therefore, in the presentation element corresponding to the quality file (quality 1 file) group, < replication level ═ 1 "association ID ═ vd 1" >, which associates level 1 and vd1 as association IDs with each other, where vd1 is the presentation ID of the depth file (depth 1 file), is described. Described is < replication level ═ 2 "association ID ═ vd2 >, which associates level 2 and vd2 as association IDs with each other, where vd2 is the representation ID of the depth file (depth 2 file). Described is < replication level ═ 3 "association ID ═ vo1 >, which associates level 3 and vo1 as association IDs with each other, where vo1 is the representation ID of the occlusion file (occlusion 1 file).
It should be noted that although in the example of fig. 24, the bandwidth is not described in the adaptation set element for the quality file, the bandwidth may be described.
Further, although not described, the file generation process in the sixth embodiment is the same as the file generation process of fig. 12 except for the following: the process at step S14 is not performed, and the encoded stream of metadata including quality information is set in units of clips into one quality file at step S15.
Further, although not described, the reproduction processing in the sixth embodiment is the same as that in fig. 14 or fig. 15.
It is to be noted that although in the first embodiment and the sixth embodiment, the combination of images in a mode to be reproduced is described in an MPD file, the above description may not be made. In this case, candidates of the reproduction mode are selected from among reproduction modes using all combinations of images that can be combined.
Further, the encoded streams of quality information may be set in different quality files for the respective encoded streams.
< seventh embodiment >
(example of configuration of clip files)
The configuration of the seventh embodiment of the information processing system to which the present disclosure is applied is substantially the same as that of the information processing system 10 of fig. 1 except for the following aspects: dividing a coded stream of the depth occlusion image and a coded stream of quality information of the depth occlusion image for each type of depth occlusion image, and setting the coded streams in different segment files; and the encoded streams of quality information within the same clip file are arranged on the same track. Therefore, in the following description, descriptions other than the description of the clip file for each type of depth occlusion image are appropriately omitted.
Fig. 25 is a diagram depicting an example of a clip file generated by the clip file generating unit 83 of the seventh embodiment of the information processing system to which the present disclosure is applied.
The texture file of fig. 25 is the same as the texture file of fig. 5. As shown in fig. 25, the clip file generating unit 83 divides the encoded streams of quality information of 2Mbps and 1Mbps depth images and 1Mbps occlusion images and the encoded streams of 2Mbps and 1Mbps depth images and 1Mbps occlusion images into two for each type of depth occlusion image.
Then, the clip file generating unit 83 generates the following clip files in units of clips: among them, coded streams of depth images of 2Mbps and 1Mbps and coded streams of quality information of division of depth images of 1Mbps are set.
Specifically, the clip file generating unit 83 generates the following depth file (depth 1 file): among them, coded streams of depth images of 2Mbps and 1Mbps and coded streams of quality information of depth images of 2Mbps and 1Mbps are set.
In the depth file (depth 1 file), encoded streams of quality information of depth images of 2Mbps and 1Mbps are set on the same track (quality 1 track), and an encoded stream of depth images of 2Mbps and an encoded stream of depth images of 1Mbps are set on tracks (depth 1 track, depth2 track) separated from each other.
Therefore, in the tracks of the encoded streams of the quality information of the depth images of 2Mbps and 1Mbps (quality 1 track), both the encoded stream of the quality information of the depth images of 2Mbps and the encoded stream of the quality information of the depth images of 1Mbps are collectively sampled.
Further, the clip file generating unit 83 generates the following occlusion file (occlusion 1 file): wherein, a coded stream of 1Mbps occlusion images and a coded stream of 1Mbps occlusion image quality information are set.
As described above, in the seventh embodiment, the encoded streams of quality information set in the clip files of the same depth occlusion image are set on the same track. Thus, the number of tracks in the clip file of the depth occlusion image can be reduced compared to the following alternative case: in this alternative case, the encoded streams of quality information set on the clip files of the same depth occlusion image are set on different tracks from each other for different encoded streams. As a result, the size of the clip file of the depth occlusion image can be reduced. Further, the load on the video reproduction terminal 14 can be reduced.
(example of configuration of sample)
Fig. 26 is a diagram depicting an example of a configuration of a sample of the track (mass 1 track) of fig. 25.
As shown in fig. 26, a sample of a track (quality 1 track) is divided into two sub-samples, and an encoded stream of a depth image of 2Mbps and an encoded stream of a depth image of 1Mbps are set in separate forms to each of the two sub-samples.
In the example of fig. 26, in the first sub-sample of the i (i ═ 1, 2.. multidata, n) th sample, a coded stream of depth images of 2Mbps (depth 1 quality i) is set, and in the second sub-sample, a coded stream of depth images of 1Mbps (depth 2 quality i) is set. Details of the subsamples are described in ISO/IEC 23001-10.
(example of configuration of moov frame of depth File)
Fig. 27 is a diagram depicting an example of the configuration of moov boxes (movie boxes) of a depth file (depth 1 file).
As shown in fig. 27, in the moov box of the depth file (depth 1 file), a trak box is set for each track. In the trak box, a tkhd box (track header box) in which a track ID (track _ ID) as an ID unique to a track is described is provided.
In the example of fig. 27, the track ID of the track (depth 1 track) of the encoded stream provided with 2Mbps of depth images is 1, and the track ID of the track (depth 2 track) of the encoded stream provided with 1Mbps of depth images is 2. Further, the track ID of the track (quality 1 track) of the encoded stream provided with the quality information of the depth images of 2Mbps and 1Mbps is 3.
Further, in the trak box, a tref box (track reference) in which track IDs of different tracks related to the present track are described may be set. Specifically, the track (mass 1 track) is the following track: the track stores a coded stream of quality information of a coded stream of depth images set on a track (depth 1 track) and a track (depth 2 track). Accordingly, a tref box in which 1 and 2 as track IDs of a track (depth 1 track) and a track (depth 2 track) are described is provided on the track (mass 1 track).
Accordingly, the video reproduction terminal 14 can recognize that the track (quality 1 track) has accommodated therein the encoded stream of the quality information of the encoded stream of the depth image set on the track (depth 1 track) and the track (depth 2 track).
However, the video reproduction terminal 14 cannot recognize the following: which sub-sample of the track (quality 1 track) stores the encoded stream of quality information of the encoded stream of depth images accommodated on the track (depth 1 track) or the track (depth 2 track).
Therefore, in the seventh embodiment, a correspondence relationship between sub-samples and a track ID (track specifying information) for specifying a track of a depth occlusion image corresponding to an encoded stream of quality information set in the sub-samples (hereinafter referred to as a sub-sample track correspondence relationship) is described. Thus, video reproduction terminal 14 may identify quality information of the encoded stream of depth occlusion images stored in each subsample. As a result, the video reproduction terminal 14 can acquire an encoded stream of quality information of each depth occlusion image from the track (quality 1 track).
Although in the seventh embodiment, the sub-sample track correspondence relationship is described in the qualitymetrics configuration box, the qualitymetrics sampleentry, the SubsampleInformationBox, or the subsamploreferencebox, the correspondence relationship may be described in any box other than the above.
(example of a description of QualityMetrics configuration Box)
Fig. 28 is a diagram depicting an example of description of qualitymetrics configuration box in the case where a sub-sample track correspondence relationship is described in the qualitymiccs configuration box set in the trak box of a track (quality 1 track).
In the qualitymetrics configuration box of fig. 28, field _ size _ bytes and metric _ count are described, and metric _ code equal in number to the metric _ count is described. Field _ size _ bytes, metric _ count, and metric code are similar to the case of fig. 6, except that the quality file is replaced with a track (quality 1 track).
In the seventh embodiment, in the quality file, encoded streams of two types of quality information of a track (depth 1 track) and a track (depth 2 track) are stored in samples of the track (quality 1 track). Therefore, metric _ count is 2.
Meanwhile, in the qualitymetrics configuration box, 1 indicating that the sub-sample track correspondence relationship can be described is set as a flag.
In the case of a flag of 1, in the qualitymetrics configuration box, a referred _ track _ in _ file _ flag indicating whether or not there is a track to be referred to in a depth file (depth 1 file) including a track (quality 1 track) is described for each metric _ code.
In the case where the referred _ track _ in _ file _ flag of each metric _ code is 1, in the qualitymetrics configuration box, reference _ track _ id _ num of a sub-sample corresponding to the metric _ code and track _ id equal in number to the reference _ track _ id _ num are described.
reference _ track _ id _ num is the number of tracks in the depth file to be referred to (depth 1 file). track _ ID is a track ID of a track in a depth file (depth 1 file) to be referred to.
In the seventh embodiment, the first subsample of samples of a track (quality 1 track) corresponds to a track (quality 1 track) in a depth file (depth 1 file), and the second subsample corresponds to a track (depth 2 track). Accordingly, the referred _ track _ in _ file _ flag of the metric _ code corresponding to each subsample is set to 1, which indicates that there is a track to be referred to in the depth file (depth 1 file) including the track (quality 1 track).
Further, reference _ track _ id _ num of the metric _ code corresponding to each subsample is 1. Further, track _ ID of the metric _ code corresponding to the first subsample is 1, and 1 is a track ID (track specifying information) specifying a track of a depth image (depth 1 track) corresponding to the encoded stream of the quality information set in the subsample. Meanwhile, track _ ID of the metric _ code corresponding to the second sub-sample is 2, and 2 is a track ID of a track of a depth image (depth 2 track) corresponding to the coded stream of the quality information set in the sub-sample.
In this way, in the qualitymetrics configuration box, the track ID of the depth image corresponding to the sub-sample is described in the order in which the sub-samples are set in the sample, thereby describing the sub-sample track correspondence relationship.
(example of a description of QualityMetricsSampleEntry)
Fig. 29 is a diagram depicting an example of description of qualitymetrics sampleentry in the case where a sub-sample track correspondence relationship is described in qualitymetrics sampleentry set in the trak box of a track (quality 1 track).
In the qualitymetrics sampleentry of fig. 29, a qualitymetrics referencebox is set. In the qualitymetrics referencebox, the metric _ count is similarly described as in the qualitymetrics configuration box of fig. 28. Further, reference _ track _ in _ file _ flag is described in the order of the reference _ code of the qualitymetrics configuration box, and in the case where the reference _ track _ in _ file _ flag of each reference _ code is 1, reference _ track _ id _ num of the reference _ code and track _ id equal in number to the reference _ track _ id _ num are described.
(example of description of SubsampleInformationBox)
Fig. 30 is a diagram depicting an example of description of the SubsampleInformationBox in the case where the SubsampleInformationBox set in the trak box of the track (quality-1 track) describes the SubsampleInformationBox.
The SubsampleInformationBox is a box that describes information about the subsample. the trak box may have a plurality of subsampleinformationboxes having flag values different from each other. In the SubsampleInformationBox of fig. 30, 2 indicating that the sub-sample track correspondence can be described may be set to a version.
In the case where the version is greater than 1, in the SubsampleInformationBox, a track _ reference _ is _ existing _ flag and a referred _ track _ in _ file _ flag are described for each subsample. the track _ reference _ is _ exist _ flag is the following flag: which indicates whether an extension is necessary to enable description of the sub-sample track correspondence. This makes it possible to prevent useless execution of expansion for enabling description of the sub-sample track correspondence when a value equal to or greater than 3 is set as a version.
In the seventh embodiment, since extension is necessary to enable description of the sub-sample track correspondence, the track _ reference _ is _ exist _ flag is set to 1, which indicates that extension is necessary to enable description of the sub-sample track correspondence.
In the case where both of the track _ reference _ is _ existing _ flag and the referenced _ track _ in _ file _ flag of each subsample are 1, in the SubsampleInformationBox, reference _ track _ id _ num and track _ id equal in number to reference _ track _ id _ num of the subsample are described.
(example of description of SubsampleReferenceBox)
Fig. 31 is a diagram depicting an example of description of a SubsampleReferenceBox in the case where a subsample track correspondence relationship is described in the SubsampleReferenceBox set in the trak box of the track (quality 1 track).
In the subsamplrefrencebox of fig. 31, referred _ track _ in _ file _ flag is described for each sub-sample, and in the case where the referred _ track _ in _ file _ flag of each sub-sample is 1, reference _ track _ id _ num of the sub-sample and track _ id equal in number to the reference _ track _ id _ num are described.
(example of description of MPD File)
Fig. 32 is a diagram depicting an example of the description of an MPD file in the seventh embodiment.
The configuration of the MPD file of fig. 32 is the same as that of the MPD file of fig. 10 except for the configuration of adaptation set elements for depth files and adaptation set elements for occlusion files, and the configuration of the MPD file in which no adaptation set element for quality files is set.
In the adaptive set element for depth files of fig. 32, a presentation element corresponding to a depth file (depth 1 file) group is described.
In the representation element corresponding to the depth file (depth 1 file), vd1 is described as a representation id, and "depth1. mp4" is described as a root domain name. Note that although the bandwidth is not described in the example of fig. 32, the bandwidth may be described.
Further, the texture file to be reproduced together with the depth file (depth 1 file) is a texture file (texture 1 file) or a texture file (texture 2 file). Therefore, in the representation element corresponding to the depth file (depth 1 file), vt1 and vt2 are described as the representation ids of the texture file (texture 1 file) and the texture file (texture 2 file).
Further, in the example of fig. 32, a track (depth 1 track) and a track (depth 2 track) are associated with level 1 and level 2, respectively, by leva boxes.
Therefore, in the presentation element corresponding to the depth file (depth 1 file) group of fig. 32, < below reproduction level ═ 1 "association ID ═ vt 1" >, which associates level 1 with vt1 as an association ID, where vt1 is the presentation ID of the texture file (texture 1 file) to be reproduced together with the track (depth 1 track), is described.
Similarly, < sub-reproduction level ═ 2 "association ID ═ vt1 vt 2" >, which associates level 2 with vt1 and vt2 as association IDs, where vt1 and vt2 are representation IDs of a texture file (texture 1 file) and a texture file (texture 2 file) to be reproduced together with a track (depth 2 track), is described.
In the presentation element corresponding to the occlusion file (occlusion 1 file) group, vo1 is described as a presentation id, and "occluusion1. mp4" is described as a root domain name. Note that although the bandwidth is not described in the example of fig. 32, the bandwidth may be described.
Further, the depth file to be reproduced together with the occlusion file (occlusion 1 file) is a depth file (depth 1 file). Therefore, in the representation element corresponding to the occlusion file (occlusion 1 file), vd1, which is the representation ID of the depth file (depth 1 file), is described as the associate ID.
It should be noted that although in the example of fig. 32, the combination of images to be used in the mode to be reproduced is not described in the MPD file, the combination may be described.
Further, although not depicted, the file generation process in the seventh embodiment is the same as that of fig. 12 except for the following: the corresponding quality information is set in the depth file and the occlusion file generated at step S13, and the processing at steps S14 and S15 is not performed.
Further, although not depicted, the reproduction processing in the seventh embodiment is the same as that of fig. 14 or fig. 15 except that the encoded stream is acquired with reference to the sub-sample track correspondence also.
In this way, the file generating apparatus 11 according to the seventh embodiment collectively sets the encoded streams of a plurality of kinds of quality information set in the clip file of the depth occlusion image into one track. Therefore, the number of tracks of the clip file for configuring the depth occlusion image can be reduced as compared with an alternative case where encoded streams of different types of quality information are disposed on different tracks. In other words, quality information of the depth occlusion image can be efficiently stored. Thus, the size of the segment file of the depth occlusion image is reduced. As a result, the amount of transmission requests when the file generation apparatus 11 uploads the clip file of the depth occlusion image is reduced.
It should be noted that although in the seventh embodiment, the quality information is the quality information of the occlusion image, the quality information may also be the quality information of a 3D image reproduced using a texture image or a texture image and a depth occlusion image.
In this case, for example, a track of the texture image and a track of quality information of the texture image are set in the texture file, and a track of the depth image and a track of quality information of the 3D image reproduced using the texture image and the depth image are set in the depth file. Further, a track of occlusion images and a track of quality information of 3D images reproduced using texture images, depth images, and occlusion images are set in an occlusion file.
< eighth embodiment >
(example of configuration of clip files)
The configuration of the eighth embodiment of the information processing system to which the present disclosure is applied is substantially the same as that of the information processing system 10 of fig. 1 except that encoded streams of quality information in the same quality file are disposed on the same track. Therefore, in the following description, descriptions other than the description of the quality file (quality 1 file) are appropriately omitted.
Fig. 33 is a diagram describing an example of a clip file generated by the clip file generating unit 83 of the eighth embodiment of the information processing system to which the present disclosure is applied.
The clip file of fig. 33 is the same as the clip file of fig. 5 except for the quality file of the depth image (quality 1 file).
As shown in fig. 33, the clip file generating unit 83 generates the following quality file (quality 1 file): in which encoded streams of quality information of depth images of 2Mbps and 1Mbps are collectively set on one track (quality 1 track). On a track (quality 1 track), both the encoded stream of quality information of a depth image of 2Mbps and the encoded stream of quality information of a depth image of 1Mbps are collectively sampled.
The MPD file in the eighth embodiment is the same as the MPD file of fig. 10. Specifically, in the MPD file in the eighth embodiment, in a representation element corresponding to a quality file (quality 1 file), a correspondence between each level and an association ID for specifying a depth occlusion image is described.
(first example of description of leva frame)
Fig. 34 is a diagram depicting a first example of description of the leva box of the quality file (quality file).
As shown in fig. 34, in a leva box of a quality file (quality file), level _ count indicating the number of levels that a sub-representation element corresponding to the quality file (quality file) described in the MPD file has is described.
Further, in the leva frame of the quality file (quality file), track _ id, assignment _ type, and the like, the number of each level being equal to level _ count, are described in order from level 1. The assignment _ type is a type of content associated with a level.
In the leva box of fig. 34, 5 may be set to assignment _ type. In the example of fig. 34, 5 as assignment _ type indicates that the type associated with the level is metric _ code described in qualitymetrics configuration box. Specifically, in the case where the assignment _ type is 5, the level i (i ═ 1,2) and the ith metric _ code from the top described in the qualitymetrics configuration box are associated with each other.
(description of the first example of levels and subsamples associated by the leva boxes)
Fig. 35 is a diagram showing a first example of sub samples and levels in a quality file (quality 1 file) associated with each other by the leva box of fig. 34.
In the MPD file of the eighth embodiment, the number of levels that a sub-representation element corresponding to a quality file (quality 1 file) has is two. Therefore, as shown in fig. 35, 2 is described as level _ count in the leva box of the quality file (quality file).
Further, the tracks corresponding to the two levels are tracks having a track ID of 1 (quality 1 tracks). Therefore, in the leva box of the quality file (quality file), 1 is described as two levels of track _ id. Further, 5 is described as two levels of assignment _ type.
Thus, with the leva box of fig. 35, each sub-representation element may have a level i associated with the i-th metric _ code from the top described in the qualitymetrics configuration box. Further, through the description of the qualitymetrics configuration box, the ith metric _ code from the top and the ith sub-sample from the top described in the qualitymetrics configuration box may be associated with each other.
As described above, by describing 5 as assignment _ type of leva box, level i and ith subsample described in the MPD file can be associated with each other. Therefore, 5 as assignment _ type can be regarded as information for associating the level and the subsample with each other.
(second example of description of the leva frame)
Fig. 36 is a diagram depicting a second example of the description of the leva box of the quality file (quality file).
The leva frame of the quality file (quality file) of fig. 36 has the same configuration as that in fig. 34 except for the case where 5 is set as assignment _ type.
In the example of fig. 36, 5 as assignment _ type indicates that the type of content associated with a level is a subsample whose information is described in a subs box (subsample information box). Specifically, in the case where the assignment _ type is 5, the level i corresponds to the i-th subsample from the top where the information is described in the subframe. Further, in the leva box of fig. 36, in the case where assignment _ type is 5, a flag of the subs box associated with the level i is described.
(description of the second example of levels and subsamples associated by the leva boxes)
Fig. 37 is a diagram showing a second example of sub samples and levels in a quality file (quality 1 file) associated with each other by the leva box of fig. 36.
The leva block of fig. 37 is the same as that of fig. 35 except that 0 is described as a sub _ flag (sub sample _ flag) of each level.
Accordingly, with the leva box of fig. 37, each sub-representation element may have a level i associated with the ith sub-sample from the top, where the information is described in the subs box labeled 0. Further, through the description of the subs box, the ith subsample from the top, where information is described in the subs box, may be specified.
By describing 5 as assignment _ type of leva box in the manner as described above, the subsamples and levels described in the MPD file can be associated with each other. Therefore, 5 as assignment _ type can be regarded as information for associating the level and the subsample with each other.
It should be noted that two subsamples arranged on a track (quality 1 track) may be grouped into one group. In this case, the Sub-sample group entry (Sub-SampleGroupEntry) shown in fig. 38 is set in the trak box of the track (quality 1 track).
The sub-sample group entry of fig. 38 is an extension entry of SampleGroupDescriptionEntry, and not only the sub-sample group entry in which the image is to be stored but also the sub-sample group entry in which the image is to be stored of any content other than the image is based on the extension entry of SampleGroupDescriptionEntry described above.
In the sub-sample group entry of fig. 38, the type of the group of sub-samples (sgss in the example of fig. 38) is described. Further, in the subsample group entry of fig. 38, a code _ parameter as a name of the sample entry of the subsample belonging to the group and sub _ sample _ flags as a flag of the subs box are described.
Further, as shown in fig. 39, assignment _ type of the leva box is set to 0, which indicates that the type of content associated with the level is information on samples belonging to a predetermined group described in the sgpd box (sample group description box).
Further, in the leva box, instead of the subsample _ flag, the grouping _ type described in the sgpd box associated with the level i (i ═ 1,2) is described. The grouping _ type is a type of a group of sub-samples corresponding to the sgpd box. In the example of fig. 36, the type of the group to which the two sub-samples set on the track (quality 1 track) belong is sgss, and for both the two sub-samples, the grouping _ type corresponding to each level of the leva box is sgss.
With the leva box of fig. 39, each sub-representation element has a level i associated with the following information: this information is described in the sgpd box and is related to the ith sub-sample from the top in the information on the sub-samples belonging to the group whose grouping _ type is sgss.
As the information related to the subsample, the name of the sample entry configuring the sample of the subsample and the flag of the subs box are described. In the example of fig. 36, the names of the sample entries configuring the samples of the two subsamples set on the track (quality 1 track) are both vqme, and the flags of both are zero. Therefore, in the sgpd box, vqme and 0 are described as information on each subsample.
By the description of the sgpd box, each sub-representation element may have a level i associated with the ith sub-sample from the top, which has a flag of 0 and in which information is described in the subs box of the sample entry named vqme. Then, through the description of the subs box, the ith subsample from the top, where the information is described in the subs box, may be specified.
In this way, the levels and subsamples described in the MPD file may be associated with each other by the leva boxes of fig. 39.
It should be noted that, in the eighth embodiment, the quality file may be divided as in the second to sixth embodiments.
Further, although in the first to eighth embodiments, the sub-presentation element is extended so that the associate ID can be described, the sub-presentation element may also be extended so that an attribute other than the associate ID can be described.
For example, an attribute such as a sub association ID (subassociation ID) in which a representation ID corresponding to a sub representation element is described may be newly defined, and the sub representation element may be extended so that such an attribute may be described.
For example, the sub-representation element may be extended such that a dependency ID (dependency ID), which is an attribute for representing an ID of a track that needs to be referred to at the time of decoding, may also be described in the sub-representation element, and the representation ID corresponding to the sub-representation element may be described as the dependency ID of the sub-representation element.
It should be noted that when a relationship between a texture image and a depth occlusion image or a relationship other than the relationship between a depth occlusion image and a quality information relationship is to be expressed, an attribute of a sub-representation element indicating a correspondence between the sub-representation element and a representation ID may also be used.
< ninth embodiment >
(description of computer to which the present disclosure applies)
Although the series of processes described above may be executed by hardware, it may also be executed by software. In the case where a series of processes is executed by software, a program that constructs the software is installed into a computer. Here, the computer includes a computer built in dedicated hardware, for example, a general-purpose personal computer or the like that can execute various functions by installing various programs.
Fig. 40 is a block diagram depicting an example of a configuration of hardware of a computer that executes the above-described series of processing according to a program.
In the computer 200, a CPU (central processing unit) 201, a ROM (read only memory) 202, and a RAM (random access memory) 203 are connected to each other by a bus 204.
The input/output interface 205 is also connected to the bus 204. The input unit 206, the output unit 207, the storage unit 208, the communication unit 209, and the driver 210 are connected to the input/output interface 205.
The input unit 206 is configured by a keyboard, a mouse, a microphone, and the like. The output unit 207 is configured by a display, a speaker, and the like. The storage unit 208 is configured by a hard disk, a nonvolatile memory, and the like. The communication unit 209 is configured by a network interface or the like. The drive 210 drives a removable medium 211 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like.
In the computer 200 configured in the above-described manner, the CPU 201 loads a program stored in, for example, the storage unit 208 into the RAM 203 through the input/output interface 205 and the bus 204, and executes the program to execute the above-described series of processes.
The program executed by the computer 200(CPU 201) may be recorded on the removable medium 211 and provided as the removable medium 211, for example, as a package medium or the like. Further, the program may be provided through a wired or wireless transmission medium such as a local area network, the internet, or digital satellite broadcasting.
In the computer 200, by loading the removable medium 211 into the drive 210, a program can be installed into the storage unit 208 through the input/output interface 205. Further, the program may be received by the communication unit 209 through a wired or wireless transmission medium and installed into the storage unit 208. Further, the program may be installed in advance in the ROM 202 or the storage unit 208.
It should be noted that the program executed by the computer 200 may be a program that executes processing in time series according to the order described in this specification, or a program that executes processing in parallel or a program that executes at a time when the program is called, or the like.
It should be noted that in this specification, the term system denotes a set of plural components (device, module (portion), etc.), and it is not important whether all the components are provided in the same housing. Therefore, any one of the following constitutes a system: a plurality of devices accommodated in separate housings and connected to each other through a network; and an apparatus in which a plurality of modules are accommodated in one housing.
Further, the effects described in the present specification are merely exemplary and should not be restrictive, and other effects may be achieved.
Further, the embodiments of the present disclosure are not limited to the embodiments described above, and may be changed in various ways without departing from the subject matter of the present disclosure.
For example, a clip file of an occlusion image may not be generated. Specifically, the present disclosure may be applied to an information processing system in which the following fragment files are generated: the clip file includes only depth images or depth related images including both depth images and occlusion images (i.e., depth related images including at least depth images).
Note that the present disclosure may have a configuration as described below.
(1)
A file generation apparatus comprising:
a file generating unit configured to generate a file in which quality information representing quality of a depth-related image including at least a depth image is set in a form for each type of division.
(2)
The file generation apparatus according to the above (1), wherein,
the file generation apparatus is configured such that:
the file generating unit divides the quality information for each type and sets the divided quality information into different files.
(3)
The file generation apparatus according to the above (2), wherein,
the file generation apparatus is configured such that:
the depth-related image comprises an occlusion image, which is a texture image of an occlusion region corresponding to the depth image.
(4)
The file generation apparatus according to the above (3), wherein,
the file generation apparatus is configured such that:
the quality information of the occlusion image is information indicating a ratio of the occlusion region to a picture of the texture image or an amount of noise of the occlusion image.
(5)
The file generating apparatus according to the above (3) or (4), wherein,
the file generation apparatus is configured such that:
the type is a type of the depth-related image.
(6)
The file generating apparatus according to any one of the above (2) to (4), wherein,
the file generation apparatus is configured such that:
the type is a texture image corresponding to the depth-related image.
(7)
The file generating apparatus according to the above (2) or (3), wherein,
the file generation apparatus is configured such that:
the quality information is information representing quality of a 3D image reproduced using the depth-related image and a texture image corresponding to the depth-related image.
(8)
The file generating apparatus according to any one of the above (2) to (7), wherein,
the file generation means is configured such that,
in a case where the file generating unit divides quality information of depth-related images of a plurality of bit rates for each type and sets the divided quality information into different files, the file generating unit sets a plurality of types of quality information set in the same file into tracks different from each other.
(9)
The file generating apparatus according to any one of the above (2) to (5),
the file generation apparatus is configured such that:
in a case where the file generation unit divides quality information of depth-related images of a plurality of bit rates for each type and sets the divided quality information into different files, the file generation unit collectively samples a plurality of types of quality information set in the same file and sets the plurality of types of quality information into the same track.
(10)
The file generation apparatus according to the above (9), wherein,
the file generation apparatus is configured such that:
the plurality of types of quality information are each divided into sub-samples different from each other and set into samples, and
the file generating unit describes, into a file, a correspondence between the subsample and track specifying information for specifying a track of the depth-related image corresponding to the quality information set in the subsample.
(11)
The file generation apparatus according to the above (10), wherein,
the file generation apparatus is configured such that:
the file generating unit describes the correspondence relationship into a file by describing, into the file, the track designation information of the depth-related image corresponding to the subsample in the order in which the subsample is set in the sample.
(12)
The file generation apparatus according to the above (10) or (11), wherein,
the track of the depth-related image corresponding to the quality information set in the subsample is set in the same file as the file of the track in which the quality information is set.
(13)
A file generation method, comprising:
a file generating step of generating a file in which quality information indicating quality of a depth-related image including at least a depth image is set in a form divided for each type.
(14)
A reproduction apparatus comprising:
an acquisition unit configured to acquire quality information of a given type from a file in which quality information representing quality of a depth-related image including at least a depth image is set in the form of division for each type.
(15)
A reproduction method, comprising:
an acquisition step in which the reproduction apparatus acquires quality information of a given type from a file in which quality information representing the quality of a depth-related image including at least a depth image is set in the form of a division for each type.
[ list of reference numerals ]
11 file generation means, 14 video reproduction terminal, 83 clip file generation means, 84 MPD file generation means, 103 quality information acquisition means, 105 image acquisition means, 106 decoding means, 107 output control means
Claims (12)
1. A file generation apparatus comprising:
a file generating unit configured to generate a depth file storing a depth file
A plurality of encoded streams of depth images in a plurality of tracks, the plurality of encoded streams of depth images including at least a first encoded stream of first depth images at a first bit rate in a first track and a second encoded stream of second depth images at a second bit rate in a second track, and
a third encoded stream of quality information for each depth image in the third track,
wherein the file generation unit is further configured to:
generating identification information related to quality information included in the third encoded stream,
generating information indicating a correspondence between the identification information and the depth images in the plurality of tracks, an
Storing information indicating the correspondence in a metadata area of the depth file,
wherein the third encoded stream comprises a plurality of samples including the quality information, each sample being divided into a plurality of sub-samples,
wherein the information indicating the correspondence indicates, based on the identification information, a respective track of the plurality of tracks corresponding to each respective subsample of the plurality of subsamples, an
Wherein the file generation unit is implemented via at least one processor.
2. The file generation apparatus according to claim 1,
each respective track of the plurality of tracks is identified by a respective track ID.
3. The file generation apparatus according to claim 2,
the information indicating the correspondence refers to a respective track ID of a respective track corresponding to each respective subsample.
4. The file generation apparatus according to claim 3,
the information indicating the correspondence includes syntax of reference _ track _ ID _ num corresponding to a respective track ID of each respective track.
5. The file generation apparatus according to claim 4,
reference _ track _ ID _ num corresponding to the respective track ID of each respective track is stored in the moov box of the depth file.
6. A file generation method, comprising:
generating a depth file storing:
a plurality of encoded streams of depth images in a plurality of tracks, the plurality of encoded streams of depth images including at least a first encoded stream of first depth images at a first bit rate in a first track and a second encoded stream of second depth images at a second bit rate in a second track, and
a third encoded stream of quality information for each depth image in a third track;
generating identification information related to quality information included in the third encoded stream;
generating information indicating a correspondence between the identification information and the depth images in the plurality of tracks; and
storing information indicating the correspondence in a metadata area of the depth file,
wherein the third encoded stream comprises a plurality of samples including the quality information, each sample being divided into a plurality of sub-samples, an
Wherein the information indicating the correspondence indicates, based on the identification information, a respective track of the plurality of tracks corresponding to each respective subsample of the plurality of subsamples.
7. A reproduction apparatus comprising:
an acquisition unit configured to acquire a depth file storing a depth file
A plurality of encoded streams of depth images in a plurality of tracks, the plurality of encoded streams of depth images including at least a first encoded stream of first depth images at a first bit rate in a first track and a second encoded stream of second depth images at a second bit rate in a second track, and
a third encoded stream of quality information for each depth image in the third track,
wherein the obtaining unit is further configured to:
acquiring identification information related to quality information included in the third encoded stream, an
Acquiring information indicating a correspondence between the identification information and the depth images in the plurality of tracks,
wherein information indicating the correspondence is stored in a metadata area of the depth file,
wherein the third encoded stream comprises a plurality of samples including the quality information, each sample being divided into a plurality of sub-samples,
wherein the information indicating the correspondence indicates, based on the identification information, a respective track of the plurality of tracks corresponding to each respective subsample of the plurality of subsamples, an
Wherein the obtaining unit is implemented via at least one processor.
8. The reproduction apparatus according to claim 7,
each respective track of the plurality of tracks is identified by a respective track ID.
9. The reproduction apparatus according to claim 8,
the information indicating the correspondence refers to a respective track ID of a respective track corresponding to each respective subsample.
10. The reproduction apparatus according to claim 9, wherein,
the information indicating the correspondence includes syntax of reference _ track _ ID _ num corresponding to a respective track ID of each respective track.
11. The reproduction apparatus according to claim 10,
reference _ track _ ID _ num corresponding to the respective track ID of each respective track is stored in the moov box of the depth file.
12. A reproduction method, comprising:
obtaining a depth file, the depth file storing:
a plurality of encoded streams of depth images in a plurality of tracks, the plurality of encoded streams of depth images including at least a first encoded stream of first depth images at a first bit rate in a first track and a second encoded stream of second depth images at a second bit rate in a second track, and
a third encoded stream of quality information for each depth image in a third track;
acquiring identification information related to quality information included in the third encoded stream, an
Acquiring information indicating a correspondence between the identification information and the depth images in the plurality of tracks,
wherein information indicating the correspondence is stored in a metadata area of the depth file,
wherein the third encoded stream comprises a plurality of samples including the quality information, each sample being divided into a plurality of sub-samples, an
Wherein the information indicating the correspondence indicates, based on the identification information, a respective track of the plurality of tracks corresponding to each respective subsample of the plurality of subsamples.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016-030859 | 2016-02-22 | ||
JP2016030859 | 2016-02-22 | ||
PCT/JP2017/004531 WO2017145756A1 (en) | 2016-02-22 | 2017-02-08 | File generation device, file generation method, reproduction device, and reproduction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108702478A CN108702478A (en) | 2018-10-23 |
CN108702478B true CN108702478B (en) | 2021-07-16 |
Family
ID=59685019
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201780011647.6A Active CN108702478B (en) | 2016-02-22 | 2017-02-08 | File generation device, file generation method, reproduction device, and reproduction method |
Country Status (5)
Country | Link |
---|---|
US (2) | US10904515B2 (en) |
EP (1) | EP3422702B1 (en) |
JP (1) | JP6868783B2 (en) |
CN (1) | CN108702478B (en) |
WO (1) | WO2017145756A1 (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11049219B2 (en) | 2017-06-06 | 2021-06-29 | Gopro, Inc. | Methods and apparatus for multi-encoder processing of high resolution content |
WO2019017298A1 (en) * | 2017-07-19 | 2019-01-24 | 日本電気株式会社 | Data delivery device, system, method, and recording medium |
US11228781B2 (en) | 2019-06-26 | 2022-01-18 | Gopro, Inc. | Methods and apparatus for maximizing codec bandwidth in video applications |
US11109067B2 (en) | 2019-06-26 | 2021-08-31 | Gopro, Inc. | Methods and apparatus for maximizing codec bandwidth in video applications |
EP3972260A4 (en) * | 2019-07-04 | 2022-08-03 | Sony Group Corporation | Information processing device, information processing method, reproduction processing device, and reproduction processing method |
US11481863B2 (en) | 2019-10-23 | 2022-10-25 | Gopro, Inc. | Methods and apparatus for hardware accelerated image processing for spherical projections |
US11973817B2 (en) * | 2020-06-23 | 2024-04-30 | Tencent America LLC | Bandwidth cap signaling using combo-index segment track in media streaming |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1939056A (en) * | 2004-04-07 | 2007-03-28 | 松下电器产业株式会社 | Information recording apparatus and information converting method |
CN103957448A (en) * | 2009-04-09 | 2014-07-30 | 瑞典爱立信有限公司 | Media container file management |
KR20150084738A (en) * | 2015-07-02 | 2015-07-22 | 삼성전자주식회사 | Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure |
CN104904220A (en) * | 2013-01-04 | 2015-09-09 | 三星电子株式会社 | Encoding apparatus and decoding apparatus for depth image, and encoding method and decoding method |
WO2015194781A1 (en) * | 2014-06-18 | 2015-12-23 | 삼성전자 주식회사 | Multi-layer video encoding method and multi-layer video decoding method using depth blocks |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9406132B2 (en) * | 2010-07-16 | 2016-08-02 | Qualcomm Incorporated | Vision-based quality metric for three dimensional video |
KR20130011994A (en) * | 2011-07-22 | 2013-01-30 | 삼성전자주식회사 | Transmitter, receiver and the method thereof |
KR20130088636A (en) * | 2012-01-31 | 2013-08-08 | 삼성전자주식회사 | Apparatus and method for image transmitting and apparatus and method for image reproduction |
US20130329939A1 (en) * | 2012-06-12 | 2013-12-12 | Jorg-Ulrich Mohnen | Decoding a quilted image representation into a digital asset along with content control data |
US9648299B2 (en) * | 2013-01-04 | 2017-05-09 | Qualcomm Incorporated | Indication of presence of texture and depth views in tracks for multiview coding plus depth |
CN104919809A (en) * | 2013-01-18 | 2015-09-16 | 索尼公司 | Content server and content distribution method |
EP3013064B1 (en) * | 2013-07-19 | 2019-03-13 | Sony Corporation | Information processing device and method |
CN105230024B (en) * | 2013-07-19 | 2019-05-24 | 华为技术有限公司 | A kind of media representation adaptive approach, device and computer storage medium |
US20150181168A1 (en) * | 2013-12-20 | 2015-06-25 | DDD IP Ventures, Ltd. | Interactive quality improvement for video conferencing |
US20160373771A1 (en) * | 2015-06-18 | 2016-12-22 | Qualcomm Incorporated | Design of tracks and operation point signaling in layered hevc file format |
-
2017
- 2017-02-08 EP EP17756197.4A patent/EP3422702B1/en active Active
- 2017-02-08 CN CN201780011647.6A patent/CN108702478B/en active Active
- 2017-02-08 US US16/064,628 patent/US10904515B2/en active Active
- 2017-02-08 WO PCT/JP2017/004531 patent/WO2017145756A1/en active Application Filing
- 2017-02-08 JP JP2018501553A patent/JP6868783B2/en active Active
-
2020
- 2020-11-18 US US16/951,288 patent/US11252397B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1939056A (en) * | 2004-04-07 | 2007-03-28 | 松下电器产业株式会社 | Information recording apparatus and information converting method |
CN103957448A (en) * | 2009-04-09 | 2014-07-30 | 瑞典爱立信有限公司 | Media container file management |
CN104904220A (en) * | 2013-01-04 | 2015-09-09 | 三星电子株式会社 | Encoding apparatus and decoding apparatus for depth image, and encoding method and decoding method |
WO2015194781A1 (en) * | 2014-06-18 | 2015-12-23 | 삼성전자 주식회사 | Multi-layer video encoding method and multi-layer video decoding method using depth blocks |
KR20150084738A (en) * | 2015-07-02 | 2015-07-22 | 삼성전자주식회사 | Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure |
Also Published As
Publication number | Publication date |
---|---|
CN108702478A (en) | 2018-10-23 |
EP3422702A1 (en) | 2019-01-02 |
JP6868783B2 (en) | 2021-05-12 |
US20210076024A1 (en) | 2021-03-11 |
US11252397B2 (en) | 2022-02-15 |
EP3422702B1 (en) | 2022-12-28 |
US20190007676A1 (en) | 2019-01-03 |
EP3422702A4 (en) | 2019-01-02 |
US10904515B2 (en) | 2021-01-26 |
WO2017145756A1 (en) | 2017-08-31 |
JPWO2017145756A1 (en) | 2018-12-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108702478B (en) | File generation device, file generation method, reproduction device, and reproduction method | |
KR101777348B1 (en) | Method and apparatus for transmitting and receiving of data | |
CN108702534B (en) | File generation device, file generation method, reproduction device, and reproduction method | |
US9277252B2 (en) | Method and apparatus for adaptive streaming based on plurality of elements for determining quality of content | |
JP5497919B2 (en) | File format-based adaptive stream generation and playback method and apparatus and recording medium therefor | |
KR101786050B1 (en) | Method and apparatus for transmitting and receiving of data | |
US9197689B2 (en) | Method and apparatus for adaptively streaming content including plurality of chapters | |
KR101620151B1 (en) | A client, a content creator entity and methods thereof for media streaming | |
US20110125918A1 (en) | Adaptive streaming method and apparatus | |
US10452715B2 (en) | Systems and methods for compressing geotagged video | |
US20140003523A1 (en) | Systems and methods for encoding video using higher rate video sequences | |
JP7238948B2 (en) | Information processing device and information processing method | |
KR101944601B1 (en) | Method for identifying objects across time periods and corresponding device | |
KR101452269B1 (en) | Content Virtual Segmentation Method, and Method and System for Providing Streaming Service Using the Same | |
US20140003502A1 (en) | Systems and Methods for Decoding Video Encoded Using Predictions that Reference Higher Rate Video Sequences | |
JP6131050B6 (en) | Data transmission method and apparatus, and data reception method and apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |