CN115118952A - Information processing apparatus, information processing method, reproduction apparatus, and reproduction method - Google Patents

Information processing apparatus, information processing method, reproduction apparatus, and reproduction method Download PDF

Info

Publication number
CN115118952A
CN115118952A CN202210579940.5A CN202210579940A CN115118952A CN 115118952 A CN115118952 A CN 115118952A CN 202210579940 A CN202210579940 A CN 202210579940A CN 115118952 A CN115118952 A CN 115118952A
Authority
CN
China
Prior art keywords
information
image
section
mapping
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210579940.5A
Other languages
Chinese (zh)
Inventor
胜股充
平林光浩
浜田俊也
泉伸明
高桥辽平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Corp
Original Assignee
Sony Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp filed Critical Sony Corp
Publication of CN115118952A publication Critical patent/CN115118952A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/388Volumetric displays, i.e. systems where the image is built up from picture elements distributed through a volume
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/366Image reproducers using viewer tracking
    • H04N13/383Image reproducers using viewer tracking for tracking with gaze detection, i.e. detecting the lines of sight of the viewer's eyes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/23439Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements for generating different versions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26258Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for generating a list of items to be played back in a given order, e.g. playlist, or scheduling item distribution according to such list
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/4728End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for selecting a Region Of Interest [ROI], e.g. for requesting a higher resolution version of a selected region
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Abstract

The present disclosure provides an information processing apparatus and an information processing method, and a reproduction apparatus and a reproduction method. The information processing apparatus includes: a file acquiring section configured to acquire a file including mapping information for mapping an omnidirectional image to a 3D model, wherein the mapping information includes rotation angle information of the omnidirectional image at a reference position of the 3D model and a position of a reference image in the omnidirectional image for rotating and mapping the reference image with respect to the reference position, and wherein the reference position of the 3D model is a position on the 3D model corresponding to a predetermined line-of-sight direction of a viewer in a case where a viewing position of the viewer is a center of the 3D model.

Description

Information processing apparatus, information processing method, reproduction apparatus, and reproduction method
The present application is a divisional application of an application having a national application number of 201780028140.1, an international application date of 2017, 5, 12, and an entering national date of 2018, 11, 6, and an invention name of "file generation apparatus, file generation method, reproduction apparatus, and reproduction method".
Technical Field
The present disclosure relates to a file generating apparatus, a file generating method, a reproducing apparatus, and a reproducing method, and particularly, to a file generating apparatus, a file generating method, a reproducing apparatus, and a reproducing method capable of recognizing an image for generating an omnidirectional image.
Background
There is known a recording apparatus which generates an omnidirectional image by mapping an image having a 360-degree field of view in a horizontal direction and a 180-degree field of view in a vertical direction onto a 2D image (planar image) from images captured by multiple cameras, encodes the omnidirectional image, and records the encoded omnidirectional image (see, for example, PTL 1).
Such a recording apparatus employs a method using an equiangular projection (equiangular projection), cubic mapping, or the like as an omnidirectional image generation method. In the case where the omnidirectional image generation method is a method using an equiangular projection, the omnidirectional image is an image of a sphere of equiangular projection obtained when the captured image is mapped onto a spherical surface. Further, in the case where the omnidirectional image generation method is cube mapping, the omnidirectional image is an image of an expansion plan (depth plan) of the cube obtained when the captured image is mapped onto the face of the cube.
Meanwhile, MPEG-DASH (motion picture experts group phase dynamic adaptive streaming based on HTTP) is known as a moving video content streaming scheme. In MPEG-DASH, a management file that manages encoded streams of moving video content is transmitted from a delivery server to a terminal apparatus, and the terminal apparatus selects encoded streams to be reproduced based on the management file and issues a request like the delivery server.
[ list of references ]
[ patent document ]
[PTL 1]JP 2006-14174A
Disclosure of Invention
[ problem ] to
However, in the case of delivering the encoded stream of the omnidirectional image from the delivery server to the terminal apparatus according to MPEG-DASH, the setting of the identification information for identifying the captured image used to generate the omnidirectional image is not considered. Therefore, the terminal apparatus has not been able to recognize the captured image for generating the omnidirectional image, and has been unable to select the omnidirectional image to be reproduced based on the captured image.
The present disclosure has been achieved in view of these circumstances, and an object of the present disclosure is to make it possible to identify an image for generating an omnidirectional image.
[ solution to problem ]
A file generating apparatus according to a first aspect of the present disclosure is a file generating apparatus including a setting section that sets identification information for identifying an image used to generate an omnidirectional image, the omnidirectional image being generated by mapping the image onto a 3D model.
The file generation method according to the first aspect of the present disclosure corresponds to the file generation apparatus according to the first aspect of the present disclosure.
According to a first aspect of the present disclosure, identification information for identifying an image used to generate an omnidirectional image generated by mapping the image onto a 3D model is provided.
A reproduction apparatus according to a second aspect of the present disclosure is a reproduction apparatus including a selection section that selects an omnidirectional image to be reproduced and generated by mapping an image onto a 3D model based on identification information for identifying an image used to generate the omnidirectional image.
The reproduction method according to the second aspect of the present disclosure corresponds to the reproduction apparatus according to the second aspect of the present disclosure.
According to the second aspect of the present disclosure, an omnidirectional image to be reproduced and generated by mapping an image onto a 3D model is selected based on identification information for identifying the image used to generate the omnidirectional image.
Note that the file generating apparatus according to the first aspect and the reproducing apparatus according to the second aspect may be realized by causing a computer to execute a program.
In addition, in order to realize the file generating apparatus according to the first aspect and the reproducing apparatus according to the second aspect, the program to be executed by the computer may be provided by being transmitted via a transmission medium or being recorded in a recording medium.
For example, the MPD file generating unit may set identification information for identifying a captured image used in the generation of an omnidirectional image, the omnidirectional image being generated by mapping the captured image to the three-dimensional model. The present disclosure can be applied to, for example, a file generation apparatus that generates a section file of an omnidirectional image delivered by HTTP using MPEG-DASH, and the like.
[ advantageous effects of the invention ]
According to a first aspect of the present disclosure, a file may be generated. Further, according to the first aspect of the present disclosure, a file may be generated so that an image for generating an omnidirectional image may be identified.
According to a second aspect of the disclosure, a file may be selected. Additionally, according to the second aspect of the present disclosure, an image for generating an omnidirectional image may be identified.
Note that the effects are not always limited to those described herein, but may be any effects described in the present disclosure.
Drawings
Fig. 1 is a block diagram showing an example of the configuration of a first embodiment of a delivery system to which the present disclosure is applied.
Fig. 2 is a block diagram showing an example of the configuration of the generation apparatus of fig. 1.
Fig. 3 is a perspective view showing a cube as a 3D model.
Fig. 4 is a diagram showing an example of an omnidirectional image generated by cubic mapping.
Fig. 5 is a perspective view showing a sphere as a 3D model.
Fig. 6 is a diagram illustrating an example of an omnidirectional image generated by a method using an isometric projection.
Fig. 7 is a diagram illustrating an example of an MPD file generated by the MPD file generating section of fig. 2.
Fig. 8 is an explanatory diagram of ID, X, Y, and λ.
Fig. 9 is an explanatory diagram of the mapping information.
Fig. 10 is a flowchart describing a file generation process executed by the generation apparatus of fig. 2.
Fig. 11 is a block diagram illustrating an example of the configuration of the reproducing apparatus of fig. 1.
Fig. 12 is a flowchart describing a reproduction process performed by the reproduction apparatus of fig. 11.
Fig. 13 is a block diagram showing an example of the configuration of a generation apparatus according to the second embodiment of the delivery system to which the present disclosure is applied.
Fig. 14 is a diagram showing a first example of an MPD file generated by the MPD file generating section of fig. 13.
Fig. 15 depicts a flowchart of the file generation process executed by the generation apparatus of fig. 13.
Fig. 16 is a block diagram showing an example of the configuration of a reproduction apparatus according to the second embodiment of the delivery system to which the present disclosure is applied.
Fig. 17 is a flowchart describing a reproduction process performed by the reproduction apparatus of fig. 16.
Fig. 18 is a diagram showing a second example of an MPD file generated by the MPD file generating section of fig. 13.
Fig. 19 is a diagram showing a third example of an MPD file generated by the MPD file generating section of fig. 13.
Fig. 20 is a diagram showing a fourth example of an MPD file generated by the MPD file generating section of fig. 13.
Fig. 21 is a diagram showing an example of type information.
Fig. 22 is a diagram showing a fifth example of an MPD file generated by the MPD file generating section of fig. 13.
Fig. 23 is a block diagram showing an example of the configuration of a generation apparatus according to the third embodiment of the delivery system to which the present disclosure is applied.
Fig. 24 is a diagram showing an example of the configuration of two-dimensional plane information.
Fig. 25 is an explanatory diagram of two-dimensional plane information.
Fig. 26 is a diagram showing an example of 18 two-dimensional planes.
Fig. 27 is a diagram showing an example of two-dimensional plane information on the 18 two-dimensional planes of fig. 26.
Fig. 28 is an explanatory diagram of a diagonal view angle.
Fig. 29 is an explanatory diagram of a method of setting the SRD of each high-resolution image.
Fig. 30 is a diagram showing a first example of an MPD file generated by the MPD file generating section of fig. 23.
Fig. 31 is a flowchart describing a file generation process executed by the generation apparatus of fig. 23.
Fig. 32 is a block diagram showing an example of the configuration of a reproduction apparatus according to the third embodiment of the delivery system to which the present disclosure is applied.
Fig. 33 is an explanatory diagram of mapping performed by the mapping processing unit of fig. 32.
Fig. 34 is a flowchart describing a reproduction process performed by the reproduction apparatus of fig. 32.
Fig. 35 is a diagram showing a second example of an MPD file generated by the MPD file generating apparatus of fig. 23.
Fig. 36 is a diagram showing two-dimensional plane information on 6 common images.
Fig. 37 is a diagram showing a third example of an MPD file generated by the MPD file generating section of fig. 23.
Fig. 38 is a diagram showing a fourth example of an MPD file generated by the MPD file generating section of fig. 23.
Fig. 39 is a diagram showing an example of the configuration of two-dimensional plane information according to the fourth embodiment.
Fig. 40 is an explanatory diagram of the drawing face in the case where FOV _ flag is 0.
Fig. 41 is a diagram illustrating a first example of an MPD file describing the two-dimensional information of fig. 39.
Fig. 42 is a diagram illustrating a second example of an MPD file describing the two-dimensional information of fig. 39.
Fig. 43 is a diagram showing a modification of the second example of the MPD file describing the two-dimensional information of fig. 39.
Fig. 44 is a perspective view showing an example of a 3D model of cubic flow.
Fig. 45 is a diagram showing an example of SRD information by expanding a face of a 3D model.
Fig. 46 is a diagram showing an example of drawing face information.
Fig. 47 is a diagram illustrating a third example of an MPD file describing the two-dimensional information of fig. 39.
Fig. 48 is a block diagram showing an example of the configuration of the fifth embodiment of the generating apparatus to which the present disclosure is applied.
Fig. 49 is an explanatory diagram of a method of setting various information about an image.
Fig. 50 is a diagram showing an example of the configuration of the hvcC box in the first method.
Fig. 51 is a diagram showing an example of the configuration of a script box in the first method.
Fig. 52 is a diagram showing an example of the configuration of a tref box in the first method.
Fig. 53 is a diagram showing an example of the configuration of Visual Sample Entry in the second method.
Fig. 54 is a diagram showing an example of the configuration of the schi frame in the third method.
Fig. 55 is an explanatory diagram of a sample group.
Fig. 56 is a diagram showing an example of the configuration of the personal coding Region Info Entry in the fourth method.
Fig. 57 is a diagram showing another example of the configuration of the personal coding Region Info Entry in the fourth method.
Fig. 58 is a diagram showing an example of the configuration of a script frame in the fourth method.
Fig. 59 is a diagram showing an example of the configuration of the span file in the fifth method.
Fig. 60 is a diagram showing an example of the configuration of the spieric coordinateregioneinfample in the fifth method.
Fig. 61 is a diagram showing an example of a configuration of the spatial coding Region Info Sample Entry in the fifth method.
Fig. 62 is a diagram showing an example of the Configuration of the personal Coordinate Region Info Configuration Box in the fifth method.
Fig. 63 is a diagram showing another example of the configuration of the statistical coordidinateregionfample in the fifth method.
Fig. 64 is a diagram showing still another example of the configuration of the spieric coordinateregioneinfample in the fifth method.
Fig. 65 is a diagram showing another example of the Configuration of the personal code Region Info Configuration Box in the fifth method.
Fig. 66 is a diagram showing still another example of the configuration of the spieric coordinateregioneinfample in the fifth method.
Fig. 67 is a flowchart describing a file generation process executed by the generation apparatus of fig. 48.
Fig. 68 is a block diagram showing an example of the configuration of a reproduction apparatus according to the fifth embodiment of the delivery system to which the present disclosure is applied.
Fig. 69 is a flowchart describing a reproduction process performed by the reproduction apparatus of fig. 68.
Fig. 70 is a block diagram showing an example of the configuration of computer hardware.
Detailed Description
Modes for carrying out the present disclosure (hereinafter, referred to as "embodiments") will be described below. Note that the description will be given in the following order.
1. The first embodiment: delivery system (FIGS. 1-12)
2. Second embodiment: delivery system (FIGS. 13-22)
3. The third embodiment: delivery system (FIGS. 23-38)
4. The fourth embodiment: conveying system (FIGS. 39 to 47)
5. Fifth embodiment: delivery system (FIGS. 48 to 69)
6. Sixth embodiment: computer (Picture 70)
< first embodiment >
(configuration example of the first embodiment of the delivery System)
Fig. 1 is a block diagram showing an example of the configuration of a first embodiment of a delivery system to which the present disclosure is applied.
The delivery system 10 of fig. 1 includes: a camera 11, a generating device 12, a delivery server 13, a rendering device 14 and a head mounted display 15. The delivery system 10 generates an omnidirectional image from a captured image captured by the camera 11 and displays a display image within the field of view of the viewer using the omnidirectional image.
Specifically, the camera 11 in the delivery system 10 includes six cameras 11A-1 to 11A-6 and a microphone 11B. The cameras are hereinafter collectively referred to as "cameras 11A" without particularly distinguishing the cameras 11A-1 to 11A-6.
Each camera 11A captures a dynamic image, and the microphone 11B acquires ambient sound. The delivery system 10 supplies the captured images captured by the camera 11A as moving images in six directions and the sound acquired by the microphone 11B to the generation device 12 as moving video content. Note that the number of cameras provided in the photographing device 11 may not be six as long as the number is equal to or greater than 2.
The generation means 12 generates an omnidirectional image from the captured image supplied from the photographing means 11 by a method using equiangular projection, encodes the omnidirectional image at a bit rate equal to or higher than 1, and generates an equiangular projection stream at each bit rate. In addition, the generation means 12 generates an omnidirectional image from the captured image by cubic mapping, encodes the omnidirectional image at a bit rate equal to or higher than 1, and generates a cubic stream at each bit rate. Further, the generating device 12 encodes the sound supplied from the photographing device 11 and generates an audio stream.
The generating means 12 archive equiangular projection streams at each bit rate, cubic streams at each bit rate, and audio streams in units of time called "intervals" of about ten seconds. The generation means 12 uploads the section file generated as a result of archiving to the delivery server 13.
Although it is assumed herein that the bit rate of the equiangular projection stream and the cubic stream is equal to or higher than 1, it may be assumed that a condition (e.g., image size) other than the bit rate is equal to or higher than 1.
In addition, the generation means 12 generates an MPD (media presentation description) file that manages a section file of the moving video content, and uploads the MPD file to the delivery server 13.
The delivery server 13 stores the section file and the MPD file uploaded from the generation apparatus 12. The delivery server 13 transmits the stored section file or the stored MPD file to the playback device 14 in response to a request from the playback device 14.
The reproduction device 14 issues a request for an MPD file to the delivery server 13, and receives the MPD file transmitted to the reproduction device 14 in response to the request. In addition, the playback device 14 issues a request for a section file of an omnidirectional image generated by an omnidirectional image generation method that corresponds to mapping that can be performed by the playback device 14 based on the MPD file, and receives the section file transmitted in response to the request. The reproducing apparatus 14 decodes the equiangular projection stream or the cubic stream contained in the received section file. The reproducing unit 14 generates a 3D model image by mapping the omnidirectional image obtained as the decoding result onto the 3D model.
In addition, the reproduction apparatus 14 includes a camera 14A and captures an image of a mark 15A attached to the head-mounted display 15. Then, the reproduction device 14 detects the viewing position in the coordinate system of the 3D model based on the captured image of the marker 15A. Further, the reproduction apparatus 14 receives the detection result of the gyro sensor 15B in the head-mounted display 15 from the head-mounted display 15. The reproduction device 14 determines the line-of-sight direction of the viewer in the coordinate system of the 3D model based on the detection result of the gyro sensor 15B. The reproduction device 14 determines the field of view of the viewer located within the 3D model based on the viewing position and the gaze direction.
The reproduction apparatus 14 performs perspective projection of the 3D model image onto the field of view of the viewer with the viewing position as a focus, thereby generating an image as a display image within the field of view of the viewer. The reproduction device 14 supplies the display image to the head mounted display 15.
The head mounted display 15 is worn on the head of the viewer and displays a display image provided from the reproduction apparatus 14. The markers 15A photographed by the camera 14A are attached to the head-mounted display 15. Therefore, the viewer can specify the viewing position by moving with the head-mounted display 15 worn on the head. In addition, the head-mounted display 15 includes a gyro sensor 15B, and the detection result of the angular velocity by the gyro sensor 15B is transmitted to the reproduction apparatus 14. Accordingly, the viewer can specify the line of sight direction by rotating the head on which the head mounted display 15 is worn.
(example of configuration of generating apparatus)
Fig. 2 is a block diagram showing an example of the configuration of the generation apparatus 12 of fig. 1.
The generating device 12 of fig. 2 includes: the splicing processing unit 21, the mapping processing unit 22, the encoder 23, the mapping processing unit 24, the encoder 25, the audio acquisition unit 26, the encoder 27, the section file generation unit 28, the MPD file generation unit 29, and the upload unit 30.
The stitching processing section 21 performs stitching processing for making the colors and the brightnesses of the captured images in the six directions supplied from the camera 11A of fig. 1 uniform, eliminating the overlapping, and connecting the captured images. The stitching processing section 21 supplies the captured image to the mapping processing section 22 and the mapping processing section 24 in units of frames after the stitching processing.
The mapping processing section 22 generates an omnidirectional image from the captured image supplied from the stitching processing section 21 by cubic mapping. Specifically, the mapping processing section 22 maps the captured image after the stitching processing onto the cube as a texture, and generates an image of the expansion plan of the cube as an omnidirectional image. The mapping processing section 22 supplies the omnidirectional image to the encoder 23. Note that the stitching processing section 21 and the mapping processing section 22 may be integrated.
The encoder 23 encodes the omnidirectional image supplied from the mapping processing section 22 at a bit rate equal to or higher than 1, and generates a cubic stream. The encoder 23 supplies the cubic stream at each bit rate to the interval file generating section 28.
The mapping processing section 24 generates an omnidirectional image from the captured image supplied from the stitching processing section 21 by a method using equiangular projection. Specifically, the mapping processing section 24 maps the captured image after the stitching processing onto a sphere as a texture, and generates a sphere image as an omnidirectional image by equiangular projection. The mapping processing section 24 supplies the omnidirectional image to the encoder 25. Note that the stitching processing section 21 and the mapping processing section 24 may be integrated.
The encoder 25 encodes the omnidirectional image supplied from the mapping process section 24 at a bit rate equal to or higher than 1, and generates an equiangular projection stream. The encoder 25 supplies the equiangular projection stream at each bit rate to the interval file generation section 28.
The audio acquisition section 26 acquires the sound supplied from the microphone 11B of fig. 1, and supplies the sound to the encoder 27. The encoder 27 encodes the sound supplied from the audio acquisition unit 26 and generates an audio stream. The encoder 27 supplies the audio stream to the section file generating section 28.
The section file generating section 28 archives the equiangular projection stream at each bit rate, the cubic stream at each bit rate, and the audio stream in sections. The section file generating unit 28 supplies the section file created as a result of the archiving to the uploading unit 30.
The MPD file generating unit 29 generates an MPD file. Specifically, for each section file of the equiangular projection stream and the cubic stream, the MPD file generating section 29 (setting section) sets an ID or the like for the MPD file as identification information for identifying a captured image to generate an omnidirectional image corresponding to the section file.
In addition, for each section file of the equiangular projection stream and the cubic stream, the MPD file generating section 29 sets mapping information corresponding to the section file for the MPD file as necessary.
The mapping information is information used when mapping the omnidirectional image onto the 3D model so that a reference image at a predetermined position within each captured image after the stitching process can be mapped at a predetermined tilt angle (hereinafter referred to as "reference tilt angle") to a reference position of the 3D model. Herein, the mapping information includes a position of the reference image within the omnidirectional image and a rotation angle of the omnidirectional image at the reference position, wherein the rotation angle is used to set an inclination of the reference image on the 3D model to a reference inclination when the omnidirectional image is mapped onto the 3D model such that the reference image is mapped at the reference position. Note that the reference position is a position on the 3D model corresponding to a predetermined line-of-sight direction in the case where the viewing position is the center of the 3D model, for example.
In other words, since the mapping processing section 22 and the mapping processing section 24 each perform mapping, the position and inclination of the captured image mapped onto the reference position on the captured image are generally different between the mapping processing section 22 and the mapping processing section 24. Therefore, in this case, the MPD file generating section 29 sets mapping information. Thus, the reproducing unit 14 can map the reference image onto the reference position at the reference inclination angle based on the mapping information regardless of the omni-directional image generation method. The MPD file generating section 29 supplies the MPD file to the uploading section 30.
The upload unit 30 uploads the section file supplied from the section file generation unit 28 and the MPD file supplied from the MPD file generation unit 29 to the delivery server 13 in fig. 1.
(description of cubic mapping)
Fig. 3 is a perspective view showing a cube as a 3D model onto which a captured image is mapped by cube mapping performed by the mapping processing section 22 of fig. 2.
As shown in fig. 3, in the cube mapping performed by the mapping processing section 22, the captured image supplied from the stitching processing section 21 is mapped onto the six faces 41 to 46 of the cube 40.
In this specification, it is assumed that an axis passing through center O of cube 40 and orthogonal to face 41 and face 42 is an x-axis, an axis orthogonal to face 43 and face 44 is a y-axis, and an axis orthogonal to face 45 and face 46 is a z-axis. In addition, when the center O and the distances between the faces 41 to 46 are each assumed to be r, the face 41 where x ═ r is also appropriately referred to as "+ x face 41", and the face 42 where x ═ r is also appropriately referred to as "-x face 42". Likewise, the y-r plane 43, the y-r plane 44, the z-r plane 45, and the z-r plane 46 are also referred to as "+ y plane 43", "— y plane 44", "+ z plane 45", and "-z plane 46", respectively, as appropriate.
The + x face 41 faces the-x face 42, the + y face 43 faces the-y face 44, and the + z face 45 faces the-z face 46.
Fig. 4 is a diagram showing an example of an omnidirectional image generated by cubic mapping performed by the mapping processing section 22 of fig. 2.
As shown in fig. 4, the omnidirectional image 50 generated by the cube mapping is an image of the expansion plan of the cube 40. Specifically, the omnidirectional image 50 is the following image: the images are such that an image 52 of the-x plane 42, an image 55 of the + z plane 45, an image 51 of the + x plane 41, an image 56 of the-z plane 46 are arranged at the center in order from the left, an image 53 of the + y plane 43 is arranged on the image 55, and an image 54 of the-y plane 44 is arranged below the image 55.
In this specification, it is assumed that the horizontal size of the omnidirectional image 50 is 4096 pixels, the vertical size thereof is 3072 pixels, and the horizontal and vertical sizes of the images 51 to 56 are 1024 pixels.
(description of method Using equiangular projection)
Fig. 5 is a perspective view showing a sphere as a 3D model onto which a captured image is mapped by using an isometric projection and a method performed by the mapping process section 24 of fig. 2.
As shown in fig. 5, the captured image supplied from the stitching processing section 21 is mapped onto the face of the sphere 70 by the method using the isometric projection performed by the mapping processing section 24. The faces of the sphere 70 may be divided into, for example, eight faces 71 to 78 of the same size and shape.
In this specification, it is assumed that an axis passing through the center O of the sphere 70 and passing through the centers of the faces 71 and 72 is the a-axis, an axis passing through the centers of the faces 73 and 74 is the B-axis, an axis passing through the centers of the faces 75 and 76 is the C-axis, and an axis passing through the centers of the faces 77 and 78 is the D-axis. In addition, when the distances between the center O and the faces 71 to 78 are each assumed to be r, the face 71 where a ═ r is also appropriately referred to as "+ a face 71", and the face 72 where a ═ r is also appropriately referred to as "-a face 72". Likewise, the B-r plane 73, the B-r plane 74, the C-r plane 75, the C-r plane 76, the D-r plane 77, and the D-r plane 78 are also referred to as "+ B plane 73", "+ B plane 74", "+ C plane 75", "+ C plane 76", "+ D plane 77", and "-D plane 78", respectively, as appropriate.
+ A face 71 faces-A face 72, + B face 73 faces-B face 74, + C face 75 faces-C face 76, and + D face 77 faces-D face 78.
Fig. 6 is a diagram illustrating an example of an omnidirectional image generated by using an isometric projection and the method performed by the mapping processing section 24 of fig. 2.
As shown in fig. 6, the omnidirectional image 90 generated by the method using the equiangular projection is an image of the sphere 70 by the equiangular projection. Therefore, the abscissa (horizontal coordinate) and the ordinate (vertical coordinate) of the omnidirectional image 90 correspond to the longitude and latitude in the case where the sphere 70 is the earth.
Specifically, the omnidirectional image 90 is the following image: the images are such that the image 91 of the + a plane 71, the image 93 of the + B plane 73, the image 95 of the + C plane 75, and the image 97 of the + D plane 77 are arranged in order from the top left, the image 96 of the-C plane 76, the image 98 of the-D plane 78, the image 92 of the-a plane 72, and the image 94 of the-B plane 74 are arranged in order from the bottom left.
In this specification, it is assumed that the horizontal size of the omnidirectional image 90 is 1920 pixels, and the vertical size thereof is 960 pixels.
(example of MPD File)
Fig. 7 is a diagram showing an example of an MPD file generated by the MPD file generating section 29 in fig. 2.
In the MPD file, information such as an encoding scheme, an encoding bit rate, and an image resolution with respect to the section file is layered and described in an XML format.
Specifically, the MPD file hierarchically contains elements such as a Period (Period) element, an adaptation set (adaptation set) element, a Representation (reproduction) element, and an interval information (SegmentInfo) element.
In the MPD file, the moving video content corresponding to the section file managed by the MPD file itself is divided by a predetermined time range (for example, by program, cm (commercial), or the like). Each period element is described for each divided motion video content. The slot element has information of a reproduction start clock time (data such as a set of synchronized image data or audio data) of a program such as moving video content.
An adaptation set element is contained in each period element, and each adaptation set element groups presentation elements of motion video content corresponding to the period element by media type, attribute, or the like. Each adaptation set element has media types, attributes, etc. corresponding to the presentation elements contained in a group that are common to the motion video content.
The presentation element is contained in each adaptation set element that groups the presentation elements, and is described for each section file group of the motion video content having the same media type and attribute as those in the motion video content corresponding to the period element as the upper layer. Each presentation element has an attribute, URL (uniform resource locator), and the like common to the span file group corresponding to the presentation element.
Each section information element is contained in each presentation element, and has information on each section file in the section file group corresponding to the presentation element.
In the example of fig. 7, the interval files of each of the equiangular projection stream, the cubic stream, and the audio stream in the time range corresponding to the period element are classified into one group. Thus, in the MPD file of fig. 7, a period element contains three adaptation set elements.
The first adaptation set element from the top is an element of an interval file corresponding to an equiangular projection stream, and the second adaptation set element is an element of an interval file corresponding to a cubic stream. Since the two adaptation set elements are elements corresponding to the interval file of the omnidirectional image, these elements are similarly configured.
In particular, each of the first adaptation set element and the second adaptation set element has a horizontal dimension width and a vertical dimension height of the respective omnidirectional image. As described above, the horizontal size of the omnidirectional image 90 corresponding to the equiangular projection stream is 1920 pixels, and the vertical size thereof is 960 pixels. Thus, as shown in fig. 7, the first adaptation set element is 1920 pixels in size width and 960 pixels in size height. In addition, the omnidirectional image 50 has a horizontal size corresponding to a cubic flow of 4096 pixels and a vertical size of 3072 pixels. Thus, as shown in fig. 7, the second adaptation set element has a size width of 4096 pixels and a size height of 3072 pixels.
Further, each of the first adaptation set element and the second adaptation set element has "supplementalProperty scheme Iduri ═ http:// xmlns. sony. net/metadata/mpeg/dash/coordinates/2015", which is a supplementalProperty (supplementary property) indicating the omnidirectional image generation method corresponding to this adaptation set element by a value (value).
Since the first adaptation set element is an element of an interval file corresponding to an equiangular projection stream, the omnidirectional image generation method corresponding to the first adaptation set element is a method using equiangular projection. Thus, as shown in FIG. 7, the value of "supplementalProperty schemeEdUri ═ http:// xmlns. sony. net/metadata/mpeg/dash/cordinates/2015" in the first adaptation set element is "equirectangular" indicating a method using equiangular projection.
Since the second adaptation set element is an element of the interval file corresponding to the cubic stream, the omnidirectional image generation method corresponding to the second adaptation set element is cubic mapping. Thus, as shown in FIG. 7, the value of "supplementalProperty schema Idri ═ http:// xmlns. sony. net/metadata/mpeg/dash/cordinates/2015" in the second adaptation set element is "cube" indicating a cube mapping.
Further, each of the first adaptation set element and the second adaptation set element has a value of "rn: mpeg: and (4) dash: original _ source _ id:2016 "", which is a value indicating identification information and mapping information about a captured image used to generate an omnidirectional image corresponding to the adaptation set element.
In the example of fig. 7, ID and "X, Y, λ" are set to "SupplementalProperty schemeIdUri ═ urn: mpeg: and (4) dash: original _ source _ id: 2016' value. As shown in fig. 8, the ID is identification information on a captured image for generating an omnidirectional image. In addition, X is an X coordinate of the position of the reference image within the omnidirectional image in the mapping information on the omnidirectional image, and Y is a Y coordinate of the position of the reference image within the omnidirectional image in the mapping information. Further, λ is a rotation angle in the mapping information.
Since the captured image used to generate the omnidirectional image 50 and the captured image used to generate the omnidirectional image 90 corresponding to the first adaptation set element and the second adaptation set element are the same, the identification information has the same value (1 in the example of fig. 7). In addition, in the example of fig. 7, the position coordinates of the reference image within the omnidirectional image 90 corresponding to the equiangular projection stream are (960, 480), and the rotation angle is 0 degrees. In the example depicted in fig. 7, "I" is added to the top of the block storing the identification information, and "B" is added to the top of the block storing the mapping information. In addition, each piece of information in a block is separated by commas (,), and blocks are separated by spaces. Thus, the first adaptation set element's "SupplementalProperty schemeelduri ═ urn: mpeg: and (4) dash: original _ source _ id:2016 "" has a value of "I1B 960,480, 0. "in other words, in this case," 1 "in" I1 "is identification information, and" 960 "in" B960,480,0 "is an X coordinate, where" 480 "is a Y coordinate, and where" 0 "is a rotation angle.
In addition, in the example of fig. 7, the position coordinate of the reference image within the omnidirectional image 50 corresponding to the cubic flow is (1530,1524), and the rotation angle is 0 degrees. Thus, the "SupplementalProperty schemeelduri" urn: mpeg: and (4) dash: original _ source _ id:2016 "", is "I1B 1530,1524, 0. In other words, in this case, "1" in "I1" is identification information, and "1530" in "B1530, 1524, 0" is an X coordinate, where "1524" is a Y coordinate, and where "0" is a rotation angle.
In the example of fig. 7, the mapping information is described in each of the first adaptation set element and the second adaptation set element because the reference image is not mapped onto the reference position of the 3D model at the reference tilt angle at which each of the omnidirectional image 50 and the omnidirectional image 90 is generated. In the case of mapping with the reference tilt angle, mapping information is not described.
The third adaptation set element from the top is an element of the interval file corresponding to the audio stream.
Additionally, in the example of FIG. 7, each adaptation set element contains one representation element. For example, "equal regular. mp 4", "cube. mp 4", and "audio. mp 4" are described as urls (baseurls) that form the basis of the span file corresponding to the presentation elements, respectively, in the presentation elements from the first adaptation set element to the third adaptation set element from the top. Note that fig. 7 omits the description of the section information element.
As described above, in the MPD file, a SupplementalProperty indicating the omnidirectional image generation method by a value, and a SupplementalProperty indicating the identification information and the mapping information by a value are set.
Accordingly, the playback device 14 can select, as a section file to be played back, a section file of an omnidirectional image generated by a generation method corresponding to mapping that can be performed by the playback device 14, from among section files to which the same identification information is set, based on the MPD file.
In addition, the reproducing unit 14 may map the reference image onto the reference position of the 3D model at the reference inclination by mapping the omni-directional image corresponding to the interval file to be reproduced using the mapping information.
Note that the identification information and the mapping information may be described not on the SupplementalProperty but in the essentitalproperty. In the case where the identification information and the mapping information are described in the SupplementalProperty, it cannot be understood that a reproducing apparatus of the SupplementalProperty can use information other than the SupplementalProperty about the MPD file. On the other hand, in the case where the identification information and the mapping information are described in the essentialpropety, it cannot be understood that the reproducing apparatus of the essentialpropety cannot use all the information on the elements including the essentialpropety.
In addition, the identification information and the mapping information may be contained in elements such as presentation elements other than the adaptation set element.
(description of mapping information)
Fig. 9 is an explanatory diagram of mapping information about an extent file of an isometric projection stream.
As shown in fig. 9, the mapping information on the zone file of the equiangular projection stream contains coordinates of a position 111 of a reference image within the omnidirectional image 90 generated by using the method of equiangular projection.
In addition, the mapping information on the interval file of the equiangular projection stream contains a rotation angle λ that is a rotation angle of the omnidirectional image 90 in a counterclockwise direction about a straight line (assumed as an axis) connecting the position on the sphere 70 and the center O of the sphere 70, the counterclockwise direction being necessary to set the inclination of the reference image on the sphere 70 to a reference inclination when the omnidirectional image 90 is mapped onto the sphere 70 so that the position 111 can be mapped onto the reference position, and is indicated by an arrow 112.
Note that the mapping information may be euler angles (α, β, γ) or quaternions (q0, q1, q2, q3) indicating the rotation angles of the omnidirectional image when the omnidirectional image is mapped onto the 3D model so that the reference image may be mapped onto the reference position at the reference inclination angle.
In the case where the mapping information is the euler angle (α, β, γ), the reproducing apparatus 14 maps the omnidirectional image onto the 3D model as it is, and then rotates the mapped omnidirectional image three times on the 3D model based on the euler angle (α, β, γ).
Specifically, first, the reproduction apparatus 14 rotates the omnidirectional image mapped onto the 3D model by the euler angle α around the y-axis based on the euler angles (α, β, γ). Next, the reproducing unit 14 rotates the omnidirectional image rotated by the euler angle α about the y-axis by the euler angle β about the x-axis. Finally, the reproduction means 14 rotates the omnidirectional image rotated by the euler angle β about the x-axis by the euler angle γ about the z-axis. Thus, the omnidirectional image mapped onto the reference position of the 3D model after three rotations becomes a reference image at a reference tilt angle.
Although it is described herein that the omnidirectional image is rotated in the order of the y-axis, the x-axis, and the z-axis, the rotation order is not limited to this order. In the case where the mapping information is euler angles (α, β, γ), a parameter indicating the rotation order may be included in the mapping information.
In addition, in the case where the mapping information is a quaternion (q0, q1, q2, q3) (quaternion), the reproducing apparatus 14 maps the omnidirectional image onto the 3D model as it is, and then rotates the mapped omnidirectional image once on the 3D model based on the quaternion (q0, q1, q2, q 3).
(description of processing performed by the generating means)
Fig. 10 is a flowchart describing a file generation process executed by the generation apparatus 12 of fig. 2.
In step S11 of fig. 10, the stitching processing section 21 performs stitching processing on the captured images in the six directions supplied from the camera 11A of fig. 1. The stitching processing section 21 supplies the captured images in units of frames obtained as a result of the stitching processing to the mapping processing section 22 and the mapping processing section 24.
In step S12, the mapping processing section 22 generates the omnidirectional image 50 by cubic mapping from the captured image supplied from the stitching processing section 21, and supplies the omnidirectional image 50 to the encoder 23.
In step S13, the encoder 23 encodes the omnidirectional image 50 which is supplied from the mapping processing section 22 and is generated at a bit rate equal to or higher than 1 by cubic mapping, and generates a cubic stream. The encoder 23 supplies the cubic stream at each bit rate to the section file generating section 28.
In step S14, the mapping processing section 24 generates an omnidirectional image 90 from the captured images supplied from the stitching processing section 21 by a method using equiangular projection, and supplies the omnidirectional image 90 to the encoder 25.
In step S15, the encoder 25 encodes the omnidirectional image 90 which is supplied from the mapping processing section 24 and is generated at a bit rate equal to or higher than 1 by using the equal-angle projection method, and generates an equal-angle projection stream. The encoder 25 supplies the isometric projection stream at each bit rate to the extent file generation section 28.
In step S16, the encoder 27 encodes the sound acquired from the microphone 11B of fig. 1 via the audio acquisition section 26, and generates an audio stream. The encoder 27 supplies the audio stream to the section file generating section 28.
In step S17, the section file generating section 28 archives the equiangular projection stream at each bit rate, the cubic stream at each bit rate, and the audio stream in sections, and generates a section file. The section file generating unit 28 supplies the section file to the uploading unit 30.
In step S18, the MPD file generating section 29 sets the same ID as the identification information about the section files of all the equiangular projection streams and all the cubic streams.
In step S19, the MPD file generating section 29 determines whether the captured image mapped onto the reference position of the 3D model by the mapping processing section 22 and the mapping processing section 24 is a reference image at a reference inclination angle.
In the event that determination is made in step S19 that the captured image mapped onto the reference position by at least one of the mapping process section 22 or the mapping process section 24 is not a reference image at the reference inclination angle, the processing proceeds to step S20.
In step S20, the MPD file generating unit 29 generates an MPD file including identification information and mapping information. Specifically, the MPD file generating section 29 sets the identification information set in step S18 for the adaptive set elements in the MPD file corresponding to the equal-angle cast stream and the cubic stream. In addition, the MPD file generating section 29 also sets mapping information for an adaptive set element corresponding to an omnidirectional image generated by mapping performed by at least one of the mapping processing section 22 or the mapping processing section 24 according to a 3D model for which a captured image mapped onto a reference position is not a reference image at a reference inclination angle. The MPD file generating section 29 supplies the generated MPD file to the uploading section 30, and the process proceeds to step S22.
On the other hand, in the event that determination is made in step S19 that the captured image mapped onto the reference position by both the mapping processing section 22 and the mapping processing section 24 is a reference image at the reference inclination angle, the processing proceeds to step S21.
In step S21, the MPD file generating section 29 generates an MPD file that includes adaptive set elements corresponding to the equiangular projection stream and the cubic stream and includes the identification information set in step S18. The MPD file generating section 29 supplies the generated MPD file to the uploading section 30, and the process proceeds to step S22.
In step S22, the upload section 30 uploads the section file supplied from the section file generation section 28 and the MPD file supplied from the MPD file generation section 29 to the delivery server 13 of fig. 1, and the process ends.
(example of configuration of reproduction apparatus)
Fig. 11 is a block diagram showing an example of the configuration of the reproduction apparatus 14 of fig. 1.
The reproduction apparatus 14 of fig. 11 includes: the camera 14A, MPD includes an acquisition unit 220, an MPD processing unit 221, a section file acquisition unit 222, a decoder 223, a mapping processing unit 226, a drawing unit 227, a reception unit 228, and a line-of-sight detection unit 229.
The MPD acquisition unit 220 in the playback device 14 issues a request for an MPD file to the delivery server 13 in fig. 1, and acquires the MPD file. The MPD acquisition unit 220 supplies the acquired MPD file to the MPD processing unit 221.
The MPD processing unit 221 analyzes the MPD file supplied from the MPD acquisition unit 220. Specifically, the MPD processing unit 221 (selecting unit) sets the "supplementary property scheme elementary uri" urn: mpeg: and (4) dash: original _ source _ id:2016 "", of the values, the value to which I is assigned is identified as the identification information. In addition, the MPD processing section 221 selects an adaptation set element including predetermined identification information based on identification information on each adaptation set element. For example, in the case where the configuration of the MPD file is the configuration of fig. 7, the MPD processing section 221 selects the first adaptation set element and the second adaptation set element including identification information 1.
Further, the MPD processing section 221 selects an adaptation set element including an omnidirectional image generation method corresponding to the scheme of mapping performed by the mapping processing section 226 from adaptation set elements as adaptation set elements to be reproduced, which include predetermined identification information.
In the example of fig. 11, the scheme of mapping by the mapping processing section 226 is a scheme for performing mapping using a cube as a 3D model, as described later. Therefore, the MPD processing section 221 selects, as an adaptation set element to be reproduced, an adaptation set element that includes a cubic map for performing mapping using a cube as a 3D model as an omnidirectional image generation method. In other words, the MPD processing section 221 selects an adaptation set element in which the value of "SupplementalProperty schemeelduri ═ http:// xmlns. sony. net/metadata/mpeg/dash/cordinates/2015" "is" cube ".
The MPD processing section 221 acquires information such as the URL of the section file at the reproduction clock time from the presentation element in the selected adaptation set element, and supplies the information to the section file acquisition section 222. When the selected adaptive set element includes mapping information, the MPD processing unit 221 supplies the mapping information to the mapping processing unit 226.
The section file acquisition unit 222 issues a request for a section file identified by the URL supplied from the MPD processing unit 221 to the delivery server 13 based on the URL, and acquires the section file. The section file acquisition section 222 supplies the cubic stream contained in the acquired section file to the decoder 223.
The decoder 223 decodes the cubic stream supplied from the section file acquisition section 222, and generates the omnidirectional image 50. The decoder 223 supplies the generated omnidirectional image 50 to the mapping processing unit 226.
In the case where the MPD processing section 221 provides mapping information, the mapping processing section 226 sets a reference image within the omnidirectional image provided from the decoder 223 at a reference position based on the mapping information, rotates the omnidirectional image 50 by a rotation angle, and maps the omnidirectional image 50 onto the faces 41 to 46 of the cube 40 as textures.
On the other hand, in a case where the MPD processing section 221 does not provide mapping information, the mapping processing section 226 maps the omnidirectional image 50 onto the faces 41 to 46 of the cube 40 as textures as they are. The mapping processing section 226 supplies the 3D model image obtained as a result of the mapping to the drawing section 227.
The drawing section 227 performs perspective projection from the 3D model image supplied from the mapping processing section 226 onto the field of view of the viewer with the viewing position supplied from the line of sight detecting section 229 as a focal point, thereby generating an image in the field of view of the viewer as a display image. The drawing section 227 supplies the display image to the head-mounted display 15.
The receiving section 228 receives the detection result of the gyro sensor 15B of fig. 1 from the head mounted display 15, and supplies the detection result to the line-of-sight detecting section 229.
The line of sight detecting unit 229 determines the direction of the line of sight of the viewer in the coordinate system of the 3D model based on the detection result of the gyro sensor 15B supplied from the receiving unit 228. Further, the sight line detection section 229 acquires a captured image of the mark 15A from the camera 14A, and detects the viewing position in the coordinate system of the 3D model based on the captured image. The sight line detection section 229 determines the visual field range of the viewer in the coordinate system of the 3D model based on the viewing position and the sight line direction in the coordinate system of the 3D model. The line of sight detecting section 229 supplies the field of view range and the viewing position of the viewer to the drawing section 227.
Note that a scheme of the mapping by the mapping processing section 226 may be a mapping scheme using a sphere as a 3D model.
In this case, the MPD processing section 221 selects, as an adaptation set element to be reproduced, an adaptation set element including, as an equiangular image generation method, a method using equiangular projection for performing mapping using a sphere as a 3D model. In other words, the MPD processing section 221 selects, as an adaptation set element to be reproduced, an adaptation set element in which "supplemenalpropertende schemeIdUri ═ http:// xmlns. sony. net/metadata/mpeg/dash/coordinates/2015" "has a value of" equirectangular ". As a result, the section file acquisition section 222 acquires the section files of the equidistant projection streams.
(processing performed by a reproduction apparatus)
Fig. 12 is a flowchart describing the reproduction processing performed by the reproduction apparatus 14 of fig. 11.
In step S31 of fig. 12, the MPD acquisition unit 220 in the playback device 14 issues a request for an MPD file to the delivery server 13 of fig. 1, and acquires the MPD file. The MPD acquisition unit 220 supplies the acquired MPD file to the MPD processing unit 221.
In step S32, the MPD processing section 221 sets "supplementpropermebediri ═ urn: mpeg: and (4) dash: original _ source _ id: 2016' the value to which I is assigned is identified as the identification information.
In step S33, the MPD processing section 221 selects predetermined identification information from the identification information on each adaptation set element as identification information on an omnidirectional image to be reproduced.
In step S34, the MPD processing section 221 acquires, from the MPD file, the URL of the section file of the cubic stream of the omnidirectional image 50 to be reproduced. Specifically, the MPD processing section 221 acquires the URL of the section file at the reproduction clock time from the presentation element in the adaptation set element described in the MPD file and containing the identification information on the forward image to be reproduced and the cubic mapping as the omnidirectional image generation method. The MPD processing unit 221 supplies the acquired URL to the section file acquisition unit 222.
In step S35, the section file acquisition unit 222 issues a request for the section file identified by the URL to the delivery server 13 based on the URL supplied from the MPD processing unit 221, and acquires the section file. The section file acquisition section 222 supplies the cubic stream contained in the acquired section file to the decoder 223.
In step S36, the decoder 223 decodes the cubic stream supplied from the section file acquisition section 222, and generates the omnidirectional image 50. The decoder 223 supplies the generated omnidirectional image 50 to the mapping processing section 226.
In step S37, the receiving unit 228 receives the detection result of the gyro sensor 15B from the head mounted display 15, and supplies the detection result to the line of sight detecting unit 229.
In step S38, the gaze detecting section 229 determines the gaze direction of the viewer in the coordinate system of the 3D model based on the detection result of the gyro sensor 15B supplied from the receiving section 228.
In step S39, the camera 14A captures an image of the mark 15A and supplies the image to the line-of-sight detecting section 229. In step S40, the line of sight detecting section 229 detects the viewing position in the coordinate system of the 3D model based on the captured image of the marker 15A supplied from the camera 14A.
In step S41, the gaze detection unit 229 determines the field of view of the viewer based on the viewing position and the gaze direction in the coordinate system of the 3D model. The line of sight detecting section 229 supplies the field of view range and the viewing position of the viewer to the drawing section 227.
In step S42, the MPD processing section 221 determines whether an adaptation set element that contains identification information about an omnidirectional image to be reproduced and cubic mapping as an omnidirectional image generation method contains mapping information.
In the event that determination is made in step S42 that the adaptive set element contains mapping information, the MPD processing section 221 supplies the mapping information to the mapping processing section 226, and the process proceeds to step S43.
In step S43, the mapping processing section 226 maps the omnidirectional image 50 supplied from the decoder 223 onto the faces 41 to 46 of the cube 40 as textures based on the mapping information. The omnidirectional image 50 mapped onto the reference position of the cube 40 thus becomes a reference image at a reference tilt angle. The mapping processing section 226 supplies the 3D model image obtained as a result of the mapping to the drawing section 227, and the process proceeds to step S45.
On the other hand, in the case where it is determined in step S42 that the adaptation set element does not contain mapping information, the mapping processing section 226 maps the omnidirectional image 50 onto the faces 41 to 46 of the cube 40 as textures in step S44.
In the case where the adaptation set element does not contain mapping information, the omnidirectional image 50 corresponding to the adaptation set element is the following omnidirectional image 50: for the omnidirectional image 50, the reference image at the reference inclination angle is mapped to the reference position of the cube 40 only by mapping the omnidirectional image onto the cube 40. Therefore, through the processing in step S44, the reference image at the reference inclination angle is mapped onto the reference position of the cube 40. The mapping processing section 226 supplies the 3D model image obtained as a result of the mapping to the drawing section 227, and the process proceeds to step S45.
In step S45, the drawing section 227 performs perspective projection onto the field of view of the viewer from the 3D model image supplied from the mapping processing section 226 with the viewing position supplied from the line of sight detecting section 229 as the focal point, thereby generating an image within the field of view of the viewer as a display image.
In step S46, the drawing section 227 transmits the display image to the head-mounted display 15 of fig. 1 to display the display image thereon, and the process ends.
As described above, the generation device 12 sets identification information for the MPD file. Accordingly, the reproduction apparatus 14 can identify the captured image for generating the omnidirectional image based on the identification information. As a result, the reproduction apparatus 14 can select an appropriate omnidirectional image as an object to be reproduced from omnidirectional images generated from the same captured image.
In addition, the generation device 12 generates an omnidirectional image by a plurality of generation methods; therefore, the number of reproduction devices capable of reproducing the omnidirectional image generated by the generation device 12 can be increased.
Note that, in the first embodiment, the mapping information may not be described in the MPD file regardless of whether the reference image is mapped to the reference position of the 3D model at the reference inclination angle when the omnidirectional image is generated.
< second embodiment >
(example of generating the configuration of the device)
The configuration of the second embodiment of the delivery system to which the present disclosure is applied is the same as that of the delivery system 10 of fig. 1 except for the configurations of the generation apparatus and the reproduction apparatus. Therefore, only the generating device and the reproducing device will be described below.
Fig. 13 is a block diagram showing an example of the configuration of a generating apparatus according to the second embodiment of the delivery system to which the present disclosure is applied.
In the configuration depicted in fig. 13, those configurations that are the same as those in fig. 2 are indicated by the same reference numerals. Duplicate description will be omitted depending on the case.
The generating apparatus 250 of fig. 13 differs from the generating apparatus 12 of fig. 2 in the configuration: the encoders 254-1 to 254-6, the encoder 252, the section file generating section 255, and the MPD file generating section 256 are provided as alternatives to the encoder 23, the encoder 25, the section file generating section 28, and the MPD file generating section 29, respectively, and the resolution reducing section 251 and the dividing section 253 are newly provided.
The generating means 250 reduces the resolution of the omnidirectional image 90 generated by the method using the equiangular projection to encode the resultant omnidirectional image 90 and separates the omnidirectional image 50 generated by the cubic mapping to encode the separated image.
Specifically, the resolution reduction section 251 in the generation device 250 reduces the resolution of the omnidirectional image 90 supplied from the mapping processing section 24 by halving the resolution of the omnidirectional image 90 in the horizontal direction and the vertical direction and generates a low-resolution image. The resolution reducing section 251 supplies the low resolution image to the encoder 252.
The encoder 252 encodes the low-resolution image supplied from the resolution reducing section 251 at a bit rate equal to or higher than 1 and generates a low-resolution stream. The encoder 252 supplies the low-resolution stream to the section file generating section 255.
The dividing section 253 divides the omnidirectional image 50 supplied from the mapping processing section 22 into images 51 to 56 of six planes 41 to 46. The dividing section 253 supplies the images 51 to 56 as high-resolution images to the encoders 254-1 to 254-6.
The encoders 254-1 to 254-6 encode the high resolution images supplied from the dividing section 253 at a bit rate equal to or higher than 1. The encoders 254-1 to 254-6 supply the high-resolution streams of the surfaces 41 to 46 generated as a result of the encoding to the section file generating section 255.
The section file generating unit 255 archives the low-resolution stream for each bit rate, the high-resolution streams for the planes 41 to 46 for each bit rate, and the audio stream in sections. The section file generating unit 255 supplies the section file generated as a result of archiving to the uploading unit 30.
The MPD file generating unit 256 generates an MPD file. Specifically, for each of the section files of the low-resolution stream and the high-resolution stream, the MPD file generating section 256 sets an ID or the like for the MPD file as identification information for identifying a captured image corresponding to the section file.
Further, for each of the section files of the low-resolution stream and the high-resolution stream, the MPD file generating section 256 sets mapping information corresponding to the section file for the MPD file as necessary. The MPD file generating section 256 supplies the MPD file to the uploading section 30.
(first example of MPD File)
Fig. 14 is a diagram illustrating a first example of an MPD file generated by the MPD file generating section 256 in fig. 13.
In the example of fig. 14, the interval files of the low-resolution stream and the high-resolution stream of the planes 41 to 46 and each of the audio streams in the time range corresponding to each Period (Period) element are classified into one group. Therefore, in the MPD file of fig. 14, a period element contains eight adaptation set (AdaptationSet) elements. Further, in the example of fig. 14, the identification information and the mapping information about the omnidirectional images 50 and 90 are the same as those about the omnidirectional images 50 and 90 in the example of fig. 7.
The first adaptation set element from the top is the element corresponding to the interval file of the low resolution stream. The adaptation set element is the same as the first adaptation element of fig. 7 except for the horizontal dimension width, the vertical dimension height, and the values of "urn: mpeg: dash: original _ source _ id: 2016" ".
In other words, the image corresponding to the first adaptive set element is not the omnidirectional image 90 generated by the mapping processing unit 24, but is a low-resolution image. Further, the horizontal resolution and the vertical resolution of the low-resolution image are half of the horizontal resolution and the vertical resolution of the omnidirectional image 90, respectively. Thus, the horizontal dimension width of the first adaptation element is 960(═ 1920/2) pixels and the vertical dimension height of the first adaptation element is 480(═ 960/2) pixels.
Further, the captured image used for generating the low-resolution image is the captured image used for generating the hologram image 90, and the ID of the captured image as the identification information is "1". However, in the example of fig. 14, the mapping information on the low resolution image is the coordinates (480,240) of the position of the reference image within the low resolution image and the rotation angle of 0 degree. Thus, the value of "I1B480,240, 0" for "complementary property schema I ═ urn: mpeg: dash: origin _ source _ id: 2016". In other words, in the case of fig. 14, "I" is added to the top of the block storing the identification information in the value, and "B" is added to the top of the block storing the mapping information, similar to the case of fig. 7. Furthermore, each piece of information in a block is separated by a comma (,) and the blocks are separated by a space.
The second to seventh adaptation set elements from the top are elements corresponding to the interval files of the high resolution streams of the planes 41 to 46, respectively. The adaptation set element is the same as the second adaptation element of fig. 7 except for the horizontal dimension width, the vertical dimension height, and the newly included "supplemenalpropertschemehduri ═ urn: mpeg: dash: srd: 2014".
In other words, the images corresponding to the second to seventh adaptation set elements from the top are not the omnidirectional image 50 generated by the mapping processing section 22 but the images 51 to 56. Therefore, the horizontal size width and the vertical size height of the second through seventh adaptation set elements are 1024 pixels, which is the number of pixels in each of the horizontal and vertical directions of the images 51 through 56.
Further, the second to seventh adaptation set elements from the top each have "complementary property scheme actual uri ═ urn: mpeg: dash: SRD: 2014" as an SRD (spatial relationship description) indicating that the image corresponding to the adaptation element is an image obtained by dividing the image of the moving video content.
The values of "supplemental property schema I ═ urn:" urn: mpeg: dash: srd:2014 "are defined by source _ id, object _ x, object _ y, object _ width, object _ height, total _ width, total _ height, and spatial _ set _ id.
"source _ ID" is an ID that identifies the image before being divided into the image corresponding to the adaptation set element (here, the omnidirectional image 50). Further, "object _ x" and "object _ y" are coordinates in the horizontal direction and the vertical direction of the upper left position of the image on the image before the image (here, the omnidirectional image 50) divided into the corresponding adaptive set element, respectively. The "object _ width" and the "object _ height" are the horizontal size and the vertical size of the image corresponding to the adaptation set element, respectively. Further, "total _ width" and "total _ height" are the horizontal size and the vertical size of an image (herein, omnidirectional image 50) before being divided into an image corresponding to the a-adaptation set element. "spatial _ set _ ID" is an ID for identifying the division level of the image corresponding to the adaptive set element. In the example of fig. 14, the ID of the omnidirectional image 50 is 0.
Accordingly, the values of "supplementalpropertymelduri ═ urn: mpeg: dash: srd: 2014" "possessed by the second to seventh adaptive set elements corresponding to the images 51 to 56 from the top are" 0,2048,1024, 4096,3072,0 "," 0,1024, 4096,3072,0 "," 0,1024,0,1024, 4096, 0,1024, 4096,3072,0 "," 0,1024,2048,1024, 4096,3072,0 "," 0,1024, 4096,3072,0 "and" 0,3072,1024, 306, 3072,0 ".
In the example of fig. 14, since the reference image is not mapped onto the reference position of the 3D model at the reference inclination angle when the omnidirectional images 50 to 90 are generated, the mapping information is described in each adaptation set element. However, in the case of mapping with the reference tilt angle, mapping information is not described.
The eighth adaptation set element from the top is an element corresponding to an interval file of an audio stream.
Further, in the example of FIG. 14, each adaptation set element contains one Representation (replication) element. Among the presentation elements in the first through eighth adaptation set elements from the top, "equirectangular. mp 4", "cube 1.mp 4" through "cube 6.mp 4" and "audio. mp 4" are respectively described as baseurls of the span files corresponding to the presentation elements. Note that fig. 14 omits the description of the section information element.
(description of processing performed by the generating means)
Fig. 15 is a flowchart describing a file generation process performed by the generation apparatus 250 of fig. 13.
In step S61 of fig. 15, the stitching processing section 21 performs stitching processing on the captured images in the six directions supplied from the camera 11A in fig. 1 for each frame. The stitching processing section 21 supplies the captured image in frame units obtained as a result of the stitching processing to the mapping processing section 22 and the mapping processing section 24.
In step S62, the mapping processing section 24 generates the omnidirectional image 90 from the captured image supplied from the stitching processing section 21 by a method using equiangular projection and supplies the omnidirectional image 90 to the resolution reducing section 251.
In step S63, the resolution reduction section 251 reduces the resolution of the omnidirectional image 90 supplied from the mapping processing section 24 and generates a low-resolution image. The resolution reducing section 251 supplies the low resolution image to the encoder 252.
In step S64, the encoder 252 encodes the low-resolution image supplied from the resolution reducing section 251 and generates a low-resolution stream. The encoder 252 supplies the low-resolution stream to the section file generating section 255.
In step S65, the mapping processing section 22 generates the omnidirectional image 50 from the captured images supplied from the stitching processing section 21 by cubic mapping and supplies the omnidirectional image 50 to the dividing section 253.
In step S66, the dividing section 253 divides the omnidirectional image 50 supplied from the mapping processing section 22 into images 51 to 56 of six planes 41 to 46. The dividing section 253 supplies the images 51 to 56 as high-resolution images to the encoders 254-1 to 254-6, respectively.
In step S67, the encoders 254-1 to 254-6 encode the high-resolution images of the planes 41 to 46 to generate a high-resolution stream and supply the high-resolution stream to the section file generating section 255.
In step S68, the encoder 27 encodes the sound acquired from the microphone 11B of fig. 1 via the audio acquisition section 26 and generates an audio stream. The encoder 27 supplies the audio stream to the section file generating section 255.
In step S69, the section file generating section 255 archives the low-resolution stream for each bit rate, the high-resolution stream for each bit rate of the planes 41 to 46, and the audio stream in sections and generates a section file. The section file generating unit 255 supplies the section file to the uploading unit 30.
Since the processing of steps S70 to S74 is similar to that of steps S18 to S22 of fig. 10, description will be omitted.
(example of configuration of reproduction apparatus)
Fig. 16 is a block diagram showing an example of the configuration of a reproduction apparatus according to the second embodiment of the delivery system to which the present disclosure is applied.
In the configuration depicted in fig. 16, the same configuration as that of fig. 11 is denoted by the same reference numeral. Repeated description will be omitted where appropriate.
The playback device 270 of fig. 16 differs from the playback device 14 of fig. 11 in the configuration: the MPD processing section 271, the section file acquisition section 272, the decoders 273 and 274, the mapping processing section 275, and the line of sight detecting section 276 are provided as alternatives to the MPD processing section 221, the section file acquisition section 222, the decoder 223, the mapping processing section 226, and the line of sight detecting section 229. Playback device 270 generates a display image from the high-resolution image and the low-resolution image on the surface corresponding to the line of sight extending from the viewing position in the direction of the line of sight of the viewer.
The MPD processing section 271 in the playback device 270 analyzes the MPD file supplied from the MPD acquisition section 220. Specifically, similarly to the MPD processing section 221 of fig. 11, the MPD processing section 271 identifies identification information about each adaptation set element and selects an adaptation set element containing predetermined identification information. For example, in the case where the configuration of the MPD file is the configuration of fig. 14, the MPD processing section 271 selects the first to seventh adaptation set elements from the top, which include identification information of 1.
Further, the MPD processing section 271 selects an adaptation set element of the low resolution image (which does not include an adaptation set element of the SRD in the example of fig. 14) as an adaptation set element of the low resolution image to be reproduced from among the selected adaptation set elements including predetermined identification information. For example, in the case where the configuration of the MPD file is the configuration of fig. 14, the MPD processing section 271 selects the first adaptation set element.
Further, the MPD processing section 271 selects adaptation set elements of the high-resolution image (each including an adaptation set element of an SRD in the example of fig. 14) from among the selected adaptation set elements including predetermined identification information. For example, in the case where the configuration of the MPD file is the configuration of fig. 14, the MPD processing section 271 selects the second through seventh adaptation set elements.
Then, the MPD processing section 271 selects, from among the adaptation set elements of the high resolution image, the adaptation set elements of the selection plane indicated by the selection plane information (which will be described in detail later) supplied from the sight line detecting section 276, as the adaptation set elements of the high resolution image to be reproduced, based on the selected plane information and the SRD.
Specifically, the MPD processing section 271 selects, as an adaptation set element of a high-resolution image to be reproduced, an adaptation set element having a value indicating the position of an image on the omnidirectional image 50 corresponding to the selection plane indicated by the selection plane information, from among the images 51 to 56. Note that the selected face information is information indicating one face corresponding to the line of sight of the viewer among the faces 41 to 46 as the selection faces.
The MPD processing section 271 acquires information such as the URL of the section file at the reproduction clock time from the presentation element in the adaptation set element of the low resolution image and the high resolution image to be reproduced and supplies the information to the section file acquisition section 272. Further, in the case where the adaptive set elements of the low-resolution image and the high-resolution image to be reproduced include mapping information, the MPD processing section 271 supplies the mapping information to the mapping processing section 275.
The section file acquisition section 272 issues a request for a section file identified by the URL supplied from the MPD processing section 271 to the delivery server 13 based on the URL and acquires the section file. The section file acquisition section 272 supplies one low-resolution stream contained in the acquired section file to the decoder 273 and one high-resolution stream to the decoder 274.
The decoder 273 decodes the one low-resolution stream supplied from the section file acquisition section 272 to generate a low-resolution image and supplies the low-resolution image to the mapping processing section 275.
The decoder 274 decodes the one high-resolution stream supplied from the section file acquisition section 272 to generate a high-resolution image and supplies the high-resolution image to the mapping processing section 275.
In the case where the MPD processing section 271 provides mapping information on the low resolution image, the mapping processing section 275 sets a reference image in the low resolution image at a reference position based on the mapping information, rotates the low resolution image by the rotation angle λ, and maps the low resolution image as a texture onto the faces 71 to 76 of the sphere 70. On the other hand, when the MPD processor 271 does not provide mapping information, the mapping processor 275 maps the low-resolution image onto the surfaces 71 to 76 of the sphere 70 as it is as a texture.
Note that the mapping processing section 275 may map only a part of the low resolution image containing the region subjected to perspective projection onto the field of view of the viewer determined by the line of sight detecting section 276 without mapping the entire low resolution image.
Further, the mapping processing section 275 sets the selection plane as a 3D model within the sphere 70 based on the selection plane information supplied from the sight line detection section 276. In the case where the MPD processing section 271 provides mapping information about a high-resolution image, the mapping processing section 275 sets a reference image of the high-resolution image at a reference position based on the mapping information, rotates the high-resolution image by a rotation angle, and maps the high-resolution image as a texture onto a selection plane set in the sphere 70. On the other hand, when the MPD processor 271 does not provide the mapping information, the mapping processor 275 maps the high-resolution image as it is onto the selection surface provided in the sphere 70 as a texture.
Further, the mapping processing section 275 supplies the 3D model image in which the texture is mapped onto the sphere 70 and the selection surface to the drawing section 227.
The line of sight detecting section 276 determines the direction of the line of sight of the viewer in the coordinate system of the 3D model based on the detection result of the gyro sensor 15B supplied from the receiving section 228. Further, the sight line detection section 276 acquires a captured image of the mark 15A from the camera 14A and detects the viewing position based on the captured image.
The sight line detection section 276 determines one of the surfaces 41 to 46 whose normal line passing through the center is closest to the sight line of the viewer as a selection surface based on the viewing position and the sight line direction in the coordinate system of the three-dimensional model. The line of sight detecting section 276 supplies the selected plane information to the MPD processing section 271 and the mapping processing section 275.
Further, the sight line detection section 276 determines the visual field range of the viewer in the coordinate system of the 3D model based on the viewing position and the sight line direction in the coordinate system of the 3D model. The line of sight detecting section 276 supplies the field of view range and the viewing position of the viewer to the drawing section 227.
(processing performed by a reproduction apparatus)
Fig. 17 is a flowchart describing the reproduction processing performed by the generation means 270 of fig. 16.
Since the processing from steps S81 to S83 of fig. 17 is similar to the processing from steps S31 to S33 of fig. 12, description will be omitted.
In step S84, the MPD processing section 271 acquires, from the MPD file, the URL of the section file of the low-resolution stream to be reproduced. Specifically, the MPD processing section 271 acquires the URL of the section file at the reproduction clock time from the presentation element in the adaptation set element of the low resolution image which is described in the MPD file and which contains the identification information selected in step S83. The MPD processing section 271 supplies the acquired URL to the section file acquisition section 272.
Since the processing from steps S85 to S88 is similar to that of steps S37 to S40 of fig. 12, description will be omitted.
In step S89, the sight line detection section 276 determines the selection plane and the visual field range of the viewer based on the viewing position and the sight line direction. The line of sight detecting section 276 supplies the field of view range and the viewing position of the viewer to the drawing section 227. Further, the line-of-sight detecting section 276 supplies the selected plane information to the MPD processing section 271 and the mapping processing section 275. The mapping processing section 275 sets the selected surface as a 3D model within the sphere 70 based on the selected surface information.
In step S90, the MPD processing section 271 acquires, from the MPD file, the URL of the section file of the high-resolution stream to be reproduced. Specifically, the MPD processing section 271 acquires the URL of the section file at the reproduction clock time from the identification information described in the MPD file and included in the identification information selected in step S83 and the presentation element in the adaptation set element of the SRD indicating the position of the high-resolution image on the omnidirectional image 50 corresponding to the selected plane indicated by the selected plane information by the value. The MPD processing section 271 supplies the acquired URL to the section file acquisition section 272.
In step S91, the section file acquisition section 272 issues a request for the section file identified by the URL to the delivery server 13 based on the URL supplied from the MPD processing section 271 and acquires the section file. The section file acquisition section 222 supplies the low resolution stream contained in the acquired section file to the decoder 273 and supplies the high resolution stream to the decoder 274.
In step S92, the decoder 274 decodes the high-resolution stream supplied from the section file acquisition section 272 to generate a high-resolution image and supplies the high-resolution image to the mapping processing section 275.
In step S93, the decoder 273 decodes the low resolution stream supplied from the section file acquisition section 272 to generate a low resolution image and supplies the low resolution image to the mapping processing section 275.
In step S94, the MPD processing section 271 determines whether the adaptation set element of the low resolution stream to be reproduced contains mapping information. In the event that determination is made in step S94 that the adaptive set element contains mapping information, the MPD processing section 271 supplies the mapping information to the mapping processing section 275, and the process proceeds to step S95.
In step S95, the mapping processing section 275 maps the low-resolution image supplied from the decoder 273 onto the faces 41 to 46 of the sphere 70 as a texture based on the mapping information on the low-resolution stream supplied from the MPD processing section 271. Then, the process proceeds to step S97.
On the other hand, in the case where it is determined in step S94 that the adaptation set element does not contain mapping information, the process proceeds to step S96. In step S96, the mapping processing unit 275 maps the low-resolution image supplied from the decoder 273 onto the surfaces 41 to 46 of the sphere 70 as it is as a texture. Then, the process proceeds to step S97.
In step S97, the MPD processing section 271 determines whether the adaptation set element of the high-resolution stream to be reproduced contains mapping information. In the event that determination is made in step S97 that the adaptive set element contains mapping information, the MPD processing section 271 supplies the mapping information to the mapping processing section 275, and the process proceeds to step S98.
In step S98, the mapping processing unit 275 maps the high-resolution image as a texture onto the selection surface provided in the sphere 70 based on the mapping information on the high-resolution stream. The mapping processing section 275 supplies the 3D model image in which the texture is mapped onto the sphere 70 and the selection surface to the drawing section 227, and the process proceeds to step S100.
On the other hand, in the case where it is determined in step S97 that the adaptation set element does not contain mapping information, the process proceeds to step S99. In step S99, the mapping processing unit 275 maps the high-resolution image as it is to the selection surface provided in the sphere 70 as a texture. The mapping processing section 275 supplies the 3D model image in which the texture is mapped onto the sphere 70 and the selection surface to the drawing section 227, and the process proceeds to step S100.
Since the processing in steps S100 and S101 is similar to that in steps S45 and S46 of fig. 12, description will be omitted.
As described so far, the generation means 250 sets identification information for the MPD file. Accordingly, the reproducing unit 270 may set the low-resolution image and the high-resolution image corresponding to the same identification information as images to be simultaneously reproduced based on the identification information.
Further, the generating means 250 sets mapping information for the MPD file. Accordingly, the reproducing unit 270 maps the low resolution image and the high resolution image based on the mapping information, thereby making it possible to make the low resolution image and the high resolution image mapped to the reference position as reference images having the reference inclination. Therefore, the low-resolution image and the high-resolution image can be mapped into the same sphere 70 in an overlapping manner with high accuracy.
Further, the reproduction device 270 acquires a high-resolution stream of only one selection plane corresponding to the line of sight of the viewer among the planes 41 to 46. Therefore, the amount of transmission between the generating device 250 and the reproducing device 270 can be reduced as compared with the case where high-resolution streams of all the facets 41 to 46 are acquired.
Further, the reproduction device 270 generates a 3D model image using the high-resolution image of the selected plane corresponding to the line of sight of the viewer and the low-resolution images of all the planes 71 to 78. Therefore, it is possible to generate a display image in the range of the viewer's field of view from the high-resolution image and finally improve the image quality of the display image. Further, even in a case where a region of the 3D model image for perspective projection on the field of view of the viewer includes a region other than the high resolution image and in a case where the field of view of the viewer abruptly changes, the display image may be generated using the low resolution image.
(second example of MPD File)
Fig. 18 is a diagram illustrating a second example of an MPD file generated by the MPD file generating section 256 in fig. 13.
The description of the MPD file of fig. 18 differs from that of the MPD file of fig. 14 in that: the identification information and the mapping information are set to not a value of "supplemenalpropertemieduri ═ urn: mpeg: dash: origin _ source _ id: 2016", but a value of "supplemeneprortemieduri ═ http:// xmlns.
Specifically, the values of "supplementalpropertyexeduuri ═ http:// xmlns.sony. net/metadata/mpeg/dash/cordinates/2015" contained in the adaptive set element of the omnidirectional image are the omnidirectional image generation method, ID, and "X, Y, λ".
Thus, the value of "supplementalproportionality schemehduri" ("http:// xmlns. sony. net/metadata/mpeg/dash/cordinates/2015") contained in the first adaptation set element from the top is "equirectangulari I1B480,240, 0". Further, the value of "SupplementalProperty schemeelduri ═ http:// xmlns. sony. net/metadata/mpeg/dash/cordates/2015" contained in each of the second to seventh adaptation set elements from the top is "cube I1B 1530,1524, 0". In this case, similarly to the case of fig. 7, "I" is added to the top of the block storing the identification information, and "B" is added to the top of the block storing the mapping information in the value. Furthermore, each piece of information in a block is separated by a comma (,) and the blocks are separated by a space. Note that nothing is added to the top of the block storing the omnidirectional image generation method.
Further, "supplementalProperty schema I _ Uri ═ urn: mpeg: dash: original _ source _ id: 2016" "is not described in the first through seventh adaptation set elements from the top.
(third example of MPD File)
Fig. 19 is a diagram illustrating a third example of an MPD file generated by the MPD file generating section 256 in fig. 13.
The description of the MPD file of fig. 19 differs from that of the MPD file of fig. 14 in that: the mapping information is set to a value not of "supplementalProperty scheme eIdUri ═ urn: mpeg: dash: original _ source _ id: 2016" ", but of" supplementalProperty scheme eIdUri ═ http:// xmlns.
Specifically, the value of "SupplementalProperty schemeIdUri ═ http:// xmlns.sony. net/metadata/mpeg/dash/cordinates/2015" contained in the adaptation set element of the omnidirectional image is the omnidirectional image generation method and the mapping information.
Thus, the value of "SupplementalProperty schemehduri" ("http:// xmlns. sony. net/metadata/mpeg/dash/cordinates/2015") contained in the first adaptation set element from the top is "equirectular B480,240, 0". Further, the value of "SupplementalProperty schemeelduri ═ http:// xmlns. sony. net/metadata/mpeg/dash/cordates/2015" contained in each of the second to seventh adaptation set elements from the top is "cube B1530,1524, 0". In this case, similarly to the case of fig. 7, "B" is added to the top of the block storing the identification information in the value. Furthermore, each piece of information in a block is separated by a comma (,) and the blocks are separated by a space. Note that nothing is added to the top of the block storing the omnidirectional image generation method.
In addition, the value of "complementary property schemeiUri ═ urn: mpeg: dash: original _ source _ ID: 2016" is ID. In other words, the value of "supplementalPropertySchemeIdUri" urn "mpeg: dash: original _ source _ id: 2016" in each of the first through seventh adaptation set elements is "I1". In this case, similarly to the case of fig. 7, "I" is added to the top of the block storing the identification information in the value.
In the example of fig. 19, since the identification information is indicated by the value of SupplementalProperty differently from the map information, the identification information can be easily used in a process different from the map.
(fourth example of MPD File)
Fig. 20 is a diagram illustrating a fourth example of an MPD file generated by the MPD file generating section 256 in fig. 13.
The description of the MPD file of fig. 20 is different from that of the MPD file in fig. 14 in that: the type information indicating the type of the mapping information is newly described.
In other words, it has been described above that the mapping information is the coordinates (X, Y) and the rotation angle λ. However, as described above, the mapping information may be Euler (Euler) angles (α, β, γ) or quaternions (q0, q1, q2, q 3).
Therefore, in the example of fig. 20, information indicating that the mapping information is any one of the coordinates (X, Y), the rotation angle λ, the euler angles (α, β, γ), and (q0, q1, q2, q3) is described as type information.
Specifically, the type information is set between "original _ source _ id" and "2016" in "complementary property scheme ID Uri" urn "mpeg" dash "original _ source _ id 2016".
As depicted in fig. 21, in the case where the mapping information indicates coordinates (X, Y), a rotation angle λ, the type information (type) is "2 dxy-3 dr", and in the case where the mapping information indicates euler angles (α, β, γ), the type information is "yxz-euler". Further, in the case where the mapping information indicates a quaternion (q0, q1, q2, q3), the type information is a "quaternion".
In the example of fig. 20, the mapping information is coordinates (X, Y) and a rotation angle λ. Thus, the SupplementalProperty contained in the first adaptation set element from the top and having a value of "I1B 480,240, 0" is "supplementalPropertySchemeIdUri ═ urn: mpeg: dash: original _ source _ id:2dxy-3dr: 2016".
Further, the supplementalProperty which is contained in the second to seventh adaptation set elements from the top and has a value of "I1B 1530,1524, 0" is "supplementalPropertyschemeeIdUri ═" urn: mpeg: data: original _ source _ id:2dxy-3dr:2016 ". Note that in the case of fig. 20, similarly to the case of fig. 7, "I" is added to the top of the block storing the identification information, and "B" is added to the top of the block storing the mapping information in the value. Further, each piece of information in a block is separated by commas (,) and blocks are separated by spaces.
(fifth example of MPD File)
Fig. 22 is a diagram illustrating a fifth example of an MPD file generated by the MPD file generating section 256 in fig. 13.
The description of the MPD file of fig. 22 is different from that of the MPD file of fig. 20 in that: the type information is set to a value of "supplementalProperty scheme IdUri ═ urn: mpeg: dash: origin _ source _ id: 2016" ".
Specifically, the values of "complementary property scheme ID uri ═ urn: mpeg: dash: original _ source _ ID: 2016" included in the adaptation set element of the omnidirectional image are ID and "type information X, Y, λ".
Thus, the value of "complementary property scheme II ═ urn:" urn: mpeg: dash: original _ source _ id:2016 "contained in the first adaptation set element from the top is" I1B2dxy-3dr,480,240,0 ". Further, the value of "supplementalPropertySchemeIdUri ═ urn: mpeg: dash: origin _ source _ id: 2016" contained in the seventh adaptation set element from the second adaptation set element value from the top is "I1B 2dxy-3dr,1530,1524, 0". In other words, in the case of fig. 20, similarly to the case of fig. 7, "I" is added to the top of the block storing the identification information in the value, and "B" is added to the top of the block storing the mapping information. In this case, "B" is added to the type information stored in the top of the block storing the mapping information. Furthermore, each piece of information in a block is separated by a comma (,) and the blocks are separated by a space.
As described so far, in the examples of fig. 20 and 22, the type information is set for the MPD file. Accordingly, the MPD file generating section 256 may select the type of mapping information set for the MPD file from among a plurality of types. Also, even in a case where the MPD file generating section 256 selects the type of mapping information, the reproducing apparatus 270 may identify the type of mapping information by the type information. Accordingly, the reproducing apparatus 270 can accurately perform mapping based on the mapping information.
Note that, in the MPD file of the first embodiment, similarly to the case of fig. 18, the identification information and the mapping information may be set to a value of "SupplementalProperty schemehduri ═ http:// xmlns. Further, in the MPD file of the first embodiment, similar to the case of fig. 19, the mapping information may be set to a value of "supplementalProperty schemeeIduri ═ http:// xmlns. Further, in the MPD file of the first embodiment, the type of mapping information may be described similarly to the case of fig. 20 and 22.
< third embodiment >
(example of configuration of reproduction apparatus)
The configuration of the third embodiment of the delivery system to which the present disclosure is applied is the same as that of the delivery system 10 of fig. 1 except for the configurations of the generation apparatus and the reproduction apparatus. Therefore, only the generating device and the reproducing device are described below.
Fig. 23 is a block diagram showing an example of the configuration of a generating apparatus according to the third embodiment of the delivery system to which the present disclosure is applied.
In the configuration depicted in fig. 23, the same configuration as that of fig. 13 is denoted by the same reference numeral. Repeated description will be omitted where appropriate.
The generating apparatus 300 of fig. 23 differs from the generating apparatus 250 of fig. 13 in the configuration: the perspective projection sections 302-1 to 302-18, the encoders 303-1 to 303-18, and the section file generation section 304 are provided as an alternative to the division sections 253, the encoders 254-1 to 254-6, and the section file generation section 255 and the setting section 301 is newly provided, and the mapping processing section 22 is not provided.
The generation means 300 generates 18 high resolution images by mapping the omnidirectional image generated by equiangular projection onto the sphere 70 and performing perspective projection of the omnidirectional image onto 18 two-dimensional planes.
Specifically, the setting section 301 of the apparatus 300 sets two-dimensional plane information indicating the position, inclination, and size of the two-dimensional plane as a drawing plane in each of the 18 sight-line directions. The setting section 301 supplies each piece of two-dimensional plane information to the perspective projection sections 302-1 to 302-18.
Each of the perspective projection sections 302-1 to 302-18 maps the omnidirectional image 90 generated by the mapping processing section 24 onto the sphere 70. Each of the perspective projection sections 302-1 to 302-18 generates an image (perspective projection image) by: the omnidirectional image mapped onto the sphere 70 is perspectively projected onto the two-dimensional plane indicated by the two-dimensional plane information supplied from the setting section 301 with the center of the sphere 70 as a focal point. The generated image thus becomes an image that looks at the omnidirectional image 90 mapped onto the sphere 70 from the center O of the sphere 70 toward a predetermined line-of-sight direction. The perspective projection sections 302-1 to 302-8 supply the generated images to the encoders 303-1 to 303-18, respectively, as high-resolution images.
The encoders 303-1 to 303-18 encode the high-resolution images supplied from the perspective projection sections 302-1 to 302-18 at a bit rate equal to or higher than 1. The encoders 303-1 to 303-18 supply the high-resolution streams of the two-dimensional planes generated as a result of the encoding to the section file generating section 304.
The perspective projection sections are hereinafter collectively referred to as "perspective projection section 302" without particularly distinguishing the perspective projection sections 302-1 to 302-18. Likewise, encoders 303-1 through 303-18 will be collectively referred to as "encoder 303".
The section file generation unit 304 archives the low-resolution stream for each bit rate, the high-resolution stream for each two-dimensional plane for each bit rate, and the audio stream in sections. The section file generation unit 304 supplies the section file generated as a result of archiving to the upload unit 30.
(example of configuration of two-dimensional plane information)
Fig. 24 is a diagram of an example of the configuration of two-dimensional plane information.
As depicted in fig. 24, the two-dimensional plane information includes azimuth, elevation, rotation angle, landscape viewing angle, and portrait viewing angle. The azimuth and elevation angles are information indicating the position of the two-dimensional plane, and the rotation angle is information indicating the inclination angle of the two-dimensional plane. Further, the lateral viewing angle and the longitudinal viewing angle are information indicating the size of the two-dimensional plane.
Specifically, as depicted in fig. 25, the azimuth angle and the elevation angle are angles formed between a line connecting the center O of the sphere 70 as the three-dimensional model and the center C of the two-dimensional plane 311 and the reference point B of the sphere 70 in the horizontal direction (the direction of arrow a in fig. 25) and the vertical direction (the direction of arrow B in fig. 25), respectively. Further, as depicted in fig. 25, when a line connecting the center C and the center O is assumed as an axis, the rotation angle is an angle of the two-dimensional plane 311 in the rotation direction (the direction of the arrow C in fig. 25). Note that the position of the center C of the two-dimensional plane from the position of the reference point B is set by moving the position in order of the azimuth angle, the elevation angle, and the rotation angle from the reference point B. The forward direction of the rotation angle is, for example, counterclockwise.
As depicted in fig. 25, the lateral viewing angle is an angle qos formed between a line connecting the midpoint s, q of both lateral ends of the two-dimensional plane 311 and the center O. Further, as depicted in fig. 25, the longitudinal viewing angle is an angle por formed between a line connecting the midpoint p, r of both longitudinal ends of the two-dimensional plane 311 and the center O.
Note that the two-dimensional plane 311 is disposed such that a line passing through the center O of the sphere 70 and the center C of the two-dimensional plane 311 serves as a normal line. When the center O is assumed as the viewing position, the normal line is located in the line-of-sight direction of the high-resolution image corresponding to the two-dimensional plane 311.
(example of two-dimensional plane)
Fig. 26 is a diagram showing an example of the 18 two-dimensional planes set by the setting section 301 of fig. 23, and fig. 27 is a diagram showing an example of two-dimensional plane information on the 18 two-dimensional planes.
As shown in fig. 26, the setting unit 301 sets 12 two-dimensional planes having lines 321-1, 321-2, 321-4, 321-5, 321-7, 321-9, 321-11, 321-12, 321-14, 321-15, 321-16, and 321-18 connecting the center points of 12 sides of a cube 320 disposed within a sphere 70 centering on the center O of the sphere 70 and the center O and serving as normal lines.
Further, the setting section 301 sets six two-dimensional planes having lines 321-3, 321-6, 321-8, 321-10, 321-13, and 321-17 connecting the centers of the six faces of the cube 320 with the center O and serving as normal lines.
In this case, as depicted in fig. 27, the azimuth angles in the two-dimensional plane information on the 18 two-dimensional planes having the lines 321-1 to 321-18 serving as the normal lines passing through the centers of the 18 two-dimensional planes are-135, -90, -45,0, 45, 90, 135, 180, and 180, respectively, in order. Further, the elevation angles are, in order, 0,45,0, -45,0,90, 45,0, -45, -90,0, 45,0, -45,0, and-45, respectively.
Further, in the example of fig. 27, the rotation angles of all 18 two-dimensional planes are set to 0 degree, and the lateral viewing angle and the longitudinal viewing angle are set to 90 degrees.
Adjacent two-dimensional planes of the 18 two-dimensional planes are arranged to overlap each other as described above. Therefore, an omnidirectional image mapped to the same area of the sphere 70 exists in the high resolution image of the adjacent two-dimensional plane.
Note that the information indicating the size of each two-dimensional plane is not necessarily a landscape viewing angle and a portrait viewing angle, but may be a diagonal viewing angle. As depicted in fig. 28, the diagonal viewing angle is the angle bad (aoc) formed between the line connecting the two vertices b (a) and d (c) of the diagonal of the two-dimensional plane 325 and the center O of the sphere 70.
In the case where the information indicating the size of the two-dimensional plane is the landscape viewing angle and the portrait viewing angle, the person viewing the information easily understands the size of the two-dimensional plane, as compared with the case where the information is the diagonal viewing angle. On the other hand, in the case where the information indicating the size of the two-dimensional plane is a diagonal view angle, it is also possible to reduce the amount of information due to the smaller number of parameters, as compared with the case where the information is a lateral view angle and a longitudinal view angle.
(description of method of setting SRD of high resolution image)
Fig. 29 is an explanatory diagram of a method of setting the SRD for each high-resolution image by the MPD section file generating unit 256 in fig. 23.
In the example of fig. 29, a perspective projection section 302-i (where i ═ 1, 2.., 18) performs perspective projection of an omnidirectional image onto a two-dimensional plane indicated by the i-th two-dimensional plane information from the top of fig. 27, thereby generating a high-resolution image 331-i.
In this case, as depicted in fig. 29, the MPD file generating section 256 arranges the high-resolution images 331-1 to 331-18 on one virtual omnidirectional image 332 and assumes that the high-resolution images 331-1 to 331-18 are SRDs of the divided image setting high-resolution images 331-1 to 331-18 obtained by dividing the virtual omnidirectional image 332. The high resolution images are hereinafter collectively referred to as "high resolution images 331" without particularly distinguishing the high resolution images 331-1 to 331-18.
In the third embodiment, the size of the virtual omnidirectional image 332 is 8192 pixels × 5120 pixels, and the size of each high-resolution image 331 is 1024 pixels × 1024 pixels, as depicted in fig. 29.
(first example of MPD File)
Fig. 30 is a diagram showing a first example of an MPD file generated by the MPD file generating section 256 in fig. 23.
In the example of fig. 30, the section files of each of the low-resolution stream, the high-resolution stream of the high-resolution image 331, and the audio stream in the time range corresponding to each period element are classified into one group. Thus, in the MPD file of fig. 30, a period element contains 20 adaptation set elements.
Since the first adaptation set element from the top corresponding to the interval file of the low resolution stream is the same as that of fig. 14, description of the product will be omitted.
The second to 19 th adaptation set elements from the top are elements corresponding to the interval file of the high resolution stream of the high resolution image 331, respectively. Each of these adaptation set elements is the same as the second adaptation set element of fig. 14 except for the omni-directional image generation method, information set to the value of the SRD, a presentation element, and mapping information.
Specifically, the omnidirectional images corresponding to the second adaptation set element through the 19 th adaptation set element are virtual omnidirectional images 332. Accordingly, "cube-ex" indicating a method of generating the virtual omnidirectional image 332 is set to a value of "SupplementalProperty schemeIdUri ═ http:// xmlns.sony.net/metadata/mpeg/dash/coordinates/2015" possessed by each of the second to 19 th adaptation set elements as the omnidirectional image generation method.
Note that in the third embodiment, 18 two-dimensional planes have six faces constituting a cube and the virtual omnidirectional image 332 contains an omnidirectional image generated by cube mapping. Accordingly, the method of generating the virtual omnidirectional image 332 is defined herein as a cubic mapping extension method, and information indicating the cubic mapping extension is described as "cube-ex"; however, the information indicating the method of generating the virtual omnidirectional image 332 is not limited to this information.
Further, the ID identifying the virtual omnidirectional image 332, which is an image before being divided into the high-resolution image 331, is set to source _ ID in the SRD owned by each of the second through 19 th adaptation set elements. In the example of fig. 30, the ID is 0.
Further, the coordinate of the upper left position of the high resolution image 331-1 corresponding to the second adaptation set element on the virtual omnidirectional image 332 is (0,2048). Therefore, 0 and 2048 are set for object _ x and object _ y in the SRD owned by the second adaptation set element, respectively.
Likewise, 1024,2048, 3072, 4096, 5120, 6144, 7168 and 7168 are set for object _ x in SRDs owned by the third through 19 th adaptation set elements, respectively. Further, 1024,2048, 3072, 2048, 0,1024,2048, 3072, 4096, 2048,1024, 2048, 3072, 2048, and 3072 are set for object _ y in the SRD owned by the third to 19 th adaptation set elements, respectively.
Further, the horizontal size and the vertical size of the high-resolution image 331 corresponding to the second adaptive set element to the 19 th adaptive set element are both 1024 pixels. Therefore, 1024 is set for all object _ width and object _ height in the SRD owned by the second through 19 th adaptation set elements.
Further, the horizontal size of the virtual omnidirectional image 332 before being divided into the high-resolution image 331 corresponding to the second to 19 th adaptation set elements is 8192 pixels, and the vertical size thereof is 5120 pixels. Thus, 8192 is set for all total _ width in the SRD owned by the second through 19 th adaptation set elements, and 5120 is set for all total _ height therein.
Further, each of the high resolution images 331 corresponding to the second to 19 th adaptation set elements is an image obtained by dividing the virtual omnidirectional image 332 once. Therefore, 0 is set for spatial _ set _ id in the SRD owned by the second to 19 th adaptation set elements. In the example of fig. 30, the coordinates (X, Y) in the mapping information are (2554,3572). In the case of fig. 30, similarly to the case of fig. 7, "I" is added to the top of the block storing the identification information in the value, and "B" is added to the top of the block storing the mapping information. Furthermore, each piece of information in a block is separated by a comma (,) and the blocks are separated by a space.
Further, in the example of fig. 30, each of the second adaptation set element to the 19 th adaptation set element from the top contains one representation element. "equestrangularmp4", "cube 1. mp4" to "cube 18. mp4" in the presentation elements in the second to 19 th adaptation set elements from the top are respectively described as baseurls of the span files corresponding to the presentation elements.
Since the 20 th adaptation set element from the top corresponds to the section file of the audio stream and is the same as the eighth adaptation set element from the top of fig. 14, a description thereof will be omitted. Note that fig. 30 omits the description of the section information element.
(description of processing performed by the generating means)
Fig. 31 is a flowchart describing a file generation process executed by the generation apparatus 300 of fig. 23.
Since the processing from steps S121 to S124 of fig. 31 is similar to the processing from steps S6 to S64 of fig. 15, description will be omitted.
In step S125, the setting section 301 sets two-dimensional plane information corresponding to the 18 line-of-sight directions 321-1 to 321-18. The setting section 301 supplies each of the two-dimensional plane information to each perspective projection section 302.
In step S126, each perspective projection section 302 maps the omnidirectional image 90 generated by the mapping processing section 24 onto the sphere 70 and performs perspective projection of the omnidirectional image 90 mapped onto the sphere 70 onto the two-dimensional plane indicated by the two-dimensional plane information, thereby generating the high-resolution image 331. The perspective projection unit 302 supplies the high-resolution image 331 to the encoder 303.
In step S127, each encoder 303 encodes the high-resolution image supplied from the perspective projection section 302 at a bit rate equal to or higher than 1. The encoder 303 supplies the high-resolution stream of the two-dimensional plane generated as a result of the encoding to the section file generating unit 304.
In step S128, the encoder 27 encodes the sound acquired from the microphone 11B of fig. 1 via the audio acquisition section 26 and generates an audio stream. The encoder 27 supplies the audio stream to the section file generating section 304.
In step S129, the section file generating section 304 archives the low resolution stream for each bit rate, the high resolution stream for the two-dimensional plane for each bit rate, and the audio stream in sections and generates a section file. The section file generating unit 304 supplies the section file to the uploading unit 30.
Since the processing from steps S130 to S134 is similar to the processing from steps S70 to S74 of fig. 15, description will be omitted.
(example of configuration of reproduction apparatus)
Fig. 32 is a block diagram showing an example of the configuration of a reproduction apparatus according to the third embodiment of the delivery system to which the present disclosure is applied.
In the configuration depicted in fig. 32, the same configuration as that of fig. 16 is denoted by the same reference numeral. Duplicate description will be omitted as appropriate.
The playback apparatus 340 in fig. 32 differs from the playback apparatus 270 in fig. 16 in the configuration: the MPD processing section 341, the mapping processing section 342, and the line-of-sight detecting section 343 are provided as alternatives of the MPD processing section 271, the mapping processing section 275, and the line-of-sight detecting section 276. The reproduction device 340 generates a display image from the high-resolution image and the low-resolution image of the two-dimensional plane corresponding to the line of sight from the viewer.
Specifically, the MPD processing unit 341 in the playback device 340 analyzes the MPD file supplied from the MPD acquisition unit 220. Specifically, similarly to the MPD processing section 271 of fig. 16, the MPD processing section 341 identifies identification information related to each adaptation set element and selects an adaptation set element containing predetermined identification information. For example, in the case where the configuration of the MPD file is the configuration of fig. 30, the MPD processing section selects the first to 19 th adaptation set elements including identification information of 1 from the top.
Further, the MPD processing section 341 selects an adaptation set element of the low resolution image (the adaptation set element of the SRD is not included in the example of fig. 30) from among the selected adaptation set elements including the predetermined identification information as an adaptation set element of the low resolution image to be reproduced. For example, in the case where the configuration of the MPD file is the configuration of fig. 30, the MPD processing section 341 selects the first adaptation set element.
Further, the MPD processing section 341 selects adaptation set elements of the high-resolution image (in the example of fig. 30, for example, adaptation set elements each including an SRD) from among the selected adaptation set elements including predetermined identification information. For example, in the case where the configuration of the MPD file is the configuration of fig. 30, the MPD processing section selects the second through 19 th adaptation set elements.
Then, the MPD processing section 341 selects, from among the adaptation set elements of the high-resolution image, the adaptation set elements of the selected two-dimensional plane indicated by the selected two-dimensional plane information (which will be described in detail later) supplied from the line-of-sight detecting section 343 as the adaptation set elements of the high-resolution image to be reproduced, based on the selected two-dimensional plane information and the SRD.
Specifically, the MPD processing section 341 retains correspondence information indicating the correspondence between each two-dimensional plane and the position of the high resolution image of the two-dimensional plane on the virtual omnidirectional image 332 in advance. The MPD processing section 341 identifies a position on the virtual omnidirectional image 332 corresponding to the selected two-dimensional plane indicated by the selected two-dimensional plane information, based on the correspondence information. The MPD processing section 341 selects the adaptation set elements having the positions indicated by the object _ x and the object _ y as the adaptation set elements of the high-resolution image to be reproduced. Note that the selected two-dimensional plane information is information indicating one two-dimensional plane corresponding to the line of sight of the viewer, which is the selected two-dimensional plane among the 18 two-dimensional planes.
The MPD processing section 341 acquires information such as a URL of a section file at a reproduction clock time from a presentation element in an adaptation set element of a low resolution image and a high resolution image to be reproduced and supplies the information to the section file acquisition section 272. When the adaptive set elements of the low-resolution image and the high-resolution image to be reproduced include mapping information, the MPD processing unit 341 supplies the mapping information to the mapping processing unit 342.
In the case where the MPD processing section 341 provides mapping information about the low-resolution image, the mapping processing section 342 sets a reference image in the low-resolution image at a reference position based on the mapping information, rotates the low-resolution image by a rotation angle λ, and maps the low-resolution image as a texture onto the faces 71 to 76 of the sphere 70. On the other hand, when the MPD processing unit 341 does not provide mapping information, the mapping processing unit 342 maps the low-resolution image onto the surfaces 71 to 76 of the sphere 70 as it is as a texture.
Note that the mapping processing section 342 may map only a part of the low-resolution image containing the region subjected to perspective projection onto the field of view of the viewer determined by the line-of-sight detecting section 343 without mapping the entire low-resolution image.
Further, the mapping processing section 342 sets the selected two-dimensional plane within the sphere 70 as a 3D model based on the selected two-dimensional plane information supplied from the line-of-sight detecting section 343. In the case where the MPD processing section 341 provides mapping information on a high resolution image, the mapping processing section 342 sets a reference image of the high resolution image at a reference position based on the mapping information, rotates the high resolution image by a rotation angle, and maps the high resolution image as a texture onto a selected two-dimensional plane set within the sphere 70. On the other hand, when the MPD processing unit 341 does not provide mapping information on a high-resolution image, the mapping processing unit 342 maps the high-resolution image as it is onto a selected two-dimensional plane provided in the sphere 70 as a texture.
Further, the mapping processing section 342 supplies the 3D model image in which the texture is mapped onto the sphere 70 and the selected two-dimensional plane to the drawing section 227.
The line-of-sight detecting section 343 determines the direction of the line of sight of the viewer in the coordinate system of the 3D model based on the detection result of the gyro sensor 15B supplied from the receiving section 228. Further, the line-of-sight detecting section 343 acquires a captured image of the mark 15A from the camera 14A and detects the viewing position based on the captured image.
The line-of-sight detecting section 343 sets two-dimensional plane information about 18 two-dimensional planes, similarly to the setting section 301 of fig. 23. The line-of-sight detecting section 343 determines, as the selected two-dimensional plane, one of the 18 two-dimensional planes whose normal line passing through the center thereof is closest to the line of sight of the viewer based on the viewing position and the line-of-sight direction in the coordinate system of the 3D model and the two-dimensional plane information. The line-of-sight detecting section 343 supplies the selected two-dimensional plane information to the MPD processing section 341 and the mapping processing section 342.
Further, the sight line detection section 343 determines the field of view range of the viewer in the coordinate system of the 3D model based on the viewing position and the sight line direction in the coordinate system of the 3D model. The line-of-sight detecting section 343 supplies the field of view and the viewing position of the viewer to the drawing section 227.
(description of the mapping)
Fig. 33 is an explanatory diagram of mapping performed by the mapping processing unit 342 of fig. 32.
In the example of fig. 33, the viewing position is the center O of the sphere 70. In this case, the two-dimensional plane 363 having a line corresponding to the azimuth and elevation angles, the rotation angle closest to the angles in the horizontal direction and the vertical direction, and the rotation angle of the line of sight 361 extending from the center O in the line of sight direction formed between the line of sight 361 and the line connecting the reference position and the center O, and extending from the center O in the direction 362 of the arrow as a normal line passing through the center of the two-dimensional plane 363 is selected as the selected two-dimensional plane.
Therefore, the mapping processing section 363 sets the two-dimensional plane within the sphere 70 as the 3D model. Further, the mapping processing section 342 maps the low-resolution image 371 onto the sphere 70 serving as a 3D model as a texture and maps the high-resolution image 372 serving as a texture onto the two-dimensional plane 363 as a 3D model.
As described above, the two-dimensional plane 363 is provided within the sphere 70. Therefore, in the case where perspective projection is performed on the visual field range within the two-dimensional plane 363 from the direction in which both the low-resolution image 371 and the high-resolution image 372 are present, the drawing section 227 can preferentially use the high-resolution image 372 over the low-resolution image 371.
(processing performed by a reproduction apparatus)
Fig. 34 is a flowchart describing the reproduction processing performed by the generation means 340 of fig. 32.
Since the processing from steps S141 to S148 of fig. 34 is similar to the processing from steps S81 to S88 of fig. 17, description will be omitted.
In step S149, the line-of-sight detecting section 343 determines, as the selected two-dimensional plane, one two-dimensional plane whose normal line passing through the center of the two-dimensional plane is closest to the line of sight of the viewer among the 18 two-dimensional planes, based on the viewing position and the line-of-sight direction in the coordinate system of the 3D model and the two-dimensional plane information. The line-of-sight detecting section 343 supplies the selected two-dimensional plane information to the MPD processing section 341 and the mapping processing section 342.
In step S150, the sight line detection section 343 determines the field of view range of the viewer in the coordinate system of the 3D model based on the viewing position and the sight line direction in the coordinate system of the 3D model. The line-of-sight detecting section 343 provides the drawing section 227 with the viewing range and the viewing position of the viewer.
In step S151, the MPD processing unit 341 acquires the URL of the section file of the high-resolution stream to be reproduced from the MPD file. Specifically, the MPD processing section 341 acquires the URL of the section file at the reproduction clock time from the identification information described in the MPD file and included in the identification information selected in step S143 and the presentation element in the adaptation set element of the SRD indicating the position on the virtual omnidirectional image 332 of the high-resolution image 331 corresponding to the selected two-dimensional plane indicated by the selected two-dimensional plane information by the value. The MPD processing unit 341 supplies the acquired URL to the section file acquisition unit 272.
Since the processing from steps S152 to S157 is similar to that of steps S91 to S96 of fig. 17, description will be omitted.
In step S158, the mapping processing section 342 sets the selected two-dimensional plane as a 3D model within the sphere 70 based on the selected two-dimensional plane information supplied from the line-of-sight detecting section 343.
In step S159, the MPD processing section 341 determines whether the adaptation set element of the high-resolution stream to be reproduced includes mapping information. In the case where it is determined in step S159 that the adaptive set element includes the mapping information, the MPD processing section 341 supplies the mapping information to the mapping processing section 342, and the process proceeds to step S160.
In step S160, the mapping processing section 342 maps the high-resolution image as a texture onto the selected two-dimensional plane set in step S158 based on the mapping information on the high-resolution stream. The mapping processing section 342 supplies the 3D model image in which the texture is mapped onto the sphere 70 and the selected two-dimensional plane to the drawing section 227, and the process proceeds to step S162.
On the other hand, in the case where it is determined in step S159 that the adaptation set element does not contain the mapping information, the process proceeds to step S161. In step S161, the mapping processing section 342 maps the high-resolution image as it is to the selected two-dimensional plane set in step S158 as a texture. The mapping processing section 342 supplies the 3D model image in which the texture is mapped onto the sphere 70 and the selected two-dimensional plane to the drawing section 227, and the process proceeds to step S162.
Since the processing in steps S162 and S163 is similar to that in steps S100 and S101 of fig. 17, description will be omitted.
Note that, in the above description, the generation device 300 generates the virtual omnidirectional image 332 provided with the high-resolution images 331, and the reproduction device 340 retains the correspondence information, thereby enabling the generation device 300 to notify the reproduction device 340 of the two-dimensional plane information about each high-resolution image 331 using the SRD. However, the generating device 300 may notify the reproducing device 340 of the two-dimensional plane information itself without using the SRD.
(second example of MPD File)
Fig. 35 is a diagram illustrating an example of an MPD file in this case.
The MPD file of fig. 35 is the same as the MPD file of fig. 30 except that none of the second to 19 th adaptive set elements from the top includes "supplemental property schema id ═ http:// xmlns.sony.net/metadata/mpeg/dash/cordinates/2015" "indicating an omnidirectional image generation method by a value and an SRD, but describes" supplemental property schema id:// metadata/mpeg/dash/360 VR/supplemental-port/2016 "" indicating two-dimensional plane information by a value, except that "supplemental property schema id:" "" values of source: region _ id "" "2016" ", which indicates two-dimensional plane information by a value
Specifically, ID, spatial _ set _ ID, azimuth, elevation, rotation angle, lateral viewing angle, longitudinal viewing angle, entire lateral viewing angle, and entire longitudinal viewing angle are set to the values of "supplementary property scheme iii uri"// xmlns. The ID is an ID that identifies a 3D model on which an omnidirectional image for generating a high-resolution image is mapped. The azimuth is an azimuth in the two-dimensional plane information.
The azimuth, elevation, rotation angle, lateral view angle, and longitudinal view angle in the two-dimensional plane information corresponding to, for example, the high-resolution image 331-1 corresponding to the second adaptive set element from the top are-135, 0,90, and 90, respectively, as depicted in fig. 24. Thus, the value of "SupplementalProperty schemehduri" http:// xmlns. sony. net/metadata/mpeg/dash/360 VR/responsive-port/2016 "owned by the second adaptive set element from the top is" I0D-135, 0,0,90,90 ".
Similarly, the values of "supplementalpropheremduri ═ http:// xmlns.sony.net/metadata/mpeg/dash/360 VR/Perspective-port/2016" are "I0D-90, 45,0,90,90," I0D-90, 0,0,90,90, "" I0D-90, -45,0,90,90, " I0D 0,90,0,90,90," " I0D 0,45,0,90,90," I0D 0,0,0,90, " I0D 0,90,0,90,90," " I0D 0,45,0,90," I0D 0,0,0,90,90, "I37D 34," I3990, "3690," I3990, "I3990," 3990, 45,0,90,90, "" I0D 90,0,0,90,90, "I0D 90, -45,0,90,90," I0D 135,0,0,90,90, " I0D 180,45,0,90,90," I0D 180,0,0,90,90, "and" I0D 180, -45,0,90, 90. In other words, in the case of fig. 35, similarly to the case of fig. 7, "I" is added to the top of the block storing the identification information in the value, and "D" is added to the top of the block storing the mapping information and the like. Furthermore, each piece of information in a block is separated by a comma (,) and the blocks are separated by a space.
Further, in the case where the two-dimensional plane information itself is described in the MPD file, the mapping information cannot be represented by the coordinates (X, Y) of the position of the reference image within the omnidirectional image and the rotation angle λ. Therefore, in the example of fig. 35, the mapping information is represented by euler angles. In the example of fig. 35, the euler angles (α, β, γ) of the low-resolution images are (0,0,0), and the euler angles (α, β, γ) of each high-resolution image 331 are (1,0, 0). Further, in the example of fig. 35, the type information is also described.
Thus, in the MPD file of fig. 35, the value of "supplementalProperty schemeIdUri ═ urn: mpeg: dash: origin _ source _ id: 2016" owned by the first adaptive set element from the top is "I1 Byxz-euler,0,0, 0". Further, the value of "complementary property schema id uri ═ urn: mpeg: dash: original _ source _ id: 2016" possessed by the 19 th adaptive set element, which is supported by the second adaptive set element from the top, is "I1 Byxz-euler,1,0, 0". In other words, in this case, the type information is stored in the top of the block storing the mapping information and "B" is added to the top of the block.
As described so far, in the case where the two-dimensional plane information itself is described in the MPD file, the generation apparatus 300 does not need to set the high-resolution image 331 on the virtual omnidirectional image 332 or the reproduction apparatus 340 does not need to retain the correspondence information in advance. Therefore, the degree of freedom for setting the two-dimensional plane increases.
Therefore, the two-dimensional plane can be set as follows. For the forward direction, which tends to draw more attention, the longitudinal and lateral viewing angles of each two-dimensional plane are set to 90 degrees, and the azimuth, elevation, and rotation angles are set such that the distance between adjacent two-dimensional planes forms 45 degrees. The longitudinal viewing angle and the lateral viewing angle of each two-dimensional plane are set to be equal to or greater than 90 degrees for the upward direction or the rearward direction that tends to draw less attention. For the downward direction, which tends to draw the least attention, the two-dimensional plane is not provided. Alternatively, the direction in which more attention is apt to be attracted may be changed according to the content of the captured image.
In a third embodiment, each high resolution image is generated by perspective projection onto a two-dimensional plane. Alternatively, the high resolution image may be generated by stereoscopic projection, isometric projection, or the like. In this case, information indicating a scheme projected onto a two-dimensional plane when generating a high-resolution image is set for the MPD file.
Specifically, "transmissive" in "supplementalProperty schemeIdUri of FIG. 35" http:// xmlns. sony. net/metadata/mpeg/dash/360 VR/transmissive-Port/2016 "is changed, for example, to information indicating the type of projection rather than the Perspective projection. Alternatively, "supplementalProperty schemeeIdUri ═ http:// xmlns.sony.net/metadata/mpeg/dash/360 VR/Port/2016" ", which indicates information representing a drawing plane and a projection type by a value, is described in the MPD file as an alternative to" supplementalProperty schemeIdUri ═ http:// xmlns.sony.net/metadata/mpeg/dash/360 VR/Perspection-Port/2016 "", of FIG. 35. Although the information indicating the projection type is indicated by schemeIdUri in the former case, the information is indicated by the value in the latter case.
In the latter case, the ID and the information indicating the projection type and the azimuth, elevation, rotation angle, lateral view angle and longitudinal view angle are set to a value of "supplemenlproperty schemeelduri ═ http:// xmlns.
The Perspective projection information indicating the projection type is, for example, "Perspective", the information indicating the stereoscopic projection is, for example, "stereographic", and the information indicating the equidistant projection is, for example, "equidistance".
Further, since the 18 two-dimensional planes include six faces constituting the cube in the third embodiment, the 18 high-resolution images include the images 51 to 56 of the six faces 41 to 46. Accordingly, the MPD file may be set in the following manner: the reproduction apparatus 270 in the second embodiment can reproduce high-resolution images (hereinafter, referred to as "common images") that are the images 51 to 56 of the six surfaces 41 to 46 among the 18 high-resolution images.
(description of two-dimensional plane information on common image)
Fig. 36 is a diagram showing two-dimensional plane information on six common images in the two-dimensional plane information of fig. 27.
As shown in fig. 36, the two-dimensional plane information on the six common images are-90 and 0,0 and 90,0 and 0, 0-90 and 0, and 180 and 0 for azimuth and elevation angles that are not within the two-dimensional plane information of fig. 27. In other words, the six common images are high resolution images 331-3, 331-6, 331-8, 331-10, 331-13, and 331-17.
(third example of MPD File)
Fig. 37 is a diagram illustrating an example of a portion of an MPD file set so that the reproducing apparatus 270 of fig. 16 can use a common image as a high-resolution image obtained by dividing an omnidirectional image generated by cubic mapping.
Fig. 37 shows only adaptation set elements of the high-resolution image 331-3 that is a common image not in the MPD file.
In the MPD file of fig. 37, the section files of the high-resolution stream of the high-resolution images 331-3 are classified into two groups. Thus, in the MPD file of FIG. 37, two adaptation set elements are described as corresponding to high resolution image 331-3.
In the first adaptation set element, various information is described while assuming the high-resolution image 331-3 as the image 52 of the face 42. Specifically, the first adaptation set element is the same as the third adaptation set element from the top of fig. 14 except for the BaseURL described in the replication element.
In other words, the high resolution image corresponding to the first adaptive set element is the high resolution image 331-3 among the 18 high resolution images 331-1 to 331-18. Thus, "cube 3.mp 4" in the presentation element within the fourth adaptation set element from the top of fig. 30 is described as a BaseURL, as the BaseURL of the presentation element in the first adaptation set element.
Further, in the second adaptation set element, various information is described while assuming the high-resolution image 331-3 as a high-resolution image of a two-dimensional plane having the line 321-3 as a normal line passing through the center thereof. Specifically, the second adaptation set element is the same as the fourth adaptation set element from the top of fig. 30 except that the value of source _ id is set to the SRD.
In other words, while the image before division is the omnidirectional image 50 when the high-resolution image 331-3 is assumed to be the image 52 of the face 42, the image before division is the virtual omnidirectional image 332 when the high-resolution image 331-3 is assumed to be a high-resolution image of a two-dimensional plane having the line 321-3 as a normal line passing through the center thereof. Therefore, 1 other than 0, which is "source _ id" of the first adaptation set element, is set to "source _ id" of the second adaptation set element.
As shown in fig. 38, the second adaptation set element may be the same as the fourth adaptation set element from the top of fig. 35. Further, although not shown, the adaptation set elements of the high-resolution images 331-6, 331-8, 331-10, 331-13, and 331-17 are described similarly to the adaptation set elements of the high-resolution image 331-3.
When the generation device 300 generates the MPD file of fig. 37 or 38, the playback device 270 in the second embodiment extracts only an adaptive set element having "cube" as information indicating the omnidirectional image generation method. Further, the reproduction means 270 selects one face corresponding to the line of sight from the six faces 41 to 46 corresponding to the extracted adaptive set elements, and generates a display image using the high-resolution image and the low-resolution image of the selected face.
On the other hand, when the generation apparatus 300 generates the MPD file in fig. 37, the playback apparatus 340 in the third embodiment extracts only an adaptive set element having "cube-ex" as information indicating the omnidirectional image generation method. Further, in the case where the generating device 300 generates the MPD file of fig. 38, the reproducing device 340 extracts only an adaptation set element having "SupplementalProperty schemeIdUri ═ http:// xmlns. Further, the reproduction means 340 selects one plane corresponding to the line of sight from the 18 two-dimensional planes corresponding to the extracted adaptive set elements, and generates a display image using the high-resolution image and the low-resolution image of the selected plane.
Note that in the third embodiment, capturing a captured image in the shooting direction, inclination, and angle of view corresponding to two-dimensional plane information makes it possible to use the captured image itself as a high-resolution image. In this case, the generation apparatus 300 does not need to perform perspective projection.
Further, although the low-resolution image and the high-resolution image are generated from the same omnidirectional image in the third embodiment, the low-resolution image and the high-resolution image may be generated using different omnidirectional image generation methods. Further, in the case where the low-resolution image and the high-resolution image are generated from the same omnidirectional image, it is not necessary to describe the mapping information.
< fourth embodiment >
(example of configuration of two-dimensional plane information)
The configuration of the fourth embodiment of the delivery system to which the present disclosure is applied is the same as that of the third embodiment except for the configuration of the two-dimensional plane information. Therefore, only two-dimensional plane information is described below.
Fig. 39 is a diagram showing a configuration example of drawing face information indicating two-dimensional plane information in the fourth embodiment.
The configuration of the drawing face information of fig. 39 differs from that of fig. 24 in that: the drawing side information newly contains FOV _ flag, entire lateral view angle, entire longitudinal view angle, and spatial _ set _ id.
FOV _ flag (area type information) is a flag indicating whether the drawing area (field of view) corresponding to the information containing the FOV _ flag is a sphere or a two-dimensional plane. In the case where the drawing face information indicates two-dimensional plane information, FOV _ flag is 1 indicating that the drawing face is a two-dimensional plane.
Similarly to the case of two-dimensional plane information, in the case where the drawing surface is a sphere, for example, FOV _ flag may be configured to indicate information of the drawing surface in the case where a part of an omnidirectional image generated using equiangular projection is generated as a high-resolution image. In other words, changing the FOV _ flag makes it easy to change the information containing the FOV _ flag to two-dimensional plane information or information indicating a sphere as a drawing surface. The FOV _ flag contained in the information indicating the sphere as the drawing face is 0 indicating that the drawing face is the sphere.
The reproducing unit 340 may recognize whether the high resolution image corresponding to the FOV _ flag is an image generated by perspective projection onto a two-dimensional plane or an image generated by perspective projection onto a sphere through the FOV _ flag.
In the case where FOV _ flag is 1, the entire transverse view angle and the entire longitudinal view angle are a view angle in the transverse direction and a view angle in the longitudinal direction, respectively, of the entire two-dimensional plane belonging to a two-dimensional plane group (which will be described in detail later) indicated by two-dimensional plane information including the entire transverse view angle and the entire longitudinal view angle. The same is true for the case where FOV _ flag is 0. Subsequently, a case where FOV _ flag is 1 will be described; however, this is also true for the case where FOV _ flag is 0 unless otherwise specified.
The reproducing unit 340 may identify the field of view range that can be represented by the high resolution image corresponding to the group only by analyzing two-dimensional plane information on one two-dimensional plane belonging to the group using the entire lateral view angle and the entire longitudinal view angle. Therefore, it is possible to prevent unnecessary searching for two-dimensional plane information relating to a field of view range that cannot be represented by a high-resolution image. In the case where a three-dimensional object such as a sphere or a cube is configured by a two-dimensional plane belonging to the group, the entire lateral view angle is 360 degrees and the entire longitudinal view angle is 180 degrees.
The spatial _ set _ ID is a unique ID of a group and indicates a group to which a two-dimensional plane indicated by two-dimensional plane information containing the spatial _ set _ ID belongs. The two-dimensional planes are grouped by, for example, landscape view, portrait view, type of two-dimensional plane, or resolution.
In the case where two-dimensional planes are grouped by a lateral viewing angle, the two-dimensional planes are grouped by the same lateral viewing angle, for example. Since a display device such as the head mounted display 15 differs in display viewing angle depending on the type of the display device, the lateral viewing angle of a two-dimensional plane suitable for display differs depending on the type of the display device. Therefore, it is desirable that the reproduction apparatus 340 can easily recognize two-dimensional plane information relating to a lateral viewing angle suitable for display of the head-mounted display 15.
In the case where the two-dimensional planes are grouped by the same lateral viewing angle, the reproduction apparatus 340 can easily recognize the two-dimensional plane information on the group corresponding to the lateral viewing angle suitable for the display of the head mounted display 15 as the two-dimensional plane information on the lateral viewing angle suitable for the display of the head mounted display 15. The same is true for longitudinal viewing angles as for transverse viewing angles.
Further, in the case where two-dimensional planes are grouped by the type of two-dimensional plane, the two-dimensional planes are grouped by, for example, whether each two-dimensional plane is a drawing plane of an omnidirectional image generated by cube mapping and configures any one of six faces of a cube. In this case, six faces of the configuration cube among the 18 two-dimensional planes illustrated in fig. 26 are classified into one group, and the other planes are classified into the other groups.
(description of FOV _ flag)
Fig. 40 is an explanatory diagram of the drawing surface in the case where FOV _ flag is 0.
In the case where FOV _ flag is 0, the high resolution image is a captured image mapped onto a partial region 391 of the surface of the sphere 70 as shown in a of fig. 40, i.e., a partial image 392 of the omnidirectional image 90 generated by using equiangular projection as shown in B of fig. 40.
In this case, the lateral viewing angle of the drawing face of the partial image 392, that is, the lateral viewing angle of the area 391 is the angle poq of the area 391 in the horizontal plane containing the center O of the sphere 70, and the longitudinal viewing angle thereof is the angle aob of the area 391 in the vertical plane perpendicular to the horizontal plane containing the center O. In the example of fig. 40, angle poq and angle aob are 30 degrees.
On the other hand, in the case where FOV _ flag is 1, the drawing surface of the high resolution image is a two-dimensional plane. Thus, the lateral viewing angle is the angle qos depicted in fig. 25 and the longitudinal viewing angle is the angle por depicted in fig. 25.
(first example of MPD File)
Fig. 41 is a diagram illustrating a first example of an MPD file describing the two-dimensional information of fig. 39.
The MPD file of fig. 41 differs from the MPD file of fig. 35 in that: in the case where the projection type is Perspective projection, the spatial _ set _ id, the entire landscape view, and the entire portrait view are newly set to values of "http:// xmlns. sony. net/metadata/mpeg/dash/360 VR/transmissive-Portion/2016" as a scheme information that indicates two-dimensional plane information by a value.
Specifically, ID, spatial _ set _ ID, azimuth, elevation, rotation angle, lateral viewing angle, longitudinal viewing angle, entire lateral viewing angle, and entire longitudinal viewing angle are set to the value of "supplementary property scheme actual uri ═ http:// xmlns. sony. net/metadata/mpeg/dash/360 VR/transmissive-Portion/2016" of fig. 41.
Further, in the example of fig. 41, the two-dimensional planes corresponding to the high-resolution images 331-1 to 331-18 are all classified into the same group and ID "1" is assigned to the group. Further, the 18 two-dimensional planes corresponding to the high-resolution images 331-1 to 331-18 contain six faces 41 to 46 configuring the cube 40. Thus, the entire transverse view and the entire longitudinal view of the entire 18 two-dimensional planes are 360 degrees and 180 degrees, respectively.
Thus, the value of "supplementalProperty schema I ═ http:// xmlns. sony. net/metadata/mpeg/dash/360 VR/Perspective-Port/2016" owned by the second adaptive set element from the top is "I0, 1D-135, 0,0,90,90, 360, 180".
Furthermore, the values of "supplementalProperty schemeeIduri ═ http:// xmlns. sony. net/metadata/mpeg/dash/360 VR/Perfect-Port/2016" possessed by the third to 19 th adaptation set elements from the top are "I0, 1D-90, 45,0,90,90, 360, 180," "I0, 1D-90, 0,0,90,90, 360, 180," "I0, 1D-90, -45,0,90,90, 360, 180," "I0, 1D-45, 0,0,90,90, 360, 180," "I0, 1D 0," "90, 0,90,90, 90, 360, 180," "I0, 1D 734, 0,90,90, 90, 360, 4830, 90, 360," "I0," "1D 0," "90, 0,90, 180," "I0, 90," "460, 90,", 180, "460, 90," 1D 7345, 90, "5, 90,", 360, 180, ", 360, 90," "I, 90," "460," 1D 465, 90, "1D 6,",4835, 90, ",180,",90, ",180,",70,180, ",90,180,",90,90,90,90,180,180, ",180,180,180,",180,180,180,180,180, ",180,180,180,466,180,",466,180,180,180,180,180,180,70,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180,180, 180, "" I0, 1D 0, -90,0,90,90, 360, 180, "" I0, 1D 45,0,0,90,90, 360, 180, "" I0, 1D 90,45,0,90,90, 360, 180, "" I0, 1D 90,0,0,90,90, 360, 180, "" I0, 1D 90, -45,0,90,90, 360, 180, "" I0, 1D 135,0,0,90,90, 360, 180, "" I0, 1D 180,45,0,90,90, 360, 180, "" I0, 1D 180,0,0,90,90, 360, 180, "and" I0, 1D 180, -45,0,90,90, 360, 180, "" I0, "" I0, ", 90,90, 90, 360, 180,", and "I0," "I. In other words, in this case of fig. 35, similarly to the case of fig. 7, "I" is added to the top of the block that stores the identification information in the value. Note that "D" is added to the top of the block storing the data. Furthermore, each piece of information in a block is separated by a comma (,) and the blocks are separated by a space.
Note that "supplementalProperty schemeIdUri ═ http:// xmlns. sony. net/metadata/mpeg/dash/360 VR/Perspective-Port/2016" is a scheme eIdUri in the case where the projection type is a Perspective projection; therefore, the drawing plane is always a two-dimensional plane in the case of describing the schemeIdUri. Therefore, although the FOV _ flag is not set to the value in the example of fig. 41, the FOV _ flag may be set to the value.
Further, similarly to the third embodiment, a scheme other than perspective projection may be used as the projection type. In this case, "transmissive" of "http:// xmlns. sony. net/metadata/mpeg/dash/360 VR/transmissive-Portion/2016" is changed to other information indicating the type of projection (e.g., "stereoscopic" or "equidistant").
Note that the projection type may be indicated by a value. In this case, "supplementalProperty schemeeId ═ http:// xmlns.sony.net/metadata/mpeg/dash/360 VR/Perspective-Port/2016" - "indicating information indicating a drawing plane and a projection type by a value is described in the MPD file as an alternative to" supplementalProperty schemeeId ═ http:// xmlns.sony.net/metadata/mpeg/dash/360 VR/Port/2016 "". Further, ID, spatial _ set _ ID, information indicating a projection type, FOV _ flag, azimuth, elevation, rotation angle, lateral view angle, longitudinal view angle, entire lateral view angle, and entire longitudinal view angle are set to their values.
Thus, where the projection type is, for example, Perspective projection, the second adaptive set element from the top of fig. 41 possesses "SupplementalProperty schemehduri ═ http:// xmlns.sony.net/metadata/mpeg/dash/360 VR/Perspective-port/2016" "" SupplementalProperty schemehduri ═ http:// xmlns.sony.net/metadata/mpeg/dash/360 VR/port/2016 "and its value is" ppi 0,1 erspective D1, -135,0,0,90,90, 360, 180 ". Note that in this case, "P" is added to the top of the block storing the projection type.
As described previously, in the case where the projection type is indicated by a value, the extension of the projection type can be handled only by changing the value. Therefore, extension of the projection type is easily handled as compared with the case where the projection type is indicated by schemeIdUri. Further, in the case where the projection type is indicated by a value, the schemeIdUri indicating the information of the drawing face is indicated by a value is common to all the projection types; therefore, it is impossible to determine whether the drawing surface is a sphere or a two-dimensional plane by the description of the scheme. Therefore, in this case, FOV _ flag is set to a value.
Further, in the fourth embodiment, similarly to the third embodiment, the MPD file may be set in such a manner that the reproduction device 270 may reproduce a common image.
(second example of MPD File)
Fig. 42 is a diagram showing an example of a part of an MPD file in this case.
The MPD file of fig. 42 differs from the MPD file of fig. 38 in that: the value of "supplementalProperty schema I/Uri ═ http:// xmlns. sony. net/metadata/mpeg/dash/360 VR/Perspective-Port/2016" in the second adaptation set element is the same as the fourth adaptation set element from the top of FIG. 41.
In addition, in the fourth embodiment, the drawing face information may be described together with the omnidirectional video generation method and SRD information in the MPD file.
Fig. 43 is a diagram showing an example of a part of an MPD file in this case. In fig. 43, the omnidirectional video generation method is set to the value of "SupplementalProperty schemeelduri ═ http:// xmlns. sony. net/metadata/mpeg/dash/coordinates/2015" "in each of the adaptation set elements.
Further, the ID, spatial _ set _ ID, FOV _ flag, azimuth, elevation, rotation angle, lateral view angle, longitudinal view angle, entire lateral view angle, and entire longitudinal view angle are set to values of "http:// xmlns. Since a scheme can be determined from the omni-directional video generation method, a projection type is not set; however, the projection type may be set.
Further, the SRD information is set to "urn: mpeg: and (4) dash: srd: 2014'.
Note that the ID, spatial _ set _ ID, the entire lateral view angle, and the entire longitudinal view angle in the drawing face information described above may be omitted in the case where the drawing face information is described together with the SRD information in each adaptive set element. Further, the description of the mapping information may be omitted in the case where the drawing face information described in each adaptation set element is information containing mapping information.
Fig. 44 is a perspective view showing an example of a 3D model of a cubic stream set by the MPD file depicted in fig. 43. In this case, the captured image is mapped to six faces of the cube (divisions 1 to 6). SRD information relating to the unfolding of these faces is, for example, as depicted in fig. 45. Further, the drawing face information is, for example, as depicted in fig. 46.
Further, in the fourth embodiment, distance information indicating the radius of the sphere 70 as a 3D model of an omnidirectional image in meters may be described in an MPD file.
(third example of MPD File)
Fig. 47 is a drawing showing a part of an MPD file in this case.
The MPD file of fig. 47 differs from the MPD file of fig. 41 in that: "SupplementalProperty schemeIdUri ═ http:// xmlns. sony. net/metadata/mpeg/dash/360VR/sphere _ radius/2016", which indicates distance information by a value, is newly described in each adaptive set element.
In the example of fig. 41, the radius of the sphere 70 is ten meters. Therefore, 10 is set to a value of "SupplementalProperty scheme electronic uri ═ http:// xmlns.sony. net/metadata/mpeg/dash/360VR/sphere _ radius/2016" in each of the second to 19 th adaptation set elements from the top of fig. 41.
Note that the distance information may contain not only information indicating the radius of the sphere 70 itself but also "inf" indicating that the radius of the sphere 70 is infinite or "und" indicating that the radius of the sphere 70 is unknown. In the case where the distance information is "und", the radius of the sphere 70 is estimated by the reproduction means 340. Further, the unit of the distance information may be different from meter.
In the first to third embodiments, similarly to the fourth embodiment, the radius of the sphere 70 may be described in an MPD file.
As described above, describing the radius of the sphere 70 in the MPD file allows the reproduction apparatus 14(270, 340) to provide a display image with an appropriate sense of depth to the viewer.
In other words, a display device such as the head mounted display 15 may display separate images of both eyes. Accordingly, the reproduction device 14(270, 340) displays the display image for the left eye and the display image for the right eye by, for example, shifting the display images to the right and left by a predetermined distance and displays the display images on the head-mounted display 15, whereby a sense of depth of the display images can be produced. In the case of a large displacement between the display image for the left eye and the display image for the right eye, the viewer recognizes that the object in the display image is on the front side, and in the case of a small displacement, the viewer recognizes that the object is on the back side.
However, the reproduction device 14(270, 340) cannot grasp an appropriate amount of displacement between the display image for the left eye and the display image for the right eye. Thus, the generating means 12(250, 300) describes the radius of the sphere 70 in the MPD file. The reproduction device 14(270, 340) can thereby calculate an appropriate amount of displacement between the display image for the left eye and the display image for the right eye based on the radius of the sphere 70. As a result, a display image with an appropriate sense of depth can be provided to the viewer.
Although the method of delimiting values in values with a space and an alphabet and the method of delimiting values with a comma are used in the first to fourth embodiments, only the method of delimiting values with a comma may be used.
In this case, for example, "ID, projection, FOV _ flag, azimuth, elevation, rotation angle, lateral view angle, longitudinal view angle, entire lateral view angle, entire longitudinal view angle, spatial _ set _ ID" is set, for example, to a value of "SupplementalProperty scheme electronic uri ═ http:// xmlns
Thus, where the projection type is, for example, Perspective projection, the second adaptive set element from the top of fig. 41 possesses "SupplementalProperty schemehduri ═ http:// xmlns.sony.net/metadata/mpeg/dash/360 VR/transmissive-portal/2016" "" SupplementalProperty schemehduri ═ http:// xmlns.sony.net/metadata/mpeg/dash/360 VR/portal/2016 "and its value is" 0, pperpectrective, 1, -135,0,0,90,90, 360, 180, 1 ".
< fifth embodiment >
< example of configuration of generating apparatus >
The fifth embodiment of the delivery system to which the present disclosure is applied is different from the fourth embodiment in that: various information on images (low-resolution images, high-resolution images) set for the MPD file in the fourth embodiment is set for the section file. Therefore, the description of the processes other than the processes performed by the generation apparatus and the reproduction apparatus is appropriately omitted hereinafter.
Fig. 48 is a block diagram showing an example of the configuration of the fifth embodiment of the generating apparatus to which the present disclosure is applied.
In the configuration shown in fig. 48, the same configuration as that in fig. 23 is denoted by the same reference numeral. Duplicate description will be omitted where appropriate.
The generating apparatus 400 of fig. 48 differs from the generating apparatus 300 of fig. 23 in the configuration: the section file generating unit 401 and the MPD file generating unit 402 are provided as alternatives to the section file generating unit 304 and the MPD file generating unit 256.
The section file generation section 401 archives the low-resolution stream, the high-resolution stream, and the audio stream for each two-dimensional plane for each bit rate and each section according to the MP4 file format, and generates a section file.
Specifically, the section file generating section 401 sets, as samples, the audio stream of the low-resolution stream and the high-resolution stream sum for each two-dimensional plane in each section file in a predetermined time unit for each bit rate and each section. Further, the section file generating section 401 (setting section) sets various kinds of information on images corresponding to the samples arranged in the section file for the section file. The section file generating unit 401 supplies the section file to the uploading unit 30.
The MPD file generating unit 402 generates an MPD file. Unlike the fourth embodiment, various information about images is not set for the MPD file. The MPD file generating section 402 supplies the MPD file to the uploading section 30.
In the fifth embodiment, it is assumed that the image coding scheme is HEVC (high efficiency video coding).
(description of setting method of various information about image)
Fig. 49 is an explanatory diagram of a method of setting various information on an image by the section file generating unit 401 in fig. 48.
The span file is a file of MP4 file format and has a frame structure shown in fig. 49. In fig. 49, four letters represent the names of boxes and the lateral positions of the names represent the layers of the boxes having the names. The layer is downward and to the right.
As shown in fig. 49, examples of methods of setting various information about images by the section file generating section 401 of fig. 48 include a first method to a fifth method.
Specifically, in a case where various information about an image does not change within tracking, various information about an image is set in the tracking unit by any one of the first to third methods.
With the first method, various information related to a picture is set for a hvcC box (HEVC configuration box) which is a configuration box of HEVC as a picture coding scheme on a layer lower than a trak box provided for each trace on a layer lower than a moov box.
With the second method, various information related to an image is set for a visual sample entry (hev1 box) of an HEVC image contained in an stsd box on a layer lower than the trak box.
With the third method, various information about an image is set for a schi frame on a layer lower than a rinf frame (restriction scheme information frame) which stores information necessary for processing after decoding on a layer lower than a trak frame.
Further, in a case where at least part of the various information related to the image is changed within the tracking, the various information related to the image is set by the fourth method or the fifth method.
With the fourth method, information that changes within a trace, among various information relating to at least an image, is set in a sample unit by a method called a sample group using an sbgp box (SampleToGroupBox) and an sgpd (samplegroupdescriptionbox) box.
With the fifth method, at least information that changes within the trace among the various information relating to the image is set as timing metadata in the sample allocated to the trace of the various information relating to the image in the sample unit.
(first method)
Fig. 50 is a diagram showing an example of the configuration of the hvcC frame in the case where the section file generating section 401 of fig. 48 sets various information about an image by the first method.
As shown in fig. 50, in the case where the section file generating section 401 sets various information on images by the first method, the hvcC frame is expanded to include a script frame (a spherical coordinate area information frame) including various information on images in the tracking unit.
In a fifth embodiment, the hvcC frame is extended since the picture coding scheme is HEVC. However, in the case where the image coding scheme is a scheme other than HEVC, not the hvcC block but the configuration block of the coding scheme is extended to include the script block.
Fig. 51 is a diagram showing an example of the configuration of a script box.
As shown in fig. 51, various information about the image is set for the script frame. Specifically, information indicating the type projected onto the two-dimensional plane at the time of generating the high-resolution image is set to project _ type.
An ID serving as identification information on a captured image for generating an omnidirectional image is set to ID. FOV _ flag is set. The azimuth angle configuring the two-dimensional plane information is set to object _ yaw, the elevation angle is set to object _ pitch, the rotation angle is set to object _ roll, the landscape view is set to object _ width, and the portrait view is set to object _ height.
Further, the entire landscape view is set to total _ width and the entire portrait view is set to total _ height. Note that total _ width may be omitted in the case of the entire landscape angle of 360 degrees, and total _ height may be omitted in the case of the entire portrait angle of 180 degrees. Further, the distance information is set to sphere _ radius and spatial _ set _ id is set. In the case where the spatial _ set _ id is 0, the spatial _ set _ id may not be set.
Note that the low-resolution image and the high-resolution image whose tracking is set corresponding to the script box having the same id set may be mapped onto the same 3D model.
Further, even in the case where different tracings are assigned to a plurality of high resolution pictures belonging to a group indicated by the same spatial _ set _ id, the entire lateral view angle and the entire longitudinal view angle are set to total _ width and total _ height. In other words, in this case, the angle of view in the lateral direction and the angle of view in the longitudinal direction of the entire two-dimensional plane corresponding to the plurality of high resolution images assigned to the plurality of tracks and belonging to the group indicated by the same spatial _ set _ id are set to total _ width and total _ height.
In the case where a plurality of tracks are assigned to a plurality of images (low-resolution image and high-resolution image) having the same ID as the identification information on the captured image for generating the omnidirectional image, the section file generating section 401 may group the tracks having the same ID without setting the ID as the ID for the frame.
In this case, as shown in fig. 52, the same tracking as the ID of the identification information on the captured image for generating the omnidirectional image is grouped using a tref frame (tracking reference frame) on a layer lower than the trak frame.
Specifically, "spid" indicating the same ID as the ID of the identification information on the captured image used to generate the omnidirectional image is set as reference _ type indicating the type of reference relationship in the tracking reference type frame indicating the reference relationship of the tracking contained in the tref frame. Further, in the tracking reference type frame of which reference _ type is "spid", the ID of the tracking corresponding to the tracking reference type frame and the ID of another tracking which is the same as the ID of the identification information on the captured image used to generate the omnidirectional image are set as track _ IDs which is the tracking ID of the tracking to be referred to.
Also, in the case where a plurality of tracks are assigned to a plurality of high-resolution images whose spatial _ set _ id is the same, the section file generating section 401 may group tracks whose spatial _ set _ id is the same without setting the spatial _ set _ id for a script frame. In this case, reference _ type is "spsi" indicating that the spatial _ set _ ids are the same.
(description of the second method)
Fig. 53 is a diagram showing an example of the arrangement of visual sample entries in the case where the section file generating section 401 of fig. 48 sets various information about an image by the second method.
As shown in fig. 53, in the case where the section file generating section 401 sets various information on the image by the second method, the Visual Sample Entry as the Sample Entry of the image is expanded to include the script frame of fig. 51. Therefore, with the second method, the frame extended to include the script frame does not change depending on the image encoding scheme.
In the fifth embodiment, since the image coding scheme is HEVC, the visual sample entry to which various information related to an image is set is a visual sample entry for an HEVC image (block hev 1).
(description of the third method)
Fig. 54 is a diagram showing an example of the arrangement of the schi frame in the case where the section file generating unit 401 of fig. 48 sets various information about an image by the third method.
As shown in fig. 54, when the section file generating unit 401 sets various information on an image by the third method, the schi frame is expanded to include the script frame shown in fig. 51. A scheme _ type representing a process type after decoding using information stored in a schi frame in a schm frame (scheme type frame) on the same layer as that of the extended schi frame is set to indicate "script" of the mapping process.
With the third method, the frame extended to include the script frame does not change depending on the image encoding scheme.
With the second method and the third method, similar to the first method, the section file generating section 401 can group the same ID or the same spatial _ set _ ID as the identification information on the captured image for generating the omnidirectional image using the tref frame without setting the ID or the spatial _ set _ ID.
As described previously, with the first to third methods, various information related to an image is set on a layer lower than the trak frame provided by each trace. Therefore, in the case where various information about the image does not change within the tracking, the information can be efficiently set in the tracking unit.
(description of the third method)
Fig. 55 is an explanatory diagram showing a sample group.
As shown in fig. 55, the method called a sample set uses an sgpd box and an sbgp box.
The sgpd block is a block for grouping consecutive samples having common sample information as sample-related information and for describing sample information related to each group. The sbgp box is a box for describing information for identifying a sample in each group (hereinafter, referred to as "sample identification information").
In the sgpd box, grouping _ type, entry _ count, GroupEntry, and the like are described. In the sgpd block, a grouping _ type indicates a type of sample information forming a basis for grouping into a corresponding group, and an entry _ count indicates the number of sample information different from each other among the sample information in each group. Further, GroupEntry is sample information that is different from each other among the sample information in each group and is described by the number indicated by entry _ count. The structure of GroupEntry varies depending on the grouping _ type.
In the sbgp frame, group _ type, entry _ count, sample _ count, group _ description _ index, and the like are described. In the sbgp box, a grouping _ type indicates a type of sample information forming a basis for grouping into a corresponding group, and an entry _ count indicates the number of groups.
sample _ count is sample identification information related to each group and indicates the number of consecutive samples in each group. The sample _ count is described by the number indicated by the entry _ count, and the sum of all sample _ counts is the number of all samples of the trace corresponding to the sgpd frame. The group _ description _ index is information identifying group _ pentry as sample information on each group.
In the example of fig. 55, the entry _ count of the sbgp box and the entry _ count of the sgpd box, which have the same grouping _ type, are 6 and 4, respectively. Therefore, the number of groups is six and the number of mutually different sample information in the sample information on each of the six groups is four. In addition, four GroupEntry are described in the sgpd box.
Further, the first to sixth sample _ count [1] to [6] from the top in the sbgp frame are 1, 2,1, 1, 1, and 2 in this order. Therefore, the number of samples of the first group to the sixth group from the top is 1, 2,1, 1, 1, and 2 in this order.
In other words, the first sample from the top (sample [1]) is classified into the first group from the top and the second and third samples from the top (sample [2], sample [3]) are classified into the second group from the top. Further, the fourth sample from the top (sample [4]) is classified into the third group from the top and the fifth sample from the top (sample [5]) is classified into the fourth group from the top. The sixth sample from the top (sample [6]) is classified into the fifth group from the top and the seventh and eighth samples from the top (sample [7], sample [8]) are classified into the sixth group from the top.
Further, the first group _ description _ index [1] to the sixth group _ description _ index [6] from the top are 1, 3, 2,0, 4, and 1 in this order. Therefore, the sample information related to the first sample from the top classified into the first group from the top is the first group from the top. Further, the sample information common to the second sample and the third sample classified into the second group from the top is the third group from the top.
Further, the sample information on the fourth sample from the top classified into the third group from the top is the second group from the top. Further, there is no sample information on the fourth sample from the top classified into the fifth group from the top.
Further, the sample information on the sixth sample from the top classified into the fifth group from the top is the fourth group from the top. Further, the sample information on the seventh sample and the eighth sample from the top classified into the sixth group from the top is the first group pentry from the top.
In the case where the section file generating section 401 sets various information on the image by the fourth method, the various information on the image in the sample unit is set as the sample information using the above-described sample group.
Specifically, the sbgp box of fig. 56 having the grouping _ type of "script" is set to correspond to the sgpd box in which the ball coordinate region information entry is arranged as GroupEntry and having the grouping _ type of "script". The sphere coordinate region information entry of fig. 56 is an extension of VisualSampleGroupEntry to include various information about the image in a sample unit.
In other words, with the fourth method, consecutive samples having common various information about images in sample units are classified into each group. Further, mutually different information among various information about images in sample units of each group is set for the ball coordinate region information items of fig. 56 as GroupEntry arranged in the sgpd frame of the grouping _ type having "script". Further, sample _ count indicating sample identification information on each group is set for the sbgp box of the grouping _ type having "script". Various information on the images in the sample unit can thus be set to each section file.
Although all of the various information about the image may be set in the sample unit by the fourth method described above, only the information that changes within the trace among the various information about the image may be set in the sample unit.
Fig. 57 is a diagram showing an example of the configuration of the ball coordinate area information entry in this case.
In the example of fig. 57, the object _ yaw, object _ pitch, object _ roll, object _ width, and object _ height, among various information about a picture, are information that changes within a track.
In this case, as shown in fig. 57, only the object _ yaw, object _ pitch, object _ roll, object _ width, and object _ height are set for the ball coordinate region information entry.
Further, in this case, information that has not changed within tracking among various information about images is set for the script frame, hev1 frame, or schi frame included in the hvcC frame using a method similar to the first to third methods. The script box at this time is as shown in fig. 58. The configuration of the script frame of fig. 58 differs from that of fig. 51 in that: object _ yaw, object _ pitch, object _ roll, object _ width, and object _ height are not set.
Note that, similar to the first to third methods, the same trace as id or spatial _ set _ id may be grouped without setting id or spatial _ set _ id among information that has not changed within the trace for the script.
As described above, in the case where only the information changed within the trace among the various information on the image is set using the sample group, the data amount of the section file can be reduced as compared with the case where all the various information on the image is set using the sample group.
Although it is assumed in the examples of fig. 57 and 58 that the information changed within the track among the various information about the picture is object _ yaw, object _ pitch, object _ roll, object _ width, and object _ height, arbitrary information may be set.
(description of the fifth method)
Fig. 59 is a diagram illustrating an example of the arrangement of the section file in the case where the section file generating section 401 of fig. 48 sets various information about the image by the fifth method.
As shown in fig. 59, with the fifth method, metadata tracking (spherical coordinate area information tracking) different from image tracking is assigned to various information about an image. Various information (spherical coordinate area information) on the image in a sample unit is set as a sample of the timing metadata for the assigned trace.
Further, a tracking reference type frame of reference _ type having "script" indicating a relationship between various information about an image and an image is set as a tref frame in the tracking of the spherical coordinate area information. For the track reference type frame, the ID of the ball coordinate area information track for various information about the picture in the track corresponding to the track reference type frame is set to track _ IDs.
Fig. 60 is a diagram showing an example of the configuration of the spiedicCoordinatRegionInfoSample as the timing metadata in which various information about images in a sample unit are arranged.
As shown in fig. 60, various information on the image in the sample cell is set to the statistical colordinateregioneinfample.
Note that various information on the image in the sample unit of the sample may be set to the spiedicCoordinatRegionInfoSample only in the case where the information is different from the various information on the image in the sample unit of the previous sample without setting all the information to the spiedicCoordinatRegionInfoSample.
In this case, examples of a method of setting various information about an image in a sample unit for the spiical coordinateregionfosample include two methods, that is, an entire information setting method and a partial information setting method.
The entire information setting method is a method of setting all of various information about an image in a sample unit to the statistical coding region information sample in the case where at least one of the various information about an image in a sample unit is different from a previous sample.
Specifically, as shown in fig. 61, with the entire information setting method, a ball coordinate region information sample entry as a metadata sample entry for various information related to an image in an stsd frame included on a lower layer of a trak frame in a ball coordinate region information tracking trak includes a ball coordinate region information configuration frame. Further, as shown in fig. 62, default values (default _ projection _ type, default _ id, default _ FOV _ flag, default _ object _ yaw, default _ object _ pitch, default _ object _ roll, default _ object _ width, default _ object _ height, default _ total _ width, default _ total _ height, default _ distance _ radius, and default _ spatial _ set _ id) of various information related to the picture in sample unit are set to the spherical coordinate region information arrangement block.
Although it is described herein that a default value of various information related to an image in a sample unit is contained in the stsd frame in the tracking of the spherical coordinate region information, the default value may be included in the stsd frame in the tracking of the image. The hvcC frame is the same as the hvcC frame in this case, except that various information on the image in the tracking unit is replaced with default values of various information on the image in the sample unit, using the first method described with reference to fig. 50 and 51.
Further, with the entire information setting method, the statistical CoordinateRegionInfoSample is as shown in FIG. 63. The spieric coordinateregioneinfosample of fig. 63 differs from the spieric coordinateregioneinfosample of fig. 60 in that: an update _ flag indicating whether the sample is different from a previous sample in at least one of various information on a picture in a sample unit is set, and is different in that: all of various information related to the picture in the sample unit is set in response to the update _ flag.
The update _ flag is 1 if a sample corresponding to the update _ flag is different from a previous sample in at least one of various information about a picture in a sample unit and is 0 if the sample corresponding to the update _ flag is identical to the previous sample. Note that, in the case where the sample corresponding to the update _ flag is the first sample, various information related to the image in the sample unit of the previous sample is a default value of various information related to the image in the sample unit.
In the case where update _ flag is 1, various information about the image in the sample unit of the sample is set to the spiericcoderegingenfosample. On the other hand, in the case where update _ flag is 0, various information about the image in this sample unit is not set to the spiical colordinateregiondinfample.
Further, the partial information setting method is a method of setting only different information to the scientific coding region info sample in the case where the sample is different from a previous sample in at least one of various information about an image in a sample unit.
Specifically, similar to the entire information setting method, default values of various kinds of information relating to images in sample units are set using the partial information setting method. Further, the SphericalCoordinateRegionInfoSample is as shown in FIG. 64. The spieric coordinateregioneinfSample of FIG. 64 differs from the spieric coordinateRegionInfoSample of FIG. 60 in that: an update _ flag of 16 bits indicating whether the sample is different from a previous sample in each of various information on the picture in a sample unit is set, and is different in that: various information related to the picture in the sample unit is set in response to the update _ flag of each information.
In other words, each of various information related to the picture in sample units is allocated to each bit of the update _ flag. In the case where each piece of information is different from a previous sample of the sample corresponding to the update _ flag, a bit allocated to the information is 1, and in the case where each piece of information is the same as a previous sample of the sample corresponding to the update _ flag, a bit is 0. Note that, in the case where the sample corresponding to the update _ flag is the first sample, various information related to the image in the sample unit of the previous sample is a default value of various information related to the image in the sample unit.
In the example of fig. 64, the bits of update _ flag are allocated in order from at least important bits, project _ type, id, FOV _ flag, object _ yaw, object _ pitch, object _ roll, object _ width, object _ height, total _ width, total _ height, sphere _ radius, and spatial _ set _ id among various information about the picture.
Therefore, in the case where the 16-bit update _ flag is, for example, 0000000000000001 b, only the project _ type of the sample corresponding to the update _ flag is set to the spatial coding regionalinfample. In the case where the update _ flag is 0000000000000010 b, only the id of the sample corresponding to the update _ flag is set to the SphericalCoorddinateRegionInfoSample.
Further, in the case where the update _ flag is 0000000000000100 b, only the FOV _ flag of the sample corresponding to the update _ flag is set to the statistical coordingateregiondinfample. In the case where the update _ flag is 0000000000001000 b, only the object _ yaw of the sample corresponding to the update _ flag is set to the SphericalCoordinateRegionInfoSample. In the case where the update _ flag is 0000000000010000 b, only the object _ pitch of the sample corresponding to the update _ flag is set to the statistical CoordinateRegionInfoSample.
In the case where the update _ flag is 0000000000100000 b, only the object _ roll of the sample corresponding to the update _ flag is set to the SphericalCoordinateRegionInfoSample. In the case where the update _ flag is 0000000001000000 b, only the object _ width of the sample corresponding to the update _ flag is set to the SphericalCoordataRegionInfoSample. In the case where update _ flag is 0000000010000000 b, only the object _ height of the sample corresponding to the update _ flag is set to the statistical coding regionalinfansample.
Further, in the case where the update _ flag is 0000000100000000 b, only the total _ width of the sample corresponding to the update _ flag is set to the statistical colordataRegionInfoSample. In the case where update _ flag is 0000001000000000 b, only the total _ height of the sample corresponding to the update _ flag is set to the SphericalCoordinateRegionInfoSample. In the case where update _ flag is 0000010000000000 b, only the sphere _ radius of the sample corresponding to the update _ flag is set to the sphere coordinateregiondinforinfosample. In the case where the update _ flag is 0000100000000000 b, only the spatial _ set _ id of the sample corresponding to the update _ flag is set to the statistical coordinatereegioninfosample.
Further, in the case where a sample differs from a previous sample only in project _ type and FOV _ flag among various information about an image, 0x0005 (0000000000000101 b) is set as update _ flag. In this case, only project _ type and FOV _ flag of the sample are set to the statistical codordinateregioneinfosample.
As described previously, various information on the image in the sample unit is set to the statistical coding region information sample only in the case where the sample is different from the sample unit only in information in at least one of the various information on the image. In this case, the data amount of the section file can be reduced when the frequency of changing various information on the image in sample units is low.
Further, with the partial information setting method, only information different from the previous sample among various information about images in sample units is set to the spiericlcordinateregiondinfample; therefore, the data amount of the section file can be reduced compared to the entire information setting method.
Although all of the various information about the image may be set in the sample unit by the above-described fifth method, only information that changes within the tracking among the various information about the image in the sample unit may be set in the sample unit.
Fig. 65 is a diagram showing an example of the configuration of the ball coordinate region information configuration box in this case, and fig. 66 is a diagram showing an example of the configuration of the spieric coordinateereinfosample in this case.
In the examples of fig. 65 and 66, object _ yaw, object _ pitch, object _ roll, object _ width, and object _ height, which are information of intra-track change among various information about images in a sample unit, can be set in the sample unit.
In this case, as shown in fig. 65, project _ type, id, FOV _ flag, total _ width, total _ height, sphere _ radius, and spatial _ set _ id, which do not change within tracking in various information about an image, are set in the tracking unit to the ball coordinate region information configuration frame included in the stsd frame on a layer lower than the trak frame in the ball coordinate region information tracking.
Further, as shown in fig. 66, object _ yaw, object _ pitch, object _ roll, object _ width, and object _ height, which are changed within the track among various information about the picture, are set to the spatial coordinaterequiremenginfosample in sample units.
Note that project _ type, id, FOV _ flag, total _ width, total _ height, sphere _ radius, and spatial _ set _ id, which do not change within the tracking in the various types of information about the image, may be included in the stsd frame in the image tracking. The configuration of the hvcC block in this case differs from the hvcC block using the first method described with reference to fig. 50 and 51 in that: object _ yaw, object _ pitch, object _ roll, object _ width, and object _ height are not set.
Further, similar to the first to third methods, the same tracking id or spatial _ set _ id may be grouped without setting id or spatial _ set _ id in information that does not change within the tracking to the box.
As described above, in the case where only information that changes within a track among various kinds of information related to an image is set to the spicatecodereregistrinfosample, the data amount of the span file can be reduced as compared with the case where all of the various kinds of information related to an image are set to the spicatecodereregistreininfosample.
Although it is assumed in the examples of fig. 65 and 66 that the information changed within the track among the various information about the picture is object _ yaw, object _ pitch, object _ roll, object _ width, and object _ height, arbitrary information may be set.
As described previously, with the fourth method or the fifth method, at least information that changes within tracking among various information about an image is set to the section file in sample units. Therefore, even in a case where at least a part of various information about the image is changed within the tracking, the information can be set to the section file.
(description of processing performed by the generating apparatus)
Fig. 67 is a flowchart showing a file generation process executed by the generation apparatus 400 of fig. 48.
Since the processing from steps S201 to S208 of fig. 67 is similar to the processing from steps S121 to S128 of fig. 31, description will be omitted.
In step S209, the section file generating section 401 archives the low-resolution stream, the high-resolution stream, and the audio stream for each two-dimensional plane for each bit rate and each section, and generates a section file for each of various information about images using any one of the first to fifth methods.
In step S210, the MPD file generating unit 402 generates an MPD file.
In step S211, the upload section 30 uploads the section file supplied from the section file generation section 401 and the MPD file supplied from the MPD file generation section 402 to the delivery server 13 of fig. 1, and the process ends.
(configuration example of reproduction apparatus)
Fig. 68 is a block diagram showing an example of the configuration of a reproduction apparatus according to the fifth embodiment of the delivery system to which the present disclosure is applied.
In the configuration shown in fig. 68, the same configuration as that of fig. 32 is denoted by the same reference numeral. Duplicate description will be omitted where appropriate.
The playback device 430 of fig. 68 differs in configuration from the playback device 340 of fig. 32 in that: the MPD processing unit 431, the section file acquisition unit 432, and the map processing unit 433 are provided as alternatives to the MPD processing unit 341, the section file acquisition unit 272, and the map processing unit 342.
Specifically, the MPD processing section 431 in the reproducing apparatus 430 analyzes the MPD file supplied from the MPD acquisition section 220, and acquires information such as the URL of the section file at the reproduction clock time at a predetermined bit rate. The MPD processing section 431 supplies the acquired URL to the section file acquisition section 432.
The section file acquisition section 432 issues a request for a section file identified by the URL supplied from the MPD processing section 431 to the delivery server 13 based on the URL and acquires the section file. The section file acquisition section 432 supplies various information about images (low-resolution image and high-resolution image) from the acquired section file. The section file acquisition section 432 selects a sample corresponding to a predetermined id among various kinds of acquired information on the image.
The section file acquisition section 432 sets, as a sample of a low-resolution image to be reproduced, a sample which is not set, such as two-dimensional plane information which is various information relating to the image, among the selection samples. The section file acquisition section 432 supplies one low-resolution stream set in a sample of a low-resolution image to be reproduced to the decoder 273.
Further, the section file acquisition section 432 recognizes, as a sample of the high-resolution image, a sample in which two-dimensional plane information or the like, which is various information on the image, among the selection samples, is also set.
Further, the section file acquisition section 432 selects a sample corresponding to the selected two-dimensional plane from among the samples from the high-resolution image as a sample of the high-resolution image to be reproduced, based on the selected two-dimensional plane information generated by the line-of-sight detection section 343 and source _ id, object _ x, object _ y, object _ width, object _ height, total _ width, total _ height, spatial _ set _ id, and the like corresponding to the samples of the high-resolution image. The section file acquisition section 432 supplies one high-resolution stream set in a sample of a high-resolution image to be reproduced to the decoder 274.
The mapping processing section 433 maps the low-resolution image supplied from the decoder 273 as a texture onto the planes 71 to 76 of the sphere 70.
Note that the mapping processing section 433 may map only a portion of the low resolution image containing the region subjected to perspective projection onto the field of view range of the viewer determined by the line-of-sight detecting section 343 without mapping the entire low resolution image.
Further, the mapping processing section 433 sets the selected two-dimensional plane within the sphere 70 as a 3D model based on the selected two-dimensional plane information supplied from the line-of-sight detecting section 343. The mapping processing section 433 maps the high-resolution image as a texture onto the selected two-dimensional plane disposed within the sphere 70. Further, the mapping processing section 433 supplies the 3D model image in which the texture is mapped onto the sphere 70 and the selected two-dimensional plane to the drawing section 227.
(processing performed by a reproduction apparatus)
Fig. 69 is a flowchart describing a reproduction process performed by the reproduction apparatus 430 of fig. 68.
In step S231 in fig. 69, the MPD acquisition section 220 in the playback device 430 issues a request for an MPD file to the delivery server 13 in fig. 1 and acquires the MPD file. The MPD acquisition unit 220 supplies the acquired MPD file to the MPD processing unit 431.
In step S232, the MPD processing section 431 analyzes the MPD file supplied from the MPD acquisition section 220, and acquires information such as a URL of the section file at the reproduction clock time at a predetermined bit rate. The MPD processing section 431 supplies the acquired URL to the section file acquisition section 432.
In step S233, the section file acquisition section 432 issues a request for a section file identified by the URL supplied from the MPD processing section 431 to the delivery server 13 based on the URL, and acquires the section file.
In step S234, the section file acquisition unit 432 acquires various information about the image from the acquired section file.
In step S235, the section file acquisition section 432 selects a predetermined id from among ids in various information about the image acquired in step S234 as the id of the omnidirectional image to be reproduced. Further, the section file acquisition section 432 selects a sample to which the selected ID is set.
In step S236, the section file acquisition section 432 sets, as a sample of a low resolution image to be reproduced, a sample which is not set, such as two-dimensional plane information as various information about an image, among the selection samples, and acquires one low resolution stream set in the sample from the section file. The section file acquisition section 432 supplies the acquired one low resolution stream to the decoder 273.
Since the processing from steps S237 to S242 is similar to the processing from steps S145 to S150 of fig. 34, description will be omitted.
In step S243, the section file acquisition section 432 recognizes, as a sample of the high-resolution image, a sample in which two-dimensional plane information and the like as various information about the image are also set, from among the selection samples. Further, the section file acquisition section 432 selects a sample corresponding to the selected two-dimensional plane from among the samples from the high-resolution image as a sample of the high-resolution image to be reproduced, based on the selected two-dimensional plane information generated by the line-of-sight detection section 343 and source _ id, object _ x, object _ y, object _ width, object _ height, total _ width, total _ height, spatial _ set _ id, and the like corresponding to the samples of the high-resolution image. The section file acquisition section 432 supplies one high-resolution stream set in a sample of a high-resolution image to be reproduced to the decoder 274.
In step S244, the section file acquisition section 432 acquires one high-resolution stream set in the sample of the high-resolution image to be reproduced selected in step S243 from the section file and supplies the high-resolution stream to the decoder 274.
Since the processing in steps S245 and S246 is similar to the processing in steps S153 and S154 of fig. 34, description will be omitted.
In step S247, the mapping processing part 433 maps the low-resolution image supplied from the decoder 273 onto the faces 71 to 76 of the sphere 70 as a texture.
In step S248, the mapping processing section 433 sets the selected two-dimensional plane within the sphere 70 as a 3D model based on the selected two-dimensional plane information supplied from the line-of-sight detecting section 343.
In step S249, the mapping processing section 433 maps the high-resolution image supplied from the decoder 274 onto the selected two-dimensional plane set in step S248 as a texture. The mapping processing section 433 supplies the 3D model image in which the texture is mapped onto the sphere 70 and the selected two-dimensional plane to the drawing section 227.
The processing in steps S250 and S251 is similar to the processing in steps S162 and S163 of fig. 34.
As described above, the generation apparatus 400 sets various information about the image to the section file. Therefore, effects similar to those of the fourth embodiment can be achieved in the MP4 layer.
Further, the generation apparatus 400 sets various information about the image to a layer lower than the sample moov box of the management section file or to a trace sample having a reference relationship with image tracing. Accordingly, various information may be identified as samples prior to decoding of the low resolution stream or the high resolution stream.
Accordingly, the reproducing apparatus 400 sets various information about the image in the tracking unit or the sample unit. Accordingly, the reproducing unit 430 may recognize various information related to the image in the tracking cell or the sample cell. Accordingly, the reproducing unit 430 can easily select a sample in a tracking cell or a sample in a sample cell to be reproduced based on various information about an image in a tracking cell or a sample cell and the selected two-dimensional plane information.
Although various information about images set to the MPD file is set to the section file in the fifth embodiment in the fourth embodiment, various information about images set to the MPD file in the first to third embodiments may be set to the section file. Further, all of the first to fifth embodiments described above may be combined, and various information relating to an image may be set to both of a section file and an MPD file, for example.
Further, various information related to the image may contain mapping information. In the case where various information relating to an image contains mapping information, the mapping processing unit 433 performs mapping based on the mapping information similarly to the mapping processing unit 433.
< sixth embodiment >
(description of computer to which the disclosure applies)
The series of processes described above may be executed by hardware or by software. In the case where a series of processes is executed by software, a program configuring the software is installed into a computer. Here, the types of computers include a computer incorporated into dedicated hardware, a computer capable of executing various functions by installing various programs into the computer, such as a general-purpose personal computer, and the like.
Fig. 70 is a block diagram showing an example of a configuration of hardware of a computer that executes a series of processing described above by a program.
In the computer 900, a Central Processing Unit (CPU)901, a Read Only Memory (ROM)902, and a Random Access Memory (RAM)903 are connected to each other by a bus 904.
An input/output interface 905 is also connected to bus 904. An input section 906, an output section 907, a storage section 908, a communication section 909, and a driver 910 are connected to the input/output interface 905.
The input section 906 includes a keyboard, a mouse, a microphone, and the like. The output portion 907 includes a display, a speaker, and the like. The storage unit 908 includes a hard disk, a nonvolatile memory, and the like. The communication section 909 includes a network interface and the like. The drive 910 drives a removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
In the computer 900 configured as described above, the CPU 901 loads a program stored in, for example, the storage section 908 to the RAM 903 via the input/output interface 905 and the bus 904, and executes the program, whereby a series of processing described above is executed.
The program executed by the computer 900(CPU 901) can be provided by, for example, recording the program in a removable medium 911 serving as a package medium or the like. Alternatively, the program may be provided via a wired or wireless transmission medium such as a local area network, the internet, or a digital satellite service.
In the computer 900, a program can be installed into the storage section 908 via the input/output interface 905 by attaching the removable medium 911 to the drive 910. Alternatively, the program may be received by the communication section 909 and installed into the storage section 908 via a wired or wireless transmission medium. In another alternative, the program may be installed in advance into the ROM 902 or the storage section 908.
The program executed by the computer 900 may be a program for executing processes in chronological order in the order described in the specification, or may be a program for executing processes in parallel or executing processes at necessary timings, for example, at the time of call.
Further, in this specification, a system means a set of a plurality of constituent elements (a device, a module (a component), and the like) regardless of whether all the constituent elements are provided in the same housing. Therefore, a plurality of devices accommodated in different housings and connected to each other via a network and one device in which a plurality of modules are accommodated in one housing are each a "system".
Further, the effects described in this specification are given only as examples, and the effects are not limited to those described in this specification, but may include other effects.
Furthermore, the embodiments of the present disclosure are not limited to the embodiments described above and various changes may be made without departing from the spirit of the present technology.
Note that the present disclosure can be configured as follows.
(1) A file generation apparatus comprising:
a setting section that sets identification information for identifying an image used to generate an omnidirectional image, the omnidirectional image being generated by mapping the image onto a 3D model.
(2) The file generation apparatus according to (1), wherein,
the setting part sets mapping information used when mapping the omnidirectional image onto the 3D model.
(3) The file generation apparatus according to (2), wherein,
the mapping information is the following information: this information is used in mapping the omnidirectional image onto the 3D model such that a reference image within the image is mapped onto a reference position of the 3D model at a predetermined tilt angle.
(4) The file generation apparatus according to (3), wherein,
the mapping information includes a position of the reference image within the omnidirectional image, and a rotation angle for making an inclination of the reference image on the 3D model equal to the predetermined inclination when the omnidirectional image is mapped onto the 3D model in such a manner that the reference image is mapped onto the reference position.
(5) The file generation apparatus according to (3), wherein,
the mapping information is a rotation angle of the omnidirectional image when the omnidirectional image is mapped onto the 3D model in such a manner that the reference image is mapped onto the reference position at the predetermined tilt angle.
(6) The file generation apparatus according to (1), wherein,
the setting section sets the same identification information for a plurality of the omnidirectional images generated by mapping the images onto the 3D model.
(7) The file generation apparatus according to (6), wherein,
the setting part sets the same identification information for the omnidirectional image generated by at least one of the plurality of methods and for the omnidirectional image for each area generated by the remaining methods.
(8) The file generation apparatus according to (1), wherein,
the setting section sets the identification information for an image generated by projecting the omnidirectional image onto a drawing surface.
(9) The file generation apparatus according to (8), wherein,
the setting section sets drawing face information containing information on the drawing face for the image.
(10) The file generation apparatus according to (9), wherein,
the drawing face information includes a position and a view angle of the drawing face.
(11) The file generation apparatus according to (9) or (10), wherein,
the drawing face information includes face type information indicating that the drawing face of the image is a two-dimensional plane or a sphere.
(12) The file generation apparatus according to (9) or (11), wherein,
the drawing faces are grouped, and
the drawing face information contains information indicating a group to which the drawing face belongs and view angles of all drawing faces belonging to the group, the drawing face information indicating a position and a view angle of the drawing face.
(13) The file generation apparatus according to any one of (8) to (12), wherein,
the setting section sets the same information for each of a plurality of images generated by projecting the omnidirectional image onto the drawing surface as information for identifying the 3D model mapped to the omnidirectional image used to generate the image.
(14) The file generating apparatus according to any one of (1) to (13), wherein,
the 3D model is a sphere, and
the setting section sets information indicating a radius of the 3D model.
(15) The file generation apparatus according to any one of (1) to (14), wherein,
the setting section sets the identification information for a management file that manages a file of the omnidirectional image.
(16) The file generating apparatus according to any one of (1) to (15), wherein,
the setting section sets the identification information for a file of the omnidirectional image.
(17) A file generation method, comprising:
a setting step of causing the file generating means to set identification information for identifying an image used to generate an omnidirectional image, the omnidirectional image being generated by mapping the image onto the 3D model.
(18) A reproduction apparatus comprising:
a selection section that selects an omnidirectional image generated by mapping an image onto a 3D model to be reproduced, based on identification information for identifying the image used to generate the omnidirectional image.
(19) A reproduction method, comprising:
a selection step of causing a reproduction apparatus to select an omnidirectional image generated by mapping an image onto a 3D model to be reproduced, based on identification information for identifying the image used to generate the omnidirectional image.
[ list of reference numerals ]
12: generating device
14: reproducing apparatus
29: MPD file generation unit
40: cube
41 to 46: flour
50: omnidirectional imagery
70: ball body
90: omnidirectional imagery
221: MPD processing unit
250: generating device
256: MPD file generation unit
270: reproducing apparatus
271: MPD processing unit
300: generating device
331-1 to 331-18: high resolution images
340: reproducing apparatus
341: MPD processing unit
400: generating device
401: section file generation unit
430: reproducing apparatus
431: MPD processing unit

Claims (14)

1. An information processing apparatus comprising:
a file acquiring section configured to acquire a file including mapping information for mapping the omnidirectional image to the 3D model,
wherein the mapping information comprises rotation angle information of the omnidirectional image at a reference position of the 3D model and a position of a reference image in the omnidirectional image for rotating and mapping the reference image for the reference position, and
wherein the reference position of the 3D model is a position on the 3D model corresponding to a predetermined gaze direction of a viewer if a viewing position of the viewer is a center of the 3D model.
2. The information processing apparatus according to claim 1,
the document further includes drawing side information that,
the drawing face information includes face type information indicating that the drawing face is a two-dimensional plane or a spherical surface, provided as information on a surface to be drawn of the 3D model.
3. The information processing apparatus according to claim 2,
the drawing side information includes angle information corresponding to the drawing side.
4. The information processing apparatus according to claim 3,
the angle information includes an azimuth angle, an elevation angle, a rotation angle, a lateral viewing angle, and a longitudinal viewing angle.
5. The information processing apparatus according to claim 4,
the drawing faces are grouped, and
the drawing face information further includes information indicating a group to which the drawing face belongs and information on an angle of an entire two-dimensional plane constituted by all drawing faces belonging to the group.
6. The information processing apparatus according to claim 5,
the file also includes information of the radius of the 3D model.
7. A reproduction apparatus comprising:
a file acquiring section configured to acquire a file including mapping information for mapping the omnidirectional image to the 3D model,
an interval file acquisition section configured to acquire an interval file including a stream corresponding to the omnidirectional image, an
A mapping processing section configured to map a decoded image obtained by decoding the stream included in the section file to the 3D model based on the mapping information to obtain a 3D model image for generating a display image,
wherein the mapping information comprises rotation angle information of the omnidirectional image at a reference position of the 3D model and a position of a reference image in the omnidirectional image for rotating and mapping the reference image for the reference position, and
wherein the reference position of the 3D model is a position on the 3D model corresponding to a predetermined gaze direction of a viewer if a viewing position of the viewer is a center of the 3D model.
8. The reproduction apparatus according to claim 7,
the document further includes drawing side information,
the drawing face information includes face type information indicating that the drawing face is a two-dimensional plane or a spherical surface, provided as information on a surface to be drawn of the 3D mode, and
the mapping processing unit maps the decoded image to the 3D model according to the drawing surface information based on the mapping information.
9. The reproduction apparatus according to claim 8,
the drawing side information includes angle information corresponding to the drawing side.
10. The reproduction apparatus according to claim 9, wherein,
the angle information includes an azimuth angle, an elevation angle, a rotation angle, a lateral viewing angle, and a longitudinal viewing angle.
11. The reproduction apparatus according to claim 10,
the drawing faces are grouped, and
the drawing face information further includes information indicating a group to which the drawing face belongs and information on an angle of an entire two-dimensional plane constituted by all drawing faces belonging to the group.
12. The reproduction apparatus according to claim 11,
the file also includes information of the radius of the 3D model.
13. An information processing method comprising:
obtaining a file comprising mapping information for mapping the omnidirectional image to the 3D model,
wherein the mapping information comprises rotation angle information of the omnidirectional image at a reference position of the 3D model and a position of a reference image in the omnidirectional image for rotating and mapping the reference image for the reference position, an
Wherein the reference position of the 3D model is a position on the 3D model corresponding to a predetermined gaze direction of a viewer if a viewing position of the viewer is a center of the 3D model.
14. A reproduction method, comprising:
obtaining a file comprising mapping information for mapping the omnidirectional image to the 3D model,
acquiring an interval file including a stream corresponding to the omnidirectional image, and
mapping a decoded image obtained by decoding the stream included in the section file to the 3D model based on the mapping information to obtain a 3D model image for generating a display image,
wherein the mapping information comprises rotation angle information of the omnidirectional image at a reference position of the 3D model and a position of a reference image in the omnidirectional image for rotating and mapping the reference image for the reference position, an
Wherein the reference position of the 3D model is a position on the 3D model corresponding to a predetermined gaze direction of a viewer if a viewing position of the viewer is a center of the 3D model.
CN202210579940.5A 2016-05-13 2017-05-12 Information processing apparatus, information processing method, reproduction apparatus, and reproduction method Pending CN115118952A (en)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
JP2016097361 2016-05-13
JP2016-097361 2016-05-13
JP2016-162433 2016-08-23
JP2016162433 2016-08-23
JP2016220619 2016-11-11
JP2016-220619 2016-11-11
PCT/JP2017/017963 WO2017195881A1 (en) 2016-05-13 2017-05-12 File generation device and file generation method, and reproduction device and reproduction method
CN201780028140.1A CN109076262B (en) 2016-05-13 2017-05-12 File generation device, file generation method, reproduction device, and reproduction method

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201780028140.1A Division CN109076262B (en) 2016-05-13 2017-05-12 File generation device, file generation method, reproduction device, and reproduction method

Publications (1)

Publication Number Publication Date
CN115118952A true CN115118952A (en) 2022-09-27

Family

ID=60267120

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202210579940.5A Pending CN115118952A (en) 2016-05-13 2017-05-12 Information processing apparatus, information processing method, reproduction apparatus, and reproduction method
CN201780028140.1A Active CN109076262B (en) 2016-05-13 2017-05-12 File generation device, file generation method, reproduction device, and reproduction method

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201780028140.1A Active CN109076262B (en) 2016-05-13 2017-05-12 File generation device, file generation method, reproduction device, and reproduction method

Country Status (5)

Country Link
US (1) US20190200096A1 (en)
EP (2) EP3627844A1 (en)
JP (2) JP7014156B2 (en)
CN (2) CN115118952A (en)
WO (1) WO2017195881A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10587934B2 (en) * 2016-05-24 2020-03-10 Qualcomm Incorporated Virtual reality video signaling in dynamic adaptive streaming over HTTP
KR102358757B1 (en) 2016-08-25 2022-02-07 엘지전자 주식회사 method for transmitting omnidirectional video, method for receiving omnidirectional video, omnidirectional video transmission device, omnidirectional video receiving device
CN106412582B (en) * 2016-10-21 2019-01-29 北京大学深圳研究生院 The description method of panoramic video area-of-interest and coding method
CN111133763B (en) * 2017-09-26 2022-05-10 Lg 电子株式会社 Superposition processing method and device in 360 video system
WO2019139052A1 (en) * 2018-01-10 2019-07-18 Sharp Kabushiki Kaisha Systems and methods for signaling source information for virtual reality applications
US11341976B2 (en) 2018-02-07 2022-05-24 Sony Corporation Transmission apparatus, transmission method, processing apparatus, and processing method
KR20220012740A (en) * 2020-07-23 2022-02-04 삼성전자주식회사 Method and apparatus for controlling transmission and reception of content in communication system
US11568574B1 (en) * 2021-08-18 2023-01-31 Varjo Technologies Oy Foveation-based image encoding and decoding

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1491403A (en) * 2001-10-29 2004-04-21 ���ṫ˾ Non-flat image processing apparatus and image processing method, and recording medium and computer program
US20050078192A1 (en) * 2003-10-14 2005-04-14 Casio Computer Co., Ltd. Imaging apparatus and image processing method therefor
US20090282365A1 (en) * 2007-02-06 2009-11-12 Nikon Corporation Image processing apparatus, image reproducing apparatus, imaging apparatus and program recording medium
CN103141078A (en) * 2010-10-05 2013-06-05 索尼电脑娱乐公司 Image display device, and image display method
FR2988964A1 (en) * 2012-03-30 2013-10-04 France Telecom Method for receiving immersive video content by client entity i.e. smartphone, involves receiving elementary video stream, and returning video content to smartphone from elementary video stream associated with portion of plan
US20130326419A1 (en) * 2012-05-31 2013-12-05 Toru Harada Communication terminal, display method, and computer program product
EP3013064A1 (en) * 2013-07-19 2016-04-27 Sony Corporation Information processing device and method

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4475643B2 (en) 2004-06-29 2010-06-09 キヤノン株式会社 Image coding apparatus and method
US7697839B2 (en) * 2006-06-30 2010-04-13 Microsoft Corporation Parametric calibration for panoramic camera systems
JP4403421B2 (en) * 2006-08-17 2010-01-27 ソニー株式会社 Image processing apparatus and image processing method
JP4992597B2 (en) * 2007-08-02 2012-08-08 株式会社ニコン Imaging apparatus and program
JP4924618B2 (en) * 2009-01-05 2012-04-25 ソニー株式会社 Display control apparatus, display control method, and program
JP5464955B2 (en) * 2009-09-29 2014-04-09 株式会社ソニー・コンピュータエンタテインメント Panorama image display device
US8872888B2 (en) * 2010-10-01 2014-10-28 Sony Corporation Content transmission apparatus, content transmission method, content reproduction apparatus, content reproduction method, program and content delivery system
JP5406813B2 (en) * 2010-10-05 2014-02-05 株式会社ソニー・コンピュータエンタテインメント Panorama image display device and panorama image display method
WO2015197818A1 (en) * 2014-06-27 2015-12-30 Koninklijke Kpn N.V. Hevc-tiled video streaming
US10204658B2 (en) * 2014-07-14 2019-02-12 Sony Interactive Entertainment Inc. System and method for use in playing back panorama video content
KR102157655B1 (en) * 2016-02-17 2020-09-18 엘지전자 주식회사 How to transmit 360 video, how to receive 360 video, 360 video transmitting device, 360 video receiving device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1491403A (en) * 2001-10-29 2004-04-21 ���ṫ˾ Non-flat image processing apparatus and image processing method, and recording medium and computer program
US20040247173A1 (en) * 2001-10-29 2004-12-09 Frank Nielsen Non-flat image processing apparatus, image processing method, recording medium, and computer program
US20050078192A1 (en) * 2003-10-14 2005-04-14 Casio Computer Co., Ltd. Imaging apparatus and image processing method therefor
US20090282365A1 (en) * 2007-02-06 2009-11-12 Nikon Corporation Image processing apparatus, image reproducing apparatus, imaging apparatus and program recording medium
CN103141078A (en) * 2010-10-05 2013-06-05 索尼电脑娱乐公司 Image display device, and image display method
FR2988964A1 (en) * 2012-03-30 2013-10-04 France Telecom Method for receiving immersive video content by client entity i.e. smartphone, involves receiving elementary video stream, and returning video content to smartphone from elementary video stream associated with portion of plan
US20130326419A1 (en) * 2012-05-31 2013-12-05 Toru Harada Communication terminal, display method, and computer program product
EP3013064A1 (en) * 2013-07-19 2016-04-27 Sony Corporation Information processing device and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GEOFFREY MARTENS: "Bandwidth management for ODV tiled steaming with MPEG-DASH", UNIVERSITEIT HASSELT, pages 17 *

Also Published As

Publication number Publication date
EP3457706A4 (en) 2019-03-20
WO2017195881A1 (en) 2017-11-16
JP7218824B2 (en) 2023-02-07
CN109076262A (en) 2018-12-21
CN109076262B (en) 2022-07-12
JP2022036289A (en) 2022-03-04
EP3627844A1 (en) 2020-03-25
EP3457706A1 (en) 2019-03-20
JPWO2017195881A1 (en) 2019-03-22
JP7014156B2 (en) 2022-02-03
US20190200096A1 (en) 2019-06-27

Similar Documents

Publication Publication Date Title
CN109076262B (en) File generation device, file generation method, reproduction device, and reproduction method
JP7009606B2 (en) Suggested viewport instructions for panoramic video
CN109565610B (en) Method, apparatus and storage medium for processing omnidirectional video
CN110100435B (en) Generation device, identification information generation method, reproduction device, and image reproduction method
JP6979035B2 (en) How to Improve Streaming of Virtual Reality Media Content, Devices and Computer Programs
US10992919B2 (en) Packed image format for multi-directional video
EP3782368A1 (en) Processing video patches for three-dimensional content
JP7218826B2 (en) Reproduction device and image generation method
TWI768372B (en) Methods and apparatus for spatial grouping and coordinate signaling for immersive media data tracks
US20200329266A1 (en) Information processing apparatus, method for processing information, and storage medium
JP2021502033A (en) How to encode / decode volumetric video, equipment, and streams
CN107851425B (en) Information processing apparatus, information processing method, and program
TWI672947B (en) Method and apparatus for deriving composite tracks
CN114095737A (en) Point cloud media file packaging method, device, equipment and storage medium
CN113852829A (en) Method and device for encapsulating and decapsulating point cloud media file and storage medium
TW202106000A (en) A method and apparatus for delivering a volumetric video content
TWI785458B (en) Method and apparatus for encoding/decoding video data for immersive media
CN114556962B (en) Multi-view video processing method and device
TW202116063A (en) A method and apparatus for encoding, transmitting and decoding volumetric video
US20230421819A1 (en) Media file unpacking method and apparatus, device, and storage medium
CN115885513A (en) Method and apparatus for encoding and decoding volumetric video
Lingyan Presentation of multiple GEO-referenced videos
CN115481280A (en) Data processing method, device and equipment for volume video and readable storage medium
Jain Practical Architectures for Fused Visual and Inertial Mobile Sensing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination