WO2016002494A1 - 情報処理装置および方法 - Google Patents
情報処理装置および方法 Download PDFInfo
- Publication number
- WO2016002494A1 WO2016002494A1 PCT/JP2015/067232 JP2015067232W WO2016002494A1 WO 2016002494 A1 WO2016002494 A1 WO 2016002494A1 JP 2015067232 W JP2015067232 W JP 2015067232W WO 2016002494 A1 WO2016002494 A1 WO 2016002494A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- still image
- encoded data
- file
- moving image
- decoding
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 44
- 238000000034 method Methods 0.000 title abstract description 115
- 238000003860 storage Methods 0.000 claims description 37
- 238000003672 processing method Methods 0.000 claims description 15
- 238000012545 processing Methods 0.000 abstract description 50
- 239000010410 layer Substances 0.000 description 345
- 238000009826 distribution Methods 0.000 description 41
- 238000005516 engineering process Methods 0.000 description 37
- 238000004458 analytical method Methods 0.000 description 31
- 238000010586 diagram Methods 0.000 description 20
- 239000011229 interlayer Substances 0.000 description 19
- 238000004891 communication Methods 0.000 description 18
- 230000006870 function Effects 0.000 description 13
- 239000000284 extract Substances 0.000 description 11
- 230000006978 adaptation Effects 0.000 description 6
- 230000008929 regeneration Effects 0.000 description 4
- 238000011069 regeneration method Methods 0.000 description 4
- 238000012937 correction Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- AWSBQWZZLBPUQH-UHFFFAOYSA-N mdat Chemical compound C1=C2CC(N)CCC2=CC2=C1OCO2 AWSBQWZZLBPUQH-UHFFFAOYSA-N 0.000 description 2
- 238000004321 preservation Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/91—Television signal processing therefor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/187—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/236—Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/438—Interfacing the downstream path of the transmission network originating from a server, e.g. retrieving encoded video stream packets from an IP network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/462—Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
- H04N21/4621—Controlling the complexity of the content stream or additional data, e.g. lowering the resolution or bit-rate of the video stream for a mobile client with a small screen
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/462—Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
- H04N21/4622—Retrieving content or additional data from different sources, e.g. from a broadcast channel and the Internet
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/63—Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
- H04N21/633—Control signals issued by server directed to the network components or client
- H04N21/6332—Control signals issued by server directed to the network components or client directed to client
- H04N21/6336—Control signals issued by server directed to the network components or client directed to client directed to decoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/85406—Content authoring involving a specific file format, e.g. MP4 format
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/80—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
- H04N9/804—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving pulse code modulation of the colour picture signal components
- H04N9/8042—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving pulse code modulation of the colour picture signal components involving data reduction
Definitions
- the present technology relates to an information processing apparatus and method, and more particularly, to an information processing apparatus and method capable of controlling the decoding timing of encoded data obtained by hierarchically encoding a plurality of hierarchical images.
- image encoding / decoding methods For example, hierarchical encoding that efficiently encodes an image layered in a plurality of hierarchies using prediction between hierarchies or the like has been considered.
- a hierarchized image for example, a still image is assumed to be a base layer, a moving image is assumed to be an enhancement layer, and prediction that refers to a still image when encoding a moving image has been considered.
- MPEG-DASH Moving Picture Experts Group Dynamic Dynamic Adaptive Streaming over HTTP
- MPEG-DASH a bit stream of image data encoded by a predetermined encoding method is filed and distributed in a predetermined file format such as the MP4 file format.
- MPEG-DASH Dynamic-Adaptive-Streaming-over-HTTP
- URL http://mpeg.chiariglione.org/standards/mpeg-dash/media-presentation-description-and-segment-formats/text-isoiec-23009-12012-dam -1)
- the present technology has been proposed in view of such a situation, and an object thereof is to be able to control the decoding timing of encoded data obtained by hierarchically encoding a plurality of hierarchical images.
- One aspect of the present technology is that still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction that refers to the still image are in different tracks.
- Time information designating the decoding timing of each frame is set in a file generation unit that generates a file to be stored and a track in which the moving image encoded data of the file is stored, and the still image encoded data of the file is Time information designating decoding timing of the still image is stored in a track to be stored using the time information of the encoded video data based on a reference relationship between the still image and the moving image for the prediction.
- An information processing apparatus including a time information setting unit to be set.
- the file generation unit can store, in the file, information indicating a storage destination of the still image encoded data instead of the still image encoded data.
- One aspect of the present technology is that still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction that refers to the still image are different from each other.
- a file to be stored in a track is generated, time information designating the decoding timing of each frame is set in a track in which the moving image encoded data of the file is stored, and the still image encoded data of the file is stored.
- Time information designating the decoding timing of the still image is set in a track using the time information of the moving image encoded data based on a reference relationship between the still image and the moving image for the prediction.
- Another aspect of the present technology is that tracks in which still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction that refers to the still image are different from each other.
- a file reproduction unit for reproducing the file stored in the file and extracting the encoded still image data and the encoded moving image data, and the encoded encoded still image extracted from the file for the prediction.
- Time information that specifies the decoding timing of the still image which is set using time information that specifies the decoding timing of each frame of the moving image encoded data based on the reference relationship between the still image and the moving image.
- a still image decoding unit that decodes at a timing based on the encoded video data extracted from the file, and a decoding type of each frame of the encoded video data At a timing based on the time information specifying the ring, an information processing apparatus and a video decoding unit for decoding by referring to the still image obtained the still image coded data is decoded.
- still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction that refers to the still image are mutually connected. Playing back a file stored in a different track, extracting the still image encoded data and the moving image encoded data, and extracting the still image encoded data extracted from the file as the still image for the prediction Timing based on time information that specifies the decoding timing of the still image, set using time information that specifies the decoding timing of each frame of the moving image encoded data based on a reference relationship between the image and the moving image
- the moving image encoded data extracted from the file is decoded based on time information that specifies the decoding timing of each frame of the moving image encoded data. In timing, an information processing method for decoding by referring to the still image obtained the still image coded data is decoded.
- Still another aspect of the present technology is that still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction that refers to the still image are different from each other.
- Information comprising: a file generation unit for generating a file to be stored in a track; and a table information generation unit for generating table information indicating a reference relationship between the still image for prediction and the moving image and storing the file in the file It is a processing device.
- the file generation unit can store time information indicating the display timing of the still image in the file.
- still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction that refers to the still image
- files to be stored in different tracks are generated, table information indicating a reference relationship between the still image and the moving image for the prediction is generated, and stored in the file.
- Still another aspect of the present technology is that still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction that refers to the still image are different from each other.
- a file reproduction unit that reproduces a file stored in a track and extracts the encoded still image data and the encoded moving image data; and the encoded encoded still image extracted from the file
- a still image decoding unit for decoding at a timing based on time information designating the decoding timing of each frame of the encoded data and table information indicating a reference relationship between the still image for prediction and the moving image, and the file
- Each frame of the moving image encoded data extracted from the frame is encoded by the still image decoding unit at a timing based on the time information.
- An information processing apparatus and a video decoding unit for decoding by referring to the still images is an image coded data obtained by decoding.
- Still another aspect of the present technology is that still image encoded data in which a still image is encoded, and moving image encoded data in which a moving image is encoded using prediction that refers to the still image, Reproducing files stored in different tracks, extracting the still image encoded data and the moving image encoded data, and extracting the still image encoded data extracted from the file as the moving image encoded data
- Each frame of encoded data is obtained by decoding the still image encoded data by the still image decoding unit at a timing based on the time information.
- The is an information processing method for decoding by referring to the still picture.
- Still another aspect of the present technology is a moving image code obtained by encoding time information indicating decoding timing of still image encoded data in which a still image is encoded, and prediction in which the moving image refers to the still image.
- a time information generating unit that generates time information indicating the decoding timing of each frame of the encoded data using a predetermined timeline, and the still image encoded data and the moving image encoded data using the time information.
- an metadata generation unit that generates metadata used for providing the information.
- Still another aspect of the present technology provides a time information indicating decoding timing of still image encoded data in which a still image is encoded, and a moving image in which a moving image is encoded using prediction that refers to the still image.
- Time information indicating the decoding timing of each frame of the image encoded data is generated using a predetermined timeline, and the still image encoded data and the moving image encoded data are provided using the time information.
- still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction that refers to a still image are recorded on different tracks.
- a file to be stored is generated, time information for specifying the decoding timing of each frame is set in a track that stores the moving image encoded data of the file, and a still image of the still image is stored in a track that stores the still image encoded data of the file.
- Time information designating decoding timing is set using time information of moving image encoded data based on a reference relationship between a still image and a moving image for prediction.
- still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction that refers to the still image are different from each other.
- File is reproduced, still image encoded data and moving image encoded data are extracted, and the still image encoded data extracted from the file is a reference relationship between the still image and the moving image for prediction.
- still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction that refers to the still image are mutually connected.
- a file to be stored in a different track is generated, and table information indicating a reference relationship between the still image and the moving image for the prediction is generated and stored in the file.
- still image encoded data in which a still image is encoded is different from encoded moving image data in which a moving image is encoded using prediction that refers to a still image.
- the file stored in the track is reproduced, the still image encoded data and the moving image encoded data are extracted, and the still image encoded data extracted from the file determines the decoding timing of each frame of the moving image encoded data.
- Each frame of moving image encoded data decoded at a timing based on time information to be specified and table information indicating a reference relationship between a still image and a moving image for prediction is based on the time information.
- the still image decoding unit decodes the still image encoded data with reference to the still image obtained by decoding.
- Time information indicating the decoding timing of each frame of the encoded data is generated using a predetermined timeline, and meta data used for providing still image encoded data and moving image encoded data using the time information. Data is generated.
- This technology can process information. Further, according to the present technology, it is possible to control the decoding timing of encoded data obtained by hierarchically encoding a plurality of hierarchical images.
- FIG. 1 It is a figure which shows the structural example of a MP4 file format. It is a figure which shows the main structural examples of an MP4 file. It is a block diagram which shows the main structural examples of an MP4 file generation apparatus. It is a flowchart explaining the example of the flow of MP4 file generation processing. It is a block diagram which shows the main structural examples of an MP4 file reproduction apparatus. It is a flowchart explaining the example of the flow of MP4 file reproduction
- FIG. 20 is a block diagram illustrating a main configuration example of a computer.
- First Embodiment> ⁇ Hierarchization of still images and moving images>
- an image encoding / decoding system there is a hierarchical encoding / decoding system that efficiently encodes an image layered in a plurality of hierarchies using prediction between hierarchies.
- a hierarchized image for example, there is a hierarchized image using a still image as a base layer and a moving image as an enhancement layer. That is, in hierarchical encoding, prediction with reference to a still image is performed when a moving image is encoded.
- the decoding timing of the still image is designated using DTS (Decoding Time Stamp) which is time information for designating the decoding timing of each frame of the moving image. That is, the correspondence between still images and moving image frames is expressed using DTS, and the information is stored in a file.
- DTS Decoding Time Stamp
- a file is generated that stores still image encoded data in which still images are encoded and moving image encoded data in which moving images are encoded using prediction that refers to still images in different tracks.
- the time information (DTS) that specifies the decoding timing of each frame is set in the track that stores the encoded video data of the file, and the still image is decoded in the track that stores the encoded still image data of the file.
- the time information for designating the timing is set using the time information of the moving image encoded data based on the reference relationship between the still image and the moving image for prediction.
- the decoding timing of moving images and still images can be controlled with one timeline. That is, it is possible to control the decoding timing of encoded data obtained by hierarchically encoding a plurality of hierarchical images.
- the number of image data layers is arbitrary and may be three or more.
- a plurality of still image layers may exist, or a plurality of moving image layers may exist.
- the resolution of each image is arbitrary.
- the still image may have a higher resolution than the moving image, a lower resolution, or the same resolution.
- the values of other parameters such as the bit depth and color gamut of each image are also arbitrary.
- some electronic devices including an image sensor such as a digital still camera, a digital video camera, a mobile phone, a smartphone, a notebook personal computer, and a tablet personal computer have a function of capturing a still image together with a moving image.
- an image sensor such as a digital still camera, a digital video camera, a mobile phone, a smartphone, a notebook personal computer, and a tablet personal computer
- a function of capturing a still image together with a moving image.
- the shutter button when the user presses the shutter button to shoot a still image, there is a function of saving not only the still image but also moving images before and after the shooting timing.
- the electronic device can provide various services to the user using the moving image and the still image stored in this way.
- the electronic device can provide the user with moving image data and still image data, respectively.
- an electronic device uses a still image to process a moving image to improve the image quality, or creates a still image with a timing different from that of a still image captured using the moving image (that is, a captured image). The timing can be shifted pseudo).
- the moving image and the still image are substantially similar images, and are highly similar to each other. That is, redundancy of moving image data and still image data is high. Therefore, the electronic device uses the still image as the base layer, the moving image as the enhancement layer, and performs hierarchical encoding using prediction (inter-layer prediction) that refers to the still image when the moving image is encoded.
- prediction inter-layer prediction
- an image of a part of the frame of the moving image is extracted as a still image (thumbnail image) at regular or irregular intervals, along with the moving image.
- a still image thumbnailnail image
- the still image stored in this manner is used as a GUI (Graphical User Interface) or the like in a function such as scene search.
- the moving image and the still image are substantially similar images, and are highly similar to each other. That is, redundancy of moving image data and still image data is high. Therefore, the electronic device uses the still image as a base layer, uses the moving image as an enhancement layer, and performs hierarchical encoding using prediction (inter-layer prediction) that refers to the still image when the moving image is encoded.
- prediction inter-layer prediction
- the encoding method of still images and moving images in hierarchical encoding is arbitrary.
- a still image is encoded using the JPEG (Joint-Photographic Experts Group) method
- a moving image is encoded using the SHVC (Scalable High Efficiency-Video Coding) method. May be used.
- This technology is a technology applied when the encoded data thus hierarchically encoded is transmitted in a predetermined transmission format.
- the present technology will be described by taking as an example the case where the encoded data thus hierarchically encoded is filed in the MP4 file format.
- an MP4 file (MP4 file) conforming to MPEG-DASH includes ftyp, moov, and mdat.
- the data of each sample (picture) of HEVC is stored in mdat as AV data.
- management information is stored in a sample table box (Sample ⁇ ⁇ ⁇ ⁇ Table Box (stbl)) for each sample (for example, picture).
- the sample table box includes a sample description box (Sample Description Box), a time-to-sample box (Time To Sample Box), a sample size box (Sample Size Box), and a sample.
- a two-chunk box (Sample-to-Chunk-Box), a chunk offset box (Chunk-Offset-Box), and a subsample information box (Subsample-Information Box) are installed.
- HEVC sample entry HEVC sample entry
- the sample size box stores information about the sample size.
- the sample-to-chunk box stores information about the position of sample data.
- the chunk offset box stores information related to data offset.
- the subsample information box stores information about the subsample.
- the time-to-sample box stores information related to the sample time. That is, for example, the above-described DTS is set in the time-to-sample box.
- FIG. 2 shows a main configuration example of an MP4 file that stores encoded data in which still images and moving images are hierarchically encoded.
- the MP4 file (MP4 file) compliant with MPEG-DASH shown in Fig. 2 stores the encoded data divided into tracks for each layer.
- track 1 (Track 1) stores encoded data (JPG / BL ⁇ sample) for each sample of the base layer (ie, still image)
- track 2 (Track 2) includes an enhancement layer (In other words, encoded data (SHVC / EL sample) for each sample of the moving image) is stored.
- the sample of the base layer and the enhancement layer is a predetermined unit of encoded data (moving image or still image) of each layer such as a picture.
- the sample entry has a jpgC box (jpgCjpgbox) that stores configuration information necessary for decoding JPEG encoded data.
- sample Entry 'lhv1'
- This sample entry has an lhvC box (lhvchbox) for storing configuration information necessary for decoding SHVC encoded data.
- This lhvC box stores flag information (hevc_baselayer_flag) indicating whether or not the base layer encoding method is the HEVC (High Efficiency Video Coding) method.
- HEVC High Efficiency Video Coding
- Track Reference information on the extended video parameter set (VPS EXT) of the SHVC encoded data is stored.
- a track reference (Track Reference) for designating a reference destination track is set for the track 2.
- the DTS of each SHVC sample (SHVC / EL Sample) is set in the time-to-sample box of the sample table box (Sample-Table Box) of track 2.
- the DTS of each JPEG sample (JPEG / BL Sample) is set in the Time To Sample Box of the Sample Table Box of Sample Track 1 (Time Table To Box).
- the DTS of each JPEG sample (JPEG / BL Sample) is set on the same timeline as the DTS of the SHVC sample of track 1.
- the DTS of each JPEG sample contains a SHVC sample (SHVC / EL ⁇ Sample) (that is, a layer using that JPEG sample).
- SHVC sample SHVC / EL ⁇ Sample
- DTS is used to align the JPEG timeline with the SHVC timeline, so that the reference relationship between the base layer and the enhancement layer (that is, which sample of the enhancement layer, which sample of the enhancement layer Is referenced).
- the encoded data of the still image can be decoded at an appropriate timing based on the time information (DTS). Furthermore, when decoding the encoded data of the moving image, it is possible to correctly grasp which sample of the base layer is referenced at which sample based on this time information (DTS). That is, the moving image can be correctly decoded.
- DTS time information
- FIG. 3 is a block diagram illustrating a main configuration example of an MP4 file generation apparatus that is an embodiment of an information processing apparatus to which the present technology is applied.
- the MP4 file generating apparatus 100 hierarchically encodes still images and moving images using the still image as a base layer and the moving image as an enhancement layer, and files the obtained encoded data of each layer as an MP4.
- a device that generates files A device that generates files.
- the MP4 file generation apparatus 100 includes a base layer encoding unit 101, an enhancement layer encoding unit 102, a time information generation unit 103, and an MP4 file generation unit 104.
- the MP4 file generation apparatus 100 in FIG. 3 executes MP4 file generation processing to hierarchically encode input still images and moving images to generate an MP4 file. An example of the flow of this MP4 file generation process will be described with reference to the flowchart of FIG.
- the MP4 file generation apparatus 100 starts an MP4 file generation process. Note that it is desirable that the input still image and moving image are images having high correlation with each other (images with high similarity in design) (encoding efficiency can be improved as the correlation is higher).
- the base layer encoding unit 101 encodes the input still image as a base layer in step S101.
- the base layer encoding unit 101 encodes a still image using, for example, the JPEG method, and generates encoded data (JPEG).
- the base layer encoding unit 101 supplies the generated base layer encoded data (JPEG) to the MP4 file generation unit 104.
- the base layer encoding unit 101 supplies a still image as a reference image to the enhancement layer encoding unit 102.
- This still image may be a decoded image obtained by decoding encoded data (JPEG).
- the base layer encoding unit 101 supplies encoding information that is information related to encoding of the still image to the enhancement layer encoding unit 102.
- the enhancement layer encoding unit 102 encodes the input moving image as an enhancement layer.
- the enhancement layer encoding unit 102 encodes a moving image using, for example, the SHVC method, and generates encoded data (SHVC).
- SHVC encoded data
- the enhancement layer encoding unit 102 performs inter-layer prediction using the reference image of the base layer supplied from the base layer encoding unit 101 as necessary.
- the enhancement layer encoding unit 102 appropriately encodes the enhancement layer generated from the base layer encoding information supplied from the base layer encoding unit 101 or the information generated based on the encoding information.
- SHVC Store in data
- Inter-layer prediction can be performed in any frame, and may not be performed in all frames.
- inter-layer prediction that refers to the base layer
- inter-frame prediction temporary direction prediction
- the enhancement layer encoding unit 102 supplies the generated encoded data (SHVC) of the enhancement layer to the MP4 file generation unit 104.
- the enhancement layer encoding unit 102 supplies reference information, which is information related to reference in inter-layer prediction, to the time information generation unit 103.
- the reference information includes, for example, information indicating an image reference source and a reference destination.
- step S103 the time information generation unit 103 generates time information of the base layer and the enhancement layer, that is, DTS, based on the supplied reference information.
- the time information generation unit 103 generates a DTS for each frame of the enhancement layer moving image, and based on the reference relationship between the base layer and the enhancement layer indicated by the reference information, the DTS of each still image of the base layer Generate using DTS. That is, the time information generation unit 103 sets the DTS of each still image of the base layer to the same value (same time) as the DTS of the enhancement layer moving image frame that refers to the still image.
- the time information generation unit 103 supplies the generated DTS to the MP4 file generation unit 104.
- step S104 the MP4 file generation unit 104 generates a track for each layer, and applies the DTS of each layer to each track to generate an MP4 file. That is, the MP4 file generation unit 104 is supplied from the base layer encoded data (JPEG) supplied from the base layer encoding unit 101 (generated in step S101) and the enhancement layer encoding unit 102 (step). An MP4 file for storing the enhancement layer encoded data (SHVC) generated in S102 in different tracks is generated.
- JPEG base layer encoded data
- SHVC enhancement layer encoded data
- the MP4 file generation unit 104 supplies the base layer DTS supplied from the time information generation unit 103 (generated in step S103) to the track storing the base layer encoded data (JPEG) (in the example of FIG. 2). In the case of track 1), it is stored in the time-to-sample box. Also, the MP4 file generation unit 104 stores the enhancement layer DTS supplied from the time information generation unit 103 (generated in step S103) and the enhancement layer encoded data (SHVC) (in the example of FIG. 2). In the case of track 2), it is stored in the time-to-sample box.
- JPEG base layer encoded data
- SHVC enhancement layer encoded data
- step S105 the MP4 file generation unit 104 outputs the MP4 file generated in step S104.
- the MP4 file generation apparatus 100 specifies the decoding timing of the base layer (still image) using the DTS of the enhancement layer (each frame of the moving image). Can do. That is, the decoding timing of the encoded data of each layer can be shown on the decoding side as one timeline. Also, the decoding timing can be indicated even if the base layer is a still image having no time information. In other words, the reference relationship between the base layer and the enhancement layer can be shown to the decoding side using such time information (DTS).
- DTS time information
- the MP4 file generation apparatus 100 can control the decoding timing of encoded data obtained by hierarchically encoding a plurality of hierarchical images.
- FIG. 5 is a block diagram illustrating a main configuration example of an MP4 file playback device that is an embodiment of an information processing device to which the present technology is applied.
- the MP4 file playback device 150 plays back the MP4 file generated as described above by the MP4 file generation device 100 of FIG. 3, and generates a decoded image of one or both of the base layer and the enhancement layer. Is a device for outputting.
- the MP4 file playback device 150 includes an MP4 file playback unit 151, a time information analysis unit 152, a base layer decoding unit 153, and an enhancement layer decoding unit 154.
- the MP4 file reproduction device 150 in FIG. 5 reproduces an input MP4 file by executing an MP4 file reproduction process, and generates a decoded image of an arbitrary layer.
- An example of the flow of this MP4 file reproduction process will be described with reference to the flowchart of FIG.
- FIG. 6 the process in the case of obtaining the enhancement layer decoded image is demonstrated.
- an MP4 file playback device 150 starts MP4 file playback processing.
- JPEG encoded data
- SHVC encoded data
- step S151 the MP4 file playback unit 151 extracts a sample to be processed in the enhancement layer from the MP4 file (track 2 in the example of FIG. 2).
- the MP4 file playback unit 151 supplies the extracted enhancement layer sample (SHVC) to the enhancement layer decoding unit 154.
- the MP4 file reproduction unit 151 extracts time information (DTS) of each track (each layer of hierarchical encoding) from the MP4 file and supplies the time information (DTS) to the time information analysis unit 152.
- DTS time information
- step S152 based on the DTS supplied from the MP4 file playback unit 151, the time information analysis unit 152 generates a base layer sample having the same value (same time) as the enhancement layer sample extracted in step S151. Determine if it exists. If it is determined that it exists, the process proceeds to step S153.
- the time information analysis unit 152 analyzes the reference relationship of inter-layer prediction between the base layer and the enhancement layer (such as which sample of the enhancement layer refers to which sample of the base layer) from the DTS of each layer, and the reference Reference information indicating the relationship is supplied to the enhancement layer decoding unit 154.
- step S153 the MP4 file playback unit 151 outputs the base layer sample (that is, the base layer sample determined to have the DTS at the same time as the enhancement layer sample extracted in step S151 in step S152). , Extracted from the MP4 file (track 1 in the case of FIG. 2).
- the MP4 file playback unit 151 supplies the extracted base layer sample (JPEG) to the base layer decoding unit 153.
- step S154 the base layer decoding unit 153 converts the base layer sample (extracted in step S153) supplied from the MP4 file reproduction unit 151 into the encoding method at the timing specified by the DTS of the sample. Decoding is performed with a corresponding decoding method (for example, JPEG method) to generate a decoded image.
- the base layer decoding unit 153 supplies the generated decoded image to the enhancement layer decoding unit 154 as a reference image.
- step S155 the enhancement layer decoding unit 154, based on the reference information supplied from the time information analysis unit 152, the reference image supplied from the base layer decoding unit 153 (generated in step S154), that is, the base layer
- the decoded image is used to perform motion compensation between layers, and the enhancement layer samples supplied from the MP4 file playback unit 151 (extracted in step S151) are decoded to generate a decoded image of the enhancement layer.
- step S156 the base layer decoding unit 153 outputs the decoded image of the base layer generated in step S154. Also, the enhancement layer decoding unit 154 outputs the decoded image of the enhancement layer generated in step S155. When the process of step S156 ends, the process proceeds to step S159.
- step S152 If it is determined in step S152 that there is no base layer sample having the same value (at the same time) as the DTS in the enhancement layer sample extracted in step S151, the process proceeds to step S157.
- step S157 the enhancement layer decoding unit 154 decodes the enhancement layer samples (extracted in step S151) supplied from the MP4 file reproduction unit 151, and generates a decoded image of the enhancement layer.
- step S158 the enhancement layer decoding unit 154 outputs the decoded image of the enhancement layer generated in step S157.
- the process of step S158 ends, the process proceeds to step S159.
- step S159 the MP4 file playback unit 151 determines whether all the samples have been processed. If there is an unprocessed sample, the process returns to step S151, and the subsequent processes are repeated. The processing from step S151 to step S159 is repeated for each sample, and if it is determined in step S159 that all samples have been processed, the MP4 file playback processing ends.
- the MP4 file reproduction device 150 may perform the above-described processing of step S153 and step S154.
- the MP4 file playback apparatus 150 can decode the base layer (still image) at an appropriate timing. That is, the MP4 file playback apparatus 150 can correctly decode encoded data obtained by hierarchically encoding a plurality of hierarchical images. In particular, even when the base layer is a still image having no time information, it can be correctly decoded.
- Second Embodiment> ⁇ POC reference table> A POC reference table indicating the reference relationship between the base layer and the enhancement layer may be separately stored instead of the DTS.
- Fig. 7 shows an example of the main structure of an MP4 file in that case.
- a POC reference table (BaseLayerPOCSampleEntry) that indicates the reference relationship between the enhancement layer and the base layer using POC (Picture Order Count) is stored in the first track (Track 1) storing the encoded data of the base layer.
- BaseLayerPOCSampleEntry a reference enhancement layer sample (SHVC / EL Sample) and a reference base layer sample (JPG / BL Sample) are shown using POC.
- the DTS of track 1 can store the decoding timing that does not depend on inter-layer prediction, that is, the decoding timing that can be used when only the base layer is decoded. For example, when slide show reproduction is performed using a still image of the base layer, a moving image of the enhancement layer is not necessary, so that only the base layer needs to be decoded. In such a case, the decoding timing corresponding to the reproduction timing as the slide show can be stored in the DTS of the track 1.
- the generation of the POC reference table may be performed according to a syntax as shown in FIG. 8, for example.
- the POC of the enhancement layer that refers to the sample is associated with the POC of each sample of the base layer.
- the format of the POC reference table is arbitrary and is not limited to this example.
- FIG. 9 is a block diagram illustrating a main configuration example of an MP4 file generation device that is an embodiment of an information processing device to which the present technology is applied.
- an MP4 file generation apparatus 200 is the same apparatus as the MP4 file generation apparatus 100 (FIG. 3), and basically has the same configuration as the MP4 file generation apparatus 100.
- the MP4 file generation device 200 includes a time information generation unit 203 instead of the time information generation unit 103 in the MP4 file generation device 100.
- the MP4 file generation device 200 includes an MP4 file generation unit 204 instead of the MP4 file generation unit 104 in the MP4 file generation device 100.
- the time information generation unit 203 generates a POC reference table instead of generating a DTS based on the reference information, and supplies it to the MP4 file generation unit 204.
- the MP4 file generation unit 204 stores the POC reference table in the MP4 file instead of storing the DTS in the MP4 file.
- step S201 and step S202 are performed similarly to each process of step S101 and step S102 of FIG.
- the base layer encoding unit 101 supplies the generated base layer encoded data (JPEG) to the MP4 file generation unit 204.
- the enhancement layer encoding unit 102 supplies the generated enhancement layer encoded data (SHVC) to the MP4 file generation unit 204, and supplies reference information that is information related to reference in inter-layer prediction to the time information generation unit 203. To do.
- step S203 the time information generation unit 203 generates a POC reference table (BaseLayerPOCSampleEntry) based on the supplied reference information.
- the time information generation unit 203 supplies the generated POC reference table (BaseLayerPOCSampleEntry) to the MP4 file generation unit 204.
- step S204 the MP4 file generation unit 204 generates a track for each layer, and applies the DTS of each layer to each track to generate an MP4 file. That is, the MP4 file generation unit 204 is supplied from the base layer encoded data (JPEG) supplied from the base layer encoding unit 101 (generated in step S101) and the enhancement layer encoding unit 102 (step). An MP4 file for storing the enhancement layer encoded data (SHVC) generated in S102 in different tracks is generated.
- JPEG base layer encoded data
- SHVC enhancement layer encoded data
- the MP4 file generation unit 204 uses the POC reference table supplied from the time information generation unit 203 (generated in step S203) as a track for storing base layer encoded data (JPEG) (in the example of FIG. 7). In case of storing in track 1).
- JPEG base layer encoded data
- the MP4 file generation unit 204 sets the DTS of the track (track 2 in the example of FIG. 7) that stores the enhancement layer encoded data (SHVC). Further, the MP4 file generation unit 204 appropriately sets the DTS of the track (track 1 in the example of FIG. 7) that stores the base layer encoded data (JPEG).
- SHVC enhancement layer encoded data
- JPEG base layer encoded data
- the MP4 file generation unit 204 appropriately sets other necessary information.
- step S205 the MP4 file generation unit 204 outputs the MP4 file generated in step S204.
- the MP4 file generation apparatus 200 can specify the decoding timing of the base layer (still image) using the POC reference table. That is, the decoding timing of the encoded data of each layer can be shown on the decoding side as one timeline. Also, the decoding timing can be indicated even if the base layer is a still image having no time information.
- the MP4 file generation apparatus 200 can control the decoding timing of encoded data obtained by hierarchically encoding a plurality of hierarchical images.
- FIG. 11 is a block diagram illustrating a main configuration example of an MP4 file reproduction device that is an embodiment of an information processing device to which the present technology is applied.
- an MP4 file reproduction device 250 reproduces the MP4 file generated as described above by the MP4 file generation device 200 in FIG. 9, and generates a decoded image of one or both of the base layer and the enhancement layer. Is a device for outputting.
- the MP4 file playback device 250 basically has the same configuration as the MP4 file playback device 150 (FIG. 5). However, the MP4 file playback device 250 has a time information analysis unit 252 instead of the time information analysis unit 152 in the MP4 file playback device 150.
- the MP4 file playback unit 151 extracts a sample to be processed in the enhancement layer from the MP4 file (track 2 in the example of FIG. 7).
- the MP4 file playback unit 151 supplies the extracted enhancement layer sample (SHVC) to the enhancement layer decoding unit 154.
- the MP4 file reproduction unit 151 extracts the POC reference table (BaseLayerPOCSampleEntry) from the MP4 file (track 1 in the example of FIG. 7), and supplies it to the time information analysis unit 252.
- step S252 the time information analysis unit 252 is based on the POC reference table (BaseLayerPOCSampleEntry) supplied from the MP4 file playback unit 151, and is extracted by the MP4 file playback unit 151 (extracted in step S251). Identify the base layer sample (POC) corresponding to the sample (POC).
- BaseLayerPOCSampleEntry the POC reference table supplied from the MP4 file playback unit 151
- step S253 the time information analysis unit 252 determines whether or not to perform inter-layer prediction.
- the time information analysis unit 252 determines to perform inter-layer prediction. In that case, the process proceeds to step S254.
- the time information analysis unit 252 analyzes the reference relationship of inter-layer prediction between the base layer and the enhancement layer (such as which sample of the enhancement layer refers to which sample of the base layer) from the POC reference table, and the reference relationship Is supplied to the enhancement layer decoding unit 154.
- step S254 to step S257 is executed in the same manner as each process from step S153 to step S156 in FIG.
- the process of step S257 ends, the process proceeds to step S260.
- step S252 determines in step S253 that inter-layer prediction is not performed. In that case, the process proceeds to step S258.
- step S258 and step S259 are performed similarly to each process of step S157 and step S158 of FIG.
- the process of step S259 ends, the process proceeds to step S260.
- step S260 the MP4 file playback unit 151 determines whether all the samples have been processed. If there is an unprocessed sample, the process returns to step S251, and the subsequent processes are repeated. The processing from step S251 to step S260 is repeated for each sample, and if it is determined in step S260 that all samples have been processed, the MP4 file playback processing ends.
- the MP4 file reproduction device 250 may perform the above-described processing of step S254 and step S255.
- the MP4 file playback apparatus 250 can decode the base layer (still image) at an appropriate timing. That is, the MP4 file reproduction device 250 can correctly decode encoded data obtained by hierarchically encoding a plurality of hierarchical images. In particular, even when the base layer is a still image having no time information, it can be correctly decoded.
- the entity of the base layer encoded data may be outside the MP4 file.
- the MP4 file only needs to store link information indicating the storage location of the entity of the JPEG file.
- Fig. 13 shows an example of the main structure of an MP4 file in that case.
- the configuration of the MP4 file is basically the same as the example of FIG. 2, and the reference relationship between the base layer and the enhancement layer is expressed by DTS.
- the base layer track (track 1) includes a JPEG file entity (JPG File For sample1, JPG) as a sample of encoded data (JPG / BL sample1, JPG / BL sample2, etc.). Link information to File For sample2 etc.) is stored.
- the entity of the JPEG file should be read based on this link information.
- the rest is the same as in the case of the first embodiment.
- FIG. 14 is a block diagram illustrating a main configuration example of an MP4 file generation device that is an embodiment of an information processing device to which the present technology is applied.
- an MP4 file generation apparatus 300 is the same apparatus as the MP4 file generation apparatus 100 (FIG. 3), and basically has the same configuration as the MP4 file generation apparatus 100.
- the MP4 file generation apparatus 300 includes a base layer encoding unit 301 instead of the base layer encoding unit 101 in the MP4 file generation apparatus 100.
- the MP4 file generation apparatus 300 includes an MP4 file generation unit 304 instead of the MP4 file generation unit 104 in the MP4 file generation apparatus 100.
- the base layer encoding unit 301 outputs the entity of the generated base layer encoded data (JPEG), and notifies the MP4 file generation unit 304 of the storage location of the encoded data (JPEG) (for example, JPEG storage) (Supplied to the MP4 file generation unit 304 as destination information).
- the MP4 file generation unit 304 stores the link information (JPEG storage) of the base layer encoded data (JPEG). Destination information).
- the base layer encoding unit 301 encodes the input still image as a base layer in step S301.
- the base layer encoding unit 301 encodes a still image using, for example, the JPEG method, and generates encoded data (JPEG).
- the base layer encoding unit 301 outputs the generated base layer encoded data (JPEG) and stores it in a predetermined storage location.
- the base layer encoding unit 301 supplies JPEG storage location information indicating the storage location of the encoded data (JPEG) to the MP4 file generation unit 304.
- the base layer encoding unit 301 supplies a reference image (still image) and encoding information to the enhancement layer encoding unit 102 as in the case of the base layer encoding unit 101.
- step S303 and step S304 are performed similarly to each process of step S102 and step S103 of FIG.
- the enhancement layer encoding unit 102 supplies the generated encoded data (SHVC) of the enhancement layer to the MP4 file generation unit 304.
- SHVC generated encoded data
- step S305 the MP4 file generation unit 304 generates a track for each layer, and applies the DTS of each layer to each track to generate an MP4 file. That is, the MP4 file generation unit 304 stores the JPEG storage destination information supplied from the base layer encoding unit 101 in the base layer track (track 1 in the example of FIG. 13), and supplies it from the enhancement layer encoding unit 102
- the enhancement layer encoded data (SHVC) (generated in step S304) is stored in the enhancement layer track (track 2 in the example of FIG. 13).
- the MP4 file generation unit 304 supplies the base layer DTS supplied from the time information generation unit 103 (generated in step S304) to the track storing the base layer encoded data (JPEG) (in the example of FIG. 13).
- JPEG base layer encoded data
- the MP4 file generation unit 304 stores the enhancement layer DTS supplied from the time information generation unit 103 (generated in step S304) and the enhancement layer encoded data (SHVC) (in the example of FIG. 13).
- track 2 it is stored in the time-to-sample box.
- the MP4 file generation unit 304 appropriately sets other necessary information.
- step S306 the MP4 file generation unit 304 outputs the MP4 file generated in step S305.
- the MP4 file generation apparatus 300 specifies the decoding timing of the base layer (still image) using the DTS of the enhancement layer (each frame of the moving image). Can do. That is, the decoding timing of the encoded data of each layer can be shown on the decoding side as one timeline. Also, the decoding timing can be indicated even if the base layer is a still image having no time information. In other words, the reference relationship between the base layer and the enhancement layer can be shown to the decoding side using such time information (DTS).
- DTS time information
- the MP4 file generation apparatus 300 decodes encoded data obtained by hierarchically encoding images of a plurality of layers even when the base layer encoded data (JPEG file) exists outside the MP4 file. Can be controlled.
- FIG. 16 is a block diagram illustrating a main configuration example of an MP4 file reproduction device which is an embodiment of an information processing device to which the present technology is applied.
- an MP4 file reproduction device 350 reproduces the MP4 file generated as described above by the MP4 file generation device 300 in FIG. 14, and generates a decoded image of one or both of the base layer and the enhancement layer. Is a device for outputting.
- the MP4 file playback device 350 basically has the same configuration as the MP4 file playback device 150 (FIG. 5). However, the MP4 file playback device 350 has an MP4 file playback unit 351 instead of the MP4 file playback unit 151 in the MP4 file playback device 150. Also, the MP4 file playback device 350 has a base layer decoding unit 353 instead of the base layer decoding unit 153 in the MP4 file playback device 150.
- step S351 the MP4 file playback unit 351 extracts a sample to be processed in the enhancement layer from the MP4 file (track 2 in the example of FIG. 13).
- the MP4 file playback unit 351 supplies the extracted enhancement layer sample (SHVC) to the enhancement layer decoding unit 154.
- the MP4 file reproduction unit 351 extracts time information (DTS) of each track (each layer of hierarchical coding) from the MP4 file, and supplies the time information (DTS) to the time information analysis unit 152.
- DTS time information
- step S352 based on the DTS supplied from the MP4 file playback unit 351, the time information analysis unit 152 generates a base layer sample having the same value (same time) as the enhancement layer sample extracted in step S351. Determine if it exists. If it is determined that it exists, the process proceeds to step S353.
- the time information analysis unit 152 analyzes the reference relationship of inter-layer prediction between the base layer and the enhancement layer (such as which sample of the enhancement layer refers to which sample of the base layer) from the DTS of each layer, and the reference Reference information indicating the relationship is supplied to the enhancement layer decoding unit 154.
- step S353 the MP4 file playback unit 351 extracts the base layer sample storage location information (JPEG storage location information) from the MP4 file (track 1 in the example of FIG. 13).
- the MP4 file playback unit 351 supplies the extracted storage location information (JPEG storage location information) to the base layer decoding unit 353.
- step S354 the base layer decoding unit 353 acquires the base layer encoded data (JPEG) entity based on the base layer sample storage destination information (JPEG storage destination information).
- JPEG base layer encoded data
- step S355 to step S357 is executed in the same manner as each process from step S154 to step S156 in FIG.
- the process of step S357 ends, the process proceeds to step S360.
- step S352 If it is determined in step S352 that there is no base layer sample having the same value (at the same time) as the DTS in the enhancement layer sample extracted in step S351, the process proceeds to step S358.
- step S358 and step S359 are performed similarly to each process of step S157 and step S158 of FIG.
- the process of step S359 ends, the process proceeds to step S360.
- step S360 the MP4 file playback unit 351 determines whether or not all the samples have been processed. If there is an unprocessed sample, the process returns to step S351, and the subsequent processes are repeated. The processing from step S351 to step S360 is repeated for each sample, and if it is determined in step S360 that all samples have been processed, the MP4 file playback processing ends.
- the MP4 file reproduction device 350 may perform the above-described processing from step S353 to step S355.
- the MP4 file playback apparatus 350 can decode the base layer (still image) at an appropriate timing. That is, the MP4 file reproduction device 350 can correctly decode encoded data obtained by hierarchically encoding a plurality of hierarchical images. In particular, even when the base layer is a still image having no time information, and even when the encoded data is not stored in the MP4 file, it can be correctly decoded.
- Control of decoding timing of base layer encoded data may be performed in MPD (Media Presentation Description) of MPEG-DASH (Moving Picture Experts Group-Dynamic Adaptive Streaming over HTTP).
- MPD has a configuration as shown in FIG. 18, for example.
- the client selects an optimum attribute from the representation attribute included in the period of the MPD (Media Presentation in FIG. 18).
- the client reads the first segment (Segment) of the selected representation (Representation), acquires the initialization segment (Initialization Segment), and processes it. Subsequently, the client acquires and reproduces the subsequent segment (Segment).
- each period (Period) that is a data unit in the time direction
- each period (Period) should be managed for each segment (Segment) that is a data unit in the time direction.
- a plurality of representations (Representations) having different attributes such as bit rate can be configured.
- this MPD file (also referred to as MPD file) has a hierarchical structure as shown in FIG. 20 below the period. Further, when the MPD structures are arranged on the time axis, an example shown in FIG. 21 is obtained. As is clear from the example of FIG. 21, there are a plurality of representations (Representations) for the same segment (Segment). The client can acquire and reproduce appropriate stream data according to the communication environment, its decoding capability, and the like by adaptively selecting one of these.
- FIG. 22 shows a configuration example of each file when the decoding timing of base layer encoded data (JPEG file) is controlled using such MPD.
- the encoded data of the base layer is configured as a JPEG file (JPG) File) (JPG File For sample1, JPG File For sample2)
- the encoded data of the enhancement layer is configured as an MP4 file (MP4 File).
- JPG JPEG file
- MP4 File MP4 file
- the track of the MP4 file only needs to be track 2 for storing the encoded data of the enhancement layer.
- the configuration of the track 2 is as described in the other embodiments.
- an adaptation set is set for each layer, and a link to the entity of the encoded data is set by segment info.
- the time information of each sample of the base layer encoded data (JPG / BL sample1, JPG / BL sample2) and each enhancement layer encoded data (SHVC / EL sample) is managed using the MPD timeline. Is done. That is, the decoding timing of each layer is matched with the MPD timeline.
- FIGS. An example of such MPD description is shown in FIGS.
- the setting of the enhancement layer adaptation set is described, and the decoding timing of the encoded data (SHVC) is expressed by the MPD timeline.
- the setting of the adaptation set of the base layer is described, and the decoding timing of the encoded data (JPEG) is expressed by the MPD timeline.
- FIG. 25 is a block diagram illustrating a main configuration example of a file generation device that is an embodiment of an information processing device to which the present technology is applied.
- the file generation apparatus 400 hierarchically encodes still images and moving images using the still images as base layers and the moving images as enhancement layers, and generates and outputs JPEG files, MP4 files, MPDs, and the like.
- the file generation device 400 basically has the same configuration as the MP4 file generation device 300 (FIG. 14). However, the file generation device 400 includes a time information generation unit 403 instead of the time information generation unit 103 in the MP4 file generation device 300. Furthermore, the file generation device 400 includes an MP4 file generation unit 404 instead of the MP4 file generation unit 304 in the MP4 file generation device 300. Furthermore, the file generation device 400 includes an MPD generation unit 405.
- the base layer encoding unit 301 supplies the JPEG storage location information to the MPD generation unit 405 instead of the MP4 file generation unit 304.
- the enhancement layer encoding unit 102 supplies the encoded data (SHVC) to the MP4 file generation unit 404 and supplies the reference information to the time information generation unit 403.
- the time information generation unit 403 generates time information (DTS) based on the reference information and supplies it to the MPD generation unit 405.
- the MP4 file generation unit 404 generates and outputs an MP4 file that stores the enhancement layer encoded data (SHVC). Further, the MP4 file generation unit 404 supplies the generated MP4 file to the MPD generation unit 405.
- the MPD generation unit 405 generates an MPD that controls playback of the enhancement layer MP4 file and the base layer JPEG file. Then, the MPD generation unit 405 converts the time information (DTS) of each layer into an MPD timeline and describes it in the MPD. The MPD generation unit 405 outputs the generated MPD.
- DTS time information
- the processes in steps S401 to S403 are performed in the same manner as the processes in steps S301 to S303 in FIG.
- the base layer encoding unit 301 outputs the generated base layer encoded data (JPEG) and stores it in a predetermined storage location. Further, the base layer encoding unit 301 supplies JPEG storage location information indicating the storage location of the encoded data (JPEG) to the MPD generation unit 405. Furthermore, the base layer encoding unit 301 supplies a reference image (still image) and encoding information to the enhancement layer encoding unit 102.
- the enhancement layer encoding unit 102 supplies the generated enhancement layer encoded data (SHVC) to the MP4 file generation unit 404, and supplies reference information that is information related to reference in inter-layer prediction to the time information generation unit 403. To do.
- SHVC generated enhancement layer encoded data
- step S404 the MP4 file generation unit 404 generates an MP4 file for storing the supplied enhancement layer encoded data (SHVC).
- SHVC enhancement layer encoded data
- step S405 the MP4 file generation unit 404 outputs the generated MP4 file. Further, the MP4 file generation unit 404 supplies the generated MP4 file to the MPD generation unit 405.
- step S406 the time information generation unit 403, based on the reference information supplied from the enhancement layer encoding unit 102 (that is, the reference relationship between each sample of the base layer and the enhancement layer), each sample of the base layer and the enhancement layer. Is expressed on the MPD timeline.
- the time information generation unit 403 supplies the time of each sample of the base layer and the enhancement layer shown on the MPD timeline to the MPD generation unit 405 as time information.
- the MPD generation unit 405 generates an MPD that controls the base layer and the enhancement layer. That is, the MPD generation unit 405 generates an adaptation set for each layer. Then, the MPD generation unit 405 describes link information (link information of each sample) indicating the storage destination of the JPEG file that is the encoded data of the base layer in the segment information of the adaptation set of the base layer. Also, the MPD generation unit 405 describes link information indicating the storage destination of the MP4 file including the enhancement layer encoded data in the segment layer information of the enhancement layer adaptation set.
- the MPD generation unit 405 stores the time information generated in step S406 in the MPD. That is, the MPD generation unit 405 describes the decoding timing of each sample of each layer expressed on the MPD timeline in the MPD.
- step S408 the MPD generation unit 405 outputs the MPD generated as described above.
- the file generation process ends.
- the file generation apparatus 400 can control the decoding timing of each sample of each layer on the MPD timeline. That is, the decoding timing of the encoded data of each layer can be shown on the decoding side as one timeline. Also, the decoding timing can be indicated even if the base layer is a still image having no time information. In other words, the reference relationship between the base layer and the enhancement layer can be shown to the decoding side using such time information.
- the file generation device 400 can control the decoding timing of encoded data obtained by hierarchically encoding a plurality of hierarchical images.
- FIG. 27 is a block diagram illustrating a main configuration example of a file reproduction device which is an embodiment of an information processing device to which the present technology is applied.
- the file playback device 450 plays back the MPD, MP4 file, and JPEG file generated as described above by the file generation device 400 of FIG. 25, and decodes one or both of the base layer and the enhancement layer. Is a device for generating and outputting
- the file playback device 450 basically has the same configuration as the MP4 file playback device 350 (FIG. 16). However, the file playback device 450 has an MPD analysis unit 451.
- the file playback device 450 includes an MP4 file playback unit 452 instead of the MP4 file playback unit 351 in the MP4 file playback device 350.
- the file playback device 450 includes an enhancement layer decoding unit 454 instead of the enhancement layer decoding unit 154 in the MP4 file playback device 350.
- the file playback device 450 does not have the time information analysis unit 152 that the MP4 file playback device 350 has.
- the MPD analysis unit 451 analyzes the input MPD and controls the playback of MP4 files and JPEG files.
- the MPD analysis unit 451 supplies JPEG storage location information indicating the storage location of the JPEG file to the base layer decoding unit 353 so that decoding can be performed at the decoding timing specified on the MPD timeline, and MP4 file playback MP4 file storage location information indicating the storage location of the MP4 file is supplied to the unit 452.
- the MP4 file playback unit 452 acquires the MP4 file from the location specified by the MP4 file storage location information under the control of the MPD analysis unit 451, plays the MP4 file, and plays the enhancement layer encoded data (SHVC). Extract a sample.
- the MP4 file reproduction unit 452 supplies the extracted MP4 file to the enhancement layer decoding unit 454.
- the base layer decoding unit 353 supplies the reference image and the encoded information to the enhancement layer decoding unit 454 instead of the enhancement layer decoding unit 154.
- the enhancement layer decoding unit 454 decodes enhancement layer encoded data (SHVC) using a reference image and encoding information as necessary, and generates a decoded image of a moving image.
- the enhancement layer decoding unit 454 outputs the moving image (decoded image).
- step S451 the MPD analysis unit 451 analyzes the input MPD.
- step S452 the MPD analysis unit 451 determines whether there is a base layer sample corresponding to the time to be processed based on the time information of each layer described in the MPD. That is, the MPD analysis unit 451 determines whether or not there is a sample with the same timing as the time of the enhancement layer sample to be processed (decoding timing) in the base layer. In other words, the MPD analysis unit 451 determines whether or not inter-layer prediction has been performed on the enhancement layer samples to be processed at the time of encoding. If it is determined that it exists (inter-layer prediction has been performed), the process proceeds to step S453.
- step S453 to step S455 is executed in the same manner as each process of step S353 to step S355 of FIG.
- the base layer decoding unit 353 supplies the still image obtained by decoding to the enhancement layer decoding unit 454 as a reference image. Further, the base layer decoding unit 353 supplies the encoded information to the enhancement layer decoding unit 454.
- step S456 the MPD analysis unit 451 extracts MP4 file storage location information (link information to the substance of the MP4 file) described in the MPD and supplies it to the MP4 file playback unit 452.
- MP4 file storage location information link information to the substance of the MP4 file
- step S457 the MP4 file playback unit 452 acquires an MP4 file based on the MP4 file storage location information.
- step S458 the MP4 file playback unit 452 extracts a sample to be processed by the enhancement layer from the acquired MP4 file, and supplies the sample to the enhancement layer decoding unit 454.
- step S459 and step S460 are performed similarly to each process of step S356 and step S357 of FIG. When the process of step S460 ends, the process proceeds to step S463.
- step S452 If it is determined in step S452 that there is no base layer sample corresponding to the processing target time (interlayer prediction is not performed), the process proceeds to step S461.
- step S461 and step S462 are performed similarly to each process of step S358 and step S359 of FIG. When the process of step S462 ends, the process proceeds to step S463.
- step S463 the MPD analysis unit 451 determines whether or not all samples have been processed. If there is an unprocessed sample, the process returns to step S451, and the subsequent processes are repeated. The processing from step S451 to step S463 is repeated for each sample, and if it is determined in step S463 that all samples have been processed, the file reproduction processing ends.
- the file playback device 450 may perform the processes of steps S453 to S555 and step S460 described above.
- the file playback device 450 can decode the base layer (still image) at an appropriate timing. That is, the file reproduction device 450 can correctly decode encoded data obtained by hierarchically encoding a plurality of hierarchical images. In particular, even when the base layer is a still image having no time information, and even when the encoded data is not stored in the MP4 file, it can be correctly decoded.
- Each device described above in each embodiment can be used in, for example, a distribution system that distributes still images and moving images. The case will be described below.
- FIG. 29 is a diagram illustrating a main configuration example of a distribution system to which the present technology is applied.
- a distribution system 500 shown in FIG. 29 is a system for distributing still images and moving images. As illustrated in FIG. 29, the distribution system 500 includes a distribution data generation device 501, a distribution server 502, a network 503, a terminal device 504, and a terminal device 505.
- the distribution data generation device 501 generates distribution data in a distribution format from still image or moving image data to be distributed.
- the distribution data generation device 501 supplies the generated distribution data to the distribution server 502.
- the distribution server 502 stores and manages the distribution data generated by the distribution data generation device 501 in a storage unit or the like, and provides the distribution data distribution service to the terminal device 504 and the terminal device 505 via the network 503. To do.
- the network 503 is a communication network serving as a communication medium.
- the network 503 may be any communication network, a wired communication network, a wireless communication network, or both of them.
- it may be a wired LAN (Local Area Network), a wireless LAN, a public telephone line network, a wide area communication network for a wireless mobile body such as a so-called 3G line or 4G line, or the Internet, or a combination thereof. May be.
- the network 503 may be a single communication network or a plurality of communication networks.
- a part or all of the network 503 is configured by a communication cable of a predetermined standard such as a USB (Universal Serial Bus) cable, an HDMI (registered trademark) (High-Definition Multimedia Interface) cable, or the like. You may be made to do.
- a communication cable of a predetermined standard such as a USB (Universal Serial Bus) cable, an HDMI (registered trademark) (High-Definition Multimedia Interface) cable, or the like. You may be made to do.
- the distribution server 502, the terminal device 504, and the terminal device 505 are connected to the network 503 and can communicate with each other.
- the connection method to these networks 503 is arbitrary.
- these devices may be connected to the network 503 by wired communication or may be connected by wireless communication. Further, for example, these devices may be connected to the network 503 via an arbitrary communication device (communication equipment) such as an access point, a relay device, or a base station.
- communication equipment such as an access point, a relay device, or a base station.
- the terminal device 504 and the terminal device 505 are each an arbitrary electronic device having a communication function such as a mobile phone, a smartphone, a tablet computer, and a notebook computer.
- the terminal device 504 and the terminal device 505 request the distribution server 502 to distribute a distribution file based on an instruction from a user, for example.
- the distribution server 502 transmits the requested distribution data to the request source.
- the terminal device 504 or the terminal device 505 that requested the distribution receives and reproduces the distribution data.
- the present technology described above in each embodiment is applied as the distribution data generation device 501. That is, the MP4 file generation device 100, the MP4 file generation device 200, the MP4 file generation device 300, or the file generation device 400 described above is used as the distribution data generation device 501.
- the present technology described in each embodiment is applied as the terminal device 504 and the terminal device 505. That is, the above-described MP4 file playback device 150, MP4 file playback device 250, MP4 file playback device 350, or file playback device 450 is used as the terminal device 504 or the terminal device 505.
- the delivery data generation device 501, the terminal device 504, and the terminal device 505 can obtain the same effects as those of the above-described embodiments. That is, the distribution system 500 can control the decoding timing of the encoded data obtained by hierarchically encoding the images of a plurality of layers, and implements the use case function and service described in the first embodiment, for example. be able to.
- the series of processes described above can be executed by hardware or can be executed by software.
- a program constituting the software is installed in the computer.
- the computer includes, for example, a general-purpose personal computer that can execute various functions by installing a computer incorporated in dedicated hardware and various programs.
- FIG. 30 is a block diagram showing an example of the hardware configuration of a computer that executes the above-described series of processing by a program.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- An input / output interface 610 is also connected to the bus 604.
- An input unit 611, an output unit 612, a storage unit 613, a communication unit 614, and a drive 615 are connected to the input / output interface 610.
- the input unit 611 includes, for example, a keyboard, a mouse, a microphone, a touch panel, an input terminal, and the like.
- the output unit 612 includes, for example, a display, a speaker, and an output terminal.
- the storage unit 613 includes, for example, a hard disk, a RAM disk, and a nonvolatile memory.
- the communication unit 614 is composed of a network interface, for example.
- the drive 615 drives a removable medium 621 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
- the CPU 601 loads the program stored in the storage unit 613 into the RAM 603 via the input / output interface 610 and the bus 604 and executes the program, for example. Is performed.
- the RAM 603 also appropriately stores data necessary for the CPU 601 to execute various processes.
- the program executed by the computer (CPU 601) can be recorded and applied to, for example, a removable medium 621 as a package medium or the like.
- the program can be installed in the storage unit 613 via the input / output interface 610 by attaching the removable medium 621 to the drive 615.
- This program can also be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting. In that case, the program can be received by the communication unit 614 and installed in the storage unit 613.
- a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
- the program can be received by the communication unit 614 and installed in the storage unit 613.
- this program can be installed in the ROM 602 or the storage unit 613 in advance.
- the program executed by the computer may be a program that is processed in time series in the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program for processing.
- the step of describing the program recorded on the recording medium is not limited to the processing performed in chronological order according to the described order, but may be performed in parallel or It also includes processes that are executed individually.
- each step described above can be executed in each device described above or any device other than each device described above.
- the device that executes the process may have the functions (functional blocks and the like) necessary for executing the process described above.
- Information necessary for processing may be transmitted to the apparatus as appropriate.
- the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Accordingly, a plurality of devices housed in separate housings and connected via a network and a single device housing a plurality of modules in one housing are all systems. .
- the configuration described as one device (or processing unit) may be divided and configured as a plurality of devices (or processing units).
- the configurations described above as a plurality of devices (or processing units) may be combined into a single device (or processing unit).
- a configuration other than that described above may be added to the configuration of each device (or each processing unit).
- a part of the configuration of a certain device (or processing unit) may be included in the configuration of another device (or other processing unit). .
- the present technology can take a configuration of cloud computing in which one function is shared by a plurality of devices via a network and is jointly processed.
- each step described in the above flowchart can be executed by one device or can be shared by a plurality of devices.
- the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
- the present technology is not limited to this, and any configuration mounted on such a device or a device constituting the system, for example, a processor as a system LSI (Large Scale Integration), a module using a plurality of processors, a plurality of It is also possible to implement as a unit using other modules, a set obtained by further adding other functions to the unit (that is, a partial configuration of the apparatus), and the like.
- a processor as a system LSI (Large Scale Integration)
- a module using a plurality of processors a plurality of It is also possible to implement as a unit using other modules, a set obtained by further adding other functions to the unit (that is, a partial configuration of the apparatus), and the like.
- this technique can also take the following structures.
- a time information setting unit that sets time information for designating the time information using the time information of the moving image encoded data based on a reference relationship between the still image for prediction and the moving image. apparatus.
- a file in which still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction referring to the still image are stored in different tracks.
- a file playback unit for extracting the still image encoded data and the moving image encoded data; Time information for designating the decoding timing of each frame of the moving image encoded data based on the reference relationship between the still image and the moving image for the prediction of the encoded still image extracted from the file
- a still image decoding unit configured to decode at a timing based on time information that specifies the decoding timing of the still image set using The moving image encoded data extracted from the file is obtained by decoding the still image encoded data at a timing based on time information designating a decoding timing of each frame of the moving image encoded data.
- An information processing apparatus comprising: a moving image decoding unit that performs decoding with reference to a still image. (5) A file in which still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction referring to the still image are stored in different tracks.
- the moving image encoded data extracted from the file is obtained by decoding the still image encoded data at a timing based on time information designating a decoding timing of each frame of the moving image encoded data.
- An information processing apparatus comprising: a table information generation unit that generates table information indicating a reference relationship between the still image for prediction and the moving image, and stores the table information in the file.
- a file playback unit for extracting the still image encoded data and the moving image encoded data;
- a still image decoding unit for decoding at a timing based on the table information shown;
- Each frame of the moving image encoded data extracted from the file is referred to the still image obtained by decoding the still image encoded data by the still image decoding unit at a timing based on the time information.
- An information processing apparatus comprising: (10) A file in which still image encoded data in which a still image is encoded and moving image encoded data in which a moving image is encoded using prediction referring to the still image are stored in different tracks. And extracting the still image encoded data and the moving image encoded data, The still image encoded data extracted from the file, the time information designating the decoding timing of each frame of the moving image encoded data, and the reference relationship between the still image and the moving image for the prediction Decoding at a timing based on the table information shown, Each frame of the moving image encoded data extracted from the file is referred to the still image obtained by decoding the still image encoded data by the still image decoding unit at a timing based on the time information. Decoding information processing method.
- a time information generation unit that generates time information indicating the decoding timing using a predetermined timeline
- An information processing apparatus comprising: a metadata generation unit that generates metadata used to provide the still image encoded data and the moving image encoded data using the time information.
- Time information indicating the decoding timing of still image encoded data in which a still image is encoded, and each frame of moving image encoded data in which a moving image is encoded using prediction referring to the still image Generate time information indicating the decoding timing using a predetermined timeline, An information processing method for generating metadata used for providing the still image encoded data and the moving image encoded data using the time information.
- 100 MP4 file generation device 101 base layer encoding unit, 102 enhancement layer encoding unit, 103 time information generation unit, 104 MP4 file generation unit, 150 MP4 file playback device, 151 MP4 file playback unit, 152 time information analysis unit, 153 Base layer decoding unit, 154 enhancement layer decoding unit, 200 MP4 file generation device, 203 time information generation unit, 204 MP4 file generation unit, 250 MP4 file playback device, 252 time information analysis unit, 300 MP4 file generation device, 301 base Layer encoding unit, 304 MP4 file generation unit, 350 MP4 file playback device, 351 MP4 file playback unit, 353 base layer decoding unit, 400 file generation device, 403 Time information generation unit, 404 MP4 file generation unit, 405 MPD generation unit, 450 file playback device, 451 MPD analysis unit, 452 MP4 file playback unit, 454 enhancement layer decoding unit, 500 distribution system, 501 distribution data generation device, 502 distribution Server, 503 network, 504 and 505 terminal devices, 600 computers
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
Description
1.第1の実施の形態(MP4のDTSを利用する場合)
2.第2の実施の形態(POC参照テーブルを生成して利用する場合)
3.第3の実施の形態(静止画像が独立している場合)
4.第4の実施の形態(MPDタイムラインを利用する場合)
5.第5の実施の形態(配信システム)
6.第6の実施の形態(コンピュータ)
<静止画像と動画像の階層化>
画像の符号化・復号方式として、複数階層に階層化された画像を、階層間の予測等を用いて効率よく符号化する階層符号化・階層復号方式がある。このような階層化された画像として、例えば、静止画像をベースレイヤとし動画像をエンハンスメントレイヤとして階層化するものがある。つまり階層符号化においては、動画像の符号化の際に、静止画像を参照する予測が行われる。
以下においては、静止画像のベースレイヤと、動画像のエンハンスメントレイヤとからなる2階層の画像データを、レイヤ間の予測を用いて階層符号化する場合を例に用いて本技術を説明する。
次に、MP4ファイルフォーマットの概要について説明する。図1に示されるように、MPEG-DASHに準拠したMP4ファイル(MP4 file)は、ftyp、moov、およびmdatを含む。
上述したように静止画像と動画像が階層符号化された符号化データを格納するMP4ファイルの主な構成例を図2に示す。
次に、このようなMP4ファイルを生成する装置について説明する。図3は、本技術を適用した情報処理装置の一実施の形態であるMP4ファイル生成装置の主な構成例を示すブロック図である。図3において、MP4ファイル生成装置100は、静止画像と動画像とを、静止画像をベースレイヤとし、動画像をエンハンスメントレイヤとして階層符号化し、得られた各階層の符号化データをファイル化してMP4ファイルを生成する装置である。
図3のMP4ファイル生成装置100は、MP4ファイル生成処理を実行することにより、入力される静止画像および動画像を階層符号化し、MP4ファイルを生成する。図4のフローチャートを参照して、このMP4ファイル生成処理の流れの例を説明する。
次に、このように生成されたMP4ファイルを再生する装置について説明する。図5は、本技術を適用した情報処理装置の一実施の形態であるMP4ファイル再生装置の主な構成例を示すブロック図である。図5において、MP4ファイル再生装置150は、図3のMP4ファイル生成装置100により上述したように生成されたMP4ファイルを再生し、ベースレイヤおよびエンハンスメントレイヤのいずれか一方または両方の復号画像を生成し、出力する装置である。
図5のMP4ファイル再生装置150は、MP4ファイル再生処理を実行することにより、入力されるMP4ファイルを再生し、任意のレイヤの復号画像を生成する。図6のフローチャートを参照して、このMP4ファイル再生処理の流れの例を説明する。なお、図6においては、エンハンスメントレイヤの復号画像を得る場合の処理について説明する。
<POC参照テーブル>
DTSの代わりにベースレイヤとエンハンスメントレイヤの参照関係を示すPOC参照テーブルを別途格納するようにしてもよい。
次に、このようなMP4ファイルを生成する装置について説明する。図9は、本技術を適用した情報処理装置の一実施の形態であるMP4ファイル生成装置の主な構成例を示すブロック図である。図9において、MP4ファイル生成装置200は、MP4ファイル生成装置100(図3)と同様の装置であり、基本的にMP4ファイル生成装置100と同様の構成を有する。ただし、MP4ファイル生成装置200は、MP4ファイル生成装置100における時刻情報生成部103の代わりに時刻情報生成部203を有する。また、MP4ファイル生成装置200は、MP4ファイル生成装置100におけるMP4ファイル生成部104の代わりにMP4ファイル生成部204を有する。
図9のMP4ファイル生成装置100により実行されるMP4ファイル生成処理の流れの例を、図10のフローチャートを参照して説明する。
次に、このように生成されたMP4ファイルを再生する装置について説明する。図11は、本技術を適用した情報処理装置の一実施の形態であるMP4ファイル再生装置の主な構成例を示すブロック図である。図11において、MP4ファイル再生装置250は、図9のMP4ファイル生成装置200により上述したように生成されたMP4ファイルを再生し、ベースレイヤおよびエンハンスメントレイヤのいずれか一方または両方の復号画像を生成し、出力する装置である。
図11のMP4ファイル再生装置250により実行されるMP4ファイル再生処理の流れの例を、図12のフローチャートを参照して説明する。なお、図12においては、エンハンスメントレイヤの復号画像を得る場合の処理について説明する。
<JPEGデータのリンク>
ベースレイヤの符号化データ(JPEGファイル)の実体は、MP4ファイルの外部にあってもよい。その場合、MP4ファイルにはJPEGファイルの実体の格納場所を示すリンク情報が格納されていればよい。
次に、このようなMP4ファイルを生成する装置について説明する。図14は、本技術を適用した情報処理装置の一実施の形態であるMP4ファイル生成装置の主な構成例を示すブロック図である。図14において、MP4ファイル生成装置300は、MP4ファイル生成装置100(図3)と同様の装置であり、基本的にMP4ファイル生成装置100と同様の構成を有する。ただし、MP4ファイル生成装置300は、MP4ファイル生成装置100におけるベースレイヤ符号化部101の代わりにベースレイヤ符号化部301を有する。また、MP4ファイル生成装置300は、MP4ファイル生成装置100におけるMP4ファイル生成部104の代わりにMP4ファイル生成部304を有する。
図14のMP4ファイル生成装置100により実行されるMP4ファイル生成処理の流れの例を、図15のフローチャートを参照して説明する。
次に、このように生成されたMP4ファイルを再生する装置について説明する。図16は、本技術を適用した情報処理装置の一実施の形態であるMP4ファイル再生装置の主な構成例を示すブロック図である。図16において、MP4ファイル再生装置350は、図14のMP4ファイル生成装置300により上述したように生成されたMP4ファイルを再生し、ベースレイヤおよびエンハンスメントレイヤのいずれか一方または両方の復号画像を生成し、出力する装置である。
図16のMP4ファイル再生装置250により実行されるMP4ファイル再生処理の流れの例を、図17のフローチャートを参照して説明する。なお、図17においては、エンハンスメントレイヤの復号画像を得る場合の処理について説明する。
<MPDによる制御>
ベースレイヤの符号化データ(JPEGファイル)の復号タイミングの制御は、MPEG-DASH(Moving Picture Experts Group - Dynamic Adaptive Streaming over HTTP)のMPD(Media Presentation Description)において行うようにしてもよい。
次に、このようなMPDやMP4ファイルを生成する装置について説明する。図25は、本技術を適用した情報処理装置の一実施の形態であるファイル生成装置の主な構成例を示すブロック図である。図25において、ファイル生成装置400は、静止画像と動画像とを、静止画像をベースレイヤとし、動画像をエンハンスメントレイヤとして階層符号化し、JPEGファイル、MP4ファイル、MPD等を生成し出力する。
図25のファイル生成装置400により実行されるファイル生成処理の流れの例を、図26のフローチャートを参照して説明する。
次に、このように生成されたMPD、MP4ファイル、JPEGファイル等を再生する装置について説明する。図27は、本技術を適用した情報処理装置の一実施の形態であるファイル再生装置の主な構成例を示すブロック図である。図27において、ファイル再生装置450は、図25のファイル生成装置400により上述したように生成されたMPD、MP4ファイル、JPEGファイルを再生し、ベースレイヤおよびエンハンスメントレイヤのいずれか一方または両方の復号画像を生成し、出力する装置である。
図27のファイル再生装置450により実行されるファイル再生処理の流れの例を、図28のフローチャートを参照して説明する。なお、図28においては、エンハンスメントレイヤの復号画像を得る場合の処理について説明する。
<配信システム>
各実施の形態において上述した各装置は、例えば、静止画像や動画像を配信する配信システムに利用することができる。以下において、その場合について説明する。
<コンピュータ>
上述した一連の処理は、ハードウエアにより実行させることもできるし、ソフトウエアにより実行させることもできる。一連の処理をソフトウエアにより実行する場合には、そのソフトウエアを構成するプログラムが、コンピュータにインストールされる。ここでコンピュータには、専用のハードウエアに組み込まれているコンピュータや、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータ等が含まれる。
(1) 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとを、互いに異なるトラックに格納するファイルを生成するファイル生成部と、
前記ファイルの前記動画像符号化データを格納するトラックに、各フレームの復号タイミングを指定する時刻情報を設定し、前記ファイルの前記静止画像符号化データを格納するトラックに、前記静止画像の復号タイミングを指定する時刻情報を、前記予測のための前記静止画像と前記動画像との参照関係に基づいて前記動画像符号化データの前記時刻情報を用いて設定する時刻情報設定部と
を備える情報処理装置。
(2) 前記ファイル生成部は、前記ファイルに、前記静止画像符号化データの代わりに、前記静止画像符号化データの格納先を示す情報を格納する
(1)に記載の情報処理装置。
(3) 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとを、互いに異なるトラックに格納するファイルを生成し、
前記ファイルの前記動画像符号化データを格納するトラックに、各フレームの復号タイミングを指定する時刻情報を設定し、
前記ファイルの前記静止画像符号化データを格納するトラックに、前記静止画像の復号タイミングを指定する時刻情報を、前記予測のための前記静止画像と前記動画像との参照関係に基づいて前記動画像符号化データの前記時刻情報を用いて設定する
情報処理方法。
(4) 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとが、互いに異なるトラックに格納されたファイルを再生し、前記静止画像符号化データと前記動画像符号化データとを抽出するファイル再生部と、
前記ファイルから抽出された前記静止画像符号化データを、前記予測のための前記静止画像と前記動画像との参照関係に基づいて前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報を用いて設定された、前記静止画像の復号タイミングを指定する時刻情報に基づくタイミングで復号する静止画像復号部と、
前記ファイルから抽出された前記動画像符号化データを、前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報に基づくタイミングで、前記静止画像符号化データが復号されて得られた前記静止画像を参照して復号する動画像復号部と
を備える情報処理装置。
(5) 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとが、互いに異なるトラックに格納されたファイルを再生し、前記静止画像符号化データと前記動画像符号化データとを抽出し、
前記ファイルから抽出された前記静止画像符号化データを、前記予測のための前記静止画像と前記動画像との参照関係に基づいて前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報を用いて設定された、前記静止画像の復号タイミングを指定する時刻情報に基づくタイミングで復号し、
前記ファイルから抽出された前記動画像符号化データを、前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報に基づくタイミングで、前記静止画像符号化データが復号されて得られた前記静止画像を参照して復号する
情報処理方法。
(6) 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとを、互いに異なるトラックに格納するファイルを生成するファイル生成部と、
前記予測のための前記静止画像と前記動画像との参照関係を示すテーブル情報を生成し、前記ファイルに格納するテーブル情報生成部と
を備える情報処理装置。
(7) 前記ファイル生成部は、前記静止画像の表示タイミングを示す時刻情報を前記ファイルに格納する
(6)に記載の情報処理装置。
(8) 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとを、互いに異なるトラックに格納するファイルを生成し、
前記予測のための前記静止画像と前記動画像との参照関係を示すテーブル情報を生成し、前記ファイルに格納する
情報処理方法。
(9) 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとが、互いに異なるトラックに格納されたファイルを再生し、前記静止画像符号化データと前記動画像符号化データとを抽出するファイル再生部と、
前記ファイルから抽出された前記静止画像符号化データを、前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報と、前記予測のための前記静止画像と前記動画像との参照関係を示すテーブル情報とに基づくタイミングにおいて復号する静止画像復号部と、
前記ファイルから抽出された前記動画像符号化データの各フレームを、前記時刻情報に基づくタイミングにおいて、前記静止画像復号部により前記静止画像符号化データが復号されて得られた前記静止画像を参照して復号する動画像復号部と
を備える情報処理装置。
(10) 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとが、互いに異なるトラックに格納されたファイルを再生し、前記静止画像符号化データと前記動画像符号化データとを抽出し、
前記ファイルから抽出された前記静止画像符号化データを、前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報と、前記予測のための前記静止画像と前記動画像との参照関係を示すテーブル情報とに基づくタイミングにおいて復号し、
前記ファイルから抽出された前記動画像符号化データの各フレームを、前記時刻情報に基づくタイミングにおいて、前記静止画像復号部により前記静止画像符号化データが復号されて得られた前記静止画像を参照して復号する
情報処理方法。
(11) 静止画像が符号化された静止画像符号化データの復号タイミングを示す時刻情報と、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データの各フレームの復号タイミングを示す時刻情報とを、所定のタイムラインを用いて生成する時刻情報生成部と、
前記時刻情報を用いて、前記静止画像符号化データと前記動画像符号化データとの提供に利用されるメタデータを生成するメタデータ生成部と
を備える情報処理装置。
(12) 静止画像が符号化された静止画像符号化データの復号タイミングを示す時刻情報と、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データの各フレームの復号タイミングを示す時刻情報とを、所定のタイムラインを用いて生成し、
前記時刻情報を用いて、前記静止画像符号化データと前記動画像符号化データとの提供に利用されるメタデータを生成する
情報処理方法。
Claims (12)
- 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとを、互いに異なるトラックに格納するファイルを生成するファイル生成部と、
前記ファイルの前記動画像符号化データを格納するトラックに、各フレームの復号タイミングを指定する時刻情報を設定し、前記ファイルの前記静止画像符号化データを格納するトラックに、前記静止画像の復号タイミングを指定する時刻情報を、前記予測のための前記静止画像と前記動画像との参照関係に基づいて前記動画像符号化データの前記時刻情報を用いて設定する時刻情報設定部と
を備える情報処理装置。 - 前記ファイル生成部は、前記ファイルに、前記静止画像符号化データの代わりに、前記静止画像符号化データの格納先を示す情報を格納する
請求項1に記載の情報処理装置。 - 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとを、互いに異なるトラックに格納するファイルを生成し、
前記ファイルの前記動画像符号化データを格納するトラックに、各フレームの復号タイミングを指定する時刻情報を設定し、
前記ファイルの前記静止画像符号化データを格納するトラックに、前記静止画像の復号タイミングを指定する時刻情報を、前記予測のための前記静止画像と前記動画像との参照関係に基づいて前記動画像符号化データの前記時刻情報を用いて設定する
情報処理方法。 - 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとが、互いに異なるトラックに格納されたファイルを再生し、前記静止画像符号化データと前記動画像符号化データとを抽出するファイル再生部と、
前記ファイルから抽出された前記静止画像符号化データを、前記予測のための前記静止画像と前記動画像との参照関係に基づいて前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報を用いて設定された、前記静止画像の復号タイミングを指定する時刻情報に基づくタイミングで復号する静止画像復号部と、
前記ファイルから抽出された前記動画像符号化データを、前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報に基づくタイミングで、前記静止画像符号化データが復号されて得られた前記静止画像を参照して復号する動画像復号部と
を備える情報処理装置。 - 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとが、互いに異なるトラックに格納されたファイルを再生し、前記静止画像符号化データと前記動画像符号化データとを抽出し、
前記ファイルから抽出された前記静止画像符号化データを、前記予測のための前記静止画像と前記動画像との参照関係に基づいて前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報を用いて設定された、前記静止画像の復号タイミングを指定する時刻情報に基づくタイミングで復号し、
前記ファイルから抽出された前記動画像符号化データを、前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報に基づくタイミングで、前記静止画像符号化データが復号されて得られた前記静止画像を参照して復号する
情報処理方法。 - 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとを、互いに異なるトラックに格納するファイルを生成するファイル生成部と、
前記予測のための前記静止画像と前記動画像との参照関係を示すテーブル情報を生成し、前記ファイルに格納するテーブル情報生成部と
を備える情報処理装置。 - 前記ファイル生成部は、前記静止画像の表示タイミングを示す時刻情報を前記ファイルに格納する
請求項6に記載の情報処理装置。 - 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとを、互いに異なるトラックに格納するファイルを生成し、
前記予測のための前記静止画像と前記動画像との参照関係を示すテーブル情報を生成し、前記ファイルに格納する
情報処理方法。 - 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとが、互いに異なるトラックに格納されたファイルを再生し、前記静止画像符号化データと前記動画像符号化データとを抽出するファイル再生部と、
前記ファイルから抽出された前記静止画像符号化データを、前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報と、前記予測のための前記静止画像と前記動画像との参照関係を示すテーブル情報とに基づくタイミングにおいて復号する静止画像復号部と、
前記ファイルから抽出された前記動画像符号化データの各フレームを、前記時刻情報に基づくタイミングにおいて、前記静止画像復号部により前記静止画像符号化データが復号されて得られた前記静止画像を参照して復号する動画像復号部と
を備える情報処理装置。 - 静止画像が符号化された静止画像符号化データと、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データとが、互いに異なるトラックに格納されたファイルを再生し、前記静止画像符号化データと前記動画像符号化データとを抽出し、
前記ファイルから抽出された前記静止画像符号化データを、前記動画像符号化データの各フレームの復号タイミングを指定する時刻情報と、前記予測のための前記静止画像と前記動画像との参照関係を示すテーブル情報とに基づくタイミングにおいて復号し、
前記ファイルから抽出された前記動画像符号化データの各フレームを、前記時刻情報に基づくタイミングにおいて、前記静止画像復号部により前記静止画像符号化データが復号されて得られた前記静止画像を参照して復号する
情報処理方法。 - 静止画像が符号化された静止画像符号化データの復号タイミングを示す時刻情報と、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データの各フレームの復号タイミングを示す時刻情報とを、所定のタイムラインを用いて生成する時刻情報生成部と、
前記時刻情報を用いて、前記静止画像符号化データと前記動画像符号化データとの提供に利用されるメタデータを生成するメタデータ生成部と
を備える情報処理装置。 - 静止画像が符号化された静止画像符号化データの復号タイミングを示す時刻情報と、動画像が前記静止画像を参照する予測を用いて符号化された動画像符号化データの各フレームの復号タイミングを示す時刻情報とを、所定のタイムラインを用いて生成し、
前記時刻情報を用いて、前記静止画像符号化データと前記動画像符号化データとの提供に利用されるメタデータを生成する
情報処理方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016531239A JP6501127B2 (ja) | 2014-06-30 | 2015-06-16 | 情報処理装置および方法 |
US15/309,963 US20170163980A1 (en) | 2014-06-30 | 2015-06-16 | Information processing device and method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2014-135146 | 2014-06-30 | ||
JP2014135146 | 2014-06-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016002494A1 true WO2016002494A1 (ja) | 2016-01-07 |
Family
ID=55019041
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2015/067232 WO2016002494A1 (ja) | 2014-06-30 | 2015-06-16 | 情報処理装置および方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20170163980A1 (ja) |
JP (1) | JP6501127B2 (ja) |
WO (1) | WO2016002494A1 (ja) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3515067A1 (en) * | 2018-01-19 | 2019-07-24 | Thomson Licensing | A method and apparatus for encoding and decoding three-dimensional scenes in and from a data stream |
EP3515066A1 (en) * | 2018-01-19 | 2019-07-24 | Thomson Licensing | A method and apparatus for encoding and decoding three-dimensional scenes in and from a data stream |
EP3515068A1 (en) | 2018-01-19 | 2019-07-24 | Thomson Licensing | A method and apparatus for encoding and decoding three-dimensional scenes in and from a data stream |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007336573A (ja) * | 2004-08-17 | 2007-12-27 | Matsushita Electric Ind Co Ltd | 画像符号化装置、画像復号化装置 |
JP2009159615A (ja) * | 1999-07-05 | 2009-07-16 | Hitachi Ltd | 映像記録方法及び装置、映像再生方法及び装置、及び記録媒体 |
JP2011505780A (ja) * | 2007-12-04 | 2011-02-24 | ソニー株式会社 | 高解像度デジタル静止画像をビデオと連続して符号化するためのavc規格の拡張 |
JP2011050068A (ja) * | 2008-11-17 | 2011-03-10 | Nec Casio Mobile Communications Ltd | 映像変換装置、映像再生装置、映像変換再生システム及びプログラム |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ES2336824T3 (es) * | 2005-03-10 | 2010-04-16 | Qualcomm Incorporated | Arquitectura de decodificador para gestion de errores optimizada en flujo continuo multimedia. |
KR101711009B1 (ko) * | 2010-08-26 | 2017-03-02 | 삼성전자주식회사 | 영상 저장장치, 영상 재생장치, 영상 저장방법, 영상 제공방법, 저장매체 및 촬영장치 |
US8856283B2 (en) * | 2011-06-03 | 2014-10-07 | Apple Inc. | Playlists for real-time or near real-time streaming |
US9973764B2 (en) * | 2013-09-09 | 2018-05-15 | Lg Electronics Inc. | Method and device for transmitting and receiving advanced UHD broadcasting content in digital broadcasting system |
-
2015
- 2015-06-16 WO PCT/JP2015/067232 patent/WO2016002494A1/ja active Application Filing
- 2015-06-16 JP JP2016531239A patent/JP6501127B2/ja not_active Expired - Fee Related
- 2015-06-16 US US15/309,963 patent/US20170163980A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009159615A (ja) * | 1999-07-05 | 2009-07-16 | Hitachi Ltd | 映像記録方法及び装置、映像再生方法及び装置、及び記録媒体 |
JP2007336573A (ja) * | 2004-08-17 | 2007-12-27 | Matsushita Electric Ind Co Ltd | 画像符号化装置、画像復号化装置 |
JP2011505780A (ja) * | 2007-12-04 | 2011-02-24 | ソニー株式会社 | 高解像度デジタル静止画像をビデオと連続して符号化するためのavc規格の拡張 |
JP2011050068A (ja) * | 2008-11-17 | 2011-03-10 | Nec Casio Mobile Communications Ltd | 映像変換装置、映像再生装置、映像変換再生システム及びプログラム |
Non-Patent Citations (1)
Title |
---|
"Multimedia Tsushin Kenkyukai", POINT ZUKAI-SHIKI SAISHIN MPEG KYOKASHO, 17 October 2005 (2005-10-17), pages 236, 237 * |
Also Published As
Publication number | Publication date |
---|---|
JP6501127B2 (ja) | 2019-04-17 |
JPWO2016002494A1 (ja) | 2017-04-27 |
US20170163980A1 (en) | 2017-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9478256B1 (en) | Video editing processor for video cloud server | |
JP6908098B2 (ja) | 情報処理装置および方法 | |
JP5559430B2 (ja) | ビデオデータをストリーミングするためのビデオ切替え | |
JP6042531B2 (ja) | ビデオ・ファイルにおけるパラメータ・セットを識別すること | |
TW201840201A (zh) | 全向式視覺媒體中之感興趣區之進階傳信 | |
JP6555263B2 (ja) | 情報処理装置および方法 | |
JP6508206B2 (ja) | 情報処理装置および方法 | |
JP2019083555A (ja) | 情報処理装置、コンテンツ要求方法およびコンピュータプログラム | |
WO2018142946A1 (ja) | 情報処理装置および方法 | |
US11206386B2 (en) | Information processing apparatus and information processing method | |
WO2016002494A1 (ja) | 情報処理装置および方法 | |
KR101944601B1 (ko) | 기간들에 걸쳐 오브젝트들을 식별하기 위한 방법 및 이에 대응하는 디바이스 | |
CN116601963A (zh) | 生成/接收包括nal单元阵列信息的媒体文件的方法和装置及发送媒体文件的方法 | |
Kammachi‐Sreedhar et al. | Omnidirectional video delivery with decoder instance reduction | |
US20240056578A1 (en) | Media file generation/reception method and apparatus supporting random access in units of samples, and method for transmitting media file | |
US20240205409A1 (en) | Method and device for generating/receiving media file on basis of eos sample group, and method for transmitting media file | |
US20240064323A1 (en) | Media file generation/reception method and device for signaling subpicture id information, and computer-readable recording medium in which media file is stored | |
US20230328261A1 (en) | Media file processing method and device therefor | |
US20240205429A1 (en) | Media file processing method, and device therefor | |
EP4266689A1 (en) | Method and device for generating/receiving media file including nal unit information, and method for transmitting media file | |
US20230379481A1 (en) | Media file generation/reception method and device for signaling operating point information and output layer set information, and computer-readable recording medium in which media file is stored |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15814840 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15309963 Country of ref document: US |
|
ENP | Entry into the national phase |
Ref document number: 2016531239 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15814840 Country of ref document: EP Kind code of ref document: A1 |