EP3977750A1 - An apparatus, a method and a computer program for video coding and decoding - Google Patents
An apparatus, a method and a computer program for video coding and decodingInfo
- Publication number
- EP3977750A1 EP3977750A1 EP20814633.2A EP20814633A EP3977750A1 EP 3977750 A1 EP3977750 A1 EP 3977750A1 EP 20814633 A EP20814633 A EP 20814633A EP 3977750 A1 EP3977750 A1 EP 3977750A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- item
- items
- file
- references
- media file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/835—Generation of protective data, e.g. certificates
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8455—Structuring of content, e.g. decomposing content into time segments involving pointers to the content, e.g. pointers to the I-frames of the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/85406—Content authoring involving a specific file format, e.g. MP4 format
Definitions
- the present invention relates to an apparatus, a method and a computer program for video coding and decoding.
- the HEIF is a standard developed by the Moving Picture Experts Group (MPEG) for storage of images and image sequences.
- MPEG Moving Picture Experts Group
- HEIF includes a rich set of features building on top of the ISOBMFF, making HEIF feature-wise superior compared to many other image file formats.
- One such feature of the file format is its capability to store multiple images in the same file. These images, called image items, can have logical relationships to each other.
- a method comprises authoring items into a media file, said items being associated with a plurality of referenced items to be processed in a specific order; and authoring one or more properties of item references of said referenced items into said media file, wherein said properties include one or more of the following: indication if the item references are strictly ordered, indication if the referenced items are removable without making a referencing item invalid, a checksum generated from ID values of the referenced items in the order they are listed.
- An apparatus comprises means for authoring items into a media file, said items being associated with a plurality of referenced items to be processed in a specific order; and means for authoring one or more properties of item references of said referenced items into said media file, wherein said properties include one or more of the following: indication if the item references are strictly ordered, indication if the referenced items are removable without making a referencing item invalid, a checksum generated from ID values of the referenced items in the order they are listed.
- the apparatus further comprises means for authoring said one or more properties of item references of said references items into a
- ItemReferenceBox according to ISO Base Media File Format (ISOBMFF).
- At least one further data structure is included in a syntax of the ItemReferenceBox to indicate the strictly ordered item references.
- flag of the syntax of the ItemReferenceBox are used to indicate the strictly ordered item references.
- At least one further box is defined in accordance with ISOBMFF syntax to indicate the strictly ordered item references.
- the items are image items and the media file format is a High Efficiency Image File Format or a High Efficiency Image File compatible storage format.
- a checksum generation algorithm is pre-defined or indicated in the media file.
- a method comprises: receiving a media file authored with image items comprising a plurality of referenced items to be processed in a specific order; reading one or more properties of item references of said references items from said media file, wherein said properties include one or more of the following: indication if the item references are strictly ordered, indication if the referenced items are removable without making a referencing item invalid, a checksum generated from ID values of the referenced items in the order they are listed; and parsing the media file according to said one or more properties.
- An apparatus comprises comprising means for receiving a media file authored with image items comprising a plurality of referenced items to be processed in a specific order; means for reading one or more properties of item references of said references items from said media file, wherein said properties include one or more of the following: indication if the item references are strictly ordered, indication if the referenced items are removable without making a referencing item invalid, a checksum generated from ID values of the referenced items in the order they are listed; and means for parsing the media file according to said one or more properties.
- the apparatus further comprises means for receiving an instruction to remove a first item from a media file; means for checking from
- ItemReferenceBoxes if the first item is referenced by any other item in the media file; and means for removing the first item if it is not referenced by any other item or if it is indicated in the media file that the first item is removable without making the referencing item(s) invalid.
- the apparatus further comprises means for receiving an instruction to reorder a first item in a first item reference; and means for reordering the first item if the first item reference is not indicated to be strictly ordered.
- the further aspects relate to apparatuses and computer readable storage media stored with code thereon, which are arranged to carry out the above methods and one or more of the embodiments related thereto.
- Figure 1 shows schematically an electronic device employing embodiments of the invention
- Figure 2 shows schematically a user equipment suitable for employing
- FIG. 3 further shows schematically electronic devices employing embodiments of the invention connected using wireless and wired network connections;
- Figure 4 shows a flow chart of a file authoring method according to an embodiment of the invention
- Figure 5 shows schematically an encoder suitable for implementing embodiments of the invention
- Figure 6 shows a schematic diagram of a decoder suitable for implementing embodiments of the invention
- Figure 7 shows a flow chart of a file parsing method according to an embodiment of the invention.
- Figure 8 shows a schematic diagram of an example multimedia communication system within which various embodiments may be implemented.
- Figure 1 shows a block diagram of a video coding system according to an example embodiment as a schematic block diagram of an exemplary apparatus or electronic device 50, which may incorporate a codec according to an
- Figure 2 shows a layout of an apparatus according to an example embodiment. The elements of Figs. 1 and 2 will be explained next.
- the electronic device 50 may for example be a mobile terminal or user equipment of a wireless communication system. However, it would be appreciated that embodiments of the invention may be implemented within any electronic device or apparatus which may require encoding and decoding or encoding or decoding video images.
- the apparatus 50 may comprise a housing 30 for incorporating and protecting the device.
- the apparatus 50 further may comprise a display 32 in the form of a liquid crystal display.
- the display may be any suitable display technology suitable to display an image or video.
- the apparatus 50 may further comprise a keypad 34.
- any suitable data or user interface mechanism may be employed.
- the user interface may be implemented as a virtual keyboard or data entry system as part of a touch-sensitive display.
- the apparatus may comprise a microphone 36 or any suitable audio input which may be a digital or analogue signal input.
- the apparatus 50 may further comprise an audio output device which in embodiments of the invention may be any one of: an earpiece 38, speaker, or an analogue audio or digital audio output connection.
- the apparatus 50 may also comprise a battery (or in other embodiments of the invention the device may be powered by any suitable mobile energy device such as solar cell, fuel cell or clockwork generator).
- the apparatus may further comprise a camera capable of recording or capturing images and/or video.
- the apparatus 50 may further comprise an infrared port for short range line of sight communication to other devices. In other embodiments the apparatus 50 may further comprise any suitable short range communication solution such as for example a Bluetooth wireless connection or a USB/firewire wired connection.
- the apparatus 50 may comprise a controller 56, processor or processor circuitry for controlling the apparatus 50.
- the controller 56 may be connected to memory 58 which in embodiments of the invention may store both data in the form of image and audio data and/or may also store instructions for implementation on the controller 56.
- the controller 56 may further be connected to codec circuitry 54 suitable for carrying out coding and decoding of audio and/or video data or assisting in coding and decoding carried out by the controller.
- the apparatus 50 may further comprise a card reader 48 and a smart card 46, for example a UICC and UICC reader for providing user information and being suitable for providing authentication information for authentication and authorization of the user at a network.
- a card reader 48 and a smart card 46 for example a UICC and UICC reader for providing user information and being suitable for providing authentication information for authentication and authorization of the user at a network.
- the apparatus 50 may comprise radio interface circuitry 52 connected to the controller and suitable for generating wireless communication signals for example for communication with a cellular communications network, a wireless communications system or a wireless local area network.
- the apparatus 50 may further comprise an antenna 44 connected to the radio interface circuitry 52 for transmitting radio frequency signals generated at the radio interface circuitry 52 to other apparatus(es) and for receiving radio frequency signals from other apparatus(es).
- the apparatus 50 may comprise a camera capable of recording or detecting individual frames which are then passed to the codec 54 or the controller for processing.
- the apparatus may receive the video image data for processing from another device prior to transmission and/or storage.
- the apparatus 50 may also receive either wirelessly or by a wired connection the image for coding/decoding.
- the structural elements of apparatus 50 described above represent examples of means for performing a corresponding function.
- the system 10 comprises multiple communication devices which can communicate through one or more networks.
- the system 10 may comprise any combination of wired or wireless networks including, but not limited to a wireless cellular telephone network (such as a GSM, UMTS, CDMA network etc.), a wireless local area network (WLAN) such as defined by any of the IEEE 802.x standards, a Bluetooth personal area network, an Ethernet local area network, a token ring local area network, a wide area network, and the Internet.
- the system 10 may include both wired and wireless communication devices and/or apparatus 50 suitable for implementing embodiments of the invention.
- the system shown in Figure 3 shows a mobile telephone network 11 and a representation of the internet 28.
- Connectivity to the internet 28 may include, but is not limited to, long range wireless connections, short range wireless connections, and various wired connections including, but not limited to, telephone lines, cable lines, power lines, and similar communication pathways.
- the example communication devices shown in the system 10 may include, but are not limited to, an electronic device or apparatus 50, a combination of a personal digital assistant (PDA) and a mobile telephone 14, a PDA 16, an integrated messaging device (IMD) 18, a desktop computer 20, a notebook computer 22.
- PDA personal digital assistant
- IMD integrated messaging device
- the apparatus 50 may be stationary or mobile when carried by an individual who is moving.
- the apparatus 50 may also be located in a mode of transport including, but not limited to, a car, a truck, a taxi, a bus, a train, a boat, an airplane, a bicycle, a motorcycle or any similar suitable mode of transport.
- the embodiments may also be implemented in a set-top box; i.e. a digital TV receiver, which may/may not have a display or wireless capabilities, in tablets or (laptop) personal computers (PC), which have hardware or software or combination of the
- encoder/decoder implementations in various operating systems, and in chipsets, processors, DSPs and/or embedded systems offering hardware/software based coding.
- Some or further apparatus may send and receive calls and messages and
- the base station 24 may be connected to a network server 26 that allows communication between the mobile telephone network 11 and the internet 28.
- the system may include additional communication devices and communication devices of various types.
- the communication devices may communicate using various transmission technologies including, but not limited to, code division multiple access (CDMA), global systems for mobile communications (GSM), universal mobile telecommunications system (UMTS), time divisional multiple access (TDMA), frequency division multiple access (FDMA), transmission control protocol-internet protocol (TCP-IP), short messaging service (SMS), multimedia messaging service (MMS), email, instant messaging service (IMS), Bluetooth, IEEE 802.11 and any similar wireless communication technology.
- CDMA code division multiple access
- GSM global systems for mobile communications
- UMTS universal mobile telecommunications system
- TDMA time divisional multiple access
- FDMA frequency division multiple access
- TCP-IP transmission control protocol-internet protocol
- SMS short messaging service
- MMS multimedia messaging service
- email instant messaging service
- IMS instant messaging service
- Bluetooth IEEE 802.11 and any similar wireless communication technology
- communications device involved in implementing various embodiments of the present invention may communicate using various media including, but not limited to, radio, infrared, laser, cable connections, and any suitable connection.
- Available media file format standards include ISO base media file format (ISO/IEC 14496-12, which may be abbreviated ISOBMFF) and file format for NAL unit structured video (ISO/IEC 14496-15), which derives from the ISOBMFF.
- ISOBMFF Some concepts, structures, and specifications of ISOBMFF are described below as an example of a container file format, based on which the embodiments may be implemented.
- the aspects of the invention are not limited to ISOBMFF, but rather the description is given for one possible basis on top of which the invention may be partly or fully realized.
- a basic building block in the ISO base media file format is called a box.
- Each box has a header and a payload.
- the box header indicates the type of the box and the size of the box in terms of bytes.
- a box may enclose other boxes, and the ISO file format specifies which box types are allowed within a box of a certain type. Furthermore, the presence of some boxes may be mandatory in each file, while the presence of other boxes may be optional. Additionally, for some box types, it may be allowable to have more than one box present in a file. Thus, the ISO base media file format may be considered to specify a hierarchical structure of boxes.
- a file includes media data and metadata that are encapsulated into boxes. Each box is identified by a four character code (4CC) and starts with a header which informs about the type and size of the box.
- 4CC four character code
- s i ze is an integer that specifies the number of bytes in this box, including all its fields and contained boxes; if size is 1 then the actual size is in the field largesize; if size is 0, then this box must be in a top-level container, and be the last box in that container (typically, a file or data object delivered over a protocol), and its contents extend to the end of that container (normally only used for a MediaDataBox).
- type identifies the box type; user extensions use an extended type, and in this case, the type field is set to 'uuid'.
- a FullBox extends the Box syntax by adding version and flags fields into the box header.
- the version field is an integer that specifies the version of this format of the box.
- the flags field is a map or a bit field of flags. Parsers may be required that to ignore and skip boxes that have an unrecognized version value.
- the syntax of a FullBox may be specified as follows:
- ISOBMFF High Efficiency Image File Format
- HEIF High Efficiency Image File Format
- MPEG-4 file format ISO/IEC 14496-14, also known as the MP4 format
- file format for NAL unit structured video ISO/IEC 14496-15
- 3GPP file format 3GPP TS 26.244, also known as the 3GP format.
- a file includes media data and metadata that are encapsulated into boxes. Each box is identified by a four character code (4CC) and starts with a header which informs about the type and size of the box.
- 4CC four character code
- FileTypeBox contains information of the brands labeling the file.
- the ftyp box includes one major brand indication and a list of compatible brands.
- the major brand identifies the most suitable file format specification to be used for parsing the file.
- the compatible brands indicate which file format specifications and/or conformance points the file conforms to. It is possible that a file is conformant to multiple specifications. All brands indicating compatibility to these specifications should be listed, so that a reader only understanding a subset of the compatible brands can get an indication that the file can be parsed.
- Compatible brands also give a permission for a file parser of a particular file format specification to process a file containing the same particular file format brand in the ftyp box.
- a file player may check if the ftyp box of a file comprises brands it supports, and may parse and play the file only if any file format specification supported by the file player is listed among the compatible brands.
- the media data may be provided in one or more instances of MediaDataBox (‘mdat‘) and the MovieBox (‘moov’) may be used to enclose the metadata for timed media.
- The‘moov’ box may include one or more tracks, and each track may reside in one corresponding TrackBox (‘trak’).
- Each track is associated with a handler, identified by a four-character code, specifying the track type.
- Video, audio, and image sequence tracks can be collectively called media tracks, and they contain an elementary media stream.
- Other track types comprise hint tracks and timed metadata tracks.
- Tracks comprise samples, such as audio or video frames.
- a media sample may correspond to a coded picture or an access unit.
- a media track refers to samples (which may also be referred to as media samples) formatted according to a media
- a hint track refers to hint samples, containing cookbook instructions for constructing packets for transmission over an indicated communication protocol.
- a timed metadata track may refer to samples describing referred media and/or hint samples.
- the 'trak' box includes in its hierarchy of boxes the SampleTableBox (also known as the sample table or the sample table box).
- the SampleTableBox contains the
- SampleDescriptionBox which gives detailed information about the coding type used, and any initialization information needed for that coding.
- the SampleDescriptionBox contains an entry-count and as many sample entries as the entry-count indicates.
- the format of sample entries is track-type specific but derive from generic classes (e.g. VisualSampleEntry, AudioSampleEntry). Which type of sample entry form is used for derivation the track-type specific sample entry format is determined by the media handler of the track.
- Movie fragments may be used, for example, when recording content to ISO files, for example, in order to avoid losing data if a recording application crashes, runs out of memory space, or some other incident occurs. Without movie fragments, data loss may occur because the file format may require that all metadata, for example, a movie box, be written in one contiguous area of the file. Furthermore, when recording a file, there may not be sufficient amount of memory space to buffer a movie box for the size of the storage available, and re-computing the contents of a movie box when the movie is closed may be too slow. Moreover, movie fragments may enable simultaneous recording and playback of a file using a regular ISO file parser.
- the movie fragment feature may enable splitting the metadata that otherwise might reside in the movie box into multiple pieces. Each piece may correspond to a certain period of time of a track. In other words, the movie fragment feature may enable interleaving file metadata and media data. Consequently, the size of the movie box may be limited and the use cases mentioned above be realized.
- the media samples for the movie fragments may reside in an mdat box.
- a moof box may be provided.
- the moof box may include the information for a certain duration of playback time that would previously have been in the moov box.
- the moov box may still represent a valid movie on its own, but in addition, it may include an mvex box indicating that movie fragments will follow in the same file.
- the movie fragments may extend the presentation that is associated to the moov box in time.
- the movie fragment there may be a set of track fragments, including anywhere from zero to a plurality per track.
- the track fragments may in turn include anywhere from zero to a plurality of track runs, each of which document is a contiguous run of samples for that track (and hence are similar to chunks).
- many fields are optional and can be defaulted.
- the metadata that may be included in the moof box may be limited to a subset of the metadata that may be included in a moov box and may be coded differently in some cases. Details regarding the boxes that can be included in a moof box may be found from the ISOBMFF specification.
- Transformed media tracks may have resulted by applying one or more
- a transformed media track may for example be an encrypted or protected media track or an incomplete media track. Incomplete tracks may result, for example, samples are received partially.
- the ISO Base Media File Format contains three mechanisms for timed metadata that can be associated with particular samples: sample groups, timed metadata tracks, and sample auxiliary information. Derived specification may provide similar functionality with one or more of these three mechanisms.
- Per-sample sample auxiliary information may be stored anywhere in the same file as the sample data itself; for self-contained media files, this is typically in a MediaDataBox or a box from a derived specification. It is stored either (a) in multiple chunks, with the number of samples per chunk, as well as the number of chunks, matching the chunking of the primary sample data or (b) in a single chunk for all the samples in a movie sample table (or a movie fragment).
- the Sample Auxiliary Information for all samples contained within a single chunk (or track run) is stored contiguously (similarly to sample data).
- Sample Auxiliary Information when present, is always stored in the same file as the samples to which it relates as they share the same data reference (’dref) structure.
- this data may be located anywhere within this file, using auxiliary information offsets ('saio') to indicate the location of the data.
- Files conforming to the ISOBMFF may contain any non-timed objects, referred to as items, meta items, or metadata items, in a meta box (fourCC:‘meta’), which may also be called MetaBox. While the name of the meta box refers to metadata, items can generally contain metadata or media data.
- the meta box may reside at the top level of the file, within a movie box (fourCC:‘moov’), and within a track box (fourCC:‘trak’), but at most one meta box may occur at each of the file level, movie level, or track level.
- the meta box may be required to contain a‘hdlr’ box indicating the structure or format of the‘meta’ box contents.
- the meta box may list and characterize any number of items that can be referred and each one of them can be associated with a file name and are uniquely identified with the file by item identifier (item id) which is an integer value.
- item id is an integer value.
- the metadata items may be for example stored in the 'idaf box of the meta box or in an 'mdaf box or reside in a separate file. If the metadata is located external to the file then its location may be declared by the DatalnformationBox (fourCC:‘dinf ).
- the metadata may be encapsulated into either the XMLBox (fourCC:‘xml‘) or the BinaryXMLBox (fourcc:‘bxml’).
- An item may be stored as a contiguous byte range, or it may be stored in several extents, each being a contiguous byte range. In other words, items may be stored fragmented into extents, e.g. to enable interleaving.
- An extent is a contiguous subset of the bytes of the resource; the resource can be formed by concatenating the extents.
- a uniform resource identifier may be defined as a string of characters used to identify a name of a resource. Such identification enables interaction with representations of the resource over a network, using specific protocols.
- a URI is defined through a scheme specifying a concrete syntax and associated protocol for the URI.
- the uniform resource locator (URL) and the uniform resource name (URN) are forms of URI.
- a URL may be defined as a URI that identifies a web resource and specifies the means of acting upon or obtaining the representation of the resource, specifying both its primary access mechanism and network location.
- a URN may be defined as a URI that identifies a resource by name in a particular namespace. A URN may be used for identifying a resource without implying its location or how to access it.
- the ISO base media file format does not limit a presentation to be contained in one file.
- a presentation may be comprised within several files.
- one file may include the metadata for the whole presentation and may thereby include all the media data to make the presentation self-contained.
- Other files, if used, may not be required to be formatted to ISO base media file format, and may be used to include media data, and may also include unused media data, or other information.
- the ISO base media file format concerns the structure of the presentation file only.
- the format of the media-data files may be constrained by the ISO base media file format or its derivative formats only in that the media-data in the media files is formatted as specified in the ISO base media file format or its derivative formats.
- a sample description box included in each track may provide a list of sample entries, each providing detailed information about the coding type used, and any initialization information needed for that coding. All samples of a chunk and all samples of a track fragment may use the same sample entry.
- a chunk may be defined as a contiguous set of samples for one track.
- the Data Reference (dref) box which may also be included in each track, may define an indexed list of uniform resource locators (URLs), uniform resource names (URNs), and/or self-references to the file containing the metadata.
- a sample entry may point to one index of the Data Reference box (which, in the syntax, may be referred to as DataReferenceBox), thereby indicating the file containing the samples of the respective chunk or track fragment.
- DataReferenceBox contains a list of boxes that declare the potential location(s) of the media data referred to by the file. DataReferenceBox is contained by
- DatalnformationBox which in turn is contained by MedialnformationBox or MetaBox.
- each sample entry of the track contains a data reference index referring to a list entry of the list of box(es) in the DataReferenceBox.
- the box(es) in the DataReferenceBox are extended from FullBox, i.e. contain the version and the flags field in the box header.
- Two box types have been specified to be included in the DataReferenceBox: DataEntryUrlBox and DataEntryUmBox provide a URL and URN data reference, respectively.
- DataEntryUrlBox or DataEntryUmBox is equal 1 (which may be called the "self-containing" flag or self-contained flag), the respective data reference refers to the containing file itself and no URL or URN string is provided within the DataEntryUrlBox or the DataEntryUmBox.
- samples excluding samples referred to by movie fragments may be computed using information provided by (a) DataReferenceBox (b) SampleToChunkBox (c) ChunkOffsetBox, and (d) SampleSizesBox. Furthermore, the locating of a sample involves an offset calculation using the start of the file. For sample referred to by movie fragments, the exact location of samples may be computed using information provided in TrackFragmentHeaderBox, and
- TrackFragmentRunBox and the locating of a sample may involve an offset calculating using either the start of the file or the start of the MovieFragmentBox as a reference.
- the use of offsets may render the file fragile to any edits. For example, it may be sufficient to simply add or delete a byte between the start of a file and a MediaDataBox to destroy the computed offsets and render the file non-decodab le. This means that any entity that is editing a file should be careful to ensure that all offsets computed and set in the file must be valid after it completes its editing.
- the ItemFocationBox provides for each item, an indication if the item is located in this or other files and in the latter case the URN/URF of the other files, the base reference of byte offsets within the file, and the extents from which the item is constructed.
- the ItemFocationBox is indicative of the byte offset relative to the base reference and the length of the extent.
- the ItemFocationBox comprises data reference index, which is a 1 -based index referring to a list entry of the list of box(es) in the
- the ItemFocationBox also comprises the construction method field, which indicates the base reference within the file and can be one of the following:
- flle offset data reference index equal to 0 indicates the same file as the file containing the MetaBox.
- the data origin is the first byte of the enclosing MovieFragmentBox.
- byte offsets are absolute byte offsets into the file (from the start of the file) identified by data reference index.
- idat offset byte offsets are relative to the ItemDataBox in the same MetaBox item offset: byte offsets are relative to the start of an item data for an item indicated by the item reference index field.
- High Efficiency Image File Format is a standard developed by the Moving Picture Experts Group (MPEG) for storage of images and image sequences.
- MPEG Moving Picture Experts Group
- the standard facilitates file encapsulation of data coded according to High Efficiency Video Coding (HEVC) standard.
- HEIF includes features building on top of the used ISO Base Media File Format (ISOBMFF).
- the ISOBMFF structures and features are used to a large extent in the design of HEIF.
- the basic design for HEIF comprises that still images are stored as items and image sequences are stored as tracks.
- HEIF enables to store multiple images in the same file. These images, called image items, can have logical relationships to each other.
- the following boxes may be contained within the root-level 'meta' box and may be used as described in the following.
- the handler value of the Handler box of the 'meta' box is 'picf .
- the resource (whether within the same file, or in an external file identified by a uniform resource identifier) containing the coded media data is resolved through the Data Information ('din ) box, whereas the Item Focation ('iloc') box stores the position and sizes of every item within the referenced file.
- the Item Reference ('ire ) box documents relationships between items using typed referencing.
- the 'meta' box is also flexible to include other boxes that may be necessary to describe items.
- Any number of image items can be included in the same file. Given a collection images stored by using the 'meta' box approach, it sometimes is essential to qualify certain relationships between images. Examples of such relationships include indicating a cover image for a collection, providing thumbnail images for some or all of the images in the collection, and associating some or all of the images in a collection with auxiliary image such as an alpha plane. A cover image among the collection of images is indicated using the 'pitm' box. A thumbnail image or an auxiliary image is linked to the primary image item using an item reference of type 'thmb' or 'auxf, respectively.
- the Item Reference (’iref) box has the following syntax:
- reference_type contains an indication of the type of the reference
- From_item_ID contains the item_ID of the item that refers to other items
- reference_count is the number of references
- to_item_ID contains the item_ID of the item referred to.
- HEIF provides the mechanism to provide item references from one image item to other image items, it does not provide any mechanism to indicate that the referenced items must be processed in the order they are listed in the item reference box. Lack of such a mechanism may cause for example the following consequences:
- additional data structures may be specified to indicate the order of items such as properties or entity groups.
- additional data structures may be specific to the type of the item reference and are therefore not generically applicable.
- the additional data structures are located in the file separately from the item references and hence make file parsing more complicated.
- ItemPropertyContainerBox ('ipco'), and ItemPropertyAssociationBox ('ipma'). Item-related features of ISOBMFF are included by reference in the HEIF standard.
- the method comprises authoring (400) items into a media file, said items being associated with a plurality of referenced items to be processed in a specific order; and authoring (402) one or more properties of item references of said referenced item into said media file, wherein said properties include one or more of the following:
- a checksum generation algorithm may be pre-defined e.g. in a standard or may be indicated in the file.
- the MD5 checksum may be used.
- a file parser may derive the checksum from the referenced item ID values and compare the derived checksum value to the checksum value contained in the file to verify that the item references are intact.
- a cryptographic hash function may be defined as a hash function that is intended to be practically impossible to invert, i.e. to create the input data based on the hash value alone.
- Cryptographic hash function may comprise e.g. the MD5 function.
- An MD5 value may be a null-terminated string of UTF-8 characters containing a base64 encoded MD5 digest of the input data.
- One method of calculating the string is specified in IETF RFC 1864. It should be understood that instead of or in addition to MD5, other types of integrity check schemes could be used in various embodiments, such as different forms of the cyclic redundancy check (CRC), such as the CRC scheme used in ITU-T Recommendation H.271.
- CRC cyclic redundancy check
- a checksum or hash sum may be defined as a small-size datum from an arbitrary block of digital data which may be used for the purpose of detecting errors which may have been introduced during its transmission or storage.
- the actual procedure which yields the checksum, given a data input may be called a checksum function or checksum algorithm.
- a checksum algorithm will usually output a significantly different value, even for small changes made to the input. This is especially true of cryptographic hash functions, which may be used to detect many data corruption errors and verify overall data integrity; if the computed checksum for the current data input matches the stored value of a previously computed checksum, there is a high probability the data has not been altered or corrupted.
- checksum may be defined to be equivalent to a cryptographic hash value or alike.
- the properties are interpreted by a file editor to conclude which types of editing operations may be performed.
- the method enables to indicate that the referenced items must be processed in the order they are listed in the item reference box.
- the file editor knows that the indicated item references shall not be reordered.
- the flexibility of the editing processes is increased.
- a new ItemReferenceBox syntax and data structure is defined in order to indicate strictly ordered item references. By introducing a new version, it is possible to indicate that strict ordering of image items is applied. Hence, required functionality is achieved with no additional data structure definitions other the item referencing.
- a new version of the ItemReferenceBox is defined as follows:
- version 2 and 3 two new FullBox versions, referred to as version 2 and 3, are added to the ItemReferenceBox data structure. It is noted that version numbers 2 and 3 are given only as examples, and any other numbering may be used, as well.
- SingleltemTypeReferenceLargeBox are strictly ordered and their order shall not be modified/changed and referenced items shall be processed in the provided order.
- Zero or one ItemReferenceBoxes that have version 0 or 1 are allowed in a MetaBox to carry item references that are not strictly ordered.
- Zero or one ItemReferenceBoxes that have version 2 or 3 are allowed in a MetaBox to carry item references that are strictly ordered.
- flags value other than 0 indicates that all the items listed in this
- ItemReferenceBox are strictly ordered and they shall be processed in the provided order.
- class ItemReferenceBox extends FullBox (' iref ' , version, 0) ⁇
- strict ordering flag indicates that the items listed in this ItemReferenceBox are strictly ordered and they shall be processed in the provided order. Again, the name of the flag and its location in the data structure is given as an example.
- class StrictOrderingBox ( 'stro' ) extends Box('stro') ⁇
- StrictOrderingBox encapsulates the strict ordering related data structure, and strict ordering flag indicates that the items listed in this ItemReferenceBox are strictly ordered and they shall be processed in the provided order.
- a new flag may be defined for the ItemReferenceBox in order to indicate the presence of the StrictOrderingBox.
- StrictltemReferenceBox may be defined as follows:
- class StrictltemReferenceBox extends FullBox ( 1 sref 1 , version, flags) ⁇
- StrictltemReferenceBox is given only as an example.
- a new 4CC code may be used.
- ‘sref is used in the above data structure.
- singlcltcm 1 ypeReterenceLargeBox [0115] In this approach, the following two new boxes are defined in order to indicate strict ordering of items.
- ItemReferenceBox can be extended with new versions in order to cover the strict ordering cases as follows:
- ItemReferenceBox is allowed to carry child boxes of both indicating strict ordering and not indicating strict ordering of item references.
- the following syntax may be used:
- ItemReferenceBox is allowed to carry child boxes of both indicating strict ordering and not indicating strict ordering of item references.
- the order of child boxes is not constrained and the version field of the ItemReferenceBox is merely used to differentiate between 16-bit and 32-bit item ID values.
- ShortltemldBox may be SingleltemTypeReferenceBox or SingleltemTypeStrictReferenceBox and need not be the same type of a box in each entry.
- LongltemldBox may be
- SingleltemTypeReferenceBoxLarge or SingleltemTypeStrictReferenceBoxLarge need not be the same of a box in each entry.
- SingleltemTypeReferenceLargeBox are extended to indicate properties of the references.
- the properties may comprise but are not limited to one or more of the following:
- a checksum generation algorithm may be pre-defined e.g. in a standard or may be indicated in the file. For example, the MD5 checksum may be used.
- a file parser may derive the checksum from the to item ID values and compare the derived checksum value to the checksum value contained in the box to verify that the item references are intact.
- SingleltemTypeReferenceBox may be extended as follows. It needs to be understood that SingleltemTypeReferenceBoxLarge could be similarly extended. It is assumed that such extensions are backward-compatible, since existing readers would stop parsing immediately after the loop.
- strict ordering flag equal to 0 indicates that the order of the to item ID values might or might not affect the interpretation of the item identified by the from item ID value references essential flag equal to 0 indicates that item(s) identified by to item ID value(s) may be removed without affecting the interpretation of the item identified by the from item ID value references essential flag equal to 1 indicates that removal of an item identified by any to item ID value affects the interpretation of the item identified by the from item ID value.
- the item identified by the from item ID shall also be removed from the file.
- md5_string is a null-terminated string of UTF-8 characters containing a base64 encoded MD5 digest of all the to item ID values in the order they are listed in the containing box.
- Figure 5 shows a block diagram of a video encoder suitable for obtaining such image items.
- Figure 5 presents an encoder for two layers, but it would be appreciated that presented encoder could be similarly extended to encode more than two layers.
- Figure 5 illustrates an embodiment of a video encoder comprising a first encoder section 500 for a base layer and a second encoder section 502 for an enhancement layer.
- Each of the first encoder section 500 and the second encoder section 502 may comprise similar elements for encoding incoming pictures.
- the encoder sections 500, 502 may comprise a pixel predictor 302, 402, prediction error encoder 303, 403 and prediction error decoder 304, 404.
- Figure 4 also shows an embodiment of the pixel predictor 302, 402 as comprising an inter-predictor 306, 406, an intra-predictor 308, 408, a mode selector 310, 410, a filter 316, 416, and a reference frame memory 318, 418.
- the pixel predictor 302 of the first encoder section 500 receives 300 base layer images of a video stream to be encoded at both the inter predictor 306 (which determines the difference between the image and a motion compensated reference frame 318) and the intra-predictor 308 (which determines a prediction for an image block based only on the already processed parts of current frame or picture).
- the output of both the inter-predictor and the intra-predictor are passed to the mode selector 310.
- the intra predictor 308 may have more than one intra-prediction modes. Hence, each mode may perform the intra-prediction and provide the predicted signal to the mode selector 310.
- the mode selector 310 also receives a copy of the base layer picture 300.
- the pixel predictor 402 of the second encoder section 502 receives 400 enhancement layer images of a video stream to be encoded at both the inter-predictor 406 (which determines the difference between the image and a motion compensated reference frame 418) and the intra predictor 408 (which determines a prediction for an image block based only on the already processed parts of current frame or picture).
- the output of both the inter-predictor and the intra-predictor are passed to the mode selector 410.
- the intra-predictor 408 may have more than one intra-prediction modes. Hence, each mode may perform the intra-prediction and provide the predicted signal to the mode selector 410.
- the mode selector 410 also receives a copy of the enhancement layer picture 400.
- the output of the inter-predictor 306, 406 or the output of one of the optional intra-predictor modes or the output of a surface encoder within the mode selector is passed to the output of the mode selector 310, 410.
- the output of the mode selector is passed to a first summing device 321, 421.
- the first summing device may subtract the output of the pixel predictor 302, 402 from the base layer picture 300/enhancement layer picture 400 to produce a first prediction error signal 320, 420 which is input to the prediction error encoder 303, 403.
- the pixel predictor 302, 402 further receives from a preliminary reconstructor 339, 439 the combination of the prediction representation of the image block 312, 412 and the output 338, 438 of the prediction error decoder 304, 404.
- the preliminary reconstructed image 314, 414 may be passed to the intra-predictor 308, 408 and to a filter 316, 416.
- the filter 316, 416 receiving the preliminary representation may filter the preliminary
- the reference frame memory 318 may be connected to the inter-predictor 306 to be used as the reference image against which a future base layer picture 300 is compared in inter-prediction operations.
- the reference frame memory 318 may also be connected to the inter-predictor 406 to be used as the reference image against which a future enhancement layer pictures 400 is compared in inter-prediction operations.
- the reference frame memory 418 may be connected to the inter predictor 406 to be used as the reference image against which a future enhancement layer picture 400 is compared in inter-prediction operations.
- Filtering parameters from the filter 316 of the first encoder section 500 may be provided to the second encoder section 502 subject to the base layer being selected and indicated to be source for predicting the filtering parameters of the enhancement layer according to some embodiments.
- the prediction error encoder 303, 403 comprises a transform unit 342, 442 and a quantizer 344, 444.
- the transform unit 342, 442 transforms the first prediction error signal 320, 420 to a transform domain.
- the transform is, for example, the DCT transform.
- the quantizer 344, 444 quantizes the transform domain signal, e.g. the DCT coefficients, to form quantized coefficients.
- the prediction error decoder 304, 404 receives the output from the prediction error encoder 303, 403 and performs the opposite processes of the prediction error encoder 303,
- the prediction error decoder may be considered to comprise a dequantizer 361, 461, which dequantizes the quantized coefficient values, e.g. DCT coefficients, to reconstruct the transform signal and an inverse
- the prediction error decoder may also comprise a block filter which may filter the reconstructed block(s) according to further decoded information and filter parameters.
- the entropy encoder 330, 430 receives the output of the prediction error encoder 303, 403 and may perform a suitable entropy encoding/variable length encoding on the signal to provide error detection and correction capability.
- the outputs of the entropy encoders 330, 430 may be inserted into a bitstream e.g. by a multiplexer 508.
- Figure 6 shows a block diagram of a video decoder suitable for employing embodiments of the invention.
- Figure 6 depicts a structure of a two-layer decoder, but it would be appreciated that the decoding operations may similarly be employed in a single layer decoder.
- the video decoder 550 comprises a first decoder section 552 for a base layer and a second decoder section 554 a predicted layer.
- Block 556 illustrates a demultiplexer for delivering information regarding base layer pictures to the first decoder section 552 and for delivering information regarding predicted layer pictures to the second decoder section 554.
- Reference P’n stands for a predicted representation of an image block.
- Reference D’n stands for a reconstructed prediction error signal.
- Blocks 704, 804 illustrate preliminary
- Blocks 703, 803 illustrate inverse transform Blocks 702, 802 illustrate inverse quantization (Q 1 ).
- Blocks 701, 801 illustrate entropy decoding (E 1 ).
- Blocks 705, 805 illustrate a reference frame memory (RFM).
- Blocks 706, 806 illustrate prediction (P) (either inter prediction or intra prediction).
- Blocks 707, 807 illustrate filtering (F).
- Blocks 708, 808 may be used to combine decoded prediction error information with predicted base layer/predicted layer images to obtain the preliminary reconstructed images (Fn).
- Preliminary reconstructed and filtered base layer images may be output 709 from the first decoder section 552 and preliminary reconstructed and filtered base layer images may be output 809 from the first decoder section 554.
- the decoder should be interpreted to cover any operational unit capable to carry out the decoding operations, such as a player, a (file) editor, a receiver, a gateway, a demultiplexer and/or a decoder.
- Figure 7 shows a flow chart of the operation of a file editor upon receiving the media file.
- the method comprises receiving (700) a media file authored with items comprising a plurality of referenced items to be processed in a specific order; reading (702) one or more properties of item references of said references items from said media file, wherein said properties include one or more of the following: indication if the item references are strictly ordered, indication if the referenced items are removable without making a referencing item invalid, a checksum generated from ID values of the referenced items in the order they are listed; and parsing (704) the media file parsing the media file according to said one or more properties.
- a file editor receives an instruction to remove a first item from a media file.
- the file editor checks from ItemReferenceBoxes if the first item is referenced by any other item in the media file.
- the file editor removes the first item if it is not referenced by any other item or if it is indicated in the media file that the first item is removable without making the referencing item(s) invalid.
- a file editor receives an instruction to reorder a first item in a first item reference.
- the file editor reorders the first item if the first item reference is not indicated to be strictly ordered.
- FIG 8 is a graphical representation of an example multimedia communication system within which various embodiments may be implemented.
- a data source 1510 provides a source signal in an analog, uncompressed digital, or compressed digital format, or any combination of these formats.
- An encoder 1520 may include or be connected with a pre processing, such as data format conversion and/or filtering of the source signal.
- the encoder 1520 encodes the source signal into a coded media bitstream. It should be noted that a bitstream to be decoded may be received directly or indirectly from a remote device located within virtually any type of network. Additionally, the bitstream may be received from local hardware or software.
- the encoder 1520 may be capable of encoding more than one media type, such as audio and video, or more than one encoder 1520 may be required to code different media types of the source signal.
- the encoder 1520 may also get synthetically produced input, such as graphics and text, or it may be capable of producing coded bitstreams of synthetic media.
- only processing of one coded media bitstream of one media type is considered to simplify the description.
- typically real-time broadcast services comprise several streams (typically at least one audio, video and text sub-titling stream).
- the system may include many encoders, but in the figure only one encoder 1520 is represented to simplify the description without a lack of generality.
- the coded media bitstream may be transferred to a storage 1530.
- the storage 1530 may comprise any type of mass memory to store the coded media bitstream.
- the format of the coded media bitstream in the storage 1530 may be an elementary self-contained bitstream format, or one or more coded media bitstreams may be encapsulated into a container file, or the coded media bitstream may be encapsulated into a Segment format suitable for DASH (or a similar streaming system) and stored as a sequence of Segments.
- a file generator (not shown in the figure) may be used to store the one more media bitstreams in the file and create file format metadata, which may also be stored in the file.
- the encoder 1520 or the storage 1530 may comprise the file generator, or the file generator is operationally attached to either the encoder 1520 or the storage 1530.
- Some systems operate“live”, i.e. omit storage and transfer coded media bitstream from the encoder 1520 directly to the sender 1540. The coded media bitstream may then be transferred to the sender 1540, also referred to as the server, on a need basis.
- the format used in the transmission may be an elementary self-contained bitstream format, a packet stream format, a Segment format suitable for DASH (or a similar streaming system), or one or more coded media bitstreams may be encapsulated into a container file.
- the encoder 1520, the storage 1530, and the server 1540 may reside in the same physical device or they may be included in separate devices.
- the encoder 1520 and server 1540 may operate with live real-time content, in which case the coded media bitstream is typically not stored
- the server 1540 sends the coded media bitstream using a communication protocol stack.
- the stack may include but is not limited to one or more of Real-Time Transport Protocol (RTP), User Datagram Protocol (UDP), Hypertext Transfer Protocol (HTTP), Transmission Control Protocol (TCP), and Internet Protocol (IP).
- RTP Real-Time Transport Protocol
- UDP User Datagram Protocol
- HTTP Hypertext Transfer Protocol
- TCP Transmission Control Protocol
- IP Internet Protocol
- the server 1540 encapsulates the coded media bitstream into packets.
- RTP Real-Time Transport Protocol
- UDP User Datagram Protocol
- HTTP Hypertext Transfer Protocol
- TCP Transmission Control Protocol
- IP Internet Protocol
- the sender 1540 may comprise or be operationally attached to a "sending file parser" (not shown in the figure).
- a sending file parser locates appropriate parts of the coded media bitstream to be conveyed over the communication protocol.
- the sending file parser may also help in creating the correct format for the communication protocol, such as packet headers and payloads.
- the multimedia container file may contain encapsulation instructions, such as hint tracks in the ISOBMFF, for
- the server 1540 may or may not be connected to a gateway 1550 through a communication network, which may e.g. be a combination of a CDN, the Internet and/or one or more access networks.
- the gateway may also or alternatively be referred to as a middle- box.
- the gateway may be an edge server (of a CDN) or a web proxy. It is noted that the system may generally comprise any number gateways or alike, but for the sake of simplicity, the following description only considers one gateway 1550.
- the gateway 1550 may perform different types of functions, such as translation of a packet stream according to one communication protocol stack to another communication protocol stack, merging and forking of data streams, and manipulation of data stream according to the downlink and/or receiver capabilities, such as controlling the bit rate of the forwarded stream according to prevailing downlink network conditions.
- the system includes one or more receivers 1560, typically capable of receiving, de modulating, and de-capsulating the transmitted signal into a coded media bitstream.
- the coded media bitstream may be transferred to a recording storage 1570.
- the recording storage 1570 may comprise any type of mass memory to store the coded media bitstream.
- the recording storage 1570 may alternatively or additive ly comprise computation memory, such as random access memory.
- the format of the coded media bitstream in the recording storage 1570 may be an elementary self-contained bitstream format, or one or more coded media bitstreams may be encapsulated into a container file.
- a container file is typically used and the receiver 1560 comprises or is attached to a container file generator producing a container file from input streams.
- Some systems operate“live,” i.e. omit the recording storage 1570 and transfer coded media bitstream from the receiver 1560 directly to the decoder 1580.
- the most recent part of the recorded stream e.g., the most recent 10-minute excerption of the recorded stream, is maintained in the recording storage 1570, while any earlier recorded data is discarded from the recording storage 1570.
- the coded media bitstream may be transferred from the recording storage 1570 to the decoder 1580. If there are many coded media bitstreams, such as an audio stream and a video stream, associated with each other and encapsulated into a container file or a single media bitstream is encapsulated in a container file e.g. for easier access, a file parser (not shown in the figure) is used to decapsulate each coded media bitstream from the container file.
- the recording storage 1570 or a decoder 1580 may comprise the file parser, or the file parser is attached to either recording storage 1570 or the decoder 1580. It should also be noted that the system may include many decoders, but here only one decoder 1570 is discussed to simplify the description without a lack of generality
- the coded media bitstream may be processed further by a decoder 1570, whose output is one or more uncompressed media streams.
- a Tenderer 1590 may reproduce the uncompressed media streams with a loudspeaker or a display, for example.
- the receiver 1560, recording storage 1570, decoder 1570, and Tenderer 1590 may reside in the same physical device or they may be included in separate devices.
- a sender 1540 and/or a gateway 1550 may be configured to perform switching between different representations e.g. for switching between different viewports of 360- degree video content, view switching, bitrate adaptation and/or fast start-up, and/or a sender 1540 and/or a gateway 1550 may be configured to select the transmitted representation(s). Switching between different representations may take place for multiple reasons, such as to respond to requests of the receiver 1560 or prevailing conditions, such as throughput, of the network over which the bitstream is conveyed. In other words, the receiver 1560 may initiate switching between representations.
- a request from the receiver can be, e.g., a request for a Segment or a Subsegment from a different representation than earlier, a request for a change of transmitted scalability layers and/or sub-layers, or a change of a rendering device having different capabilities compared to the previous one.
- a request for a Segment may be an HTTP GET request.
- a request for a Subsegment may be an HTTP GET request with a byte range.
- bitrate adjustment or bitrate adaptation may be used for example for providing so-called fast start-up in streaming services, where the bitrate of the transmitted stream is lower than the channel bitrate after starting or random-accessing the streaming in order to start playback immediately and to achieve a buffer occupancy level that tolerates occasional packet delays and/or retransmissions.
- Bitrate adaptation may include multiple representation or layer up-switching and representation or layer down-switching operations taking place in various orders.
- a decoder 1580 may be configured to perform switching between different representations e.g. for switching between different viewports of 360-degree video content, view switching, bitrate adaptation and/or fast start-up, and/or a decoder 1580 may be configured to select the transmitted representation(s). Switching between different representations may take place for multiple reasons, such as to achieve faster decoding operation or to adapt the transmitted bitstream, e.g. in terms of bitrate, to prevailing conditions, such as throughput, of the network over which the bitstream is conveyed.
- Faster decoding operation might be needed for example if the device including the decoder 1580 is multi-tasking and uses computing resources for other purposes than decoding the video bitstream.
- faster decoding operation might be needed when content is played back at a faster pace than the normal playback speed, e.g. twice or three times faster than conventional real-time playback rate.
- user equipment may comprise a video codec such as those described in embodiments of the invention above. It shall be appreciated that the term user equipment is intended to cover any suitable type of wireless user equipment, such as mobile telephones, portable data processing devices or portable web browsers.
- elements of a public land mobile network may also comprise video codecs as described above.
- the various embodiments of the invention may be implemented in hardware or special purpose circuits, software, logic or any combination thereof.
- some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto.
- firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto.
- While various aspects of the invention may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
- the embodiments of this invention may be implemented by computer software executable by a data processor of the mobile device, such as in the processor entity, or by hardware, or by a combination of software and hardware.
- any blocks of the logic flow as in the Figures may represent program steps, or interconnected logic circuits, blocks and functions, or a combination of program steps and logic circuits, blocks and functions.
- the software may be stored on such physical media as memory chips, or memory blocks implemented within the processor, magnetic media such as hard disk or floppy disks, and optical media such as for example DVD and the data variants thereof, CD.
- the memory may be of any type suitable to the local technical environment and may be implemented using any suitable data storage technology, such as semiconductor-based memory devices, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory.
- the data processors may be of any type suitable to the local technical environment, and may include one or more of general purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs) and processors based on multi-core processor architecture, as non-limiting examples.
- Embodiments of the inventions may be practiced in various components such as integrated circuit modules. The design of integrated circuits is by and large a highly automated process. Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.
- Programs such as those provided by Synopsys, Inc. of Mountain View, California and Cadence Design, of San Jose, California automatically route conductors and locate components on a semiconductor chip using well established rules of design as well as libraries of pre-stored design modules.
- the resultant design in a standardized electronic format (e.g., Opus, GDSII, or the like) may be transmitted to a semiconductor fabrication facility or "fab" for fabrication.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FI20195453 | 2019-05-29 | ||
PCT/FI2020/050356 WO2020240089A1 (en) | 2019-05-29 | 2020-05-27 | An apparatus, a method and a computer program for video coding and decoding |
Publications (2)
Publication Number | Publication Date |
---|---|
EP3977750A1 true EP3977750A1 (en) | 2022-04-06 |
EP3977750A4 EP3977750A4 (en) | 2023-06-21 |
Family
ID=73552016
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP20814633.2A Pending EP3977750A4 (en) | 2019-05-29 | 2020-05-27 | An apparatus, a method and a computer program for video coding and decoding |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP3977750A4 (en) |
WO (1) | WO2020240089A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024076494A1 (en) * | 2022-10-03 | 2024-04-11 | Bytedance Inc. | Enhanced signalling of preselection in a media file |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201502205D0 (en) * | 2015-02-10 | 2015-03-25 | Canon Kabushiki Kaisha And Telecom Paris Tech | Image data encapsulation |
GB2538997A (en) * | 2015-06-03 | 2016-12-07 | Nokia Technologies Oy | A method, an apparatus, a computer program for video coding |
-
2020
- 2020-05-27 EP EP20814633.2A patent/EP3977750A4/en active Pending
- 2020-05-27 WO PCT/FI2020/050356 patent/WO2020240089A1/en unknown
Also Published As
Publication number | Publication date |
---|---|
WO2020240089A1 (en) | 2020-12-03 |
EP3977750A4 (en) | 2023-06-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3703384B1 (en) | Media encapsulating and decapsulating | |
EP3092772B1 (en) | Media encapsulating and decapsulating | |
EP3257261B1 (en) | A method, an apparatus and a computer program product for processing image sequence tracks | |
RU2434277C2 (en) | Recording multimedia data stream into track for indicating media file reception | |
JP6649404B2 (en) | Apparatus, method and computer program for image coding / decoding | |
US20090177942A1 (en) | Systems and methods for media container file generation | |
US20210314626A1 (en) | Apparatus, a method and a computer program for video coding and decoding | |
EP2580920A1 (en) | Method and apparatus for encapsulating coded multi-component video | |
US20130097334A1 (en) | Method and apparatus for encapsulating coded multi-component video | |
US20130091154A1 (en) | Method And Apparatus For Encapsulating Coded Multi-Component Video | |
WO2016097482A1 (en) | Media encapsulating and decapsulating | |
US11553258B2 (en) | Apparatus, a method and a computer program for video coding and decoding | |
WO2020240089A1 (en) | An apparatus, a method and a computer program for video coding and decoding | |
EP4068781A1 (en) | File format with identified media data box mapping with track fragment box |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20220103 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
A4 | Supplementary search report drawn up and despatched |
Effective date: 20230522 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: H04N 19/46 20140101ALI20230515BHEP Ipc: H04N 19/30 20140101ALI20230515BHEP Ipc: H04N 21/835 20110101ALI20230515BHEP Ipc: G06F 16/61 20190101ALI20230515BHEP Ipc: G06F 16/41 20190101ALI20230515BHEP Ipc: H04N 19/70 20140101ALI20230515BHEP Ipc: G06F 16/71 20190101ALI20230515BHEP Ipc: G06F 16/51 20190101ALI20230515BHEP Ipc: H04N 21/854 20110101ALI20230515BHEP Ipc: H04N 21/845 20110101AFI20230515BHEP |