GB2582024A

GB2582024A - Method and apparatus for encapsulating groups of images in a file

Info

Publication number: GB2582024A
Application number: GB1903174.9A
Authority: GB
Inventors: Maze Frédéric; Denoual Franck
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-03-08
Filing date: 2019-03-08
Publication date: 2020-09-09
Anticipated expiration: 2039-03-08
Also published as: GB2582024B; GB201903174D0

Abstract

Encapsulating or de-encapsulating media data to or from a file, the file comprising the media data, a grouping data structure describing a group of entities 402 (where each entity corresponds to at least a part of the media data), and a property data structure containing details of a property associated with a group of entities and details of the association between properties and groups of entities 403, 405. The details of the properties are in a property container data structure, and this container preferably includes a text property and an associated language attribute. Each property may have a number of text attributes including a name, description and tag. The purpose of the invention is to allow other entity (item or track) groupings not previously specified with arbitrary properties describing these groupings. A second disclosed invention involves encapsulating or de-encapsulating media data to or from a file, the file comprising the media data a grouping data structure describing a group of entities (where each entity corresponds to at least a part of the media data), where the grouping data structure includes a text property associated with a group of entities and a language attribute associated with the text attribute.

Description

METHOD AND APPARATUS FOR ENCAPSULATING GROUPS OF IMAGES

IN A FILE

The present disclosure concerns a method and a device for encapsulating multiple images in a file.

Modern cameras provide different capture modes to capture images. Some of these capture modes result in capturing series of images. For example, they offer bracketing modes where several images are captured, the value of one parameter of the capture varying from one capture image to another. The parameter may be the exposure time, the value of the white, or the focus for example. The image burst mode provides the ability to take a series of images with no delay. It can be used to capture a fast event in sport for example. Panorama mode allows obtaining a series of overlapping images to reconstitute a large view of a scene. Modern cameras also provide collection modes that allow users organizing captured images and creating groups of images. Users may create group of images to identify their favourite images or collections of images corresponding to some given interests. Actually, any kind of group of images, sequences of images, or both might be contemplated.

Images captured by a camera are stored on a storage device like a memory card for example. The images are typically encoded to reduce the size of data on the storage device. Many encoding standard may be used, like JPEG or the more recent HEVC standard.

The HEVC standard defines a profile for the encoding of still images and describes specific tools for compressing single still images or sequence of still images. An extension of the ISO Base Media File Format (ISOBMFF) used for such kind of image data has been proposed for inclusion into the ISO/I EC 23008 standard, in Part 12, under the name. "HEIF or High Efficiency Image File Format".

HEIF (High Efficiency Image File Format) is a standard developed by the Moving Picture Experts Group (MPEG) for storage and sharing of one or more images and image sequences.

The MIAF (Multi-Image Application Format) is a standard developed by MPEG into ISO/IEC 23000 standard part 22 that defines a set of constraints on HEIF specification to precise interoperability points for creation, reading, parsing and decoding of images embedded in the High Efficiency Image File (HEIF) format.

While providing limited grouping mechanisms and limited mechanisms to describe properties of encoded images, the HEIF and MIAF file formats do not provide efficient grouping and properties description mechanisms adapted to gather and describe images and group of images resulting of a capture or edit according to one of the cited capture or collection modes.

The present invention has been devised to address one or more of the foregoing concerns. It concerns the extension of the grouping and the properties description mechanisms in HEIF adapted to capture and editing modes resulting in a plurality of images.

According to an aspect of the invention, there is provided a method of encapsulating media data in a file, wherein the method comprises: generating a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; generating a property container data structure containing a property associated with a group of entities; generating an association data structure comprising association information between the property and the grouping data structure; generating a property data structure comprising the property container data structure and the association data structure; and embedding the grouping data structure, the property data structure, and the media data in the file.

In an embodiment, the method further comprises: generating a text property in the property container data structure comprising at least one text attribute associated with a group of entities and a language attribute associated with the at least one text attribute.

In an embodiment, the text property comprises a plurality of the same text attribute associated with a respective plurality of language attributes.

In an embodiment, the property container data structure comprises a first and second text properties, the first text property comprising a text attribute associated with a first language attribute, and the second text property comprising the text attribute associated with a second language attribute, the first language being different from the second language, and wherein the association data structure comprises association information between the first and second text properties, and the grouping data structure.

In an embodiment, the text property comprises a plurality of text attributes comprising a name text attribute, a description text attribute, and a tag text attribute.

In an embodiment, the method comprises: generating at least one of a name text property, a description text property, and/or a tag text property in the property container data structure, each of the text properties comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute.

In an embodiment, each text property comprises a plurality of the same text attribute associated with a respective plurality of language attributes.

In an embodiment, the association data structure further comprises at least an association of a property with an entity.

According to another aspect of the invention, there is provided a method of encapsulating media data in a file, wherein the method comprises: generating a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; generating a text property in the grouping data structure comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute; and embedding the media data, and the grouping data structure in the file.

According to another aspect of the invention, there is provided a method of reading media data in a file, wherein the method comprises: reading a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; reading a property container data structure containing a property associated with a group of entities; reading an association data structure comprising association information between the property and the grouping data structure; reading a property data structure comprising the property container data structure and the association data structure; and reading the media data identified in the grouping data structure according to the property.

In an embodiment, the method further comprises: reading a text property in the property container data structure comprising at least one text attribute associated with a group of entities and a language attribute associated with the at least one text attribute.

In an embodiment, the method comprises: reading at least one of a name text property, a description text property, and/or a tag text property in the property container data structure, each of the text properties comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute.

According to another aspect of the invention, there is provided a method of reading media data in a file, wherein the method comprises: reading a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; reading a text property in the grouping data structure comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute; and reading the media data identified in the grouping data structure according to the property.

According to another aspect of the invention, there is provided a device for encapsulating media data in a file, wherein the device comprises circuitry configured for: generating a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; generating a property container data structure containing a property associated with a group of entities; generating an association data structure comprising association information between the property and the grouping data structure; generating a property data structure comprising the property container data structure and the association data structure; and embedding the grouping data structure, the property data structure, and the media data in the file.

According to another aspect of the invention, there is provided a device for encapsulating media data in a file, wherein the device comprises circuitry configured for: generating a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; generating a text property in the grouping data structure comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute; and embedding the media data, and the grouping data structure in the file.

According to another aspect of the invention, there is provided a device for reading media data in a file, wherein the device comprises circuitry configured for: reading a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; reading a property container data structure containing a property associated with a group of entities; reading an association data structure comprising association information between the property and the grouping data structure; reading a property data structure comprising the property container data structure and the association data structure; and - reading the media data identified in the grouping data structure according to the property.

According to another aspect of the invention, there is provided a device for reading media data in a file, wherein the device comprises circuitry configured for: - reading a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; reading a text property in the grouping data structure comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute; and reading the media data identified in the grouping data structure according to the property.

According to another embodiment, there is provided a computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing a method according to the invention, when loaded into and executed by the programmable apparatus.

According to another embodiment, there is provided a computer-readable storage medium storing instructions of a computer program for implementing a method according to the invention.

According to another embodiment, there is provided a computer program which upon execution causes the method of the invention to be performed.

At least parts of the methods according to the invention may be computer implemented. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "circuit", "module" or "system". Furthermore, the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer usable program code embodied in the medium.

Since the present invention can be implemented in software, the present invention can be embodied as computer readable code for provision to a programmable apparatus on any suitable carrier medium. A tangible, non-transitory carrier medium may comprise a storage medium such as a floppy disk, a CD-ROM, a hard disk drive, a magnetic tape device or a solid state memory device and the like. A transient carrier medium may include a signal such as an electrical signal, an electronic signal, an optical signal, an acoustic signal, a magnetic signal or an electromagnetic signal, e.g. a microwave or RF signal.

Embodiments of the invention will now be described, by way of example only, and with reference to the following drawings in which: Figure 1 illustrates an example of an HEIF file that contains several images or sequences of images; Figure 2 illustrates an example of association of item properties with an item within the scope of a group of entities; Figure 3 illustrates the different associations between item properties and items, groups of entities, or items within the scope of a group of entities according to some embodiments of the invention; Figure 4 illustrates the main steps of a process for encapsulating one or more entities in one file using HEIF format according to some embodiments of the invention; Figure 5 illustrates the main steps of a parsing process of an HEIF file generated according to some embodiments of the invention; Figure 6 is a schematic block diagram of a computing device for implementation of one or more embodiments of the invention.

The HEVC standard defines a profile for the encoding of still images and describes specific tools for compressing single still images or sequences of still images. An extension of the ISO Base Media File Format (ISOBMFF) used for such kind of image data has been proposed for inclusion into the ISO/IEC 23008 standard, in Part 12, under the name: "HEIF or High Efficiency Image File Format".

The HEIF and MIAF standards cover two forms of storage corresponding to different use cases: the storage of image sequences, each image being represented by a sample with timing information that is optionally used at the decoder, and in which the images may be dependent on other images, and the storage of single images, and collections of independently coded images.

In the first case, the encapsulation is close to the encapsulation of the video tracks in the ISO Base Media File Format (see document « Information technology -Coding of audio-visual objects -Part 12: ISO base media file format», ISO/IEC 14496-12:2015, Fifth edition, December 2015), and the similar tools and concepts are used, such as the file-level moov' box, 'Irak' boxes (encapsulated in the moov' box) and the sample grouping for description of samples and group of samples. A sample denotes all timed data associated with a single time (e.g. a frame in a video or an image in an image sequence).

Boxes, also called containers, are metadata structures provided to describe the data in the files. Boxes are object-oriented building block defined by a unique type identifier (typically a four-character code, also noted FourCC or 4CC) and length. All data in a file (media data and metadata describing the media data) is contained in boxes. There is no other data within the file. File-level boxes are boxes that are not contained in other boxes.

The 'moov' box is a file format box that contains Irak' sub boxes, each Irak' box describing a track, that is to say, a timed sequence of related samples.

In the second case, a set of ISOBMFF boxes, the file-level 'meta' box and its sub boxes are used. These boxes and their hierarchy offer less description tools than the 'track related' boxes ('trak' box hierarchy) and relate to "information items" or "items" instead of related samples. It is to be noted that the wording 'box' and the wording 'container' may be both used with the same meaning to refer to metadata structures that contain metadata describing the organization or/and properties of the image data in the file. The same wording 'box' and the wording 'container' may also be both used with the same meaning to refer to metadata structures that contain the image data in the file (e.g. tridat or 'idat boxes).

Figure 1 illustrates an example of an HEIF file 101 that contains media data like one or more still images and possibly video or sequence of images. This file contains a first ftyp' box (FileTypeBox) 111 that contains an identifier of the type of file, (typically a set of four character codes). This file contains a second box called 'meta' (MetaBox) 102 that is used to contain general untimed metadata including metadata structures describing the one or more still images. This 'meta' box 102 contains an jiff box (ItemInfoBox) 121 that describes several single images. Each single image is described by a metadata structure ItemInfoEntry also denoted items 1211 and 1212. Each items has a unique 32-bit identifier item_ID. The media data corresponding to these items is stored in the container for media data, the 'mdat' box 104.

Optionally, for describing the storage of image sequences or video, the HEIF file 101 may contain a third box called 'moov' (MovieBox) 103 that describes several tracks 131 and 132. Typically, the track 131 is an image sequence ('pict') track designed to describe a set of images for which the temporal information is not necessarily meaningful and 122 is a video ('vide') track designed to describe video content. Both these tracks describe a series of image samples, an image sample being a set of pixels captured at the same time, for example a frame of a video sequence. Main difference between the two tracks is that in 'pict' tracks the timing information is not necessarily meaningful whereas for 'vide' track the timing information is intended to constraint the timing of the display of the samples. The data corresponding to these samples is stored in the container for media data, the 'mdat' box 104.

The 'mdat' container 104 stores the untimed encoded images corresponding to items as represented by the data portion 141 and 142 and the timed encoded images corresponding to samples as represented by the data portion 143.

The purpose of HEIF file 101 is to describe the different alternatives available to store multiple images in one HEIF file. For instance, we may store the multiple images either as items or as a track of samples that can be a 'pia track or a 'vide' track. The actual choice is typically made by the application or device generating the file according to the type of images and the contemplated usage of the file.

The HEIF standard also provides some mechanisms designed to specify properties associated to images, in particular some metadata structures to declare or store properties for images and more generally for items (of any kind of media types). Typically, the 'meta' box 102 may contain an lprp' box (Item PropertiesBox) 123 that enables the association of any item with an ordered set of item properties. This lprp' box 123 contains an Ipco' box (ItemPropertyContainerBox) 1231 that is a property container data structure that contains all property data structures (ItemProperty and itemFullProperty) 1233 describing properties of all items described in the HEIF file. The 'iprp' box also contains a set of 'ipma' box (Item PropertyAssociationBox), which are association data structures that actually associates one or more item properties with a given item. It is then possible to associate a same property with several items.

The associated syntax is as follow: aligned(8) class ItemProperty(property_type) extends Box(property_type) aligned(8) class ItemFullProperty(property_type, version, flags) extends FullBox(property_type, version, flags) aligned(8) class ItemPropertyContainerBox extends Box( ipcol) Box properties[]; // boxes derived from // ItemProperty or ItemFullProperty, to fill box aligned(8) class ItemPropertyAssociationBox extends FullBox('ipma', version, flags) unsigned int(32) entry_count; for(i = 0; i < entry_count; i++) { if (version < 1) unsigned int(16) item_ID; else unsigned int(32) item_ID; unsigned int(8) association_count; for (i=0; i<association_count; i++) { bit(1) essential; if (flags & 1) unsigned int(15) property_index; else unsigned int(7) property_index; aligned(8) class ItemPropertiesBox extends Box(Iiprp') { ItemPropertyContainerBox property_container; ItemPropertyAssociationBox association[]; The ItemProperty and ItemFullProperty boxes are designed for the description of a property, i.e. that all properties shall inherit from either ItemProperty or ItemFullProperty. Compared to Item Property, ItemFullProperty allows defining multiple versions of a property with varying syntax conditionally to the value of the version parameter and allows defining a map of flags to signal/activate optional features or parameters conditionally to the value of the flags parameter.

The ItemPropertyContainerBox is designed for describing a set of properties as an array of Item Property or ItemFullProperty boxes.

The ItemPropertyAssociationBox is designed to describe the association between items and their properties. It provides the description of a list of item identifiers, each item identifier (item_ID) being associated with a list of property index referring to a property in the ItemPropertyContainerBox (as a 1-based index value). The index value 0 is reserved to indicate that no property is associated with the item. The essential attribute when set to 1 indicates that the associated property is essential to the item, otherwise it is non-essential.

Finally, the ItemPropertyContainerBox and the ItemPropertyAssociationBox(es) are gathered within an ItemPropertiesBox.

ISO Base Media File Format specifies a grouping mechanism adapted for the grouping of items and/or tracks. In this mechanism, the wording 'entity' is used to refer to media data as items (any type of items, e.g. image or metadata items) or tracks (e.g. video track 'vide', sequence of images track 'pia, audio track, or any other type of tracks). This mechanism specifies the grouping of entities.

The 'meta' box 102 may contain a container box gprl' (GroupsListBox) 122 that may contains a set of metadata structures describing groups of entities 1221 and 1222.

A group of entities is described by a grouping data structure called EnfityToGroupBox defined according to the following syntax: aligned(8) class EntityToGroupBox(grouping_type, version, flags) extends FullBox(grouping_type, version, flags) { unsigned int(32) group_id; unsigned int(32) num_entities_in_group; for(i=0; i<num_entities_in_group; i++) unsigned int(32) entity_id; // the remaining data may be specified for a particular grouping_type The group_id is a unique identifier of the group of entities, unique in this case must be understood as unique within the file. It shall not be equal to any group_id value of any EntityToGroupBox, any item_ID value of the hierarchy level (file, movie, or track) that contains the GroupsListBox or any track_ID value (when the GroupsListBox is contained in the file level). Then, the list of entity_id gives all the entities pertaining to the group.

The grouping_type is used to specify the type of grouping. HEIF actually defines a limited number of grouping_type values. A first grouping_type altr' specifies that the different entities are alternatives that may alternatively be used in an application. A second grouping_type ceqiv' specifies that a given untimed image relate to a particular position in the timeline of a track. All the items included in an 'eqiv' entity group are equivalent' and the tracks in the same leclivi entity group include selected samples that are 'equivalent' to the items. A third grouping_type 'stet' specifies that two entities are a stereo pair, typically left and right views, in a stereoscopic application. No other grouping type of entities is specified.

It is to be noted that this mechanism is very limited as there are only three types of groups specified. Moreover, nothing is specified to provide some further potential information or properties on the group or on the entities within the group.

The invention provides a mechanism that provides a mean for describing a group of entities (items or tracks, for example sequence of images) captured or edited according to a given capture or collection mode. It is provided a mean to describe the capture or collection mode that has been used. According to some embodiments, some additional information or properties regarding the capture or edit may be described in relation with the entities, the group of entities or entities within the group.

The invention can also be applied to other group of entities that can be encapsulated using the HEIF format. For example, it can be applied to groups of metadata items such as groups of Exif data.

For instance, entities to be gathered in a group of entities can be obtained by capturing with a camera new entities according to a capture mode. The capture mode of the series of images describes the kind of relationship between the images of the series. For instance, the capture modes are one of the following: Bracketing capture mode includes auto exposure, white balance, focus, flash bracketing modes. All these bracketing modes consist in performing several shots of the same content with different values of one or more parameters of the shooting. These different bracketing modes differ in the parameter whose value is varying in the series of capture. The capture system changes one capture parameter to generate the different versions. For example, in auto exposure bracketing the time of exposure is modified for each image capture.

Image burst is a capture mode consisting in capturing successively a series of images with a small interval of time between two image captures.

Panorama is a capture mode where several images are captured with an overlap between each capture. The principle is then to stitch each captured image to form a panorama of higher resolution.

Time-lapse is a capture mode consisting in capturing several images with the same device with a predetermined timing between each shot.

User defined capture series, also called photo series, is a capture mode where a user associates images in a series that shares the same context. For instance, a photograph makes several photos of the same product and wants to store all the images he made in the same file. He starts the User defined capture series capture mode at the beginning of the session. Once he finishes his shooting session, he stops the capture mode.

Super Resolution is a capture mode consisting in capturing several images at different resolutions that could be processed to generate a new image with a higher resolution.

Multi-Exposure is capture mode consisting in capturing several images at different exposures with the goal of generating a new image that is the superposition of the multi exposure set of images.

Noise Reduction is a capture mode consisting in capturing several images of a single scene to reduce the random noise generated by the capture process.

Long-exposure Noise Reduction is a capture mode for removing the sensor-related noise during long exposures. In this mode, in addition to the normal image(s), a image, called a 'dark', is captured with the same exposure duration without letting the light reach the sensor (for example by putting the cap on the lens, or by not opening the shutter). This 'dark' can be used to remove the sensor-related noise from the normal image.

Vignetting Compensation is a capture mode for compensating for the vignetting of the lens. Many lenses have a non-uniform response to light (typically, the corners of the images are darker than the centre, due to less light coming through the lens in the corners than in the centre). To compensate for this non-uniformity, a reference image, called a 'flat' is captured, by taking an image of a uniformly lighted surface. This reference image can be used to compensate for the non-uniformity of other images capture with the same lens.

HDR (High Dynamic Range) is a capture mode for handling very large differences of luminosity in the capture scene. The resulting image is a combination of several images with different exposures. It is similar to the auto-exposure bracketing mode, but in a more general way: the exposure variation between the images may not be regular, or may be unspecified.

Captured images are encoded using a video or still picture codec. For instance, in this example, the codec is H.264 or HEVC/H.265 or Versatile Video Coding (VVC -ISO/IEC 23090-3) format.

Captured images can be encoded either independently as still images to be stored in the file as HEIF items, or as images to be stored in the file as samples of a 'pia or vide' track. The encoding of images may depend on previous images using for instance an HEVC or VVC encoder similarly to video encoding. Previous images in the track are available as reference images for predictive encoding.

As another example, entities to be gathered in a group of entities can be obtained by selecting some existing or previously captured or created images or sequences of images. For instance, existing images or sequences of images can be extracted from an existing HEIF file or any other type of file, for instance JPEG, GIF, BMP, MKV, MP4 or AVI In order to be able to perform the storage of images, new types of grouping are necessary along with new methods to signal these types of grouping in a file with their associated properties.

According to a first embodiment, a new EntityToGroup inherited from EntityToGroupBox is defined with a unique generic grouping_type value covering all capture or collection modes. The particular type of capture or collection mode is defined as an additional attribute of the EntityToGroupBox.

For example, a new EntityToGroup with a generic trak' (for bracketing) or 'case' (for capture series) or clgrp' (for logical grouping) grouping type (or any non-already used four character code might be used with equivalent semantic) may be defined for grouping multiple entities according to a specific capture or collection mode. The particular type of capture or collection mode, for instance namely auto exposure bracketing, white balance bracketing, focus bracketing, flash exposure bracketing, depth of field bracketing, iso bracketing, favourite collection, album collection or user-defined capture series may be signalled using a new parameter capture_mode (or collection_mode, or grouping_mode or logical_grouping_type) of the EntityToGroupBox(case) or equivalent grouping_type.

An example of syntax of the Grouping Information when described as EntityToGroup is described below: aligned(8) class EnfityToGroupBox('case', version, flags) extends Full Box('case', version, flags) { unsigned int(32) group_id; unsigned int(32) num_entities_in_group; for(i=0; knum_enfities_in_group; i++) unsigned int(32) entity_id; // Parameters below provide common parameters for the grouping_type 'case' unsigned int(32) capture_mode; // 4CC identifying the capture or collection mode where capture_mode identifies the capture or collection mode.

Equivalent syntax with explicit inheritance from EntityToGroupBox is described below: aligned(8) class CaptureSeriesEntityToGroupBox extends EnfityToGroupBox('case', version, flags) { unsigned int(32) capture_mode; In some embodiments, CaptureSeriesEntityToGroupBox may also be named LogicalEnfityToGroupBox or BrackefingEnfityToGroupBox.

Alternatively, rather than defining a new 4CC for each capture or collection mode, the capture_mode parameter may be defined as an index in the table below.

The following 4CC codes can be defined for identifying the capture or collection mode: Capture mode type 4cc code Auto Exposure bracketing 'aebr White balance bracketing 'wbbr' Focus bracketing 'fob( Flash Exposure bracketing 'afbr

Depth of field bracketing dobr

Panorama 'pano' User-defined capture series 'udcs' Time lapse lila' Super Resolution 'sres' Multi Exposure 'mit Noise Reduction 'nois' Long-Exposure noise reduction 'dark' Vignetting Compensation 'flat HDR hdr ' Album Collection 'albc' Favourite Collection 'favc' Alternatively, in a variant, new EntityToGroupBoxes with specific grouping_type may be defined for each particular capture or collection mode listed in the above table: 'aebr' for auto exposure bracketing, cwbbC for white balance bracketing, etc...

Examples of the syntax of the new EntityToGroupBoxes are described below (similar syntax may be derived for each capture or collection mode): aligned(8) class AlbumCollectionEntityToGroupBox extends EntityToGroupBoxcalbc', version, flags) { aligned(8) class FavoriteCollectionEntityToGroupBox extends EntityToGroupBox(favc', version, flags) { aligned(8) class AutoExposureBracketingEntityToGroupBox extends EntityToGroupBoxcaebr, version, flags) { According to alternatives of this first embodiment, Grouping Information based on above new EntityToGroupBoxes may also include one or more additional tag(s) or label(s) parameters providing null-terminated strings in UTF-8 characters that give a human-readable name, tag(s) or description of the content of the group of entities. Optionally they may also include location information parameter (e.g. GPS coordinates or human-readable description of the location) and language information parameter representing the language of the text contained in other null-terminated string parameters.

Optionally they may also include a parameter providing a unique identifier (e.g. group_uuid or logical_group_id) of the Grouping Information. This unique identifier may be used to associate multiple groups of entities with each other within the HEIF file or across multiple HEIF files (e.g. multiple image files in a directory that belong to the same album). To ensure unicity of this identifier, in particular if it is allocated from different devices, users or vendors, this identifier is defined using a Universally Unique IDentifier (UUID) as specified in RFC4122.

An example of syntax of the generic Grouping Information when described as EntityToGroup is described below (including all optional parameters): aligned(8) class CaptureSeriesEntityToGroupBox extends EntityToGroupBox('casei, version = 0, flags = 0) { unsigned int(32) capture_mode; unsigned int(8)[16] group_uuid; utf8string group_name; utf8string group_description; utf8string group_tags; utf8string lang; Where: group_name is a null-terminated UTF-8 character string containing human readable name for the group of entities.

group_description is a null-terminated UTF-8 character string containing human

readable description of the group of entities.

group_tags is a null-terminated UTF-8 character string containing comma separated tags related to the group of entities lang is a character string containing an RFC 5646 compliant language tag string, such as "en-US", "fr-FR", or "zh-CN", representing the language of the text contained in group_name, group_description and group_tags. When lang is empty, the language is unknown/undefined.

In case specific grouping_type are used instead of a generic grouping_type, examples of the syntax of the specific EntityToGroupBoxes are as follows (with same semantics of parameters than above): aligned(8) class AlbumCollecfionEnfityToGroupBox extends EntityToGroupBox('albc', version, flags) { unsigned int(8)[16] group_uuid; utf8string group_name;

utf8string group_description;

utf8string group_tags; utf8string lang; aligned(8) class FavouriteCollectionEntityToGroupBox extends EntityToGroupBox('favc', version, flags) { unsigned int(8)[16] group_uuid; utf8string group_name;

utf8string group_description;

utf8string group_tags; utf8string lang; aligned(8) class AutoExposureBrackefingEntityToGroupBox extends EntityToGroupBox('aebr', version, flags) { unsigned int(8)[16] group_uuid; utf8string group_name;

utf8string group_description;

utf8string group_tags; utf8string lang; Similar syntax may be derived for each capture or collection mode.

Above syntax allows expressing only one language for the human-readable string parameters. It may be desirable to provide such human-readable strings in multiple languages to support alternative internationalization of the user-presentable text (e.g. both Japanese and French).

In a variant, new EntityToGroupBoxes contains a list of alternative human-readable string parameters with their associated language.

The syntax of the generic Grouping Information when described as EntityToGroup is described below (including all optional parameters): aligned(8) class CaptureSeriesEntityToGroupBox extends EnfityToGroupBox('case', version = 0, flags = 0) { unsigned int(32) capture_mode; unsigned int(8)[16] group_uuid; unsigned int(16) entry_count; for (i = 0; i < entry_count; i++) { utf8string group_name;

utf8string group_description;

utf8string group_tags; utf8string lang; Where entry_count provides the number of alternative user-presentable text.

Similarly the syntax of EnfityToGroupBoxes with specific grouping_type may be as follows (with same semantics of parameters than above): aligned(8) class AlbumCollectionEntityToGroupBox extends EntityToGroupBox('albc', version, flags) { unsigned int(8)[16] group_uuid; unsigned int(16) entry_count for (i = 0; i < entry_count; i++) { utf8string group_name; utf8string group_descripfion; utf8string group_tags; utf8string lang, It may be desirable to share group properties between groups of entities (thus possibly grouping single images, sequences of images, or both) to avoid duplicating the same information in multiple groups.

According to a second embodiment the new EntityToGroupBox with generic grouping_type (e.g. 'case' or 'Igrp') or with specific grouping_type (e.g. 'aebr', 'albc'...) may only contain the optional group_uuid parameter previously described. All other group properties are described as follows.

Group properties are defined as Box or FullBox rather than as parameters in EntityToGroupBoxes. The box type of the group property specifies the property type. Group properties can be descriptive or transformafive. Transformafive group properties apply to each item in the group of entities with preceding transformations applied (Transformations associated with items (via 'ipma' box) are applied first, and then transformations associated with the group of entities (as described below) are applied to each item in the group).

A new container box GroupPropertiesBox('gprp') is created in the MetaBox. The GroupPropertiesBox enables the association of any group with an ordered set of group properties. This GroupPropertiesBox consists of two parts: GroupPropertyContainerBox('gpco') that contains an implicitly indexed list of group properties, and one or more GroupPropertyAssociationBox(es)(gpma) that associate groups of entities with group properties.

In an example, each GroupPropertyAssociationBox shall be ordered by increasing group_id, and there shall be at most one occurrence of a given group_id, in the set of GroupPropertyAssociationBox boxes. The version 0 should be used unless 32-bit group_id values are needed; similarly, flags should be equal to 0 unless there are more than 127 properties in the GroupPropertyContainerBox. There shall be at most one GroupPropertyAssociationBox with a given pair of values of version and flags.

The associated syntax is as follows: aligned(8) class GroupProperty(property_type) extends Box(property_type) aligned(8) class GroupFullProperty(property_type, version, flags) extends FullBox(property_type, version, flags) aligned(8) class GroupPropertyContainerBox extends Box('gpco') Box properties[]; // boxes derived from // GroupProperty or GroupFullProperty, or FreeSpaceBox(es) // to fill the box aligned(8) class GroupPropertyAssociationBox extends FullBox('gpma', version, flags) unsigned int(32) entry_count; for(i = 0; i < entry_count; i++) { if (version < 1) unsigned int(16) group_id; else unsigned int(32) group_id; unsigned int(8) association_count; for (i=0; i<association_count; H+) { bit(1) essential; if (flags & 1) unsigned int(15) property_index; else unsigned int(7) property_index; Box Type: 'gprp' Container: MetaBox (Theta') Mandatory: No Quantity: Zero or one aligned(8) class GroupPropertiesBox extends Box('gprp') { GroupPropertyContainerBox property_container; GroupPropertyAssociationBox association[]; Where: group_id identifies the EntityToGroupBox with which properties are associated. essential when set to 1 indicates that the associated property is essential to the group, otherwise it is non-essential.

property_index is either 0 indicating that no property is associated (the essential indicator shall also be 0), or is the 1-based index (counting all boxes, including FreeSpace boxes) of the associated property box in the GroupPropertyContainerBox contained in the same GroupPropertiesBox.

According to this second embodiment, human-readable label(s) such as name, tag(s) or description of the content of the group of entities may be defined as a specific GroupDescriptionProperty with for instance a specific FourCC 'gdes' as follows:

aligned(8) class GroupDescriptionProperty

extends GroupFullProperty('gdes', version = 0, flags = 0){ utf8string name;

utf8string description;

utf8string tags; utf8string lang; Where name is a null-terminated UTF-8 character string containing human readable name for the group of entities.

description is a null-terminated UTF-8 character string containing human readable description of the group of entities.

tags is a null-terminated UTF-8 character string containing comma separated tags related to the group of entities.

lang is a character string containing an RFC 5646 compliant language tag string, such as "en-US", "fr-FR", or "zh-CN", representing the language of the text contained in name, description and tags. When lang is empty, the language is unknown/undefined.

According to this second embodiment, it is possible to associate multiple GroupDescriptionProperty with a group via the GroupPropertyAssociafionBox to represent different language alternatives.

Alternatively to associate multiple GroupDescriptionProperty with alternative language with a group, the GroupDescriptionProperty may contain a list of alternative

name, description and tags as follows:

aligned(8) class GroupDescriptionProperty

extends GroupFullProperty('gdes', version = 0, flags = 0){ unsigned int(16) entry_count; for (i = 0; i < entry_count; i++) { utf8string group_name;

utf8string group_description;

utf8string group_tags; utf8string lang; In a variant, each human-readable string can be defined as a separate group property with its associated language for more flexibility. For example, there may be one property for a lag' string, one for a 'label' string, and one for a 'description' string.

In above variant, each group property may also contain a list of alternative couple text/language similarly to the description above.

In some cases, same properties may apply to either items or groups of entities. For instance, above GroupDescriptionProperty box can be useful to provide human-presentable description to either items or groups of entities. Similarly item properties may apply to items or to group of items as a whole.

In a variant, rather than defining new boxes to associate group properties with group of entities, group properties are defined as item properties boxes (ItemProperty or ItemFullProperty) and the semantic of the ItemPropertyAssociationBox is modified to be able to refer to items or groups of entities.

For example, the ItemPropertyAssociationBox allows referring to an identifier of group (for example to the EntityToGroup::group_id parameter) that groups a series of items or tracks. The item_ID field of the cipma box is then replaced by an item_or group_id which may refer to either an identifier of Item (item_ID) or to a group identifier (group_id). The advantage of this variant is that the description of the properties is more compact since groups of entities and items may share the same properties and repetition of properties definition and associations is avoided.

Below is an example of a new syntax of the ItemPropertyAssociationBox: aligned(8) class ItemPropertyAssociationBox extends FullBox('ipma', version, flags) unsigned int(32) entry_count; for(i = 0; i < entry_count; i++) { if (version < 1) unsigned int(16) item_or_group_ID; else unsigned int(32) item_or_group_ID; unsigned int(8) association_count; for (i=0; icassociation_count i++) { bit(1) essential; if (flags & 1) unsigned int(15) property_index; else unsigned int(7) property_index; The semantics of the different parameters of ItemPropertyAssociationBox remain unchanged except item_ID field which is renamed to item_or_group_ID. The item_or_group_ID parameter may refer either to one item or to one EntityToGroup. Thus, the value of item_or_group_ID should be equal to one group_id value or to one item_ID value. It is to be noted that, by definition of the group_id in the standard, a group_id shall not be equal to any other group_id, any item_id, or any track_id. Thus, given an item_or_group_ID, there cannot be ambiguity whether it refers to an item or a group.

As an alternative, to keep backward compatibility with existing version 0 and 1 of the ItemPropertyAssociationBox, new versions 2 and 3 of the ItemPropertyAssociationBox may be proposed and defined as follows: aligned(8) class ItemPropertyAssociationBox extends FullBox('ipma', version, flags) unsigned int(32) entry_count; for(i = 0; i < entry_count; i++) { if (version < 1) unsigned int(16) item_ID; else if (version == 1) unsigned int(32) item_ID; else if (version == 2) unsigned int(16) group_id else unsigned int(32) group_id unsigned int(8) association_count; for (i=0; icassociation_count; H+) { bit(1) essential; if (flags & 1) unsigned int(15) property_index; else unsigned int(7) property_index; Each ItemPropertyAssociationBox shall be ordered by increasing item_ID or group_id, and there shall be at most one occurrence of a given item_ID or group_id, in the set of ItemPropertyAssociationBox boxes. The version 0 should be used for associating properties with items unless 32-bit item_ID values are needed, in such case version 1 should be used; Similarly, the version 2 should be used for associating properties with groups of entities unless 32-bit group_id values are needed, in such case version 3 should be used; Flags should be equal to 0 unless there are more than 127 properties in the ItemPropertyContainerBox. There shall be at most one ItemPropertyAssociafionBox with a given pair of values of version and flags.

In another variant, rather than creating new versions of the existing ItemPropertyAssociationBox, a new box EnfityToGroupAssociafionBox epma' is created directly into the ItemPropertiesBox, for example as follows: aligned(8) class EntityToGroupAssociationBox extends FullBox('epma', version, flags) unsigned int(32) entry_count; for(i = 0; i < entry_count; i++) { if (version < 1) unsigned int(16) group_id; else unsigned int(32) group_id; unsigned int(8) association_count; for (i=0; kassociation_count; i++) { bit(1) essential; if (flags & Oxl == Ox1) unsigned int(15) property_index; else unsigned int(7) property_index; Box Type: 'iprp' Container: MetaBox ('meta') Mandatory: No Quantity: Zero or one aligned(8) class ItemPropertiesBox extends Box('iprp') { ItemPropertyContainerBox property_container ItemPropertyAssociationBox association[]; EntityToGroupAssociationBox groupAssociation[]; Each EntityToGroupAssociationBox shall be ordered by increasing group_id, and there shall be at most one occurrence of a given group_id, in the set of EnfityToGroupAssociafionBox boxes. The version 0 should be used for associating properties with EntityToGroups unless 32-bit group_id values are needed, in such case version 1 should be used; Flags should be equal to 0 unless there are more than 127 properties in the ItemPropertyContainerBox. There shall be at most one EnfityToGroupAssociabonBox with a given pair of values of version and flags.

In those variant, human-readable label(s) such as name, tag(s) or description may apply either to items or group of entities. Such labels are defined as a specific descriptive Item Property ItemDescriptionProperty with for instance a specific FourCC 'ides' as follows:

aligned(8) class ItemDescriptionProperty

extends ItemFullPropertycides', version = 0, flags = 0){ utf8string name;

utf8string description;

utf8string tags; utf8string lang Where name is a null-terminated UTF-8 character string containing human readable name for the item or group of entities.

description is a null-terminated UTF-8 character string containing human readable description of the item or group of entities.

tags is a null-terminated UTF-8 character string containing comma separated tags related to the item or group of entities.

In a variant, each human-readable string can be defined as separate descriptive item properties with its associated language for more flexibility. For example, there may be one property for a 'tag' string, one for a 'label' string, and one for a 'description' string.

Item Properties are ordered and Item Properties can be either descriptive item properties or transformative item properties as described in standard. Examples of transformative item properties are image rotation or image crop. Examples of descriptive Item Properties are colour information or pixel information.

According to above embodiments, when a descriptive item property is associated with a group of entities, it describes common properties that applies to each entity in the group or it applies to the group of entities as a whole depending on the semantic of the Item Property. Alternatively, a box flag value of ipma' box (or cepma' box depending on embodiment) allows signalling if the descriptive Item Property associations described by the box applies to the group as a whole or to each item in the group. When a transformative Item Property is associated with a group of entities, it applies to each entity in the group with preceding transformations applied, i.e. transformations associated with items (via ipma' box) are applied first, and then transformations associated with the group of entities (as described above) are applied in order to each item in the group.

According to previous embodiments, one or more item and/or group properties can be associated with each item or group of entities to describe either the properties of the item by itself or the common properties that apply to a whole group.

However, there are cases where it is also desirable to be able to associate properties with an entity within a limited scope, i.e. to associate properties with an entity that would apply only in the context of a given group of entities.

For instance, a same image may pertain to two different album collections and the user may want to associate different human-presentable texts describing this image in the scope of each album collection. For instance, an image representing a landscape with a car in front of a mountain may be associated with two different album collections, one dedicated to cars and another dedicated to holidays. In the first album collection, the user may want to associate the text "Nice red car!" with the image, while in the second album collection, the user may want to associate the text "My nice holidays at Mountain".

As another example, outside from any collection, an image may be associated in the HEIF file with a transformative Item Property Clean Aperture that realizes a crop of the image. But the user would like to add this image in different collections in same file with different crop parameters.

In both above use cases, it is useful to associate properties to images in the context of a particular group.

In an embodiment, when considering an item pertaining to a group, this item is associated with all the properties that are associated with this item unconditionally (meaning not within the scope of a group) and the properties associated with this item within the scope of the group. When a same property is defined unconditionally and within the scope of the group, the property defined within the scope of the group has the precedence and its value overwrites the value defined in the unconditional property.

According to a third embodiment, the new EntityToGroupBoxes that describe capture series or collection group, either the generic ones (thus CaptureSeriesEntityToGroupBox or LogicalEntityToGroupBox), or the specific ones (thus FavouriteCollecfionEntityToGroupBox, AlbumCollectionEnfityToGroupBox or AutoExposureBracketingEntityToGroupBox), introduced in the first embodiment are extended to associate a list of property index with each entity in the group as illustrated in examples below: aligned(8) class CaptureSeriesEntityToGroupBox extends EntityToGroupBox('case', version, flags) { unsigned int(32) capture_mode; for(i = 0; i < num_entities_in_group; i++) { unsigned int(32) entity_id; unsigned int(8) association_count; for (i=0; kassociation_count; i++) { bit(1) essential; if (flags & Ox1 == Ox1) unsigned int(15) property_index; else unsigned int(7) property_index; Where capture_mode identifies the capture or collection mode.

num_entities_in_group is inherited from EntityToGroupBox and represents the number of entities (item or track) in the group.

entity_id identifies the entity with which properties are associated essential when set to 1 indicates that the associated property is essential to the item, otherwise it is non-essential.

property_index is either 0 indicating that no property is associated (the essential indicator shall also be 0), or is the 1-based index (counting all boxes, including FreeSpace boxes) of the associated property box in the ItemPropertyContainerBox contained in the ItemPropertiesBox present at the same level than the enclosing GroupListBox.

Similar example and parameter semantics illustrate the case of EntityToGroupBox with specific grouping_type per type of capture or collection mode as follows: aligned(8) class AlbumCollectionEntityToGroupBox extends EntityToGroupBox(albc', version, flags) { for(i = 0; i < num_enfifies_in_group; i++) { unsigned int(32) entity_id; unsigned int(8) association_count; for (i=0; kassociation_count; i++) { bit(1) essential; if (flags & Ox1 == Ox1) unsigned int(15) property_index; else unsigned int(7) property_index; In a variant as illustrated on Figure 2, the association of item properties 240 with an item 210 within the scope of a group 220 is performed directly in the ItemPropertiesBox 230 in MetaBox 20. A new box 233 is defined in ItemPropertiesBox as follows: aligned(8) class ItemPropertyInGroupAssociationBox extends FullBoxcgpma', version, flags) unsigned int(32) group_entry_count; for (i = 0; i < group_entry_count; i++) if (version < 1) unsigned int(16) group_id; else unsigned int(32) group_id; unsigned int(32) entry_count; for(i = 0; i < entry_count; i++) { if (version < 1) unsigned int(16) item_ID; else unsigned int(32) item_ID; unsigned int(8) association_count; for (i = 0; i < associafion_count; i++) { bit(1) essential; if (flags & Ox1 == Ox1) unsigned int(15) property_index; else unsigned int(7) property_index; Box Type: 'iprp' Container: MetaBox ('meta') Mandatory: No Quantity: Zero or one aligned(8) class ItemPropertiesBox extends Box('iprp') ItemPropertyContainerBox property_container; ItemPropertyAssociationBox association[]; ItemPropertylnGroupAssociationBox itemInGroupAssociation[]; Where group_entry_count provides the number of entries in the list of groups.

group_id identifies the group that defines the context in which properties are associated.

entry_count provides the number of entries in the list of items in the group. Item_ID identifies the item with which properties are associated.

essential when set to 1 indicates that the associated property is essential to the item, otherwise it is non-essential property_index is either 0 indicating that no property is associated (the essential indicator shall also be 0), or is the 1-based index (counting all boxes, including FreeSpace boxes) of the associated property box in the ItemPropertyContainerBox contained in the same Item PropertiesBox.

In a variant, item_ID may be replaced with entity_ID to designate either an item_ID or a track_ID in the associated EnfityToGroupBox with EntityToGroupBoK:group_id == group_id.

Each ItemPropertyInGroupAssociationBox shall be ordered by increasing group_id and item_ID, and there shall be at most one occurrence of a given group_id, in the set of ItemPropertyInGroupAssociationBox boxes. The version 0 should be used unless 32-bit item_ID values are needed; similarly, flags should be equal to 0 unless there are more than 127 properties in the ItemPropertyContainerBox. There shall be at most one ItemPropertyAssociationBox with a given pair of values of version and flags.

In another variant, rather than defining a new box in ItemPropertiesBox, the existing ItemPropertyAssociationBox ('ipma') may be extended with a new version equals to 2 as follows: aligned(8) class ItemPropertyAssociationBox extends FullBox('ipma', version, flags) if (version == 2) unsigned int(32) group_id; unsigned int(32) entry_count; for(i = 0; i < entry_count; i++) { if (version < 1) unsigned int(16) item_ID; else unsigned int(32) item_ID; unsigned int(8) association_count; for (1=0; kassociation_count; i++) { bit(1) essential; if (flags & Ox1 == Ox1) unsigned int(15) property_index; else unsigned int(7) property_index; Where when the version equals 2, a new attribute group_id limits the scope of the association between item_ID and property_index to the context of the EnfityToGroupBox with same group_id value.

Each ItemPropertyAssociationBox shall be ordered by increasing item_ID (and group_id if present).

There shall be at most one occurrence of a given group_id in the set of ItemPropertyAssociationBox boxes.

There shall be at most one occurrence of a given item_ID in the set of ItemPropertyAssociationBox boxes with version 0 and 1.

1st-bit (LSB) flags should be equal to 0 unless there are more than 127 properties in the ItemPropertyContainerBox.

There shall be at most one ItemPropertyAssociationBox with a given pair of values of version and flags with version 0 or 1.

In HEIF standard, Item Properties are ordered and Item Properties can be either descriptive item properties or transformative item properties. According to some embodiments, multiple item properties with same type can be associated with same item, either globally or in the scope of a particular group. For instance, a transformative item property Image rotation (of type 'iron may be associated with an item in a general scope using an ItemPropertyAssociationBox with version 0 or 1. Simultaneously, the same item may be associated with another item property Image rotation (of type 'irot') in the limited scope of a given group, for instance using an ItemPropertyAssociationBox with version 2.

In such case, according to some embodiments, transformative item properties apply to the item with preceding transformations applied and for items in the scope of a given group, transformative item properties in general scope apply before transformative item properties in the scope of this given group. On contrary, for descriptive item properties, descriptive item properties associated with an item in the scope of a group supersedes the descriptive item properties with same type associated with same item in general scope.

Alternatively, an additional 1-bit attribute supersede_in_groupflag is added in the new version of ItemPropertyAssociationBox to signal if Item Properties associated in the scope of a group supersedes or not Item Properties with same type associated with same item in general scope.

An example of syntax of the ItemPropertyAssociationBox with this additional attribute is described below aligned(8) class Item PropertyAssociationBox extends FullBox('ipma', version, flags) if (version == 2) unsigned int(32) group_id; unsigned int(32) entry_count; for(i = 0; i < entry_count; i++) { if (version < 1) unsigned int(16) item_ID; else unsigned int(32) item_ID; unsigned int(8) association_count; for (i=0; kassociation_count; i++) { bit(1) essential; if (version == 2){ bit(1) supersede_in_groupflag; if (flags & Ox1 == Ox1) unsigned int(14) property_index; else unsigned int(6) property_index; else { if (flags & Ox1 == Ox1) unsigned int(15) property_index; else unsigned int(7) property_index; I. Where when the version equals 2, a new attribute group_id limits the scope of the association between item_ID and property_index to the context of the EntityToGroupBox with same group_id value. Moreover, a new 1-bit attribute (supersede_in_groupflag) when set to 1 indicates that the associated property supersedes property with same type in general scope, if any, otherwise the associated property applies in order after properties with same type associated with same item in general scope.

Alternatively, the supersede_in_groupflag value may be signalled as a particular flag value in the flags parameter of the ItemPropertyAssociationBox. In such case, the supersede_in_groupflag value applies to all property associations declared in the ItemPropertyAssociationBox.

According to a fourth embodiment new versions (version = 2 and version = 3) of the ItemPropertyAssociationBox ('ipma) with new flags is defined to deal with all versions of this single metadata structure providing all possible association of Item Properties with either items, groups of entities, or items within a group of entities.

Figure 3 illustrates the different associations between item properties and items, groups of entities, or items within the scope of a group of entities according to some embodiments of the invention.

Relation 301 between an item property 1 and an item1 illustrates an association of an item property with an item. This association is valid within the scope of the file; this is a general association.

Relation 302 between item property Sand the group of items composed by item3 and item4, illustrates an association of an item property with a group of items.

Relation 303 between item4 and item property 5 illustrates an association of an item property with an item within the scope of the group of items composed of item3 and item4.

It may be noted that, independently of its belonging to the group of items, item4 is associated unconditionally to item property 1 and item property 2.

According to this embodiment, the ItemPropertyAssociafionBox may be defined for example as follows: The following flags are allowed to be set in the 'ipma' box flags: Ox000001 indicates the size of property_index attribute; o When this bit is set, property_index_size = 15 o Otherwise, property_index_size = 7 0x000002 indicates the size of item_ID and group_id attributes; o When this bit is set, item_or_group_ID_size = 32 o Otherwise, item_or group_ID_size = 16 aligned(8) class ItemPropertyAssociationBox extends FullBox('ipma', version, flags) if (version == 2) unsigned int(item_or group_ID_size) context_group_id; unsigned int(32) entry_count; for(i = 0; i < entry_count i++) if (version < 1) unsigned int(16) item_ID; else if (version == 1) unsigned int(32) item_ID; else if (version == 2) unsigned int(item_or_group_ID_size) item_ID; else unsigned int(item_or group_ID_size) group_id; unsigned int(8) association_count; for (1=0; kassociation_count; i++) bit(1) essential; if (version == 2){ bit(1) supersede_in_group_flag; unsigned int(property_index_size-1) property_index; else unsigned int(property_index_size) property_index; Where context_group_id limits the scope of the association between item_ID and property_index to the context of the EntityToGroupBox with same group_id value. group_id identifies the EntityToGroupBox with which properties are associated. essential when set to 1 indicates that the associated property is essential to the group, otherwise it is non-essential.

supersede_in_group_flag when set to 1 indicates that the associated property supersedes property with same type in general scope, if any, otherwise the associated property applies in order after properties with same type associated with same item in general scope.

The definition of version 2 and 3 of 'ipma' box is backward compatible and does not modify existing version 0 and 1 of the 'ipma' box.

Each ItemPropertyAssociationBox shall be ordered by increasing item_ID or group_id (and context_group_id if present).

There shall be at most one occurrence of a given context_group_id in the set of ItemPropertyAssociationBox boxes.

There shall be at most one occurrence of a given item_ID and group_id in the set of ItemPropertyAssociationBox boxes with version 0, 1 or 3.

There shall be at most one ItemPropertyAssociationBox with a given pair of values of version and flags with values Ox1 or 0x2 except for version = 2.

Alternatively to previous embodiment, the ItemPropertyAssociationBox may be defined for example as follows: The following flags are allowed to be set in the 'ipma' box flags: Ox000001 indicates the size of property_index attribute; o When this bit is set, property_index_size = 15 o Otherwise, property_index_size = 7 0x000002 indicates the size of item_ID and group_id attributes; o When this bit is set, item_or_group_ID_size = 32 o Otherwise, item_or group_ID_size = 16 0x000004 group_limited_scope o When this bit is set, it indicates that the associations declared in this box are limited to the scope of the group identified by the attribute context_group_id.

aligned(8) class ItemPropertyAssociationBox extends FullBox('ipma', version, flags) if (version > 1 && group_limited_scope) unsigned int(item_or group_ID_size) context_group_id; unsigned int(32) entry_count for(i = 0; i < entry_count i++) if (version < 1) unsigned int(16) item_ID; else if (version == 1) unsigned int(32) item_ID; else if (version > 1 && group_limited_scope) unsigned int(item_or_group_ID_size) item_ID; else unsigned int(item_or_group_ID_size) group_id; unsigned int(8) association_count; for (1=0; kassociation_count; i++) { bit(1) essential; if (version > 1 && group_limited_scope){ bit(1) supersede_in_group_flag; unsigned int(property_index_size-1) property_index; else unsigned int(property_index_size) property_index; Where context_group_id limits the scope of the association between item_ID and property_index to the context of the EntityToGroupBox with same group_id value. group_id identifies the EntityToGroupBox with which properties are associated. essential when set to 1 indicates that the associated property is essential to the group, otherwise it is non-essential.

The definition of version 2 of cipma' box is backward compatible and does not modify existing version 0 and 1 of the 'ipma' box.

There shall be at most one occurrence of a given item_ID and group_id in the set of ItemPropertyAssociationBox boxes for which group_limited_scope is not set 1st-bit (LSB) flags should be equal to 0 unless there are more than 127 properties in the ItemPropertyContainerBox.

There shall be at most one ItemPropertyAssociationBox with a given pair of values of version and flags except for boxes with flag group_limited_scope present.

Figure 4 illustrates the main steps of a process for encapsulating one or more entities in one file using HEIF format. A computing device 600 (Figure 6) may for instance apply this processing.

First, one or more entities are obtained in step 401. They can be obtained according to a capture mode by a capturing device as described above.

In an alternative, the one or more entities obtained in step 401 can be selected among existing or previously captured or created entities. For instance, existing entities can be extracted from an existing HEIF file or from any other type of files, for instance JPEG, GIF, BMP, MKV, MP4 or AVI.

At step 402, one or more new groups of entities are created. Those groups may result from the selected capture mode at capture time, i.e. a new group is created for each set of entities resulting from applying a capture mode during capture of images. For example, at capture-time, a time-lapse is realized containing an image every second. The time-lapse images are grouped together in a same group at the end of the capture. Another example, an auto-exposure bracketing group is created to group together all images resulting from an auto-exposure bracketing capture. New groups of entities may also be created during an editing operation performed by an automatic process or by a user or the creator of a HEIF file. For instance, groups of entities can be created to logically grouped entities such as for creating user's collections or user's favourites or photo series or a set of photos or user defined capture series. A user may for instance groups several entities to form a collection of images. An automatic process may for instance groups the photo shot in the same location or/and within a predetermined interval of time. For instance, the capture device uses the location information (for instance from a GPS sensor) of the capture image to determine the name of the town corresponding to the location. All the images taken in the same town form one group of images. For instance, the capture or editing device may use artificial intelligence algorithms (e.g. face or object recognition) to categorize images and group them together.

At step 403, several properties may be associated to each group to describe more precisely the object of the group. For instance, a group may be associated with tag(s), label(s), name(s), descriptive text(s) of the content, location coordinates, common parameters of a capture mode. For instance, common parameters of a capture mode may be related to the timing (e.g. acquisition frame rate or time delta between successive images), differences in capture parameters variations (e.g. exposure step) between successive captures, or properties of the bracketing mode (e.g. continuous, single or auto bracketing, or panorama direction).

In some embodiments, in an optional step 405, several properties may be associated to entities pertaining to a group of entities, the association being only valid within the scope of the group. This means that the property applies to the entity only when the entity is considered as being part of the group. If the entity is considered independently of the group, then the property doesn't apply to the entity.

Finally at step 404, images and sequences of entities are encapsulated in HEIF file format with metadata structures describing them and describing the created groups of entities with their associated properties For the storage of images, two main alternative possibilities are available.

In the first alternative, images are encoded independently and stored in the file as HEIF items. During this encapsulation step, additional information on the condition of capture may be provided in the file. For example, for auto exposure bracketing mode, the exposure data used for the capture of each image may be provided. This description is provided using properties in an ItemProperty box.

In a second alternative, images are stored in a 'pia or 'vide' track. Additional information may be provided using SampleEntry or SampleGroupEntry boxes. The encoding of images may depend on previous images using an HEVC encoder similarly to video encoding. Previous images in the track are available as reference image for predictive encoding.

The storage of the captured images as a group of images is signaled in the file using the available EntityToGroup grouping mechanism previously described.

Figure 5 illustrates the main steps of a parsing process of an HEIF file generated by the encapsulating process of Figure 4 to determine the properties associated with a group of entities and the properties associated with an entity (item or track) within the scope of a group. The decoding process starts by the parsing of an HEIF file with one or more images (items) or sequences of images ('pict' or 'vide' tracks). In a step 501 the Grouping Information contained in the GroupsListBox is parsed to determine groups of entities. The grouping type of each group of entities is determined when the Grouping Information is present i.e. the HEIF file includes an EntityToGroupBox with a grouping_type equal to one of the value previously described. In a first alternative, the grouping_type parameter specifies directly the grouping mode. In a second alternative, the Grouping Information signals that the set of images belong to a generic capture series or collection group (the grouping type is equal to 'case', 'brak', 'udcs', or any other FourCC with similar meaning)). In this case, an additional attribute of the EntityToGroupBox provides the particular grouping mode.

Property Information associated with each items is parsed at step 502 by parsing the ItemPropertiesBox ('iprp') and using the ItemPropertyAssociationBoxes (version 0 or version 1) to retrieve each ItemProperty or ItemFullProperty from the ItemPropertyContainerBox that is associated with a particular item.

Property Information associated with each group of entities is parsed at step 503 by parsing the additional attributes defined in the respective EntityToGroupBox. In an alternative, Property Information associated with each group of entities is obtained by parsing a dedicated box GroupPropertiesBox ('gprp') and using the GroupPropertyAssociationBoxes to retrieve each Item Property or ItemFullProperty from the GroupPropertyContainerBox that is associated with a particular group. In another alternative, Property Information associated with each group of entities is obtained by parsing the ItemPropertiesBox ('gprp') and using dedicated ItemPropertyAssociationBoxes (version >= 2) to retrieve each ItemProperty or ItemFullProperty from the ItemPropertyContainerBox that is associated with a particular group.

In some embodiments, property information associated with each items in the scope of a particular group of entities is parsed in an optional step 504 by parsing the additional attributes defined in the respective EntityToGroupBox providing the index of itemProperties or directly the itemProperties associated with each item in that group. In an alternative, Property Information associated with each items in the scope of a particular group of entities is obtained by parsing the ItemPropertiesBox ('gprp') and using dedicated ItemPropertyAssociationBoxes (version >= 2) to retrieve each ItemProperty or ItemFullProperty from the Item PropertyContainerBox that is associated with the item in the scope of a particular group.

At step 505 the decoder provides to the player all grouping information and property information associated with items, group of entities and entities within a group of entities. The GUI interface may indicate specific GUI elements to allow a user to navigate between the images, sequences of images and groups of such entities using associated respective Property Information. From this Property information the user may select to render part of images contained in the HEIF file.

For instance, HEIF file contains a series of bracketing images. In such a case, the application provides a GUI interface that permits to view the different bracketing alternatives. In one embodiment, the interface provides the information provided in the Property Information such as the ItemProperties to extract the characteristics of the capture associated to each image (including both Property Information from the group and Property Information of each image within the scope of the group). In particular, for auto exposure bracketing the exposure stop of each shot are displayed at step 305 in order to allow a user to select the appropriate shot. Upon selection of the preferred exposure, the decoding device may modify the HEIF file to mark the selected image as "primary item".

When the capture mode corresponds to a Panorama image, the decoder notifies in step 505 the player that HEIF file contains a series of images in which the user may navigate. The GUI interface may indicate specific GUI elements to allow a user to navigate between the images as a spatial composition. The player parses the Property Information of the group to extract the pattern of capture of the set images (for example from left to right) in order to generate a navigation interface adapted to the pattern. For example, if the pattern of capture is from left to right, the GUI interface provides horizontal navigation arrows to navigate between the items of the HEIF file.

When the capture mode corresponds to a Photo Series or an Image Burst, the decoder notifies in step 505 the player, for example, to start a diaporama between all the images of the Photo Series or Image Burst group. In one embodiment, the display time of each image in the diaporama is a function of the timing Interval specified in Property Information of Image burst group.

In one embodiment, the player displays the label or name information provided in the Property Information at the beginning of the diaporama or as a watermarking in each image to allow the user to identify rapidly the content of the Photo Series. In another embodiment, the user may select one image as the preferred image from the series of images of the Photo series group. In such a case, the preferred image is marked as the Primary Item. In another embodiment, the user may select several images as preferred images from the Photo Series. In such a case, the player creates a new Photo Series group with the selected images and associates the same label Property Information.

Figure 6 is a schematic block diagram of a computing device 600 for implementation of one or more embodiments of the invention. The computing device 600 may be a device such as a micro-computer, a workstation or a light portable device, for instance a mobile phone, tablet or still or video camera. The computing device 600 comprises a communication bus connected to: - a central processing unit 601, such as a microprocessor, denoted CPU; - a random access memory 602, denoted RAM, for storing the executable code of the method of embodiments of the invention as well as the registers adapted to record variables and parameters necessary for implementing the method according to embodiments of the invention, the memory capacity thereof can be expanded by an optional RAM connected to an expansion port for example; -a read only memory 603, denoted ROM, for storing computer programs for implementing embodiments of the invention; - a network interface 604 is typically connected to a communication network over which digital data to be processed are transmitted or received. The network interface 604 can be a single network interface, or composed of a set of different network interfaces (for instance wired and wireless interfaces, or different kinds of wired or wireless interfaces). Data packets are written to the network interface for transmission or are read from the network interface for reception under the control of the software application running in the CPU 601; - a user interface 605 may be used for receiving inputs from a user or to display information to a user; - a hard disk 606 denoted HD may be provided as a mass storage device; -an I/O module 607 may be used for receiving/sending data from/to external devices such as a video source or display.

The executable code may be stored either in read only memory 603, on the hard disk 606 or on a removable digital medium such as for example a disk. According to a variant, the executable code of the programs can be received by means of a communication network, via the network interface 604, in order to be stored in one of the storage means of the communication device 600, such as the hard disk 606, before being executed.

The central processing unit 601 is adapted to control and direct the execution of the instructions or portions of software code of the program or programs according to embodiments of the invention, which instructions are stored in one of the aforementioned storage means. After powering on, the CPU 601 is capable of executing instructions from main RAM memory 602 relating to a software application after those instructions have been loaded from the program ROM 603 or the hard-disc (HD) 606 for example. Such a software application, when executed by the CPU 601, causes the steps of the flowcharts of the invention to be performed.

Any step of the algorithms of the invention may be implemented in software by execution of a set of instructions or program by a programmable computing machine, such as a PC ("Personal Computer"), a DSP ("Digital Signal Processor") or a microcontroller; or else implemented in hardware by a machine or a dedicated component, such as an FPGA ("Field-Programmable Gate Array") or an ASIC ("Application-Specific Integrated Circuit').

Although the present invention has been described hereinabove with reference to specific embodiments, the present invention is not limited to the specific embodiments, and modifications will be apparent to a skilled person in the art which lie within the scope of the present invention.

Many further modifications and variations will suggest themselves to those versed in the art upon making reference to the foregoing illustrative embodiments, which are given by way of example only and which are not intended to limit the scope of the invention, that being determined solely by the appended claims. In particular the different features from different embodiments may be interchanged, where appropriate.

Each of the embodiments of the invention described above can be implemented solely or as a combination of a plurality of the embodiments. Also, features from different embodiments can be combined where necessary or where the combination of elements or features from individual embodiments in a single embodiment is beneficial.

Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features In the claims, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. The mere fact that different features are recited in mutually different dependent claims does not indicate that a combination of these features cannot be advantageously used.

Claims

CLAIMS1 A method of encapsulating media data in a file, wherein the method comprises: generating a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; generating a property container data structure containing a property associated with a group of entities; generating an association data structure comprising association information between the property and the grouping data structure; generating a property data structure comprising the property container data structure and the association data structure; and embedding the grouping data structure, the property data structure, and the media data in the file.
2 The method of claim 1, further comprising: generating a text property in the property container data structure comprising at least one text attribute associated with a group of entities and a language attribute associated with the at least one text attribute.
3 The method of claim 2, wherein the text property comprises a plurality of the same text attribute associated with a respective plurality of language attributes.
4 The method of claim 2, wherein the property container data structure comprises a first and second text properties, the first text property comprising a text attribute associated with a first language attribute, and the second text property comprising the text attribute associated with a second language attribute, the first language being different from the second language, and wherein the association data structure comprises association information between the first and second text properties, and the grouping data structure.
The method of claim 2 or 3, wherein the text property comprises a plurality of text attributes comprising a name text attribute, a description text attribute, and a tag text attribute.
6 The method of claim 2, wherein the method comprises: generating at least one of a name text property, a description text property, and/or a tag text property in the property container data structure, each of the text properties comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute.
7 The method of claim 6, wherein each text property comprises a plurality of the same text attribute associated with a respective plurality of language attributes.
8 The method of claim 6, wherein the property container data structure comprises a first and second text properties, the first text property comprising a text attribute associated with a first language attribute, and the second text property comprising the text attribute associated with a second language attribute, the first language being different from the second language, and wherein the association data structure comprises association information between the first and second text properties, and the grouping data structure.
9. The method of any one claim 1 to 8, wherein the association data structure further comprises at least an association of a property with an entity.
A method of encapsulating media data in a file, wherein the method comprises: generating a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; generating a text property in the grouping data structure comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute; and -embedding the media data, and the grouping data structure in the file.
11. The method of claim 10, wherein the text property comprises a plurality of the same text attribute associated with a respective plurality of language attributes.
12. The method of claim 10 or 11, wherein the text property comprises a plurality of text attributes comprising a name text attribute, a description text attribute, and a tag text attribute.
13 A method of reading media data in a file, wherein the method comprises: reading a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; reading a property container data structure containing a property associated with a group of entities; reading an association data structure comprising association information between the property and the grouping data structure; reading a property data structure comprising the property container data structure and the association data structure; and -reading the media data identified in the grouping data structure according to the property.
14. The method of claim 13, further comprising: reading a text property in the property container data structure comprising at least one text attribute associated with a group of entities and a language attribute associated with the at least one text attribute.
15. The method of claim 14, wherein the text property comprises a plurality of the same text attribute associated with a respective plurality of language attributes.
16 The method of claim 16, wherein the property container data structure comprises a first and second text properties, the first text property comprising a text attribute associated with a first language attribute, and the second text property comprising the text attribute associated with a second language attribute, the first language being different from the second language, and wherein the association data structure comprises association information between the first and second text properties, and the grouping data structure.
17. The method of claim 14 or 15, wherein the text property comprises a plurality of text attributes comprising a name text attribute, a description text attribute, and a tag text attribute.
18 The method of claim 14, wherein the method comprises: reading at least one of a name text property, a description text property, and/or a tag text property in the property container data structure, each of the text properties comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute.
19. The method of claim 18, wherein each text property comprises a plurality of the same text attribute associated with a respective plurality of language attributes.
The method of claim 18, wherein the property container data structure comprises a first and second text properties, the first text property comprising a text attribute associated with a first language attribute, and the second text property comprising the text attribute associated with a second language attribute, the first language being different from the second language, and wherein the association data structure comprises association information between the first and second text properties, and the grouping data structure.
21. The method of any one claim 13 to 20, wherein the association data structure further comprises at least an association of a property with an entity.
22 A method of reading media data in a file, wherein the method comprises: reading a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; - reading a text property in the grouping data structure comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute; and - reading the media data identified in the grouping data structure according to the property.
23. The method of claim 22, wherein the text property comprises a plurality of the same text attribute associated with a respective plurality of language attributes.
24. The method of claim 22 or 23, wherein the text property comprises a plurality of text attributes comprising a name text attribute, a description text attribute, and a tag text attribute.
A device for encapsulating media data in a file, wherein the device comprises circuitry configured for: generating a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; generating a property container data structure containing a property associated with a group of entities; - generating an association data structure comprising association information between the property and the grouping data structure; - generating a property data structure comprising the property container data structure and the association data structure; and - embedding the grouping data structure, the property data structure, and the media data in the file.
26 A device for encapsulating media data in a file, wherein the device comprises circuitry configured for: generating a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; generating a text property in the grouping data structure comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute; and embedding the media data, and the grouping data structure in the file.
27. A device for reading media data in a file, wherein the device comprises circuitry configured for: reading a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; reading a property container data structure containing a property associated with a group of entities; - reading an association data structure comprising association information between the property and the grouping data structure; reading a property data structure comprising the property container data structure and the association data structure; and reading the media data identified in the grouping data structure according to the property.
28 A device for reading media data in a file, wherein the device comprises circuitry configured for: reading a grouping data structure describing a group of entities, each entity corresponding to at least a portion of media data; - reading a text property in the grouping data structure comprising at least a text attribute associated with a group of entities and a language attribute associated with the text attribute; and - reading the media data identified in the grouping data structure according to the property.
29. A computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing a method according to any one of claims 1 to 24, when loaded into and executed by the programmable apparatus.
30. A computer-readable storage medium storing instructions of a computer program for implementing a method according to any one of claims 1 to 24.
31. A computer program which upon execution causes the method of any one of claims ito 24 to be performed.