WO2020043003A1 - 处理传输媒体数据和指定参考图像的方法和装置 - Google Patents

处理传输媒体数据和指定参考图像的方法和装置 Download PDF

Info

Publication number
WO2020043003A1
WO2020043003A1 PCT/CN2019/102025 CN2019102025W WO2020043003A1 WO 2020043003 A1 WO2020043003 A1 WO 2020043003A1 CN 2019102025 W CN2019102025 W CN 2019102025W WO 2020043003 A1 WO2020043003 A1 WO 2020043003A1
Authority
WO
WIPO (PCT)
Prior art keywords
media data
media
samples
time
information
Prior art date
Application number
PCT/CN2019/102025
Other languages
English (en)
French (fr)
Inventor
虞露
于化龙
袁锜超
林翔宇
Original Assignee
浙江大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201811487546.9A external-priority patent/CN110876083B/zh
Application filed by 浙江大学 filed Critical 浙江大学
Priority to US17/418,703 priority Critical patent/US11716505B2/en
Priority to EP19853701.1A priority patent/EP3866478A4/en
Publication of WO2020043003A1 publication Critical patent/WO2020043003A1/zh
Priority to US18/342,526 priority patent/US12052464B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440227Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234327Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2362Generation or processing of Service Information [SI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26258Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for generating a list of items to be played back in a given order, e.g. playlist, or scheduling item distribution according to such list
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4345Extraction or processing of SI, e.g. extracting service information from an MPEG stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/61Network physical structure; Signal processing
    • H04N21/6106Network physical structure; Signal processing specially adapted to the downstream path of the transmission network
    • H04N21/6125Network physical structure; Signal processing specially adapted to the downstream path of the transmission network involving transmission via Internet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/654Transmission by server directed to the client
    • H04N21/6547Transmission by server directed to the client comprising parameters, e.g. for client setup
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Definitions

  • the present invention relates to the technical field of image or video compression and the field of system transmission, and more particularly, to a method and device for processing media data and a method and device for transmitting media data.
  • a file format is a specific way of storing encoded data in a computer file. It can separate metadata and media data to solve problems such as random access and network streaming.
  • Media data includes video data, audio data, time-series metadata, non-time-series images, and so on.
  • Media data is divided into multiple access units. Each access unit contains a non-time series image or a group of non-time series images or at least one random access fragment.
  • the access unit of the media data is carried in a sample (sample )
  • the access unit of the media data is carried in the metadata item.
  • Metadata is auxiliary data used to describe media data, such as sample entries, data boxes describing tracks, and so on. Metadata can be divided into time-series metadata and non-time-series metadata. Sequential metadata is stored in the media data box together with media data. Non-sequential metadata is stored in the metadata box. The metadata box is stored in different levels under the file.
  • the file format stores these data in a prescribed structure. Files in the file format will contain media data boxes and metadata boxes.
  • Magnetic Box is an important metadata box, which stores various tracks and some metadata boxes.
  • Tracks have a logical and temporal structure. From the perspective of logical structure, tracks can be divided into media tracks and cue tracks. From the perspective of time structure, each track is a series of time-parallel tracks, and different tracks have the same time axis under the media data stream.
  • sample offset a sample offset
  • sample size sample size
  • sample entry a sample entry
  • Sample groups can represent characteristics common to some sample data information in the same track.
  • Sample auxiliary information Sample auxiliary information boxes and sample auxiliary information offset boxes
  • the auxiliary type determines the type of this auxiliary information.
  • the track in addition to the metadata describing the media data, there are many data boxes describing the track itself.
  • the dependency relationship between different data streams can be stored in such a tref data box (Track Reference Box).
  • the tref data box of one track will contain the identifier and reference type (reference_type) of another track.
  • the reference type values are: 'hint', 'cdsc', 'font', 'hind', 'vdep', 'dplx', 'subt', 'thmb', 'auxl', which determine the Dependencies and types, such as 'cdsc' indicates that the current track describes the referenced track, and 'hint' indicates that the referenced track is the media data track pointed to by the 'hint' track.
  • the index information of the second sample in the other data stream that the first sample depends on in the interdependent data stream is implicit and the same as the presentation time information of the first sample. Synchronization of presentation time information between the second sample and the second sample obtains the dependency relationship between the samples. Therefore, the existing reference types use the same time axis, which is a dependency relationship in time series. Regarding the non-aligned time period dependencies, the existing types can neither express correctly, nor hinder the reuse of non-sequential data and the flexibility of operation.
  • MMT MPEG Media Transport
  • a package is a logical entity that consists of an organization information (CI) and one or more assets.
  • MMT assets are logical data entities that contain encoded media data.
  • the encoded media of MMT assets can be timed data or non-timed data.
  • Timing data is audiovisual media data that requires specific data units to be decoded and represented synchronously at a specified time.
  • Non-timed data is other types of data that can be decoded and presented at any time based on the context of the user's service or interaction.
  • Organization information (Composition, Information, CI) describes the relationship between assets, thereby completing the synchronous transmission of files in different assets.
  • MMT uses MPU (Media Processing Unit) to encapsulate files on the basis of ISO file format.
  • Media processing unit (MPU) is independent and fully processed data that conforms to the MMT entity.
  • the processing includes encapsulation and packetization.
  • the MPU is uniquely identified within the MMT package and has a serial number and an associated MMT asset ID to distinguish it from other MPUs.
  • MMT adds hint track to the MPU to guide the sender to packetize media data into smaller media fragmentation units (MFU).
  • MFU media fragmentation units
  • the sample is used as MFU header information, which describes the scalable layer of the MFU.
  • the existing MMT is mainly designed for media data generated by existing video encoding methods.
  • a video sequence It includes at least one random access segment.
  • Each random access segment corresponds to a display period and includes a random access image and multiple non-random access images.
  • Each image has its own display time to describe the time when the image is displayed or played.
  • An image in a random access segment may be intra-coded, or other images in the random access segment may be referenced to be encoded using inter prediction, where the referenced image may be an image to be displayed or a composite image that cannot be displayed Wait.
  • an image (excluding leading pictures) displayed after a random access image can only refer to other images in the random access segment to which the image belongs, and cannot refer to before or after the random access segment to which the image belongs.
  • the image in the random access fragment is shown in Figure 1. Specifically, there are several ways to describe the dependency relationship between the current image and the candidate reference image:
  • the dependency relationship between the current image and the candidate reference image is described by the reference image configuration set of the video compression layer, where the reference image configuration set The numbered difference between the reference image and the current image is described.
  • the reason for describing only the number difference in the reference image configuration set is that in the existing video coding scheme, the candidate reference image and the current image belong to the same independently decodable random access segment, and the candidate reference image and the current image can only be The same numbering rule is used, for example, in chronological order, so the candidate reference image can be accurately located according to the difference between the current image number and the candidate reference image number.
  • SVC Scalable Video Coding Scheme
  • MVC Multiview Video Coding Scheme
  • existing inter-frame prediction using only the same layer / co-view point
  • SVC / MVC uses inter-layer / inter-view prediction to expand the range of candidate reference images for the current image, where the extended candidate reference image has the same number (for example, the same time stamp) as the current image and Do not belong to the same level of independently decodable fragments.
  • SVC / MVC uses the layer identifier to describe the dependency relationship of the code streams of different layers / views, and uses the same number of images to describe the dependency relationship of images between layers / views.
  • AVS2 uses identifiers to describe special scene image types (that is, G images and GB images), and uses specific reference caches (that is, scene image caches) to manage G / GB images, and uses identifiers to describe whether the current image refers to G / GB image, and using a specific reference image queue construction method (that is, G / GB images are placed in the last reference image bit of the reference image queue by default), so that the current image numbered according to the rule can refer to the number that does not follow the rule Candidate reference image (ie, GB image), or a candidate reference image (ie, G image) that uses the same rule number as the current image but the number difference exceeds the constraint range.
  • this technology limits that only one candidate reference image can exist in the scene image cache at any moment, and the candidate reference image still belongs to the same independently
  • the aforementioned mechanism of the prior art will limit the number of available reference images of the image to be encoded, and cannot effectively improve the efficiency of image encoding and image decoding.
  • the encoder when encoding (or decoding) an image, the encoder (or decoder) can select from the database the current encoded image (or (Decoded image) Images with similar textures are used as reference images.
  • This type of reference image is called a knowledge base image.
  • the database that stores the set of reference images is called a knowledge base.
  • At least one image in this video refers to at least one knowledge base image.
  • the method of encoding and decoding is called knowledge-based video coding (English: library-based video coding).
  • Using knowledge base-based video coding to encode a video sequence will generate a knowledge layer code stream containing the knowledge base image coded code stream and a video layer code stream containing the coded image of each frame of the video sequence with reference to the knowledge base image coded code stream .
  • These two types of code streams are similar to the basic layer code stream and the enhancement layer code stream generated by scalable video coding (SVC), that is, the sequence layer code stream depends on the knowledge layer code stream.
  • SVC scalable video coding
  • the dependency relationship between the dual-stream organization method of knowledge base-based video coding and the hierarchical stream organization method of SVC is different. The difference is that the dual-stream levels of SVC are based on a certain The alignment time period is dependent, while the knowledge layer based on the knowledge layer is based on the non-alignment time period in the dual stream of knowledge base-based video coding.
  • Knowledge base-based video coding schemes bring problems to the storage, transmission, and reference image management of code stream data encoded using knowledge base-based video coding schemes.
  • the knowledge image is acquired and used to provide additional candidate reference images for the encoding and decoding of the image.
  • Figure 4 shows the dependency between the sequence image and the knowledge image in the knowledge image coding and decoding technology. relationship.
  • the knowledge image enables the sequence image to use the relevant information of a large span, and improves the encoding and decoding efficiency.
  • the existing technical solutions cannot effectively support the description of the dependency relationship between sequence images and knowledge images and the efficient management of knowledge images.
  • the scaling level in the foregoing MMT can describe the level information of SVC data.
  • the scaling level combined with the time information can describe the dependency relationship between SVC data at different levels at the same time, but it cannot describe the non-aligned time period of the knowledge base encoding video stream. Dependencies.
  • the present invention aims to provide multiple methods and multiple devices, such as processing to obtain media data, transmitting media data, processing media data, processing reference image requests, and specifying reference images, to implement knowledge base-based video.
  • the correct decoding and efficient transmission of the code stream obtained by the encoding method and improve the transmission efficiency and storage efficiency.
  • the present invention adopts the following technical solutions:
  • a first aspect of the present invention provides a method for specifying a reference image, the method including:
  • the decoder extracts the first identification information in the reference mapping table to obtain whether the reference image number corresponding to the reference index in the reference mapping table uses at least two numbering rules;
  • the decoder When the number corresponding to the reference index in the reference mapping table uses at least two numbering rules, the decoder extracts second identification information corresponding to at least one reference index j from the reference mapping table to obtain a corresponding number corresponding to the reference index j.
  • Numbering rules used for reference image numbering are used for reference image numbering;
  • the decoder extracts a reference image number corresponding to the reference index j from the reference mapping table
  • the decoder uses the same numbering rule as the current picture to use the reference picture number to determine the reference picture of the current picture;
  • the decoder uses the reference picture information returned from the outside of the decoder by the reference picture number to determine the reference picture of the current picture.
  • the method further includes:
  • the decoder obtains the reference image number and the second identification information corresponding to the at least one reference index j from the reference mapping update table;
  • the method further includes:
  • the decoder When the decoder decodes the current image using the reference image pointed by the reference image number using the second numbering rule, the decoder sets the distance between the reference image and the current image to a non-time domain distance.
  • a second aspect of the present invention provides a method for processing a reference image request, the method including:
  • the method further includes:
  • positioning information of the second-type segment to send to the decoder information about the reference image included in the second-type segment pointed to by the positioning information further includes:
  • a third aspect of the present invention provides a device for specifying a reference image, the device including:
  • One or more programs are used to:
  • the processor extracts the first identification information in the reference mapping table to obtain whether the reference image number corresponding to the reference index in the reference mapping table uses at least two numbering rules;
  • the processor extracts second identification information corresponding to at least one reference index j from the reference mapping table to obtain a corresponding number corresponding to the reference index j.
  • Numbering rules used for reference image numbering are used for reference image numbering
  • the processor extracts a reference image number corresponding to the reference index j from the reference mapping table
  • the processor uses the same numbering rule as the current image to determine the reference image of the current image by using the reference image number;
  • the processor uses the reference picture information returned from the outside of the decoder by the reference picture number to determine a reference picture of the current picture;
  • the above reference mapping table and reference image processed by the processor exist in the memory.
  • the device further includes:
  • the processor obtains the reference image number and the second identification information corresponding to the at least one reference index j from the reference map update table;
  • the processor When the reference index j in the reference mapping update table exists in the reference mapping table, the processor replaces the reference image number and the second identification information corresponding to the reference index j in the reference mapping table with A reference number corresponding to the reference index j and second identification information obtained from the reference map update table;
  • the processor adds the reference index j obtained from the reference mapping update table to the reference mapping table. And its corresponding reference image number and second identification information.
  • the device further includes:
  • the processor sets the distance between the reference image and the current image to a non-time domain distance.
  • a fourth aspect of the present invention also provides an apparatus for processing a reference image request, the apparatus includes:
  • One or more programs are used to:
  • the processor obtains a dependency mapping table of at least one first-type segment to obtain a reference image number of at least one reference image that the at least one first-type segment depends on and positioning information of a second-type segment to which the at least one reference image belongs. Mapping relations;
  • the processor receives the reference image request information sent by the decoder to obtain a reference image number of at least one reference image that the current image depends on, and the current image is included in a fragment of the first type of the current image;
  • the processor obtains, from the dependency mapping table of the first-type fragment to which the current image belongs, the positioning of the second-type fragment to which the reference image pointed to by the reference image number of the reference image in the reference image request information information;
  • the transmitter uses the positioning information of the second type of fragments to send to the decoder information of a reference image included in the second type of fragments pointed to by the positioning information;
  • the device further includes:
  • the processor obtains a dependency mapping table of at least one fragment of the first type from the media description information.
  • the sending unit further includes:
  • the processor searches the cache for a second type fragment pointed to by the positioning information of the second type fragment or a reference image included in the second type fragment;
  • the processor obtains the second type fragment or the reference image included in the second type fragment from the cache;
  • the processor downloads the second type fragment from the server.
  • a fifth aspect of the present invention also provides a device for specifying a reference image, where the device includes:
  • a first extraction unit configured to extract first identification information in a reference mapping table to obtain whether there are at least two numbering rules for a reference image number corresponding to a reference index in the reference mapping table;
  • a second extraction unit configured to extract second identification information corresponding to at least one reference index j from the reference mapping table when the number corresponding to the reference index in the reference mapping table uses at least two numbering rules to acquire the The numbering rule adopted for the reference image number corresponding to the reference index j;
  • a third extraction unit configured to extract a reference image number corresponding to the reference index j from the reference mapping table
  • a first determining unit when the numbering rule adopted by the reference image number is the first numbering rule, using the same numbering rule as the current image to determine the reference image of the current image by using the reference image number;
  • the second determining unit is configured to determine the reference image of the current image by using the reference image information returned from the outside of the decoder by the reference image number when the numbering rule adopted by the reference image number is the second numbering rule.
  • the device further includes:
  • a fourth extraction unit configured to obtain a reference image number and second identification information corresponding to at least one reference index j from the reference map update table;
  • a replacement unit configured to: when the reference index j in the reference mapping update table exists in the reference mapping table, replace the reference image number and the second identifier corresponding to the reference index j in the reference mapping table Replacing the information with the reference number corresponding to the reference index j and the second identification information obtained from the reference mapping update table;
  • An adding unit configured to add the reference index j obtained from the reference mapping update table to the reference mapping table when the reference index j in the reference mapping update table does not exist in the reference mapping table
  • the reference index j and its corresponding reference image number and second identification information are included in the reference mapping table.
  • the device further includes:
  • a setting unit configured to set the distance between the reference image and the current image to a non-time domain distance when the decoder decodes the current image using the reference image pointed by the reference image number using the second numbering rule.
  • a sixth aspect of the present invention further provides an apparatus for processing a reference image request, the apparatus includes:
  • a first obtaining unit configured to obtain a dependency mapping table of at least one fragment of the first type to obtain a reference image number of at least one reference image on which the at least one fragment of the first type depends and a second category to which the at least one reference image belongs The mapping relationship of the positioning information of the fragments;
  • a receiving unit configured to receive reference image request information sent by a decoder to obtain a reference image number of at least one reference image on which a current image depends, the current image being included in a fragment of a first type of the current image;
  • a second obtaining unit is configured to obtain, from the dependency mapping table of the first-type fragment to which the current image belongs, at least one of the reference image request information, the first Positioning information for second-class fragments;
  • a sending unit configured to use the positioning information of the second-type segment to send to the decoder information of a reference image included in the second-type segment pointed to by the positioning information.
  • the device further includes:
  • the third obtaining unit is configured to obtain at least one dependency mapping table of the first type fragment from the media description information.
  • the sending unit further includes:
  • a searching unit configured to search a cache for a second type fragment pointed to by the positioning information of the second type fragment or a reference image included in the second type fragment;
  • a fourth obtaining unit configured to obtain the second type fragment or the second type fragment from the cache if the second type fragment or a reference image included in the second type fragment exists in the cache; Included reference images;
  • a download unit configured to download the second-type fragment from a server if the second-type fragment or a reference image included in the second-type fragment does not exist in the cache.
  • a seventh aspect of the present invention also provides a method for processing to obtain media data, the method including:
  • the first media data is time-series media data
  • the sample entry includes metadata pointing to a sample of the first media data
  • An access unit entry of the second media data is placed in the second media data box, the access unit entry includes metadata pointing to the access unit of the second media data, and the second media data is time-series media data or non- Time series media data;
  • the at least two samples that are not temporally continuous are encoded or decoded with reference to the same set of access units in the second media data, the same set of access units and the at least two access units in time. At least one of the discontinuous samples is not aligned in time; if
  • the second media data is non-temporal media data, and the at least two samples that are temporally discontinuous are encoded or decoded with reference to the same set of access units in the second media data.
  • the method further includes:
  • the second media data is time-series media data
  • track dependency information pointing to the second media data box is placed in the first media track
  • the track dependency information includes information indicating that the same set of access units and the two Identification of at least one of the temporally discontinuous samples being misaligned in time.
  • the method further includes:
  • the description information of the sample group in the first media track including the at least two samples that are temporally discontinuous in encoding or decoding with reference to the same group access The identity of the unit.
  • An eighth aspect of the present invention also provides a method for processing to obtain media data, the method includes:
  • the first media data is time-series media data
  • the sample entry includes metadata pointing to a sample of the first media data
  • An access unit entry of the second media data is placed in the second media data box, the access unit entry includes metadata pointing to the access unit of the second media data, and the second media data is time-series media data or non- Time series media data;
  • a respective dependent metadata is separately placed, and the at least two samples that are temporally discontinuous meet one of the following conditions:
  • the dependent metadata corresponding to each sample includes index information that points to the same set of access units in the second media data, and the index information is in addition to the first media Information other than the presentation time information of the samples of the data, the at least two samples that are discontinuous in time encoding or decoding refer to the same set of access units, the same set of access units and the at least two At least one of the consecutive samples is not aligned in time; if
  • the second media data is non-temporal media data
  • the dependent metadata corresponding to each sample includes index information that points to the same set of access units in the second media data, and the index information is in addition to the first media data.
  • the at least two samples that are not temporally consecutive are encoded or decoded with reference to the same set of access units.
  • the method further includes:
  • a sample entry of the time-series metadata is placed in the time-series metadata track.
  • the method further includes:
  • the dependency metadata is placed in a fragment index data box.
  • a ninth aspect of the present invention also provides a method for processing media data, the method includes:
  • Extracting first media data and second media data wherein the first media data is time-series media data, and the second media data is time-series media data or non-time-series media data;
  • a set of access units in the second media data is located for each of the at least two temporally discontinuous samples, and the index information of the set of access units is included in the The description information of the sample group; wherein the second media data meets one of the following conditions:
  • the second media data is time-series media data
  • the at least two samples that are temporally discontinuous are located in the same set of access units in the second media data, and the same set of access units and the first access unit The time periods of at least one of the at least two samples of a media data are not aligned; or
  • the two samples of the first media data are located to the same second media data access unit.
  • the method further includes:
  • the identifier of the track dependency information pointing to the data box to which the second media data belongs is parsed from the track to which the first media data belongs to obtain the same set of access units and the Information that at least one of the two temporally discontinuous samples is not aligned in time.
  • the method further includes:
  • the identifier is parsed to obtain the information that refers to the same set of access units by encoding or decoding the at least two samples that are not continuous in time.
  • a tenth aspect of the present invention also provides a method for processing media data, the method includes:
  • Extracting first media data and second media data wherein the first media data is time-series media data, and the second media data is time-series media data or non-time-series media data;
  • a set of access units in the second media data is located for each of the at least two temporally discontinuous samples, and index information of the set of access units is included in the Depends on metadata; the second media data satisfies one of the following conditions:
  • the second media data is time-series media data
  • the at least two samples that are temporally discontinuous are located in the same set of access units in the second media data, and the same set of access units and the first access unit The time periods of at least one of the at least two samples of a media data are not aligned; or
  • the two samples of the first media data are located to the same second media data access unit.
  • extracting the dependent metadata for each of the at least two samples that are temporally discontinuous in the first media data further includes:
  • Extract dependent metadata from time-series metadata Extract dependent metadata from time-series metadata.
  • extracting the dependent metadata for each of the at least two samples that are temporally discontinuous in the first media data further includes:
  • the dependency metadata is extracted from a fragment index data box.
  • An eleventh aspect of the present invention also provides a method for transmitting media data, the method includes:
  • Segmenting the first media data into media fragmentation units wherein the first media data is time-series media data, and the first media data includes at least two samples that are discontinuous in time;
  • the second media data is time-series media data
  • the at least two temporally discontinuous samples in the first media data are located to the same second media data access unit, and the The second media data access unit is not aligned with a time period of at least one of the at least two samples of the first media data;
  • the second media data access unit does not exist in the analog cache, dividing the second media data access unit into a media fragmentation unit;
  • extracting the dependency index information corresponding to the first media data media fragmentation unit further includes:
  • extracting the dependency information corresponding to the first media data media fragmentation unit further includes:
  • the dependency index information corresponding to the media fragmentation unit is extracted from the time-series metadata corresponding to the media fragmentation unit.
  • a twelfth aspect of the present invention further provides a device for processing to obtain media data, where the device includes:
  • One or more programs are used to:
  • the processor places a sample entry of the first media data in the first media track, the first media data is time-series media data, and the sample entry includes metadata that points to a sample of the first media data;
  • the processor puts an access unit entry of the second media data in the second media data box, the access unit entry includes metadata pointing to the access unit of the second media data, and the second media data is time-series media data Or non-temporal media data;
  • the processor marks at least two samples that are temporally discontinuous in the first media data as a sample group, and the at least two samples that are temporally discontinuous meet one of the following conditions:
  • the at least two samples that are not temporally continuous are encoded or decoded with reference to the same set of access units in the second media data, the same set of access units and the at least two access units in time. At least one of the discontinuous samples is not aligned in time; if
  • the second media data is non-temporal media data, and the at least two samples that are temporally discontinuous are encoded or decoded with reference to the same set of access units in the second media data;
  • the media data obtained by the processor as described above exists in the memory.
  • a thirteenth aspect of the present invention also provides a device for processing and obtaining media data, where the device includes:
  • One or more programs are used to:
  • the processor places a sample entry of the first media data in the first media track, the first media data is time-series media data, and the sample entry includes metadata that points to a sample of the first media data;
  • the processor puts an access unit entry of the second media data in the second media data box, the access unit entry includes metadata pointing to the access unit of the second media data, and the second media data is time-series media data Or non-temporal media data;
  • the processor places separate dependent metadata for each of the at least two temporally discontinuous samples in the first media data, and the at least two temporally discontinuous samples satisfy one of the following conditions :
  • the dependent metadata corresponding to each sample includes index information that points to the same set of access units in the second media data, and the index information is in addition to the first media Information other than the presentation time information of the samples of the data, the at least two samples that are discontinuous in time encoding or decoding refer to the same set of access units, the same set of access units and the at least two At least one of the consecutive samples is not aligned in time; if
  • the second media data is non-temporal media data
  • the dependent metadata corresponding to each sample includes index information pointing to the same set of access units in the second media data, where the index information is Information other than the presentation time information of the samples, the at least two samples that are temporally discontinuous in encoding or decoding refer to the same set of access units;
  • the media data obtained by the processor as described above exists in the memory.
  • a fourteenth aspect of the present invention also provides a device for processing media data, where the device includes:
  • One or more programs are used to:
  • the processor processes the media data existing in the memory
  • the processor extracts first media data and second media data, wherein the first media data is time-series media data and the second media data is time-series media data or non-time-series media data;
  • the processor extracts a sample group from the track to which the first media data belongs, the sample group including at least two samples that are discontinuous in time;
  • the processor locates a set of access units in the second media data for each of the at least two temporally discontinuous samples according to the description information of the sample group, and the index information of the set of access units is included in In the description information of the sample group; wherein the second media data meets one of the following conditions:
  • the second media data is time-series media data
  • the at least two samples that are temporally discontinuous are located in the same set of access units in the second media data, and the same set of access units and the first access unit The time periods of at least one of the at least two samples of a media data are not aligned; or
  • the two samples of the first media data are located to the same second media data access unit.
  • the fifteenth aspect of the present invention also provides a device for processing media data, where the device includes:
  • One or more programs are used to:
  • the processor processes the media data existing in the memory
  • the processor extracts first media data and second media data, wherein the first media data is time-series media data and the second media data is time-series media data or non-time-series media data;
  • the processor extracts at least two temporally discontinuous samples from the first media data
  • the processor extracts dependent metadata for each of the at least two samples that are temporally discontinuous in the first media data
  • the processor locates a set of access units in the second media data for each of the at least two temporally discontinuous samples according to the dependent metadata, and index information of the set of access units is included in In the dependent metadata; the second media data satisfies one of the following conditions:
  • the second media data is time-series media data
  • the at least two samples that are temporally discontinuous are located in the same set of access units in the second media data, and the same set of access units and the first access unit The time periods of at least one of the at least two samples of a media data are not aligned; or
  • the two samples of the first media data are located to the same second media data access unit.
  • a sixteenth aspect of the present invention also provides a device for transmitting media data, where the device includes:
  • One or more programs are used to:
  • the processor processes the media data existing in the memory
  • the processor divides the first media data into media fragmentation units, where the first media data is time-series media data, and the first media data includes at least two samples that are discontinuous in time;
  • the processor extracts dependency index information corresponding to the first media data media fragmentation unit, where the dependency index information is information other than presentation time information of a sample to which the media fragmentation unit belongs;
  • the transmitter transmits the extracted first media data media fragmentation unit
  • the processor locates a second media data access unit according to the dependency index information corresponding to the first media data media fragment unit, and the second media data access unit is determined by the first media data sample to which the media fragment unit belongs. Referenced for encoding or decoding; wherein the second media data meets one of the following conditions:
  • the second media data is time-series media data
  • the at least two temporally discontinuous samples in the first media data are located to the same second media data access unit, and the The second media data access unit is not aligned with a time period of at least one of the at least two samples of the first media data;
  • the processor searches for the second media data access unit in an analog cache
  • the processor divides the second media data access unit into a media fragmentation unit
  • the transmitter transmits a media fragmentation unit in which the second media data access unit is divided.
  • a seventeenth aspect of the present invention further provides a device for processing and obtaining media data, where the device includes:
  • a first placing unit is configured to place a sample entry of the first media data in the first media track, the first media data is time-series media data, and the sample entry includes a pointer to the sample of the first media data.
  • a second placing unit configured to place an access unit entry of the second media data in a second media data box, the access unit entry containing metadata pointing to the access unit of the second media data, the second The media data is time-series media data or non-time-series media data;
  • a marking unit configured to mark at least two samples that are discontinuous in time in the first media data as a sample group, and the at least two samples that are discontinuous in time meet one of the following conditions:
  • the at least two samples that are not temporally continuous are encoded or decoded with reference to the same set of access units in the second media data, the same set of access units and the at least two access units in time. At least one of the discontinuous samples is not aligned in time; if
  • the second media data is non-temporal media data, and the at least two samples that are temporally discontinuous are encoded or decoded with reference to the same set of access units in the second media data.
  • An eighteenth aspect of the present invention further provides a device for processing and obtaining media data, where the device includes:
  • a first placing unit is configured to place a sample entry of the first media data in the first media track, the first media data is time-series media data, and the sample entry includes a pointer to the sample of the first media data.
  • a second placing unit configured to place an access unit entry of the second media data in a second media data box, the access unit entry containing metadata pointing to the access unit of the second media data, the second The media data is time-series media data or non-time-series media data;
  • a third inserting unit configured to respectively put respective dependent metadata for each of the at least two samples that are temporally discontinuous in the first media data, and the at least two samples that are discontinuous in time
  • the sample meets one of the following conditions:
  • the dependent metadata corresponding to each sample includes index information that points to the same set of access units in the second media data, and the index information is in addition to the first media Information other than the presentation time information of the samples of the data, the at least two samples that are discontinuous in time encoding or decoding refer to the same set of access units, the same set of access units and the at least two At least one of the consecutive samples is not aligned in time; if
  • the second media data is non-temporal media data
  • the dependent metadata corresponding to each sample includes index information that points to the same set of access units in the second media data, and the index information is in addition to the first media data.
  • the at least two samples that are not temporally consecutive are encoded or decoded with reference to the same set of access units.
  • a nineteenth aspect of the present invention also provides a device for processing media data, where the device includes:
  • a first extraction unit configured to extract first media data and second media data, wherein the first media data is time-series media data and the second media data is time-series media data or non-time-series media data;
  • a second extraction unit configured to extract a sample group from a track to which the first media data belongs, where the sample group includes at least two samples that are discontinuous in time;
  • a positioning unit configured to locate a set of access units in the second media data for each of the at least two temporally discontinuous samples according to the description information of the sample group, and an index of the set of access units The information is included in the description information of the sample group; wherein the second media data satisfies one of the following conditions:
  • the second media data is time-series media data
  • the at least two samples that are temporally discontinuous are located in the same set of access units in the second media data, and the same set of access units and the first access unit The time periods of at least one of the at least two samples of a media data are not aligned; or
  • the two samples of the first media data are located to the same second media data access unit.
  • the twentieth aspect of the present invention also provides a device for processing media data, where the device includes:
  • a first extraction unit configured to extract first media data and second media data, wherein the first media data is time-series media data and the second media data is time-series media data or non-time-series media data;
  • a second extraction unit configured to extract at least two temporally discontinuous samples from the first media data
  • a third extraction unit configured to extract dependent metadata for each of at least two samples that are temporally discontinuous in the first media data
  • a positioning unit configured to respectively locate a group of access units in the second media data for each of the at least two samples that are temporally discontinuous according to the dependent metadata, and an index of the group of access units
  • the information is included in the dependent metadata; the second media data satisfies one of the following conditions:
  • the at least two temporally discontinuous samples locate the same set of access units in the second media data, and the same set of access units and the The time periods of at least one of the at least two samples of a media data are not aligned; or
  • the two samples of the first media data are located to the same second media data access unit.
  • the twenty-first aspect of the present invention also provides a device for transmitting media data, where the device includes:
  • the first sub-unit is used to divide the first media data into media fragment units, wherein the first media data is time-series media data, and the first media data includes at least two temporally discontinuous sample;
  • An extraction unit configured to extract dependency index information corresponding to the first media data media fragmentation unit, where the dependency index information is information other than presentation time information of a sample to which the media fragmentation unit belongs;
  • a first transmission unit configured to transmit the extracted first media data media fragmentation unit
  • a positioning unit configured to locate a second media data access unit according to the dependent index information corresponding to the first media data media segment unit, where the second media data access unit is a first media to which the media segment unit belongs Reference is made to the encoding or decoding of the data samples; wherein the second media data meets one of the following conditions:
  • the second media data is time-series media data
  • the at least two temporally discontinuous samples in the first media data are located to the same second media data access unit, and the The second media data access unit is not aligned with a time period of at least one of the at least two samples of the first media data;
  • a searching unit configured to search the second media data access unit in an analog cache
  • a second segmentation unit configured to segment the second media data access unit into a media fragmentation unit if the second media data access unit does not exist in the analog cache;
  • the second transmission unit is configured to transmit a media fragmentation unit in which the second media data access unit is divided.
  • the invention discloses a method and a device for processing media data, a method and a device for transmitting media data, and a method and a device for processing media data. These methods and devices constitute a complete set of methods and devices from the encoding end to the decoding end, and ensure the correct decoding and efficient transmission of the video layer code stream data and knowledge layer code stream data in the code stream obtained from the knowledge base-based video encoding method. Improved transmission efficiency and storage efficiency.
  • the video layer code stream and the knowledge layer code stream and the dependency index relationship between them are placed in the media data or the file to which the media data belongs. Then through the method of transmitting media data, accurately synchronize the video layer data and the knowledge layer data according to the dependency information of the video layer data and the knowledge layer data in the media data encoded using the knowledge base-based encoding method, and avoid duplicate storage of the knowledge layer data And repeat downloads. Then, by processing the media data, the receiving end extracts the video layer data and the referenced knowledge layer data from the media data encoded using the knowledge base-based encoding method.
  • a reference image is obtained from the processed knowledge layer code stream data and provided.
  • the decoder assigns the knowledge image in the knowledge layer data as the reference image to the image in the video layer data according to the dependent index information, where the knowledge image does not belong to the random access fragment to which the current image belongs and the most recent previous fragment.
  • the set of images to be displayed in an adjacent random access segment is not belong to the random access fragment to which the current image belongs and the most recent previous fragment.
  • FIG. 1 is a schematic diagram of a video sequence segmented into a random access segment using the image dependency relationship of the first prior art.
  • FIG. 2 is a schematic diagram of a video sequence segmented into a random access segment using the image dependency relationship of the second prior art.
  • FIG. 3 is a schematic diagram of an image dependency relationship of a video sequence segmented into a random access segment using the prior art III.
  • FIG. 4 is a schematic diagram of an image dependency relationship of a video sequence segmented into a random access segment using the prior art IV.
  • FIG. 5 is a flowchart of a method for specifying a reference image according to an embodiment of the present invention.
  • FIG. 6 is another flowchart of a method for specifying a reference image according to an embodiment of the present invention.
  • FIG. 7 is a flowchart of a method for processing a reference image request according to an embodiment of the present invention.
  • FIG. 8 is another flowchart of a method for processing a reference image request according to an embodiment of the present invention.
  • FIG. 9 is a schematic system diagram of a method for specifying a reference image and a method for processing a request for a reference image according to an embodiment of the present invention.
  • FIG. 10 is a device structural diagram of a device for specifying a reference image according to an embodiment of the present invention.
  • FIG. 11 is a structural diagram of another apparatus for specifying a reference image according to an embodiment of the present invention.
  • FIG. 12 is a structural diagram of a device for processing a reference image request device according to an embodiment of the present invention.
  • FIG. 13 is a structural diagram of another apparatus for processing a reference image request apparatus according to an embodiment of the present invention.
  • FIG. 14 is a schematic system diagram of a method for specifying a reference image and a method for processing a request for a reference image according to an embodiment of the present invention.
  • FIG. 15 is a schematic system diagram of a method for specifying a reference image and a method for processing a request for a reference image according to an embodiment of the present invention.
  • FIG. 16 is a schematic system diagram of a method for specifying a reference image and a method for processing a request for a reference image according to an embodiment of the present invention.
  • FIG. 17 is a schematic system diagram of a method for specifying a reference image and a method for processing a request for a reference image according to an embodiment of the present invention.
  • FIG. 18 is a schematic structural relationship diagram of media data using a knowledge base encoding method according to an embodiment of the present invention.
  • FIG. 19 is a schematic diagram of a method for processing and obtaining media data according to an embodiment of the present invention.
  • FIG. 20 is a schematic diagram of a method for processing and obtaining media data according to an embodiment of the present invention.
  • FIG. 21 is a schematic diagram of a method for processing and obtaining media data according to an embodiment of the present invention.
  • FIG. 22 is a schematic diagram of a method for processing and obtaining media data according to an embodiment of the present invention.
  • FIG. 23 is a schematic diagram of a method for processing and obtaining media data according to an embodiment of the present invention.
  • FIG. 24 is a schematic diagram of a method for transmitting media data according to an embodiment of the present invention.
  • FIG. 25 is a schematic diagram of a method for transmitting media data according to an embodiment of the present invention.
  • FIG. 26 is a schematic diagram of a method for transmitting media data according to an embodiment of the present invention.
  • FIG. 27 is a schematic diagram of a method for transmitting media data according to an embodiment of the present invention.
  • FIG. 28 is a schematic diagram of a method for transmitting media data according to an embodiment of the present invention.
  • FIG. 29 is a schematic diagram of a method for transmitting media data according to an embodiment of the present invention.
  • a knowledge image is an image other than the set of images to be displayed in the random access segment to which the current image belongs and the immediately adjacent random access segment.
  • the knowledge image is a reference image that is used for the image to be encoded or The image to be decoded provides a reference.
  • FIG. 5 shows an example of a process in this embodiment.
  • the embodiment includes:
  • Step 101 The decoder extracts the first identification information in the reference mapping table to obtain whether the reference image number corresponding to the reference index in the reference mapping table uses at least two numbering rules;
  • Step 102 When the reference image number corresponding to the reference index in the reference mapping table uses at least two numbering rules, the decoder extracts second identification information corresponding to at least one reference index j from the reference mapping table to obtain the reference image.
  • Step 103 The decoder extracts the reference image number corresponding to the reference index j to which it belongs from the reference mapping table;
  • Step 104 When the numbering rule adopted by the reference picture number is the first numbering rule, the decoder uses the same numbering rule as the current picture to use the reference picture number to determine the reference picture of the current picture;
  • Step 105 When the numbering rule adopted by the reference picture number is the second numbering rule, the decoder uses the reference picture information returned from the outside of the decoder by the reference picture number to determine the reference picture of the current picture.
  • the second embodiment provides a method for specifying a reference image. This embodiment is obtained by changing the first embodiment. The difference from the first embodiment is:
  • reference_configuration_set in the syntax table is used to indicate the reference mapping table
  • syntax reference_to_library_enable_flag is used to indicate the first identification information
  • syntax is_library_pid_flag is used to indicate the second identification information
  • syntax library_pid is used to indicate the number using the second numbering rule.
  • syntax delta_doi_of_reference_picture is used to indicate the difference between the number using the first numbering rule and the number of the current image.
  • Table 1 A syntax example of reference_configuration_set carrying identification information and numbering information
  • Reference knowledge image flag reference_to_library_enable_flag [i] a binary variable.
  • a value of '1' indicates that at least one reference image of the current image is a knowledge image in the knowledge image buffer area; a value of '0' indicates that the reference image of the current image is not a knowledge image in the knowledge image buffer area.
  • i is the number of the reference image configuration set.
  • the value of ReferenceToLibraryEnableFlag [i] is equal to the value of reference_to_library_enable_flag [i]. If reference_to_library_enable_flag [i] does not exist in the bitstream, the value of ReferenceToLibraryEnableFlag [i] is equal to 0.
  • Number of reference pictures num_of_reference_picture [i] 3-bit unsigned integer. Indicates the number of reference images of the current image. The number of reference images should not exceed the size of the reference image buffer. The value of NumOfRefPic [i] is equal to the value of num_of_reference_picture [i]. i is the number of the reference image configuration set.
  • Knowledge image number flag is_library_pid_flag [i] [j]: a binary variable.
  • a value of '1' indicates that the j-th reference image in the current image reference queue is a knowledge image in the knowledge image buffer, and library_pid [i] [j] is used to locate the knowledge image in the knowledge image buffer according to the knowledge reference image index;
  • a value of '0' indicates that the j-th reference image in the current image reference queue is not a knowledge image in the knowledge image buffer, and the image is located in the decoded image buffer according to delta_doi_of_reference_picture [i] [j].
  • i is the number of the reference image configuration set
  • j is the number of the reference image.
  • IsLibraryPidFlag [i] [j] is equal to the value of is_library_pid_flag [i] [j].
  • the value of ReferenceToLibraryOnlyFlag [i] is 1.
  • Knowledge image number library_pid [i] [j] A 6-bit unsigned integer with a value ranging from 0 to 63.
  • the number of the j-th reference image in the current image reference queue in the knowledge image buffer. i is the number of the reference image configuration set, and j is the number of the reference image.
  • the value of LibraryPid [i] [j] is equal to the value of library_pid [i] [j].
  • Reference image decoding order offset delta_doi_of_reference_picture [i] [j] a 6-bit unsigned integer with a value ranging from 1 to 63. Describe the difference between the j-th reference picture in the current picture reference picture queue and the current picture decoding order. i is the number of the reference image configuration set, and j is the number of the reference image. For the same reference picture configuration set, the values of the offset offsets of the reference pictures of different numbers should be different. The value of DeltaDoiOfRefPic [i] [j] is equal to the value of delta_doi_of_reference_picture [i] [j].
  • delta_doi_of_reference_picture [i] [j] represents the relative number of the reference image, where delta_doi_of_reference_picture [i] [j] is an integer longer than a specific long code, for example 6 is longer than a specific long code;
  • the numbering uses the second numbering rule, such as by library_pid [i ] [j] represents the number of the reference image, where library_pid [i] [j] is an integer longer than a specific long code, for example, 6 is longer than a specific long code.
  • the third embodiment provides a method for specifying a reference image. This embodiment is obtained by changing the first embodiment.
  • the difference from the first embodiment is that in the H.265 standard, the first A numbering rule uses delta_poc_s0_minus1 or delta_poc_s1_minus1 characters to indicate the relative number of the images in the output order, and the relative number indicates the number difference between the pointing image and the current image in the output order.
  • a method for specifying a reference image is provided. This embodiment is obtained by changing the first embodiment. The difference from the first embodiment is that the first numbering rule is related to the display order.
  • Related numbering rules for example, assign numbers to images according to rules including, but not limited to, the display order, decoding order, and output order of images.
  • a method for specifying a reference image is provided. This embodiment is obtained by changing from the first embodiment. The difference from the first embodiment is that the second numbering rule is related to the display order. Irrelevant numbering rules, such as assigning numbers to images according to rules including, but not limited to, the order in which images are generated, extracted, used, or randomly.
  • Sixth embodiment A method for specifying a reference image is provided. This embodiment is obtained by changing the first embodiment. The difference from the first embodiment is that the image set using the first numbering rule is used. Refers to the set of images used for display or output in the video sequence to which the current image belongs.
  • a method for specifying a reference image is provided. This embodiment is obtained by changing the first embodiment. The difference from the first embodiment is that the image set using the first numbering rule is used. It includes at least one of an intra-coded image and an inter-coded image.
  • An eighth embodiment A method for specifying a reference image is provided. This embodiment is obtained by changing the first embodiment. The difference from the first embodiment is that the image set using the second numbering rule is used. Refers to the set of knowledge images.
  • the ninth embodiment A method for specifying a reference image is provided. This embodiment is obtained based on the eighth embodiment.
  • a knowledge image may include but is not limited to a video sequence. At least one of a background image in the video sequence, a scene switching image in the video sequence, an image modeled by the image in the video sequence, and an image synthesized from the image in the video sequence, wherein the background image can be obtained by background-building the video sequence.
  • the scene switching image is obtained by performing scene switching detection on the video sequence.
  • Tenth embodiment A method for specifying a reference image is provided. This embodiment is obtained by changing from the eighth embodiment. The difference from the eighth embodiment is that the knowledge image is stored in and stored using the first The first cache of the numbered images is different from the second cache, for example, the second cache is a knowledge image cache.
  • Eleventh embodiment A method for specifying a reference image is provided. This embodiment is obtained by changing the tenth embodiment. The difference from the tenth embodiment is that the maximum buffer capacity is the first buffer. And the maximum capacity of the second cache.
  • Twelfth embodiment A method for specifying a reference image is provided. This embodiment is obtained by changing from the first embodiment. The difference from the first embodiment is that the reference mapping table belongs to a bit In the image set included in the stream, the number corresponding to the reference index in the reference mapping table of at least one image uses a mixed numbering rule, that is, the at least one image uses at least one knowledge image as a reference image.
  • a thirteenth embodiment A method for specifying a reference image is provided. This embodiment is obtained by changing from the first embodiment, and is different from the first embodiment in that: the bit to which the reference mapping table belongs In the image set included in the stream, the number corresponding to the reference index in the reference mapping table of at least one image A uses the first numbering rule and the number corresponding to the reference index in the reference mapping table of at least another image B uses the second numbering rule, that is, The image B uses only a knowledge image as a reference image.
  • a method for specifying a reference image is provided. This embodiment is obtained by changing the first embodiment. The difference from the first embodiment is that the reference mapping table is carried in the sequence header. , Image header, and strip header.
  • FIG. 6 shows a flow example of this embodiment, which is the same as the first embodiment. The difference is that before performing step 101, the method further includes an update method of a reference mapping table, including:
  • Step 201 The decoder extracts a reference mapping update table to obtain a number corresponding to at least one reference index j and second identification information.
  • Step 202 When the reference index j in the reference mapping update table exists in the reference mapping table, replace the number corresponding to the reference index j and the second identification information in the reference mapping table with all The number corresponding to the reference index j and the second identification information in the reference mapping update table;
  • Step 203 When the reference index j in the reference mapping update table does not exist in the reference mapping table, add the reference index j and the reference index in the reference mapping update table to the reference mapping table. Its corresponding number and second identification information.
  • a method for specifying a reference image is provided. This embodiment is obtained by changing from the fifteenth embodiment.
  • the difference from the fifteenth embodiment is that the reference mapping update table is only Including at least one reference index and its reference number using the second numbering rule, at this time, when updating the corresponding number in the reference mapping table by at least one reference index in the reference mapping update table, the The numbering in the reference mapping table identifies using the second numbering rule.
  • Seventeenth embodiment A method for specifying a reference image is provided. This embodiment is obtained by changing from the fifteenth embodiment. The difference from the fifteenth embodiment is that the reference map update table carries In the image header, the strip header.
  • An eighteenth embodiment A method for specifying a reference image is provided. This embodiment is obtained by changing from the first embodiment. Unlike the first embodiment, the method further includes:
  • Step 301 When the decoder decodes the current image using the reference image pointed to by the number using the second numbering rule, the decoder sets the distance between the reference image and the current image to a non-time domain distance.
  • a method for specifying a reference image is provided. This embodiment is obtained by changing the eighteenth embodiment. The difference from the eighteenth embodiment is that the non-time domain distance is Given a fixed non-zero value.
  • the twentieth embodiment A method for specifying a reference image is provided. This embodiment is obtained by changing the eighteenth embodiment. The difference from the eighteenth embodiment is that the non-time domain distance is A non-zero value calculated according to the similarity between the reference image pointed to by the second numbering rule and the current image.
  • the twenty-first embodiment A method for specifying a reference image is provided. This embodiment is obtained by changing from the first embodiment. The difference from the first embodiment is that before step 101 is performed, The method also includes:
  • Step 401 The decoder extracts the third identification information to obtain whether the first identification information exists in the reference mapping table.
  • the twenty-second embodiment a method for specifying a reference image is provided. This embodiment is obtained by changing the second embodiment. The difference from the second embodiment is:
  • library_picture_enable_flag is used to indicate the third identification information.
  • the syntax example is shown in Table 2 in italics, and corresponding reference configuration table is used to indicate the reference mapping table.
  • Table 2 carries a syntax example of the third identification information
  • Knowledge image enable flag library_picture_enable_flag a binary variable.
  • a value of '1' means that the video sequence can contain knowledge images and allows the image to use the image in the knowledge buffer as a reference image; a value of '0' means that the video sequence should not contain only images and the image is not allowed to use the image in the knowledge buffer As a reference image.
  • the value of LibraryPictureEnableFlag is equal to the value of library_picture_enable_flag.
  • reference_to_library_enable_flag [i] exists in the reference_configuration_set (i).
  • reference_to_library_enable_flag [i] exists in the reference_configuration_set (i).
  • reference_to_library_enable_flag [i] when the value of reference_to_library_enable_flag [i] is 1, it indicates that the number described by the reference_configuration_set (i) uses a mixed numbering rule.
  • delta_doi_of_reference_picture [i] [j] represents the relative number of the reference image, where delta_doi_of_reference_picture [i] [j] is an integer longer than a specific long code, for example 6 is longer than a specific long code;
  • the numbering uses the second numbering rule, such as by library_pid [i ] [j] represents the number of the reference image, where library_pid [i] [j] is an integer longer than a specific long code, for example, 6 is longer than a specific long code.
  • FIG. 7 shows a flowchart of this embodiment, and the embodiment includes:
  • Step 501 Obtain a dependency mapping table of at least one fragment of the first type, wherein the dependency mapping table describes a number of at least one reference image to which the first type fragment depends and a second image to which the at least one reference image belongs. Mapping relationship of positioning information of class fragments;
  • Step 502 Receive the reference image request information sent by the decoder to obtain the number of at least one reference image on which the current image depends;
  • Step 503 Obtain, from the dependency mapping table of the first-type fragment to which the current image belongs, positioning information of the second-type fragment to which the reference image pointed to by at least one reference image number in the reference image request information. ;
  • Step 504 Use the positioning information of the second segment to send to the decoder the information of the knowledge image contained in the second type segment pointed to by the positioning information.
  • a method for processing a reference image request is provided. This embodiment is obtained by changing from the twenty-third embodiment. The difference from the twenty-third embodiment is:
  • Step 601 Obtain a dependency mapping table of at least one fragment of the first type from the media description information.
  • the twenty-fifth embodiment A method for processing a reference image request is provided. This embodiment is obtained by changing the twenty-fourth embodiment. The difference from the twenty-fourth embodiment is:
  • the fragment dependent descriptor is used to indicate the dependency mapping information of the fragment to which it belongs, where the fragment dependent descriptor is determined by dependent_segment
  • the descriptor indicates that the attribute of the dependent_segment descriptor is represented by @dependent_segment_indicator, wherein the @dependent_segment_indicator attribute describes the positioning information of the second type segment dependent on the first type segment to which the dependent_segment descriptor belongs and the knowledge image it contains
  • the numbering information is described by the attribute @pictureID, and the positioning information is described by the attribute @dependentSegmentURL.
  • Table 4 shows an example of the syntax of the fragment dependency descriptor.
  • the twenty-sixth embodiment A method for processing a reference image request is provided. This embodiment is obtained by changing from the twenty-fifth embodiment. The difference from the twenty-fifth embodiment is:
  • sample entry data box LayerSMTHintSampleEntry to describe the sample entry of the code stream where the knowledge image and sequence image are located, and use the syntax is_library_layer to identify whether the code stream contains a knowledge image or a sequence image.
  • Sample data box LayerMediaSample Describe the sample of the code stream to which the sequence image belongs, and use the data box LayerInfo to describe the code stream and sample number information of the knowledge image that the code stream sample belongs to, where library_layer_in_ceu_sequence_number is used to describe the dependent knowledge image.
  • the code stream sample belongs to the common
  • the number of the packaging unit is described by the library_layer_in_mfu_sequence_number.
  • a value of 1 indicates that the CEU is an MFU for each prompt sample; a value of 0 indicates that each CEU contains only one sample.
  • a value of 1 indicates that the media data is knowledge layer media data and contains the code stream of the knowledge image; a value of 0 indicates that the media data is sequence layer media data and contains the code stream of the sequence image.
  • Samplenumber The sample number from which the MFU was extracted. Samplenumber represents the sample corresponding to the nth 'moof' box accumulated in the CEU. The samplenumber of the first sample in the CEU should be 0.
  • library_layer_in_ceu_sequence_number- describes the CEU number of the MFU on which the MFU decodes in the knowledge layer media resources.
  • library_layer_in_mfu_sequence_number scribes the number of the MFU in its CEU needed for this MFU decoding.
  • the twenty-seventh embodiment A method for processing a reference image request is provided. This embodiment is obtained by changing from the twenty-third embodiment. As shown in FIG. 8, it is different from the twenty-third embodiment.
  • the step 404 uses the positioning information of the second segment to send to the decoder the information of the knowledge image contained in the second type segment pointed to by the positioning information, further including:
  • Step 701 Find in the cache a second-type fragment pointed to by the positioning information of the second-type fragment or a knowledge image contained in the second-type fragment;
  • Step 702 If the second type fragment or the knowledge image contained in the second type fragment exists in the cache, obtain the second type fragment or the knowledge image contained in the second type fragment from the cache. ;
  • Step 703 If the second type fragment or the knowledge image contained in the second type fragment does not exist in the cache, download the second type fragment from the server.
  • the twenty-eighth embodiment A method for processing a reference image request is provided. This embodiment is obtained by changing from the twenty-third embodiment. The difference from the twenty-third embodiment is that the first The second type of fragment contains a knowledge image.
  • the twenty-ninth embodiment A method for processing a reference image request is provided. This embodiment is obtained by changing from the twenty-third embodiment. The difference from the twenty-third embodiment is: the positioning
  • the information may be one including, but not limited to, a Uniform Resource Locator (URL) and a Uniform Resource Identifier (URI).
  • URL Uniform Resource Locator
  • URI Uniform Resource Identifier
  • the thirty-third embodiment a method for processing a reference image request is provided. This embodiment is obtained by changing from the twenty-third embodiment. The difference from the twenty-third embodiment is: to the decoder
  • the information of the knowledge image contained in the second type of fragment pointed to by the positioning information sent is the pixel value of the knowledge image.
  • the thirty-first embodiment A method for processing a reference image request is provided. This embodiment is obtained by changing from the twenty-third embodiment. The difference from the twenty-third embodiment is: to the decoder The information of the knowledge image contained in the second type segment pointed to by the positioning information sent is the storage location of the knowledge image.
  • the thirty-second embodiment a method for processing a reference image request is provided. This embodiment is obtained by changing from the twenty-third embodiment. The difference from the twenty-third embodiment is that it uses HTTP transmission The protocol sends an HTTP-request to the server to download the second-type fragment from the server.
  • the thirty-third embodiment a system method for specifying a reference image and processing a request for a reference image is provided. This embodiment is obtained by changing the first embodiment and the twenty-third embodiment, and is the same as the first embodiment. The difference between this example and the twenty-third embodiment is:
  • the sequence encoder 1002 receives the video sequence to be encoded and encodes the encoded image in the encoding order; if the currently to-be-encoded image refers to at least one knowledge image, the sequence encoder 1002 is locally The available knowledge image set selects at least one knowledge image to construct a reference image queue for the current image to be encoded, and informs the knowledge image encoder 1003 of the local number of the knowledge image; the knowledge image encoder 1003 will encode the knowledge image according to the knowledge image number And reconstruct, providing the reconstructed knowledge image to the sequence encoder 1002; the server manager 1004 receives the local number of the knowledge image and the number of the knowledge image in the bit stream (for example, represented by LibPID) from the sequence encoder 1002, and from the sequence fragment organizer 1005 receives positioning information (for example, represented by SeqURL) of a random access segment to which the currently encoded image belongs, receives a local number of the knowledge image from the knowledge image encoder 1003, and receives
  • positioning information for example, represented by SeqURL
  • the MPD parser 1009 receives the MPD sent by the server 1001 and parses and obtains a dependency mapping table of at least one sequence segment.
  • the client manager 1010 determines that it needs to download according to the current playback time. SeqURL of sequence fragment; sequence fragment downloader 1011 downloads sequence fragment from server 1001 according to SeqURL; sequence decoder 1012 receives the sequence fragment and parses the bit stream therein, and judges the current to be decoded according to the reference mapping table carried in the bit stream Whether the image depends on the knowledge image.
  • the client manager 1010 sends the knowledge image request information to the client manager 1010 according to the LibPID of the dependent knowledge image in the reference mapping table; the client manager 1010 according to the knowledge image in the request information LibPID, in the dependency mapping table of the sequence segment to which the currently decoded image belongs, finds and obtains the LibURL corresponding to LibPID; the knowledge image manager 1013 receives the LibURL, and in one possible method, searches the local knowledge cache for the knowledge segment pointed to by LibURL Whether the contained knowledge image exists, and if so, from The corresponding knowledge image is extracted from the knowledge cache and provided to the sequence decoder 1012.
  • the knowledge segment is downloaded from the server 1001 according to the LibURL, the knowledge image contained in it is decoded and provided to the sequence decoder 1012; The obtained knowledge image decodes the current decoded image, and displays or outputs the current image.
  • a thirty-fourth embodiment A device for specifying a reference image is provided. As shown in FIG. 10, the device includes:
  • a first extraction unit 11 configured to extract first identification information in a reference mapping table to obtain whether a number corresponding to a reference index in the reference mapping table uses a mixed numbering rule
  • a second extraction unit 12 is configured to extract second identification information corresponding to at least one reference index j from the reference mapping table when the number corresponding to the reference index in the reference mapping table uses a mixed numbering rule to obtain the reference.
  • a third extraction unit 13 configured to extract a reference image number corresponding to the reference index j to which the reference map j belongs from the reference mapping table;
  • a first determining unit 14 configured to determine a reference image of the current image by using the reference image number when the numbering rule adopted by the reference image number is the first numbering rule;
  • the second determining unit 15 is configured to determine the reference image of the current image using the reference image information returned from the outside of the decoder by the reference image number when the numbering rule adopted by the reference image number is the second numbering rule.
  • the thirty-fifth embodiment A device for specifying a reference image is provided. This embodiment is changed from the thirty-fourth embodiment. The difference from the thirty-fourth embodiment is:
  • the syntax reference_configuration_set is used to indicate the reference mapping table.
  • the first extraction unit 11 is used to extract the syntax reference_to_library_enable_flag from the reference_configuration_set to obtain whether the number corresponding to the reference index in the reference mapping table uses a mixed numbering rule; when the When the number corresponding to the reference index in the reference mapping table uses the mixed numbering rule, the second extraction unit 12 is configured to extract the second identification information corresponding to at least one reference index j from the reference_configuration_set to obtain the number used by the number corresponding to the reference index j Rules; the third extraction unit 13 is configured to extract the reference image number library_pid or delta_doi_of_reference_picture corresponding to the reference index j from the reference_configuration_set; if the third extraction unit 13 extracts the reference image number delta_doi_of_reference_picture, the first determination unit 14 is configured to adopt The same numbering rule is used to determine the reference image of the current image using the reference image number; if the third
  • the thirty-sixth embodiment A device for specifying a reference image is provided. This embodiment is changed from the thirty-fourth embodiment. The difference from the thirty-fourth embodiment is: the first extraction
  • the reference mapping table used by the unit 11, the second extraction unit 12, and the third extraction unit 13 is carried in a sequence header, an image header, and a slice header.
  • a device for specifying a reference image is provided. This embodiment is changed from the thirty-fourth embodiment. As shown in FIG. 11, it is different from the thirty-fourth embodiment. It is said that the device further comprises:
  • a fourth extraction unit 21 configured to extract a reference mapping update table to obtain a number corresponding to at least one reference index j and second identification information;
  • a replacing unit 22 configured to: when the reference index j in the reference mapping update table exists in the reference mapping table, replace the number corresponding to the reference index j in the reference mapping table and the second identification information Replacing with the number corresponding to the reference index j and the second identification information in the reference mapping update table;
  • An adding unit 23 is configured to add the reference in the reference mapping update table to the reference mapping table when the reference index j in the reference mapping update table does not exist in the reference mapping table.
  • the thirty-eighth embodiment A device for specifying a reference image is provided. This embodiment is changed from the thirty-seventh embodiment. The difference from the thirty-seventh embodiment is:
  • the replacement unit 22 is further configured to replace the number corresponding to the reference index j in the reference map with the reference map. Update the number corresponding to the reference index j in the table, and identify the second identification information corresponding to the reference index j in the reference mapping table to use the second numbering rule.
  • the thirty-ninth embodiment A device for specifying a reference image is provided. This embodiment is changed from the thirty-seventh embodiment. The difference from the thirty-seventh embodiment is: when the When the reference mapping update table includes only at least one reference index and a number thereof using the second numbering rule, the adding unit 23 is further configured to add the reference index j in the reference mapping update table to the reference mapping table. And its corresponding number, and identifies the second identification information corresponding to the reference index j in the reference mapping table as using a second numbering rule.
  • a device for specifying a reference image is provided. This embodiment is changed from the thirty-fourth embodiment. The difference from the thirty-fourth embodiment is that the device also include:
  • a setting unit 33 is configured to set the distance between the reference image and the current image to a non-time domain distance when the decoder decodes the current image using the reference image pointed by the number using the second numbering rule.
  • Forty-first embodiment A device for specifying a reference image is provided. This embodiment is changed from the fortieth embodiment. The difference from the forty-first embodiment is that the setting unit 33 also uses The method sets the distance between the reference image and the current image to a given fixed non-zero value.
  • Forty-second embodiment A device for specifying a reference image is provided. This embodiment is changed from the fortieth embodiment. The difference from the fortieth embodiment is that the setting unit 33 also uses The method sets the distance between the reference image and the current image to a non-zero value calculated according to the similarity between the reference image pointed to by the number using the second numbering rule and the current image.
  • a device for specifying a reference image is provided. This embodiment is changed from the thirty-fourth embodiment. The difference from the thirty-fourth embodiment is that the device Also includes:
  • the fifth extraction unit 41 is configured to extract third identification information to obtain whether the first identification information exists in the reference mapping table.
  • a device for specifying a reference image is provided. This embodiment is changed from the forty-third embodiment. The difference from the forty-third embodiment is that in the AVS3 standard, The fifth extraction unit 41 is further configured to extract the third identification information represented by library_picture_enable_flag from the sequence header.
  • an apparatus for processing a reference image request is provided. As shown in FIG. 12, the apparatus includes:
  • the first obtaining unit 51 is configured to obtain a dependency mapping table of at least one fragment of the first type, where the dependency mapping table describes a number of the at least one reference image to which the fragment of the first type depends and the at least one reference. The mapping relationship of the positioning information of the second type fragment to which the image belongs;
  • a receiving unit 52 configured to receive reference image request information sent by a decoder to obtain a number of at least one reference image on which a current image depends;
  • a second obtaining unit 53 is configured to obtain, from the dependency mapping table of the first-type fragment to which the current image belongs, a second image to which the reference image pointed to by the reference image number in at least one of the reference image request information. Positioning information of class fragments;
  • a sending unit 54 is configured to use the positioning information of the second segment to send to the decoder information of a knowledge image included in the second type segment pointed to by the positioning information.
  • Forty-sixth embodiment A device for processing a reference image request is provided. This embodiment is changed from the forty-fifth embodiment, and is different from the forty-fourth embodiment:
  • the third obtaining unit 61 is configured to obtain a dependency mapping table of at least one fragment of the first type from the media description information.
  • Forty-seventh embodiment A device for processing a reference image request is provided. This embodiment is changed from the forty-sixth embodiment, and is different from the forty-sixth embodiment:
  • the third obtaining unit 61 is further configured to obtain at least one fragment dependent descriptor of the first type fragment dependent_segment from the MPD, and obtain at least one attribute dependent_segment_indicator of the dependent_segment descriptor from the first type fragment dependent
  • a device for processing a reference image request is provided. This embodiment is changed from the forty-fifth embodiment, as shown in FIG. 13, and the forty-fifth embodiment. The difference is that the sending unit 54 further includes:
  • a searching unit 71 configured to search, according to the positioning information of the second type fragment, a second type fragment pointed to by the positioning information or a knowledge image included in the second type fragment;
  • a fourth obtaining unit 72 is configured to obtain the second-type fragment or the second-type fragment from the cache if the second-type fragment or the knowledge image contained in the second-type fragment exists in the cache. The knowledge image contained in the fragment;
  • the downloading unit 73 is configured to download the second-type fragment from the server if the second-type fragment or the knowledge image contained in the second-type fragment does not exist in the cache.
  • a device for processing a reference image request is provided. This embodiment is changed from the forty-fifth embodiment. The difference from the forty-fifth embodiment is: a sending unit 54 is also used to send a pixel value of a knowledge image contained in the second type segment pointed to by the positioning information to a decoder.
  • Fifty-fifth embodiment A device for processing a reference image request is provided. This embodiment is changed from the forty-fifth embodiment. The difference from the forty-fifth embodiment is: the sending unit 54 And is further configured to send a storage location of a knowledge image included in the second type segment pointed to by the positioning information to a decoder.
  • the fifty-first embodiment a device for processing a reference image request is provided. This embodiment is changed from the forty-eighth embodiment. The difference from the forty-eighth embodiment is: a download unit 73 is also used to send an HTTP-request to the server using the HTTP transmission protocol to download the second type fragment from the server.
  • Fifty-second embodiment A system method for specifying a reference image and processing a reference image request is provided. This embodiment is obtained by changing from the thirty-fourth embodiment and the forty-fifth embodiment. The fourteen embodiments are different from the forty-fifth embodiment:
  • the MPD parser 2001 receives the MPD and parses and obtains a dependency mapping table of at least one sequence segment; the manager 2002 determines the SeqURL of the sequence segment to be downloaded according to the current playback time; the sequence segment downloader 2003 Download sequence fragments according to SeqURL; the sequence decoder 2004 receives the sequence fragments and parses the bit stream therein, and determines whether the current image to be decoded depends on the knowledge image according to the reference mapping table carried in the bit stream.
  • the knowledge image request information is sent to the manager 2002; according to the LibPID of the knowledge image in the request information, the manager 2002 finds and obtains the LibPID in the dependency mapping table of the sequence segment to which the decoded image belongs.
  • Corresponding LibURL; Knowledge Image Manager 2005 receives LibURL.
  • it searches in the local knowledge cache whether the knowledge image contained in the knowledge segment pointed to by LibURL exists, and if it exists, extracts the corresponding knowledge image from the knowledge cache and Provided to sequence decoder 2004, if not Existing, downloading a knowledge segment according to LibURL, decoding to obtain the contained knowledge image, and providing it to the sequence decoder 2004; the sequence decoder 2004 uses the obtained knowledge image to decode the currently decoded image and display or output the current image.
  • the 53rd embodiment A system method for specifying a reference image and processing a request for a reference image is provided. This embodiment is obtained by changing from the 34th embodiment and the 45th embodiment.
  • the fourteen embodiments are different from the forty-fifth embodiment:
  • the MPD parser 3001 receives the MPD and parses and obtains a dependency mapping table of at least one sequence segment; the manager 3002 determines the SeqURL of the sequence segment to be downloaded according to the current playback time; the sequence segment downloader 3003 Download the sequence fragment according to SeqURL; the sequence decoder 3004 receives the sequence fragment and parses the bit stream therein, and determines whether the current image to be decoded depends on the knowledge image according to the reference mapping table carried in the bit stream.
  • the manager 3002 According to the LibPID of the dependent knowledge image in the reference mapping table, send the knowledge image request information to the manager 3002; according to the LibPID of the knowledge image in the request information, the manager 3002 finds and obtains the LibPID in the dependency mapping table of the sequence segment to which the currently decoded image belongs.
  • the manager 3002 uses LibURL to find in the local knowledge cache 3005 whether the knowledge image contained in the knowledge segment pointed to by LibURL exists, and if so, returns the storage address of the knowledge image in the knowledge cache 3006 to the sequence decoder 3004, if it does not exist, use LibURL
  • the knowledge segment is decoded to obtain the contained knowledge image, the reconstructed knowledge image is stored in the knowledge cache 3005, and the storage address of the knowledge image in the knowledge cache 3005 is returned to the sequence decoder 3004; the sequence decoder 3004 uses the returned knowledge image
  • the storage address obtains a knowledge image from the knowledge buffer 3005 for decoding the currently decoded image, and displays or outputs the current image.
  • the fifty-fourth embodiment A system method for specifying a reference image and processing a request for a reference image is provided. This embodiment is obtained by changing from the thirty-fourth embodiment and the forty-fifth embodiment. The fourteen embodiments are different from the forty-fifth embodiment:
  • the MPD parser 4001 receives the MPD and parses and obtains a dependency mapping table of at least one sequence segment; the manager 4002 determines the SeqURL of the sequence segment to be downloaded according to the current playback time; the sequence segment downloader 4003 Download sequence fragments according to SeqURL; the sequence decoder 4004 receives the sequence fragments and parses the bit stream therein, and determines whether the current image to be decoded depends on the knowledge image according to the reference mapping table carried in the bit stream.
  • the manager 4002 finds and obtains the LibPID in the dependency mapping table of the sequence segment to which the currently decoded image belongs.
  • the manager 4002 uses LibURL to find in the local knowledge cache 4005 whether the knowledge image contained in the knowledge segment pointed to by LibURL exists, and if so, obtains the knowledge image from the knowledge cache 4005 and decodes the knowledge image back to the sequence 4004, if it does not exist, use LibU RL downloads the knowledge segment, decodes the acquired knowledge image, stores the reconstructed knowledge image in the knowledge cache 4005, and returns the knowledge image to the sequence decoder 4004; the sequence decoder 4004 uses the returned knowledge image to decode the currently decoded image, and displays or Output the current image.
  • Fifty-fifth embodiment A system method for specifying a reference image and processing a request for a reference image is provided. This embodiment is obtained by changing from the thirty-fourth embodiment and the forty-fifth embodiment. The fourteen embodiments are different from the forty-fifth embodiment:
  • the MPD parser 5001 receives the MPD and parses and obtains a dependency mapping table of at least one sequence segment; the manager 5002 determines the SeqURL of the sequence segment to be downloaded according to the current playback time; the sequence segment downloader 5003 Download sequence fragments according to SeqURL; the sequence decoder 5004 receives the sequence fragments and parses the bit stream therein, and determines whether the current image to be decoded depends on the knowledge image according to the reference mapping table carried in the bit stream.
  • the manager 5002 sends the knowledge image request information to the manager 5002 according to the LibPID of the dependent knowledge image in the reference mapping table; according to the LibPID of the knowledge image in the request information, the manager 5002 finds and obtains the LibPID in the dependency mapping table of the sequence segment to which the currently decoded image belongs.
  • the manager 5002 uses LibURL to find in the local knowledge cache 5005 whether the knowledge image code stream contained in the knowledge segment pointed to by LibURL exists, and if so, obtain the knowledge image code stream from the knowledge cache 5005 and decode the knowledge image And return the knowledge image to the sequence decoder 50 04, if it does not exist, use LibURL to download the knowledge fragment, store the knowledge image code stream contained in the knowledge fragment in the knowledge cache 5005, decode the knowledge image, and return the knowledge image to the sequence decoder 5004; the sequence decoder 5004 uses the returned knowledge Image Decodes the current decoded image and displays or outputs the current image.
  • the fifty-sixth embodiment Provide a method for processing and obtaining media data.
  • FIG. 18 shows the dependency structure relationship of media data generated using a knowledge base-based video encoding method.
  • the media data generated by the knowledge base-based encoding method includes two types of video data, the first type of video data and the second type of video data.
  • the first type of video data is called video layer data
  • the video layer data contains the video stream of the video layer image.
  • the second type of video data is knowledge layer data, which contains the code stream of the knowledge layer image.
  • the video data includes at least one sample, the sample includes an image or a group of images. Samples of the first type of video data are assigned numbers and arranged in order according to the first numbering rule.
  • the first numbering rule is a rule of assigning numbers according to chronological or playback order or decoding order, and samples of the second type of video data are according to the second number.
  • the rules are assigned numbers and arranged sequentially, and the second numbering rule is a rule that assigns numbers in the order of use, generation, or storage.
  • At least one sample in the second type of video data is dependent on at least two discontinuous samples in the first type of video data and provides reference information for the encoding and decoding of at least two discontinuous samples in the first type of video data. This kind of dependency relationship is called dependency relationship of non-aligned time period.
  • video 1 data In order for video 1 data to depend on video 2 data, it is necessary to perform encoding and decoding in synchronization with video 2 data, and multiple samples in video 1 data depend on the same sample in video 2 data.
  • the dashed arrows indicate the Dependency
  • sample 1, sample 2 and sample 4 in video 1 data depend on sample 1 in video 2 data
  • sample 3 and sample 5 in video 1 data depend on sample 2 in video 2 data.
  • the dependent video 2 data samples need to be synchronized with the dependent video 1 data samples to ensure the correct decoding of the video 1 data samples.
  • Video 2 data samples that are dependent on multiple samples in Video 1 data are not repeatedly stored or transmitted, but are shared.
  • Video 2 Data Sample 1 is connected to the video. After 1 data sample 1 is used synchronously, it will still be reused for subsequent video 1 sample 2 and sample 4.
  • the present invention provides a method for storing media data and a method for extracting a media data stream.
  • the structural relationship example in FIG. 18 is also applicable to the subsequent embodiments. Structural relationship description.
  • FIG. 19 shows an embodiment of a method for processing to obtain media data.
  • the media data box and the metadata box Movie Box are stored in one file.
  • the media data box and the Movie box may be stored in different files respectively.
  • two tracks are used in "Movie Box" to describe samples of video 1 data and video 2 data, respectively, as shown in Figure 19, by video track 1
  • the structure of the data sample of Video 1 is described, and the structure of the data sample of Video 2 is described by Video track 2.
  • Tref data box (Track Reference Box) is used in video track 1 to describe the dependency relationship between video track 1 and video track 2.
  • sample group data box Sample Group Box
  • sample group description data box Sample Group Description Box
  • sample group 2 points to data sample entry 3 and sample entry 5 of video 1, and record number 2 at the same time.
  • the samples pointed to by video sample data entry 3 and sample entry 5 depend on the samples pointed by video 2 data sample entry 2. Therefore, the sample group needs to describe the information of the video 2 data sample entry that is depended on, and needs the following syntax:
  • num_library_samples indicates the number of video 2 data samples pointed to by this group.
  • library_sample_index indicates the number of the video 2 data sample entry pointed to by this group.
  • the track to which the sample entry of the video 2 data sample pointed to by library_sample_index belongs is described by the tref data box of the current track.
  • the video 2 data samples are described in at least two tracks. At this time, in order to locate the video 2 data samples pointed to by the sample group, the following syntax is needed:
  • num_library_samples indicates the number of video 2 data samples pointed to by this group.
  • library_track_ID indicates the track number where the video 2 data sample entry pointed to by this group is located.
  • library_sample_index indicates the number of the video 2 data sample entry pointed to by this group.
  • the dependent video 2 data sample can be uniquely determined, thereby establishing a dependency relationship between the video 1 data sample and the video 2 data sample.
  • FIG. 20 shows another embodiment of a method for processing to obtain media data.
  • the media data box and the metadata box Movie Box are stored in one file.
  • the media data box and the Movie box may be stored in different files respectively.
  • a track is used in the metadata to describe the video 1 data and sample auxiliary information.
  • sample auxiliary information (Sample auxiliary information boxes and sample auxiliary information boxes) are used to describe the dependency relationship between video 1 data and video 2 data, and the sample auxiliary information and video 1 data sample entry Time-to-time correspondence.
  • a new value needs to be added to the sample auxiliary information information type (aux_info_type), such as using the 'libi' flag .
  • the value of the information type is 'libi', it indicates that the current data box is sample auxiliary information, including the video 2 reference relationship corresponding to the video 1 data and the position in the media data box where the video 2 data is located.
  • FIG. 21 shows still another embodiment of a method for storing media data.
  • the media data box and the metadata box Movie Box are stored in one file.
  • the media data box and the Movie box may be stored in different files respectively.
  • a time-series metadata track is also used to describe the video The relationship between tracks and video tracks.
  • the structure of the data sample of video 1 is described by video track 1
  • the structure of the data sample of video 2 is described by video track 2
  • the structure of the time-series metadata sample is described by video track 3
  • video track 1 and video track 3 are described.
  • the tref data box (Track Reference Box) is used to describe the dependency relationship between video track 1 and video track 3.
  • reference_type the reference type of the tref data box
  • 'libr' the value of the reference type
  • the data sample pointed to by the current video 1 track depends on the data sample pointed by the video 2 track pointed by the track identifier under tref.
  • the video 1 data samples and the time-series metadata samples use the same sequence numbering rule, the video 1 data samples and the time-series metadata samples both use chronological order, and the dependency relationship between the samples can be directly described using time stamps.
  • the time-series metadata sample pointed by the time-series metadata sample entry describes the dependency relationship between the video-1 data sample pointed by the video-1 data sample entry and the video-2 data sample pointed by the video-2 data sample entry. To do this, you need to add sample syntax for time-series metadata describing dependencies:
  • number_of_library_sample indicates the number of referenced video 2 data samples.
  • library_sample_index indicates the number of the video 2 data sample entry.
  • the track to which the sample entry of the video 2 data sample pointed to by library_sample_index belongs is described by the tref data box of the track to which the video 1 data pointed by the current track's tref data box belongs.
  • segment index data box is used to describe the dependency relationship between video 1 data samples and video 2 data samples.
  • the syntax of the fragment index data box is:
  • syntax elements in italics are the syntax elements newly added in this embodiment, and their semantics are:
  • reference_library_flag a value of 1 indicates that the current project references a knowledge image, and a value of 0 indicates no reference;
  • reference_sample_number indicates the number of knowledge images referenced by the current project
  • sample_track_ID indicates the track number to which the sample of the currently referenced knowledge image belongs
  • sample_ID indicates the number of the sample of the knowledge image currently being referenced.
  • FIG. 22 shows still another embodiment of a method for processing and obtaining media data. Compared with the fifty-ninth embodiment, a sample syntax describing the temporal metadata of the dependency relationship:
  • number_of_library_sample indicates the number of referenced video 2 data samples.
  • library_sample_URL Uniform resource locator indicating a video 2 data sample.
  • library_sample_offset indicates the byte offset of the video 2 data sample.
  • library_sample_size indicates the byte size of the video 2 data sample.
  • FIG. 23 shows another embodiment of a method for processing to obtain media data.
  • the media data box and the metadata box Movie Box are stored in one file.
  • the media data box and the Movie box may be stored in different files respectively.
  • a sample group is used to describe the dependency relationship between video 1 data and video 2 data.
  • a new value needs to be added to the grouping type (grouping_type) of the sample group, for example, using 'libg 'Identity.
  • the value of the group type is 'libg', it indicates that the current data box is a sample group with a dependency relationship, including the video 2 reference relationship corresponding to the video 1 data and the position of the video 2 data in the metadata box.
  • the syntax of the sample group is as follows:
  • meta_box_handler_type the type of the metadata item, where adding 'libi' indicates that the type of the metadata item is a knowledge image
  • num_items the number of metadata items
  • item_id [i] the number of the i-th metadata item
  • library_pid [i] The number of the knowledge image corresponding to the i-th metadata item.
  • FIG. 24 shows an embodiment of a method for transmitting media data.
  • the relationship between the tracks is determined according to the track's tref data box, thereby determining video track 1 pointing to the video 1 data sample, video track 2 pointing to the video 2 data sample (if present), and metadata pointing to the time-series metadata sample.
  • Data track 3 (if any); then extract video 1 data samples from video track 1 in chronological order; then locate and extract video 2 data samples that are dependent on video 1 data samples and auxiliary information based on the auxiliary information of video 1 data samples
  • the description manner may be the description manner of the dependency relationship between the video 1 data sample and the video 2 data sample in any of the embodiments of FIG. 19 to FIG. 22; and then, the video 1 data sample and the dependent video 2 data sample are transmitted synchronously. Go to the receiving end to decode or play.
  • FIG. 25 shows an embodiment for transmitting SVC media data.
  • This embodiment encapsulates SVC media data in a package.
  • the package contains two assets, Asset 1 and Asset 2, and also contains Organization Information (CI).
  • Each asset contains an MPU, and each MPU contains a type of data for SVC media data.
  • MPU1 of asset 1 contains data at the base layer
  • MPU2 of asset 2 contains data at the enhancement layer.
  • Organization information records information such as the dependency relationship between assets.
  • organization information describes the dependency of asset 1 on asset 2.
  • Each MPU contains at least one MFU, and a hint track describes the segmentation information of the MFU in the MPU. For example, MPU2 is segmented into MFU1-4, and MPU1 is segmented into MFU1-4.
  • the dotted line Represents the dependency relationship between MFUs.
  • MFU1-4 in Asset 1 corresponds to MFU1-4 in Asset 2 respectively.
  • the base layer data and the enhancement layer data are media data of the aligned time period
  • the interdependent MFU is in the customer.
  • the terminals need to be transmitted synchronously, such as the transmission time of the MFU described by the solid arrow on the timeline in FIG. 25. It can be seen that the use of MMT to transmit SVC media data is simply a matter of segmenting SVC media data and transmitting it in the same aligned time period. This method is obvious when performing simple segmented transmission of media data that has dependencies on non-aligned time periods. Nope.
  • FIG. 26 shows an embodiment in which media is segmented and transmitted. Compared with the sixty-third embodiment, this embodiment uses different methods to describe the dependency relationship between MFUs.
  • the knowledge base encoded media data is packaged in a package.
  • the package asset includes three assets: asset 1, asset 2, and asset 3. It also contains organization information (Composition Information).
  • Each asset contains an MPU.
  • Each MPU contains a type of data encoded by the knowledge base. For example, the MPU of asset 1 contains video layer data, the MPU2 of asset 2 contains dependent metadata, and the MPU3 of asset 3 contains knowledge layer data.
  • Organization information records the time domain, airspace, or dependency relationships between assets. For example, organization information describes the dependency of asset 1 on asset 2 and the dependency of asset 2 on asset 3.
  • Each MPU contains at least one MFU, and a hint track describes the segmentation information of the MFU in the MPU.
  • MPU1 is segmented into MFU1-5
  • MPU2 is segmented into MFU1-5
  • MPU3 is Segmented into MFU1-2, where the dashed lines indicate the dependencies between MFUs, for example, MFU1-5 in Asset 1 depends on MFU1-5 in Asset 2, MFU1-5 in Asset 2 depends on MFU1 in Asset 3, MFU3 in Asset 2, MFU5 depends on MFU2 in asset 3.
  • this implementation uses timed metadata to describe the dependencies between MFUs.
  • the timed metadata has the same non-aligned time period as the video layer data, and the timing is maintained by the aligned time period.
  • time series metadata describes the knowledge layer data that needs to be synchronized for the corresponding period, so that the video layer data is indirectly related to the knowledge layer data.
  • the advantage of this method is that the addition and deletion multiplexing of the time-series metadata track is very flexible, and there is no need to modify the data of the video track.
  • the disadvantage is that the time-series metadata is stored in the media data of the file. After the time-series metadata is parsed again, the dependent knowledge layer data can be obtained from the file based on the positioning information, which brings additional operational load to the MMT sender. You need to use a sample of time-series metadata describing the dependencies, the syntax is as follows:
  • reference_MFU_flag indicates whether the MFU is referenced, a value of "0" means not referenced.
  • number_of_reference_MFU indicates the number of referenced MFU.
  • depended_MFU_asset_id indicates the Asset number to which the referenced MFU belongs.
  • depended_MFU_sequence_number indicates the number of the referenced MFU.
  • syntax is as follows:
  • reference_sample_flag indicates whether the sample is referenced, the value "0" means not referenced.
  • number_of_reference_sample indicates the number of samples for reference.
  • depended_sample_MPU_id indicates the MPU number to which the referenced sample belongs.
  • depended_sample_id indicates the number of the referenced sample.
  • FIG. 27 shows another embodiment of transmitting media data. Compared with the sixty-fourth embodiment, this embodiment uses different methods to describe the dependency relationship between MFUs.
  • the knowledge base encoded media data is packaged in a package.
  • the package asset includes three assets: asset 1, asset 2, and asset 3. It also contains organization information (Composition Information).
  • Each asset contains an MPU.
  • Each MPU contains a type of data encoded by the knowledge base.
  • the MPU of asset 1 contains video layer data.
  • the knowledge layer data is divided into at least two assets.
  • MPU 2 of asset 2 contains the knowledge layer.
  • Data, MPU3 of Asset 3 contains knowledge layer data.
  • Organization information records the time domain, airspace or dependency relationship between assets.
  • organization information describes the dependence of asset 1 on assets 2 and 3, and assets 2 and 3 can be independent or interdependent.
  • Each MPU contains at least one MFU, and a hint track describes the segmentation information of the MFU in the MPU.
  • MPU1 is segmented as MFU1-5
  • MPU2 is segmented as MFU1-2
  • MPU3 is only Contains MFU1, where the dotted line indicates the dependency relationship between MFUs, for example, MFU1, MFU5 in Asset 1 depends on MFU1 in Asset 2, MFU2 in Asset 1 depends on MFU1 in Asset 3, MFU3 and MFU5 in Asset 1 depend on MFU2 in Asset 2, at this time Because the number of MFU in asset 2 and asset 3 may be duplicated, it is necessary to add the positioning information of MFU.
  • the MFUs that are dependent on each other need to be transmitted synchronously at the client, such as the transmission time of the MFU described by the solid arrow on the timeline in FIG. 27. Since the video layer data is media data in the aligned time period, and the knowledge layer data is media data in the non-aligned time period, the dependencies between the MFUs need to be clearly marked.
  • the advantage of this method is that the MMT sender can obtain the dependence of the video layer data samples on the knowledge layer data samples by analyzing the hint track of the video layer data, and then extract the video layer MFU and the video layer data according to the hint track of the video layer data and knowledge layer data. Knowledge layer MFU.
  • this method does not affect the hint track information of the knowledge layer data, and maintains the independence and flexibility of the knowledge layer data.
  • the disadvantage is that the number of MFU in different assets may be repeated, which will cause some increase in the hint layer of the video layer data.
  • Redundant knowledge layer data sample positioning information Based on the MMT standard MFU sample, the syntax of the MFU (referred to as DMFU, dependent MFU) sample describing the current MFU reference and the additional positioning information of the MFU in the MFU is extended as follows:
  • referenceMFU_flag indicates whether to reference the MFU, the value "0" means not to reference.
  • number_of_depended_MFU indicates the number of MFUs referenced.
  • depended_MFU_asset_id indicates the Asset number to which the referenced MFU belongs.
  • depended_MFU_sequence_number indicates the number of the referenced MFU.
  • FIG. 28 shows another embodiment of transmitting media data. Compared with the sixty-fourth and sixty-fifth embodiments, this embodiment uses different methods to describe the relationship between MFUs. Dependencies.
  • the knowledge base encoded media data is packaged in a package, which contains two assets, asset 1 and asset 2, and also contains organization information (Composition, Information, CI).
  • Each asset contains an MPU.
  • Each MPU contains a type of data encoded by the knowledge base. For example, MPU1 of asset 1 contains video layer data, and MPU2 of asset 2 contains knowledge layer data.
  • Organizational information records the time domain, airspace, or dependencies between assets. For example, organization information describes the dependence of asset 1 on asset 2.
  • Each MPU contains at least one MFU, and a hint track describes the segmentation information of the MFU in the MPU.
  • MPU2 is segmented into MFU1 and MFU4, and MPU1 is segmented into MFU2, MFU3, MFU5- 7.
  • the dotted line indicates the dependency relationship between MFUs.
  • MFU2, MFU3, and MFU6 in Asset 1 depend on MFU1 in Asset 2
  • MFU5 and MFU7 in Asset 1 depend on MFU4 in Asset 2.
  • the interdependent MFU needs to be on the client.
  • the transmission time of the MFU is transmitted synchronously, for example, as depicted by the solid line arrow on the time line in FIG. 28.
  • the MMT sender analyzes the hint of the video layer data. Track can obtain the dependence of video layer data samples on knowledge layer data samples, and then extract video layer MFU and knowledge layer MFU according to the hint layer of video layer data and knowledge layer data. At the same time, this method does not affect the hint track information of knowledge layer data. , Maintaining the independence and flexibility of knowledge-level data.
  • DMFU dependent MFU
  • referenceMFU_flag indicates whether to reference the MFU, the value "0" means not to reference.
  • number_of_depended_MFU indicates the number of MFUs referenced.
  • depended_MFU_sequence_number indicates the number of the referenced MFU.
  • RMFU reference MFU
  • dependedMFU_flag indicates whether it is depended on by MFU, the value "0" means not depended.
  • number_of_reference_MFU indicates the number of referenced MFU.
  • reference_MFU_sequence_number indicates the number of the referenced MFU.
  • number_of_consequent_MFU indicates the number of consecutive MFUs that depend on the current MFU after the referenced MFU.
  • the above syntax can be used to obtain the dependencies between MFU. It should be noted that, in one case, the numbers of DMFU and RMFU and the number of the current MFU use the same set of sequential numbers and do not duplicate each other. At this time, DMFU and RMFU can be uniquely determined; in another case, DMFU and RMFU When the number of the MFU and the number of the current MFU use different sequence numbers and can be repeated with each other, you need to determine the asset to which the DMFU and RMFU belong to belong to the MPU to which the asset belongs, according to the dependency relationship between the MPU to which the MFU belongs and described in the organization information. Determine DMFU and RMFU.
  • FIG. 29 shows another embodiment of transmitting media data. Compared with the sixty-third, sixty-fourth, sixty-fifth, and sixty-sixth embodiments, this embodiment This example adds an operation to avoid MFU retransmission.
  • the dependent MFU needs to be transmitted synchronously according to the dependency.
  • FIG. 29 describes the process of transmitting MFU.
  • the current MFU is obtained from the video layer data in asset 1 with an aligned time period according to the current transmission order, such as MFU 2 in asset 1 in FIG. 28. According to the sample information of the current MFU, determine whether the current MFU depends on DMFU.
  • DMFU If it does not depend on DMFU, then transmit the current MFU and continue to obtain the next MFU in sequence or terminate the transmission. If it depends on DMFU, then according to the DMFU number described in the current MFU, Obtain the dependent MFU from the knowledge layer data in Asset 2 in a non-aligned time period. Since multiple aligned time period MFUs depend on the same unaligned time period MFU, in order to avoid repeated transmission of DMFU, three situations need to be considered when transmitting DMFU to determine the availability of DMFU on the client, as shown in Figure 29. In one case, according to the historical transmission list of DMFU, the DMFU that the current MFU depends on has not been transmitted, then the DMFU and the current MFU need to be transmitted synchronously.
  • MFU1 in asset 2 and MFU2 in asset 2 in Figure 28 require Are transmitted synchronously; in another case, according to the historical transmission list of the DMFU, the DMFU that the current MFU depends on has already been transmitted, then only the current MFU needs to be transmitted without the DMFU, such as MFU3 and MFU6 in Asset 2 in Figure 28 , MFU7, where MFU1 in MFU3, MFU6 depends on asset 2 MFU1 has been synchronized with MFU2 in asset 1, MFU4 in MFU7 dependent asset 2 has been synchronized with MFU5 in asset 1; in another case, according to the history of DMFU List, the DMFU that the current MFU depends on has been transmitted, but according to the signaling message fed back by the client, the DMFU is unavailable on the client due to multiple possible reasons such as frequency of use, storage, and management methods.
  • the sixty-eighth embodiment another embodiment of transmitting media data is provided.
  • signaling messages need to be used in the transmission.
  • the server informs the client of the optimal storage size of the knowledge layer data in the non-aligned time period, the storage management method (such as FIFO (First In In Fist Out), LFU (Least Frequently Used), LRU (Least (Recently Used), and other possible storage management methods) and other information.
  • the storage management method such as FIFO (First In In Fist Out), LFU (Least Frequently Used), LRU (Least (Recently Used), and other possible storage management methods
  • LBM Knowledge layer data buffer model
  • message_id indicates that the message is an LBM message.
  • version indicates the version of the LBM message.
  • the client can check whether the LBM message is new or old.
  • length indicates the byte length of the LBM message.
  • required_buffer_size indicates the size in bytes of the knowledge layer data buffer that the client needs to prepare in order to receive this data.
  • required_buffer_Manage Instructs the client to manage the knowledge layer data cache. For example, a value of 0 means using the FIFO method, a value of 1 means using the LFU method, a value of 2 means using the LRU method, and so on.
  • the client feedbacks the management operation of the knowledge layer data cache to the server through a signaling message, and informs which transmitted knowledge layer data is no longer available on the client, so that the server transmits the dependent unavailable knowledge again.
  • the knowledge layer data can be retransmitted again. This requires using the knowledge layer data to cache the feedback message.
  • message_id indicates that the message is an LBM message.
  • version indicates the version of the LBM message.
  • the client can check whether the LBM message is new or old.
  • length indicates the byte length of the LBM message.
  • unavailable_mfu_number indicates the number of MFUs to which the unavailable data in the knowledge layer data cache belongs.
  • asset_id indicates the asset number to which the i-th unavailable MFU belongs.
  • sample_id indicates the sample number to which the i-th unavailable MFU belongs.
  • mfu_id indicates the number of the i-th unavailable MFU.
  • Sixty-ninth embodiment adds a new relationship type.
  • SMT Smart Media Transport
  • the corresponding flags are dependency_flag, composition_flag, equality_flag, and similarity_flag, respectively.
  • the new relationship type added in this embodiment is the knowledge base dependency type of the non-aligned time period, and the corresponding flag is library_flag. This relationship type is used to describe the dependency relationship between the current Asset and the knowledge base Asset of the non-aligned time period.
  • the corresponding syntax is shown in Table 3:
  • descriptor_tag A tag value used to indicate this type of descriptor.
  • descriptor_length indicates the byte length of this descriptor, calculated from the next field to the last field.
  • dependency_flag Indicates whether dependencies need to be added in this descriptor. The value "0" means that no addition is required.
  • composition_flag Indicates whether a composition relationship needs to be added in this descriptor. The value "0" means that no addition is required.
  • equivalence_flag Indicates whether equivalence needs to be added in this descriptor. The value "0" means that no addition is required.
  • similarity_flag Indicates whether or not a similarity relationship needs to be added in this descriptor. The value "0" means that no addition is required.
  • library_flag indicates whether a knowledge base dependency for non-aligned time periods needs to be added in this descriptor. The value "0" means that no addition is required.
  • num_dependencies indicates the number of Assets that the Asset described by this descriptor depends on.
  • asset_id indicates the ID of the Asset on which the Asset described by this descriptor depends.
  • the order of the Asset ID provided in this descriptor corresponds to its internal coding dependency level.
  • num_compositions indicates the number of Assets that have a combination relationship with the Assets described by this descriptor.
  • asset_id indicates the ID of an Asset that has a combination relationship with the Asset described by this descriptor.
  • equivalence_selection_level indicates the rendering level of the corresponding Asset in the equivalence group. A "0" value indicates that the asset is rendered by default. When the default Asset cannot be selected, the Asset with the lower rendering level will be selected and rendered as an alternative.
  • num_equivalences indicates the number of Assets that are equivalent to the Assets described by this descriptor.
  • asset_id indicates the ID of an Asset that is equivalent to the Asset described by this descriptor.
  • similarity_selection_level indicates the rendering level of the corresponding Asset in the similarity group. A "0" value indicates that the asset is rendered by default. When the default Asset cannot be selected, the Asset with the lower rendering level will be selected and rendered as an alternative.
  • num_similarities indicates the number of Assets that have similar relationships with the Assets described by this descriptor.
  • asset_id indicates the ID of an Asset that has a similar relationship with the Asset described by this descriptor.
  • num_libraries indicates the number of knowledge base Assets in the non-aligned time period on which the Asset described by this descriptor depends.
  • asset_id indicates the ID of an Asset that has a knowledge base dependency that is out of alignment with the Asset described by this descriptor.
  • a first placing unit is configured to place a sample entry of the first media data in the first media track, the first media data is time-series media data, and the sample entry includes a pointer to the sample of the first media data.
  • a second placing unit configured to place an access unit entry of the second media data in a second media data box, the access unit entry containing metadata pointing to the access unit of the second media data, the second The media data is time-series media data or non-time-series media data;
  • a third placing unit configured to mark at least two samples that are discontinuous in time in the first media data as a sample group, and the at least two samples that are discontinuous in time meet one of the following conditions :
  • the at least two samples that are not temporally continuous are encoded or decoded with reference to the same set of access units in the second media data, the same set of access units and the at least two access units in time. At least one of the discontinuous samples is not aligned in time; if
  • the second media data is non-temporal media data, and the at least two samples that are temporally discontinuous are encoded or decoded with reference to the same set of access units in the second media data.
  • a first placing unit is configured to place a sample entry of the first media data in the first media track, the first media data is time-series media data, and the sample entry includes a pointer to the sample of the first media data.
  • a second placing unit configured to place an access unit entry of the second media data in a second media data box, the access unit entry containing metadata pointing to the access unit of the second media data, the second The media data is time-series media data or non-time-series media data;
  • a third inserting unit configured to respectively put respective dependent metadata for each of the at least two samples that are temporally discontinuous in the first media data, and the at least two samples that are discontinuous in time
  • the sample meets one of the following conditions:
  • the dependent metadata corresponding to each sample includes index information that points to the same set of access units in the second media data, and the index information is in addition to the first media Information other than the presentation time information of the samples of the data, the at least two samples that are discontinuous in time encoding or decoding refer to the same set of access units, the same set of access units and the at least two At least one of the consecutive samples is not aligned in time; if
  • the second media data is non-temporal media data
  • the dependent metadata corresponding to each sample includes index information that points to the same set of access units in the second media data, and the index information is in addition to the first media data.
  • the at least two samples that are not temporally consecutive are encoded or decoded with reference to the same set of access units.
  • a first extraction unit configured to extract first media data and second media data, wherein the first media data is time-series media data and the second media data is time-series media data or non-time-series media data;
  • a second extraction unit configured to extract a sample group from a track to which the first media data belongs, where the sample group includes at least two samples that are discontinuous in time;
  • a positioning unit configured to locate a set of access units in the second media data for each of the at least two temporally discontinuous samples according to the description information of the sample group, and an index of the set of access units The information is included in the description information of the sample group; wherein the second media data satisfies one of the following conditions:
  • the at least two temporally discontinuous samples locate the same set of access units in the second media data, and the same set of access units are the same as the first set of access units.
  • the time periods of at least one of the at least two samples of the media data are misaligned; or
  • the two samples of the first media data are located to the same second media data access unit.
  • a first extraction unit configured to extract first media data and second media data, wherein the first media data is time-series media data and the second media data is time-series media data or non-time-series media data;
  • a second extraction unit configured to extract at least two temporally discontinuous samples from the first media data
  • a third extraction unit configured to extract dependent metadata for each of at least two samples that are temporally discontinuous in the first media data
  • a positioning unit configured to respectively locate a group of access units in the second media data for each of the at least two samples that are temporally discontinuous according to the dependent metadata, and an index of the group of access units
  • the information is included in the dependent metadata; the second media data satisfies one of the following conditions:
  • the at least two temporally discontinuous samples locate the same set of access units in the second media data, and the same set of access units are the same as the first set of access units.
  • the time periods of at least one of the at least two samples of the media data are misaligned; or
  • the two samples of the first media data are located to the same second media data access unit.
  • the first sub-unit is used to divide the first media data into media fragment units, wherein the first media data is time-series media data, and the first media data includes at least two temporally discontinuous sample;
  • a first extraction unit configured to extract dependency index information corresponding to the first media data media fragmentation unit, where the dependency index information is information other than presentation time information of a sample to which the media fragmentation unit belongs;
  • a first transmission unit configured to transmit the extracted first media data media fragmentation unit
  • a positioning unit configured to locate a second media data access unit according to the dependent index information corresponding to the first media data media segment unit, where the second media data access unit is a first media to which the media segment unit belongs Reference is made to the encoding or decoding of the data samples; wherein the second media data meets one of the following conditions:
  • the second media data is time-series media data
  • the at least two temporally discontinuous samples in the first media data are located as the same second media data access unit, and the first The time periods of the two media data access units are not aligned with at least one of the at least two samples of the first media data;
  • the two samples of the first media data are located to the same second media data access unit.
  • a searching unit configured to search the second media data access unit in an analog cache
  • a second segmentation unit configured to segment the second media data access unit into a media fragmentation unit if the second media data access unit does not exist in the analog cache;
  • the second transmission unit is configured to transmit a media fragmentation unit in which the second media data access unit is divided.
  • One or more programs are used to:
  • the processor places a sample entry of the first media data in the first media track, the first media data is time-series media data, and the sample entry includes metadata that points to a sample of the first media data;
  • the processor puts an access unit entry of the second media data in the second media data box, the access unit entry includes metadata pointing to the access unit of the second media data, and the second media data is time-series media data Or non-temporal media data;
  • the processor places separate dependent metadata for each of the at least two temporally discontinuous samples in the first media data, and the at least two temporally discontinuous samples satisfy one of the following conditions :
  • the dependent metadata corresponding to each sample includes index information that points to the same set of access units in the second media data, and the index information is in addition to the first media Information other than the presentation time information of the samples of the data, the at least two samples that are discontinuous in time encoding or decoding refer to the same set of access units, the same set of access units and the at least two At least one of the consecutive samples is not aligned in time; if
  • the second media data is non-temporal media data
  • the dependent metadata corresponding to each sample includes index information that points to the same set of access units in the second media data, and the index information is in addition to the first media data.
  • the at least two samples that are not temporally consecutive are encoded or decoded with reference to the same set of access units.
  • the media data obtained by the processor as described above is stored in the memory.
  • One or more programs are used to:
  • the processor places a sample entry of the first media data in the first media track, the first media data is time-series media data, and the sample entry includes metadata that points to a sample of the first media data;
  • the processor puts an access unit entry of the second media data in the second media data box, the access unit entry includes metadata pointing to the access unit of the second media data, and the second media data is time-series media data Or non-temporal media data;
  • the processor places separate dependent metadata for each of the at least two temporally discontinuous samples in the first media data, and the at least two temporally discontinuous samples satisfy one of the following conditions :
  • the dependent metadata corresponding to each sample includes index information that points to the same set of access units in the second media data, and the index information is in addition to the first media Information other than the presentation time information of the samples of the data, the at least two samples that are discontinuous in time encoding or decoding refer to the same set of access units, the same set of access units and the at least two At least one of the consecutive samples is not aligned in time; if
  • the second media data is non-temporal media data
  • the dependent metadata corresponding to each sample includes index information that points to the same set of access units in the second media data, and the index information is in addition to the first media data.
  • the at least two samples that are not temporally consecutive are encoded or decoded with reference to the same set of access units.
  • the media data obtained by the processor as described above is stored in the memory.
  • One or more programs are used to:
  • the processor processes the media data stored in the memory
  • the processor extracts first media data and second media data, wherein the first media data is time-series media data and the second media data is time-series media data or non-time-series media data;
  • the processor extracts a sample group from the track to which the first media data belongs, the sample group including at least two samples that are discontinuous in time;
  • the processor locates a set of access units in the second media data for each of the at least two temporally discontinuous samples according to the description information of the sample group, and the index information of the set of access units is included in In the description information of the sample group; wherein the second media data meets one of the following conditions:
  • the at least two temporally discontinuous samples locate the same set of access units in the second media data, and the same set of access units are the same as the first access unit
  • the time periods of at least one of the at least two samples of the media data are misaligned; or
  • the two samples of the first media data are located to the same second media data access unit.
  • One or more programs are used to:
  • the processor processes the media data stored in the memory
  • the processor extracts first media data and second media data, wherein the first media data is time-series media data and the second media data is time-series media data or non-time-series media data;
  • the processor extracts at least two temporally discontinuous samples from the first media data
  • the processor extracts dependent metadata for each of the at least two samples that are temporally discontinuous in the first media data
  • the processor locates a set of access units in the second media data for each of the at least two temporally discontinuous samples according to the dependent metadata, and index information of the set of access units is included in In the dependent metadata; the second media data satisfies one of the following conditions:
  • the at least two temporally discontinuous samples locate the same set of access units in the second media data, and the same set of access units are the same as the first The time periods of at least one of the at least two samples of the media data are misaligned; or
  • the two samples of the first media data are located to the same second media data access unit.
  • One or more programs are used to:
  • the processor processes the media data stored in the memory
  • the processor divides the first media data into media fragmentation units, where the first media data is time-series media data, and the first media data includes at least two samples that are discontinuous in time;
  • the processor extracts dependency index information corresponding to the first media data media fragmentation unit, where the dependency index information is information other than presentation time information of a sample to which the media fragmentation unit belongs;
  • the transmitter transmits the extracted first media data media fragmentation unit
  • the processor locates a second media data access unit according to the dependency index information corresponding to the first media data media fragment unit, and the second media data access unit is determined by the first media data sample to which the media fragment unit belongs. Referenced for encoding or decoding; wherein the second media data meets one of the following conditions:
  • the second media data is time-series media data
  • the at least two temporally discontinuous samples in the first media data are located as the same second media data access unit, and the first The time periods of the two media data access units are not aligned with at least one of the at least two samples of the first media data;
  • the two samples of the first media data are located to the same second media data access unit.
  • the processor searches for the second media data access unit in an analog cache
  • the processor divides the second media data access unit into a media fragmentation unit
  • the transmitter transmits a media fragmentation unit in which the second media data access unit is divided.
  • the first sub-unit is used to divide the first media data into media fragment units, wherein the first media data is time-series media data, and the first media data includes at least two temporally discontinuous sample;
  • a first extraction unit configured to extract dependency index information corresponding to the first media data media fragmentation unit, where the dependency index information is information other than presentation time information of a sample to which the media fragmentation unit belongs;
  • a first transmission unit configured to transmit the extracted first media data media fragmentation unit
  • a positioning unit configured to locate a second media data access unit according to the dependent index information corresponding to the first media data media segment unit, where the second media data access unit is a first media to which the media segment unit belongs Reference is made to the encoding or decoding of the data samples; wherein the second media data meets one of the following conditions:
  • the second media data is time-series media data
  • the at least two temporally discontinuous samples in the first media data are located as the same second media data access unit, and the first The time periods of the two media data access units are not aligned with at least one of the at least two samples of the first media data;
  • the two samples of the first media data are located to the same second media data access unit.
  • a searching unit configured to search the second media data access unit in an analog cache
  • a second segmentation unit configured to segment the second media data access unit into a media fragmentation unit if the second media data access unit does not exist in the analog cache;
  • the second transmission unit is configured to transmit a media fragmentation unit in which the second media data access unit is divided.
  • the first containing unit includes at least two assets, and also contains a composition information (CI), the asset contains MPU, and each of the MPUs contains a type of data of media data, the composition information Asset dependency information is recorded.
  • CI composition information
  • the first sub-unit is used to divide the first media data into media fragment units, wherein the first media data is time-series media data, and the first media data includes at least two temporally discontinuous sample;
  • a first extraction unit configured to extract dependency index information corresponding to the first media data media fragmentation unit, where the dependency index information is information other than presentation time information of a sample to which the media fragmentation unit belongs;
  • a first transmission unit configured to transmit the extracted first media data media fragmentation unit
  • a positioning unit configured to locate a second media data access unit according to the dependent index information corresponding to the first media data media segment unit, where the second media data access unit is a first media to which the media segment unit belongs Reference is made to the encoding or decoding of the data samples; wherein the second media data meets one of the following conditions:
  • the second media data is time-series media data
  • the at least two temporally discontinuous samples in the first media data are located as the same second media data access unit, and the first The time periods of the two media data access units are not aligned with at least one of the at least two samples of the first media data;
  • the two samples of the first media data are located to the same second media data access unit.
  • the first containing unit includes at least two assets, and also contains a composition information (CI), the asset contains MPU, and each of the MPUs contains a type of data of media data, the composition information Asset dependency information is recorded.
  • CI composition information
  • the first sub-unit is used to divide the first media data into media fragment units, wherein the first media data is time-series media data, and the first media data includes at least two temporally discontinuous sample;
  • a first extraction unit configured to extract dependency index information corresponding to the first media data media fragmentation unit, where the dependency index information is information other than presentation time information of a sample to which the media fragmentation unit belongs;
  • a first transmission unit configured to transmit the extracted first media data media fragmentation unit
  • a first positioning unit configured to locate the asset number to which the referenced MFU belongs.
  • a second positioning unit configured to locate a second media data access unit according to the dependency index information corresponding to the first media data media fragment unit, where the second media data access unit is Reference for encoding or decoding of a media data sample; wherein the second media data satisfies one of the following conditions:
  • the second media data is time-series media data
  • the at least two temporally discontinuous samples in the first media data are located as the same second media data access unit, and the first The time periods of the two media data access units are not aligned with at least one of the at least two samples of the first media data;
  • the two samples of the first media data are located to the same second media data access unit.
  • the first containing unit includes at least two assets, and also contains a composition information (CI), the asset contains MPU, and each of the MPUs contains a type of data of media data, the composition information Asset dependency information is recorded.
  • CI composition information
  • the first sub-unit is used to divide the first media data into media fragment units, wherein the first media data is time-series media data, and the first media data includes at least two temporally discontinuous sample;
  • a first extraction unit configured to extract dependency index information corresponding to the first media data media fragmentation unit, where the dependency index information is information other than presentation time information of a sample to which the media fragmentation unit belongs;
  • a first transmission unit configured to transmit the extracted first media data media fragmentation unit
  • a synchronization unit for describing a dependency relationship between MFUs wherein the time-series metadata has the same non-aligned time period as the first media data, and the time-series metadata and the video layer data are kept synchronized by the time-aligned period,
  • the time-series metadata describes the second media data whose corresponding period needs to be used synchronously, so that the first media data is indirectly associated with the second unmentioned data.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本发明公开了处理得到媒体数据、传输媒体数据、处理媒体数据、处理参考图像请求和指定参考图像等多个方法和多个装置。通过处理得到媒体数据的方法、传输媒体数据的方法以及处理媒体数据的方法,根据视频层码流数据和知识层码流数据之间的依赖索引关系,保证码流的同步和正确处理和传输,并为解码器高效地提供正确的数据。通过处理参考图像请求和指定参考图像的方法,为当前图像提供不属于当前图像所属的随机访问片段和之前最邻近的一个随机访问片段中的需要显示的知识图像集,保证当前图像的正确编解码并避免知识图像的重复下载,从而保证基于知识库的视频编码方法得到的码流的正确解码和高效传输,提升了传输效率和存储效率。

Description

处理传输媒体数据和指定参考图像的方法和装置 技术领域
本发明涉及图像或视频压缩技术领域和系统传输领域,更具体的说,尤其涉及处理媒体数据的方法及装置及传输媒体数据的方法及装置。
背景技术
1.文件格式
文件格式(file format)是一种将编码数据存储在计算机文件中的特定方式。它可以使得元数据(metadata)和媒体数据(media data)分离,以解决随机访问和网络流化等问题。
媒体数据包括视频数据、音频数据、时序化元数据、非时序的图像等。媒体数据分为多个访问单元,每个访问单元包含一个非时序图像或一组非时序图像或至少一个随机访问片段,当媒体数据为时序媒体时,媒体数据的访问单元被携带在样本(sample)中,当媒体数据为非时序媒体时,媒体数据的访问单元被携带在元数据项目中。元数据是用来描述媒体数据的辅助数据,如样本入口、描述轨道的数据盒等。元数据可以分为时序化元数据和非时序化元数据。时序化元数据会和媒体数据一起存在媒体数据盒中,非时序化元数据存在元数据盒中,元数据盒存放在文件下的不同层级中。
文件格式用规定的结构来存储这些数据。在文件格式下的文件,会包含媒体数据盒(media data box)和一些元数据盒(metadata box)。
“Movie Box”是一个重要的元数据盒,它存放着各类轨道和一些元数据盒。轨道具有逻辑和时间结构。从逻辑结构看,轨道可以分为媒体轨道和提示轨道。从时间结构看,各轨道是一系列时间平行的轨道,不同的轨道都拥有媒体数据流下的同一时间轴。
轨道中会存储描述媒体数据的各种元数据盒。例如,根据样本位移(sample offset)、样本大小(sample size)和样本条目(sample entry),可以确定媒体数据在媒体数据盒中的位置。样本群组(sample groups),可以表示同一轨道中一些样本数据信息所共有的特性。样本辅助信息(Sample auxiliary information sizes box和sample auxiliary information offsets box),描述的是样本的辅助信息。辅助类型(aux_info_type)决定了这个辅助信息的类型。
在轨道中,除描述媒体数据的元数据外,还有许多描述轨道本身的数据盒。在现有标准中,不同数据流之间的依赖关系就可以存放在这样的tref数据盒(Track Reference Box)中。在一个轨道的tref数据盒中会包含另一个轨道的标识和参考类型(reference_type)。参考类型的值有:’hint’、’cdsc’、‘font’、’hind’、’vdep’、’dplx’、’subt’、’thmb’、’auxl’,决定了不同数据流之间的依赖关系和种类,如’cdsc’表示当前轨道描述了所参考轨道,‘hint’表示被参考轨道是该’hint’轨道指向的媒体数据轨道。但是,相互依赖的数据流中第一样本依赖的另一个数据流中第二样本的索引信息是隐式的且与第一样本的呈现时间信息相同的,由此,通过第一样本和第二样本之间呈现时间信息的同步得到样本之间的依赖关系,因此现有的参考类型均使用同一时间轴,是时序下的依赖关系。对于非对齐时间段的依赖关系,现有类型既不能正确表达,也阻碍了非时序数据的复用和操作灵活度。
2.媒体传输方案
现有的媒体传输方法有多种,其中被标准化的有MPEG媒体传输(MPEG Media Transport,以下简称“MMT”)是由MPEG系统子工作组开发的、用于存储和运送多媒体内容的一种新的标准技术。SMT(smart media transport)
这些媒体传输的主要功能是将媒体文件分包并传输给接收端。包裹(Package)是一个逻辑实体,由一 个组织信息(Composition Information,CI),一个或多个资产(Asset)组成。MMT资产是包含编码媒体数据的逻辑数据实体。MMT资产的编码媒体可以是定时数据或非定时数据。定时数据是视听媒体数据,要求在指定时间对特定数据单元进行同步解码和表示。非定时数据是可以基于用户的服务或交互的上下文在任意时间解码和呈现的其他类型的数据。组织信息(Composition Information,CI)描述资产之间的关系,从而完成不同资产中文件之间的同步传输。MMT在ISO文件格式的基础上使用MPU(Media Processing Unit)封装文件,媒体处理单元(MPU)是由符合MMT实体的独立且完全处理的数据,其中的处理包括封装和分组化。MPU在MMT包内被唯一标识,其具有序列号和关联的MMT资产ID,以将其与其他MPU区分开。每个包裹中为了能够适应网络环境进行灵活的传输,MMT在MPU中增加hint track以指导发送端将媒体数据分包为更小的媒体分片单元MFU(Media Fragment Unit),hint track指向的hint sample则作为MFU的头信息,其中描述了MFU的伸缩层级(scalable layer)。
现有的MMT主要针对现有视频编码方法产生的媒体数据而设计的。
3.传统视频编码方案
在现有视频序列处理中,为了使编码后的视频序列支持随机访问功能,视频序列被分割成多个具有随机访问功能的片段(简称为随机访问片段),如图1所示,一个视频序列包括至少一个随机访问片段,每个随机访问片段对应一个显示时段并包括一幅随机访问图像以及多幅非随机访问图像,每个图像拥有各自的显示时间以描述该图像被显示或播放的时间。一个随机访问片段中的图像可以进行帧内编码,或者,参考该随机访问片段中的其他图像利用帧间预测进行编码,其中,被参考的图像可以是要显示的图像、或者不能显示的合成图像等。然而在现有技术中,一个显示顺序在随机访问图像之后的图像(不包括leading pictures)只能参考该图像所属的随机访问片段中的其他图像,而不能参考该图像所属随机访问片段之前或之后的随机访问片段中的图像,如图1所示。具体的,有如下几种方式描述当前图像和候选参考图像之间的依赖关系:
在现有的视频编码方案(如H.264\AVC或H.265\HEVC)中,当前图像和候选参考图像之间的依赖关系由视频压缩层的参考图像配置集描述,其中参考图像配置集描述了参考图像与当前图像之间的编号差值。之所以在参考图像配置集中仅描述编号差值,是因为现有的视频编码方案中,候选参考图像与当前图像从属于同一个独立可解码的随机访问片段,且候选参考图像与当前图像只能使用同一种编号规则,例如按照时间顺序编号,因此根据当前图像编号和候选参考图像编号差值可以准确定位候选参考图像。如果参考图像和当前图像使用不同的编号规则,由于现有视频编码方案没有提供在码流中描述不同编号规则的方法,同样的编号差值会指向不同的候选参考图像,导致编解码器不能使用正确的候选参考图像。
在可伸缩视频编码方案(Scalable Video Coding,SVC)和多视点视频编码方案(Multiview Video Coding,MVC)中,如图2所示,在已有帧间预测(仅使用同层/同视点内的候选参考图像)的基础上,SVC/MVC使用层间/视点间预测为当前图像扩展了候选参考图像的范围,其中扩展的候选参考图像与当前图像拥有相同的编号(例如,相同时间戳)且不属于独立可解码片段的同一层级。SVC/MVC在视频压缩层使用层级标识描述不同层/视点的码流的依赖关系,并联合使用图像的同一编号描述层间/视点间图像的依赖关系。
在AVS2的背景帧技术中,如图3所示,编码图像和场景图像的依赖关系由视频压缩层中参考图像类型的标识来描述。具体的,AVS2使用标识描述特别的场景图像类型(即,G图像和GB图像),并使用特定的参考缓存(即场景图像缓存)管理G/GB图像,同时,使用标识描述当前图像是否参考G/GB图像,并使用特定的参考图像队列构建方法(即,默认将G/GB图像放入参考图像队列的最后一个参考图像位),最终,使得按照规则编号的当前图像能够参考不按照规则编号的候选参考图像(即GB图像)、或与当前图像使用同一规则编号但编号差值超出约束范围的候选参考图像(即G图像)。但该技术限制任意时刻场景图像缓存中只能存在一个候选参考图像,且该候选参考图像仍然与当前图像属于同一独立可解码片段。
4.基于知识库的视频编码方案
现有技术的上述机制会限制待编码图像的可用参考图像数量,不能有效提升图像编码和图像解码的效率。
为了挖掘和利用多个随机访问片段之间的图像在编码时相互参考的信息,在编码(或解码)一幅图像时,编码器(或解码器)可以从数据库中选择与当前编码图像(或解码图像)纹理内容相近的图像作为参考图像,这种参考图像称为知识库图像,存储上述参考图像的集合的数据库称为知识库,这种视频中至少一幅图像参考至少一幅知识库图像进行编解码的方法称为基于知识库的视频编码(英文:library-based video coding)。采用基于知识库的视频编码对一个视频序列进行编码会产生一个包含知识库图像编码码流的知识层码流和一个包含视频序列各帧图像参考知识库图像编码得到的码流的视频层码流。这两种码流分别类似于可伸缩视频编码(英文:scalable video coding,SVC)产生的基本层码流和增强层码流,即序列层码流依赖于知识层码流。然而,基于知识库的视频编码的双码流组织方式与SVC的分级码流组织方式的层级码流之间的依赖关系不同,不同之处在于,SVC的双码流层级之间是按照一定的对齐时间段依赖的,而基于知识库的视频编码的双码流中视频层依赖知识层是按照非对齐时间段依赖的。
基于知识库的视频编码方案为使用基于知识库的视频编码方案编码的码流数据的存储、传输和参考图像管理等带来了问题。
在利用知识图像的编解码技术中,知识图像被获取并用于为图像的编解码提供额外的候选参考图像,图4展示了利用知识图像的编解码技术中,序列图像与知识图像之间的依赖关系。知识图像使得序列图像能够利用大跨度的相关信息,提升了编解码效率。但是,现有的技术方案并不能有效地支持序列图像和知识图像依赖关系的描述及对知识图像进行高效的管理。
前述MMT中的伸缩层级可描述SVC数据的层级信息,伸缩层级配合时间信息可以描述SVC数据在同一时刻不同层级之间的依赖关系,但是却不能描述知识库编码视频码流的非对齐时间段的依赖关系。
发明内容
基于现有技术的上述缺陷,本发明目在提供处理得到媒体数据、传输媒体数据、处理媒体数据、处理参考图像请求和指定参考图像等多个方法和多个装置,以实现基于知识库的视频编码方法得到的码流的正确解码和高效传输,并提升传输效率和存储效率。
为实现上述目的,本发明采用了如下技术方案:
本发明的第一个方面提供了一种指定参考图像的方法,所述方法包括:
解码器提取参考映射表中的第一标识信息以获取所述参考映射表中参考索引对应的参考图像编号是否使用至少两种编号规则;
当所述参考映射表中参考索引对应的编号使用至少两种编号规则时,解码器从所述参考映射表中提取至少一个参考索引j对应的第二标识信息以获取所述参考索引j对应的参考图像编号采用的编号规则;
解码器从所述参考映射表中提取所述参考索引j对应的参考图像编号;
当所述参考图像编号采用的编号规则为第一编号规则时,解码器采用与当前图像相同的编号规则来使用所述参考图像编号确定当前图像的参考图像;
当所述参考图像编号采用的编号规则为第二编号规则时,解码器使用所述参考图像编号从解码器外部返回的参考图像信息确定当前图像的参考图像。
进一步的,所述方法还包括:
解码器从参考映射更新表中获取至少一个参考索引j对应的参考图像编号和第二标识信息;
当所述参考映射更新表中的所述参考索引j存在于所述参考映射表中时,将所述参考映射表中所述参考索引j对应的参考图像编号和第二标识信息替换为从所述参考映射更新表中获取的所述参考索引j对应的参考编号和第二标识信息;
当所述参考映射更新表中的所述参考索引j不存在于所述参考映射表中时,在所述参考映射表中增加从所述参考映射更新表中获取的所述参考索引j及其对应的参考图像编号和第二标识信息。
进一步的,所述方法还包括:
当解码器使用采用第二编号规则的参考图像编号指向的参考图像对当前图像解码时,解码器将所述参考图像与当前图像的距离设置为非时域距离。
本发明的第二个方面提供了一种处理参考图像请求的方法,所述方法包括:
获取至少一个第一类片段的依赖映射表以获取所述至少一个第一类片段依赖的至少一个参考图像的参考图像编号与所述至少一个参考图像所属的第二类片段的定位信息的映射关系;
接收解码器发送的参考图像请求信息以获取当前图像依赖的至少一个参考图像的参考图像编号,所述当前图像包含在所属第一类片段中;
从所述当前图像所属的第一类片段的依赖映射表中,获取所述参考图像请求信息中的至少一个所述参考图像的参考图像编号指向的参考图像所属的第二类片段的定位信息;
使用所述第二类片段的定位信息向解码器发送所述定位信息指向的所述第二类片段包含的参考图像的信息。
进一步的,所述方法还包括:
从媒体描述信息中获取至少一个第一类片段的依赖映射表。
进一步的,使用所述第二类片段的定位信息向解码器发送所述定位信息指向的所述第二类片段包含的参考图像的信息还包括:
在缓存中查找所述第二类片段的所述定位信息指向的第二类片段或所述第二类片段包含的参考图像;
如果所述缓存中存在所述第二类片段或所述第二类片段包含的参考图像,从所述缓存中获取所述第二类片段或所述第二类片段包含的参考图像;
如果所述缓存中不存在所述第二类片段或所述第二类片段包含的参考图像,从服务端下载所述第二类片段。
本发明的第三个方面提供了一种指定参考图像的装置,所述装置包括:
处理器;
存储器;以及
一个或多个程序用于完成以下方法:
处理器提取参考映射表中的第一标识信息以获取所述参考映射表中参考索引对应的参考图像编号是否使用至少两种编号规则;
当所述参考映射表中参考索引对应的编号使用至少两种编号规则时,处理器从所述参考映射表中提取至少一个参考索引j对应的第二标识信息以获取所述参考索引j对应的参考图像编号采用的编号规则;
处理器从所述参考映射表中提取所述参考索引j对应的参考图像编号;
当所述参考图像编号采用的编号规则为第一编号规则时,处理器采用与当前图像相同的编号规则来使用所述参考图像编号确定当前图像的参考图像;
当所述参考图像编号采用的编号规则为第二编号规则时,处理器使用所述参考图像编号从解码器外部返回的参考图像信息确定当前图像的参考图像;
处理器处理的上述参考映射表和参考图像存在于存储器中。
进一步的,所述装置还包括:
处理器从参考映射更新表中获取至少一个参考索引j对应的参考图像编号和第二标识信息;
当所述参考映射更新表中的所述参考索引j存在于所述参考映射表中时,处理器将所述参考映射表中所述参考索引j对应的参考图像编号和第二标识信息替换为从所述参考映射更新表中获取的所述参考索引j对应的参考编号和第二标识信息;
当所述参考映射更新表中的所述参考索引j不存在于所述参考映射表中时,处理器在所述参考映射表中增加从所述参考映射更新表中获取的所述参考索引j及其对应的参考图像编号和第二标识信息。
进一步的,所述装置还包括:
当解码器使用采用第二编号规则的参考图像编号指向的参考图像对当前图像解码时,处理器将所述参考图像与当前图像的距离设置为非时域距离。
本发明的第四个方面还提供了一种处理参考图像请求的装置,所述装置包括:
处理器;
存储器;
传输器;以及
一个或多个程序用于完成以下方法:
处理器获取至少一个第一类片段的依赖映射表以获取所述至少一个第一类片段依赖的至少一个参考图像的参考图像编号与所述至少一个参考图像所属的第二类片段的定位信息的映射关系;
处理器接收解码器发送的参考图像请求信息以获取当前图像依赖的至少一个参考图像的参考图像编号,所述当前图像包含在所属第一类片段中;
处理器从所述当前图像所属的第一类片段的依赖映射表中,获取所述参考图像请求信息中的至少一个所述参考图像的参考图像编号指向的参考图像所属的第二类片段的定位信息;
传输器使用所述第二类片段的定位信息向解码器发送所述定位信息指向的所述第二类片段包含的参考图像的信息;
处理器处理的上述依赖映射表和参考图像存在于存储器中。
进一步的,所述装置还包括:
处理器从媒体描述信息中获取至少一个第一类片段的依赖映射表。
进一步的,所述发送单元还包括:
处理器在缓存中查找所述第二类片段的所述定位信息指向的第二类片段或所述第二类片段包含的参考图像;
如果所述缓存中存在所述第二类片段或所述第二类片段包含的参考图像,处理器从所述缓存中获取所述第二类片段或所述第二类片段包含的参考图像;
如果所述缓存中不存在所述第二类片段或所述第二类片段包含的参考图像,处理器从服务端下载所述第二类片段。
本发明的第五个方面还提供了一种指定参考图像的装置,所述装置包括:
第一提取单元,用于提取参考映射表中的第一标识信息以获取所述参考映射表中参考索引对应的参考图像编号是否至少两种编号规则;
第二提取单元,用于当所述参考映射表中参考索引对应的编号使用至少两种编号规则时,从所述参考映射表中提取至少一个参考索引j对应的第二标识信息以获取所述参考索引j对应的参考图像编号采用的编号规则;
第三提取单元,用于从所述参考映射表中提取所述参考索引j对应的参考图像编号;
第一确定单元,当所述参考图像编号采用的编号规则为第一编号规则时,采用与当前图像相同的编号规则来使用所述参考图像编号确定当前图像的参考图像;
第二确定单元,用于当所述参考图像编号采用的编号规则为第二编号规则时,使用所述参考图像编号从解码器外部返回的参考图像信息确定当前图像的参考图像。
进一步的,所述装置还包括:
第四提取单元,用于从参考映射更新表中获取至少一个参考索引j对应的参考图像编号和第二标识信息;
替换单元,用于当所述参考映射更新表中的所述参考索引j存在于所述参考映射表中时,将所述参考映射表中所述参考索引j对应的参考图像编号和第二标识信息替换为从所述参考映射更新表中获取的所述参考索引j对应的参考编号和第二标识信息;
增加单元,用于当所述参考映射更新表中的所述参考索引j不存在于所述参考映射表中时,在所述参考映射表中增加从所述参考映射更新表中获取的所述参考索引j及其对应的参考图像编号和第二标识信息。
进一步的,所述装置还包括:
设置单元,用于当解码器使用采用第二编号规则的参考图像编号指向的参考图像对当前图像解码时,解码器将所述参考图像与当前图像的距离设置为非时域距离。
本发明的第六个方面还提供了一种处理参考图像请求的装置,所述装置包括:
第一获取单元,用于获取至少一个第一类片段的依赖映射表以获取所述至少一个第一类片段依赖的至少一个参考图像的参考图像编号与所述至少一个参考图像所属的第二类片段的定位信息的映射关系;
接收单元,用于接收解码器发送的参考图像请求信息以获取当前图像依赖的至少一个参考图像的参考图像编号,所述当前图像包含在所属第一类片段中;
第二获取单元,用于从所述当前图像所属的第一类片段的依赖映射表中,获取所述参考图像请求信息中的至少一个所述参考图像的参考图像编号指向的参考图像所属的第二类片段的定位信息;
发送单元,用于使用所述第二类片段的定位信息向解码器发送所述定位信息指向的所述第二类片段包含的参考图像的信息。
进一步的,所述装置还包括:
第三获取单元,用于从媒体描述信息中获取至少一个第一类片段的依赖映射表。
进一步的,所述发送单元还包括:
查找单元,用于在缓存中查找所述第二类片段的所述定位信息指向的第二类片段或所述第二类片段包含的参考图像;
第四获取单元,用于如果所述缓存中存在所述第二类片段或所述第二类片段包含的参考图像,从所述缓存中获取所述第二类片段或所述第二类片段包含的参考图像;
下载单元,用于如果所述缓存中不存在所述第二类片段或所述第二类片段包含的参考图像,从服务端下载所述第二类片段。
本发明的第七个方面还提供了一种处理得到媒体数据的方法,所述方法包括:
在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
将所述第一媒体数据中至少两个在时间上不连续的样本标记为一个样本群组,所述至少两个在时间上不连续的样本满足以下条件之一:
如果第二媒体数据为时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
第二媒体数据为非时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元。
进一步的,所述方法还包括:
如果第二媒体数据为时序媒体数据,在第一媒体轨道中放入指向所述第二媒体数据盒的轨道依赖信息,所述轨道依赖信息包含表明所述同一组访问单元和所述两个在时间上不连续的样本中至少一个样本在时间上不对齐的标识。
进一步的,所述方法还包括:
在所述第一媒体轨道中放入所述样本群组的描述信息,所述样本群组的描述信息包含表明所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元的标识。
本发明的第八个方面还提供了一种处理得到媒体数据的方法,所述方法包括:
在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据,所述至少两个在时间上不连续的样本满足以下条件之一:
如果所述第二媒体数据为时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
第二媒体数据为非时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元。
进一步的,其中为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据还包括:
在时序化元数据中放入所述依赖元数据;
在时序化元数据轨道中放入所述时序化元数据的样本条目。
进一步的,其中为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据还包括:
在片段索引数据盒中放入所述依赖元数据。
本发明的第九个方面还提供了一种处理媒体数据的方法,所述方法包括:
提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
从所述第一媒体数据所属的轨道中提取样本群组,所述样本群组包含至少两个时间上不连续的样本;
根据样本群组的描述信息,为所述至少两个时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述样本群组的描述信息中;其中所述第二媒体数据满足以下条件之一:
1)如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
2)如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
进一步的,所述方法还包括:
如果第二媒体数据为时序媒体数据,从所述第一媒体数据所属的轨道中解析指向所述第二媒体数据所属的数据盒的轨道依赖信息的标识以获得所述同一组访问单元和所述两个在时间上不连续的样本中至少一个样本在时间上不对齐的信息。
进一步的,所述方法还包括:
从所述第一媒体轨道中的所述样本群组的描述信息中,解析标识以获得所述至少两个在时间上不连续 的样本编码或解码参考所述同一组访问单元的信息。
本发明的第十个方面还提供了一种处理媒体数据的方法,所述方法包括:
提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
从所述第一媒体数据中提取至少两个时间上不连续的样本;
为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本提取依赖元数据;
根据所述依赖元数据,为所述至少两个在时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述依赖元数据中;所述所述第二媒体数据满足以下条件之一:
1)如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
2)如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
进一步的,其中为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本提取依赖元数据还包括:
提取时序化元数据轨道中的样本条目指向的时序化元数据;
提取时序化元数据中的依赖元数据。
进一步的,其中为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本提取依赖元数据还包括:
从片段索引数据盒中提取所述依赖元数据。
本发明的第十一个方面还提供了一种传输媒体数据的方法,所述方法包括:
将第一媒体数据切分为媒体分片单元,其中所述的第一媒体数据为时序媒体数据,所述第一媒体数据包括至少两个在时间上不连续的样本;
提取所述第一媒体数据媒体分片单元对应的依赖索引信息,所述依赖索引信息为除所述媒体分片单元所属的样本的呈现时间信息以外的信息;
传输所述提取的第一媒体数据媒体分片单元;
根据所述第一媒体数据媒体分片单元对应的依赖索引信息,定位第二媒体数据访问单元,所述第二媒体数据访问单元被所述媒体分片单元所属的第一媒体数据样本的编码或解码所参考;其中所述第二媒体数据满足以下条件之一:
1)如果第二媒体数据为时序媒体数据,则所述的第一媒体数据中所述的至少两个在时间上不连续的样本定位到的为同一个第二媒体数据访问单元,并且所述第二媒体数据访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;
2)如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元;
在模拟缓存中查找所述第二媒体数据访问单元;
如果所述模拟缓存中不存在所述第二媒体数据访问单元,将所述第二媒体数据访问单元切分为媒体分片单元;
传输所述第二媒体数据访问单元被切分的媒体分片单元。
进一步的,所述提取所述第一媒体数据媒体分片单元对应的依赖索引信息还包括:
从包含所述媒体分片单元的分片信息的提示轨道样本中提取所述媒体分片单元对应的依赖索引信息。
进一步的,所述提取所述第一媒体数据媒体分片单元对应的依赖信息还包括:
从所述媒体分片单元对应的时序化元数据中提取所述媒体分片单元对应的依赖索引信息。
本发明的第十二个方面还提供了一种处理得到媒体数据的装置,所述装置包括:
处理器;
存储器;以及
一个或多个程序用于完成以下方法:
处理器在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
处理器在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
处理器将所述第一媒体数据中至少两个在时间上不连续的样本标记为一个样本群组,所述至少两个在时间上不连续的样本满足以下条件之一:
如果第二媒体数据为时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
第二媒体数据为非时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元;
处理器上述处理得到的媒体数据存在于存储器中。
本发明的第十三个方面还提供了一种处理得到媒体数据的装置,所述装置包括:
处理器;
存储器;以及
一个或多个程序用于完成以下方法:
处理器在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
处理器在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
处理器为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据,所述至少两个在时间上不连续的样本满足以下条件之一:
如果所述第二媒体数据为时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
第二媒体数据为非时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元;
处理器上述处理得到的媒体数据存在于存储器中。
本发明的第十四个方面还提供了一种处理媒体数据的装置,所述装置包括:
处理器;
存储器;以及
一个或多个程序用于完成以下方法:
处理器处理存储器中存在的媒体数据;
处理器提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
处理器从所述第一媒体数据所属的轨道中提取样本群组,所述样本群组包含至少两个时间上不连续的样本;
处理器根据样本群组的描述信息,为所述至少两个时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述样本群组的描述信息中;其中所述第二媒体数据满足以下条件之一:
1)如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
2)如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
本发明的第十五个方面还提供了一种处理媒体数据的装置,所述装置包括:
处理器;
存储器;以及
一个或多个程序用于完成以下方法:
处理器处理存储器中存在的媒体数据;
处理器提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
处理器从所述第一媒体数据中提取至少两个时间上不连续的样本;
处理器为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本提取依赖元数据;
处理器根据所述依赖元数据,为所述至少两个在时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述依赖元数据中;所述第二媒体数据满足以下条件之一:
1)如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
2)如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
本发明的第十六个方面还提供了一种传输媒体数据的装置,所述装置包括:
处理器;
存储器;
传输器;以及
一个或多个程序用于完成以下方法:
处理器处理存储器中存在的媒体数据;
处理器将第一媒体数据切分为媒体分片单元,其中所述的第一媒体数据为时序媒体数据,所述第一媒体数据包括至少两个在时间上不连续的样本;
处理器提取所述第一媒体数据媒体分片单元对应的依赖索引信息,所述依赖索引信息为除所述媒体分片单元所属的样本的呈现时间信息以外的信息;
传输器传输所述提取的第一媒体数据媒体分片单元;
处理器根据所述第一媒体数据媒体分片单元对应的依赖索引信息,定位第二媒体数据访问单元,所述第二媒体数据访问单元被所述媒体分片单元所属的第一媒体数据样本的编码或解码所参考;其中所述第二媒体数据满足以下条件之一:
1)如果第二媒体数据为时序媒体数据,则所述的第一媒体数据中所述的至少两个在时间上不连续的 样本定位到的为同一个第二媒体数据访问单元,并且所述第二媒体数据访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
2)如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元;
处理器在模拟缓存中查找所述第二媒体数据访问单元;
处理器如果所述模拟缓存中不存在所述第二媒体数据访问单元,将所述第二媒体数据访问单元切分为媒体分片单元;
传输器传输所述第二媒体数据访问单元被切分的媒体分片单元。
本发明的第十七个方面还提供了一种处理得到媒体数据的装置,所述装置包括:
第一放入单元,用于在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
第二放入单元,用于在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
标记单元,用于将所述第一媒体数据中至少两个在时间上不连续的样本标记为一个样本群组,所述至少两个在时间上不连续的样本满足以下条件之一:
如果第二媒体数据为时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
第二媒体数据为非时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元。
本发明的第十八个方面还提供了一种处理得到媒体数据的装置,所述装置包括:
第一放入单元,用于在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
第二放入单元,用于在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
第三放入单元,用于为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据,所述至少两个在时间上不连续的样本满足以下条件之一:
如果所述第二媒体数据为时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
第二媒体数据为非时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元。
本发明的第十九个方面还提供了一种处理媒体数据的装置,所述装置包括:
第一提取单元,用于提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
第二提取单元,用于从所述第一媒体数据所属的轨道中提取样本群组,所述样本群组包含至少两个时间上不连续的样本;
定位单元,用于根据样本群组的描述信息,为所述至少两个时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述样本群组的描述信息中;其中所述第二媒体数据满足以下条件之一:
1)如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
2)如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
本发明的第二十个方面还提供了一种处理媒体数据的装置,所述装置包括:
第一提取单元,用于提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
第二提取单元,用于从所述第一媒体数据中提取至少两个时间上不连续的样本;
第三提取单元,用于为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本提取依赖元数据;
定位单元,用于根据所述依赖元数据,为所述至少两个在时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述依赖元数据中;所述所述第二媒体数据满足以下条件之一:
1)如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
2)如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
本发明的第二十一个方面还提供了一种传输媒体数据的装置,所述装置包括:
第一切分单元,用于将第一媒体数据切分为媒体分片单元,其中所述的第一媒体数据为时序媒体数据,所述第一媒体数据包括至少两个在时间上不连续的样本;
提取单元,用于提取所述第一媒体数据媒体分片单元对应的依赖索引信息,所述依赖索引信息为除所述媒体分片单元所属的样本的呈现时间信息以外的信息;
第一传输单元,用于传输所述提取的第一媒体数据媒体分片单元;
定位单元,用于根据所述第一媒体数据媒体分片单元对应的依赖索引信息,定位第二媒体数据访问单元,所述第二媒体数据访问单元被所述媒体分片单元所属的第一媒体数据样本的编码或解码所参考;其中所述第二媒体数据满足以下条件之一:
1)如果第二媒体数据为时序媒体数据,则所述的第一媒体数据中所述的至少两个在时间上不连续的样本定位到的为同一个第二媒体数据访问单元,并且所述第二媒体数据访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;
2)如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元;
查找单元,用于在模拟缓存中查找所述第二媒体数据访问单元;
第二切分单元,用于如果所述模拟缓存中不存在所述第二媒体数据访问单元,将所述第二媒体数据访问单元切分为媒体分片单元;
第二传输单元,用于传输所述第二媒体数据访问单元被切分的媒体分片单元。
本发明公开了处理得到媒体数据的方法和装置、传输媒体数据的方法和装置以及处理媒体数据的方法和装置。这些方法和装置构成了从编码端到解码端的一整套方法和装置,保证了基于知识库的视频编码方法得到的码流中视频层码流数据和知识层码流数据的正确解码和高效传输,提升了传输效率和存储效率。
首先通过处理得到媒体数据的方法,将视频层码流和知识层码流以及其间的依赖索引关系放入媒体数据或媒体数据所属的文件。然后通过传输媒体数据的方法,根据使用基于知识库的编码方法编码的媒体数 据中视频层数据和知识层数据的依赖信息准确地同步视频层数据和知识层数据,并避免知识层数据的重复存储和重复下载。再通过处理媒体数据的方法,在接收端从使用基于知识库的编码方法编码的媒体数据中提取视频层数据和其参考的知识层数据。接着通过处理参考图像请求的方法,根据解码器的参考图像请求和视频层码流数据和知识层码流数据之间的依赖索引关系,从处理得到的知识层码流数据中获得参考图像并提供给解码器。然后通过指定参考图像的方法由解码器根据依赖索引信息为视频层数据中的图像指定知识层数据中的知识图像作为参考图像,其中所述知识图像不属于当前图像所属的随机访问片段和之前最邻近的一个随机访问片段中的需要显示的图像集。
这些方法解决了现有方法不能为当前图像提供知识图像作为参考图像的问题,并保证视频层数据中图像能够使用正确的知识图像进行编解码,同时,提高传输与存储效率,并保证接收端对视频层数据的正确解码。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1被分割为随机访问片段的视频序列使用现有技术一的图像依赖关系示意图。
图2被分割为随机访问片段的视频序列使用现有技术二的图像依赖关系示意图。
图3被分割为随机访问片段的视频序列使用现有技术三的图像依赖关系示意图。
图4被分割为随机访问片段的视频序列使用现有技术四的图像依赖关系示意图。
图5本发明实施例提供的指定参考图像的方法的一种流程图。
图6本发明实施例提供的指定参考图像的方法的另一种流程图。
图7本发明实施例提供的处理参考图像请求方法的一种流程图。
图8本发明实施例提供的处理参考图像请求方法的另一种流程图。
图9本发明实施例提供的指定参考图像的方法及处理参考图像请求的方法的一种系统示意图。
图10本发明实施例提供的指定参考图像的装置的一种装置结构图。
图11本发明实施例提供的指定参考图像的装置的另一种装置结构图。
图12本发明实施例提供的处理参考图像请求装置的一种装置结构图。
图13本发明实施例提供的处理参考图像请求装置的另一种装置结构图。
图14本发明实施例提供的指定参考图像的方法及处理参考图像请求的方法的一种系统示意图。
图15本发明实施例提供的指定参考图像的方法及处理参考图像请求的方法的一种系统示意图。
图16本发明实施例提供的指定参考图像的方法及处理参考图像请求的方法的一种系统示意图。
图17本发明实施例提供的指定参考图像的方法及处理参考图像请求的方法的一种系统示意图。
图18本发明实施例提供的使用知识库编码方法的媒体数据的结构关系示意图。
图19本发明实施例提供的处理得到媒体数据方法的一种示意图。
图20本发明实施例提供的处理得到媒体数据方法的一种示意图。
图21本发明实施例提供的处理得到媒体数据方法的一种示意图。
图22本发明实施例提供的处理得到媒体数据方法的一种示意图。
图23本发明实施例提供的处理得到媒体数据方法的一种示意图。
图24本发明实施例提供的传输媒体数据方法的一种示意图。
图25本发明实施例提供的传输媒体数据方法的一种示意图。
图26本发明实施例提供的传输媒体数据方法的一种示意图。
图27本发明实施例提供的传输媒体数据方法的一种示意图。
图28本发明实施例提供的传输媒体数据方法的一种示意图。
图29本发明实施例提供的传输媒体数据方法的一种示意图。
具体实施方法
为使本发明的目的、技术方案和优点更加清楚,下面结合附图对本发明作进一步的详细描述。
在说明实施例之前,先声明必要的名词定义:
知识图像:知识图像为当前图像所属的随机访问片段及其之前最邻近的一个随机访问片段中的需要显示的图像集之外的图像,知识图像是一种参考图像,用于为待编码图像或待解码图像提供参考。
第一个实施例:提供一种指定参考图像的方法,图5出示了本实施例的一个流程示例,所述实施例包括:
步骤101:解码器提取参考映射表中的第一标识信息以获取所述参考映射表中参考索引对应的参考图像编号是否使用至少两种编号规则;
步骤102:当所述参考映射表中参考索引对应的参考图像编号至少使用两种编号规则时,解码器从所述参考映射表中提取至少一个参考索引j对应的第二标识信息以获取所述参考索引j对应的参考图像编号采用的编号规则,其中j是一个自然数;
步骤103:解码器从所述参考映射表中提取所属参考索引j对应的参考图像编号;
步骤104:当所述参考图像编号采用的编号规则为第一编号规则时,解码器采用与当前图像相同的编号规则来使用所述参考图像编号确定当前图像的参考图像;
步骤105:当所述参考图像编号采用的编号规则为第二编号规则时,解码器使用所述参考图像编号从解码器外部返回的参考图像信息确定当前图像的参考图像。
第二个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是:
在AVS3标准中使用语法表中reference_configuration_set表示所述参考映射表,使用语法reference_to_library_enable_flag表示所述第一标识信息,使用语法is_library_pid_flag表示所述第二标识信息,使用语法library_pid表示采用第二编号规则的编号,使用语法delta_doi_of_reference_picture表示采用第一编号规则的编号与当前图像的编号的差值。语法示例如表1所示。
表1携带有标识信息和编号信息的reference_configuration_set的一种语法示例
Figure PCTCN2019102025-appb-000001
其中所述语法的语义为:
参考知识图像标志reference_to_library_enable_flag[i]:二值变量。值为‘1’表示当前图像的至少一个参考图像为知识图像缓存区中的知识图像;值为‘0’表示当前图像的参考图像都不是知识图像缓存区中的知识图像。i是参考图像配置集的编号。ReferenceToLibraryEnableFlag[i]的值等于reference_to_library_enable_flag[i]的值,如果位流中不存在reference_to_library_enable_flag[i]则ReferenceToLibraryEnableFlag[i]的值等于0。
参考图像数量num_of_reference_picture[i]:3位无符号整数。表示当前图像的参考图像数量。参考图像数量不应超过参考图像缓冲区的大小。NumOfRefPic[i]的值等于num_of_reference_picture[i]的值。i是参考图像配置集的编号。
符合本部分的位流应满足以下要求:
——如果当前图像的PictureType等于0,则num_of_reference_picture[i]的值应为0;
——如果当前图像的PictureType等于1或3,则num_of_reference_picture[i]的值应大于或等于1。
——如果当前图像的PictureType等于2,则num_of_reference_picture[i]的值应为2;
知识图像编号标志is_library_pid_flag[i][j]:二值变量。值为‘1’表示当前图像参考队列中第j个参考图像是知识图像缓存区中的知识图像,并根据知识参考图像索引编library_pid[i][j]在知识图像缓存区中定位知识图像;值为‘0’表示当前图像参考队列中第j个参考图像不是知识图像缓存区中的知识图像,并根据delta_doi_of_reference_picture[i][j]在解码图像缓冲区中定位图像。i是参考图像配置集的编号,j是参考图像的编号。IsLibraryPidFlag[i][j]的值等于is_library_pid_flag[i][j]的值。对于给定第i个参考图像配置,当任意第j个参考图像的IsLibraryPidFlag[i][j]的值为1时,ReferenceToLibraryOnlyFlag[i]的值为1。
知识图像编号library_pid[i][j]:6位无符号整数,取值范围是0~63。当前图像参考队列中第j个参考图像在知识图像缓存区中的编号。i是参考图像配置集的编号,j是参考图像的编号。LibraryPid[i][j]的值等于library_pid[i][j]的值。
参考图像解码顺序偏移量delta_doi_of_reference_picture[i][j]:6位无符号整数,取值范围是1~63。说明当前图像参考图像队列中的第j个参考图像与当前图像解码顺序的差值。i是参考图像配置集的编号,j是参考图像的编号。对同一个参考图像配置集,不同编号的参考图像解码顺序偏移量的值应各不相同。DeltaDoiOfRefPic[i][j]的值等于delta_doi_of_reference_picture[i][j]的值。
从上述表1可知,对于第i个reference_configuration_set,当reference_to_library_enable_flag[i]的值为1时,表明所述reference_configuration_set(i)描述的编号使用混合编号规则。此时,对于第j个参考索引,当is_library_pid_flag[i][j]的值为0时,所述编号使用第一编号规则,例如由delta_doi_of_reference_picture[i][j]表示参考图像的相对编号,其中delta_doi_of_reference_picture[i][j]为一个整数比特定长码,例如6比特定长码;当is_library_pid_flag[i][j]的值为1时,所述编号使用第二编号规则,例如由library_pid[i][j]表示参考图像的编号,其中library_pid[i][j]为一个整数比特定长码,例如6比特定长码。
第三个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是:在H.265标准中,所述第一编号规则使用delta_poc_s0_minus1或delta_poc_s1_minus1字符表示图像在输出顺序上的相对编号,所述相对编号表示其指向图像与当前图像在输出顺序上的编号差值。
第四个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是:所述第一编号规则是与显示顺序相关的编号规则,例如,根据包括但不限于图像的显示顺序、解码顺序、输出顺序等规则为图像分配编号。
第五个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是:所述第二编号规则是与显示顺序无关的编号规则,例如,根据包括但不限于图像的生成顺序、提取顺序、使用顺序或随机等规则为图像分配编号。
第六个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是:使用所述第一编号规则的图像集指当前图像所属的视频序列中用于显示或输出的图像集。
第七个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是:使用所述第一编号规则的图像集包括帧内编码图像、帧间编码图像中至少一种图像。
第八个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是:使用所述第二编号规则的图像集指知识图像集。
第九个实施例:提供一种指定参考图像的方法,本实施例在第八个实施例的基础上变化得到,与第八个实施例不同的是:知识图像可以是包括但不限于视频序列中的背景图像、视频序列中的场景切换图像、视频序列中的图像建模得到的图像和视频序列中的图像合成的图像中的至少一种图像,其中背景图像可以通过对视频序列进行背景建模而得到,场景切换图像通过对视频序列进行场景切换检测而得到。
第十个实施例:提供一种指定参考图像的方法,本实施例在第八个实施例的基础上变化得到,与第八个实施例不同的是:知识图像被存储在与存储采用第一编号规则的图像的第一缓存不同的第二缓存中,例如,第二缓存为知识图像缓存。
第十一个实施例:提供一种指定参考图像的方法,本实施例在第十个实施例的基础上变化得到,与第十个实施例不同的是:最大缓存容量为所述第一缓存和所述第二缓存的最大容量的总和。
第十二个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是:在所述参考映射表所属的位流包含的图像集中,至少一幅图像的参考映射表中参考索引对应的编号使用混合编号规则,即所述至少一幅图像使用至少一个知识图像作为参考图像。
第十三个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是:在所述参考映射表所属的位流包含的图像集中,至少一幅图像A的参考映射表中参考索引对应的编号使用第一编号规则且至少另一幅图像B的参考映射表中参考索引对应的编号使用第二编号规则,即所述图像B仅使用知识图像作为参考图像。
第十四个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是:所述参考映射表携带在序列头、图像头、条带头中。
第十五个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,图6展示了本实施例的一种流程示例,与第一个实施例不同的是,在执行步骤101之前,所述方法还包括参考映射表的更新方法,包括:
步骤201:解码器提取参考映射更新表以获取至少一个参考索引j对应的编号和第二标识信息;
步骤202:当所述参考映射更新表中的所述参考索引j存在于所述参考映射表中时,将所述参考映射表中所述参考索引j对应的编号和第二标识信息替换为所述参考映射更新表中的所述参考索引j对应的编号和第二标识信息;
步骤203:当所述参考映射更新表中的所述参考索引j不存在于所述参考映射表中时,在所述参考映射表中增加所述参考映射更新表中的所述参考索引j及其对应的编号和第二标识信息。
第十六个实施例:提供一种指定参考图像的方法,本实施例在第十五个实施例的基础上变化得到,与第十五个实施例不同的是:所述参考映射更新表仅包括至少一条参考索引及其指向的使用第二编号规则的编号,此时,在更新由所述参考映射更新表中至少一条参考索引在所述参考映射表中对应的编号时,同时将所述参考映射表中所述编号标识为使用第二编号规则。
第十七个实施例:提供一种指定参考图像的方法,本实施例在第十五个实施例的基础上变化得到,与第十五个实施例不同的是:所述参考映射更新表携带在图像头、条带头中。
第十八个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是,所述方法还包括:
步骤301:当解码器使用采用第二编号规则的编号指向的参考图像对当前图像解码时,解码器将所述参考图像与当前图像的距离设置为非时域距离。
第十九个实施例:提供一种指定参考图像的方法,本实施例在第十八个实施例的基础上变化得到,与第十八个实施例不同的是:所述非时域距离是给定的一个固定非零值。
第二十个实施例:提供一种指定参考图像的方法,本实施例在第十八个实施例的基础上变化得到,与第十八个实施例不同的是:所述非时域距离是根据所述采用第二编号规则的编号指向的参考图像与当前图像之间的相似性计算得到的一个非零值。
第二十一个实施例:提供一种指定参考图像的方法,本实施例在第一个实施例的基础上变化得到,与第一个实施例不同的是,在执行步骤101之前,所述方法还包括:
步骤401:解码器提取第三标识信息以获取所述参考映射表中是否存在第一标识信息。
第二十二个实施例:提供一种指定参考图像的方法,本实施例在第二个实施例的基础上变化得到,与第二个实施例不同的是:
在AVS3标准的序列头中使用library_picture_enable_flag表示第三标识信息,所述语法示例如表2所示斜体字语法,相应的使用reference_configuration_set表示所述参考映射表,所述语法示例如表3所示。
表2携带有第三标识信息的一种语法示例
Figure PCTCN2019102025-appb-000002
表3携带有标识信息和编号信息的reference_configuration_set的另一种语法示例
Figure PCTCN2019102025-appb-000003
其中所述语法的语义为:
知识图像允许标志library_picture_enable_flag:二值变量。值为‘1’表示视频序列可包含知识图像且允许图像使用知识缓存区中的图像作为参考图像;值为‘0’表示视频序列不应包含只是图像且不允许图像使用知识缓存区中的图像作为参考图像。LibraryPictureEnableFlag的值等于library_picture_enable_flag的值。
从上述表1可知,当library_picture_enable_flag的值为1时,LibraryPictureEnableFlag的值为1,此时,所述reference_configuration_set(i)中存在reference_to_library_enable_flag[i]。对于第i个reference_configuration_set,当reference_to_library_enable_flag[i]的值为1时,表明所述reference_configuration_set(i)描述的编号使用混合编号规则。此时,对于第j个参考索引,当is_library_pid_flag[i][j]的值为0时,所述编号使用第一编号规则,例如由delta_doi_of_reference_picture[i][j]表示参考图像的相对编号,其中delta_doi_of_reference_picture[i][j]为一个整数比特定长码,例如6比特定长码;当is_library_pid_flag[i][j]的值为1时,所述编号使用第二编号规则,例如由library_pid[i][j]表示参考图像的编号,其中library_pid[i][j]为一个整数比特定长码,例如6比特定长码。
第二十三个实施例:提供一种处理参考图像请求的方法,图7展示了本实施例的一种流程图,所述实施例包括:
步骤501:获取至少一个第一类片段的依赖映射表,其中所述依赖映射表描述其所属的所述第一类片段依赖的至少一个参考图像的编号与所述至少一个参考图像所属的第二类片段的定位信息的映射关系;
步骤502:接收解码器发送的参考图像请求信息以获取当前图像依赖的至少一个参考图像的编号;
步骤503:从所述当前图像所属的第一类片段的依赖映射表中,获取所述参考图像请求信息中的至少一个所述参考图像的编号指向的参考图像所属的第二类片段的定位信息;
步骤504:使用所述第二片段的定位信息向解码器发送所述定位信息指向的所述第二类片段包含的知识图像的信息。
在第二十四个实施例中,提供一种处理参考图像请求的方法,本实施例在第二十三个实施例的基础上变化得到,与第二十三个实施例不同的是:
步骤601:从媒体描述信息中获取至少一个第一类片段的依赖映射表。
第二十五个实施例:提供一种处理参考图像请求的方法,本实施例在第二十四个实施例的基础上变化得到,与第二十四个实施例不同的是:
在动态自适应流媒体标准(Dynamic Adaptive Streaming over HTTP,DASH)的媒体描述信息(Media Presentation Description,MPD)中使用片段依赖描述子表示其所属片段的依赖映射表信息,其中片段依赖描述子由dependent_segment描述子表示,所述dependent_segment描述子的属性由@dependent_segment_indicator表示,其中所述@dependent_segment_indicator属性描述所述dependent_segment描述子所属的第一类片段依赖的一个第二类片段的定位信息及其包含的知识图像的编号信息,所述编号信息由属性@pictureID描述,所述定位信息由属性@dependentSegmentURL描述。表4展示了所述片段依赖描述子的语法一个示例。
表4片段依赖描述子的一种语法示例
Figure PCTCN2019102025-appb-000004
Figure PCTCN2019102025-appb-000005
其中所述语法的语义见表5。
表5片段依赖描述子的一种语法语义示例
Figure PCTCN2019102025-appb-000006
第二十六个实施例:提供一种处理参考图像请求的方法,本实施例在第二十五个实施例的基础上变化得到,与第二十五个实施例不同的是:
在传输文件或传输封包单元的文件格式层级,使用样本入口数据盒LayerSMTHintSampleEntry描述知识图像和序列图像所在码流的样本入口,并以语法is_library_layer标识码流包含知识图像还是序列图像,使用样本数据盒LayerMediaSample描述序列图像所属码流的样本,以数据盒LayerInfo描述其所属码流样本依赖的知识图像所属码流和样本的编号信息,其中以library_layer_in_ceu_sequence_number描述所述被依赖的知识图像所述码流样本所属通用封装单元的编号,以library_layer_in_mfu_sequence_number描述所述被依赖的知识图像所述码流样本在通用封装单元中最小分片单元的编号。详细语法语义见下文。
Figure PCTCN2019102025-appb-000007
Figure PCTCN2019102025-appb-000008
语义:
has_mfus_flag–标识通用封装单元CEU是否被分片为最小分片单元MFU。值为1,表示CEU被每个提示样本为一个MFU;值为0,表示每个每个CEU仅包含一个样本。
is_library_layer-标识该轨道提示的媒体数据是否为知识层媒体数据。值为1,表示媒体数据是知识层媒体数据,包含知识图像的码流;值为0,表示媒体数据是序列层媒体数据,包含序列图像的码流。
Figure PCTCN2019102025-appb-000009
语义:
sequence_number–CEU中MFU的序列编号。
Trackrefindex–提取该MFU的媒体轨道的编号。
Samplenumber-提取该MFU的样本编号。Samplenumber n表示CEU中累积第n个‘moof’盒子对应的样本。CEU中第一个样本的samplenumber应为0。
offset-描述该MFU包含的媒体数据的从‘mdat’数据盒开始的位置偏移量。
length-描述该MFU包含的媒体数据的字节长度。
library_layer_in_ceu_sequence_number–描述该MFU解码所依赖的MFU在知识层媒体资源中的CEU的编号。
library_layer_in_mfu_sequence_number–描述该MFU解码所需的MFU在其CEU中的编号。
第二十七个实施例:提供一种处理参考图像请求的方法,本实施例在第二十三个实施例的基础上变化得到,如图8所示,与第二十三个实施例不同的是,步骤404使用所述第二片段的定位信息向解码器发送所述定位信息指向的所述第二类片段包含的知识图像的信息还包括:
步骤701:在缓存中查找所述第二类片段的所述定位信息指向的第二类片段或所述第二类片段包含的知识图像;
步骤702:如果所述缓存中存在所述第二类片段或所述第二类片段包含的知识图像,从所述缓存中获取所述第二类片段或所述第二类片段包含的知识图像;
步骤703:如果所述缓存中不存在所述第二类片段或所述第二类片段包含的知识图像,从服务端下载所述第二类片段。
第二十八个实施例:提供一种处理参考图像请求的方法,本实施例在第二十三个实施例的基础上变化得到,与第二十三个实施例不同的是:所述第二类片段包含一个知识图像。
第二十九个实施例:提供一种处理参考图像请求的方法,本实施例在第二十三个实施例的基础上变化得到,与第二十三个实施例不同的是:所述定位信息可以是包括但不限于统一资源定位符(Uniform Resource Locator,URL)和统一资源标识符(Uniform Resource Identifier,URI)的一种。
第三十个个实施例:提供一种处理参考图像请求的方法,本实施例在第二十三个实施例的基础上变化得到,与第二十三个实施例不同的是:向解码器发送的所述定位信息指向的所述第二类片段包含的知识图像的信息为知识图像像素值。
第三十一个实施例:提供一种处理参考图像请求的方法,本实施例在第二十三个实施例的基础上变化得到,与第二十三个实施例不同的是:向解码器发送的所述定位信息指向的所述第二类片段包含的知识图像的信息为知识图像的存储位置。
第三十二个实施例:提供一种处理参考图像请求的方法,本实施例在第二十三个实施例的基础上变化得到,与第二十三个实施例不同的是:使用HTTP传输协议向服务端发送HTTP-request从服务端下载所述第二类片段。
第三十三个实施例:提供一种指定参考图像和处理参考图像请求的系统方法,本实施例在第一个实施例和二十三个实施例的基础上变化得到,与第一个实施例和第二十三个实施例不同的是:
在图9展示的实施例中的服务端1001,序列编码器1002接收待编码视频序列并按照编码顺序对待编码图像进行编码;如果当前待编码图像参考至少一个知识图像,则序列编码器1002在本地可用的知识图像集中选择至少一个知识图像为当前待编码图像构建参考图像队列,并将知识图像的本地编号告知知识图像编码器1003;知识图像编码器1003将根据知识图像编号对所述知识图像编码并重建,向序列编码器1002提供重建知识图像;服务端管理器1004从序列编码器1002接收知识图像的本地编号和知识图像在位流中的编号(例如以LibPID表示),从序列片段组织器1005接收当前编码图像所属的随机访问片段的定位信息(例如以SeqURL表示),从知识图像编码器1003接收知识图像的本地编号,从知识片段组织器1006接收知识图像所属片段的定位信息(例如以LibURL表示),并根据以上信息建立每个序列片段的依赖映射表,所述映射表描述每个序列片段依赖的至少一个知识图像的LibPID与其所属片段的LibURL;MPD生成器1007接收服务端管理器1004的依赖映射表,并根据映射表生成MPD;
在图9展示的实施例中的客户端1008,MPD解析器1009接收服务端1001发送的MPD,并解析获取至少一个序列片段的依赖映射表;客户端管理器1010根据当前播放时间,决定需要下载的序列片段的SeqURL;序列片段下载器1011根据SeqURL从服务端1001下载序列片段;序列解码器1012接收序列片段,并解析其中的位流,根据位流中携带的参考映射表,判断当前待解码图像是否依赖知识图像,如果当前待解码图像依赖知识图像,根据参考映射表中被依赖知识图像的LibPID,向客户端管理器1010发送知识图像请求信息;客户端管理器1010根据请求信息中知识图像的LibPID,在当前解码图像所属序列片段的依赖映射表中,查找获取LibPID对应的LibURL;知识图像管理器1013接收LibURL,在一种可能的方法中,在本地知识缓存中查找LibURL指向的知识片段包含的知识图像是否存在,如果存在,从知识缓存中提取所属知识图像并提供给序列解码器1012,如果不存在,根据LibURL从服务端1001下载知识片段,解码获取其包含的知识图像,并提供给序列解码器1012;序列解码器1012使用获得的知识图像解码当前解码图像,并显示或输出当前图像。
第三十四个实施例:提供一种指定参考图像的装置,如图10所示,所述装置包括:
第一提取单元11,用于提取参考映射表中的第一标识信息以获取所述参考映射表中参考索引对应的编号是否使用混合编号规则;
第二提取单元12,用于当所述参考映射表中参考索引对应的编号使用混合编号规则时,从所述参考映射表中提取至少一个参考索引j对应的第二标识信息以获取所述参考索引j对应的编号采用的编号规则;
第三提取单元13,用于从所述参考映射表中提取所属参考索引j对应的参考图像编号;
第一确定单元14,用于当所述参考图像编号采用的编号规则为第一编号规则时,采用与当前图像相同的编号规则来使用所述参考图像编号确定当前图像的参考图像;
第二确定单元15,用于当所述参考图像编号采用的编号规则为第二编号规则时,使用所述参考图像编号从解码器外部返回的参考图像信息确定当前图像的参考图像。
第三十五个实施例:提供一种指定参考图像的装置,本实施例在第三十四个实施例的基础上变化而来,与第三十四个实施例不同的是:
在AVS3标准中的使用语法reference_configuration_set表示所述参考映射表,第一提取单元11用于从reference_configuration_set中提取语法reference_to_library_enable_flag以获取所述参考映射表中参考索引对应的编号是否使用混合编号规则;当所述参考映射表中参考索引对应的编号使用混合编号规则时,第二提取单元12用于从reference_configuration_set中提取至少一个参考索引j对应的第二标识信息以获取所述参考索引j对应的编号采用的编号规则;第三提取单元13用于从reference_configuration_set中提取所属参考索引j对应的参考图像编号library_pid或delta_doi_of_reference_picture;如果第三提取单元13提取了参考图像编号delta_doi_of_reference_picture,第一确定单元14用于采用与当前图像相同的编号规则来使用所述参考图像编号确定当前图像的参考图像;如果第三提取单元13提取了参考图像编号library_pid,第二确定单元15用于使用所述参考图像编号从解码器外部返回的参考图像信息确定当前图像的参考图像。
第三十六个实施例:提供一种指定参考图像的装置,本实施例在第三十四个实施例的基础上变化而来,与第三十四个实施例不同的是:第一提取单元11、第二提取单元12、第三提取单元13使用的所述参考映射表携带在序列头、图像头、条带头中。
第三十七个实施例:提供一种指定参考图像的装置,本实施例在第三十四个实施例的基础上变化而来,如图11所示,与第三十四个实施例不同的是,所述装置还包括:
第四提取单元21,用于提取参考映射更新表以获取至少一个参考索引j对应的编号和第二标识信息;
替换单元22,用于当所述参考映射更新表中的所述参考索引j存在于所述参考映射表中时,将所述参考映射表中所述参考索引j对应的编号和第二标识信息替换为所述参考映射更新表中的所述参考索引j对应的编号和第二标识信息;
增加单元23,用于当所述参考映射更新表中的所述参考索引j不存在于所述参考映射表中时,在所述参考映射表中增加所述参考映射更新表中的所述参考索引j及其对应的编号和第二标识信息。
第三十八个实施例:提供一种指定参考图像的装置,本实施例在第三十七个实施例的基础上变化而来,与第三十七个实施例不同的是:当所述参考映射更新表仅包括至少一条参考索引及其指向的使用第二编号规则的编号时,替换单元22还用于将所述参考映射表中所述参考索引j对应的编号替换为所述参考映射更新表中的所述参考索引j对应的编号,并将所述参考映射表中所述参考索引j对应的第二标识信息标识为使用第二编号规则。
第三十九个实施例:提供一种指定参考图像的装置,本实施例在第三十七个实施例的基础上变化而来,与第三十七个实施例不同的是:当所述参考映射更新表仅包括至少一条参考索引及其指向的使用第二编号规则的编号时,增加单元23还用于在所述参考映射表中增加所述参考映射更新表中的所述参考索引j及其对应的编号,并将所述参考映射表中所述参考索引j对应的第二标识信息标识为使用第二编号规则。
第四十个实施例:提供一种指定参考图像的装置,本实施例在第三十四个实施例的基础上变化而来,与第三十四个实施例不同的是,所述装置还包括:
设置单元33,用于当解码器使用采用第二编号规则的编号指向的参考图像对当前图像解码时,将所述参考图像与当前图像的距离设置为非时域距离。
第四十一个实施例:提供一种指定参考图像的装置,本实施例在第四十个实施例的基础上变化而来,与第四十个实施例不同的是:设置单元33还用于将所述参考图像与当前图像的距离设置为给定的一个固 定非零值。
第四十二个实施例:提供一种指定参考图像的装置,本实施例在第四十个实施例的基础上变化而来,与第四十个实施例不同的是:设置单元33还用于将所述参考图像与当前图像的距离设置为根据所述采用第二编号规则的编号指向的参考图像与当前图像之间的相似性计算得到的一个非零值。
第四十三个实施例:提供一种指定参考图像的装置,本实施例在第三十四个实施例的基础上变化而来,与第三十四个实施例不同的是,所述装置还包括:
第五提取单元41,用于提取第三标识信息以获取所述参考映射表中是否存在第一标识信息。
第四十四个实施例:提供一种指定参考图像的装置,本实施例在第四十三个实施例的基础上变化而来,与第四十三个实施例不同的是:在AVS3标准中,第五提取单元41还用于从序列头中提取library_picture_enable_flag表示的第三标识信息。
第四十五个实施例:提供一种处理参考图像请求的装置,如图12所示,所述装置包括:
第一获取单元51,用于获取至少一个第一类片段的依赖映射表,其中所述依赖映射表描述其所属的所述第一类片段依赖的至少一个参考图像的编号与所述至少一个参考图像所属的第二类片段的定位信息的映射关系;
接收单元52,用于接收解码器发送的参考图像请求信息以获取当前图像依赖的至少一个参考图像的编号;
第二获取单元53,用于从所述当前图像所属的第一类片段的依赖映射表中,获取所述参考图像请求信息中的至少一个所述参考图像的编号指向的参考图像所属的第二类片段的定位信息;
发送单元54,用于使用所述第二片段的定位信息向解码器发送所述定位信息指向的所述第二类片段包含的知识图像的信息。
第四十六个实施例:提供一种处理参考图像请求的装置,本实施例在第四十五个实施例的基础上变化而来,与第四十四个实施例不同的是:
第三获取单元61,用于从媒体描述信息中获取至少一个第一类片段的依赖映射表。
第四十七个实施例:提供一种处理参考图像请求的装置,本实施例在第四十六个实施例的基础上变化而来,与第四十六个实施例不同的是:
在DASH中,第三获取单元61还用于从MPD中获取至少一个第一类片段的片段依赖描述子dependent_segment,从所述dependent_segment描述子的至少一个属性dependent_segment_indicator中获取所述第一类片段依赖的一个第二类片段的定位信息dependentSegmentURL及其包含的知识图像的编号信息pictureID。
第四十八个实施例:提供一种处理参考图像请求的装置,本实施例在第四十五个实施例的基础上变化而来,如图13所示,与第四十五个实施例不同的是,所述发送单元54还包括:
查找单元71,用于根据所述第二类片段的所述定位信息,在缓存中查找所述定位信息指向的第二类片段或所述第二类片段包含的知识图像;
第四获取单元72,用于如果所述缓存中存在所述第二类片段或所述第二类片段包含的知识图像,从所述缓存中获取所述第二类片段或所述第二类片段包含的知识图像;
下载单元73,用于如果所述缓存中不存在所述第二类片段或所述第二类片段包含的知识图像,从服务端下载所述第二类片段。
第四十九个实施例:提供一种处理参考图像请求的装置,本实施例在第四十五个实施例的基础上变化而来,与第四十五个实施例不同的是:发送单元54还用于向解码器发送所述定位信息指向的所述第二类片段包含的知识图像的像素值。
第五十个实施例:提供一种处理参考图像请求的装置,本实施例在第四十五个实施例的基础上变化而来,与第四十五个实施例不同的是:发送单元54还用于向解码器发送所述定位信息指向的所述第二类片 段包含的知识图像的存储位置。
第五十一个实施例:提供一种处理参考图像请求的装置,本实施例在第四十八个实施例的基础上变化而来,与第四十八个实施例不同的是:下载单元73还用于使用HTTP传输协议向服务端发送HTTP-request从服务端下载所述第二类片段。
第五十二个实施例:提供一种指定参考图像和处理参考图像请求的系统方法,本实施例在第三十四个实施例和四十五个实施例的基础上变化得到,与第三十四个实施例和第四十五个实施例不同的是:
在图14展示的实施例中,MPD解析器2001接收MPD,并解析获取至少一个序列片段的依赖映射表;管理器2002根据当前播放时间,决定需要下载的序列片段的SeqURL;序列片段下载器2003根据SeqURL下载序列片段;序列解码器2004接收序列片段,并解析其中的位流,根据位流中携带的参考映射表,判断当前待解码图像是否依赖知识图像,如果当前待解码图像依赖知识图像,根据参考映射表中被依赖知识图像的LibPID,向管理器2002发送知识图像请求信息;管理器2002根据请求信息中知识图像的LibPID,在当前解码图像所属序列片段的依赖映射表中,查找获取LibPID对应的LibURL;知识图像管理器2005接收LibURL,在一种可能的方法中,在本地知识缓存中查找LibURL指向的知识片段包含的知识图像是否存在,如果存在,从知识缓存中提取所属知识图像并提供给序列解码器2004,如果不存在,根据LibURL下载知识片段,解码获取其包含的知识图像,并提供给序列解码器2004;序列解码器2004使用获得的知识图像解码当前解码图像,并显示或输出当前图像。
第五十三个实施例:提供一种指定参考图像和处理参考图像请求的系统方法,本实施例在第三十四个实施例和四十五个实施例的基础上变化得到,与第三十四个实施例和第四十五个实施例不同的是:
在图15展示的实施例中,MPD解析器3001接收MPD,并解析获取至少一个序列片段的依赖映射表;管理器3002根据当前播放时间,决定需要下载的序列片段的SeqURL;序列片段下载器3003根据SeqURL下载序列片段;序列解码器3004接收序列片段,并解析其中的位流,根据位流中携带的参考映射表,判断当前待解码图像是否依赖知识图像,如果当前待解码图像依赖知识图像,根据参考映射表中被依赖知识图像的LibPID,向管理器3002发送知识图像请求信息;管理器3002根据请求信息中知识图像的LibPID,在当前解码图像所属序列片段的依赖映射表中,查找获取LibPID对应的LibURL;管理器3002使用LibURL,在本地知识缓存3005中查找LibURL指向的知识片段包含的知识图像是否存在,如果存在,将所述知识图像的在知识缓存3006中的存储地址返回序列解码器3004,如果不存在,使用LibURL下载知识片段,解码获取其包含的知识图像,将重建知识图像存储在知识缓存3005,将所述知识图像的在知识缓存3005中的存储地址返回序列解码器3004;序列解码器3004使用返回的知识图像存储地址从知识缓存3005中获得知识图像用于解码当前解码图像,并显示或输出当前图像。
第五十四个实施例:提供一种指定参考图像和处理参考图像请求的系统方法,本实施例在第三十四个实施例和四十五个实施例的基础上变化得到,与第三十四个实施例和第四十五个实施例不同的是:
在图16展示的实施例中,MPD解析器4001接收MPD,并解析获取至少一个序列片段的依赖映射表;管理器4002根据当前播放时间,决定需要下载的序列片段的SeqURL;序列片段下载器4003根据SeqURL下载序列片段;序列解码器4004接收序列片段,并解析其中的位流,根据位流中携带的参考映射表,判断当前待解码图像是否依赖知识图像,如果当前待解码图像依赖知识图像,根据参考映射表中被依赖知识图像的LibPID,向管理器4002发送知识图像请求信息;管理器4002根据请求信息中知识图像的LibPID,在当前解码图像所属序列片段的依赖映射表中,查找获取LibPID对应的LibURL;管理器4002使用LibURL,在本地知识缓存4005中查找LibURL指向的知识片段包含的知识图像是否存在,如果存在,从知识缓存4005中获取所述知识图像,并将知识图像返回序列解码器4004,如果不存在,使用LibURL下载知识片段,解码获取其包含的知识图像,将重建知识图像存储在知识缓存4005,并将知识图像返回序列解码器4004;序列解码器4004使用返回的知识图像解码当前解码图像,并显示或输出当前图像。
第五十五个实施例:提供一种指定参考图像和处理参考图像请求的系统方法,本实施例在第三十四个 实施例和四十五个实施例的基础上变化得到,与第三十四个实施例和第四十五个实施例不同的是:
在图17展示的实施例中,MPD解析器5001接收MPD,并解析获取至少一个序列片段的依赖映射表;管理器5002根据当前播放时间,决定需要下载的序列片段的SeqURL;序列片段下载器5003根据SeqURL下载序列片段;序列解码器5004接收序列片段,并解析其中的位流,根据位流中携带的参考映射表,判断当前待解码图像是否依赖知识图像,如果当前待解码图像依赖知识图像,根据参考映射表中被依赖知识图像的LibPID,向管理器5002发送知识图像请求信息;管理器5002根据请求信息中知识图像的LibPID,在当前解码图像所属序列片段的依赖映射表中,查找获取LibPID对应的LibURL;管理器5002使用LibURL,在本地知识缓存5005中查找LibURL指向的知识片段包含的知识图像码流是否存在,如果存在,从知识缓存5005中获取所述知识图像码流,解码知识图像,并将知识图像返回序列解码器5004,如果不存在,使用LibURL下载知识片段,将知识片段包含的知识图像码流存储在知识缓存5005中,解码知识图像,并将知识图像返回序列解码器5004;序列解码器5004使用返回的知识图像解码当前解码图像,并显示或输出当前图像。
第五十六个实施例:提供一种处理得到媒体数据的方法。图18展示了使用基于知识库的视频编码方法产生的媒体数据的依赖结构关系。基于知识库的编码方法产生的媒体数据包含第一类视频数据和第二类视频数据两部分,其中称第一类视频数据为视频层数据,视频层数据包含视频层图像的码流,称第二类视频数据为知识层数据,知识层数据包含知识层图像的码流。视频数据包含至少一个样本(sample),所述样本包含一张图像或一组图像。第一类视频数据的样本按照第一编号规则被分配编号并顺序排列,第一编号规则为按照时间顺序或播放顺序或解码顺序分配编号的规则,而第二类视频数据的样本按照第二编号规则被分配编号并顺序排列,第二编号规则为按照使用顺序或生成顺序或存储顺序分配编号的规则。第二类视频数据中的至少一个样本被第一类视频数据中至少两个不连续的样本依赖并为所述第一类视频数据中至少两个不连续的样本的编解码提供参考信息,这种依赖关系被称为非对齐时间段的依赖关系。
为了视频1数据依赖视频2数据,需要与视频2数据同步进行编解码,且视频1数据中多个样本依赖视频2数据中同一个样本,例如在图18中,虚线箭头表示了样本之间的依赖关系,视频1数据中的样本1、样本2和样本4依赖视频2数据中的样本1,视频1数据中的样本3和样本5依赖视频2数据中的样本2。当视频1数据按照时间呈现时,例如图18中实线箭头表示的呈现时间顺序,被依赖的视频2数据样本需要和依赖其的视频1数据样本同步以确保视频1数据样本的正确解码。为了避免存储空间或传输带宽的浪费,被视频1数据中多个样本依赖的视频2数据样本并不会被重复存储或传输,而是被共享,例如图18中视频2数据样本1在与视频1数据样本1同步使用后,仍然会为后续的视频1样本2和样本4重复使用。基于以上使用知识库编码方法编码的媒体数据的结构关系,本发明提供一种存储媒体数据的方法和一种提取媒体数据流的方法,图18中的结构关系示例同样适用于后续实施例中的结构关系描述。
第五十七个实施例:图19展示了一种处理得到媒体数据方法的一种实施例。在本实施例中媒体数据盒和元数据盒Movie Box存储在一个文件中,在另一种情况下,媒体数据盒和Movie Box可以分别存储在不同的文件中。为了描述视频1数据和视频2数据之间的依赖关系,在“Movie Box”中使用两个轨道(track)分别描述视频1数据和视频2数据的样本,如图19所示,由视频轨道1描述视频1数据样本的结构,由视频轨道2描述视频2数据样本的结构。在视频轨道1中使用tref数据盒(Track Reference Box)描述视频轨道1和视频轨道2之间的依赖关系,为了标识两个轨道之间的依赖关系为视频1数据和视频2数据之间的依赖关系,需要为tref数据盒的参考类型(reference_type)增加一个新的值,例如使用’libr’标识。当参考类型的值是’libr’时,表示这是一个特殊的参考类型,即当前视频1轨道指向的数据样本依赖tref下的轨道标识指向的视频2轨道指向的数据样本。
在描述了视频1数据所属轨道和视频2数据所属轨道之间的依赖关系之后,需要描述样本之间的依赖关系。由于视频1数据样本和视频2数据样本使用不同的顺序编号规则,例如视频1数据样本使用时间顺序而视频2数据样本使用非对齐时间顺序,样本之间的依赖关系不能使用时间戳来描述。本实施例使用样 本群组数据盒(Sample Group Box)和样本群组描述数据盒(Sample Group Description Box)描述多个视频1数据样本依赖同一个视频2数据样本,如图18所示,在视频轨道1中,样本群组1指向视频1数据样本入口1、样本入口2和样本入口4,同时记录所依赖的视频2数据样本入口在视频轨道2中的编号1。这表示视频1数据样本入口1、样本入口2和样本入口4指向的样本依赖视频2数据样本入口1指向的样本。样本群组2指向视频1数据样本入口3和样本入口5,同时记录编号2。表示视频1数据样本入口3和样本入口5指向的样本依赖视频2数据样本入口2指向的样本。因此,样本群组中需要描述被依赖的视频2数据样本入口的信息,需要如下语法:
Figure PCTCN2019102025-appb-000010
相应的语义如下:
num_library_samples:指示此群组所指向的视频2数据样本数目。
library_sample_index:指示此群组指向的视频2数据样本条目的编号。
其中library_sample_index指向的视频2数据样本的样本入口所属的轨道由当前轨道的tref数据盒描述。在另一种情况下,视频2数据样本被描述在至少两个轨道中,此时,为了定位样本群组指向的视频2数据样本,需要如下语法:
Figure PCTCN2019102025-appb-000011
相应的语义如下:
num_library_samples:指示此群组所指向的视频2数据样本数目。
library_track_ID:指示此群组指向的视频2数据样本条目所在的轨道编号。
library_sample_index:指示此群组指向的视频2数据样本条目的编号。
根据视频2数据样本的样本入口所属的轨道编号和编号能够唯一确定被依赖的视频2数据样本,从而建立视频1数据样本和视频2数据样本之间的依赖关系。
第五十八个实施例:图20展示了一种处理得到媒体数据方法的另一种实施例。在本实施例中媒体数据盒和元数据盒Movie Box存储在一个文件中,在另一种情况下,媒体数据盒和Movie Box可以分别存储在不同的文件中。为了描述视频1数据和视频2数据之间的依赖关系,在元数据中使用一个轨道(track)来描述视频1数据和样本辅助信息。如图20所示,在视频轨道中,使用样本辅助信息(Sample auxiliary information sizes box和sample auxiliary information offsets box)来描述视频1数据和视频2数据的依赖关系,样本辅助信息和视频1数据样本入口时序上一一对应。为了描述视频1数据样本入口对应的视频1数据样本所依赖的视频2数据样本在媒体数据中的定位,需要为样本辅助信息的信息类型(aux_info_type)增加一个新的值,例如使用‘libi’标识。当信息类型的值为’libi’时,表示当前数据盒是样本辅助信息,包含对应视频1数据的视频2参考关系和视频2数据所在媒体数据盒中的位置。
由于样本辅助信息和视频1数据样本入口在时序上是一一对应的,在获得‘libi’类型时,对于一个视 频1数据样本条目,可以同时获得对应视频层数据所参考的知识层数据在视频层数据的媒体数据中所在的位置。因此,在此实施例下,知识层数据和视频层数据必须在同一个文件中。第五十九个实施例:图21展示了一种存储媒体数据方法的又一种实施例。在本实施例中媒体数据盒和元数据盒Movie Box存储在一个文件中,在另一种情况下,媒体数据盒和Movie Box可以分别存储在不同的文件中。为了描述视频1数据和视频2数据之间的依赖关系,在元数据中使用两个轨道(track)分别描述视频1数据和视频2数据的样本,同时,还使用一个时序化元数据轨道描述视频轨道和视频轨道之间的关系。如22所示,由视频轨道1描述视频1数据样本的结构,由视频轨道2描述视频2数据样本的结构,由视频轨道3描述时序化元数据样本的结构,在视频轨道1和视频轨道3中使用tref数据盒(Track Reference Box)描述视频轨道1和视频轨道3所需要的依赖关系。为了标识1和2两个轨道之间的依赖关系为视频1数据和视频2数据之间的依赖关系,需要为tref数据盒的参考类型(reference_type)增加一个新的值,例如使用’libr’标识。当参考类型的值是’libr’时,表示这是一个特殊的参考类型,即当前视频1轨道指向的数据样本依赖tref下的轨道标识指向的视频2轨道指向的数据样本。
由于视频1数据样本和时序化元数据样本使用相同的顺序编号规则,视频1数据样本和时序化元数据样本均使用时间顺序,样本之间的依赖关系可以直接使用时间戳来描述。同时,由时序化元数据样本入口指向的时序化元数据样本描述视频1数据样本入口指向的视频1数据样本与视频2数据样本入口指向的视频2数据样本的依赖关系。为此,需要增加描述依赖关系的时序化元数据的样本语法:
Figure PCTCN2019102025-appb-000012
相应的语义如下:
number_of_library_sample:指示参考的视频2数据样本的数目。
library_sample_index:指示视频2数据样本条目的编号。
其中library_sample_index指向的视频2数据样本的样本入口所属的轨道由当前轨道的tref数据盒指向的视频1数据所属轨道的tref数据盒描述。
又一个实施例提供一种处理得到媒体数据方法:与第五十九个实施例不同的是,使用片段索引数据盒(segment index box)描述视频1数据样本和视频2数据样本之间的依赖关系,所述片段索引数据盒的语法为:
Figure PCTCN2019102025-appb-000013
Figure PCTCN2019102025-appb-000014
其中斜体的语法元素为本实施例新增的语法元素,其语义为:
reference_library_flag:值为1表示当前项目参考知识图像,值为0表示不参考;
reference_sample_number:表示当前项目参考的知识图像的数目;
sample_track_ID:表示当前被参考的知识图像的样本所属的轨道编号;
sample_ID:表示当前被参考的知识图像的样本的编号。
第六十个实施例:图22展示了一种处理得到媒体数据方法的又一种实施例,相较于第五十九个实施例,描述依赖关系的时序化元数据的样本语法:
Figure PCTCN2019102025-appb-000015
Figure PCTCN2019102025-appb-000016
相应的语义如下:
number_of_library_sample:指示参考的视频2数据样本的数目。
library_sample_URL:指示视频2数据样本的统一资源定位符。
library_sample_offset:指示视频2数据样本的字节偏移量。
library_sample_size:指示视频2数据样本的字节大小。
第六十一个实施例:图23展示了一种处理得到媒体数据方法的另一种实施例。在本实施例中媒体数据盒和元数据盒Movie Box存储在一个文件中,在另一种情况下,媒体数据盒和Movie Box可以分别存储在不同的文件中。为了描述视频1数据和视频2数据之间的依赖关系,如图23所示,在视频轨道中,使用样本群组来描述视频1数据和视频2数据的依赖关系。为了描述视频1数据样本条目对应的视频1数据样本所依赖的视频2数据样本在元数据盒中的定位,需要为样本群组的群组类型(grouping_type)增加一个新的值,例如使用‘libg’标识。当群组类型的值为’libg’时,表示当前数据盒是含依赖关系的样本群组,包含对应视频1数据的视频2参考关系和视频2数据在元数据盒中的位置。所述样本群组的语法如下:
Figure PCTCN2019102025-appb-000017
其中语法元素的语义为:
meta_box_handler_type:元数据item的类型,其中增加’libi’表示所述元数据item的类型为知识图像;
num_items:元数据item的数目;
item_id[i]:第i个元数据item的编号;
library_pid[i]:第i个元数据item对应的知识图像的编号。
第六十二个实施例:图24展示了一种传输媒体数据的方法的一种实施例。首先,根据轨道的tref数据盒确定轨道之间的关系,从而确定指向视频1数据样本的视频轨道1、指向视频2数据样本的视频轨道2(如果存在的话)、指向时序化元数据样本的元数据轨道3(如果存在的话);然后从视频轨道1中按照时间顺序提取视频1数据样本;再根据视频1数据样本的辅助信息,定位并提取视频1数据样本依赖的视频2数据样本,辅助信息的描述方式可以是图19~图22的任一种实施例中视频1数据样本和视频2数据样本的依赖关系描述方式;然后,将视频1数据样本和被依赖的视频2数据样本,同步传输到接收端以解码或播放。
第六十三个实施例:图25示出了传输SVC媒体数据的一种实施例。该实施例将SVC媒体数据封装在一个包裹内。该包裹包含资产1和资产2两个资产,同时还包含一个组织信息(Composition Information,CI)。每个资产包含一个MPU,每个MPU包含SVC媒体数据的一类数据,例如资产1的MPU1包含基本层数据,资产2的MPU2包含增强层数据。组织信息记录了资产之间的依赖关系等信息,例如组织信息描述了资产1对资产2的依赖性。每个MPU中包含了至少一个MFU,并由提示轨道(hint track)描述MFU在MPU中的分段信息,例如MPU2被分段为MFU1-4,而MPU1被分段为MFU1-4,其中虚线表示MFU之间的依赖关系,例如,资产1中MFU1-4分别对应依赖资产2中MFU1-4,同时,由于基本层数据和增 强层数据都为对齐时间段媒体数据,相互依赖的MFU在客户端需要被同步传输,例如图25中实线箭头在时间线上描述的MFU的传输时间。可以看到运用MMT传输SVC媒体数据仅仅是简单的对SVC媒体数据进行分段并按照同一对齐时间段进行传输,对于有非对齐时间段依赖关系的媒体数据进行简单的分段传输时该方法明显不行。
第六十四个实施例:图26展示了将媒体分段并传输的一种实施例,相较于上第六十三个实施例,本实施例使用不同的方式描述MFU之间的依赖关系。该实施例将知识库编码媒体数据封装在一个包裹(package)中,该包裹资产包含资产1、资产2和资产3三个资产,同时还包含一个组织信息(Composition Information,CI)。每个资产包含一个MPU,每个MPU包含知识库编码媒体数据的一类数据,例如资产1的MPU包含视频层数据,资产2的MPU2包含依赖元数据,资产3的MPU3包含知识层数据。组织信息记录了资产之间的时域、空域或依赖关系等信息,例如组织信息描述了资产1对资产2的依赖性,资产2对资产3的依赖性。每个MPU中包含了至少一个MFU,并由提示轨道(hint track)描述MFU在MPU中的分段信息,例如MPU1被分段为MFU1-5,MPU2被分段为MFU1-5,而MPU3被分段为MFU1-2,其中虚线表示MFU之间的依赖关系,例如资产1中MFU1-5分别依赖资产2中MFU1-5,资产2中MFU1-5依赖资产3中MFU1,资产2中MFU3、MFU5依赖资产3中MFU2。与前述实施例不同的是,该实施使用时序化元数据(timed metadata)描述MFU之间的依赖关系,其中时序化元数据拥有与视频层数据相同的非对齐时间段,通过对齐时间段保持时序化元数据和视频层数据的同步,同时,时序化元数据中描述其对应时段需要同步使用的知识层数据,从而使得视频层数据间接地与知识层数据相关联。这种方法的优点是时序化元数据轨道的增删复用很灵活,不需要修改视频轨道的数据,但是缺点是时序化元数据存储在文件的媒体数据中,MMT发送器需要先根据hint sample定位时序化元数据再解析时序化元数据之后才能根据定位信息去文件中获取被依赖的知识层数据,这给MMT发送器带来了额外操作负载。需要使用描述依赖关系的时序化元数据样本,语法如下:
Figure PCTCN2019102025-appb-000018
相应的语义如下:
reference_MFU_flag:指示是否参考MFU,值“0”意味着不参考。
number_of_reference_MFU:指示参考的MFU数目。
depended_MFU_asset_id:指示参考的MFU所属的Asset编号。
depended_MFU_sequence_number:指示参考的MFU的编号。
在又一种情况下,语法表示如下:
Figure PCTCN2019102025-appb-000019
相应的语义如下:
reference_sample_flag:指示是否参考样本,值“0”意味着不参考。
number_of_reference_sample:指示参考的样本数目。
depended_sample_MPU_id:指示参考的样本所属的MPU编号。
depended_sample_id:指示参考的样本的编号。
第六十五个实施例:图27展示了传输媒体数据的另一种实施例,相较于第六十四个实施例,本实施例使用不同的方式描述MFU之间的依赖关系。该实施例将知识库编码媒体数据封装在一个包裹(package)中,该包裹资产包含资产1、资产2和资产3三个资产,同时还包含一个组织信息(Composition Information,CI)。每个资产包含一个MPU,每个MPU包含知识库编码媒体数据的一类数据,例如资产1的MPU包含视频层数据,知识层数据被分割为至少两个资产,例如资产2的MPU2包含知识层数据,资产3的MPU3包含知识层数据。组织信息记录了资产之间的时域、空域或依赖关系等信息,例如组织信息描述了资产1对资产2和资产3的依赖性,且资产2和资产3可以相互独立或相互依赖。每个MPU中包含了至少一个MFU,并由提示轨道(hint track)描述MFU在MPU中的分段信息,例如MPU1被分段为MFU1-5,MPU2被分段为MFU1-2,而MPU3仅包含MFU1,其中虚线表示MFU之间的依赖关系,例如资产1中MFU1、MFU5依赖资产2中MFU1,资产1中MFU2依赖资产3中MFU1,资产1中MFU3、MFU5依赖资产2中MFU2,此时由于资产2和资产3中MFU的编号可能重复,需要增加对MFU的定位信息。同时,相互依赖的MFU在客户端需要被同步传输,例如图27中实线箭头在时间线上描述的MFU的传输时间。由于视频层数据为对齐时间段媒体数据,而知识层数据为非对齐时间段媒体数据,MFU之间的依赖关系需要被明确标记。这种方法的优点是MMT发送端通过分析视频层数据的hint track即可获得视频层数据样本对知识层数据样本的依赖关系,然后根据视频层数据和知识层数据的hint track提取视频层MFU和知识层MFU,同时该方法不影响知识层数据的hint track信息,保持了知识层数据的独立性和灵活性;缺点是不同资产中MFU的编号可能重复导致视频层数据的hint sample中会增加一些冗余的知识层数据样本定位信息。在MMT标准MFU样本的基础上,扩展得到在MFU中描述当前MFU参考的MFU(称为DMFU,depended MFU)样本和增加的对MFU的定位信息的语法为:
Figure PCTCN2019102025-appb-000020
Figure PCTCN2019102025-appb-000021
相应的语义如下:
referenceMFU_flag:指示是否参考MFU,值“0”意味着不参考。
number_of_depended_MFU:指示参考的MFU数目。
depended_MFU_asset_id:指示参考的MFU所属的Asset编号。
depended_MFU_sequence_number:指示参考的MFU的编号。
第六十六个实施例:图28展示了传输媒体数据的另一种实施例,相较于第六十四、第六十五个实施例,本实施例使用不同的方式描述MFU之间的依赖关系。将知识库编码媒体数据封装在一个包裹(package)中,该包裹包含资产1和资产2两个资产,同时还包含一个组织信息(Composition Information,CI)。每个资产包含一个MPU,每个MPU包含知识库编码媒体数据的一类数据,例如资产1的MPU1包含视频层数据,资产2的MPU2包含知识层数据。组织信息记录了资产之间的时域、空域或依赖关系等信息,例如组织信息描述了资产1对资产2的依赖性。每个MPU中包含了至少一个MFU,并由提示轨道(hint track)描述MFU在MPU中的分段信息,例如MPU2被分段为MFU1和MFU4,而MPU1被分段为MFU2、MFU3、MFU5-7,其中虚线表示MFU之间的依赖关系,例如,资产1中MFU2、MFU3和MFU6依赖资产2中MFU1,资产1中MFU5和MFU7依赖资产2中MFU4,同时,相互依赖的MFU在客户端需要被同步传输,例如图28中实线箭头在时间线上描述的MFU的传输时间。由于视频层数据为对齐时间段媒体数据,而知识层数据为非对齐时间段媒体数据,MFU之间的依赖关系需要被明确标记,这种方法的优点是MMT发送端通过分析视频层数据的hint track即可获得视频层数据样本对知识层数据样本的依赖关系,然后根据视频层数据和知识层数据的hint track提取视频层MFU和知识层MFU,同时该方法不影响知识层数据的hint track信息,保持了知识层数据的独立性和灵活性。在MMT标准MFU样本的基础上,扩展得到在MFU中描述当前MFU参考的MFU(称为DMFU,depended MFU)样本的语法为:
Figure PCTCN2019102025-appb-000022
Figure PCTCN2019102025-appb-000023
相应的语义如下:
referenceMFU_flag:指示是否参考MFU,值“0”意味着不参考。
number_of_depended_MFU:指示参考的MFU数目。
depended_MFU_sequence_number:指示参考的MFU的编号。
上述语法描述了MFU依赖的DMFU,类似的,可以在一个被依赖的MFU中描述依赖当前MFU的MFU(称为RMFU,reference MFU),例如:
Figure PCTCN2019102025-appb-000024
相应的语义如下:
dependedMFU_flag:指示是否被MFU依赖,值“0”意味着不被依赖。
number_of_reference_MFU:指示参考的MFU数目。
reference_MFU_sequence_number:指示参考的MFU的编号。
number_of_consequent_MFU:指示参考的MFU之后依赖当前MFU的连续MFU数目。
通过上述语法可以在获得MFU之间的依赖关系。需要注意的是,在一种情况下,DMFU和RMFU的编号与当前MFU的编号使用同一组顺序编号且互不重复,此时DMFU和RMFU可以被唯一确定;在另一种情况下DMFU和RMFU的编号和当前MFU的编号使用不同的顺序编号且可以互相重复时,需要根据组织信息描述的MFU所属的MPU所属的Asset之间的依赖关系,确定DMFU和RMFU所属的MPU所属的Asset,从而唯一确定DMFU和RMFU。
第六十七个实施例:图29展示了传输媒体数据的另一种实施例,相较于第六十三、第六十四、第六十五和第六十六个实施例,本实施例增加了避免MFU重传的操作。在确定MFU的依赖关系并能够唯一定位MFU之后,当需要传输MFU时,需要按照依赖关系同步传输有依赖关系的MFU。图29描述了传输MFU的流程,首先根据当前传输顺序从有对齐时间段的资产1中视频层数据中获取当前MFU,例如图28中资产1中的MFU2。根据当前MFU的样本信息,判断当前MFU是否依赖DMFU,如果不依赖DMFU,那么传输当前MFU并继续按照顺序获取下一个MFU或终止传输,如果依赖DMFU,那么根据当前MFU中描述的DMFU的编号,从非对齐时间段的资产2中知识层数据中获取所属被依赖的MFU。由于多个对齐时间段MFU依赖同一个非对齐时间段MFU,为了避免DMFU的重复传输,在传输DMFU时,需要考虑三种情况,以判断DMFU在客户端的可用性,如图29所示。在一种情况下,根据DMFU的历史传输列 表,当前MFU依赖的DMFU没有被传输过,那么需要将DMFU和当前MFU同步传输,例如图28中被依赖的资产2中MFU1和资产2中MFU2需要被同步传输;在另一种情况下,根据DMFU的历史传输列表,当前MFU依赖的DMFU已经被传输过,那么只需要传输当前MFU而不需要传输DMFU,例如图28中资产2中MFU3、MFU6、MFU7,其中MFU3、MFU6依赖的资产2中MFU1已经与资产1中MFU2同步传输,MFU7依赖的资产2中MFU4已经与资产1中MFU5同步传输;在又一种情况下,根据DMFU的历史传输列表,当前MFU依赖的DMFU已经被传输过,但是,根据客户端反馈的信令消息,该DMFU由于使用频次、存储、管理方法等多种可能的原因,在客户端已经不可用,此时需要将DMFU和当前MFU同步传输,例如客户端只能缓存1个资产2中MFU,当传输资产1中MFU5时,同步传输的资产2中MFU4替换掉已有的资产2中MFU1,这导致资产2中MFU1的不可用,因此在传输资产1中MFU6时,需要同步再次传输资产2中MFU1。
第六十八个实施例:提供传输媒体数据的又一个实施例,为了能够了解并模拟客户端对非对齐时间段知识层数据的管理结果,在传输中需要使用信令消息。
在一种情况下,服务端通过信令消息告知客户端对非对齐时间段知识层数据的最佳存储大小、存储管理方法(例如FIFO(First In Fist Out)、LFU(Least Frequently Used)、LRU(Least Recently Used)等各种可能的存储管理方法)等信息,这需要使用知识层数据缓存模型(Library Buffer Model,LBM)消息,语法定义如下:
Figure PCTCN2019102025-appb-000025
相应的语义如下:
message_id:指示该消息为LBM消息。
version:指示LBM消息的版本,客户端可以检查该LBM消息是新消息或旧消息。
length:指示LBM消息的字节长度。
required_buffer_size:指示客户端为了接收该数据,需要准备的知识层数据的缓存的字节大小。
required_buffer_Manage:指示客户端管理知识层数据缓存的方法,例如值为0表示使用FIFO方法,值为1表示使用LFU方法,值为2表示使用LRU方法等等。
在一种情况下,客户端通过信令消息将知识层数据缓存的管理操作反馈给服务端,告知哪些已经传输的知识层数据在客户端已经不可用,从而使得服务端再次传输依赖不可用知识层数据的视频层数据时,能够再次重传知识层数据,这需要使用知识层数据缓存反馈消息,语法定义如下:
Figure PCTCN2019102025-appb-000026
Figure PCTCN2019102025-appb-000027
相应的语义如下:
message_id:指示该消息为LBM消息。
version:指示LBM消息的版本,客户端可以检查该LBM消息是新消息或旧消息。
length:指示LBM消息的字节长度。
unavailable_mfu_number:指示知识层数据缓存中不可用的数据所属的MFU的数目。
asset_id:指示第i个不可用MFU所属的资产编号。
sample_id:指示第i个不可用MFU所属的样本编号。
mfu_id:指示第i个不可用MFU的编号。
第六十九个实施例:该实施例添加了一个新的关系类型,例如在SMT(Smart Media Transport)中,原来只有四种关系类型,分别是依赖关系、组合关系、等同关系和相似关系,相应的flag分别是dependency_flag、composition_flag、equivalence_flag和similarity_flag。本实施例添加的新的关系类型是非对齐时间段的知识库依赖关系类型,对应的flag是library_flag,该关系类型是用来描述当前Asset与非对齐时间段的知识库Asset的依赖关系。相应的语法如表格3所示:
Figure PCTCN2019102025-appb-000028
Figure PCTCN2019102025-appb-000029
相应的语义如下:
descriptor_tag:用于指示此类型描述符的标签值。
descriptor_length:指示此描述符的字节长度,从下一个字段计算至最后一个字段。
dependency_flag:指示在此描述符中是否需要添加依赖关系。值“0”意味着不需要添加。
composition_flag:指示在此描述符中是否需要添加组合关系。值“0”意味着不需要添加。
equivalence_flag:指示在此描述符中是否需要添加等同关系。值“0”意味着不需要添加。
similarity_flag:指示在此描述符中是否需要添加相似关系。值“0”意味着不需要添加。
library_flag:指示在此描述符中是否需要添加非对齐时间段的知识库依赖关系。值“0”意味着不需要添加。
num_dependencies:指示此描述符所描述的Asset所依赖的Asset的数目。
asset_id:指示此描述符所描述的Asset所依赖的Asset的ID,此描述符中提供的Asset ID顺序与其内部编码依赖层次相对应。
num_compositions:指示与此描述符所描述的Asset有组合关系的Asset的数目。
asset_id:指示与此描述符所描述的Asset有组合关系的Asset的ID。
equivalence_selection_level:指示所对应的Asset在等同关系组中的呈现等级。“0”值表示该Asset被默认呈现。当默认Asset无法被选择时,拥有呈现等级较小的Asset会作为替代被选择和呈现。
num_equivalences:指示与此描述符所描述的Asset有等同关系的Asset的数目。
asset_id:指示与此描述符所描述的Asset有等同关系的Asset的ID。
similarity_selection_level:指示所对应的Asset在相似关系组中的呈现等级。“0”值表示该Asset被默认呈现。当默认Asset无法被选择时,拥有呈现等级较小的Asset会作为替代被选择和呈现。
num_similarities:指示与此描述符所描述的Asset有相似关系的Asset的数目。
asset_id:指示与此描述符所描述的Asset有相似关系的Asset的ID。
num_libraries:指示此描述符所描述的Asset所依赖的非对齐时间段的知识库Asset的数目。
asset_id:指示与此描述符所描述的Asset有非对齐时间段的知识库依赖关系的Asset的ID。
一种实施例提供了处理得到媒体数据的装置:
第一放入单元,用于在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
第二放入单元,用于在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
第三放入单元,用于将所述第一媒体数据中至少两个在时间上不连续的样本标记为一个样本群组,所述至少两个在时间上不连续的样本满足以下条件之一:
如果第二媒体数据为时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
第二媒体数据为非时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元。
又一种实施例提供了处理得到媒体数据的装置::
第一放入单元,用于在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
第二放入单元,用于在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
第三放入单元,用于为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据,所述至少两个在时间上不连续的样本满足以下条件之一:
如果所述第二媒体数据为时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
第二媒体数据为非时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一 组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元。
一种实施例提供了处理媒体数据的装置:
第一提取单元,用于提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
第二提取单元,用于从所述第一媒体数据所属的轨道中提取样本群组,所述样本群组包含至少两个时间上不连续的样本;
定位单元,用于根据样本群组的描述信息,为所述至少两个时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述样本群组的描述信息中;其中所述第二媒体数据满足以下条件之一:
● 如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
● 如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
又一种实施例提供了处理媒体数据的装置:
第一提取单元,用于提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
第二提取单元,用于从所述第一媒体数据中提取至少两个时间上不连续的样本;
第三提取单元,用于为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本提取依赖元数据;
定位单元,用于根据所述依赖元数据,为所述至少两个在时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述依赖元数据中;所述第二媒体数据满足以下条件之一:
● 如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
● 如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
一种实施例提供了传输媒体数据的装置:
第一切分单元,用于将第一媒体数据切分为媒体分片单元,其中所述的第一媒体数据为时序媒体数据,所述第一媒体数据包括至少两个在时间上不连续的样本;
第一提取单元,用于提取所述第一媒体数据媒体分片单元对应的依赖索引信息,所述依赖索引信息为除所述媒体分片单元所属的样本的呈现时间信息以外的信息;
第一传输单元,用于传输所述提取的第一媒体数据媒体分片单元;
定位单元,用于根据所述第一媒体数据媒体分片单元对应的依赖索引信息,定位第二媒体数据访问单元,所述第二媒体数据访问单元被所述媒体分片单元所属的第一媒体数据样本的编码或解码所参考;其中所述第二媒体数据满足以下条件之一:
● 如果第二媒体数据为时序媒体数据,则所述的第一媒体数据中所述的至少两个在时间上不连续的样本定位到的为同一个第二媒体数据访问单元,并且所述第二媒体数据访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
● 如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个 第二媒体数据访问单元。
查找单元,用于在模拟缓存中查找所述第二媒体数据访问单元;
第二切分单元,用于如果所述模拟缓存中不存在所述第二媒体数据访问单元,将所述第二媒体数据访问单元切分为媒体分片单元;
第二传输单元,用于传输所述第二媒体数据访问单元被切分的媒体分片单元。
一种实施例提供一种处理得到媒体数据的装置:
处理器;
存储器;以及
一个或多个程序用于完成以下方法:
处理器在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
处理器在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
处理器为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据,所述至少两个在时间上不连续的样本满足以下条件之一:
如果所述第二媒体数据为时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
第二媒体数据为非时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元。
处理器上述处理得到的媒体数据存入存储器。
又一种实施例提供一种处理得到媒体数据的装置:
处理器;
存储器;以及
一个或多个程序用于完成以下方法:
处理器在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
处理器在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
处理器为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据,所述至少两个在时间上不连续的样本满足以下条件之一:
如果所述第二媒体数据为时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
第二媒体数据为非时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元。
处理器上述处理得到的媒体数据存入存储器。
一种实施例提供一种处理媒体数据的装置:
处理器;
存储器;以及
一个或多个程序用于完成以下方法:
处理器处理存储器存入的媒体数据;
处理器提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
处理器从所述第一媒体数据所属的轨道中提取样本群组,所述样本群组包含至少两个时间上不连续的样本;
处理器根据样本群组的描述信息,为所述至少两个时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述样本群组的描述信息中;其中所述第二媒体数据满足以下条件之一:
● 如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
● 如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
又一种实施例提供一种处理媒体数据的装置:
处理器;
存储器;以及
一个或多个程序用于完成以下方法:
处理器处理存储器存入的媒体数据;
处理器提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
处理器从所述第一媒体数据中提取至少两个时间上不连续的样本;
处理器为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本提取依赖元数据;
处理器根据所述依赖元数据,为所述至少两个在时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述依赖元数据中;所述第二媒体数据满足以下条件之一:
● 如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
● 如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
一种实施例提供一种传输媒体数据的装置:
处理器;
存储器;
传输器;以及
一个或多个程序用于完成以下方法:
处理器处理存储器存入的媒体数据;
处理器将第一媒体数据切分为媒体分片单元,其中所述的第一媒体数据为时序媒体数据,所述第一媒体数据包括至少两个在时间上不连续的样本;
处理器提取所述第一媒体数据媒体分片单元对应的依赖索引信息,所述依赖索引信息为除所述媒体分 片单元所属的样本的呈现时间信息以外的信息;
传输器传输所述提取的第一媒体数据媒体分片单元;
处理器根据所述第一媒体数据媒体分片单元对应的依赖索引信息,定位第二媒体数据访问单元,所述第二媒体数据访问单元被所述媒体分片单元所属的第一媒体数据样本的编码或解码所参考;其中所述第二媒体数据满足以下条件之一:
● 如果第二媒体数据为时序媒体数据,则所述的第一媒体数据中所述的至少两个在时间上不连续的样本定位到的为同一个第二媒体数据访问单元,并且所述第二媒体数据访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
● 如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
处理器在模拟缓存中查找所述第二媒体数据访问单元;
处理器如果所述模拟缓存中不存在所述第二媒体数据访问单元,将所述第二媒体数据访问单元切分为媒体分片单元;
传输器传输所述第二媒体数据访问单元被切分的媒体分片单元。
一种实施例提供了传输媒体数据的装置:
第一切分单元,用于将第一媒体数据切分为媒体分片单元,其中所述的第一媒体数据为时序媒体数据,所述第一媒体数据包括至少两个在时间上不连续的样本;
第一提取单元,用于提取所述第一媒体数据媒体分片单元对应的依赖索引信息,所述依赖索引信息为除所述媒体分片单元所属的样本的呈现时间信息以外的信息;
第一传输单元,用于传输所述提取的第一媒体数据媒体分片单元;
定位单元,用于根据所述第一媒体数据媒体分片单元对应的依赖索引信息,定位第二媒体数据访问单元,所述第二媒体数据访问单元被所述媒体分片单元所属的第一媒体数据样本的编码或解码所参考;其中所述第二媒体数据满足以下条件之一:
● 如果第二媒体数据为时序媒体数据,则所述的第一媒体数据中所述的至少两个在时间上不连续的样本定位到的为同一个第二媒体数据访问单元,并且所述第二媒体数据访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
● 如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
查找单元,用于在模拟缓存中查找所述第二媒体数据访问单元;
第二切分单元,用于如果所述模拟缓存中不存在所述第二媒体数据访问单元,将所述第二媒体数据访问单元切分为媒体分片单元;
第二传输单元,用于传输所述第二媒体数据访问单元被切分的媒体分片单元。
又一种实施例提供了传输媒体数据的装置:
第一包含单元,该包含单元包含至少两个资产,同时还包含一个组织信息(Composition Information,CI),所述资产包含MPU,每个所述MPU包含媒体数据的一类数据,所述组成信息记录了资产依赖关系信息。
第一切分单元,用于将第一媒体数据切分为媒体分片单元,其中所述的第一媒体数据为时序媒体数据,所述第一媒体数据包括至少两个在时间上不连续的样本;
第一提取单元,用于提取所述第一媒体数据媒体分片单元对应的依赖索引信息,所述依赖索引信息为除所述媒体分片单元所属的样本的呈现时间信息以外的信息;
第一传输单元,用于传输所述提取的第一媒体数据媒体分片单元;
定位单元,用于根据所述第一媒体数据媒体分片单元对应的依赖索引信息,定位第二媒体数据访问单 元,所述第二媒体数据访问单元被所述媒体分片单元所属的第一媒体数据样本的编码或解码所参考;其中所述第二媒体数据满足以下条件之一:
● 如果第二媒体数据为时序媒体数据,则所述的第一媒体数据中所述的至少两个在时间上不连续的样本定位到的为同一个第二媒体数据访问单元,并且所述第二媒体数据访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
● 如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
又一种实施例提供了传输媒体数据的装置:
第一包含单元,该包含单元包含至少两个资产,同时还包含一个组织信息(Composition Information,CI),所述资产包含MPU,每个所述MPU包含媒体数据的一类数据,所述组成信息记录了资产依赖关系信息。
第一切分单元,用于将第一媒体数据切分为媒体分片单元,其中所述的第一媒体数据为时序媒体数据,所述第一媒体数据包括至少两个在时间上不连续的样本;
第一提取单元,用于提取所述第一媒体数据媒体分片单元对应的依赖索引信息,所述依赖索引信息为除所述媒体分片单元所属的样本的呈现时间信息以外的信息;
第一传输单元,用于传输所述提取的第一媒体数据媒体分片单元;
第一定位单元,用于定位参考的MFU所属的所述资产编号。
第二定位单元,用于根据所述第一媒体数据媒体分片单元对应的依赖索引信息,定位第二媒体数据访问单元,所述第二媒体数据访问单元被所述媒体分片单元所属的第一媒体数据样本的编码或解码所参考;其中所述第二媒体数据满足以下条件之一:
● 如果第二媒体数据为时序媒体数据,则所述的第一媒体数据中所述的至少两个在时间上不连续的样本定位到的为同一个第二媒体数据访问单元,并且所述第二媒体数据访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
● 如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
又一种实施例提供了传输媒体数据的装置:
第一包含单元,该包含单元包含至少两个资产,同时还包含一个组织信息(Composition Information,CI),所述资产包含MPU,每个所述MPU包含媒体数据的一类数据,所述组成信息记录了资产依赖关系信息。
第一切分单元,用于将第一媒体数据切分为媒体分片单元,其中所述的第一媒体数据为时序媒体数据,所述第一媒体数据包括至少两个在时间上不连续的样本;
第一提取单元,用于提取所述第一媒体数据媒体分片单元对应的依赖索引信息,所述依赖索引信息为除所述媒体分片单元所属的样本的呈现时间信息以外的信息;
第一传输单元,用于传输所述提取的第一媒体数据媒体分片单元;
同步单元,用于描述MFU之间的依赖关系,其中时序化元数据拥有与所述第一媒体数据相同的非对齐时间段,通过对齐时间段保持时序化元数据和视频层数据的同步,同时,时序化元数据中描述其对应时段需要同步使用的所述第二媒体数据,从而使得所述第一媒体数据间接地与所述第二没提数据相关联。

Claims (32)

  1. 一种指定参考图像的方法,其特征在于,所述方法包括:
    解码器提取参考映射表中的第一标识信息以获取所述参考映射表中参考索引对应的参考图像编号是否使用至少两种编号规则;
    当所述参考映射表中参考索引对应的编号使用至少两种编号规则时,解码器从所述参考映射表中提取至少一个参考索引j对应的第二标识信息以获取所述参考索引j对应的参考图像编号采用的编号规则;
    解码器从所述参考映射表中提取所述参考索引j对应的参考图像编号;
    当所述参考图像编号采用的编号规则为第一编号规则时,解码器采用与当前图像相同的编号规则来使用所述参考图像编号确定当前图像的参考图像;
    当所述参考图像编号采用的编号规则为第二编号规则时,解码器使用所述参考图像编号从解码器外部返回的参考图像信息确定当前图像的参考图像。
  2. 根据权利要求1所述方法,其特征在于,所述方法还包括:
    解码器从参考映射更新表中获取至少一个参考索引j对应的参考图像编号和第二标识信息;
    当所述参考映射更新表中的所述参考索引j存在于所述参考映射表中时,将所述参考映射表中所述参考索引j对应的参考图像编号和第二标识信息替换为从所述参考映射更新表中获取的所述参考索引j对应的参考编号和第二标识信息;
    当所述参考映射更新表中的所述参考索引j不存在于所述参考映射表中时,在所述参考映射表中增加从所述参考映射更新表中获取的所述参考索引j及其对应的参考图像编号和第二标识信息。
  3. 根据权利要求1所述方法,其特征在于,所述方法还包括:
    当解码器使用采用第二编号规则的参考图像编号指向的参考图像对当前图像解码时,解码器将所述参考图像与当前图像的距离设置为非时域距离。
  4. 一种处理参考图像请求的方法,其特征在于,所述方法包括:
    获取至少一个第一类片段的依赖映射表以获取所述至少一个第一类片段依赖的至少一个参考图像的参考图像编号与所述至少一个参考图像所属的第二类片段的定位信息的映射关系;
    接收解码器发送的参考图像请求信息以获取当前图像依赖的至少一个参考图像的参考图像编号,所述当前图像包含在所属第一类片段中;
    从所述当前图像所属的第一类片段的依赖映射表中,获取所述参考图像请求信息中的至少一个所述参考图像的参考图像编号指向的参考图像所属的第二类片段的定位信息;
    使用所述第二类片段的定位信息向解码器发送所述定位信息指向的所述第二类片段包含的参考图像的信息。
  5. 根据权利要求4所述方法,其特征在于,所述方法还包括:
    从媒体描述信息中获取至少一个第一类片段的依赖映射表。
  6. 根据权利要求4所述方法,其特征在于,使用所述第二类片段的定位信息向解码器发送所述定位信息指向的所述第二类片段包含的参考图像的信息还包括:
    在缓存中查找所述第二类片段的所述定位信息指向的第二类片段或所述第二类片段包含的参考图像;
    如果所述缓存中存在所述第二类片段或所述第二类片段包含的参考图像,从所述缓存中获取所述第二类片段或所述第二类片段包含的参考图像;
    如果所述缓存中不存在所述第二类片段或所述第二类片段包含的参考图像,从服务端下载所述第二类片段。
  7. 一种指定参考图像的装置,其特征在于,所述装置包括:
    处理器;
    存储器;以及
    一个或多个程序用于完成以下方法:
    处理器提取参考映射表中的第一标识信息以获取所述参考映射表中参考索引对应的参考图像编号是否使用至少两种编号规则;
    当所述参考映射表中参考索引对应的编号使用至少两种编号规则时,处理器从所述参考映射表中提取至少一个参考索引j对应的第二标识信息以获取所述参考索引j对应的参考图像编号采用的编号规则;
    处理器从所述参考映射表中提取所述参考索引j对应的参考图像编号;
    当所述参考图像编号采用的编号规则为第一编号规则时,处理器采用与当前图像相同的编号规则来使用所述参考图像编号确定当前图像的参考图像;
    当所述参考图像编号采用的编号规则为第二编号规则时,处理器使用所述参考图像编号从解码器外部返回的参考图像信息确定当前图像的参考图像;
    处理器处理的上述参考映射表和参考图像存在于存储器中。
  8. 根据权利要求7所述装置,其特征在于,所述装置还包括:
    处理器从参考映射更新表中获取至少一个参考索引j对应的参考图像编号和第二标识信息;
    当所述参考映射更新表中的所述参考索引j存在于所述参考映射表中时,处理器将所述参考映射表中所述参考索引j对应的参考图像编号和第二标识信息替换为从所述参考映射更新表中获取的所述参考索引j对应的参考编号和第二标识信息;
    当所述参考映射更新表中的所述参考索引j不存在于所述参考映射表中时,处理器在所述参考映射表中增加从所述参考映射更新表中获取的所述参考索引j及其对应的参考图像编号和第二标识信息。
  9. 根据权利要求7所述装置,其特征在于,所述装置还包括:
    当解码器使用采用第二编号规则的参考图像编号指向的参考图像对当前图像解码时,处理器将所述参考图像与当前图像的距离设置为非时域距离。
  10. 一种处理参考图像请求的装置,其特征在于,所述装置包括:
    处理器;
    存储器;
    传输器;以及
    一个或多个程序用于完成以下方法:
    处理器获取至少一个第一类片段的依赖映射表以获取所述至少一个第一类片段依赖的至少一个参考图像的参考图像编号与所述至少一个参考图像所属的第二类片段的定位信息的映射关系;
    处理器接收解码器发送的参考图像请求信息以获取当前图像依赖的至少一个参考图像的参考图像编号,所述当前图像包含在所属第一类片段中;
    处理器从所述当前图像所属的第一类片段的依赖映射表中,获取所述参考图像请求信息中的至少一个所述参考图像的参考图像编号指向的参考图像所属的第二类片段的定位信息;
    传输器使用所述第二类片段的定位信息向解码器发送所述定位信息指向的所述第二类片段包含的参考图像的信息;
    处理器处理的上述依赖映射表和参考图像存在于存储器中。
  11. 根据权利要求10所述装置,其特征在于,所述装置还包括:
    处理器从媒体描述信息中获取至少一个第一类片段的依赖映射表。
  12. 根据权利要求10所述装置,其特征在于,所述发送单元还包括:
    处理器在缓存中查找所述第二类片段的所述定位信息指向的第二类片段或所述第二类片段包含的参考图像;
    如果所述缓存中存在所述第二类片段或所述第二类片段包含的参考图像,处理器从所述缓存中获取所述第二类片段或所述第二类片段包含的参考图像;
    如果所述缓存中不存在所述第二类片段或所述第二类片段包含的参考图像,处理器从服务端下载所述第二类片段。
  13. 一种处理得到媒体数据的方法,所述方法包括:
    在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
    在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
    将所述第一媒体数据中至少两个在时间上不连续的样本标记为一个样本群组,所述至少两个在时间上不连续的样本满足以下条件之一:
    如果第二媒体数据为时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
    第二媒体数据为非时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元。
  14. 根据权利要求13所述方法,所述方法还包括:
    如果第二媒体数据为时序媒体数据,在第一媒体轨道中放入指向所述第二媒体数据盒的轨道依赖信息,所述轨道依赖信息包含表明所述同一组访问单元和所述两个在时间上不连续的样本中至少一个样本在时间上不对齐的标识。
  15. 根据权利要求13所述方法,所述方法还包括:
    在所述第一媒体轨道中放入所述样本群组的描述信息,所述样本群组的描述信息包含表明所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元的标识。
  16. 一种处理得到媒体数据的方法,所述方法包括:
    在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
    在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
    为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据,所述至少两个在时间上不连续的样本满足以下条件之一:
    如果所述第二媒体数据为时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
    第二媒体数据为非时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元。
  17. 根据权利要求16所述方法,其特征在于,其中为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据还包括:
    在时序化元数据中放入所述依赖元数据;
    在时序化元数据轨道中放入所述时序化元数据的样本条目。
  18. 根据权利要求16所述方法,其特征在于,其中为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据还包括:
    在片段索引数据盒中放入所述依赖元数据。
  19. 一种处理媒体数据的方法,所述方法包括:
    提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
    从所述第一媒体数据所属的轨道中提取样本群组,所述样本群组包含至少两个时间上不连续的样本;
    根据样本群组的描述信息,为所述至少两个时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述样本群组的描述信息中;其中所述第二媒体数据满足以下条件之一:
    如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
    如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
  20. 根据权利要求19所述方法,所述方法还包括:
    如果第二媒体数据为时序媒体数据,从所述第一媒体数据所属的轨道中解析指向所述第二媒体数据所属的数据盒的轨道依赖信息的标识以获得所述同一组访问单元和所述两个在时间上不连续的样本中至少一个样本在时间上不对齐的信息。
  21. 根据权利要求19所述方法,所述方法还包括:
    从所述第一媒体轨道中的所述样本群组的描述信息中,解析标识以获得所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元的信息。
  22. 一种处理媒体数据的方法,所述方法包括:
    提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
    从所述第一媒体数据中提取至少两个时间上不连续的样本;
    为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本提取依赖元数据;
    根据所述依赖元数据,为所述至少两个在时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述依赖元数据中;所述所述第二媒体数据满足以下条件之一:
    如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
    如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
  23. 根据权利要求22所述方法,其特征在于,其中为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本提取依赖元数据还包括:
    提取时序化元数据轨道中的样本条目指向的时序化元数据;
    提取时序化元数据中的依赖元数据。
  24. 根据权利要求22所述方法,其特征在于,其中为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本提取依赖元数据还包括:
    从片段索引数据盒中提取所述依赖元数据。
  25. 一种传输媒体数据的方法,所述方法包括:
    将第一媒体数据切分为媒体分片单元,其中所述的第一媒体数据为时序媒体数据,所述第一媒体数据包括至少两个在时间上不连续的样本;
    提取所述第一媒体数据媒体分片单元对应的依赖索引信息,所述依赖索引信息为除所述媒体分片单元所属的样本的呈现时间信息以外的信息;
    传输所述提取的第一媒体数据媒体分片单元;
    根据所述第一媒体数据媒体分片单元对应的依赖索引信息,定位第二媒体数据访问单元,所述第二媒体数据访问单元被所述媒体分片单元所属的第一媒体数据样本的编码或解码所参考;其中所述第二媒体数据满足以下条件之一:
    如果第二媒体数据为时序媒体数据,则所述的第一媒体数据中所述的至少两个在时间上不连续的样本定位到的为同一个第二媒体数据访问单元,并且所述第二媒体数据访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;
    如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元;
    在模拟缓存中查找所述第二媒体数据访问单元;
    如果所述模拟缓存中不存在所述第二媒体数据访问单元,将所述第二媒体数据访问单元切分为媒体分片单元;
    传输所述第二媒体数据访问单元被切分的媒体分片单元。
  26. 根据权利要求25所述方法,所述提取所述第一媒体数据媒体分片单元对应的依赖索引信息,其特征还包括:
    从包含所述媒体分片单元的分片信息的提示轨道样本中提取所述媒体分片单元对应的依赖索引信息。
  27. 根据权利要求25所述方法,所述提取所述第一媒体数据媒体分片单元对应的依赖信息,其特征还包括:
    从所述媒体分片单元对应的时序化元数据中提取所述媒体分片单元对应的依赖索引信息。
  28. 一种处理得到媒体数据的装置,所述装置包括:
    处理器;
    存储器;以及
    一个或多个程序用于完成以下方法:
    处理器在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样本条目包含指向所述第一媒体数据的样本的元数据;
    处理器在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
    处理器将所述第一媒体数据中至少两个在时间上不连续的样本标记为一个样本群组,所述至少两个在时间上不连续的样本满足以下条件之一:
    如果第二媒体数据为时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
    第二媒体数据为非时序媒体数据,所述至少两个在时间上不连续的样本编码或解码参考第二媒体数据中同一组访问单元;
    处理器上述处理得到的媒体数据存在于存储器中。
  29. 一种处理得到媒体数据的装置,所述装置包括:
    处理器;
    存储器;以及
    一个或多个程序用于完成以下方法:
    处理器在第一媒体轨道中放入第一媒体数据的样本条目,所述第一媒体数据为时序媒体数据,所述样 本条目包含指向所述第一媒体数据的样本的元数据;
    处理器在第二媒体数据盒中放入第二媒体数据的访问单元条目,所述访问单元条目包含指向所述第二媒体数据的访问单元的元数据,所述第二媒体数据为时序媒体数据或非时序媒体数据;
    处理器为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本分别放入各自的依赖元数据,所述至少两个在时间上不连续的样本满足以下条件之一:
    如果所述第二媒体数据为时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元,所述同一组访问单元和所述至少两个在时间上不连续的样本中至少一个样本在时间上不对齐;如果
    第二媒体数据为非时序媒体数据,所述每个样本对应的依赖元数据包含指向所述第二媒体数据中同一组访问单元的索引信息,所述索引信息为除所述第一媒体数据的样本的呈现时间信息以外的信息,所述至少两个在时间上不连续的样本编码或解码参考所述同一组访问单元;
    处理器上述处理得到的媒体数据存在于存储器中。
  30. 一种处理媒体数据的装置,所述装置包括:
    处理器;
    存储器;以及
    一个或多个程序用于完成以下方法:
    处理器处理存储器中存在的媒体数据;
    处理器提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
    处理器从所述第一媒体数据所属的轨道中提取样本群组,所述样本群组包含至少两个时间上不连续的样本;
    处理器根据样本群组的描述信息,为所述至少两个时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述样本群组的描述信息中;其中所述第二媒体数据满足以下条件之一:
    如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
    如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
  31. 一种处理媒体数据的装置,所述装置包括:
    处理器;
    存储器;以及
    一个或多个程序用于完成以下方法:
    处理器处理存储器中存在的媒体数据;
    处理器提取第一媒体数据和第二媒体数据,其中所述的第一媒体数据为时序媒体数据,第二媒体数据为时序媒体数据或非时序媒体数据;
    处理器从所述第一媒体数据中提取至少两个时间上不连续的样本;
    处理器为所述第一媒体数据中至少两个在时间上不连续的样本中每一个样本提取依赖元数据;
    处理器根据所述依赖元数据,为所述至少两个在时间上不连续的样本中每一个样本分别定位第二媒体数据中的一组访问单元,所述一组访问单元的索引信息包括在所述依赖元数据中;所述第二媒体数据满足以下条件之一:
    如果第二媒体数据为时序媒体数据,则所述至少两个在时间上不连续的样本定位到的为第二媒体数据中同一组访问单元,并且所述同一组访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
    如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元。
  32. 一种传输媒体数据的装置,其特征在于,所述装置包括:
    处理器;
    存储器;
    传输器;以及
    一个或多个程序用于完成以下方法:
    处理器处理存储器中存在的媒体数据;
    处理器将第一媒体数据切分为媒体分片单元,其中所述的第一媒体数据为时序媒体数据,所述第一媒体数据包括至少两个在时间上不连续的样本;
    处理器提取所述第一媒体数据媒体分片单元对应的依赖索引信息,所述依赖索引信息为除所述媒体分片单元所属的样本的呈现时间信息以外的信息;
    传输器传输所述提取的第一媒体数据媒体分片单元;
    处理器根据所述第一媒体数据媒体分片单元对应的依赖索引信息,定位第二媒体数据访问单元,所述第二媒体数据访问单元被所述媒体分片单元所属的第一媒体数据样本的编码或解码所参考;其中所述第二媒体数据满足以下条件之一:
    如果第二媒体数据为时序媒体数据,则所述的第一媒体数据中所述的至少两个在时间上不连续的样本定位到的为同一个第二媒体数据访问单元,并且所述第二媒体数据访问单元与所述第一媒体数据的至少两个样本中的至少一个样本的时间段不对齐;或者,
    如果第二媒体数据为非时序媒体数据,则所述第一媒体数据的所述的两个样本定位到的为同一个第二媒体数据访问单元;
    处理器在模拟缓存中查找所述第二媒体数据访问单元;
    处理器如果所述模拟缓存中不存在所述第二媒体数据访问单元,将所述第二媒体数据访问单元切分为媒体分片单元;
    传输器传输所述第二媒体数据访问单元被切分的媒体分片单元。
PCT/CN2019/102025 2018-08-29 2019-08-22 处理传输媒体数据和指定参考图像的方法和装置 WO2020043003A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US17/418,703 US11716505B2 (en) 2018-08-29 2019-08-22 Methods and apparatus for media data processing and transmitting and reference picture specifying
EP19853701.1A EP3866478A4 (en) 2018-08-29 2019-08-22 METHODS AND DEVICES FOR PROCESSING AND TRANSFERRING MEDIA DATA AND FOR SPECIFICATION OF A REFERENCE IMAGE
US18/342,526 US12052464B2 (en) 2018-08-29 2023-06-27 Methods and apparatus for media data processing and transmitting and reference picture specifying

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
CN201810992086.9 2018-08-29
CN201810992086 2018-08-29
CN201811488779.0 2018-12-06
CN201811487546.9A CN110876083B (zh) 2018-08-29 2018-12-06 指定参考图像的方法及装置及处理参考图像请求的方法及装置
CN201811488779.0A CN110876084B (zh) 2018-08-29 2018-12-06 处理和传输媒体数据的方法和装置
CN201811487546.9 2018-12-06

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US17/418,703 A-371-Of-International US11716505B2 (en) 2018-08-29 2019-08-22 Methods and apparatus for media data processing and transmitting and reference picture specifying
US18/342,526 Continuation US12052464B2 (en) 2018-08-29 2023-06-27 Methods and apparatus for media data processing and transmitting and reference picture specifying

Publications (1)

Publication Number Publication Date
WO2020043003A1 true WO2020043003A1 (zh) 2020-03-05

Family

ID=69643903

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/102025 WO2020043003A1 (zh) 2018-08-29 2019-08-22 处理传输媒体数据和指定参考图像的方法和装置

Country Status (2)

Country Link
US (2) US11716505B2 (zh)
WO (1) WO2020043003A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11589032B2 (en) * 2020-01-07 2023-02-21 Mediatek Singapore Pte. Ltd. Methods and apparatus for using track derivations to generate new tracks for network based media processing applications

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104902279A (zh) * 2015-05-25 2015-09-09 浙江大学 一种视频处理方法及装置
CN107634928A (zh) * 2016-07-18 2018-01-26 华为技术有限公司 一种码流数据的处理方法及装置
CN107634930A (zh) * 2016-07-18 2018-01-26 华为技术有限公司 一种媒体数据的获取方法和装置
CN108243339A (zh) * 2016-12-27 2018-07-03 浙江大学 图像编解码方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2516825B (en) * 2013-07-23 2015-11-25 Canon Kk Method, device, and computer program for encapsulating partitioned timed media data using a generic signaling for coding dependencies
US10284867B2 (en) * 2014-12-18 2019-05-07 Nokia Technologies Oy Apparatus, a method and a computer program for video coding and decoding
GB2534136A (en) * 2015-01-12 2016-07-20 Nokia Technologies Oy An apparatus, a method and a computer program for video coding and decoding
CN108141636B (zh) * 2015-10-07 2021-03-23 松下知识产权经营株式会社 接收装置以及接收方法
US10587934B2 (en) * 2016-05-24 2020-03-10 Qualcomm Incorporated Virtual reality video signaling in dynamic adaptive streaming over HTTP

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104902279A (zh) * 2015-05-25 2015-09-09 浙江大学 一种视频处理方法及装置
CN107634928A (zh) * 2016-07-18 2018-01-26 华为技术有限公司 一种码流数据的处理方法及装置
CN107634930A (zh) * 2016-07-18 2018-01-26 华为技术有限公司 一种媒体数据的获取方法和装置
CN108243339A (zh) * 2016-12-27 2018-07-03 浙江大学 图像编解码方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YU , HUALONG ET AL.: "Improved DASH for Cross-Random-Access Prediction Structure in Video Coding", 2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS, 4 May 2018 (2018-05-04), pages 1 - 5, XP033434474, DOI: 10.1109/ISCAS.2018.8351027 *

Also Published As

Publication number Publication date
US12052464B2 (en) 2024-07-30
US11716505B2 (en) 2023-08-01
US20220078515A1 (en) 2022-03-10
US20230353824A1 (en) 2023-11-02

Similar Documents

Publication Publication Date Title
CN110876084B (zh) 处理和传输媒体数据的方法和装置
US12047661B2 (en) Method, device, and computer program for encapsulating partitioned timed media data
US10547914B2 (en) Method, device, and computer program for encapsulating partitioned timed media data using sub-track feature
KR101918658B1 (ko) 멀티미디어 시스템에서 복합 미디어 컨텐츠를 송수신하는 방법 및 장치
US10110654B2 (en) Client, a content creator entity and methods thereof for media streaming
US20190037256A1 (en) Method, device, and computer program for encapsulating partitioned timed media data using a generic signaling for coding dependencies
CN113949938B (zh) 封装方法和装置、处理方法和装置以及存储介质
JP4392442B2 (ja) FlexMuxストリームをストリーム形成、受信及び処理する装置及び方法
US11403804B2 (en) Method for real time texture adaptation
CN113170239A (zh) 将媒体数据封装到媒体文件的方法、装置和计算机程序
JP2013532441A (ja) 符号化マルチコンポーネント・ビデオをカプセル化する方法および装置
US20220366611A1 (en) Three-dimensional content processing methods and apparatus
US12052464B2 (en) Methods and apparatus for media data processing and transmitting and reference picture specifying
EP3908001A1 (en) Video encoding and decoding method and device
CN115396647A (zh) 一种沉浸媒体的数据处理方法、装置、设备及存储介质
US20230336602A1 (en) Addressable resource index events for cmaf and dash multimedia streaming
CN118118694A (zh) 点云封装与解封装方法、装置、介质及电子设备
CN115866258A (zh) 传输流和节目流的生成和处理方法及其装置
WO2024015256A1 (en) Method for bandwidth switching by cmaf and dash clients using addressable resource index tracks and events
JP2022546894A (ja) メディアデータをメディアファイルにカプセル化するための方法、装置、およびコンピュータプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19853701

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019853701

Country of ref document: EP

Effective date: 20210329