WO2007035151A1 - Mesure de flux multimedia - Google Patents

Mesure de flux multimedia Download PDF

Info

Publication number
WO2007035151A1
WO2007035151A1 PCT/SE2006/001056 SE2006001056W WO2007035151A1 WO 2007035151 A1 WO2007035151 A1 WO 2007035151A1 SE 2006001056 W SE2006001056 W SE 2006001056W WO 2007035151 A1 WO2007035151 A1 WO 2007035151A1
Authority
WO
WIPO (PCT)
Prior art keywords
content
identifier
scalability
description
importance
Prior art date
Application number
PCT/SE2006/001056
Other languages
English (en)
Inventor
Anisse Taleb
Jonas Svedberg
Attila Takacs
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Publication of WO2007035151A1 publication Critical patent/WO2007035151A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/26603Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel for automatically generating descriptors from content, e.g. when it is not made available by its provider, using content analysis techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/70Media network packetisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/64322IP
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/647Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
    • H04N21/64784Data processing by the network
    • H04N21/64792Controlling the complexity of the content stream, e.g. by dropping packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8451Structuring of content, e.g. decomposing content into time segments using Advanced Video Coding [AVC]

Definitions

  • the present invention relates to flexible scaling of packetized media streams.
  • Adaptation at intermediate gateways If a part of the network becomes congested, or has a different service capability, a dedicated network entity performs a transcoding of the service. With scalable codecs this could be as simple as dropping or truncating media frames.
  • Audio coding (Non-conversational, streaming /download ⁇
  • MPEG have resulted in a scalable to lossless extension tool, MPEG4-SLS.
  • MPEG4-SLS provides progressive enhancements to the core Advanced Audio Coding/Bit Slice Arithmetic Coding (AAC/BSAC) all the way up to lossless coding with a granularity step down to 0.4 kbps.
  • AAC/BSAC Advanced Audio Coding/Bit Slice Arithmetic Coding
  • An overview of the MPEG4 toolset can be found in [15] .Furthermore, within MPEG a Call for Information (CfI) has been issued in January 2005 [6] targeting the area of scalable speech and audio coding. In CfI the key issues addressed are scalability, consistent performance across content types (e.g. speech and music) and encoding quality at low bit rates ( ⁇ 24kbps) .
  • CfI Call for Information
  • VMR-WB Variable-rate Multimode Wideband
  • ITU-T an exten- sion of the Multirate G.722.1 audio/video conferencing codec has been extended with two new modes providing super wideband (14 kHz audio bandwidth, 32 kHz sampling) capability operating at 24, 32 and 48 kbps.
  • G.729 With respect to scalable conversational speech coding, the main standardiza- tion effort is taking place in ITU-T, (Working Party 3, Study Group 16). There the requirements for a scalable extension of G.729 have been defined recently (Nov. 2004), and the qualification process was ended in July 2005. This new G.729 extension will be scalable from 8 to 32 kbps with at least 2 kbps granularity steps from 12 kbps.
  • the main target application for the G.729 scalable extension is conversational speech over shared and bandwidth limited xDSL-links, i.e. the scaling is likely to take place in a Digital Residential Gateway that passes the Voice over IP (VoIP) packets through specific controlled Voice channels (Vc's).
  • VoIP Voice over IP
  • ITU-T is also in the process of defin- ing the requirements for a completely new scalable conversational codec in SG16/WP3/ Question 9.
  • the requirements for the Q.9/Embedded Variable rate (EV) codec were finalized in July 2005; currently the Q.9 /EV requirements state a core rate of 8.0 kbps and a maximum rate of 32 kbps.
  • the Q.9 /EV core is not restricted to narrowband (8 kHz sampling) like the G.729 extension will be, i.e. Q.9/EV may provide wideband (16 kHz sampling) from the core layer and onwards.
  • audio scalability can be achieved by:
  • Dropping audio channels e.g., mono consist of 1 channel, stereo 2 channels, surround 5 channels. This is called spatial scalability.
  • AAC-BSAC fine-grained scalable audio codec
  • bit-slicing scheme is applied to the quantized spectral data.
  • the quantized spectral values are grouped into frequency bands, each of these groups containing the quantized spectral values in their binary representation.
  • the bits of the group are processed in slices according to their significance and spectral content.
  • MSB most significant bits
  • scalability can be achieved in a two-dimensional space. Quality, corresponding to a certain signal bandwidth, can be enhanced by transmitting more LSBs, or the bandwidth of the signal can be extended by providing more bit-slices to the receiver. Moreover, a third dimension of scalability is available by adapting the number of channels available for decoding. For example, a surround audio (5 channels) could be scaled down to stereo (2 channels) which, on the other hand, can be scaled to mono (1 channels) if, e.g., transport conditions make it necessary.
  • H.264/MPEG-4 Advanced Video Codec is the current state-of-the- art in video coding [I].
  • the design of H.264/MPEG-4 AVC is based on the traditional concept of hybrid video coding using motion- compensated temporal and spatial prediction in conjunction with block- based residual transform coding.
  • H.264/MPEG-4 AVC contains a large number of innovative technical features, both in terms of improved coding efficiency and network friendliness [2].
  • Recently, a new standardization initiative has been launched by the Joint Video Team of ITU-T/ VCEG and ISO/IEC MPEG with the objective of extending the H.264/
  • a scalable bit-stream consists of a base or core layer and one or more nested enhancement layers.
  • video scalability can be achieved by:
  • RTP Payload format for the H.264 video codec is specified.
  • the RTP payload format allows for packetization of one or more Network Abstraction Layer Units (NALUs) in each RTP payload, see [9].
  • NALUs are the basic transport entities of the H.263/AVC framework.
  • SVC Scalable Video Coding
  • Layers are used to increase spatial resolution of a scalable stream. For example, slices corresponding to Layer-0 describe the scene at a certain resolution. If an additional set of Layer- 1 slices is available, the scene can be decoded at a higher spatial resolution. The next three bits (T2, Tl, TO) indicate a temporal resolution. Slices assigned to temporal resolution 0 (TR-O) correspond to the lowest temporal resolution, that is only I-frames are available. If TR- 1 slices are also available, the frame-rate can be increased (temporal-scalability). The last two bits (Ql, QO) specify a quality level (QL). QL-O corresponds to the lowest quality. If additional QL slices are available, the quality can be increased (SNR-scalability).
  • T2, Tl, TO The next three bits (T2, Tl, TO) indicate a temporal resolution. Slices assigned to temporal resolution 0 (TR-O) correspond to the lowest temporal resolution, that is only I-frames are available. If TR- 1 slices are also available
  • network entities e.g., routers, Radio Network Controllers (RNCs), Media Gateways (MGWs), etc.
  • RNCs Radio Network Controllers
  • MGWs Media Gateways
  • the vision for MPEG-21 is to define a multimedia framework to enable transparent and augmented use of multimedia resources across a wide range of networks and devices used by dif- ferent communities.”, see [13].
  • the key components of adaptation are: (i) the adaptation engine, and (ii) standardized descriptors for adaptation.
  • the adaptation engine has the role of bridging the gap between media format, terminal, network, and user characteristics.
  • Content providers permit access to multimedia content through various connections such as Internet, Ethernet, DSL, W-LAN, cable, satellite, and broadcast networks.
  • users with various terminals such as desktop computers, handheld devices, mobile phones, and TV-sets are allowed to access the content. This high level of difference between content delivery to various users demand for a system that resolves the complexity of service provisioning, service delivery, and service access.
  • MPEG-7 For adaptation of the content to the user, three types of descriptions, namely multimedia content description, service provider environment description and a user environment description, are necessary. To allow for wide deployment and good interoperability these descriptors must follow a standard- ized form. While the MPEG-7 standard plays a key role in content description [11, 12], the MPEG-21 standard, especially Part 7 Digital Item Adaptation (DIA), in addition to standardized descriptions provides tools for adaptation engines as well, see [13].
  • DIA adaptation tools are divided into seven groups. In the following we highlight the most relevant groups.
  • the usage environment includes the description of user characteristics and preferences, terminal capabilities, network characteristics and limitations, and natural environment characteristics.
  • the standards provide means to specify the preferences of the user related to the type and content of the media. It can be used to specify the interest of the user, e.g., in sport events, or movies of a certain actor. Based on the usage preference information a user agent can search for appropriate content or might call the attention of the user to a relevant multimedia broadcast content.
  • the user can set the "Audio Presentation Preferences" and the "Display Presentation Preferences". These descriptors specify certain properties of the media like, audio volume and color saturation, which reflect the preferred ren- dering of multimedia content.
  • Bit-stream Syntax Description The BSD describes the syntax (high level structure) of a binary media resource. Based on the description the adaptation engine can perform the necessary adaptation as all required information about the bit-stream is available through the description. The description is based on the XML language. This way, the description is very flexible but, on the other hand, the result is a quite extensive specification (Examples of using BSD for multimedia resource adaptation are available in [17].).
  • Terminal and Network Quality of Service There are descriptors specified that aid the adaptation decisions at the adaptation engine.
  • the adaptation engine has the task to find the best trade-off among network and terminal constraints, feasible adaptation operations satisfying these constraints, and quality degradation associated to each adaptation operation.
  • the main constraints in media resource adaptation are bandwidth and computation time.
  • Adaptation methods include selection of frame dropping and/ or coefficient dropping, requantization, MPEG-4 Fine-Granular
  • a system and a method for delivery of scalable media is described.
  • the system is based on rate-distortion packet selection and organization.
  • the method used by the system consists of scanning the encoded scalable media and scoring each data unit based on a rate-distortion score.
  • the scored data units are then organized from the highest to the lowest into network packets, which are transmitted to the receiver based on the available network bandwidth.
  • An objective of the present invention is to more efficiently use a scalable source codec.
  • the present invention forms a packetized scalable media stream by determining a media scalability description and a media content preference de- scription.
  • the determined descriptions are mapped into an importance identifier included in and controlling the scalability of the media stream, thereby providing a content-aware scalability.
  • Fig. 1 is a block diagram an apparatus for forming a packetized scalable media stream illustrating the principles of content-aware scalability mapping in accordance with the present invention
  • Fig. 2 is a block diagram of a first embodiment of an apparatus for forming a packetized scalable media stream in accordance with the present invention
  • Fig. 3 is a block diagram of a second embodiment of an apparatus for forming a packetized scalable media stream in accordance with the present invention
  • Fig. 4 is a block diagram of a third embodiment of an apparatus for forming a packetized scalable media stream in accordance with the present invention
  • Fig. 5 is a block diagram of a fourth embodiment of an apparatus for forming a packetized scalable media stream in accordance with the present invention
  • Fig. 6 is a video example illustrating how an importance identifier may be stored in an IP header
  • Fig. 7 is a video example illustrating how an importance identifier may be stored in an ethernet header
  • Fig. 8 is an audio example illustrating how an importance identifier may be stored in an IP header
  • Fig. 9 is a block diagram of an embodiment of a media stream scaling apparatus in accordance with the present invention.
  • Fig. 10 is a flow chart illustrating a method of forming a packetized scalable media stream in accordance with the present invention
  • Fig. 11 is a flow chart illustrating a packetized media stream scaling method in accordance with the present invention.
  • Fig. 12 illustrates an example of a content- aware mapping in accordance with the present invention.
  • Scalable codecs permit data loss while maintaining good user-perceived quality. However, which data (packet) is truncated or lost highly matters. Hence, packets are assigned certain impor- tance for decodeability.
  • user-perceived quality has different vulnerability to spatial, temporal, and quality (SNR) scalability for video and quality, bandwidth, and channel scalability for audio.
  • SNR quality
  • the scalable audiovisual data is ordered for scaling by taking into account the content or context of the data as well.
  • the joint prioritization of media frames of inherently related media flows is possible. That is, joint scaling of the video and audio part of a scene can be realized taking into account the importance of the video and audio part for the user.
  • scalable video codecs permit scalability along three distinct meas- ures; in terms of spatial-, temporal-, and SNR scalability, while audio codecs can be scaled by SNR-, bandwidth, and channel scalability.
  • the data-rate can be reduced by dropping spatial enhancement packets (Layers), or temporal reso- lution enhancement packets (TR), or quality enhancement packets (QL).
  • Lasers spatial enhancement packets
  • TR temporal reso- lution enhancement packets
  • QL quality enhancement packets
  • the user-perceived quality is affected differently by these adaptation methods.
  • different adaptation approaches are actually appropriate. For example, in the case of broadcasting of sport events, the motion may be more important than the quality of single video frames, since extensive and fast movements are likely to be present in the scenes. On the other hand, news or documentary content may contain slow motion and the quality of the frames is likely to be more important than the frame rate used for play-out.
  • video audio data is also an integral part of multimedia streams.
  • the corre- sponding audio sequence is also necessary to be transported to the users.
  • the audio part may also be less important than the motion, as seeing a "goal" is usually more satisfying than having a commentator telling about one. Conversely, news is more important to hear about than watching a man/ woman announcing something which cannot be heard.
  • mapping of content and codec- specific packet importance may be dynamic as well. That is, as the content changes during communication, the importance of different scalability measures may change as well. For example, in the sport broadcast scenario, usually the motion is most important but there may be certain scenes when the reporters are shown, e.g., during game-breaks. For this case quality and voice may become more important than the motion just as in the news scenario.
  • a Content Type may be assigned to video scenes and audio sequences or a joint Content Type description may be provided to a combined audiovisual scene (e.g., a movie or video phone).
  • Each different Content Type is associated with a Content Preference Description that specifies the relative importance of the quality, the movements, the audio part, the video part, etc., to the corresponding scene.
  • Fig. 1 is a block diagram illustrating the principles of content-aware scalability mapping in accordance with the present invention.
  • An audio, video or audiovisual signal is encoded by a scalable encoder 10.
  • the encoded signal may be stored in a media store 12.
  • Each encoded media frame 14 includes a header with scalability information and actual coded data.
  • the media frames 14 are forwarded to a unit 16 for UDP/ RTP packetization.
  • a scalability information extractor 18 extracts a Scalability Description from the header of each media frame 14, and forwards this information to a content-aware scalability mapper 20 in unit 16.
  • a content type identifier 22 provides Content Preference Description to the content-aware scalability mapper 22.
  • the content-aware scalability-mapper 20 uses the Scalability Description of a media frame and the Content Preference Description associated with the content type identification to perform a mapping of the possible scaling operations to an Importance Identifier in each IP packet.
  • the Importance Identifier is used to indicate specific priorities and/ or QoS classes to the underlying network.
  • a Content Type value is assigned to each media frame 14, although the same Content Type value may be assigned to several consecutive media frames. Thus, the Content Type may change even during a continuous media stream to address significant changes in the content or context characteristics.
  • each Content Type is associated with a Content Preference Description.
  • the purpose of a Content Preference Description is to define the mapping of codec-specific importance of media frames to network specific priorities or QoS classes. The objective of the introduction of this mapping is to permit different priority or QoS assignments to the media frame with the same codec- specific importance when the context in which the frame is encoded makes different scaling implementations more desirable from the user-perceived quality point-of-view.
  • the Content Type identification may be performed in several ways.
  • the Content Type may be assigned to the stream in a separate description file 24, as illustrated by the embodiment in Fig. 2.
  • This file may ei- ther be stored in media store 12 or in separate storage.
  • a priori information about the Content Type may also be calculated based on the media stream itself, as illustrated by the embodiment in Fig. 3.
  • Fig 4 The embodiment illustrated in Fig 4 is similar to the embodiment of Fig. 3.
  • the information included in the headers of media frames 14 are assumed to carry enough information to estimate the relevant properties of the content. For example, if motion compensation frames require high bit-rates, then high motion is to be expected in the media. Hence, protecting these frames is a feasible solution. To derive a more accurate guess of the content, the properties of several frames could be incorporated in the decision about the actual Content Type association.
  • the Content Type may be assigned based on user defined preferences.
  • mapping will use the same Content Type value for all media frames associated with that service.
  • It may be assigned to certain service-contexts. That is, for example, in a video telephony service the context when the user is speaking may be assigned a different Content Type than the context when the user is listening.
  • a service binding storage 26 may optionally be combined with the other previously described embodiments.
  • Content-aware scalability-mapping may also be used to allow content-based relative differentiation among services in a network domain. That is, certain Content Types may be used, for example, in gold services while others are used for silver and bronze differentiation. In this way, not only the packets of a single service are scaled based on the actual content, but content based differentiation is also realized among different services and media streams.
  • Content-aware scalability-mapping may be used to derive importance identifiers for audiovisual media that require joint handling to increase the user- perceived quality.
  • audiovisual media include movies, broadcast events where video is accompanied by audio data.
  • the mapping must consider, besides the distinct scaling of video and audio data, the coherence of the audio and video streams. For example, under severe conditions the entire loss of audio is acceptable for sport events while for video conferencing the loss/ degradation of the video signal is more appropriate.
  • the content-aware scalability-mapping to produce an importance identifier may be applied at the media source, as described in the above embodiments, or at any appropriate network entity (node).
  • This identifier may be encoded in an optional IP header extension, or Differentiated Services code-point, or may be signaled out-of-band with an appropriate signaling protocol.
  • a media frame is packetized into more than one IP packet, all the packet are classified according the importance identifier derived for the correspond- ing media frame.
  • the importance identifier should be examined by network elements such as switches, routers, RNCs, MGWs when they predict or currently experience undesirable conditions.
  • Undesirable conditions include, but are not limited to, network congestion, buffer overflow, undesirable wireless channel conditions.
  • Network elements may initiate local adaptation procedures to prevent or recover from undesirable conditions. That is, packets or in general data may ⁇ be treated differently, e.g., discarded based on the value of the importance identifier.
  • Intermediate gateways may initiate transcoding procedures to prevent or recover from undesirable conditions. That is, packets or in general data may be treated differently, e.g., discarded based on the value of the importance identifier.
  • the lower layers i.e., below the Application Layer
  • the lower layers may also initiate adaptation procedures to prevent or recover from undesirable conditions. That is, packets or in general data may be treated differently, e.g., discarded based on the value of the importance identifier.
  • Fig. 6 is a video example illustrating how an importance identifier may be stored in an IP header.
  • network entities can make use of the information encoded in the newly proposed NALU header extension for H.264/ AVC SVC.
  • network elements must dig deep into higher layer protocol headers.
  • routers operate at the Network Layer, therefore they can easily access the IP header.
  • the protocol stack is as follows IP/ UDP/ RTP/ NALU.
  • One option for network elements to scale the media traffic by accessing the NALU header would be to parse higher layer headers, but this has the disadvantage, among others, to introduce a lot of processing overhead, and reduce robustness and transparency.
  • a mapping is used to calculate an importance identifier based on the NALU header extension NALU- H and the content of the stream to derive a Network- or Link Layer specific priority or QoS class association. For example, if motion is more important for the perception of the current scene than quality, the importance identifier could be set to L2,L1,LO,T2,T1,TO,Q1,QO, where a higher identifier corre- sponds to a lower drop preference or probability. If quality is more important then the value L2,L1,LO,Q1,QO,T2,T1,TO could be used as the importance identifier. As illustrated in Fig. 6, the importance identifier may be stored in the DS field (Differential Services field) of the IP header IP-H.
  • DS field Differential Services field
  • the importance identifier may, for example, be encoded in accordance with the Assured Forwarding (AF) services defined by the Internet Engineering Task Force (IETF). This defines three priority classes and four discarding priorities for each class. Thus, it is possible to differentiate between, for example, conversational, audio and video.
  • the video class may then have four distinct packet drop precedence levels. The information in the NALU header and the content description may thus be used to map video packets into one of these four drop precedence levels.
  • Fig. 7 is a video example illustrating how an importance identifier may be stored in an ethernet header.
  • This header includes a Tag Control Information
  • TCI Priority (3 bits) field and a Drop Eligible (1 bit) field.
  • TCI Priority (3 bits) field and a Drop Eligible (1 bit) field.
  • 3 bits may be used to encode up to 8 service classes.
  • the Drop Eligible bit may be used to in conjunction with the priority classes to mark frames that should be dropped first.
  • the draft RTP transport format update for MPEG-4 AAC-BSAC draft [14] proposes a mode for RFC 3640 to support an MPEG-4 AAC-BSAC audio codec format with an optional attached bit stream description.
  • the bit stream description employs the MPEG-21 generalized Bit stream Syntax Description Language (gBSDL).
  • the description is attached as auxiliary header and can be used to support adaptation.
  • An example gBSDL description is given in APPENDIX 1 (corresponds to Example 2 from Annex C in reference [16]).
  • the gBSDL conveys information on how the AAC-BSAC layering is composed.
  • AdaptationQoS conveys information of the bit rates for the Layers (“BANDWIDTH”), channel dropping possibility (“NUMBERJDF_CHANNELS”), and for SNR scalability the "ODG” (Objective Difference Grade) relation is given in the ODG operator.
  • Fig. 8 is an audio example illustrating how an importance identifier may be stored in an IP header.
  • a version of the gBSDL Scalability Description (APPENDIX 1) is transported in each RTP frame according to [14] (Note: the chatty gBSD could be compressed.).
  • An example solution would be to only use the gBSD Scalability Description transported in the RTP frames, together with a statistical analysis of the gBSD-values from previous RTP frames. This is required since the example gBSD Scalability description does not reveal any details about what specific enhancement is provided for each layer (SNR, bandwidth, channels/ spatial).
  • the Importance Identifier Mapping cannot rely on the intra RTP-packet information only to set the Importance (priority).
  • the Importance Identifier may instead use the rate of change of some features of the Scalability Description (e.g. Inter-RTP packet information may be used). Examples:
  • Base Layer change analysis A higher amount of bits allocated to the base layer than a long term average of base layer bits may be used as an indication of increased importance.
  • XML AdaptationQoS Description and use the mapping-function to provide an Importance Identifier for the Network Layer (e.g., IP).
  • IP Network Layer
  • the channeling importance could be given variable priority, depending on whether the audio content is speech/news or music.
  • the mono layers will get the highest Importance Identifier value (e.g., 7) and the enhancement layers will get a somewhat lower Importance Identifier (e.g., 4).
  • the music content type is indicated, mono layers will still get the highest Importance Identifier value (e.g., 7) and the enhancement layers will get a higher Importance Identifier (e.g., 6).
  • Fig. 9 is a block diagram of an embodiment of a media stream scaling apparatus in accordance with the present invention.
  • Such an apparatus may be installed at a network node and includes a scaling unit 30, which may also in- elude the routing function. Packets received on an input are scaled either by truncation or by discarding the entire packet (indicated by the dashed lines). The scaling is controlled by an importance identifier obtained from the IP header of the packets by an importance identifier extractor 32. As an alternative, this extractor may be included in unit 30.
  • Fig. 10 is a flow chart illustrating the method of forming a packetized scalable media stream in accordance with the present invention.
  • Step Sl determines a media scalability description.
  • Step S2 determining a media content preference description.
  • Step S3 maps the scalability description and the content preference description into an importance identifier included in and controlling the scalability of the media stream.
  • Fig. 11 is a flow chart illustrating a packetized media stream scaling method in accordance with the present invention.
  • Step S4 receives packets with a content dependent importance identifier.
  • Step S5 scales packets based on the value of the importance identifier.
  • Fig. 12 illustrates an example of a content- aware mapping in accordance with the present invention. This example is based on the video example described with reference to Fig. 6-7.
  • the scalability description includes the layer identifiers L2, Ll, LO, the temporal identifiers T2, Tl, TO and the SNR identifiers Ql, QO.
  • Each identifier is a binary number (0 or 1), and only one identifier has the value 1 in each scalability dimension (layer, temporal, SNR). That is, a frame may have, for example, the following scalability description:
  • the content preference description includes the possible "high motion”, “high quality” and “high resolution”
  • the content preference description of the current frame indicates "high motion”.
  • This is used to order the different identifiers in the scalability description and form an 8 bit number (3+3+2 bits). This number could in principle be used directly as an importance identifier. However, it is likely that fewer bits are available for storing an importance identifier. Hence the 8 bits may have to be mapped into, for example, 3 groups, as indicated in Fig. 12.
  • the two lowest priority groups include all frames with either SNR or temporal enhancements.
  • the other two content preference descriptions "high quality” and "high resolution” would give a different ordering of L2, Ll, LO, T2, Tl, TO, Ql, QO and a different mapping of the resulting 8 bits into the 3 possible values of the importance identifier.
  • a QoS mapping function can map the possible values of an importance identifier to the possible classes of the QoS architecture.
  • the QoS mapping could be the following in the case of IP DiffServ. We make use of the AF service class.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

L'invention concerne un appareil pour former un flux multimédia modulable en paquets qui comporte un extracteur (18) d'informations d'échelonnabilité déterminant une description d'échelonnabilité multimédia ainsi qu'un identificateur (22) de type de contenu déterminant une description préférentielle de contenu multimédia. Un mappeur (20) cartographie la description d'échelonnabilité et la description préférentielle de contenu en un identificateur d'importance compris l'échelonnabilité du flux multimédia et régulant celle-ci.
PCT/SE2006/001056 2005-09-23 2006-09-15 Mesure de flux multimedia WO2007035151A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US71955505P 2005-09-23 2005-09-23
US60/719,555 2005-09-23

Publications (1)

Publication Number Publication Date
WO2007035151A1 true WO2007035151A1 (fr) 2007-03-29

Family

ID=37546782

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2006/001056 WO2007035151A1 (fr) 2005-09-23 2006-09-15 Mesure de flux multimedia

Country Status (1)

Country Link
WO (1) WO2007035151A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2034736A1 (fr) * 2007-09-07 2009-03-11 Nokia Siemens Networks Oy Procédé et dispositif de données de traitement et système de communication comprenant un tel dispositif
EP2129128A1 (fr) * 2008-05-28 2009-12-02 Broadcom Corporation Dispositif de bordure qui permet la livraison efficace de vidéo sur un dispositif portatif
US8255962B2 (en) 2008-05-28 2012-08-28 Broadcom Corporation Edge device reception verification/non-reception verification links to differing devices
WO2023163802A1 (fr) * 2022-02-25 2023-08-31 Futurewei Technologies, Inc. Abandon de paquet rtf sensible au contenu multimédia

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002035844A2 (fr) * 2000-10-24 2002-05-02 Eyeball Networks Inc. Adaptation de qualite mecanique utilisant des profils de qualite personnels et des mesures de performances composites
WO2002056563A2 (fr) * 2001-01-12 2002-07-18 Ericsson Telefon Ab L M Services multimedia intelligents
US20030135631A1 (en) * 2001-12-28 2003-07-17 Microsoft Corporation System and method for delivery of dynamically scalable audio/video content over a network
US20040194142A1 (en) * 1999-12-22 2004-09-30 Zhimei Jiang Method and system for adaptive transmission of smoothed data over wireless channels

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040194142A1 (en) * 1999-12-22 2004-09-30 Zhimei Jiang Method and system for adaptive transmission of smoothed data over wireless channels
WO2002035844A2 (fr) * 2000-10-24 2002-05-02 Eyeball Networks Inc. Adaptation de qualite mecanique utilisant des profils de qualite personnels et des mesures de performances composites
WO2002056563A2 (fr) * 2001-01-12 2002-07-18 Ericsson Telefon Ab L M Services multimedia intelligents
US20030135631A1 (en) * 2001-12-28 2003-07-17 Microsoft Corporation System and method for delivery of dynamically scalable audio/video content over a network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
AHMED, T.; MEHAOUA, A.; BOUTABA, R.; IRAQI, Y.: "Adaptive packet video streaming over IP networks: a cross-layer approach", SELECTED AREAS IN COMMUNICATIONS, IEEE JOURNAL ON, vol. 23, February 2005 (2005-02-01), pages 385 - 401, XP002413053 *
JITAE SHIN ET AL: "Quality-of-Service Mapping Mechanism for Packet Video in Differentiated Services Network", IEEE TRANSACTIONS ON MULTIMEDIA, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 3, no. 2, June 2001 (2001-06-01), XP011036246, ISSN: 1520-9210 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2034736A1 (fr) * 2007-09-07 2009-03-11 Nokia Siemens Networks Oy Procédé et dispositif de données de traitement et système de communication comprenant un tel dispositif
EP2129128A1 (fr) * 2008-05-28 2009-12-02 Broadcom Corporation Dispositif de bordure qui permet la livraison efficace de vidéo sur un dispositif portatif
US8209733B2 (en) 2008-05-28 2012-06-26 Broadcom Corporation Edge device that enables efficient delivery of video to handheld device
US8255962B2 (en) 2008-05-28 2012-08-28 Broadcom Corporation Edge device reception verification/non-reception verification links to differing devices
CN101594529B (zh) * 2008-05-28 2015-09-30 美国博通公司 视频处理系统和用于处理视频数据的方法
WO2023163802A1 (fr) * 2022-02-25 2023-08-31 Futurewei Technologies, Inc. Abandon de paquet rtf sensible au contenu multimédia

Similar Documents

Publication Publication Date Title
TWI432035B (zh) 可縮放視訊編碼之圖像反向相容聚合技術
Radha et al. Scalable internet video using MPEG-4
Schierl et al. Using H. 264/AVC-based scalable video coding (SVC) for real time streaming in wireless IP networks
US20070183494A1 (en) Buffering of decoded reference pictures
AU2007230602B2 (en) System and method for management of scalability information in scalable video and audio coding systems using control messages
Wenger et al. RTP payload format for scalable video coding
US20090222855A1 (en) Method and apparatuses for hierarchical transmission/reception in digital broadcast
EP1638333A1 (fr) Codage video à débit adaptif
TW200829032A (en) Generic indication of adaptation paths for scalable multimedia
US20110274180A1 (en) Method and apparatus for transmitting and receiving layered coded video
WO2007035147A1 (fr) Codage de signal source adaptatif
CA2647823A1 (fr) Systeme et procede de gestion d'informations d'evolutivite dans des systemes de codage audio-video evolutif utilisant des messages de commande
US20100161823A1 (en) A streaming service system and method for universal video access based on scalable video coding
Huusko et al. Cross-layer architecture for scalable video transmission in wireless network
US20040139219A1 (en) Transcaling: a video coding and multicasting framework for wireless IP multimedia services
WO2007035151A1 (fr) Mesure de flux multimedia
EP1230802B1 (fr) Paquet de commande specifique a un contenu video mpeg-4 permettant d'obtenir un ensemble personnalise d'outils de codage
KR100799592B1 (ko) 스케일러블 비디오 비트스트림의 계층 변조 송수신을 위한장치 및 그 방법
Seo et al. A practical RTP packetization scheme for SVC video transport over IP networks
JP2024503647A (ja) メディアデータのバックグラウンドデータトラフィック配信
Nafaa et al. RTP4mux: a novel MPEG-4 RTP payload for multicast video communications over wireless IP
Itakura et al. A scalable delivery system based on RTP JPEG2000 video stream format
Takács et al. Forward information-a general approach for scalable audiovisual service delivery
Dunte Efficient Transmission Infrastructure for Scalable Coded MPEG-4/H. 264 Video
IT et al. SUIT Doc Number SUIT_517 Project Number IST-4-028042

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06799683

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)