WO2016172328A1

WO2016172328A1 - Content protection and modification detection in adaptive streaming and transport streams

Info

Publication number: WO2016172328A1
Application number: PCT/US2016/028620
Authority: WO
Inventors: Alexander GILADI
Original assignee: Vid Scale, Inc.
Priority date: 2015-04-24
Filing date: 2016-04-21
Publication date: 2016-10-27
Also published as: TW201713095A

Abstract

Systems, methods, and instrumentalities are disclosed for content protection and modification detection in adaptive streaming and transport streams. Content protection may be multi-level, e.g., payload signatures and interval signatures. Content protection may be multi-layered, e.g., overlapping signatures. Signatures may be carried inband, e.g., in transport segments. Content protection may be used for modification detection. Modification detection may be multi-level, e.g., container level detection and bitstream level detection. Types of modifications and sources may be detected and distinguished, e.g., detection of reordering, detection of benign and/or malicious modification of one or more types of content (e.g., bitstream, metadata) by insertion and/or removal of content.

Description

CONTENT PROTECTION AND MODIFICATION DETECTION IN ADAPTIVE STREAMING AND TRANSPORT STREAMS

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application claims priority to U.S. Provisional Patent Application No. 62/152,639, filed April 24, 2015, which is incorporated herein by reference in its entirety.

BACKGROUND

[0002] Adaptive streaming and transport streams may not be supported by anti-modification techniques (e.g., HTTPS, out-of-band segment integrity verification, and HLS). Anti-modification techniques may have undesirable effects. For example, anti-modification techniques may fail to permit "benign" modifications (e.g., insertion of events, PGR restamping), remultiplexing, and/or scalability. Anti -modification techniques may incur extra requests per segment, create a single point of failure, and/or may be incompatible with adaptive streaming and transport streams.

[0003] However, there is a need for content protection and modification detection in adaptive streaming and transport streams.

SUMMARY

[0004] Systems, methods, and instrumentalities are disclosed for confirming the authenticity of content in adaptive streaming, comprising receiving a media presentation description (MPD) file, receiving a key, requesting content based on the MPD, receiving the content comprising a plurality of packets and an inband signature, determining the authenticity of the content (e.g., by performing a key/signature confirmation check), and upon confirming the authenticity of the content, decoding at least one packet of the content.

[0005] Systems, methods, and instrumentalities are disclosed for protecting content in adaptive streaming, comprising receiving a request for content, the request based upon a media presentation description (MPD) file, and sending content based upon the request, the content comprising a plurality of packets and an inband signature, wherein the authenticity of the inband signature is determined using a key, thereby confirming that no unauthorized addition or removal was performed to the content. [0006] Systems, methods, and instrumentalities are disclosed for inserting an advertisement into content in an adaptive stream, comprising receiving a content, the content comprising a plurality of packets, and inserting an advertisement and an inbound signature into the content, wherein the authenticity of the inband signature is determined using a key, thereby confirming that the advertisement insertion into the content was authorized.

BRIEF DESCRIPTION OF THE DRAWINGS

[0007] FIG. 1 is a diagram of an example of a DASH system model.

[0008] FIG. 2 is a diagram of an example of payload signatures in Transport Stream (TS) packets.

[0009] FIG. 3 is a diagram of an example of a marker framework for interval signatures.

[0010] FIG. 4 is a diagram of an example of interval signatures for continuous TS packets.

[0011] FIG. 5 is a diagram of an example of layered interval signatures for continuous TS packets.

[0012] FIG. 6 is a diagram of an example of layered interval signatures for segmented TS packets.

[0013] FIG. 7 A is a system diagram of an example communications system in which one or more disclosed embodiments may be implemented.

[0014] FIG. 7B is a system diagram, of an example wireless transmit/receive unit (WTRU) that may be used within the communications system illustrated in FIG. 7A.

[0015] FIG. 7C is a system diagram of an example radio access network and an example core network that may be used within the communications system illustrated in FIG. 7A.

[0016] FIG 7D is a system diagram of another example radio access network and an example core network that may be used within the communications system illustrated in FIG. 7A.

[0017] FIG. 7E is a system diagram of another example radio access network and an example core network that may be used within the communications system illustrated in FIG. 7A.

DETAILED DESCRIPTION

[0018] A detailed description of illustrative embodiments will now be described with reference to the various figures. Although this description provides a detailed example of possible

implementations, it should be noted that the details are intended to be exemplary and in no way limit the scope of the application. [0019] Systems, methods, and instrumentalities are disclosed for content protection and modification detection in adaptive streaming and transport streams. Content protection may be multi-level, e.g., pay load signatures and interval signatures. Content protection may be multi- layered, e.g., overlapping signatures. Signatures may be carried inband, e.g., in segments such as transport stream (TS) segments. Content protection may be used for modification detection.

Modification detection may be multi-level, e.g., container level detection and bitstream level detection. Types of modifications and sources may be detected and distinguished, e.g., detection of reordering, detection of benign and/or malicious modification of one or more types of content (e.g., bitstream, metadata) by insertion and/or removal of content.

[0020] Content subject to malicious modification may be protected. Protection may be desirable, for example, when media is transferred over an open or otherwise attack-prone network. Content may be protected, for example, by inclusion of signatures in MPEG-2 TS. Signatures may protect content from unauthorized tampering. Content may be wholly or partially protected. Protection may be backwards compatible. Protection (e.g., by including signatures in content) may be minimally intrusive. Signatures may be added, for example, for one or more content component units. A content component unit may be, for example, a PES packet or section. Interval signatures may be added to sign data intervals (e.g., portions) of a transport stream. The signed data intervals may be applied to data within segments (e.g. DASH content segments), or to data within continuous transport streams. A data interval may comprise bytes between two markers, for example. Signed data intervals may, for example, disallow addition and/or removal of content. Benign stream modifications may be detectable. Content protection and detection of content modification may be applicable, for example, to MPEG DASH, SCTE ATS, 3 GPP SA4 and SA3, and ETSI (HbbTV).

[0021] "Over-the-top" (OTT) streaming may, for example, utilize the Internet as a delivery medium for video content. Video content may comprise high-quality video content. Video-capable devices comprise, for example, mobile devices, Internet set-top boxes (STBs), and network TVs.

[0022] "Closed" networks may be controlled by a multi-system operator (MSO). The Internet may be a "best effort" environment. Bandwidth and latency may change. Network conditions may be volatile, for example, in mobile networks. Dynamic adaptation to network changes may be used, for example, to provide a tolerable user experience.

[0023] Adaptive (e.g., scalable, rate-adaptive) streaming may be implemented, for example, by HTTP streaming or UDP streaming. Internet video streaming may, for example, take advantage of HTTP infrastructure, such as content distribution networks (CDNs) and HTTP support on multiple platforms and devices. A firewall may disallow UDP traffic while permitting HTTP traffic behind firewalls. HTTP adaptive streaming of an asset may, for example, be segmented (e.g., virtually or physically) and published to CDN's.

[0024] A client may acquire knowledge of published alternative encodings (e.g., representations) of a streamed asset. A client may construct or use a URL, for example, to download a segment from a given representation. As an example, an Adaptive Bit Rate (ABR) client may observe network conditions. An ABR client may, for example, determine one or more parameters, e.g., bitrate, quality, and/or resolution, to provide a desired quality of experience on the client device at one or more instances of time. As the client determines the optimal URL to use, a client may i ssue an HTTP GET request to download a segment, for example, based on a URL selected to provide a desired quality.

[0025] MPEG DASH may, for example, be built on top of an HTTP/TCP/IP stack, MPEG DASH may define a manifest format, e.g., Media Presentation Description (MPD). MPEG D ASH may, for example, define a segment format for ISO Base Media File Format and/or MPEG-2 Transport Streams. Transport streams (TS) may include MPEG-2 Transport Streams. DASH may define a set of quality metrics at network, client operation, and/or media presentation levels. One or more quali ty metrics may enable an interoperable technique to monitor Quality of Experience and Quality of Service.

[0026] A DASH representation may be defined as an encoded version of a portion or entirety of an asset. A representation may, for example, encode a complete asset or a subset of its components. Examples of representations may be, for example, ISO-BMFF comprising unmultiplexed 2.5 Mbps 720p AVC video and ISO-BMFF representations for 96 Kbps MPEG-4 AAC audio in different languages. A TS comprising video, audio, and subtitles may be a single multiplexed representation. A structure may be combined. For example, video and English audio may be a single multiplexed representation while Spanish and Chinese audio tracks may be separate unmultiplexed

representations.

[0027] A segment (e .g., a DASH segment) may be defined as an addressable unit of media data. In an example, a segment may be a minimal or a smallest individually addressable unit. A segment may be an entity that may be downloaded, e.g., using URLs advertised via the MPD. An example of a media segment may be a 4-second part of a live broadcast at playout time 0:42:38 to 0:42:42, available within a 3-min time window. An example of a media segment may be a complete on- demand movie available during a period the movie is licensed.

[0028] An MPD may comprise an XML document. An MPD may advertise available media. An MPD may provide information that a client may use, for example, to select a representation, make adaptation decisions, and/or retrieve segments from a network. An MPD may be independent of a segment. An MPD may signal one or more properties. One or more properties may be used to determine whether a representation may be played. One or more properties may be used to determine one or more functional properties of a representation, such as whether a segment starts at a random access point. An MPD may use a hierarchical data model, for example, to describe a presentation.

[0029] A representation may be a conceptual level (e.g., a low level) of a hierarchical data model. MPD signal information may comprise, for example, bandwidth, codecs for presentation and techniques to construct URLs to access segments. Additional information may be provided at a conceptual level. Additional information may comprise, for example, starting from trick mode, random access information, layer and view information for scalable and multiview codecs, and/or schemes that may be supported by a client that may play a given representation.

[0030] DASH may provide flexible URL construction functionality. DASH may provide a monolithic per-segment URL, DASH may allow dynamic construction of URLs, for example, by combining parts of the URL, such as base URL's. Base URL's may appear at different levels of a hierarchical data model. Multiple base URL's may be used. Segments may have multi-path functionality. Segments may be requested from one or more locations, which may improve performance and reliability.

[0031 ] DASH may permit use of predefined variables (e.g., segment number, segment time) and/or printf-style syntax, for example, for on-the-fly construction of URLs. URLs may be constructed using templates. This may be helpful, for example, when there would otherwise be explicit URLs or byte ranges for many short segments in one or more representations

[0032] As an example of a template representation, seg $ i ndex% 05 $ . ts, may express one or more segments (e.g., s eg_00001 . t s , seg_00002 . ts , . . . seg_03600 . ts). Segments may be expressed, for example, regardless whether they may be retrieved at the time an MPD is fetched. Multi-segment representations may be used in templates. [0033] Different representations of an asset and/or component may be grouped into adaptation sets. Different representations in an adaptation set may render the same content. A client may switch between representations.

[0034] An example of an adaptation set may be a collection of 10 representations with video encoded at different bitrates and resolutions. A client may switch between representations, for example, at segment or subsegment granularity, while presenting content, e.g., to a viewer.

Presentation may be seamless while switching between representations, for example, under a segment-level restriction. One or more restrictions may be implemented, for example, in a DASH profile and/or as DASH subsets adopted by one or more SDO's. Segment restrictions may be applied, for example, to one or more representations in an adaptation set.

[0035] A period may be a time-limited subset of a presentation. Adaptation sets may be valid in a period. Adaptation sets in different periods may have the same, similar, and/or different representations, for example, in terms of codecs, bitrates, etc. An MPD may have one or more periods for a duration of an asset. A period may be used, for example, for ad markup. Separate periods may be dedicated to one or more parts of an asset and/or to one or more advertisements.

[0036] An MPD may be an XML document that presents a hierarchy. A hierarchy may, for example, have global presentation-level properties (e.g., timing) and/or period-level properties. A hierarchy may have one or more adaptation sets available for a period. A representation may be a lowest level of a hierarchy.

[0037] DASH may use a version (e.g., simplified version) of XLink, for example, to allow loading parts of an MPD (e.g., periods) in real time from a remote location. In an example regarding ad insertion, parts of an MPD may be loaded from a remote location, for example, when precise timing of ad breaks is known ahead of time. An ad server may determine an ad in real time.

[0038] A dynamic MPD may change. A dynamic MPD may be periodically reloaded by a client. A static MPD may be valid for all or part of a presentation. A static MPD may be used, for example, for a VoD application. A dynamic MPD may be used, for example, for live and/or PVR applications.

[0039] A media segment may be a time-bounded part of a representation. An approximate segment duration may appear in an MPD. Segment duration may or may not be the same for all segments. Segment durations may have segments with a tolerance margin, which may be expressed as a percentage (e.g., 25% tolerance margin). [0040] An MPD may comprise information regarding media segments that are unavailable at the time an MPD is read by a client, for example, in a live broadcast scenario. As an example, one or more segments may be available in a defined availability time window. A time window may be calculated, for example, based on wall-clock time and segment duration.

[0041] A segment type may be an index segment. An index segment may appear, for example, as a side file or in a media segment. An index segment may comprise timing and/or random access information. An index may be used, for example, for random access, trick modes, and/or bitstream switching, which may render bitstream switching more efficient. Indexing may be used for VoD and/or PVR type applications.

[0042] Segment-level and representation-level properties may be used, for example, to implement bitstream switching. DASH may indicate functionality requirements for properties, which may be expressed in an MPD. A segment format specification may have format-level restrictions corresponding to functionality requirements.

[0043] A media segment i of a representation R may be denoted asS_R (i). A duration of a media segment may be denoted as D(S_R (i)). A media segment may have an earliest presentation time, which may be denoted as EPT(S_R (i)). EPT may correspond with, for example, an earliest presentation time of a segment or a time at which a segment may be played out at random access.

[0044] A time alignment of segments for representations in an adaptation set may be used, for example, in support of efficient switching. A relationship between a pair of representation R and R and a segment i may be expressed, for example, in accordance with Equation 1 :

[0045] Switching may occur at segment borders without overlapped downloads and dual decoding, for example, when the relationship provided in Eq. 1 is satisfied and when a segment starts with a random access point. Bitstream switching may occur at a subsegment level for example, when indexing is used, a subsegment starts with a random access point and Eq. 1 is satisfied. [0046] Time alignment and random access point placement may be restricted. Restrictions may translate, for example, into encodings with matching IDR frames at segment borders and closed GOP's.

[0047] A DASH client may conceptually comprise an access client, a media engine and an application. An access client may be an HTTP client. A media engine may decode and present media provided to it. An application may receive events from an access client. One or more interfaces may be defined, e.g., an on- the- wire format for MPD and segments.

[0048] FIG. 1 is a diagram of an example of a DASH system model. Timing behavior of a DASH client may vary from Apple HLS, for example. In an example of Apple HLS, segments mentioned in a manifest may be valid and a client may poll for new manifests. A DASH MPD may reduce polling behavior, for example, by defining MPD update frequency and/or allowing calculation of segment availability.

[0049] A static MPD may be valid (e.g., always valid). A dynamic MPD may be valid, for example, beginning at a time it was fetched by a client or for an explicitly stated refresh period. An MPD may implement versioning, for example, by explicitly exposing its publication time.

[0050] An MPD may provide an availability time of the earliest segment of a period, which may^¬ be denoted as T_A (0). A media segment n may be available starting at a time, for example, as provided by Equation 2:

[0051] The relationship given by Eq. 2 may be available, for example, for a duration of a timeshift buffer T_ts, which may be explicitly stated in an MPD. Availability window size may have an impact (e.g., direct impact) on catch-up TV functionality, which may be implemented in a DASH deployment. Segment availability time may be relied on by an access client, for example, when it falls within an MPD validity period.

[0052] An MPD may declare bandwidth B_R for a representation R. An MPD may define a global minimum buffering time, which may be denoted as BT_min An access client may pass a segment to a media engine, for example, after bits are downloaded. A time (e.g., an earliest time) that

segment n may be passed to a media engine may be given by for example, when

a segment starts with a random access point. may represent a download time of segment n. A

DASH client may start a playout immediately, for example, to minimize a delay. An MPD may propose a presentation delay (e.g., as an offset from , for example, to synchronize between

different clients. Segment HTTP GET requests may be synchronized.

[0053] MPD validity and/or segment availability may be calculated using absolute (e.g., wall- clock) time. Media time may be expressed within segments. Drift may develop between an encoder and client clocks, for example, in a live case. Drift may be addressed at a container level, for example, when MPEG-2 TS and ISO-BMFF provide synchronization functionality.

[0054] Events may be an extension of DASH. "Push"-style events may be emulated, for example, using frequent polls in stateless and client-driven HTTP. Upcoming ad breaks may be signaled in advance (e.g., 3 to 8 seconds before their start), for example, in ad insertion practice in cahle/lPTV systems.

[0055] Events may be "blobs," for example, with explicit time and duration information and application-specific payloads. Inband events may be message boxes, which may, for example, appear at a beginning of media segments. MPD events may be a period-level list of timed elements. DASH may define an MPD validity expiration event, which may identify the earliest MPD version as valid after a given presentation time.

[0056] DASH may be used in conjunction with digital rights management (DRM). DASH may support a signaling DRM scheme and its properties, for example, within an MPD. A DRM scheme may be signaled, for example, via a ContentProtection descriptor. An opaque value may be passed within a ContentProtection descriptor. A DRM scheme may be signaled, for example, with a unique identifier for a given scheme and a definition of an opaque value. A DRM scheme may be signaled for example, with a scheme-specific namespace.

[0057] Content may be protected, for example, with Common Encryption for ISO-BMFF (CENG) or Segment Encryption and Authentication. Encryption may, for example, designate which parts of a sample may be encrypted and/or how encryption metadata is signaled within a track. A DRM module may deliver keys to a client, for example, when encryption metadata, is in a segment. Decryption may use, for example, AES-CTR or AES-CBC modes. A CENC framework may, for example, be extensible to use other encryption algorithms. Common Encryption may be used with one or more DRM systems. [0058] DASH Segment Encryption and Authentication (DASH-SEA) may be used with a segment format. Encryption metadata may be passed via an MPD. For example, the MPD may comprise information on which key may be used for decryption of a segment and/or how to obtain a key. A system may comprise, for example, HLS with AES-CBC encryption and HTTPS-based key transport. MPEG-2 TS media segments may be compatible with encrypted HLS segments. A system may be extensible, permitting other encryption algorithms and more DRM systems.

[0059] DASH-SEA may provide a segment authenticity framework. A segment authenticity framework may, for example, confirm that a segment received by a client is a segment an MPD author intended the client to receive. Confirmation may be performed, for example, using a MAC or a digest algorithm. A segment authenticity framework may prevent content modification in a network. Modification may be, for example, ad replacement or alteration of inband events.

[0060] A TS may specify a container format encapsulating packetized elementary streams. A TS may have error correction and stream synchronization features.

[0061] A packet may represent a unit of data in a TS. A packet may comprise a sync byte and a header. A packet may comprise one or more transport fields, for example, when signaled in an adaptation field. A packet may comprise a payload. A packet may have a fixed length, e.g., 1 88 bytes.

[0062] A TS may have a concept of programs. A program may be described by a Program Map Table (PMT). A PMT may have a unique identifier (PID). Elementaiy streams associated with a program may have PIDs listed in a PMT. A TS used in digital television may, for example, comprise three programs that may represent three television channels. In an example, a channel may comprise a video stream, one or more audio streams, and metadata. A receiver may decode a "channel," for example, by decoding pay loads for a PID associated with a program. A receiver may- discard the contents of other PIDs. A TS having more than one program may be referred to as a Multi Program Transport Stream (MPTS). A TS having one program may be referred to as a Single Program Transport Stream (SPTS).

[0063] A program may have program specific information (PSI). For example, there may be four PSI tables: Program Association Table (PAT), Program Map Table (PMT), Conditional Access Table (CAT), and Network Information Table (NIT). [0064] A PAT may list one or more programs available in a transport stream. A program listed in a PAT may be identified, for example, by a 16-bit value, which may be referred to as

program number. A program listed in a PAT may have an associated value of PID for a PMT.

[0065] A program number value 0x0000 may be reserved, for example, to specify a PID to look for a Network Information Table (NIT). A default PID value, e.g., (0x0010), may be used for an NIT, for example, when a PID is not specified. A TS Packet comprising PAT information may, for example, have a PID value 0x0000.

[0066] A PMT may comprise information about one or more programs. In an example, a program may have a one-to-one correspondence with a PMT. More than one PMT section may be transmitted on a PID, for example, in MPEG-2. A single TS PID may comprise PMT information for more than one program. In an example, a PMT may be transmitted on a separate PID that is not used for other packets (e.g., ATSC and SCTE). A PMT may provide information on a program present in a transport stream. Information may comprise, for example, a program_number and a list of elementary streams for a described MPEG-2 program.

[0067] There may be locations for one or more descriptors, for example, that describe an MPEG- 2 program (e.g., in its entirety) and/or one or more descriptors for one or more elementary streams. An elementary stream may be labeled with a stream_type value.

[0068] A Program Clock Reference (PGR) may be transmitted, for example, in an adaptation field of an MPEG-2 TS packet. A PGR may enable a decoder to present synchronized content, such as audio tracks matching associated video. A PGR may be transmitted periodically, e.g., every 100 ms. A PID having a PGR, e.g., for an MPEG-2 program, may be identified, for example, by a per pid value in an associated PMT. A value of a PGR may be employed, for example, to generate a system timing clock in a decoder. A System Time Clock (STC) decoder may provide an accurate time base to synchronize audio and video elementary streams. As an example, timing in MPEG2 may reference an STC. A presentation time stamp (PTS) may be relative to a PGR. A number of bits (e.g., 33 bits) may be based on a PGR (e.g., a 90 kHz clock). A number of bits (e.g., last 9 bits) may be based on the same or different PGR (e.g., a 27 MHz clock). A PGR may have a maximum jitter (e.g., +/- 500 ns).

[0069] A transmission scheme may have a constant bitrate for a transport stream. A multiplexer may insert one or more packets (e.g., null packets), for example, to maintain a constant bitrate in a transport stream. Null packets may not comprise data. A receiver may ignore contents of null packets. A PID value (e.g., OxlFFF) may be reserved for null packets.

[0070] A TS may deliver HTTP links, DASH MPD and/or HTML content. A framework for delivery of a timeline for external data (e.g., ΊΈΜΙ as defined in MPEG-2 systems) may, for example, allow transport of HTTP links embedded in a TS. DSM-CC (e.g., in MPEG-2 TS) may, for example, carry files in a TS. HTML inband may be carried, for example, by HbbTV. Advanced television sets (e.g., smart TVs) may, for example, render HTML, run JavaScript, etc. MPDs may be carried mband, for example, by DASH MPD Update and MPD Patch events. DASH events may be carried, for example, in MPEG-2 transport streams (e.g., on PID 0x004).

[0071] An Adaptive Transport Stream (ATS) may provide backwards-compatible virtual segmentation. An ATS may add constraints to an MPEG-2 TS stream, for example, to allow conversion to one or more adaptive streaming technologies. A single-program MPEG-2 TS stream may be playable in its regular context. A stream may be adapted into an adaptive streaming workflow. An ATS stream may be recognized as an MPEG-2 TS.

[0072] An ATS may have, for example, an encoder boundary point (EBP) structure, an EBP descriptor, and a source description. An EBP structure may be a marker inside a TS packet. An EBP descriptor may describe a multiplex as part of in-band system information. A source description may be a manifest that defines an ATS set. An ATS set may be a set of multiplexes and/or files associated with content.

[0073] An EBP structure may, for example, be carried in an adaptation field of a TS packet. An EBP structure may be inserted, for example, at a transcoding stage. An EBP structure may provide information, such as one or more of a stream mark-up indicator, a wall clock time indicator, information about an upcoming segment, labeling, and a unique identifier.

[0074] A stream mark-up indicator may indicate segmentation points (e.g., virtual segments). A wail clock time indicator may accompany a segment (e.g., each segment). Information on an upcoming segment may comprise, for example, a segment access point (SAP) type. Labeling (e.g., generic labeling) may be provided in an EBP structure. A unique identification of current segment may be, for example, a segment number.

[0075] An EBP descriptor in a PMT may describe one or more elements of an ATS set carried in a multiplex. A description may comprise information, such as one or more locations of EBP structures in one or more elementary streams, one or more random access characteristics, and expected segment size.

[0076] Media segments may be sent over insecure links (e. g., HTTP). Insecure links may be used, for example, for performance reasons. An entity in a network may modify content in an HTTP response. A fake transmitter may be used to produce modified content for broadcast delivery over the air.

[0077] A man-in-the-middle attack may be dangerous, for example in audiovisual content, due to a potential for exploitation of one or more weaknesses of equipment to maliciously crafted parameters and/or replacement of provider-inserted advertising. A man-in-the-middle attack may be dangerous, for example, when content leads to receiver action (e.g., issuing HTTP GET, parsing documents, executing scripts, etc.). As an example, a computer (e.g., a smart TV such as an HbbTV), may be tricked into running malicious JavaScript scripts, or a session may be hijacked (e.g., as a result of altering embedded MPD or TEMI data may be altered or replaced). Severe man- in-the-middle attacks have been reported for HbbTV.

[0078] DRM may not mitigate a man-in-the-middle attack. DRM techniques protect the media content from unauthorized viewing. DRM techniques (perhaps other than full-segment encryption via Apple HLS) do not provide protection against modification of media.

[0079] Content changes may be benign. Examples of benign changes may be, for example, addition of a track, ad insertion markup, addition of NULL packets to pad a TS to a specific bitrate, extraction of a single stream from a multi-program transport stream, changing timestamps, changing PMT or PAT, etc. (e.g., in MPEG-2 TS). A change may result in invalid content, and so detection of such a change may be useful in order to isolate of the workflow stage where noncompliance first appeared.

[0080] Adaptive streaming is not well supported by data anti-modification techniques (e.g., HTTPS, out-of-band segment integrity verification, and HLS), for example, because they fail to permit "benign" modifications (e.g., insertion of events, PGR restamping), fail to permit remultipiexing and, further, fail to provide scalability, incur extra requests per segment, create a single point of failure, and/or are incompatible with continuous streams.

[0081] HTTPS for segment download provides security, but lacks scalability.

[0082] Out-of-band (outband) segment integrity verification (e.g., DASH Part 4 ISO/IEC 23009- 4) may protect a segment by preventing an attacker from modifying a segment or events (e.g., modifications by CDNs or remultiplexing), for example, when there is a single entity (or group of entities) having the same key and generating bitwise identical segments. Outband segment integrity verification may create a single point of failure and may incur extra requests per segment (e.g., HTTPS client requests for each individually requested and retrieved segment). A large live audience (eg. ., an audience viewing a broadcast sporting event) may access the same server at the same time to create a point of contention and failure. Outband segment integrity verification may not work (e.g., may not be supported) for continuous streams.

[0083] HLS may define full-segment encryption, which may protect media from modification. Continuous delivery of MPEG-2 TS may make data protection techniques relying on discreet fully- encrypted content segments or use of HTTPS irrelevant, for example, as IP multicast may be used for the purpose. Fully encrypted content may not be an MPEG-2 TS, and for example, may not be processed by existing equipment and/or changes, such as insertion of ad mark-up or remultiplexing, may not be possible.

[0084] Signatures may be carried inband. As an example, a signature may be placed in a transport stream, such as an MPEG-2 TS. A signature may be carried by TS segments (e.g., TS segments in MPEG DASH). A signature may be carried in parts of (e.g., in one or more TS in) a continuous stream, such as broadcast TV (terrestrial, cable, satellite, etc.).

[0085] A signature may be a message authentication code (MAC), which may be carried in one or more (e.g., each) segment, subsegment, at an interval in a continuous stream, or in a content component in a continuous stream.

[0086] FIG. 2 is a diagram of an example of payload signatures in TS packets. A "pavload" signature may be used to sign, for example, a payload of a partial (incomplete) packetized elementary stream (PES) packet in a TS, a whole (complete) PES packet (e.g., a PES packet with timing), or a section. A section may be used, for example, for PSI (PAT/PMT) and/or SCTE 35. In the example illustration of FIG. 2, a payload signature for a complete PES packet is computed and carried in a TS packet in association with the one or more TS payloads which carry the PES packet data. The payload signature may be carried in an adaptation field of a TS packet, for example. The payload signature may be carried in the first TS packet associated with the PES packet, for example as illustrated in FIG. 2. The payload signature may be carried in the last TS packet associated with the PES packet, or in any of the TS packets associated with the PES packet. A descriptor may be used to carry the payload signature. [0087] A pay load signature may improve or guarantee the integrity and/or authenticity of content. Content may be, for example, one or more carried portions of content (e.g., audio, video, textual, data), such as content carried in a PES packet or a section.

[0088] A composition of a stream (e.g. , a TS) with a payload signature may be modified. As an example, links and MPDs in a stream may be signed by a content author. A packager may create a new stream comprising content signed by an author in addition to other content.

[0089] As an example, multiple content authors or content sources may sign respective portions of content they authored or sourced. A packager may combine content from different sources. For example, a packager may combine first content signed by a first content author, second content signed by a second content author, and third content, which may be unsigned. Keys (e.g. , keys used to verify signatures at a client) may be specified for the different signatures, for example, by signaling Key ID's.

[0090] FIG. 3 is a diagram of an example of a marker framework for interval signatures. An "interval" signature may be carried in a "marker" packet, A marker TS packet may be carried, for example, in a payload, in an adaptation field of a TS packet, or as a part of a virtual segmentation structure (e.g., EBP). A marker TS packet may be on the same PID as the media itself (e.g., video PID) or on a separate PID, for example, using private or section syntax. A marker packet or interval signature may refer to packets (e.g., all packets) between "marker" packets. For example, interval signatures may be carried in TS packets in MPEG-2. A "marker" packet may be, for example, any MPEG-2 TS packet carrying an interval signature. As an example, an EBP, which may be used to mark content (e.g., a content file or content stream) with virtual boundary points for segmentation, may have an interval signature.

[0091 ] FIG. 4 is a diagram of an example of an interval signature in a continuous TS packet stream. In FIG.4, an interval signature may be carried inband in the TS (e.g., m an adaptation field of a marker TS packet), and the data interval over which the interval signature is computed may be based on the location of the current and previous marker TS packets. A marker TS packet may be any TS packet which carries a signature, an interval signature, or a descriptor which identifies or carries signatures for the transport stream. In one or more embodiments, the interval signature may be computed over a data interval consisting of all of the TS packets between the previous and current marker packets. In one or more embodiments (e.g., as illustrated in FIG. 4), the data interval over which the interval signature may be computed may include portions of the marker packets themselves.

[0092] FIG. 5 is a diagram of an example of layered interval signatures for a continuous TS packet stream. In FIG. 5, a pair of overlapping interval signatures (e.g. , which may be denoted as "Signature" and "Overlap Signature" in FIG. 5) may be carried inband in the TS (e.g., in an adaptation field of a marker TS packet). As illustrated in FIG. 5, "Signature" may be an interval signature computed over a data interval which may comprise the TS packets between the previous and current marker packets, and which may include portions of the previous and/or current marker packets themselves. "Overlap Signature" may be an interval signature computed over a data interval which overlaps with the data interval for "Signature," for example. As illustrated in FIG. 5, the data interval for "Overlap Signature" may comprise the previous marker packet. Although not shown in FIG. 5, the data interval for "Overlap Signature" may be extended to include additional data from the TS packets which precede or which follow the previous marker packet.

[0093] FIG. 6 is a diagram of an example of layered interval signatures for segmented TS packets. As illustrated in FIG. 6, the overlapping signature scheme of FIG. 5 may be applied to a range of TS packets which span one or more segments (e.g., one or more DASH segments). As shown in FIG. 6, the signature intervals may be arranged so that one set of signatures covers part or all of the content in several segments. A single segment may be partitioned arbitrarily using multiple marker packets, so that an arbitrary number of "Signature" and "Overlap Signature" pairs may define signed overlapped signatures within a single segment (not shown). In any of FIGS. 4, 5, and 6 it should be understood that the patterns shown may be extended from additional previous marker packets and to additional subsequent marker packets, which may each carry interval signature values such as "Signature" and "Overlap Signature." In addition, any or all of the TS Packets shown in these figures may additionally carry Payload Signatures, as previously discussed (e.g., as illustrated in FIG. 2).

[0094] An interval signature may be applied to continuous (non-segmented) content (e.g., continuous MPEG-2 TS) and/or to one or more segments in segmented content (e.g., segmented MPEG-2 TS or MPEG DASH segments). Interval signatures may span multiple segments, which may be referred to as a multi-segment interval. An interval signature may be mixed. For example, an interval signature for a segment may be carried in a different segment (e.g., a signature of a previous segment may be carried in a current segment, or a signature of a current segment may be carried in a previous segment).

[0095] Signatures may overlap. Different levels of signatures may be applied to packets. As an example, an interval signature may be applied from the first byte of a previous marker packet to the last byte of a signature in a current marker packet. As an example, a pay load signature may be carried that signs data for an upcoming PES packet and a previous PES packet.

[0096] An externally predictable identifier (such as segment number per MPD or per EBP structure) may be used (e.g., prepended or communicated separately), for example, to support random access. A first overlapping signature may be unverifiable, for example, if it is the first overlapping signature after a random access operation (e.g., random access in MPEG-2 DASH, random access for a continuous MPEG-2 TS, channel switching between streams, file access, etc.). As an example, a segment number may be an identifier that may be embedded in an EBP structure, which may be used in random access of continuous streams.

[0097] Benign stream modifications may reassemble content and void a signature. In an example scenario, components may be intact (e.g., payload signatures are accurate) and there may be no unsigned elements, but there may not be an interval signature or the provided interval signature may not match the modified (e.g., benignly modified, remultiplexed or reassembled) content. This scenario may occur for several reasons. Remultiplexing may occur. Null packets may be added, for example, so that a stream has a constant bitrate. PSI and SCTE 35 data may be moved. Different stages of processing may use the same key. Payload signatures may be correct, but an interval signature may be wrong (e.g., it may not match the data interval over which it was apparently- computed). Such a scenario may be averted. As an example, a signature of a previous and/or next TS packet (e.g., neighbor signatures) may be carried, for example, in a location where a "payload" signature may be carried.

[0098] A signature may be, for example, a symmetrical HMAC or AES-GMAC (Galois Message Authentication Code). An HMAC may be based on SHA. Signature overhead may be relatively small. An HMAC-MD5 may be 128 bit long. An HMAC-SHAl may be 160 bits. An HMAC- SHA256 may be 256 bits long. GMAC overhead may be decreased, for example, by signaling an initialization vector.

[0099] A key used for one or more signatures may be identified, for example, in one or more packets. A key may not be carried inband (e.g., in content or in an inband MPD), for example, for security reasons. There may be more than one set of keys. Different keys may be used to sign intervals, different content components and PSI tables. A key exchange mechanism may be used. An example comprises an HTTPS GET to a key URL provided in a manifest (e.g., MPD or m3u8). A key may be provided (e.g., to a media client) in a body of a response to an HTTP GET.

[0100] A receiver may compute a signature from received content segments or a received content stream, for example, during a time between reception of the signature and content decoding. A receiver may compare a computed signature to a signature received from the stream. Multiple keys may exist. Different keys may correspond, for example, to different entities in the network or different stages in a content generation and/or distribution chain, such as an encoder and a packager. Signature timestamps may be transmitted together with key identifiers, for example, so that signing entity and/or signature time may be identifiable.

[0101] A receiver may detect addition or removal of components, for example, by detecting that one or more pavload signatures are correct and that one or more interval signatures are incorrect or missing. A receiver may discover a stage that introduced one or more changes. An encoder may generate audiovisual content in a stage. Another entity may insert advertisement content in another stage. Events may be added by another entity in another stage.

[0102] In an example, layered signatures may be applied to MPEG-2 TS. In an example, layered signatures may be applied to ISO-BMFF. In an example, "markers" may be ISO-BMFF boxes.

Signatures may cover current and previous fragments and/or tracks within segments.

[0103] A descriptor structure may, for example, carry a signature, signature type and key ID information. A descriptor may be carried as a first descriptor in a table (e.g., PAT or PMT). A signature may be applied, for example, to a key id field through an end of a table (e.g., PAT or

PMT). A descriptor may be carried in a TS packet adaptation field. A descriptor may have its signature apply to a payload unit (e.g., PES packet) in which a descriptor appears.

[0104] TABLE 1 presents an example interval signature descriptor syntax for MPEG-2 TS.

Table I Example interval signature descriptor syntax for MPEG-2 TS

[0105] A signaturejevel may be a value indicating which component(s) (e.g., over a defined data interval ) are signed. A value of 0 may indicate all TS packets in a multiplex are signed. A value of 1 may indicate that all TS packets in the multiplex are signed except for those with a PID value of OxlFFF or NULL packets. A value of 2 may indicate that TS packets (e.g., all TS packets) on the current PID are signed. A value of 3 may indicate that concatenated TS packet payloads on the current PID are signed. A value of 4 may indicate concatenated PES packet payloads (e.g., a complete elementary stream) on a PID are signed, e.g., for PIDs carrying PES packets.

[0106] A signature may be an "interval" signature. An interval signature may, for example, cover bytes (e.g., all bytes which match a current signature level) starting from a byte following a previous descriptor until an end of a key id field of a current descriptor. The descriptors may be instances of interval signature extension descriptor ), for example, as defined above.

[0107] An overlap signature may be an "interval" signature computed over a data interval including part or all of a previous "marker" packet. An overlap signature may apply to a data interval consisting entirely of the previous "marker" packet, or to a data interval comprising the previous marker packet and additional data from previous and/or subsequent TS packets. The additional data may be selected to match a current signature level, for example. An overlap signature may apply, for example, starting from a key id field to the end of a (PAT or PMT) table.

[0108] A segment number signature may be a signed segment number or a signed segment earliest presentation time. A segment number signature may be taken from an MPD or EBP structure.

[0109] A timestamp may be the absolute time at which the signature(s) were generated. [0110] A TS packet may be used, for example, to demarcate a border between two intervals. Such a TS packet may be referred to as a "marker packet." A marker packet may, for example, have or include a descriptor structure that carriers signatures. The descriptor structure may be carried in a packet payload or in an adaptation field. A descriptor structure carried in a packet payload may, for example, use sections or private syntax.

[0111] TABLE 2 presents an example payload signature descriptor syntax for MPEG-2 TS.

- Example payload signature descriptor syntax for MPEG-2 TS

[0112] A signaturejevel may be a value indicating which component(s) (e.g., which payload components) are signed. A value of 0 may indicate a complete payload unit is signed. A complete payload unit may comprise, for example, concatenated payloads of TS packets from a current FID between a "payload start" packet and a "payload end" packet, A value of 1 may indicate a PES packet payload is signed, for example, when a PID uses PES syntax.

[0113] A "payload start" packet may be a closest preceding TS packet having a same value of PID as a current packet. A value of payload unit start indicator in this packet may be 1. A payload start packet may be a current packet or a preceding packet, for example, when a signature is carried in a "payload end" packet. [0114] A "payload end" packet may be a last TS packet having a same value of PID as a current packet. A value of payload unit start indicator in this packet may be 0. The first payload-carrying TS packet following a payload end packet may have a payload unit start indicator value of 1.

[0115] A payload signature may cover bytes (e.g., all bytes) in concatenated payloads of TS packets between "start" and "end" packets. A payload signature may cover bytes associated only with the payload components specified by a current signature level. A payload signature may exclude PES headers, for example, to comply with signature level.

[0116] A previous payload signature may be a signature of a previous

payload_signature_extension_descriptor() having a same PID and level. A previous- payload_signature may carry a signature field from a previous descriptor.

[0117] A descriptor_signature may be a signature of bytes (e.g., all bytes) in a descriptor starting from signature_type to a start of a descriptor field.

[0118] A PMT descriptor may provide information on signature type and keys. A signature may be carried in adaptation fields (e.g., only in adaptation fields), for example, to conserve bytes and/or provide optimization.

[0119] Use of public key encryption may eliminate key exchange mechanisms. Efficiency of public encryption may improve with longer intervals and/or higher bitrate streams. Certificates may be carried, for example, in MPEG-2 systems sections, in PES packets or by using private syntax. A certificate may cover a subset of TS packets in an interval, for example, when an interval comprises multiple packets containing parts of a certificate.

[0120] FIG. 7A is a diagram of an example communications system 100 in which one or more disclosed embodiments may be implemented. The communications system 100 may be a multiple access system that provides content, such as voice, data, video, messaging, broadcast, etc., to multiple wireless users. The communications system 100 may enable multiple wireless users to access such content through the sharing of system resources, including wireless bandwidth. For example, the communications systems 100 may employ one or more channel access methods, such as code division multiple access (CDMA), time division multiple access (TDMA), frequency- division multiple access (FDMA), orthogonal FDMA (OFDMA), single-carrier FDMA (SC- FDMA), and the like.

[0121] As shown in FIG. 7A, the communications system 100 may include wireless

transmit/receive units (WTRUs) 102a, 102b, 102c, and/or 102d (which generally or collectively may be referred to as WTRU 102), a radio access network (RAN) 103/104/105, a core network

106/107/109, a public switched telephone network (PSTN) 108, the Internet 110, and other networks 112, though it will be appreciated that the disclosed embodiments contemplate any number of WTRUs, base stations, networks, and/or network elements. WTRUs 102a, 102b, 102c, 102d may be any type of device configured to operate and/or communicate in a wireless environment. By way of example, the WTRUs 102a, 102b, 102c, 102d may be configured to transmit and/or receive wireless signals and may include user equipment (UE), a mobile station, a fixed or mobile subscriber unit, a pager, a cellular telephone, a personal digital assistant (PDA), a smartphone, a laptop, a netbook, a personal computer, a wireless sensor, consumer electronics, and the like.

[0122] The communications systems 100 may also include a base station 114a and a base station 114b. Base stations 1 14a, 114b may be any type of device configured to wirelessly interface with at least one of the WTRUs 102a, 102b, 102c, 102d to facilitate access to one or more communication networks, such as the core network 106/107/109, the Internet 110, and/or the networks 112. By way of example, the base stations 114a, 114b may be a base transceiver station (BTS), a Node-B, an eNode B, a Home Node B, a Home eNode B, a site controller, an access point (AP), a wireless router, and the like. While the base stations 1 14a, 1 14b are depicted as a single element, it will be appreciated that the base stations 1 14a, 1 14b may include any number of interconnected base stations and/or network elements.

[0123] The base station 114a may be part of the RAN 103/104/105, which may also include other base stations and/or network elements (not shown), such as a base station controller (BSC), a radio network controller (RNC), relay nodes, etc. The base station 114a and/or the base station 114b may^¬ be configured to transmit and/ or receive wireless signals within a particular geographic region, which may be referred to as a cell (not shown). The cell may further be divided into cell sectors. For example, the cell associated with the base station 114a may be divided into three sectors. Thus, in one embodiment, the base station 114a may include three transceivers, e.g., one for each sector of the cell. In another embodiment, the base station 1 14a may employ multiple-input multiple output (MIMO) technology and, therefore, may utilize multiple transceivers for each sector of the cell.

[0124] The base stations 114a, 114b may communicate with one or more of the WTRUs 102a, 102b, 102c, 102d over an air interface 115/116/117, which may be any suitable wireless

communication link (e.g., radio frequency (RF), microwave, infrared (IR), ultraviolet (UV), visible light, etc.). The an interface 115/1 16/1 17 may be established using any suitable radio access technology (RAT).

[0125] More specifically, as noted above, the communications system 100 may be a multiple access system and may employ one or more channel access schemes, such as CDMA, TDMA, FDMA, OFDMA, SC-FDMA, and the like. For example, the base station 1 14a in the RAN

103/104/105 and the WTRUs 102a, 102b, 102c may implement a radio technology such as

Universal Mobile Telecommunications System (UMTS) Terrestrial Radio Access (UTRA), which may establish the air interface 1 15/116/117 using wideband CDMA (WCDMA). WCDMA may include communication protocols such as High-Speed Packet Access (HSPA) and/or Evolved HSPA (HSPA+). HSPA may include High-Speed Downlink Packet Access (HSDPA) and/or High-Speed Uplink Packet Access (HSUPA).

[0126] In another embodiment, the base station 1 14a and the WTRUs 102a, 102b, 102c may implement a radio technology such as Evolved UMTS Terrestrial Radio Access (E-UTRA), which may establish the air interface 115/116/1 17 using Long Term Evolution (LTE) and/or LTE- Advanced (LTE- A).

[0127] In other embodiments, the base station 114a and the WTRUs 102a, 102b, 102c may implement radio technologies such as IEEE 802. 16 (e.g., Worldwide Interoperability for Microwave Access (WiMAX)), CDMA2000, CDMA2000 I X, CDMA2000 EV-DO, Interim Standard 2000 (IS- 2000), interim Standard 95 (IS-95), Interim Standard 856 (IS-856), Global System for Mobile communications (GSM), Enhanced Data rates for GSM Evolution (EDGE), GSM EDGE (GERAN), and the like.

[0128] The base station 114b in FIG. 7A may be a wireless router, Home Node B, Home eNode B, or access point, for example, and may utilize any suitable RAT for facilitating wireless connectivity in a localized area, such as a place of business, a home, a vehicle, a campus, and the like. In one embodiment, the base station 114b and the WTRUs 102c, 102d may implement a radio technology such as IEEE 802.1 1 to establish a wireless local area network (WLAN). In another embodiment, the base station 114b and the WTRUs 102c, 102d may implement a radio technology- such as IEEE 802.15 to establish a wireless personal area network (WPAN). In yet another embodiment, the base station 114b and the WTRUs 102c, 102d may utilize a cellular-based RAT (e.g., WCDMA, CDMA2000, GSM, LTE, LTE- A, etc.) to establish a picoceli or femtocell. As shown in FIG. 7A, the base station 1 14b may have a direct connection to the Internet 1 10. Thus, the base station 114b may not be used to access the Internet 110 via the core network 106/107/109.

[0129] The RAN 103/104/105 may be in communication with the core network 106/107/109, which may be any type of network configured to provide voice, data, applications, and/or voice over internet protocol (VoIP) services to one or more of the WTRUs 102a, 102b, 102c, 102d. For example, the core network 106/107/109 may provide call control, billing services, mobile location- based services, pre-paid calling, Internet connectivity, video distribution, etc., and/or perform high- level security functions, such as user authentication. Although not shown in FIG. 7A, it will be appreciated that the RAN 103/104/105 and/or the core network 106/107/109 may be in direct or indirect communication with other RANs that employ the same RAT as the RAN 103/104/105 or a different RAT. For example, in addition to being connected to the RAN 103/104/105, which may be utilizing an E-UTRA radio technology, the core network 106/107/109 may also be in

communication with another RAN (not shown) employing a GSM radio technology.

[0130] The core network 106/107/109 may also serve as a gateway for the WTRUs 102a, 102b, 102c, 102d to access the PSTN 108, the internet 1 10, and/or other networks 1 12. The PSTN 108 may include circuit-switched telephone networks that provide plain old telephone service (POTS). The Internet 1 10 may include a global system of interconnected computer networks and devices that use common communication protocols, such as the transmission control protocol (TCP), user datagram protocol (UDP) and the internet protocol (IP) in the TCP/IP internet protocol suite. The networks 1 12 may include wired or wireless communications networks owned and/or operated by other service providers. For example, the networks 1 12 may include another core network connected to one or more RANs, which may employ the same RAT as the RAN 103/104/105 or a different RAT.

[0131] One or more of the WTRUs 102a, 102b, 102c, 102d in the communications system 100 may include multi-mode capabilities, e.g., the WTRUs 102a, 102b, 102c, 102d may include multiple transceivers for communicating with different wireless networks over different wireless links. For example, the WTRU 102c shown in FIG. 7 A may be configured to communicate with the base station 114a, which may employ a cellular-based radio technology, and with the base station 114b, which may employ an IEEE 802 radio technology.

[0132] FIG. 7B is a system diagram of an example WTRU 102. As shown in FIG. 7B, the WTRU 102 may include a processor 118, a transceiver 120, a transmit/receive element 122, a speaker/microphone 124, a keypad 126, a display/touchpad 128, non-removable memory 130, removable memory 132, a power source 134, a global positioning system (GPS) chipset 136, and other peripherals 138. It will be appreciated that the WTRU 102 may include any sub-combination of the foregoing elements while remaining consistent with an embodiment. Also, embodiments contemplate that the base stations 114a and 114b, and/or the nodes that base stations 114a and 114b may represent, such as but not limited to transceiver station (BTS), a Node-B, a site controller, an access point (AP), a home node-B, an evolved home node-B (eNodeB), a home evolved node-B (HeNB), a home evolved node-B gateway, and proxy nodes, among others, may include one or more of the elements depicted in FIG. 7B.

[0133] The processor 1 18 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like. The processor 118 may perform signal coding, data, processing, power control, input/output processing, and/or any other functionality that enables the WTRU 102 to operate in a wireless environment. The processor 118 may be coupled to the transceiver 120, which may be coupled to the transmit/receive element 122, While FIG. 7B depicts the processor 1 18 and the transceiver 120 as separate components, it will be appreciated that the processor 118 and the transceiver 120 may be integrated together in an electronic package or chip.

[0134] The transmit/receive element 122 may be configured to transmit signals to, or receive signals from, a base station g(.e,. the base station 1 14a) over the air interface 115/1 16/117. For example, in one embodiment, the transmit/receive element 122 may be an antenna configured to transmit and/or receive RF signals. In another embodiment, the transmit/receive element 122 may be an emitter/detector configured to transmit and/or receive IR, UV, or visible light signals, for example. In yet another embodiment, the transmit/receive element 122 may be configured to transmit and receive both RF and light signals. It will be appreciated that the transmit/receive element 122 may be configured to transmit and/or receive any combination of wireless signals.

[0135] In addition, although the transmit/receive element 122 is depicted in FIG. 7B as a single element, the WTRU 102 may include any number of transmit/receive elements 122. More specifically, the WTRU 102 may employ MEMO technology. Thus, in one embodiment, the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 1 15/116/1 17.

[0136] The transceiver 120 may be configured to modulate the signals that are to be transmitted by the transmit/receive element 122 and to demodulate the signals that are received by the transmit/receive element 122. As noted above, the WTRU 102 may have multi-mode capabilities. Thus, the transceiver 120 may include multiple transceivers for enabling the WTRU 102 to communicate via multiple RATs, such as UTRA and IEEE 802.11, for example.

[0137] The processor 118 of the WTRU 102 may be coupled to, and may receive user input data from, the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128 (e.g. , a liquid crystal display (LCD) display unit or organic light-emitting diode (OLED) display unit). The processor 118 may also output user data to the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128. In addition, the processor 1 1 8 may access information from, and store data in, any type of suitable memory, such as the non-removable memory 130 and/or the removable memory 132. The non-removable memory 130 may include random-access memory (RAM), readonly memory (ROM), a hard disk, or any other type of memory storage device. The removable memory 132 may include a subscriber identity module (SIM) card, a memory stick, a secure digital (SD) memory card, and the like. In other embodiments, the processor 1 1 8 may access information from, and store data in, memory that is not physically located on the WTRU 102, such as on a server or a home computer (not shown).

[0138] The processor 1 18 may receive power from the power source 134, and may be configured to distribute and/or control the power to the other components in the WTRU 102. The power source 134 may be any suitable device for powering the WTRU 102. For example, the power source 134 may include one or more dry cell batteries (e.g., nickel-cadmium (NiCd), nickel-zinc (NiZn), nickel metal hydride (NiMH), lithium-ion (Li-ion), etc. ), solar cells, fuel cells, and the like.

[0139] The processor 118 may also be coupled to the GPS chipset 136, which may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 102. In addition to, or in lieu of, the information from the GPS chipset 136, the WTRU 102 may receive location information over the air interface 1 15/116/117 from a base station (e.g., base stations 114a, 114b) and/or determine its location based on the timing of the signals being received from two or more nearby base stations. It will be appreciated that the WTRU 102 may acquire location information by way of any suitable location-determination method while remaining consistent with an embodiment.

[0140] The processor 118 may further be coupled to other peripherals 138, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity. For example, the peripherals 138 may include an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth© module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module, an Internet browser, and the like.

[0141] FIG. 7C is a system diagram of the RAN 103 and the core network 106 according to an embodiment. As noted above, the RAN 103 may employ a UTRA radio technology to communicate with the WTRUs 102a, 102b, 102c over the air interface 1 1 5. The RAN 103 may also be in communication with the core network 106. As shown in FIG. 7C, the RAN 103 may include Node- Bs 140a, 140b, 140c, which may each include one or more transceivers for communicating with the WTRUs 102a, 102b, 102c over the air interface 115. The Node-Bs 140a, 140b, 140c may each be associated with a particular cell (not shown) within the RAN 103. The RAN 103 may also include RNCs 142a, 142b. It will be appreciated that the RAN 103 may include any number of Node-Bs and RNCs while remaining consistent with an embodiment.

[0142] As shown in FIG. 7C, the Node-Bs 140a, 140b may be in communication with the RNC 142a. Additionally, the Node-B 140c may be in communication with the RNC142b. The Node-Bs 140a, 140b, 140c may communicate with the respective RNCs 142a, 142b via an lub interface. The RNCs 142a, 142b may be in communication with one another via an lur interface. Each of the RNCs 142a, 142b may be configured to control the respective Node-Bs 140a, 140b, 140c to which it is connected. In addition, each of the RNCs 142a, 142b may be configured to carry out or support other functionality, such as outer loop power control, load control, admission control, packet scheduling, handover control, macrodiversity, security functions, data encryption, and the like.

[0143] The core network 106 shown in FIG. 7C may include a media gateway (MGW) 144, a mobile switching center (MSC) 146, a serving GPRS support node (SGSN) 148, and/or a gateway GPRS support node (GGSN) 150. While each of the foregoing elements are depicted as part of the core network 106, it will be appreciated that any one of these elements may be owned and/or operated by an entity other than the core network operator. [0144] The RNC 142a in the RAN 103 may be connected to the MSG 146 in the core network 106 via an luCS interface. The MSG 146 may be connected to the MGW 144. The MSG 146 and the MGW 144 may provide the WTRUs 102a, 102b, 102c with access to circuit-switched networks, such as the PSTN 108, to facilitate communications between the WTRUs 102a, 102b, 102c and land-line communications devices.

[0145] The RNC 142a in the RAN 103 may also be connected to the SGSN 148 in the core network 106 via an luPS interface. The SGSN 148 may be connected to the GGSN 150. The SGSN 148 and the GGSN 150 may provide the WTRUs 102a, 102b, 102c with access to packet-switched networks, such as the Internet 1 10, to facilitate communications between and the WTRUs 102a, 102b, 102c and IP-enabled devices.

[0146] As noted above, the core network 106 may also be connected to the networks 1 12, which may include other wired or wireless networks that are owned and/or operated by other service providers.

[0147] FIG. 7D is a system diagram of the RAN 104 and the core network 107 according to an embodiment. As noted above, the RAN 104 may employ an E-UTRA radio technology to communicate with the WTRUs 102a, 102b, 102c over the air interface 1 16. The RAN 104 may also be in communication with the core network 107.

[0148] The RAN 104 may include eNode-Bs 160a, 160b, 160c, though it will be appreciated that the RAN 104 may include any number of eNode-Bs while remaining consistent with an

embodiment. The eNode-Bs 160a, 160b, 160c may each include one or more transceivers for communicating with the WTRUs 102a, 102b, 102c over the air interface 116. In one embodiment, the eNode-Bs 160a, 160b, 160c may implement MIMO technology. Thus, the eNode-B 160a, for example, may use multiple antennas to transmit wireless signals to, and receive wireless signals from, the WTRU 102a.

[0149] Each of the eNode-Bs 160a, 160b, 160c may be associated with a particular cell (not shown) and may be configured to handle radio resource management decisions, handover decisions, scheduling of users in the uplink and/or downlink, and the like. As shown in FIG. 7D, the eNode-Bs 160a, 160b, 160c may communicate with one another over an X2 interface.

[0150] The core network 107 shown in FIG. 7D may include a mobility management gateway (MME) 162, a serving gateway 164, and a packet data network (PDN) gateway 166. While each of the foregoing elements are depicted as part of the core network 107, it will be appreciated that any one of these elements may be owned and/or operated by an entity other than the core network operator.

[0151] The MME 162 may be connected to each of the eNode-Bs 160a, 160b, 160c in the RAN 104 via an S1 interface and may serve as a control node. For example, the MME 162 may be responsible for authenticating users of the WTRUs 102a, 102b, 102c, bearer activatiori/deactivation, selecting a particular serving gateway during an initial attach of the WTRU s 102a, 102b, 102c, and the like. The MME. 162 may also provide a control plane function for switching between the RAN 104 and other RANs (not shown) that employ other radio technologies, such as GSM or WCDMA.

[0152] The serving gateway 164 may be connected to each of the eNode-Bs 160a, 160b, 160c in the RAN 104 via the SI interface. The serving gateway 164 may generally route and forward user data packets to/from the WTRUs 102a, 102b, 102c. The serving gateway 164 may also perform other functions, such as anchoring user planes during inter-eNode B handovers, triggering paging when downlink data is available for the WTRUs 102a, 102b, 102c, managing and storing contexts of the WTRUs 102a, 102b, 102c, and the like.

[0153] The serving gateway 164 may also be connected to the PDN gateway 166, which may provide the WTRUs 102a, 102b, 102c with access to packet-switched networks, such as the Internet 110, to facilitate communications between the WTRUs 102a, 102b, 102c and IP-enabled devices.

[0154] The core network 107 may facilitate communications with other networks. For example, the core network 107 may provide the WTRUs 102a, 102b, 102c with access to circuit-switched networks, such as the PSTN 108, to facilitate communications between the WTRUs 102a, 102b, 102c and land-line communications devices. For example, the core network 107 may include, or may communicate with, an IP gateway (e.g., an IP multimedia subsystem (IMS) server) that serves as an interface between the core network 107 and the PSTN 108. In addition, the core network 107 may provide the WTRUs 102a, 102b, 102c with access to the networks 1 12, which may include other wired or wireless networks that are owned and/or operated by other service providers.

[0155] FIG. 7E is a system diagram of the RAN 105 and the core network 109 according to an embodiment. The RAN 105 may be an access service network (ASN) that employs IEEE 802.16 radio technology to communicate with the WTRUs 102a, 102b, 102c over the air interface 1 17. As will be further discussed below, the communication links between the different functional entities of the WTRUs 102a, 102b, 102c, the RAN 105, and the core network 109 may be defined as reference points. [0156] As shown in FIG. 7E, the RAN 105 may include base stations 180a, 180b, 180c, and an ASN gateway 182, though it will be appreciated that the RAN 105 may include any number of base stations and ASN gateways while remaining consistent with an embodiment. The base stations 180a, 180b, 180c may each be associated with a particular cell (not shown) in the RAN 105 and may each include one or more transceivers for communicating with the WTRUs 102a, 102b, 102c over the air interface 117. In one embodiment the base stations 180a, 180b, 180c may implement MIMQ technology. Thus, the base station 180a, for example, may use multiple antennas to transmit wireless signals to, and receive wireless signals from, the WTRU 102a. The base stations 180a, 180b, 180c may also provide mobility management functions, such as handoff triggering, tunnel establishment, radio resource management, traffic classification, quality of service (QoS) policy enforcement, and the like. The ASN gateway 182 may serve as a traffic aggregation point and may be responsible for paging, caching of subscriber profiles, routing to the core network 109, and the like.

[0157] The air interface 117 between the WTRUs 102a, 102b, 102c and the RAN 105 may be defined as an Rl reference point that implements the IEEE 802, 16 specification. In addition, each of the WTRUs 102a, 102b, 102c may establish a logical interface (not shown) with the core network 109. The logical interface between the WTRUs 102a, 102b, 102c and the core network 109 may be defined as an R2 reference point, which may be used for authentication, authorization, IP host configuration management, and/or mobility management.

[0158] The communication link between each of the base stations 180a, 180b, 180c may be defined as an R8 reference point that includes protocols for facilitating WTRU handovers and the transfer of data between base stations. The communication link between the base stations 180a, 180b, 180c and the ASN gateway 182 may be defined as an R6 reference point. The R6 reference point may include protocols for facilitating mobility management based on mobility events associated with each of the WTRUs 102a, 102b, 102c.

[0159] As shown in FIG. 7E, the RAN 105 may be connected to the core network 109. The communication link between the RAN 105 and the core network 109 may defined as an R3 reference point that includes protocols for facilitating data transfer and mobility management capabilities, for example. The core network 109 may include a mobile IP home agent (ΜΪΡ-ΗΑ) 184, an authentication, authorization, accounting (AAA) server 186, and a gateway 188. While each of the foregoing elements are depicted as part of the core network 109, it will be appreciated that any one of these elements may be owned and/or operated by an entity other than the core network operator.

[0160] The MIP-HA may be responsible for IP address management, and may enable the WTRUs 102a, 102b, 102c to roam between different ASNs and/or different core networks. The MIP-HA 184 may provide the WTRUs 102a, 102b, 102c with access to packet-switched networks, such as the Internet 110, to facilitate communications between the WTRUs 102a, 102b, 102c and IP- enabled devices. The AAA server 186 may be responsible for user authentication and for supporting user services. The gateway 188 may facilitate interworking with other networks. For example, the gateway 188 may provide the WTRUs 102a, 102b, 102c with access to circuit- switched networks, such as the PSTN 108, to facilitate communications between the WTRUs 102a, 102b, 102c and land-line communications devices. In addition, the gateway 188 may provide the WTRUs 102a, 102b, 102c with access to the networks 1 12, which may include other wired or wireless networks that are owned and/or operated by other service providers.

[0161] Although not shown in FIG. 7E, it will be appreciated that the RAN 105 may be connected to other ASNs and the core network 109 may be connected to other core networks. The communication link between the RAN 105 the other ASNs may be defined as an R4 reference point, which may include protocols for coordinating the mobility of the WTRUs 102a, 102b, 102c between the RAN 105 and the other ASNs. The communication link between the core network 109 and the other core networks may be defined as an R5 reference, which may include protocols for facilitating interworking between home core networks and visited core networks.

[0162] Systems, methods, and instrumentalities have been disclosed for content protection and modification detection in adaptive streaming and transport streams. Content protection may be multi-level, e.g., pay load signatures and interval signatures. Content protection may be multi- layered, e.g., overlapping signatures. Signatures may be carried inband, e.g., in transport segments. Content protection may be used for modification detection. Modification detection may be multilevel, e.g., container level detection and bitstream level detection. Types of modifications and sources may be detected and distinguished, e.g., detection of reordering, detection of benign and/or malicious modification of one or more types of content (e.g., bitstream, metadata) by insertion and/or removal of content.

[0163] Although features and elements are described above in particular combinations, one of ordinary skill in the art will appreciate that each feature or element can be used alone or in any combination with the other features and elements. In addition, techniques described herein may be implemented in a computer program, software, or firmware incorporated in a computer-readable medium for execution by a computer or processor. Examples of computer-readable media include electronic signals (transmitted over wired or wireless connections) and computer-readable storage media. Examples of computer-readable storage media include, but are not limited to, a read only memory (ROM), a random access memory (RAM), a register, cache memory, semiconductor memory devices, magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROM disks, and digital versatile disks (DVDs). A processor in association with software may be used to implement a radio frequency transceiver for use in a WTRU, UE, terminal, base station, RNC, or any host computer.

Claims

1. A method of confirming the authenticity of content in adaptive streaming, comprising:

receiving a media presentation description (MPD) file;

receiving a key;

requesting content based on the MPD;

receiving the content comprising a plurality of packets and an inband signature; determining the authenticity of the content using the inband signature and the key; and upon confirming the authenticity of the content, decoding at least one packet of the content.

2. The method of claim 1, wherein the at least one packet comprises the inband signature,

3. The method of claim 2, wherein an adaptation field of the packet comprises the inband signature.

4. The method of claim 1, further comprising determining the authenticity of the inband signature and an additional inband signature using the key, wherein the content comprises the additional inband signature.

5. The method of claim 4, further comprising, upon confirming the authenticity of the inband signature and the additional inband signature, decoding a plurality of packets of the content.

6. The method of claim 5, wherein the plurality of packets of the content are between the inband signature and the additional inband signature.

7. The method of claim 6, wherein the inband signature and the additional inband signature are carried in individual MPEG-2 transport stream packets.

8. The method of claim 6, wherein the inband signature and the additional inband signature are carried in an encoder boundary point packet to provide a virtual segmentation structure.

9. The method of claim 8, wherein the encoder boundary point packet provides at least one virtual boundary point for segmentation.

10. The method of claim 4, further comprising, upon failing to confirm the authenticity of the inband signature and the additional inband signature, determining the addition or removal of components of the content.

11. The method of claim 1, wherein the inband signature is a symmetrical keyed-hash message authentication code (HMAC) or an Advanced Encryption Standard Galois Message Authentication Code (AES-GMAC).

12. The method of claim 1, wherein the key is received out of band from the content.

13. The method of claim 1, wherein the key is a entity- specific key.

14. The method of claim 1, further comprising receiving an additional key, wherein the content comprises an additional inband signature, and determining the authenticity of the additional mband signature using the additional key.

15. A device, comprising:

a processor configured to:

receive a media presentation description (MPD) file;

receive a key;

request a content based on the MPD;

receive the content in a transport stream, wherein the content comprises comprising a plurality of packets and an inband signature;

determine the authenticity of the content using the inband signature and the key; and upon confirming the authenticity of the content, decode at least one packet of the content.

16. The device of claim 15, wherein the at least one packet comprises the inband signature.

17. The device of claim 16, wherein an adaptation field of the packet comprises the inband signature.

18. The device of claim 15, wherein the processor is further configured to determine the authenticitj' of the mband signature and an additional inband signature using the key, wherein the content comprises the additional inband signature.

19. The device of claim 18, wherein the processor is further configured to, upon confirming the authenticity of the inband signature and the ad ditional inband signature, decod e a plurality of packets of the content.

20. The device of claim 19, wherein the plurality of packets of the content are between the inband signature and the additional inband signature.

21 . The device of claim 20, wherein the inband signature and the additional inband signature are carried in individual MPEG-2 transport stream packets.

22. The device of claim 20, wherein the inband signature and the additional inband signature are carried in an encoder boundary point packet to provide a virtual segmentation structure.

23. The device of claim 22, wherein the encoder boundary point packet provides at least one virtual boundary point for segmentation.

24. The device of claim 18, wherein the processor is further configured to, upon failing to confirm the authenticity of the inband signature and the additional inband signature, determine the addition or removal of components of the content.

25. The device of claim 15, wherein the inband signature is a symmetrical keyed-hash message authentication code (HMAC) or an Advanced Encryption Standard Galois Message Authentication Code (AES-GMAC).

26. The device of claim 15, wherein the key is received out of band from the content.

27. The device of claim 15, wherein the key is a entity-specific key.

28. The device of claim 15, wherein the processor is further configured to receive an additional key, wherein the content comprises an additional inband signature, and determine the authenticity of the additional inband signature using the additional key.

29. A method of protecting content in adaptive streaming, comprising:

receiving a request for content, the request based upon a media presentation description (MPD) file; and

sending content based upon the request, the content comprising a plurality of packets and an inband signature;

wherein the authenticity of the inband signature is determined using a key, thereby confirming that no unauthorized addition or removal was performed to the content.

30. A method of inserting an advertisement into content in an adaptive stream, comprising:

receiving a content, the content comprising a plurality of packets; and

inserting an advertisement and an inbound signature into the content;

wherein the authenticity of the inband signature is determined using a key, thereby confirming that the advertisement insertion into the content was authorized.