TWI492598B

TWI492598B - Enhanced block-request streaming system for handling low-latency streaming

Info

Publication number: TWI492598B
Application number: TW102115099A
Authority: TW
Inventors: Michael G Luby; Mark Watson; Lorenzo Vicisano; Payam Pakzad; Bin Wang; Ying Chen; Thomas Stockhammer; Jaber Mohammad Borran
Original assignee: Qualcomm Inc
Priority date: 2012-04-26
Filing date: 2013-04-26
Publication date: 2015-07-11
Also published as: BR112014026741A8; RU2014147463A; CA2869311A1; IL234872A; KR20150003296A; TW201408020A; JP2015519813A; EP2842336A1; CN104221390B; PH12014502203A1; BR112014026741B1; JP6105717B2; MY166917A; CN104221390A; RU2629001C2; PH12014502203B1; WO2013163448A1; KR101741484B1; CA2869311C; BR112014026741A2

Description

Enhanced block request streaming system for handling low latency streams

本發明係關於改善的媒體串流系統和方法，尤其係關於自我調整於網路和緩衝條件以使串流媒體的呈現最佳化並允許對串流媒體資料進行高效的併發或時間分散式投遞的系統和方法。The present invention relates to improved media streaming systems and methods, and more particularly to self-adjusting to network and buffer conditions to optimize the presentation of streaming media and allowing efficient concurrent or time-distributed delivery of streaming media material. Systems and methods.

串流媒體投遞可能變得日益重要，因為在諸如網際網路、蜂巢和無線網路、輸電線網路，及其他類型的網路之類的基於封包的網路上投遞高品質音訊和視訊變得越來越常見。所投遞的串流媒體能被呈現出的品質可取決於數種因素，包括原始內容的解析度(或其他屬性)、原始內容的編碼品質、接收裝置解碼和呈現媒體的能力、在接收器處接收到的訊號的及時性和品質等。為了產生感知到的良好的串流媒體體驗，在接收器處接收到的訊號的傳輸和及時性可能尤其重要。良好的傳輸可以提供在接收器處接收到的串流相對於發送方發送的串流的保真度，而及時性可以代表接收器在初始請求內容之後多快就能開始播出該內容。Streaming media delivery may become increasingly important as delivering high quality audio and video over packet-based networks such as the Internet, cellular and wireless networks, power line networks, and other types of networks becomes More and more common. The quality at which the delivered streaming media can be presented may depend on several factors, including the resolution (or other attributes) of the original content, the encoding quality of the original content, the ability of the receiving device to decode and render the media, at the receiver. The timeliness and quality of the received signals. In order to generate a perceived good streaming media experience, the transmission and timeliness of the signals received at the receiver may be particularly important. A good transmission can provide fidelity of the stream received at the receiver relative to the stream sent by the sender, and timeliness can represent how quickly the receiver can begin broadcasting the content after initially requesting the content.

媒體投遞系統可表徵為具有媒體源、媒體目的地，及將源和目的地分開的(時間及/或空間上的)通道的系統。典型地，源包括能存取可電子地管理的形式的媒體的發射器，及有能力電子地控制對媒體(或媒體的近似物)的接收並將媒體提供給媒體消費者(例如，具有以某種方式耦合到該接收器、儲存裝置或元件、另一通道等的顯示裝置的使用者)的接收器。A media delivery system can be characterized as a system having media sources, media destinations, and separate (time and/or spatial) channels for source and destination. Typically, the source includes a transmitter that can access media in an electronically manageable form, and has the ability to electronically control the receipt of media (or an approximation of the media) and provide the media to the media consumer (eg, having A receiver that is coupled in some manner to a user of the receiver, storage device or component, display device of another channel, or the like.

儘管有許多變型是可能的，但在常見的實例中，媒體投遞系統具有能存取電子形式的媒體內容的一或多個伺服器，並且一或多個客戶端系統或裝置向伺服器作出對媒體的請求，而伺服器使用作為該伺服器的一部分的向客戶端處的接收器進行傳送的發射器來輸送該媒體，從而收到的媒體能由該客戶端以某種方式消費。在簡單的實例中，對於給定的請求和回應而言有一個伺服器和一個客戶端，但並非必需如此。While many variations are possible, in a typical example, a media delivery system has one or more servers that can access media content in electronic form, and one or more client systems or devices make a pair to the server. The media requests, and the server uses the transmitter that is part of the server to transmit to the receiver at the client to deliver the media so that the received media can be consumed by the client in some manner. In a simple example, there is a server and a client for a given request and response, but this is not required.

按傳統，媒體投遞系統可表徵為「下載」模型或「串流」模型。「下載」模型可由媒體資料的投遞與該媒體向使用者或接收方裝置的播出之間的時基獨立性來表徵。Traditionally, media delivery systems can be characterized as "download" models or "streaming" models. The "download" model can be characterized by the time base independence between the delivery of media material and the broadcast of the media to the user or recipient device.

作為實例，媒體在被需要或將被使用之前被下載得足夠多，並且在該媒體被使用時，在接收方處已有所需一般多的媒體可用。在下載的上下文中的投遞往往是使用諸如HTTP、FTP或單向傳輸上的檔投遞(FLUTE)之類的檔案傳輸通訊協定來執行的，且投遞速率可由下層的流量及/或壅塞控制協定(諸如TCP/IP)來決定。該流量或壅塞控制協定的操作可獨立於媒體向使用者或目的地裝置的播出，而播出可與下載併發地發生或在其他某個時間發生。As an example, the media is downloaded sufficiently before it is needed or will be used, and when the media is used, there is already a generally large amount of media available at the recipient. Delivery in the context of downloading is often performed using a file transfer protocol such as HTTP, FTP, or file delivery on one-way transmission (FLUTE), and the delivery rate can be controlled by underlying traffic and/or congestion control protocols ( Such as TCP / IP) to decide. The flow or congestion control agreement The operation may be independent of the media's playout to the user or destination device, while the playout may occur concurrently with the download or at some other time.

「串流」模式可由媒體資料的投遞與該媒體向使用者或接收方裝置的播出的時基之間的緊耦合來表徵。在該上下文中的投遞往往是使用串流協定來執行的，諸如用於控制的即時串流協定(RTSP)和用於媒體資料的即時傳輸協定(RTP)。投遞速率可由串流伺服器決定，通常與資料的播出速率匹配。The "streaming" mode can be characterized by the tight coupling between the delivery of media material and the time base of the media to the broadcast of the user or recipient device. Delivery in this context is often performed using a streaming protocol, such as Real Time Streaming Protocol (RTSP) for control and Instant Transfer Protocol (RTP) for media material. The delivery rate can be determined by the streaming server and usually matches the broadcast rate of the data.

「下載」模型的一些缺點可能在於，由於投遞與播出之間的時基獨立性，要麼在需要媒體資料供播出時該媒體資料可能不可用(例如，由於可用頻寬小於媒體資料率)，導致播出暫時停止(「停滯」)，而此舉造成不良的使用者體驗；要麼可能要求提前在播出之前很久就下載媒體資料(例如，由於可用頻寬大於媒體資料率)，從而消費掉接收裝置上可能稀缺的儲存資源，並且消費寶貴的網路資源進行投遞，而此舉在內容最終沒有被播出或以其他方式使用的情況下會被浪費掉。Some of the shortcomings of the "download" model may be that the media material may not be available due to the time base independence between delivery and playout (eg, because the available bandwidth is less than the media data rate) , causing the broadcast to temporarily stop ("stagnation"), and this will result in a bad user experience; or it may be required to download the media material long before the broadcast (for example, because the available bandwidth is greater than the media data rate), thereby consuming The storage resources that may be scarce on the receiving device are lost, and valuable network resources are consumed for delivery, which is wasted if the content is not finally broadcast or otherwise used.

「下載」模型的優點可在於執行此類下載所需的技術(例如，HTTP)非常成熟、被廣泛部署且全面適用於很廣範圍的應用。用於實現此類檔下載的大規模可伸縮性的下載伺服器和解決方案(例如，HTTP Web伺服器和內容投遞網路)可能是現成可用的，從而使得基於該技術的服務部署簡單且成本低廉。The advantage of the Download model is that the technology required to perform such downloads (for example, HTTP) is very mature, widely deployed, and fully applicable to a wide range of applications. Download servers and solutions for implementing large-scale scalability of such file downloads (eg, HTTP web servers and content delivery networks) may be readily available, making service deployment based on this technology simple and cost effective low.

「串流」模型的一些缺點可能在於，一般而言，媒體資料的投遞速率並不適配於從伺服器到客戶端的連接上的可用頻寬，且需要提供頻寬和延遲擔保的專門的串流伺服器或更複雜的網路架構。儘管存在支援根據可用頻寬來變化投遞資料率的串流系統(例如，Adobe Flash自我調整串流)，但是該等系統在利用所有可用頻寬方面通常不如諸如TCP之類的下載傳輸流量控制協定一般高效。Some of the shortcomings of the "streaming" model may be that, in general, the media The delivery rate of volume data is not adapted to the available bandwidth from the server to the client connection, and requires a dedicated streaming server or a more complex network architecture that provides bandwidth and latency guarantees. Although there are streaming systems that support changing the delivery rate based on available bandwidth (eg, Adobe Flash self-tuning streams), these systems are generally inferior to download traffic control protocols such as TCP in terms of utilizing all available bandwidth. Generally efficient.

最近，已開發和部署了基於「串流」和「下載」模型的組合的新型媒體投遞系統。此類模型的實例在本文中被稱為「區塊請求串流」模型，其中媒體客戶端使用諸如HTTP之類的下載協定來向服務基礎設施請求媒體資料區塊。此類系統中的關注點可能是開始播出串流的能力，例如使用個人電腦來解碼和渲染收到的音訊和視訊串流並在電腦螢幕上顯示該視訊以及經由內置揚聲器來播放該音訊，或者作為另一實例，使用機上盒來解碼和渲染收到的音訊和視訊串流並在電視顯示裝置上顯示該視訊及經由身歷聲系統來播放該音訊。Recently, a new media delivery system based on a combination of "streaming" and "downloading" models has been developed and deployed. An example of such a model is referred to herein as a "block request stream" model in which a media client uses a download protocol, such as HTTP, to request media material chunks from a service infrastructure. The focus in such systems may be the ability to begin streaming, such as using a personal computer to decode and render incoming audio and video streams and display the video on a computer screen and play the audio via a built-in speaker. Or as another example, a set-top box is used to decode and render the received audio and video streams and display the video on a television display device and play the audio via a live sound system.

諸如能夠足夠快地解碼源區塊以跟上源串流速率、使解碼等待時間最小化及減少對可用CPU資源的使用之類的其他關注點亦是問題所在。另一關注點是提供穩健和可伸縮的串流投遞解決方案，該解決方案允許系統的元件發生故障而不會不利地影響投遞給接收器的串流的品質。基於在呈現正被分發時關於該呈現的快速改變的資訊，可能發生其他問題。因此，希望具有改善的過程和設備。Other concerns such as being able to decode the source block fast enough to keep up with the source stream rate, minimize decoding latency, and reduce the use of available CPU resources are also a problem. Another concern is to provide a robust and scalable streaming delivery solution that allows components of the system to fail without adversely affecting the quality of the streams delivered to the receiver. Other problems may occur based on information about the rapid change of the presentation while the presentation is being distributed. Therefore, it is desirable to have improved processes and equipment.

一種區塊請求串流系統典型地使用攝取系統來提供此類系統的使用者體驗和頻寬效率的改善，該攝取系統產生將由習知檔案伺服器(例如，HTTP、FTP或類似伺服器)供應的形式的資料，其中該攝取系統攝入內容並將該內容製備為將由可包括或可不包括快取記憶體的該檔案伺服器來供應的檔或資料元素。A block request streaming system typically uses an ingestion system to provide improved user experience and bandwidth efficiency for such systems, which will be supplied by a conventional file server (eg, HTTP, FTP, or the like). In the form of data, wherein the ingestion system ingests content and prepares the content as a file or material element to be supplied by the file server that may or may not include cache memory.

根據實施例，一種區塊請求串流系統的媒體伺服器允許對媒體呈現內容的低等待時間串流。用於實況簡檔串流的相對較大的媒體段可從用於低等待時間串流的相對較小的媒體片斷聚集而成。媒體段和媒體片斷是根據相同的編碼協定來編碼的。According to an embodiment, a media server of a block request streaming system allows for low latency streaming of content to the media. Relatively large media segments for live profile streaming can be aggregated from relatively small media segments for low latency streaming. Media segments and media segments are encoded according to the same encoding protocol.

以下詳細描述連同附圖將提供對本發明的本質和優點的更好理解。A better understanding of the nature and advantages of the present invention will be set forth in the <RTIgt;

100‧‧‧區塊串流系統100‧‧‧block streaming system

101‧‧‧區塊供應基礎設施101‧‧‧ Block supply infrastructure

101(1)‧‧‧區塊供應基礎設施101 (1) ‧ ‧ block supply infrastructure

102‧‧‧攝取內容102‧‧‧Ingestion

103‧‧‧攝取系統103‧‧‧Ingestion system

104‧‧‧HTTP串流伺服器104‧‧‧HTTP Streaming Server

106‧‧‧HTTP快取記憶體106‧‧‧HTTP cache memory

108‧‧‧客戶端108‧‧‧Client

110‧‧‧內容儲存110‧‧‧Content storage

112‧‧‧請求112‧‧‧Request

114‧‧‧回應114‧‧‧Respond

122‧‧‧網路122‧‧‧Network

123‧‧‧區塊選擇器123‧‧‧block selector

124‧‧‧區塊請求器124‧‧‧block requester

125‧‧‧區塊緩衝器125‧‧‧block buffer

126‧‧‧緩衝監視器126‧‧‧ buffer monitor

127‧‧‧媒體解碼器127‧‧‧Media Decoder

128‧‧‧媒體換能器128‧‧‧Media transducer

270‧‧‧修復段產生器270‧‧‧Repair segment generator

300‧‧‧匯流排300‧‧‧ busbar

302‧‧‧攝取處理器302‧‧‧Ingestion processor

304‧‧‧記憶體304‧‧‧ memory

306‧‧‧磁碟儲存306‧‧‧Disk storage

308‧‧‧視訊顯示單元308‧‧‧Video display unit

310‧‧‧數符號輸入裝置310‧‧‧number input device

312‧‧‧網路介面裝置312‧‧‧Network interface device

400‧‧‧匯流排400‧‧‧ busbar

402‧‧‧客戶端處理器402‧‧‧Client Processor

404‧‧‧記憶體404‧‧‧ memory

406‧‧‧磁碟儲存406‧‧‧Disk storage

408‧‧‧視訊顯示單元408‧‧‧Video display unit

410‧‧‧數符號輸入裝置410‧‧‧number input device

412‧‧‧網路介面裝置412‧‧‧Network interface device

500‧‧‧MPD500‧‧‧MPD

501‧‧‧時段記錄501‧‧‧time record

502‧‧‧表示記錄502‧‧‧ indicates record

503‧‧‧段資訊503‧‧‧Information

504‧‧‧初始化段504‧‧‧Initialization section

505(1)‧‧‧媒體段505(1)‧‧‧Media section

510‧‧‧源段510‧‧‧ source section

512‧‧‧修復段512‧‧‧Repair section

700‧‧‧簡單索引700‧‧‧Simple index

702‧‧‧階層式索引702‧‧‧ Hierarchical index

900‧‧‧中繼資料表900‧‧‧Relay data sheet

902‧‧‧HTTP串流客戶端902‧‧‧HTTP Streaming Client

904‧‧‧媒體區塊904‧‧‧Media block

906‧‧‧HTTP串流伺服器906‧‧‧HTTP Streaming Server

1000‧‧‧視訊串流1000‧‧‧Video Streaming

1002‧‧‧區塊1002‧‧‧ Block

1004‧‧‧RAP1004‧‧‧RAP

1200‧‧‧發射器1200‧‧‧transmitter

1202‧‧‧中繼資料1202‧‧‧Relay information

1204‧‧‧可伸縮層11204‧‧‧Scalable layer 1

1206‧‧‧可伸縮層21206‧‧‧Scalable layer 2

1208‧‧‧可伸縮層31208‧‧‧Scalable layer 3

1210‧‧‧接收器1210‧‧‧ Receiver

1212‧‧‧媒體呈現1212‧‧‧Media presentation

1300‧‧‧步驟1300‧‧ steps

1310‧‧‧步驟1310‧‧‧Steps

1320‧‧‧步驟1320‧‧‧Steps

1330‧‧‧步驟1330‧‧‧Steps

1340‧‧‧步驟1340‧‧ steps

1410‧‧‧步驟1410‧‧‧Steps

1420‧‧‧步驟1420‧‧‧Steps

1430‧‧‧步驟1430‧‧‧Steps

1440‧‧‧步驟1440‧‧‧Steps

1450‧‧‧步驟1450‧‧‧Steps

1710‧‧‧步驟1710‧‧‧Steps

1720‧‧‧步驟1720‧‧‧Steps

1730‧‧‧步驟1730‧‧‧Steps

1740‧‧‧步驟1740‧‧‧Steps

1750‧‧‧步驟1750‧‧ steps

1760‧‧‧步驟1760‧‧ steps

1770‧‧‧步驟1770‧‧‧Steps

3002‧‧‧媒體段3002‧‧‧Media section

3004‧‧‧媒體片斷3004‧‧‧Media clips

3006‧‧‧媒體片斷3006‧‧‧Media clip

3008‧‧‧媒體片斷3008‧‧‧Media clips

圖1圖示了根據本發明的實施例的區塊請求串流系統的元素。FIG. 1 illustrates elements of a block request stream system in accordance with an embodiment of the present invention.

圖2圖示圖1的區塊請求串流系統，圖示耦合到區塊供應基礎設施(「BSI」)以接收由內容攝取系統處理的資料的客戶端系統的元素中的更多細節。2 illustrates the block request streaming system of FIG. 1, illustrating more details of elements of a client system coupled to a block provisioning infrastructure ("BSI") to receive material processed by a content ingestion system.

圖3圖示了攝取系統的硬體/軟體實現。Figure 3 illustrates the hardware/software implementation of the ingestion system.

圖4圖示了客戶端系統的硬體/軟體實現。Figure 4 illustrates the hardware/software implementation of the client system.

圖5圖示了圖1中所示的內容儲存的可能結構，包括段和媒體呈現描述符(「MPD」)檔，及MPD檔內的段分解、時基和其他結構。Figure 5 illustrates a possible structure of the content store shown in Figure 1, including segment and media presentation descriptor ("MPD") files, and segmentation, time base, and other structures within the MPD file.

圖6圖示了如可儲存在圖1和圖5中圖示的內容儲存中的典型源段的細節。FIG. 6 illustrates details of a typical source segment as may be stored in the content store illustrated in FIGS. 1 and 5.

圖7a和圖7b圖示了檔內的簡單索引和階層式索引。Figures 7a and 7b illustrate a simple index and a hierarchical index within a file.

圖8(a)圖示了在媒體串流的複數個版本上具有對準的檢視點的可變區塊大小控制。Figure 8(a) illustrates variable block size control with aligned view points over a plurality of versions of the media stream.

圖8(b)圖示了在媒體串流的複數個版本上具有非對準的檢視點的可變區塊大小控制。Figure 8(b) illustrates variable block size control with non-aligned view points over a plurality of versions of the media stream.

圖9(a)圖示了中繼資料表。Figure 9(a) illustrates a relay data table.

圖9(b)圖示了從伺服器向客戶端傳輸區塊和中繼資料表。Figure 9(b) illustrates the transfer of block and relay data tables from the server to the client.

圖10圖示了獨立於RAP邊界的區塊。Figure 10 illustrates a block that is independent of the RAP boundary.

圖11圖示了跨段的連續時基和不連續時基。Figure 11 illustrates the continuous time base and discontinuous time base across the segments.

圖12是圖示可伸縮區塊的一態樣的圖。Figure 12 is a diagram illustrating an aspect of a scalable block.

圖13圖示了區塊請求串流系統內的某些變數隨時間進化的圖形表示。Figure 13 illustrates a graphical representation of the evolution of certain variables within a block request stream system over time.

圖14圖示了區塊請求串流系統內的某些變數隨時間進化的另一圖形表示。Figure 14 illustrates another graphical representation of the evolution of certain variables within a block request stream system over time.

圖15描述了作為閾值的函數的狀態的單元柵格。Figure 15 depicts a cell grid of states as a function of threshold.

圖16是可在接收器中執行的每請求能請求單個區塊及多個區塊的過程的流程圖。16 is a flow diagram of a process that can request a single block and multiple blocks per request that can be performed in a receiver.

圖17是靈活管線過程的流程圖。Figure 17 is a flow chart of a flexible pipeline process.

圖18圖示了在某個時間的一組候選請求、該組候選請求的優先順序，及在何連接上能發出該等請求的實例。Figure 18 illustrates a set of candidate requests at a certain time, a prioritization of the set of candidate requests, and an instance of which requests can be issued on which connections.

圖19圖示了已隨時間變遷而進化的一組候選請求、該組候選請求的優先順序，及在何連接上能發出該等請求的實例。Figure 19 illustrates a set of candidate requests that have evolved over time, The priority order of the set of candidate requests, and the instances on which the connections can be issued.

圖20是基於檔辨識符的一致性快取記憶體伺服器代理選擇的流程圖。Figure 20 is a flow diagram of a consistent cache memory server proxy selection based on a file identifier.

圖21圖示了對合適的運算式語言的句法定義。Figure 21 illustrates a syntactic definition of a suitable arithmetic language.

圖22圖示了合適的散列函數的實例。Figure 22 illustrates an example of a suitable hash function.

圖23圖示了檔辨識符構造規則的實例。Figure 23 illustrates an example of a file identifier construction rule.

圖24(a)至圖24(e)圖示了TCP連接的頻寬波動。24(a) to 24(e) illustrate the bandwidth fluctuation of the TCP connection.

圖25圖示了對來源資料和修復資料的多個HTTP請求。Figure 25 illustrates multiple HTTP requests for source and repair materials.

圖26圖示了帶FEC和不帶FEC的實例頻道換台時間。Figure 26 illustrates an example channel change time with and without FEC.

圖27圖示了作為圖1中所示的攝取系統一部分的從源段和控制參數產生修復段的修復段產生器的詳情。Figure 27 illustrates details of a repair segment generator that generates a repair segment from a source segment and control parameters as part of the ingestion system shown in Figure 1.

圖28圖示了源區塊與修復區塊之間的關係。Figure 28 illustrates the relationship between a source block and a repair block.

圖29圖示了在客戶端處不同時間的實況服務的程序。Figure 29 illustrates a procedure for live service at different times at the client.

圖30圖示了用於低等待時間串流的媒體片斷與諸媒體片斷之間的關係。Figure 30 illustrates the relationship between media segments and media segments for low latency streaming.

在附圖中，相似的項目用相似的標號來引述且在括弧中提供副標以指示相似或相同項目的多個實例。除非另行指出，否則最後的副標(例如，「N」或「M」)並非意在限定於任何特定值，且一個項目的實例數目可與另一項目的實例數目不同，即使在圖示了相同的標號且重複使用了副標時亦然。In the figures, like items are referred to by like reference numerals and subscripts are provided in parentheses to indicate multiple instances of similar or identical items. Unless otherwise stated, the final subscript (eg, "N" or "M") is not intended to be limited to any particular value, and the number of instances of one item may differ from the number of instances of another item, even if illustrated. The same reference numerals are used when the sub-labels are repeatedly used.

如本文中所描述的，串流系統的目標是將媒體從媒體的儲存位置(或正產生該媒體的位置)移到正消費該媒體的位置，即呈現給使用者或以其他方式被人類或電子消費者「用盡」。理想情況下，串流系統可在接收端提供不間斷的重播(或更一般而言，不間斷的「消費」)，且在使用者請求了該串流或該等串流之後不久就能開始播放串流或串流集合。出於效率原因，亦希望每個串流在一旦使用者指示不再需要該串流時，諸如當使用者正從一個串流切換到另一個串流或者該串流服從例如「字幕」串流之類的串流的呈現時就被停下。若諸如視訊之類的媒體分量繼續被呈現，但選擇了不同的串流來呈現該媒體分量，則往往較佳令新的串流佔用有限的頻寬並停止舊的串流。As described herein, the goal of a streaming system is to move media from a storage location of the media (or the location from which the media is being generated) to a location where the media is being consumed, ie presented to the user or otherwise by humans or Electronic consumers "run out". Ideally, the streaming system can provide uninterrupted replay (or more generally, uninterrupted "consumption") at the receiving end and can begin shortly after the user requests the stream or the streams. Play a stream or stream collection. For efficiency reasons, it is also desirable that each stream is once the user indicates that the stream is no longer needed, such as when the user is switching from one stream to another or the stream is subject to, for example, "subtitle" streaming. The presentation of such streams is stopped. If media components such as video continue to be presented, but different streams are selected to present the media components, it is often better to have the new stream occupy a limited bandwidth and stop the old stream.

根據本文中描述的實施例的區塊請求串流系統提供許多益處。應理解，可行的系統無需包括本文中描述的所有特徵，因為一些應用可用不足本文中描述的特徵全體的特徵來提供令人滿意程度適宜的體驗。The block request streaming system in accordance with the embodiments described herein provides a number of benefits. It should be understood that a viable system need not include all of the features described herein, as some applications may provide a satisfactory level of experience with features that are less than all of the features described herein.

HTTP串流HTTP streaming

HTTP串流是一種具體的串流類型。在HTTP串流下，源可以是標準web伺服器和內容投遞網路(CDN)並且可使用標準HTTP。該技術可涉及串流分段及使用多個串流，所有該等皆落在標準化HTTP請求的上下文中。諸如視訊之類的媒體可以用多個位元元速率來編碼以構成不同的版本或表示。術語「版本」和「表示」在本文中同義地使用。每個版本或表示可被分解成較小的片以構成段，片可能在幾秒的量級。每個段隨後可作為單獨的檔被儲存在web伺服器或CDN上。HTTP streaming is a specific type of streaming. Under HTTP streaming, the source can be a standard web server and a content delivery network (CDN) and standard HTTP can be used. This technique can involve streaming segments and using multiple streams, all of which fall within the context of standardized HTTP requests. Media such as video can be encoded with multiple bit rate to form different versions or representations. The terms "version" and "representation" are used synonymously herein. Each version or The representation can be broken down into smaller pieces to form segments, which may be on the order of a few seconds. Each segment can then be stored as a separate file on a web server or CDN.

在客戶端一側，隨後可使用HTTP作出對個體段的請求，該等個體段由客戶端無瑕疵地拼接在一起。客戶端可基於可用頻寬切換到不同的資料率。客戶端亦可請求多個表示，每個表示呈現不同的媒體分量，並且可聯合且同步地呈現該等表示中的媒體。切換的觸發可包括例如緩衝器佔用率和網路量測。當在穩態下操作時，客戶端可調整向伺服器請求的步調以維持目標緩衝器佔用率。On the client side, requests for individual segments can then be made using HTTP, which are seamlessly stitched together by the client. The client can switch to different data rates based on the available bandwidth. The client may also request multiple representations, each representation presenting different media components, and may present the media in the representations jointly and synchronously. The triggering of the handover may include, for example, buffer occupancy and network measurements. When operating in steady state, the client can adjust the pace requested to the server to maintain the target buffer occupancy.

HTTP串流的優點可包括位元元速率自我調整、快速啟動和檢視，及最小程度的不必要投遞。該等優點源於將投遞控製成僅比播出提前很短時間、對可用頻寬作出最大程度的使用(經由變數位元元速率媒體)，及最佳化串流分段和智慧客戶端程序。Advantages of HTTP streaming can include bit element rate self-adjustment, quick start and view, and minimal unnecessary delivery. These advantages stem from controlling delivery to only a short time ahead of playout, maximizing the available bandwidth (via variable bit rate media), and optimizing streaming segmentation and smart client program.

媒體呈現描述可被提供給HTTP串流客戶端，以使得客戶端能使用檔(例如以3GPP規定的格式，在本文中稱為3gp段)的集合來向使用者提供串流服務。媒體呈現描述及可能亦有該媒體呈現描述的更新描述了實為結構化的段集合的媒體呈現，每個段包含媒體分量以使得客戶端能以同步方式呈現所包括的媒體並且能提供高級特徵，諸如檢視、切換位元元速率，及聯合呈現不同表示中的媒體分量。客戶端可按不同方式使用媒體呈現描述資訊來得到服務的供給。具體而言，根據媒體呈現描述，HTTP串流客戶端可決定能存取該集合中的何段，從而串流服務內的資料對於客戶端能力及使用者而言是有用的。The media presentation description can be provided to the HTTP streaming client to enable the client to provide the streaming service to the user using a collection of files (eg, in the format specified by 3GPP, referred to herein as a 3gp segment). The media presentation description and possibly the update of the media presentation description describes the media presentation of the set of structured segments, each segment containing media components to enable the client to present the included media in a synchronized manner and to provide advanced features. , such as viewing, switching bit element rates, and jointly presenting media components in different representations. The client can use the media presentation description information in different ways to get the service. Specifically, according to the media presentation description, the HTTP streaming client can decide which segment in the set can be accessed, so that the data in the streaming service is for the client capability and the user. It is useful.

在一些實施例中，媒體呈現描述可以是靜態的，但段可以是動態地建立的。媒體呈現描述可以儘可能緊湊以使該服務的存取和下載時間最小化。其他專用伺服器連通性可被最小化，例如客戶端與伺服器之間一般或頻繁的時基同步。In some embodiments, the media presentation description can be static, but the segments can be dynamically created. The media presentation description can be as compact as possible to minimize access and download time for the service. Other dedicated server connectivity can be minimized, such as general or frequent time base synchronization between the client and the server.

可以將媒體呈現構造成允許被具有不同能力-諸如存取不同的存取網路類型、不同的當前網路條件、顯示器大小、存取位元元速率和轉碼器支援-的終端存取。客戶端隨後可提取合適的資訊以向使用者提供串流服務。Media presentations can be constructed to allow access by terminals with different capabilities - such as access to different access network types, different current network conditions, display size, access bit rate, and transcoder support. The client can then extract the appropriate information to provide streaming services to the user.

媒體呈現描述亦可根據要求允許部署靈活性及緊湊性。The media presentation description can also allow deployment flexibility and compactness as required.

在最簡單的情形中，每個替換表示可被儲存在單個3GP檔中，亦即，遵照如3GPP TS26.244中界定的檔，或遵照如ISO/IEC 14496-12或衍生規範中界定的ISO基媒體檔案格式(諸如3GPP技術規範26.244中描述的3GP檔案格式)的任何其他檔。在本文件的其餘部分中，在引述3GP檔時，應理解ISO/IEC 14496-12和衍生規範可將所有所描述的特徵映射到如ISO/IEC 14496-12或任何衍生規範中界定的更一般性的ISO基媒體檔案格式。客戶端隨後可請求檔的初始部分以獲悉媒體中繼資料(該媒體中繼資料典型地被儲存在亦被稱為「moov」包的電影頭部包中)連同電影片斷時間和位元組偏移量。客戶端隨後可發出HTTP部分獲取請求以獲得所要求的電影片斷。In the simplest case, each replacement representation can be stored in a single 3GP file, ie in accordance with a file as defined in 3GPP TS 26.244 or in accordance with ISO as defined in ISO/IEC 14496-12 or derivative specifications. Any other file of the base media file format (such as the 3GP file format described in 3GPP Technical Specification 26.244). In the remainder of this document, when referring to 3GP files, it should be understood that ISO/IEC 14496-12 and derivative specifications may map all of the described features to more general terms as defined in ISO/IEC 14496-12 or any derivative specification. Sexual ISO base media file format. The client can then request the initial portion of the file to learn the media relay material (which is typically stored in a movie header package, also referred to as a "moov" package) along with the movie clip time and byte offset. Transfer amount. The client can then issue an HTTP partial access request to obtain the requested movie fragment.

在一些實施例中，可能希望將每個表示拆分成若干段，其中該等段。在段格式基於3GP檔案格式的情形中，則段包含電影片斷的非重疊時間片，稱為「按時間拆分」。該等段中的每一個可包含多個電影片斷且每個段本身可以是有效3GP檔。在另一實施例中，表示被拆分成包含中繼資料的初始段(典型情況下為電影頭部「moov」包)及一組媒體段，每個媒體段包含媒體資料，並且初始段與任何媒體段的級聯構成有效的3GP檔，而且一個表示的初始段和所有媒體段的級聯亦構成有效的3GP檔。經由依次播出每個段、根據每個表示的起始時間將檔內的局部時戳映射到全域呈現時間便可構成整個呈現。In some embodiments, it may be desirable to split each representation into segments, where the segments. In the case where the segment format is based on the 3GP file format, the segment contains a non-overlapping time slice of the movie segment, which is called "split by time". Each of the segments may contain multiple movie segments and each segment itself may be a valid 3GP file. In another embodiment, the representation is split into an initial segment containing the relay material (typically a movie header "moov" packet) and a set of media segments, each media segment containing media material, and the initial segment and The cascading of any media segment constitutes a valid 3GP file, and the initial segment of one representation and the cascading of all media segments also constitute a valid 3GP file. The entire presentation can be constructed by sequentially broadcasting each segment, mapping the local time stamp within the file to the global presentation time according to the start time of each representation.

應注意，貫穿本說明書對「段」的引述應被理解為包括作為檔下載協定請求(包括例如HTTP請求)的結果完全或部分地從儲存媒體構造或讀取或者以其他方式獲得的任何資料物件。例如，在HTTP的情形中，資料物件可被儲存在常駐於連接到HTTP伺服器或構成HTTP伺服器一部分的盤或其他儲存媒體上的實際檔中，或者資料物件可由回應於HTTP請求而執行的CGI腳本或其他動態地執行的程式來構造。除非另行指出，否則術語「檔」和「段」在本文中同義地使用。在HTTP的情形中，段可被視為HTTP請求回應的實體主體。It should be noted that the reference to "segment" throughout this specification should be understood to include any material item constructed or read or otherwise obtained from the storage medium as a result of a file download agreement request (including, for example, an HTTP request). . For example, in the case of HTTP, the data item can be stored in an actual file resident on a disk or other storage medium connected to the HTTP server or forming part of the HTTP server, or the data item can be executed in response to the HTTP request. Constructed by CGI scripts or other dynamically executed programs. Unless otherwise indicated, the terms "file" and "segment" are used synonymously herein. In the case of HTTP, a segment can be thought of as the entity body of an HTTP request response.

術語「呈現」和「內容項」在本文中同義地使用。在許多實例中，呈現是具有界定的「播出」時間的音訊、視訊或其他媒體呈現，但其他變型亦是可能的。The terms "presentation" and "content item" are used synonymously herein. In many instances, the presentation is an audio, video or other media presentation with a defined "broadcast" time, but other variations are also possible.

除非另行指出，否則術語「區塊」和「片斷」在本文中同義地使用且通常代表有索引的最小的資料聚集。基於可用的索引法，客戶端可在不同的HTTP請求中請求片斷的不同部分，或者可在一個HTTP請求中請求一或多個連貫片斷或片斷部分。在其中使用基於ISO基媒體檔案格式的段或基於3GP檔案格式的段的情形中，片斷典型地代表界定為電影片斷頭部(‘moof’)包和媒體資料(‘mdat’)包的組合的電影片斷。Unless otherwise stated, the terms "block" and "fragment" are in this section. It is used synonymously in the text and usually represents the smallest data aggregate with an index. Based on the available indexing methods, the client can request different portions of the fragment in different HTTP requests, or can request one or more consecutive fragments or fragment portions in an HTTP request. In the case where a segment based on an ISO base media file format or a segment based on a 3GP file format is used, the segment typically represents a combination defined as a movie clip header ('moof') packet and a media material ('mdat') packet. Movie clips.

在本文中，為了簡化本文中的描述而假定攜帶資料的網路是基於封包的，在閱讀本案之後應認識到，本領域技藝人士能將本文中描述的本發明的實施例應用於其他類型的傳輸網路，諸如連續位元串流網路。In this document, in order to simplify the description herein, it is assumed that the network carrying the data is based on packets. It should be appreciated after reading this disclosure that those skilled in the art can apply the embodiments of the invention described herein to other types of Transmission network, such as a continuous bit stream network.

在本文中，為了簡化本文中的描述而假定由FEC碼提供對抗資料投遞時間長且可變問題的保護，在閱讀本案之後應認識到，本領域技藝人士能將本發明的實施例應用於其他類型的資料傳輸問題，諸如資料的位元翻轉訛誤。例如，在沒有FEC的情況下，若所請求片斷的最後部分比該片斷的先前部分晚到很久或者所請求片斷的最後部分的抵達時間有很大變動，則內容換台時間可能是長且可變的，而在使用FEC和並行請求的情況下，僅需要針對片斷所請求的資料的大半抵達後就能恢復該片斷，藉此減少了內容換台時間及內容換台時間上的可變性。在本描述中，可假定待編碼資料(亦即，來源資料)已被分解成可以具有有任何長度(小至單個位元)的等長「符號」，但對於資料的不同部分而言，符號可以具有不同長度，例如，可對不同的資料區塊使用不同的符號大小。In this document, in order to simplify the description herein, it is assumed that the FEC code provides protection against long data delivery time and variable problems. It should be appreciated after reading this disclosure that those skilled in the art can apply the embodiments of the present invention to others. Types of data transfer problems, such as bit flipping of data. For example, in the absence of FEC, if the last portion of the requested segment is much later than the previous portion of the segment or the arrival time of the last portion of the requested segment varies greatly, the content swap time may be long and available. In the case of FEC and parallel requests, only the majority of the data requested for the segment needs to be recovered after the arrival, thereby reducing the variability in content switching time and content switching time. In this description, it can be assumed that the data to be encoded (ie, the source material) has been decomposed into equal-length "symbols" that can have any length (small to a single bit), but for different parts of the data, the symbols Can have different lengths, for example, different characters can be used for different data blocks Number size.

在本描述中，為了簡化本文中的描述，假定每次向資料「區塊」或「片斷」應用FEC，亦即，「區塊」是用於FEC編碼和解碼目的的「源區塊」。客戶端裝置可使用本文中描述的段索引法來說明決定段的源區塊結構。本領域技藝人士可將本發明的實施例應用於其他類型的源區塊結構，例如源區塊可以是片斷的一部分，或者可涵蓋一或多個片斷或片斷部分。In this description, in order to simplify the description herein, it is assumed that FEC is applied to a data "block" or "slice" each time, that is, a "block" is a "source block" for FEC encoding and decoding purposes. The client device can use the segment indexing method described herein to illustrate the source block structure of the decision segment. One skilled in the art can apply embodiments of the present invention to other types of source block structures, such as a source block that can be part of a segment, or can encompass one or more segments or segment portions.

考慮與區塊請求串流聯用的FEC碼典型情況下是系統FEC碼，亦即，源區塊的源符號可作為該源區塊的編碼的一部分被包括，並且因此源符號被傳送。如本領域技藝人士將認識到的，本文中描述的實施例亦等同地適用於非系統的FEC碼。系統FEC編碼器從由源符號構成的源區塊產生一定數目的修復符號，且源符號和修復符號中的至少一些的組合便是在通道上發送的表示源區塊的經編碼符號。一些FEC碼對於高效地產生如所需的一般多的修復符號而言可能是有用的，諸如「資訊加性碼」或「噴泉碼」，且該等碼的實例包括「鏈式反應碼」和「多級鏈式反應碼」。諸如Reed-Solomon碼之類的其他FEC碼可能實際上僅為每個源區塊產生有限數目的修復符號。An FEC code that is considered to be used in conjunction with a block request stream is typically a system FEC code, i.e., the source symbol of the source block can be included as part of the encoding of the source block, and thus the source symbol is transmitted. As will be appreciated by those skilled in the art, the embodiments described herein are equally applicable to non-systematic FEC codes. The system FEC encoder generates a number of repair symbols from the source block formed by the source symbols, and the combination of at least some of the source symbols and the repair symbols is the encoded symbol representing the source block transmitted on the channel. Some FEC codes may be useful for efficiently generating as many repair symbols as are required, such as "information additive code" or "fountain code", and examples of such codes include "chain reaction codes" and "Multi-level chain reaction code". Other FEC codes, such as Reed-Solomon codes, may actually only produce a limited number of repair symbols per source block.

在該等實例中的許多實例中假定客戶端耦合到媒體伺服器或複數個媒體伺服器，且客戶端在通道或複數個通道上向該媒體伺服器或該複數個媒體伺服器請求串流媒體。然而，更複雜的安排亦是可能的。In many instances of the examples, it is assumed that the client is coupled to a media server or a plurality of media servers, and the client requests streaming media to the media server or the plurality of media servers on a channel or a plurality of channels. . However, more complicated arrangements are also possible.

益處實例Benefit example

經由區塊請求串流，媒體客戶端維護該等區塊請求的時基與向使用者進行媒體播出的時基之間的耦合。該模型可留存以上描述的「下載」模型的優點，同時避免源於媒體播出與資料投遞之間通常為解耦的一些缺點。該區塊請求串流模型利用諸如TCP之類的傳輸協定中可用的速率和壅塞控制機制來確保最大可用頻寬被用於媒體資料。另外，將媒體呈現分成區塊允許從一組多種可用編碼中選擇每個經編碼媒體資料區塊。The media client maintains the coupling between the time base of the block request and the time base for media broadcast to the user via the block request stream. This model preserves the advantages of the "download" model described above while avoiding some of the disadvantages that are often decoupled between media playout and data delivery. The block request stream model utilizes the rate and congestion control mechanisms available in transport protocols such as TCP to ensure that the maximum available bandwidth is used for media material. In addition, partitioning the media presentation into blocks allows each of the encoded media data blocks to be selected from a set of multiple available codes.

該選擇可基於任何數目個準則，包括媒體資料率與可用頻寬的匹配一即使在可用頻寬隨時間改變時亦然，媒體解析度或解碼複雜性與客戶端能力或配置的匹配，或者與諸如語言之類的使用者偏好的匹配。該選擇亦可包括對輔助分量的下載和呈現，諸如可存取性分量、隱藏字幕、字幕、手語視訊等。使用區塊請求串流模型的現有系統的實例包括行動網路(Move Network^TM )、微軟流暢串流(Microsoft Smooth Streaming)及蘋果iPhone^TM 串流協定。The selection can be based on any number of criteria, including a match of the media data rate and the available bandwidth - even when the available bandwidth changes over time, the media resolution or decoding complexity matches the client capabilities or configuration, or A match of user preferences such as language. The selection may also include downloading and rendering of auxiliary components such as accessibility components, closed captioning, subtitles, sign language video, and the like. Use existing block request streaming system model include mobile network (Move Network ^TM), Microsoft Smooth Stream (Microsoft Smooth Streaming) and ^TM streaming protocols Apple iPhone.

通常，每個媒體資料區塊可作為個體檔儲存在伺服器上，且隨後使用諸如HTTP之類的協定協同在伺服器上執行的HTTP伺服器軟體將該檔作為單位來請求。典型地，向客戶端提供中繼資料檔，中繼資料檔例如可以為可延伸標記語言(XML)格式或播放清單文字格式或二進位元格式，中繼資料檔描述了在本文件中通常稱為「表示」的媒體呈現的特徵，諸如可用編碼(例如，要求的頻寬、解析度、編碼參數、媒體類型、語言)，及將編碼劃分成區塊的方式。例如，中繼資料可包括每個區塊的統一資源定位符(URL)。URL本身可提供諸如以串「http：//」來預考慮以指示將用於存取此記載的資源的協定是HTTP的方案。另一實例是「ftp：//」以指示將使用的協定是FTP。Typically, each media material block can be stored on the server as an individual file, and then requested by the HTTP server software executing on the server in cooperation with a protocol such as HTTP. Typically, the client is provided with a relay profile, which may be, for example, an Extensible Markup Language (XML) format or a playlist text format or a binary format. The relay profile describes what is commonly referred to in this document. Features presented for "represented" media, such as available encodings (eg, required bandwidth, resolution, encoding parameters, Media type, language), and the way to divide the code into blocks. For example, the relay material may include a uniform resource locator (URL) for each block. The URL itself may provide a scheme such as prepending with the string "http://" to indicate that the protocol that will be used to access this recorded resource is HTTP. Another example is "ftp://" to indicate that the protocol to be used is FTP.

在其他系統中，例如可由伺服器回應於來自客戶端的以時間指示媒體呈現中被請求的部分的請求「在執行中」構造媒體區塊。例如，在使用方案「http：//」的HTTP情形中，對該URL的請求的執行提供請求回應，在該請求回應的實體主體中包含一些特定資料。在網路中關於如何產生該請求回應的實現可能是十分不同的，此取決於服務此類請求的伺服器的實現。In other systems, the media block may be constructed "in execution" by a server in response to a request from the client to indicate the requested portion of the media presentation in time. For example, in the HTTP scenario using the scheme "http://", a request response is provided for execution of the request for the URL, and some specific material is included in the entity body of the request response. The implementation of how the response to the request is generated in the network can be quite different, depending on the implementation of the server that serves such requests.

典型地，每個區塊可以是能獨立解碼的。例如，在視訊媒體的情形中，每個區塊可始於「檢視點」。在一些編碼方案中，檢視點被稱為「隨機存取點」或即「RAP」，儘管並非所有RAP皆會被指定為檢視點。類似地，在其他編碼方案中，檢視點在H.264視訊編碼的情形中始於「獨立資料刷新」訊框或「IDR」，儘管並非所有IDR皆會被指定為檢視點。檢視點是視訊(或其他)媒體中解碼器不需要關於先前訊框或資料或取樣的任何資料就能開始解碼的位置，而正被解碼的訊框或取樣不是以自立方式而是例如作為當前訊框與先前訊框之間的差異來編碼的情形可能就是需要關於先前訊框或資料或取樣的資料的情形。Typically, each block can be independently decodable. For example, in the case of video media, each block can start at the "view point." In some coding schemes, the view point is called a "random access point" or "RAP", although not all RAPs are designated as view points. Similarly, in other coding schemes, the view point starts in the "independent data refresh" frame or "IDR" in the case of H.264 video coding, although not all IDRs are designated as view points. The view point is the location in the video (or other) media where the decoder does not need to know about the previous frame or data or sampled data, and the frame or sample being decoded is not in a self-sustaining manner but for example as current The situation in which the difference between the frame and the previous frame is encoded may be the case of information about the previous frame or data or sampled.

此類系統中的關注點可能是開始播出串流的能力，例如使用個人電腦來解碼和渲染收到的音訊串流和視訊串流並在電腦螢幕上顯示該視訊及經由內置揚聲器播放該音訊，或者作為另一實例，使用機上盒來解碼和渲染收到的音訊串流和視訊串流並在電視顯示裝置上顯示該視訊及經由身歷聲系統播放該音訊。主要關注點可能是使在使用者決定觀看作為串流來投遞的新內容並採取表達彼決定的行動(例如，使用者點擊瀏覽器訊窗內的連結或遙控裝置上的播放按鈕)的時間與內容開始在使用者的螢幕上播放的時間之間的延遲(下文中稱為「內容換台時間」)最小化。該等關注點中的每一個均可由本文中描述的增強型系統的元素來解決。The focus in such systems may be the ability to start streaming. For example, a personal computer is used to decode and render the received audio stream and video stream and display the video on a computer screen and play the audio via the built-in speaker, or as another example, use the set-top box to decode and render the received signal. The audio stream and the video stream are displayed on the television display device and played through the live sound system. The main concern may be to allow the user to decide to view the new content delivered as a stream and take action to express his decision (eg, the user clicks on a link in the browser window or a play button on the remote control). The delay between the time when the content starts to be played on the user's screen (hereinafter referred to as "content change time") is minimized. Each of these points of interest can be addressed by elements of the enhanced system described herein.

內容換台的實例是使用者正觀看經由第一串流來投遞的第一內容，並且隨後該使用者決定觀看經由第二串流來投遞的第二內容並發起開始觀看第二內容的行動。第二串流可以是從與第一串流相同或不同的一組伺服器發送的。內容換台的另一實例是使用者正訪問網站並經由點擊瀏覽器訊窗內的連結來決定開始觀看經由第一串流投遞的第一內容。以類似方式，使用者可能決定並非從頭，而是從串流內的某個時間開始播放內容。使用者指示使用者的客戶端裝置檢視時間位置並且使用者可能期望所選擇的時間被立刻渲染。使內容換台時間最小化對於視訊觀看而言是重要的，此舉允許使用者在搜尋和取樣很廣範圍的可用內容時獲得高品質的快速內容衝浪體驗。An example of a content swap is that the user is viewing the first content delivered via the first stream, and then the user decides to view the second content posted via the second stream and initiates an action to begin viewing the second content. The second stream may be sent from a set of servers that are the same or different than the first stream. Another example of a content swap is that the user is visiting the website and decides to begin viewing the first content delivered via the first stream by clicking on a link within the browser window. In a similar manner, the user may decide not to start from the beginning, but to play the content from some time within the stream. The user instructs the user's client device to view the time location and the user may expect the selected time to be rendered immediately. Minimizing content swap time is important for video viewing, allowing users to get a high-quality, fast content surfing experience while searching and sampling a wide range of available content.

近期，考慮使用前向糾錯(FEC)碼在傳輸期間保護串流媒體已成為慣例。當在封包網路(封包網路的實例包括網際網路和諸如由諸如3GPP、3GPP2和DVB之類的群體標準化的彼等無線網路)上發送時，源串流在被產生或變為可用時被放入封包中，且因此該等封包可用來將源串流或內容串流按該源串流或該內容串流被產生或變為可用的次序攜至接收器。Recently, it has become customary to consider the use of Forward Error Correction (FEC) codes to protect streaming media during transmission. When in the packet network (the packet network instance package) When transmitted over the Internet and on such wireless networks as are standardized by groups such as 3GPP, 3GPP2, and DVB, the source stream is placed into the packet when it is generated or becomes available, and thus such The packet can be used to carry the source stream or content stream to the receiver in the order in which the source stream or the content stream is generated or becomes available.

在向該等類型的場景應用FEC碼的典型情形中，編碼器可使用FEC碼來建立修復封包，隨後將該等修復封包作為包含源串流的原始源封包的補充來發送。修復封包具有以下性質：當發生源封包丟失時，可使用接收到的修復封包來恢復丟失的源封包中包含的資料。修復封包可被用來恢復完全丟失的丟失源封包的內容，但亦可被用來從部分封包丟失中恢復，該等恢復或者是從完全接收到的修復封包或者甚至是從部分接收到的修復封包來進行的。因此，可以使用整體或部分接收到的修復封包來恢復整體或部分丟失的源封包。In a typical scenario where an FEC code is applied to such types of scenarios, the encoder may use the FEC code to establish a repair packet, which is then sent as a supplement to the original source packet containing the source stream. The repair packet has the following properties: When the source packet is lost, the received repair packet can be used to recover the data contained in the lost source packet. The repair packet can be used to recover the contents of the completely lost source packet, but can also be used to recover from partial packet loss, either from a fully received repair packet or even from a partially received repair. The package is carried out. Thus, a whole or partially received repair packet can be used to recover an entire or partially lost source packet.

在其他實例中，所發送的資料可能發生其他類型的訛誤，例如位元值可能翻轉，並且因此修復封包可被用來糾正此類訛誤並提供對源封包儘可能準確的恢復。在其他實例中，源串流不一定以個別的封包來發送，而是可代之以例如作為連續位元串流來發送。In other instances, other types of corruption may occur in the transmitted material, such as bit values may be flipped, and thus the repair packet may be used to correct such corruption and provide as accurate recovery as possible of the source packet. In other examples, the source stream is not necessarily transmitted as an individual packet, but instead may be sent, for example, as a continuous bit stream.

存在可用於提供對源串流的保護的FEC碼的許多實例。Reed-Solomon碼是公知的用於在通訊系統中進行差錯和擦除糾正的碼。對於例如封包資料網路上的擦除糾正，Reed-Solomon碼的公知高效實現使用如在L.Rizzo的「Effective Erasure Codes for Reliable Computer Communication Protocols(用於可靠的電腦通訊協定的有效擦除碼)」，電腦通訊評論27(2)：24-36(1997年4月)(下文稱為「Rizzo」)及Bloemer等人的「An XOR-Based Erasure-Resilient Coding Scheme(基於異或的擦除回彈編碼方案)」，技術報告TR-95-48，國際電腦科學協會，加利福尼亞州伯克利市(1995年)(下文稱為「XOR-Reed-Solomon」)中或別處描述的柯西(Cauchy)矩陣或范德蒙(Vandermonde)矩陣。There are many examples of FEC codes that can be used to provide protection for source streams. The Reed-Solomon code is a well-known code for error and erasure correction in a communication system. For the erasure correction on the packet data network, for example, the well-known and efficient implementation of the Reed-Solomon code is used in L.Rizzo's "Effective Erasure Codes for Reliable Computer Communication". Protocols (Efficient Erase Codes for Reliable Computer Protocols), Computer Communications Review 27(2): 24-36 (April 1997) (hereinafter referred to as "Rizzo") and Bloemer et al. "An XOR" -Based Erasure-Resilient Coding Scheme, Technical Report TR-95-48, International Computer Science Association, Berkeley, California (1995) (hereafter referred to as "XOR-Reed -Cauchy matrix or Vandermonde matrix described in or elsewhere.

FEC碼的其他實例包括LDPC碼、諸如Luby I中描述的彼等之類的鏈式反應碼，及諸如Shokrollahi I中的多級鏈式反應碼。Other examples of FEC codes include LDPC codes, chain reaction codes such as those described in Luby I, and multi-level chain reaction codes such as those in Shokrollahi I.

用於Reed-Solomon碼的變體的FEC解碼過程的實例在Rizzo和XOR-Reed-Solomon中描述。在彼等實例中，解碼可在已接收到充分的來源資料封包和修復資料封包之後應用。解碼過程可能是計算密集的，且取決於可用的CPU資源，解碼過程可能要花費與區塊中的媒體所跨越的時間長度成比例的相當多的時間來完成。接收器在演算開始接收媒體串流與播出該媒體之間所需的延遲時可以計及解碼所需的該時間長度。由於解碼造成的該延遲被使用者感知為使用者對特定媒體串流的請求與開始重播之間的延遲。因此希望使該延遲最小化。Examples of FEC decoding procedures for variants of Reed-Solomon codes are described in Rizzo and XOR-Reed-Solomon. In these examples, the decoding may be applied after sufficient source data packets and repair data packets have been received. The decoding process may be computationally intensive, and depending on the available CPU resources, the decoding process may take quite a considerable amount of time in proportion to the length of time spanned by the media in the block. The receiver can account for the length of time required for decoding when the calculus begins to receive the delay between the media stream and the broadcast of the media. This delay due to decoding is perceived by the user as a delay between the user's request for a particular media stream and the start of replay. It is therefore desirable to minimize this delay.

在許多應用中，封包可被進一步細分成對符號應用FEC過程的符號。封包可包含一或多個符號(或者不足一個符號，但通常符號不會被跨封包群拆分，除非已知封包群之間的差錯條件是高度相關的)。符號可具有任何大小，但符號的大小往往至多等於封包的大小。源符號是編碼要傳送的資料的彼等符號。修復符號是直接或間接地從源符號產生的作為源符號的補充的符號(亦即，若全部源符號皆可用且沒有任何修復符號可用，亦能完全恢復出要傳送的資料)。In many applications, the packet can be further subdivided into symbols that apply the FEC process to the symbol. A packet may contain one or more symbols (or less than one symbol, but typically the symbols will not be split across the packet group unless the error conditions between the packet groups are known to be highly correlated). Symbols can have any size, but symbols The size is often at most equal to the size of the packet. The source symbols are the symbols that encode the material to be transmitted. A repair symbol is a supplemental symbol that is generated directly or indirectly from a source symbol as a source symbol (ie, if all source symbols are available and no repair symbols are available, the data to be transmitted can be completely recovered).

一些FEC碼可以是基於區塊的，因為編碼操作取決於區塊中的一或多個符號並且可獨立於不在彼區塊中的符號。經由基於區塊的編碼，FEC編碼器可從彼區塊中的源符號產生該區塊的修復符號，隨後繼續前往下一個區塊且無需參考除了正被編碼的當前區塊的源符號以外的其他源符號。在傳輸中，包括源符號的源區塊可由包括經編碼符號(經編碼符號可以是一些源符號、一些修復符號或兩者)的經編碼區塊來表示。在存在修復符號的情況下，在每個經編碼區塊中，並非所有源符號皆是需要的。Some FEC codes may be block based because the encoding operation depends on one or more symbols in the block and may be independent of the symbols that are not in the block. Via block-based coding, the FEC encoder may generate the repair symbols for the block from the source symbols in the block, and then proceed to the next block without reference to the source symbols other than the current block being encoded. Other source symbols. In transmission, a source block including a source symbol may be represented by an encoded block that includes encoded symbols (the encoded symbols may be some source symbols, some repair symbols, or both). In the presence of repair symbols, not all source symbols are required in each coded block.

對於一些FEC碼，特別是Reed-Solomon碼，編碼和解碼時間可能會隨著每源區塊的經編碼符號數目的增長而增長到不切實際的地步。因此，在實踐中，每源區塊能產生的經編碼符號總數往往存在切合實際的上限(255是一些應用的大致的切合實際的上限)，尤其是在由定製硬體執行Reed-Solomon編碼或解碼過程的典型情形中，例如，使用作為DVB-H標準的一部分被包括的Reed-Solomon碼來保護串流以對抗封包丟失的MPE-FEC過程是在蜂巢式電話內的專門硬體中實現的，該專門硬體限於每源區塊總共有255個Reed-Solomon經編碼符號。由於往往要求將符號放入分開的封包有效載荷中，因此此對正被編碼的源區塊的最大長度設置了切合實際的上限。例如，若封包有效載荷限於1024位元組或更少且每個封包攜帶一個經編碼符號，則經編碼源區塊可至多為255千位元組，並且此對於源區塊本身的大小當然亦是上限。For some FEC codes, especially Reed-Solomon codes, the encoding and decoding time may increase to an impractical level as the number of encoded symbols per source block increases. Therefore, in practice, there is often a realistic upper bound on the total number of encoded symbols that can be generated per source block (255 is the approximate practical upper limit for some applications), especially when Reed-Solomon encoding is performed by custom hardware. Or in the typical case of the decoding process, for example, using the Reed-Solomon code included as part of the DVB-H standard to protect the stream against packet loss, the MPE-FEC process is implemented in specialized hardware within the cellular phone. The specialized hardware is limited to a total of 255 Reed-Solomon coded symbols per source block. Since the symbols are often required to be placed in separate packet payloads, the maximum length of the source block being encoded is set. A realistic upper limit has been set. For example, if the packet payload is limited to 1024 bytes or less and each packet carries one encoded symbol, the encoded source block can be up to 255 kilobytes, and of course the size of the source block itself is also It is the upper limit.

諸如能夠足夠快地解碼源區塊以跟上源串流速率、使由FEC解碼引入的解碼等待時間最小化，及在FEC解碼期間的任何時間點僅使用接收裝置上可用CPU的一小部分等其他關注點由本文中描述的元素來解決。Such as being able to decode the source block fast enough to keep up with the source stream rate, minimize the decoding latency introduced by FEC decoding, and use only a small portion of the available CPU on the receiving device at any point during FEC decoding, etc. Other concerns are addressed by the elements described in this article.

需要提供穩健和可伸縮的串流投遞解決方案，該串流投遞解決方案允許系統的元件發生故障而不會不利地影響投遞給接收器的串流的品質。There is a need to provide a robust and scalable streaming delivery solution that allows components of the system to fail without adversely affecting the quality of the streams delivered to the receiver.

區塊請求串流系統需要支援對呈現的結構或中繼資料的改變，例如對可用媒體編碼的數目的改變或是對媒體編碼的參數(諸如位元元速率、解析度、縱橫比、音訊或視訊轉碼器或編解碼參數)的改變，或是對諸如URL之類的與內容檔相關聯的其他中繼資料的改變。此類改變可能是出於多個原因而必需的，包括將較大呈現的諸如廣告或不同段之類的來自不同源的內容編輯在一起，作為例如由於配置改變、裝備故障或從裝備故障恢復或其他原因造成的服務基礎設施改變的結果而變得必要的對URL或其他參數的修改。The block request stream system needs to support changes to the structure or relay material of the presentation, such as changes to the number of available media codes or parameters encoding the media (such as bit rate, resolution, aspect ratio, audio or A change in the video transcoder or codec parameter, or a change to other relay material associated with the content file, such as a URL. Such changes may be necessary for a number of reasons, including editing together large rendered content from different sources, such as advertisements or different segments, as for example due to configuration changes, equipment failure, or recovery from equipment failure. Modifications to URLs or other parameters that become necessary as a result of changes in service infrastructure caused by other reasons.

存在可由持續更新的播放清單檔來控制呈現的方法。由於該檔被持續更新，因此以上描述的至少一些改變能在該等更新中作出。習知方法的缺點在於客戶端裝置必須不斷地檢索(亦稱為「輪詢」)播放清單檔，從而對服務基礎設施造成負荷，並且該檔可能沒有被快取記憶體得比更新間隔更久，從而使得服務基礎設施的任務困難得多。此由本文中的元素來解決，從而在無需由客戶端對中繼資料檔進行不斷輪詢的情況下提供以上描述的此種的更新。There is a method by which the presentation can be controlled by a continuously updated playlist file. Since the file is continuously updated, at least some of the changes described above can be made in the updates. A disadvantage of the conventional method is that the client device must constantly retrieve (also called "polling") the playlist file, thereby setting the service infrastructure. The load is applied, and the file may not be cached longer than the update interval, making the task of the service infrastructure much more difficult. This is addressed by the elements herein to provide such updates as described above without the need for the client to continuously poll the relay profile.

典型地從廣播分發中知曉的尤其存在於實況服務中的另一問題是缺乏供使用者觀看在比使用者加入節目的時間早時廣播的內容的能力。典型地，本端個人錄製消耗不必要的本機儲存區，或者在客戶端沒有調諧到該節目或受到內容保護條例禁止時不可能進行本端個人錄製。網路錄製和時移觀看是較佳的，但要求使用者與伺服器的個體連接及與實況服務分開的投遞協定和基礎設施，從而造成重複的基礎設施和顯著的伺服器成本。此亦由本文中描述的元素來解決。Another problem that is typically known from broadcast distributions, particularly in live services, is the lack of the ability for the user to view content that is broadcast earlier than when the user joined the program. Typically, local personal recording consumes unnecessary local storage, or local recording is not possible when the client does not tune to the program or is prohibited by content protection regulations. Network recording and time-shift viewing are preferred, but require individual connections to the server and separate delivery protocols and infrastructure from live services, resulting in duplicate infrastructure and significant server costs. This is also addressed by the elements described herein.

系統概覽System overview

參照圖1描述本發明的一個實施例，圖1圖示實施本發明的區塊請求串流系統的簡化圖。One embodiment of the present invention is described with reference to Figure 1, which illustrates a simplified diagram of a block request streaming system embodying the present invention.

在圖1中，圖示了區塊串流系統100，區塊串流系統100包括區塊供應基礎設施(「BSI」)101，BSI 101又包括攝取系統103，攝取系統103用於攝取內容102、製備彼內容並將彼內容打包以經由將彼內容儲存在攝取系統103和HTTP串流伺服器104兩者均可存取的內容儲存110中來由HTTP串流伺服器104供應。如圖所示，系統100亦可包括HTTP快取記憶體106。在操作中，諸如HTTP串流客戶端之類的客戶端108向HTTP串流伺服器104發送請求112並從HTTP串流伺服器104或HTTP快取記憶體106接收回應114。在每種情形中，圖1中所示的元素可至少部分地在軟體中實現，其中包括在處理器或其他電子裝置上執行的程式碼。In FIG. 1, a block streaming system 100 is illustrated. The block streaming system 100 includes a block provisioning infrastructure ("BSI") 101, which in turn includes an ingestion system 103 for ingesting content 102. The content is prepared and packaged for provision by the HTTP streaming server 104 via the content store 110, which is accessible to both the ingest system 103 and the HTTP streaming server 104. As shown, system 100 can also include HTTP cache memory 106. In operation, client 108, such as an HTTP streaming client, sends a request 112 to HTTP streaming server 104 and receives response 114 from HTTP streaming server 104 or HTTP cache 106. In each case, in Figure 1 The illustrated elements can be implemented, at least in part, in software, including code executed on a processor or other electronic device.

內容可包括電影、音訊、2D平面視訊、3D視訊、其他類型的視訊、圖像、定時文字、定時中繼資料或類似物。一些內容可涉及要以定時方式呈現或消費的資料，諸如用於隨正被播出的其他媒體一起呈現輔助資訊(台識別、廣告、股價、Flash^TM 序列等)的資料。亦可以使用組合其他媒體及/或超越僅音訊和視訊的其他混合呈現。Content may include movies, audio, 2D flat video, 3D video, other types of video, images, timed text, timed relay data, or the like. Some content may relate to material to be presented or consumed in a timed manner, such as for presenting auxiliary information (stage identification, advertising, stock price, ^FlashTM sequence, etc.) along with other media being broadcast. It is also possible to use other media combinations and/or other hybrid presentations beyond audio and video alone.

如圖2中所示，媒體區塊可被儲存在區塊供應基礎設施101(1)內，區塊供應基礎設施101(1)可以是例如HTTP伺服器、內容投遞網路裝置、HTTP代理、FTP代理或伺服器，或其他某種媒體伺服器或系統。區塊供應基礎設施101(1)連接到網路122，網路122可以是例如諸如網際網路之類的網際網路協定(「IP」)網路。區塊請求串流系統客戶端被示為具有6個功能元件，即：區塊選擇器123，區塊選擇器123被提供以上描述的中繼資料並執行從由該中繼資料指示的複數個可用區塊之中選擇要請求的區塊或部分區塊的功能；區塊請求器124，區塊請求器124接收來自區塊選擇器123的請求指令並執行在網路122上向區塊供應基礎設施101(1)發送對指定區塊、區塊的部分，或多個區塊的請求及接收包括該區塊的資料作為回復所必要的操作；及區塊緩衝器125；緩衝監視器126；媒體解碼器127；及促成媒體消費的一或多個媒體換能器128。As shown in FIG. 2, the media block may be stored within the block provisioning infrastructure 101(1), which may be, for example, an HTTP server, a content delivery network device, an HTTP proxy, FTP proxy or server, or some other media server or system. The block provisioning infrastructure 101(1) is connected to a network 122, which may be, for example, an Internet Protocol ("IP") network such as the Internet. The block request stream system client is shown with six functional elements, namely: block selector 123, which is provided with the relay material described above and executes a plurality of blocks indicated by the relay material The function of the block or partial block to be requested is selected among the available blocks; the block requester 124 receives the request command from the block selector 123 and performs the supply to the block on the network 122. The infrastructure 101(1) transmits a request for a specified block, a portion of the block, or a plurality of blocks, and receives an operation including the data of the block as a reply; and a block buffer 125; a buffer monitor 126 Media decoder 127; and one or more media transducers 128 that facilitate media consumption.

由區塊請求器124接收到的區塊資料被傳遞到儲存媒體資料的區塊緩衝器125進行臨時儲存。替換地，接收到的區塊資料可如圖1中所示地被直接儲存到區塊緩衝器125中。媒體解碼器127由區塊緩衝器125來提供媒體資料並對該資料執行向媒體換能器128提供合適的輸入所必要的變換，媒體換能器128以適合使用者或其他消費的形式來渲染該媒體。媒體換能器的實例包括諸如存在於行動電話、電腦系統或電視中的彼等之類的視覺顯示裝置，並且亦可包括諸如揚聲器或頭戴式耳機之類的音訊渲染裝置。The block data received by the block requester 124 is transferred to the block buffer 125 storing the media material for temporary storage. Alternatively, received The block data can be directly stored into the block buffer 125 as shown in FIG. The media decoder 127 provides the media data from the block buffer 125 and performs the transformation necessary to provide the appropriate input to the media transducer 128, and the media transducer 128 renders in a form suitable for the user or other consumer. The media. Examples of media transducers include visual display devices such as those found in mobile phones, computer systems, or televisions, and may also include audio rendering devices such as speakers or headphones.

媒體解碼器的實例可以是將在H.264視訊編碼標準中描述的格式的資料變換成視訊訊框的類比或數位表示(諸如每一訊框或取樣有相關聯的呈現時戳的YUV格式圖元映射)的功能。An example of a media decoder may be to convert data in a format described in the H.264 video coding standard to an analog or digital representation of a video frame (such as a YUV format map with associated presentation timestamps for each frame or sample). Meta map) function.

緩衝監視器126接收關於區塊緩衝器125的內容的資訊，並且基於該資訊及可能亦有其他資訊向區塊選擇器123提供輸入，該輸入被用來決定對要請求的區塊的選擇，如本文中所描述的。The buffer monitor 126 receives information about the contents of the block buffer 125 and provides input to the block selector 123 based on the information and possibly other information that is used to determine the selection of the block to be requested, As described herein.

在本文中使用的術語中，每個區塊具有「播出時間」或「歷時」，該「播出時間」或「歷時」表示接收器以正常速度播放彼區塊中所包括的媒體要花的時間量。在一些情形中，區塊內的媒體的播出可能取決於已接收到來自先前區塊的資料。在罕見的情形中，區塊中的一些媒體的播出可能取決於後續區塊，在此種情形中，區塊的播出時間是參照該區塊內不參照後續區塊就能播出的媒體來界定的，且後續區塊的播出時間增大該區塊內只能在已接收到該後續區塊之後才能播出的媒體的播出時間。由於在區塊中包括取決於後續區塊的媒體是罕見的情形，因此在本案的其餘部分中，假定一個區塊中的媒體不取決於後續區塊，但是應注意，本領域技藝人士將認識到此種變形可被輕易地添加到以下描述的實施例。In the terminology used herein, each block has a "broadcast time" or "duration", which means that the receiver plays the media included in the block at normal speed. The amount of time. In some cases, the playout of media within a tile may depend on having received material from a previous tile. In rare cases, the broadcast of some media in a block may depend on subsequent blocks. In this case, the broadcast time of the block can be broadcasted without reference to subsequent blocks in the block. The media defines, and the broadcast time of the subsequent block increases the broadcast time of the media that can only be broadcasted after the subsequent block has been received in the block. As included in the block depends on the follow-up The media of the block is a rare case, so in the remainder of the case, it is assumed that the media in one block does not depend on subsequent blocks, but it should be noted that those skilled in the art will recognize that such variations can be easily added. To the embodiments described below.

接收器可具有諸如「暫停」、「快進」、「倒退」等控制，該等控制可導致區塊經由以不同速率播放來被消費，但是若接收器能獲得並解碼每個連貫的區塊序列的集總時間等於或少於排除該序列中的最末區塊情況下的集總播放時間，則接收器能不停滯地向使用者呈現該媒體。在本文中的一些描述中，媒體串流中的特定位置被稱為該媒體中的特定「時間」，該特定「時間」對應於媒體播出的開始與到達視訊串流中的該特定位置的時間之間會流逝的時間。媒體串流中的時間或位置是習知概念。例如，在視訊串流包括24訊框每秒的場合，第一訊框可以被稱為具有t=0.0秒的位置或時間，且第241訊框可被稱為具有t=10.0秒的位置或時間。自然，在基於訊框的視訊串流中，位置或時間無需是連續的，因為該串流中從第241訊框的第一位元直至第242訊框的第一位元之前的每個位元可以全都具有相同的時間值。The receiver may have controls such as "pause", "fast forward", "reverse", etc., which may cause the block to be consumed by playing at different rates, but if the receiver can acquire and decode each consecutive block If the lumped time of the sequence is equal to or less than the lumped play time in the case of excluding the last block in the sequence, the receiver can present the media to the user without stagnation. In some descriptions herein, a particular location in a media stream is referred to as a particular "time" in the media, the particular "time" corresponding to the beginning and end of the media broadcast at that particular location in the video stream. The time that will elapse between time. The time or location in the media stream is a well-known concept. For example, where the video stream includes 24 frames per second, the first frame may be referred to as having a position or time of t=0.0 seconds, and the 241th frame may be referred to as having a position of t=10.0 seconds or time. Naturally, in a frame-based video stream, the position or time need not be continuous, because the first bit from the 241th frame to the first bit before the first bit of the 242th frame in the stream The elements can all have the same time value.

採用以上術語，區塊請求串流系統(BRSS)包括向一或多個內容伺服器(例如，HTTP伺服器、FTP伺服器等)作出請求的一或多個客戶端。攝取系統包括一或多個攝取處理器，其中攝取處理器(即時地或非即時地)接收內容，處理該內容以供BRSS使用並將該內容及可能亦連同由攝取處理器產生的中繼資料一起儲存在內容伺服器可存取的儲存中。Using the above term, a Block Request Streaming System (BRSS) includes one or more clients making requests to one or more content servers (eg, HTTP servers, FTP servers, etc.). The ingestion system includes one or more ingest processors, wherein the ingest processor receives the content (on the fly or not), processes the content for use by the BRSS and possibly also the relay data generated by the ingest processor Stored together in a storage accessible by the content server.

BRSS亦可包含與內容伺服器協調的內容快取記憶體。內容伺服器和內容快取記憶體可以是習知的HTTP伺服器和HTTP快取記憶體，HTTP伺服器和HTTP快取記憶體接收包括URL的HTTP請求形式的對檔或段的請求，並且該等請求亦可包括位元組範圍以請求不足由該URL指示的檔或段的全部。客戶端可包括習知的HTTP客戶端，HTTP客戶端作出對HTTP伺服器的請求並且處置對彼等請求的回應，其中HTTP客戶端由編制請求、將請求傳遞給HTTP客戶端、獲取來自HTTP客戶端的回應並處理彼等回應(或儲存、變換等)以將回應提供給呈現播放機供客戶端裝置播出的新穎客戶端系統驅動。典型地，客戶端系統事先並不知道將需要何種媒體(因為該等需要可能取決於使用者輸入、使用者輸入的改變等)，因此客戶端系統被稱為「串流」系統，因為媒體是一旦被接收到或此後不久就被「消費」的。結果，回應延遲和頻寬約束可能導致呈現的延遲，諸如當串流追趕使用者正消費該呈現時所在之處時造成呈現的暫停。The BRSS can also include content cache memory that is coordinated with the content server. The content server and the content cache memory may be a conventional HTTP server and HTTP cache memory, and the HTTP server and the HTTP cache memory receive a request for an archive or a segment in the form of an HTTP request including a URL, and the The request may also include a byte range to request all of the files or segments that are not indicated by the URL. The client may include a conventional HTTP client that makes requests to the HTTP server and handles responses to their requests, where the HTTP client is requested by the request, the request is passed to the HTTP client, and the HTTP client is obtained. The endpoints respond and process their responses (or store, transform, etc.) to provide the response to the novel client system driver that presents the player for broadcast by the client device. Typically, the client system does not know in advance which media will be needed (because such needs may depend on user input, user input changes, etc.), so the client system is referred to as a "streaming" system because of the media It is "consumed" once it is received or shortly thereafter. As a result, the response delay and bandwidth constraints may result in a delay in rendering, such as a pause in rendering when the stream catches up where the user is consuming the presentation.

為了提供被感知為有良好品質的呈現，可在BRSS中在客戶端或在攝取端或在該兩者處實現多個細節。在一些情形中，所實現的細節是考慮並為應對網路上的客戶端-伺服器介面來進行的。在一些實施例中，客戶端系統和攝取系統兩者皆知曉該增強，而在其他實施例中，僅一側知曉該增強。在此類情形中，即使一側並不知曉該增強，整個系統亦會受益於該增強，而在其他情形中，該效益僅在兩側皆知曉該增強的情況下才發生，但在一側不知曉時，該效益仍無故障地操作。In order to provide a presentation that is perceived as having good quality, multiple details can be implemented in the BRSS on the client or at the ingestion end or both. In some cases, the details achieved are considered and addressed in response to the client-server interface on the network. In some embodiments, both the client system and the ingestion system are aware of the enhancement, while in other embodiments, only one side is aware of the enhancement. In such cases, the entire system would benefit from the enhancement even if one side did not know the enhancement, while in other cases, the benefit was only known on both sides. This happens only in strong situations, but when it is not known on one side, the benefit is still operating without failure.

如圖3中圖示的，根據各種實施例，攝取系統可實現為硬體元件與軟體元件的組合。攝取系統可包括可被執行以導致系統執行本文中論述的任何一或多種方法的指令集。該系統可實現為電腦形式的專門機器。該系統可以是伺服器電腦、個人電腦(PC)，或能夠執行指定要由該系統採取的行動的指令集(順序或以其他方式)的任何系統。此外，儘管僅圖示單個系統，但是術語「系統」亦應被視為包括任何系統集合，該等系統個體地或聯合地執行指令集(或多個指令集)以執行本文中所論述的任何一或多種方法。As illustrated in FIG. 3, in accordance with various embodiments, the ingestion system can be implemented as a combination of a hardware component and a software component. The ingestion system can include a set of instructions that can be executed to cause the system to perform any one or more of the methods discussed herein. The system can be implemented as a specialized machine in the form of a computer. The system can be a server computer, a personal computer (PC), or any system capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by the system. Moreover, although only a single system is illustrated, the term "system" shall also be taken to include any collection of systems that individually or jointly execute a set of instructions (or sets of instructions) to perform any of the discussed herein. One or more methods.

攝取系統可包括攝取處理器302(例如，中央處理單元(CPU))、可儲存執行期間的程式碼的記憶體304，及磁碟儲存306，所有該等均經由匯流排300彼此通訊。該系統可進一步包括視訊顯示單元308(例如，液晶顯示器(LCD)或陰極射線管(CRT))。該系統亦可包括數符號輸入裝置310(例如，鍵盤)，及用於接收內容源並投遞內容儲存的網路介面裝置312。The ingestion system can include an ingest processor 302 (e.g., a central processing unit (CPU)), a memory 304 that can store code during execution, and a disk storage 306, all of which communicate with each other via the bus bar 300. The system can further include a video display unit 308 (eg, a liquid crystal display (LCD) or a cathode ray tube (CRT)). The system can also include a digital sign input device 310 (e.g., a keyboard), and a network interface device 312 for receiving content sources and delivering content storage.

磁碟儲存單元306可包括其上可儲存實施本文中所描述的任何一或多個方法或功能的一或多個指令集(例如，軟體)的機器可讀取媒體。該等指令在該等指令被系統執行期間亦可完全或至少部分地常駐在記憶體304內及/或攝取處理器302內，其中記憶體304和攝取處理器302亦構成機器可讀取媒體。Disk storage unit 306 can include machine readable media on which one or more sets of instructions (eg, software) that implement any one or more of the methods or functions described herein can be stored. The instructions may also reside wholly or at least partially within the memory 304 and/or the ingest processor 302 during execution of the instructions by the system, wherein the memory 304 and the ingest processor 302 also constitute machine readable media.

如圖4中圖示的，根據各種實施例，客戶端系統可實現為硬體元件與軟體元件的組合。客戶端系統可包括能被執行以導致系統執行本文中論述的任何一或多種方法的指令集。該系統可實現為電腦形式的專門機器。該系統可以是伺服器電腦、個人電腦(PC)，或能夠執行指定要由該系統採取的行動的指令集(順序或以其他方式)的任何系統。此外，儘管僅圖示單個系統，但是術語「系統」亦應被視為包括任何系統集合，該等系統個體地或聯合地執行指令集(或多個指令集)以執行本文中所論述的任何一或多種方法。As illustrated in Figure 4, a client system can be implemented as a combination of hardware and software components, in accordance with various embodiments. The client system can include a set of instructions that can be executed to cause the system to perform any one or more of the methods discussed herein. The system can be implemented as a specialized machine in the form of a computer. The system can be a server computer, a personal computer (PC), or any system capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by the system. Moreover, although only a single system is illustrated, the term "system" shall also be taken to include any collection of systems that individually or jointly execute a set of instructions (or sets of instructions) to perform any of the discussed herein. One or more methods.

客戶端系統可包括客戶端處理器402(例如，中央處理單元(CPU))、可儲存執行期間的程式碼的記憶體404，及磁碟儲存406，所有該等均經由匯流排400彼此通訊。該系統可進一步包括視訊顯示單元408(例如，液晶顯示器(LCD)或陰極射線管(CRT))。該系統亦可包括數符號輸入裝置410(例如，鍵盤)，及用於發送請求並接收回應的網路介面裝置412。The client system can include a client processor 402 (e.g., a central processing unit (CPU)), a memory 404 that can store code during execution, and a disk storage 406, all of which communicate with each other via the bus bar 400. The system can further include a video display unit 408 (eg, a liquid crystal display (LCD) or a cathode ray tube (CRT)). The system can also include a digital sign input device 410 (e.g., a keyboard) and a network interface device 412 for transmitting requests and receiving responses.

磁碟儲存單元406可包括其上可儲存實施本文中所描述的任何一或多個方法或功能的一或多個指令集(例如，軟體)的機器可讀取媒體。該等指令在該等指令被系統執行期間亦可完全或至少部分地常駐在記憶體404內及/或客戶端處理器402內，其中記憶體404和客戶端處理器402亦構成機器可讀取媒體。Disk storage unit 406 can include machine readable media on which one or more sets of instructions (eg, software) that implement any one or more of the methods or functions described herein can be stored. The instructions may also reside wholly or at least partially within the memory 404 and/or the client processor 402 during execution of the instructions by the system, wherein the memory 404 and the client processor 402 also constitute a machine readable media.

3 GPP檔案格式的使用 3 GPP file format usage

3GPP檔案格式或基於ISO基媒體檔案格式(諸如 MP4檔案格式或3GPP2檔案格式)的任何其他檔可被用作具有以下特徵的用於HTTP串流的容器格式。每個段中可包括段索引以訊號傳遞通知時間偏移量和位元組範圍，以使得客戶端能下載如所需的合適檔案片或媒體段。整個媒體呈現的全域呈現時基和每個3GP檔或媒體段內的局部時基可準確地對準。一個3GP檔或媒體段內的各個軌跡可被準確地對準。跨各個表示的各個軌跡亦可經由將各個軌跡之每一者指派到全域等時線來對準，以使得跨表示的切換可以是無瑕疵的，並且不同表示中的媒體分量的聯合呈現可以同步。3GPP file format or ISO-based media file format (such as Any other file of the MP4 file format or the 3GPP2 file format) can be used as a container format for HTTP streaming with the following features. A segment index may be included in each segment to signal the notification time offset and the byte range to enable the client to download the appropriate profile or media segment as needed. The global presentation time base presented by the entire media and the local time base within each 3GP file or media segment can be accurately aligned. Individual tracks within a 3GP file or media segment can be accurately aligned. Individual trajectories across the various representations may also be aligned by assigning each of the individual trajectories to a global isochronal line such that the switching across the representations may be innocent and the joint presentation of the media components in the different representations may be synchronized .

檔案格式可包含具有以下性質的用於自我調整串流的簡檔。所有電影資料可被包含在電影片斷中-「moov」包可以不包含任何取樣資訊。音訊和視訊取樣資料可以按與如TS26.244中規定的對漸進式下載簡檔的要求類似的要求來被交錯。「moov」包可被放在檔開頭處，繼之以亦被稱為段索引的片斷偏移量資料，片斷偏移量資料包含該包容段之每一者片斷或者至少片斷子集的時間偏移量資訊及位元組範圍。The file format may contain a profile for self-adjusting streaming having the following properties. All movie material can be included in the movie clip - the "moov" package can contain no sampling information. The audio and video sampling data can be interleaved as required to meet the requirements of the progressive download profile as specified in TS 26.244. The "moov" package can be placed at the beginning of the file, followed by the fragment offset data, also referred to as the segment index, which contains each segment of the containment segment or at least the time offset of the subset of fragments. The amount of shift information and the range of bytes.

亦有可能使媒體呈現描述引用跟隨在存在的漸進式下載簡檔之後的檔。在此種情形中，客戶端可使用媒體呈現描述簡單地從多個可用版本之中選擇合適的替換版本。客戶端亦可對順應於漸進式下載簡檔的檔使用HTTP部分獲取請求以請求每個替換版本的子集並藉此實現效率略低的形式的自我調整串流。在此種情形中，包含漸進式下載簡檔中的媒體的不同表示仍可遵守共同的全域等時線以使得能夠進行跨表示的無瑕疵切換。It is also possible for the media to present a description that references the file that follows the existing progressive download profile. In such a case, the client can simply select a suitable alternate version from among the plurality of available versions using the media presentation description. The client may also use the HTTP partial fetch request for files conforming to the progressive download profile to request a subset of each alternate version and thereby achieve a less efficient form of self-adjusting streaming. In such a case, different representations of the media included in the progressive download profile may still adhere to a common global isochronal line to enable infinity switching across representations.

高級方法概覽Advanced method overview

在以下小節中，描述了用於改進的區塊請求串流系統的方法。應理解，該等改進中的一些改進可在有或沒有該等改進中的其他改進的情況下使用，此取決於應用的需要。在一般操作中，接收器向伺服器或其他發射器作出對特定資料區塊或資料區塊部分的請求。亦被稱為段的檔可包含多個區塊並且與媒體呈現的一個表示相關聯。In the following subsections, a method for an improved block request streaming system is described. It should be understood that some of these improvements may be used with or without other improvements in such improvements, depending on the needs of the application. In normal operation, the receiver makes a request to a server or other transmitter for a particular data block or data block portion. A file, also referred to as a segment, may contain multiple tiles and be associated with one representation of the media presentation.

較佳地，產生亦被稱為「段索引」或「段映射」的索引資訊，索引資訊提供從播出或解碼時間至段內相應的區塊或片斷的位元組偏移量的映射。該段索引可被包括在段內，典型地在段開頭處(至少一些段映射在開頭處)且往往很小。段索引亦可被設在單獨的索引段或檔中。尤其是在段索引被包含在該段中的情形中，接收器可迅速下載此段映射的一些或全部，並隨後將此段映射的一些或全部用來決定時間偏移量與檔內同彼等時間偏移量相關聯的片斷的相應位元組位置之間的映射。Preferably, index information, also referred to as "segment index" or "segment map", is generated, and the index information provides a mapping from the broadcast or decode time to the byte offset of the corresponding block or segment within the segment. The segment index can be included in the segment, typically at the beginning of the segment (at least some segments are mapped at the beginning) and tend to be small. The segment index can also be set in a separate index segment or file. Especially in the case where the segment index is included in the segment, the receiver can quickly download some or all of the segment map and then use some or all of the segment map to determine the time offset and the same file. A mapping between the corresponding byte locations of the segments associated with the time offset.

接收器可使用位元組偏移量來請求來自與特定的時間偏移量相關聯的片斷的資料，而不必下載與同感興趣的時間偏移量無關聯的其他片斷相關聯的全部資料。以此方式，段映射或段索引能極大地改進接收器直接存取段中與感興趣的當前時間偏移量有關的部分的能力，此舉的效益包括改善的內容換台時間、在網路條件變化時從一個表示迅速換到另一表示的能力，及減少浪費網路資源來下載沒有在接收器處播出的媒體。The receiver can use the byte offset to request data from segments associated with a particular time offset without having to download all of the data associated with other segments that are not associated with the time offset of interest. In this way, the segment map or segment index can greatly improve the receiver's ability to directly access the portion of the segment that is related to the current time offset of interest, including the improved content swap time, on the network. When conditions change, the ability to quickly switch from one representation to another, and the reduction of wasted network resources to download media that is not broadcast at the receiver.

在考慮從一個表示(本文中稱為「切換自」的表示)切換到另一表示(本文中稱為「切換到」的表示)的情形中，亦可使用段索引來識別「切換到」的表示中的隨機存取點的開始時間，從而識別在「切換自」的表示中要請求以確保無瑕疵切換的資料量，此無瑕疵切換的意義是指「切換自」的表示中的媒體被下載到直至使得對「切換到」的表示的播出能從該隨機存取點無瑕疵地開始的呈現時間。In the case of considering switching from one representation (herein referred to as "switching from" to another representation (referred to herein as "switching to" representation), the segment index may also be used to identify "switched to" Indicates the start time of the random access point in the representation, thereby identifying the amount of data to be requested in the "switched from" representation to ensure innocent switching. The meaning of this innocent handover means that the media in the "switched from" representation is The rendering time until the broadcast of the representation of "switched to" can be started innocently from the random access point.

彼等區塊代表請求方接收器為了產生給該接收器的使用者的輸出所需要的視訊媒體或其他媒體的段。媒體的接收器可以是客戶端裝置，諸如當接收器從傳送內容的伺服器接收內容時。實例包括機上盒、電腦、遊戲控制臺、特殊裝備的電視、手持裝置、特殊裝備的行動電話，或其他客戶端接收器。These blocks represent segments of video media or other media required by the requesting receiver to produce output to the user of the receiver. The receiver of the media can be a client device, such as when the receiver receives content from a server that delivers the content. Examples include set-top boxes, computers, game consoles, specially equipped televisions, handheld devices, specially equipped mobile phones, or other client receivers.

本文中描述了許多高級的緩衝管理方法。例如，緩衝管理方法使得客戶端能請求可及時接收以連續播出的具有最高媒體品質的區塊。可變區塊大小特徵改進了壓縮效率。具有多個連接以用於向請求方裝置傳送區塊而同時限制請求頻率的能力提供了改善的傳輸效能。部分收到的資料區塊可被用來繼續媒體呈現。可以為多個區塊重用連接而不必在一開始就承諾由該連接負責特定的一組區塊。改善了由多個客戶端從多個可能的伺服器之中選擇伺服器的一致性，此減少了近旁伺服器中內容重複的頻率並且改進了伺服器包含整個檔的概率。客戶端可基於嵌入在關於包含媒體區塊的檔的URL中的中繼資料(諸如可用媒體編碼)來請求該等媒體區塊。系統可提供對在能開始內容播出而不會在媒體播出中招致後續暫停之前所需的緩衝時間量的演算和最小化。可以在多個媒體區塊之間共享可用頻寬，隨著每個區區塊的播出時間逼近而進行調節，從而在必要的情況下較大份額的可用頻寬可有傾向性地被分配給具有最接近播出時間的區塊。Many advanced buffer management methods are described in this article. For example, the buffer management method enables a client to request a block with the highest media quality that can be received in time for continuous broadcast. The variable block size feature improves compression efficiency. The ability to have multiple connections for transmitting blocks to the requesting device while limiting the frequency of requests provides improved transmission performance. Some of the received data blocks can be used to continue media presentation. Connections can be reused for multiple blocks without having to commit to a particular set of blocks by the connection at the outset. The consistency of selecting servers from among multiple possible servers by multiple clients is improved, which reduces the frequency of content repetition in nearby servers and improves the probability that the server will contain the entire file. The client may request the media regions based on relay material (such as available media encoding) embedded in the URL for the file containing the media block. Piece. The system can provide a calculation and minimization of the amount of buffering time required before the content can be broadcast without incurring a subsequent pause in the media playout. The available bandwidth can be shared between multiple media blocks, adjusted as the play time of each block approaches, so that a larger share of the available bandwidth can be preferentially assigned to the necessary The block with the closest broadcast time.

HTTP串流可採用中繼資料。呈現級中繼資料包括例如串流歷時、可用編碼(位元元速率、轉碼器、空間解析度、訊框率、語言、媒體類型)、指向每種編碼的串流中繼資料的指標，及內容保護(數位版權管理(DRM)資訊)。串流中繼資料可以是關於段檔的URL。HTTP streaming can use relay data. Presentation level relay data includes, for example, streaming duration, available encoding (bit element rate, transcoder, spatial resolution, frame rate, language, media type), indicators pointing to each encoded stream relay material, And content protection (Digital Rights Management (DRM) information). The streaming relay data can be a URL about the segment file.

段中繼資料可包括位元組範圍對時間資訊以用於段內請求及識別隨機存取點(RAP)或其他檢視點，其中該資訊中的一些或全部可以是段索引或段映射的一部分。Segment relay data may include byte range versus time information for intra-segment request and identification of random access points (RAPs) or other view points, some or all of which may be part of a segment index or segment map .

串流可包括對相同內容的多種編碼。每種編碼隨後可被分解成段，其中每一段對應於儲存單位或檔。在HTTP的情形中，段典型地是能由URL引用的資源，並且對此類URL的請求導致返回該段作為請求回應訊息的實體主體。段可包括多個畫面群(GoP)。每個GoP可進一步包括多個片斷，其中段索引提供關於每個片斷的時間/位元組偏移量資訊，亦即，索引的單位是片斷。Streaming can include multiple encodings of the same content. Each code can then be broken down into segments, with each segment corresponding to a storage unit or file. In the case of HTTP, a segment is typically a resource that can be referenced by a URL, and a request for such a URL results in returning the segment as the entity body of the request response message. A segment may include multiple picture groups (GoPs). Each GoP may further include a plurality of segments, wherein the segment index provides time/byte set offset information for each segment, that is, the unit of the index is a segment.

可經由並行的TCP連接來請求片斷或片斷部分以提高輸送量。此可以緩解在共享瓶頸鏈路上的連接時或者在連接由於壅塞而丟失時產生的問題，由此提高投遞的整體速度和可靠性，此能明顯改進內容換台時間的速度和可靠性。可以藉由過度請求用頻寬換等待時間，但是應注意避免作出對太遠的將來的請求，此會增加匱乏風險。The fragment or fragment portion can be requested via a parallel TCP connection to increase throughput. This can alleviate the problems that arise when the connection on the shared bottleneck link or when the connection is lost due to congestion, thereby increasing the overall speed and reliability of the delivery, which can significantly improve the speed and reliability of the content change time. can To wait for time with excessive bandwidth by over-requesting, but care should be taken to avoid making requests for too distant futures, which increases the risk of scarcity.

對相同伺服器上的段的多個請求可被管線化(在當前請求完成之前作出下一請求)以避免反覆的TCP啟動延遲。對連貫片斷的請求可被聚集成一個請求。Multiple requests for segments on the same server can be pipelined (the next request is made before the current request is completed) to avoid repeated TCP startup delays. Requests for consecutive fragments can be aggregated into one request.

一些CDN偏好大檔並且可在首次看到範圍請求時觸發從原始伺服器來後臺取回整個檔。然而，絕大多數CDN將從快取記憶體來服務範圍請求-若資料可用。因此，使客戶端請求的某個部分針對整個段檔可能是有利的。若有必要，該等請求可在以後被取消。Some CDNs prefer big files and can trigger the background retrieval of the entire file from the original server when the range request is first seen. However, the vast majority of CDNs will request a range of services from the cache memory - if the data is available. Therefore, it may be advantageous to have a portion of the client request for the entire segment. Such requests may be cancelled later if necessary.

有效切換點可以是目標串流中的檢視點，具體而言例如是RAP。不同的實現是可能的，諸如固定GoP結構或跨串流的RAP對準(基於媒體的開頭或基於GoP)。The effective switching point may be a view point in the target stream, specifically, for example, a RAP. Different implementations are possible, such as fixed GoP structures or RAP alignment across streams (based on the beginning of the media or based on GoP).

在一個實施例中，段和GoP可跨不同速率的串流對準。在該實施例中，GoP可具有可變大小並且可包含多個片斷，但片斷在該等不同速率的串流之間並不對準。In one embodiment, the segments and GoPs can be aligned across streams of different rates. In this embodiment, the GoP can be of variable size and can include multiple segments, but the segments are not aligned between the streams of different rates.

在一些實施例中，可有利地採用檔冗餘性。在該等實施例中，對每個片斷應用擦除碼以產生該資料的冗餘版本。較佳地，源格式化不因使用FEC而改變，且可作為攝取系統中的額外步驟產生包含FEC修復資料的例如作為源表示的從屬表示的額外修復段並使額外修復段可用。僅使用彼片斷的來源資料就能夠重構該片斷的客戶端可僅向伺服器請求該段內的該片斷的來源資料。若伺服器不可用或者與伺服器的連接較慢-此可能是在請求來源資料之前或之後決定的，則可請求來自修復段的關於該片斷的額外修復資料，此減少了用於可靠地投遞足以恢復該片斷的資料的時間，從而有可能使用FEC解碼以使用收到的來源資料和修復資料的組合來恢復該片斷的來源資料。此外，若片斷變得緊急，亦即，片斷的播出時間變得迫近，則可以請求額外修復資料以允許恢復該片斷，此增加了鏈路上用於彼片斷的資料份額，但比關掉該鏈路上的其他連接以釋放頻寬更高效。此亦可以緩解由於使用並行連接造成的匱乏風險。In some embodiments, gear redundancy may be advantageously employed. In these embodiments, an erasure code is applied to each segment to produce a redundant version of the material. Preferably, source formatting is not altered by the use of FEC, and additional steps in the ingestion system can be used to generate additional repair segments containing subordinate representations of the FEC repair material, such as a source representation, and to make additional repair segments available. A client that can reconstruct a fragment using only the source material of the fragment can request only the source material of the fragment within the segment from the server. If the server is unavailable or the connection to the server is slow - this may be determined before or after requesting the source material, then Requesting additional repair information about the segment from the repair segment, which reduces the time for reliably delivering enough data to recover the segment, making it possible to use FEC decoding to recover using a combination of received source and repair data. Source of the piece. In addition, if the clip becomes urgent, that is, the play time of the clip becomes imminent, additional repair data can be requested to allow the clip to be restored, which increases the share of the data for the clip on the link, but Other connections on the link are more efficient at releasing bandwidth. This can also alleviate the risk of scarcity due to the use of parallel connections.

片斷格式可以是儲存著的具有經由即時傳輸控制協定RTCP達成的音訊/視訊同步的即時傳輸協定(RTP)封包串流。The fragment format may be a stored Real-Time Transport Protocol (RTP) packet stream with audio/video synchronization via instant messaging control protocol RTCP.

段格式亦可以是儲存著的具有經由MPEG-2 TS內部時基達成的音訊/視訊同步的MPEG-2 TS封包串流。The segment format may also be a stored MPEG-2 TS packet stream with audio/video synchronization via an MPEG-2 TS internal time base.

使用訊號傳遞及/或區塊建立來使串流更高效Use signal passing and/or block building to make streaming more efficient

在區塊請求串流系統中可以使用或不使用數種特徵以提供改善的效能。效能可係關於不停滯地播出呈現、在頻寬約束內獲得媒體資料，及/或在客戶端、伺服器及/或攝取系統處有限的處理器資源內如此做的能力。現在將描述該等特徵中的一些。Several features may or may not be used in the block request streaming system to provide improved performance. Performance may be related to the ability to broadcast presentations without stagnation, to obtain media material within bandwidth constraints, and/or to do so within limited processor resources at the client, server, and/or ingestion system. Some of these features will now be described.

段內的索引Index within the segment

為了編制對電影片斷的部分獲取請求，可向客戶端告知檔或段內的片斷中所包含的所有媒體分量在解碼或呈現時間的位元組偏移量和開始時間，及亦有何種片斷始於或包含隨機存取點(且因此適合用作替換表示之間的切換點)，其中該資訊往往被稱為段索引或段映射。在解碼或呈現時間的開始時間可直接表達或者可表達為相對於參考時間的△。In order to compose a partial acquisition request for a movie segment, the client may be informed of the byte offset and start time of all media components contained in the segment or segment within the decoding or presentation time, and also what segment Starting with or containing a random access point (and therefore suitable for use as a switching point between replacement representations), This information is often referred to as a segment index or a segment map. The start time of the decoding or presentation time can be directly expressed or can be expressed as Δ with respect to the reference time.

該時間和位元組偏移量索引資訊可能每電影片斷需要至少8位元組資料。作為實例，對於具有500ms電影片斷的單個檔內所包含的2小時電影，此將會是總共約112千位元組資料。在開始呈現時下載該資料的全部可能導致顯著的額外啟動延遲。然而，該時間和位元組偏移量資料可被階層式編碼，從而客戶端能迅速找到與呈現中希望開始的點有關的小時間塊和偏移量資料。該資訊亦可分佈在段內，以使得對段索引的某種細化可與媒體資料交錯地放置。This time and byte offset index information may require at least 8 bytes of material per movie segment. As an example, for a 2-hour movie contained within a single file with a 500 ms movie clip, this would be a total of approximately 112 kilobytes of material. Downloading all of this material at the start of the presentation may result in significant additional startup delays. However, the time and byte offset data can be hierarchically encoded so that the client can quickly find small time blocks and offset data associated with the point in the presentation that is desired to begin. The information may also be distributed within the segments such that some refinement of the segment index may be interleaved with the media material.

注意，若表示是按時間分段成多個段的，則使用該階層式編碼可能是不必要的，因為每個段的完整的時間和偏移量資料可能已經十分小。例如，若段是一分鐘而非以上實例中的2小時，則時間-位元組偏移量索引資訊約為1千位元組資料，1千位元組資料通常可裝進單個TCP/IP封包內。Note that if the representation is segmented into multiple segments by time, then using this hierarchical encoding may not be necessary, as the complete time and offset data for each segment may already be very small. For example, if the segment is one minute instead of 2 hours in the above example, the time-byte offset index information is about 1 kilobyte data, and the 1 kilobyte data can usually be loaded into a single TCP/IP. Inside the package.

有不同選項可能用於向3GPP檔添加片斷時間和位元組偏移量資料：首先，可出於此目的使用電影片斷隨機存取包(「MFRA」)。MFRA提供表，表可輔助讀者使用電影片斷來尋找檔中的隨機存取點。為支援該功能，MFRA順帶地包含帶有隨機存取點的MFRA包的位元組偏移量。MFRA可被放在檔末尾處或附近，但並非必需如此。經由從檔末尾掃瞄電影片斷隨機存取偏移量包並使用電影片斷隨機存取偏移量包中的大小資訊，就能夠定位到電影片斷隨機存取包的開頭。然而，將MFRA放在末尾來進行HTTP串流典型地需要至少3到4個HTTP請求來存取所要資料：至少一個用於從檔末尾請求MFRA，一個用於獲得MFRA並且最後一個用於獲得該檔中的所要片斷。因此，放在開頭可能是可取的，因為如此就可在單個請求中與第一媒體資料一起下載mfra。另外，對HTTP串流使用MFRA可能是效率不高的，因為「MFRA」中除了時間和moof_偏移量以外的其他任何資訊皆是不需要的，且指定偏移量而非長度可能需要更多位元。There are different options that may be used to add clip time and byte offset data to the 3GPP file: First, a Movie Fragment Random Access Packet ("MFRA") may be used for this purpose. The MFRA provides a table that assists the reader in using movie clips to find random access points in the file. To support this feature, the MFRA includes, in tandem, the byte offset of the MFRA packet with random access points. The MFRA can be placed at or near the end of the file, but this is not required. By scanning the movie fragment random access offset packet from the end of the file and using the size information in the movie fragment random access offset packet, it is possible to locate the beginning of the movie fragment random access packet. however, Placing MFRA at the end for HTTP streaming typically requires at least 3 to 4 HTTP requests to access the desired data: at least one for requesting MFRA from the end of the file, one for obtaining MFRA and the last one for obtaining the file. The desired piece. Therefore, it may be desirable to put it at the beginning because the mfra can be downloaded along with the first media material in a single request. In addition, using MFRA for HTTP streaming may be inefficient because any information other than time and moof_offset in "MFRA" is not required, and specifying the offset instead of the length may require more Multi-bit.

其次，可以使用項定位包(ILOC)。「ILOC」經由定位中繼資料資源的包容檔、中繼資料資源在彼檔內的偏移量及中繼資料資源的長度來提供該檔或其他檔中的中繼資料資源的目錄。例如，系統可將所有被外部引用的中繼資料資源整合到一個檔中，相應地重新調整檔偏移量和檔引用。然而，「ILOC」意欲提供中繼資料的位置，因此可能很難使「ILOC」與真正的中繼資料共存。Second, the Item Location Pack (ILOC) can be used. The "ILOC" provides a directory of relay data resources in the file or other files by locating the metadata of the relay data resource, the offset of the relay data resource in the file, and the length of the relay data resource. For example, the system can consolidate all externally referenced relay data resources into one file and re-adjust the file offset and file references accordingly. However, "ILOC" is intended to provide a location for relaying data, so it may be difficult to coexist "ILOC" with real relay data.

最後且可能最適合的是規定稱為時間索引包(「TIDX」)的新包，時間索引包專用於以高效方式提供確切的片斷時間或歷時及位元組偏移量。此點在下節中更詳細地描述。具有相同功能性的替換包可以是段索引包(「SIDX」)。在本文中，除非另行指出，否則該兩者可以是可互換的，因為兩種包皆提供了以高效方式提供確切的片斷時間或歷時及位元組偏移量的能力。以下提供TIDX與SIDX之間的差異。如何互換TIDX包和SIDX包應當是明顯的，因為該兩種包均實現段索引。Finally and perhaps most suitable is the provision of a new package called Time Index Packet ("TIDX"), which is dedicated to providing exact fragment time or duration and byte offsets in an efficient manner. This point is described in more detail in the next section. A replacement package with the same functionality may be a segment index package ("SIDX"). In this context, the two may be interchangeable unless otherwise indicated, as both packages provide the ability to provide exact fragment time or duration and byte offsets in an efficient manner. The differences between TIDX and SIDX are provided below. How to interchange TIDX packages and SIDX packages should be obvious, because both packages implement segment indexing.

段索引Segment index

段具有識別出的開始時間和識別出的位元組數。多個片斷可被級聯成單個段，且客戶端可發出識別該段內與所請求的片斷或片斷子集相對應的具體位元組範圍的請求。例如，在使用HTTP作為請求協定時，HTTP範圍頭部可用於此目的。該辦法要求客戶端能存取該段的「段索引」，該「段索引」指定不同片斷在該段內的位置。該「段索引」可作為中繼資料的一部分來提供。該辦法具有以下結果：與在每個區塊被保持在單獨的檔中的辦法相比，需要建立和管理的檔少得多。對建立、傳遞和儲存非常大量的檔(諸如對於1小時呈現，該等檔可延伸到好幾千個文件)的管理可能是複雜的且容易出錯，因此減少檔數量代表著優點。The segment has the identified start time and the number of identified bytes. Multiple segments can be concatenated into a single segment, and the client can issue a request identifying a particular range of bytes within the segment that corresponds to the requested segment or subset of segments. For example, when using HTTP as a request protocol, the HTTP range header can be used for this purpose. This method requires the client to access the "segment index" of the segment, which specifies the location of the different segments within the segment. The "segment index" can be provided as part of the relay material. This approach has the result that there is much less need to build and manage than the way each block is kept in a separate file. The management of creating, delivering, and storing very large numbers of files (such as for 1 hour presentation, which can be extended to thousands of files) can be complex and error prone, so reducing the number of files represents an advantage.

若客戶端僅知曉段的較小部分的所要開始時間，則客戶端可請求整個檔，隨後從頭至尾讀取該檔以決定合適的播出起始位置。為了改進頻寬利用率，段可包括索引檔作為中繼資料，其中索引檔映射個體區塊的位元組範圍與該等區塊所對應的時間範圍，稱為段索引或段映射。該中繼資料可被格式化為XML資料或者該中繼資料可以是二進位的，例如遵循3GPP檔案格式的原子和包結構。索引可以是簡單的，其中每個區塊的時間和位元組範圍相對於檔的開始是絕對的，或者每個區塊的時間和位元組範圍可以是階層式的，其中一些區塊被編組成父區塊(且彼等父區塊被編組成祖父區塊，等等)且給定區塊的時間和位元組範圍是相對於該區塊的父區塊的時間及/或位元組範圍來表達的。If the client only knows the desired start time for the smaller portion of the segment, the client can request the entire file and then read the file from beginning to end to determine the appropriate play start location. In order to improve the bandwidth utilization, the segment may include an index file as a relay material, wherein the index file maps the byte range of the individual block to the time range corresponding to the blocks, and is referred to as a segment index or a segment map. The relay material can be formatted as XML material or the relay material can be binary, such as an atomic and packet structure that follows the 3GPP file format. The index can be simple, where the time and byte range of each block is absolute relative to the beginning of the file, or the time and byte range of each block can be hierarchical, with some blocks being Compilation of parent blocks (and their parent blocks are grouped into grandparents, etc.) and the time and byte range of a given block is relative to the time and/or bit of the parent block of the block The tuple range is expressed.

示例索引映射結構Sample index mapping structure

在一個實施例中，媒體串流的一個表示的原始來源資料可被包含在一或多個在本文中被稱為「媒體段」的媒體檔中，其中每個媒體段包含用於重播該媒體的連續時間段的媒體資料，例如5分鐘的媒體重播。In one embodiment, the original source material of a representation of the media stream may be included in one or more media files referred to herein as "media segments", wherein each media segment includes for replaying the media Media data for consecutive periods, such as 5 minute media replays.

圖6圖示媒體段的實例整體結構。在每個段內，在源段開頭或遍佈整個源段，亦可存在包括時間/位元組偏移量段映射的索引資訊。一個實施例中的時間/位元組偏移量段映射可以是時間/位元組偏移量對的清單(T(0),B(0))、(T(1),B(1))、...、(T(i),B(i))、...、(T(n),B(n))，其中T(i-1)代表該段內相對於該媒體在所有媒體段中的初始開始時間用於重播第i 個媒體片斷的開始時間，T(i)代表第i 個片斷的結束時間(且因此代表下一片斷的開始時間)，並且位元組偏移量B(i-1)是該源段內相對於該源段開頭第i 個媒體片斷開始之處的資料的開頭的相應位元組索引，且B(i)是第i 個片斷的相應結束位元組索引(且因此是下一片斷的首個位元組的索引)。若段包含多個媒體分量，則可以絕對方式為該段之每一者分量提供T(i)和B(i)，或者T(i)和B(i)可相對於用作參考媒體分量的另一媒體分量來表達。Figure 6 illustrates an example overall structure of a media segment. Within each segment, at the beginning of the source segment or throughout the source segment, there may also be index information including a time/byte tuple offset segment map. The time/byte tuple offset segment map in one embodiment may be a list of time/byte tuple offset pairs (T(0), B(0)), (T(1), B(1). ), ..., (T(i), B(i)), ..., (T(n), B(n)), where T(i-1) represents the segment relative to the medium The initial start time in all media segments is used to replay the start time of the i- th media segment, T(i) represents the end time of the i- th segment (and therefore the start time of the next segment), and the byte offset The quantity B(i-1) is the corresponding byte index of the beginning of the material in the source segment relative to the beginning of the i- th media segment at the beginning of the source segment, and B(i) is the corresponding end of the i- th segment. The byte index (and therefore the index of the first byte of the next segment). If the segment contains a plurality of media components, T(i) and B(i) may be provided for each of the segments in an absolute manner, or T(i) and B(i) may be used relative to the reference media component. Another media component is expressed.

在該實施例中，源段中的片斷數目為n，其中n在段與段之間可有所不同。In this embodiment, the number of segments in the source segment is n, where n may vary from segment to segment.

在另一實施例中，段索引中關於每個片斷的時間偏移量可以用第一片斷的絕對開始時間及每個片斷的歷時來決定。在此種情形中，段索引可以記載第一片斷的開始時間及該段中所包括的所有片斷的歷時。段索引亦可以僅記載片斷子集。在該情形中，段索引記載被界定為一或多個連貫片斷的子段的歷時，該子段在包容段的末尾或在下一子段的開頭處結束。In another embodiment, the time offset for each segment in the segment index can be determined by the absolute start time of the first segment and the duration of each segment. In this case, the segment index can record the start time of the first segment and The duration of all the clips included in this paragraph. The segment index can also only record a subset of fragments. In this case, the segment index records the duration of the sub-segment defined as one or more consecutive segments that end at the end of the containment segment or at the beginning of the next sub-segment.

對於每個片斷，亦可以有指示該片斷是否始於或包含檢視點的值，亦即，在某點處，彼點之後的媒體皆不取決於彼點之前的任何媒體，且因此自彼片斷前行的媒體能獨立於先前片斷地播出。檢視點一般而言是媒體中能獨立於所有先前的媒體地開始播出之處的點。圖6亦圖示源段的可能段索引的簡單實例。在彼實例中，時間偏移量值以毫秒為單位，且因此該源段的首個片斷自該媒體開頭20秒處開始，且首個片斷具有485毫秒的播出時間。首個片斷的開始的位元組偏移量為0，且首個片斷的末尾/第二片斷的開始的位元組偏移量為50245，且因此首個片斷的大小為50245位元組。若片斷或子段並非始於隨機存取點，但該片斷或子段中包含隨機存取點，則可以提供開始時間與實際RAP時間之間的解碼時間或呈現時間差。此使得在切換到該媒體段的情形中客戶端能準確地知曉「切換自」的表示需要一直被呈現到的時間。For each segment, there may also be a value indicating whether the segment starts or contains a view point, that is, at a certain point, the media after the point does not depend on any media before the point, and thus from the piece The forward media can be broadcast independently of the previous clips. The point of view is generally the point in the media where it can be broadcast independently of all previous media. Figure 6 also illustrates a simple example of a possible segment index for a source segment. In its example, the time offset value is in milliseconds, and thus the first segment of the source segment begins 20 seconds from the beginning of the media, and the first segment has a broadcast time of 485 milliseconds. The byte offset of the beginning of the first segment is 0, and the byte offset of the beginning of the first segment/the second segment is 50245, and thus the size of the first segment is 50245 bytes. If the segment or sub-segment does not start at a random access point, but the segment or sub-segment contains a random access point, the decoding time or presentation time difference between the start time and the actual RAP time may be provided. This allows the client to accurately know when the "switched from" representation needs to be presented all the time in the case of switching to the media segment.

補充地或代替地，可以使用簡單的或階層式的索引、雛菊鏈索引及/或混合索引。Additionally or alternatively, a simple or hierarchical index, a daisy chain index, and/or a hybrid index may be used.

由於不同軌跡的取樣歷時可能不是相同的(例如，視訊取樣可能播放33ms，而音訊取樣可能持續80ms)，因此電影片斷中的不同軌跡可能不是在正好相同的時間開始和結束的，亦即，音訊可能比視訊略早或略遲開始，而在前一片斷中可能正好是相反情形以作為補償。為避免多義性，在時間和位元組偏移量資料中指定的時戳可相對於特定軌跡來指定，且此軌跡對於每個表示可以是相同的軌跡。通常此將是視訊軌跡。此允許客戶端在切換表示時能確切地識別下一視訊訊框。Since the sampling durations of different tracks may not be the same (for example, video sampling may play for 33ms, and audio sampling may last for 80ms), different tracks in a movie clip may not start and end at exactly the same time, that is, audio. May start slightly earlier or later than video, but in front of it The break may be exactly the opposite case as compensation. To avoid ambiguity, the timestamps specified in the time and byte offset data can be specified relative to a particular trajectory, and this trajectory can be the same trajectory for each representation. Usually this will be the video track. This allows the client to accurately identify the next video frame when switching the representation.

儘管有上述問題，但在呈現期間可注意維持軌跡時標與呈現時間之間的嚴格關係，以確保流暢播出及維持音訊/視訊同步。Despite the above problems, attention can be paid to maintaining a strict relationship between the trajectory time stamp and the presentation time during presentation to ensure smooth broadcast and maintain audio/video synchronization.

圖7圖示了一些實例，諸如簡單索引700和階層式索引702。FIG. 7 illustrates some examples, such as a simple index 700 and a hierarchical index 702.

以下提供包含段映射的包的兩個具體實例，一個稱為時間索引包(‘TIDX’)且一個稱為(‘SIDX’)。該定義遵循根據ISO基媒體檔案格式的包結構。用於此類包以界定類似句法且具有相同語義和功能性的其他設計對於讀者應當是明顯的。Two specific examples of a packet containing a segment map are provided below, one called a time index packet ('TIDX') and one called ('SIDX'). This definition follows the package structure according to the ISO base media file format. Other designs for such packages to define similar syntax and have the same semantics and functionality should be apparent to the reader.

時間索引包Time index package

定義definition

包類型：‘tidx’Package type: ‘tidx’

容器：檔Container: file

強制性的：否Mandatory: No

數量：任何數目0或1Quantity: any number 0 or 1

時間索引包可提供將檔的某些區域與呈現的某些時間區間相關聯的一組時間和位元組偏移量索引。時間索引包可包括目標類型(targettype)欄位，目標類型欄位指示所引用的資料的類型。例如，具有目標類型「moof」的時間索引包提供在時間和位元組偏移量兩者意義上對檔中所包含的媒體片斷的索引。具有目標類型「時間索引包(Time Index Box)」的時間索引包可被用來構造階層式時間索引，從而允許檔的使用者迅速導航至該索引的所需部分。The time index package may provide a set of time and byte offset indexes that associate certain regions of the file with certain time intervals presented. The time index package may include a target type field, and the target type field indication The type of information used. For example, a time index packet with the target type "moof" provides an index of the media segments contained in the file in the sense of both time and byte offset. A time index packet with the target type "Time Index Box" can be used to construct a hierarchical time index, allowing the user of the file to quickly navigate to the desired portion of the index.

段索引可例如包含以下句法： The segment index can, for example, contain the following syntax:

語義Semantics

targettype(目標類型)：為該時間索引包引用的包資料的類型。此目標類型可以是電影片斷頭部(「moof」)或時間索引包(「tidx」)。Targettype: The type of package material referenced by the index package for this time. This target type can be a movie clip header ("moof") or a time index packet ("tidx").

time-reference_track_id(時間參考軌跡id)：指示指定該索引中的時間偏移量時參照的軌跡。Time-reference_track_id: indicates the trajectory to be referenced when specifying the time offset in this index.

number_of_elements(元素數目)：由該時間索引包索引的元素的數目。Number_of_elements: The number of elements indexed by this time indexed packet.

first_element_offset(首個元素偏移量)：首個被索引的元素自該檔的開始起的位元組偏移量。First_element_offset (the first element offset): The byte offset of the first indexed element since the beginning of the file.

first_element_time(首個元素時間)：首個被索引的元素的開始時間，使用由「時間參考軌跡id」識別的軌跡的媒體頭部包中指定的時標。First_element_time: The start time of the first indexed element, using the time stamp specified in the media header package of the track identified by the "time reference track id".

random_access_flag(隨機存取標誌)：若元素的開始時間是隨機存取點則為1。否則為0。Random_access_flag (random access flag): 1 if the start time of the element is a random access point. Otherwise 0.

length(長度)：被索引的元素以位元組計的長度。Length: The length of the indexed element in bytes.

deltaT(△T)：該元素的開始時間與下一元素的開始時間之間的差量，該時間差是由「時間參考軌跡id」識別的軌跡的媒體頭部包中指定的時標的形式。deltaT (ΔT): The difference between the start time of the element and the start time of the next element, which is the form of the time stamp specified in the media header packet of the track identified by the "time reference track id".

段索引包Segment index package

段索引包(‘sidx’)提供對段中的電影片斷和其他段索引包的緊湊索引。段索引包中有兩個循環結構。第一循環記載子段的第一取樣，亦即，由第二循環引用的第一電影片斷中的取樣。第二循環提供該子段的索引。‘sidx’包的容器是該檔或直接是段。The segment index package ('sidx') provides a compact index of movie fragments and other segment index packets in the segment. There are two loop structures in the segment index package. The first loop records the first sample of the sub-segment, that is, the sample in the first movie segment referenced by the second loop. The second loop provides an index of the sub-segment. The container of the ‘sidx’ package is either the file or the segment directly.

句法syntax

語義：Semantics:

reference_track_ID(參考軌跡ID)提供參考軌跡的軌跡ID。The reference_track_ID (reference track ID) provides the track ID of the reference track.

track_count(軌跡計數)：接下來的循環中被索引的軌跡的數目(1或更大)；reference_count(參考計數)：由第二循環索引的元素的數目(1或更大)；track_ID(軌跡ID)：軌跡片斷被包括在由該索引識別的首個電影片斷中的彼軌跡的ID；該循環中有恰好一個「軌跡ID」等於「參考軌跡ID」。Track_count: The number of tracks indexed in the next loop (1 or greater); reference_count (reference count): the number of elements indexed by the second loop (1 or greater); track_ID (track ID) ): The track segment is included in the ID of the track in the first movie segment identified by the index; there is exactly one "track ID" equal to "reference track ID" in the loop.

decoding_time(解碼時間)：由第二循環中的第一項引用的電影片斷中由「軌跡ID」識別的軌跡中的首個取樣的解碼時間，以該軌跡的時標來表達(如在該軌跡的媒體頭部包的時標欄位中記載的)；reference_type(參考類型)：在設為0時指示該引用是對電影片斷(‘moof’)包的引用；在設為1時指示該引用是對段索引(‘sidx’)包的引用；reference_offset(引用偏移量)：從繼包容段索引包之後的首個位元組至被引用包的首個位元組的以位元組計的距離；subsegment_duration(子段歷時)：當引用是對段索引包的引用時，該欄位攜帶彼包的第二循環中的「子段歷時」欄位的總和；當引用是對電影片斷的引用時，該欄位攜帶參考軌跡中、所指示的電影片斷及直至該循環中的下一條目記載的首個電影片斷或該子段末尾(取較早者)的後續電影片斷中的取樣的取樣歷時總和；該歷時以該軌跡的時標來表達(如在該軌跡的媒體頭部包的時標欄位中記載的)；contains_RAP(包含RAP)：當引用是對電影片斷的引用時，則若彼電影片斷內「軌跡ID」等於「參考軌跡ID」的彼軌跡的的軌跡片斷包含至少一個隨機存取點，則該位元可為1，否則該位元設為0；當引用是對段索引的引用時，則僅在彼段索引中的任何引用的該位元被設為1時該位元才被設為1，否則設為0；RAP_delta_time(RAP_△時間)：若「包含RAP」為1，則提供隨機存取點(RAP)的呈現(合成)時間；若「包含RAP」為0則保留值0。該時間表達為在由該條目記載的子段的首個取樣的解碼時間與「軌跡ID」等於「參考軌跡ID」的軌跡中隨機存取點的呈現(合成)時間之間的差量。Decoding_time: the decoding time of the first sample in the track identified by the "track ID" in the movie segment referenced by the first item in the second loop, expressed as the time stamp of the track (eg, in the track) Referenced in the time stamp field of the media header package; reference_type (reference type): when set to 0, indicates that the reference is a reference to a movie fragment ('moof') package; when set to 1, indicates the reference Is a reference to the segment index ('sidx') package; reference_offset (reference offset): from the first byte after the inclusion segment index packet to the first byte of the referenced packet in bytes Distance; subsegment_duration: When the reference is a reference to the segment index package, the field carries the sum of the "sub-segment duration" fields in the second loop of the packet; when the reference is for the movie fragment When quoted, this field carries the reference movie, the indicated movie clip, and the first movie clip recorded in the next entry in the loop or the subsequent power of the end of the sub-segment (whichever is earlier) The sum of the sample durations of the samples in the film break; the duration is expressed in the time stamp of the track (as recorded in the time stamp field of the media header package of the track); contains_RAP (including RAP): when the reference is When a movie clip is referenced, if the track segment of the track in which the "track ID" is equal to the "reference track ID" in the movie segment contains at least one random access point, the bit may be 1, otherwise the bit is set. 0; when a reference is a reference to a segment index, the bit is set to 1 only if the referenced bit in any of the segments is set to 1, otherwise set to 0; RAP_delta_time(RAP_△ Time): If "Include RAP" is 1, the presentation (synthesis) time of the random access point (RAP) is provided; if "Include RAP" is 0, the value 0 is reserved. This time is expressed as the difference between the decoding time of the first sample of the sub-segment described by the entry and the presentation (synthesis) time of the random access point in the trajectory of the "track ID" equal to the "reference track ID".

TIDX與SIDX之間的差異Difference between TIDX and SIDX

TIDX和SIDX就索引而言提供相同的功能性。SIDX的第一循環作為補充亦提供首個電影片斷的全域時基，但該全域時基亦可被包含在該電影片斷自身中，該全域時基或者是絕對的或者是相對於參考軌跡的。TIDX and SIDX provide the same functionality for indexing. The first cycle of SIDX supplements also provides the global time base of the first movie segment, but the global time base can also be included in the movie segment itself, which is either absolute or relative to the reference trajectory.

SIDX的第二循環實現TIDX的功能性。具體地，SIDX准許具有由「參考類型」所指的每個索引的引用的目標的混合，而TIDX僅僅是要麼只引用TIDX要麼只引用MOOF。TIDX中的「元素數目」對應於SIDX中的「引用計數」，TIDX中的「時間參考軌跡ID」對應於SIDX中的「參考軌跡ID」，TIDX中的「首個元素偏移量」對應於第二循環的首個條目中的「引用偏移量」，TIDX中的「首個元素時間」對應於第一循環中的「參考軌跡」的「解碼時間」，TIDX中的「隨機存取標誌」對應於SIDX中的「包含RAP」，後者亦具有額外的自由-在SIDX中，RAP可以不一定要放在片斷開始處，且因此需要「RAP_△時間」，TIDX中的「長度」對應於SIDX中的「引用偏移量」，並且最後TIDX中的△T對應於SIDX中的「子段歷時」。因此，該兩個包的功能性是等效的。The second cycle of SIDX implements the functionality of TIDX. Specifically, SIDX grants a mix of targets with references to each index referred to by "reference type", while TIDX simply refers to either only TIDX or only MOOF. The "number of elements" in TIDX corresponds to the "reference count" in SIDX, the "time reference track ID" in TIDX corresponds to the "reference track ID" in SIDX, and the "first element offset" in TIDX corresponds to In the first entry of the second loop "reference offset", "first element time" in TIDX corresponds to "decoding time" of "reference track" in the first loop, and "random access flag" in TIDX corresponds to "include RAP" in SIDX The latter also has extra freedom - in SIDX, RAP may not necessarily be placed at the beginning of the segment, and therefore requires "RAP_△ time", and "length" in TIDX corresponds to "reference offset" in SIDX. And finally ΔT in TIDX corresponds to the "sub-segment duration" in SIDX. Therefore, the functionality of the two packages is equivalent.

可變區塊大小控制和子GoP區塊Variable block size control and sub-GoP blocks

對於視訊媒體而言，視訊編碼結構與供請求的區塊結構之間的關係可能是重要的。例如，若每個區塊始於諸如隨機存取點(「RAP」)之類的檢視點且每個區塊代表相等的視訊時間段，則視訊媒體中的至少一些檢視點的定位是固定的且檢視點在視訊編碼內將以有規律的間隔出現。如視訊編碼領域中的技藝人士公知的，若檢視點是根據視訊訊框之間的關係來放置的，並且具體而言若檢視點被放置在與先前訊框幾乎沒有多少共性的訊框處，則壓縮效率可得以改進。由此要各區塊代表等量時間的要求對視訊編碼構成了約束，從而使得壓縮可能是未臻最優的。For video media, the relationship between the video encoding structure and the block structure for the request may be important. For example, if each block starts from a view point such as a random access point ("RAP") and each block represents an equal video time period, the positioning of at least some of the view points in the video medium is fixed. And the view points will appear at regular intervals within the video code. As is well known to those skilled in the art of video coding, if the view points are placed according to the relationship between the video frames, and in particular if the view points are placed at a frame that has little in common with the previous frame, Then the compression efficiency can be improved. Therefore, it is required that each block represents the same amount of time to constrain the video coding, so that the compression may be optimal.

希望允許視訊呈現內的檢視點的位置由視訊編碼系統來選取，而非要求檢視點位於固定位置。允許視訊編碼系統選取檢視點得到改善的視訊壓縮，並且因此使用給定的可用頻寬能提供更高的視訊媒體品質，從而得到改善的使用者體驗。當前的區塊請求串流系統可能要求所有區塊具有相同的歷時(以視訊時間計)，且每個區區塊必須始於檢視點且此因此是現有系統的缺點。It is desirable to allow the position of the view point within the video presentation to be selected by the video coding system, rather than requiring the view point to be in a fixed position. Allowing the video encoding system to select the view point for improved video compression, and thus providing a higher video media quality using a given available bandwidth, resulting in an improved user experience. The current block request streaming system may require all blocks to have the same duration (in video time), and each block must start at the view point and This is therefore a disadvantage of existing systems.

現在描述提供勝於上述系統的優點的新穎的區塊請求串流系統。在一個實施例中，視訊分量的第一版本的視訊編碼過程可被配置成選取檢視點的位置以力圖最佳化壓縮效率，但要求對檢視點之間的歷時有最大限度。後一個要求的確限制了編碼過程對檢視點的選取且因此降低了壓縮效率。然而，倘若檢視點之間的歷時的最大限度不是太小(例如，大於約1秒)，則壓縮效率的降低與若檢視點要求有規律的固定位置所招致的壓縮效率降低相比是微小的。此外，若對檢視點之間的歷時的最大限度是幾秒，則該壓縮效率的降低與檢視點的完全自由定位相比一般是微乎其微的。A novel block request streaming system that provides advantages over the above described system is now described. In one embodiment, the video encoding process of the first version of the video component can be configured to select the location of the viewing point in an effort to optimize compression efficiency, but requires a maximum of the duration between the viewing points. The latter requirement does limit the selection of the viewing point by the encoding process and thus reduces the compression efficiency. However, if the maximum duration between the inspection points is not too small (eg, greater than about 1 second), the reduction in compression efficiency is small compared to the reduction in compression efficiency incurred if the inspection point requires a regular fixed position. . In addition, if the maximum duration between the inspection points is a few seconds, the reduction in compression efficiency is generally negligible compared to the fully free positioning of the inspection points.

在包括本實施例的許多實施例中，可能有一些RAP不是檢視點，亦即，可能有訊框是兩個連貫檢視點之間的沒有被選取為檢視點的RAP，例如由於該RAP在時間上太接近周圍的檢視點，或者由於該RAP之前或之後的檢視點與該RAP之間的媒體資料量太少。In many embodiments including this embodiment, there may be some RAPs that are not view points, that is, there may be frames that are RAPs that are not selected as view points between two consecutive view points, for example due to the RAP at time The view is too close to the surrounding view point, or the amount of media data between the view point before or after the RAP and the RAP is too small.

媒體呈現的所有其他版本內的檢視點的位置可被約束成與第一(例如，最高媒體資料率)版本中的檢視點相同。與允許編碼器自由選取檢視點相比，此的確降低了該等其他版本的壓縮效率。The location of the view point within all other versions of the media presentation can be constrained to be the same as the view point in the first (eg, highest media data rate) version. This does reduce the compression efficiency of these other versions compared to allowing the encoder to freely select the view points.

使用檢視點典型地要求訊框是能獨立解碼的，此一般導致彼訊框的低壓縮效率。不被要求能獨立解碼的訊框可參考其他訊框中的資料來編碼，此一般而言將使彼訊框的壓縮效率提高達取決於待編碼訊框與參考訊框之間的共性量的量。對檢視點定位的高效選取較佳選取與先前訊框具有低共性的訊框作為檢視點訊框，並藉此使得由於以能獨立解碼的方式編碼該訊框招致的壓縮效率懲罰最小化。The use of a view point typically requires that the frame be independently decodable, which generally results in a low compression efficiency of the frame. Frames that are not required to be independently decodable can be encoded with reference to data in other frames. This generally increases the compression efficiency of the frame by the amount of commonality between the frame to be encoded and the reference frame. the amount. For efficient selection of the view point location, it is preferred to select a frame having a low commonality with the previous frame as the view point frame, and thereby minimizing the compression efficiency penalty incurred by encoding the frame in an independently decodable manner.

然而，訊框與潛在可能的參考訊框之間的共性的程度是跨該內容的不同表示而高度相關的，因為原始內容是相同的。因此，令其他變體中的檢視點與第一變體中的檢視點具有相同位置的約束對於壓縮效率而言並沒有帶來很大差別。However, the degree of commonality between the frame and the potential reference frame is highly correlated across different representations of the content because the original content is the same. Therefore, the constraint that the view points in the other variants have the same position as the view points in the first variant does not make a big difference in compression efficiency.

檢視點結構較佳被用來決定區塊結構。較佳地，每個檢視點決定區塊的開始，且可能存在涵蓋兩個連貫檢視點之間的資料的一或多個區塊。由於為以良好壓縮進行編碼的目的使得檢視點之間的歷時不是固定的，因此不要求所有區塊皆具有相同的播出歷時。在一些實施例中，區塊在內容的各版本之間是對準的-亦即，若在內容的一個版本中有跨越特定訊框群的區塊，則在該內容的另一版本中亦有跨越相同的訊框群的區塊。內容的給定版本的各區塊不重疊，且內容的每個訊框被包含在每個版本的恰好一個區塊內。The view point structure is preferably used to determine the block structure. Preferably, each view point determines the beginning of the block and there may be one or more blocks covering the material between the two consecutive view points. Since the duration between the views is not fixed for the purpose of encoding with good compression, it is not required that all blocks have the same play duration. In some embodiments, the tiles are aligned between versions of the content - that is, if there is a block spanning a particular frame group in one version of the content, then in another version of the content There are blocks that span the same frame group. Each block of a given version of content does not overlap, and each frame of content is contained within exactly one block of each version.

使得能允許高效利用檢視點之間的可變歷時及因此的可變歷時GoP的特徵是可被包括在段中或經由其他手段提供給客戶端的段索引或段映射，亦即，此是可被提供的與該表示中的此段相關聯的包括該呈現的每個區塊的開始時間和歷時的中繼資料。當使用者已請求該呈現在段內的特定點開始時，客戶端可在決定要在何區塊開始該呈現時使用該段索引資料。若沒有提供此類中繼資料，則呈現僅能在內容開頭或在接近所要點的隨機或近似點開始(例如，經由將所請求的起始點(以時間計)除以平均區塊歷時以提供起始區塊的索引來選取起始區塊)。A feature that enables efficient use of the variable duration between the view points and thus the variable duration GoP is a segment index or segment map that can be included in the segment or provided to the client via other means, ie, this can be A relay profile associated with the segment in the representation that includes the start time and duration of each block of the presentation is provided. When the user has requested that the presentation begin at a particular point within the segment, the client may use the segment index material when deciding which block to start the rendering in. If no such relay material is provided, the rendering can only be at the beginning of the content. Or starting at a random or approximate point close to the point (eg, selecting the starting block by dividing the requested starting point (in terms of time) by the average block duration to provide an index of the starting block).

在一個實施例中，每個區塊可作為單獨的檔來提供。在另一個實施例中，多個連貫區塊可被聚集成單個檔以構成段。在該第二實施例中，可以提供每個版本的中繼資料，中繼資料包括每個區塊的開始時間和歷時及該區塊在此檔內開始處的位元組偏移量。該中繼資料可回應於初始協定請求而提供，亦即，可與段或檔分開地獲得，或者可被包含在與區塊本身相同的檔或段內，例如位於檔開頭處。如對於本領域技藝人士將清楚的，該中繼資料可以諸如gzip或△編碼之類的壓縮形式或以二進位形式來編碼，以減少向客戶端傳輸該中繼資料所需的網路資源。In one embodiment, each block may be provided as a separate file. In another embodiment, a plurality of consecutive blocks may be aggregated into a single file to form a segment. In this second embodiment, each version of the relay material may be provided, the relay data including the start time and duration of each block and the byte offset at the beginning of the block within the file. The relay material may be provided in response to an initial agreement request, that is, may be obtained separately from the segment or file, or may be included in the same file or segment as the block itself, such as at the beginning of the file. As will be apparent to those skilled in the art, the relay material can be encoded in a compressed form such as gzip or delta encoding or in binary form to reduce the network resources required to transmit the relay material to the client.

圖6圖示段索引的實例，其中區塊具有可變大小，且其中區塊的範疇是部分GoP，亦即，一個RAP與下一個RAP之間的媒體資料的部分量。在該實例中，檢視點由RAP指示符來指示，其中RAP指示符值1指示該區塊始於或包含RAP或檢視點，並且其中RAP指示符0指示該區塊不包含RAP或檢視點。在該實例中，頭三個區塊，亦即，即位元組0到157033構成第一GoP，第一GoP具有1.623秒的呈現歷時，第一GoP的呈現時間從進入內容中的20秒走到21.623秒。在該實例中，頭三個區塊中的第一區塊包括0.485秒的呈現時間，且包括該段中的媒體資料的頭50245位元組。在該實例中，區塊4、5和區塊6構成第二GoP，區塊7和區塊8構成第三GoP，並且區塊9、10和區塊11構成第四GoP。注意，媒體資料中可能存在沒有被指定為檢視點、且因此在段映射中沒有作為RAP被訊號傳遞的其他RAP。6 illustrates an example of a segment index in which a block has a variable size, and wherein the category of the block is a partial GoP, that is, a partial amount of media material between one RAP and the next RAP. In this example, the view point is indicated by a RAP indicator, where a RAP indicator value of 1 indicates that the block begins or contains a RAP or a view point, and wherein the RAP indicator 0 indicates that the block does not contain a RAP or a view point. In this example, the first three blocks, that is, the bytes 0 to 157033 constitute the first GoP, and the first GoP has a presentation duration of 1.623 seconds, and the presentation time of the first GoP goes from 20 seconds into the content. 21.623 seconds. In this example, the first of the first three blocks includes a rendering time of 0.485 seconds and includes the first 50245 bytes of media material in the segment. In this example, blocks 4, 5 and block 6 constitute a second GoP, block 7 and block 8 constitute a third GoP, and blocks 9, 10 and Block 11 constitutes a fourth GoP. Note that there may be other RAPs in the media material that are not designated as view points, and therefore are not passed as RAPs in the segment map.

再次參照圖6，若客戶端或接收器想要存取始於進入媒體呈現中約22秒的時間偏移量處的內容，則客戶端可首先使用諸如以下更詳細地描述的MPD之類的其他資訊來首先決定該相關媒體資料的確在該段內。客戶端可例如使用HTTP位元組範圍請求來下載該段的第一部分以獲得段索引，其在本例中只有幾個位元組。使用該段索引，客戶端可決定自己應當下載的第一區塊是時間偏移量至多為22秒且始於RAP、即實為檢視點的首個區塊。在該實例中，儘管區塊5具有小於22秒的時間偏移量，亦即，區塊5的時間偏移量為21.965秒，但段索引指示區塊5並非始於RAP，且由此基於段索引，客戶端改為選擇下載區塊4，因為區塊4開始時間至多為22秒，亦即，區塊4的時間偏移量為21.623秒，區塊4始於RAP。因此，基於段索引，客戶端將作出始於位元組偏移量157034的HTTP範圍請求。Referring again to Figure 6, if the client or receiver wants to access content starting at a time offset of about 22 seconds into the media presentation, the client may first use an MPD such as described in more detail below. Other information to first determine that the relevant media information is indeed in the paragraph. The client can download the first portion of the segment, for example using an HTTP byte range request, to obtain a segment index, which in this example has only a few bytes. Using this segment index, the client can decide that the first block that it should download is the first block with a time offset of at most 22 seconds and starting at RAP, which is the actual view point. In this example, although block 5 has a time offset of less than 22 seconds, ie, the time offset of block 5 is 21.965 seconds, the segment index indicates that block 5 does not begin with RAP and is thus based on Segment index, the client instead selects download block 4, because block 4 starts at most 22 seconds, that is, block 4 has a time offset of 21.623 seconds, and block 4 starts with RAP. Therefore, based on the segment index, the client will make an HTTP range request starting at byte offset 157034.

若段索引不可用，則客戶端可能不得不在下載該資料之前先下載所有之前的157034位元組的資料，導致啟動時間或頻道換台時間長得多，及浪費地下載了無用的資料。替換地，若段索引不可用，則客戶端可近似出所要資料在該段內的開始之處，但該近似可能是不良的並且該近似可能錯過合適的時間且隨後需要後退，此再次增加了啟動延遲。If the segment index is not available, the client may have to download all of the previous 157034 bytes of data before downloading the material, resulting in much longer startup time or channel swap time, and wasted downloading of useless data. Alternatively, if the segment index is not available, the client can approximate where the desired data is at the beginning of the segment, but the approximation may be bad and the approximation may miss the appropriate time and then need to retreat, which again increases Startup delay.

一般而言，每個區塊涵蓋媒體資料的一部分，該部分連同先前各區塊一起可由媒體播放機播出。因此，此種成區塊結構及向客戶端訊號傳遞通知段索引成區塊結構(無論包含在段內還是經由其他手段提供給客戶端)的機制能顯著改善客戶端提供快速頻道換台，及在面臨網路變動和中斷時的無瑕疵播出的能力。如由段索引實現的對可變歷時區塊及僅涵蓋GoP的部分的區塊的支援能顯著改善串流體驗。例如，再次參照圖6及以上描述的其中客戶端想要在進入呈現中約22秒處開始播出的實例，客戶端可經由一或多個請求來請求區塊4內的資料並隨後一旦該區塊可用就將該區塊內的資料饋送給媒體播放機以開始重播。因此，在該實例中，一旦在客戶端處接收到區塊4的42011位元組，播出就開始，由此實現快速的頻道換台時間。若相反在播放將要起動之前需要客戶端先請求整個GoP，則頻道換台時間將較長，因為有144211位元組的資料。In general, each block covers a portion of the media material, the department The points, along with the previous blocks, can be broadcast by the media player. Therefore, such a block structure and a mechanism for transmitting a notification segment index to a client signal into a block structure (whether included in a segment or provided to a client via other means) can significantly improve the client to provide fast channel switching, and The ability to broadcast innocently in the face of network changes and disruptions. Support for a variable duration block and a block that only covers portions of GoP as implemented by the segment index can significantly improve the streaming experience. For example, referring again to FIG. 6 and the example described above in which the client wants to start playing at about 22 seconds into the presentation, the client may request the material in block 4 via one or more requests and then once the The block can be used to feed the data in the block to the media player to begin the replay. Thus, in this example, once the 42011 byte of block 4 is received at the client, the playout begins, thereby enabling fast channel change times. If instead the client is required to request the entire GoP before the playback is about to start, the channel change time will be longer because there is 144211 bytes of data.

在其他實施例中，RAP或檢視點亦可出現在區塊當中，且段索引中可以有指示該RAP或檢視點在區塊或片斷內何處的資料。在其他實施例中，時間偏移量可以是區塊內的第一訊框的解碼時間，而非區塊內的第一訊框的呈現時間。In other embodiments, a RAP or a view point may also be present in the block, and the segment index may have information indicating where the RAP or view point is within the block or segment. In other embodiments, the time offset may be the decoding time of the first frame within the block, rather than the presentation time of the first frame within the block.

圖8(a)和圖8(b)圖示了對跨複數個版本或表示對準的檢視點結構的可變區塊大小控制的實例；圖8(a)圖示了在媒體串流的複數個版本上有對準的檢視點的可變區塊大小控制，而圖8(b)圖示了在媒體串流的複數個版本上有非對準的檢視點的可變區塊大小控制。Figures 8(a) and 8(b) illustrate examples of variable block size control for a view point structure across multiple versions or representations; Figure 8(a) illustrates the media stream Variable block size control with aligned view points on multiple versions, and Figure 8(b) illustrates variable block size control for non-aligned view points on multiple versions of media streams .

跨頂部以秒圖示時間，且該兩個表示的兩個段的區塊和檢視點以該等區塊和檢視點相對於該等時線的時基的形式從左到右圖示，並且因此所示的每個區塊的長度與每個區塊的播出時間成比例而不是與區塊中的位元組數目成比例。在該實例中，該兩個表示的兩個段的段索引對於檢視點將具有相同的時間偏移量，但檢視點之間具有潛在不同數目的區塊或片斷，且由於每個區塊中的媒體資料量不同，因此區塊有不同的位元組偏移量。在該實例中，若客戶端想要在約23秒的呈現時間處從表示1切換到表示2，則客戶端可請求直至表示1的段中的區塊1.2，並自區塊2.2起開始請求表示2的段，且因此切換將發生在與表示1中的檢視點1.2重合的呈現處，檢視點1.2與表示2中的檢視點2.2位於相同的時間。The time is shown in seconds across the top, and the two segments of the two representations The block and the view point are illustrated from left to right in the form of the time bases of the blocks and the view points with respect to the isochronal line, and thus the length of each block shown and the play time of each block Proportional rather than proportional to the number of bytes in the block. In this example, the segment indices of the two segments of the two representations will have the same time offset for the view points, but with potentially different numbers of tiles or segments between the view points, and due to The amount of media data is different, so the blocks have different byte offsets. In this example, if the client wants to switch from representation 1 to representation 2 at a presentation time of about 23 seconds, the client may request block 1.2 in the segment representing 1 and start requesting from block 2.2. A segment representing 2, and thus the switching will occur at the presentation coincident with the view point 1.2 in representation 1, the view point 1.2 being at the same time as the view point 2.2 in representation 2.

如從前述內容應當清楚的，所描述的區塊請求串流系統並不約束視訊編碼要將檢視點放置在內容內的特定位置處，並且此緩解了現有系統的問題之一。As should be apparent from the foregoing, the described block request streaming system does not constrain the video encoding to place the view point at a particular location within the content, and this alleviates one of the problems of prior systems.

在以上描述的實施例中，組織成使得相同內容呈現的各種表示的檢視點被對準。然而，在許多情形中，較佳放寬該對準要求。例如，有時是此種情形：已使用編碼工具產生不具有產生檢視點對準的表示的能力的表示。作為另一實例，內容呈現可被獨立地編碼成不同的表示，而不同的表示之間沒有檢視點對準。作為另一實例，表示在表示具有低速率且更普遍需要切換或者在表示包含檢視點以支援諸如快進或倒帶或快速檢視之類的特技模式時可包含較多檢視點。因此，希望提供使得區塊請求串流系統能高效且無瑕疵地應對跨內容呈現的各種表示非對準的檢視點的方法。In the embodiments described above, the view points that are organized such that the various representations of the same content are presented are aligned. However, in many cases, the alignment requirements are preferably relaxed. For example, it is sometimes the case that an encoding tool has been used to generate a representation that does not have the ability to produce a representation of the point of view alignment. As another example, content presentations can be independently encoded into different representations without view point alignment between different representations. As another example, the representation may include more view points when the representation has a lower rate and is more generally required to switch or when representing a trick mode that includes a view point to support such as fast forward or rewind or fast view. Accordingly, it is desirable to provide a method that enables a block request stream system to efficiently and flawlessly address various view points that represent misalignment across content presentations.

在該實施例中，跨表示的檢視點的位置可能並不對準。區塊被構造成使得新區塊始於每個檢視點，且因此在呈現的不同版本的區塊之間可能沒有對準。此類不同表示之間非對準的檢視點結構的實例在圖8(b)中圖示。跨頂部以秒圖示時間，且該兩個表示的兩個段的區塊和檢視點以該等區塊和檢視點相對於該等時線的時基的形式從左到右圖示，且因此所示的每個區塊的長度與每個區塊的播出時間成比例而不是與區塊中的位元組數目成比例。在該實例中，該兩個表示的兩個段的段索引對於檢視點將具有潛在不同的時間偏移量，並且檢視點之間亦具有潛在不同數目的區塊或片斷，且由於每個區塊中的媒體資料量不同，因此區塊有不同的位元組偏移量。在該實例中，若客戶端想要在約25秒的呈現時間處從表示1切換到表示2，則客戶端可請求直至表示1的段中的區塊1.3，並自區塊2.3起開始請求表示2的段，且因此切換將發生在與表示2中的檢視點2.3重合的呈現處，該呈現位於表示1中區塊1.3的播出當中，且因此區塊1.2的媒體中的一些將不被播出(儘管區塊1.3的沒被播出的訊框的媒體資料可能已被載入到接收器緩衝器中以用於區塊1.3的其他被播出的訊框的解碼)。In this embodiment, the locations of the viewing points across the representations may not be aligned. The tiles are constructed such that the new block begins at each view point, and thus there may be no alignment between the different versions of the tiles presented. An example of a non-aligned view point structure between such different representations is illustrated in Figure 8(b). The time is illustrated in seconds across the top, and the blocks and viewpoints of the two segments of the two representations are illustrated from left to right in the form of the time bases of the blocks and the view points relative to the isochronal line, and Thus the length of each block shown is proportional to the playout time of each block rather than to the number of bytes in the block. In this example, the segment indices of the two segments of the two representations will have potentially different time offsets for the view points, and there will also be a potentially different number of blocks or segments between the view points, and due to each block The amount of media data in the media is different, so the blocks have different byte offsets. In this example, if the client wants to switch from representation 1 to representation 2 at a presentation time of about 25 seconds, the client may request block 1.3 in the segment representing 1 and start requesting from block 2.3. Represents a segment of 2, and thus the handover will occur at a presentation that coincides with the view point 2.3 in representation 2, which is located in the playout of block 1.3 in representation 1, and thus some of the media in block 1.2 will not Broadcast (although the media material of the frame that was not broadcasted in block 1.3 may have been loaded into the receiver buffer for decoding of other broadcast frames of block 1.3).

在該實施例中，區塊選擇器123的操作可被修改以使得每當區塊選擇器123被要求從不同於先前所選版本的表示選擇區塊時，區塊的第一訊框不晚於上次所選區塊的最末訊框之後的訊框的最晚區塊被選取。In this embodiment, the operation of the block selector 123 can be modified such that whenever the block selector 123 is required to select a block from a different representation than the previously selected version, the first frame of the block is not too late. The latest block of the frame after the last frame of the last selected block is selected.

該最近描述的實施例可消除約束除第一版本以外的其他版本內的檢視點位置的要求，且因此提高了該等版本的壓縮效率，從而對於給定的可用頻寬得到更高品質的呈現並且此是改善的使用者體驗。進一步的考慮是執行跨內容的多個編碼(版本)的檢視點對準功能的視訊編碼工具可能並非普遍可用，因此該最近描述的實施例的優點在於可以使用目前可用的視訊編碼工具。另一優點在於內容的不同版本的編碼可並行進行而完全無需不同版本的編碼過程之間的協調。另一優點在於內容的額外版本可在稍後的時間被編碼並被添加到呈現，而不必向編碼工具提供具體檢視點位置的列表。The recently described embodiment eliminates constraints other than the first version The requirement for view point locations within other versions, and thus the compression efficiency of such versions, results in a higher quality presentation for a given available bandwidth and this is an improved user experience. A further consideration is that video encoding tools that perform a plurality of encoded (versioned) view point alignment functions across content may not be universally available, and thus the advantage of this recently described embodiment is that currently available video encoding tools can be used. Another advantage is that the encoding of different versions of the content can be done in parallel without the need for coordination between different versions of the encoding process. Another advantage is that an extra version of the content can be encoded at a later time and added to the presentation without having to provide the encoding tool with a list of specific view point locations.

一般而言，在畫面被編碼為畫面群(GoP)的場合，序列中的第一畫面可以是檢視點，但並非總是必需如此。In general, where the picture is encoded as a picture group (GoP), the first picture in the sequence can be a view point, but this is not always necessary.

最優區塊劃分Optimal block division

區塊請求串流系統中的一個關注問題是例如視訊媒體之類的經編碼媒體的結構與用於區塊請求的區塊結構之間的互動。如視訊編碼領域中的技藝人士將知曉的，往往是此種情形：每個視訊訊框的經編碼表示所需的位元數目有時實際上逐訊框變化。因此，收到資料量與由彼資料編碼的媒體歷時之間的關係可能不是直截了當的。此外，在區塊請求串流系統內將媒體資料分成區塊增加了進一維的複雜性。具體而言，在一些系統中，區塊的媒體資料可能在整個區塊被接收到之前不會被播出，例如，使用擦除碼的區塊內的媒體資料佈局或者區塊內的媒體取樣之間的依存性就可能導致此種性質。由於區塊大小與區塊歷時之間的該等複雜互動及可能需要在開始播出之前接收整個區塊，因此客戶端系統通常採納保守辦法，其中在播出開始之前緩衝媒體資料。此類緩衝導致頻道換台時間很長並且因此使用者體驗不良。One concern in the block request streaming system is the interaction between the structure of the encoded media, such as video media, and the block structure for the tile request. As will be appreciated by those skilled in the art of video coding, it is often the case that the number of bits required for the encoded representation of each video frame sometimes varies from frame to frame. Therefore, the relationship between the amount of data received and the duration of the media encoded by the material may not be straightforward. In addition, the division of media data into blocks within a block request streaming system adds one-dimensional complexity. In particular, in some systems, the media material of the block may not be broadcast until the entire block is received, for example, the media data layout within the block using the erasure code or the media sample within the block. The interdependence may lead to this nature. Due to the complex interaction between block size and block duration and the possibility to receive the entire block before starting the broadcast, the client system usually takes A conservative approach in which media information is buffered before the broadcast begins. Such buffering results in long channel switching times and therefore poor user experience.

Pakzad描述了「區塊劃分方法」，該等方法是基於資料串流的底層結構來決定如何將資料串流劃分成毗連區塊的新穎且高效的方法，且Pakzad進一步描述了該等方法在串流系統的上下文中的若干優點。現在描述本發明將Pakzad的區塊劃分方法應用於區塊請求串流系統的進一步實施例。該方法可包括將待呈現媒體資料安排成大致的呈現時間次序，以使得媒體資料的任何給定元素(例如，視訊訊框或音訊取樣)的播出時間與任何毗鄰媒體資料元素的播出時間相差小於所設閾值。如此排序的媒體資料按Pakzad的話而言可被視為資料串流，且應用於該資料串流的任何Pakzad方法識別該資料串流的區塊邊界。任一對毗鄰區塊邊界之間的資料按本案的話而言被視為「區塊」，且本案的方法被應用以提供該媒體資料在區塊請求串流系統內的呈現。如本領域技藝人士在閱讀本案之際將清楚的，Pakzad中揭示的方法的若干優點由此可在區塊請求串流系統的上下文中實現。Pakzad describes the "block partitioning method", which is based on the underlying structure of the data stream to determine how to divide the data stream into contiguous blocks. The Pakzad further describes the methods in the string. Several advantages in the context of a streaming system. A further embodiment of the present invention for applying the block partitioning method of Pakzad to a block request stream system will now be described. The method can include arranging the media material to be presented in a substantially presentation time order such that the broadcast time of any given element of the media material (eg, video frame or audio sample) and the broadcast time of any adjacent media material elements The difference is less than the set threshold. Such sorted media material can be considered a stream of data in the case of Pakzad, and any Pakzad method applied to the stream of data identifies the block boundary of the stream of data. The data between any pair of adjacent block boundaries is considered a "block" in the context of the present case, and the method of the present invention is applied to provide presentation of the media material within the block request streaming system. As will be apparent to those skilled in the art upon reading this disclosure, several advantages of the method disclosed in Pakkad can thus be implemented in the context of a block request streaming system.

如Pakzad中描述的，對段的區塊結構(包括涵蓋部分GoP或涵蓋一個以上GoP的部分的區塊)的決定會影響客戶端實現快速頻道換台時間的能力。在Pakzad中，提供了在提供目標啟動時間的情況下將提供區塊結構和目標下載速率的方法，該區塊結構和目標下載速率將確保若客戶端在任何檢視點開始下載表示並在該目標啟動時間已流逝之後開始播出，則只要在每個時間點該客戶端已下載的資料量至少為目標下載速率乘以自下載開始以來流逝的時間，則播出就將無瑕疵地繼續。有利的是使客戶端能存取目標啟動時間和目標下載速率，因為此向客戶端提供了決定何時開始在最早的時間點播出該表示的手段，並且只要下載滿足上述條件就允許客戶端繼續播出該表示。因此，稍後描述的方法提供了用於在媒體呈現描述內包括目標啟動時間和目標下載速率的手段，從而該方法可被用於上述目的。As described in Pakzad, the decision on the block structure of the segment (including blocks that cover part of the GoP or parts that cover more than one GoP) can affect the client's ability to implement fast channel change times. In Pakzad, a method is provided that will provide a block structure and a target download rate in the case of a target boot time, which will ensure that if the client starts downloading the representation at any view point and is at that target After the start time has elapsed and the broadcast starts, the amount of data that the client has downloaded at least at each point in time is at least the target. The download rate is multiplied by the elapsed time since the download began, and the broadcast will continue innocently. It is advantageous to enable the client to access the target boot time and the target download rate, since this provides the client with a means of deciding when to start broadcasting the representation at the earliest point in time, and allows the client to continue as long as the download satisfies the above conditions. Broadcast the representation. Therefore, the method described later provides means for including the target startup time and the target download rate within the media presentation description, so that the method can be used for the above purpose.

媒體呈現資料模型Media presentation data model

圖5圖示了圖1中所示的內容儲存的可能結構，包括段和媒體呈現描述(「MPD」)檔，及MPD檔內的段分解、時基和其他結構。現在將描述MPD結構或檔的可能實現的細節。在許多實例中，MPD被描述為檔，但亦可以使用非檔結構。Figure 5 illustrates a possible structure for content storage shown in Figure 1, including segment and media presentation description ("MPD") files, and segmentation, time base, and other structures within the MPD file. Details of possible implementations of the MPD structure or file will now be described. In many instances, the MPD is described as a file, but a non-file structure can also be used.

如其中圖示的，內容儲存110裝有複數個源段510、MPD 500和修復段512。MPD可包括時段記錄501，時段記錄501又可包括表示記錄502，表示記錄502包含諸如對初始化段504和媒體段505的引用之類的段資訊503。As illustrated therein, the content store 110 houses a plurality of source segments 510, MPDs 500, and repair segments 512. The MPD may include a time period record 501, which in turn may include a representation record 502 indicating that the record 502 includes segment information 503 such as a reference to the initialization segment 504 and the media segment 505.

圖9(a)圖示了實例中繼資料表900，而圖9(b)圖示了HTTP串流客戶端902如何在與HTTP串流伺服器906的連接上獲得中繼資料表900和媒體區塊904的實例。Figure 9(a) illustrates an example relay data table 900, while Figure 9(b) illustrates how the HTTP streaming client 902 obtains the relay data table 900 and media over the connection with the HTTP streaming server 906. An instance of block 904.

在本文中描述的方法中，提供包括關於客戶端可用的媒體呈現的表示的資訊的「媒體呈現描述」。表示可以是替換性的，替換性的意義是指客戶端選出不同替換項之一，或者表示可以是互補性的，互補性的意義是指客戶端選擇其中若干個表示、每個表示可能亦來自一組替換項，並且聯合地呈現該等表示。表示可有利地被指派到群，其中客戶端被程式設計或配置成理解：對於一群中的表示，該等表示各自是彼此的替換項，而來自不同群的表示使得能聯合地呈現一個以上表示。換言之，若群中有一個以上表示，則客戶端從彼群挑選一個表示，從下一群挑選一個表示等等以構成呈現。In the methods described herein, a "media presentation description" is provided that includes information regarding representations of media presentations available to the client. The representation can be alternative. The meaning of substitution is that the client chooses one of the different alternatives, or the representation can be complementary. The meaning of complementarity means that the client chooses its A number of representations, each representation may also come from a set of alternatives, and the representations are presented jointly. The representation may advantageously be assigned to a group, where the client is programmed or configured to understand that for a representation in a group, the representations are each a substitute for each other, and the representations from the different groups enable one or more representations to be jointly presented. . In other words, if there is more than one representation in the group, the client picks a representation from the group, picks a representation from the next group, etc. to form the presentation.

描述表示的資訊可有利地包括所應用的媒體轉碼器的詳情，包括解碼該表示所需的彼等轉碼器的簡檔和等級、視訊訊框率、視訊解析度及資料率。接收媒體呈現描述的客戶端可使用該資訊來事先決定表示是否適合解碼或呈現。此代表了優點，因為若區別資訊將僅被包含在表示的二進位資料中，則將必需請求來自所有表示的二進位資料並解析和提取相關資訊才能發現關於區別資訊的適用性的資訊。該多個請求及對資料的解析並附提取可能要花一些時間，此會導致啟動時間很長並且因此使用者體驗不良。The information represented by the description may advantageously include details of the applied media transcoder, including the profile and level of the transcoders required to decode the representation, the video frame rate, the video resolution, and the data rate. The client receiving the media presentation description can use this information to determine in advance whether the representation is suitable for decoding or rendering. This represents an advantage, because if the difference information will only be included in the binary data of the representation, then it will be necessary to request binary data from all representations and parse and extract relevant information in order to find information about the applicability of the difference information. The multiple requests and the parsing of the data and the extraction may take some time, which results in a long startup time and thus a poor user experience.

此外，媒體呈現描述可包括基於時辰來限制客戶端請求的資訊。例如對於實況服務，客戶端可被限於請求呈現中接近「當前廣播時間」的部分。此代表了優點，因為對於實況廣播，可能希望從服務基礎設施中清空比當前廣播時間早所設閾值以上廣播的內容的資料。此對於服務基礎設施內的儲存資源的重複使用而言是可取的。此亦可能取決於所提供的服務類型而變為可取的，例如，在一些情形中，由於接收客戶端裝置的某個訂閱模型，可使得呈現僅有實況可用，而可使得其他媒體呈現有實況和點播可用，並可使得其他呈現對於第一類客戶端裝置僅有實況可用，對於第二類客戶端裝置僅有點播可用，及對於第三類客戶端裝置有實況或點播的組合可用。(下文)「媒體呈現資料模型」小節中描述的方法允許向客戶端通知此類策略，從而客戶端對於服務基礎設施中可能不可用的資料可避免作出請求並調節對使用者的供應。作為替換方案，例如，客戶端可向使用者呈現該資料不可用的通知。In addition, the media presentation description may include information limiting the client request based on the hour. For example, for live services, the client can be limited to the portion of the request presentation that is close to the "current broadcast time." This represents an advantage because for live broadcasts, it may be desirable to empt the content of the content broadcasted above the threshold set by the current broadcast time from the service infrastructure. This is desirable for reuse of storage resources within the service infrastructure. This may also be desirable depending on the type of service provided, for example, in some cases, due to receiving a certain subscription model of the client device, the presentation may be made available only live, Other media can be rendered live and on-demand, and other presentations can be made available only for the first type of client device, only on-demand for the second type of client device, and for the third type of client device. A live or on-demand combination is available. The method described in the "Media Presentation Data Model" section (below) allows the client to be notified of such policies so that the client can avoid making requests and adjusting the supply to the user for materials that may not be available in the service infrastructure. Alternatively, for example, the client may present to the user a notification that the material is not available.

在本發明的進一步實施例中，媒體段可順應於ISO/IEC 14496-12或衍生規範中描述的ISO基媒體檔案格式(諸如3GPP技術規範26.244中描述的3GP檔案格式)。(上文)「3GPP檔案格式的使用」該小節描述了對ISO基媒體檔案格式的新穎增強，從而准許在區塊請求串流系統內高效地使用該檔案格式的資料結構。如該參考文件中描述的，可在檔內提供資訊從而准許媒體呈現的時間段與檔內的位元組範圍之間快速且高效的映射。媒體資料本身可根據ISO/IEC14496-12中界定的電影片斷構造來結構化。提供時間和位元組偏移量的該資訊可被階層式地結構化或被結構化為單個區塊。該資訊可在檔開始處提供。使用如「3GPP檔案格式的使用」該小節中描述的高效編碼來提供該資訊導致客戶端能迅速檢索該資訊，例如在區塊請求串流系統使用的檔下載協定是HTTP的情形中使用HTTP部分獲取請求來迅速檢索該資訊，此導致很短的啟動、檢視或串流切換時間且因此導致改善的使用者體驗。In a further embodiment of the invention, the media segment may conform to the ISO-based media file format described in ISO/IEC 14496-12 or the derivative specification (such as the 3GP file format described in 3GPP Technical Specification 26.244). (above) "Use of the 3GPP Archive Format" This subsection describes a novel enhancement to the ISO Base Media Archive format to permit efficient use of the data structure of the file format within the Block Request Streaming System. As described in this reference, information can be provided within the file to permit a fast and efficient mapping between the time period of media presentation and the range of bytes within the file. The media material itself can be structured according to the movie fragment construction defined in ISO/IEC 14496-12. This information providing time and byte offsets can be hierarchically structured or structured into a single block. This information can be provided at the beginning of the file. The use of efficient coding as described in this section of the "Use of 3GPP Archive Format" to provide this information causes the client to retrieve the information quickly, for example in the case of a block download protocol used by the block request stream system in the case of HTTP. The request is retrieved to quickly retrieve the information, which results in a very short start, view or streaming switching time and thus results in an improved user experience.

媒體呈現中的表示是同步在全域等時線上的，以確保跨表示(典型地為替換項)的無瑕疵切換，並且確保兩個或兩個以上表示的同步呈現。因此，自我調整HTTP串流媒體呈現內的各表示中所包含的媒體的取樣時基可跨多個段與連續的全域等時線相關。The representation in the media presentation is synchronized on the global isochronal to ensure innocent switching across representations (typically replacements) and to ensure simultaneous presentation of two or more representations. Thus, the sampling time base of the media contained in each representation within the self-adjusting HTTP streaming media presentation can be related to successive global isochrones across multiple segments.

包含多種類型的媒體(例如，音訊和視訊)的經編碼媒體區塊對於不同類型的媒體可具有不同的呈現結束時間。在區塊請求串流系統中，此類媒體區塊可按使每種媒體類型被連續地播放的方式來連貫播出，且因此來自一個區塊的一種類型的媒體取樣可能在前一區塊的另一類型的媒體取樣之前播出，此在本文中被稱為「連續區塊拼接」。作為替換，此類媒體區塊可按以下方式播放：一個區塊的任何類型的最早取樣在前一區塊的任何類型的最晚取樣之後播放，此在本文中被稱為「非連續區塊拼接」。當該兩個區塊包含來自相同內容項和相同表示的按順序編碼的媒體時或在其他情形中，連續區塊拼接可能是合適的。典型地，在一個表示內，在拼接兩個區塊時可以應用連續區塊拼接。此是有利的，因為可以應用現有編碼，且可以在無需使媒體軌跡在區塊邊界處對準的情況下進行分段。此在圖10中圖示，其中視訊串流1000包括區塊1202和其他區塊，帶有諸如RAP 1204之類的RAP。Encoded media blocks containing multiple types of media (eg, audio and video) may have different presentation end times for different types of media. In a block request streaming system, such media blocks may be coherently broadcast in such a way that each media type is continuously played, and thus one type of media sample from one block may be in the previous block. Another type of media is sampled before it is broadcast, which is referred to herein as "continuous block splicing." Alternatively, such media blocks can be played in such a way that any type of earliest sample of a block plays after the last sample of any type of the previous block, which is referred to herein as a "discontinuous block". splice". Continuous block stitching may be appropriate when the two blocks contain sequentially encoded media from the same content item and the same representation or in other situations. Typically, within one representation, contiguous block splicing can be applied when splicing two blocks. This is advantageous because existing encodings can be applied and can be segmented without having to align the media tracks at the block boundaries. This is illustrated in Figure 10, where video stream 1000 includes block 1202 and other blocks with RAPs such as RAP 1204.

媒體呈現描述Media presentation description

媒體呈現可被視為HTTP串流伺服器上的結構化的檔集合。HTTP串流客戶端可下載充分的資訊以向使用者呈現串流服務。替換表示可包括一或多個3GP檔或3GP檔的部分，其中該等3GP檔遵循3GPP檔案格式或至少遵照一組界定明確的能容易地轉換自或轉換成3GP檔的資料結構。Media rendering can be viewed as a structured set of files on an HTTP streaming server. The HTTP streaming client can download sufficient information to present to the user Streaming service. The replacement representation may include one or more 3GP files or 3GP files, wherein the 3GP files follow the 3GPP file format or at least follow a defined set of data structures that can be easily converted or converted to 3GP files.

媒體呈現可由媒體呈現描述來描述。媒體呈現描述(MPD)可包含中繼資料，客戶端能使用該中繼資料來構造合適的檔請求，例如HTTP獲取請求，以在合適的時間處存取該資料並向使用者提供串流服務。媒體呈現描述可提供充分的資訊以供HTTP串流客戶端選擇合適的3GPP檔和檔片。訊號傳遞通知客戶端可存取的單位被稱為段。The media presentation can be described by a media presentation description. The Media Presentation Description (MPD) may contain relay data that the client can use to construct a suitable file request, such as an HTTP access request, to access the data at the appropriate time and provide streaming services to the user. . The media presentation description provides sufficient information for the HTTP streaming client to select the appropriate 3GPP files and files. The unit that the signal delivery informs the client to access is called a segment.

媒體呈現描述尤其可包含如下元素和屬性等。The media presentation description may include, inter alia, the following elements and attributes, and the like.

媒體呈現描述元素(MediaPresentationDescription)Media presentation description element (MediaPresentationDescription)

封裝供HTTP串流客戶端用來向最終使用者提供串流服務的中繼資料的元素。「媒體呈現描述」元素可包含以下屬性和元素中的一或多個。An element that encapsulates the relay material used by the HTTP streaming client to provide streaming services to the end user. The Media Rendering Description element can contain one or more of the following attributes and elements.

版本：協定的版本號以確保可擴展性。Version: The version number of the agreement to ensure scalability.

呈現辨識符(PresentationIdentifier)：使得該呈現可在其他呈現之中被唯一性地識別出來的資訊。亦可包含私有欄位或名稱。Presentation Identifier: Information that enables the presentation to be uniquely identified among other presentations. It can also contain a private field or name.

更新頻率(UpdateFrequency)：媒體呈現描述的更新頻率，亦即，客戶端可多頻繁地重新載入實際的媒體呈現描述。若該元素不出現，則媒體呈現可以是靜態的。更新媒體呈現可意味著媒體呈現不能被快取記憶體。UpdateFrequency: The frequency of update of the media presentation description, ie how often the client can reload the actual media presentation description. If the element does not appear, the media presentation can be static. Updating the media presentation can mean that the media presentation cannot be cached.

媒體呈現描述URI(MediaPresentationDescriptionURL)：用於約定媒體呈現描述的URI。Media Presentation Description URI (MediaPresentationDescriptionURL): used to agree on the media presentation description URI.

串流(Stream)：描述串流或媒體呈現的類型：視訊、音訊或文字。視訊串流類型可包含音訊並且可包含文字。Stream: Describes the type of streaming or media presentation: video, audio, or text. The video stream type can contain audio and can contain text.

服務(Service)：描述具有額外屬性的服務類型。服務類型可以是實況或點播。此可以用來通知客戶端超出某個當前時間的檢視和存取是不被准許的。Service: Describes the type of service with additional attributes. The type of service can be live or on-demand. This can be used to notify the client that viewing and access beyond a certain current time is not permitted.

最大客戶端預緩衝時間(MaximumClientPreBufferTime)：客戶端可預緩衝媒體串流的最大時間量。該時基可將串流與客戶端被限於下載超出該最大預緩衝時間的漸進式下載區別開。該值可以不出現，指示沒有預緩衝意義上的限制可應用。Maximum client prebuffer time (MaximumClientPreBufferTime): The maximum amount of time the client can pre-buffer media streams. The time base distinguishes the stream from the progressive download that the client is limited to download beyond the maximum pre-buffer time. This value may not appear, indicating that no restrictions on pre-buffering are applicable.

安全保護區間實況服務(SafetyGuardIntervalLiveService)：關於伺服器上的實況服務的最大周轉時間的資訊。向客戶端提供了在當前時間有何種資訊已可存取的指示。若預期客戶端和伺服器按UTC時間操作且不提供緊密的時間同步，則該資訊可能是必需的。SafetyGuard IntervalLiveService: Information about the maximum turnaround time of the live service on the server. Provides the client with instructions on what information is available at the current time. This information may be necessary if the client and server are expected to operate in UTC time and do not provide tight time synchronization.

時移緩衝器深度(TimeShiftBufferDepth)：關於客戶端在實況服務中相對於當前時間回移多遠的資訊。藉由該深度的擴展，無需在服務供應中作出特定改變亦可准許時移觀看和追看服務。Time Shift BufferDepth: Information about how far the client is moving back in the live service relative to the current time. With this depth extension, time-shifted viewing and tracking services can also be permitted without making specific changes in the service offering.

准許本端快取記憶體(LocalCachingPermitted)：該標誌指示HTTP客戶端在已播放所下載的資料之後是否能本端快取記憶體該資料。LocalCachingPermitted: This flag indicates whether the HTTP client can cache the data locally after the downloaded data has been played.

實況呈現區間(LivePresentationInverval)：經由指定開始時間(StartTimes)和結束時間(EndTimes)來包含其間呈現可用的時間區間。「開始時間」指示服務的開始時間而「結束時間」指示服務的結束時間。若沒有指定結束時間，則結束時間在當前時間是未知的，且「更新頻率」可確保客戶端能在服務的實際結束時間之前存取到結束時間。LivePreview Inverval: Contains the time interval available for presentation between the start time (StartTimes) and the end time (EndTimes). The "start time" indicates the start time of the service and the "end time" indicates the end time of the service. If no end time is specified, the end time is unknown at the current time, and the "update frequency" ensures that the client can access the end time before the actual end time of the service.

點播可用性區間(OnDemandAvailabilityInterval)：該呈現區間指示該服務在網路上的可用性。可以提供多個呈現區間。HTTP客戶端在任何指定時間訊窗以外可能不能存取該服務。經由提供「點播區間」，可指定額外的時移觀看。對於實況服務，該屬性亦可出現。倘若對於實況服務該屬性出現，則伺服器可確保在所有所提供的可用性區間期間，客戶端能以點播服務的形式來存取該服務。因此，「實況呈現區間」不可與任何「點播可用性區間」重疊。OnDemand Availability Interval: This presentation interval indicates the availability of the service on the network. Multiple presentation intervals can be provided. The HTTP client may not be able to access the service outside of any specified time window. Additional time shift viewing can be specified by providing an "on-demand interval". For live services, this property can also appear. In the event that this attribute occurs for live services, the server can ensure that the client can access the service in the form of an on-demand service during all available availability intervals. Therefore, the "live presentation interval" cannot be overlapped with any "on-demand availability interval".

MPD檔資訊動態(MPDFileInfoDynamic)：描述媒體呈現中的檔的預設動態構造。更多細節在下文中提供。若對若干或所有替換表示使用相同規則，則在MPD等級上的預設指定可以節省不必要的重複。MPD File Info Dynamic (MPDFileInfoDynamic): Describes the preset dynamic construction of a file in a media presentation. More details are provided below. Preset designation on the MPD level can save unnecessary duplication if the same rules are used for several or all of the replacement representations.

MPD轉碼器描述(MPDCodecDescription)：描述媒體呈現中的主預設轉碼器。更多細節在下文中提供。若對若干或所有表示使用相同的轉碼器，則在MPD等級上的預設指定可以節省不必要的重複。MPD Transcoder Description (MPDCodecDescription): Describes the main preset transcoder in the media presentation. More details are provided below. Preset designation on the MPD level can save unnecessary duplication if the same transcoder is used for some or all of the representations.

MPD移動包頭部大小不變(MPDMoveBoxHeaderSizeDoexNotChange)：指示移動包頭部的大小在整個媒體呈現內各個體檔之間是否改變的標誌。該標誌可用來最佳化下載並且可以僅在特定段格式(尤其是段格式的段包含moov頭部的彼等段格式)的情形中才出現。MPD mobile packet header size unchanged (MPDMoveBoxHeaderSizeDoexNotChange): indicates the mobile packet header The size of the logo in the entire media presentation between the various body files. This flag can be used to optimize the download and can only occur in the case of a particular segment format (especially where the segment format contains segments of the moov header).

FileURI模式(FileURIPattern)：客戶端用來產生對媒體呈現內的檔的請求訊息的模式。不同屬性准許為媒體呈現內的每個檔產生唯一性的URI。基URI可以是HTTP URI。FileURIPattern: A mode used by the client to generate a request message for a file within a media presentation. Different attributes permit a unique URI for each file within the media presentation. The base URI can be an HTTP URI.

替換表示：描述表示列表。Replacement representation: Description of the list of representations.

「替換表示(AlternativeRepresentation)」元素：封裝一個表示的所有中繼資料的XML元素。「替換表示」元素可包含以下屬性和元素。"AlternativeRepresentation" element: An XML element that encapsulates all of the relayed data represented. The Replace Representation element can contain the following attributes and elements.

表示ID(RepresentationID)：媒體呈現內該特定替換表示的唯一性ID。Representation ID: The unique ID of the particular alternate representation within the media presentation.

靜態檔資訊(FilesInfoStatic)：提供一個替換呈現的所有檔的起始時間和URI的顯式列表。檔清單的該靜態提供可提供對媒體呈現有確切時基描述的優點，但可能不夠緊湊，尤其是若替換表示包含許多檔。另外，該等檔案名稱可具有任意的名稱。Static Info (FilesInfoStatic): Provides an explicit list of the start time and URI of all files that are rendered. This static provisioning of the manifest can provide the advantage of having an exact time base description for the media presentation, but may not be compact enough, especially if the replacement representation contains many files. In addition, the file names can have any name.

動態檔資訊(FilesInfoDynamic)：提供構造一個替換呈現的起始時間和URI的清單的隱式方式。該檔清單的動態提供可提供具有更緊湊表示的優點。若僅提供了起始時間序列，則此處時基優點亦成立，但檔案名稱將基於檔模式URI(FilePatternURI)來動態地構造。若僅提供每個段的歷時，則表示是緊湊的並且可適合用在實況服務內，但檔的產生可由全域時基來掌管。FilesInfoDynamic: Provides an implicit way of constructing a list that replaces the start time and URI of the rendering. The dynamic provision of this list of files provides the advantage of having a more compact representation. If only the starting time series is provided, then the time base advantage is also true, but the file name will be dynamically constructed based on the file mode URI (FilePatternURI). If only the duration of each segment is provided, the representation is compact and can be adapted for use in live services, but the generation of files can be governed by a global time base.

AP移動包頭部大小不變(APMoveBoxHeaderSizeDoesNotChange)：指示移動包頭部的大小在替換描述內各個體檔之間是否改變的標誌。該標誌可用來最佳化下載並且可以僅在特定段格式(尤其是段格式的段包含moov頭部的彼等段格式)的情形中才出現。The AP mobile packet header size is unchanged (APMoveBoxHeaderSizeDoesNotChange): a flag indicating whether the size of the mobile packet header is changed between the individual body names in the replacement description. This flag can be used to optimize the download and can only occur in the case of a particular segment format (especially where the segment format contains segments of the moov header).

AP轉碼器描述(APCodecDescription)：描述替換呈現中的檔的主轉碼器。AP Transcoder Description (APCodecDescription): A primary transcoder that describes the replacement of the file in the presentation.

媒體描述元素Media description element

媒體描述(MediaDescription)：可封裝該表示中所包含的媒體的所有中繼資料的元素。具體而言，該元素可包含關於該替換呈現中的軌跡以及推薦的軌跡編組(若適用)的資訊。「媒體描述」屬性包含以下屬性：軌跡描述(TrackDescription)：封裝該表示中所包含的媒體的所有中繼資料的XML屬性。「軌跡描述」屬性包含以下屬性：軌跡ID(TrackID)：替換表示內的軌跡的唯一性ID。此可以用在軌跡是編組描述的一部分的情形中。MediaDescription: An element that encapsulates all relay material for the media contained in the representation. In particular, the element may contain information about the trajectory in the alternate presentation and the recommended trajectory grouping (if applicable). The Media Description attribute contains the following attributes: TrackDescription: Encapsulates the XML attributes of all relayed material for the media contained in the representation. The Track Description attribute contains the following attributes: Track ID: Replaces the unique ID of the track within the representation. This can be used in situations where the trajectory is part of a group description.

位元元速率(Bitrate)：軌跡的位元元速率。Bitrate: The bitwise rate of the track.

軌跡轉碼器描述(TrackCodecDescription)：包含關於該軌跡中使用的轉碼器的所有屬性的XML屬性。「軌跡轉碼器描述」屬性包含以下屬性：媒體名稱(MediaName)：界定媒體類型的屬性。媒體類型包括「音訊」、「視訊」、「文字」、「應用程式」和「訊息」。TrackCodecDescription: Contains XML attributes for all attributes of the transcoder used in the track. The Track Transcoder Description property contains the following properties: MediaName: The property that defines the media type. Media types include "Audio", "Video", "Text", "Application" and "Message".

轉碼器(Codec)：包括簡檔和等級的轉碼器類型。Codec: Transcoder type including profile and level.

語言標記(LanguageTag)：語言標記(若適用)。LanguageTag: The language tag (if applicable).

最大寬度(MaxWidth)、最大高度(MaxHeight)：對於視訊而言，是指被包含的視訊以圖元計的高度和寬度。Maximum width (MaxWidth), maximum height (MaxHeight): For video, it refers to the height and width of the included video in the primitive.

取樣速率(SamplingRate)：對於音訊而言的取樣速率。Sampling Rate: The sampling rate for audio.

群描述(GroupDescription)：基於不同參數向客戶端提供對合適編組的推薦的屬性。GroupDescription: Provides the client with recommended attributes for the appropriate group based on different parameters.

群類型(GroupType)：基於該類型，客戶端可決定如何編組軌跡。GroupType: Based on this type, the client can decide how to group the tracks.

媒體呈現描述中的資訊有利地被HTTP串流客戶端用來在合適的時間執行對檔/段或檔/段部分的請求，以選擇來自例如在存取頻寬、顯示能力、轉碼器能力等等以及諸如語言等使用者偏好的意義上匹配資訊能力的勝任表示的段。此外，由於「媒體呈現描述」描述了時間對準且被映射到全域等時線的表示，因此客戶端在正在進行的媒體呈現期間亦可以使用MPD中的資訊來發起合適的行動以跨表示進行切換、聯合地呈現各表示，或在媒體呈現內進行檢視。The information in the media presentation description is advantageously used by the HTTP streaming client to perform a request for a file/segment or file/segment portion at an appropriate time to select from, for example, access bandwidth, display capabilities, transcoder capabilities. And segments that match the competency representation of information capabilities in the sense of user preferences, such as language. In addition, since the "media presentation description" describes the time alignment and is mapped to the representation of the global isochronal line, the client can also use the information in the MPD to initiate appropriate actions during the ongoing media presentation to cross the representation. Switching, jointly presenting each representation, or viewing within the media presentation.

訊號傳遞通知段開始時間Signal delivery notification segment start time

表示可按時間被拆分成多個段。一個段的最後片斷與下一段的下一片斷之間存在軌跡間時基問題。此外，在使用有恆定歷時的段的情形中，存在另一個時基問題。Indicates that it can be split into multiple segments by time. There is an inter-track timebase problem between the last segment of a segment and the next segment of the next segment. Furthermore, in the case of using segments with a constant duration, there is another time base problem.

對每個段使用相同歷時可具有MPD既緊湊又呈靜態的優點。然而，每個段可能仍始於隨機存取點。因此，要麼可將視訊編碼約束成在該等特定點提供隨機存取點，要麼實際的段歷時可能沒有像在MPD中指定的一般精確。可能希望串流系統對視訊編碼過程不施加不必要的限制，且因此第二選項可能是較佳的。Use the same duration for each segment to have an MPD that is both compact and static The advantages. However, each segment may still start at a random access point. Thus, either the video encoding can be constrained to provide random access points at the particular points, or the actual segment duration may not be as general as specified in the MPD. It may be desirable for the streaming system to impose no unnecessary restrictions on the video encoding process, and thus the second option may be preferred.

具體而言，若在MPD中將檔歷時指定為d秒，則第n個檔可始於時間(n-1)d處或緊隨時間(n-1)d後的隨機存取點。Specifically, if the time history is specified as d seconds in the MPD, the nth file may start at a random access point at time (n-1)d or immediately after time (n-1)d.

在該辦法中，每個檔可包括關於該段的在全域呈現時間意義上的確切開始時間的資訊。訊號傳遞通知該點的三種可能方式包括：In this approach, each file may include information about the exact start time of the segment in the sense of a global presentation time. Three possible ways to signal this point are:

(1)第一，將每個段的開始時間限製成如在MPD中指定的確切時基。但由此媒體編碼器對於IDR訊框的放置可能不具有任何靈活性且檔串流可能要求特殊編碼。(1) First, the start time of each segment is limited to the exact time base as specified in the MPD. However, the media encoder may not have any flexibility in the placement of the IDR frame and the stream may require special encoding.

(2)第二，為每個段向MPD添加確切開始時間。對於點播情形，MPD的緊湊性可能降低。對於實況情形，此可能要求對MPD的定期更新，此會降低可伸縮性。(2) Second, add the exact start time to the MPD for each segment. For the on-demand case, the compactness of the MPD may be reduced. For live situations, this may require periodic updates to the MPD, which reduces scalability.

(3)第三，在MPD中向段添加全域時間或相對於該表示的宣稱開始時間或該段的宣稱開始時間的確切開始時間，向段添加的意義是指段包含該資訊。該資訊可被添加至專用於自我調整串流的新包。該包亦可包括如由「TIDX」或「SIDX」包所提供的資訊。該第三辦法的結果是在檢視該等段之一的開頭附近的特定位置時，客戶端可以基於MPD來選取包含所請求的檢視點的彼段的後續段。該情形中的簡單回應可以是將檢視點前向移至檢索到的段的開始(亦即，移至檢視點之後的下一個隨機存取點)。通常，至少每幾秒就提供隨機存取點(且使得隨機存取點更不頻繁往往幾乎獲得不到多少編碼增益)且因此在最差情形中，檢視點可被移到比指定處晚幾秒。替換地，客戶端在檢索該段的頭部資訊時可決定所請求檢視點實際上是在前一段中並改為請求彼段。此可能導致不時地增加執行檢視操作所需的時間。(3) Third, adding a global time to the segment in the MPD or an exact start time relative to the declared start time of the representation or the declared start time of the segment, the meaning added to the segment means that the segment contains the information. This information can be added to a new package dedicated to self-tuning streaming. The package may also include information as provided by the "TIDX" or "SIDX" package. The result of this third approach is that when viewing a particular location near the beginning of one of the segments, the client can select a subsequent segment of the segment containing the requested viewpoint based on the MPD. A simple response in this case may be to move the view point forward to the beginning of the retrieved segment (ie, after moving to the view point) Next random access point). In general, random access points are provided at least every few seconds (and making random access points less frequent often results in almost no coding gain) and therefore in the worst case, the view points can be moved to a later than specified second. Alternatively, the client may determine that the requested view point is actually in the previous segment and instead requests the segment when retrieving the header information of the segment. This may result in an increase in the time required to perform the viewing operation from time to time.

可存取段的列表List of accessible segments

媒體呈現包括一組表示，每個表示提供對原始媒體內容的某個不同版本的編碼。該等表示本身有利地包含關於該表示相比於其他參數的區別參數的資訊。該等表示亦顯式地或隱式地包含可存取段的列表。The media presentation includes a set of representations, each representation providing an encoding of a different version of the original media content. The representations themselves advantageously contain information about the distinguishing parameters of the representation compared to other parameters. The representations also explicitly or implicitly contain a list of accessible segments.

段可被區別為僅包含中繼資料的不受時間影響的段和主要包含媒體資料的媒體段。媒體呈現描述(「MPD」)有利地識別每個段並向每個段指派不同的屬性，要麼隱式地要麼顯式地進行。有利地指派給每個段的屬性包括期間段可存取的時段、藉以可存取段的資源和協定。此外，媒體段被有利地指派諸如該段在媒體呈現中的開始時間，及該段在媒體呈現中的歷時之類的屬性。A segment can be distinguished as a time-independent segment containing only relay data and a media segment containing primarily media material. The Media Presentation Description ("MPD") advantageously identifies each segment and assigns each segment a different attribute, either implicitly or explicitly. The attributes that are advantageously assigned to each segment include the time period during which the period is accessible, the resources and agreements through which the segment can be accessed. In addition, the media segment is advantageously assigned attributes such as the start time of the segment in the media presentation, and the duration of the segment in the media presentation.

在媒體呈現如有利地由媒體呈現描述中的屬性(諸如點播可用性區間)指示的一般為「點播」類型的場合，則媒體呈現描述典型地描述整個段並且亦提供該等段何時可存取及該等段何時不可存取的指示。段的開始時間有利地相對於媒體呈現的開始來表達，以使得在不同時間開始重播相同媒體呈現的兩個客戶端能使用相同的媒體呈現描述以及相同的媒體段。此有利地提高了快取記憶體該等段的能力。Where the media presentation is generally of the "on-demand" type as indicated by attributes in the media presentation description, such as on-demand availability intervals, the media presentation description typically describes the entire segment and also provides when the segments are accessible and An indication of when the segments are not accessible. The start time of the segment is advantageously expressed relative to the beginning of the media presentation, so that two clients starting to replay the same media presentation at different times can use the same media presentation description and the same Media segment. This advantageously increases the ability to cache the segments of the memory.

在媒體呈現如有利地由媒體呈現描述中的屬性(諸如「服務」屬性)指示的一般為「實況」類型的場合，則包括媒體呈現的超出實際時辰的部分的段一般不被產生或者至少不可存取，儘管該等段在MPD中作了完整描述。然而，有了媒體呈現服務為「實況」類型的指示，客戶端可基於MPD中包含的資訊及MPD的下載時間來產生對以壁鐘時間計的客戶端內部時間「現在」而言可存取的段連同時基屬性的清單。伺服器有利地在使得資源可存取從而在壁鐘時間「現在」用MPD的實例操作的參考客戶端能存取該資源的意義上操作。Where the media presentation is generally of the "live" type as indicated by an attribute (such as a " service " attribute) in the media presentation description, then the segment of the portion of the media presentation that exceeds the actual time is generally not generated or at least not Access, although the segments are fully described in the MPD. However, with the media presentation service being a "live" type of indication, the client can generate access to the client's internal time "now" based on the wall clock based on the information contained in the MPD and the download time of the MPD. A list of simultaneous base attributes. The server advantageously operates in the sense that the resource is accessible so that the reference client operating with the instance of the MPD at the wall clock time can access the resource.

具體地，參考客戶端基於MPD中包含的資訊及MPD的下載時間產生對以壁鐘時間計的客戶端內部時間「現在」而言可存取的段連同時基屬性的清單。隨著時間前進，客戶端將使用相同的MPD並且將建立能用來連續地播出該媒體呈現的新的可存取段列表。因此，伺服器可在該等段實際上能存取之前在MPD中宣告該等段。此是有利的，因為此舉減少了對MPD的頻繁更新和下載。Specifically, the reference client generates a list of segment-based base attributes accessible to the client internal time "now" based on the wall clock time based on the information contained in the MPD and the download time of the MPD. As time progresses, the client will use the same MPD and will create a new list of accessible segments that can be used to continuously broadcast the media presentation. Therefore, the server can announce the segments in the MPD before the segments can actually be accessed. This is advantageous because it reduces frequent updates and downloads of the MPD.

假定經由諸如「靜態檔資訊」之類的元素中的播放清單顯式地描述了或者經由使用諸如「動態檔資訊」之類的元素隱式地描述了各自具有開始時間tS的段的列表。以下描述使用「動態檔資訊」來產生段清單的有利方法。基於該構造規則，客戶端能存取每個表示r 的URI的列表(在本文中稱為FileURI(r ,i ))及索引為i的每個段的開始時間tS (r ,i )。It is assumed that a list of segments each having a start time tS is implicitly described via a playlist in an element such as "Static File Information" or by using an element such as "dynamic file information". The following describes an advantageous way to use the "dynamic file information" to generate a list of segments. Based on this construction rule, the client can access a list of each URI representing r (referred to herein as FileURI( r , i )) and a start time tS ( r , i ) of each segment indexed i .

使用MPD中的資訊來建立段的可存取時間訊窗可使用以下規則來執行。The use of information in the MPD to establish a segment's accessible time window can be performed using the following rules.

對於如有利地由諸如「服務」之類的屬性指示的一般類型為「點播」的服務，若在客戶端處的當前壁鐘時間「現在」落在如有利地由諸如「點播可用性區間」之類的MPD元素表達的任何可用性範圍內，則該點播呈現的所有被描述的段皆是可存取的。若在客戶端處的當前壁鐘時間「現在」落在任何可用性範圍之外，則該點播呈現的被描述段皆是不可存取的。For a service of the general type "on-demand" as indicated by an attribute such as " service ", if the current wall clock time at the client "now" falls, such as advantageously by an "on-demand availability interval" All of the described segments of the on-demand presentation are accessible within any usability range of the MPD element representation of the class. If the current wall clock time at the client "now" falls outside of any availability range, the described segments of the on-demand presentation are not accessible.

對於如有利地由諸如「服務」之類的屬性指示的一般類型為「實況」的服務，開始時間tS (r ,i )有利地以壁鐘時間來表達可用性時間。可用性開始時間可推導為是事件的實況服務時間與伺服器處用於捕捉、編碼和發佈的一些周轉時間的組合。該過程的時間可例如在MPD中指定，例如使用在MPD中指定為「安全保護區間實況服務」的安全保護區間tG來指定。此將提供UTC時間與HTTP串流伺服器上的資料可用性之間的最小差異。在另一實施例中，MPD顯式地指定MPD中的段的可用性時間而不提供作為事件實況時間與周轉時間之差的周轉時間。在以下描述中，假定任何全域時間被指定為可用性時間。實況媒體廣播領域中的一般技藝人士在閱讀本描述之後可從媒體呈現描述中的合適資訊推導出該資訊。For services that are generally "live" as indicated by attributes such as " services ", the start time tS ( r , i ) advantageously expresses the availability time in wall clock time. The availability start time can be derived as a combination of the live service time of the event and some turnaround time at the server for capture, encoding, and publishing. The time of the process can be specified, for example, in the MPD, for example, using a security protection interval tG designated as "security zone live service" in the MPD. This will provide the smallest difference between UTC time and data availability on the HTTP streaming server. In another embodiment, the MPD explicitly specifies the availability time of the segments in the MPD without providing a turnaround time as the difference between the event live time and the turnaround time. In the following description, it is assumed that any global time is specified as the availability time. A person of ordinary skill in the field of live media broadcasting may derive this information from the appropriate information in the media presentation description after reading this description.

若在客戶端處的當前壁鐘時間「現在」落在有利地由諸如「實況呈現區間」之類的MPD元素表達的實況呈現區間的任何範圍之外，則該實況呈現的被描述的段皆是不可存取的，若在客戶端處的當前壁鐘時間「現在」落在實況呈現區間內，則該實況呈現的被描述的段中的至少某些段可能是可存取的。If the current wall clock time at the client "now" falls outside any range of the live presentation interval that is advantageously expressed by an MPD element such as a "live presentation interval", then the described segments of the live presentation are Can not be saved In the event that the current wall clock time at the client "now" falls within the live presentation interval, then at least some of the segments of the live presentation may be accessible.

對可存取段的限制由以下值來掌管：壁鐘時間「現在」(如客戶端可用的)。The limit on the accessible segment is governed by the following values: wall clock time "now" (as available to the client).

例如在媒體呈現描述中指定為「時移緩衝器深度」的所准許的時移緩衝器深度tTSB。For example, the allowed time shifting buffer depth tTSB specified as "time shifting buffer depth" in the media presentation description.

客戶端在相對事件時間t₁ 可能僅被允許請求開始時間tS (r ,i )落在(現在-tTSB)至「現在」的區間中或者落在使得歷時為d的段的結束時間亦被包括(從而得到區間(現在-tTSB-d)至「現在」)的區間中的段。The client may only be allowed to allow the request start time tS ( r , i ) to fall in the (now - tTSB) to "now" interval or the end time of the segment that makes the duration d to be included in the relative event time t ₁ (There is a segment in the interval of the interval (now -tTSB-d) to "now").

更新MPDUpdate MPD

在一些實施例中，伺服器事先並不知道段的檔或段定位符及開始時間，因為例如伺服器位置將改變，或者媒體呈現包括來自不同伺服器的一些廣告，或者媒體呈現的歷時是未知的，或者伺服器想要混淆後繼段的定位符。In some embodiments, the server does not know the segment or segment locator and start time of the segment beforehand because, for example, the server location will change, or the media presentation includes some advertisements from different servers, or the duration of the media presentation is unknown. , or the server wants to confuse the locator of the successor segment.

在此類實施例中，伺服器可能僅描述已經可存取或者在發佈了MPD的該實例之後不久便可存取的段。此外，在一些實施例中，客戶端有利地消費接近MPD中描述的媒體的彼等媒體，以使得使用者體驗到所包含的與媒體內容的產生儘可能接近的媒體節目。一旦客戶端預計自己到達MPD中所描述的媒體段的末尾，客戶端就有利地在伺服器已發佈描述新媒體段的新MPD的預期下請求MPD的新實例以繼續連續播出。伺服器有利地產生MPD的新實例並更新MPD以使得客戶端能依賴該等程序進行連續更新。伺服器可使自己的MPD更新程序連同段產生和發佈適配於客戶端的行為舉止如普通客戶端可能的行為舉止的參考客戶端的程序。In such an embodiment, the server may only describe segments that are already accessible or accessible shortly after the instance of the MPD is published. Moreover, in some embodiments, the client advantageously consumes media that is close to the media described in the MPD such that the user experiences the media program contained as close as possible to the production of the media content. Once the client expects to arrive at the end of the media segment described in the MPD, the client advantageously requests a new instance of the MPD to continue the continuous playout as expected by the server to publish a new MPD describing the new media segment. The server advantageously generates a new instance of the MPD and updates the MPD to make the client The terminal can rely on these programs for continuous updates. The server can make its own MPD update program along with the segment generation and release of the reference client program that adapts to the behavior of the client, such as the behavior of the normal client.

若MPD的新實例僅描述很短的時間提前量，則客戶端需要頻繁地請求MPD的新實例。此可能由於不必要的頻繁請求而導致可伸縮性問題及不必要的上行鏈路和下行鏈路訊務。If the new instance of the MPD describes only a short amount of time advancement, the client needs to frequently request a new instance of the MPD. This can lead to scalability issues and unnecessary uplink and downlink traffic due to unnecessary frequent requests.

因此，有關係的是，一方面要描述儘可能遠地進入將來的段而不必使該等段已可存取，另一方面要使MPD中未預見的更新能表達新伺服器位置、准許插入諸如廣告之類的新內容或提供轉碼器參數的改變。Therefore, it is relevant to describe on the one hand the segment that enters the future as far as possible without having to make the segments accessible, and on the other hand to make the unforeseen updates in the MPD express the new server location, permit insertion such as New content such as advertisements or changes to transcoder parameters.

此外，在一些實施例中，媒體段的歷時可能很小，諸如在幾秒的範圍中。媒體段的歷時有利地是靈活的，以便調節到可針對投遞或快取記憶體性質來最佳化的合適段大小、補償實況服務中的端對端延遲或應對段儲存或投遞的其他態樣，或出於其他原因。尤其是在段與媒體呈現歷時相比很小的情形中，則需要在媒體呈現描述中描述顯著量的媒體段資源和開始時間。結果，媒體呈現描述的大小可能很大，此會不利地影響媒體呈現描述的下載時間且因此影響媒體呈現的啟動延遲及亦有存取鏈路上的頻寬使用。因此，有利的是不僅准許使用播放清單來描述媒體段列表，且亦准許經由使用範本或URL構造規則來進行描述。範本和URL規則規則在本描述中同義地使用。Moreover, in some embodiments, the duration of the media segment may be small, such as in the range of a few seconds. The duration of the media segment is advantageously flexible in order to adjust to a suitable segment size that can be optimized for delivery or cache memory properties, to compensate for end-to-end delays in live service, or to cope with other aspects of segment storage or delivery. Or for other reasons. Especially in situations where the segment is small compared to the media presentation duration, then a significant amount of media segment resources and start time need to be described in the media presentation description. As a result, the size of the media presentation description may be large, which can adversely affect the download time of the media presentation description and thus affect the startup delay of the media presentation and also the bandwidth usage on the access link. Therefore, it is advantageous to not only permit the use of a playlist to describe a list of media segments, but also to permit descriptions by using a template or URL construction rules. Template and URL rule rules are used synonymously in this description.

此外，範本可有利地被用來在實況情形中描述超出當前時間的段定位符。在此類情形中，對MPD的更新本身不是必需的，因為定位符以及段清單是由範本描述的。然而，可能仍會發生未預見的事件，此要求對表示或段的描述進行改變。當來自多個不同源的內容被拼接在一起時，例如在插入了廣告時，可能需要自我調整HTTP串流媒體呈現描述上的改變。來自不同源的內容可能在各種方面有所不同。實況呈現期間的另一個原因在於可能有必要改變用於內容檔的URL以提供從一個實況發原始伺服器容錯移轉到另一個。In addition, the template can advantageously be used to describe beyond in a live situation. The segment locator of the current time. In such cases, the update to the MPD itself is not necessary because the locator and the list of segments are described by the template. However, unforeseen events may still occur, which require changes to the description of the representation or segment. When content from multiple different sources is stitched together, such as when an advertisement is inserted, it may be necessary to self-adjust the changes in the HTTP streaming media presentation description. Content from different sources may vary in various ways. Another reason during live presentation is that it may be necessary to change the URL for the content file to provide for fault tolerance transfer from one live origin server to another.

在一些實施例中，有利的是若MPD被更新，則對MPD的更新被執行以使得經更新的MPD與先前MPD相容，相容的意義是指：對於直至先前MPD的有效時間為止的任何時間，參考客戶端及因此任何所實現的客戶端從經更新的MPD產生的可存取段列表與客戶端從MPD的該先前實例會產生的可存取段列表等同地起效。該要求確保了(a)客戶端可立即開始使用新MPD而無需與舊MPD同步，因為新MPD在更新時間之前就與舊MPD相容；及(b)更新時間無需與對MPD的實際改變發生的時間同步。換言之，對MPD的更新可事先被廣告，並且一旦有新資訊可用，伺服器就能替換掉MPD的舊實例而不必維護MPD的不同版本。In some embodiments, it may be advantageous if the MPD is updated so that the updated MPD is compatible with the previous MPD, a compatible meaning means any for up to the effective time of the previous MPD The time, the reference client and thus any implemented client's list of accessible segments generated from the updated MPD is equivalent to the list of accessible segments that the client will generate from this previous instance of the MPD. This requirement ensures that (a) the client can immediately start using the new MPD without having to synchronize with the old MPD because the new MPD is compatible with the old MPD before the update time; and (b) the update time does not need to occur with the actual change to the MPD. Time synchronization. In other words, updates to the MPD can be advertised in advance, and once new information is available, the server can replace the old instance of the MPD without having to maintain a different version of the MPD.

對跨用於一組表示或所有表示的MPD更新的媒體時基可存在兩種可能性。(a)現有全域等時線跨該MPD更新而延續(在本文中被稱為「連續MPD更新」)，或(b)當前等時線結束並且新等時線從繼該改變之後的段開始(在本文中被稱為「非連續MPD更新」)。There are two possibilities for media time bases that span MPD updates for a set of representations or all representations. (a) the existing global isochronal line continues over the MPD update (referred to herein as "continuous MPD update"), or (b) the current isochron is over and the new isochron is from the segment following the change (This is called "non-continuous MPD update" in this article).

在考慮到媒體片斷的軌跡及因此段的軌跡由於跨各軌跡的取樣細微性有所不同故而一般並不在相同的時間開始和結束的情況下，該等選項之間的差異可能是明顯的。在正常呈現期間，片斷的一個軌跡的取樣可能在先前片斷的另一軌跡的一些取樣之前被渲染，亦即，片斷之間可能存在某種重疊，儘管單個軌跡內可能沒有重疊。The difference between these options may be significant in view of the fact that the trajectory of the media segment and thus the trajectory of the segment are generally not starting and ending at the same time due to differences in sampling granularity across the trajectories. During normal rendering, the sampling of one track of a segment may be rendered prior to some sampling of another track of the previous segment, ie there may be some overlap between the segments, although there may be no overlap within a single track.

(a)與(b)之間的差異在於是否可允許跨MPD更新實現此類重疊。當MPD更新是由於完全分開的內容的拼接時，此類重疊一般是難以達成的，因為新內容需要新編碼才能與先前內容拼接。因此有利的是提供經由為某些段重啟等時線來非連續地更新媒體呈現，及有可能亦在更新之後界定一組新的表示的能力。而且，若內容已被獨立地編碼和分段，則亦避免要將時戳調節成擬合在先前內容片的全域等時線內。The difference between (a) and (b) is whether it is permissible to implement such overlap across MPD updates. Such overlap is generally difficult to achieve when the MPD update is due to the splicing of completely separate content, as the new content requires a new encoding to be stitched with the previous content. It is therefore advantageous to provide the ability to non-continuously update the media presentation by restarting the isochronous line for certain segments, and possibly also to define a new set of representations after the update. Moreover, if the content has been independently encoded and segmented, it is also avoided that the timestamp is adjusted to fit within the global isochronal line of the previous piece of content.

在更新是出於次要原因，諸如僅僅是向所描述媒體段的列表添加新媒體段時，或者若URL的位置被改變，則重疊和連續更新可被允許。Overlap and continuous updates may be allowed when the update is for secondary reasons, such as merely adding a new media segment to the list of described media segments, or if the location of the URL is changed.

在非連續MPD更新的情形中，先前表示的最末段的等時線在該段中任何取樣的最晚呈現結束時間處結束。下一表示的等時線(或更準確而言，媒體呈現的新部分的第一個媒體段的首次呈現時間，亦被稱為新時段)典型地且有利地始於與上一時段的呈現的結束相同的該時刻，以確保無瑕疵和連續播出。In the case of a non-continuous MPD update, the last indicated isochronal end of the last segment ends at the latest presentation end time of any samples in the segment. The isochronal line of the next representation (or more accurately, the first presentation time of the first media segment of the new portion of the media presentation, also referred to as the new time period) typically and advantageously begins with the presentation of the previous time period The end of the same moment to ensure innocent and continuous broadcast.

該兩種情形在圖11中圖示。These two scenarios are illustrated in FIG.

較佳且有利的是將MPD更新限制於段邊界。將此類改變或更新限制於段邊界的基本原理如下。首先，對每個表示的二進位中繼資料(典型情況下為電影頭部)的改變至少可在段邊界處發生。其次，媒體呈現描述可包含指向段的指標(URL)。在某種意義上，MPD是將與媒體呈現相關聯的所有段檔編組在一起的「傘」資料結構。為了維護該包容關係，每個段可被單個MPD引用，且當該MPD被更新時，有利的是僅在段邊界處更新該MPD。It is preferred and advantageous to limit the MPD update to segment boundaries. Put this class The basic principle of changing or updating to the segment boundaries is as follows. First, a change to each of the represented binary relay data (typically the movie header) can occur at least at the segment boundaries. Second, the media presentation description can include metrics (URLs) that point to segments. In a sense, an MPD is an "umbrella" data structure that groups all the segments associated with the media presentation. To maintain this containment relationship, each segment can be referenced by a single MPD, and when the MPD is updated, it is advantageous to update the MPD only at the segment boundaries.

一般不要求段邊界對準，然而對於從不同源拼接的內容的情形及普遍對於非連續MPD更新，使段邊界對準是有意義的(具體而言，每個表示的最末段可在相同的視訊訊框結束並且不可包含呈現開始時間晚於彼訊框的呈現時間的音訊取樣)。非連續更新隨後可在共同的時刻(稱為時段)開始一組新的表示。該組新的表示的有效性的開始時間例如由時段開始時間來提供。每個表示的相對開始時間被復位為0且該時段的開始時間將該新時段中的該組表示放在全域媒體呈現等時線中。Segment boundary alignment is generally not required, however, for the case of content spliced from different sources and generally for non-continuous MPD updates, it makes sense to align the segment boundaries (specifically, the last segment of each representation can be the same The video frame ends and cannot contain audio samples whose presentation start time is later than the presentation time of the frame. Non-continuous updates can then begin a new set of representations at a common time (called a time period). The start time of the validity of the new set of representations is provided, for example, by the start time of the time period. The relative start time of each representation is reset to zero and the start time of the time period places the set of representations in the new time period in the global media presentation isochron.

對於連續MPD更新，不要求段邊界對準。每個替換表示的每個段可由單個媒體呈現描述來掌管，且因此對該媒體呈現描述的新實例的更新請求(該等更新請求一般因預計沒有額外的媒體段在此工作MPD中被描述而被觸發)取決於所消費的該組表示可發生在不同時間，其中該組表示包括預計要消費的該組表示。For continuous MPD updates, segment boundary alignment is not required. Each segment of each alternate representation can be hosted by a single media presentation description, and thus an update request for the new instance of the description is presented to the media (these update requests are generally not expected to be described in this working MPD due to the absence of additional media segments) Triggered) Depending on the group representation consumed, it can occur at different times, where the group representation includes the set of representations that are expected to be consumed.

為了支援更一般化情形中MPD元素和屬性的更新，不僅是表示或一組表示，而是可將任何元素與有效性時間相關聯。因此，若MDP的某些元素需要更新，例如在表示的數目改變了或URL構造規則改變了的場合，則經由為元素的多個副本提供不相交的有效性時間，該等元素各自可在指定時間被分別更新。In order to support the update of MPD elements and attributes in a more generalized case, not only a representation or a set of representations, but any element can be associated with the validity time. Association. Thus, if certain elements of the MDP need to be updated, such as where the number of representations has changed or the URL construction rules have changed, then the non-intersecting validity times are provided for multiple copies of the elements, each of which can be specified The time is updated separately.

有效性有利地與全域媒體時間相關聯，以使得與有效性時間相關聯的被描述元素在媒體呈現的全域等時線的時段裡有效。The validity is advantageously associated with the global media time such that the described elements associated with the validity time are valid for the time period of the global isochronal of the media presentation.

如以上所論述的，在一個實施例中，有效性時間僅被添加到全組表示。每個全組則構成時段。有效性時間隨後構成該時段的開始時間。換言之，在使用有效性元素的具體情形中，全組表示可在由一組表示的全域有效性時間指示的時間段裡有效。一組表示的有效性時間被稱為時段。在新時段的開始，前一組表示的有效性過期且新一組表示有效。再次注意，有效性時間段較佳是不相交的。As discussed above, in one embodiment, the validity time is only added to the full group representation. Each full group constitutes a time period. The validity time then constitutes the start time of the time period. In other words, in the specific case of using the validity element, the full group representation can be valid for a period of time indicated by a set of global validity times. The validity time of a set of representations is called the time period. At the beginning of the new time period, the validity of the previous set of representations expires and the new set represents valid. Note again that the validity period is preferably disjoint.

如上所述，對媒體呈現描述的改變發生在段邊界處，且因此對於每個表示，元素改變實際上發生在下一段邊界處。客戶端隨後可構成有效MPD，有效MPD包括媒體的呈現時間內的每個時刻的段列表。As described above, the change to the media presentation description occurs at the segment boundaries, and thus for each representation, the element change actually occurs at the next segment boundary. The client can then form a valid MPD, which includes a list of segments at each moment in the presentation time of the media.

在其中區塊包含來自不同表示或來自不同內容(例如，來自內容段和廣告)的媒體資料的情形中或在其他情形中，非連續區塊拼接可能是合適的。在區塊請求串流系統中可能要求對呈現中繼資料的改變僅發生在區塊邊界處。此出於實現原因可能是有利的，因為在區塊內更新媒體解碼器參數可能比僅在區塊之間更新該等參數更加複雜。在該情形中，可有利地指定如上所描述的有效性區間可被解釋成近似的，以使得元素被視為從不早於所指定的有效性區間的開始的第一個區塊邊界至不早於所指定的有效性區間的末尾的第一個區塊邊界是有效的。In situations where the block contains media material from different representations or from different content (eg, from content segments and advertisements) or in other situations, non-contiguous tile stitching may be appropriate. In a block request streaming system it may be required that changes to the presentation relay material only occur at the block boundaries. This may be advantageous for implementation reasons, as updating media decoder parameters within a block may be more complicated than updating the parameters only between blocks. In this case Advantageously, it may be specified that the validity interval as described above may be interpreted as approximate such that the element is considered to be from a first block boundary that is not earlier than the beginning of the specified validity interval to no earlier than specified The first block boundary at the end of the validity interval is valid.

以上描述的對區塊請求串流系統的新穎增強的實例實施例在稍後提供的題為「對媒體呈現的改變」的小節中描述。A novel enhanced example embodiment of the block request stream system described above is described in the section entitled "Changes to Media Presentation" provided later.

段歷時訊號傳遞Segment duration signal transmission

非連續更新有效地將呈現分成一系列不相交的稱為時段的區間。每個時段具有時段自己的等時線用於媒體取樣時基。時段內的表示的媒體時基可有利地經由指定每個時段或時段之每一者表示的段歷時的單獨的緊湊列表來指示。A non-continuous update effectively divides the presentation into a series of disjoint intervals called time periods. Each time period has its own isochronal time for the media sampling time base. The media time base of the representations within the time period may advantageously be indicated via a separate compact list specifying the duration of the segments represented by each of the time periods or time periods.

與MPD內的元素相關聯的例如稱為時段開始時間之類的屬性可指定媒體呈現時間內的某些元素的有效性時間。該屬性可被添加到MPD的任何元素(可對元素換上可被指派有效性的屬性)。An attribute associated with an element within the MPD, such as a time period start time, may specify the validity time of certain elements during the media presentation time. This attribute can be added to any element of the MPD (the element can be replaced with an attribute that can be assigned validity).

對於非連續MPD更新，所有表示的段可在非連續點結束。此一般至少意味著該非連續點之前的最末段與先前各段具有不同歷時。訊號傳遞通知段歷時可涉及指示所有段具有相同的歷時或者為每個段指示單獨的歷時。可能希望具有關於段歷時列表的緊湊表示，此在有許多段具有相同歷時的情形中是高效的。For non-contiguous MPD updates, all represented segments can end at non-contiguous points. This generally means at least that the last segment before the discontinuous point has a different duration than the previous segment. Signaling the notification segment duration may involve indicating that all segments have the same duration or indicating a separate duration for each segment. It may be desirable to have a compact representation of the segment duration list, which is efficient in situations where there are many segments with the same duration.

一個表示或一組表示之每一者段的歷時可有利地用單個串來實現，該串指定了從非連續更新的開始(亦即，該時段的開始)直至MPD中描述的最末媒體段為止的單個區間的所有段歷時。在一個實施例中，該元素的格式是與包含段歷時條目列表的產生式相符的文字串，其中每個條目包含歷時屬性dur 及該屬性的可任選的乘數mult ，該乘數指示該表示包含第一條目的<mult >個、歷時為第一條目的<dur >的段，隨後是第二條目的<mult >個、歷時為第二條目的<dur >的段，依此類推。The duration of each representation or group of representations can advantageously be implemented with a single string that specifies the beginning of the discontinuous update (ie, the beginning of the period) up to the last media segment described in the MPD. All segments of a single interval up to the duration. In one embodiment, the format of the element is a literal string that conforms to the production of the list of segment duration entries, wherein each entry includes a duration attribute dur and an optional multiplier mult of the attribute, the multiplier indicating the Represents a segment of < mult > containing the first entry, a < dur > that lasts for the first entry, followed by a < mult > of the second entry, a segment of < dur > that lasts for the second entry, and so on.

每個歷時條目指定一或多個段的歷時。若<dur >值後面跟有「^＊」字元和數位，則該數位指定具有該歷時(以秒計)的連貫段的數目。若不存在乘數符號「^＊」，則段數目為1。若存在「^＊」而沒有後繼數位，則所有後續段具有所指定的歷時且該列表中可能沒有進一步的條目。例如，串「30^＊」意味著所有段具有30秒的歷時。串「30^＊ 12 10.5」指示有12個歷時30秒的段，繼以一個歷時為10.5秒的段。Each diachronic entry specifies the duration of one or more segments. If the < dur > value is followed by a " ^* " character and a digit, the digit specifies the number of consecutive segments with that duration (in seconds). If the multiplier symbol " ^* " does not exist, the number of segments is 1. If there is a " ^* " without a successor digit, then all subsequent segments have the specified duration and there may be no further entries in the list. For example, the string "30 ^* " means that all segments have a duration of 30 seconds. The string "30 ^* 12 10.5" indicates that there are 12 segments lasting 30 seconds, followed by a segment with a duration of 10.5 seconds.

若針對每個替換表示分開地指定段歷時，則每個區間內的段歷時的總和對於每個表示而言可以是相同的。在視訊軌跡的情形中，該區間在每個替換表示中可結束於相同的訊框。If the segment duration is specified separately for each replacement representation, the sum of the segment durations within each interval may be the same for each representation. In the case of a video track, the interval may end in the same frame in each of the alternate representations.

本領域一般技藝人士在閱讀本案之際可發現以緊湊方式來表達段歷時的類似和等效途徑。One of ordinary skill in the art, upon reading this disclosure, will find similar and equivalent ways of expressing segment duration in a compact manner.

在另一實施例中，由訊號「歷時屬性<歷時>」來訊號傳遞通知除了最後一個段以外，對於該表示中的所有段而言段的歷時是恆定的。非連續更新之前的最末段的歷時可以較短，只要提供了下一非連續更新的開始點或新時段的開始即可，而此則意味著最末段的歷時延及下一時段的開始。In another embodiment, the signal delivery notification is signaled by the signal "duration attribute <duration>". The duration of the segment is constant for all segments in the representation except for the last segment. The last paragraph before the non-continuous update can be shorter, as long as the start point of the next discontinuous update or the start of the new period is provided That's it, and this means the last delay and the beginning of the next period.

對表示中繼資料的改變和更新Change and update of the relay data

指示諸如電影頭部「moov」改變之類的經二進位編碼的表示中繼資料的改變可按不同方式來完成：(a)在MPD中引用的單獨檔中可以對所有表示有一個moov包，(b)在每個替換表示中引用的單獨檔中可以對每個替換表示有一個moov包，(c)每個段可包含moov包且因此是自含式的，(d)在與MPD一起的一個3GP檔中可以有用於所有表示的moov包。The binary-represented representation of the relay data, such as a change in the movie header "moov", can be done in different ways: (a) in a separate file referenced in the MPD, there can be one moov packet for all representations. (b) In each individual file referenced in each replacement representation, there may be one moov package for each replacement representation, (c) each segment may contain a moov package and is therefore self-contained, (d) in conjunction with the MPD There can be a moov package for all representations in a 3GP file.

注意在(a)和(b)的情形中，可有利地將單個「moov」與來自上文的有效性概念相組合，組合的意義是指在MPD中可以引用更多的「moov」包，只要該等「moov」包的有效性不相交即可。例如，有了時段邊界的定義，舊時段中的‘moov’的有效性可隨著新時段的開始而過期。Note that in the case of (a) and (b), it is advantageous to combine a single "moov" with the concept of validity from above, which means that more "moov" packets can be referenced in the MPD. As long as the validity of these "moov" packages does not intersect. For example, with the definition of a time period boundary, the validity of the 'moov' in the old time period may expire as the new time period begins.

在選項(a)的情形中，對單個moov包的引用可被指派有效性元素。可允許多個呈現頭部，但每個時間僅可有一個呈現頭部有效。在另一實施例中，如以上界定的時段中的整組表示或整個時段的有效性時間可被用作該表示中繼資料的有效性時間，典型地作為moov頭部來提供。In the case of option (a), a reference to a single moov package can be assigned a validity element. Multiple presentation headers can be allowed, but only one presentation header can be active at a time. In another embodiment, the full set of representations in the time period as defined above or the validity time of the entire time period may be used as the validity time of the representation relay material, typically provided as a moov header.

在選項(b)的情形中，對每個表示的moov包的引用可被指派有效性元素。可允許多個表示頭部，但每個時間僅可有一個表示頭部有效。在另一實施例中，如以上界定的整個表示或整個時段的有效性時間可被用作該表示中繼資料的有效性時間，典型地作為moov頭部來提供。In the case of option (b), a reference to each represented moov package can be assigned a validity element. Multiple representation headers may be allowed, but only one representation per time is valid for the header. In another embodiment, the validity time of the entire representation or the entire time period as defined above may be used as the validity time of the representation relay material, typically provided as a moov header.

在選項(c)的情形中，可以不在MPD中添加訊號傳遞，但可在媒體串流中添加額外訊號傳遞以指示moov包對於任何即將到來的段是否將改變。此在下文「訊號傳遞通知段中繼資料內的更新」該小節的上下文中進一步解釋。In the case of option (c), you can not add signal passing in the MPD. However, additional signal passing can be added to the media stream to indicate whether the moov packet will change for any upcoming segments. This is further explained in the context of this section below, "Updates in Signal Delivery Notification Segment Relay Data".

訊號傳遞通知段中繼資料內的更新Signal delivery notification segment update in the relay data

為了避免頻繁更新媒體呈現描述以獲得關於潛在更新的知識，有利的是連同媒體段一起訊號傳遞通知任何此類更新。在媒體段本身內可提供額外的一或多個元素，該等元素可指示有經更新的中繼資料(諸如媒體呈現描述)可用並且必須在某個時間量內被存取才能成功地繼續建立可存取段列表。此外，對於經更新的中繼資料檔，此類元素可提供檔辨識符(諸如URL)或可用來構造檔辨識符的資訊。經更新中繼資料檔可包括等於將與該呈現的原始中繼資料檔中提供的中繼資料修改成指示有效性區間的中繼資料、連同亦伴隨著有效性區間的額外中繼資料。此類指示可在媒體呈現的所有可用表示的媒體段中提供。存取區塊請求串流系統的客戶端在偵測到媒體區塊內的此類指示之際可使用檔下載協定或其他手段來檢索經更新中繼資料檔。藉此為客戶端提供了關於媒體呈現描述中的改變及該等改變將發生或已發生的時間的資訊。有利地，每個客戶端僅在此類改變發生時請求經更新媒體呈現描述一次，而非「輪詢」並接收該文件許多次以獲得可能的更新或改變。In order to avoid frequent updates to the media presentation description to gain knowledge about potential updates, it is advantageous to signal any such updates along with the media segment. Additional one or more elements may be provided within the media segment itself, which may indicate that updated relay material (such as a media presentation description) is available and must be accessed within a certain amount of time to successfully continue to establish A list of segments can be accessed. In addition, for updated relay profiles, such elements may provide a file identifier (such as a URL) or information that may be used to construct a file identifier. The updated relay profile may include relay data equal to the relay data provided in the original relay profile of the presentation to the indication validity interval, along with additional relay material also accompanied by the validity interval. Such an indication may be provided in a media segment of all available representations of the media presentation. The client of the access block request stream system may use the file download protocol or other means to retrieve the updated relay data file upon detecting such indication within the media block. This provides the client with information about the changes in the media presentation description and the time at which the changes will occur or have occurred. Advantageously, each client requests an updated media presentation description once only when such a change occurs, rather than "polling" and receiving the file many times to obtain a possible update or change.

改變的實例包括表示的添加或移除，對一或多個表示的改變，諸如位元元速率、解析度、縱橫比、所包括的軌跡或轉碼器參數的改變，及對URL構造規則的改變，例如用於廣告的不同發原始伺服器。一些改變可能僅影響與表示相關聯的初始化段，諸如電影頭部(「moov」)原子，而其他改變可能影響媒體呈現描述(MPD)。Examples of changes include additions or removals of representations, changes to one or more representations, such as bit element rate, resolution, aspect ratio, included trajectory or transcoder parameters, and rules for URL construction. Change, for example The original server is sent to the different advertisements. Some changes may only affect the initialization segment associated with the representation, such as the movie header ("moov") atom, while other changes may affect the media presentation description (MPD).

在點播內容的情形中，該等改變及該等改變的時基可以事先知曉且可在媒體呈現描述中訊號傳遞通知。In the case of on-demand content, the time bases of such changes and such changes may be known in advance and may be signaled in the media presentation description.

對於實況內容，改變可能在該等改變發生的時間點之前是未知的。一個解決方案是允許在特定URL處可用的媒體呈現描述被動態地更新並且要求客戶端定期地請求該MPD以便偵測改變。該解決方案在可伸縮性(原始伺服器負荷及快取記憶體效率)意義上有缺點。在具有大量觀眾的場景中，快取記憶體可能在MPD的先前版本已從快取記憶體過期之後並在新版本被接收到之前接收到許多對MPD的請求，且所有該等請求可能被轉發給原始伺服器。原始伺服器可能需要不斷地處理來自快取記憶體的對MPD的每個經更新版本的請求。而且，該等更新可能不容易與媒體呈現中的改變在時間上對準。For live content, the change may be unknown until the point in time when the change occurred. One solution is to allow the media presentation descriptions available at a particular URL to be dynamically updated and require the client to periodically request the MPD to detect the change. This solution has drawbacks in terms of scalability (original server load and cache memory efficiency). In scenarios with a large audience, the cache may receive many requests for MPDs after the previous version of the MPD has expired from the cache and before the new version is received, and all such requests may be forwarded Give the original server. The original server may need to constantly process requests from the cache for each updated version of the MPD. Moreover, such updates may not be easily aligned in time with changes in media presentation.

由於HTTP串流的優點之一在於利用標準web基礎設施和服務以獲得可伸縮性的能力，因此較佳的解決方案可僅涉及「靜態」(亦即，可快取記憶體的)檔而不依賴於客戶端「輪詢」檔以查看該等檔是否已改變。Since one of the advantages of HTTP streaming is the ability to leverage standard web infrastructure and services for scalability, a preferred solution may involve only "static" (ie, cacheable) files. Depends on the client "polling" file to see if the files have changed.

論述並提議了用於解決包括媒體呈現描述和二進位表示中繼資料(諸如自我調整HTTP串流媒體呈現中的「moov」原子)的中繼資料的更新的解決方案。A solution for addressing updates of relay data including media presentation descriptions and binary representation relay data (such as self-adjusting "moov" atoms in HTTP streaming media presentations) is discussed and proposed.

對於實況內容的情形，在構造MPD時可能不知道 MPD或「moov」可能發生改變的時間點。由於出於頻寬和可伸縮性原因一般應當避免頻繁「輪詢」MPD以檢查更新，因此對MPD的更新可在段檔自身中「帶內」地指示，亦即，每個媒體段可具有指示更新的選項。取決於上文的段格式(a)到段格式(c)，可訊號傳遞通知不同的更新。For the case of live content, you may not know when constructing MPD The point in time when MPD or "moov" may change. Since frequent "polling" of the MPD should be avoided to check for updates for reasons of bandwidth and scalability, the update to the MPD can be indicated "in-band" in the segment file itself, ie each media segment can have Indicates the option to update. Depending on the segment format (a) to segment format (c) above, the signal can be communicated to notify different updates.

一般而言，可有利地在段內的訊號中提供以下指示：MPD可能在請求該表示內的下一段或段的開始時間大於當前段的開始時間的任何下一段之前被更新的指示符。更新可事先被宣告，以指示該更新只需要在晚於該下一段的任何段發生。在媒體段的定位符改變了的情形中，該MPD更新亦可用來更新二進位表示中繼資料，諸如電影頭部。另一訊號可指示在該段完成時，不應再請求更多將時間提前的段。In general, it may be advantageous to provide an indication in the signal within the segment that the MPD may be updated before the start time of the next segment or segment within the request is requested to be greater than any next segment of the start time of the current segment. The update can be announced in advance to indicate that the update only needs to occur in any segment that is later than the next segment. In the case where the locator of the media segment is changed, the MPD update can also be used to update the binary representation of the relay material, such as the movie header. Another signal may indicate that when the segment is completed, no more segments that advance the time should be requested.

在段是根據段格式(c)來格式化，亦即，每個媒體段可包含諸如電影頭部之類的自初始化中繼資料的情形中，則可添加另一訊號，以指示後續段包含經更新的電影頭部(moov)。此有利地允許將電影頭部包括在段中，但該電影頭部僅在若先前段指示電影頭部更新的情況下或在當切換表示時進行檢視或隨機存取的情形中才需要被客戶端請求。在其他情形中，客戶端可發出對段的位元組範圍請求，該位元組範圍請求排除電影頭部的下載，因此有利地節省了頻寬。In the case where the segment is formatted according to the segment format (c), that is, in the case where each media segment can contain self-initializing relay data such as a movie header, another signal can be added to indicate that the subsequent segment contains Updated movie header (moov). This advantageously allows the movie header to be included in the segment, but the movie header only needs to be served by the client if the previous segment indicates a movie header update or in the case of a view or random access when switching representations Request. In other cases, the client may issue a byte range request for the segment that requests the download of the movie header to be excluded, thus advantageously saving bandwidth.

在另一實施例中，若訊號傳遞通知了MPD更新指示，則該訊號亦可包含關於經更新的媒體呈現描述的諸如URL之類的定位符。在非連續更新的情形中，經更新的MPD可使用諸如新時段和舊時段之類的有效性屬性來描述更新前後的呈現。此可以有利地被用來准許如以下進一步描述的時移觀看，但亦有利地允許MPD更新在該MPD更新所包含的改變生效之前任何時間被訊號傳遞通知。客戶端可立即下載新MPD並將新MPD應用於正在進行的呈現。In another embodiment, if the signal delivery notifies the MPD update indication, the signal may also include a locator such as a URL for the updated media presentation description. In the case of non-continuous updates, the updated MPD may use validity attributes such as new time periods and old time periods to describe before and after the update. Presented. This may advantageously be used to permit time-shifted viewing as further described below, but also advantageously allows the MPD update to be signaled at any time prior to the change included in the MPD update taking effect. The client can immediately download the new MPD and apply the new MPD to the ongoing presentation.

在具體實現中，對媒體呈現描述、moov頭部或呈現結束的任何改變的訊號傳遞通知可被包含在遵循使用ISO基媒體檔案格式的包結構的段格式的規則來格式化的串流資訊包中。該包可為任何不同更新提供專門訊號。In a specific implementation, the signal delivery notification for any changes to the media presentation description, moov header, or presentation end may be included in the streaming packet formatted in accordance with the rules of the segment format of the packet structure using the ISO base media file format. in. This package provides a special signal for any different update.

串流信息包Streaming packet

定義definition

包類型：‘sinf’Package type: ‘sinf’

容器：無Container: none

強制性的：否Mandatory: No

數量：0或1。Quantity: 0 or 1.

串流資訊包包含關於檔是何種串流呈現的一部分的資訊。The streaming packet contains information about what part of the stream is presented.

句法syntax

語義Semantics

streaming_information_flags(串流資訊標誌)包含以下各項中的0個或更多個的邏輯或：0x00000001後續有電影頭部更新Streaming_information_flags (streaming information flag) contains logical OR of 0 or more of the following: 0x00000001 followed by movie header update

0x00000002呈現描述更新0x00000002 rendering description update

0x00000004呈現結束0x00000004 rendering ends

當且僅當呈現描述更新(Presentation Description update )標誌被設定時mpd_location(mpd位置)出現，並且mpd_location(mpd位置)提供關於新的媒體呈現描述的統一資源定位符。The mpd_location (mpd position) appears if and only if the Presentation Description update flag is set, and the mpd_location (mpd location) provides a uniform resource locator for the new media presentation description.

實況服務的MPD更新的示例使用情形Example use case for MPD updates for live services

假設服務提供方想要使用本文中描述的增強型區塊請求串流來提供實況足球事件。或許幾百萬使用者可能想要存取該事件的呈現。該實況事件被請求暫停時的休息或該行動中的其他間歇偶發地打斷，在此期間可加插廣告。典型情況下，對於休息的確切時基完全或幾乎沒有事先通知。It is assumed that the service provider wants to provide live football events using the enhanced block request stream described herein. Perhaps millions of users may want to access the presentation of the event. The live event is interrupted by a break when the request is suspended or other breaks in the action, during which an advertisement can be inserted. Typically, there is no or almost no prior notice of the exact time base for the break.

服務提供方可能需要提供冗餘的基礎設施(例如，編碼器和伺服器)以在實況事件期間有任何元件故障情形中能進行無瑕疵轉切。Service providers may need to provide redundant infrastructure (eg, encoders and servers) to enable seamless switching in the event of any component failure during a live event.

假設使用者Anna在公車上用她的行動裝置存取該服務並且該服務立即可用。她旁邊坐著另一使用者Paul，Paul在他的膝上型裝置上觀看該事件。進了球且兩個人在相同的時間慶祝該事件。Paul告訴Anna該比賽中的第一個球甚至更激動人心並且Anna使用該服務從而她能回看30分鐘前的事件。在看了該進球之後，她回到實況比賽。Assume that the user Anna accesses the service on her bus with her mobile device and the service is immediately available. Next to her is another user, Paul, who watches the event on his laptop. The ball was scored and two people celebrated the event at the same time. Paul told Anna that the first ball in the game was even more exciting and that Anna used the service so she could look back at the event 30 minutes ago. After watching the goal, she returned to the live game.

為了解決該使用情形，服務提供方應當能夠更新MPD，向客戶端訊號傳遞通知有經更新的MPD可用，並准許客戶端存取串流服務以使得該MPD能接近即時地呈現該資料。In order to resolve this use case, the service provider should be able to update the MPD, communicate to the client signal that the updated MPD is available, and permit the client to access the streaming service to enable the MPD to present the data in near-immediate manner.

按與段投遞非同步的方式對MPD進行更新是可行的，如本文中別處所解釋的。伺服器可向接收器提供MPD在一定時間裡不更新的擔保。伺服器可依賴於當前MPD。然而，當MPD在某個最小更新期之前就被更新時無需任何顯式訊號傳遞。It is possible to update the MPD in a non-synchronous manner with segment delivery, as explained elsewhere herein. The server can provide the receiver with a guarantee that the MPD will not be updated for a certain period of time. The server can depend on the current MPD. However, no explicit signaling is required when the MPD is updated before a certain minimum update period.

完全同步的播出是幾乎難以達成的，因為客戶端可能在對不同的MPD更新實例進行操作因此客戶端可能有漂移。使用MPD更新，伺服器可傳達改變並且客戶端可被提醒有改變，即使在呈現期間亦然。逐段基礎上的帶內訊號傳遞可被用來指示MPD的更新，因此更新可能被限於段邊界，但在絕大多數應用中此應當是可接受的。Fully synchronized playouts are almost impossible to achieve because the client may be operating on different MPD update instances so the client may drift. With MPD updates, the server can communicate changes and the client can be reminded of changes, even during presentation. In-band signal delivery on a piecemeal basis can be used to indicate an update of the MPD, so updates may be limited to segment boundaries, but this should be acceptable in most applications.

可以添加如下的MPD元素，該MPD元素提供MPD的以壁鐘時間計的發佈時間以及添加在段開頭以訊號傳遞通知要求MPD更新的可任選MPD更新包。該更新可如同MPD一般階層式地進行。An MPD element can be added that provides the release time of the MPD in wall clock time and an optional MPD update package added at the beginning of the segment to signal the MPD update with signal delivery notification. This update can be performed hierarchically like an MPD.

MPD「發佈時間」提供MPD的唯一性辨識符及MPD何時發出。MPD「發佈時間」亦提供用於更新程序的錨。MPD "Publish Time" provides the unique identifier of the MPD and when the MPD is sent. The MPD "release time" also provides an anchor for updating the program.

MPD更新包可存在於MPD中的「styp」包之後，並且由包類型=「mupe」界定，不需要容器、不是強制性的且具有數量0或1。MPD更新包包含關於段是何者媒體呈現的一部分的資訊。The MPD update package may exist after the "styp" packet in the MPD and is defined by the packet type = "mupe", does not require a container, is not mandatory, and has a number of 0 or 1. The MPD update package contains a section on the media presentation of the segment. Information about points.

實例句法如下： The example sentence is as follows:

MPDUpdateBox(MPD更新包)類的各種物件的語義可如下：mpd_information_flags(mpd資訊標誌)：以下各項中的0個或更多個的邏輯或：The semantics of the various objects of the MPDUpdateBox class can be as follows: mpd_information_flags: Logic OR of zero or more of the following:

i. 0x00 現在的媒體呈現描述更新i. 0x00 current media presentation description update

ii. 0x01 將來的媒體呈現描述更新Ii. 0x01 Future Media Presentation Description Update

iii. 0x02 呈現結束Iii. 0x02 end of presentation

iv. 0x03-0x07 保留Iv. 0x03-0x07 Reserved

new_location flag(新位置標誌)：若置為1，則在mpd_location(mpd位置)中指定的新位置處有新的媒體呈現描述可用。New_location flag: If set to 1, a new media presentation description is available at the new location specified in mpd_location (mpd location).

latest_mpd_update time(最新mpd更新時間)：指定相對於最新MPD的MPD發出時間最晚必需進行MPD更新的時間(以ms計)。客戶端可選擇在客戶端與現在之間的任何時間更新MPD。Latest_mpd_update time: Specifies the time (in ms) at which the MPD update must be performed at the latest relative to the latest MPD's MPD issue time. The client can choose between any time between the client and the present Update the MPD.

mpd_location(mpd位置)：當且僅當「新位置標誌」被設定時出現，且若如此，則「mpd位置」提供關於新的媒體呈現描述的統一資源定位符。Mpd_location (mpd location): Appears if and only if the "new location flag" is set, and if so, the "mpd location" provides a uniform resource locator for the new media presentation description.

若更新所使用的頻寬成問題，則伺服器可供應針對某些裝置能力的MPD以使得只有該等部分被更新。If the bandwidth used by the update is a problem, the server can supply MPDs for certain device capabilities such that only those portions are updated.

時移觀看和網路PVRTime-shifted viewing and network PVR

在時移觀看得到支援時，可能在該通信期的壽命時間裡碰巧有兩個或兩個以上MPD或電影頭部是有效的。在此種情形中，經由在必要時更新MPD，但添加有效性機制或時段概念，便可對整個時間訊窗皆有有效的MPD存在。此意味著伺服器可確保任何MPD和電影頭部在落在用於時移觀看的有效時間訊窗內的任何時間段上皆是被宣告的。由客戶端來確保客戶端的當前呈現時間的可用MPD和中繼資料是有效的。亦可支援僅使用少量的MPD更新來將實況通信期遷移到網路PVR通信期。When time-shifted viewing is supported, it may happen that two or more MPDs or movie heads are valid during the lifetime of the communication period. In this case, by updating the MPD when necessary, but adding a validity mechanism or a time period concept, an effective MPD can exist for the entire time window. This means that the server can ensure that any MPD and movie heads are declared at any time period that falls within the valid time window for time-shifted viewing. It is effective for the client to ensure that the available MPD and relay data for the client's current rendering time is valid. It is also possible to support the migration of live communication periods to network PVR communication periods using only a small number of MPD updates.

特殊媒體段Special media segment

當在區塊請求串流系統內使用ISO/IEC 14496-12的檔案格式時的問題在於，如上文描述的，將呈現的單個版本的媒體資料儲存在按連貫時間段安排的多個檔中可能是有利的。此外，將每個檔安排成始於隨機存取點可能是有利的。此外，可能有利的是在視訊編碼過程期間選取檢視點的位置並基於在編碼過程期間作出的對檢視點的選取來將呈現分段成各自始於檢視點的多個檔，其中每個隨機存取點可以置於檔開頭亦可以不置於檔開頭，但其中每個檔始於隨機存取點。在具有上述性質的一個實施例中，呈現中繼資料或媒體呈現描述可包含每個檔的確切歷時，其中歷時例如被認為表示檔的視訊媒體的開始時間與下一檔的視訊媒體的開始時間之差。基於呈現中繼資料中的該資訊，客戶端能夠構造媒體呈現的全域等時線與每個檔內的媒體的局部等時線之間的映射。A problem when using the file format of ISO/IEC 14496-12 within a block request streaming system is that, as described above, storing a single version of the media material presented may be stored in multiple files arranged in consecutive time periods. It is beneficial. Furthermore, it may be advantageous to arrange each file to start at a random access point. Furthermore, it may be advantageous to select the location of the view point during the video encoding process and segment the presentation into a plurality of files each starting at the view point based on the selection of the view points made during the encoding process, each of which is randomly stored Take points can be placed The beginning of the file may also not be placed at the beginning of the file, but each of the files starts at the random access point. In an embodiment having the above-described properties, the presentation relay material or media presentation description may include the exact duration of each file, wherein the duration of the video media, for example, that is considered to represent the file, and the start time of the video media of the next file, Difference. Based on the information in the presentation relay material, the client can construct a mapping between the global isochronal line of the media presentation and the local isochronal line of the media within each file.

在另一實施例中，經由改為指定每個檔或段具有相同歷時可有利地減小呈現中繼資料的大小。然而，在此種情形中並且在根據上述方法來構造媒體檔的場合，每個檔的歷時可能並不嚴格等於在媒體呈現描述中指定的歷時，因為在自該檔開始起恰好過了該指定歷時的點處可能並無隨機存取點存在。In another embodiment, the size of the presentation relay material may advantageously be reduced by specifying that each file or segment has the same duration. However, in such a situation and where the media file is constructed according to the above method, the duration of each file may not be strictly equal to the duration specified in the media presentation description, since the designation has just passed since the beginning of the file. There may be no random access points at the point in time.

現在描述本發明的又一實施例，用於在即使有以上提及的矛盾的情況下亦能實現區塊請求串流系統的正確操作。在該方法中，可在每個檔內提供如下的元素，該元素指定該檔內的媒體的局部等時線(該檔內的媒體的局部等時線指根據ISO/IEC 14496-12的從時戳0開始的、可供對照著來指定該檔中的媒體取樣的解碼和合成時戳的等時線)向全域呈現等時線的映射。該映射資訊可包括全域呈現時間中的與局部檔等時線中的0時戳相對應的單個時戳。該映射資訊可替換地包括偏移值，該偏移值根據呈現中繼資料中提供的資訊來指定與局部檔等時線中的0時戳相對應的全域呈現時間與同檔開始相對應的全域呈現時間之間的差量。Yet another embodiment of the present invention is now described for enabling proper operation of the block request stream system even in the case of the above-mentioned contradictions. In this method, an element may be provided within each file that specifies a local isochron of the media within the file (the local isochron of the media within the file refers to the slave according to ISO/IEC 14496-12) The isochronous line that begins with timestamp 0 and which is used to specify the decoding of the media samples in the file and the isochronous time of the composite timestamp) presents the mapping of the isochrones to the global domain. The mapping information may include a single timestamp corresponding to a zero timestamp in the local file isochronal time in the global rendering time. The mapping information may alternatively include an offset value that specifies that the global presentation time corresponding to the zero time stamp in the local time isochronal line corresponds to the start of the same file based on the information provided in the presentation relay data. The difference between the global presentation times.

此類包的實例可例如是軌跡片斷解碼時間(‘tfdt’)包或軌跡片斷調整包(‘tfad’)連同軌跡片斷媒體調整(‘tfma’)包。Examples of such packets may be, for example, a track segment decoding time ('tfdt') packet or a track segment adjustment packet ('tfad') along with a track segment media adjustment ('tfma') packet.

包括段列表產生的實例客戶端Instance client generated by segment list

現在將描述實例客戶端。該客戶端可被用作伺服器用來確保MPD的恰當產生和更新的參考客戶端。The example client will now be described. The client can be used as a reference client for the server to ensure proper generation and update of the MPD.

HTTP串流客戶端由MPD中提供的資訊來指導。假定客戶端能存取客戶端在時間T(亦即，客戶端能成功接收MPD的時間)接收到的MPD。決定成功接收可包括客戶端獲得經更新的MPD或客戶端驗證出該MPD自先前的成功接收以來尚未被更新過。The HTTP streaming client is guided by the information provided in the MPD. It is assumed that the client can access the MPD received by the client at time T (i.e., when the client can successfully receive the MPD). Determining successful reception may include the client obtaining an updated MPD or the client verifying that the MPD has not been updated since the previous successful reception.

以下介紹實例客戶端行為。為了向使用者提供連續串流服務，客戶端首先解析MPD並且在計及如以下詳述的可能使用播放清單或使用URL構造規則的段清單產生程序的情況下為每個表示建立在當前系統時間的客戶端本端時間可存取的段的列表。隨後，客戶端基於表示屬性中的資訊及其他資訊(例如可用頻寬和客戶端能力)選擇一或多個表示。取決於編組，表示可自立呈現或與其他表示聯合呈現。The example client behavior is described below. In order to provide the user with a continuous streaming service, the client first parses the MPD and establishes the current system time for each representation, taking into account the possible use of playlists or segment list generation procedures using URL construction rules as detailed below. A list of segments that the client can access at the local time. The client then selects one or more representations based on the information in the representation attribute and other information such as available bandwidth and client capabilities. Depending on the grouping, the representation can be rendered ad hoc or in conjunction with other representations.

對於每個表示，客戶端獲取諸如該表示的「moov」頭部之類的二進位中繼資料(若有)，及所選表示的媒體段。客戶端經由可能使用段列表之類以請求段或段的位元組範圍來存取媒體內容。客戶端可在開始該呈現之前初始地緩衝媒體，並且一旦該呈現已開始，客戶端就經由在計及MPD更新程序的情況下不斷請求段或段部分來繼續消費該媒體內容。For each representation, the client obtains binary relay data (if any), such as the "moov" header of the representation, and the media segment of the selected representation. The client accesses the media content via a byte array that may use a segment list or the like to request a segment or segment. The client may initially buffer the media before starting the presentation, and once the presentation has begun, the client continues to consume the media content by continuously requesting a segment or segment portion while accounting for the MPD update program. .

客戶端可在計及經更新的MPD資訊及/或來自客戶端環境的經更新資訊(例如，可用頻寬的改變)的情況下切換表示。以對包含隨機存取點的媒體段的任何請求，客戶端就可切換到不同的表示。在前移，亦即，當前系統時間(稱為「現在時間」，用於表示相對於呈現的時間)前進時，客戶端消費可存取的段。隨著「現在時間」中的每次前進，客戶端有可能根據本文中指定的程序擴展每個表示的可存取段的列表。The client can switch the representation with account of updated MPD information and/or updated information from the client environment (eg, changes in available bandwidth). The client can switch to a different representation with any request for a media segment containing random access points. In the forward movement, that is, the current system time (referred to as "current time", used to indicate the time relative to the presentation), the client consumes the accessible segments. With each advance in "current time", it is possible for the client to extend the list of accessible segments for each representation according to the procedure specified herein.

若尚未到達媒體呈現結束且若當前重播時間落在客戶端預計會用盡任何正在消費或將要消費的表示的MPD中所描述的媒體中的媒體的閾值以內，則客戶端可請求MPD的更新，該更新帶有新的取回時間「接收時間T」。一旦接收到，客戶端隨後計及有可能經更新的MPD和新時間T來產生可存取段列表。圖29圖示了在客戶端處不同時間的實況服務的程序。If the media presentation has not yet arrived and if the current replay time falls within the threshold of the media in the media described in the MPD of the representation that is being consumed or will be consumed by the client, the client may request an update of the MPD, This update has a new retrieval time "Receive Time T". Once received, the client then considers the likely updated MPD and new time T to generate a list of accessible segments. Figure 29 illustrates a procedure for live service at different times at the client.

可存取段列表產生Accessible segment list generation

假定HTTP串流客戶端能存取MPD並且可能想要產生對於壁鐘時間「現在」而言可存取的段的列表。客戶端以某個精度同步到全域時間基準，但有利地不要求直接同步到HTTP串流伺服器。It is assumed that the HTTP streaming client can access the MPD and may want to generate a list of segments that are accessible for the wall clock time "now". The client synchronizes to the global time base with some precision, but advantageously does not require direct synchronization to the HTTP streaming server.

每個表示的可存取段列表較佳界定為段開始時間和段定位符的列表對，其中不失一般性，段開始時間可界定為是相對於表示的開始而言的。表示的開始可與時段的開始對準(若應用該概念)。否則，表示開始可在媒體呈現的開始處。The list of accessible segments for each representation is preferably defined as a list pair of segment start times and segment locators, without loss of generality, the segment start time may be defined as relative to the beginning of the representation. The beginning of the representation can be compared to the beginning of the time period Standard (if the concept is applied). Otherwise, the indication begins at the beginning of the media presentation.

客戶端使用例如本文中進一步界定的URL構造規則和時基。一旦獲得了所描述段的列表，該列表被進一步限於可存取的段，該等段可以是完整媒體呈現的段的子集。該構造由時鐘在客戶端「現在」時間的當前值來掌管。一般而言，段僅在一組可用性時間以內的任何「現在」時間可用。對於落在該訊窗以外的「現在」時間，則沒有段可用。此外，對於實況服務，假定某個時間「檢查時間(checktime)」提供關於將此媒體描述到進入將來多遠的資訊。「檢查時間」是在MPD記載的媒體時間軸上界定的；當客戶端的重播時間到達檢查時間時，客戶端有利地請求新MPD。當客戶端的重播時間到達檢查時間時，其有利地請求新MPD。The client constructs rules and time bases using, for example, URLs as further defined herein. Once the list of described segments is obtained, the list is further limited to accessible segments, which may be a subset of the segments of the full media presentation. This construct is governed by the current value of the clock at the client's "now" time. In general, segments are only available for any "now" time within a set of availability times. For the "now" time that falls outside the window, no segment is available. In addition, for live services, it is assumed that a certain time "checktime" provides information about how far this media is described to enter the future. The "check time" is defined on the media timeline recorded by the MPD; when the client's replay time reaches the check time, the client advantageously requests a new MPD. When the client's replay time reaches the check time, it advantageously requests a new MPD.

隨後，段清單由檢查時間連同MPD屬性TimeShiftBufferDepth(時移緩衝器深度)進一步限制，以使得可用媒體段僅有媒體段的開始時間與表示開始時間之和落入「現在」減去時移緩衝器深度減去上個被描述的段的歷時與檢查時間或「現在」中的較小值之間的區間的彼等段。Subsequently, the segment list is further limited by the check time along with the MPD attribute TimeShiftBufferDepth (time shift buffer depth) so that the available media segment only has the sum of the start time and the start time of the media segment falling into the "now" minus the time shift buffer. Depth subtracts the segments of the interval between the last described segment and the inspection time or the smaller of the "now" segments.

可伸縮區塊Scalable block

有時，可用頻寬下降得如此之低，從而接收器處當前正在接收的一或多個區塊變得不大可能被及時完全接收以供不暫停呈現地播出。接收器可能事先偵測到此類情形。例如，接收器可決定自己正在接收每6單位的時間編碼5單位的媒體的區塊，並且具有4單位的媒體的緩衝器，因此接收器可能預期不得不將該呈現停滯或暫停到大約24單位的時間以後。在充分注意到此點的情況下，接收器可經由例如放棄當前的區塊串流之類來對此類情形作出反應並開始請求來自該內容的不同表示(諸如每單位播出時間使用較少頻寬的表示)的一或多個區塊。例如，若接收器切換到其中對於相同大小的區塊而言，區塊所編碼的視訊時間至少多了20%的表示，則接收器可能能夠消除停滯直至頻寬情形得到改善的需要。Sometimes, the available bandwidth drops so low that one or more blocks currently being received at the receiver become less likely to be fully received in time for broadcast without pause. The receiver may detect such a situation in advance. For example, the receiver can decide that it is receiving a block of 5 units of media coded every 6 units of time, and has a buffer of 4 units of media, so the receiver can It can be expected that the presentation will have to be stalled or paused to a time of approximately 24 units. With sufficient attention to this, the receiver can react to such situations via, for example, abandoning the current block stream and begin requesting different representations from the content (such as less time per unit playout time). One or more blocks of representation of the bandwidth. For example, if the receiver switches to a representation in which at least 20% of the video time encoded by the block is for a block of the same size, the receiver may be able to eliminate the need for stagnation until the bandwidth situation is improved.

然而，使接收器完全丟棄已從被放棄的表示接收到的資料可能是浪費的。在本文中描述的區塊串流系統的實施例中，每個區塊內的資料可按以下方式來編碼和安排：區塊內的資料的某些首碼可被用來在尚未接收到該區塊的其餘部分的情況下繼續該呈現。例如，可使用可伸縮視訊編碼的公知技術。此類視訊編碼方法的實例包括H.264可伸縮視訊編碼(SVC)或H.264高級視訊編碼(AVC)的時間可伸縮性。有利地，該方法允許呈現基於區塊中已接收到的部分來繼續進行，即使對一或多個區塊的接收例如由於可用頻寬的改變而可能被放棄。另一優點在於單個資料檔案可被用作該內容的多個不同表示的源。此是可能的，例如經由利用選擇區塊中與所要求的表示相對應的子集的HTTP部分獲取請求來實現。However, it may be wasteful for the receiver to completely discard the data that has been received from the abandoned representation. In an embodiment of the block stream system described herein, the data within each block can be encoded and arranged in such a way that certain first codes of data within the block can be used to have not received the The presentation continues with the rest of the block. For example, well-known techniques for scalable video coding can be used. Examples of such video coding methods include time scalability of H.264 Scalable Video Coding (SVC) or H.264 Advanced Video Coding (AVC). Advantageously, the method allows the presentation to proceed based on the received portion of the block, even if the reception of one or more blocks may be discarded, for example due to a change in the available bandwidth. Another advantage is that a single profile can be used as a source for multiple different representations of the content. This is possible, for example, via an HTTP partial acquisition request that utilizes a subset of the selected blocks corresponding to the required representation.

本文中詳述的一種改進是增強型段：可伸縮段映射。該可伸縮段映射包含段中不同層的位置，以使得客戶端能相應地存取該等段的各部分並提取各層。在另一實施例中，段中的媒體資料被排序，以使得在從段開頭逐漸下載資料的同時段的品質亦在提高。在另一實施例中，品質的逐漸提高被應用於段中包含的每個區塊或片斷，以使得能進行片斷請求來解決可伸縮辦法。One improvement detailed in this article is the enhanced segment: the scalable segment map. The scalable segment map contains the locations of the different layers in the segment so that the client can access portions of the segments and extract the layers accordingly. In another embodiment, the media material in the segments is ordered such that the quality of the segments is also increasing as the data is gradually downloaded from the beginning of the segment. In another embodiment, the quality is gradually improved It is applied to each block or fragment contained in the segment to enable fragment requests to resolve the scalable approach.

圖12是圖示可伸縮區塊的態樣的圖。在該圖中，發射器1200輸出中繼資料1202、可伸縮層1(1204)、可伸縮層2(1206)，及可伸縮層3(1208)，其中後者被延誤了。接收器1210隨後可使用中繼資料1202、可伸縮層1(1204)和可伸縮層2(1206)來呈現媒體呈現1212。FIG. 12 is a diagram illustrating an aspect of a scalable block. In the figure, the transmitter 1200 outputs the relay data 1202, the scalable layer 1 (1204), the scalable layer 2 (1206), and the scalable layer 3 (1208), wherein the latter is delayed. Receiver 1210 can then render media presentation 1212 using relay material 1202, scalable layer 1 (1204), and scalable layer 2 (1206).

獨立可伸縮性層Independent scalability layer

如以上所解釋的，不希望區塊請求串流系統在接收器不能及時接收到媒體資料的特定表示的所請求區塊供該接收器播出時不得不停滯，因為該往往造成不良使用者體驗。經由將所選取的表示的資料率限製成比可用頻寬小得多以使該呈現不太可能有任何給定部分不會被及時接收到，就能夠避免、減少或緩解停滯，但該策略具有媒體品質必然比可用頻寬原則上能支援的媒體品質低得多。比可能達到的品質低的呈現亦可能被解釋為不良使用者體驗。因此，區塊請求串流系統的設計者在設計客戶端程序、客戶端程式設計或硬體設定時面臨著以下選擇：要麼請求具有比可用頻寬低得多的資料率的內容版本，在此種情形中使用者可能遭受不良媒體品質；要麼請求具有接近可用頻寬的資料率的內容版本，在此種情形中使用者在呈現期間隨著可用頻寬改變有高概率會遭受暫停。As explained above, it is undesirable for the block request streaming system to have to stall when the receiver is unable to receive the requested portion of the particular representation of the media material in time for the receiver to broadcast, as this often results in a bad user experience. . By limiting the data rate of the selected representation to be much smaller than the available bandwidth to make the presentation less likely that any given portion will not be received in time, stagnation can be avoided, reduced or mitigated, but the strategy is Having media quality is inevitably much lower than the quality of the media that can be supported in principle. A presentation that is lower than the quality that may be achieved may also be interpreted as a poor user experience. Therefore, designers of block request streaming systems face the following choices when designing client programs, client programming, or hardware settings: either request a content version with a much lower data rate than the available bandwidth, here In this case, the user may suffer from poor media quality; or request a version of the content having a data rate close to the available bandwidth, in which case the user will suffer a pause with a high probability of changing with the available bandwidth during presentation.

為了處置此類情形，本文中描述的區塊串流系統可被配置成獨立地處置多個可伸縮性層，以使得接收器能作出分層請求並且發射器能回應於分層請求。To handle such situations, the block streaming system described herein can be configured to independently handle multiple scalability layers to enable the receiver to make The layered request and the emitter can respond to the layered request.

在此類實施例中，每個區塊的經編碼媒體資料可被劃分成多個不相交的片(在本文中被稱為「層」)，以使得層組合構成區塊的整個媒體資料並且使得已接收到該等層的某些子集的客戶端可執行對該內容的表示的解碼和呈現。在該辦法中，串流中的資料的排序使得毗連範圍在品質上呈遞增且中繼資料反映該點。In such an embodiment, the encoded media material for each tile may be divided into a plurality of disjoint slices (referred to herein as "layers") such that the layers combine to form the entire media material of the block and A client that has received certain subsets of the layers may perform decoding and rendering of the representation of the content. In this approach, the ordering of the data in the stream causes the contiguous range to be incremental in quality and the relay data to reflect that point.

可用來產生具有上述性質的層的技術的實例是例如ITU-T標準H.264/SVC中描述的可伸縮視訊編碼技術。可用來產生具有上述性質的層的技術的另一實例是如ITU-T標準H.264/AVC中提供的時間可伸縮性層技術。An example of a technique that can be used to generate a layer having the above properties is, for example, the scalable video coding technique described in the ITU-T standard H.264/SVC. Another example of a technique that can be used to generate a layer having the above properties is the temporal scalability layer technique as provided in the ITU-T standard H.264/AVC.

在該等實施例中，中繼資料可在MPD中或在段自身中提供，從而使得能構造對任何給定區塊的個體層及/或層組合及/或多個區塊的給定層及/或多個區塊的層組合的請求。例如，構成區塊的層可被儲存在單個檔內且可提供指定該檔內與個體層相對應的位元組範圍的中繼資料。In such embodiments, the relay material may be provided in the MPD or in the segment itself, thereby enabling construction of individual layers and/or layer combinations for any given block and/or a given layer of multiple blocks. And/or a request for a layer combination of multiple blocks. For example, the layers that make up the block can be stored in a single file and can provide relay data specifying the range of bytes within the file that correspond to the individual layer.

能夠指定位組範圍的檔下載協定(例如HTTP 1.1)可被用來請求個體層或多個層。此外，如本領域技藝人士在查閱本案之際將清楚的，以上描述的涉及可變大小的區塊及可變的區塊組合的構造、請求和下載的技術亦可應用於本上下文。A file download protocol (eg, HTTP 1.1) capable of specifying a bit range can be used to request an individual layer or multiple layers. Moreover, as will be apparent to those skilled in the art upon review of this disclosure, the above-described techniques relating to the construction, request and download of variable size blocks and variable block combinations are also applicable to this context.

組合combination

現在描述數個實施例，該等實施例可有利地由區塊請求串流客戶端採用以經由使用如以上描述地劃分成層的媒體資料來達成相比於現有技術在使用者體驗上的改善及/或在服務基礎設施容量要求上的減少。Several embodiments are now described, which may advantageously be employed by a tile request streaming client to use media divided into layers as described above. Physical data to achieve improvements in user experience compared to prior art and/or reduction in service infrastructure capacity requirements.

在第一實施例中，可應用區塊請求串流系統的已知技術，該等技術的修改在於內容的不同版本在一些情形中由層的不同組合所取代。換言之，在現有系統可提供內容的兩種相異表示的場合，此處描述的增強型系統便可提供兩個層，其中現有系統中內容的一個表示在位元速率、品質及可能亦有其他度量方面類似於增強型系統中的第一層，而現有系統中內容的第二表示在位元速率、品質及可能亦有其他度量方面類似於增強型系統中該兩個層的組合。因此，該增強型系統內要求的儲存容量相比於現有系統中要求的儲存容量得以減小。此外，現有系統的客戶端可發出對一個表示或另一表示的區塊的請求，而該增強型系統的客戶端可發出對區塊的第一層或兩層的請求。因此，該兩個系統中的使用者體驗是相似的。此外，提供了改善的快取記憶體，因為即使是對於不同的品質，使用的亦是共用的段，由此段被快取記憶體的概度性更高。In a first embodiment, a known technique of a block request stream system is applied, the modification of which is that different versions of the content are replaced in some cases by different combinations of layers. In other words, where an existing system can provide two distinct representations of content, the enhanced system described herein can provide two layers, where one representation of the content in the existing system is at bit rate, quality, and possibly other. The metric aspect is similar to the first layer in an enhanced system, while the second representation of content in an existing system is similar to the combination of the two layers in an enhanced system in terms of bit rate, quality, and possibly other metrics. Therefore, the required storage capacity within the enhanced system is reduced compared to the required storage capacity in existing systems. In addition, a client of an existing system can issue a request for a block of one representation or another representation, and a client of the enhanced system can issue a request for a first or two layer of the block. Therefore, the user experience in the two systems is similar. In addition, improved cache memory is provided because even for different qualities, a shared segment is used, whereby the segment is more highly cached.

在第二實施例中，採用現在描述的層方法的增強型區塊請求串流系統中的客戶端可為媒體編碼的若干層中的每一層維護分開的資料緩衝器。如對於客戶端裝置內的資料管理的領域中的技藝人士而言將清楚的，該等「分開的」緩衝器可經由為該等分開的緩衝器分配實體上或邏輯上分開的記憶體區域或經由其他技術來實現，其中所緩衝的資料被儲存在單個或多個記憶體區域中且來自不同層的資料的分開是經由使用包含對來自分開的層的資料的儲存位置的引用的資料結構來邏輯地達成的，且因此在下文中，術語「分開的緩衝器」應當被理解為包括其中相異層的資料可被分開識別的任何方法。客戶端基於每個緩衝器的佔用率發出對每個區塊的個體層的請求，例如該等層可按優先順序次序排序以使得對來自一個層的資料的請求在優先順序次序上較低的層的任何緩衝器的佔用率低於該較低層的閾值的情況下不會被發出。在該方法中，對接收來自優先順序次序上較低的層的資料給予優先，以使得若可用頻寬降至比亦接收優先順序次序上較高的層所要求的頻寬低，則僅請求該等較低層。此外，與不同層相關聯的閾值可以不同，以使得例如較低層具有較高閾值。在可用頻寬改變以使得較高層的資料不能在區塊的播出時間之前被接收到的情形中，則較低層的資料將必然已被接收到且因此呈現能單單用該等較低層來繼續進行。緩衝器佔用率的閾值可按資料位元組、緩衝器中包含的資料的播出歷時、區塊數目或任何其他合適的量測的形式來界定。In a second embodiment, a client in an enhanced block request stream system employing the layer method now described may maintain separate data buffers for each of several layers of media encoding. As will be apparent to those skilled in the art of data management within a client device, the "separate" buffers may be assigned a physically or logically separate memory region for the separate buffers or Implemented by other techniques in which the buffered material is stored in a single or multiple memory regions and the separation of data from different layers is It is logically achieved by using a data structure containing references to storage locations of data from separate layers, and thus the term "separate buffer" should be understood to include that the data of the different layers can be separated. Any method of identification. The client issues a request for an individual layer of each tile based on the occupancy of each buffer, for example, the layers may be ordered in order of priority such that requests for material from one layer are lower in order of priority. If the occupancy of any buffer of the layer is lower than the threshold of the lower layer, it will not be issued. In the method, the data received from the lower priority order is given priority so that if the available bandwidth is reduced to a lower bandwidth than the layer that receives the higher priority order, then only the request is requested. The lower layers. Moreover, the thresholds associated with different layers may be different such that, for example, the lower layers have higher thresholds. In the case where the available bandwidth is changed such that higher layer data cannot be received before the tile's playout time, the lower layer data will necessarily have been received and thus the presentation can be used with only the lower layers. Come on. The threshold for buffer occupancy may be defined in terms of the data byte, the play duration of the data contained in the buffer, the number of blocks, or any other suitable measurement.

在第三實施例中，第一實施例和第二實施例的方法可被組合以使得提供多個媒體表示，每個媒體表示包括層的子集(如同第一實施例中一樣)並且使得第二實施例被應用於表示內的層的子集。In a third embodiment, the methods of the first embodiment and the second embodiment can be combined such that a plurality of media representations are provided, each media representation comprising a subset of layers (as in the first embodiment) and enabling The second embodiment is applied to a subset of the layers within the representation.

在第四實施例中，第一實施例、第二實施例及/或第三實施例的方法可與其中提供內容的多個獨立表示的實施例相組合，以使得例如該等獨立表示中的至少一個獨立表示包括應用第一實施例、第二實施例及/或第三實施例的技術的多個層。In a fourth embodiment, the methods of the first embodiment, the second embodiment and/or the third embodiment may be combined with embodiments in which a plurality of independent representations of content are provided, such that, for example, in the independent representations At least one independent representation includes a plurality of techniques applying the first embodiment, the second embodiment, and/or the third embodiment Layers.

高級緩衝管理器Advanced buffer manager

與緩衝監視器126(參見圖2)相組合，可使用高級緩衝管理器來最佳化客戶端方的緩衝器。區塊請求串流系統想要確保媒體播出能迅速開始並平滑地繼續，與此同時向使用者或目的地裝置提供最大程度的媒體品質。此可能要求客戶端請求具有最高媒體品質、但亦能迅速開始並在此後被及時接收以便在不會迫使呈現中發生暫停的情況下播出的區塊。In combination with buffer monitor 126 (see Figure 2), an advanced buffer manager can be used to optimize the client side buffers. The block request streaming system wants to ensure that media play can begin quickly and smoothly, while providing maximum media quality to the user or destination device. This may require the client to request a block that has the highest media quality, but can also start quickly and be received in time to be broadcasted without forcing a pause in the presentation.

在使用高級緩衝管理器的實施例中，該管理器決定要請求媒體資料的何者區塊及何時作出彼等請求。可例如向高級緩衝管理器提供要呈現的內容的中繼資料集，該中繼資料包括內容可用的表示清單及每個表示的中繼資料。表示的中繼資料可包括關於表示的資料率及其他參數的資訊，諸如視訊、音訊或其他轉碼器和編解碼參數、視訊解析度、解碼複雜性、音訊語言及會影響客戶端處對表示的選取的任何其他參數。In an embodiment using an advanced buffer manager, the manager decides which blocks of media material to request and when to make their requests. The relay data set of the content to be presented may be provided, for example, to an advanced buffer manager, the relay material including a list of representations available for the content and relay information for each representation. The represented relay data may include information about the indicated data rate and other parameters, such as video, audio or other transcoder and codec parameters, video resolution, decoding complexity, audio language, and impact on the client. Any other parameters of the selection.

表示的中繼資料亦可包括該表示已被分段成的區塊的辨識符，該等辨識符提供客戶端請求區塊所需的資訊。例如，在請求協定是HTTP的場合，該辨識符可以是HTTP URL可能亦連同額外資訊，該額外資訊識別由該URL所識別的檔內的位元組範圍或時間跨度，該位元組範圍或時間跨度識別由該URL所識別的檔內的特定區塊。The represented relay data may also include an identifier of the block indicating that the segment has been segmented, and the identifiers provide information required by the client to request the block. For example, where the request protocol is HTTP, the identifier may be an HTTP URL, possibly along with additional information identifying the byte range or time span within the file identified by the URL, the byte range or The time span identifies a particular block within the file identified by the URL.

在具體實現中，高級緩衝管理器決定接收器何時作出對新區塊的請求並且高級緩衝管理器自身可能處置該請求的發送。在新穎的態樣，高級緩衝管理器根據在使用過多頻寬與串流播出期間用盡媒體之間進行平衡的平衡比的值來對新區塊作出請求。In a specific implementation, the advanced buffer manager determines when the receiver is made A request for a new block is made and the advanced buffer manager itself may handle the sending of the request. In a novel aspect, the Advanced Buffer Manager makes a request for a new block based on the value of the balance ratio between the use of excess bandwidth and the exhaustion of media during streaming.

緩衝監視器126從區塊緩衝器125接收到的資訊可包括對何時接收到媒體資料、已接收到多少、對媒體資料的播出何時已開始或停止，及媒體播出的速度等的每個事件的指示。基於該資訊，緩衝監視器126可演算代表當前緩衝器大小的變數B _當前。在該等實例中，B _當前代表客戶端或其他一或多個裝置緩衝器中包含的媒體量並且可以時間單位來量測，從而B _當前代表若不再接收更多的區塊或部分區塊則播出該一或多個緩衝器中所儲存的區塊或部分區塊所表示的所有媒體將花費的時間量。因此，B _當前代表客戶端處可用但尚未播放的媒體資料按正常播出速度的「播出歷時」。The information received by the buffer monitor 126 from the block buffer 125 may include each of when the media material was received, how much has been received, when the broadcast of the media material has started or stopped, and the speed of the media broadcast, etc. An indication of the event. Based on this information, buffer monitor 126 can calculate the variable B _current representing the current buffer size. In these examples, B _currently represents the amount of media contained in the client or other one or more device buffers and can be measured in time units such that B _currently represents that no more blocks or partial blocks are received. The amount of time it will take to broadcast all of the media represented by the block or partial block stored in the one or more buffers. Therefore, B _currently represents the "broadcast duration" of the media data available at the client but not yet played at the normal broadcast speed.

隨著時間流逝，B _當前的值將隨著媒體被播出而減小並且會在每次接收到區塊的新資料時增大。注意，出於本解釋的目的，假定區塊是在彼區塊的全部資料在區塊請求器124處可用時被接收的，但亦可以改為使用其他措施以例如計及部分區塊的接收。在實踐中，對區塊的接收可發生在時段上。As time passes, the _current value of B will decrease as the media is broadcast and will increase each time new material for the block is received. Note that for the purposes of this explanation, it is assumed that a block is received when all of the data for that block is available at the block requester 124, but other measures may be used instead to take into account, for example, the reception of a partial block. . In practice, the receipt of a block can occur over a period of time.

圖13圖示了在媒體被播出並且區塊被接收時B _當前的值隨時間的變化。如圖13中所示，對於小於t ₀ 的時間，B _當前的值為0，指示尚未接收到資料。在t ₀ ，第一區塊被接收並且B _當前的值增大到等於所接收的區塊的播放歷時。此時，播出尚未開始且因此B _當前的值保持恆定直至時間t ₁ ，此時第二區塊抵達並且B _當前增大該第二區塊的大小。此時，播出開始並且B _當前的值開始線性減小直至時間t ₂ ，此時第三區塊抵達。Figure 13 illustrates the change in the _current value of B over time as the media is broadcast and the block is received. As shown in FIG. 13, for a time of less than t _0, B _current is 0, indicating information has not been received. At t ₀ , the first block is received and the _current value of B is increased to be equal to the playback duration of the received block. At this point, the broadcast has not yet started and therefore the _current value of B remains constant until time t ₁ , at which point the second block arrives and B _currently increases the size of the second block. At this time, _{the current} broadcast starts and B value starts to decrease linearly until time t _2, the third block to arrive at this time.

B _當前的累進以此「鋸齒」方式繼續，每次接收到區塊時階躍地增大(在時間t ₂ 、t ₃ 、t ₄ 、t ₅ 和t ₆ )並在其間隨著資料被播出而平滑地減小。注意，在該實例中，播出是以該內容的正常播出速率來進行的，並且因此區塊接收之間的曲線斜率恰好為-1，意味著對於流逝的每一秒真正時間，有一秒的媒體資料被播放。在基於訊框的媒體以給定訊框數每秒(例如24訊框每秒)播出時，斜率-1將由指示每個個體資料訊框的播出的小階躍函數來近似，例如每訊框播出時-1/24秒的步長。 B The _current progressive continues in this "sawtooth" manner, stepwise increasing each time a block is received (at times t ₂ , t ₃ , t ₄ , t _{5 ,} and t ₆ ) and is broadcast along with the data It is reduced smoothly. Note that in this example, the broadcast is done at the normal playout rate of the content, and therefore the slope of the curve between block receptions is exactly -1, meaning that there is one second for each second real time elapsed. The media material is played. When the frame-based media is broadcast at a given number of frames per second (eg, 24 frames per second), the slope-1 will be approximated by a small step function indicating the broadcast of each individual data frame, such as each When the frame is broadcast, the step size is -1/24 seconds.

圖14圖示B _當前隨時間進化的另一個實例。在該實例中，第一區塊在t ₀ 抵達並且播出立即開始。區塊抵達和播出繼續直至時間t ₃ ，此時B _當前的值到達0。在此種情況發生時，沒有更多媒體資料可供播出，從而迫使媒體呈現暫停。在時間t ₄ ，第四區塊被接收並且播放可恢復。該實例因此圖示其中對第四區塊的接收晚於所需從而導致播出暫停及因此導致不良使用者體驗的情形。因此，高級緩衝管理器及其他特徵的目標是降低該事件的概率，與此同時維持高媒體品質。Figure 14 illustrates another example of B _current evolution over time. In this example, the first block arrives at t ₀ and the broadcast begins immediately. Blocks and broadcast continues until arrival time t _3, _{the current} value B at this time reaches zero. When this happens, no more media material is available for broadcast, forcing the media to pause. At time t ₄ , the fourth block is received and playback is resumeable. This example thus illustrates a situation in which reception of a fourth block is later than desired, resulting in a pause in playout and thus a poor user experience. Therefore, the goal of the Advanced Buffer Manager and other features is to reduce the probability of this event while maintaining high media quality.

緩衝監視器126亦可演算另一度量B _比率 (t) ，B _比率 (t) 為給定的時間段中接收到的媒體與該時間段的長度之比。更具體而言，B _比率 (t) 等於T _收到 /(T _現在 -t )，其中T _收到是在自t 直至當前時間T _現在的該時間段中接收到的媒體量(以媒體播出時間來度量)，t 是比當前時間早的某個時間。Buffer monitor 126 may also be another metric calculating _{the ratio} B (t), the ratio of the length _{ratio of} B (t) is received in a given period of time and the period of time of media. More specifically, B _ratio (t) _{is received} equal to T / (T _now - t), where T is _received in the media from the received amount t until the _current period of the current time T (in media playout Time to measure), t is some time earlier than the current time.

B _比率 (t) 可用於量測B _當前的變化率。B _比率 (t) =0是其中自時間t起尚未接收到資料的情形；假定媒體正在播出，則B _當前自該時間起將減少了(T _現在 -t )。B _比率 (t) =1是其中對於時間(T _現在 -t )而言接收到的媒體的量與正在播出的量相同的情形；B _當前在時間T _現在將具有與時間t 時相同的值。B _比率 (t) >1是其中對於時間(T _現在 -t )而言接收到的資料比播出所需的多的情形；B _當前從時間t 到時間T _現在將有所增大。 The B _ratio (t) can be used to measure the _current rate of change of B. _Ratio B (t) = 0 is a case wherein starting from the time t has not received data; assumed media being broadcast, the _current from the B reduces the time (T _now - t). _Ratio B (t) = 1 in which for the time (T _now - t) the same amount in terms of the amount of media being received and broadcast case; B _current at time T will _now have the same value at time t . The B _ratio (t) > 1 is a case where more data is received than time for the time ( T _now - t ); B _is _now increasing from time t to time T.

緩衝監視器126進一步演算「State (狀態)」 值，「State (狀態)」 值可取離散數目個值。緩衝監視器126進一步裝備有函數NewState (B _當前 ,B _比率 )，在給定B _當前的當前值和B _比率對於t <T _現在的值的情況下該函數提供新「狀態」 值作為輸出。每當B _當前和B _比率導致該函數返回不同於「狀態」的當前值的值時，該新值就被指派給「狀態」並且向區塊選擇器123指示該新狀態值。Buffer monitor 126 for further calculations "State (status)" value "State (status)" The value can be a discrete number of values. Buffer monitor 126 is further provided with a function of the NewState _(current B, B _ratio), the new "Status" value for the case where the function t <T to the _current value _{of the current} _{ratio of} the current value B and B set as the output. Whenever the B _current and B _ratio causes the function to return a value different from the current value of the "state", the new value is assigned to the "state" and the block selector 123 is indicated to the new state value.

函數NewState (新狀態)可參照(B _當前 ,B _比率 (T _現在 -T _x ))對的所有可能值的空間來求值，其中T _x 可以是固定(配置)值，或者可例如由從B _當前的值映射到T _x 的值的配置表從B _當前推導出，或者可取決於「狀態」的先前值。向緩衝監視器126供應該空間的一或多個劃分，其中每個劃分包括不相交區域的集合，每個區域用「狀態」 值來標注。對函數「NewState」的求值由此包括識別劃分並決定(B _當前 ,B _比率 (T _現在 -T _x ))對所落在的區域的操作。返回值由此是與該區域相關聯的標注。在簡單情形中，只提供一個劃分。在更複雜的情形中，劃分可取決於前一次對NewState 函數求值時的(B _當前 ,B _比率 (T _現在 -T _x ))對或取決於其他因素。The function NewState can be evaluated with reference to the space of all possible values of ( B _current , B _ratio ( T _now - T _x )), where T _x can be a fixed (configuration) value, or can be, for example, from B _{the current} values are mapped to the T _x values derived from the configuration table B _current output, or may depend on the "status" of the previous value. One or more partitions of the space are supplied to the buffer monitor 126, wherein each partition includes a set of disjoint regions, each region being labeled with a "state" value. On the evaluation of the function "NewState" thus determined is divided and includes an identification (B _current, B _ratio (T _current - T _x)) fall within the operating region of. The return value is thus the label associated with the area. In the simple case, only one division is provided. In a more complex case, the division (B _current, B _ratio (T _current - T _x)) at the time of evaluation before NewState function may depend or depends on other factors.

在具體實施例中，以上描述的劃分可基於包含B _當前的數個閾值及B _比率的數個閾值的配置表。具體而言，令B _當前的閾值為B _閾 (0)=0、B _閾 (1)、...、B _閾 (n ₁ )、B _閾 (n ₁ +1)=∞，其中n ₁ 是B _當前的非零閾值的數目。令B _比率的閾值為B _比率閾 (0)=0、B _比率閾 (1)、...、B _比率閾 (n ₂ )、B _比率閾 (n ₂ +1)=∞，其中n ₂ 是B _比率的閾值的數目。該等閾值界定了包括(n ₁ +1)× (n ₂ +1)的單元柵格的劃分，其中第j 行的第i 個單元對應於其中B _閾 (i -1)<=B _當前 <B _閾 (i )且B _比率閾 (j -1)<=B _比率 <B _比率閾 (j )的區域。以上描述的柵格的每個儲存格諸如經由與儲存在記憶體中的特定值相關聯之類而被標注以狀態值，並且函數NewState 隨後返回與由值B _當前和B _比率 (T _現在 -T _x )指示的儲存格相關聯的狀態值。In a particular embodiment, the division may be based on the above described _current B comprise several thresholds and several configuration tables B _ratio threshold values. Specifically, let _{the current} threshold B is _{a threshold} B (0) = 0, B _threshold (. 1), ..., B _threshold (n _1), _{the threshold} B (n ₁ +1) = ∞, where n ₁ is B The _current number of non-zero thresholds. Let B _ratio threshold _{ratio threshold} B (0) = 0, B _{ratio threshold} (1), ..., B _{ratio threshold} (n _2), B _{ratio threshold} (n ₂ +1) = ∞, where n ₂ is The number of thresholds for the B _ratio . The thresholds define a division of a cell grid comprising ( n ₁ +1) × ( n ₂ +1), wherein the i- th unit of the j-th row corresponds to where the B- _threshold ( i -1) <= B is _currently < A region where B _threshold ( i ) and B _{ratio threshold} ( j -1) <= B _ratio < B _{ratio threshold} ( j ). Each cell of the grid described above is labeled with a state value, such as via association with a particular value stored in the memory, and the function NewState then returns a _{ratio of} _current and B values by value B ( T _now - T _x ) The status value associated with the indicated cell.

在另一實施例中，可令遲滯值與每個閾值相關聯。在該增強型方法中，對函數NewState 的求值可基於使用一組臨時修改的閾值如下構造的臨時劃分。對於小於與在對NewState 的上次求值時所選取的儲存格相對應的B _當前範圍的每個B _當前閾值，經由減去與該閾值相關聯的遲滯值來減小該閾值。對於大於與在對NewState 的上次求值時所選取的儲存格相對應的B _當前範圍的每個B _當前閾值，經由加上與該閾值相關聯的遲滯值來增大該閾值。對於小於與在對NewState 的上次求值時所選取的儲存格相對應的B _比率範圍的每個B _比率閾值，經由減去與該閾值相關聯的遲滯值來減小該閾值。對於大於與在對NewState 的上次求值時所選取的儲存格相對應的B _比率範圍的每個B _比率閾值，經由加上與該閾值相關聯的遲滯值來增大該閾值。經修改的閾值被用來對NewState 的值進行求值並且隨後該等閾值返回閾值的原始值。In another embodiment, a hysteresis value can be associated with each threshold. In this enhanced method, the evaluation of the function NewState may be based on a temporary partition constructed as follows using a set of temporarily modified thresholds. For each B _current threshold less than the B _current range corresponding to the cell selected at the last evaluation of NewState , the threshold is decreased by subtracting the hysteresis value associated with the threshold. For each B _current threshold greater than the B _current range corresponding to the cell selected at the last evaluation of NewState , the threshold is increased by adding a hysteresis value associated with the threshold. For each threshold is less than the _ratio B at the last evaluation of NewState the selected cells corresponding to the range B _ratio, hysteresis value by subtracting the threshold value associated with the threshold value is reduced. For each B- _ratio threshold greater than the B- _rate range corresponding to the cell selected at the last evaluation of NewState , the threshold is increased by adding a hysteresis value associated with the threshold. The modified threshold is used to evaluate the value of NewState and then the threshold returns the original value of the threshold.

在閱讀本案之際，界定空間的劃分的其他方式對於本領域技藝人士將變得明顯。例如，劃分可經由使用基於B _比率和B _當前的線性組合的不等式，例如α0、α1和α2為實數值的α1‧B _比率 +α2‧B _當前 ≦α0形式的線性不等式閾值來界定，以界定整個空間內的半空間及將每個不相交集合界定為數個此類半空間的交集。Other ways of defining the division of space will become apparent to those skilled in the art upon reading this. For example, B can be divided based on _{the ratio} through the use of _current and inequalities B linear combination, e.g. α0, α1, and α2 is _{the ratio of} real-valued α1‧ B ≦ α0 + α2‧ B _this form of linear inequality defined threshold to define the entire The half space within the space and the definition of each disjoint set as the intersection of several such half spaces.

以上描述是為了圖示基本過程。如實時程式設計領域的技藝人士在閱讀本案之際將清楚的，高效實現是可能的。例如，每次將新資訊提供給緩衝監視器126時，就有可能演算若例如不接收區塊的進一步的資料則NewState 將轉移到新值的將來時間。隨後為該時間設置計時器並且在沒有進一步的輸入的情況下，該計時器的期滿將導致新的狀態值被發送給區塊選擇器123。因此，只需要在有新資訊被提供給緩衝監視器126時或者在計時器期滿時執行計算，而無需連續地執行。The above description is for illustrating the basic process. For example, in the real-time programming field, the skilled person will be clear when reading the case, and efficient implementation is possible. For example, each time new information is provided to the buffer monitor 126, it is possible to calculate the future time that NewState will shift to the new value if, for example, no further data for the block is received. The timer is then set for this time and, in the absence of further input, the expiration of the timer will cause a new status value to be sent to the block selector 123. Therefore, it is only necessary to perform calculations when new information is supplied to the buffer monitor 126 or when the timer expires, without performing continuously.

狀態的合適值可為「低」、「穩定」和「滿」。合適的閾值集合和所得單元柵格的實例在圖15中圖示。The appropriate values for the status can be Low, Stable, and Full. An example of a suitable threshold set and resulting cell grid is illustrated in FIG.

在圖15中，B _當前閾值以毫秒在橫軸上圖示，遲滯值在B _當前閾值下方以「+/-值」形式圖示。B _比率閾值以千分率(亦即，乘以1000)在縱軸上圖示，遲滯值在B _比率閾值下方以「+/-值」形式圖示。「低」、「穩定」和「滿」狀態值分別在柵格單元中被標注為「L」、「S」和「F」。In Fig. 15, the _current threshold of B is shown on the horizontal axis in milliseconds, and the hysteresis value is shown in the form of "+/- value " below the _current threshold of B. The B _ratio threshold is plotted on the vertical axis in parts per thousand (i.e., multiplied by 1000), and the hysteresis value is shown in the form of "+/- value " below the B _ratio threshold. The "low", "stable", and "full" status values are labeled "L", "S", and "F" in the grid cells, respectively.

每當有機會請求新區塊時，區塊選擇器123就接收到來自區塊請求器124的通知。如以上所描述的，向區塊選擇器123提供關於可用的該複數個區塊的資訊及彼等區塊的中繼資料，包括例如關於每個區區塊的媒體資料率的資訊。The block selector 123 receives a notification from the block requester 124 whenever there is an opportunity to request a new block. As described above, the block selector 123 is provided with information about the available plurality of blocks and relay information for the blocks, including, for example, information about the media data rate for each of the blocks.

關於區塊的媒體資料率的資訊可包括該特定區塊的實際媒體資料率(亦即，以位元組計的區塊大小除以以秒計的播出時間)、區塊所屬的表示的平均媒體資料率，或為了不暫停地播出該區塊所屬的表示在維繫的基礎上需要的可用頻寬的度量，或以上的組合。The information about the media data rate of the block may include the actual media data rate of the specific block (that is, the block size in bytes divided by the broadcast time in seconds), the representation of the block. The average media data rate, or a combination of the above available bandwidths that are required to maintain the coverage of the block, or a combination of the above.

區塊選擇器123基於緩衝監視器126最新指示的狀態值來選擇區塊。在該狀態值為「穩定」時，區塊選擇器123從與先前所選區塊相同的表示選擇區塊。所選擇的區塊是包含該呈現中先前未曾請求過該呈現的媒體資料的時段的媒體資料的第一區塊(按播出次序)。The block selector 123 selects a block based on the state value newly indicated by the buffer monitor 126. When the status value is "stable", the block selector 123 selects the block from the same representation as the previously selected block. The selected block is the first block (in the order of play) of the media material containing the time period of the media material in the presentation that has not previously requested the presentation.

在該狀態值為「低」時，區塊選擇器123從具有比先前所選區塊的媒體資料率低的媒體資料率的表示選擇區塊。數個因素會影響該情形中對表示的確切選取。例如，區塊選擇器123可被提供對傳入資料的聚集速率的指示並且可選取具有小於該值的媒體資料率的表示。When the status value is "low", the block selector 123 selects a block from a representation having a media data rate lower than the media data rate of the previously selected block. Several factors influence the exact choice of representation in this situation. For example, the tile selector 123 can be provided with an indication of the aggregation rate of the incoming material and can select a representation having a media data rate that is less than the value.

在該狀態值為「滿」時，區塊選擇器123從具有比先前所選區塊的媒體資料率高的媒體資料率的表示選擇區塊。數個因素會影響該情形中對表示的確切選取。例如，區塊選擇器123可被提供對傳入資料的聚集速率的指示並且可選取具有不高於該值的媒體資料率的表示。When the status value is "full", the block selector 123 selects a block from a representation having a media data rate higher than the media data rate of the previously selected block. Several factors influence the exact choice of representation in this situation. For example, the tile selector 123 can be provided with an indication of the aggregation rate of incoming data and can be selected A representation of the media data rate that is no higher than this value.

數個額外因素可能進一步影響區塊選擇器123的操作。具體而言，可限制增大所選區塊的媒體資料率的頻率，即使緩衝監視器126持續指示「滿」狀態亦然。此外，區塊選擇器123有可能接收到「滿」狀態指示但沒有更高媒體資料率的區塊可用(例如，由於上次所選的區塊已經對應最高可用媒體資料率)。在此種情形中，區塊選擇器123可將下一區塊的選擇延遲某個時間，該時間被選取為使得在區塊緩衝器125中緩衝的媒體資料總量是上有界的。Several additional factors may further affect the operation of the block selector 123. In particular, the frequency of increasing the media data rate of the selected block may be limited, even if the buffer monitor 126 continues to indicate the "full" state. In addition, the block selector 123 is likely to receive a "full" status indication but no higher media data rate is available (eg, since the last selected block already corresponds to the highest available media data rate). In this case, block selector 123 may delay the selection of the next block by a time selected such that the total amount of media material buffered in block buffer 125 is bounded.

額外因素可能影響在選擇過程期間考慮的區塊集合。例如，可用區塊可被限於來自區塊的編碼解析度落在提供給區塊選擇器123的特定範圍以內的彼等表示的區塊。Additional factors may affect the set of blocks considered during the selection process. For example, the available blocks may be limited to blocks from which the coded resolution of the blocks falls within a particular range provided to the block selector 123.

區塊選擇器123亦可接收來自監視系統的其他態樣的其他元件的輸入，所監視的態樣諸如是用於媒體解碼的計算資源的可用性。若此類資源變得稀缺，則區塊選擇器123可在中繼資料內指示其解碼的計算複雜性較低的區塊(例如，具有較低解析度或訊框率的表示一般具有較低解碼複雜性)。The block selector 123 may also receive input from other elements of other aspects of the monitoring system, such as the availability of computing resources for media decoding. If such resources become scarce, block selector 123 may indicate blocks of lower computational complexity of decoding thereof within the relay material (eg, representations with lower resolution or frame rate are generally lower) Decoding complexity).

以上描述的實施例帶來的實質優點在於，在緩衝監視器126內對NewState 函數求值時使用值B _比率與僅考慮B _當前的方法相比允許在呈現開始時品質更快地上升。在不考慮B _比率的情況下，可能在累積了大量緩衝的資料後系統才能選擇具有更高媒體資料率及因此具有更高品質的區塊。然而，當B _比率值很大時，該指示可用頻寬遠高於先前接收到的區塊的媒體資料率且即使緩衝的資料相對較少(亦即，B _當前的值很低)，請求有更高媒體資料率及因此有更高品質的區塊仍是安全的。同樣地，若B _比率值很低(例如<1)，該指示可用頻寬已降至先前所請求的區塊的媒體資料率之下且因此即使B _當前很高，系統仍將切換到較低的媒體資料率及因此較低的品質，以例如避免到達B _當前 =0且媒體的播出停滯的點。此種改善的行為在其中網路條件及因此投遞速度可能快速且動態地變化(例如使用者向行動裝置串流)的環境中可能尤其重要。The substantial advantage brought about by the embodiment described above is that the use of the value B _{ratio in the} evaluation of the NewState function within the buffer monitor 126 allows the quality to rise faster at the beginning of the presentation than the method that only considers B _current . Without considering the B _ratio , it is possible that the system can select blocks with higher media data rates and therefore higher quality after accumulating a large amount of buffered data. However, when the B _ratio value is large, the indication available bandwidth is much higher than the media data rate of the previously received block and even if the buffered data is relatively small (ie, the _current value of B is very low), the request has Higher media data rates and therefore higher quality blocks are still safe. Similarly, if the B _ratio value is low (eg, <1), the indicated available bandwidth has fallen below the media data rate of the previously requested block and therefore the system will switch to lower even if B is _currently high. The media data rate and hence the lower quality, for example, avoids reaching the point where B is _currently = 0 and the broadcast of the media is stagnant. Such improved behavior may be particularly important in environments where network conditions and thus delivery speeds may change rapidly and dynamically (eg, users are streaming to mobile devices).

使用配置資料來指定(B _當前 ,B _比率 )的值空間的劃分帶來了另一優點。此類配置資料可作為呈現中繼資料的一部分或經由其他動態手段被提供給緩衝監視器126。在實踐部署中，由於使用者網路連接的行為在各使用者之間及對於單個使用者而言隨時間推移而可能高度可變，因此可能很難預測對於所有使用者皆將工作良好的劃分。動態地向使用者提供此類配置資訊的可能性允許隨時間推移根據累積的經驗來開發良好的配置設置。The use of configuration data to specify the partitioning of the value space ( B _current , B _ratio ) brings another advantage. Such configuration data may be provided to the buffer monitor 126 as part of the presentation relay material or via other dynamic means. In a practical deployment, since the behavior of the user's network connection may be highly variable over time between users and for individual users, it may be difficult to predict a good job for all users. . The possibility of dynamically providing such configuration information to the user allows for the development of good configuration settings based on accumulated experience over time.

可變請求大小控制Variable request size control

若每個請求針對單個區塊且若每個區塊編碼短媒體段，則可能需要高頻率的請求。若媒體區塊很短，則視訊播出迅速地在區塊間移動，此為接收器提供了更頻繁的經由改變表示來調整或改變接收器所選資料率的機會，從而提高了播出能不停滯地繼續的可能性。然而，高頻率的請求的不利方面在於該等請求在其中客戶端至伺服器網路中的可用頻寬受約束的某些網路上可能是不能維繫的，例如在諸如3G和4G 無線WAN之類的無線WAN網路中，其中從客戶端至網路的資料連結的容量是受限制的或者會由於無線電條件的改變而變為在或短或長的時間段上受限制。If each request is for a single block and if each block encodes a short media segment, then a high frequency request may be required. If the media block is short, the video broadcast moves quickly between the blocks, which provides the receiver with more frequent opportunities to adjust or change the selected data rate of the receiver via the change representation, thereby improving the broadcast performance. The possibility of continuing without stagnation. However, a disadvantage of high frequency requests is that such requests may not be sustainable on certain networks where the client's available bandwidth in the server network is constrained, such as in 3G and 4G, for example. In a wireless WAN network such as a wireless WAN, the capacity of the data link from the client to the network is limited or may become limited in a short or long period of time due to changes in radio conditions.

高頻率的請求亦意味著服務基礎設施的高負荷，此帶來容量要求方面的關聯成本。因此，將希望有高頻率請求的一些效益而沒有所有該等缺點。High frequency requests also mean a high load on the service infrastructure, which brings associated costs in terms of capacity requirements. Therefore, it would be desirable to have some of the benefits of high frequency requests without all of these disadvantages.

在區塊串流系統的一些實施例中，將高請求頻率的靈活性與低頻率請求相組合。在該實施例中，區塊可以如上所描述地構造並且同樣如以上所描述地被聚集成包含多個區塊的段。在呈現的開頭，應用以上所描述的其中每個請求引用單個區塊或者作出多個併發請求以請求區塊的各部分的過程以確保在呈現的開始有快速的頻道換台時間並且因此有良好的使用者體驗。隨後，當滿足將在以下描述的某個條件時，客戶端可發出在單個請求中涵蓋多個區塊的請求。此是可能的，因為該等區塊已被聚集成較大檔或段並且可使用位元組或時間範圍來請求。連貫的位元組或時間範圍可被聚集成單個較大的位元組或時間範圍，從而導致單個請求針對多個區塊，並且甚至能在一個請求中請求非連續的區塊。In some embodiments of the block stream system, the flexibility of the high request frequency is combined with the low frequency request. In this embodiment, the tiles may be constructed as described above and also aggregated into segments comprising a plurality of tiles as described above. At the beginning of the presentation, the process described above in which each request references a single block or multiple concurrent requests to request portions of the block is applied to ensure a fast channel change time at the beginning of the presentation and is therefore good User experience. Subsequently, when a certain condition to be described below is met, the client can issue a request to cover multiple blocks in a single request. This is possible because the blocks have been aggregated into larger files or segments and can be requested using a byte or time range. Coherent bytes or time ranges can be aggregated into a single larger byte or time range, resulting in a single request for multiple blocks, and even a non-contiguous block can be requested in one request.

可由決定是請求單個區塊(或部分區塊)還是請求多個連貫區塊來驅使的一個基本配置是使該決定基於所請求區塊是否很可能被播出。例如，若很可能不久將需要換到另一表示，則客戶端最好作出針對單個區塊即少量媒體資料的請求。此舉的一個原因在於，若在可能即將要切換至另一表示時作出針對多個區塊的請求，則該切換可能在該請求的最後幾個區塊被播出之前作出。因此，對該最後幾個區塊的下載可能延遲了對所切換到的表示的媒體資料的投遞，此可能導致媒體播出停滯。A basic configuration that can be driven by whether a single block (or partial block) is requested or multiple consecutive blocks is requested is based on whether the decision is likely to be played out. For example, if it is likely that a change to another representation will be required in the near future, then the client preferably makes a request for a single block, ie a small amount of media material. One reason for this is that if a request for multiple blocks is made when it is likely to switch to another representation, the switch may be at the most The last few blocks were made before being broadcast. Therefore, the download of the last few blocks may delay the delivery of the media material of the indicated switch, which may result in media playout.

然而，針對單個區塊的請求的確導致更高頻率的請求。另一方面，若不大可能不久將需要換到另一表示，則可能較佳作出針對多個區塊的請求，因為所有該等區塊很可能將被播出，並且此導致較低頻率的請求，從而可以大量上降低請求管理負擔，典型地在表示中沒有即將來臨的改變的情況下尤其如此。However, requests for a single block do result in higher frequency requests. On the other hand, if it is unlikely that a change to another representation will be needed in the near future, it may be better to make a request for multiple blocks, since all of these blocks are likely to be broadcast, and this results in lower frequencies. The request can thus greatly reduce the burden of request management, especially if there are no upcoming changes in the representation.

在習知的區塊聚集系統中，每個請求中所請求的量不是動態地調整的，即典型地每個請求針對整個檔，或者每個請求針對與表示的檔大致相同的量(有時以時間量測，有時以位元組量測)。因此，若所有請求皆較小，則請求管理負擔較高，而若所有請求皆較大，則此增加了媒體停滯事件的機會及/或在選取了較低品質表示以避免不得不隨著網路條件變化而迅速改變表示的情況下增加了提供較低品質媒體播出的機會。In conventional block aggregation systems, the amount requested in each request is not dynamically adjusted, ie typically each request is for the entire file, or each request is for approximately the same amount as the indicated file (sometimes Measured in time, sometimes measured in bytes). Therefore, if all requests are small, the request management burden is high, and if all requests are large, this increases the chance of media stagnation events and/or selects a lower quality representation to avoid having to follow the network. The opportunity to provide lower quality media broadcasts is increased with changes in road conditions and rapid changes.

在被滿足時可導致後續請求引用多個區塊的條件的實例是對緩衝器大小B _當前的閾值。若B _當前低於該閾值，則發出的每個請求引用單個區塊。若B _當前大於或等於該閾值，則發出的每個請求引用多個區塊。若發出引用多個區塊的請求，則每單個請求中所請求的區塊數目可按若干種可能方式之一來決定。例如，該數目可以是常數，例如2。替換地，單個請求中所請求的區塊數目可取決於緩衝器狀態並且尤其是取決於B _當前。例如，可以設置數個閾值，其中單個請求中所請求的區塊數目是從小於B _當前的多個閾值中的最高閾值來推導的。An instance of a condition that can cause subsequent requests to reference multiple blocks when satisfied is the _current threshold for buffer size B. If B is _currently below the threshold, each request issued references a single block. If B is _currently greater than or equal to the threshold, each request issued references multiple blocks. If a request to reference multiple blocks is issued, the number of blocks requested in each individual request can be determined in one of several possible ways. For example, the number can be a constant, such as two. Alternatively, the number of blocks requested in a single request may depend on the buffer status and in particular on B _current . For example, a number of thresholds may be set in which the number of blocks requested in a single request is derived from the highest of a plurality of thresholds less than B _current .

在被滿足時可導致請求引用多個區塊的條件的另一實例是以上描述的「狀態」 值變數。例如，當狀態為「穩定」或「滿」時，則可發出針對多個區塊的請求，但當狀態為「低」時，則所有請求可針對一個區塊。Another example of a condition that may result in a request to reference multiple blocks when satisfied is the "state" value variable described above. For example, when the status is "stable" or "full", a request for multiple blocks can be issued, but when the status is "low", all requests can be directed to one block.

另一實施例在圖16中圖示。在該實施例中，當將發出下一請求時(在步驟1300中決定)，使用當前狀態值和B _當前來決定下一請求的大小。若當前狀態值為「低」，或當前狀態值為「滿」且當前表示不是最高可用的表示(在步驟1310中決定，答案為「是」)，則下一請求被選取為短請求，例如僅針對下一區塊(在步驟1320中決定區塊並作出請求)。此舉背後的基本原理在於，該等條件是其中很可能很快將有表示改變的條件。若當前狀態值為「穩定」，或當前狀態值為「滿」且當前表示是最高可用的表示(在步驟1310中決定，答案為「否」)，則下一請求中所請求的連貫區塊的歷時對於某個固定的α<1被選取為與B _當前的α分數成比例(在步驟1330中決定區塊，在步驟1340中作出請求)，例如對於α=0.4，若B _當前 =5秒，則下一請求可針對約2秒的區塊，而若B _當前 =10秒，則下一請求可針對約4秒的區塊。此舉背後的一個基本原理在於在該等條件下，在與B _當前成比例的時間量裡將不大可能作出向新表示的切換。Another embodiment is illustrated in FIG. In this embodiment, when the next request is to be issued (determined in step 1300), the current state value and B _{current are} used to determine the size of the next request. If the current state value is "low", or the current state value is "full" and the current representation is not the highest available representation (determined in step 1310, the answer is yes), then the next request is selected as a short request, for example Only for the next block (the block is determined and requested in step 1320). The basic principle behind this move is that these conditions are the conditions in which it is likely that there will be a change soon. If the current status value is "stable", or the current status value is "full" and the current representation is the highest available representation (determined in step 1310, the answer is no), then the consecutive blocks requested in the next request The duration of a certain fixed α < 1 is selected to be proportional to the _current alpha score of B (the block is determined in step 1330, the request is made in step 1340), for example, for α = 0.4, if B is _currently = 5 seconds The next request may be for a block of about 2 seconds, and if B is _currently = 10 seconds, the next request may be for a block of about 4 seconds. One of the basic principles behind this is that under these conditions, it is unlikely that a switch to the new representation will be made in the amount of time that is _currently proportional to B.

靈活管線化Flexible pipeline

區塊串流系統可使用具有例如TCP/IP之類的特定底層傳輸協定的檔請求協定。在TCP/IP或其他傳輸協定連接的開頭，可能要花某個相當長的時間來達成對全部可用頻寬的利用。此舉可能導致在每次開始新連接時皆有「連接啟動懲罰」。例如，在TCP/IP的情形中，連接啟動懲罰由於初始TCP交握建立連接所花的時間及壅塞控制協定達成對可用頻寬的完全利用所花的時間兩者而發生。The block stream system can use a specific bottom with, for example, TCP/IP A file transfer agreement for a layer transport agreement. At the beginning of a TCP/IP or other transport protocol connection, it may take a considerable amount of time to achieve utilization of all available bandwidth. This may result in a "connection initiation penalty" each time a new connection is initiated. For example, in the case of TCP/IP, the connection initiation penalty occurs due to both the time taken to establish a connection by the initial TCP handshake and the time taken by the congestion control protocol to achieve full utilization of the available bandwidth.

在該情形中，可能希望使用單個連接來發出多個請求，以降低招致連接啟動懲罰的頻率。然而，例如HTTP之類的一些檔案傳輸通訊協定不提供並非將傳輸層連接完全關閉而是取消請求的機制，並且因此在建立新連接來代替舊連接時招致連接啟動懲罰。若決定可用頻寬已改變並且改為要求不同的媒體資料率，即決定切換到不同的表示，則發出的請求可能需要被取消。取消發出的請求的另一原因可能是使用者已請求結束媒體呈現並且開始新呈現(或許是在該呈現的不同點的相同內容項，或者或許是新內容項)。In this case, it may be desirable to use a single connection to issue multiple requests to reduce the frequency of initiating connection initiation penalties. However, some file transfer protocols, such as HTTP, do not provide a mechanism to cancel the request without completely closing the transport layer connection, and thus incur a connection initiation penalty when establishing a new connection instead of the old one. If it is decided that the available bandwidth has changed and instead requires a different media data rate, ie a decision to switch to a different representation, the request issued may need to be cancelled. Another reason for canceling the issued request may be that the user has requested to end the media presentation and start a new presentation (perhaps the same content item at a different point of the presentation, or perhaps a new content item).

如已知的，可經由保持連接打開並對後續請求重用相同的連接來避免連接啟動懲罰，並且如同樣已知的，若在相同的連接上同時發出多個請求(在HTTP的上下文中被稱為「管線化」的技術)，則連接可保持充分利用。然而，同時地或者更通常以使得連接上有多個請求在先前請求完成之前發出的方式發出多個請求的缺點可能在於，該連接隨後要負責攜帶對該等請求的回應並且因此若希望改變應當發出的請求，則該連接可能會被關閉一若需要取消已發出但不再想要的請求。As is known, the connection initiation penalty can be avoided by keeping the connection open and reusing the same connection for subsequent requests, and as is also known, if multiple requests are issued simultaneously on the same connection (called in the context of HTTP) For the "pipelined" technology, the connection can be fully utilized. However, the disadvantage of simultaneously or more generally issuing multiple requests in such a way that multiple requests on the connection are issued before the previous request is completed may be that the connection is then responsible for carrying the response to the request and therefore should be changed if desired If a request is made, the connection may be closed if you need to cancel a request that has been sent but is no longer wanted.

所發出的請求需要被取消的概率可部分地取決於發出該請求與所請求區塊的播出時間之間的時間區間的歷時，部分取決於的意義是指：當該時間區間大時，所發出的請求需要被取消的概率亦高(因為在該區間期間可用頻寬很可能改變了)。The probability that the issued request needs to be cancelled may depend in part on the duration of the time interval between the issuance of the request and the broadcast time of the requested block, in part depending on the meaning that when the time interval is large, The probability that a request to be sent needs to be cancelled is also high (because the available bandwidth is likely to change during this interval).

如已知的，一些檔下載協定具有單個底層傳輸層連接可有利地被用於多個下載請求的性質。例如，HTTP具有該性質，因為將單個連接重用於多個請求對於除第一個請求以外的其他請求避免了以上描述的「連接啟動懲罰」。然而，該辦法的缺點在於該連接要負責傳輸每個所發出請求中所請求的資料，且因此若一或多個請求需要被取消，則要麼該連接可能被關閉，從而在建立取代連接時招致連接啟動懲罰，要麼客戶端可能等待接收不再需要的資料，從而在接收後續資料時招致延遲。As is known, some file download protocols have the property that a single underlying transport layer connection can advantageously be used for multiple download requests. For example, HTTP has this property because reusing a single connection for multiple requests avoids the "connection initiation penalty" described above for requests other than the first one. However, the disadvantage of this approach is that the connection is responsible for transmitting the data requested in each issued request, and therefore if one or more requests need to be cancelled, either the connection may be closed, thereby inviting a connection when establishing a replacement connection. The penalty is initiated, or the client may wait to receive data that is no longer needed, thereby incurring a delay in receiving subsequent data.

現在描述留存連接重用的優點而不招致該缺點並且亦額外地提高連接能被重用的頻率的實施例。Embodiments that preserve the advantages of connection reuse without incurring this disadvantage and additionally increasing the frequency at which connections can be reused are now described.

本文中描述的區塊串流系統的實施例被配置成對多個請求重用連接而不必一開始就承諾由該連接負責特定的一組請求。實質上，在現有連接上的已發出的請求尚未完成但接近完成時在該連接上發出新請求。不等待直至現有請求完成的一個原因在於，若先前的請求完成則連線速度可能降級，亦即，底層TCP通信期可能進入閒置狀態或者TCPcwnd 變數可能被顯著地減小，藉此顯著降低該連接上發出的新請求的初始下載速度。在發出額外請求之前等待直至接近完成的一個原因在於，因為若新請求是在先前請求完成之前很久就發出的，則新發出的請求可能甚至在某個很長時間段內不開動，並且有可能在新發出的請求開動之前的該時間段期間，作出新請求的決定不再有效，例如由於決定切換表示而導致上述情形。因此，實現該技術的客戶端的實施例將在不減慢連接的下載能力的情況下儘可能晚地在該連接上發出新請求。Embodiments of the block streaming system described herein are configured to reuse connections for multiple requests without having to commit to the connection being responsible for a particular set of requests from the beginning. Essentially, a new request is made on the connection when the issued request on the existing connection has not completed but is near completion. One reason for not waiting until the existing request is completed is that the connection speed may be degraded if the previous request is completed, that is, the underlying TCP communication period may enter an idle state or the TCP cwnd variable may be significantly reduced, thereby significantly reducing the The initial download speed of the new request made on the connection. One reason to wait until the completion of an additional request is close to completion is because if the new request was issued long before the previous request was completed, the newly issued request may not even be activated for a long period of time, and it is possible During this period of time before the newly issued request is initiated, the decision to make a new request is no longer valid, for example due to a decision to switch the representation. Thus, embodiments of the client implementing the technology will issue new requests on the connection as late as possible without slowing down the download capabilities of the connection.

該方法包括監視回應於在連接上發出的最晚請求在該連接上接收到的位元組數目並對該數目應用測試。此舉可以經由使接收器(或發射器，若適用)配置成進行監視和測試來進行。The method includes monitoring and applying a test to the number of bytes received on the connection in response to the latest request sent on the connection. This can be done by configuring the receiver (or transmitter, if applicable) for monitoring and testing.

若測試通過，則可在該連接上發出另一請求。合適的測試的一個實例是接收到的位元組數目是否大於所請求資料的大小的固定分數。例如，該分數可以為80%。合適的測試的另一實例基於以下演算，如圖17中所圖示的。在該演算中，令R 為對該連接的資料率的估計，T 為對往返行程時間(「RTT」)的估計，並且X 為例如可以是設為0.5與2之間的值的常數的數值因數，其中對R 和T 的估計在定期的基礎上更新(在步驟1410中更新)。令S 為最晚請求中所請求的資料的大小，B 為所請求的資料中接收到的位元組數目(在步驟1420中演算)。If the test passes, another request can be made on the connection. An example of a suitable test is whether the number of bytes received is greater than a fixed fraction of the size of the requested material. For example, the score can be 80%. Another example of a suitable test is based on the following calculations, as illustrated in FIG. In this calculation, let R be an estimate of the data rate of the connection, T is an estimate of the round trip time ("RTT"), and X is a value which can be, for example, a constant set to a value between 0.5 and 2. A factor in which the estimates for R and T are updated on a regular basis (updated in step 1410). Let S be the size of the data requested in the latest request, and B be the number of bytes received in the requested data (calculated in step 1420).

合適的測試將是使接收器(或發射器，若適用)執行評估不等式(S -B )<X ‧R ‧T 的常式(在步驟1430中測試)，並且若「是」則採取行動。例如，可作出測試以查看是否有另一個請求準備好要在該連接上發出(在步驟1440中測試)，並且若「是」則向該連接發出該請求(步驟1450)並且若「否」則本過程返回步驟1410以繼續更新和測試。若步驟1430中的測試的結果是「否」，則本過程返回步驟1410以繼續更新和測試。A suitable test would be to have the receiver (or transmitter, if applicable) perform the routine of evaluating the inequality ( S - B ) < X ‧ R ‧ T (tested in step 1430) and take action if "yes". For example, a test can be made to see if another request is ready to be issued on the connection (tested in step 1440), and if "yes" then the request is made to the connection (step 1450) and if "no" then The process returns to step 1410 to continue the update and testing. If the result of the test in step 1430 is "NO", then the process returns to step 1410 to continue the update and testing.

步驟1430中的不等式測試(例如由合適地程式設計的元件來執行)導致在待接收的剩餘資料量等於X 乘以在一個RTT內以當前估計的接收速率能接收的資料量時發出每個後續請求。數種在步驟1410中估計資料率R 的方法是本領域已知的。例如，資料率可估計為Dt /t ，其中Dt 是在之前t 秒中接收到的位元數目並且其中t 可以是例如1秒或0.5秒或其他某個區間。另一種方法是對傳入資料率的指數加權平均或一階無限衝激回應(IIR)濾波。數種在步驟1410中估計RTT「T」 的方法是本領域已知的。The inequality test in step 1430 (eg, performed by a suitably programmed component) causes each subsequent to be issued when the amount of remaining data to be received is equal to X multiplied by the amount of data that can be received at the current estimated reception rate within one RTT. request. Several methods of estimating the data rate R in step 1410 are known in the art. For example, the data rate can be estimated as Dt / t , where Dt is the number of bits received in the previous t seconds and where t can be, for example, 1 second or 0.5 seconds or some other interval. Another method is exponentially weighted average or first order infinite impulse response (IIR) filtering of incoming data rates. Several methods for estimating RTT "T" in step 1410 are known in the art.

步驟1430中的測試可應用於介面上所有活躍連接的聚集，如以下更詳細地解釋的。The test in step 1430 can be applied to the aggregation of all active connections on the interface, as explained in more detail below.

該方法進一步包括構造候選請求列表，將每個候選請求與可向伺服器作出該請求的一組合適伺服器相關聯，並且按優先順序次序排序該候選請求列表。候選請求列表中的一些條目可具有相同的優先順序。與每個候選請求相關聯的合適伺服器列表中的伺服器由主機名稱來識別。每個主機名稱對應於可從網功能變數名稱稱系統獲得的一組網際協定位址，此是公知的。因此，候選請求列表上的每個可能的請求與一組網際協定位址相關聯，具體而言是與該候選請求的關聯伺服器的關聯主機名稱的關聯的各組網際協定位址的並集相關聯。每當連接滿足步驟1430中描述的測試並且該連接上尚未發出新請求時，就選取候選請求列表上與該連接的目的地的網際協定位址相關聯的最高優先順序請求，並且在該連接上發出該請求。亦將該請求從候選請求列表中移除。The method further includes constructing a candidate request list, associating each candidate request with a set of suitable servers that can make the request to the server, and ordering the candidate request list in order of priority. Some of the entries in the candidate request list may have the same priority order. The servers in the list of suitable servers associated with each candidate request are identified by the host name. Each host name corresponds to a set of internet protocol addresses that are available from the network function variable name system, as is well known. Thus, each possible request on the candidate request list is associated with a set of internet protocol addresses, specifically with the candidate request The associated server name of the associated server is associated with the union of the sets of Internet Protocol addresses associated with each other. Whenever the connection satisfies the test described in step 1430 and a new request has not been issued on the connection, the highest priority request associated with the internet protocol address of the destination of the connection is selected on the candidate request list, and on the connection Issue the request. The request is also removed from the list of candidate requests.

候選請求可從候選請求列表移除(取消)，具有高於候選列表上的已有請求的優先順序的新請求可被添加到候選列表，並且候選列表上的現有請求的優先順序可改變。有何種請求在候選請求清單上的該動態本質及該等請求在候選列表上的優先順序的該動態本質可取決於何時滿足步驟1430中描述的類型的測試而更改接下來可發出何種請求。The candidate request may be removed (cancelled) from the candidate request list, a new request having a higher priority than the existing request on the candidate list may be added to the candidate list, and the prioritization of existing requests on the candidate list may be changed. The dynamic nature of the request on the candidate request list and the dynamic nature of the priority of the requests on the candidate list may depend on when the test of the type described in step 1430 is satisfied and the next request can be made. .

例如，有可能若在某個時間t 對步驟1430中描述的測試的回答為「是」，則發出的下一請求將為請求A，而若對步驟1430中描述的測試的回答並非為「是」直至某個時間t ′>t ，則發出的下一請求將改為是請求B，因為請求A在時間t 與t ′之間從候選請求列表被移除，或者由於在時間t 與t ′之間優先順序比請求A高的請求B被添加到候選請求列表，或者由於請求B在時間t 時已在該候選列表上但優先順序比請求A低，並且在時間t 與t ′之間，請求B的優先順序被改為高於請求A的優先順序。For example, it is possible that if the answer to the test described in step 1430 is "yes" at some time t , then the next request issued will be request A, and if the answer to the test described in step 1430 is not "yes" Until a certain time t '> t , the next request issued will be changed to request B because request A is removed from the candidate request list between times t and t ', or because at time t and t ' The request B with a higher priority order than the request A is added to the candidate request list, or since the request B is already on the candidate list at time t but the priority order is lower than the request A, and between the times t and t ', The priority order of request B is changed to be higher than the priority order of request A.

圖18圖示了候選請求列表上的請求列表的實例。在該實例中，有三個連接，並且候選列表上有6個請求，標示為A、B、C、D、E和F。候選列表上的每個請求可在如所指示的連接子集上發出，例如請求A可在連接1上發出，而請求F可在連接2或連接3上發出。每個請求的優先順序亦在圖18中標示，並且較低的優先順序值指示請求有較高優先順序。因此，具有優先順序0的請求A和B是最高優先順序請求，而具有優先順序值3的請求F是候選列表上的請求中的最低優先順序。Figure 18 illustrates an example of a request list on a candidate request list. In this example, there are three connections, and there are six requests on the candidate list, labeled A, B, C, D, E, and F. Each request on the candidate list can be issued on the connected subset as indicated, for example request A can be sent on connection 1, and request F can be Connection 2 or connection 3 is issued. The priority order for each request is also indicated in Figure 18, and the lower priority value indicates that the request has a higher priority. Thus, requests A and B with priority 0 are the highest priority requests, while requests F with priority value 3 are the lowest priority among the requests on the candidate list.

若在該時間點t ，連接1通過了步驟1430中描述的測試，則在連接1上發出請求A或請求B。若改為是連接3在該時間t 通過了步驟1430中描述的測試，則在連接3上發出請求D，因為請求D是能在連接3上發出的具有最高優先順序的請求。If at this point in time t , connection 1 passes the test described in step 1430, then request A or request B is issued on connection 1. If instead the connection 3 passes the test described in step 1430 at this time t , then request D is issued on connection 3 because request D is the highest priority request that can be issued on connection 3.

假設對於所有連接，從時間t 到某個稍後的時間t ′對步驟1430中描述的測試的答案皆為「否」，並且在時間t 與t ′之間，請求A的優先順序從0改變為5，請求B從候選列表被移除，並且具有優先順序0的新請求G被添加到該候選列表。隨後，在時間t ′，新候選列表可如圖19中所示。Suppose that for all connections, the answer to the test described in step 1430 is "No" from time t to some later time t ', and between time t and t ', the priority of request A changes from 0. At 5, the request B is removed from the candidate list, and a new request G having the priority order 0 is added to the candidate list. Subsequently, at time t ', the new candidate list can be as shown in FIG.

若在時間t ′連接1通過了步驟1430中描述的測試，則在連接1上發出優先順序為4的請求C，因為請求C是候選列表上在該時間點能在連接1上發出的的最高優先順序請求。If connection 1 passes the test described in step 1430 at time t ', then request C with priority order 4 is issued on connection 1, since request C is the highest on the candidate list that can be sent on connection 1 at that point in time. Priority order request.

假設在該相同的情形中改為在時間t 在連接1上本來發出了請求A(請求A為如圖18中所示的在時間t 對連接1而言兩個最高優先順序選擇之一)。由於從時間t 到某個稍後的時間t ′對於所有連接而言步驟1430中描述的測試的答案皆為「否」，因此連接1仍為在時間t 之前發出的請求投遞資料直到至少時間t ′，且因此請求A在至少時間t ′之前將不會開動。在時間t ′發出請求C是比本來在時間t 發出請求A更好的決定，因為請求C在t ′之後與請求A本來將開動的時間相同的時間開動，並且因為截至該時間，請求C比請求A具有更高優先順序。It is assumed that in the same case, the request A is originally issued on connection 1 at time t (request A is one of the two highest priority choices for connection 1 at time t as shown in FIG. 18). Since the answer to the test described in step 1430 is "No" for all connections from time t to some later time t ', then connection 1 still delivers the request for the request issued before time t until at least time t ', and therefore request A will not start until at least time t '. Requesting C at time t ' is a better decision than issuing request A at time t , because request C is started after t ' with the same time that request A would have started, and because the time C is requested, Request A has a higher priority.

作為另一替換方案，若步驟1430中描述的類型的測試應用於活躍連接的聚集，則可選取連接的目的地的網際網路協定位址與候選請求清單上的第一個請求或同該第一個請求具有相同優先順序的另一請求相關聯的連接。As a further alternative, if the type of test described in step 1430 is applied to the aggregation of active connections, the Internet Protocol address of the connected destination may be selected with the first request on the candidate request list or the same A request is associated with another request with the same priority order.

有數種方法可用於構造候選請求列表。例如，候選列表可包含代表對呈現的當前表示按時間順序次序的接下來n 個資料部分的請求的n 個請求，其中對最早資料部分的請求具有最高優先順序而對最晚資料部分的請求具有最低優先順序。在一些情形中，n 可以為1。n 的值可取決於緩衝器大小B _當前，或狀態變數或客戶端緩衝器佔用率的狀態的其他度量。例如，可為B _當前設置數個閾，並且有值與每個閾相關聯，隨後將n 的值取為與小於B _當前的最高閾相關聯的值。There are several ways to construct a list of candidate requests. For example, the candidate list may include representatives of n represents the n next Request information portion chronological order of the currently presented, wherein a request having the highest priority of the first data portion of the request has the latest data portion The lowest priority. In some cases, n can be one. The value of n may depend on the buffer size B _current , or other measure of the state variable or the state of the client buffer occupancy. For example, a number of thresholds may be _currently set for B , and a value is associated with each threshold, and then the value of n is taken to be a value associated with a _current highest threshold less than B.

以上描述的實施例確保了靈活地將請求分配給連接，從而確保向重用現有連接給予優待，即使最高優先順序請求不適合該連接(因為該連接的目的地IP位址不是分配給與該請求相關聯的任何主機名稱的那一IP位址)亦然。n 對B _當前或客戶端緩衝器佔用率的狀態或其他度量的依存性確保了在客戶端亟需發出和完成與按時間順序要播出的下一資料部分相關聯的請求時，此類「脫離優先順序次序」的請求不被發出。The embodiment described above ensures that the request is flexibly assigned to the connection, thereby ensuring that the existing connection is reused, even if the highest priority request is not suitable for the connection (because the destination IP address of the connection is not assigned to be associated with the request) The same IP address of any host name) is also true. B n of _current or client buffer occupancy state of dependency or other measures to ensure that the next request for information associated with the portion of the issue and the urgent need to complete the chronological order to be broadcast in the client's time, such " A request to leave the priority order is not issued.

該等方法可有利地與協調式HTTP和FEC相組合。These methods can be advantageously combined with coordinated HTTP and FEC.

一致性的伺服器選擇Consistent server selection

如公知的，將使用檔下載協定來下載的檔通常由包括主機名稱和檔案名的辨識符來識別。例如，HTTP協定就是此種情形，在該情形中，辨識符是統一資源辨識項(URI)。主機名稱可對應於由各網際網路協定位址識別的多個主機。例如，此是跨多個實體機器分攤來自多個客戶端的請求負荷的常見方法。具體而言，該辦法常被內容投遞網路(CDN)採用。在此種情形中，在連接上向任何實體主機發出的請求預期將成功。已知有多種可供客戶端用來從與主機名稱相關聯的各網際協定位址中進行選擇的方法。例如，該等位址通常經由領域名稱系統提供給客戶端並按優先順序次序提供。客戶端隨後可選取最高優先順序(第一)網際協定位址。然而，一般而言，客戶端之間就如何作出該選取並沒有協調，因此不同客戶端可能向不同伺服器請求相同的檔。此舉可能導致相同的檔被儲存在近旁多個伺服器的快取記憶體中，此舉降低了快取記憶體基礎設施的效率。As is well known, files that are to be downloaded using a file download protocol are typically identified by an identifier including a host name and a file name. This is the case, for example, for an HTTP protocol, in which case the identifier is a Uniform Resource Identifier (URI). The host name may correspond to multiple hosts identified by each internet protocol address. For example, this is a common way to spread the load of requests from multiple clients across multiple physical machines. Specifically, this approach is often adopted by the Content Delivery Network (CDN). In this scenario, a request to any physical host on the connection is expected to succeed. A variety of methods are known for clients to use to select from various internet protocol addresses associated with host names. For example, the addresses are typically provided to the client via the domain name system and are provided in order of priority. The client can then select the highest priority (first) internet protocol address. However, in general, there is no coordination between the clients on how to make the selection, so different clients may request the same file from different servers. This may result in the same file being stored in the cache memory of multiple servers in the vicinity, which reduces the efficiency of the cache memory infrastructure.

此舉可以由有利地增加請求相同區塊的兩個客戶端將向相同伺服器請求該區塊的概率的系統來處置。此處描述的新穎方法包括以由要請求的檔的辨識符來決定的方式並以使得被呈示了相同或相似的網際協定位址和檔辨識符選擇的不同客戶端將作出相同選取的方式從可用網際網路協定位址中進行選擇。This can be handled by a system that advantageously increases the probability that two clients requesting the same block will request the block from the same server. The novel methods described herein include in a manner determined by the identifier of the file to be requested and in such a way that different clients selected by the same or similar Internet Protocol address and file identifier will make the same selection. You can choose from the Internet Protocol address.

參考圖20來描述該方法的第一實施例。客戶端首先獲得一組網際協定位址IP₁ 、IP₂ 、...、IP_n ，如步驟1710中所示。若如步驟1720中決定的有要針對檔發出請求的檔，則客戶端決定用何者網際協定位址來發出對該檔的請求，如步驟1730至步驟1770中所決定的。給定一組網際協定位址和要請求的檔的辨識符，該方法包括以由該檔辨識符所決定的方式排序該等網際網路協定位址。例如，對於每個網際協定位址，構造出包括該網際協定位址與該檔辨識符的級聯的位元組串，如步驟1730中所示。向該位元組串應用散列函數，如步驟1740中所示，並且根據固定排序，例如數值昇冪，來排列所得的散列值，如步驟1750中所示，從而引起網際協定位址的排序。相同散列函數可被所有客戶端使用，因此保證對於所有客戶端的給定輸入，該散列函數產生相同的結果。該散列函數可被靜態地配置到客戶端集合中的所有客戶端中，或者客戶端集合中的所有客戶端可在該等客戶端獲得網際協定位址清單時獲得該散列函數的部分或完整描述，或者客戶端集合中的所有客戶端可在該等客戶端獲得檔辨識符時獲得該散列函數的部分或完整描述，或者該散列函數可由其他手段決定。該排序中的首個網際協定位址被選取並且該位址隨後被用來建立連接並發出對該檔的全部或部分的請求，如步驟1760和步驟1770中所示。A first embodiment of the method is described with reference to FIG. The client first obtains a set of internet protocol addresses IP ₁ , IP ₂ , ..., IP _n as shown in step 1710. If, as determined in step 1720, there is a file to be requested for the file, the client decides which Internet Protocol address to use to issue the request for the file, as determined in steps 1730 through 1770. Given a set of internet protocol addresses and identifiers for the files to be requested, the method includes sorting the internet protocol addresses in a manner determined by the file identifier. For example, for each internet protocol address, a byte string comprising the concatenation of the internet protocol address and the file identifier is constructed, as shown in step 1730. Applying a hash function to the byte string, as shown in step 1740, and arranging the resulting hash values according to a fixed order, such as a numerical power, as shown in step 1750, thereby causing an internet protocol address. Sort. The same hash function can be used by all clients, thus ensuring that the hash function produces the same result for a given input for all clients. The hash function can be statically configured into all clients in the client set, or all clients in the client set can obtain portions of the hash function when the client obtains the list of internet protocol addresses or A full description, or all clients in the client collection, may obtain a partial or complete description of the hash function when the client obtains the file identifier, or the hash function may be determined by other means. The first internet protocol address in the ranking is selected and the address is then used to establish a connection and issue a request for all or part of the file, as shown in steps 1760 and 1770.

以上方法可在建立新連接以請求檔時應用。該方法亦可在有多個建成的連接可用時應用，並且該等連接中的一個可被選取來發出新請求。The above method can be applied when a new connection is established to request a file. The method can also be applied when multiple built connections are available, and one of the connections can be selected to make a new request.

此外，當有建成的連接可用並且可從具有相等優先順序的候選請求的集合中選取請求時，例如經由以上描述的相同的散列值方法引起對候選請求的排序，並且該排序中首個出現的候選請求被選取。該等方法可被組合以從一組連接和具有相等優先順序的請求中同樣經由計算連接與請求的每個組合的散列、根據固定排序對該等散列值進行排序、並選取對請求與連接的組合的集合引起的排序中首個出現的組合來選擇連接和候選請求兩者。Furthermore, when a built-in connection is available and a request can be selected from a set of candidate requests having equal priority, the ordering of the candidate requests is caused, for example, via the same hash value method described above, and the ranking is first The candidate candidates that appear are selected. The methods can be combined to sort the hash values from a set of joins and requests with equal priority, respectively, by computing a hash of each combination of joins and requests, sorting the hash values according to a fixed order, and selecting the pair of requests and The first occurrence of the combination in the sorting resulting from the set of connected combinations selects both the connection and the candidate request.

該方法出於以下原因具有優點：諸如圖1(BSI 101)或圖2(BSI 101)中所示的區塊供應基礎設施採取的典型辦法、尤其是CDN通常採取的辦法是提供多個接收客戶端請求的快取記憶體代理伺服器。快取記憶體代理伺服器可能並未裝有給定請求中所請求的檔並且在此種情形中，此類伺服器典型地將該請求轉發給另一伺服器，接收來自該伺服器的回應(典型地包括所請求的檔)，並將該回應轉發給客戶端。快取記憶體代理伺服器亦可儲存(快取記憶體)所請求的檔，從而快取記憶體代理伺服器能立即回應對該檔的後續請求。以上描述的常用辦法具有以下性質：儲存在給定快取記憶體代理伺服器上的檔集很大程度上是由該快取記憶體代理伺服器已接收到的請求集合來決定的。This approach has advantages for the following reasons: a typical approach taken by the block provisioning infrastructure such as shown in Figure 1 (BSI 101) or Figure 2 (BSI 101), in particular the CDN typically takes the approach of providing multiple receiving clients The cached proxy server that is requested by the client. The cached memory proxy server may not have the file requested in the given request and in such a case, such a server typically forwards the request to another server, receiving a response from the server ( The requested file is typically included and forwarded to the client. The cache memory proxy server can also store (cache memory) the requested file, so that the cache memory proxy server can immediately respond to subsequent requests for the file. The common approach described above has the property that the set of files stored on a given cached proxy server is largely determined by the set of requests that the cached proxy server has received.

以上描述的方法具有以下優點。若客戶端集合中的所有客戶端被提供相同的網際協定位址清單，則該等客戶端將對針對相同檔發出的所有請求使用相同的網際網路協定位址。若存在兩個不同的網際協定位址清單並且每個客戶端被提供該兩個列表之一，則客戶端對針對相同檔發出的所有請求將使用至多兩個不同的網際網路協定位址。一般而言，若提供給客戶端的網際協定位址清單是相似的，則該等客戶端將對針對相同檔發出的所有請求使用所提供的網際網路協定位址的小集合。由於近程的客戶端傾向於被提供相似的網際協定位址清單，因此很可能近程客戶端會向該等客戶端可用的快取記憶體代理伺服器的僅小部分發出對檔的請求。因此，將只有很小分數的快取記憶體代理伺服器快取記憶體該檔，此舉有利地使用於快取記憶體該檔的快取記憶體資源量最小化。The method described above has the following advantages. If all clients in the client set are provided with the same list of Internet Protocol addresses, then those clients will use the same Internet Protocol address for all requests made for the same file. If there are two different lists of internet protocol addresses and each client is provided with one of the two lists, then the client will use up to two different internet protocol addresses for all requests for the same file. In general, if the list of Internet Protocol addresses provided to the client is similar, then the clients A small collection of Internet Protocol addresses provided will be used for all requests made for the same file. Since short-range clients tend to be provided with a similar list of Internet Protocol addresses, it is likely that the proximity client will issue a request for a file to only a small portion of the cached proxy server available to those clients. Therefore, only a small fraction of the cache memory proxy server caches the file, which advantageously minimizes the amount of cache memory used for the cache memory.

較佳地，散列函數具有以下性質：很小分數的不同輸入被映射到相同的輸出，且不同輸入被映射到本質上隨機的輸出，以確保對於給定的網際網路協定位址集合，使網際網路協定位址中給定的一個位址在由步驟1750產生的經分序列表中為首個位址的檔比例對於該清單中的所有網際協定位址大致相同。另一方面，重要的是該散列函數是決定性的，決定性的意義是指：對於給定輸入，該散列函數的輸出對於所有客戶端皆相同。Preferably, the hash function has the property that different inputs of a small fraction are mapped to the same output, and different inputs are mapped to an essentially random output to ensure that for a given set of internet protocol addresses, The proportion of a given address in the Internet Protocol address to the first address in the sequenced list generated by step 1750 is approximately the same for all Internet Protocol addresses in the list. On the other hand, it is important that the hash function is decisive, and the decisive meaning is that for a given input, the output of the hash function is the same for all clients.

以上描述的方法的另一優點如下。假設客戶端集合中的所有客戶端被提供相同的網際協定位址清單。由於該散列函數的剛才描述的該等性質，很可能來自該等客戶端的對不同檔的請求將均勻地跨該組網際協定位址分攤，此進而意味著該等請求將跨快取記憶體代理伺服器均勻分攤。因此，用於儲存該等檔的快取記憶體資源跨快取記憶體代理伺服器均勻分攤，且對檔的請求跨該等快取記憶體代理伺服器均勻分攤。因此，該方法提供跨快取記憶體基礎設施的儲存平衡和負荷平衡兩者。Another advantage of the method described above is as follows. Assume that all clients in the client collection are provided with the same list of Internet Protocol addresses. Due to the nature of the hash function just described, it is likely that requests from different clients for different files will be evenly distributed across the set of internet protocol addresses, which in turn means that the requests will span the cache memory. The proxy server is evenly distributed. Therefore, the cache memory resources for storing the files are evenly distributed across the cache memory proxy server, and requests for files are evenly distributed across the cache memory proxy servers. Thus, the method provides both storage balance and load balancing across the cache memory infrastructure.

以上描述的辦法的多種變形為本領域技藝人士所知的，並且在許多情形中，該等變形留存了儲存在給定代理上的檔集是至少部分地由該快取記憶體代理伺服器已接收到的請求集合決定該性質。在其中給定主機名稱解析到多個實體快取記憶體代理伺服器的常見情形中，所有該等伺服器將最終儲存任何被頻繁請求的給定檔的副本將會是很常見的。此類重複可能是不可取的，因為快取記憶體代理伺服器上的儲存資源是有限的，且因此檔有時可能會從快取記憶體被移除(清空)。此處描述的新穎方法確保了對給定檔的請求以減少此種重複的方式被定向到快取記憶體代理伺服器，藉此減少從快取記憶體移除檔的需要並且藉此增大任何給定檔存在於該代理快取記憶體中(亦即，尚未從其清空)的可能性。Many variations of the approaches described above are known to those skilled in the art, and in many cases, such deformations retain the set of files stored on a given agent that is at least partially comprised by the cached proxy server. The set of requests received determines this property. In the common case where a given host name resolves to multiple entity cache memory proxy servers, it would be common for all such servers to eventually store any copies of a given file that are frequently requested. Such repetition may be undesirable because the storage resources on the cached memory proxy server are limited, and thus the file may sometimes be removed (emptied) from the cache memory. The novel method described herein ensures that requests for a given file are directed to the cache memory proxy server in a manner that reduces such duplication, thereby reducing the need to remove files from the cache memory and thereby increasing The probability that any given file exists in the proxy cache (ie, has not been emptied from it).

當檔存在於代理快取記憶體中時，向客戶端發送的回應更快，此具有減少所請求的檔晚到的概率的優點，所請求檔晚到會導致媒體播出的暫停並且因此導致不良的使用者體驗。此外，當檔不存在於代理快取記憶體中時，該請求可被發送給另一伺服器，從而既造成服務基礎設施上又造成伺服器之間的網路連接上的額外負荷。在許多情形中，請求所發往的伺服器可能位於遙遠位置並且從該伺服器向快取記憶體代理伺服器回傳該檔可能招致傳輸成本。因此，此處描述的新穎方法使得該等傳輸成本能得以減少。When the file exists in the proxy cache, the response sent to the client is faster, which has the advantage of reducing the probability that the requested file is late, and the requested file is late, which causes the media to be paused and thus causes Bad user experience. In addition, when the file does not exist in the proxy cache, the request can be sent to another server, causing additional load on the service infrastructure and on the network connection between the servers. In many cases, the server to which the request is sent may be located at a remote location and returning the file from the server to the cached memory proxy server may incur transmission costs. Thus, the novel methods described herein enable such transmission costs to be reduced.

概率性全檔請求Probabilistic full file request

將HTTP協定與範圍請求聯用的情形中特別的關注點是通常用來提供服務基礎設施中的可伸縮性的快取記憶體伺服器的行為。儘管HTTP快取記憶體伺服器支援HTTP範圍頭部可能是常見的，但不同HTTP快取記憶體伺服器的確切行為因實現而變化。大多數快取記憶體伺服器實現在檔在快取記憶體中可用的情形中從該快取記憶體來服務範圍請求。HTTP快取記憶體伺服器的常用實現總是將包含範圍頭部的下游HTTP請求轉發給上游節點，除非該快取記憶體服務器具有該檔的副本(快取記憶體伺服器或原始伺服器)。在一些實現中，對該範圍請求的上游回應是整個檔，並且該整個檔被快取記憶體且從該檔提取對下游範圍請求的回應並發送該回應。然而，在至少一種實現中，對範圍請求的上游回應只是範圍請求本身中的資料位元組，且該等資料位元組不被快取記憶體而是只作為對下游範圍請求的回應被發送。因此，客戶端使用範圍頭部可能的後果是檔本身從不被放入快取記憶體且網路的可取的可伸縮性性質將會丟失。A particular concern in the context of using HTTP protocols with scope requests is the cache memory that is typically used to provide scalability in the service infrastructure. The behavior of the server. Although it may be common for HTTP cache memory servers to support HTTP range headers, the exact behavior of different HTTP cache servers varies by implementation. Most cache memory servers implement service range requests from the cache in situations where the file is available in cache memory. A common implementation of the HTTP cache memory server always forwards the downstream HTTP request containing the range header to the upstream node unless the cache server has a copy of the file (cache server or raw server) . In some implementations, the upstream response to the range request is the entire file, and the entire file is cached and the response to the downstream range request is extracted from the file and the response is sent. However, in at least one implementation, the upstream response to the range request is only the data byte in the scope request itself, and the data bytes are not sent by the cache but only as a response to the downstream scope request. . Therefore, the possible consequence of the client using the scope header is that the file itself is never put into the cache and the desirable scalability properties of the network will be lost.

在上述內容中，描述了快取記憶體代理伺服器的操作，並且亦描述了從作為多個區塊的聚集的檔來請求區塊的方法。例如，此可以經由使用HTTP範圍請求頭部來達成。此類請求在下文被稱為「部分請求」。現在描述另一實施例，該實施例在區塊供應基礎設施101不提供對HTTP範圍頭部的完全支援的情形中具有優點。通常，區塊供應基礎設施內的伺服器例如內容投遞網路支援部分請求，但可能不在本機儲存區(快取記憶體)中儲存對部分請求的回應。此類伺服器可經由將請求轉發給另一伺服器來履行部分請求，除非完整檔被儲存在本機儲存區中，在後一種情形中可在不將請求轉發給另一伺服器的情況下發送回應。In the above, the operation of the cache memory proxy server is described, and a method of requesting a tile from an aggregated file as a plurality of blocks is also described. For example, this can be achieved by using an HTTP range request header. Such requests are referred to below as "partial requests." Another embodiment will now be described which has advantages in the case where the patch provisioning infrastructure 101 does not provide full support for the HTTP range header. Typically, a server within the block provisioning infrastructure, such as a content delivery network support partial request, may not store a response to a partial request in the local storage area (cache memory). Such a server may fulfill a partial request by forwarding the request to another server, unless the full file is stored in the local storage area, in which case the request may not be transferred. Send a response if it is sent to another server.

利用以上描述的對區塊聚集的新穎增強的區塊請求串流系統在區塊供應基礎設施顯現該行為的情況下可能效能不良，因為實為部分請求的所有請求將被轉發給另一伺服器並且沒有任何請求將由快取記憶體代理伺服器來服務，此首先就使提供快取記憶體代理伺服器所為的目的落空。在如上描述的區塊請求串流過程期間，客戶端可能在某個時刻請求處在檔開頭的區塊。The novel enhanced block request streaming system for block aggregation described above may be ineffective in the event that the block provisioning infrastructure exhibits this behavior, as all requests that are actually partially requested will be forwarded to another server. And no request will be served by the cache memory proxy server, which first makes the purpose of providing the cache memory proxy server fall. During the block request streaming process as described above, the client may request a block at the beginning of the file at some point.

根據此處描述的新穎方法，每當滿足某個條件時，便可將此類請求從對檔中的首個區塊的請求轉換成對整個檔的請求。當快取記憶體代理伺服器接收到對整個檔的請求時，該代理伺服器通常儲存回應。因此，該等請求的使用使得檔被放入本端快取記憶體代理伺服器的快取記憶體中，以使得後續請求無論是針對全檔的還是部分請求均可直接由該快取記憶體代理伺服器來服務。該條件可以是使得在與給定檔相關聯的請求集合(例如由觀看所議的內容項的一組客戶端產生的請求的集合)中，該條件至少對於該等請求中的規定的分數而言將是滿足的。According to the novel method described herein, such a request can be converted from a request for the first block in the file to a request for the entire file each time a certain condition is met. When the cached memory proxy server receives a request for the entire file, the proxy server typically stores the response. Therefore, the use of the requests causes the file to be placed in the cache memory of the local cache memory proxy server, so that subsequent requests can be directly from the cache memory whether for full file or partial request. Proxy server to serve. The condition may be such that in a set of requests associated with a given file (eg, a set of requests generated by a group of clients viewing the proposed content item), the condition is at least for a specified score in the requests The words will be satisfied.

合適條件的實例是隨機選取的數字高於所規定的閾值。該閾值可被設置成使得將單區塊請求轉換成全檔請求的此種轉換平均而言對該等請求中規定的分數發生，例如10個請求裡面發生一次(在此種情形中，可從區間[0,1]選取該亂數並且閾值可為0.9)。合適條件的另一實例是對與區塊相關聯的一些資訊及與客戶端相關聯的一些資訊演算出的散列函數取所規定的值集合中的一個值。該方法具有以下優點：對於被頻繁請求的檔，該檔將被放入本端快取記憶體代理伺服器的快取記憶體中，然而區塊請求串流系統的操作與其中每個請求針對單個區塊的標準操作相比沒有明顯更改。在許多情形中，在發生將請求從單區塊請求轉換成全檔請求的場合，客戶端程序本將接著請求該文件內的其他區塊。若是此種情形，則此類請求可被抑制，因為所議的區塊無論如何皆將作為全檔請求的結果被接收到。An example of a suitable condition is that the randomly selected number is above a prescribed threshold. The threshold may be set such that such conversions that convert a single block request into a full file request occur on average in the scores specified in the requests, for example once within 10 requests (in this case, from the interval) [0, 1] The random number is selected and the threshold may be 0.9). Another example of a suitable condition is a hash of some information associated with a block and some information associated with the client. Counts one of the specified set of values. The method has the following advantages: for files that are frequently requested, the file will be placed in the cache memory of the local cache memory proxy server, but the block requests the operation of the streaming system with each of the requests There is no significant change in the standard operation of a single block. In many cases, where a request to convert a single block request to a full file request occurs, the client program will then request other blocks within the file. If this is the case, such a request can be suppressed because the block in question will be received as a result of the full file request anyway.

URL構造及段列表產生和檢視URL construction and segment list generation and viewing

段列表產生應對客戶端在特定的客戶端本端時間「現在」如何從MPD來為始於某個開始時間「starttime」的特定表示產生段列表的問題，其中該開始時間「starttime」對於點播情形而言是相對於媒體呈現的開始而言的，或者是以壁鐘時間來表達的。段列表可包括定位符，例如指向可任選的初始表示中繼資料的URL，以及媒體段列表。每個媒體段可能已被指派開始時間、歷時和定位符。開始時間典型地表達段中所包含媒體的媒體時間的近似，但不一定是取樣準確時間。開始時間被HTTP串流客戶端用來在合適的時間發出下載請求。段列表(包括每個段的開始時間)的產生可按不同方式進行。URL可作為播放清單提供，或者URL構造規則可被有利地用於段列表的緊湊表示。The segment list generates a question of how the client's local time at the specific client "now" generates a segment list from the MPD for a specific representation starting at a start time "starttime", where the start time "starttime" is for the on-demand situation. In terms of the beginning of the media presentation, or in terms of wall clock time. The segment list may include locators, such as a URL pointing to an optional initial representation relay material, and a media segment list. Each media segment may have been assigned a start time, duration, and locator. The start time typically represents an approximation of the media time of the media contained in the segment, but not necessarily the exact time of sampling. The start time is used by the HTTP streaming client to issue a download request at the appropriate time. The generation of the segment list (including the start time of each segment) can be done in different ways. The URL can be provided as a playlist, or the URL construction rules can be advantageously used for a compact representation of the segment list.

可例如執行基於URL的段列表構造-若MPD經由諸如檔動態資訊(FileDynamicInfo)或等效訊號之類的特定屬性或元素來訊號傳遞通知該點。以下在「URL構造概覽」小節中提供從URL構造建立段清單的普適方式。基於播放清單的構造可例如由不同訊號來訊號傳遞通知。本上下文中亦有利地實現在段列表中檢視以及到達準確的媒體時間的功能。A URL-based segment list construction may be performed, for example, if the MPD signals the notification via a particular attribute or element such as FileDynamicInfo or an equivalent signal. The following is a small "URL construction overview" A general way of constructing a list of segments from a URL construct is provided in the section. The playlist-based construct can signal notifications, for example, by different signals. The function of viewing in the segment list and reaching the exact media time is also advantageously implemented in this context.

URL構造器概覽URL constructor overview

如先前描述的，在本發明的一個實施例中，可提供包含URL構造規則的中繼資料檔，該等URL構造規則允許客戶端裝置構造呈現的區塊的檔辨識符。現在描述對區塊請求串流系統的進一步新穎增強，該新穎增強提供中繼資料檔的改變，包括URL構造規則的改變、可用編碼的數目的改變、與可用編碼相關聯的中繼資料諸如位元元速率、縱橫比、解析度、音訊或視訊轉碼器或編解碼參數或其他參數的改變。As previously described, in one embodiment of the invention, a relay profile containing URL construction rules can be provided that allow the client device to construct a file identifier for the rendered tile. Further novel enhancements to the block request stream system are now described that provide for changes to the relay profile, including changes to URL construction rules, changes in the number of available codes, and relay data associated with available codes, such as bits. Changes in element rate, aspect ratio, resolution, audio or video transcoder or codec parameters or other parameters.

在該新穎增強中，可提供與中繼資料檔的每個元素相關聯的指示在整個呈現內的時間區間的額外資料。在該時間區間內，該元素可被視為有效，而除該時間區間以外，該元素可被忽略。此外，可增強中繼資料的句法，以使得先前允許出現僅一次或至多一次的元素可出現多次。在此種情形中可應用額外限制，該額外限制規定對於此類元素，指定的時間區間必須不相交。在任何給定時刻，僅考慮元素的時間區間包含此給定時刻的元素就將得到與原始中繼資料句法相一致的中繼資料檔。將此類時間區間稱為有效性區間。該方法因此提供了在單個中繼資料檔內訊號傳遞通知上述種類的改變的手段。有利地，此類方法可用來提供在呈現內的指定點支援所描述的種類的改變的媒體呈現。In this novel enhancement, additional material associated with each element of the relay profile can be provided indicating a time interval throughout the presentation. During this time interval, the element can be considered valid, and the element can be ignored except for the time interval. In addition, the syntax of the relay material can be enhanced such that elements that were previously allowed to appear only once or at most may appear multiple times. Additional restrictions may be applied in such cases, which specify that for such elements, the specified time intervals must not intersect. At any given moment, considering only the time interval of the element containing the element at this given time will result in a relay data file that is consistent with the original relay material syntax. Such time intervals are referred to as validity intervals. The method thus provides a means of signaling the notification of the above categories within a single relay profile. Advantageously, such methods can be used to provide a media presentation that supports a change in the described category at a specified point within the presentation.

URL構造器URL constructor

如本文中所描述的，區塊請求串流系統的常見特徵是需要向客戶端提供「中繼資料」，中繼資料識別可用媒體編碼並提供客戶端請求來自該等編碼的區塊所需的資訊。例如，在HTTP的情形中，該資訊可包括包含媒體區塊的檔的URL。可提供播放清單檔，該播放清單檔列出給定編碼的區塊的URL。提供多個播放清單檔，每個檔針對一種編碼，連同列出與不同編碼相對應的播放清單的「關於播放清單的主播放清單」。該系統的缺點在於中繼資料可能變得相當大，且因此在客戶端開始流時要花一定時間來請求中繼資料。該系統的另一缺點在實況內容的情形中是明顯的，此時與媒體資料區區塊相對應的檔是從正被即時地(實況)捕捉的媒體串流(例如實況體育比賽或新聞節目)「在執行中」產生的。在此種情形中，可在每次有新區塊可用時(例如，每幾秒)更新播放清單檔。客戶端裝置可重複地取回該播放清單檔以決定是否有新區塊可用並獲得新區塊的URL。此舉可能對服務基礎設施造成顯著負荷，並且尤其意味著中繼資料檔不能被快取記憶體比更新間隔更久的時間，更新間隔等於通常為幾秒的量級的區塊大小。As described herein, a common feature of a block request streaming system is the need to provide "relay data" to the client, the relay data identifying the available media encodings and providing the client with the required blocks from the encoded blocks. News. For example, in the case of HTTP, the information may include a URL containing a file of the media block. A playlist file can be provided that lists the URL of the given coded block. A plurality of playlist files are provided, each file being for one type of code, together with a "main playlist for playlists" listing playlists corresponding to different codes. The disadvantage of this system is that the relay data can become quite large, and therefore it takes some time to request the relay data when the client starts streaming. Another disadvantage of the system is evident in the case of live content, where the file corresponding to the media data area block is from a stream of media that is being captured (lively) (eg, live sports or news programs). Produced in "in execution". In this case, the playlist file can be updated each time a new block is available (eg, every few seconds). The client device can repeatedly retrieve the playlist file to determine if a new block is available and obtain the URL of the new block. This can place a significant load on the service infrastructure, and in particular means that the relay profile cannot be cached longer than the update interval, which is equal to the block size, typically on the order of a few seconds.

區塊請求串流系統的一個重要態樣是用於向客戶端通知應當與檔下載協定一起用來請求區塊的檔辨識符(例如URL)的方法。例如，其中對於呈現的每個表示皆提供播放清單檔的方法，該播放清單檔列出包含媒體資料區塊的檔的URL。該方法的缺點在於播放清單檔本身的至少一些需要被下載後播出才能開始，此增加了頻道換台時間並且因此導致不良使用者體驗。對於具有若干或許多表示的長媒體呈現，檔URL的列表可能很大，並且因此播放清單檔可能很大，此進一步增加了頻道換台時間。An important aspect of the block request stream system is a method for notifying a client of a file identifier (e.g., a URL) that should be used with a file download agreement to request a block. For example, there is a method of providing a playlist file for each representation presented, the playlist file listing URLs of files containing media material blocks. The disadvantage of this method is that at least some of the playlist files themselves need to be downloaded and then broadcast to start, which increases the channel change time and thus leads to Bad user experience. For long media presentations with several or many representations, the list of file URLs may be large, and thus the playlist file may be large, which further increases the channel change time.

該方法的另一缺點發生在實況內容的情形中。在此種情形中，不會事先就有完整的URL列表可用，且播放清單檔隨著有新區塊變為可用而被週期性地更新，並且客戶端週期性地請求該播放清單檔以接收經更新版本。由於該檔被頻繁更新，因此該檔不能被長時間儲存在快取記憶體代理伺服器內。此意味著對該檔的很多請求將被轉發給其他伺服器並最終轉發給產生該檔的伺服器。在受歡迎的媒體呈現的情形中，此舉可能對該伺服器及網路造成高負荷，進而可能導致回應時間很慢並因此導致頻道換台時間很長且使用者體驗不良。在最差情形中，伺服器變得過載，並且此導致一些使用者不能觀看該呈現。Another disadvantage of this method occurs in the case of live content. In this case, the complete URL list is not available in advance, and the playlist file is periodically updated as new blocks become available, and the client periodically requests the playlist file to receive the updated version. Since the file is frequently updated, the file cannot be stored in the cache memory proxy server for a long time. This means that many requests for this file will be forwarded to other servers and eventually forwarded to the server that generated the file. In the case of popular media presentations, this may result in high load on the server and the network, which may result in slow response times and thus long channel switching times and poor user experience. In the worst case, the server becomes overloaded and this causes some users to be unable to view the presentation.

在區塊請求串流系統的設計中希望避免對可使用的檔辨識符的形式加以限制。此是由於多種考慮可能激發使用特定形式的辨識符的動機。例如，在區塊供應基礎設施是內容投遞網路的情形中，可能存在與跨網路分佈儲存或服務負荷的願望或其他要求相關的檔命名或儲存慣例，此舉導致在系統設計時不能預測的特定形式的檔辨識符。It is desirable in the design of the block request stream system to avoid restrictions on the form of the available file identifiers. This is due to a variety of considerations that may motivate the use of specific forms of identifiers. For example, in the case where the block provisioning infrastructure is a content delivery network, there may be file naming or storage practices associated with desires or other requirements for distributing storage or service load across the network, which results in unpredictable system design. The specific form of the file identifier.

現在描述另一實施例，該實施例緩解了上述缺點而同時留存選取合適檔識別慣例的靈活性。在該方法中，可為媒體呈現的每個表示提供包括檔辨識符構造規則的中繼資料。檔辨識符構造規則可例如包括文字串。為了決定呈現的給定區塊的檔辨識符，可提供解讀檔辨識符構造規則的方法，該方法包括決定輸入參數及將檔識別構造規則連同該等輸入參數一起求值。輸入參數可例如包括要識別的檔的索引，其中第一檔具有索引0，第二檔具有索引1，第三檔具有索引2，依此類推。例如，在每個檔跨越相同時間歷時(或大致相同的時間歷時)的情形中，則與呈現內任何給定的時間相關聯的檔的索引可容易地決定。替換地，呈現內由每個檔跨越的時間可在呈現或版本中繼資料內提供。Another embodiment will now be described which alleviates the above disadvantages while retaining the flexibility to select a proper profile recognition convention. In the method, relay data including a file identifier construction rule can be provided for each representation of the media presentation. The file identifier construction rule may include, for example, a text string. In order to decide what to present The file identifier of the fixed block may provide a method for interpreting the file identifier construction rule, the method comprising determining the input parameter and evaluating the file identification construction rule together with the input parameters. The input parameters may, for example, include an index of the file to be identified, where the first gear has index 0, the second gear has index 1, the third gear has index 2, and so on. For example, in the case where each file spans the same time duration (or substantially the same time duration), then the index of the file associated with any given time within the presentation can be easily determined. Alternatively, the time spanned by each file within the presentation may be provided within the presentation or version relay material.

在一個實施例中，檔辨識符構造規則可包括文字串，該文字串可包含與輸入參數相對應的某些特殊辨識符。檔辨識符構造規則的求值方法包括決定該等特殊辨識符在該文字串內的位置，及用相應的輸入參數的值的串表示來取代每個此類特殊辨識符。In one embodiment, the file identifier construction rule can include a text string that can include certain special identifiers corresponding to the input parameters. The evaluation method of the file identifier construction rule includes determining the position of the special identifier in the character string, and replacing each such special identifier with a string representation of the value of the corresponding input parameter.

在另一實施例中，檔辨識符構造規則可包括遵循表達語言的文字串。表達語言包括該語言的表達可遵循的句法的定義及用於對遵循該句法的串求值的規則集。In another embodiment, the file identifier construction rule can include a text string that follows the expression language. The expression language includes the definition of the syntax that the expression of the language can follow and the set of rules for evaluating the string following the syntax.

現在將參照圖21及下列等等來描述具體實例。圖21中圖示對以增廣巴科斯-諾爾範式(Augmented Backus Naur Form)界定的合適運算式語言的句法定義的實例。用於對遵循圖21中的<運算式>產生式的串的規則求值的實例包括如下將遵循<運算式>產生式的串(<運算式>)遞迴地變換成遵循<字面>產生式的串：遵循<字面>產生式的<運算式>不變。Specific examples will now be described with reference to FIG. 21 and the following. An example of a syntactic definition of a suitable arithmetic language defined by the augmented Backus Naur Form is illustrated in FIG. An example for evaluating the rule of the string following the <expression> production in FIG. 21 includes recursively transforming the string (<expression>) following the <expression> production to follow the <word> generation String of the formula: follows the <expression> of the <literal> production.

遵循<變數>產生式的<運算式>用由該<變數>產生式的<符記>串識別的變數的值來替代。The <expression> following the <variable> production is generated by the <variable> The value of the variable identified by the <character> string is replaced by the value of the variable.

遵循<函數>產生式的<運算式>經由如下描述地根據該等規則來對其每個引數求值並向該等引數應用取決於<函數>產生式的<符記>元素的變換來求值。The <expression> following the <function> production formula evaluates each of its arguments according to the rules as described below and applies a transformation to the arguments depending on the <function> element of the <function> production formula. To evaluate.

遵循<運算式>產生式的最後選項的<運算式>經由如下描述地對該兩個<運算式>元素求值並向該等引數應用取決於<運算式>產生式的此最後選項的<運算元>元素的運算來求值。The <expression> following the last option of the <expression> production is evaluated by describing the two <expression> elements as follows and applying the last option depending on the <expression> production formula to the arguments The operation of the <opero> element is evaluated.

在以上描述的方法中，假定求值發生在其中可界定複數個變數的上下文中。變數是(名稱，值)對，其中「名稱」是遵循<符記>產生式的串，而「值」是遵循<字面>產生式的串。一些變數可在求值開始前在求值過程外部界定。其他變數可在求值過程本身內部界定。所有變數皆是「全域」的，「全域」的意義是指對於每個可能的「名稱」僅存在一個變數。In the method described above, it is assumed that the evaluation occurs in a context in which a plurality of variables can be defined. A variable is a (name, value) pair, where "name" is a string that follows the <character> production, and "value" is a string that follows the <literal> production. Some variables can be defined outside the evaluation process before the evaluation begins. Other variables can be defined internally within the evaluation process itself. All variables are "global", and the meaning of "global" means that there is only one variable for each possible "name".

函數的實例是「printf」函數。該函數接受一或多個引數。第一引數可遵循<串>產生式(下文稱為「串」)。printf函數求值得到printf函數的第一引數的經變換版本。所應用的變換與C標準庫的「printf」函數相同，其中<函數>產生式中包括的額外引數供應C標準庫printf函數預期的額外引數。An example of a function is the "printf" function. This function accepts one or more arguments. The first argument can follow the <string> production formula (hereinafter referred to as "string"). The printf function evaluates to get the transformed version of the first argument of the printf function. The applied transform is the same as the "printf" function of the C standard library, where the extra arguments included in the <function> production supply the extra arguments expected by the C standard library printf function.

函數的另一實例是「hash(散列)」函數。該函數接受兩個引數，其中第一個引數可以是串，而第二個引數可遵循<數位>產生式(下文稱為「數位」)。「hash」函數向第一引數應用散列演算法並返回為小於第二引數的非負整數結果。合適散列函數的實例在圖22中所示的C函數中提供，該函數的引數為輸入串(排除包封的引號)及數值輸入值。散列函數的其他實例對本領域技藝人士是公知的。Another example of a function is the "hash" function. The function accepts two arguments, where the first argument can be a string and the second argument can follow the <digit> production (hereafter referred to as "digit"). The "hash" function applies a hash algorithm to the first argument and returns a non-negative integer less than the second argument result. An example of a suitable hash function is provided in the C function shown in Figure 22, the arguments of which are the input string (excluding the enclosed quotes) and the value input value. Other examples of hash functions are well known to those skilled in the art.

函數的另一實例是取一個、兩個或三個串引數的「Subst」函數。在供應一個引數的情形中，「Subst」函數的結果是第一引數。在供應兩個引數的情形中，則「Subst」函數的結果經由在第一引數內擦除第二引數(排除包封的引號)的任何出現並返回經如此修改的第一引數來計算。在供應三個引數的情形中，則「Subst」函數的結果經由在第一引數內用第三引數(排除包封的引號)來替代第二引數(排除包封的引號)的任何出現並返回經如此修改的第一引數來計算。Another example of a function is a "Subst" function that takes one, two, or three string arguments. In the case of supplying an argument, the result of the "Subst" function is the first argument. In the case of supplying two arguments, then the result of the "Subst" function is erased by the second argument (excluding the enclosed quotes) in the first argument and returns the first argument thus modified To calculate. In the case of supplying three arguments, the result of the "Subst" function replaces the second argument (excluding the enclosed quotes) by using the third argument (excluding the enclosed quotes) within the first argument. Any occurrence and return of the first argument thus modified is calculated.

運算元的一些實例是加、減、除、乘和模運算元，分別由<運算元>產生式‘+’、‘-’、‘/’、‘^＊ ’、‘%’識別。該等運算元要求<運算元>產生式任一側的<運算式>產生式求值均得到數位。運算元的求值包括以一般方式向該兩個數位應用合適的算數運算(分別為加、減、除、乘和模)並返回遵循<數位>產生式的形式的結果。Some examples of operands are addition, subtraction, division, multiplication, and modulo operation elements, which are identified by the <operator> production formulas '+', '-', '/', ' ^* ', '%', respectively. These operands require that the <expression> production expression on either side of the <operating element> production formula obtains a digit. The evaluation of the operands involves applying the appropriate arithmetic operations (addition, subtraction, division, multiplication, and modulo, respectively) to the two digits in a general manner and returning the results in a form that follows the <digit> production.

運算元的另一實例是設定運算元，由<運算元>產生式‘=’識別。該運算元要求左邊的引數求值得到串，該串的內容遵循<符記>產生式。串的內容被界定為包封的引號內的字串。等於運算元導致變數的名稱為等於左邊引數的內容的<符記>的變數被賦於等於對右邊引數求值的結果的值。該值亦是對該運算元運算式求值的結果。Another example of an operand is to set an operand, which is identified by the <operator> production formula '='. The operand requires the argument on the left to be evaluated to obtain a string whose contents follow the <character> production. The contents of the string are defined as strings within the enclosed quotes. A variable equal to the operand causes the variable's name to be equal to the content of the left argument to be assigned a value equal to the result of evaluating the right argument. This value is also the result of evaluating the operand expression.

運算元的另一實例是順序運算元，由<運算元>產生式‘；’識別。對該運算元求值的結果是右邊的引數。注意，與所有運算元一樣，兩個引數均被求值並且左邊的引數先被求值。Another example of an operand is a sequential operand, generated by an <operating element> Formula ‘;’ identification. The result of evaluating the operand is the argument on the right. Note that, like all operands, both arguments are evaluated and the arguments on the left are evaluated first.

在本發明的一個實施例中，檔的辨識符可經由根據以上規則用識別所要求的檔的特定的一組輸入變數對檔辨識符構造規則求值來獲得。輸入變數的實例是名稱為「index(索引)」且值等於該檔在呈現內的數值索引的變數。輸入變數的另一實例是名稱為「bitrate(位元元速率)」且值等於呈現的所要求版本的平均位元元速率的變數。In one embodiment of the invention, the identifier of the file may be obtained by evaluating the file identifier construction rule with a particular set of input variables identifying the requested file according to the above rules. An instance of the input variable is a variable whose name is "index" and whose value is equal to the index of the value of the file within the rendering. Another example of an input variable is a variable named "bitrate" and having a value equal to the average bit rate of the requested version of the presentation.

圖23圖示了檔辨識符構造規則的一些實例，其中輸入變數是提供想要的呈現的表示的辨識符的「id」及提供檔的序號的「seq」。Figure 23 illustrates some examples of file identifier construction rules where the input variable is the "id" of the identifier providing the representation of the desired presentation and the "seq" of the serial number of the provided file.

如本領域技藝人士在閱讀本案之際將清楚的，以上方法的許多變形是可能的。例如，可以並非提供以上描述的函數和運算元的全部，或者可提供外加的函數或運算元。Many variations of the above methods are possible as will be apparent to those skilled in the art upon reading this. For example, not all of the functions and operands described above may be provided, or additional functions or operands may be provided.

URL構造規則和時基URL construction rules and time base

本節提供基本URI構造規則，以指派檔或段URI以及每個段在表示和媒體呈現內的開始時間。This section provides basic URI construction rules to assign file or segment URIs and the start time of each segment within the presentation and media presentation.

對於本條，假定客戶端處有媒體呈現描述可用。For this article, assume that a media presentation description is available at the client.

假定HTTP串流客戶客戶端正在播出在媒體呈現內下載的媒體。HTTP客戶端的實際呈現時間可用呈現時間相對於呈現開始在何處來界定。在初始化時，可假定呈現時間t=0。Assume that the HTTP streaming client client is playing out the media downloaded within the media presentation. The actual rendering time of the HTTP client can be defined by where the rendering time is relative to where the rendering begins. At initialization, the presentation time t=0 can be assumed.

在任何點t，HTTP客戶端可下載播放時間tP(亦是相對於呈現的開始而言的)比實際呈現時間t提前至多「MaximumClientPreBufferTime(最大客戶端預緩衝時間)」的任何資料，及由於使用者互動(例如，檢視、快進等)而需要的任何資料。在一些實施例中，「最大客戶端預緩衝時間」甚至可以不被指定，不被指定的意義是指客戶端能無限制地下載比當前播放時間提前(tP)的資料。At any point t, the HTTP client can download the playback time tP (also the phase Any material that is at least "MaximumClientPreBufferTime" before the actual presentation time t, and any material that is required due to user interaction (eg, view, fast forward, etc.). In some embodiments, the "maximum client pre-buffer time" may not even be specified. The meaning of not being specified means that the client can download data in advance (tP) ahead of the current play time without restriction.

HTTP客戶端可避免下載不必要的資料，例如，來自預期不會播出的表示的任何段可能典型地不被下載。The HTTP client can avoid downloading unnecessary material, for example, any segment from a representation that is not expected to be broadcast may typically not be downloaded.

提供串流服務的基本過程可以是經由產生下載整個檔/段或檔/段子集的合適請求，例如經由使用HTTP獲取請求或HTTP部分獲取請求來下載資料。本描述解決了如何存取特定播出時間tP的資料，但一般而言，客戶端可下載更大時間範圍的播放時間的資料以避免低效率請求。HTTP客戶端可使得在提供串流服務時的HTTP請求的數目/頻率最小化。The basic process of providing a streaming service may be via downloading a suitable request to download an entire file/segment or file/segment subset, such as via an HTTP acquisition request or an HTTP partial acquisition request. This description addresses how to access data for a particular airtime tP, but in general, the client can download data for a larger time range of play time to avoid inefficient requests. The HTTP client can minimize the number/frequency of HTTP requests when providing streaming services.

為了存取特定表示中在播放時間tP或至少接近播放時間tP的媒體資料，客戶端決定指向包含該播放時間的檔的URL並且另外決定該檔中用於存取該播放時間的位元組範圍。In order to access the media material in the specific representation at the play time tP or at least close to the play time tP, the client decides to point to the URL of the file containing the play time and additionally determines the byte range in the file for accessing the play time. .

媒體呈現描述可例如經由使用RepresentationID(表示ID)屬性向每個表示指派表示id「r」。換言之，MPD的該內容在由攝取系統寫時或在被客戶端讀時將被解讀成存在指派。為了下載id為r的特定表示的特定播放時間tP的資料，客戶端可構造針對檔的合適URI。The media presentation description may assign a representation id "r" to each representation, for example via the use of a RepresentationID attribute. In other words, the content of the MPD will be interpreted as a presence assignment when written by the ingest system or when read by the client. In order to download data for a particular play time tP with a particular representation of id r, the client can construct a suitable URI for the file.

媒體呈現描述可向每個表示r的每個檔或段指派以下屬性：(a)表示r內的檔的序號i，其中i=1,2,...,Nr，(b)表示id為r且檔索引為i的檔相對於呈現時間而言的相對開始時間，界定為ts(r,i)，(c)表示id為r且檔索引為i的檔/段的檔URI，記為FileURI(r,i)。The media presentation description can be assigned to each file or segment of each representation r Lower attribute: (a) represents the sequence number i of the file within r, where i = 1, 2, ..., Nr, (b) represents the relative position of the file with the id r and the index index i relative to the presentation time. The start time, defined as ts(r, i), (c) represents the file URI of the file/segment with the id r and the file index i, denoted as FileURI(r, i).

在一個實施例中，可顯式地為表示提供檔的開始時間和檔URI。在另一實施例中，可顯式地提供檔URI列表，其中每個檔URI根據在該清單中的位置被固有地指派索引i，並且段的開始時間是作為從1到i-1的段的所有段歷時之和來推導的。每個段的歷時可根據以上論述的任何規則來提供。例如，懂基礎數學的任何人員可使用其他方法來推導用於從單個元素或屬性及檔URI在表示中的位置/索引來容易地推導開始時間的方法。In one embodiment, the start time and file URI of the file may be explicitly provided for the representation. In another embodiment, a list of file URIs may be explicitly provided, where each file URI is inherently assigned an index i according to the location in the list, and the start time of the segment is as a segment from 1 to i-1 The sum of all the segments is derived from the sum of time. The duration of each segment can be provided in accordance with any of the rules discussed above. For example, anyone who understands basic mathematics can use other methods to derive a method for easily deriving the start time from a single element or attribute and the location/index of the file URI in the representation.

若MPD中提供動態URI構造規則，則每個檔的開始時間及每個檔URI可經由使用構造規則、所請求的檔的索引，及潛在可能亦使用在媒體呈現描述中提供的一些額外參數來動態地構造。該資訊例如可在諸如檔URI模式(FileURIPattern)和動態檔資訊(FileInfoDynamic)等MPD屬性及元素中提供。檔URI模式提供關於如何基於檔索引序號i和表示ID r來構造URI的資訊。檔URI格式(FileURIFormat)構造為：檔URI格式=sprintf(「%s%s%s%s%s.%s」，基URI，基檔案名稱，表示ID格式，分隔符號號格式，檔序列ID格式，檔副檔名)；並且FileURI(r,i)構造為：FileURI(r,i)=sprintf(檔URI格式，r,i)；每個檔/段的相對開始時間ts(r,i)可由MPD中包含的描述該表示中的段的歷時的某個屬性來推導，例如「動態檔資訊」屬性。MPD亦可包含對媒體呈現中的所有表示或以如上指定的相同方式至少在時段中對所有表示而言為全域的「動態檔資訊」屬性序列。若表示r中的特定播放時間tP的媒體資料被請求，則相應的索引i(r,tP)可被推導為i(r,t_p )，以使得該索引的播放時間落在開始時間ts(r,i(r,tP))與ts(r,i(r,tP)+1)的區間中。段存取可進一步被以上情形限制，例如段是不可存取的。If a dynamic URI construction rule is provided in the MPD, the start time of each file and each file URI may be via the use of construction rules, an index of the requested file, and possibly some additional parameters provided in the media presentation description. Dynamically constructed. This information can be provided, for example, in MPD attributes and elements such as file URI mode (FileURIPattern) and dynamic file information (FileInfoDynamic). The file URI mode provides information on how to construct a URI based on the file index number i and the representation ID r. The file URI format (FileURIFormat) is constructed as: file URI format = sprintf ("%s%s%s%s%s.%s", base URI, base file name, indicating ID format, separator symbol format, file sequence ID Format, file extension name); and FileURI(r,i) is constructed as: FileURI(r,i)=sprintf(file URI format, r,i); relative start time ts(r,i) of each file/segment ) may be derived from an attribute included in the MPD that describes the duration of the segment in the representation, such as the "dynamic file information" attribute. The MPD may also contain a "dynamic file" attribute sequence for all representations in the media presentation or in the same manner as specified above, at least for all representations in the time period. If the media material representing the specific play time tP in r is requested, the corresponding index i(r, tP) can be derived as i(r, t _p ) such that the play time of the index falls at the start time ts ( r, i(r, tP)) and ts(r, i(r, tP) + 1). Segment access can be further limited by the above, for example, the segment is inaccessible.

一旦獲得相應段的索引和URI，存取確切的播放時間tP就取決於實際段格式。在該實例中，不失一般性，假定媒體段具有始於0的局部等時線。為了存取和呈現播放時間tP的資料，客戶端可從能經由其中i=i(r,t_p )的URI FileURI(r,i)存取的檔/段下載與局部時間相對應的資料。Once the index and URI of the corresponding segment are obtained, the exact playback time tP is accessed depending on the actual segment format. In this example, without loss of generality, it is assumed that the media segment has a local isochronal line starting at zero. In order to access and present the data of the play time tP, the client can download the material corresponding to the local time from the file/segment accessible via the URI FileURI(r, i) where i=i(r, t _p ).

一般而言，客戶端可下載整個檔並且隨後可存取播放時間tP。然而，不一定3GP檔的所有資訊皆需要被下載，因為3GP檔提供將局部時基映射到位元組範圍的結構。因此，只要有充分的隨機存取資訊可用，僅存取播放時間tP的特定位組範圍就足以播放媒體。同樣，在段的初始部分中可例如使用段索引之類來提供關於媒體段的位元組範圍和局部時基的結構和映射的充分資訊。經由能存取段的初始的例如1200個位元組，客戶端就可具有充分的資訊以直接存取播放時間tP所需的位元組範圍。In general, the client can download the entire file and then access the play time tP. However, not all information of the 3GP file needs to be downloaded, because the 3GP file provides a structure that maps the local time base to the byte range. Therefore, as long as sufficient random access information is available, accessing only a specific bit range of the playback time tP is sufficient to play the media. Likewise, sufficient information about the structure and mapping of the byte ranges and local time bases of the media segments can be provided, for example, using segment indices, etc., in the initial portion of the segment. The client can have sufficient information to directly access the playback time tP via the initial, eg, 1200, byte that can access the segment. The range of bytes.

在另一實例中，假定段索引(可能在下文指定為「tidx」包)可被用於識別所要求的一或多個片斷的位元組偏移量。可針對所要求的一或多個片斷形成部分獲取請求。亦存在其他替換方案，例如，客戶端可發出對檔的標準請求並在已接收到第一個「tidx」包時取消該請求。In another example, it is assumed that a segment index (which may be designated as a "tidx" packet hereinafter) may be used to identify the byte offset of the required one or more segments. A partial acquisition request may be formed for the required one or more segments. There are other alternatives, for example, the client can issue a standard request for a file and cancel the request when the first "tidx" package has been received.

檢視View

客戶端可嘗試檢視到表示中的特定呈現時間tp。基於MPD，客戶端能存取表示之每一者段的媒體段開始時間和媒體段URL。客戶端可獲取開始時間tS(r,i)小於或等於呈現時間tp的最大段索引i作為最有可能包含呈現時間tp的媒體取樣的段的段索引segment_index，即段索引=max{i|tS(r,i)<=tp}。獲得段URL為FileURI(r,i)。The client can attempt to view the specific presentation time tp in the representation. Based on the MPD, the client can access the media segment start time and the media segment URL for each segment of the representation. The client may obtain the maximum segment index i whose start time tS(r, i) is less than or equal to the presentation time tp as the segment index segment_index of the segment most likely to contain the media sample of the presentation time tp, ie segment index=max{i|tS (r,i)<=tp}. Get the segment URL as FileURI(r,i).

注意，由於與隨機存取點的放置、媒體軌跡的對準及媒體時基漂移有關的問題，MPD中的時基資訊可能是近似的。因此，由以上程序所識別出的段可能始於比tp稍晚的時間，並且呈現時間tp的媒體資料可能位於先前媒體段中。在檢視的情形中，可將檢視時間更新成等於檢索到的檔的首個取樣時間，或者可代之以檢索前一檔。然而應注意，在連續播出期間，包括在替換表示/版本之間切換的情形中，tp與檢索到的段的開始之間的時間的媒體資料仍是可用的。Note that the time base information in the MPD may be approximate due to problems associated with placement of random access points, alignment of media tracks, and media timebase drift. Therefore, the segment identified by the above procedure may start at a later time than tp, and the media material presenting the time tp may be located in the previous media segment. In the case of viewing, the viewing time may be updated to be equal to the first sampling time of the retrieved file, or alternatively the previous file may be retrieved. It should be noted, however, that during continuous play, including in the case of switching between replacement representations/versions, media material for the time between tp and the beginning of the retrieved segment is still available.

為了準確地檢視到呈現時間tp，HTTP串流客戶端需要存取隨機存取點(RAP)。為了在3GPP自我調整HTTP串流的情形中決定媒體段中的隨機存取點，客戶端可例如使用‘ tidx’或‘sidx’包(若存在)中的資訊來定位媒體呈現中的隨機存取點以及相應的呈現時間。在段為3GPP電影片斷的情形中，亦可能由客戶端使用‘moof’和‘mdat’包內的資訊來例如定位RAP並從電影片斷中的資訊和從MPD推導出的段開始時間來獲得必需的呈現時間。若沒有呈現時間在所請求的呈現時間tp之前的RAP可用，則客戶端可存取先前段或者可僅僅使用首個隨機存取點作為檢視結果。當媒體段始於RAP時，該等程序是簡單的。In order to accurately view the presentation time tp, the HTTP streaming client needs to access a random access point (RAP). In order to determine a random access point in a media segment in the case of 3GPP self-adjusting HTTP streaming, the client may use, for example, ‘ Information in the tidx' or ‘sidx' package (if present) to locate random access points in the media presentation and corresponding rendering times. In the case where the segment is a 3GPP movie clip, it is also possible for the client to use the information in the 'moof' and 'mdat' packets to, for example, locate the RAP and obtain the necessary information from the information in the movie segment and the segment start time derived from the MPD. Presentation time. If no RAP is available for the presentation time before the requested presentation time tp, the client may access the previous segment or may only use the first random access point as the viewing result. When the media segment starts at RAP, the programs are simple.

同樣應注意，媒體段的所有資訊未必皆需要被下載才能存取呈現時間tp。客戶端可以例如首先使用位元組範圍請求從媒體段的開頭請求‘tidx’或‘sidx’包。經由使用‘tidx’或‘sidx’包，段時基可被映射到段的位元組範圍。經由連續使用部分HTTP請求，僅媒體段中有關係的部分需要被存取，從而得到改善的使用者體驗及低啟動延遲。It should also be noted that all information in the media segment does not necessarily need to be downloaded in order to access the presentation time tp. The client may, for example, first request a 'tidx' or 'sidx' packet from the beginning of the media segment using a byte range request. By using a 'tidx' or 'sidx' packet, the segment time base can be mapped to the byte range of the segment. By continuously using partial HTTP requests, only the relevant parts of the media segment need to be accessed, resulting in an improved user experience and low startup delay.

段列表產生Segment list generation

如本文中所描述的，如何實現使用由MPD提供的資訊來為具有訊號傳遞通知的大致段歷時dur 的表示建立段列表的簡單直接的HTTP串流客戶端應當是明瞭的。在一些實施例中，客戶端可向表示內的媒體段指派連貫索引i=1,2,3,...，亦即，第一個媒體段被指派索引i=1，第二個媒體段被指派索引i=2，依此類推。則，具有段索引i的媒體段的列表被指派startTime[i](開始時間[i])，並且例如如下產生URL[i]。首先，將索引i設為1。獲得第一個媒體段的開始時間為0，即startTime[1]=0。獲得媒體段i的URL即URL[i]為FileURI(r,i)。該過程針對具有索引i的所有被描述的媒體段繼續，並且獲得媒體段i的startTime[i]為(i-1)^＊ dur ，並且獲得URL[i]為FileURI(r,i)。As described herein, how to implement a simple direct HTTP streaming client that uses the information provided by the MPD to create a list of segments for the representation of the approximate segment duration dur with signal delivery notifications should be apparent. In some embodiments, the client may assign a coherent index i=1, 2, 3, . . . to the media segment within the representation, ie, the first media segment is assigned an index i=1, the second media segment The index is assigned i=2, and so on. Then, the list of media segments having the segment index i is assigned startTime[i] (start time [i]), and the URL [i] is generated, for example, as follows. First, set the index i to 1. The start time of the first media segment is 0, that is, startTime[1]=0. The URL [i] of the media segment i is obtained as FileURI(r, i). The process continues for all described media segments with index i, and obtains startTime[i] for media segment i as (i-1) ^* dur and obtains URL[i] as FileURI(r,i).

併發HTTP/TCP請求Concurrent HTTP/TCP request

區塊請求串流系統中一關注點是希望總是請求能被及時地完整接收以供播出的最高品質區塊。然而，資料抵達率可能不是事先已知的，且因此可能碰巧所請求的區塊沒有及時抵達以供播出。此導致需要暫停媒體播出，從而導致不良使用者體驗。該問題可經由對要請求的區塊的選擇採取保守辦法的客戶端演算法來緩解，此保守辦法請求更有可能即便在區塊的接收期間資料抵達率下降亦能被及時接收到的較低品質(因此有較小大小)的區塊。然而，該保守辦法具有可能向使用者或目的地裝置投遞較低品質播出該缺點，此亦是不良使用者體驗。當同時使用多個HTTP連接來下載不同區塊時該問題可能放大，如以下所描述的，因為可用網路資源跨連接被共享，且因此被同時用於具有不同播出時間的區塊。One concern in the block request streaming system is that it is desirable to always request the highest quality block that can be completely received in time for broadcast. However, the data arrival rate may not be known in advance, and thus it may happen that the requested block did not arrive in time for broadcast. This results in the need to pause media play, resulting in a poor user experience. This problem can be mitigated by a conservative client-side algorithm that selects the block to be requested. This conservative approach request is more likely to be received in a timely manner even if the data arrival rate drops during the block reception period. A block of quality (and therefore smaller size). However, this conservative approach has the potential to deliver lower quality broadcasts to users or destination devices, which is also a poor user experience. This problem may be magnified when multiple HTTP connections are used simultaneously to download different blocks, as described below, because available network resources are shared across connections and are therefore used simultaneously for blocks with different playout times.

客戶端併發地發出對多個區塊的請求可能是有利的，其中在本上下文中，「併發地」表示對請求的回應發生在重疊的時間區間中，且該等請求不一定是在精確或甚至大致相同的時間作出的。在HTTP協定的情形中，該辦法由於TCP協定的行為(此是公知的)故而可改善對可用頻寬的利用率。此對於改善內容換台時間可能是尤為重要的，因為在新內容首次被請求時，藉以請求該等區塊的資料的相應HTTP/TCP 連接可能起動很慢，並且因此在此時使用若干HTTP/TCP連接能極大地加速第一批區塊的資料投遞時間。然而，在不同HTTP/TCP連接上請求不同區塊或片斷亦可能導致效能降級，因為對將要先播出的區塊的請求正在與對後續區塊的請求爭用，爭用的各HTTP/TCP下載在該等下載的投遞速度方面大為不同，並且因此該請求的完成時間可能高度可變，且一般不可能控制何者HTTP/TCP連接將迅速完成而何者將較慢，因此很有可能至少一些時候頭幾個區塊的HTTP/TCP下載將最後完成，因此導致很大且可變的頻道換台時間。It may be advantageous for the client to issue requests for multiple blocks concurrently, wherein in this context, "concurrently" means that the response to the request occurs in overlapping time intervals, and the requests are not necessarily exact or Even made at roughly the same time. In the case of the HTTP protocol, this approach improves the utilization of available bandwidth due to the behavior of the TCP protocol, which is well known. This may be especially important for improving the content change time, because the corresponding HTTP/TCP requesting the data of the blocks is requested when the new content is first requested. The connection may start very slowly, and thus using several HTTP/TCP connections at this time can greatly speed up the data delivery time for the first batch of blocks. However, requesting different blocks or fragments on different HTTP/TCP connections may also result in performance degradation because the request for the block to be broadcast first is contending with the request for the subsequent block, contending for each HTTP/TCP Downloads are very different in terms of the delivery speed of such downloads, and therefore the completion time of the request may be highly variable, and it is generally impossible to control which HTTP/TCP connections will complete quickly and which will be slower, so it is likely that at least some The HTTP/TCP download of the first few blocks will be finalized, resulting in a large and variable channel changeover time.

假設段的每個區塊或片斷是在單獨的HTTP/TCP連接上下載的，並且並行連接的數量為n且每個區塊的播出歷時為t秒，並且與該段相關聯的內容的串流率為S。當客戶端最初開始串流該內容時，可發出對頭n個區塊的請求，此代表n^＊ t秒的媒體資料。Suppose that each block or segment of the segment is downloaded on a separate HTTP/TCP connection, and the number of concurrent connections is n and the broadcast duration of each block is t seconds, and the content associated with the segment The stream rate is S. When the client initially starts streaming the content, it can issue a request for the first n blocks, which represents the media data for n ^* t seconds.

如本領域技藝人士已知的，TCP連接的資料率有很大變動。然而，為了簡化本論述，假設理想情況下所有連接皆並行地進行，從而第一個區塊將在與其他n-1個請求的區塊大約相同的時間被完全接收。為了進一步簡化本論述，假設該n個下載連接所利用的聚集頻寬在該下載的整個歷時期間固定為值B，並且串流率S在整個表示期間是恆定的。進一步假設媒體資料結構使得區塊的播出在整個區塊在客戶端處可用時才能進行，即區塊的播出只能在接收到整個區塊之後開始，例如由於底層視訊編碼的結構或者由於採用了加密來單獨加密每個片斷或區塊，且因此整個片斷或區塊需要先被接收到才能被解密。因此，為了簡化以下的論述，假定在區塊的任何部分能被播出之前，需要接收到整個區塊。則，在第一個區塊抵達並且能被播出之前所需的時間約為n^＊ t^＊ S/B。As is known to those skilled in the art, the data rate of TCP connections varies widely. However, to simplify this discussion, assume that all connections are ideally performed in parallel so that the first block will be fully received at approximately the same time as the other n-1 requested blocks. To further simplify the discussion, it is assumed that the aggregate bandwidth utilized by the n download connections is fixed to a value B during the entire duration of the download, and the stream rate S is constant throughout the representation. Further assuming that the media data structure enables the broadcast of the block to be performed when the entire block is available at the client, that is, the broadcast of the block can only start after receiving the entire block, for example due to the structure of the underlying video coding or due to Encryption is used to encrypt each segment or block individually, and thus the entire segment or block needs to be received before it can be decrypted. Therefore, in order to simplify the following discussion, it is assumed that the entire block needs to be received before any part of the block can be broadcast. Then, the time required before the first block arrives and can be broadcast is about n ^* t ^* S/B.

由於希望使內容換台時間最小化，因此使n^＊ t^＊ S/B最小化是可取的。t的值可由諸如底層視訊編碼結構及如何利用攝取方法等因素決定，且因此t可以適度地小，但非常小的t值導致過度複雜的段映射，並且可能與高效視訊編碼和解密(若使用)不相容。n的值亦可能影響B的值，即B對於較大的連接數量n可能較大，且因此減少連接數量n具有潛在地減少所利用的可用頻寬量B的負面效應，因此對於達成減少內容換台時間的目標可能不是有效的。S的值取決於選取下載和播出何者表示，且理想情況下S應當儘可能接近B以使給定網路條件下媒體的播出品質最大化。因此，為了簡化本論述，假定S約等於B。則，頻道換台時間與n^＊ t成比例。因此，若各連接利用的聚集頻寬與連接數量呈亞線性比例(通常就是此種情形)，則利用較多連接來下載不同片斷會使頻道換台時間降級。Minimizing n ^* t ^* S/B is desirable because it is desirable to minimize the content swap time. The value of t can be determined by factors such as the underlying video coding structure and how to utilize the ingestion method, and thus t can be modestly small, but very small t values result in overly complex segment mapping and possibly with efficient video encoding and decryption (if used) )incompatible. The value of n may also affect the value of B, ie B may be large for a larger number of connections n, and thus reducing the number of connections n has the potential to potentially reduce the negative effect of the available bandwidth amount B, thus achieving a reduction in content The goal of changing the time may not be valid. The value of S depends on which of the downloads and broadcasts are selected, and ideally S should be as close as possible to B to maximize the broadcast quality of the media for a given network condition. Therefore, to simplify the discussion, S is assumed to be approximately equal to B. Then, the channel change time is proportional to n ^* t. Therefore, if the aggregate bandwidth used by each connection is sub-linearly proportional to the number of connections (which is usually the case), using multiple connections to download different segments can degrade the channel change time.

作為實例，假設t=1秒，且n=1時B的值=500Kbps，n=2時B的值=700Kbps，且n=3時B的值=800Kbps。假設選取了S=700Kbps的表示。則，在n=1時，第一區塊的下載時間為1^＊ 700/500=1.4秒，在n=2時，第一區塊的下載時間為2^＊ 700/700=2秒，在n=3時，第一區塊的下載時間為3^＊ 700/800=2.625秒，此外，隨著連接數量增加，各連接的個體下載速度的可變性很可能增加(儘管即使在一個連接的情況下，亦很可能有某個明顯的可變性)。因此，在該實例中，頻道換台時間和頻道換台時間可變性隨連接數量的增加而增加。直觀上，正被投遞的區塊具有不同優先順序，亦即，第一個區塊具有最早投遞期限，第二個區塊具有次最早期限等等，而正藉以投遞該等區塊的各下載連接正在投遞期間爭用網路資源，且因此具有最早期限的區塊隨著有越來越多的爭用區塊被請求而越加延遲。另一方面，即使在該情形中，最終使用一個以上下載連接亦允許支援可維繫的較高串流率，例如在三個連接的情況下，在該實例中能支援最高達800Kbps的串流率，而在一個連接的情況下僅能支援500Kbps的流。As an example, assume t = 1 second, and the value of B = 500 Kbps when n = 1, the value of B = 700 Kbps when n = 2, and the value of B = 800 Kbps when n = 3. Assume that the representation of S=700Kbps is selected. Then, when n=1, the download time of the first block is 1 ^* 700/500=1.4 seconds, and when n=2, the download time of the first block is 2 ^* 700/700=2 seconds, at n When =3, the download time of the first block is 3 ^* 700/800 = 2.625 seconds. In addition, as the number of connections increases, the variability of individual download speeds of each connection is likely to increase (although even in the case of one connection) It is also likely to have some obvious variability). Therefore, in this example, the channel change time and the channel change time variability increase as the number of connections increases. Intuitively, the blocks being delivered have different priorities, that is, the first block has the earliest delivery deadline, the second block has the next earliest deadline, etc., and the downloads are being used to deliver the blocks. The connection is contending for network resources while it is being delivered, and therefore the block with the earliest deadline is more delayed as more and more contention blocks are requested. On the other hand, even in this case, the final use of more than one download connection allows for a higher stream rate that can be maintained, for example in the case of three connections, a stream rate of up to 800 Kbps can be supported in this example. And in the case of a connection, only 500Kbps of stream can be supported.

在實踐中，如上所述，連接的資料率在相同連接內隨著時間推移及在不同連接之間皆可能高度可變，並且因此，n個所請求的區塊一般不在同時完成，且事實上通常可能是一個區塊可能在另一區塊的一半時間裡就完成了。該效應導致不可預測的行為，因為在一些情形中，第一個區塊可能比其他區塊完成得快得多，而在其他情形中，第一個區塊可能比其他區塊完成得晚得多，且因此播出的開始在一些情形中可能相對迅速地發生而在其他情形中可能發生得很慢。此種不可預測的行為對於使用者來說可能是令人沮喪的，並且因此可能被視為不良使用者體驗。In practice, as described above, the data rate of the connection may be highly variable over time and between different connections within the same connection, and therefore, the n requested blocks are generally not completed at the same time, and in fact usually It may be that one block may have been completed in half of the other block. This effect leads to unpredictable behavior, because in some cases the first block may be completed much faster than the other blocks, while in other cases, the first block may be completed later than the other blocks. Many, and therefore the beginning of the broadcast may occur relatively quickly in some situations and may occur very slowly in other situations. Such unpredictable behavior can be frustrating for the user and can therefore be considered a bad user experience.

因此，需要能利用多個TCP連接來改善頻道換台時間和頻道換台時間可變性而同時支援可能的良好品質串流率的方法。亦需要允許隨著區塊的播出時間的逼近調節分配給每個區塊的可用頻寬的份額、從而在必要的情況下較大份額的可用頻寬能有傾向性地被分配給具有最迫近的播放時間的區塊的方法。Therefore, there is a need for a method that can utilize multiple TCP connections to improve channel change time and channel change time variability while supporting possible good quality streaming rates. It also needs to be allowed to be assigned to the approximation of the broadcast time of the block. The share of the available bandwidth of each block, and thus a larger share of the available bandwidth, if necessary, can be preferentially assigned to the block with the most imminent play time.

協調式HTTP/TCP請求Coordinated HTTP/TCP request

現在描述以協調式方式使用併發HTTP/TCP請求的方法。接收器可採用多個併發的協調式HTTP/TCP請求，例如使用複數個HTTP位元組範圍請求，其中每個此類請求針對源段中的片斷的一部分，或源段的片斷的全部，或修復段的一部分或修復片斷，或針對修復段的修復片段的全部。A method of using concurrent HTTP/TCP requests in a coordinated manner is now described. The receiver may employ multiple concurrent coordinated HTTP/TCP requests, for example using a plurality of HTTP byte range requests, each of which is directed to a portion of a fragment in the source segment, or a fragment of the source segment, or Repair part of the segment or repair piece, or all of the repair piece for the repair segment.

協調式HTTP/TCP請求連同使用FEC修復資料的優點對於提供一貫快速的頻道換台時間可能尤為重要。例如，在頻道換台時間，TCP連接很可能剛剛起動或者已閒置了一段時間，在此種情形中壅塞窗cwnd對於該等連接位於其最小值，且因此該等TCP連接的投遞速度將花若干往返行程時間(RTT)來斜坡上升，且在該斜坡上升時間期間不同TCP連接上的投遞速度將具有高度可變性。The advantages of coordinated HTTP/TCP requests, along with the use of FEC to repair data, may be especially important to provide consistently fast channel change times. For example, at channel switching time, the TCP connection is likely to have just started or has been idle for a period of time, in which case the congestion window cwnd is at its minimum for such connections, and therefore the delivery speed of the TCP connections will be spent The round trip time (RTT) ramps up and the delivery speed on different TCP connections during this ramp up time will be highly variable.

現在描述無FEC方法的概覽，無FEC方法是協調式HTTP/TCP請求方法，其中使用多個併發HTTP/TCP連接來僅請求源區塊的媒體資料，即不請求FEC修復資料。經由該無FEC方法，在不同連接上請求相同片斷的各部分，例如使用針對該片斷的各部分的HTTP位元組範圍請求，且因此例如每個HTTP位元組範圍請求針對該片斷的段映射中指示的位元組範圍的一部分。可能是此種情形：個體HTTP/TCP的投遞速度在若干RTT(往返行程時間)上斜坡上升以完全利用可用頻寬，且因此在相對長的時間段裡投遞速度小於可用頻寬，因此若使用單個HTTP/TCP連接來下載例如要播出的內容的第一個片斷，則頻道換台時間可能很大。使用無FEC方法，在不同HTTP/TCP連接上下載相同片斷的不同部分就能顯著減小頻道換台時間。An overview of the FEC-free method is now described. The FEC-free method is a coordinated HTTP/TCP request method in which multiple concurrent HTTP/TCP connections are used to request only the media material of the source block, ie no FEC repair material is requested. Via the FEC-free method, portions of the same segment are requested on different connections, for example using HTTP byte-range requests for portions of the segment, and thus, for example, each HTTP byte range request for segment mapping for the segment Part of the range of bytes indicated in . This may be the case: the individual HTTP/TCP delivery speed ramps up over several RTTs (round trip time) to fully utilize the available bandwidth And therefore the delivery speed is less than the available bandwidth over a relatively long period of time, so if a single HTTP/TCP connection is used to download, for example, the first segment of the content to be broadcast, the channel change time may be large. Using the FEC-free method, downloading different parts of the same clip on different HTTP/TCP connections can significantly reduce channel swap time.

現在描述FEC方法的概覽，FEC方法是協調式HTTP/TCP請求方法，其中使用多個併發HTTP/TCP連接來請求源段的媒體資料及從該媒體資料產生的FEC修復資料。經由該FEC方法，使用針對片斷的各部分的HTTP位元組範圍請求在不同連接上請求相同片斷的各部分及從該片斷產生的FEC修復資料，且因此例如每個HTTP位元組範圍請求針對該片斷的段映射中指示的位元組範圍的一部分。可能是此種情形：個體HTTP/TCP請求的投遞速度在若干RTT(往返行程時間)上斜坡上升以完全利用可用頻寬，因此在相對長的時間段裡投遞速度小於可用頻寬，因此若使用單個HTTP/TCP連接來下載例如要播出的內容的第一個片斷，則頻道換台時間可能很大。使用FEC方法具有與無FEC方法相同的優點，且具有並非所有所請求資料皆需要在能恢復該片斷之前抵達的額外優點，因此進一步減小了頻道換台時間及頻道換台時間可變性。經由在不同TCP連接上作出請求，及在其中至少一條連接上亦請求FEC修復資料的溢額請求，投遞例如足以恢復使得媒體播出能開始的第一個所請求片斷的資料量要花的時間量可被極大地減少，並能使之比不使用協調式TCP連接和FEC修復資料的情況下更加一致。An overview of the FEC method is now described. The FEC method is a coordinated HTTP/TCP request method in which multiple concurrent HTTP/TCP connections are used to request media material for the source segment and FEC repair data generated from the media material. Via the FEC method, the HTTP byte range request for each part of the fragment is used to request portions of the same fragment and FEC repair material generated from the fragment on different connections, and thus for example each HTTP byte range request is directed to A portion of the range of bytes indicated in the segment map of the segment. It may be the case that the delivery speed of an individual HTTP/TCP request ramps up over several RTTs (round trip time) to fully utilize the available bandwidth, so the delivery speed is less than the available bandwidth for a relatively long period of time, so if used A single HTTP/TCP connection to download, for example, the first fragment of the content to be broadcast, the channel change time can be large. The use of the FEC method has the same advantages as the no-FEC method, and has the additional advantage that not all of the requested data needs to arrive before the segment can be recovered, thus further reducing the channel change time and channel change time variability. By requesting a request on a different TCP connection, and requesting an overflow request for FEC repair data on at least one of the connections, the delivery time is, for example, sufficient to recover the amount of data of the first requested segment that enables the media broadcast to begin. The amount can be greatly reduced and can be made more consistent than if the coordinated TCP connection and FEC repair data were not used.

圖24(a)至圖24(e)圖示在從模擬的進化資料最佳化(EVDO)網路上的相同HTTP web伺服器至相同客戶端的相同鏈路上執行的5個TCP連接的投遞率波動的實例。在圖24(a)至圖24(e)中，X軸圖示以秒計的時間，並且Y軸圖示客戶端處在該5個TCP連接中的每一個連接上接收位元的速率，每個連接上的速率是針在1秒的區間上量測的。在該特定模擬中，在該鏈路上總共有12個TCP連接在執行，且因此網路在所示時間期間是相對有負荷的，此在有一個以上客戶客戶端正在行動網路的相同細胞服務區內進行串流時是典型的。注意，儘管投遞率隨著時間推移來看在一定程度上是相關的，但該5個連接的投遞率在許多時間點是有巨大差異的。Figures 24(a) through 24(e) illustrate the delivery rate fluctuations of five TCP connections performed on the same link from the same HTTP web server on the simulated Evolutionary Data Optimized (EVDO) network to the same client. An example. In Figures 24(a) through 24(e), the X-axis illustrates the time in seconds, and the Y-axis illustrates the rate at which the client receives the bit on each of the five TCP connections, The rate on each connection is measured over the 1 second interval. In this particular simulation, a total of 12 TCP connections are executing on the link, and thus the network is relatively loaded during the time shown, which is the same cellular service in which more than one client client is operating the network. It is typical when streaming in the area. Note that although the delivery rate is somewhat related over time, the delivery rate of the five connections is vastly different at many points in time.

圖25圖示針對大小為250000位元(約為31.25千位元組)的片斷的可能請求結構，其中對該片斷的不同部分並行地作出4個HTTP位元組範圍請求，即第一HTTP連接請求頭50000位元，第二HTTP連接請求接下來的50000位元，第三HTTP連接請求接下來的50000位元，而第四HTTP連接請求接下來的50000位元。若不使用FEC，即無FEC方法，則在該實例中對該片斷只有4個請求。若使用FEC，即FEC方法，則在該實例中，有一個額外的HTTP請求，用於請求從該片斷產生的修復段的額外50000位元FEC修復資料。Figure 25 illustrates a possible request structure for a fragment of size 250,000 bits (approximately 31.25 kilobytes), where 4 different HTTP byte range requests, i.e., first HTTP connections, are made in parallel for different portions of the fragment The request is 50,000 bits, the second HTTP connection requests the next 50,000 bits, the third HTTP connection requests the next 50,000 bits, and the fourth HTTP connection requests the next 50,000 bits. If FEC is not used, ie there is no FEC method, there are only 4 requests for this segment in this example. If FEC, the FEC method, is used, then in this example there is an additional HTTP request for requesting additional 50,000 bit FEC repair data for the repair segment generated from the segment.

圖26是圖24(a)至圖24(e)中所示的5個TCP連接的頭幾秒的放大，其中在圖26中，X軸以100毫秒的間隔圖示時間，並且Y軸圖示客戶端處在該5個TCP連接中的每一個連接上接收位元的速率，該速率是在100毫秒區間上量測的。一條線圖示在客戶端處已從頭4個HTTP連接(排除藉以請求FEC資料的彼HTTP連接)為該片斷接收到的聚集位元量，亦即，使用無FEC方法抵達的聚集位元量。另一條線圖示在客戶端出已從所有該5個HTTP連接(包括藉以請求FEC資料的彼HTTP連接)為該片斷接收到的聚集位元量，亦即，使用FEC方法抵達的聚集位元量。對於FEC方法，假定該片斷從接收到250000個所請求位元中的任何200000位元時起能被FEC解碼，該在例如使用Reed-Solomon FEC碼的情況下就能實現，且在例如使用Luby IV中描述的RaptorQ碼的情況下則本質上能實現。對於該實例中的FEC方法，在1秒後接收到足以使用FEC解碼來恢復該片斷的資料，從而允許1秒的頻道換台時間(假定在第一個片斷被完全播出之前能請求並接收後續片斷的資料)。對於該實例中的無FEC方法，在能恢復該片斷之前不得不接收該4個請求的所有資料，此在1.7秒之後發生，從而得到1.7秒的頻道換台時間。因此，在圖26中所示的實例中，無FEC方法在頻道換台時間的意義上比FEC方法差70%。該實例中的FEC方法表現出的優點的一個原因在於，對於FEC方法，接收到所請求的資料中的任何80%就允許恢復該片斷，而對於無FEC方法，要求接收到所請求的資料的100%。因此，無FEC方法不得不等待最慢的TCP連接完成投遞，且由於TCP投遞率的自然變動，最慢TCP連接的投遞速度與平均TCP連接相比可能動輒有很大方差。在該實例中的FEC方法下，一個慢TCP連接不會決定該片斷何時能恢復。相反，對於FEC方法，足夠資料的投遞更多地取決於平均TCP投遞率而非最差情形TCP投遞率。Figure 26 is an enlargement of the first few seconds of the five TCP connections shown in Figures 24(a) through 24(e), wherein in Figure 26, the X-axis illustrates time at intervals of 100 milliseconds, and the Y-axis diagram The rate at which the client receives the bit on each of the five TCP connections is measured, the rate being measured over a 100 millisecond interval. a line The amount of aggregate bits received at the client from the first 4 HTTP connections (excluding the HTTP connection from which the FEC data was requested) is received, that is, the amount of aggregate bits arrived using the FEC-free method. The other line shows the amount of aggregate bits received by the client for all of the five HTTP connections (including the HTTP connection from which the FEC data was requested), that is, the aggregate bits arriving using the FEC method. the amount. For the FEC method, it is assumed that the segment can be FEC decoded from the reception of any 200,000 bits of the 250,000 requested bits, which can be achieved, for example, using a Reed-Solomon FEC code, and for example using Luby IV In the case of the RaptorQ code described in the above, it can be realized in essence. For the FEC method in this example, data sufficient to recover the segment using FEC decoding is received after 1 second, allowing 1 second channel change time (assuming that the first clip can be requested and received before it is fully broadcast) Subsequent pieces of information). For the FEC-free method in this example, all the data for the 4 requests had to be received before the segment could be recovered, which occurred after 1.7 seconds, resulting in a channel change time of 1.7 seconds. Therefore, in the example shown in FIG. 26, the FEC-free method is 70% worse than the FEC method in the sense of the channel change time. One reason for the advantages exhibited by the FEC method in this example is that for the FEC method, any 80% of the requested data is allowed to be recovered, and for the FEC-free method, the requested data is required to be received. 100%. Therefore, the FEC-free method has to wait for the slowest TCP connection to complete the delivery, and due to the natural variation of the TCP delivery rate, the delivery speed of the slowest TCP connection may be quite large compared to the average TCP connection. Under the FEC method in this example, a slow TCP connection does not determine when the fragment can be recovered. Conversely, for the FEC method, the delivery of sufficient data is more dependent on the average TCP delivery rate than the worst case TCP delivery rate.

存在以上描述的無FEC方法和FEC方法的許多變形。例如，協調式HTTP/TCP請求可被用於發生頻道換台後的僅頭幾個片斷，且此後僅使用單個HTTP/TCP請求來下載進一步的片斷、多個片斷或整個段。作為另一實例，所使用的協調式HTTP/TCP連接的數量可以是正在請求的片斷的緊急性(亦即，該等片斷的播出時間有多迫切)及當前網路條件的函數。There are many variations of the FEC-free and FEC methods described above. For example, a coordinated HTTP/TCP request can be used for the first few fragments after a channel swap occurs, and thereafter only a single HTTP/TCP request is used to download further fragments, multiple fragments, or entire segments. As another example, the number of coordinated HTTP/TCP connections used may be a function of the urgency of the fragment being requested (i.e., how urgent the broadcast time of the fragments is) and the current network conditions.

在一些變形中，可使用多個HTTP連接來請求來自修復段的修復資料。在其他變形中，可在不同的HTTP連接上請求不同的資料量，例如取決於媒體緩衝器的當前大小及客戶端處的資料接收速率。在另一變形中，各源表示彼此並不獨立，而是代表分層媒體編碼，其中例如增強型源表示可取決於基源表示。在此種情形中，可以有與基源表示相對應的修復表示，及與基和增強源表示的組合相對應的另一修復表示。In some variations, multiple HTTP connections may be used to request repair material from the repair segment. In other variations, different amounts of data may be requested on different HTTP connections, such as depending on the current size of the media buffer and the data reception rate at the client. In another variation, the source representations are not independent of one another, but rather represent layered media coding, where for example an enhanced source representation may depend on the base source representation. In such a case, there may be a repair representation corresponding to the base source representation and another repair representation corresponding to the combination of base and enhancement source representations.

額外的全部元件增進了可由以上揭示的方法實現的優點。例如，所使用的HTTP連接的數量可取決於媒體緩衝器中的當前媒體量，及/或向媒體緩衝器中接收的速率而變化。在媒體緩衝器相對較空時可進取性地使用利用FEC的協調式HTTP請求，即以上描述的FEC方法及該方法的變形，例如對第一片斷的不同部分並行地作出較多的協調式HTTP請求，請求源片斷的全部及來自相應的修復片段的相對大分數的修復資料，並隨後隨著媒體緩衝器增長，轉換到數量減少的併發HTTP請求，每請求皆請求更大部分的媒體資料，及請求較小分數的修復資料，例如，轉換到1、2或3個併發HTTP請求，轉換到每請求對滿量的片斷或多個連貫片斷作出請求，及轉換到不請求修復資料。All of the additional elements enhance the advantages that can be achieved by the methods disclosed above. For example, the number of HTTP connections used may vary depending on the current amount of media in the media buffer and/or the rate of reception in the media buffer. Coordinated HTTP requests utilizing FEC, ie, the FEC method described above and variations of the method, such as making more coordinated HTTP in parallel for different portions of the first segment, are aggressively used when the media buffer is relatively empty Requesting, requesting all of the source fragments and the relatively large fraction of the repair data from the corresponding repair fragment, and then as the media buffer grows, transitions to a reduced number of concurrent HTTP requests, each requesting a larger portion of the media material, And request is smaller The repair information for the score, for example, is converted to 1, 2 or 3 concurrent HTTP requests, converted to a request for a full number of fragments or multiple consecutive segments per request, and converted to not requesting repair data.

作為另一實例，FEC修復資料的量可作為媒體緩衝器大小的函數而變化，亦即，當媒體緩衝器小時，則可請求較多FEC修復資料，並且隨著媒體緩衝器增長，則可逐漸減少所請求的FEC修復資料的量，及在媒體緩衝器充分大時的某個時間點，可以不請求FEC修復資料，僅請求來自源表示的源段的資料。此類增強型技術的益處在於該等技術可允許更快和更一致的頻道換台時間，及更強的抗潛在媒體不流暢或停滯的回彈性，同時經由減少請求訊息訊務和FEC修復資料兩者使所使用的超過只投遞源段中的媒體本應消費的量的額外頻寬量最小化，同時又使得能支援給定網路條件下最高的可能媒體速率。As another example, the amount of FEC repair data may vary as a function of media buffer size, ie, when the media buffer is small, more FEC repair data may be requested, and as the media buffer grows, it may gradually The amount of FEC repair data requested is reduced, and at some point in time when the media buffer is sufficiently large, the FEC repair data may not be requested, and only the data from the source segment indicated by the source may be requested. The benefit of such enhanced technologies is that they allow faster and more consistent channel change times, and greater resilience against potential media stagnation or stagnation, while reducing request message traffic and FEC repair data. Both minimize the amount of extra bandwidth used that exceeds the amount that media in the delivery source segment should consume, while enabling the highest possible media rate for a given network condition.

使用併發HTTP連接時的額外增強Additional enhancements when using concurrent HTTP connections

若滿足合適的條件則可放棄HTTP/TCP請求並且可作出另一HTTP/TCP請求以下載可替代在被放棄的請求中所請求的資料的資料，其中第二HTTP/TCP請求可請求與原始請求中完全相同的資料，例如來源資料；或重疊資料，例如相同源資料和修復資料中在第一請求中尚未請求的一些；或者完全不相交的資料，例如第一請求中尚未請求的修復資料。合適條件的實例是請求因在所規定的時間內沒有來自區塊伺服器基礎設施(BSI)的回應，或者在建立與BSI的傳輸連接時發生故障，或接收到來自伺服器的顯式故障訊息，或其他故障條件而失敗。If the appropriate conditions are met, the HTTP/TCP request can be discarded and another HTTP/TCP request can be made to download the material that can be substituted for the material requested in the abandoned request, where the second HTTP/TCP request can be requested with the original request. The exact same information, such as source material; or overlapping data, such as some of the same source data and repair data that have not been requested in the first request; or completely disjoint data, such as repair data that has not been requested in the first request. An example of a suitable condition is to request that there is no response from the Block Server Infrastructure (BSI) within the specified time, or that a failure occurs while establishing a transmission connection with BSI, or an explicit failure message is received from the server. ,or others Failure condition failed.

合適條件的另一實例是根據將連線速度的度量(回應於所議請求的資料抵達率)與預期連線速度，或與在回應中包含的媒體資料的播出時間或取決於該時間的其他時間之前接收該回應所需的連線速度的估計作比較，資料的接收進行得異常慢。Another example of a suitable condition is based on a measure of the connection speed (in response to the requested data arrival rate) and the expected connection speed, or the broadcast time of the media material contained in the response or depending on the time. The comparison of the connection speeds required to receive the response before other times is compared, and the reception of the data proceeds abnormally.

該辦法在BSI有時顯現故障或不良效能的情形中具有優點。在此種情形中，上述辦法提高了即使BSI內有故障或不良效能該客戶端仍能繼續可靠地播出媒體資料的概率。注意，在一些情形中，以BSI有時的確顯現出此類故障或不良效能的方式來設計BSI可能存在優點，例如此類設計可具有比不顯現出此類故障或不良效能或不那麼頻繁地顯現出此類故障或不良效能的替換設計低的成本。在此種情形中，本文中描述的方法具有進一步優點，因為其允許對BSI利用此類較低成本設計而不會導致使用者體驗降級的後果。This approach has advantages in situations where BSI sometimes exhibits malfunction or poor performance. In this case, the above method improves the probability that the client can continue to reliably broadcast the media material even if there is a fault or bad performance in the BSI. Note that in some cases, there may be advantages to designing a BSI in such a way that BSI sometimes does exhibit such failure or poor performance, such as such a design may have less or less frequent or less frequent performance than does not manifest such failure or poor performance. A low cost of replacement design that exhibits such failure or poor performance. In such a case, the method described herein has further advantages because it allows for the use of such lower cost designs for BSI without the consequences of user experience degradation.

在另一實施例中，對與給定區塊相對應的資料發出的請求的數目可取決於是否滿足關於該區塊的合適條件。若不滿足該條件，則假使對該區塊的所有目前未完成的資料請求的成功完成將允許以高概率恢復出該區塊，客戶端可被禁止對該區塊作出進一步請求。若滿足該條件，則可發出對該區塊的更大量請求，亦即，以上的禁止不適用。合適條件的實例是截至該區塊的排程播出時間為止的時間或取決於該時間的其他時間落在所規定的閾值之下。該方法具有優點，此是由於對區塊的資料的額外請求是在對區塊的接收因包括該區塊的媒體資料的播出時間迫近而變得更急迫之後發出的。在諸如HTTP/TCP之類的常見傳輸協定的情形中，該等額外請求具有增加專用於對所議區塊的接收有貢獻的資料的可用頻寬的份額的效應。此減少了完成足以恢復該區塊的資料的接收所需的時間，並因此減少該區塊不能在包括該區塊的媒體資料的排程播出時間之前被恢復的概率。如以上所描述的，若該區塊不能在包括該區塊的媒體資料的排程播出時間之前被恢復，則播出會暫停，從而導致不良使用者體驗，因此此處描述的方法有利地減少了此種不良使用者體驗的概率。In another embodiment, the number of requests issued for material corresponding to a given block may depend on whether suitable conditions for the block are met. If the condition is not met, the successful completion of all currently unfinished data requests for the block will allow the block to be recovered with a high probability, and the client may be prohibited from making further requests for the block. If this condition is met, a larger number of requests for the block may be issued, i.e., the above prohibition does not apply. An example of a suitable condition is that the time up to the scheduled broadcast time of the block or other time depending on the time falls below the specified threshold. This method has the advantage that this is due to the fact that the additional request for the block data is included in the receipt of the block. The broadcast time of the media materials of the block was imminent and became more urgent. In the case of common transport protocols such as HTTP/TCP, these additional requests have the effect of increasing the share of the available bandwidth dedicated to the material that contributes to the reception of the negotiated block. This reduces the time required to complete the reception of the material sufficient to recover the block, and thus reduces the probability that the block cannot be recovered before the scheduled air time of the media material including the block. As described above, if the block cannot be recovered before the scheduled broadcast time of the media material including the block, the playout will be suspended, resulting in a poor user experience, so the method described herein advantageously Reduce the probability of such a bad user experience.

應理解，貫穿本說明書對區塊的排程播出時間的引述是指包括該區塊的經編碼媒體資料最初可在客戶端處可用以達成該呈現的無暫停播出的時間。如對於媒體呈現系統領域的技藝人士將清楚的，該時間實際上比包括該區塊的媒體出現在用於播出的實體換能器(螢幕、揚聲器等)處的實際時間稍早，因為可能需要對包括該區塊的媒體資料應用若干變換功能以實現對該區塊的實際播出並且該等功能可能要花一定量的時間來完成。例如，媒體資料一般是以壓縮形式傳輸的並且可應用解壓變換。It should be understood that the reference to the scheduled playout time for a block throughout this specification refers to the time at which the encoded media material including the block may initially be available at the client to achieve a non-suspended playout of the presentation. As will be apparent to those skilled in the art of media presentation systems, this time is actually slightly earlier than the actual time at which the media including the block appears at the physical transducer (screen, speaker, etc.) for broadcast, as may A number of transformation functions need to be applied to the media material including the block to achieve actual playout of the block and the functions may take a certain amount of time to complete. For example, media material is typically transmitted in compressed form and a decompression transform can be applied.

用於產生支援協調式HTTP/FEC方法的檔結構的方法Method for generating a file structure supporting a coordinated HTTP/FEC method

現在描述用於產生可有利地被採用協調式HTTP/FEC方法的客戶端使用的檔結構的實施例。在該實施例中，對於每個源段，存在如下產生的相應修復段。參數R指示對於源段中的來源資料而言平均產生多少FEC修復資料。例如，R=0.33指示若源段包含1000千位元組的資料，則相應的修復段包含約330千位元組的修復資料。參數S指示用於FEC編碼和解碼的以位元組計的符號大小。例如，S=64指示來源資料和修復資料包括各自用於FEC編碼和解碼目的的大小為64位元組的符號。Embodiments of a file structure for generating a client that can be advantageously employed by a coordinated HTTP/FEC method are now described. In this embodiment, for each source segment, there is a corresponding repair segment generated as follows. The parameter R indicates how much FEC repair data is generated on average for the source material in the source segment. For example, R=0.33 indicates that if the source segment contains 1000 kilobytes of data, the corresponding repair The complex contains approximately 330 kilobytes of repair data. The parameter S indicates the symbol size in bits for FEC encoding and decoding. For example, S=64 indicates that the source material and the repair material include 64-byte symbols each of which is used for FEC encoding and decoding purposes.

可如下為源段產生修復段。源段的每個片斷被視為用於FEC編碼目的的源區塊，且因此每個片斷被當作據以產生修復符號的源區塊的源符號序列。為頭i個片斷產生的修復符號總數演算為TNRS(i)=ceiling(R^＊ B(i)/S)，其中ceiling(x)是用於輸出值至少為x的最小整數的函數。因此，為片斷i產生的修復符號數目為NRS(i)=TNRS(i)-TNRS(i-1)。A repair segment can be generated for the source segment as follows. Each segment of the source segment is considered a source block for FEC encoding purposes, and thus each segment is treated as a source symbol sequence from which the source block of the repair symbol is derived. The total number of repair symbols generated for the first i segments is calculated as TNRS(i)=ceiling(R ^* B(i)/S), where ceiling(x) is a function for outputting the smallest integer with a value of at least x. Therefore, the number of repair symbols generated for the segment i is NRS(i)=TNRS(i)-TNRS(i-1).

修復段包括關於該等片斷的修復符號的級聯，其中修復段內的修復符號的次序按據以產生該等修復符號的片斷的次序，且在片斷內，修復符號按其編碼符號辨識符(ESI)的次序。與源段結構相對應的修復段結構在圖27中圖示，包括修復段產生器2700。The repair segment includes a cascade of repair symbols for the segments, wherein the order of the repair symbols within the repair segment is in the order in which the segments of the repair symbol are generated, and within the segment, the repair symbol is encoded by its symbol identifier ( The order of ESI). The repair segment structure corresponding to the source segment structure is illustrated in FIG. 27 and includes a repair segment generator 2700.

注意，經由如上所述地界定關於片斷的修復符號數目，關於所有先前片斷的修復符號總數及因此修復段的位元組索引僅取決於R、S、B(i-1)和B(i)，而不取決於源段內的片斷的任何先前或後續結構。此是有利的，因為此舉允許客戶端僅使用關於據以產生修復區塊的源段的相應片斷的結構的局部資訊來迅速計算修復段內的修復區塊開始的位置，並且亦迅速計算該修復區塊內的修復符號數目。因此，若客戶端決定從源段中間開始下載和播出片斷，則客戶端亦能從相應的修復段內迅速產生和存取相應的修復區塊。Note that by defining the number of repair symbols for a segment as described above, the total number of repair symbols for all previous segments and thus the byte index of the repair segment depends only on R, S, B(i-1), and B(i). Without depending on any previous or subsequent structure of the segment within the source segment. This is advantageous because it allows the client to quickly calculate the position at which the repair block within the repair segment begins, using only local information about the structure of the corresponding segment from which the source segment of the repair block was generated, and also quickly calculates the Fix the number of repair symbols in the block. Therefore, if the client decides to download and play the clip from the middle of the source segment, the client can also quickly generate and access the corresponding repair block from the corresponding repair segment.

與片斷i相對應的源區塊中的源符號數目演算為NSS(i)=ceiling((B(i)-B(i-1))/S)。若B(i)-B(i-1)不是S的倍數，則最後的源符號出於FEC編碼和解碼目的被填充「0」位元組，亦即，最後的源符號被填充「0」位元組從而最後的源符號大小為S位元組以用於FEC編碼和解碼目的，但該等「0」填充位元組不被儲存為源段的一部分。在該實施例中，源符號的ESI為0、1、...、NSS(i)-1，並且修復符號的ESI為NSS(i)、...、NSS(i)+NRS(i)-1。The number of source symbols in the source block corresponding to the segment i is calculated as NSS(i)=ceiling((B(i)-B(i-1))/S). If B(i)-B(i-1) is not a multiple of S, the last source symbol is filled with "0" bytes for FEC encoding and decoding purposes, ie, the last source symbol is padded with "0". The byte and thus the final source symbol size is the S-bit tuple for FEC encoding and decoding purposes, but the "0" padding byte is not stored as part of the source segment. In this embodiment, the ESI of the source symbol is 0, 1, ..., NSS(i)-1, and the ESI of the repair symbol is NSS(i), ..., NSS(i) + NRS(i) -1.

在該實施例中，可經由簡單地向源段的URL添加後置「.repair」來從相應的源段的URL產生修復段的URL。In this embodiment, the URL of the repair segment can be generated from the URL of the corresponding source segment by simply adding a post ".repair" to the URL of the source segment.

修復段的修復索引資訊和FEC資訊由相應的源段的索引資訊及從R和S的值隱式地界定，如本文中所描述的。時間偏移量和包括修復段的片斷結構由相應的源段的時間偏移量和結構決定。至與片斷i相對應的修復段中的修復符號末尾的位元組偏移量可被演算為RB(i)=S^＊ ceiling(R^＊ B(i)/S)。與片斷i相對應的修復段中的位元組數目則為RB(i)-RB(i-1)，且因此與片斷i相對應的修復符號數目被演算為NRS(i)=(RB(i)-RB(i-1))/S。與片斷i相對應的源符號數目可演算為NSS(i)=ceiling((B(i)-B(i-1))/S)。因此，在該實施例中，修復段內的修復區塊的修復索引資訊和相應的FEC資訊可從R、S及相應源段的相應片斷的索引資訊隱式地推導出。The repair index information and FEC information of the repair segment are implicitly defined by the index information of the corresponding source segment and from the values of R and S, as described herein. The time offset and the fragment structure including the repair segment are determined by the time offset and structure of the corresponding source segment. The byte offset to the end of the repair symbol in the repair segment corresponding to the segment i can be calculated as RB(i)=S ^* ceiling(R ^* B(i)/S). The number of bytes in the repair segment corresponding to the segment i is RB(i)-RB(i-1), and thus the number of repair symbols corresponding to the segment i is calculated as NRS(i)=(RB( i) - RB(i-1)) / S. The number of source symbols corresponding to the segment i can be calculated as NSS(i)=ceiling((B(i)-B(i-1))/S). Therefore, in this embodiment, the repair index information and the corresponding FEC information of the repair block in the repair segment can be implicitly derived from the index information of the corresponding segments of R, S and the corresponding source segment.

作為實例，考慮圖28中所示的實例，圖28圖示始於位元組偏移量B(1)=6410並止於位元組偏移量B(2)=6770的片斷2。在該實例中，符號大小為S=64位元組，且虛分隔號圖示源段內與S的倍數相對應的位元組偏移量。作為源段大小的分數的總修復段大小在該實例中被設為R=0.5。片斷2的源區塊中的源符號數目演算為NSS(2)=ceiling((6,770-6,410)/64)=ceil(5.625)=6，且該6個源符號分別具有ESI 0、...、5，其中第一個源符號為片斷2的始於該源段內的位元組索引6410處的頭64個位元組，第二個源符號為片斷2的始於源段內的位元組索引6474處的接下來64個位元組，等等。與片斷2相對應的修復區塊的末尾位元組偏移量演算為RB(2)=64^＊ ceiling(0.5^＊ 6,770/64)=64^＊ ceiling(52.89...)=64^＊ 53=3,392，且與片斷2相對應的修復區塊的開始位元組偏移量演算為RB(1)=64^＊ ceiling(0.5^＊ 6,410/64)=64^＊ ceiling(50.07...)=64^＊ 51=3,264，因此在該實例中，在與片斷2相對應的修復區塊中具有ESI分別為6和7的兩個修復符號，始於修復段內位元組偏移量3264處並止於位元組偏移量3392。As an example, consider the example shown in FIG. 28, which illustrates a segment 2 starting at a byte offset B(1)=6410 and ending at a byte offset B(2)=6770. In this example, the symbol size is S=64 bytes, and the virtual separator number illustrates the byte offset corresponding to the multiple of S in the source segment. The total repair segment size as a fraction of the source segment size is set to R = 0.5 in this example. The number of source symbols in the source block of segment 2 is calculated as NSS(2)=ceiling((6,770-6,410)/64)=ceil(5.625)=6, and the 6 source symbols have ESI 0,... 5, wherein the first source symbol is the first 64 bytes of the segment 2 starting at the byte index 6410 in the source segment, and the second source symbol is the bit of the segment 2 starting from the source segment The next 64 bytes at the tuple index 6474, and so on. The end byte offset of the repair block corresponding to the segment 2 is calculated as RB(2)=64 ^* ceiling(0.5 ^* 6,770/64)=64 ^* ceiling(52.89...)=64 ^* 53=3,392 And the starting byte offset of the repair block corresponding to the segment 2 is calculated as RB(1)=64 ^* ceiling(0.5 ^* 6,410/64)=64 ^* ceiling(50.07...)=64 ^* 51 = 3,264, so in this example, there are two repair symbols with ESI 6 and 7 respectively in the repair block corresponding to segment 2, starting at the bit offset 3264 within the repair segment and ending at the bit The tuple offset is 3392.

注意，在圖28中所示的實例中，儘管R=0.5且存在與片斷2相對應的6個源符號，修復符號的數目亦不是如簡單地使用源符號數目來演算修復符號數目的情況下可預期的3個，而是根據本文中所描述的方法解出該數目為2。與簡單地使用片斷的源符號數目來決定修復符號數目不同，以上描述的實施例使得能夠單單從與相應源段的相應源區塊相關聯的索引資訊來演算修復段內的修復區塊的定位。此外，隨著源區塊中源符號數目K增長，相應修復區塊的修復符號數目KR緊密近似為K^＊ R，因為一般而言KR至多為ceil(K^＊ R)且KR至少為 floor((K-1)^＊ R)，其中floor(x)是最多為x的最大整數。Note that in the example shown in FIG. 28, although R=0.5 and there are 6 source symbols corresponding to the segment 2, the number of repair symbols is not as simple as the number of source symbols used to calculate the number of repair symbols. Three can be expected, but the number is 2 according to the method described herein. Unlike the number of source symbols that simply use the fragment to determine the number of repair symbols, the embodiments described above enable the positioning of the repair block within the repair segment to be calculated from the index information associated with the corresponding source block of the corresponding source segment. . Furthermore, as the number of source symbols K in the source block increases, the number of repair symbols KR of the corresponding repair block is closely approximated as K ^* R, since generally KR is at most ceil(K ^* R) and KR is at least floor(( K-1) ^* R), where floor(x) is the largest integer up to x.

存在以上用於產生可有利地被採用協調式HTTP/FEC方法的客戶端使用的檔結構的實施例的許多變形，如本領域技藝人士將認識到的。作為替換實施例的實例，表示的原始段可被劃分成N>1個並行段，其中對於i=1,...,N，原始段的指定分數F_i 被包含在第i個並行段中，且其中i=1,...,N的F_i 之和等於1。在該實施例中，可存在被用於推導所有並行段的段映射的一個主段映射，類似於在以上描述的實施例中如何從源段映射推導出修復段映射。例如，若並非所主動媒體資料皆被劃分成並行段而是改為被包含在一個原始段中，則主段映射可指示片斷結構，並且隨後第i個並行段的段映射可如下從主段映射推導出：演算若原始段的片斷的第一首碼中的媒體資料量為L位元組，則頭i個並行段中該首碼聚集的位元組總數為ceil(L^＊ G_i )，其中G_i 為F_j 在j=1,...,i上的和。作為替換實施例的另一實例，段可包括每個段的原始源媒體資料其後緊隨著該片斷的修復資料的組合，從而得到包含源媒體資料與使用FEC碼從該源媒體資料產生的修復資料的混合的段。作為替換實施例的另一實例，包含源媒體資料和修復資料的混合的段可被劃分成多個包含源媒體資料和修復資料的混合的並行段。There are many variations of the above embodiments for generating a file structure that can be advantageously used by clients using the coordinated HTTP/FEC method, as will be appreciated by those skilled in the art. As an example of an alternate embodiment, the original segment of the representation may be divided into N > 1 parallel segments, where for i = 1, ..., N, the specified fraction F _{i of the} original segment is included in the ith parallel segment And wherein the sum of F _i of i = 1, ..., N is equal to 1. In this embodiment, there may be one main segment map that is used to derive segment maps for all parallel segments, similar to how the repair segment map is derived from the source segment map in the embodiment described above. For example, if the active media material is not divided into parallel segments but instead is included in one original segment, the primary segment mapping may indicate the segment structure, and then the segment mapping of the i-th parallel segment may be as follows from the primary segment. Mapping derivation: If the amount of media data in the first code of the segment of the original segment is L byte, the total number of bytes in the first i parallel segment where the first code is aggregated is ceil(L ^* G _i ) Where G _i is the sum of F _j at j=1,...,i. As another example of an alternate embodiment, the segment may include a combination of the original source media material for each segment followed by the repair material of the segment, thereby obtaining the source media material and the FEC code generated from the source media material. Fix the mixed segments of the data. As another example of an alternate embodiment, a mixed segment comprising source media material and repair material may be divided into a plurality of parallel segments comprising a mixture of source media material and repair material.

用於處置低等待時間串流的方法Method for handling low latency streaming

在一些部署情景中，對實況服務的低等待時間串流可能是期望的。例如，在事件(諸如體育事件或音樂會)的本端場所中分發的情形中，期望實況活動與該實況服務在客戶端上的呈現之間的延遲儘可能短。例如，可能期望最大延遲為1秒。In some deployment scenarios, low latency streaming to live services may be desirable. For example, in the case of distribution in a local venue of an event, such as a sports event or concert, it is expected that the live event is in the guest with the live service The delay between presentations on the client is as short as possible. For example, a maximum delay of 1 second may be expected.

如上所述，將儲存媒體呈現的段的每個檔安排成始於隨機存取點(RAP)可能是有利的。一些簡檔(尤其是ISO基媒體檔案格式實況簡檔)要求每個媒體段始於RAP。As noted above, it may be advantageous to arrange each of the segments of the segment presented by the storage medium to begin with a random access point (RAP). Some profiles (especially ISO base media file format live profiles) require that each media segment begin with a RAP.

然而，在其中需要低的端到端等待時間遞送的環境中，每個段的歷時必須較短以使實況活動與該實況事件在客戶端上的呈現之間的延遲最小化。期望避免在將用於低等待時間串流的每個段中插入RAP。例如，視訊中的RAP通常由IDR訊框來實現。經由避免在期望用於低等待時間串流的短段內使用IDR訊框，編碼效率可得到改善。However, in environments where low end-to-end latency delivery is required, the duration of each segment must be shorter to minimize the delay between live activity and the presentation of the live event on the client. It is desirable to avoid inserting a RAP in each segment that will be used for low latency streaming. For example, RAP in video is usually implemented by an IDR frame. Encoding efficiency can be improved by avoiding the use of IDR frames within short segments expected for low latency streaming.

根據實施例，產生媒體呈現的順應於實況簡檔的表示和低等待時間表示。順應於實況簡檔的表示具有相對較大的媒體段歷時。順應於實況簡檔的表示的每個媒體段在該媒體段的開始處具有RAP。低等待時間表示具有相對較短的段(該等段可被稱為「媒體片斷」)，該等段可以不包含RAP。支援低等待時間串流的客戶端可接收為媒體呈現的低等待時間表示所產生的媒體片斷，而不支援低等待時間串流的客戶端可以能夠接收為媒體呈現的順應於實況簡檔的表示所產生的媒體段。According to an embodiment, a representation of the media presentation conforming to the live profile and a low latency representation are generated. The representation conforming to the live profile has a relatively large media segment duration. Each media segment that conforms to the representation of the live profile has a RAP at the beginning of the media segment. Low latency indicates that there are relatively short segments (the segments may be referred to as "media segments"), which may not contain RAPs. Clients that support low latency streaming can receive media segments generated for low latency representations of media presentations, while clients that do not support low latency streaming can receive representations of media profiles that are compliant with live profiles. The resulting media segment.

圖30圖示了用於低等待時間串流的媒體片斷與諸媒體片斷之間的關係。為實況簡檔串流所產生的媒體段3002在該媒體資料的開始處包含RAP 3004(「mdat」)。相反，在為低等待時間串流所產生的媒體片斷3004、3006和3008中，只有媒體片斷3004包含RAP。Figure 30 illustrates the relationship between media segments and media segments for low latency streaming. The media segment 3002 generated for the live profile stream includes RAP 3004 ("mdat") at the beginning of the media material. Conversely, in media segments 3004, 3006, and 3008 generated for low latency streaming, Only media segment 3004 contains RAPs.

該等媒體片斷是在執行中產生的，並且可供客戶端經由HTTP進行下載。該等媒體片斷可被累積成順應於ISO基媒體檔案格式實況簡檔的媒體段，而無需對該等媒體片斷作任何修改。例如，該等媒體片斷可被級聯成媒體段。These media segments are generated during execution and are available for the client to download via HTTP. The media segments can be accumulated into media segments that conform to the ISO base media file format live profile without any modification to the media segments. For example, the media segments can be concatenated into media segments.

媒體段和媒體片斷兩者可使用相同的編碼過程來建立。以此方式，媒體可被高效地編碼以供在需要低的端到端等待時間的環境中操作的客戶端及由使用要求在每個段中有RAP的協定的客戶端消費。Both the media segment and the media segment can be created using the same encoding process. In this way, the media can be efficiently encoded for use by clients operating in environments that require low end-to-end latency and by clients that use protocols that require RAPs in each segment.

在一些實施例中，為每個媒體片斷產生段索引(SIDX)。SIDX可包括在媒體段內的呈現時間範圍及媒體段被該媒體片斷所佔據的相應位元組範圍。在一些實施例中，SIDX指示片斷內是否存在RAP。在圖30中，媒體片斷3004的SIDX包的「包含_RAP」欄位被設為1，指示該媒體片斷3004包含RAP。媒體片斷3006和3008的SIDX包的「包含_RAP」欄位被設為0，指示媒體片斷3006和3008不包含RAP。SIDX可進一步指示第一RAP在片斷內的呈現時間。In some embodiments, a segment index (SIDX) is generated for each media segment. The SIDX can include a presentation time range within the media segment and a corresponding byte range in which the media segment is occupied by the media segment. In some embodiments, the SIDX indicates whether a RAP is present within the fragment. In FIG. 30, the "Include_RAP" field of the SIDX packet of the media segment 3004 is set to 1, indicating that the media segment 3004 contains the RAP. The "Include_RAP" field of the SIDX package for media segments 3006 and 3008 is set to 0, indicating that media segments 3006 and 3008 do not contain RAPs. The SIDX may further indicate the presentation time of the first RAP within the segment.

根據實施例，媒體伺服器可產生用於低等待時間串流的片斷並將該等片斷推送到快取記憶體中。該快取記憶體可級聯該等片斷以產生與實況簡檔相容的媒體段。在產生媒體段之後，該快取記憶體可清空已被級聯以產生該媒體段的彼等媒體片斷。According to an embodiment, the media server may generate segments for low latency streaming and push the segments into the cache. The cache can cascade the segments to produce a media segment that is compatible with the live profile. After the media segment is generated, the cache can empty the media segments that have been concatenated to produce the media segment.

單個媒體呈現描述(MPD)可儲存關於具有媒體呈現的順應於實況簡檔的媒體段的第一表示和具有低等待時間流的媒體片斷的第二表示的資訊。使用媒體段進行時移緩衝及使用媒體片斷來處置在該串流的近實況邊緣處的觀看，可提供時移觀看。客戶端可在該等表示之間切換，例如，在時移緩衝器中開始並經由跳過該媒體呈現的諸章節而移動至更靠近實況邊緣。MPD的每個表示可被指派一屬性以表達可用於單個媒體呈現的表示陣列。A single media presentation description (MPD) can store a first representation of a media segment with a media presentation that is compliant with a live profile and has low latency The second representation of the streamed media segment. Time shifting buffering using media segments and using media segments to handle viewing at the near live edge of the stream provides time shifted viewing. The client can switch between the representations, for example, starting in a time shift buffer and moving closer to the live edge via skipping the chapters presented by the media. Each representation of the MPD can be assigned an attribute to represent an array of representations that can be used for a single media presentation.

在儲存關於具有媒體段的第一表示和具有媒體片斷的第二表示的資訊的MPD中，提供指示第二表示的何者媒體片斷始於RAP的資訊可能是有利的。例如，MPD可包括一屬性以指示在複數個媒體片斷內出現RAP的頻率。在一個實施例中，MPD包括以片斷數目的方式來指示頻率的屬性(亦即，每第x個媒體片斷包含RAP)。在另一個實施例中，該屬性以毗鄰RAP之間在時間上的距離的方式來指示頻率。In storing an MPD with information about the first representation of the media segment and the second representation of the media segment, it may be advantageous to provide information indicating which of the second representations the media segment begins with the RAP. For example, the MPD can include an attribute to indicate the frequency at which RAPs occur within a plurality of media segments. In one embodiment, the MPD includes an attribute that indicates the frequency in a number of segments (ie, every xth media segment contains a RAP). In another embodiment, the attribute indicates the frequency in a manner that is adjacent to the distance between the RAPs in time.

替換地，關於媒體片斷的資訊可儲存在第一MPD中，而關於媒體段的資訊可儲存在第二MPD中。Alternatively, information about the media clips may be stored in the first MPD, and information about the media segments may be stored in the second MPD.

在一些實施例中，MPD可發訊號通知適用於具體表示的具體參數，諸如表示的媒體段或媒體片斷的最大歷時。In some embodiments, the MPD can signal a particular parameter that is applicable to a particular representation, such as the maximum duration of the represented media segment or media segment.

本領域一般技藝人士在閱讀本案之後可以預見其他實施例。在其他實施例中，可有利地作出以上所揭示的發明的組合或子群組合。元件的實例安排是出於圖式目的圖示的，應理解，在本發明的替換實施例中構想了組合、添加、重新安排，及類似方案。因此，儘管本發明是參照示例性實施例描述的，但是本領域技藝人士將意識到許多修改是可能的。Other embodiments will be envisioned by those of ordinary skill in the art after reading this disclosure. In other embodiments, combinations or sub-combinations of the inventions disclosed above may be advantageously made. The example arrangement of elements is illustrated for purposes of illustration, and it is understood that combinations, additions, rearrangements, and the like are contemplated in alternative embodiments of the invention. Accordingly, while the invention has been described with respect to the exemplary embodiments, those skilled in the art will recognize that many modifications are possible.

例如，本文中描述的過程可使用硬體元件、軟體元件及/或以上各者之任何組合來實現。在該等情形中，軟體元件可在有形的非瞬態媒體上提供以在設有該媒體或與該媒體分開的硬體上執行。因此，本說明書和附圖被認為是圖示性的而非限制性的。然而，顯然可對本發明作出各種修改和變更而不會脫離所附請求項中所闡述的本發明的更寬泛的精神和範圍，且本發明意欲涵蓋落在所附請求項的範圍內的所有修改和等效技術方案。For example, the processes described herein can be implemented using hardware components, software components, and/or any combination of the above. In such cases, the software component can be provided on a tangible, non-transitory medium for execution on a hardware that is or separate from the media. Accordingly, the specification and drawings are to be regarded as However, it is apparent that various modifications and changes can be made to the present invention without departing from the spirit and scope of the invention as set forth in the appended claims. And equivalent technical solutions.

Claims

A method for constructing content material to be supplied using a media server, comprising the steps of: obtaining the content to be served; generating a plurality of media segments representing the content and encoded according to an encoding protocol, The encoding protocol includes a media presentation encoding one or more frames in each media segment, wherein a random access point is available in each media segment; generating a plurality of media segments encoded according to the encoding protocol (media fragments), wherein a media segment includes the plurality of media segments, and wherein at least some of the plurality of media segments comprise random access points and at least some of the media segments do not include random access points, a random An access point includes a location in a segment at which a decoder can decode subsequent fragments of the random access point without being affected by segments prior to the random access point; and generating an index for the media segment (segment index), the segment index includes: a presentation time range for each media segment in the media segment, and the media in the media segment Off a corresponding byte range occupied by the random access point, and a presence indicator, the random access point indicator indicating whether there exists a random access point within the media segment.

The method of claim 1, wherein the media segment is generated by concatenating the plurality of media segments.

The method as recited in claim 2, further comprising the steps of: The media segment is generated in the memory, and wherein the plurality of media segments used to generate the media segment are emptied from the cache memory after the media segment is generated in the cache.

The method of claim 1, further comprising the steps of: generating a single media presentation description (MPD) file, the MPD file storing a first representation comprising the plurality of media segments and the media presentation regarding the media presentation Information including a second representation of the plurality of media segments.

The method of claim 4, wherein the MPD includes an attribute to indicate a frequency at which a random access point occurs within the second representation.

The method of claim 5, wherein the frequency is a time period.

The method of claim 5, wherein the frequency is a number of media segments.

A media server, comprising: a processor configured to: obtain the content to be served; generate a plurality of media segments representing the content and encoded according to an encoding protocol, the encoding protocol including a media presentation Encoding one or more frames into each media segment, wherein a random access point is available in each media segment; generating a plurality of media segments encoded according to the encoding protocol (media Fragments), wherein a media segment includes the plurality of media segments, and wherein at least some of the plurality of media segments comprise random access points and at least some of the media segments do not include random access points, a random access The point includes a position in the segment at which a decoder can decode the subsequent segment of the random access point without being affected by the segment before the random access point; and generate an index for the media segment (segment) Index), the segment index includes: a presentation time range for each media segment in the media segment, a corresponding byte range occupied by each media segment in the media segment, and a random access point existence An indicator, the random access point presence indicator indicates whether a random access point exists within each media segment.

The media server as recited in claim 8, wherein the media segment is generated by concatenating the plurality of media segments.

The media server as recited in claim 9, wherein the processor is further configured to: generate the media segment in a cache memory, and wherein when the media segment is generated in the cache memory, The plurality of media segments used to generate the media segment are emptied in the cache memory.

The media server as recited in claim 8, wherein the processor is further configured to: generate a single media presentation description (MPD) file, the MPD file storing a first plurality of media segments including the plurality of media segments for the media presentation And indicating information about a second representation of the plurality of media segments presented by the media.

A media server as recited in claim 11, wherein the MPD includes an attribute to indicate a frequency at which a random access point occurs within the second representation.

The media server as recited in claim 12, wherein the frequency is a time period.

The media server as recited in claim 13, wherein the frequency is a number of media segments.

A non-transitory computer readable medium storing computer executable instructions to, when executed, cause one or more computing devices to: obtain the content to be served; generate a representation of the content and a plurality of media segments encoded according to an encoding protocol, the encoding protocol including a media presentation encoding one or more frames in each media segment, wherein each media segment has a random memory Taking a point; generating a plurality of media fragments encoded according to the encoding protocol, wherein a media segment includes the plurality of media segments, and wherein at least some of the plurality of media segments comprise random memory Taking a point and at least some of the media segments do not include a random access point, and a random access point includes a position in the segment at which a decoder can decode the subsequent segments of the random access point without being subject to the random The impact of the fragment before the access point; and generate a segment index for the media segment, the segment index package Included: a presentation time range for each media segment in the media segment, a corresponding byte range occupied by each media segment in the media segment, and a random access point presence indicator, the random storage The pick presence indicator indicates whether a random access point exists within each media segment.