TW202240431A - Object collision data for virtual camera in virtual interactive scene defined by streamed media data - Google Patents
Object collision data for virtual camera in virtual interactive scene defined by streamed media data Download PDFInfo
- Publication number
- TW202240431A TW202240431A TW111108833A TW111108833A TW202240431A TW 202240431 A TW202240431 A TW 202240431A TW 111108833 A TW111108833 A TW 111108833A TW 111108833 A TW111108833 A TW 111108833A TW 202240431 A TW202240431 A TW 202240431A
- Authority
- TW
- Taiwan
- Prior art keywords
- data
- virtual
- camera
- collision
- solid object
- Prior art date
Links
- 230000002452 interceptive effect Effects 0.000 title description 9
- 230000033001 locomotion Effects 0.000 claims abstract description 140
- 230000004044 response Effects 0.000 claims abstract description 38
- 238000000034 method Methods 0.000 claims description 131
- 238000009877 rendering Methods 0.000 claims description 113
- 239000000463 material Substances 0.000 claims description 27
- 230000003068 static effect Effects 0.000 claims description 18
- 230000001960 triggered effect Effects 0.000 claims description 11
- 239000012634 fragment Substances 0.000 description 31
- 238000005538 encapsulation Methods 0.000 description 23
- 239000007787 solid Substances 0.000 description 23
- 230000006978 adaptation Effects 0.000 description 18
- 230000002123 temporal effect Effects 0.000 description 14
- 238000010586 diagram Methods 0.000 description 12
- 238000002360 preparation method Methods 0.000 description 12
- 238000012545 processing Methods 0.000 description 10
- 230000005540 biological transmission Effects 0.000 description 9
- 239000013598 vector Substances 0.000 description 9
- 230000003190 augmentative effect Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000011218 segmentation Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000006835 compression Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 101100412093 Schizosaccharomyces pombe (strain 972 / ATCC 24843) rec16 gene Proteins 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 239000000470 constituent Substances 0.000 description 2
- 230000026058 directional locomotion Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 238000007667 floating Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- FMYKJLXRRQTBOR-UBFHEZILSA-N (2s)-2-acetamido-4-methyl-n-[4-methyl-1-oxo-1-[[(2s)-1-oxohexan-2-yl]amino]pentan-2-yl]pentanamide Chemical group CCCC[C@@H](C=O)NC(=O)C(CC(C)C)NC(=O)[C@H](CC(C)C)NC(C)=O FMYKJLXRRQTBOR-UBFHEZILSA-N 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 230000000153 supplemental effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/238—Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
- H04N21/2387—Stream processing in response to a playback request from an end-user, e.g. for trick-play
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44012—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/003—Navigation within 3D models or images
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/218—Source of audio or video content, e.g. local disk arrays
- H04N21/21805—Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234318—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into objects, e.g. MPEG-4 objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/443—OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
- H04N21/4431—OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB characterized by the use of Application Program Interface [API] libraries
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/4728—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for selecting a Region Of Interest [ROI], e.g. for requesting a higher resolution version of a selected region
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/65—Transmission of management data between client and server
- H04N21/658—Transmission by the client directed to the server
- H04N21/6587—Control parameters, e.g. trick play commands, viewpoint selection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/816—Monomedia components thereof involving special video data, e.g 3D video
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/85406—Content authoring involving a specific file format, e.g. MP4 format
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8541—Content authoring involving branching, e.g. to different story endings
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Computer Graphics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Library & Information Science (AREA)
- Remote Sensing (AREA)
- Radar, Positioning & Navigation (AREA)
- Human Computer Interaction (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
本申請主張於2021年3月10日遞交的美國臨時申請No. 63/159,379的權益,據此將上述申請的全部內容通過引用的方式併入。This application claims the benefit of U.S. Provisional Application No. 63/159,379, filed March 10, 2021, which is hereby incorporated by reference in its entirety.
本公開內容係關於經編碼的視頻數據的儲存及傳輸。This disclosure relates to storage and transmission of encoded video data.
數位視頻能力可以被合併到各種各樣的裝置中,包括數位電視、數位直播系統、無線廣播系統、個人數位助理(PDA)、膝上型計算機或臺式計算機、數位相機、數位記錄裝置、數位媒體播放器、視頻遊戲裝置、視頻遊戲控制台、蜂巢或衛星無線電電話、視頻電話會議裝置等。數位視頻裝置實現視頻壓縮技術(諸如在由MPEG-2、MPEG-4、ITU-T H.263或ITU-T H.264/MPEG-4(第10部分,先進視頻寫碼(AVC))、ITU-T H.265(亦被稱為高效率視頻寫碼(HEVC))定義的標準以及此類標準的延伸中描述的那些技術),以更加高效地傳送及接收數位視頻資訊。Digital video capabilities can be incorporated into a wide variety of devices, including digital television, digital broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, digital cameras, digital recording devices, digital Media players, video game devices, video game consoles, cellular or satellite radio phones, video teleconferencing devices, etc. Digital video devices implementing video compression techniques (such as those developed by MPEG-2, MPEG-4, ITU-T H.263 or ITU-T H.264/MPEG-4 (
視頻壓縮技術履行空間預測及/或時間預測,以減少或去除在視頻序列中固有的冗餘。對於基於塊的視頻寫碼,可以將視頻幀或切片劃分為宏塊。每個宏塊可以被進一步劃分。經幀內寫碼的(I)幀或切片中的宏塊是使用相對於相鄰宏塊的空間預測進行編碼的。經幀間寫碼的(P或B)幀或切片中的宏塊可以使用相對於相同幀或切片中的相鄰宏塊的空間預測,或者使用相對於其它參考幀的時間預測。Video compression techniques perform spatial prediction and/or temporal prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, video frames or slices can be divided into macroblocks. Each macroblock can be further divided. Macroblocks in an intra-coded (I) frame or slice are coded using spatial prediction with respect to neighboring macroblocks. A macroblock in an intercoded (P or B) frame or slice may use spatial prediction with respect to neighboring macroblocks in the same frame or slice, or temporal prediction with respect to other reference frames.
在視頻數據已經被編碼之後,視頻數據可以被分封化以進行傳送或儲存。視頻數據可以被組裝成符合各種標準(諸如國際標準化組織(ISO)的基媒體檔案格式以及其延伸(諸如AVC))中的任何一種的視頻檔案。After the video data has been encoded, the video data may be packetized for transmission or storage. Video data may be assembled into a video archive conforming to any of various standards, such as the International Organization for Standardization (ISO) base media archive format and extensions thereof, such as AVC.
概括而言,本公開內容描述了與串流互動式媒體數據相關的技術。此類互動式媒體數據可以是例如虛擬實境、擴增實境或其它此類互動式內容(諸如其它三維視頻內容)。最新的MPEG場景描述元素包括對glTF 2.0中的定時媒體的支援。媒體存取功能(MAF)向呈現引擎提供應用程式介面(API),呈現引擎可以通過該介面請求定時媒體。執行MAF的檢索單元可以處理所檢索到的定時媒體數據,並且通過循環緩衝器將所處理的媒體數據以期望格式傳遞給呈現引擎。當前的MPEG場景描述允許用戶在6個自由度(6DoF)中消耗場景媒體數據。因此,用戶通常能夠在3D場景中自由地移動(例如,穿過在3D場景中顯示的牆)。然而,內容作者可能希望對觀看者在特定區域的移動施加限制,例如以防止移動穿過所顯示的牆或其它物體。本公開內容描述了施加此類限制的技術,這些限制可以改善用戶的體驗,因為可以通過防止用戶穿越虛擬世界中的障礙物而使體驗變得更真實。In summary, this disclosure describes techniques related to streaming interactive media data. Such interactive media data may be, for example, virtual reality, augmented reality, or other such interactive content (such as other three-dimensional video content). The latest MPEG scene description elements include support for timed media in glTF 2.0. The media access function (MAF) provides an application programming interface (API) to the presentation engine through which the presentation engine can request timed media. The retrieval unit executing the MAF may process the retrieved timed media data, and pass the processed media data to the rendering engine in a desired format through a circular buffer. The current MPEG scene description allows users to consume scene media data in 6 degrees of freedom (6DoF). Thus, the user is generally able to move freely in the 3D scene (eg, through a wall displayed in the 3D scene). However, content authors may wish to impose restrictions on viewer movement in certain areas, for example to prevent movement through displayed walls or other objects. This disclosure describes techniques for imposing such restrictions that can improve a user's experience by making the experience more realistic by preventing the user from traversing obstacles in the virtual world.
在一個示例中,一種檢索媒體數據之方法包括:由呈現引擎接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;由該呈現引擎接收用於該三維場景的相機控制數據,該相機控制數據包括定義限制以防止虛擬相機穿越該至少一個虛擬固體物體的數據;由該呈現引擎從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該相機控制數據,由該呈現引擎防止該虛擬相機響應於該相機移動數據而穿越該至少一個虛擬固體物體。In one example, a method of retrieving media data includes: receiving, by a rendering engine, streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receiving, by the rendering engine, camera control data, the camera control data including data defining restrictions to prevent the virtual camera from traversing the at least one virtual solid object; receiving camera movement data from the user by the rendering engine, the camera movement data requesting the virtual camera to move through the at least one virtual solid object a virtual solid object; and using the camera control data, preventing, by the rendering engine, the virtual camera from traversing the at least one virtual solid object in response to the camera movement data.
在另一示例中,一種用於檢索媒體數據的裝置包括:記憶體,其被組態以儲存媒體數據;以及一個或多個處理器,其在電路中實現並且被組態以執行呈現引擎,該呈現引擎被組態以:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收用於該三維場景的相機控制數據,該相機控制數據包括定義限制以防止虛擬相機穿越該至少一個虛擬固體物體的數據;從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該相機控制數據,防止該虛擬相機響應於該相機移動數據而穿越該至少一個虛擬固體物體。In another example, an apparatus for retrieving media data includes: a memory configured to store media data; and one or more processors implemented in a circuit and configured to execute a rendering engine, The rendering engine is configured to: receive streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receive camera control data for the three-dimensional scene, the camera control data including defining constraints to data preventing the virtual camera from traversing the at least one virtual solid object; receiving camera movement data from a user requesting the virtual camera to move through the at least one virtual solid object; and using the camera control data, preventing the virtual camera from responding The camera movement data traverses the at least one virtual solid object.
在另一示例中,一種計算機可讀儲存媒體具有儲存在其上的指令,該指令在被執行時使得客戶端裝置的處理器進行以下操作:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收用於該三維場景的相機控制數據,該相機控制數據包括定義限制以防止虛擬相機穿越該至少一個虛擬固體物體的數據;從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該相機控制數據,防止該虛擬相機響應於該相機移動數據而穿越該至少一個虛擬固體物體。In another example, a computer-readable storage medium has stored thereon instructions that, when executed, cause a processor of a client device to: receive streaming media data representing A virtual three-dimensional scene of at least one virtual solid object; receiving camera control data for the three-dimensional scene, the camera control data including data defining restrictions to prevent the virtual camera from traversing the at least one virtual solid object; receiving camera movement data from a user, the camera movement data requesting the virtual camera to move through the at least one virtual solid object; and using the camera control data to prevent the virtual camera from passing through the at least one virtual solid object in response to the camera movement data.
在另一示例中,一種用於檢索媒體數據的裝置包括:用於接收串流媒體數據的構件,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;用於接收用於該三維場景的相機控制數據的構件,該相機控制數據包括定義限制以防止虛擬相機穿越該至少一個虛擬固體物體的數據;用於從用戶接收相機移動數據的構件,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及用於使用該相機控制數據來防止該虛擬相機響應於該相機移動數據而穿越該至少一個虛擬固體物體的構件。In another example, an apparatus for retrieving media data includes: means for receiving streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; means for camera control data of the scene, the camera control data comprising data defining limits to prevent the virtual camera from traversing the at least one virtual solid object; means for receiving camera movement data from the user requesting the virtual camera to move through passing the at least one virtual solid object; and means for using the camera control data to prevent the virtual camera from passing through the at least one virtual solid object in response to the camera movement data.
在另一示例中,一種檢索媒體數據之方法包括:由呈現引擎接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;由該呈現引擎接收表示該至少一個虛擬固體物體之邊界的物體碰撞數據;由該呈現引擎從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該物體碰撞數據,由該呈現引擎防止該虛擬相機響應於該相機移動數據而穿越該至少一個虛擬固體物體。In another example, a method of retrieving media data includes: receiving, by a rendering engine, streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receiving, by the rendering engine, data representing the at least one virtual object collision data for boundaries of solid objects; receiving, by the rendering engine, camera movement data from a user, the camera movement data requesting the virtual camera to move through the at least one virtual solid object; and using the object collision data, the rendering engine prevents The virtual camera traverses the at least one virtual solid object in response to the camera movement data.
在另一示例中,一種用於檢索媒體數據的裝置包括:記憶體,其被組態以儲存媒體數據;以及一個或多個處理器,其在電路中實現並且被組態以執行呈現引擎,該呈現引擎被組態以:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收表示該至少一個虛擬固體物體之邊界的物體碰撞數據;從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該物體碰撞數據,防止該虛擬相機響應於該相機移動數據而穿越該至少一個虛擬固體物體。In another example, an apparatus for retrieving media data includes: a memory configured to store media data; and one or more processors implemented in a circuit and configured to execute a rendering engine, The rendering engine is configured to: receive streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receive object collision data representing boundaries of the at least one virtual solid object; receive a camera from a user movement data requesting the virtual camera to move through the at least one virtual solid object; and using the object collision data to prevent the virtual camera from moving through the at least one virtual solid object in response to the camera movement data.
在另一示例中,一種計算機可讀儲存媒體具有儲存在其上的指令,該指令在被執行時使得客戶端裝置的處理器進行以下操作:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收表示該至少一個虛擬固體物體之邊界的物體碰撞數據;從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該物體碰撞數據,防止該虛擬相機響應於該相機移動數據而穿越該至少一個虛擬固體物體。In another example, a computer-readable storage medium has stored thereon instructions that, when executed, cause a processor of a client device to: receive streaming media data representing a virtual three-dimensional scene of at least one virtual solid object; receiving object collision data representing a boundary of the at least one virtual solid object; receiving camera movement data from a user requesting the virtual camera to move through the at least one virtual solid object; and using the object collision data, preventing the virtual camera from traversing the at least one virtual solid object in response to the camera movement data.
在另一示例中,一種用於檢索媒體數據的裝置包括:用於接收串流媒體數據的構件,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;用於接收表示該至少一個虛擬固體物體之邊界的物體碰撞數據的構件;用於從用戶接收相機移動數據的構件,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及用於使用該物體碰撞數據來防止該虛擬相機響應於該相機移動數據而穿越該至少一個虛擬固體物體的構件。In another example, an apparatus for retrieving media data includes: means for receiving streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; means for object collision data of a boundary of a virtual solid object; means for receiving camera movement data from a user requesting that the virtual camera move through the at least one virtual solid object; and for using the object collision data to A component that prevents the virtual camera from traversing the at least one virtual solid object in response to the camera movement data.
在隨附圖式及以下描述中闡述了一個或多個示例的細節。根據說明書及圖式以及根據申請專利範圍,其它特徵、目的及優點將是顯而易見的。The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description and drawings, and from the scope of claims.
互動式媒體數據可以通過網路進行串流傳輸。例如,客戶端裝置可以使用單播、廣播、多播等來檢索互動式媒體數據。互動式媒體數據可以是例如用於延展實境(XR)、擴增實境(AR)、虛擬實境(VR)等的三維(3D)媒體數據。因此,當被呈現給用戶時,用戶可以導航根據互動式媒體數據而渲染的3D虛擬場景。Interactive media data can be streamed over the Internet. For example, a client device may retrieve interactive media data using unicast, broadcast, multicast, etc. Interactive media data may be, for example, three-dimensional (3D) media data for extended reality (XR), augmented reality (AR), virtual reality (VR), and the like. Thus, when presented to the user, the user can navigate the 3D virtual scene rendered from the interactive media data.
MPEG場景描述可以描述用於虛擬世界或體驗的三維(3D)場景,例如用於XR、VR、AR或其它互動式媒體體驗。根據本公開內容的技術,MPEG場景描述可以描述3D場景內的物體,諸如椅子、牆、桌子、櫃檯、門、窗或其它固體物體。本公開內容描述了如下的技術:通過該技術,可以增強MPEG場景描述(或其它這樣的描述數據集),以對虛擬相機的移動施加限制,例如以防止相機穿過諸如牆壁之類的固體物體。MPEG scene descriptions can describe three-dimensional (3D) scenes for virtual worlds or experiences, such as for XR, VR, AR or other interactive media experiences. According to the techniques of this disclosure, an MPEG scene description can describe objects within a 3D scene, such as chairs, walls, tables, counters, doors, windows, or other solid objects. This disclosure describes techniques by which MPEG scene descriptions (or other such description datasets) can be enhanced to impose constraints on the movement of a virtual camera, for example to prevent the camera from passing through solid objects such as walls .
具體地,場景描述可以描述允許相機移動的路徑集合。路徑可以被描述為通過路徑分段連接的錨點集合。為了增強相機控制的表現力,每個路徑分段可以利用邊界體積進行增強,邊界體積允許沿著路徑的運動的某種自由。Specifically, the scene description may describe the set of paths that the camera is allowed to move. A path can be described as a collection of anchors connected by path segments. To enhance the expressiveness of camera control, each path segment can be augmented with a bounding volume that allows some freedom of motion along the path.
另外或替代地,場景描述可以描述場景中的虛擬固體物體。場景描述可以提供資訊,其表示例如物體之邊界、物體是否可能受到與用戶或其它物體的碰撞的影響(諸如物體是否移動或將響應此類碰撞而保持靜止)、表示碰撞的物體如何與物體互動的用於物體的材料、及/或表示響應於碰撞而播放或應用於物體的動畫的動畫數據。Additionally or alternatively, the scene description may describe virtual solid objects in the scene. A scene description can provide information that represents, for example, the boundaries of an object, whether the object is likely to be affected by collisions with the user or other objects (such as whether the object moves or will remain stationary in response to such collisions), represents how colliding objects interact with objects The material used for the object, and/or animation data representing animations played or applied to the object in response to collisions.
本公開內容的技術可以應用於符合根據以下各項中的任何一項封裝的視頻數據的視頻檔案:ISO基媒體檔案格式、可縮放視頻寫碼(SVC)檔案格式、先進視頻寫碼(AVC)檔案格式、第三代合作夥伴計劃(3GPP)檔案格式、及/或多視圖視頻寫碼(MVC)檔案格式、或其它類似的視頻檔案格式。The techniques of this disclosure may be applied to video archives conforming to video data encapsulated according to any of the following: ISO base media archive format, Scalable Video Coding (SVC) archive format, Advanced Video Coding (AVC) file format, 3rd Generation Partnership Project (3GPP) file format, and/or Multiview Video Coding (MVC) file format, or other similar video file formats.
在HTTP串流傳輸中,頻繁使用的操作包括HEAD、GET及部分GET。HEAD操作檢索與給定的統一資源定位符(URL)或統一資源名稱(URN)相關聯的檔案的標頭,而不檢索與URL或URN相關聯的酬載。GET操作檢索與給定的URL或URN相關聯的整個檔案。部分GET操作接收作為輸入參數的位元組範圍,並且檢索檔案的連續數量的位元組,其中位元組數量對應於所接收的位元組範圍。因此,可以提供電影片段以用於HTTP串流傳輸,因為部分GET操作可以獲得一個或多個個別電影片段。在電影片段中,可以存在不同軌道的若干軌道片段。在HTTP串流傳輸中,媒體呈現可以是客戶端可存取的結構化數據彙集。客戶端可以請求並且下載媒體數據資訊以向用戶呈現串流傳輸服務。In HTTP streaming, frequently used operations include HEAD, GET and partial GET. The HEAD operation retrieves the header of the archive associated with a given Uniform Resource Locator (URL) or Uniform Resource Name (URN), without retrieving the payload associated with the URL or URN. The GET operation retrieves the entire archive associated with a given URL or URN. A partial GET operation receives as an input parameter a byte range and retrieves a consecutive number of bytes for the archive, where the number of bytes corresponds to the received byte range. Thus, movie fragments can be provided for HTTP streaming, since a partial GET operation can obtain one or more individual movie fragments. In a movie fragment there may be several track fragments of different tracks. In HTTP streaming, a media presentation may be a client-accessible collection of structured data. The client can request and download the media data information to present the streaming service to the user.
在使用HTTP串流傳輸來對3GPP數據進行串流傳輸的示例中,針對多媒體內容的視頻及/或音頻數據可以存在多個表示。如下文所解釋的,不同的表示可以對應於不同的寫碼特性(例如,視頻寫碼標準的不同簡檔或級別)、不同的寫碼標準或寫碼標準的延伸(諸如多視圖及/或可縮放延伸)、或不同的位元率。這樣的表示的清單可以是在媒體呈現描述(MPD)數據結構中定義的。媒體呈現可以對應於HTTP串流傳輸客戶端裝置可存取的結構化數據彙集。HTTP串流傳輸客戶端裝置可以請求並且下載媒體數據資訊以向客戶端裝置的用戶呈現串流傳輸服務。媒體呈現可以是在MPD數據結構中描述的,MPD數據結構可以包括MPD的更新。In the example of streaming 3GPP data using HTTP streaming, there may be multiple representations for the video and/or audio data of the multimedia content. As explained below, different representations may correspond to different coding characteristics (e.g., different profiles or levels of video coding standards), different coding standards, or extensions of coding standards (such as multi-view and/or scalable extension), or different bit rates. A list of such representations may be defined in a Media Presentation Description (MPD) data structure. A media presentation may correspond to a collection of structured data accessible to an HTTP streaming client device. The HTTP streaming client device can request and download media data information to present the streaming service to the user of the client device. A media presentation may be described in an MPD data structure, which may include updates to the MPD.
媒體呈現可以含有一個或多個時段的序列。每個時段可以延長直到下一時段的開始為止,或者直到媒體呈現的結束為止(在最後一個時段的情況下)。每個時段可以含有針對相同媒體內容的一個或多個表示。表示可以是音頻、視頻、定時文本或其它此類數據的數個替代經編碼版本之一。表示可以在編碼類型(例如,對於視頻數據而言,位元率、解析度及/或編解碼器、以及對於音頻數據而言,位元率、語言及/或編解碼器)方面不同。術語表示可以用於指稱經編碼的音頻或視頻數據中的與多媒體內容的特定時段相對應並且以特定方式編碼的一部分。A media presentation may contain a sequence of one or more time periods. Each period may extend until the start of the next period, or until the end of the media presentation (in the case of the last period). Each period may contain one or more representations for the same media content. The representation may be one of several alternative encoded versions of audio, video, timed text, or other such data. The representations may differ in encoding type (eg, bit rate, resolution, and/or codec for video data, and bit rate, language, and/or codec for audio data). The term representation may be used to refer to a portion of encoded audio or video data that corresponds to a particular period of multimedia content and is encoded in a particular manner.
特定時段的表示可以被分配給由MPD中的指示這些表示所屬的適配集合的屬性所指示的組。同一適配集合中的表示通常被認為是彼此的替代,因為客戶端裝置可以在這些表示之間動態且無縫地切換,例如以履行帶寬適配。例如,用於特定時段的視頻數據的每個表示可以被分配給相同的適配集合,使得可以選擇這些表示中的任何表示進行解碼以呈現多媒體內容的用於對應時段的媒體數據,諸如視頻數據或音頻數據。在一些示例中,在一個時段內的媒體內容可以通過來自組0的任何一個表示(如果存在的話)或者來自每個非零組的至多一個表示的組合來表示。用於時段的每個表示的時序數據可以是相對於該時段的開始時間來表達的。Representations of a particular time period may be assigned to a group indicated by an attribute in the MPD indicating the adapted set to which these representations belong. Representations in the same adaptation set are generally considered to be substitutes for each other, since a client device can dynamically and seamlessly switch between these representations, eg, to perform bandwidth adaptation. For example, each representation of video data for a particular period of time may be assigned to the same adaptation set so that any of these representations may be selected for decoding to render media data for the corresponding period of multimedia content, such as video data or audio data. In some examples, media content within a time period may be represented by any one representation from group 0, if present, or a combination of at most one representation from each non-zero group. Timing data for each representation of a period may be expressed relative to the start time of that period.
表示可以包括一個或多個分段。每個表示可以包括初始化分段,或者表示的每個分段可以是自初始化的。當存在時,初始化分段可以含有用於存取表示的初始化資訊。通常,初始化分段不含有媒體數據。分段可以由標識符唯一地引用,諸如統一資源定位符(URL)、統一資源名稱(URN)或統一資源標識符(URI)。MPD可以為每個分段提供標識符。在一些示例中,MPD亦可以以 range屬性的形式提供位元組範圍,位元組範圍可以對應於用於在檔案內通過URL、URN或URI可存取的分段的數據。 A representation can consist of one or more segments. Each representation may include initialization segments, or each segment of the representation may be self-initializing. When present, the initialization section may contain initialization information for accessing representations. Typically, initialization segments contain no media data. A segment may be uniquely referenced by an identifier, such as a Uniform Resource Locator (URL), Uniform Resource Name (URN), or Uniform Resource Identifier (URI). MPD can provide an identifier for each segment. In some examples, the MPD may also provide a byte range in the form of a range attribute, which may correspond to data for a segment within the archive accessible via a URL, URN, or URI.
可以選擇不同的表示以用於基本上同時地檢索不同類型的媒體數據。例如,客戶端裝置可以選擇要從其檢索分段的音頻表示、視頻表示及定時文本表示。在一些示例中,客戶端裝置可以選擇特定的適配集合以履行帶寬適配。亦即,客戶端裝置可以選擇包括視頻表示的適配集合、包括音頻表示的適配集合及/或包括定時文本的適配集合。替代地,客戶端裝置可以為某些類型的媒體(例如,視頻)選擇適配集合,而為其它類型的媒體(例如,音頻及/或定時文本)直接選擇表示。Different representations can be selected for substantially simultaneous retrieval of different types of media data. For example, a client device may select audio representations, video representations, and timed text representations from which to retrieve segments. In some examples, a client device may select a particular adaptation set to perform bandwidth adaptation. That is, the client device may select an adaptation set that includes video representations, an adaptation set that includes audio representations, and/or an adaptation set that includes timed text. Alternatively, a client device may select adaptation sets for certain types of media (eg, video) and directly select representations for other types of media (eg, audio and/or timed text).
圖1是示出實現用於在網路上對媒體數據進行串流傳輸的技術的示例系統10的方塊圖。在該示例中,系統10包括內容準備裝置20、伺服器裝置60及客戶端裝置40。客戶端裝置40及伺服器裝置60通過可以包括互聯網的網路74通信地耦合。在一些示例中,內容準備裝置20及伺服器裝置60亦可以通過網路74或另一網路耦合,或者可以直接通信地耦合。在一些示例中,內容準備裝置20及伺服器裝置60可以包括相同的裝置。1 is a block diagram illustrating an
在圖1的示例中,內容準備裝置20包括音頻源22及視頻源24。音頻源22可以包括例如麥克風,其產生表示被捕獲的要由音頻編碼器26編碼的音頻數據的電信號。替代地,音頻源22可以包括儲存先前記錄的音頻數據的儲存媒體、音頻數據生成器(諸如計算機化的合成器)、或任何其它音頻數據源。視頻源24可以包括產生要由視頻編碼器28編碼的視頻數據的攝影機、利用先前記錄的視頻數據而編碼的儲存媒體、視頻數據生成單元(諸如計算機圖形源)、或任何其它視頻數據源。在所有示例中,內容準備裝置20不一定通信地耦合到伺服器裝置60,而是可以將多媒體內容儲存到由伺服器裝置60讀取的單獨媒體。In the example of FIG. 1 ,
原始音頻及視頻數據可以包括類比或數位數據。類比數據可以在被音頻編碼器26及/或視頻編碼器28編碼之前被數位化。音頻源22可以在講話參與者正在講話時從講話參與者獲得音頻數據,並且視頻源24可以同時獲得講話參與者的視頻數據。在其它示例中,音頻源22可以包括包含儲存的音頻數據的計算機可讀儲存媒體,而視頻源24可以包括包含儲存的視頻數據的計算機可讀儲存媒體。以這種方式,在本公開內容中描述的技術可以被應用於實況的(live)、串流傳輸的、即時的(real-time)音頻及視頻數據或者被應用於被存檔的、預先記錄的音頻及視頻數據。Raw audio and video data may include analog or digital data. Analog data may be digitized before being encoded by audio encoder 26 and/or
與視頻幀相對應的音頻幀通常是含有音頻數據的音頻幀,音頻數據是與由視頻源24捕獲(或生成)的被含有在視頻幀內的視頻數據同時地、由音頻源22捕獲(或生成)的。例如,當講話參與者通常通過講話產生音頻數據時,音頻源22捕獲音頻數據,而視頻源24同時(即,當音頻源22正在捕獲音頻數據時)捕獲講話參與者的視頻數據。因此,音頻幀可以在時間上對應於一個或多個特定視頻幀。相應地,對應於視頻幀的音頻幀通常對應於以下情形:其中音頻數據及視頻數據是同時被捕獲的,並且針對其音頻幀及視頻幀分別包括同時被捕獲的音頻數據及視頻數據。An audio frame corresponding to a video frame is typically an audio frame containing audio data captured by audio source 22 (or Generated. For example, audio source 22 captures audio data when a speaking participant would normally produce audio data by speaking, while
在一些示例中,音頻編碼器26可以將表示用於每個經編碼的音頻幀的音頻數據被記錄的時間的時間戳編碼到該經編碼的音頻幀中,並且類似地,視頻編碼器28可以將表示用於每個經編碼的視頻幀的視頻數據被記錄的時間的時間戳編碼在該經編碼的視頻幀中。在這樣的示例中,音頻幀對應於視頻幀可以包括含有時間戳的音頻幀及含有相同時間戳的視頻幀。內容準備裝置20可以包括內部時鐘,其中音頻編碼器26及/或視頻編碼器28可以根據該內部時鐘來生成時間戳,或者音頻源22及視頻源24可以使用該內部時鐘將音頻數據及視頻數據分別與時間戳進行關聯。In some examples, audio encoder 26 may encode into each encoded audio frame a time stamp representing the time at which the audio data was recorded for the encoded audio frame, and similarly,
在一些示例中,音頻源22可以向音頻編碼器26發送與音頻數據被記錄的時間相對應的數據,而視頻源24可以向視頻編碼器28發送與視頻數據被記錄的時間相對應的數據。在一些示例中,音頻編碼器26可以將序列標識符編碼到經編碼的音頻數據中,以指示經編碼的音頻數據的相對時間順序,但是不一定指示音頻數據被記錄的絕對時間,並且類似地,視頻編碼器28亦可以使用序列標識符來指示經編碼的視頻數據的相對時間順序。類似地,在一些示例中,序列標識符可以被映射或以其它方式與時間戳相關。In some examples, audio source 22 may send data to audio encoder 26 corresponding to a time when audio data was recorded, and
音頻編碼器26通常產生經編碼的音頻數據的流,而視頻編碼器28產生經編碼的視頻數據的流。每個個別的數據流(無論是音頻還是視頻)都可以被稱為基本流。基本流是表示的單個的、經數位寫碼的(可能被壓縮的)分量。例如,表示的經寫碼的視頻或音頻部分可以是基本流。在將基本流封裝在視頻檔案內之前,可以將其轉換為分封化基本流(PES)。在同一表示內,流ID可以用於將屬一個基本流的PES封包與屬另一基本流的PES封包區分開。基本流的基礎數據單元是分封化基本流(PES)封包。因此,經寫碼的視頻數據通常對應於基本視頻流。類似地,音頻數據對應於一個或多個相應的基本流。Audio encoder 26 typically produces a stream of encoded audio data, while
許多視頻寫碼標準(諸如ITU-T H.264/AVC、以及即將產生的高效率視頻寫碼(HEVC)標準)定義了用於無錯誤位元流的語法、語義及解碼過程,其中的任何一者符合某個簡檔或級別。視頻寫碼標準通常不指定編碼器,但是編碼器被派給有保證所生成的位元流對於解碼器來說是符合標準的任務。在視頻寫碼標準的背景下,“簡檔”對應於應用於它們的演算法、特徵、或工具及約束的子集。例如,如由H.264標準所定義的,“簡檔”是由H.264標準所指定的整個位元流語法的子集。“級別”對應於與圖片的解析度、位元率及塊處理率有關的解碼器資源消耗的限制,諸如例如,解碼器記憶體及計算。可以利用profile_idc(簡檔指示符)值來用信號通知簡檔,而可以利用level_idc(級別指示符)值來用信號通知級別。Many video coding standards (such as ITU-T H.264/AVC, and the forthcoming High Efficiency Video Coding (HEVC) standard) define the syntax, semantics, and decoding process for error-free bitstreams, any of which One conforms to a certain profile or level. Video coding standards usually do not specify encoders, but encoders are tasked with ensuring that the resulting bitstream is standard-compliant for decoders. In the context of video coding standards, "profiles" correspond to subsets of algorithms, features, or tools and constraints that apply to them. For example, as defined by the H.264 standard, a "profile" is a subset of the entire bitstream syntax specified by the H.264 standard. A "level" corresponds to a limit on decoder resource consumption, such as, for example, decoder memory and computation, related to the resolution, bit rate, and block processing rate of a picture. A profile may be signaled with a profile_idc (profile indicator) value, while a level may be signaled with a level_idc (level indicator) value.
例如,H.264標準認可的是,在由給定簡檔的語法施加的界限內,仍然可能需要編碼器及解碼器的性能的大變化,這取決於由位元流中的語法元素所採用的值,諸如指定的經解碼的圖片大小。H.264標準進一步認可的是,在許多應用中,實現能夠處理特定簡檔內的語法的所有假設用途的解碼器是既不實用亦不經濟的。因此,H.264標準將“級別”定義為對在位元流中的語法元素的值施加的指定的約束集合。這些約束可能是對值的簡單限制。替代地,這些約束可以採取對值的算術組合的約束的形式(例如,圖片寬度乘以圖片高度乘以每秒解碼的圖片數量)。H.264標準進一步提供了個別實現針對每個支援的簡檔可以支援不同級別。For example, the H.264 standard recognizes that within the bounds imposed by the syntax of a given profile, large variations in the performance of encoders and decoders may still be required, depending on the A value such as the specified decoded picture size. The H.264 standard further recognizes that in many applications it is neither practical nor economical to implement a decoder capable of handling all hypothetical uses of the syntax within a particular profile. Accordingly, the H.264 standard defines a "level" as a specified set of constraints imposed on the values of syntax elements in a bitstream. These constraints may be simple restrictions on values. Alternatively, these constraints may take the form of constraints on arithmetic combinations of values (eg picture width times picture height times number of pictures decoded per second). The H.264 standard further provides that individual implementations may support different levels for each supported profile.
符合簡檔的解碼器通常支援在簡檔中定義的所有特徵。例如,作為寫碼特徵,B圖片寫碼在H.264/AVC的基準簡檔中是不支援的,但是在H.264/AVC的其它簡檔中是支援的。符合級別的解碼器應當能夠對不需要超出在該級別中定義的限制的資源的任何位元流進行解碼。簡檔及級別的定義可以有助於可解釋性。例如,在視頻傳輸期間,可以為整個傳輸會話協商並且商定一對簡檔及級別定義。更具體地說,在H.264/AVC中,級別可以定義對以下各項的限制:需要被處理的宏塊數量、經解碼圖片緩衝器(DPB)大小、經寫碼圖片緩衝器(CPB)大小、垂直運動向量範圍、每兩個連續MB的運動向量的最大數量、以及B塊是否可以具有小於8x8個像素的子宏塊劃分。以這種方式,解碼器可以決定該解碼器是否能夠對位元流進行正確地解碼。A decoder conforming to a profile typically supports all features defined in the profile. For example, as a coding feature, B-picture coding is not supported in the basic profile of H.264/AVC, but is supported in other profiles of H.264/AVC. A conforming class-compliant decoder shall be able to decode any bitstream that does not require resources beyond the limits defined in that class. The definition of profiles and levels can aid in interpretability. For example, during video transmission, a pair of profile and level definitions may be negotiated and agreed upon for the entire transmission session. More specifically, in H.264/AVC, a level can define limits on: the number of macroblocks that need to be processed, the decoded picture buffer (DPB) size, the coded picture buffer (CPB) Size, vertical motion vector range, maximum number of motion vectors per two consecutive MBs, and whether B-blocks can have sub-macroblock partitions smaller than 8x8 pixels. In this way, a decoder can determine whether the decoder is able to decode the bitstream correctly.
在圖1的示例中,內容準備裝置20的封裝單元30從視頻編碼器28接收包括經寫碼的視頻數據的基本流,並且從音頻編碼器26接收包括經寫碼的音頻數據的基本流。在一些示例中,視頻編碼器28及音頻編碼器26可以分別包括用於從經編碼的數據形成PES封包的分封化器。在其它示例中,視頻編碼器28及音頻編碼器26可以分別與用於從經編碼的數據形成PES封包的相應的分封化器進行對接。在又其它示例中,封裝單元30可以包括用於從經編碼的音頻及視頻數據形成PES封包的分封化器。In the example of FIG. 1 ,
視頻編碼器28可以以各種方式對多媒體內容的視頻數據進行編碼,以產生多媒體內容的處於各種位元率並且具有各種特性(諸如像素解析度、幀速率、符合各種寫碼標準、符合用於各種寫碼標準的各個簡檔及/或簡檔的級別、具有一個或多個視圖的表示(例如,用於二維或三維回放)或其它這樣的特性)的不同表示。如在本公開內容中使用的表示可以包括音頻數據、視頻數據、文本數據(例如,用於隱藏式字幕)或其它這樣的數據中的一者。表示可以包括基本流,諸如音頻基本流或視頻基本流。每個PES封包可以包括標識該PES封包所屬的基本流的stream_id。封裝單元30負責將基本流組裝成各個表示的視頻檔案(例如,分段)。The
封裝單元30從音頻編碼器26及視頻編碼器28接收用於表示的基本流的PES封包,並且從PES封包形成對應的網路抽象化層(NAL)單元。可以將經寫碼的視頻分段組織為NAL單元,這些NAL單元提供了尋址到諸如視頻電話、儲存、廣播或串流傳輸之類的應用的“網路友好”視頻表示。NAL單元可以被分類為視頻寫碼層(VCL)NAL單元及非VCL NAL單元。VCL單元可以含有核心壓縮引擎,並且可以包括塊、宏塊及/或切片級數據。其它NAL單元可以是非VCL NAL單元。在一些示例中,在一個時間實例中通常被呈現為基本經寫碼圖片的經寫碼圖片可以被含有在存取單元中,存取單元可以包括一個或多個NAL單元。
非VCL NAL單元還可以包括參數集NAL單元及SEI NAL單元以及其它單元。參數集可以含有序列級別標頭資訊(在序列參數集(SPS)中)及不頻繁變化的圖片級別標頭資訊(在圖片參數集(PPS)中)。利用參數集(例如,PPS及SPS),不需要為每個序列或圖片重複不頻繁變化的資訊;因此可以提高寫碼效率。此外,使用參數集可以實現對重要標頭資訊的帶外傳輸,從而避免為了錯誤恢復而對於冗餘傳輸的需求。在帶外傳輸示例中,可以在與其它NAL單元(諸如SEI NAL單元)不同的信道上傳送參數集NAL單元。Non-VCL NAL units may also include parameter set NAL units and SEI NAL units, among other units. A parameter set can contain sequence-level header information (in a sequence parameter set (SPS)) and infrequently changing picture-level header information (in a picture parameter set (PPS)). With parameter sets (eg, PPS and SPS), infrequently changing information does not need to be repeated for each sequence or picture; thus, coding efficiency can be improved. In addition, the use of parameter sets enables out-of-band transmission of important header information, thereby avoiding the need for redundant transmission for error recovery. In an out-of-band transmission example, parameter set NAL units may be transmitted on a different channel than other NAL units, such as SEI NAL units.
補充增強資訊(SEI)可能含有對於從VCL NAL單元解碼經寫碼的圖片樣本而言不必要的資訊,但是可能有助於與解碼、顯示、錯誤恢復及其它目的有關的過程。SEI訊息可以被含有在非VCL NAL單元中。SEI訊息是一些標準規範的正規部分,並且因此對於符合標準的解碼器實現而言並非總是強制的。SEI訊息可以是序列級別SEI訊息或圖片級別SEI訊息。一些序列級別資訊可以被含有在SEI訊息中,諸如在SVC示例中的可縮放性資訊SEI訊息、以及在MVC中的視圖可縮放性資訊SEI訊息。這些示例SEI訊息可以傳遞關於例如操作點的提取及操作點的特性的資訊。另外,封裝單元30可以形成清單檔案,諸如描述表示的特性的媒體呈現描述符(MPD)。封裝單元30可以根據可延伸標示語言(XML)來將MPD格式化。Supplemental Enhancement Information (SEI) may contain information that is not necessary for decoding coded picture samples from VCL NAL units, but may facilitate processes related to decoding, display, error recovery, and other purposes. SEI information can be contained in non-VCL NAL units. SEI messages are a formal part of some standard specifications, and thus are not always mandatory for standard-compliant decoder implementations. SEI messages can be sequence-level SEI messages or picture-level SEI messages. Some sequence level information can be contained in SEI messages, such as scalability information SEI messages in the SVC example, and view scalability information SEI messages in MVC. These example SEI messages may convey information about, for example, the extraction of the operation point and the characteristics of the operation point. Additionally,
封裝單元30可以將用於多媒體內容之一個或多個表示的數據以及清單檔案(例如,MPD)一起提供給輸出介面32。輸出介面32可以包括網路介面、或用於寫入儲存媒體的介面(諸如通用序列匯流排(USB)介面、CD或DVD刻錄機或燒錄機、與磁性或快閃儲存媒體的介面、或用於儲存或傳送媒體數據的其它介面)。封裝單元30可以將多媒體內容的表示中的每個表示的數據提供給輸出介面32,輸出介面32可以經由網路傳輸或儲存媒體將數據發送給伺服器裝置60。在圖1的示例中,伺服器裝置60包括用於儲存各種多媒體內容64的儲存媒體62,每種多媒體內容包括相應的清單檔案66及一個或多個表示68A-68N(表示68)。在一些示例中,輸出介面32亦可以直接向網路74發送數據。
在一些示例中,表示68可以被分成適配集合。亦即,表示68的各個子集可以包括相應的共同特性集,諸如編解碼器、簡檔及級別、解析度、視圖數量、用於分段的檔案格式、可以標識將與表示及/或要被解碼及例如由揚聲器呈現的音頻數據一起顯示的文本的語言或其它特性的文本類型資訊、可以描述針對適配集合中的表示的場景的相機角度或現實世界視角的相機角度資訊、描述內容對於特定觀眾的適合性的評級資訊等。In some examples, representation 68 may be divided into adapted sets. That is, each subset of representations 68 may include a corresponding set of common properties, such as codec, profile and level, resolution, number of views, file format for segmentation, may identify the associated representation and/or required Text type information that is decoded and displayed with, for example, audio data presented by a speaker, language or other characteristics of the text, camera angle information that can describe the camera angle or real-world perspective for the scene represented in the adaptation set, describing the content for Rating information on suitability for a particular audience, etc.
清單檔案66可以包括指示與特定的適配集合相對應的表示68的子集以及用於適配集合的共同特性的數據。清單檔案66亦可以包括表示用於適配集合中的個別表示的個別特性的數據,諸如位元率。以這種方式,適配集合可以提供簡化的網路帶寬適配。可以使用清單檔案66的適配集合元素中的子元素來指示在適配集合中的表示。
伺服器裝置60包括請求處理單元70及網路介面72。在一些示例中,伺服器裝置60可以包括複數個網路介面。此外,伺服器裝置60的任何或所有特徵可以在內容遞送網路的其它裝置上實現,諸如路由器、橋接器、代理裝置、交換機或其它裝置。在一些示例中,內容遞送網路的中間裝置可以對多媒體內容64的數據進行快取,並且包括基本上與伺服器裝置60的組件一致的組件。通常,網路介面72被組態以經由網路74發送及接收數據。The server device 60 includes a
請求處理單元70被組態以從諸如客戶端裝置40之類的客戶端裝置接收對儲存媒體62的數據的網路請求。例如,請求處理單元70可以實現如在RFC 2616中(1999年6月,IETF,網路工作組,R. Fielding等人的“Hypertext Transfer Protocol – HTTP/1.1”)中描述的超文本傳輸協定(HTTP)版本1.1。亦即,請求處理單元70可以被組態以接收HTTP GET或部分GET請求,並且響應於該請求而提供多媒體內容64的數據。請求可以指定表示68中的一個表示的分段(例如,使用該分段的URL)。在一些示例中,請求亦可以指定分段之一個或多個位元組範圍,由此包括部分GET請求。請求處理單元70進一步可以被組態以對HTTP HEAD請求進行服務以提供表示68中的一個表示的分段的標頭數據。在任何情況下,請求處理單元70可以被組態以處理請求以將請求的數據提供給進行請求的裝置(諸如客戶端裝置40)。
另外或替代地,請求處理單元70可以被組態以經由諸如eMBMS之類的廣播或多播協定來遞送媒體數據。內容準備裝置20可以以與所描述的基本相同的方式來創建DASH分段及/或子分段,但是伺服器裝置60可以使用eMBMS或另一廣播或多播網路傳輸協定來遞送這些分段或子分段。例如,請求處理單元70可以被組態以從客戶端裝置40接收多播組加入請求。亦即,伺服器裝置60可以向包括客戶端裝置40的客戶端裝置通告與多播組相關聯的互聯網協定(IP)位址,該多播組與特定的媒體內容(例如,實況事件的廣播)相關聯。客戶端裝置40進而可以提交用於加入多播組的請求。該請求可以在整個網路74(例如,組成網路74的路由器)中傳播,從而使路由器將去往與多播組相關聯的IP位址的訊務引導到訂制客戶端裝置(諸如客戶端裝置40)。Additionally or alternatively,
如在圖1的示例中所示,多媒體內容64包括清單檔案66,清單檔案66可以對應於媒體呈現描述(MPD)。清單檔案66可以含有對不同替代表示68(例如,具有不同品質的視頻服務)的描述,並且該描述可以包括例如表示68的編解碼器資訊、簡檔值、級別值、位元率及其它描述性特性。客戶端裝置40可以檢索媒體呈現的MPD以決定如何存取表示68的分段。As shown in the example of FIG. 1 , multimedia content 64 includes a
具體地,檢索單元52可以檢索客戶端裝置40的組態數據(未示出)以決定視頻解碼器48的解碼能力及視頻輸出44的渲染能力。視頻輸出44可以被包括在用於延展實境、擴增實境或虛擬實境的顯示裝置(諸如頭戴機)中。同樣,組態數據可以指示視頻輸出44是否能夠呈現三維視頻數據,例如,用於延展實境、擴增實境、虛擬實境等。組態數據亦可以包括以下各項中的任何一項或全部:由客戶端裝置40的用戶選擇的語言偏好、與由客戶端裝置40的用戶設定的深度偏好相對應的一個或多個相機視角、及/或由客戶端裝置40的用戶選擇的評級偏好。Specifically, the retrieval unit 52 can retrieve configuration data (not shown) of the client device 40 to determine the decoding capability of the video decoder 48 and the rendering capability of the
檢索單元52可以包括例如被組態以提交HTTP GET及部分GET請求的網頁瀏覽器或媒體客戶端。檢索單元52可以對應於由客戶端裝置40之一個或多個處理器或處理單元(未示出)執行的軟體指令。在一些示例中,關於檢索單元52描述的功能中的全部或部分功能可以用硬體、或者用硬體、軟體及/或韌體的組合來實現,其中可以提供必需的硬體來執行針對軟體或韌體的指令。Retrieval unit 52 may include, for example, a web browser or media client configured to submit HTTP GET and partial GET requests. Retrieval unit 52 may correspond to software instructions executed by one or more processors or processing units (not shown) of client device 40 . In some examples, all or part of the functions described with respect to the retrieval unit 52 may be implemented by hardware, or by a combination of hardware, software and/or firmware, wherein necessary hardware may be provided to execute or firmware instructions.
檢索單元52可以將客戶端裝置40的解碼及渲染能力與由清單檔案66的資訊所指示的表示68的特性進行比較。檢索單元52可以初始地檢索清單檔案66的至少一部分以決定表示68的特性。例如,檢索單元52可以請求清單檔案66的描述一個或多個適配集合的特性的一部分。檢索單元52可以選擇表示68的具有可以由客戶端裝置40的寫碼及渲染能力滿足的特性的子集(例如,適配集合)。檢索單元52然後可以決定用於在適配集合中的表示的位元率,決定當前可用的網路帶寬量,並且從表示中的一個表示中檢索具有網路帶寬可以滿足的位元率的分段。The retrieval unit 52 may compare the decoding and rendering capabilities of the client device 40 with the characteristics of the representation 68 indicated by the information of the
通常,較高位元率的表示可以產生較高品質的視頻回放,而較低位元率的表示可以在可用網路帶寬減小時提供足夠品質的視頻回放。相應地,當可用網路帶寬是相對高的時,檢索單元52可以從相對高位元率的表示中檢索數據,而當可用網路帶寬是低的時,檢索單元52可以從相對低位元率的表示中檢索數據。以這種方式,客戶端裝置40可以在網路74上對多媒體數據進行串流傳輸,同時亦適配網路74的變化的網路帶寬可用性。In general, higher bit rate representations can produce higher quality video playback, while lower bit rate representations can provide adequate quality video playback when available network bandwidth is reduced. Correspondingly, when the available network bandwidth is relatively high, retrieval unit 52 may retrieve data from relatively high bit-rate representations, and when available network bandwidth is low, retrieval unit 52 may retrieve data from relatively low-bit-rate representations. Retrieve data in the representation. In this manner, the client device 40 can stream multimedia data over the network 74 while also adapting to the varying network bandwidth availability of the network 74 .
另外或替代地,檢索單元52可以被組態以根據諸如eMBMS或IP多播之類的廣播或多播網路協定來接收數據。在這樣的示例中,檢索單元52可以提交用於加入與特定的媒體內容相關聯的多播網路組的請求。在加入多播組之後,檢索單元52可以接收該多播組的數據,而無需向伺服器裝置60或內容準備裝置20發出另外的請求。當不再需要多播組的數據時,檢索單元52可以提交用於離開該多播組的請求,例如,停止回放或者將信道改變到不同的多播組。Additionally or alternatively, retrieval unit 52 may be configured to receive data according to a broadcast or multicast network protocol, such as eMBMS or IP multicast. In such an example, retrieval unit 52 may submit a request to join a multicast network group associated with particular media content. After joining the multicast group, the retrieval unit 52 can receive the data of the multicast group without making an additional request to the server device 60 or the
網路介面54可以接收所選擇的表示的分段的數據並且將其提供給檢索單元52,檢索單元52進而可以將分段提供給解封裝單元50。解封裝單元50可以將視頻檔案的元素解封裝為組成的PES流,對PES流進行解分封化以檢索經編碼的數據,並且向音頻解碼器46或視頻解碼器48發送經編碼的數據,這取決於經編碼的數據是音頻流還是視頻流的一部分(例如,如由該流的PES封包標頭所指示的)。音頻解碼器46對經編碼的音頻數據進行解碼並且將經解碼的音頻數據發送到音頻輸出42,而視頻解碼器48對經編碼的視頻數據進行解碼並且將經解碼的視頻數據(其可以包括流的複數個視圖)發送到視頻輸出44。
視頻編碼器28、視頻解碼器48、音頻編碼器26、音頻解碼器46、封裝單元30、檢索單元52及解封裝單元50均可以在適用的情況下被實現為各種適當的處理電路中的任何一者,諸如一個或多個微處理器、數位信號處理器(DSP)、特定應用積體電路(ASIC)、現場可程式閘陣列(FPGA)、離散邏輯電路、軟體、硬體、韌體或其任何組合。視頻編碼器28及視頻解碼器48中的每一者可以被包括在一個或多個編碼器或解碼器中,其中的任一者可以被整合為組合的視頻編碼器/解碼器(CODEC)的一部分。同樣,音頻編碼器26及音頻解碼器46中的每一者可以被包括在一個或多個編碼器或解碼器中,其中的任一者可以被整合為組合的CODEC的一部分。包括視頻編碼器28、視頻解碼器48、音頻編碼器26、音頻解碼器46、封裝單元30、檢索單元52及/或解封裝單元50的器具可以包括積體電路、微處理器及/或無線通信裝置(諸如蜂巢電話)。
客戶端裝置40、伺服器裝置60及/或內容準備裝置20可以被組態以根據本公開內容的技術進行操作。出於示例的目的,本公開內容關於客戶端裝置40及伺服器裝置60描述了這些技術。然而,應當理解的是,內容準備裝置20可以被組態以履行這些技術,代替(或者除了)伺服器裝置60。Client device 40, server device 60, and/or
封裝單元30可以形成NAL單元,NAL單元包括標識該NAL單元所屬的節目的標頭以及酬載(例如,音頻數據、視頻數據、或描述NAL單元所對應的傳輸或節目流的數據)。例如,在H.264/AVC中,NAL單元包括1位元組的標頭及可變大小的酬載。在其酬載中包括視頻數據的NAL單元可以包括各種粒度級別的視頻數據。例如,NAL單元可以包括視頻數據塊、複數個塊、視頻數據的切片、或視頻數據的整個圖片。封裝單元30可以以基本流的PES封包的形式從視頻編碼器28接收經編碼的視頻數據。封裝單元30可以將每個基本流與對應的節目進行關聯。
封裝單元30亦可以從複數個NAL單元組裝存取單元。通常,存取單元可以包括一個或多個NAL單元,其用於表示視頻數據的幀、以及與該幀相對應的音頻數據(當這樣的音頻數據是可用的時)。存取單元通常包括用於一個輸出時間實例的所有NAL單元,例如,用於一個時間實例的所有音頻及視頻數據。例如,如果每個視圖具有20幀每秒(fps)的幀速率,則每個時間實例可以對應於0.05秒的時間間隔。在該時間間隔期間,可以同時渲染用於同一存取單元(同一時間實例)的所有視圖的特定幀。在一個示例中,存取單元可以包括在一個時間實例中的經寫碼的圖片,其可以被呈現為基本經寫碼圖片。
相應地,存取單元可以包括共同時間實例的所有音頻及視頻幀,例如,對應於時間 X的所有視圖。本公開內容亦將特定視圖的經編碼的圖片稱為“視圖分量”。亦即,視圖分量可以包括在特定時間處用於特定視圖的經編碼的圖片(或幀)。相應地,存取單元可以被定義為包括共同時間實例的所有視圖分量。存取單元的解碼順序不一定需要與輸出或顯示順序相同。 Accordingly, an access unit may include all audio and video frames at a common time instance, eg, all views corresponding to time X. This disclosure also refers to encoded pictures of a particular view as "view components." That is, a view component may include encoded pictures (or frames) for a particular view at a particular time. Accordingly, an access unit may be defined to include all view components for a common time instance. The decoding order of the access units does not necessarily need to be the same as the output or display order.
媒體呈現可以包括媒體呈現描述(MPD),其可以含有不同替代表示(例如,具有不同品質的視頻服務)的描述,並且該描述可以包括例如編解碼器資訊、簡檔值及級別值。MPD是清單檔案的一個示例,諸如清單檔案66。客戶端裝置40可以檢索媒體呈現的MPD,以決定如何存取各個呈現的電影片段。電影片段可以位於視頻檔案的電影片段盒(box)(moof盒)中。A media presentation may include a media presentation description (MPD), which may contain descriptions of different alternative representations (eg, video services with different qualities), and which may include, for example, codec information, profile values, and level values. MPD is an example of a manifest file, such as
清單檔案66(其可以包括例如MPD)可以通告表示68的分段的可用性。亦即,MPD可以包括指示表示68中的一個表示的第一分段變得可用的時鐘時間的資訊、以及指示表示68內的分段的持續時間的資訊。以這種方式,客戶端裝置40的檢索單元52可以基於在特定分段之前的分段的開始時間以及持續時間來決定每個分段何時可用。Manifest archive 66 (which may include, for example, MPD) may advertise the availability of segments of representation 68 . That is, the MPD may include information indicating the clock time at which a first segment of one of the representations 68 became available, and information indicating the duration of a segment within representation 68 . In this way, the retrieval unit 52 of the client device 40 can decide when each segment is available based on the start times and durations of the segments preceding the particular segment.
在封裝單元30已經基於所接收的數據將NAL單元及/或存取單元組裝為視頻檔案之後,封裝單元30將視頻檔案傳遞到輸出介面32以進行輸出。在一些示例中,封裝單元30可以將視頻檔案進行本地儲存或者經由輸出介面32將視頻檔案發送給遠程伺服器,而不是將視頻檔案直接發送給客戶端裝置40。輸出介面32可以包括例如發射器、收發器、用於將數據寫入計算機可讀媒體的裝置(諸如例如,光學驅動器、磁性媒體驅動器(例如,軟盤驅動器))、通用序列匯流排(USB)埠、網路介面或其它輸出介面。輸出介面32將視頻檔案輸出到計算機可讀媒體,諸如例如,傳輸信號、磁性媒體、光學媒體、記憶體、快閃驅動器或其它計算機可讀媒體。After
網路介面54可以經由網路74接收NAL單元或存取單元,並且經由檢索單元52將NAL單元或存取單元提供給解封裝單元50。解封裝單元50可以將視頻檔案的元素解封裝為組成PES流,將PES流進行解分封化以檢索經編碼的數據,並且向音頻解碼器46或視頻解碼器48(取決於經編碼的數據是音頻流還是視頻流的一部分,例如如由流的PES封包標頭指示的)發送經編碼的數據。音頻解碼器46對經編碼的音頻數據進行解碼並且將經解碼的音頻數據發送到音頻輸出42,而視頻解碼器48對經編碼的視頻數據進行解碼並且將經解碼的視頻數據(其可以包括流的複數個視圖)發送到視頻輸出44。
根據本公開內容的技術,客戶端裝置40的用戶可以獲得與諸如用於延展實境(XR)、擴增實境(AR)、虛擬實境(VR)等的3D虛擬場景相關的媒體數據。用戶可以使用與客戶端裝置40相通信的一個或多個裝置(諸如控制器)來導航穿過3D虛擬場景。另外或替代地,客戶端裝置40可以包括用於決定用戶已經在真實世界空間中移動的感測器、相機等,並且客戶端裝置40可以將此類真實世界移動轉化為虛擬空間移動。According to the techniques of this disclosure, a user of client device 40 may obtain media data related to a 3D virtual scene, such as for extended reality (XR), augmented reality (AR), virtual reality (VR), and the like. A user may navigate through the 3D virtual scene using one or more devices, such as controllers, in communication with client device 40 . Additionally or alternatively, client device 40 may include sensors, cameras, etc. for determining that the user has moved in real-world space, and client device 40 may translate such real-world movement into virtual space movement.
3D虛擬場景可以包括一個或多個虛擬固體物體。此類物體可以包括例如牆、窗、桌子、椅子或可能出現在虛擬場景中的任何其它此類物體。根據本公開內容的技術,由檢索單元52檢索的媒體數據可以包括描述此類虛擬固體物體的場景描述。場景描述可以符合例如glTF 2.0的MPEG場景描述元素。A 3D virtual scene may include one or more virtual solid objects. Such objects may include, for example, walls, windows, tables, chairs, or any other such objects that may appear in a virtual scene. According to the techniques of this disclosure, the media data retrieved by retrieval unit 52 may include scene descriptions describing such virtual solid objects. The scene description may conform to eg the MPEG scene description element of glTF 2.0.
在一些示例中,場景描述可以包括可允許相機移動的描述。例如,場景描述可以描述允許虛擬相機在其中移動使得不允許虛擬相機移動超出形狀之邊界的一個或多個邊界體積(例如,根據形狀的體積,諸如球體、立方體、圓錐體、平截頭體等)。亦即,邊界體積可以描述允許虛擬相機在其中移動的可允許相機移動體積。另外或替代地,場景描述可以描述一個或多個頂點或錨點以及頂點或錨點之間的允許路徑(例如,分段)。客戶端裝置40可以僅允許虛擬相機沿著允許路徑及/或在邊界體積內移動。In some examples, the scene description may include a description of allowable camera movement. For example, the scene description may describe one or more bounding volumes within which the virtual camera is allowed to move such that the virtual camera is not allowed to move beyond the bounds of the shape (e.g., volumes according to shapes such as spheres, cubes, cones, frustums, etc. ). That is, the bounding volume may describe the allowable camera movement volume within which the virtual camera is permitted to move. Additionally or alternatively, the scene description may describe one or more vertices or anchors and allowed paths (eg, segments) between the vertices or anchors. Client device 40 may only allow the virtual camera to move along the allowed path and/or within the bounding volume.
在一些示例中,另外或替代地,場景描述可以描述在場景中虛擬相機無法穿過的一個或多個虛擬固體物體。In some examples, additionally or alternatively, the scene description may describe one or more virtual solid objects in the scene that the virtual camera cannot pass through.
圖2是更詳細地示出圖1的檢索單元52的示例組件集合的方塊圖。在該示例中,檢索單元52包括eMBMS中間件單元100、DASH客戶端110、媒體應用112以及呈現引擎114。FIG. 2 is a block diagram illustrating an example set of components of the retrieval unit 52 of FIG. 1 in more detail. In this example, the retrieval unit 52 includes an
在該示例中,eMBMS中間件單元100進一步包括eMBMS接收單元106、快取104及代理伺服器單元102。在該示例中,eMBMS接收單元106被組態以經由eMBMS,例如,根據基於單向傳輸的檔案傳遞(FLUTE)來接收數據,FLUTE是在2012年11月、RFC 6726、網路工作組、T. Paila等人的“FLUTE—File Delivery over Unidirectional Transport”中描述的,其可在tools.ietf.org/html/rfc6726處獲得。亦即,eMBMS接收單元106可以經由廣播從例如伺服器裝置60接收檔案,伺服器裝置60可以充當廣播/多播服務中心(BM-SC)。In this example, the
隨著eMBMS中間件單元100接收用於檔案的數據,eMBMS中間件單元可以將所接收的數據儲存在快取104中。快取104可以包括計算機可讀儲存媒體,諸如快閃記憶體、硬盤驅動器、RAM或任何其它適當的儲存媒體。As the
代理伺服器單元102可以充當用於DASH客戶端110的伺服器。例如,代理伺服器單元102可以向DASH客戶端110提供MPD檔案或其它清單檔案。代理伺服器單元102可以在MPD檔案中通告針對分段的可用性時間以及分段可以從其中被檢索的超鏈接。這些超鏈接可以包括與客戶端裝置40相對應的本地主機位址前綴(例如,對於IPv4而言,為127.0.0.1)。以這種方式,DASH客戶端110可以使用HTTP GET或部分GET請求從代理伺服器單元102請求分段。例如,對於可從鏈接http://127.0.0.1/rep1/seg3得到的分段,DASH客戶端110可以構造包括針對http://127.0.0.1/rep1/seg3的請求的HTTP GET請求,並且向代理伺服器單元102提交該請求。代理伺服器單元102可以響應於這樣的請求來從快取104檢索所請求的數據,並且將該數據提供給DASH客戶端110。The
DASH客戶端110將所檢索到的媒體數據提供給媒體應用112。例如,媒體應用112可以是網頁瀏覽器、遊戲引擎或接收及呈現媒體數據的另一應用。此外,呈現引擎114表示與媒體應用112進行互動以在3D虛擬環境中呈現所檢索到的媒體數據的應用。呈現引擎114可以例如將二維媒體數據映射到3D投影上。呈現引擎114亦可以接收來自客戶端裝置40的其它元件的輸入,以決定用戶在3D虛擬環境中的位置以及用戶在該位置中正在面對的朝向。例如,呈現引擎114可以決定用戶位置的X、Y及Z坐標、以及用戶正在觀看的朝向,以便決定要向用戶顯示的適當媒體數據。此外,呈現引擎114可以接收表示真實世界用戶移動數據的相機移動數據,並且將真實世界用戶移動數據轉化為3D虛擬空間移動數據。
根據本公開內容的技術,eMBMS中間件單元100可以經由廣播或多播來接收媒體數據(例如,根據glTF 2.0),然後DASH客戶端110可以從eMBMS中間件單元100檢索媒體數據。媒體數據可以包括場景描述,場景描述包括指示虛擬相機可以如何移動穿過虛擬場景的相機控制資訊。例如,場景描述可以包括描述穿過虛擬場景的可允許路徑的數據,例如,沿著錨點之間的定義路徑。另外或替代地,場景描述可以包括描述邊界體積的數據,邊界體積表示允許虛擬相機在其中移動的體積。另外或替代地,場景描述可以包括描述3D虛擬環境中的一個或多個固體虛擬物體(諸如牆、桌子、椅子等)的數據。例如,場景描述的數據可以定義3D虛擬物體的碰撞邊界。場景描述進一步可以包括表示在與此類物體的碰撞的情況下發生哪種情況的數據,諸如要使用該物體播放的動畫,無論該物體是靜態的(例如,在牆的情況下)還是動態的(例如,在椅子的情況下)。According to the techniques of this disclosure, the
呈現引擎114可以使用場景描述來決定在與3D虛擬物體的碰撞及/或嘗試移動到可允許路徑或體積之外的情況下要呈現什麼。例如,如果場景描述包括用於可允許路徑或邊界體積的數據,並且用戶嘗試移動超出可允許路徑或邊界體積,則呈現引擎114可以簡單地避免更新顯示器,從而指示不允許此類移動。作為另一示例,如果場景描述包括用於3D虛擬固體物體的數據,並且用戶嘗試移動穿過3D虛擬固體物體,那麼,如果3D虛擬固體物體是靜態的,則呈現引擎114可以避免更新顯示器。如果3D虛擬固體物體不是靜態的,則呈現引擎114可以決定要為該物體顯示的動畫,例如,要應用於該物體的平移移動及/或旋轉移動。例如,如果3D虛擬固體物體是一把椅子,則動畫數據可以指示椅子將沿著地板被推動,或者在碰撞的情況下傾倒。The
圖3是示出示例多媒體內容120的元素的概念圖。多媒體內容120可以對應於多媒體內容64(圖1)或被儲存在儲存媒體62中的另一多媒體內容。在圖3的示例中,多媒體內容120包括媒體呈現描述(MPD)122及複數個表示124A-124N(表示124)。表示124A包括可選的標頭數據126及分段128A-128N(分段128),而表示124N包括可選的標頭數據130及分段132A-132N(分段132)。為了方便起見,字母N用於指定表示124中的每個表示中的最後一個電影片段。在一些示例中,在表示124之間可以存在不同數量的電影片段。FIG. 3 is a conceptual diagram illustrating elements of
MPD 122可以包括與表示124分開的數據結構。MPD 122可以對應於圖1的清單檔案66。同樣,表示124可以對應於圖1的表示68。通常,MPD 122可以包括通常描述表示124的特性的數據,諸如寫碼及渲染特性、適配集合、MPD 122所對應的簡檔、文本類型資訊、相機角度資訊、評級資訊、特技模式資訊(例如,指示包括時間子序列的表示的資訊)、及/或用於檢索遠程時段的資訊(例如,用於在回放期間將目標廣告插入到媒體內容中)。
標頭數據126(當存在時)可以描述分段128的特性,例如,隨機存取點(RAP,亦被稱為流存取點(SAP))的時間位置、分段128中的哪個包括隨機存取點、與分段128內的隨機存取點的位元組偏移、分段128的統一資源定位符(URL)、或分段128的其它方面。標頭數據130(當存在時)可以描述分段132的類似特性。另外或替代地,這樣的特性可以被完全包括在MPD 122中。Header data 126 (when present) may describe characteristics of segments 128, for example, the temporal location of a random access point (RAP, also known as a stream access point (SAP)), which of the segments 128 includes random An access point, a byte offset from a random access point within segment 128 , a uniform resource locator (URL) for segment 128 , or other aspect of segment 128 . Header data 130 (when present) may describe similar characteristics of segments 132 . Additionally or alternatively, such features may be fully included in
分段128、132包括一個或多個經寫碼的視頻樣本,其中的每個經寫碼的視頻樣本可以包括視頻數據的幀或切片。分段128的經寫碼的視頻樣本中的每一者可以具有類似的特性,例如,高度、寬度及帶寬要求。這樣的特性可以由MPD 122的數據來描述,雖然在圖3的示例中未示出這樣的數據。MPD 122可以包括如由3GPP規範描述的特性,其中添加了在本公開內容中描述的用信號通知的資訊中的任何或全部資訊。Segments 128, 132 include one or more encoded video samples, each of which may include a frame or slice of video data. Each of the coded video samples of segment 128 may have similar characteristics, such as height, width, and bandwidth requirements. Such characteristics may be described by data from
分段128、132中的每個分段可以與唯一的統一資源定位符(URL)相關聯。因此,分段128、132中的每個分段可以是可使用諸如DASH之類的串流傳輸網路協定來獨立地檢索的。以這種方式,諸如客戶端裝置40之類的目的地裝置可以使用HTTP GET請求來檢索分段128或132。在一些示例中,客戶端裝置40可以使用HTTP partial GET請求來檢索分段128或132的特定位元組範圍。Each of the segments 128, 132 may be associated with a unique Uniform Resource Locator (URL). Accordingly, each of the segments 128, 132 may be independently retrievable using a streaming network protocol such as DASH. In this manner, a destination device, such as client device 40, may retrieve segment 128 or 132 using an HTTP GET request. In some examples, client device 40 may retrieve a particular byte range of segment 128 or 132 using an HTTP partial GET request.
圖4是示出示例視頻檔案150的元素的方塊圖,視頻檔案150可以對應於表示的分段,諸如圖3的分段128、132之一。分段128、132中的每個分段可以包括基本上符合在圖4的示例中示出的數據的佈置的數據。視頻檔案150可以被認為是封裝分段。如上所述,根據ISO基媒體檔案格式以及其延伸的視頻檔案將數據儲存在被稱為“盒”的一系列物體中。在圖4的示例中,視頻檔案150包括檔案類型(FTYP)盒152、電影(MOOV)盒154、分段索引(sidx)盒162、電影片段(MOOF)盒164、以及電影片段隨機存取(MFRA)盒166。儘管圖4表示視頻檔案的示例,但是應當理解的是,其它媒體檔案可以包括根據ISO基媒體檔案格式以及其延伸而與視頻檔案150的數據類似地構造的其它類型的媒體數據(例如,音頻數據、定時文本數據等)。FIG. 4 is a block diagram illustrating elements of an
檔案類型(FTYP)盒152通常描述用於視頻檔案150的檔案類型。檔案類型盒152可以包括標識描述用於視頻檔案150的最佳用途的規範的數據。檔案類型盒152可以替代地被放置在MOOV盒154、電影片段盒164及/或MFRA盒166之前。A file type (FTYP)
在一些示例中,諸如視頻檔案150之類的分段可以在FTYP盒152之前包括MPD更新盒(未示出)。MPD更新盒可以包括指示與包括視頻檔案150的表示相對應的MPD將被更新的資訊以及用於更新MPD的資訊。例如,MPD更新盒可以提供用於將用於更新MPD的資源的URI或URL。作為另一示例,MPD更新盒可以包括用於更新MPD的數據。在一些示例中,MPD更新盒可以緊跟在視頻檔案150的分段類型(STYP)盒(未顯示)之後,其中STYP盒可以定義用於視頻檔案150的分段類型。In some examples, a segment such as
在圖4的示例中,MOOV盒154包括電影標頭(MVHD)盒156、軌道(TRAK)盒158以及一個或多個電影延伸(MVEX)盒160。通常,MVHD盒156可以描述視頻檔案150的一般特性。例如,MVHD盒156可以包括描述視頻檔案150最初何時被創建、視頻檔案150最近何時被修改、用於視頻檔案150的時間標度、用於視頻檔案150的回放的持續時間的數據、或者通常描述視頻檔案150的其它數據。In the example of FIG. 4 ,
TRAK盒158可以包括用於視頻檔案150的軌道的數據。TRAK盒158可以包括描述與TRAK盒158相對應的軌道的特性的軌道標頭(TKHD)盒。在一些示例中,TRAK盒158可以包括經寫碼的視頻圖片,而在其它示例中,軌道的經寫碼的視頻圖片可以被包括在電影片段164中,電影片段164可以通過TRAK盒158及/或sidx盒162的數據來引用。
在一些示例中,視頻檔案150可以包括一個以上的軌道。因此,MOOV盒154可以包括數個TRAK盒,TRAK盒的數量等於視頻檔案150中的軌道的數量。TRAK盒158可以描述視頻檔案150的對應軌道的特性。例如,TRAK盒158可以描述用於對應軌道的時間及/或空間資訊。當封裝單元30(圖3)在諸如視頻檔案150之類的視頻檔案中包括參數集軌道時,類似於MOOV盒154的TRAK盒158的TRAK盒可以描述參數集軌道的特性。封裝單元30可以在描述參數集軌道的TRAK盒內用信號通知序列級別SEI訊息在參數集軌道中的存在。In some examples,
MVEX盒160可以描述對應的電影片段164的特性,例如,以用信號通知除了被包括在MOOV盒154內的視頻數據(如果有的話)之外,視頻檔案150還包括電影片段164。在串流傳輸視頻數據的背景下,經寫碼的視頻圖片可以被包括在電影片段164中,而不是在MOOV盒154中。因此,所有經寫碼的視頻樣本可以被包括在電影片段164中,而不是在MOOV盒154中。The
MOOV盒154可以包括數個MVEX盒160,MVEX盒160的數量等於視頻檔案150中的電影片段164的數量。MVEX盒160中的每一者可以描述電影片段164中的相應電影片段的特性。例如,每個MVEX盒可以包括電影延伸標頭盒(MEHD)盒,其描述用於電影片段164中的對應電影片段的時間上的持續時間。The
如上所述,封裝單元30可以將序列數據集儲存在不包括實際經寫碼的視頻數據的視頻樣本中。視頻樣本通常可以對應於存取單元,存取單元是在特定時間實例處的經寫碼的圖片的表示。在AVC的背景下,經寫碼的圖片包括含有要構造存取單元的所有像素的資訊的一個或多個VCL NAL單元及其它相關聯的非VCL NAL單元(諸如SEI訊息)。相應地,封裝單元30可以在電影片段164中的一個電影片段中包括序列數據集,序列數據集可以包括序列級別SEI訊息。封裝單元30進一步可以在MVEX盒160內的與電影片段164之一相對應的MVEX盒中將序列數據集及/或序列級別SEI訊息的存在用信號通知為存在於電影片段164中的該電影片段中。As noted above,
SIDX盒162是視頻檔案150的可選元素。亦即,符合3GPP檔案格式或其它這樣的檔案格式的視頻檔案不一定包括SIDX盒162。根據3GPP檔案格式的示例,SIDX盒可以用於標識分段(例如,被含有在視頻檔案150內的分段)的子分段。3GPP檔案格式將子分段定義為“具有對應媒體數據盒的一個或多個連續電影片段盒之自足集合,並且含有由電影片段盒引用的數據的媒體數據盒必須跟隨在該電影片段盒之後並且在含有關於相同軌道的資訊的下一電影片段盒之前。”3GPP檔案格式亦指示SIDX盒“含有對由該盒所記載的(子)分段的子分段的引用序列。所引用的子分段在呈現時間上是連續的。類似地,由分段索引盒所引用的位元組在分段內始終是連續的。所引用的大小給出了在所引用的材料中的位元組數量的計數。”SIDX box 162 is an optional element of
SIDX盒162通常提供表示被包括在視頻檔案150中的分段之一個或多個子分段的資訊。例如,此類資訊可以包括子分段開始及/或結束的回放時間、針對子分段的位元組偏移、子分段是否包括流存取點(SAP)(例如,從其開始)、用於SAP的類型(例如,SAP是瞬時解碼器刷新(IDR)圖片、乾淨隨機存取(CRA)圖片、斷鏈存取(BLA)圖片、還是其它圖片)、SAP在子分段中的位置(依據回放時間及/或位元組偏移)等。SIDX box 162 generally provides information representing one or more sub-segments of a segment included in
電影片段164可以包括一個或多個經寫碼的視頻圖片。在一些示例中,電影片段164可以包括一個或多個圖片組(GOP),其中的每個圖片組可以包括數個經寫碼的視頻圖片,例如,幀或圖片。另外,如上所述,在一些示例中,電影片段164可以包括序列數據集。電影片段164中的每個電影片段可以包括電影片段標頭盒(MFHD,在圖4中未示出)。MFHD盒可以描述對應電影片段的特性,諸如用於該電影片段的序列號。電影片段164可以按照序列號的順序被包括在視頻檔案150中。
MFRA盒166可以描述在視頻檔案150的電影片段164內的隨機存取點。這可以輔助履行特技模式,諸如對通過視頻檔案150封裝的分段內的特定時間位置(即,回放時間)履行搜索。MFRA盒166通常是可選的,並且在一些示例中不需要被包括在視頻檔案中。同樣,客戶端裝置(例如,客戶端裝置40)不一定需要引用MFRA盒166來正確地解碼及顯示視頻檔案150的視頻數據。MFRA盒166可以包括數個軌道片段隨機存取(TFRA)盒(未示出),TFRA盒的數量等於視頻檔案150的軌道數量,或者在一些示例中,等於視頻檔案150的媒體軌道(例如,非提示軌道)的數量。
在一些示例中,電影片段164可以包括一個或多個流存取點(SAP),諸如IDR圖片。同樣,MFRA盒166可以提供對SAP在視頻檔案150內的位置的指示。相應地,視頻檔案150的時間子序列可以從視頻檔案150的SAP形成。時間子序列亦可以包括其它圖片,諸如依賴於SAP的P幀及/或B幀。可以將時間子序列的幀及/或切片佈置在分段內,使得可以正確地解碼時間子序列的、依賴於該子序列的其它幀/切片的幀/切片。例如,在數據的分層佈置中,用於針對其它數據的預測的數據亦可以被包括在時間子序列中。In some examples,
圖5是示出根據本公開內容的技術的具有邊界體積的示例相機路徑分段212的概念圖。具體地,在3D場景200中,相機202表示用戶能夠觀看3D場景200的一部分的視點。在該示例中,路徑分段212被定義在點204及點206之間。此外,通過沿著路徑分段212將點從邊界框208擠壓到邊界框210來定義邊界體積。因此,在該示例中,允許相機202沿著路徑分段212在邊界體積內移動,但限制其移動超出邊界體積。FIG. 5 is a conceptual diagram illustrating an example
場景描述可以描述允許相機(諸如相機202)沿著其移動的路徑集合。路徑可以被描述為錨點集合(諸如點204、206),它們通過路徑分段(諸如路徑分段212)連接。在一些示例(諸如圖5的示例)中,每個路徑分段可以利用允許沿著路徑的某種運動自由的邊界體積來增強。A scene description may describe a set of paths along which a camera, such as
因此,場景相機以及因此觀看者將能夠在邊界體積內沿著路徑分段自由地移動。可以使用更複雜的幾何形狀來描述路徑分段,以允許對路徑進行更精細的控制。Thus, the scene camera, and thus the viewer, will be able to move freely along the path segments within the bounding volume. Path segments can be described using more complex geometries to allow finer control over the path.
此外,相機參數可能在沿著路徑的每個點處受到約束。可以提供用於每個錨點的參數,並且然後與內插函數一起使用,以計算用於沿著路徑分段的每個點的相應參數。內插函數可以適用於所有參數,包括邊界體積。Additionally, camera parameters may be constrained at each point along the path. Parameters for each anchor point can be provided and then used with an interpolation function to calculate corresponding parameters for each point segmented along the path. Interpolation functions can be applied to all parameters, including bounding volumes.
本公開內容的相機控制延伸機制可以被實現為定義用於場景的相機控制的glTF 2.0延伸。相機控制延伸可以由“MPEG_camera_control”標籤來標識,該標籤可以被包括在extensionsUsed元素中,並且可以被包括在用於3D場景的extensionsRequired元素中。The camera control extension mechanism of the present disclosure can be implemented as a glTF 2.0 extension that defines camera control for a scene. The camera control extension may be identified by an 'MPEG_camera_control' tag, which may be included in the extensionsUsed element, and may be included in the extensionsRequired element for 3D scenes.
下面在表1中示出了示例“MPEG_camera_control”延伸,並且可以在場景描述的“camera”元素上定義。
表1
相機控制資訊可以如下構造: l 對於每個錨點,錨點的(x,y,z)坐標可以使用浮點值來表示 l 對於每個路徑分段,路徑分段的第一錨點及第二錨點的(i,j)索引可以表示為整數值 l 對於邊界體積: o 如果邊界體積為BV_CONE,則可以提供第一錨點及第二錨點的圓的(r1,r2)半徑。 o 如果邊界體積為BV_FRUSTUM,則可以針對路徑分段的每個錨點提供((x,y,z)_topleft,w,h)。 o 如果邊界體積為BV_SPHERE,則可以針對路徑分段的每個錨點提供作為球體的半徑的r。 l 如果固有參數為true,則可以修改固有參數物體。 Camera control information can be structured as follows: l For each anchor point, the (x, y, z) coordinates of the anchor point can be represented by floating point values l For each path segment, the (i, j) index of the first anchor point and the second anchor point of the path segment can be expressed as an integer value l For bounding volumes: o If the bounding volume is BV_CONE, you can provide the (r1,r2) radii of the circles of the first anchor point and the second anchor point. o If the bounding volume is BV_FRUSTUM, ((x,y,z)_topleft,w,h) can be provided for each anchor point of the path segment. o If the bounding volume is BV_SPHERE, r can be provided as the radius of the sphere for each anchor point of the path segment. l If the intrinsic parameter is true, the intrinsic parameter object can be modified.
呈現引擎(例如,圖2的呈現引擎114或客戶端裝置40的另一元件,其可以不同於在圖1及圖2中所示的組件)可以支援MPEG_camera_control延伸或其它此類數據結構。如果場景提供相機控制資訊,則呈現引擎可以將相機移動限制在所指示的路徑上,使得相機的(x,y,z)坐標始終位於路徑分段上或路徑分段之邊界體積內。當觀看者接近邊界體積之邊界時,呈現引擎可以向他們提供視覺、聽覺及/或觸覺反饋。A presentation engine (eg,
圖6是示出示例虛擬物體220的概念圖,在該示例中虛擬物體220是椅子。為了向觀看者提供沉浸式體驗,重要的是觀看者與場景中的物體正確地互動。觀看者不應當能夠穿行場景中的固體物體(諸如牆、椅子及桌子)或其它此類固體物體。FIG. 6 is a conceptual diagram illustrating an example
圖6描繪了椅子的3D網格表示以及被定義為長方體集合的碰撞邊界。可以定義MPEG_mesh_collision延伸數據結構,以提供此類3D網格的碰撞邊界的描述。延伸數據結構可以在網格物體上定義為在網格幾何體周圍的長方體集合。下面的表2表示了可以被包括在此類延伸數據結構中的示例屬性集合。
表2
網格碰撞資訊可以包括長方體邊界的長方體頂點坐標(x,y,z)、或球形邊界的球體中心及半徑。這些值可以作為浮點數來提供。Mesh collision information may include cuboid vertex coordinates (x, y, z) for cuboid boundaries, or sphere center and radius for spherical boundaries. These values can be provided as floating point numbers.
呈現引擎可以支援MPEG_mesh_collision延伸或其它此類數據結構。呈現引擎可以確保相機位置(x,y,z)在任何時間點都不會變得被含有在所定義的網格長方體中的一者內。可以通過視覺、聽覺及/或觸覺反饋將碰撞用信號通知給觀看者。呈現引擎可以使用關於用於節點的邊界的資訊來初始化及組態將檢測碰撞的3D物理引擎。The rendering engine may support the MPEG_mesh_collision extension or other such data structures. The rendering engine can ensure that the camera position (x,y,z) does not become contained within one of the defined mesh cuboids at any point in time. The collision may be signaled to the viewer by visual, audible and/or tactile feedback. The rendering engine can use the information about the bounds for the nodes to initialize and configure the 3D physics engine that will detect collisions.
圖7是示出根據本公開內容的技術的檢索媒體數據的示例方法的流程圖。關於圖1的客戶端裝置40及圖2的檢索單元52解釋了圖7的方法。其它此類裝置可以被組態以履行該方法或類似方法。7 is a flowchart illustrating an example method of retrieving media data in accordance with the techniques of this disclosure. The method of FIG. 7 is explained with respect to the client device 40 of FIG. 1 and the retrieval unit 52 of FIG. 2 . Other such devices may be configured to perform this or similar methods.
最初,客戶端裝置40可以檢索媒體數據(250)。例如,檢索單元52可以檢索例如符合glTF 2.0的媒體數據。在一些示例中,檢索單元52可以例如根據單播(諸如使用DASH)來直接檢索媒體數據。在一些示例中,檢索單元52的中間件單元(諸如圖2的eMBMS中間件100)可以經由廣播或多播接收媒體數據,然後DASH客戶端(例如,圖2的DASH客戶端110)可以從中間件單元檢索媒體數據。Initially, client device 40 may retrieve media data (250). For example, the retrieval unit 52 may retrieve media data conforming to glTF 2.0, for example. In some examples, retrieval unit 52 may directly retrieve the media data, eg, from unicast, such as using DASH. In some examples, the middleware unit of the retrieval unit 52 (such as the
媒體數據可以包括場景描述。因此,檢索單元52或客戶端裝置40的另一組件可以從媒體數據中提取場景描述(252)。根據本公開內容的技術,場景描述可以是包括相機控制數據的MPEG場景描述。檢索單元52可以向呈現引擎114提供場景描述。呈現引擎114因此可以接收場景描述,並且進而根據場景描述來決定用於三維場景的相機控制數據(254)。相機控制數據可以符合上面的表1。亦即,例如,相機控制數據可以包括用於相機路徑的一個或多個錨點、用於相機路徑的錨點之間的一個或多個分段、諸如圓錐體、平截頭體或球體之類的邊界體積、可以在每個錨點處修改的固有參數、及/或提供相機控制資訊的存取器。Media data may include scene descriptions. Accordingly, retrieval unit 52 or another component of client device 40 may extract the scene description from the media data (252). According to the techniques of this disclosure, the scene description may be an MPEG scene description including camera control data. The retrieval unit 52 may provide the scene description to the
呈現引擎114進一步可以根據相機控制數據來決定移動限制(256)。例如,呈現引擎114可以根據相機控制數據的移動限制來決定兩個或更多個錨點以及錨點之間的允許路徑。另外或替代地,呈現引擎114可以根據相機控制數據的移動限制來決定邊界體積,諸如立方體、球體、平截頭體、圓錐體等。呈現引擎114可以使用允許路徑來決定允許虛擬相機沿著其移動的路徑及/或允許虛擬相機在邊界體積內移動但不在邊界體積之外的路徑。可以定義允許路徑及/或邊界體積,以確保虛擬相機不會超過3D固體虛擬物體,諸如牆。亦即,可以將邊界體積或允許路徑定義為在一個或多個3D固體虛擬物體(諸如牆、地板、天花板或3D虛擬場景中的其它物體)內。
呈現引擎114然後可以接收相機移動數據(258)。例如,呈現引擎114可以從一個或多個控制器(諸如手持控制器及/或包括顯示器的頭戴機)接收數據,其表示頭戴機的朝向以及頭戴機及/或虛擬相機的移動(諸如定向移動及/或旋轉移動)。呈現引擎114可以決定相機移動數據請求相機穿過3D固體虛擬物體的移動(諸如超出邊界體積之邊界或沿著不是所定義的允許路徑之一的路徑)(260)。作為響應,呈現引擎114可以防止虛擬相機穿越3D固體虛擬物體(262)。
以這種方式,圖7的方法表示一種檢索媒體數據的方法的示例,包括由呈現引擎接收串流媒體數據,串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;由呈現引擎接收用於三維場景的相機控制數據,相機控制數據包括定義限制以防止虛擬相機穿越至少一個虛擬固體物體的數據;由呈現引擎從用戶接收相機移動數據,相機移動數據請求虛擬相機移動穿過至少一個虛擬固體物體;以及使用相機控制數據,由呈現引擎響應於相機移動數據來防止虛擬相機穿越至少一個虛擬固體物體。In this manner, the method of FIG. 7 represents an example of a method of retrieving media data, comprising receiving, by a rendering engine, streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; Camera control data for a three-dimensional scene, the camera control data including data defining constraints to prevent the virtual camera from passing through at least one virtual solid object; camera movement data received by the rendering engine from the user, the camera movement data requesting the virtual camera to move through at least one virtual solid object objects; and using the camera control data, preventing, by the rendering engine, the virtual camera from traversing the at least one virtual solid object in response to the camera movement data.
圖8是示出根據本公開內容的技術的檢索媒體數據的示例方法的流程圖。關於圖1的客戶端裝置40及圖2的檢索單元52解釋了圖8的方法。其它此類裝置可以被組態以履行該方法或類似方法。8 is a flowchart illustrating an example method of retrieving media data in accordance with the techniques of this disclosure. The method of FIG. 8 is explained with respect to the client device 40 of FIG. 1 and the retrieval unit 52 of FIG. 2 . Other such devices may be configured to perform this or similar methods.
最初,客戶端裝置40可以檢索媒體數據(280)。例如,檢索單元52可以檢索例如符合glTF 2.0的媒體數據。在一些示例中,檢索單元52可以例如根據單播(諸如使用DASH)直接地檢索媒體數據。在一些示例中,檢索單元52的中間件單元(諸如圖2的eMBMS中間件100)可以經由廣播或多播接收媒體數據,然後DASH客戶端(例如,圖2的DASH客戶端110)可以從中間件單元檢索媒體數據。Initially, client device 40 may retrieve media data (280). For example, the retrieval unit 52 may retrieve media data conforming to glTF 2.0, for example. In some examples, retrieval unit 52 may retrieve the media data directly, eg, from unicast, such as using DASH. In some examples, the middleware unit of the retrieval unit 52 (such as the
媒體數據可以包括場景描述。因此,檢索單元52或客戶端裝置40的另一組件可以從媒體數據中提取場景描述(282)。根據本公開內容的技術,場景描述可以是包括物體碰撞數據的MPEG場景描述。檢索單元52可以向呈現引擎114提供場景描述。呈現引擎114因此可以接收場景描述,並且進而根據場景描述來決定用於一個或多個3D固體虛擬物體的物體碰撞數據(284)。物體碰撞數據可以符合上面的表2。亦即,物體碰撞數據可以包括表示例如以下各項的數據:表示定義網格(3D虛擬固體)物體之碰撞邊界的邊界形狀陣列的邊界;指示物體是否是靜態(即,可移動)的數據;表示用於物體的碰撞材料的材料;及/或在碰撞的情況下要針對該物體呈現的動畫。Media data may include scene descriptions. Accordingly, retrieval unit 52 or another component of client device 40 may extract the scene description from the media data (282). According to the techniques of this disclosure, the scene description may be an MPEG scene description including object collision data. The retrieval unit 52 may provide the scene description to the
呈現引擎114進一步可以根據相機控制數據來決定物體碰撞數據(286)。例如,呈現引擎114可以決定表示定義網格(3D虛擬固體)物體之碰撞邊界的邊界形狀陣列的邊界、指示該物體是否是靜態(即,可移動)的數據、表示用於該物體的碰撞材料的材料、及/或在碰撞的情況下要針對該物體呈現的動畫。呈現引擎114可以使用物體碰撞數據來決定在與3D固體虛擬物體的碰撞的情況下如何反應。The
呈現引擎114然後可以接收相機移動數據(288)。例如,呈現引擎114可以從一個或多個控制器(諸如手持控制器及/或包括顯示器的頭戴機)接收數據,其表示頭戴機的朝向以及頭戴機及/或虛擬相機的移動(諸如定向移動及/或旋轉移動)。呈現引擎114可以決定相機移動數據請求相機穿過3D固體虛擬物體的移動(諸如進入由物體碰撞數據定義的3D固體虛擬物體)(290)。作為響應,呈現引擎114可以防止虛擬相機穿越3D固體虛擬物體(292)。例如,如果物體是靜態的(如物體碰撞數據所指示的),則呈現引擎114可以防止虛擬相機移動進入及穿過物體。作為另一示例,如果物體不是靜態的(例如,可移動的),則呈現引擎114可以根據物體碰撞數據來決定響應於與物體的碰撞的反應,諸如將在物體上播放的動畫,例如,如果物體將翻倒或移動。The
以這種方式,圖8的方法表示一種檢索媒體數據的方法的示例,該方法包括:由呈現引擎接收串流媒體數據,串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;由呈現引擎接收表示至少一個虛擬固體物體之邊界的物體碰撞數據;由呈現引擎從用戶接收相機移動數據,相機移動數據請求虛擬相機移動穿過至少一個虛擬固體物體;以及使用物體碰撞數據,由呈現引擎響應於相機移動數據來防止虛擬相機穿越至少一個虛擬固體物體。In this manner, the method of FIG. 8 represents an example of a method of retrieving media data, the method comprising: receiving, by a rendering engine, streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receiving, by the engine, object collision data representing a boundary of at least one virtual solid object; receiving, by the rendering engine, camera movement data from the user, the camera movement data requesting that the virtual camera move through the at least one virtual solid object; and using the object collision data, the rendering engine responds The camera movement data is used to prevent the virtual camera from passing through at least one virtual solid object.
在以下條款中概述了本公開內容的技術的某些示例:Some examples of the techniques of this disclosure are outlined in the following clauses:
條款1:一種檢索媒體數據之方法,該方法包含:由呈現引擎接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;由該呈現引擎接收用於該三維場景的相機控制數據,該相機控制數據包括定義用於虛擬相機的可允許位置的數據;由該呈現引擎從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該相機控制數據,由該呈現引擎更新該虛擬相機之位置,以確保該虛擬相機保持在該可允許位置內。Clause 1: A method of retrieving media data, the method comprising: receiving, by a rendering engine, streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receiving, by the rendering engine, a camera control data, the camera control data including data defining allowable positions for the virtual camera; receiving camera movement data from the user by the rendering engine, the camera movement data requesting the virtual camera to move through the at least one virtual solid object and updating, by the rendering engine, the position of the virtual camera using the camera control data to ensure that the virtual camera remains within the allowable position.
條款2:如條款1之方法,其中,更新該虛擬相機之該位置包含:防止該虛擬相機穿越該至少一個虛擬固體物體。Clause 2: The method of Clause 1, wherein updating the position of the virtual camera includes preventing the virtual camera from passing through the at least one virtual solid object.
條款3:如條款1之方法,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 3: The method of Clause 1, wherein the streaming media data includes glTF 2.0 media data.
條款4:如條款1之方法,其中,接收該串流媒體數據包含:經由應用程式介面(API)從檢索單元請求該串流媒體數據。Clause 4: The method of Clause 1, wherein receiving the streaming media data comprises: requesting the streaming media data from a retrieval unit via an Application Programming Interface (API).
條款5:如條款1之方法,其中,該相機控制數據被包括在MPEG場景描述中。Clause 5: The method of Clause 1, wherein the camera control data is included in an MPEG scene description.
條款6:如條款1之方法,其中,該相機控制數據包括定義兩個或更多個錨點及該等錨點之間的一個或多個分段的數據,該分段表示用於該虛擬相機的可允許相機移動向量,並且其中,更新該虛擬相機之該位置包含允許該虛擬相機僅越過該等錨點之間的該分段。Clause 6: The method of Clause 1, wherein the camera control data includes data defining two or more anchor points and one or more segments between the anchor points, the segments representing an allowable camera movement vector for the camera, and wherein updating the position of the virtual camera includes allowing the virtual camera to traverse only the segment between the anchor points.
條款7:如條款1之方法,其中,該相機控制數據包括定義邊界體積的數據,該邊界體積表示用於該虛擬相機的可允許相機移動體積,並且其中,更新該虛擬相機之該位置包含允許該虛擬相機僅越過該可允許相機移動體積。Clause 7: The method of Clause 1, wherein the camera control data includes data defining a bounding volume representing an allowable camera movement volume for the virtual camera, and wherein updating the position of the virtual camera includes allowing The virtual camera only crosses the allowable camera movement volume.
條款8:如條款7之方法,其中,定義該邊界體積的該數據包含定義圓錐體、平截頭體或球體中的至少一者的數據。Clause 8: The method of Clause 7, wherein the data defining the bounding volume comprises data defining at least one of a cone, frustum, or sphere.
條款9:如條款1之方法,其中,該相機控制數據被包括在MPEG_camera_control延伸中。Clause 9: The method of clause 1, wherein the camera control data is included in an MPEG_camera_control extension.
條款10:如條款9之方法,其中,該MPEG_camera_control延伸包括以下各項中的一項或多項:錨點數據,其表示對於用於該虛擬相機的可允許路徑的錨點數量;分段數據,其表示對於該等錨點之間的該可允許路徑的路徑分段數量;邊界體積數據,其表示用於該虛擬相機的邊界體積;固有參數,其指示相機參數是否在該等錨點中的每個錨點處被修改;以及存取器數據,其表示提供該相機控制數據的存取器之索引。Clause 10: The method of Clause 9, wherein the MPEG_camera_control extension includes one or more of: anchor point data representing the number of anchor points for allowable paths for the virtual camera; segment data, It represents the number of path segments for the allowable path between the anchor points; bounding volume data, which represents the bounding volume for the virtual camera; intrinsic parameters, which indicate whether the camera parameters are within the anchor points modified at each anchor point; and accessor data representing the index of the accessor that provided the camera control data.
條款11:如條款1之方法,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 11: The method of Clause 1, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款12:如條款1之方法,進一步包含:根據該相機控制數據來決定用於該虛擬相機的可允許路徑,其中,更新該虛擬相機之該位置包含確保該虛擬相機僅沿著在該相機控制數據中定義的該可允許路徑內的虛擬路徑移動。Clause 12: The method of Clause 1, further comprising: determining an allowable path for the virtual camera based on the camera control data, wherein updating the position of the virtual camera comprises ensuring that the virtual camera only follows Virtual path movement within the allowable path defined in the data.
條款13:如條款1之方法,其中,該相機控制數據被包括在MPEG_mesh_collision延伸中。Clause 13: The method of Clause 1, wherein the camera control data is included in an MPEG_mesh_collision extension.
條款14:一種用於檢索媒體數據的裝置,該裝置包含:記憶體,其被組態以儲存媒體數據;以及一個或多個處理器,其在電路中實現並且被組態以執行呈現引擎,該呈現引擎被組態以:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收用於該三維場景的相機控制數據,該相機控制數據包括定義用於虛擬相機的可允許位置的數據;從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該相機控制數據,更新該虛擬相機之位置,以確保該虛擬相機保持在該可允許位置內。Clause 14: An apparatus for retrieving media data, the apparatus comprising: a memory configured to store the media data; and one or more processors implemented in a circuit and configured to execute a rendering engine, The rendering engine is configured to: receive streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receive camera control data for the three-dimensional scene, the camera control data including definitions for data of allowable positions of the virtual camera; receiving camera movement data from a user requesting the virtual camera to move through the at least one virtual solid object; and using the camera control data, updating the position of the virtual camera to ensure The virtual camera remains within the allowable position.
條款15:如條款14之裝置,其中,該呈現引擎被組態以:防止該虛擬相機穿越該至少一個虛擬固體物體。Clause 15: The apparatus of Clause 14, wherein the rendering engine is configured to: prevent the virtual camera from passing through the at least one virtual solid object.
條款16:如條款14之裝置,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 16: The device of Clause 14, wherein the streaming media data comprises glTF 2.0 media data.
條款17:如條款14之裝置,其中,該呈現引擎被組態以:經由應用程式介面(API)從檢索單元請求該串流媒體數據。Clause 17: The device of Clause 14, wherein the rendering engine is configured to: request the streaming media data from the retrieval unit via an application programming interface (API).
條款18:如條款14之裝置,其中,該相機控制數據被包括在MPEG場景描述中。Clause 18: The apparatus of Clause 14, wherein the camera control data is included in an MPEG scene description.
條款19:如條款14之裝置,其中,該相機控制數據包括定義兩個或更多個錨點及該等錨點之間的一個或多個分段的數據,該分段表示用於該虛擬相機的可允許相機移動向量,並且其中,為了更新該虛擬相機之該位置,該呈現引擎被組態以允許該虛擬相機僅越過該等錨點之間的該分段。Clause 19: The apparatus of Clause 14, wherein the camera control data includes data defining two or more anchor points and one or more segments between the anchor points, the segments representing an allowable camera movement vector for the camera, and wherein, to update the position of the virtual camera, the rendering engine is configured to allow the virtual camera to only traverse the segment between the anchor points.
條款20:如條款14之裝置,其中,該相機控制數據包括定義邊界體積的數據,該邊界體積表示用於該虛擬相機的可允許相機移動體積,並且其中,為了更新該虛擬相機之該位置,該呈現引擎被組態以允許該虛擬相機僅越過該可允許相機移動體積。Clause 20: The apparatus of Clause 14, wherein the camera control data includes data defining a bounding volume representing an allowable camera movement volume for the virtual camera, and wherein, to update the position of the virtual camera, The rendering engine is configured to allow the virtual camera only beyond the allowable camera movement volume.
條款21:如條款20之裝置,其中,定義該邊界體積的該數據包含定義圓錐體、平截頭體或球體中的至少一者的數據。Clause 21: The device of
條款22:如條款14之裝置,其中,該相機控制數據被包括在MPEG_camera_control延伸中。Clause 22: The apparatus of Clause 14, wherein the camera control data is included in an MPEG_camera_control extension.
條款23:如條款22之裝置,其中,該MPEG_camera_control延伸包括以下各項中的一項或多項:錨點數據,其表示對於用於該虛擬相機的可允許路徑的錨點數量;分段數據,其表示對於該等錨點之間的該可允許路徑的路徑分段數量;邊界體積數據,其表示用於該虛擬相機的邊界體積;固有參數,其指示相機參數是否在該等錨點中的每個錨點處被修改;以及存取器數據,其表示提供該相機控制數據的存取器之索引。Clause 23: The apparatus of Clause 22, wherein the MPEG_camera_control extension includes one or more of: anchor point data representing the number of anchor points for allowable paths for the virtual camera; segment data, It represents the number of path segments for the allowable path between the anchor points; bounding volume data, which represents the bounding volume for the virtual camera; intrinsic parameters, which indicate whether the camera parameters are within the anchor points modified at each anchor point; and accessor data representing the index of the accessor that provided the camera control data.
條款24:如條款14之裝置,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 24: The apparatus of Clause 14, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款25:如條款14之裝置,其中,該呈現引擎進一步被組態以:如該相機控制數據來決定用於該虛擬相機的可允許路徑,其中,為了更新該虛擬相機之該位置,該呈現引擎被組態以確保該虛擬相機僅沿著在該相機控制數據中定義的該可允許路徑內的虛擬路徑移動。Clause 25: The apparatus of Clause 14, wherein the rendering engine is further configured to determine allowable paths for the virtual camera as the camera control data, wherein, to update the position of the virtual camera, the rendering The engine is configured to ensure that the virtual camera only moves along virtual paths within the allowable paths defined in the camera control data.
條款26:如條款14之裝置,其中,該相機控制數據被包括在MPEG_mesh_collision延伸中。Clause 26: The apparatus of Clause 14, wherein the camera control data is included in an MPEG_mesh_collision extension.
條款27:一種具有儲存在其上的指令的計算機可讀儲存媒體,該指令在被執行時使得執行呈現引擎的處理器進行以下操作:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收用於該三維場景的相機控制數據,該相機控制數據包括定義用於虛擬相機的可允許位置的數據;從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該相機控制數據,更新該虛擬相機之位置,以確保該虛擬相機保持在該可允許位置內。Clause 27: A computer-readable storage medium having stored thereon instructions that, when executed, cause a processor executing a rendering engine to: receive streaming media data representing at least one A virtual three-dimensional scene of virtual solid objects; receiving camera control data for the three-dimensional scene, the camera control data including data defining allowable positions for the virtual camera; receiving camera movement data from a user, the camera movement data requesting the virtual moving a camera through the at least one virtual solid object; and updating a position of the virtual camera using the camera control data to ensure that the virtual camera remains within the allowable position.
條款28:如條款27之計算機可讀儲存媒體,其中,使得該處理器更新該虛擬相機之該位置的該指令包含:使得該處理器防止該虛擬相機穿越該至少一個虛擬固體物體的指令。Clause 28: The computer-readable storage medium of Clause 27, wherein the instructions causing the processor to update the position of the virtual camera comprise instructions causing the processor to prevent the virtual camera from passing through the at least one virtual solid object.
條款29:如條款27之計算機可讀媒體,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 29: The computer-readable medium of Clause 27, wherein the streaming media data comprises glTF 2.0 media data.
條款30:如條款27之計算機可讀媒體,其中,使得該處理器接收該串流媒體數據的該指令包含使得該處理器經由應用程式介面(API)從檢索單元請求該串流媒體數據的指令。Clause 30: The computer-readable medium of clause 27, wherein the instructions causing the processor to receive the streaming media data comprise instructions causing the processor to request the streaming media data from a retrieval unit via an application programming interface (API). .
條款31:如條款27之計算機可讀媒體,其中,該相機控制數據被包括在MPEG場景描述中。Clause 31: The computer-readable medium of Clause 27, wherein the camera control data is included in an MPEG scene description.
條款32:如條款27之計算機可讀媒體,其中,該相機控制數據包括定義兩個或更多個錨點及該等錨點之間的一個或多個分段的數據,該分段表示用於該虛擬相機的可允許相機移動向量,並且其中,使得該處理器更新該虛擬相機之該位置的該指令包含使得該處理器允許該虛擬相機僅越過該等錨點之間的該分段的指令。Clause 32: The computer-readable medium of Clause 27, wherein the camera control data includes data defining two or more anchor points and one or more segments between the anchor points, the segments representing an allowable camera movement vector for the virtual camera, and wherein the instruction causing the processor to update the position of the virtual camera comprises causing the processor to allow the virtual camera to traverse only the segment between the anchor points instruction.
條款33:如條款27之計算機可讀媒體,其中,該相機控制數據包括定義邊界體積的數據,該邊界體積表示用於該虛擬相機的可允許相機移動體積,並且其中,使得該處理器更新該虛擬相機之該位置的該指令包含使得該處理器允許該虛擬相機僅越過該可允許相機移動體積的指令。Clause 33: The computer-readable medium of Clause 27, wherein the camera control data includes data defining a bounding volume representing an allowable camera movement volume for the virtual camera, and wherein the processor is caused to update the The instructions for the position of the virtual camera include instructions for causing the processor to allow the virtual camera to only cross the allowable camera movement volume.
條款34:如條款20之計算機可讀媒體,其中,定義該邊界體積的該數據包含定義圓錐體、平截頭體或球體中的至少一者的數據。Clause 34: The computer-readable medium of
條款35:如條款27之計算機可讀媒體,其中,該相機控制數據被包括在MPEG_camera_control延伸中。Clause 35: The computer-readable medium of Clause 27, wherein the camera control data is included in an MPEG_camera_control extension.
條款36:如條款22之計算機可讀媒體,其中,該MPEG_camera_control延伸包括以下各項中的一項或多項:錨點數據,其表示對於用於該虛擬相機的可允許路徑的錨點數量;分段數據,其表示對於該等錨點之間的該可允許路徑的路徑分段數量;邊界體積數據,其表示用於該虛擬相機的邊界體積;固有參數,其指示相機參數是否在該等錨點中的每個錨點處被修改;以及存取器數據,其表示提供該相機控制數據的存取器之索引。Clause 36: The computer-readable medium of Clause 22, wherein the MPEG_camera_control extension includes one or more of: anchor point data representing the number of anchor points for allowable paths for the virtual camera; segment data, which represents the number of path segments for the allowable path between the anchor points; bounding volume data, which represents the bounding volume for the virtual camera; intrinsic parameters, which indicate whether the camera parameters are within the anchor points points are modified at each anchor point; and accessor data representing the index of the accessor that provided the camera control data.
條款37:如條款27之計算機可讀媒體,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 37: The computer-readable medium of Clause 27, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款38:如條款27之計算機可讀媒體,進一步包含:使得該處理器根據該相機控制數據來決定用於該虛擬相機的可允許路徑的指令,其中,使得該處理器更新該虛擬相機之該位置的該指令包含使得該處理器確保該虛擬相機僅沿著在該相機控制數據中定義的該可允許路徑內的虛擬路徑移動的指令。Clause 38: The computer-readable medium of Clause 27, further comprising: instructions causing the processor to determine allowable paths for the virtual camera based on the camera control data, wherein the processor is caused to update the virtual camera's The instructions for position include instructions for the processor to ensure that the virtual camera moves only along virtual paths within the allowable paths defined in the camera control data.
條款39:如條款27之計算機可讀媒體,其中,該相機控制數據被包括在MPEG_mesh_collision延伸中。Clause 39: The computer-readable medium of Clause 27, wherein the camera control data is included in an MPEG_mesh_collision extension.
條款40:一種用於檢索媒體數據的裝置,該裝置包含:用於接收串流媒體數據的構件,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;用於接收用於該三維場景的相機控制數據的構件,該相機控制數據包括定義用於虛擬相機的可允許位置的數據;用於從用戶接收相機移動數據的構件,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及用於使用該相機控制數據來更新該虛擬相機之位置以確保該虛擬相機保持在該可允許位置內的構件。Clause 40: An apparatus for retrieving media data, the apparatus comprising: means for receiving streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; means for camera control data of the scene, the camera control data including data defining allowable positions for the virtual camera; means for receiving camera movement data from the user, the camera movement data requesting the virtual camera to move through the at least one a virtual solid object; and means for updating the position of the virtual camera using the camera control data to ensure that the virtual camera remains within the allowable position.
條款41:一種檢索媒體數據之方法,該方法包含:由呈現引擎接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;由該呈現引擎接收表示該至少一個虛擬固體物體之邊界的物體碰撞數據;由該呈現引擎從用戶接收相機移動數據,該相機移動數據請求虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該物體碰撞數據,由該呈現引擎更新該虛擬相機之位置,以確保該虛擬相機響應於該相機移動數據而保持在該至少一個虛擬固體物體之外。Clause 41: A method of retrieving media data, the method comprising: receiving, by a rendering engine, streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receiving, by the rendering engine, representation of the at least one virtual object collision data for boundaries of solid objects; receiving, by the rendering engine, camera movement data from a user, the camera movement data requesting a virtual camera to move through the at least one virtual solid object; and using the object collision data, updating, by the rendering engine, the A position of the virtual camera to ensure that the virtual camera remains outside the at least one virtual solid object in response to the camera movement data.
條款42:如條款41之方法,其中,更新該虛擬相機之該位置包含:防止該虛擬相機穿越該至少一個虛擬固體物體。Clause 42: The method of Clause 41, wherein updating the position of the virtual camera comprises preventing the virtual camera from passing through the at least one virtual solid object.
條款43:如條款41之方法,其中,接收該物體碰撞數據包含:接收MPEG_mesh_collision延伸。Clause 43: The method of Clause 41, wherein receiving the object collision data comprises: receiving an MPEG_mesh_collision extension.
條款44:如條款43之方法,其中,該MPEG_mesh_collision延伸包括定義用於該至少一個虛擬固體物體的至少一個3D網格的數據。Clause 44: The method of Clause 43, wherein the MPEG_mesh_collision extension includes data defining at least one 3D mesh for the at least one virtual solid object.
條款45:如條款44之方法,其中,該MPEG_mesh_collision延伸包括定義以下各項中的至少一項的數據:用於該至少一個虛擬固體物體的3D網格之邊界、用於該3D網格的材料、或將響應於該虛擬相機接觸該3D網格而呈現的動畫。Clause 45: The method of
條款46:如條款41之方法,其中,接收該物體碰撞數據包含接收包括以下各項中的一項或多項的數據:邊界數據,其表示該至少一個虛擬固體物體之一個或多個碰撞邊界;靜態數據,其表示該至少一個虛擬固體物體是否受到碰撞影響;材料數據,其表示碰撞物體如何與該至少一個虛擬固體物體互動;或者動畫數據,其表示由與該至少一個虛擬固體物體的碰撞觸發的動畫。Clause 46: The method of Clause 41, wherein receiving the object collision data comprises receiving data comprising one or more of: boundary data representing one or more collision boundaries of the at least one virtual solid object; static data indicating whether the at least one virtual solid object is affected by the collision; material data indicating how the colliding object interacts with the at least one virtual solid object; or animation data indicating being triggered by a collision with the at least one virtual solid object animation.
條款47:如條款41之方法,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 47: The method of Clause 41, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款48:如條款41之方法,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 48: The method of Clause 41, wherein the streaming media data comprises glTF 2.0 media data.
條款49:如條款41之方法,其中,接收該串流媒體數據包含:經由應用程式介面(API)從檢索單元請求該串流媒體數據。Clause 49: The method of Clause 41, wherein receiving the streaming media data comprises: requesting the streaming media data from a retrieval unit via an Application Programming Interface (API).
條款50:如條款41之方法,其中,該物體碰撞數據被包括在MPEG場景描述中。Clause 50: The method of Clause 41, wherein the object collision data is included in an MPEG scene description.
條款51:一種用於檢索媒體數據的裝置,該裝置包含:記憶體,用於儲存媒體數據;以及一個或多個處理器,其在電路中實現並且被組態以執行呈現引擎,該呈現引擎被組態以:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收表示該至少一個虛擬固體物體之邊界的物體碰撞數據;從用戶接收相機移動數據,該相機移動數據請求虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該物體碰撞數據,更新該虛擬相機之位置,以確保該虛擬相機響應於該相機移動數據而保持在該至少一個虛擬固體物體之外。Clause 51: An apparatus for retrieving media data, the apparatus comprising: a memory for storing the media data; and one or more processors implemented in a circuit and configured to execute a rendering engine, the rendering engine configured to: receive streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receive object collision data representing boundaries of the at least one virtual solid object; receive camera movement data from a user, the camera movement data requests a virtual camera to move through the at least one virtual solid object; and updating the position of the virtual camera using the object collision data to ensure that the virtual camera remains on the at least one virtual solid in response to the camera movement data outside the object.
條款52:如條款51之裝置,其中,為了更新該虛擬相機之該位置,該呈現引擎被組態以:防止該虛擬相機穿越該至少一個虛擬固體物體。Clause 52: The device of Clause 51, wherein, to update the position of the virtual camera, the rendering engine is configured to: prevent the virtual camera from passing through the at least one virtual solid object.
條款53:如條款51之裝置,其中,為了接收該物體碰撞數據,該呈現引擎被組態以:接收MPEG_mesh_collision延伸。Clause 53: The apparatus of Clause 51, wherein, to receive the object collision data, the rendering engine is configured to: receive an MPEG_mesh_collision extension.
條款54:如條款53之裝置,其中,該MPEG_mesh_collision延伸包括定義用於該至少一個虛擬固體物體的至少一個3D網格的數據。Clause 54: The apparatus of Clause 53, wherein the MPEG_mesh_collision extension includes data defining at least one 3D mesh for the at least one virtual solid object.
條款55:如條款54之裝置,其中,該MPEG_mesh_collision延伸包括定義以下各項中的至少一項的數據:用於該至少一個虛擬固體物體的3D網格之邊界、用於該3D網格的材料、或將響應於該虛擬相機接觸該3D網格而呈現的動畫。Clause 55: The apparatus of
條款56:如條款51之裝置,其中,為了接收該物體碰撞數據,該呈現引擎被組態以接收包括以下各項中的一項或多項的數據:邊界數據,其表示該至少一個虛擬固體物體之一個或多個碰撞邊界;靜態數據,其表示該至少一個虛擬固體物體是否受到碰撞影響;材料數據,其表示碰撞物體如何與該至少一個虛擬固體物體互動;或者動畫數據,其表示由與該至少一個虛擬固體物體的碰撞觸發的動畫。Clause 56: The apparatus of Clause 51, wherein, to receive the object collision data, the rendering engine is configured to receive data comprising one or more of the following: boundary data representing the at least one virtual solid object one or more collision boundaries; static data indicating whether the at least one virtual solid object is affected by the collision; material data indicating how the collision object interacts with the at least one virtual solid object; An animation triggered by the collision of at least one virtual solid object.
條款57:如條款51之裝置,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 57: The apparatus of Clause 51, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款58:如條款51之裝置,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 58: The device of Clause 51, wherein the streaming media data comprises glTF 2.0 media data.
條款59:如條款51之裝置,其中,為了接收該串流媒體數據,該呈現引擎被組態以:經由應用程式介面(API)從檢索單元請求該串流媒體數據。Clause 59: The device of Clause 51, wherein, to receive the streaming media data, the rendering engine is configured to: request the streaming media data from the retrieval unit via an Application Programming Interface (API).
條款60:如條款51之裝置,其中,該物體碰撞數據被包括在MPEG場景描述中。Clause 60: The apparatus of Clause 51, wherein the object collision data is included in an MPEG scene description.
條款61:一種具有儲存在其上的指令的計算機可讀儲存媒體,該指令在被執行時使得處理器進行以下操作:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收表示該至少一個虛擬固體物體之邊界的物體碰撞數據;從用戶接收相機移動數據,該相機移動數據請求虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該物體碰撞數據,更新該虛擬相機之位置,以確保該虛擬相機響應於該相機移動數據而保持在該至少一個虛擬固體物體之外。Clause 61: A computer-readable storage medium having stored thereon instructions that, when executed, cause a processor to: receive streaming media data representing an a virtual three-dimensional scene; receiving object collision data representing a boundary of the at least one virtual solid object; receiving camera movement data from a user requesting the virtual camera to move through the at least one virtual solid object; and using the object collision data, The position of the virtual camera is updated to ensure that the virtual camera remains outside the at least one virtual solid object in response to the camera movement data.
條款62:如條款61之計算機可讀媒體,其中,使得該處理器更新該虛擬相機之該位置的該指令包含:使得該處理器防止該虛擬相機穿越該至少一個虛擬固體物體的指令。Clause 62: The computer-readable medium of Clause 61, wherein the instructions causing the processor to update the position of the virtual camera comprise instructions causing the processor to prevent the virtual camera from passing through the at least one virtual solid object.
條款63:如條款61之計算機可讀媒體,其中,使得該處理器接收該物體碰撞數據的該指令包含:使得該處理器接收MPEG_mesh_collision延伸的指令。Clause 63: The computer-readable medium of Clause 61, wherein the instructions causing the processor to receive the object collision data comprise instructions causing the processor to receive an MPEG_mesh_collision extension.
條款64:如條款62之計算機可讀媒體,其中,該MPEG_mesh_collision延伸包括定義用於該至少一個虛擬固體物體的至少一個3D網格的數據。Clause 64: The computer-readable medium of
條款65:如條款63之計算機可讀媒體,其中,該MPEG_mesh_collision延伸包括定義以下各項中的至少一項的數據:用於該至少一個虛擬固體物體的3D網格之邊界、用於該3D網格的材料、或將響應於該虛擬相機接觸該3D網格而呈現的動畫。Clause 65: The computer-readable medium of Clause 63, wherein the MPEG_mesh_collision extension includes data defining at least one of: a boundary of a 3D mesh for the at least one virtual solid object, a boundary for the 3D mesh The material of the mesh, or the animation that will appear in response to the virtual camera touching the 3D mesh.
條款66:如條款61之計算機可讀媒體,其中,使得該處理器接收該物體碰撞數據的該指令包含使得該處理器接收包括以下各項中的一項或多項的數據的指令:邊界數據,其表示該至少一個虛擬固體物體之一個或多個碰撞邊界;靜態數據,其表示該至少一個虛擬固體物體是否受到碰撞影響;材料數據,其表示碰撞物體如何與該至少一個虛擬固體物體互動;或者動畫數據,其表示由與該至少一個虛擬固體物體的碰撞觸發的動畫。Clause 66: The computer-readable medium of Clause 61, wherein the instructions causing the processor to receive the object collision data comprise instructions causing the processor to receive data comprising one or more of: boundary data, which represents one or more collision boundaries of the at least one virtual solid object; static data which represents whether the at least one virtual solid object is affected by the collision; material data which represents how the collision object interacts with the at least one virtual solid object; or animation data representing an animation triggered by a collision with the at least one virtual solid object.
條款67:如條款61之計算機可讀媒體,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 67: The computer-readable medium of Clause 61, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款68:如條款61之計算機可讀媒體,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 68: The computer-readable medium of Clause 61, wherein the streaming media data comprises glTF 2.0 media data.
條款69:如條款61之計算機可讀媒體,其中,使得該處理器接收該串流媒體數據的該指令包含:使得該處理器經由應用程式介面(API)從檢索單元請求該串流媒體數據的指令。Clause 69: The computer-readable medium of clause 61, wherein the instructions causing the processor to receive the streaming media data comprise: causing the processor to request the streaming media data from a retrieval unit via an application programming interface (API) instruction.
條款70:如條款61之計算機可讀媒體,其中,該物體碰撞數據被包括在MPEG場景描述中。Clause 70: The computer-readable medium of Clause 61, wherein the object collision data is included in an MPEG scene description.
條款71:一種用於檢索媒體數據的裝置,該裝置包含:用於接收串流媒體數據的構件,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;用於接收表示該至少一個虛擬固體物體之邊界的物體碰撞數據的構件;用於從用戶接收相機移動數據的構件,該相機移動數據請求虛擬相機移動穿過該至少一個虛擬固體物體;以及用於更新該虛擬相機之位置,以確保該虛擬相機響應於該相機移動數據而保持在該至少一個虛擬固體物體之外的構件。Clause 71: An apparatus for retrieving media data, the apparatus comprising: means for receiving streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; means for object collision data of a boundary of a virtual solid object; means for receiving camera movement data from a user requesting that the virtual camera move through the at least one virtual solid object; and for updating a position of the virtual camera, A means to ensure that the virtual camera remains outside the at least one virtual solid object in response to the camera movement data.
條款72:一種檢索媒體數據之方法,該方法包含:由呈現引擎接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;由該呈現引擎接收用於該三維場景的相機控制數據,該相機控制數據包括定義用於虛擬相機的可允許位置的數據;由該呈現引擎從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該相機控制數據,由該呈現引擎更新該虛擬相機之位置,以確保該虛擬相機保持在該可允許位置內。Clause 72: A method of retrieving media data, the method comprising: receiving, by a rendering engine, streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receiving, by the rendering engine, a camera control data, the camera control data including data defining allowable positions for the virtual camera; receiving camera movement data from the user by the rendering engine, the camera movement data requesting the virtual camera to move through the at least one virtual solid object and updating, by the rendering engine, the position of the virtual camera using the camera control data to ensure that the virtual camera remains within the allowable position.
條款73:如條款72之方法,其中,更新該虛擬相機之該位置包含:防止該虛擬相機穿越該至少一個虛擬固體物體。Clause 73: The method of
條款74:如條款72及73中任一項之方法,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 74: The method of any one of
條款75:如條款72-74中任一項之方法,其中,接收該串流媒體數據包含:經由應用程式介面(API)從檢索單元請求該串流媒體數據。Clause 75: The method of any one of clauses 72-74, wherein receiving the streaming media data comprises: requesting the streaming media data from a retrieval unit via an application programming interface (API).
條款76:如條款72-75中任一項之方法,其中,該相機控制數據被包括在MPEG場景描述中。Clause 76: The method of any one of clauses 72-75, wherein the camera control data is included in an MPEG scene description.
條款77:如條款72-76中任一項之方法,其中,該相機控制數據包括定義兩個或更多個錨點及該等錨點之間的一個或多個分段的數據,該分段表示用於該虛擬相機的可允許相機移動向量,並且其中,更新該虛擬相機之該位置包含允許該虛擬相機僅越過該等錨點之間的該分段。Clause 77: The method of any of clauses 72-76, wherein the camera control data includes data defining two or more anchor points and one or more segments between the anchor points, the segment Segments represent allowable camera movement vectors for the virtual camera, and wherein updating the position of the virtual camera includes allowing the virtual camera to traverse only the segment between the anchor points.
條款78:如條款72-77中任一項之方法,其中,該相機控制數據包括定義邊界體積的數據,該邊界體積表示用於該虛擬相機的可允許相機移動體積,並且其中,更新該虛擬相機之該位置包含允許該虛擬相機僅越過該可允許相機移動體積。Clause 78: The method of any of clauses 72-77, wherein the camera control data includes data defining a bounding volume representing an allowable camera movement volume for the virtual camera, and wherein updating the virtual The position of the camera includes allowing the virtual camera to pass only the allowable camera movement volume.
條款79:如條款78之方法,其中,定義該邊界體積的該數據包含定義圓錐體、平截頭體或球體中的至少一者的數據。Clause 79: The method of Clause 78, wherein the data defining the bounding volume comprises data defining at least one of a cone, frustum, or sphere.
條款80:如條款72-79中任一項之方法,其中,該相機控制數據被包括在MPEG_camera_control延伸中。Clause 80: The method of any one of clauses 72-79, wherein the camera control data is included in an MPEG_camera_control extension.
條款81:如條款80之方法,其中,該MPEG_camera_control延伸包括以下各項中的一項或多項:錨點數據,其表示對於用於該虛擬相機的可允許路徑的錨點數量;分段數據,其表示對於該等錨點之間的該可允許路徑的路徑分段數量;邊界體積數據,其表示用於該虛擬相機的邊界體積;固有參數,其指示相機參數是否在該等錨點中的每個錨點處被修改;以及存取器數據,其表示提供該相機控制數據的存取器之索引。Clause 81: The method of Clause 80, wherein the MPEG_camera_control extension includes one or more of: anchor point data representing the number of anchor points for allowable paths for the virtual camera; segment data, It represents the number of path segments for the allowable path between the anchor points; bounding volume data, which represents the bounding volume for the virtual camera; intrinsic parameters, which indicate whether the camera parameters are within the anchor points modified at each anchor point; and accessor data representing the index of the accessor that provided the camera control data.
條款82:如條款72-81中任一項之方法,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 82: The method of any one of clauses 72-81, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款83:如條款72之方法,進一步包含:根據該相機控制數據來決定用於該虛擬相機的可允許路徑,其中,更新該虛擬相機之該位置包含確保該虛擬相機僅沿著在該相機控制數據中定義的該可允許路徑內的虛擬路徑移動。Clause 83: The method of
條款84:如條款72-83中任一項之方法,其中,該相機控制數據被包括在MPEG_mesh_collision延伸中。Clause 84: The method of any of clauses 72-83, wherein the camera control data is included in an MPEG_mesh_collision extension.
條款85:一種用於檢索媒體數據的裝置,該裝置包含:記憶體,其被組態以儲存媒體數據;以及一個或多個處理器,其在電路中實現並且被組態以執行呈現引擎,該呈現引擎被組態以:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收用於該三維場景的相機控制數據,該相機控制數據包括定義用於虛擬相機的可允許位置的數據;從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該相機控制數據,更新該虛擬相機之位置,以確保該虛擬相機保持在該可允許位置內。Clause 85: An apparatus for retrieving media data, the apparatus comprising: a memory configured to store the media data; and one or more processors implemented in a circuit and configured to execute a rendering engine, The rendering engine is configured to: receive streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receive camera control data for the three-dimensional scene, the camera control data including definitions for data of allowable positions of the virtual camera; receiving camera movement data from a user requesting the virtual camera to move through the at least one virtual solid object; and using the camera control data, updating the position of the virtual camera to ensure The virtual camera remains within the allowable position.
條款86:如條款85之裝置,其中,該呈現引擎被組態以:防止該虛擬相機穿越該至少一個虛擬固體物體。Clause 86: The device of Clause 85, wherein the rendering engine is configured to: prevent the virtual camera from passing through the at least one virtual solid object.
條款87:如條款85及86中任一項之裝置,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 87: The device of any one of clauses 85 and 86, wherein the streaming media data comprises glTF 2.0 media data.
條款88:如條款85-87中任一項之裝置,其中,該呈現引擎被組態以:經由應用程式介面(API)從檢索單元請求該串流媒體數據。Clause 88: The device of any one of clauses 85-87, wherein the presentation engine is configured to: request the streaming media data from the retrieval unit via an application programming interface (API).
條款89:如條款85-88中任一項之裝置,其中,該相機控制數據被包括在MPEG場景描述中。Clause 89: The apparatus of any one of clauses 85-88, wherein the camera control data is included in an MPEG scene description.
條款90:如條款85-89中任一項之裝置,其中,該相機控制數據包括定義兩個或更多個錨點及該等錨點之間的一個或多個分段的數據,該分段表示用於該虛擬相機的可允許相機移動向量,並且其中,為了更新該虛擬相機之該位置,該呈現引擎被組態以允許該虛擬相機僅越過該等錨點之間的該分段。Clause 90: The device of any of Clauses 85-89, wherein the camera control data includes data defining two or more anchor points and one or more segments between the anchor points, the segment Segments represent allowable camera movement vectors for the virtual camera, and wherein, to update the position of the virtual camera, the rendering engine is configured to allow the virtual camera to only traverse the segment between the anchor points.
條款91:如條款85-90中任一項之裝置,其中,該相機控制數據包括定義邊界體積的數據,該邊界體積表示用於該虛擬相機的可允許相機移動體積,並且其中,為了更新該虛擬相機之該位置,該呈現引擎被組態以允許該虛擬相機僅越過該可允許相機移動體積。Clause 91: The apparatus of any of clauses 85-90, wherein the camera control data includes data defining a bounding volume representing an allowable camera movement volume for the virtual camera, and wherein, to update the At the location of the virtual camera, the rendering engine is configured to allow the virtual camera to only cross the allowable camera movement volume.
條款92:如條款91之裝置,其中,定義該邊界體積的該數據包含定義圓錐體、平截頭體或球體中的至少一者的數據。Clause 92: The device of Clause 91, wherein the data defining the bounding volume comprises data defining at least one of a cone, frustum, or sphere.
條款93:如條款85-92中任一項之裝置,其中,該相機控制數據被包括在MPEG_camera_control延伸中。Clause 93: The apparatus of any one of clauses 85-92, wherein the camera control data is included in an MPEG_camera_control extension.
條款94:如條款93之裝置,其中,該MPEG_camera_control延伸包括以下各項中的一項或多項:錨點數據,其表示對於用於該虛擬相機的可允許路徑的錨點數量;分段數據,其表示對於該等錨點之間的該可允許路徑的路徑分段數量;邊界體積數據,其表示用於該虛擬相機的邊界體積;固有參數,其指示相機參數是否在該等錨點中的每個錨點處被修改;以及存取器數據,其表示提供該相機控制數據的存取器之索引。Clause 94: The apparatus of Clause 93, wherein the MPEG_camera_control extension includes one or more of: anchor point data representing the number of anchor points for allowable paths for the virtual camera; segment data, It represents the number of path segments for the allowable path between the anchor points; bounding volume data, which represents the bounding volume for the virtual camera; intrinsic parameters, which indicate whether the camera parameters are within the anchor points modified at each anchor point; and accessor data representing the index of the accessor that provided the camera control data.
條款95:如條款85-94中任一項之裝置,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 95: The apparatus of any of clauses 85-94, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款96:如條款85-95中任一項之裝置,其中,該呈現引擎進一步被組態以:根據該相機控制數據來決定用於該虛擬相機的可允許路徑,其中,為了更新該虛擬相機之該位置,該呈現引擎被組態以確保該虛擬相機僅沿著在該相機控制數據中定義的該可允許路徑內的虛擬路徑移動。Clause 96: The apparatus of any one of clauses 85-95, wherein the rendering engine is further configured to: determine an allowable path for the virtual camera based on the camera control data, wherein, for updating the virtual camera For this position, the rendering engine is configured to ensure that the virtual camera moves only along virtual paths within the allowable paths defined in the camera control data.
條款97:如條款85-96中任一項之裝置,其中,該相機控制數據被包括在MPEG_mesh_collision延伸中。Clause 97: The apparatus of any one of clauses 85-96, wherein the camera control data is included in an MPEG_mesh_collision extension.
條款98:一種具有儲存在其上的指令的計算機可讀儲存媒體,該指令在被執行時使得執行呈現引擎的處理器進行以下操作:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收用於該三維場景的相機控制數據,該相機控制數據包括定義用於虛擬相機的可允許位置的數據;從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該相機控制數據,更新該虛擬相機之位置,以確保該虛擬相機保持在該可允許位置內。Clause 98: A computer-readable storage medium having stored thereon instructions that, when executed, cause a processor executing a rendering engine to: receive streaming media data representing at least one A virtual three-dimensional scene of virtual solid objects; receiving camera control data for the three-dimensional scene, the camera control data including data defining allowable positions for the virtual camera; receiving camera movement data from a user, the camera movement data requesting the virtual moving a camera through the at least one virtual solid object; and updating a position of the virtual camera using the camera control data to ensure that the virtual camera remains within the allowable position.
條款99:如條款98之計算機可讀儲存媒體,其中,使得該處理器更新該虛擬相機之該位置的該指令包含:使得該處理器防止該虛擬相機穿越該至少一個虛擬固體物體的指令。Clause 99: The computer-readable storage medium of Clause 98, wherein the instructions causing the processor to update the position of the virtual camera comprise instructions causing the processor to prevent the virtual camera from passing through the at least one virtual solid object.
條款100:如條款98及99中任一項之計算機可讀媒體,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 100: The computer-readable medium of any one of clauses 98 and 99, wherein the streaming media data comprises glTF 2.0 media data.
條款101:如條款98-100中任一項之計算機可讀媒體,其中,使得該處理器接收該串流媒體數據的該指令包含使得該處理器經由應用程式介面(API)從檢索單元請求該串流媒體數據的指令。Clause 101: The computer-readable medium of any one of clauses 98-100, wherein the instructions causing the processor to receive the streaming media data comprise causing the processor to request the stream from a retrieval unit via an application programming interface (API). Instructions for streaming media data.
條款102:如條款98-101中任一項之計算機可讀媒體,其中,該相機控制數據被包括在MPEG場景描述中。Clause 102: The computer-readable medium of any one of Clauses 98-101, wherein the camera control data is included in an MPEG scene description.
條款103:如條款98-102中任一項之計算機可讀媒體,其中,該相機控制數據包括定義兩個或更多個錨點及該等錨點之間的一個或多個分段的數據,該分段表示用於該虛擬相機的可允許相機移動向量,並且其中,使得該處理器更新該虛擬相機之該位置的該指令包含使得該處理器允許該虛擬相機僅越過該等錨點之間的該分段的指令。Clause 103: The computer-readable medium of any one of clauses 98-102, wherein the camera control data includes data defining two or more anchor points and one or more segments between the anchor points , the segment represents an allowable camera movement vector for the virtual camera, and wherein the instruction causing the processor to update the position of the virtual camera includes causing the processor to allow the virtual camera to pass only between the anchor points Instructions for this segment between.
條款104:如條款103之計算機可讀媒體,其中,該相機控制數據包括定義邊界體積的數據,該邊界體積表示用於該虛擬相機的可允許相機移動體積,並且其中,使得該處理器更新該虛擬相機之該位置的該指令包含使得該處理器允許該虛擬相機僅越過該可允許相機移動體積的指令。Clause 104: The computer-readable medium of Clause 103, wherein the camera control data includes data defining a bounding volume representing an allowable camera movement volume for the virtual camera, and wherein the processor is caused to update the The instructions for the position of the virtual camera include instructions for causing the processor to allow the virtual camera to only cross the allowable camera movement volume.
條款105:如條款98-104中任一項之計算機可讀媒體,其中,定義該邊界體積的該數據包含定義圓錐體、平截頭體或球體中的至少一者的數據。Clause 105: The computer-readable medium of any one of clauses 98-104, wherein the data defining the bounding volume comprises data defining at least one of a cone, frustum, or sphere.
條款106:如條款105之計算機可讀媒體,其中,該相機控制數據被包括在MPEG_camera_control延伸中。Clause 106: The computer-readable medium of Clause 105, wherein the camera control data is included in an MPEG_camera_control extension.
條款107:如條款98-106中任一項之計算機可讀媒體,其中,該MPEG_camera_control延伸包括以下各項中的一項或多項:錨點數據,其表示對於用於該虛擬相機的可允許路徑的錨點數量;分段數據,其表示對於該等錨點之間的該可允許路徑的路徑分段數量;邊界體積數據,其表示用於該虛擬相機的邊界體積;固有參數,其指示相機參數是否在該等錨點中的每個錨點處被修改;以及存取器數據,其表示提供該相機控制數據的存取器之索引。Clause 107: The computer-readable medium of any one of clauses 98-106, wherein the MPEG_camera_control extension includes one or more of: anchor point data representing allowable paths for the virtual camera number of anchor points; segment data, which represents the number of path segments for the allowable path between the anchor points; bounding volume data, which represents the bounding volume for the virtual camera; intrinsic parameters, which indicate the camera whether the parameter is modified at each of the anchor points; and accessor data representing the index of the accessor that provided the camera control data.
條款108:如條款98-107中任一項之計算機可讀媒體,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 108: The computer-readable medium of any one of clauses 98-107, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款109:如條款98-108中任一項之計算機可讀媒體,進一步包含:使得該處理器根據該相機控制數據來決定用於該虛擬相機的可允許路徑的指令,其中,使得該處理器更新該虛擬相機之該位置的該指令包含使得該處理器確保該虛擬相機僅沿著在該相機控制數據中定義的該可允許路徑內的虛擬路徑移動的指令。Clause 109: The computer-readable medium of any one of clauses 98-108, further comprising: instructions that cause the processor to determine an allowable path for the virtual camera based on the camera control data, wherein the processor is caused to The instructions to update the position of the virtual camera include instructions to cause the processor to ensure that the virtual camera moves only along virtual paths within the allowable paths defined in the camera control data.
條款110:如條款98-109中任一項之計算機可讀媒體,其中,該相機控制數據被包括在MPEG_mesh_collision延伸中。Clause 110: The computer-readable medium of any one of clauses 98-109, wherein the camera control data is included in an MPEG_mesh_collision extension.
條款111:一種用於檢索媒體數據的裝置,該裝置包含:用於接收串流媒體數據的構件,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;用於接收用於該三維場景的相機控制數據的構件,該相機控制數據包括定義用於虛擬相機的可允許位置的數據;用於從用戶接收相機移動數據的構件,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及用於使用該相機控制數據來更新該虛擬相機之位置以確保該虛擬相機保持在該可允許位置內的構件。Clause 111: An apparatus for retrieving media data, the apparatus comprising: means for receiving streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; means for camera control data of the scene, the camera control data including data defining allowable positions for the virtual camera; means for receiving camera movement data from the user, the camera movement data requesting the virtual camera to move through the at least one a virtual solid object; and means for updating the position of the virtual camera using the camera control data to ensure that the virtual camera remains within the allowable position.
條款112:一種檢索媒體數據之方法,該方法包含:由呈現引擎接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;由該呈現引擎接收表示該至少一個虛擬固體物體之邊界的物體碰撞數據;由該呈現引擎從用戶接收相機移動數據,該相機移動數據請求虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該物體碰撞數據,由該呈現引擎更新該虛擬相機之位置,以確保該虛擬相機響應於該相機移動數據而保持在該至少一個虛擬固體物體之外。Clause 112: A method of retrieving media data, the method comprising: receiving, by a rendering engine, streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receiving, by the rendering engine, representation of the at least one virtual object collision data for boundaries of solid objects; receiving, by the rendering engine, camera movement data from a user, the camera movement data requesting a virtual camera to move through the at least one virtual solid object; and using the object collision data, updating, by the rendering engine, the A position of the virtual camera to ensure that the virtual camera remains outside the at least one virtual solid object in response to the camera movement data.
條款113:一種方法,包含如條款72-84中任一項之方法及如條款112之方法之組合。Clause 113: A method comprising a combination of the method of any one of clauses 72-84 and the method of
條款114:如條款112及113中任一項之方法,其中,更新該虛擬相機之該位置包含:防止該虛擬相機穿越該至少一個虛擬固體物體。Clause 114: The method of any one of
條款115:如條款112-114中任一項之方法,其中,接收該物體碰撞數據包含:接收MPEG_mesh_collision延伸。Clause 115: The method of any one of clauses 112-114, wherein receiving the object collision data comprises: receiving an MPEG_mesh_collision extension.
條款116:如條款115之方法,其中,該MPEG_mesh_collision延伸包括定義用於該至少一個虛擬固體物體的至少一個3D網格的數據。Clause 116: The method of Clause 115, wherein the MPEG_mesh_collision extension includes data defining at least one 3D mesh for the at least one virtual solid object.
條款117:如條款116之方法,其中,該MPEG_mesh_collision延伸包括定義以下各項中的至少一項的數據:用於該至少一個虛擬固體物體的3D網格之邊界、用於該3D網格的材料、或將響應於該虛擬相機接觸該3D網格而呈現的動畫。Clause 117: The method of Clause 116, wherein the MPEG_mesh_collision extension includes data defining at least one of: a boundary of a 3D mesh for the at least one virtual solid object, a material for the 3D mesh , or an animation that will be rendered in response to the virtual camera touching the 3D mesh.
條款118:如條款112-117中任一項之方法,其中,接收該物體碰撞數據包含接收包括以下各項中的一項或多項的數據:邊界數據,其表示該至少一個虛擬固體物體之一個或多個碰撞邊界;靜態數據,其表示該至少一個虛擬固體物體是否受到碰撞影響;材料數據,其表示碰撞物體如何與該至少一個虛擬固體物體互動;或者動畫數據,其表示由與該至少一個虛擬固體物體的碰撞觸發的動畫。Clause 118: The method of any of clauses 112-117, wherein receiving the object collision data comprises receiving data comprising one or more of: boundary data representing one of the at least one virtual solid object or a plurality of collision boundaries; static data representing whether the at least one virtual solid object is affected by the collision; material data representing how the collision object interacts with the at least one virtual solid object; or animation data representing the interaction between the at least one virtual solid object Animations triggered by collisions of virtual solid objects.
條款119:如條款112-118中任一項之方法,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 119: The method of any one of clauses 112-118, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款120:如條款112-119中任一項之方法,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 120: The method of any one of clauses 112-119, wherein the streaming media data comprises glTF 2.0 media data.
條款121:如條款112-120中任一項之方法,其中,接收該串流媒體數據包含:經由應用程式介面(API)從檢索單元請求該串流媒體數據。Clause 121: The method of any one of clauses 112-120, wherein receiving the streaming media data comprises: requesting the streaming media data from a retrieval unit via an application programming interface (API).
條款122:如條款112-121中任一項之方法,其中,該物體碰撞數據被包括在MPEG場景描述中。Clause 122: The method of any of clauses 112-121, wherein the object collision data is included in an MPEG scene description.
條款123:一種用於檢索媒體數據的裝置,該裝置包含:記憶體,用於儲存媒體數據;以及一個或多個處理器,其在電路中實現並且被組態以執行呈現引擎,該呈現引擎被組態以:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收表示該至少一個虛擬固體物體之邊界的物體碰撞數據;從用戶接收相機移動數據,該相機移動數據請求虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該物體碰撞數據,更新該虛擬相機之位置,以確保該虛擬相機響應於該相機移動數據而保持在該至少一個虛擬固體物體之外。Clause 123: An apparatus for retrieving media data, the apparatus comprising: memory for storing media data; and one or more processors implemented in circuitry and configured to execute a rendering engine, the rendering engine configured to: receive streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receive object collision data representing boundaries of the at least one virtual solid object; receive camera movement data from a user, the camera movement data requests a virtual camera to move through the at least one virtual solid object; and updating the position of the virtual camera using the object collision data to ensure that the virtual camera remains on the at least one virtual solid in response to the camera movement data outside the object.
條款124:一種裝置,包含如條款85-97中任一項之裝置及如條款123之裝置之組合。Clause 124: A device comprising a combination of the device of any one of clauses 85-97 and the device of clause 123.
條款125:如條款123及124中任一項之裝置,其中,為了更新該虛擬相機之該位置,該呈現引擎被組態以:防止該虛擬相機穿越該至少一個虛擬固體物體。Clause 125: The device of any one of clauses 123 and 124, wherein, to update the position of the virtual camera, the rendering engine is configured to: prevent the virtual camera from passing through the at least one virtual solid object.
條款126:如條款123-125中任一項之裝置,其中,為了接收該物體碰撞數據,該呈現引擎被組態以:接收MPEG_mesh_collision延伸。Clause 126: The apparatus of any one of clauses 123-125, wherein, to receive the object collision data, the rendering engine is configured to: receive an MPEG_mesh_collision extension.
條款127:如條款126之裝置,其中,該MPEG_mesh_collision延伸包括定義用於該至少一個虛擬固體物體的至少一個3D網格的數據。Clause 127: The device of Clause 126, wherein the MPEG_mesh_collision extension includes data defining at least one 3D mesh for the at least one virtual solid object.
條款128:如條款127之裝置,其中,該MPEG_mesh_collision延伸包括定義以下各項中的至少一項的數據:用於該至少一個虛擬固體物體的3D網格之邊界、用於該3D網格的材料、或將響應於該虛擬相機接觸該3D網格而呈現的動畫。Clause 128: The apparatus of Clause 127, wherein the MPEG_mesh_collision extension includes data defining at least one of: a boundary of a 3D mesh for the at least one virtual solid object, a material for the 3D mesh , or an animation that will be rendered in response to the virtual camera touching the 3D mesh.
條款129:如條款123-128中任一項之裝置,其中,為了接收該物體碰撞數據,該呈現引擎被組態以接收包括以下各項中的一項或多項的數據:邊界數據,其表示該至少一個虛擬固體物體之一個或多個碰撞邊界;靜態數據,其表示該至少一個虛擬固體物體是否受到碰撞影響;材料數據,其表示碰撞物體如何與該至少一個虛擬固體物體互動;或者動畫數據,其表示由與該至少一個虛擬固體物體的碰撞觸發的動畫。Clause 129: The apparatus of any of Clauses 123-128, wherein, to receive the object collision data, the rendering engine is configured to receive data comprising one or more of the following: boundary data representing One or more collision boundaries of the at least one virtual solid object; static data indicating whether the at least one virtual solid object is affected by the collision; material data indicating how the collision object interacts with the at least one virtual solid object; or animation data , which represents an animation triggered by a collision with the at least one virtual solid object.
條款130:如條款123-129中任一項之裝置,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 130: The apparatus of any of clauses 123-129, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款131:如條款123-130中任一項之裝置,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 131: The device of any one of clauses 123-130, wherein the streaming media data comprises glTF 2.0 media data.
條款132:如條款123-131中任一項之裝置,其中,為了接收該串流媒體數據,該呈現引擎被組態以:經由應用程式介面(API)從檢索單元請求該串流媒體數據。Clause 132: The device of any one of clauses 123-131, wherein, to receive the streaming media data, the rendering engine is configured to: request the streaming media data from the retrieval unit via an application programming interface (API).
條款133:如條款123-132中任一項之裝置,其中,該物體碰撞數據被包括在MPEG場景描述中。Clause 133: The apparatus of any of Clauses 123-132, wherein the object collision data is included in an MPEG scene description.
條款134:一種具有儲存在其上的指令的計算機可讀儲存媒體,該指令在被執行時使得處理器進行以下操作:接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;接收表示該至少一個虛擬固體物體之邊界的物體碰撞數據;從用戶接收相機移動數據,該相機移動數據請求虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該物體碰撞數據,更新該虛擬相機之位置,以確保該虛擬相機響應於該相機移動數據而保持在該至少一個虛擬固體物體之外。Clause 134: A computer-readable storage medium having stored thereon instructions that, when executed, cause a processor to: receive streaming media data representing a a virtual three-dimensional scene; receiving object collision data representing a boundary of the at least one virtual solid object; receiving camera movement data from a user requesting the virtual camera to move through the at least one virtual solid object; and using the object collision data, The position of the virtual camera is updated to ensure that the virtual camera remains outside the at least one virtual solid object in response to the camera movement data.
條款135:一種計算機可讀儲存媒體,其包含如條款98-110中任一項之計算機可讀儲存媒體及如條款134之計算機可讀儲存媒體之組合。Clause 135: A computer-readable storage medium comprising a combination of the computer-readable storage medium of any one of clauses 98-110 and the computer-readable storage medium of clause 134.
條款136:如條款134及135中任一項之計算機可讀媒體,其中,使得該處理器更新該虛擬相機之該位置的該指令包含:使得該處理器防止該虛擬相機穿越該至少一個虛擬固體物體的指令。Clause 136: The computer-readable medium of any one of clauses 134 and 135, wherein the instruction causing the processor to update the position of the virtual camera comprises: causing the processor to prevent the virtual camera from traversing the at least one virtual solid object instructions.
條款137:如條款134-136中任一項之計算機可讀媒體,其中,使得該處理器接收該物體碰撞數據的該指令包含:使得該處理器接收MPEG_mesh_collision延伸的指令。Clause 137: The computer-readable medium of any one of clauses 134-136, wherein the instructions causing the processor to receive the object collision data comprise instructions causing the processor to receive an MPEG_mesh_collision extension.
條款138:如條款134-137中任一項之計算機可讀媒體,其中,該MPEG_mesh_collision延伸包括定義用於該至少一個虛擬固體物體的至少一個3D網格的數據。Clause 138: The computer-readable medium of any one of clauses 134-137, wherein the MPEG_mesh_collision extension includes data defining at least one 3D mesh for the at least one virtual solid object.
條款139:如條款134-138中任一項之計算機可讀媒體,其中,該MPEG_mesh_collision延伸包括定義以下各項中的至少一項的數據:用於該至少一個虛擬固體物體的3D網格之邊界、用於該3D網格的材料、或將響應於該虛擬相機接觸該3D網格而呈現的動畫。Clause 139: The computer-readable medium of any of Clauses 134-138, wherein the MPEG_mesh_collision extension includes data defining at least one of: a boundary of a 3D mesh for the at least one virtual solid object , a material for the 3D mesh, or an animation to be rendered in response to the virtual camera touching the 3D mesh.
條款140:如條款134-139中任一項之計算機可讀媒體,其中,使得該處理器接收該物體碰撞數據的該指令包含使得該處理器接收包括以下各項中的一項或多項的數據的指令:邊界數據,其表示該至少一個虛擬固體物體之一個或多個碰撞邊界;靜態數據,其表示該至少一個虛擬固體物體是否受到碰撞影響;材料數據,其表示碰撞物體如何與該至少一個虛擬固體物體互動;或者動畫數據,其表示由與該至少一個虛擬固體物體的碰撞觸發的動畫。Clause 140: The computer-readable medium of any one of clauses 134-139, wherein the instructions causing the processor to receive the object collision data comprise causing the processor to receive data comprising one or more of the following Instructions for: boundary data, which represents one or more collision boundaries of the at least one virtual solid object; static data, which represents whether the at least one virtual solid object is affected by collision; material data, which represents how the collision object interacts with the at least one virtual solid object interaction; or animation data representing an animation triggered by a collision with the at least one virtual solid object.
條款141:如條款134-140中任一項之計算機可讀媒體,其中,該至少一個虛擬固體物體包含虛擬牆、虛擬椅子或虛擬桌子中的一者。Clause 141: The computer-readable medium of any one of clauses 134-140, wherein the at least one virtual solid object comprises one of a virtual wall, a virtual chair, or a virtual table.
條款142:如條款134-141中任一項之計算機可讀媒體,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 142: The computer-readable medium of any one of clauses 134-141, wherein the streaming media data comprises glTF 2.0 media data.
條款143:如條款134-142中任一項之計算機可讀媒體,其中,使得該處理器接收該串流媒體數據的該指令包含:使得該處理器經由應用程式介面(API)從檢索單元請求該串流媒體數據的指令。Clause 143: The computer-readable medium of any one of clauses 134-142, wherein the instructions causing the processor to receive the streaming media data comprise: causing the processor to request from the retrieval unit via an application programming interface (API) The instruction to stream media data.
條款144:如條款134-143中任一項之計算機可讀媒體,其中,該物體碰撞數據被包括在MPEG場景描述中。Clause 144: The computer-readable medium of any one of clauses 134-143, wherein the object collision data is included in an MPEG scene description.
條款145:一種檢索媒體數據之方法,該方法包含:由呈現引擎接收串流媒體數據,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;由該呈現引擎接收用於該三維場景的相機控制數據,該相機控制數據包括定義限制以防止虛擬相機穿越該至少一個虛擬固體物體的數據;由該呈現引擎從用戶接收相機移動數據,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及使用該相機控制數據,防止該虛擬相機響應於該相機移動數據而穿越該至少一個虛擬固體物體。Clause 145: A method of retrieving media data, the method comprising: receiving, by a rendering engine, streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; receiving, by the rendering engine, a camera control data, the camera control data including data defining restrictions to prevent the virtual camera from traversing the at least one virtual solid object; receiving camera movement data from the user by the rendering engine, the camera movement data requesting the virtual camera to move through the at least one virtual solid object a virtual solid object; and using the camera control data, preventing the virtual camera from traversing the at least one virtual solid object in response to the camera movement data.
條款146:如條款145之方法,其中,該串流媒體數據包含glTF 2.0媒體數據。Clause 146: The method of Clause 145, wherein the streaming media data comprises glTF 2.0 media data.
條款147:如條款145及146中任一項之方法,其中,接收該串流媒體數據包含:經由應用程式介面(API)從檢索單元請求該串流媒體數據。Clause 147: The method of any one of clauses 145 and 146, wherein receiving the streaming media data comprises: requesting the streaming media data from a retrieval unit via an application programming interface (API).
條款148:如條款145-147中任一項之方法,其中,該相機控制數據被包括在MPEG場景描述中。Clause 148: The method of any one of clauses 145-147, wherein the camera control data is included in an MPEG scene description.
條款149:如條款145-148中任一項之方法,其中,該相機控制數據被包括在MPEG_camera_control延伸中。Clause 149: The method of any one of clauses 145-148, wherein the camera control data is included in an MPEG_camera_control extension.
條款150:如條款149之方法,其中,該MPEG_camera_control延伸包括定義兩個或更多個錨點及該等錨點之間的一個或多個分段的數據,該分段表示可允許相機移動向量。Clause 150: The method of Clause 149, wherein the MPEG_camera_control extension includes data defining two or more anchor points and one or more segments between the anchor points, the segments representing allowable camera movement vectors .
條款151:如條款145-148中任一項之方法,其中,該MPEG_camera_control延伸包括定義表示可允許相機移動體積的邊界體積的數據。Clause 151: The method of any one of clauses 145-148, wherein the MPEG_camera_control extension includes data defining a bounding volume representing an allowable camera movement volume.
條款152:如條款151之方法,其中,定義該邊界體積的該數據包含定義圓錐體、平截頭體或球體中的至少一者的數據。Clause 152: The method of Clause 151, wherein the data defining the bounding volume comprises data defining at least one of a cone, frustum, or sphere.
條款153:如條款149-152中任一項之方法,其中,該MPEG_camera_control延伸符合以上表1之數據。Clause 153: The method of any one of clauses 149-152, wherein the MPEG_camera_control extension conforms to the data of Table 1 above.
條款154:如條款149-153中任一項之方法,其中,該至少一個虛擬固體物體包含虛擬牆。Clause 154: The method of any one of clauses 149-153, wherein the at least one virtual solid object comprises a virtual wall.
條款155:如條款149-154中任一項之方法,其中,防止該虛擬相機穿越該至少一個虛擬固體物體包含:防止該虛擬相機沿著超出在該MPEG_camera_control延伸中定義的可允許路徑的虛擬路徑移動。Clause 155: The method of any one of clauses 149-154, wherein preventing the virtual camera from traversing the at least one virtual solid object comprises: preventing the virtual camera from following a virtual path that exceeds an allowable path defined in the MPEG_camera_control extension move.
條款156:如條款145-155中任一項之方法,其中,該相機控制數據被包括在MPEG_mesh_collision延伸中。Clause 156: The method of any one of clauses 145-155, wherein the camera control data is included in an MPEG_mesh_collision extension.
條款157:如條款145-155中任一項之方法,其中,該MPEG_mesh_collision延伸包括定義用於該至少一個虛擬固體物體的至少一個3D網格的數據。Clause 157: The method of any of clauses 145-155, wherein the MPEG_mesh_collision extension includes data defining at least one 3D mesh for the at least one virtual solid object.
條款158:如條款157之方法,其中,該MPEG_mesh_collision延伸包括定義以下各項中的至少一項的數據:該3D網格之邊界、用於該3D網格的材料、或將響應於該虛擬相機接觸該3D網格而呈現的動畫。Clause 158: The method of Clause 157, wherein the MPEG_mesh_collision extension includes data defining at least one of: boundaries of the 3D mesh, materials for the 3D mesh, or will respond to the virtual camera An animation rendered by touching the 3D mesh.
條款159:如條款156-158中任一項之方法,其中,該MPEG_mesh_collision延伸符合以上表2。Clause 159: The method of any one of clauses 156-158, wherein the MPEG_mesh_collision extension conforms to Table 2 above.
條款160:如條款156-159中任一項之方法,其中,防止該虛擬相機穿越該至少一個虛擬固體物體包含:使用該MPEG_mesh_collision延伸來防止該虛擬相機進入該至少一個虛擬固體物體。Clause 160: The method of any of clauses 156-159, wherein preventing the virtual camera from passing through the at least one virtual solid object comprises: using the MPEG_mesh_collision extension to prevent the virtual camera from entering the at least one virtual solid object.
條款161:一種用於檢索媒體數據的裝置,該裝置包含用於履行如條款145-160中任一項之方法的一個或多個構件。Clause 161: An apparatus for retrieving media data, the apparatus comprising one or more means for performing the method of any one of clauses 145-160.
條款162:如條款161之裝置,其中,該一個或多個構件包含在電路中實現的一個或多個處理器。Clause 162: The apparatus of Clause 161, wherein the one or more means comprise one or more processors implemented in circuitry.
條款163:如條款161之裝置,其中,該器具包含以下各項中的至少一項:積體電路;微處理器;以及無線通信裝置。Clause 163: The device of Clause 161, wherein the apparatus comprises at least one of: an integrated circuit; a microprocessor; and a wireless communication device.
條款164:一種用於檢索媒體數據的裝置,該裝置包含:用於接收串流媒體數據的構件,該串流媒體數據表示包括至少一個虛擬固體物體的虛擬三維場景;用於接收用於該三維場景的相機控制數據的構件,該相機控制數據包括定義限制以防止虛擬相機穿越該至少一個虛擬固體物體的數據;用於從用戶接收相機移動數據的構件,該相機移動數據請求該虛擬相機移動穿過該至少一個虛擬固體物體;以及用於使用該相機控制數據來防止該虛擬相機響應於該相機移動數據而穿越該至少一個虛擬固體物體的構件。Clause 164: An apparatus for retrieving media data, the apparatus comprising: means for receiving streaming media data representing a virtual three-dimensional scene including at least one virtual solid object; means for camera control data of the scene, the camera control data comprising data defining limits to prevent the virtual camera from traversing the at least one virtual solid object; means for receiving camera movement data from the user requesting the virtual camera to move through passing the at least one virtual solid object; and means for using the camera control data to prevent the virtual camera from passing through the at least one virtual solid object in response to the camera movement data.
在一個或多個示例中,所描述的功能可以用硬體、軟體、韌體或其任何組合來實現。如果用軟體來實現,則該功能可以作為一個或多個指令或代碼儲存在計算機可讀媒體上或者通過其進行傳輸並且由基於硬體的處理單元執行。計算機可讀媒體可以包括計算機可讀儲存媒體,其對應於諸如數據儲存媒體之類的有形媒體,或者包括例如根據通信協定來促進計算機程式從一個地方傳輸到另一個地方的任何媒體的通信媒體。以這種方式,計算機可讀媒體通常可以對應於(1)非暫時性的有形計算機可讀儲存媒體、或者(2)諸如信號或載波之類的通信媒體。數據儲存媒體可以是可以由一個或多個計算機或者一個或多個處理器存取以檢索用於實現在本公開內容中描述的技術的指令、代碼及/或數據結構的任何可用的媒體。計算機程式產品可以包括計算機可讀媒體。In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which correspond to tangible media such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, eg, according to a communication protocol. In this manner, a computer-readable medium may generally correspond to (1) a non-transitory tangible computer-readable storage medium, or (2) a communication medium such as a signal or carrier wave. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. A computer program product may include computer readable media.
通過舉例而非限制性的方式,這樣的計算機可讀儲存媒體可以包括RAM、ROM、EEPROM、CD-ROM或其它光盤儲存、磁盤儲存或其它磁儲存裝置、快閃記憶體、或者能夠用於以指令或數據結構形式儲存期望的程式代碼以及能夠由計算機存取的任何其它媒體。此外,任何連接被適當地稱為計算機可讀媒體。例如,如果使用同軸纜線、光纖纜線、雙絞線、數位用戶線路(DSL)或者無線技術(諸如紅外線、無線電及微波)從網站、伺服器或其它遠程源傳輸指令,則同軸纜線、光纖纜線、雙絞線、DSL或者無線技術(諸如紅外線、無線電及微波)被包括在媒體的定義中。然而,應當理解的是,計算機可讀儲存媒體及數據儲存媒體不包括連接、載波、信號或其它暫時性媒體,而是替代地針對非暫時性的有形儲存媒體。如本文所使用的,磁盤及光碟包括緊湊光碟(CD)、雷射光碟、光學碟、數位多功能光碟(DVD)、軟盤及藍光光碟,其中,磁盤通常磁性地複製數據,而光碟則利用雷射來光學地複製數據。上述各項的組合亦應當被包括在計算機可讀媒體的範圍之內。By way of example and not limitation, such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, or may be used in Store desired program code in the form of instructions or data structures and any other medium that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if instructions are transmitted from a website, server, or other remote source using coaxial cables, fiber optic cables, twisted pair cables, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwaves, then coaxial cables, Fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of media. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory, tangible storage media. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc, where disks usually reproduce data magnetically, while discs use Injection to optically replicate data. Combinations of the above should also be included within the scope of computer-readable media.
指令可以由一個或多個處理器來執行,諸如一個或多個數位信號處理器(DSP)、通用微處理器、特定應用積體電路(ASIC)、現場可程式邏輯陣列(FPGA)、或其它等效的整合或離散邏輯電路。因此,如本文所使用的術語“處理器”可以指稱前述結構中的任何一者或者適於實現本文描述的技術的任何其它結構。另外,在一些態樣中,本文描述的功能可以在被組態用於編碼及解碼的專屬硬體及/或軟體模組內提供,或者被併入經組合的編解碼器中。此外,該技術可以完全在一個或多個電路或邏輯元件中實現。Instructions may be executed by one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuits. Accordingly, the term "processor," as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. Additionally, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated into a combined codec. Additionally, the technology may be implemented entirely in one or more circuits or logic elements.
本公開內容的技術可以在多種多樣的裝置或器具中實現,包括無線手機、積體電路(IC)或一組IC(例如,晶片組)。在本公開內容中描述了各種組件、模組或單元以強調被組態以履行所公開的技術的裝置的功能性態樣,但是不一定需要通過不同的硬體單元來實現。確切而言,如上所述,各種單元可以被組合在編解碼器硬體單元中,或者由互操作的硬體單元的彙集(包括如上所述的一個或多個處理器)結合適當的軟體及/或韌體來提供。The techniques of this disclosure may be implemented in a wide variety of devices or appliances, including a wireless handset, an integrated circuit (IC), or a group of ICs (eg, a chipset). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, the various units may be combined in a codec hardware unit, or by a collection of interoperable hardware units (including one or more processors as described above) combined with appropriate software and /or firmware to provide.
已經描述了各個示例。這些及其它示例在隨後的申請專利範圍的範疇內。Various examples have been described. These and other examples are within the scope of the claims that follow.
10:系統 20:內容準備裝置 22:音頻源 24:視頻源 26:音頻編碼器 28:視頻編碼器 30:封裝單元 32:輸出介面 40:客戶端裝置 42:音頻輸出 44:視頻輸出 46:音頻解碼器 48:視頻解碼器 50:解封裝單元 52:檢索單元 54:網路介面 60:伺服器裝置 62:儲存媒體 64:多媒體內容 66:清單檔案 68A-68N:表示 70:請求處理單元 72:網路介面 74:網路 100:eMBMS中間件單元 102:代理伺服器單元 104:快取 106:eMBMS接收單元 110:DASH客戶端 112:媒體應用 114:呈現引擎 120:多媒體內容 122:媒體呈現描述 124A-124N:表示 126、130:標頭數據 128A-128N、132A-132N:分段 150:視頻檔案 152:檔案類型(FTYP)盒 154:電影(MOOV)盒 156:電影標頭(MVHD)盒 158:軌道(TRAK)盒 160:電影延伸(MVEX)盒 162:分段索引(SIDX)盒 164:電影片段(MOOF)盒 166:電影片段隨機存取(MFRA)盒 200:3D場景 202:相機 204、206:點 208、210:邊界框 212:路徑分段 220:虛擬物體 250:檢索媒體數據 252:提取場景描述 254:根據場景描述來決定相機控制數據 256:根據相機控制數據來決定移動限制 258:接收相機移動數據 260:決定相機移動數據請求穿過3D固體虛擬物體的移動 262:防止虛擬相機穿越3D固體虛擬物體 280:檢索媒體數據 282:提取場景描述 284:根據場景描述來決定相機控制數據 286:根據相機控制數據來決定物體碰撞數據 288:接收相機移動數據 290:決定相機移動數據請求穿過3D固體虛擬物體的移動 292:防止虛擬相機穿越3D固體虛擬物體 10: System 20: Content preparation device 22: Audio source 24: Video source 26:Audio encoder 28:Video Encoder 30: Encapsulation unit 32: output interface 40: Client device 42: Audio output 44: Video output 46:Audio decoder 48:Video decoder 50: Decapsulation unit 52: retrieval unit 54: Network interface 60: Server device 62: Storage media 64: Multimedia content 66:Manifest file 68A-68N: Indicates 70: request processing unit 72: Network interface 74: Network 100: eMBMS middleware unit 102: proxy server unit 104: Cache 106: eMBMS receiving unit 110: DASH client 112:Media application 114: Rendering engine 120: Multimedia content 122:Media Presentation Description 124A-124N: Indicates 126, 130: header data 128A-128N, 132A-132N: Segmentation 150:Video Archives 152: File Type (FTYP) Box 154: Movie (MOOV) box 156:Movie header (MVHD) box 158: Track (TRAK) box 160: Movie Extended (MVEX) Box 162:Segmented Index (SIDX) Box 164:Movie fragment (MOOF) box 166:Movie Fragment Random Access (MFRA) Box 200:3D scene 202: camera 204, 206: points 208, 210: bounding box 212: Path segmentation 220: Virtual objects 250: Retrieve media data 252: Extract scene description 254:Determine camera control data according to scene description 256: Determine movement restrictions based on camera control data 258: Receive camera movement data 260: Determining Camera Movement Data Requests Movement Through 3D Solid Virtual Objects 262: Prevent the virtual camera from passing through 3D solid virtual objects 280: Retrieve media data 282: Extract scene description 284:Determine camera control data according to scene description 286: Determine object collision data based on camera control data 288: Receive camera movement data 290:Determine camera movement data request movement through 3D solid virtual objects 292:Prevent virtual camera from passing through 3D solid virtual objects
圖1是示出實現用於在網路上對媒體數據進行串流傳輸的技術的示例系統的方塊圖。1 is a block diagram illustrating an example system that implements techniques for streaming media data over a network.
圖2是更詳細地示出圖1的檢索單元的示例組件集合的方塊圖。FIG. 2 is a block diagram illustrating an example component set of the retrieval unit of FIG. 1 in more detail.
圖3是示出示例多媒體內容的元素的概念圖。FIG. 3 is a conceptual diagram illustrating elements of example multimedia content.
圖4是示出可以對應於表示的分段的示例視頻檔案的元素的方塊圖。4 is a block diagram illustrating elements of an example video archive that may correspond to segments of a representation.
圖5是示出根據本公開內容的技術的具有邊界體積的示例相機路徑分段的概念圖。5 is a conceptual diagram illustrating an example camera path segmentation with a bounding volume in accordance with the techniques of this disclosure.
圖6是示出示例虛擬物體的概念圖,虛擬物體在該示例中是椅子。FIG. 6 is a conceptual diagram illustrating an example virtual object, which in this example is a chair.
圖7是示出根據本公開內容的技術的檢索媒體數據的示例方法的流程圖。7 is a flowchart illustrating an example method of retrieving media data in accordance with the techniques of this disclosure.
圖8是示出根據本公開內容的技術的檢索媒體數據的示例方法的流程圖。8 is a flowchart illustrating an example method of retrieving media data in accordance with the techniques of this disclosure.
280:檢索媒體數據 280: Retrieve media data
282:提取場景描述 282: Extract scene description
284:根據場景描述來決定相機控制數據 284:Determine camera control data according to scene description
286:根據相機控制數據來決定物體碰撞數據 286: Determine object collision data based on camera control data
288:接收相機移動數據 288: Receive camera movement data
290:決定相機移動數據請求穿過3D固體虛擬物體的移動 290:Determine camera movement data request movement through 3D solid virtual objects
292:防止虛擬相機穿越3D固體虛擬物體 292:Prevent virtual camera from passing through 3D solid virtual objects
Claims (31)
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163159379P | 2021-03-10 | 2021-03-10 | |
US63/159,379 | 2021-03-10 | ||
US17/654,023 | 2022-03-08 | ||
US17/654,023 US20220292770A1 (en) | 2021-03-10 | 2022-03-08 | Object collision data for virtual camera in virtual interactive scene defined by streamed media data |
Publications (1)
Publication Number | Publication Date |
---|---|
TW202240431A true TW202240431A (en) | 2022-10-16 |
Family
ID=80978776
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW111108833A TW202240431A (en) | 2021-03-10 | 2022-03-10 | Object collision data for virtual camera in virtual interactive scene defined by streamed media data |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP4305848A1 (en) |
JP (1) | JP2024509524A (en) |
KR (1) | KR20230155445A (en) |
BR (1) | BR112023017541A2 (en) |
TW (1) | TW202240431A (en) |
WO (1) | WO2022192886A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI833560B (en) * | 2022-11-25 | 2024-02-21 | 大陸商立訊精密科技(南京)有限公司 | Image scene construction method, apparatus, electronic equipment and storage medium |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8044953B2 (en) * | 2002-06-28 | 2011-10-25 | Autodesk, Inc. | System for interactive 3D navigation for proximal object inspection |
JP4489800B2 (en) * | 2007-08-30 | 2010-06-23 | 株式会社スクウェア・エニックス | Image generating apparatus and method, program, and recording medium |
-
2022
- 2022-03-09 KR KR1020237030132A patent/KR20230155445A/en unknown
- 2022-03-09 EP EP22713836.9A patent/EP4305848A1/en active Pending
- 2022-03-09 BR BR112023017541A patent/BR112023017541A2/en unknown
- 2022-03-09 WO PCT/US2022/071056 patent/WO2022192886A1/en active Application Filing
- 2022-03-09 JP JP2023552338A patent/JP2024509524A/en active Pending
- 2022-03-10 TW TW111108833A patent/TW202240431A/en unknown
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI833560B (en) * | 2022-11-25 | 2024-02-21 | 大陸商立訊精密科技(南京)有限公司 | Image scene construction method, apparatus, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
EP4305848A1 (en) | 2024-01-17 |
JP2024509524A (en) | 2024-03-04 |
WO2022192886A1 (en) | 2022-09-15 |
BR112023017541A2 (en) | 2024-01-23 |
KR20230155445A (en) | 2023-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI774744B (en) | Signaling important video information in network video streaming using mime type parameters | |
KR102252238B1 (en) | The area of interest in the image | |
KR102342274B1 (en) | Advanced signaling of regions of most interest in images | |
US11405699B2 (en) | Using GLTF2 extensions to support video and audio data | |
JP2019521584A (en) | Signaling of Virtual Reality Video in Dynamic Adaptive Streaming over HTTP | |
TWI703854B (en) | Enhanced high-level signaling for fisheye virtual reality video in dash | |
TW201830974A (en) | Signaling data for prefetching support for streaming media data | |
KR102247404B1 (en) | Enhanced high-level signaling for fisheye virtual reality video | |
CN111034203A (en) | Processing omnidirectional media with dynamic zone-by-zone encapsulation | |
KR102339197B1 (en) | High-level signaling for fisheye video data | |
TWI820227B (en) | Initialization set for network streaming of media data | |
TW202240431A (en) | Object collision data for virtual camera in virtual interactive scene defined by streamed media data | |
US20220295034A1 (en) | Camera control data for virtual camera in virtual interactive scene defined by streamed media data | |
KR102117805B1 (en) | Media data processing using omni-directional media format | |
CN116918339A (en) | Camera control data for virtual cameras in streaming media data defined virtual interaction scenarios | |
TW202249493A (en) | Anchoring a scene description to a user environment for streaming immersive media content | |
TW202242677A (en) | Camera control data for virtual camera in virtual interactive scene defined by streamed media data | |
CN116918338A (en) | Object collision data for virtual cameras in virtual interaction scenarios defined by streaming media data | |
US20220335694A1 (en) | Anchoring a scene description to a user environment for streaming immersive media content |