CN115484476A - VR live video transmission method and device and storage medium - Google Patents

VR live video transmission method and device and storage medium Download PDF

Info

Publication number
CN115484476A
CN115484476A CN202110600096.5A CN202110600096A CN115484476A CN 115484476 A CN115484476 A CN 115484476A CN 202110600096 A CN202110600096 A CN 202110600096A CN 115484476 A CN115484476 A CN 115484476A
Authority
CN
China
Prior art keywords
data
unicast
fragment
hls
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110600096.5A
Other languages
Chinese (zh)
Inventor
陈戈
唐宏
梁洁
叶何亮
庄一嵘
陈步华
余媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202110600096.5A priority Critical patent/CN115484476A/en
Publication of CN115484476A publication Critical patent/CN115484476A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2365Multiplexing of several video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4347Demultiplexing of several video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/64Addressing
    • H04N21/6405Multicasting

Abstract

The disclosure provides a transmission method, a device and a storage medium of VR live video, which relate to the technical field of communication, wherein the method comprises the following steps: coding video data sent by a VR live source to generate fragment coded data; coding the fragment coded data to generate panoramic coded data and main visual angle coded data; packaging the panoramic coding data, the main view coding data, the unicast fragment data index information and the audio to generate a multicast stream and add a PMT (program map table); and transmitting the multicast stream in the multicast group, and transmitting unicast fragment data corresponding to the unicast switching request to the terminal when receiving the unicast switching request transmitted by the terminal in the multicast group based on the PMT table. The method, the device and the medium can utilize multicast and unicast to carry out combined transmission, save network bandwidth, utilize the multicast bearing advantages of operators, reduce CDN bearing cost, be compatible with various terminals and improve the use experience of users.

Description

VR live video transmission method and device and storage medium
Technical Field
The present disclosure relates to the field of communications technologies, and in particular, to a VR live video transmission method and apparatus, a storage medium, and a storage medium.
Background
The existing VR live video scheme generally uses a DASH or HLS file to encapsulate a live stream, and provides a service to a user in a unicast manner through a CDN (Content Delivery Network). The existing VR live video scheme uses VR FOV technology for reducing bandwidth requirements and terminal decoding capability requirements, but the existing VR FOV technology can only use unicast bearing, does not fully utilize the advantages of an operator multicast network, and has high CDN bearing cost.
Disclosure of Invention
In view of the above, an object of the present disclosure is to provide a VR live video transmission method, an apparatus, a storage medium, and a storage medium.
According to a first aspect of the disclosure, a VR live video transmission method is provided, including: coding video data sent by a virtual reality VR live source to generate fragment coded data; coding the fragment coded data to generate panoramic coded data and main visual angle coded data; packaging the panoramic coding data, the main view coding data, the corresponding unicast fragment data index information and the audio to generate a multicast stream; adding a PMT table corresponding to the multicast stream into the multicast stream; sending the multicast stream to all terminals in a multicast group so that all terminals play the panoramic coded data or the main view coded data; and receiving a unicast switching request sent by a terminal in the multicast group based on the PMT, and sending unicast fragment data corresponding to the unicast switching request to the terminal so that the terminal receives the multicast stream and the unicast fragment data at the same time and performs corresponding display processing.
Optionally, the encoding the video data sent by the virtual reality VR live source, and generating the sliced encoded data includes: coding the video data based on a preset coding mode to generate the fragment coded data; wherein, the coding mode comprises: motion constrained block set MCTS coding.
Optionally, the unicast fragmentation data includes: HLS slicing data; the method further comprises the following steps: and packaging the fragment coded data by using an HLS protocol to generate the HLS fragment data.
Optionally, the unicast fragment data index information includes: URL link and offset of the HLS fragment data; the encapsulating the panoramic encoded data, the main view encoded data, and the corresponding unicast fragment data index information and audio, and generating a multicast stream includes: and performing TS packaging processing on the panoramic coding data, the main view coding data, the URL link and offset of the HLS slicing data and the audio to generate TS streams which are used as the multicast streams.
Optionally, the information carried by the PMT table includes: a first PID of the panorama encoded data, a second PID of the main view encoded data, a third PID of the HLS sliced data index, a URL link and offset of the HLS sliced data index, and a fourth PID of the audio.
Optionally, the terminal extracts and parses the PMT table from the TS stream, and acquires corresponding encoded data and audio based on a parsing result to perform display processing; and after receiving the TS stream, the terminal firstly displays the main view coding data.
Optionally, the unicast handover request includes: URL link and offset of the HLS fragment data; the sending the unicast fragment data corresponding to the unicast switching request to the terminal comprises: and acquiring corresponding HLS fragment data according to the URL link and the offset of the HLS fragment data index, and sending the HLS fragment data to the terminal in a unicast mode.
According to a second aspect of the present disclosure, there is provided an apparatus for transmitting VR live video, including: the first coding module is used for coding video data sent by a virtual reality VR live source to generate fragment coded data; the second coding module is used for coding the fragment coded data to generate panoramic coded data and main view coded data; a multicast stream encapsulation module, configured to encapsulate the panoramic encoded data, the primary view encoded data, and corresponding unicast fragment data index information and audio to generate a multicast stream; adding a PMT table corresponding to the multicast stream into the multicast stream; a multicast stream sending module, configured to send the multicast stream to all terminals in a multicast group, so that all terminals play the panoramic encoded data or the main view encoded data; and the unicast stream sending module is used for receiving a unicast switching request sent by a terminal in the multicast group based on the PMT, and sending unicast fragment data corresponding to the unicast switching request to the terminal so that the terminal receives the multicast stream and the unicast fragment data at the same time and performs corresponding display processing.
Optionally, the first encoding module is specifically configured to perform encoding processing on the video data based on a preset encoding manner to generate the sliced encoded data; wherein, the coding mode comprises: motion constrained block set MCTS coding.
Optionally, the unicast fragmentation data includes: HLS slicing data; the device further comprises: and the unicast packaging module is used for packaging the fragment coded data by using an HLS protocol to generate the HLS fragment data.
Optionally, the unicast fragment data index information includes: URL link and offset of the HLS fragment data; the multicast stream encapsulation module is specifically configured to perform TS encapsulation on the panoramic encoded data, the primary view encoded data, the URL link and offset of the HLS sliced data, and the audio, and generate a TS stream, which is used as the multicast stream.
Optionally, the information carried by the PMT table includes: a first PID of the panorama encoded data, a second PID of the main view encoded data, a third PID of the HLS sliced data index, a URL link and offset of the HLS sliced data index, and a fourth PID of the audio.
Optionally, the unicast handover request includes: URL link and offset of the HLS fragment data; the unicast stream sending module is specifically configured to obtain corresponding HLS fragment data according to the URL link and the offset of the HLS fragment data index, and send the HLS fragment data to the terminal in a unicast manner, and according to a third aspect of the present disclosure, provides a VR live video transmission apparatus, including: a memory; and a processor coupled to the memory, the processor configured to perform the method as described above based on instructions stored in the memory.
According to a third aspect of the present disclosure, there is provided a VR live video transmission apparatus, including: a memory; and a processor coupled to the memory, the processor configured to perform the method as described above based on instructions stored in the memory.
According to a fourth aspect of the present disclosure, there is provided a computer readable storage medium storing computer instructions for execution by a processor of a method as described above.
The VR live video transmission method, the VR live video transmission device, the storage medium and the storage medium of the present disclosure encode video data sent by a VR live source to generate fragment encoded data; coding the fragment coded data to generate panoramic coded data and main visual angle coded data; packaging the panoramic coding data, the main view coding data, the unicast fragment data index information and the audio to generate a multicast stream and add a PMT (program map table); sending a multicast stream to all terminals in the multicast group, and sending unicast fragment data corresponding to a unicast switching request to the terminals when receiving the unicast switching request sent by the terminals in the multicast group based on a PMT (program map table); the method can utilize multicast and unicast to carry out combined transmission, saves network bandwidth, utilizes the multicast bearing advantages of operators, reduces CDN bearing cost, can be compatible with various terminals, and improves user experience.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without inventive exercise.
Fig. 1 is a schematic diagram of a prior VR live broadcasting technical scheme;
fig. 2 is a flow diagram of one embodiment of a method of transmitting VR live video in accordance with the present disclosure;
fig. 3 is an application architecture diagram of one embodiment of a method of transmitting VR live video according to the present disclosure;
fig. 4 is a data processing diagram of one embodiment of a VR live video transmission method according to the present disclosure;
fig. 5A is a schematic diagram of initial multicast reception and decoding at a terminal side; fig. 5B is a schematic diagram of switching from multicast to VR FOV at the terminal side; fig. 5C is a schematic diagram illustrating stable playing after the terminal side viewing angle is switched;
fig. 6 is a schematic diagram of an actual application of an embodiment of a VR live video transmission method according to the present disclosure;
fig. 7 is a block schematic diagram of one embodiment of a VR live video transmission apparatus according to the present disclosure;
fig. 8 is a block schematic diagram of another embodiment of a VR live video transmission apparatus according to the present disclosure;
fig. 9 is a block diagram of yet another embodiment of an apparatus for transmitting VR live video according to the present disclosure.
Detailed Description
The present disclosure now will be described more fully hereinafter with reference to the accompanying drawings, in which exemplary embodiments of the disclosure are shown. The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure. The technical solution of the present disclosure is variously described below with reference to various drawings and embodiments.
VR (Virtual Reality) live broadcast refers to real-time Virtual Reality audio and video content service provided for users by a live broadcast source after projecting and encoding 360-degree panoramic video collected in real time. The VR live video provides immersive experience for a terminal user by showing a video of 360 degrees, supports panoramic video content, supports the user to interactively switch viewing angles, and the terminal can dynamically render images, videos and associated audio according to the viewing angles of the user.
As shown in fig. 1, in an existing VR live broadcast technical solution, a VR live broadcast video generally uses a DASH (Dynamic Adaptive Streaming over HTTP) or HLS (HTTP live Streaming over HTTP) file mode to encapsulate a live broadcast stream, and provides a service to a user in a unicast mode through a CDN. The existing VR live video scheme has the following problems:
1. VR live broadcast has a very large bandwidth (for example, 8K 60P reaches 120-180 Mb/s) for achieving the effect of the existing high-definition live broadcast video, and the access bandwidth of a large number of existing IPTV (Internet Protocol Television) users cannot reach the standard; 2. in order to reduce bandwidth requirements and terminal decoding capability requirements, the existing VR live broadcast scheme uses a VR FOV (Field of View) technology, but the existing VR FOV technology can only use unicast bearer, does not fully utilize the advantages of a multicast network of a telecom operator, and has very high CDN bearer cost; 3. the existing stock IPTV terminal does not support VR projection, so that VR live content cannot be played.
Fig. 2 is a schematic flowchart of an embodiment of a VR live video transmission method according to the present disclosure, as shown in fig. 2:
step 201, encoding video data sent by a Virtual Reality (VR) live source to generate fragment encoded data.
Step 202, the coded data of the slice is coded to generate the coded data of the panorama and the coded data of the main view angle.
Step 203, packaging the panoramic coded data, the main view coded data, the corresponding unicast fragment data index information and the audio to generate a multicast stream; a PMT (Program Map Table) Table corresponding to the multicast stream is added to the multicast stream.
And step 204, sending the multicast stream to all terminals in the multicast group so that all terminals play the panoramic coded data or the main view coded data.
Step 205, receiving a unicast switching request sent by the terminal in the multicast group based on the PMT table, and sending unicast fragment data corresponding to the unicast switching request to the terminal, so that the terminal receives the multicast stream and the unicast fragment data at the same time, and performs corresponding display processing. The panoramic coding data or the main view coding data can be played by adopting various methods, and the multicast stream and the unicast fragment data are displayed.
The VR live video transmission method can be compatible with the existing IPTV multicast, can utilize the multicast bearing advantages of operators, reduces the bearing cost of the CDN, can be compatible with various terminals, and is favorable for improving the audience area of VR live video.
In one embodiment, a plurality of methods can be adopted for encoding the video data transmitted by the virtual reality VR live source. For example, video data is encoded based on a preset encoding method to generate slice encoded data. The coding scheme may be various, for example, an MCTS (motion constrained tile set) coding scheme. The unicast fragment data comprises HLS fragment data and the like, and the HLS protocol is used for packaging the fragment coded data to generate the HLS fragment data. The fragment coded data can be encapsulated by using the existing method to generate HLS fragment data.
The unicast fragment data index information comprises URL links, offset and the like of HLS fragment data; the multicast Stream may be generated by various methods, for example, a TS (Transport Stream) encapsulation process is performed on the URL link and offset of the panorama encoded data, main view encoded data, HLS sliced data, and audio, and a TS Stream is generated as the multicast Stream. The TS stream can be generated by performing TS encapsulation processing using a variety of existing methods.
The PMT table may take a variety of formats that are currently available. For example, the data carried by the PMT table includes a first PID of panorama encoded data, a second PID of main view encoded data, a third PID of HLS sliced data index, a URL link of HLS sliced data index, and a fourth PID of offset and audio.
In the above embodiment, for the problems that the existing VR FOV technology can only use unicast bearer, does not fully utilize the advantages of a multicast network of a telecom operator, and has high CDN bearer cost, the transmission method for VR live videos in the above embodiment adopts a mixed encapsulation new format that encapsulates VR live background stream, main view angle, and FOV (field angle) using a TS protocol, and uses a multicast and unicast mixed transmission method at the same time.
In one embodiment, as shown in fig. 3, the VR live video transmission method of the present disclosure is applied to a VR live video transmission apparatus. As shown in fig. 4, the VR stream fusion encapsulation process is to perform decapsulation and decoding on data sent by a VR live source. And (3) slicing TILE high-definition coding: and performing TILE coding based on MCTS to facilitate subsequent main view coding.
Panoramic low-definition encoding: transcoding is carried out according to the fragment coded data output by TILE high-definition coding, so that the resolution is reduced, and low code rate is realized; main view high definition projection coding: projecting TILE into a main view plane according to the main view, and then carrying out high-definition coding; and (4) TS coupling encapsulation: based on the TS format standard, fusion packaging is carried out on VR full-influence low-definition video, main view high-definition video, HLS index information (including URL link and offset of HLS fragment data) and the like, and various existing fusion packaging methods can be adopted.
And outputting TS coupling encapsulation flow after TS coupling encapsulation, so that each video flow can realize synchronization. For example, any format of a VR live source is decoded first, and then MCTS recoding is carried out through TILE high-definition coding; when the panorama low definition, main view and DASH packaging are carried out, the frame rate, time and other information of GOP (Group of Pictures) cannot be changed; TS coupling encapsulation carries out time synchronization fusion encapsulation based on the same audio, and rendering and display synchronization needs to be carried out based on PTS; the TS coupling package needs to meet the TS standard, the information of each video stream is carried by a PMT table, and the filling rule of the PMT is shown in table 1 below:
code stream PID Corresponding Payload
Panoramic low definition PID1 Panoramic video code stream
Main visual angle high definition PID2 Projecting the encoded main view code stream
HLS indexing PID3 URL links of HLS fragments, carrying offsets
Audio frequency PID4 Audio common to video streams, with audio time as synchronisation information
TABLE 1-TS coupled packaged PMT TABLE
And the terminal extracts and analyzes the PMT table from the TS stream, and acquires corresponding encoded data and audio based on the analysis result to perform display processing. The terminal first displays the primary view encoded data upon receiving the TS stream. The unicast switching request comprises a URL link, an offset and the like of the HLS fragment data, the corresponding HLS fragment data is obtained according to the URL link and the offset of the HLS fragment data index, and the HLS fragment data is sent to the terminal in a unicast mode.
In one embodiment, as shown in fig. 5A, initial multicast reception and decoding at the terminal side: different types of terminals receive the same multicast stream, and a PMT table is acquired and analyzed firstly; acquiring a video and audio stream PID from a PMT table according to respective decoding and VR processing capacity; initially all terminals decode and display the main view video.
As shown in fig. 5B, the multicast on the terminal side switches to VR FOV: when a user switches the field angle, a multicast is triggered to unicast to a VR FOV flow (the traditional IPTV terminal does not support switching); the terminal analyzes FOV unicast URL from FOV index (the FOV index is HLS index and contains URL link and offset of HLS fragment data) and receives panoramic low definition. The FOV unicast URL (URL link of HLS fragment data) is the URL of a specific live fragment (HLS fragment data) and carries an offset, without pointing to the total index URL.
As shown in fig. 5C, the stable playback after the angle of view switching on the terminal side: after stabilization, the terminal receives the multicast panoramic low definition and the FOV unicast high definition at the same time, and realizes superposition display; when the FOV unicast is played, the terminal renders and overlaps the background stream and the FOV stream according to the audio clock; the existing various methods can be adopted to realize the superposition display, the rendering superposition and the like.
As shown in fig. 6, for an IPTV user, the IPTV user may receive the VR fused encapsulated video stream and separate out a main viewing angle high definition video stream according to the PID indication of the PMT table. Since the main view high definition video stream is projected and encoded into a normal flat video stream, the IPTV common terminal can decode. Due to the function limitation of an IPTV common terminal, the users cannot use functions of VR live panoramic playing, view angle switching and the like.
For a VR IPTV terminal and a VR head display (head display), the terminals play normal plane video streams at the beginning, when a user performs video decoding switching, the terminals separate VR background streams and FOV indexes according to the indication of a PMT table, request FOV video live broadcast fragment files from the CDN according to the FOV indexes, and finally realize FOV playing.
In one embodiment, as shown in fig. 7, the present disclosure provides a VR live video transmission apparatus 70, comprising: a first encoding module 71, a second encoding module 72, a multicast stream encapsulation module 73, a multicast stream transmission module 74, and a unicast stream transmission module 75.
The first encoding module 71 performs encoding processing on video data sent by a virtual reality VR live source to generate sliced encoded data. The second encoding module 72 performs encoding processing on the slice encoded data to generate panorama encoded data and main view encoded data. The multicast stream encapsulation module 73 encapsulates the panoramic encoded data, the main view encoded data, and the corresponding unicast fragment data index information and audio to generate a multicast stream; wherein, PMT table corresponding to multicast flow is added in multicast flow.
The multicast stream sending module 74 sends the multicast stream to all the terminals in the multicast group, so that all the terminals play the panoramic encoded data or the main view encoded data. The unicast stream sending module 75 receives a unicast switching request sent by a terminal in the multicast group based on the PMT table, and sends unicast fragment data corresponding to the unicast switching request to the terminal, so that the terminal receives the multicast stream and the unicast fragment data at the same time and performs corresponding display processing.
In one embodiment, the first encoding module 71 performs encoding processing on the video data based on a preset encoding mode to generate slice encoded data; the coding method includes an MCTS coding method and the like. As shown in fig. 8, the unicast fragment data includes HLS fragment data; the VR live video transmission device 70 further includes a unicast encapsulation module 76, where the unicast encapsulation module 76 encapsulates the fragment encoded data using the HLS protocol to generate HLS fragment data.
In one embodiment, the unicast fragment data index information includes URL links, offsets, and the like of the HLS fragment data, and the multicast stream encapsulation module 73 performs TS encapsulation on the panorama encoded data, the main view encoded data, the URL links, the offsets, and the audio of the HLS fragment data to generate a TS stream as the multicast stream. The PMT table includes a first PID of panorama encoded data, a second PID of main view encoded data, a third PID of HLS sliced data index, a URL link and offset of HLS sliced data index, and a fourth PID of audio.
The unicast stream sending module 75 obtains the corresponding HLS fragment data according to the URL link and the offset of the HLS fragment data index, and sends the obtained HLS fragment data to the terminal in a unicast manner.
Fig. 9 is a block diagram of yet another embodiment of an apparatus for transmitting VR live video according to the present disclosure. As shown in fig. 9, the apparatus may include a memory 91, a processor 92, a communication interface 93, and a bus 94. The memory 91 is used for storing instructions, the processor 92 is coupled to the memory 91, and the processor 92 is configured to execute a transmission method for implementing VR live video described above based on the instructions stored in the memory 91.
The memory 91 may be a high-speed RAM memory, a non-volatile memory (non-volatile memory), or the like, and the memory 91 may be a memory array. The storage 91 may also be partitioned and the blocks may be combined into virtual volumes according to certain rules. The processor 92 may be a central processing unit CPU, or an Application Specific Integrated Circuit ASIC (Application Specific Integrated Circuit), or one or more Integrated circuits configured to implement the VR live video transmission method of the present disclosure.
In one embodiment, the present disclosure provides a computer-readable storage medium storing computer instructions that, when executed by a processor, implement a method of transmitting VR live video as in any one of the above embodiments.
In the transmission method and device, the storage medium and the storage medium of the VR live video in the embodiments, video data sent by a VR live source is encoded to generate fragment encoded data; coding the fragment coded data to generate panoramic coded data and main visual angle coded data; packaging the panoramic coding data, the main view coding data, the unicast fragment data index information and the audio to generate a multicast stream and add a PMT (program map table); sending a multicast stream to all terminals in the multicast group, and sending unicast fragment data corresponding to a unicast switching request to the terminals when receiving the unicast switching request sent by the terminals in the multicast group based on a PMT (program map table); the method has the advantages that multicast and unicast can be utilized for combined transmission, network bandwidth is saved, switching between multicast and unicast can be achieved, existing IPTV multicast can be compatible, the carrying advantage of operator multicast is utilized, CDN carrying cost is reduced, meanwhile, the method can be compatible with multiple types of terminals, improvement of VR live audience area is facilitated, and use experience of users is improved.
The method and system of the present disclosure may be implemented in a number of ways. For example, the methods and systems of the present disclosure may be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
The description of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to practitioners skilled in this art. The embodiment was chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.

Claims (15)

1. A VR live video transmission method comprises the following steps:
coding video data sent by a virtual reality VR live source to generate fragment coded data;
coding the fragment coded data to generate panoramic coded data and main visual angle coded data;
packaging the panoramic coding data, the main view coding data, the corresponding unicast fragment data index information and the audio to generate a multicast stream; adding a PMT table corresponding to the multicast stream into the multicast stream;
sending the multicast stream to all terminals in a multicast group so that all terminals play the panoramic coded data or the main view coded data;
and receiving a unicast switching request sent by a terminal in the multicast group based on the PMT, and sending unicast fragment data corresponding to the unicast switching request to the terminal so that the terminal receives the multicast stream and the unicast fragment data at the same time and performs corresponding display processing.
2. The method of claim 1, the encoding video data sent by a Virtual Reality (VR) live source, and the generating sliced encoded data includes:
coding the video data based on a preset coding mode to generate the fragment coded data; wherein, the coding mode comprises: motion constrained block set MCTS coding.
3. The method of claim 1 or 2, the unicast fragmentation data comprising: HLS slicing data; the method further comprises the following steps:
and packaging the fragment coded data by using an HLS protocol to generate the HLS fragment data.
4. The method of claim 3, the unicast fragment data index information comprising: URL link and offset of the HLS fragment data; the encapsulating the panoramic encoded data, the main view encoded data, and the corresponding unicast fragment data index information and audio, and generating a multicast stream includes:
and performing TS packaging processing on the panoramic coding data, the main view coding data, the URL link and offset of the HLS slicing data and the audio to generate a TS stream which is used as the multicast stream.
5. The method of claim 4, wherein,
the information carried by the PMT table comprises: a first PID of the panorama encoded data, a second PID of the main view encoded data, a third PID of the HLS sliced data index, a URL link and offset of the HLS sliced data index, and a fourth PID of the audio.
6. The method of claim 5, further comprising:
the terminal extracts and analyzes the PMT table from the TS stream, and acquires corresponding encoded data and audio for display processing based on the analysis result; and after receiving the TS stream, the terminal firstly displays the main view coding data.
7. The method of claim 5, the unicast handover request comprising: URL link and offset of the HLS fragment data; the sending the unicast fragment data corresponding to the unicast switching request to the terminal comprises:
and acquiring corresponding HLS fragment data according to the URL link and the offset of the HLS fragment data index, and sending the HLS fragment data to the terminal in a unicast mode.
8. A VR live video transmission device, comprising:
the first coding module is used for coding video data sent by a virtual reality VR live source to generate fragment coded data;
the second coding module is used for coding the fragment coded data to generate panoramic coded data and main view coded data;
a multicast stream encapsulation module, configured to encapsulate the panoramic encoded data, the primary view encoded data, and corresponding unicast fragment data index information and audio, and generate a multicast stream; adding a PMT table corresponding to the multicast stream in the multicast stream;
a multicast stream sending module, configured to send the multicast stream to all terminals in a multicast group, so that all terminals play the panoramic encoded data or the primary view encoded data;
and the unicast stream sending module is used for receiving a unicast switching request sent by a terminal in the multicast group based on the PMT, and sending unicast fragment data corresponding to the unicast switching request to the terminal so that the terminal can simultaneously receive the multicast stream and the unicast fragment data and perform corresponding display processing.
9. The apparatus of claim 8, wherein,
the first encoding module is specifically configured to perform encoding processing on the video data based on a preset encoding manner to generate the sliced encoded data; wherein, the coding mode comprises: motion constrained block set MCTS coding.
10. The apparatus of claim 8 or 9, the unicast fragmentation data comprising: HLS slicing data; the device further comprises:
and the unicast encapsulation module is used for encapsulating the fragment coded data by using an HLS protocol to generate the HLS fragment data.
11. The apparatus of claim 10, the unicast fragment data index information comprising: URL link and offset of the HLS fragment data;
the multicast stream encapsulation module is specifically configured to perform TS encapsulation on the panoramic encoded data, the primary view encoded data, the URL link and offset of the HLS sliced data, and the audio, and generate a TS stream, which is used as the multicast stream.
12. The apparatus of claim 11, wherein,
the information carried by the PMT table comprises: a first PID of the panorama encoded data, a second PID of the main view encoded data, a third PID of the HLS sliced data index, a URL link and offset of the HLS sliced data index, and a fourth PID of the audio.
13. The apparatus of claim 12, the unicast handover request comprising: URL link and offset of the HLS fragment data;
the unicast stream sending module is specifically configured to obtain corresponding HLS fragment data according to the URL link and the offset of the HLS fragment data index, and send the HLS fragment data to the terminal in a unicast manner.
14. A VR live video transmission device, comprising:
a memory; and a processor coupled to the memory, the processor configured to perform the method of any of claims 1-7 based on instructions stored in the memory.
15. A computer-readable storage medium having stored thereon computer instructions for execution by a processor of the method of any one of claims 1 to 7.
CN202110600096.5A 2021-05-31 2021-05-31 VR live video transmission method and device and storage medium Pending CN115484476A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110600096.5A CN115484476A (en) 2021-05-31 2021-05-31 VR live video transmission method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110600096.5A CN115484476A (en) 2021-05-31 2021-05-31 VR live video transmission method and device and storage medium

Publications (1)

Publication Number Publication Date
CN115484476A true CN115484476A (en) 2022-12-16

Family

ID=84420084

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110600096.5A Pending CN115484476A (en) 2021-05-31 2021-05-31 VR live video transmission method and device and storage medium

Country Status (1)

Country Link
CN (1) CN115484476A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116668779A (en) * 2023-08-01 2023-08-29 中国电信股份有限公司 Virtual reality view field distribution method, system, device, equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116668779A (en) * 2023-08-01 2023-08-29 中国电信股份有限公司 Virtual reality view field distribution method, system, device, equipment and medium
CN116668779B (en) * 2023-08-01 2023-10-10 中国电信股份有限公司 Virtual reality view field distribution method, system, device, equipment and medium

Similar Documents

Publication Publication Date Title
US10587883B2 (en) Region-wise packing, content coverage, and signaling frame packing for media content
US10565463B2 (en) Advanced signaling of a most-interested region in an image
US10582201B2 (en) Most-interested region in an image
US20190104326A1 (en) Content source description for immersive media data
US10567734B2 (en) Processing omnidirectional media with dynamic region-wise packing
US10575018B2 (en) Enhanced high-level signaling for fisheye virtual reality video in dash
CN110035331B (en) Media information processing method and device
US10659760B2 (en) Enhanced high-level signaling for fisheye virtual reality video
US11089285B2 (en) Transmission device, transmission method, reception device, and reception method
US10931980B2 (en) Method and apparatus for providing 360 degree virtual reality broadcasting service
US20200228837A1 (en) Media information processing method and apparatus
RU2767300C2 (en) High-level transmission of service signals for video data of "fisheye" type
KR102176404B1 (en) Communication apparatus, communication data generation method, and communication data processing method
CN115484476A (en) VR live video transmission method and device and storage medium
US10587904B2 (en) Processing media data using an omnidirectional media format
KR20170130883A (en) Method and apparatus for virtual reality broadcasting service based on hybrid network
KR101656193B1 (en) MMT-based Broadcasting System and Method for UHD Video Streaming over Heterogeneous Networks
US20190014362A1 (en) Enhanced region-wise packing and viewport independent hevc media profile

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination