WO2023273675A1 - Method for processing video scene of free visual angle, and client and server - Google Patents

Method for processing video scene of free visual angle, and client and server Download PDF

Info

Publication number
WO2023273675A1
WO2023273675A1 PCT/CN2022/093592 CN2022093592W WO2023273675A1 WO 2023273675 A1 WO2023273675 A1 WO 2023273675A1 CN 2022093592 W CN2022093592 W CN 2022093592W WO 2023273675 A1 WO2023273675 A1 WO 2023273675A1
Authority
WO
WIPO (PCT)
Prior art keywords
video frame
camera position
viewing angle
information
index file
Prior art date
Application number
PCT/CN2022/093592
Other languages
French (fr)
Chinese (zh)
Inventor
江平
赵俊哲
高元仲
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2023273675A1 publication Critical patent/WO2023273675A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments

Definitions

  • the embodiments of the present application relate to the field of multimedia technologies, and in particular, relate to a processing method, a client, and a server for a free-view video scene.
  • the viewing angle can be arbitrarily selected through the free viewing angle of the 360-degree full viewing angle, which can give users a customized experience. Now the free viewing angle is widely used in sports events, education and training, and entertainment performances, providing new video scenarios for 5G applications.
  • Embodiments of the present application provide a method for processing a free-view video scene, a client, and a server.
  • the embodiment of the present application provides a method for processing a free-view video scene, which is applied to a client, and the method includes: obtaining an index file, the index file includes video frame information; analyzing according to the index file Obtain the video frame information and camera position values of all camera positions; obtain the camera position values for switching viewing angles, and download video frames according to the video frame information and the camera position values for switching viewing angles.
  • the embodiment of the present application provides a method for processing a free-view video scene, which is applied to a server.
  • the method includes: performing slice packaging on a media file containing multiple code streams to obtain slices, and the slices Including video frame information; generating an index file corresponding to the slice; extracting the video frame information in the slice to the index file.
  • an embodiment of the present application provides a client, including: a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor implements the above-mentioned first method when executing the computer program.
  • the method for processing a free viewing angle video scene.
  • an embodiment of the present application provides a server, including: a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor implements the second above when executing the computer program.
  • a server including: a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor implements the second above when executing the computer program.
  • the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer-executable program, and the computer-executable program is used to make the computer perform the above-mentioned first aspect.
  • Fig. 1 is the main flow chart (client side) of the processing method of a kind of free viewing angle video scene provided by one embodiment of the present application;
  • FIG. 2 is a sub-flow chart of a method for processing a free-view video scene provided by an embodiment of the present application
  • FIG. 3 is a subflow chart of a method for processing a free-view video scene provided by an embodiment of the present application
  • FIG. 4 is a camera position diagram of a free viewing angle provided by an embodiment of the present application.
  • FIG. 5 is a switching frame diagram of a free viewing angle provided by an embodiment of the present application.
  • Fig. 6 is a bullet time switching frame diagram of a free perspective provided by an embodiment of the present application.
  • FIG. 7 is a main flow chart (server side) of a method for processing a free-view video scene provided by an embodiment of the present application;
  • Fig. 8 is a flow chart of a live broadcast of a free viewing angle provided by an embodiment of the present application.
  • Fig. 9 is an on-demand flow chart of a free viewing angle provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a client provided by an embodiment of the present application.
  • Fig. 11 is a schematic diagram of a server structure provided by an embodiment of the present application.
  • multiple means more than two, greater than, less than, exceeding, etc. are interpreted as not including the original number, and above, below, within, etc. are understood as including the original number. If there are descriptions such as “first”, “second”, etc., it is only for the purpose of distinguishing technical features, and it cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features or implicitly indicating the indicated The sequence relationship of the technical characteristics.
  • the viewing angle can be arbitrarily selected through the free viewing angle of the 360-degree full viewing angle, which can give users a customized experience. Now the free viewing angle is widely used in sports events, education and training, and entertainment performances, providing new video scenarios for 5G applications.
  • the embodiment of the present application provides a processing method, a client and a server for a free viewing angle video scene.
  • the client obtains an index file, analyzes the camera position values of the video frame information of all camera positions according to the index file, and obtains the position value of the switching angle of view.
  • Camera position value download the video frame according to the video frame information and the camera position value for switching the viewing angle. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching.
  • By introducing the auxiliary file of index file it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
  • FIG. 1 is a flowchart of a method for processing a free-view video scene provided by an embodiment of the present application.
  • the method for processing a free-view video scene can be applied to a client, and the method for processing a free-view video scene includes, but is not limited to, the following steps 101 , 102 and 103 .
  • Step 101 acquire an index file.
  • Step 102 analyze the video frame information and camera position values of all camera positions according to the index file.
  • Step 103 acquire the camera position value for switching the viewing angle, and download the video frame according to the video frame information and the camera position value for switching the viewing angle.
  • the client obtains the index file, which includes video frame information and camera position values of all camera positions, and parses out the video frame information and camera position values of all camera positions according to the index file, and obtains the switch
  • the camera position value of the angle of view download the video frame according to the video frame information and the camera position value of the switching angle of view, so as to realize the switching of the free angle of view. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching.
  • By introducing the auxiliary file of index file it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
  • the video frame information includes, but is not limited to, video frame starting position information, video frame size, and camera position value corresponding to the video frame.
  • the client can download the index file from the server and parse out the video frame information, and download the next frame of the current camera frame by frame according to the video frame information in the index file, and decode and render the downloaded frame , when the user performs a viewing angle switching operation, modify the current camera position value through the client, and download the next frame of the switched camera position by frame, and decode and render the downloaded frame, and so on until the viewing angle switching operation ends.
  • the client responds to the interaction, modifies the camera location value information, and then downloads frames according to the modified camera location stream. Since the download is performed in units of frames, the playback can achieve low-latency interaction of viewing angles, and switching camera positions does not affect picture rendering or cause camera position jumps, which can ensure the smoothness of screen switching.
  • step 103 may further include but not limited to the following sub-steps.
  • the client needs to be decoded and rendered before it can be opened, so that the user can see the picture of the video frame on the client.
  • step 101 may include but not limited to the following sub-steps: step 1011 , step 1012 and step 1013 .
  • Step 1011 acquire the media presentation description file, the media presentation description file is generated by the server.
  • Step 1012 obtain the information of the index file according to the media presentation description file.
  • Step 1013 download the index file from the server according to the information of the index file.
  • the client for the client to obtain the index file, for example, by obtaining the media presentation description (Media Presentation Description, MPD) file generated by the server based on DASH (Dynamic Adaptive Streaming over HTTP, adaptive streaming media transmission),
  • the information of the index file is obtained according to the media presentation description file, and then the index file is downloaded from the server according to the information of the index file.
  • Media Presentation Description Media Presentation Description
  • DASH Dynamic Adaptive Streaming over HTTP, adaptive streaming media transmission
  • step 103 may include but not limited to the following sub-steps: step 1031 to step 1034 .
  • Step 1031 obtain the current camera position value.
  • Step 1032 determine the current viewing angle video frame to be downloaded according to the video frame information and the current camera position value.
  • Step 1033 acquire the position value of the target camera.
  • Step 1034 determine the target viewing angle video frame to be downloaded according to the video frame information and the position value of the target camera.
  • each camera position corresponds to a camera position value
  • the user realizes viewing angle switching by modifying the camera position value on the client, for example, by obtaining the current camera position value, and determining the current camera position value according to the current camera position value.
  • the current camera position value can be modified to the target camera position value, and the client obtains the The target camera position value input by the user, determine the target camera position according to the target camera position value, so as to realize switching from the current camera position to the target camera position, and determine the target to be downloaded according to the video frame information and the target camera position Angle of view video frame, so as to realize the switching of free angle of view.
  • FIG. 7 is a flowchart of a method for processing a free-view video scene provided by an embodiment of the present application.
  • the method for processing a free-view video scene may be applied to a server, and the method for processing a free-view video scene includes, but is not limited to, the following steps 201 , 202 and 203 .
  • Step 201 Slicing and encapsulating a media file containing multiple code streams to obtain fragments, which include video frame information.
  • Step 202 generating an index file corresponding to the slice.
  • Step 203 extract the video frame information in the slice to the index file.
  • the server slices and encapsulates the media files containing multiple code streams to obtain fragments.
  • the fragments include video frame information, and generate index files corresponding to the fragments, and extract the video frame information in the fragments to Index file, so that the client can download the index file from the server. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
  • the server may generate a media presentation description file based on the DASH protocol, where the media presentation description file includes information about fragments and information about index files.
  • the client can acquire the media presentation description file generated by the server, obtain the information of the index file according to the media presentation description file, and then download the index file from the server according to the information of the index file.
  • the server obtains the live stream or on-demand stream, performs slice encapsulation based on the DASH protocol, and extracts the frame information in the slice to the index file. Describe the segment information and index file information to the media presentation description file.
  • the client obtains the media presentation description file, downloads the index file according to the custom fields of the index file, and parses the video frame information in the index file.
  • the client downloads frames in units of frames according to the video frame information in the index file, and decodes and renders the downloaded frames. If the user switches the viewing angle, the client responds to the interaction, modifies the camera position value information, and then downloads frames according to the modified camera position stream. Since the download is performed in units of frames, the playback can achieve low-latency interaction of viewing angles, and switching camera positions does not affect picture rendering or cause camera position jumps, which can ensure the smoothness of screen switching.
  • the video capture module collects multi-camera video streams, and the server synchronizes the video frames of the multi-camera video streams, and merges the synchronized multiple video streams into a single code stream.
  • the merged streams are DASH sliced, and the corresponding frame index files are generated synchronously.
  • the server slices and encapsulates the media files containing multiple streams and generates corresponding index files.
  • the index files mark the information of all frames in the corresponding slices.
  • the server generates a media presentation description file
  • the client downloads the media presentation description file and parses the index file, video fragmentation, and audio fragmentation information in it
  • the client downloads the index file and parses the video frame information
  • the client Video frame information download the next frame of the current camera position by frame
  • the client will decode and render the downloaded frame
  • the user performs the viewing angle switching operation
  • the client modifies the value of the current camera position, and downloads the next frame of the switched camera position by frame One frame, the client decodes and renders the downloaded frame, and repeats this until the viewing angle switching operation ends.
  • the server performs DASH slices on the recorded combined streams and generates corresponding frame index files synchronously.
  • file the index file marks the information of all frames in the corresponding segment
  • the server generates the media presentation description file
  • the client downloads the media presentation description file and parses the index file, video segment, and audio segment information
  • the client downloads the index file and parse the video frame information
  • the client downloads the next frame of the current camera frame by frame according to the video frame information in the index file
  • the client decodes and renders the downloaded frame
  • the user switches the viewing angle
  • the client modifies the current The camera position value, and download the next frame of the switched camera position by frame, and the client will decode and render the downloaded frame, and repeat this until the viewing angle switching operation ends.
  • the server slices and encapsulates the media files containing multiple streams and generates corresponding index files.
  • the index files mark the information of all frames in the corresponding slices
  • the server generates media presentation description files
  • the client downloads the media Present the description file and analyze the index file, video segment and audio segment information in it.
  • the client downloads the index file and parses the video frame information.
  • One frame the client decodes and renders the downloaded frame
  • the user performs the bullet time operation
  • the client modifies the current position value, increments by 1, and downloads the same frame of the switched position by frame
  • the client downloads the frame Perform decoding and rendering, and repeat until the end of the bullet time operation.
  • the embodiment of the present application also provides a client.
  • the terminal includes: one or more processors and memories, and one processor and memories are taken as an example in FIG. 10 .
  • the processor and the memory may be connected through a bus or in other ways, and connection through a bus is taken as an example in FIG. 10 .
  • the memory can be used to store non-transitory software programs and non-transitory computer-executable programs, such as the method for processing free-view video scenes in the above-mentioned embodiments of the present application.
  • the processor executes the non-transitory software program and the program stored in the memory, so as to realize the processing method of the free-view angle video scene in the above-mentioned embodiment of the present application.
  • the memory may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; the data storage area may store the processing method for executing the free-view video scene in the above-mentioned embodiments of the present application required data etc.
  • the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices.
  • the memory may include memory located remotely from the processor, and these remote memories may be connected to the terminal through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the non-transitory software programs and programs required to realize the processing method of the free-view video scene in the above-mentioned embodiments of the present application are stored in the memory.
  • the processing method of the perspective video scene for example, execute the method steps 101 to 103 in Figure 1 described above, the method steps 1011 to 1013 in Figure 2, and the method steps 1031 to 1034 in Figure 3, and the client obtains the index file, analyze the video frame information and camera position values of all camera positions according to the index file, obtain the camera position values for switching viewing angles, and download video frames according to the video frame information and camera position values for switching viewing angles. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
  • the embodiment of the present application also provides a server.
  • the electronic device includes: one or more processors and memories, and one processor and memories are taken as an example in FIG. 11 .
  • the processor and the memory may be connected through a bus or in other ways, and connection through a bus is taken as an example in FIG. 11 .
  • the memory can be used to store non-transitory software programs and non-transitory computer-executable programs, such as the method for processing free-view video scenes in the above-mentioned embodiments of the present application.
  • the processor executes the non-transitory software program and the program stored in the memory, so as to implement the method for processing the free-view video scene in the above-mentioned embodiments of the present application.
  • the memory may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; the data storage area may store the processing method for executing the free-view video scene in the above-mentioned embodiments of the present application required data etc.
  • the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices.
  • the memory may include memory located remotely from the processor, and these remote memories may be connected to the terminal through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the non-transitory software programs and programs required to realize the processing method of the free-view video scene in the above-mentioned embodiments of the present application are stored in the memory.
  • the processing method of the perspective video scene for example, executes steps 201 to 203 of the method in Figure 7 described above, and the server slices and encapsulates the media file containing multiple code streams to obtain fragments.
  • the fragments include video frame information, And generate an index file corresponding to the segment, and extract the video frame information in the segment to the index file, so that the client can download the index file from the server. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer-executable program, and the computer-executable program is executed by one or more control processors, for example, shown in FIG. 11
  • Execution by one of the processors can cause the above-mentioned one or more processors to execute the processing method of the free-view video scene in the above-mentioned embodiment of the present application, for example, execute the method steps 101 to 103 in FIG. 1 described above, FIG. From step 1011 to step 1013 of the method in 2, and from step 1031 to step 1034 of the method in FIG.
  • the client obtains the index file, parses out the video frame information and camera position values of all camera positions according to the index file, and obtains the switching angle of view Camera position value, download the video frame according to the video frame information and the camera position value for switching the viewing angle. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information. Or, execute steps 201 to 203 of the method in FIG. 7 described above, the server slices and encapsulates the media file containing multiple code streams to obtain slices, and the slices include video frame information, and generate corresponding to the slices.
  • Index file extract the video frame information in the segment to the index file, so that the client can download the index file from the server. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
  • the embodiment of the present application includes: the client obtains the index file, parses the video frame information and camera position values of all camera positions according to the index file, obtains the camera position value of the switching angle of view, and obtains the camera position value according to the video frame information and the camera position of the switching angle of view. Bit value to download video frames. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer.
  • communication media typically embodies computer readable programs, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Abstract

A method for processing a video scene of a free visual angle, and a client and a server. The method, which is applied to a client, comprises: a client acquiring an index file (101); parsing video frame information and camera position values of all camera positions according to the index file (102); and acquiring a camera position value of a switched visual angle, and downloading a video frame according to the video frame information and the camera position value of the switched visual angle (103).

Description

自由视角视频场景的处理方法、客户端及服务器Processing method, client and server of free-view video scene
相关申请的交叉引用Cross References to Related Applications
本申请基于申请号为202110722259.7,申请日为2021年06月28日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is based on a Chinese patent application with application number 202110722259.7 and a filing date of June 28, 2021, and claims the priority of this Chinese patent application. The entire content of this Chinese patent application is hereby incorporated by reference into this application.
技术领域technical field
本申请实施例涉及多媒体技术领域,特别是涉及一种自由视角视频场景的处理方法、客户端及服务器。The embodiments of the present application relate to the field of multimedia technologies, and in particular, relate to a processing method, a client, and a server for a free-view video scene.
背景技术Background technique
随着5G时代的来临与用户娱乐需求的增长,单视角的视频体验已无法满足用户的体验需求,而多视角场景的视频仅能提供较少的精彩视角,用户交互选择性受限制。通过360度全视角的自由视角任意选择观看视角,能够给予用户定制化的体验。现在自由视角被广泛应用于运动赛事、教育培训、文娱演出,给5G应用提供新的视频场景。With the advent of the 5G era and the growth of user entertainment needs, single-view video experience can no longer meet user experience needs, while multi-view video can only provide fewer exciting perspectives, and user interaction options are limited. The viewing angle can be arbitrarily selected through the free viewing angle of the 360-degree full viewing angle, which can give users a customized experience. Now the free viewing angle is widely used in sports events, education and training, and entertainment performances, providing new video scenarios for 5G applications.
在一些情形中,用户体验自由视角时,对视角交互的时延、画面切换的流畅性要求高,现在行业内主流的自由视点都是使用拼接式或者是实时合成的方式,拼接的视角方案传输带宽占用高,并且损失了原有视频帧的画质,而实时合成的视角效果无法保证,且对性能消耗大。In some cases, when users experience free viewing angles, they have high requirements for the delay of viewing angle interaction and the smoothness of screen switching. Now the mainstream free viewing angles in the industry use splicing or real-time synthesis, and the splicing viewing angle scheme is transmitted. The bandwidth usage is high, and the image quality of the original video frame is lost, while the viewing angle effect of real-time synthesis cannot be guaranteed, and it consumes a lot of performance.
发明内容Contents of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics described in detail in this article. This summary is not intended to limit the scope of the claims.
本申请实施例提供了一种自由视角视频场景的处理方法、客户端及服务器。Embodiments of the present application provide a method for processing a free-view video scene, a client, and a server.
第一方面,本申请实施例提供了一种自由视角视频场景的处理方法,应用于客户端,所述方法包括:获取索引文件,所述索引文件包括有视频帧信息;根据所述索引文件解析出所有相机机位的视频帧信息和相机机位值;获取切换视角的所述相机机位值,根据所述视频帧信息和切换视角的所述相机机位值下载视频帧。In the first aspect, the embodiment of the present application provides a method for processing a free-view video scene, which is applied to a client, and the method includes: obtaining an index file, the index file includes video frame information; analyzing according to the index file Obtain the video frame information and camera position values of all camera positions; obtain the camera position values for switching viewing angles, and download video frames according to the video frame information and the camera position values for switching viewing angles.
第二方面,本申请实施例提供了一种自由视角视频场景的处理方法,应用于服务器,所述方法包括:将包含多路码流的媒体文件进行切片封装,得到分片,所述分片包括有视频帧信息;生成与所述分片对应的索引文件;提取所述分片中的所述视频帧信息至所述索引文件。In the second aspect, the embodiment of the present application provides a method for processing a free-view video scene, which is applied to a server. The method includes: performing slice packaging on a media file containing multiple code streams to obtain slices, and the slices Including video frame information; generating an index file corresponding to the slice; extracting the video frame information in the slice to the index file.
第三方面,本申请实施例提供了一种客户端,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上第一方面所述的自由视角视频场景的处理方法。In a third aspect, an embodiment of the present application provides a client, including: a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor implements the above-mentioned first method when executing the computer program. In one aspect, the method for processing a free viewing angle video scene.
第四方面,本申请实施例提供了一种服务器,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上第二方面所述的自由视角视频场景的处理方法。In a fourth aspect, an embodiment of the present application provides a server, including: a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor implements the second above when executing the computer program. The processing method of the free viewing angle video scene described in the aspect.
第五方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存 储有计算机可执行程序,所述计算机可执行程序用于使计算机执行如上第一方面所述的自由视角视频场景的处理方法,或者如上第二方面所述的自由视角视频场景的处理方法。In the fifth aspect, the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer-executable program, and the computer-executable program is used to make the computer perform the above-mentioned first aspect. A processing method for a free-view video scene, or a processing method for a free-view video scene as described in the second aspect above.
本申请的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请而了解。本申请的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。Additional features and advantages of the application will be set forth in the description which follows, and, in part, will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
附图说明Description of drawings
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。The accompanying drawings are used to provide a further understanding of the technical solution of the present application, and constitute a part of the specification, and are used together with the embodiments of the present application to explain the technical solution of the present application, and do not constitute a limitation to the technical solution of the present application.
图1是本申请一个实施例提供的一种自由视角视频场景的处理方法的主流程图(客户端侧);Fig. 1 is the main flow chart (client side) of the processing method of a kind of free viewing angle video scene provided by one embodiment of the present application;
图2是本申请一个实施例提供的一种自由视角视频场景的处理方法的子流程图;FIG. 2 is a sub-flow chart of a method for processing a free-view video scene provided by an embodiment of the present application;
图3是本申请一个实施例提供的一种自由视角视频场景的处理方法的子流程图;FIG. 3 is a subflow chart of a method for processing a free-view video scene provided by an embodiment of the present application;
图4是本申请一个实施例提供的自由视角的相机机位图;FIG. 4 is a camera position diagram of a free viewing angle provided by an embodiment of the present application;
图5是本申请一个实施例提供的自由视角的切换帧图;FIG. 5 is a switching frame diagram of a free viewing angle provided by an embodiment of the present application;
图6是本申请一个实施例提供的自由视角的子弹时间切换帧图;Fig. 6 is a bullet time switching frame diagram of a free perspective provided by an embodiment of the present application;
图7是本申请一个实施例提供的一种自由视角视频场景的处理方法的主流程图(服务器侧);FIG. 7 is a main flow chart (server side) of a method for processing a free-view video scene provided by an embodiment of the present application;
图8是本申请一个实施例提供的自由视角的直播流程图;Fig. 8 is a flow chart of a live broadcast of a free viewing angle provided by an embodiment of the present application;
图9是本申请一个实施例提供的自由视角的点播流程图;Fig. 9 is an on-demand flow chart of a free viewing angle provided by an embodiment of the present application;
图10是本申请一个实施例提供的客户端结构示意图;FIG. 10 is a schematic structural diagram of a client provided by an embodiment of the present application;
图11是本申请一个实施例提供的服务器结构示意图。Fig. 11 is a schematic diagram of a server structure provided by an embodiment of the present application.
具体实施方式detailed description
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.
应了解,在本申请实施例的描述中,多个(或多项)的含义是两个以上,大于、小于、超过等理解为不包括本数,以上、以下、以内等理解为包括本数。如果有描述到“第一”、“第二”等只是用于区分技术特征为目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。It should be understood that in the description of the embodiments of the present application, multiple (or multiple) means more than two, greater than, less than, exceeding, etc. are interpreted as not including the original number, and above, below, within, etc. are understood as including the original number. If there are descriptions such as "first", "second", etc., it is only for the purpose of distinguishing technical features, and it cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features or implicitly indicating the indicated The sequence relationship of the technical characteristics.
随着5G时代的来临与用户娱乐需求的增长,单视角的视频体验已无法满足用户的体验需求,而多视角场景的视频仅能提供较少的精彩视角,用户交互选择性受限制。通过360度全视角的自由视角任意选择观看视角,能够给予用户定制化的体验。现在自由视角被广泛应用于运动赛事、教育培训、文娱演出,给5G应用提供新的视频场景。With the advent of the 5G era and the growth of user entertainment needs, single-view video experience can no longer meet user experience needs, while multi-view video can only provide fewer exciting perspectives, and user interaction options are limited. The viewing angle can be arbitrarily selected through the free viewing angle of the 360-degree full viewing angle, which can give users a customized experience. Now the free viewing angle is widely used in sports events, education and training, and entertainment performances, providing new video scenarios for 5G applications.
在一些情形中,用户体验自由视角时,对视角交互的时延、画面切换的流畅性要求高,现在行业内主流的自由视点都是使用拼接式或者是实时合成的方式,拼接的视角方案传输带宽占用高,并且损失了原有视频帧的画质,而实时合成的视角效果无法保证,且对性能消耗In some cases, when users experience free viewing angles, they have high requirements for the delay of viewing angle interaction and the smoothness of screen switching. Now the mainstream free viewing angles in the industry use splicing or real-time synthesis, and the splicing viewing angle scheme is transmitted. The bandwidth usage is high, and the image quality of the original video frame is lost, and the viewing angle effect of real-time synthesis cannot be guaranteed, and it consumes performance
本申请实施例提供了一种自由视角视频场景的处理方法、客户端及服务器,客户端获取索引文件,根据索引文件解析出所有相机机位的视频帧信息的相机机位值,获取切换视角的相机机位值,根据视频帧信息和切换视角的相机机位值下载视频帧。基于此,既可以实现视角交互低时延,又可以保证画面切换的流畅性,通过引入索引文件这个辅助文件,能够在保证画面质量的情况下,减少不必要的下载量,同时后期易于扩充其他视角信息。The embodiment of the present application provides a processing method, a client and a server for a free viewing angle video scene. The client obtains an index file, analyzes the camera position values of the video frame information of all camera positions according to the index file, and obtains the position value of the switching angle of view. Camera position value, download the video frame according to the video frame information and the camera position value for switching the viewing angle. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
如图1所示,图1是本申请一个实施例提供的一种自由视角视频场景的处理方法的流程图。该自由视角视频场景的处理方法可以应用于客户端,自由视角视频场景的处理方法包括但不限于如下的步骤101、步骤102和步骤103。As shown in FIG. 1 , FIG. 1 is a flowchart of a method for processing a free-view video scene provided by an embodiment of the present application. The method for processing a free-view video scene can be applied to a client, and the method for processing a free-view video scene includes, but is not limited to, the following steps 101 , 102 and 103 .
步骤101,获取索引文件。 Step 101, acquire an index file.
步骤102,根据索引文件解析出所有相机机位的视频帧信息和相机机位值。 Step 102, analyze the video frame information and camera position values of all camera positions according to the index file.
步骤103,获取切换视角的相机机位值,根据视频帧信息和切换视角的相机机位值下载视频帧。 Step 103, acquire the camera position value for switching the viewing angle, and download the video frame according to the video frame information and the camera position value for switching the viewing angle.
可以理解的是,客户端获取索引文件,索引文件包括有所有相机机位的视频帧信息和相机机位值,根据索引文件解析出所有相机机位的视频帧信息和相机机位值,获取切换视角的相机机位值,根据视频帧信息和切换视角的相机机位值进行视频帧下载,从而实现自由视角的切换。基于此,既可以实现视角交互低时延,又可以保证画面切换的流畅性,通过引入索引文件这个辅助文件,能够在保证画面质量的情况下,减少不必要的下载量,同时后期易于扩充其他视角信息。It is understandable that the client obtains the index file, which includes video frame information and camera position values of all camera positions, and parses out the video frame information and camera position values of all camera positions according to the index file, and obtains the switch The camera position value of the angle of view, download the video frame according to the video frame information and the camera position value of the switching angle of view, so as to realize the switching of the free angle of view. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
需要说明的是,视频帧信息包括但不限于视频帧起始位置信息、视频帧大小和视频帧对应相机机位值。It should be noted that the video frame information includes, but is not limited to, video frame starting position information, video frame size, and camera position value corresponding to the video frame.
以直播为例,客户端可以从服务器下载索引文件并解析出视频帧信息,并根据索引文件中的视频帧信息,按帧下载当前机位的下一帧,并将下载后的帧进行解码渲染,当用户进行视角切换操作,通过客户端修改当前机位值,并按帧下载切换后机位的下一帧,并将下载后的帧进行解码渲染,如此,直至视角切换操作结束。基于此,若用户进行视角切换,客户端响应交互,修改机位值信息,并后续按照修改后的机位流进行帧下载。由于按照帧为单位进行下载,因此播放的可以实现视角交互低时延性,并且,切换机位不影响画面渲染、不引起机位跳变,可以保证画面切换的流畅性。Taking live broadcast as an example, the client can download the index file from the server and parse out the video frame information, and download the next frame of the current camera frame by frame according to the video frame information in the index file, and decode and render the downloaded frame , when the user performs a viewing angle switching operation, modify the current camera position value through the client, and download the next frame of the switched camera position by frame, and decode and render the downloaded frame, and so on until the viewing angle switching operation ends. Based on this, if the user switches the viewing angle, the client responds to the interaction, modifies the camera location value information, and then downloads frames according to the modified camera location stream. Since the download is performed in units of frames, the playback can achieve low-latency interaction of viewing angles, and switching camera positions does not affect picture rendering or cause camera position jumps, which can ensure the smoothness of screen switching.
需要说明的是,在步骤103之后,还可以包括但不限于如下子步骤。It should be noted that after step 103, it may further include but not limited to the following sub-steps.
对视频帧进行解码渲染。Decode and render video frames.
可以理解的是,对于下载下来的视频帧,在客户端需要进行解码和渲染的处理才可以打开,以使得用户能在客户端看到视频帧的画面。It can be understood that, for the downloaded video frame, the client needs to be decoded and rendered before it can be opened, so that the user can see the picture of the video frame on the client.
如图2所示,步骤101可以包括但不限于如下的子步骤:步骤1011、步骤1012和步骤1013。As shown in FIG. 2 , step 101 may include but not limited to the following sub-steps: step 1011 , step 1012 and step 1013 .
步骤1011,获取媒体呈现描述文件,媒体呈现描述文件由服务器生成。 Step 1011, acquire the media presentation description file, the media presentation description file is generated by the server.
步骤1012,根据媒体呈现描述文件得到索引文件的信息。 Step 1012, obtain the information of the index file according to the media presentation description file.
步骤1013,根据索引文件的信息从服务器下载索引文件。 Step 1013, download the index file from the server according to the information of the index file.
可以理解的是,对于客户端获取索引文件的方式,例如,可以通过获取由服务器基于DASH(Dynamic Adaptive Streaming over HTTP,自适应流媒体传输)生成的媒体呈现描述(Media Presentation Description,MPD)文件,根据媒体呈现描述文件得到索引文件的信息,再根 据索引文件的信息从服务器下载索引文件。It can be understood that, for the client to obtain the index file, for example, by obtaining the media presentation description (Media Presentation Description, MPD) file generated by the server based on DASH (Dynamic Adaptive Streaming over HTTP, adaptive streaming media transmission), The information of the index file is obtained according to the media presentation description file, and then the index file is downloaded from the server according to the information of the index file.
如图3所示,步骤103可以包括但不限于如下的子步骤:步骤1031至步骤1034。As shown in FIG. 3 , step 103 may include but not limited to the following sub-steps: step 1031 to step 1034 .
步骤1031,获取当前相机机位值。Step 1031, obtain the current camera position value.
步骤1032,根据视频帧信息和当前相机机位值确定要下载的当前视角视频帧。Step 1032, determine the current viewing angle video frame to be downloaded according to the video frame information and the current camera position value.
步骤1033,获取目标相机机位值。Step 1033, acquire the position value of the target camera.
步骤1034,根据视频帧信息和目标相机机位值确定要下载的目标视角视频帧。Step 1034, determine the target viewing angle video frame to be downloaded according to the video frame information and the position value of the target camera.
可以理解的是,如图4所示,为了实现自由视角,需要有多个相机以360围绕被拍摄的对象,且通过切换相机机位来找到对应的视频帧进行下载。具体地,每个相机机位对应有一个相机机位值,用户在客户端通过修改相机机位值来实现视角切换,例如,通过获取当前相机机位值,根据当前相机机位值确定当前相机机位,并根据视频帧信息和当前相机机位确定要下载的当前视角视频帧;又例如,当用户进行视角切换操作,可以修改当前相机机位值为目标相机机位值,客户端通过获取用户输入的目标相机机位值,根据目标相机机位值确定目标相机机位,从而实现从当前相机机位切换到目标相机机位,并根据视频帧信息和目标相机机位确定要下载的目标视角视频帧,从而实现自由视角的切换。It can be understood that, as shown in FIG. 4 , in order to achieve a free viewing angle, multiple cameras are required to surround the object to be photographed in 360°, and the corresponding video frames are found and downloaded by switching camera positions. Specifically, each camera position corresponds to a camera position value, and the user realizes viewing angle switching by modifying the camera position value on the client, for example, by obtaining the current camera position value, and determining the current camera position value according to the current camera position value. camera position, and determine the current viewing angle video frame to be downloaded according to the video frame information and the current camera position; for another example, when the user performs a viewing angle switching operation, the current camera position value can be modified to the target camera position value, and the client obtains the The target camera position value input by the user, determine the target camera position according to the target camera position value, so as to realize switching from the current camera position to the target camera position, and determine the target to be downloaded according to the video frame information and the target camera position Angle of view video frame, so as to realize the switching of free angle of view.
可以理解的是,如图5所示,以直播为例,根据索引文件中的视频帧信息,按帧下载当前相机机位的下一帧,当用户进行视角切换操作,通过客户端修改当前相机机位值至目标相机机位值,并按帧下载切换目标相机机位的下一帧,如此重复,直至视角切换操作结束;如图5所示,以点播为例,根据索引文件中的视频帧信息,按帧下载当前相机机位的下一帧,当用户进行视角切换操作,通过客户端修改当前相机机位值至目标相机机位值,并按帧下载切换目标相机机位的下一帧,如此重复,直至视角切换操作结束;如图6所示,以子弹时间为例,根据索引文件中的视频帧信息,按帧下载当前相机机位的下一帧,当用户进行子弹时间操作,通过客户端修改当前相机机位值,递增1至目标相机机位值,并按帧下载切换目标相机机位的同一帧,如此重复,直至视角切换操作结束。It can be understood that, as shown in Figure 5, taking the live broadcast as an example, according to the video frame information in the index file, the next frame of the current camera position is downloaded frame by frame. From the camera position value to the target camera position value, download and switch the next frame of the target camera position by frame, and repeat until the viewing angle switching operation ends; as shown in Figure 5, taking VOD as an example, according to the video in the index file Frame information, download the next frame of the current camera position by frame, when the user performs an angle switching operation, modify the current camera position value to the target camera position value through the client, and download the next frame of the switching target camera position by frame Frame, repeat until the end of the viewing angle switching operation; as shown in Figure 6, taking bullet time as an example, according to the video frame information in the index file, download the next frame of the current camera position by frame, when the user performs the bullet time operation , modify the current camera position value through the client, increment by 1 to the target camera position value, and download the same frame for switching the target camera position by frame, and repeat this until the viewing angle switching operation ends.
如图7所示,图7是本申请一个实施例提供的一种自由视角视频场景的处理方法的流程图。该自由视角视频场景的处理方法可以应用于服务器,自由视角视频场景的处理方法包括但不限于如下的步骤201、步骤202和步骤203。As shown in FIG. 7 , FIG. 7 is a flowchart of a method for processing a free-view video scene provided by an embodiment of the present application. The method for processing a free-view video scene may be applied to a server, and the method for processing a free-view video scene includes, but is not limited to, the following steps 201 , 202 and 203 .
步骤201,将包含多路码流的媒体文件进行切片封装,得到分片,分片包括有视频帧信息。Step 201: Slicing and encapsulating a media file containing multiple code streams to obtain fragments, which include video frame information.
步骤202,生成与分片对应的索引文件。 Step 202, generating an index file corresponding to the slice.
步骤203,提取分片中的视频帧信息至索引文件。 Step 203, extract the video frame information in the slice to the index file.
可以理解的是,服务器将包含多路码流的媒体文件进行切片封装,得到分片,分片包括有视频帧信息,并生成与分片对应的索引文件,提取分片中的视频帧信息至索引文件,以使得客户端可以从服务器下载索引文件。基于此,既可以实现视角交互低时延,又可以保证画面切换的流畅性,通过引入索引文件这个辅助文件,能够在保证画面质量的情况下,减少不必要的下载量,同时后期易于扩充其他视角信息。It can be understood that the server slices and encapsulates the media files containing multiple code streams to obtain fragments. The fragments include video frame information, and generate index files corresponding to the fragments, and extract the video frame information in the fragments to Index file, so that the client can download the index file from the server. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
可以理解的是,服务器可以基于DASH协议生成媒体呈现描述文件,其中,媒体呈现描述文件包括有分片的信息和索引文件的信息。基于此,客户端可以通过获取由服务器生成的媒体呈现描述文件,根据媒体呈现描述文件得到索引文件的信息,再根据索引文件的信息从服务器下载索引文件。It can be understood that the server may generate a media presentation description file based on the DASH protocol, where the media presentation description file includes information about fragments and information about index files. Based on this, the client can acquire the media presentation description file generated by the server, obtain the information of the index file according to the media presentation description file, and then download the index file from the server according to the information of the index file.
将多机位的视频流转码合并为统一码流进行直播或者是录制后进行点播。服务端获取直播流或点播流,进行基于DASH协议的切片封装,并提取分片中的帧信息至索引文件。将分片信息与索引文件信息描述至媒体呈现描述文件。客户端获取媒体呈现描述文件,根据索引文件自定义字段,下载索引文件,并解析索引文件中的视频帧信息。客户端根据索引文件中的视频帧信息按照帧为单位进行帧下载,并将下载后的帧进行解码渲染。若用户进行视角切换,客户端响应交互,修改机位值信息,并后续按照修改后的机位流进行帧下载。由于按照帧为单位进行下载,因此播放的可以实现视角交互低时延性,并且,切换机位不影响画面渲染、不引起机位跳变,可以保证画面切换的流畅性。Transcode and merge multi-camera video streams into a unified stream for live broadcast or recording for on-demand. The server obtains the live stream or on-demand stream, performs slice encapsulation based on the DASH protocol, and extracts the frame information in the slice to the index file. Describe the segment information and index file information to the media presentation description file. The client obtains the media presentation description file, downloads the index file according to the custom fields of the index file, and parses the video frame information in the index file. The client downloads frames in units of frames according to the video frame information in the index file, and decodes and renders the downloaded frames. If the user switches the viewing angle, the client responds to the interaction, modifies the camera position value information, and then downloads frames according to the modified camera position stream. Since the download is performed in units of frames, the playback can achieve low-latency interaction of viewing angles, and switching camera positions does not affect picture rendering or cause camera position jumps, which can ensure the smoothness of screen switching.
以下结合附图和具体实施例进一步介绍本申请提供的自由视角视频场景的处理方法。The method for processing a free-view video scene provided by the present application will be further described below in conjunction with the accompanying drawings and specific embodiments.
如图8所示,以直播为例,视频采集模块采集多机位视频流,服务器将多机位视频流进行视频帧同步,并将同步后的多个视频流合并为单路码流,将合并后的码流进行DASH切片,并同步生成对应的帧索引文件,服务器将包含多路码流的媒体文件进行切片封装并生成对应的索引文件,索引文件中标注对应分片中所有帧的信息,服务器生成媒体呈现描述文件,客户端下载媒体呈现描述文件并解析其中的索引文件与视频分片、音频分片的信息,客户端下载索引文件并解析视频帧信息,客户端根据索引文件中的视频帧信息,按帧下载当前机位的下一帧,客户端将下载后的帧进行解码渲染,用户进行视角切换操作,客户端修改当前机位值,并按帧下载切换后机位的下一帧,客户端将下载后的帧进行解码渲染,如此重复,直至视角切换操作结束。As shown in Figure 8, taking a live broadcast as an example, the video capture module collects multi-camera video streams, and the server synchronizes the video frames of the multi-camera video streams, and merges the synchronized multiple video streams into a single code stream. The merged streams are DASH sliced, and the corresponding frame index files are generated synchronously. The server slices and encapsulates the media files containing multiple streams and generates corresponding index files. The index files mark the information of all frames in the corresponding slices. , the server generates a media presentation description file, the client downloads the media presentation description file and parses the index file, video fragmentation, and audio fragmentation information in it, the client downloads the index file and parses the video frame information, and the client Video frame information, download the next frame of the current camera position by frame, the client will decode and render the downloaded frame, the user performs the viewing angle switching operation, the client modifies the value of the current camera position, and downloads the next frame of the switched camera position by frame One frame, the client decodes and renders the downloaded frame, and repeats this until the viewing angle switching operation ends.
如图9所示,以点播为例,服务器将录制后的合并码流进行DASH切片,并同步生成对应的帧索引文件,服务器将包含多路码流的媒体文件进行切片封装并生成对应的索引文件,索引文件中标注对应分片中所有帧的信息,服务器生成媒体呈现描述文件,客户端下载媒体呈现描述文件并解析其中的索引文件与视频分片、音频分片的信息,客户端下载索引文件并解析视频帧信息,客户端根据索引文件中的视频帧信息,按帧下载当前机位的下一帧,客户端将下载后的帧进行解码渲染,用户进行视角切换操作,客户端修改当前机位值,并按帧下载切换后机位的下一帧,客户端将下载后的帧进行解码渲染,如此重复,直至视角切换操作结束。As shown in Figure 9, taking video-on-demand as an example, the server performs DASH slices on the recorded combined streams and generates corresponding frame index files synchronously. The server slices and encapsulates media files containing multiple streams and generates corresponding indexes. file, the index file marks the information of all frames in the corresponding segment, the server generates the media presentation description file, the client downloads the media presentation description file and parses the index file, video segment, and audio segment information, and the client downloads the index file and parse the video frame information, the client downloads the next frame of the current camera frame by frame according to the video frame information in the index file, the client decodes and renders the downloaded frame, the user switches the viewing angle, and the client modifies the current The camera position value, and download the next frame of the switched camera position by frame, and the client will decode and render the downloaded frame, and repeat this until the viewing angle switching operation ends.
以子弹时间为例,服务器将包含多路码流的媒体文件进行切片封装并生成对应的索引文件,索引文件中标注对应分片中所有帧的信息,服务器生成媒体呈现描述文件,客户端下载媒体呈现描述文件并解析其中的索引文件与视频分片、音频分片的信息,客户端下载索引文件并解析视频帧信息,客户端根据索引文件中的视频帧信息,按帧下载当前机位的下一帧,客户端将下载后的帧进行解码渲染,用户进行子弹时间操作,客户端修改当前机位值,递增1,并按帧下载切换后机位的同一帧,客户端将下载后的帧进行解码渲染,如此重复,直至子弹时间操作结束。Taking Bullet Time as an example, the server slices and encapsulates the media files containing multiple streams and generates corresponding index files. The index files mark the information of all frames in the corresponding slices, the server generates media presentation description files, and the client downloads the media Present the description file and analyze the index file, video segment and audio segment information in it. The client downloads the index file and parses the video frame information. One frame, the client decodes and renders the downloaded frame, the user performs the bullet time operation, the client modifies the current position value, increments by 1, and downloads the same frame of the switched position by frame, and the client downloads the frame Perform decoding and rendering, and repeat until the end of the bullet time operation.
如图10所示,本申请实施例还提供了一种客户端。As shown in FIG. 10 , the embodiment of the present application also provides a client.
具体地,该终端包括:一个或多个处理器和存储器,图10中以一个处理器及存储器为例。处理器和存储器可以通过总线或者其他方式连接,图10中以通过总线连接为例。Specifically, the terminal includes: one or more processors and memories, and one processor and memories are taken as an example in FIG. 10 . The processor and the memory may be connected through a bus or in other ways, and connection through a bus is taken as an example in FIG. 10 .
存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序,如上述本申请实施例中的自由视角视频场景的处理方法。处理器通过运行存储在存储器中的非暂态软件程序以及程序,从而实现上述本申请实施例中的自由视角视 频场景的处理方法。As a non-transitory computer-readable storage medium, the memory can be used to store non-transitory software programs and non-transitory computer-executable programs, such as the method for processing free-view video scenes in the above-mentioned embodiments of the present application. The processor executes the non-transitory software program and the program stored in the memory, so as to realize the processing method of the free-view angle video scene in the above-mentioned embodiment of the present application.
存储器可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储执行上述本申请实施例中的自由视角视频场景的处理方法所需的数据等。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器可包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; the data storage area may store the processing method for executing the free-view video scene in the above-mentioned embodiments of the present application required data etc. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices. In some embodiments, the memory may include memory located remotely from the processor, and these remote memories may be connected to the terminal through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
实现上述本申请实施例中的自由视角视频场景的处理方法所需的非暂态软件程序以及程序存储在存储器中,当被一个或者多个处理器执行时,执行上述本申请实施例中的自由视角视频场景的处理方法,例如,执行以上描述的图1中的方法步骤101至步骤103,图2中的方法步骤1011至步骤1013,图3中的方法步骤1031至步骤1034,客户端获取索引文件,根据索引文件解析出所有相机机位的视频帧信息和相机机位值,获取切换视角的相机机位值,根据视频帧信息和切换视角的相机机位值下载视频帧。基于此,既可以实现视角交互低时延,又可以保证画面切换的流畅性,通过引入索引文件这个辅助文件,能够在保证画面质量的情况下,减少不必要的下载量,同时后期易于扩充其他视角信息。The non-transitory software programs and programs required to realize the processing method of the free-view video scene in the above-mentioned embodiments of the present application are stored in the memory. The processing method of the perspective video scene, for example, execute the method steps 101 to 103 in Figure 1 described above, the method steps 1011 to 1013 in Figure 2, and the method steps 1031 to 1034 in Figure 3, and the client obtains the index file, analyze the video frame information and camera position values of all camera positions according to the index file, obtain the camera position values for switching viewing angles, and download video frames according to the video frame information and camera position values for switching viewing angles. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
如图11所示,本申请实施例还提供了一种服务器。As shown in FIG. 11 , the embodiment of the present application also provides a server.
具体地,该电子设备包括:一个或多个处理器和存储器,图11中以一个处理器及存储器为例。处理器和存储器可以通过总线或者其他方式连接,图11中以通过总线连接为例。Specifically, the electronic device includes: one or more processors and memories, and one processor and memories are taken as an example in FIG. 11 . The processor and the memory may be connected through a bus or in other ways, and connection through a bus is taken as an example in FIG. 11 .
存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序,如上述本申请实施例中的自由视角视频场景的处理方法。处理器通过运行存储在存储器中的非暂态软件程序以及程序,从而实现上述本申请实施例中的自由视角视频场景的处理方法。As a non-transitory computer-readable storage medium, the memory can be used to store non-transitory software programs and non-transitory computer-executable programs, such as the method for processing free-view video scenes in the above-mentioned embodiments of the present application. The processor executes the non-transitory software program and the program stored in the memory, so as to implement the method for processing the free-view video scene in the above-mentioned embodiments of the present application.
存储器可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储执行上述本申请实施例中的自由视角视频场景的处理方法所需的数据等。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器可包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; the data storage area may store the processing method for executing the free-view video scene in the above-mentioned embodiments of the present application required data etc. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices. In some embodiments, the memory may include memory located remotely from the processor, and these remote memories may be connected to the terminal through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
实现上述本申请实施例中的自由视角视频场景的处理方法所需的非暂态软件程序以及程序存储在存储器中,当被一个或者多个处理器执行时,执行上述本申请实施例中的自由视角视频场景的处理方法,例如,执行以上描述的图7中的方法步骤201至步骤203,服务器将包含多路码流的媒体文件进行切片封装,得到分片,分片包括有视频帧信息,并生成与分片对应的索引文件,提取分片中的视频帧信息至索引文件,以使得客户端可以从服务器下载索引文件。基于此,既可以实现视角交互低时延,又可以保证画面切换的流畅性,通过引入索引文件这个辅助文件,能够在保证画面质量的情况下,减少不必要的下载量,同时后期易于扩充其他视角信息。The non-transitory software programs and programs required to realize the processing method of the free-view video scene in the above-mentioned embodiments of the present application are stored in the memory. The processing method of the perspective video scene, for example, executes steps 201 to 203 of the method in Figure 7 described above, and the server slices and encapsulates the media file containing multiple code streams to obtain fragments. The fragments include video frame information, And generate an index file corresponding to the segment, and extract the video frame information in the segment to the index file, so that the client can download the index file from the server. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
此外,本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质存储有 计算机可执行程序,该计算机可执行程序被一个或多个控制处理器执行,例如,被图11中的一个处理器执行,可使得上述一个或多个处理器执行上述本申请实施例中的自由视角视频场景的处理方法,例如,执行以上描述的图1中的方法步骤101至步骤103,图2中的方法步骤1011至步骤1013,图3中的方法步骤1031至步骤1034,客户端获取索引文件,根据索引文件解析出所有相机机位的视频帧信息和相机机位值,获取切换视角的相机机位值,根据视频帧信息和切换视角的相机机位值下载视频帧。基于此,既可以实现视角交互低时延,又可以保证画面切换的流畅性,通过引入索引文件这个辅助文件,能够在保证画面质量的情况下,减少不必要的下载量,同时后期易于扩充其他视角信息。或者,执行以上描述的图7中的方法步骤201至步骤203,服务器将包含多路码流的媒体文件进行切片封装,得到分片,分片包括有视频帧信息,并生成与分片对应的索引文件,提取分片中的视频帧信息至索引文件,以使得客户端可以从服务器下载索引文件。基于此,既可以实现视角交互低时延,又可以保证画面切换的流畅性,通过引入索引文件这个辅助文件,能够在保证画面质量的情况下,减少不必要的下载量,同时后期易于扩充其他视角信息。In addition, the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer-executable program, and the computer-executable program is executed by one or more control processors, for example, shown in FIG. 11 Execution by one of the processors can cause the above-mentioned one or more processors to execute the processing method of the free-view video scene in the above-mentioned embodiment of the present application, for example, execute the method steps 101 to 103 in FIG. 1 described above, FIG. From step 1011 to step 1013 of the method in 2, and from step 1031 to step 1034 of the method in FIG. 3, the client obtains the index file, parses out the video frame information and camera position values of all camera positions according to the index file, and obtains the switching angle of view Camera position value, download the video frame according to the video frame information and the camera position value for switching the viewing angle. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information. Or, execute steps 201 to 203 of the method in FIG. 7 described above, the server slices and encapsulates the media file containing multiple code streams to obtain slices, and the slices include video frame information, and generate corresponding to the slices. Index file, extract the video frame information in the segment to the index file, so that the client can download the index file from the server. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
本申请实施例包括:客户端获取索引文件,根据索引文件解析出所有相机机位的视频帧信息和相机机位值,获取切换视角的相机机位值,根据视频帧信息和切换视角的相机机位值下载视频帧。基于此,既可以实现视角交互低时延,又可以保证画面切换的流畅性,通过引入索引文件这个辅助文件,能够在保证画面质量的情况下,减少不必要的下载量,同时后期易于扩充其他视角信息。The embodiment of the present application includes: the client obtains the index file, parses the video frame information and camera position values of all camera positions according to the index file, obtains the camera position value of the switching angle of view, and obtains the camera position value according to the video frame information and the camera position of the switching angle of view. Bit value to download video frames. Based on this, it can not only realize the low latency of viewing angle interaction, but also ensure the smoothness of screen switching. By introducing the auxiliary file of index file, it can reduce unnecessary downloads while ensuring the quality of the screen, and at the same time, it is easy to expand other files later. Angle information.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统可以被实施为软件、固件、硬件及其适当的组合。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读程序、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读程序、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。Those skilled in the art can understand that all or some of the steps and systems in the methods disclosed above can be implemented as software, firmware, hardware and an appropriate combination thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit . Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable programs, data structures, program modules, or other data. permanent, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer. In addition, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable programs, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .
以上是对本申请的若干实施方式进行了具体说明,但本申请并不局限于上述实施方式,熟悉本领域的技术人员在不违背本申请精神的共享条件下还可作出种种等同的变形或替换,这些等同的变形或替换均包括在本申请权利要求所限定的范围内。The above is a specific description of several implementations of the present application, but the application is not limited to the above-mentioned implementations, and those skilled in the art can also make various equivalent deformations or replacements without violating the spirit of the application. These equivalent modifications or replacements are all within the scope defined by the claims of the present application.

Claims (10)

  1. 一种自由视角视频场景的处理方法,应用于客户端,其中,所述方法包括:A method for processing a free-view video scene, applied to a client, wherein the method includes:
    获取索引文件;Get the index file;
    根据所述索引文件解析出所有相机机位的视频帧信息和相机机位值;Analyzing video frame information and camera position values of all camera positions according to the index file;
    获取切换视角的所述相机机位值,根据所述视频帧信息和切换视角的所述相机机位值下载视频帧。Obtain the camera position value for switching the viewing angle, and download the video frame according to the video frame information and the camera position value for switching the viewing angle.
  2. 根据权利要求1所述的方法,其中,所述获取索引文件,包括:The method according to claim 1, wherein said obtaining the index file comprises:
    获取媒体呈现描述文件,所述媒体呈现描述文件由服务器生成;Acquire a media presentation description file, where the media presentation description file is generated by a server;
    根据所述媒体呈现描述文件得到所述索引文件的信息;Obtain the information of the index file according to the media presentation description file;
    根据所述索引文件的信息从所述服务器下载索引文件。The index file is downloaded from the server according to the information of the index file.
  3. 根据权利要求2所述的方法,其中,所述获取切换视角的所述相机机位值,根据所述视频帧信息和切换视角的所述相机机位值下载所述视频帧,包括:The method according to claim 2, wherein the acquiring the camera position value for switching the viewing angle, and downloading the video frame according to the video frame information and the camera position value for switching the viewing angle include:
    获取当前相机机位值;Get the current camera position value;
    根据所述视频帧信息和所述当前相机机位值确定要下载的当前视角视频帧;Determine the current viewing angle video frame to be downloaded according to the video frame information and the current camera position value;
    获取目标相机机位值;Obtain the position value of the target camera;
    根据所述视频帧信息和所述目标相机机位值确定要下载的目标视角视频帧。Determine the target viewing angle video frame to be downloaded according to the video frame information and the target camera position value.
  4. 根据权利要求3所述的方法,其中,在所述获取切换视角的所述相机机位值,根据所述视频帧信息和切换视角的所述相机机位值下载所述视频帧之后,还包括:The method according to claim 3, wherein, after said obtaining the camera position value for switching the viewing angle, and downloading the video frame according to the video frame information and the camera position value for switching the viewing angle, further comprising :
    对所述视频帧进行解码渲染。Decode and render the video frame.
  5. 根据权利要求1至4任意一项所述的方法,其中,所述视频帧信息包括以下任意之一:The method according to any one of claims 1 to 4, wherein the video frame information includes any one of the following:
    视频帧起始位置信息;Video frame start position information;
    视频帧大小;video frame size;
    视频帧对应相机机位值。The video frame corresponds to the camera position value.
  6. 一种自由视角视频场景的处理方法,应用于服务器,其中,所述方法包括:A method for processing a free-view video scene, applied to a server, wherein the method includes:
    将包含多路码流的媒体文件进行切片封装,得到分片,所述分片包括有视频帧信息;Slicing and encapsulating the media file containing multiple code streams to obtain fragments, the fragments including video frame information;
    生成与所述分片对应的索引文件;Generate an index file corresponding to the fragmentation;
    提取所述分片中的所述视频帧信息至所述索引文件。extracting the video frame information in the slice to the index file.
  7. 根据权利要求6所述的方法,还包括;The method of claim 6, further comprising;
    生成媒体呈现描述文件,所述媒体呈现描述文件包括所述分片的信息和所述索引文件的信息。Generate a media presentation description file, where the media presentation description file includes the information of the fragment and the information of the index file.
  8. 一种客户端,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如权利要求1至5中任意一项所述的自由视角视频场景的处理方法。A client, comprising: a memory, a processor, and a computer program stored on the memory and operable on the processor, wherein, when the processor executes the computer program, any one of claims 1 to 5 is realized The processing method of the free viewing angle video scene.
  9. 一种服务器,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如权利要求6至7中任意一项所述的自由视角视频场景的处理方法。A server, comprising: a memory, a processor, and a computer program stored on the memory and operable on the processor, wherein, when the processor executes the computer program, the computer program described in any one of claims 6 to 7 is implemented. The processing method of the free-view video scene described above.
  10. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机可执行程序,所述计算机可执行程序用于使计算机执行如权利要求1至5任意一项所述的自由视角视频场 景的处理方法,或者如权利要求6至7任意一项所述的自由视角视频场景的处理方法。A computer-readable storage medium, wherein the computer-readable storage medium stores a computer-executable program, and the computer-executable program is used to enable a computer to execute the free-view video according to any one of claims 1 to 5. The processing method of the scene, or the processing method of the free viewing angle video scene according to any one of claims 6 to 7.
PCT/CN2022/093592 2021-06-28 2022-05-18 Method for processing video scene of free visual angle, and client and server WO2023273675A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110722259.7A CN115604523A (en) 2021-06-28 2021-06-28 Processing method of free visual angle video scene, client and server
CN202110722259.7 2021-06-28

Publications (1)

Publication Number Publication Date
WO2023273675A1 true WO2023273675A1 (en) 2023-01-05

Family

ID=84690329

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/093592 WO2023273675A1 (en) 2021-06-28 2022-05-18 Method for processing video scene of free visual angle, and client and server

Country Status (2)

Country Link
CN (1) CN115604523A (en)
WO (1) WO2023273675A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105872570A (en) * 2015-12-11 2016-08-17 乐视网信息技术(北京)股份有限公司 Method and apparatus for implementing multi-camera video synchronous playing
CN109257611A (en) * 2017-07-12 2019-01-22 阿里巴巴集团控股有限公司 A kind of video broadcasting method, device, terminal device and server
CN109891906A (en) * 2016-04-08 2019-06-14 维斯比特股份有限公司 View perceives 360 degree of video streamings
US20190206128A1 (en) * 2017-12-28 2019-07-04 Rovi Guides, Inc. Systems and methods for changing a users perspective in virtual reality based on a user-selected position
CN110035316A (en) * 2018-01-11 2019-07-19 华为技术有限公司 The method and apparatus for handling media data
CN112188219A (en) * 2020-09-29 2021-01-05 北京达佳互联信息技术有限公司 Video receiving method and device and video transmitting method and device
CN112771884A (en) * 2018-04-13 2021-05-07 华为技术有限公司 Immersive media metrics for virtual reality content with multiple machine positions

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105872570A (en) * 2015-12-11 2016-08-17 乐视网信息技术(北京)股份有限公司 Method and apparatus for implementing multi-camera video synchronous playing
CN109891906A (en) * 2016-04-08 2019-06-14 维斯比特股份有限公司 View perceives 360 degree of video streamings
CN109257611A (en) * 2017-07-12 2019-01-22 阿里巴巴集团控股有限公司 A kind of video broadcasting method, device, terminal device and server
US20190206128A1 (en) * 2017-12-28 2019-07-04 Rovi Guides, Inc. Systems and methods for changing a users perspective in virtual reality based on a user-selected position
CN110035316A (en) * 2018-01-11 2019-07-19 华为技术有限公司 The method and apparatus for handling media data
CN112771884A (en) * 2018-04-13 2021-05-07 华为技术有限公司 Immersive media metrics for virtual reality content with multiple machine positions
CN112188219A (en) * 2020-09-29 2021-01-05 北京达佳互联信息技术有限公司 Video receiving method and device and video transmitting method and device

Also Published As

Publication number Publication date
CN115604523A (en) 2023-01-13

Similar Documents

Publication Publication Date Title
US20200288182A1 (en) Transmission device, transmission method, receiving device, and receiving method for rendering a multi-image-arrangement distribution service
JP6735415B2 (en) Method and apparatus for controlled selection of viewing point and viewing orientation of audiovisual content
CN109076229B (en) Areas of most interest in pictures
US11094130B2 (en) Method, an apparatus and a computer program product for video encoding and video decoding
JP2019521583A (en) Advanced signaling of the areas of greatest interest in the image
KR20190008901A (en) Method, device, and computer program product for improving streaming of virtual reality media content
CN107634930B (en) Method and device for acquiring media data
CN110870282B (en) Processing media data using file tracks of web content
CN111034203A (en) Processing omnidirectional media with dynamic zone-by-zone encapsulation
US11438645B2 (en) Media information processing method, related device, and computer storage medium
WO2019062613A1 (en) Media information processing method and apparatus
CN112087642B (en) Cloud guide playing method, cloud guide server and remote management terminal
US11653054B2 (en) Method and apparatus for late binding in media content
CA3069031A1 (en) Media information processing method and apparatus
WO2017092433A1 (en) Method and device for video real-time playback
WO2021198553A1 (en) An apparatus, a method and a computer program for video coding and decoding
WO2023273675A1 (en) Method for processing video scene of free visual angle, and client and server
WO2022116822A1 (en) Data processing method and apparatus for immersive media, and computer-readable storage medium
TW201441935A (en) System and method of video screenshot
CN110832878B (en) Enhanced region-oriented encapsulation and view-independent high-efficiency video coding media profile
WO2023236732A1 (en) Media information processing method and device, media information playback method and device, and storage medium
US20230072093A1 (en) A Method, An Apparatus and a Computer Program Product for Video Streaming
Otsuki et al. A trial implementation of an MMT-receiving application to enable HTTP access by designating absolute time
WO2019193245A1 (en) Method and apparatus for signaling and storage of multiple viewpoints for omnidirectional audiovisual content

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22831501

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE