CN115604523A - Processing method of free visual angle video scene, client and server - Google Patents

Processing method of free visual angle video scene, client and server Download PDF

Info

Publication number
CN115604523A
CN115604523A CN202110722259.7A CN202110722259A CN115604523A CN 115604523 A CN115604523 A CN 115604523A CN 202110722259 A CN202110722259 A CN 202110722259A CN 115604523 A CN115604523 A CN 115604523A
Authority
CN
China
Prior art keywords
video frame
index file
information
camera position
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110722259.7A
Other languages
Chinese (zh)
Inventor
江平
赵俊哲
高元仲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN202110722259.7A priority Critical patent/CN115604523A/en
Priority to PCT/CN2022/093592 priority patent/WO2023273675A1/en
Publication of CN115604523A publication Critical patent/CN115604523A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses a processing method of a free visual angle video scene, a client and a server, wherein the method applied to the client comprises the following steps: the client acquires the index file, analyzes the video frame information and the camera position values of all the camera positions according to the index file, acquires the camera position value of the switching visual angle, and downloads the video frame according to the video frame information and the camera position value of the switching visual angle. Based on this, both can realize the interactive low time delay of visual angle, can guarantee the smoothness of picture switching again, through introducing this auxiliary file of index file, can reduce unnecessary download volume under the condition of guaranteeing the picture quality, other visual angle information are easily expanded to the later stage simultaneously.

Description

Processing method of free visual angle video scene, client and server
Technical Field
The embodiment of the invention relates to the technical field of multimedia, in particular to a method for processing a free visual angle video scene, a client and a server.
Background
With the increase of entertainment requirements of users in the coming of the 5G era, the experience requirements of users cannot be met by the single-view video experience, and the videos of multi-view scenes only can provide less wonderful views, so that the user interaction selectivity is limited. The viewing angle can be selected freely through the free viewing angle of 360 degrees of full viewing angle, and customized experience can be given to a user. At present, the free visual angle is widely applied to sports events, education training and entertainment performances, and a new video scene is provided for 5G application.
In the related technology, when a user experiences a free view angle, requirements on time delay of view angle interaction and smoothness of picture switching are high, a splicing type or real-time synthesis mode is used for mainstream free viewpoints in the industry, the transmission bandwidth of a spliced view angle scheme is high, the image quality of an original video frame is lost, a real-time synthesis view angle effect cannot be guaranteed, and performance consumption is high.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
The embodiment of the invention provides a processing method of a free visual angle video scene, a client and a server, which can realize low time delay of visual angle interaction and ensure the smoothness of picture switching.
In a first aspect, an embodiment of the present invention provides a method for processing a free-view video scene, where the method is applied to a client, and the method includes:
acquiring an index file, wherein the index file comprises video frame information;
analyzing video frame information and camera position values of all camera positions according to the index file;
and acquiring the camera position value of the switching visual angle, and downloading the video frame according to the video frame information and the camera position value of the switching visual angle.
In a second aspect, an embodiment of the present invention provides a method for processing a free-view video scene, which is applied to a server, and the method includes:
slicing and packaging a media file containing multiple paths of code streams to obtain slices, wherein the slices comprise video frame information;
generating an index file corresponding to the fragment;
and extracting the video frame information in the fragments to the index file.
In a third aspect, an embodiment of the present invention provides a client, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor when executing the computer program implementing the method of processing a freeview video scene as described above in the first aspect.
In a fourth aspect, an embodiment of the present invention provides a server, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of processing a freeview video scene as described above in the second aspect when executing the computer program.
In a fifth aspect, an embodiment of the present invention provides a computer-readable storage medium storing a computer-executable program for causing a computer to perform the method for processing a freeview video scene as described in the first aspect above or the method for processing a freeview video scene as described in the second aspect above.
The embodiment of the invention comprises the following steps: the client side obtains the index file, analyzes video frame information and camera machine position values of all camera machine positions according to the index file, obtains camera machine position values of the switching visual angles, and downloads video frames according to the video frame information and the camera machine position values of the switching visual angles. Based on this, both can realize the interactive low time delay of visual angle, can guarantee the smoothness nature of picture switching again, through introducing this auxiliary file of index file, can reduce unnecessary download volume under the condition of guaranteeing picture quality, other visual angle information of later stage easily expands simultaneously.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.
Fig. 1 is a main flow diagram (client side) of a method for processing a freeview video scene according to an embodiment of the present invention;
fig. 2 is a sub-flowchart of a method for processing a free-view video scene according to an embodiment of the present invention;
fig. 3 is a sub-flowchart of a method for processing a free-view video scene according to an embodiment of the present invention;
FIG. 4 is a camera bitmap of a free-view provided by one embodiment of the present invention;
FIG. 5 is a diagram of a switching frame of a free view provided by an embodiment of the present invention;
FIG. 6 is a diagram of a free-view bullet time-switched frame, according to an embodiment of the present invention;
fig. 7 is a main flow chart (server side) of a method for processing a freeview video scene according to an embodiment of the present invention;
FIG. 8 is a live flow diagram of freeview provided by one embodiment of the present invention;
FIG. 9 is a flow chart of free-view on demand provided by one embodiment of the present invention;
FIG. 10 is a schematic diagram of a client architecture provided by an embodiment of the invention;
fig. 11 is a schematic diagram of a server structure according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It should be understood that in the description of the embodiments of the present invention, a plurality (or a plurality) means two or more, more than, less than, more than, etc. are understood as excluding the number, and more than, less than, etc. are understood as including the number. If any description of "first", "second", etc. is used for the purpose of distinguishing technical features, it is not intended to indicate or imply relative importance or to implicitly indicate the number of the technical features indicated or to implicitly indicate the precedence of the technical features indicated.
With the increase of entertainment requirements of users in the coming of the 5G era, the experience requirements of users cannot be met by the single-view video experience, and the videos of multi-view scenes only can provide less wonderful views, so that the user interaction selectivity is limited. The viewing angle can be selected freely through the free viewing angle of 360 degrees of full viewing angle, and customized experience can be given to a user. At present, the free visual angle is widely applied to sports events, education training and entertainment performances, and a new video scene is provided for 5G application.
In the related technology, when a user experiences a free view angle, requirements on time delay of view angle interaction and smoothness of picture switching are high, a splicing type or real-time synthesis mode is used for mainstream free viewpoints in the industry, the transmission bandwidth of a spliced view angle scheme is high, the image quality of an original video frame is lost, a real-time synthesis view angle effect cannot be guaranteed, and performance consumption is high.
The embodiment of the invention provides a processing method of a free view angle video scene, a client and a server, wherein the client acquires an index file, analyzes camera machine position values of video frame information of all camera positions according to the index file, acquires camera machine position values of view angles, and downloads video frames according to the video frame information and the camera machine position values of the view angles. Based on this, both can realize the interactive low time delay of visual angle, can guarantee the smoothness of picture switching again, through introducing this auxiliary file of index file, can reduce unnecessary download volume under the condition of guaranteeing the picture quality, other visual angle information are easily expanded to the later stage simultaneously.
As shown in fig. 1, fig. 1 is a flowchart of a method for processing a freeview video scene according to an embodiment of the present invention. The processing method of the free view video scene can be applied to a client, and the processing method of the free view video scene includes but is not limited to the following steps:
step 101, obtaining an index file;
102, analyzing video frame information and camera position values of all camera positions according to the index file;
and 103, acquiring a camera machine position value of the switching visual angle, and downloading the video frame according to the video frame information and the camera machine position value of the switching visual angle.
It can be understood that the client obtains the index file, the index file includes the video frame information and the camera opportunity value of all the camera locations, the video frame information and the camera opportunity value of all the camera locations are analyzed according to the index file, the camera opportunity value of the view angle is obtained, and the video frame is downloaded according to the video frame information and the camera opportunity value of the view angle, so that the free view angle is switched. Based on this, both can realize the interactive low time delay of visual angle, can guarantee the smoothness nature of picture switching again, through introducing this auxiliary file of index file, can reduce unnecessary download volume under the condition of guaranteeing picture quality, other visual angle information of later stage easily expands simultaneously.
It should be noted that the video frame information includes, but is not limited to, video frame start position information, video frame size, and video frame corresponding camera position value.
Taking live broadcast as an example, a client can download an index file from a server and analyze video frame information, and download the next frame of the current machine position according to the video frame information in the index file, and decode and render the downloaded frame, when a user performs view angle switching operation, the client modifies the current machine position value, downloads the next frame of the switched machine position according to the frame, and decodes and renders the downloaded frame, and thus, the view angle switching operation is finished. Based on this, if the user switches the visual angle, the client responds to the interaction, modifies the machine position value information, and then carries out frame downloading according to the modified machine position stream. Because the downloading is carried out according to the frame unit, the low-time-delay visual angle interaction can be realized during the playing, and the smoothness of the picture switching can be ensured because the picture rendering is not influenced and the machine position jumping is not caused by the machine position switching.
It should be noted that, after step 103, the following sub-steps may be included, but are not limited to:
and decoding and rendering the video frame.
It can be understood that, for the downloaded video frame, the decoding and rendering process is required at the client to open the video frame, so that the user can see the picture of the video frame at the client.
As shown in fig. 2, step 101 may include, but is not limited to, the following sub-steps:
step 1011, acquiring a media presentation description file, wherein the media presentation description file is generated by a server;
step 1012, obtaining the information of the index file according to the media presentation description file;
step 1013, downloading the index file from the server according to the information of the index file.
It can be understood that, for the manner of obtaining the index file by the client, for example, the index file may be downloaded from the server by obtaining a Media Presentation Description (MPD) file generated by the server based on DASH (Dynamic Adaptive Streaming over HTTP), obtaining information of the index file according to the Media Presentation Description file, and then downloading the index file from the server according to the information of the index file.
As shown in fig. 3, step 103 may include, but is not limited to, the following sub-steps:
step 1031, acquiring a current camera position value;
step 1032, determining a current view video frame to be downloaded according to the video frame information and the current camera position value;
step 1033, obtaining a target camera location value;
step 1034, determining a target view video frame to be downloaded according to the video frame information and the target camera position value.
It will be appreciated that in order to achieve a free angle of view, as shown in fig. 4, multiple cameras are required to surround 360 the object being photographed and the corresponding video frames are found for downloading by switching camera positions. Specifically, each camera position corresponds to a camera position value, and a user realizes view angle switching by modifying the camera position value at a client, for example, by acquiring a current camera position value, determining a current camera position according to the current camera position value, and determining a current view angle video frame to be downloaded according to video frame information and the current camera position; for another example, when the user performs the view switching operation, the current camera position value may be modified to be the target camera position value, and the client determines the target camera position according to the target camera position value by acquiring the target camera position value input by the user, so as to implement switching from the current camera position to the target camera position, and determines the target view video frame to be downloaded according to the video frame information and the target camera position, so as to implement switching of the free view.
It can be understood that, as shown in fig. 5, taking live broadcast as an example, according to video frame information in an index file, downloading a next frame of a current camera position by frame, when a user performs view angle switching operation, modifying the current camera position value to a target camera position value by a client, downloading the next frame of the target camera position by frame, and repeating the above steps until the view angle switching operation is finished; as shown in fig. 5, taking on-demand as an example, according to the video frame information in the index file, downloading the next frame of the current camera position by frame, when the user performs the view angle switching operation, modifying the current camera position value to the target camera position value by the client, and downloading the next frame of the target camera position by frame, and repeating the above steps until the view angle switching operation is finished; as shown in fig. 6, taking bullet time as an example, according to the video frame information in the index file, downloading the next frame of the current camera position by frame, when the user performs bullet time operation, modifying the current camera position value by the client, incrementing by 1 to the target camera position value, and downloading the same frame of the target camera position by frame, and repeating this operation until the view angle switching operation is finished.
Fig. 7 is a flowchart illustrating a method for processing a free-view video scene according to an embodiment of the present invention, as shown in fig. 7. The processing method of the free view video scene can be applied to a server, and the processing method of the free view video scene includes but is not limited to the following steps:
step 201, slicing and packaging a media file containing multiple paths of code streams to obtain slices, wherein the slices comprise video frame information;
step 202, generating an index file corresponding to the fragments;
step 203, extracting the video frame information in the fragment to an index file.
It can be understood that the server performs slice encapsulation on the media file containing the multiple paths of code streams to obtain slices, the slices include video frame information, and generates an index file corresponding to the slices, and extracts the video frame information in the slices to the index file, so that the client can download the index file from the server. Based on this, both can realize the interactive low time delay of visual angle, can guarantee the smoothness nature of picture switching again, through introducing this auxiliary file of index file, can reduce unnecessary download volume under the condition of guaranteeing picture quality, other visual angle information of later stage easily expands simultaneously.
It is to be understood that the server may generate the media presentation description file based on the DASH protocol, wherein the media presentation description file includes information of the segments and information of the index file. Based on the method, the client can obtain the information of the index file according to the media presentation description file by obtaining the media presentation description file generated by the server, and then download the index file from the server according to the information of the index file.
And transcoding and merging the video streams of multiple machine positions into a unified code stream for live broadcasting or on-demand broadcasting after recording. The server side obtains the live streaming or the on-demand streaming, carries out slice packaging based on a DASH protocol, and extracts frame information in the slices to an index file. And describing the slicing information and the index file information to a media presentation description file. The client side obtains the media presentation description file, downloads the index file according to the index file custom field, and analyzes the video frame information in the index file. And the client downloads the frames according to the video frame information in the index file by taking the frames as units and decodes and renders the downloaded frames. And if the user switches the visual angle, the client responds to the interaction, modifies the machine position value information, and then carries out frame downloading according to the modified machine position stream. Because the downloading is carried out according to the frame unit, the low-time-delay visual angle interaction can be realized during the playing, and the smoothness of the picture switching can be ensured because the picture rendering is not influenced and the machine position jumping is not caused by the machine position switching.
The processing method for the free-view video scene provided by the invention is further described in the following with reference to the accompanying drawings and specific embodiments.
As shown in fig. 8, taking live broadcast as an example, the video acquisition module acquires multi-bit video streams, the server synchronizes video frames of the multi-bit video streams, merges multiple synchronized video streams into a single-channel code stream, performs DASH slicing on the merged code stream, and synchronously generates a corresponding frame index file, the server slices and encapsulates a media file containing multiple channel code streams and generates a corresponding index file, marks information of all frames in the corresponding slice in the index file, the server generates a media presentation description file, the client downloads the media presentation description file and analyzes the index file, the video slice and audio slice information in the media presentation description file, the client downloads the index file and analyzes the video frame information, the client downloads a frame next to a current bit according to video frame information in the index file, the client decodes and renders a frame next to the current bit, the client performs decoding and rendering on the frame, and renders the frame after the view switching operation until the view switching operation is finished.
As shown in fig. 9, taking on-demand as an example, a server performs DASH slicing on recorded merged code streams, and generates corresponding frame index files synchronously, the server performs slice encapsulation on a media file including multiple code streams and generates corresponding index files, information of all frames in corresponding slices is marked in the index files, the server generates a media presentation description file, a client downloads the media presentation description file and analyzes the index file, video slices, and information of audio slices therein, the client downloads the index file and analyzes video frame information, the client downloads a next frame of a current machine position according to video frame information in the index file, the client decodes and renders the downloaded frame, the user performs view switching operation, the client modifies a current machine position value and downloads a next frame of the switched machine position according to the frame, the client decodes and renders the downloaded frame, and the process is repeated until the view switching operation is finished.
Taking bullet time as an example, a server slices a media file containing multiple paths of code streams and generates a corresponding index file, information of all frames in a corresponding fragment is marked in the index file, the server generates a media presentation description file, a client downloads the media presentation description file and analyzes the index file, the information of video fragments and audio fragments, the client downloads the index file and analyzes video frame information, the client downloads the next frame of the current machine position according to the video frame information in the index file, the client decodes and renders the downloaded frame, a user performs bullet time operation, the client modifies the current machine position value, increments by 1, downloads the same frame of the switched machine position according to the frame, and the client decodes and renders the downloaded frame, and the steps are repeated until the bullet time operation is finished.
As shown in fig. 10, an embodiment of the present invention further provides a client.
Specifically, the terminal includes: one or more processors and memory, one for example in fig. 10. The processor and memory may be connected by a bus or other means, such as by a bus in FIG. 10.
The memory, as a non-transitory computer readable storage medium, may be used to store a non-transitory software program and a non-transitory computer executable program, such as the processing method of the free-perspective video scene in the embodiment of the present invention described above. The processor implements the method for processing freeview video scenes in the embodiments of the present invention described above by running a non-transitory software program and a program stored in a memory.
The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data and the like required to perform the processing method of the free-view video scene in the above-described embodiment of the present invention. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The non-transitory software program and program required to implement the method for processing a free view video scene in the embodiment of the present invention are stored in a memory, and when executed by one or more processors, the method for processing a free view video scene in the embodiment of the present invention is executed, for example, the method steps 101 to 103 in fig. 1, the method steps 1011 to 1013 in fig. 2, and the method steps 1031 to 1034 in fig. 3 described above are executed, the client acquires the index file, parses the video frame information and the camera opportunity values of all camera opportunities according to the index file, acquires the camera opportunity value of the view switching, and downloads the video frame according to the video frame information and the camera opportunity value of the view switching. Based on this, both can realize the interactive low time delay of visual angle, can guarantee the smoothness nature of picture switching again, through introducing this auxiliary file of index file, can reduce unnecessary download volume under the condition of guaranteeing picture quality, other visual angle information of later stage easily expands simultaneously.
As shown in fig. 11, an embodiment of the present invention further provides a server.
Specifically, the electronic device includes: one or more processors and memories, one processor and memory being illustrated in fig. 11. The processor and memory may be connected by a bus or other means, such as by a bus in FIG. 11.
The memory, as a non-transitory computer-readable storage medium, may be used to store a non-transitory software program and a non-transitory computer-executable program, such as the processing method of the freeview video scene in the embodiments of the present invention described above. The processor implements the method for processing freeview video scenes in the embodiments of the present invention described above by running a non-transitory software program and programs stored in memory.
The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data and the like required to perform the processing method of the free-view video scene in the above-described embodiment of the present invention. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The non-transitory software program and the program required to implement the processing method of the free-view video scene in the embodiment of the present invention are stored in the memory, and when the non-transitory software program and the program are executed by one or more processors, the processing method of the free-view video scene in the embodiment of the present invention is executed, for example, the method steps 201 to 203 in fig. 7 described above are executed, the server performs slice encapsulation on a media file including multiple paths of code streams to obtain slices, the slices include video frame information, and generates an index file corresponding to the slices, and extracts the video frame information in the slices to the index file, so that the client can download the index file from the server. Based on this, both can realize the interactive low time delay of visual angle, can guarantee the smoothness nature of picture switching again, through introducing this auxiliary file of index file, can reduce unnecessary download volume under the condition of guaranteeing picture quality, other visual angle information of later stage easily expands simultaneously.
Furthermore, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer-executable program, where the computer-executable program is executed by one or more control processors, for example, by one processor in fig. 11, and the one or more processors may be enabled to execute the method for processing a free-view video scene in the foregoing embodiment of the present invention, for example, to execute the above-described method steps 101 to 103 in fig. 1, method steps 1011 to 1013 in fig. 2, and method steps 1031 to 1034 in fig. 3, where the client acquires an index file, parses video frame information and camera position values of all camera positions according to the index file, acquires a camera position value for switching a view, and downloads a video frame according to the video frame information and the camera position value for switching the view. Based on this, both can realize the interactive low time delay of visual angle, can guarantee the smoothness nature of picture switching again, through introducing this auxiliary file of index file, can reduce unnecessary download volume under the condition of guaranteeing picture quality, other visual angle information of later stage easily expands simultaneously. Or, the server performs slicing and packaging on the media file including the multiple paths of code streams to obtain slices, where the slices include video frame information, and generates an index file corresponding to the slices, and extracts the video frame information in the slices to the index file, so that the client can download the index file from the server. Based on this, both can realize the interactive low time delay of visual angle, can guarantee the smoothness of picture switching again, through introducing this auxiliary file of index file, can reduce unnecessary download volume under the condition of guaranteeing the picture quality, other visual angle information are easily expanded to the later stage simultaneously.
It will be understood by those of ordinary skill in the art that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, or suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable programs, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable programs, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.
While the preferred embodiments of the present invention have been described in detail, it will be understood by those skilled in the art that the foregoing and various other changes, omissions and deviations in the form and detail thereof may be made without departing from the scope of this invention.

Claims (10)

1. A processing method of a free view video scene is applied to a client, and the method comprises the following steps:
acquiring an index file;
analyzing video frame information and camera position values of all camera positions according to the index file;
and acquiring the camera position value of the switching visual angle, and downloading the video frame according to the video frame information and the camera position value of the switching visual angle.
2. The method of claim 1, wherein obtaining the index file comprises:
acquiring a media presentation description file, wherein the media presentation description file is generated by a server;
obtaining the information of the index file according to the media presentation description file;
and downloading the index file from the server according to the information of the index file.
3. The method of claim 2, wherein the obtaining a camera position value for switching a view, and the downloading the video frame according to the video frame information and the camera position value for switching a view comprises:
acquiring a current camera position value;
determining a current view video frame to be downloaded according to the video frame information and the current camera position value;
acquiring a target camera position value;
and determining a target view angle video frame to be downloaded according to the video frame information and the target camera position value.
4. The method according to claim 3, further comprising, after said obtaining the camera position value for switching the view angle, downloading the video frame according to the video frame information and the camera position value for switching the view angle:
and decoding and rendering the video frame.
5. The method according to any of claims 1 to 4, wherein the video frame information comprises any of:
video frame start position information;
a video frame size;
the video frames correspond to camera values.
6. A processing method of a free view video scene is applied to a server, and the method comprises the following steps:
slicing and packaging a media file containing multiple paths of code streams to obtain slices, wherein the slices comprise video frame information;
generating an index file corresponding to the fragment;
and extracting the video frame information in the fragments to the index file.
7. The method of claim 6, further comprising;
and generating a media presentation description file, wherein the media presentation description file comprises the information of the fragments and the information of the index file.
8. A client, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of processing the freeview video scene according to any one of claims 1 to 5 when executing the computer program.
9. A server, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of processing freeview video scenes according to any one of claims 6 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer-executable program for causing a computer to execute the method of processing a freeview video scene according to any one of claims 1 to 5 or the method of processing a freeview video scene according to any one of claims 6 to 7.
CN202110722259.7A 2021-06-28 2021-06-28 Processing method of free visual angle video scene, client and server Pending CN115604523A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110722259.7A CN115604523A (en) 2021-06-28 2021-06-28 Processing method of free visual angle video scene, client and server
PCT/CN2022/093592 WO2023273675A1 (en) 2021-06-28 2022-05-18 Method for processing video scene of free visual angle, and client and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110722259.7A CN115604523A (en) 2021-06-28 2021-06-28 Processing method of free visual angle video scene, client and server

Publications (1)

Publication Number Publication Date
CN115604523A true CN115604523A (en) 2023-01-13

Family

ID=84690329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110722259.7A Pending CN115604523A (en) 2021-06-28 2021-06-28 Processing method of free visual angle video scene, client and server

Country Status (2)

Country Link
CN (1) CN115604523A (en)
WO (1) WO2023273675A1 (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105872570A (en) * 2015-12-11 2016-08-17 乐视网信息技术(北京)股份有限公司 Method and apparatus for implementing multi-camera video synchronous playing
JP6741784B2 (en) * 2016-04-08 2020-08-19 ヴィズビット インコーポレイテッド View-oriented 360-degree video streaming
CN109257611A (en) * 2017-07-12 2019-01-22 阿里巴巴集团控股有限公司 A kind of video broadcasting method, device, terminal device and server
US10535190B2 (en) * 2017-12-28 2020-01-14 Rovi Guides, Inc. Systems and methods for changing a users perspective in virtual reality based on a user-selected position
CN110035316B (en) * 2018-01-11 2022-01-14 华为技术有限公司 Method and apparatus for processing media data
WO2019199379A1 (en) * 2018-04-13 2019-10-17 Futurewei Technologies, Inc. Immersive media metrics for virtual reality content with multiple viewpoints
CN112188219B (en) * 2020-09-29 2022-12-06 北京达佳互联信息技术有限公司 Video receiving method and device and video transmitting method and device

Also Published As

Publication number Publication date
WO2023273675A1 (en) 2023-01-05

Similar Documents

Publication Publication Date Title
US11582497B2 (en) Methods, systems, processors and computer code for providing video clips
US9712890B2 (en) Network video streaming with trick play based on separate trick play files
RU2652099C2 (en) Transmission device, transmission method, reception device and reception method
US20140359678A1 (en) Device video streaming with trick play based on separate trick play files
JP5859694B2 (en) Method and apparatus for supporting content playout
CN110351606B (en) Media information processing method, related device and computer storage medium
CN107634930B (en) Method and device for acquiring media data
CN110913278B (en) Video playing method, display terminal and storage medium
CN112087642B (en) Cloud guide playing method, cloud guide server and remote management terminal
US20190327425A1 (en) Image processing device, method and program
CN106331763B (en) Method for seamlessly playing fragmented media file and device for implementing method
RU2651241C2 (en) Transmission device, transmission method, reception device and reception method
US20180324480A1 (en) Client and Method for Playing a Sequence of Video Streams, and Corresponding Server and Computer Program Product
WO2022116822A1 (en) Data processing method and apparatus for immersive media, and computer-readable storage medium
CN115068911B (en) Control method and device of fitness equipment, storage medium and processor
CN115604523A (en) Processing method of free visual angle video scene, client and server
WO2023236732A1 (en) Media information processing method and device, media information playback method and device, and storage medium
US12015805B2 (en) Method, an apparatus and a computer program product for video streaming
CN114827747B (en) Streaming media data switching method, device, equipment and storage medium
CN114125499A (en) Virtual reality video processing method, terminal, server and storage medium
CN116996488A (en) Interactive multimedia information playing method and device, medium and electronic equipment
KR20230171291A (en) Method and apparatus for providing contents for user-selectable stereoscopic media in web environment
CN116320596A (en) Progress content preview processing method, device and system
WO2023194648A1 (en) A method, an apparatus and a computer program product for media streaming of immersive media
CN112995752A (en) Full-view interactive live broadcast method, system, terminal and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination