WO2019076356A1 - 基于虚拟现实场景的视频处理方法、服务器、虚拟现实设备和系统 - Google Patents

基于虚拟现实场景的视频处理方法、服务器、虚拟现实设备和系统 Download PDF

Info

Publication number
WO2019076356A1
WO2019076356A1 PCT/CN2018/110935 CN2018110935W WO2019076356A1 WO 2019076356 A1 WO2019076356 A1 WO 2019076356A1 CN 2018110935 W CN2018110935 W CN 2018110935W WO 2019076356 A1 WO2019076356 A1 WO 2019076356A1
Authority
WO
WIPO (PCT)
Prior art keywords
current
virtual reality
video frame
video
view
Prior art date
Application number
PCT/CN2018/110935
Other languages
English (en)
French (fr)
Inventor
曾新海
涂远东
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2019076356A1 publication Critical patent/WO2019076356A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/156Mixing image signals

Definitions

  • the present application relates to the field of computer technologies, and in particular, to a video processing method, a server, a virtual reality device, and a system based on a virtual reality scenario.
  • the traditional method When a user watches a virtual reality live video, it is not static, and often produces a certain amount of motion, which causes a change in the viewing angle.
  • the traditional method after the user's perspective is changed, the video image before the perspective change is still played for a period of time, and it is not until this time that the clear video of the new perspective can be seen. Therefore, the traditional method usually takes a period of time to switch to clear video at a new perspective, and there is a certain display delay.
  • a video processing method based on a virtual reality scene, a server, a virtual reality device, a storage medium, and a system are provided.
  • a video processing method based on a virtual reality scene comprising:
  • the server obtains a current perspective after the perspective switching in the virtual reality scene
  • the server outputs a key frame corresponding to the current viewing angle and corresponding to the current playing time node
  • the server searches for a video frame in sequence from the next video frame of the video frame corresponding to the current play time node in the mixed video frame sequence corresponding to the current view;
  • the server outputs the found video frame.
  • a server comprising a memory and one or more processors having stored therein computer readable instructions, the computer readable instructions being executed by one or more processors such that the one or more processors Perform the following steps:
  • One or more storage media storing computer readable instructions, when executed by one or more processors, cause one or more processors to perform the following steps:
  • a video processing method based on a virtual reality scene comprising:
  • the virtual reality device plays a sequence of mixed video frames corresponding to the original view in the virtual reality scene;
  • the virtual reality device generates a video frame acquisition request and outputs the output when the viewing angle is switched;
  • the virtual reality device receives a key frame returned in response to the video frame acquisition request; the key frame corresponds to a current perspective after the view switching, and corresponds to a current play time node;
  • the virtual reality device replaces the key frame with a mixed video frame sequence corresponding to the original view
  • the virtual reality device receives and plays a video frame in a sequence of mixed video frames corresponding to the current view; the video frame is in a sequence of mixed video frames corresponding to the current view, from the current play time node The next video frame corresponding to the video frame is searched in order.
  • a virtual reality device comprising a memory and one or more processors having stored therein computer readable instructions that, when executed by one or more processors, cause one or more processors Perform the following steps:
  • the key frame corresponds to a current perspective after the view switching, and corresponds to a current play time node;
  • the video frame is in a mixed video frame sequence corresponding to the current view, from a video frame corresponding to the current play time node A video frame is searched in order.
  • One or more storage media storing computer readable instructions, when executed by one or more processors, cause one or more processors to perform the following steps:
  • the key frame corresponds to a current perspective after the view switching, and corresponds to a current play time node;
  • the video frame is in a mixed video frame sequence corresponding to the current view, from a video frame corresponding to the current play time node A video frame is searched in order.
  • a video processing system based on a virtual reality scene comprising a virtual reality device and a content distribution network server;
  • the virtual reality device is configured to acquire an original view in the virtual reality scenario, obtain a mixed video frame sequence corresponding to the original view from the content distribution network server, and play the image frame; Sent to the content distribution network server;
  • the content distribution network server is configured to acquire a current view and a current play time node after the view switchover in response to the video frame acquisition request, and acquire a key corresponding to the current view and corresponding to the current play time node. a frame; transmitting the key frame to the virtual reality device;
  • the content distribution network server is further configured to sequentially search for a video frame from a next video frame of the video frame corresponding to the current play time node in a mixed video frame sequence corresponding to the current view; Sending a video frame to the virtual reality device; and
  • the virtual reality device is further configured to replace the mixed video frame sequence corresponding to the original view by the key frame; and sequentially play the subsequently received video frames.
  • FIG. 1 is an application scenario diagram of a video processing method based on a virtual reality scene in an embodiment
  • FIG. 2 is a schematic flowchart of a video processing method based on a virtual reality scenario in an embodiment
  • 3A-3B are schematic diagrams of interfaces of a virtual reality video image corresponding to a viewing angle in an embodiment
  • FIG. 4 is a schematic diagram of a key frame sequence and a mixed video frame sequence in one embodiment
  • FIG. 5 is a schematic flow chart of a video processing method based on a virtual reality scene in another embodiment
  • FIG. 6 is an architectural diagram of a video processing system based on a virtual reality scenario in an embodiment
  • FIG. 7 is an architectural diagram of a video processing system based on a virtual reality scenario in another embodiment
  • FIG. 8 is a block diagram of a video processing system based on a virtual reality scene in still another embodiment
  • FIG. 9 is a sequence diagram of a video processing method based on a virtual reality scene in an embodiment
  • FIG. 10 is a structural block diagram of a video processing apparatus based on a virtual reality scene in an embodiment
  • FIG. 11 is a structural block diagram of a video processing apparatus based on a virtual reality scene in another embodiment
  • FIG. 12 is a structural block diagram of a video processing device based on a virtual reality scene in still another embodiment.
  • Figure 13 is a block diagram of a computer device in one embodiment.
  • Figure 14 is a block diagram of a computer device in another embodiment.
  • FIG. 1 is an application scenario diagram of a video processing method based on a virtual reality scene in an embodiment.
  • the application scenario includes a head mounted display 110, a virtual reality device 120, and a server 130 connected through a network.
  • the head mounted display (HMD) 110 may be a head mounted display device having a function of displaying a virtual reality scene screen. It can be understood that the head mounted display can also be replaced by other devices having the function of displaying a virtual reality scene.
  • the virtual reality device 120 is a device having a function of realizing a virtual reality scene.
  • the virtual reality device 120 may be a desktop computer or a mobile terminal, and the mobile terminal may include at least one of a mobile phone, a tablet computer, a personal digital assistant, and a wearable device.
  • the server 130 can be implemented by a stand-alone server or a server cluster composed of a plurality of physical servers.
  • a sequence of mixed video frames corresponding to the original perspective can be played in the virtual reality device 120.
  • the user can view the virtual reality scene picture formed by the virtual reality device 120 playing the mixed video frame sequence corresponding to the original viewing angle through the head mounted display 110.
  • the virtual reality device 120 can generate a video frame acquisition request and send it to the server 130 when the view is switched.
  • the server 130 may acquire the current perspective after the perspective switching in the virtual reality scene, and determine the current playing time node.
  • the server 130 may output a key frame corresponding to the current viewing angle and corresponding to the current playing time node to the virtual reality device 120.
  • the server 130 may further search for a video frame from the next video frame of the video frame corresponding to the current play time node in the mixed video frame sequence corresponding to the current view, and send the searched video frame to the virtual reality device 120.
  • the virtual reality device 120 may play the key frame instead of the mixed video frame sequence corresponding to the original view, and sequentially play the subsequently received video frames to implement video playback under the current view after the switch.
  • the user can view the virtual reality scene picture corresponding to the current view angle of the switch formed by the virtual reality device 120 when playing the key frame and sequentially playing the subsequently received video frame through the head mounted display 110.
  • the head mounted display 110 may not be included in the application scenario.
  • the virtual reality device 120 itself has the function of displaying a virtual reality scene screen provided by the head mounted display 110, the head mounted display 110 can be omitted.
  • FIG. 2 is a schematic flow chart of a video processing method based on a virtual reality scene in an embodiment. This embodiment is mainly illustrated by applying the video processing method based on the virtual reality scene to the server 130 in FIG. 1 .
  • the method specifically includes the following steps:
  • the virtual reality scene is a virtual world created by computer simulation in a three-dimensional space, providing users with simulations of visual, auditory, tactile and other senses, allowing the user to observe the virtual world in a three-dimensional space as if they were immersed in the virtual world. Things. It can be understood that the virtual reality scene is three-dimensional.
  • the virtual reality scene may be a virtual reality game live scene.
  • the Filed of View is the angle formed by the line connecting the observation point to the edge of the visible area.
  • the viewing angle is used to characterize the extent of the visible area that the viewing point can see. It can be understood that the range of visible areas that can be seen by different viewing angles is different.
  • the server may divide the total field of view of the three-dimensional virtual reality video into a plurality of different perspectives in advance.
  • the total field of view of the three-dimensional virtual reality video is the full view of the three-dimensional virtual reality video.
  • the total field of view of the 3D virtual reality video may be a 360 degree panoramic view or a view of less than 360 degrees, such as a 180 degree field of view.
  • each of the divided views corresponds to a virtual reality video image sequence of a partial scene in the three-dimensional virtual reality video.
  • the virtual reality video image sequence of the local scene is a virtual reality video image sequence that presents local features of the virtual reality scene displayed by the three-dimensional virtual reality video.
  • the virtual reality video image sequence of the local scene corresponding to each view is a sequence of virtual reality video images of the local scene presented in the visible area corresponding to the view, that is, within the view, from the observation point A sequence of virtual reality video images of the local scene corresponding to the perspective is seen.
  • FIG. 3A-3B are schematic diagrams of interfaces of a virtual reality video image corresponding to a viewing angle in one embodiment.
  • FIG. 3A is a three-dimensional virtual reality video image in a panoramic view.
  • Figure 3B is a virtual reality video image from a perspective.
  • the virtual reality video image in one view presented in FIG. 3B is a partial scene of the three-dimensional virtual reality video image.
  • the server may directly receive the current perspective after the perspective switching in the virtual reality scene sent by the virtual reality device.
  • the server can also switch the related information according to the perspective, and obtain the current perspective after the perspective switching.
  • the view switching related information is information that has an association relationship with the view switching.
  • the perspective switching association information may include a head gesture. It can be understood that the change of the head posture can cause the switching of the angle of view, and then has an association relationship with the angle of view switching, so the head posture belongs to the perspective switching related information.
  • the play time node is a time node preset for playing a video frame. It can be understood that each video frame has a corresponding playing time node. When the playing time node is reached, the video frame corresponding to the playing time node can be output for playing.
  • a play time node may correspond to a plurality of video frames in different perspectives, and when the play time node is actually reached, the video frame corresponding to the play time node in one of the view angles is output. For example, there are a video frame corresponding to the play time node t1 in the video frames corresponding to the view 1, the view 2, and the view 3, respectively, and the view 1, the view 2, and the view 3. Then, when the play time node t1 is reached, a view angle is determined from the view 1, the view 2, and the view 3, and then the determined video frame corresponding to the play time node t1 at this view is output.
  • the current play time node is the play time node currently to be played.
  • the server may directly acquire the current play time node sent by the virtual reality device.
  • the server may also acquire a switching time point of the view switching, and determine a current playing time node according to the switching time point.
  • the switching time point is a time point when the viewing angle is switched.
  • the server may also determine the current play time node based on the current current system time. In an embodiment, the server may search for a play time node that matches the current current system time in the play time node corresponding to the key frame sequence or the mixed video frame sequence corresponding to the current view, as the current play time node.
  • the server may select a play time node that is closest to the switch time point from the played play time node, and use the next play time node after the selected closest play time node as the current play time node.
  • the key frame is a frame that is not referenced by other frame images and is only encoded by using the information of the frame.
  • the key frame may be an Intra-coded picture (intra-coded picture).
  • the key frame is output, which may be a key frame, such that the key frame is played.
  • the server may add the key frame to the sending queue, and send the key frame to the virtual reality device through the sending queue, and the virtual reality device may decode and play the key frame corresponding to the current view to generate the key frame in the current view.
  • Corresponding virtual reality video image It can be understood that the virtual reality video image corresponding to the key frame in the current perspective belongs to the virtual reality video image of the local scene.
  • the server stores the correspondence between the view angle and the key frame, and stores the correspondence between the key frame and the play time node.
  • the server may determine and output a key frame corresponding to the current viewing angle and corresponding to the current playing time node according to the foregoing correspondence.
  • step S206 includes: acquiring a key frame sequence corresponding to the current view angle; and searching for a key frame corresponding to the current play time node in the key frame sequence.
  • a key frame sequence is a sequence of video frames consisting of multiple key frames.
  • a virtual reality video image sequence corresponding to the corresponding angle of view may be generated.
  • the server may obtain a key frame sequence corresponding to the current view.
  • the key corresponding to the current play time node is searched according to the correspondence between the preset key frame and the play time node. frame.
  • the video frame is sequentially searched from the next video frame of the video frame corresponding to the current play time node.
  • the mixed video frame sequence is a compressed sequence of mixed video frames consisting of key frames and inter-frame predicted frames.
  • An inter prediction frame is a video frame that is encoded with reference to other video frames using correlation between video image frames.
  • the inter prediction frame includes a P frame (Predictive-codedPicture) and/or a B frame (Bidirectionally predicted picture). It can be understood that the mixed video frame sequence is a sequence of video frames used when playing virtual reality video under normal conditions.
  • the sequence of mixed video frames is generated frame by frame from live video frames.
  • a correspondence between a view angle and a sequence of mixed video frames is preset in the server.
  • the server may acquire a sequence of mixed video frames corresponding to the current perspective according to the correspondence. It can be understood that each video frame in the mixed video frame sequence has a corresponding play time node.
  • the server may determine a video frame corresponding to the current play time node in the mixed video frame sequence corresponding to the current view, and sequentially search for the video frame from the next video frame of the determined video frame.
  • the video frames in the key frame and the mixed video frame sequence can correspond to the same play time node.
  • the generated virtual reality video images are the same.
  • the virtual reality video images mentioned herein are the same, and are virtual reality video images having the same picture content, and do not exclude the virtual reality video images output according to the key frames and the virtual reality output according to the video frames in the mixed video frame sequence. There is a difference in quality between the video images and the like.
  • the key frame is a key frame in a sequence of key frames corresponding to the current perspective and corresponding to the current time node.
  • the key frame sequence and the mixed video frame sequence are different representations of the virtual reality video image sequence corresponding to the current perspective. That is, the virtual reality video image sequence corresponding to the current view may be represented by a key frame sequence, and each key frame in the key frame sequence is outputted frame by frame, and a virtual reality video image sequence corresponding to the current view may be generated.
  • the virtual reality video image sequence corresponding to the current view may also be represented by a mixed video frame sequence, and each video frame in the mixed video frame sequence is output frame by frame, and a virtual reality video image sequence corresponding to the current view may be generated. It can be understood that the key frame sequence corresponding to the current view and the virtual reality video image sequence corresponding to the mixed video frame sequence are all virtual reality video image sequences of the local scene.
  • FIG. 4 is a schematic diagram of a sequence of key frames and a sequence of mixed video frames in one embodiment.
  • a key frame sequence and a mixed video frame sequence corresponding to the same view are used.
  • the key frame sequence is composed of individual key frames (I frames)
  • the mixed video frame sequence is composed of key frames (I frames) and inter prediction frames (such as P frames in FIG. 4).
  • the video frames in the same dashed box in the key frame sequence and the mixed video frame sequence correspond to the same play time node, for example, the I frame 402a and the P frame 402b correspond to the same play time node t1, and the I frame 404a and the P frame 404b correspond to the same play.
  • the server may sequentially output the found video frames frame by frame. It can be understood that the server can add the found video frames to the sending queue, and send the video frames to the virtual reality device frame by frame through the sending queue.
  • the virtual reality device can decode the video frame frame by frame to achieve normal playback of the virtual reality video image sequence in the current perspective.
  • the inter-predicted frame in the mixed video frame sequence is decoded and played according to the video image generated by the previous video frame output, and as before, the key frames corresponding to the same viewing angle and corresponding to the same playing time node are
  • the generated virtual reality video images are the same. Therefore, when the key frames corresponding to the current view angle and corresponding to the current play time node and the video frames in the mixed video frame sequence are respectively output, the generated virtual reality video images are the same.
  • the video frame is sequentially searched from the next video frame of the video frame corresponding to the current play time node, and the found video frames are sequentially output after the key frame, and then
  • the output video frame can rely on the key frame to achieve normal playback of the virtual reality video image sequence at the current perspective.
  • the server may obtain the key frame sequence and the mixed video frame sequence corresponding to the current viewing angle, and assume that FIG. 4 is the current viewing angle. Key frame sequences and mixed video frame sequences.
  • the server may select and output the key frame 402a corresponding to t1 from the key frame sequence, and the server may determine the video frame 402b corresponding to t1 in the mixed video frame sequence, and sequentially search for the video from the next video frame 404b of the video frame 402b. Frames, and the found video frames are output sequentially from frame to frame.
  • the video processing method based on the virtual reality scene may directly output the key frame corresponding to the current playing time node in the current viewing angle after the playback switching after the view switching, and after the output key frame is output, the current viewing angle after the switching
  • the video frame is sequentially searched and outputted from the next video frame of the video frame corresponding to the current playing time node, and the normal playback of the virtual reality video image under the new perspective after the switching is quickly realized. There is no need to wait for a waiting time of the virtual reality video image before the viewing angle is changed, which shortens the display delay time when the angle of view is switched.
  • step S202 includes: acquiring a current head pose after the head pose is changed; and mapping the current head pose into a virtual reality scene according to a preset mapping relationship between the head pose and the perspective in the virtual reality scene. Current perspective.
  • the head posture is a relative position of the head when the front head position is preset as a reference.
  • the positive position of the head is the position when the head is upright and not skewed.
  • the head posture includes a relative position where the head is rotated left and right, a relative position where the head is tilted left and right, and a relative position when the head is raised or lowered.
  • a plurality of different perspectives are preset in the server, and a preset mapping relationship between the head pose and the perspective in the virtual reality scene is set.
  • the server may acquire a current head posture after detecting a change in the head posture, and map the current head posture to a current perspective in the virtual reality scene according to the preset mapping relationship.
  • the server can acquire the current head posture sent by the virtual reality device after detecting a change in the head posture.
  • the virtual reality device may detect the head posture, and when detecting that the head posture changes, send the current head posture after the head posture change to the server, and the server directly acquires the current head posture sent by the virtual reality device.
  • the server may also obtain the current head pose after detecting a change in the head pose from other devices.
  • the current head angle and the mapping relationship between the head posture and the angle of view can quickly and accurately determine the current angle of view, because the head wear is usually worn by the head with the function of displaying the virtual reality scene picture.
  • the display device is used to watch the playback of the virtual reality video image in the virtual reality scene. Therefore, the change of the head posture is one of the important reasons for causing the angle of view switching, so the current angle of view after the switching according to the current head posture is accurate.
  • the mapping relationship between the head posture and the viewing angle the current viewing angle is determined, and complicated calculation processing is avoided, thereby further shortening the display delay time when the viewing angle is switched.
  • step S204 includes: acquiring a switching time point of the viewing angle switching; searching for a playing time node matching the switching time point in the playing time node corresponding to the key frame sequence or the mixed video frame sequence corresponding to the current viewing angle ; Match the playing time node as the current playing time node.
  • the switching time point is a time point when the viewing angle is switched.
  • the server may acquire a switching time point of the view switching sent by the virtual reality device. It can be understood that the server can also perform real-time monitoring on the virtual reality device. When the virtual reality device monitors the view switching, the switch acquires the switching time point of the view switching.
  • each key frame in the key frame sequence has a corresponding play time node
  • each video frame in the mixed video frame sequence also has a corresponding play time node
  • the server may sequence or mix the key frames corresponding to the current view.
  • a play time node matching the switch time point is found.
  • the server can use the matching play time node as the current play time node.
  • the server may use a play time node closest to the switching time point among the play time nodes corresponding to the key frame sequence or the mixed video frame sequence corresponding to the current view as the play time node matching the switch time point.
  • the play time node closest to the switching time point here has a play time node earlier than the switch time point, a play time node later than the switch time point, and a play time node equal to the switch time point.
  • the play time nodes corresponding to each key frame in the key frame sequence corresponding to the current view are 20 ms (milliseconds), 21 ms, and 22 ms, respectively, and the switching time point is the 20.4 ms of the play, which is recently connected.
  • the playback time node at the switching time point is a playback time node that is earlier than the 20 ms of the switching time point.
  • the switching time point is the 20.8 ms of the playing
  • the playing time node that is closest to the switching time point is the playing time node that is later than the 21 ms of the switching time point.
  • the switching time point is the 22th time of the playing, and the playing time node closest to the switching time point is equal to the playing time node of the 22ms of the switching time point.
  • the server may filter the play time node later than the switch time point from the play time node corresponding to the key frame sequence or the mixed video frame sequence corresponding to the current view, and select the switch time later than the switch time. In the play time node of the point, the play time node closest to the switch time point is selected as the play time node that matches the switch time point.
  • the play time nodes corresponding to each key frame in the key frame sequence corresponding to the current view are 20 ms (milliseconds), 21 ms, and 22 ms, respectively, and the switching time point is the 20.4 ms of the play.
  • the play time node later than the 20.4 ms is the 21 ms and the 22 ms, and the play time node closest to the 20.4 ms of the switching time point is selected therefrom, which is the 21 ms play time node.
  • the playing time node matching the switching time point of the viewing angle switching is used as the current playing time node, and then the key frame corresponding to the current viewing angle after switching and the video frame in the mixed video frame sequence are determined according to the current playing time node.
  • the output is performed such that the output of the key frame corresponding to the current view angle and the video frame in the mixed video frame sequence are more correlated with the video image played back from the original view when the view angle is switched, thereby realizing the virtual view switching.
  • the close connection of realistic video images ensures the quality of virtual reality video playback.
  • the method further includes: acquiring a three-dimensional virtual reality video in a virtual reality scene; acquiring different perspectives corresponding to the three-dimensional virtual reality video; and correspondingly generating a corresponding key frame according to the three-dimensional virtual reality video for each perspective Sequence and mixed video frame sequences.
  • the three-dimensional virtual reality video is a video showing a video image in the form of a three-dimensional virtual reality scene.
  • the different perspectives corresponding to the three-dimensional virtual reality video are different perspectives into which the total field of view of the three-dimensional virtual reality video is to be divided.
  • the server may directly acquire different perspectives corresponding to the three-dimensional virtual reality video set in advance.
  • the server may also acquire a preset total number of views, and divide the panoramic view of the three-dimensional virtual reality video into different perspectives according to the total number of views.
  • the preset total number of views is the total number of preset perspectives to be divided.
  • the server may divide the panoramic view of the three-dimensional virtual reality video into different perspectives that satisfy the total number of views. For example, if the preset total number of views is 6, the server can divide the panoramic view of the 3D virtual reality video into 6 different perspectives.
  • the server may divide the panoramic view of the three-dimensional virtual reality video into equal parts according to a preset total number of views, and divide into different views with equal or approximately equal ranges.
  • the server may also divide the panoramic view of the three-dimensional virtual reality video according to the primary and secondary positions of the visual field position, and divide the different perspectives according to the preset total number of views, wherein the main position in the panoramic view may be divided into A large viewing angle, a relatively minor position, can be divided into smaller viewing angles.
  • each of the divided views corresponds to a virtual reality video image sequence of a partial scene in the three-dimensional virtual reality video.
  • the different perspectives correspond to different virtual reality video image sequences of different local scenes in the three-dimensional virtual reality video.
  • the server may determine a virtual reality video image sequence of the local scene corresponding to each perspective in the three-dimensional virtual reality video, and generate a video frame sequence representing the virtual reality video image sequence of the local scene.
  • the server may respectively generate a corresponding key frame sequence and a mixed video frame sequence for the virtual reality video image sequence of the local scene corresponding to each view. That is, the virtual reality video image sequence of the local scene corresponding to the same perspective is represented in two forms: a key frame sequence and a mixed video frame series. It can be understood that decoding the corresponding key frame sequence may generate a virtual reality video image sequence of the corresponding local scene, and decoding the mixed video frame sequence may also generate a virtual reality video image sequence of the corresponding local scene.
  • corresponding key frames and mixed video frame series are generated corresponding to each view.
  • the key frame under the current play time node corresponding to the current view angle after the switch is obtained from the key frame sequence, and the key frame is output, and then the mixed video frame corresponding to the current view is obtained.
  • the video frame is sequentially searched for output, and the video frame outputted after the subsequent key frame can be decoded and played according to the key frame, and the switched image is quickly realized.
  • the normal playback of the virtual reality video image under the new perspective does not require a waiting time for waiting for the virtual reality video image before the viewing angle change, and shortens the display delay time when the angle of view is switched.
  • a video processing method based on a virtual reality scene is provided.
  • the method is applied to the virtual reality device in FIG. 1 for illustration.
  • the method specifically includes the following steps:
  • a mixed video frame sequence is a compressed sequence of mixed video frames consisting of key frames and inter-frame predicted frames.
  • An inter prediction frame is a video frame that is encoded with reference to other video frames using correlation between video image frames.
  • the inter prediction frame includes a P frame (Predictive-codedPicture) and/or a B frame (Bidirectionally predicted picture). It can be understood that the mixed video frame sequence is a sequence of video frames used when playing virtual reality video under normal conditions. Each video frame in the mixed video frame sequence has a corresponding play time node.
  • the virtual reality device may acquire a mixed video frame sequence corresponding to the original view from the server, and play the mixed video frame sequence in the virtual reality scene to present a continuous virtual reality video image.
  • the sequence of mixed video frames is generated frame by frame from live video frames.
  • the virtual reality device may detect whether the view angle is switched, and when the view angle is switched, generate a video frame acquisition request and output.
  • the virtual reality device may acquire the current view after the switchover when the view is switched, generate a video frame acquisition request including the current view, and output the video frame.
  • the virtual reality device may acquire a current head pose after the view switching, generate a video frame acquisition request including the current head pose, and output.
  • the virtual reality device may further acquire a switching time point of the view switching, generate a video frame acquisition request including a switching time point of the current view and the view switching, or generate a switching time point including the current head posture and the view switching.
  • the video frame gets the request.
  • the virtual reality device can send the generated video frame acquisition request to the server.
  • the virtual reality device can receive a key frame returned by the server in response to the video frame acquisition request.
  • the server may determine a current perspective after the perspective switching according to the current head pose in the video frame acquisition request in response to the video frame acquisition request.
  • the server may determine the current play time node, and find a key frame corresponding to the current view after the view switch and corresponding to the current play time node and return.
  • the virtual reality device can receive key frames returned by the server.
  • the server may determine the current play time node based on a switching time point of the view switching included in the video frame acquisition request.
  • the server can also determine the current play time node according to the local current system time.
  • the server may search for a play time node that matches the current current system time in the play time node corresponding to the key frame sequence or the mixed video frame sequence corresponding to the current view.
  • the server may acquire a sequence of key frames corresponding to the current view, from which the key frames corresponding to the current play time node are searched and returned.
  • the key frame sequence is a sequence of video frames composed of multiple key frames.
  • a virtual reality video image sequence corresponding to the corresponding angle of view may be generated.
  • Each key frame in the key frame sequence has a corresponding play time node.
  • the key frame is replaced by a mixed video frame sequence corresponding to the original view.
  • the key frame is replaced by the mixed video frame corresponding to the original view, and the mixed video frame sequence corresponding to the original view is stopped, and the key frame is played.
  • S510 Receive a video frame in a sequence of the mixed video frame corresponding to the current view and play the video frame.
  • the video frame is a sequence of the mixed video frame corresponding to the current view, and searches from the next video frame of the video frame corresponding to the current play time node. get.
  • the server may further determine a sequence of mixed video frames corresponding to the current view after determining the current view and the current play time node in response to the video frame acquisition request, in the mixed video frame sequence, from the current After playing the next video frame of the video frame corresponding to the time node, the video frame is sequentially searched, and the found video frame is returned to the virtual reality device. After the virtual reality device receives the video frames in the returned mixed video frame sequence, the video frames are sequentially played frame by frame.
  • the key frame sequence and the mixed video frame sequence corresponding to the current view are different representations of the virtual reality video image sequence corresponding to the current view. That is, the virtual reality video image sequence corresponding to the current view may be represented by a key frame sequence, and each key frame in the key frame sequence is outputted frame by frame, and a virtual reality video image sequence corresponding to the current view may be generated.
  • the virtual reality video image sequence corresponding to the current view may also be represented by a mixed video frame sequence corresponding to the current view, and each video frame in the mixed video frame sequence is output frame by frame, and the virtual reality video corresponding to the current view may be generated. Image sequence.
  • the video frames in the key frame and the mixed video frame sequence may correspond to the same play time node.
  • the generated virtual reality video images are the same.
  • the virtual reality video images mentioned herein are the same, and are virtual reality video images having the same picture content, and do not exclude the virtual reality video images output according to the key frames and the virtual reality output according to the video frames in the mixed video frame sequence. There is a difference in quality between the video images and the like.
  • the generated virtual reality video images are the same. Therefore, when the key frames corresponding to the current view angle and corresponding to the current play time node and the video frames in the mixed video frame sequence are respectively output, the generated virtual reality video images are the same. Then, after the virtual reality device plays the key frame instead of the mixed video frame sequence corresponding to the original view, the searched video frame is sequentially played, and the subsequently output video frame can rely on the key frame to implement the virtual reality at the current perspective. Normal playback of a sequence of video images.
  • the sequence of mixed video frames is generated frame by frame from live video frames.
  • the video processing method based on the virtual reality scene may directly play the key frame corresponding to the current play time node in the switched current view and replace the mixed video frame sequence corresponding to the original view, and play the key After the frame, in the mixed video frame sequence corresponding to the current view, the video frames sequentially searched from the next video frame of the video frame corresponding to the current play time node are played, and the new perspective after the switch is quickly realized.
  • the normal playback of the virtual reality video image does not require a waiting time for waiting for the virtual reality video image before the viewing angle change, and shortens the display delay time when the angle of view is switched.
  • step S504 includes: when it is detected that the head posture changes, then determining that the angle of view is switched; acquiring the changed current head posture; generating a video frame according to the current head posture for determining the current angle of view Get the request and output.
  • the head posture is a relative position of the head when the front head position is preset as a reference.
  • the positive position of the head is the position when the head is upright and not skewed.
  • the head posture includes a relative position where the head is rotated left and right, a relative position where the head is tilted left and right, and a relative position when the head is raised or lowered.
  • the virtual reality device can detect whether the head gesture has changed.
  • the virtual reality device can monitor the gyro sensor; acquire the current head pose by the gyro sensor; compare the current head pose with the previously acquired head pose; and determine the head pose according to the comparison result Whether it has changed.
  • the comparison result is that the difference between the current head posture and the previously acquired head posture exceeds the preset range, it is determined that the head posture changes.
  • the virtual reality device When the virtual reality device detects a change in the head posture, it determines that the angle of view has switched. It can be understood that the head posture change can cause the angle of view switching, so the current head posture after the head posture change can be used to determine the current angle of view.
  • the virtual reality device may acquire the changed current head pose, generate a video frame acquisition request according to the current head pose for determining the current perspective, and output.
  • the current head pose is included in the video frame acquisition request generated according to the current head pose for determining the current perspective.
  • the virtual reality device may map the current head pose to the current perspective in the virtual reality scene according to a preset mapping relationship between the head gesture and the perspective in the virtual reality scene; generate a video frame acquisition request according to the current perspective. Output.
  • a plurality of different perspectives are preset in the virtual reality device, and a preset mapping relationship between the head pose and the perspective in the virtual reality scene is set.
  • the virtual reality device may acquire the current head posture after detecting the change of the head posture, and map the current head posture to the current perspective in the virtual reality scene according to the preset mapping relationship.
  • the virtual reality device may generate a video frame acquisition request according to the current perspective and output.
  • the generated video frame acquisition request includes a current perspective.
  • the virtual reality device determines that the angle of view is switched by detecting a change in the head posture. Since the change of the head posture is one of the important causes of the switching of the angle of view, it is determined that the switching of the angle of view is relatively accurate according to the change of the current head posture. In addition, after determining that the angle of view changes, the video frame acquisition request is generated and output according to the current head posture, so that the current view angle after the switching determined according to the current head posture can be made more accurate.
  • a virtual reality scene based video processing system 600 is provided, the system including a virtual reality device 602 and a content distribution network server 604, wherein:
  • the virtual reality device 602 is configured to acquire an original view in the virtual reality scenario, acquire a mixed video frame sequence corresponding to the original view from the content distribution network server 604, and play the video frame acquisition request and send the content to the content distribution. Web server 604.
  • the content distribution network server 604 is configured to acquire a current view and a current play time node after the view switching after responding to the video frame acquisition request, acquire a key frame corresponding to the current view and corresponding to the current play time node, and send the key frame to the Virtual reality device 602.
  • the Content Delivery Network (CDN) server redirects the user's request to the nearest service according to the network traffic and the connection and load status of each node, and the distance to the user and the response time.
  • the server on the node The Content Delivery Network (CDN) server redirects the user's request to the nearest service according to the network traffic and the connection and load status of each node, and the distance to the user and the response time.
  • the server on the node The Content Delivery Network (CDN) server redirects the user's request to the nearest service according to the network traffic and the connection and load status of each node, and the distance to the user and the response time.
  • the content distribution network server 604 is further configured to: in the mixed video frame sequence corresponding to the current view, search for the video frame sequentially from the next video frame of the video frame corresponding to the current play time node; and send the found video frame to the virtual Reality device 602.
  • the virtual reality device 602 is further configured to play the key frame instead of the mixed video frame sequence corresponding to the original view; and sequentially play the subsequently received video frames.
  • the virtual reality device 602 is further configured to: when it is detected that the head posture changes, determine that the angle of view is switched; acquire the changed current head posture; according to the current head posture for determining the current angle of view, A video frame acquisition request is generated and sent to the content distribution network server 604.
  • the virtual reality device 602 is further configured to map the current head pose to a current perspective in the virtual reality scene according to a preset mapping relationship between the head pose and the perspective in the virtual reality scene; and generate a video frame according to the current perspective.
  • the request is obtained and sent to the content distribution web server 604.
  • the content distribution network server 604 is further configured to acquire a current head pose after the head pose is changed; and map the current head pose to the preset mapping relationship between the head pose and the perspective in the virtual reality scene.
  • the current perspective in a virtual reality scene is further configured to acquire a current head pose after the head pose is changed; and map the current head pose to the preset mapping relationship between the head pose and the perspective in the virtual reality scene.
  • the content distribution network server 604 is further configured to acquire a key frame sequence corresponding to the current view angle; and in the key frame sequence, find a key frame corresponding to the current play time node.
  • the content distribution network server 604 is further configured to acquire a switching time point of the view switching; the matching and the switching time point are matched in the play time node corresponding to the key frame sequence or the mixed video frame sequence corresponding to the current view.
  • the play time node; the matching play time node is taken as the current play time node.
  • the system 600 further includes:
  • the push flow server 606 is configured to acquire a three-dimensional virtual reality video in a virtual reality scene; acquire different perspectives corresponding to the three-dimensional virtual reality video; and correspondingly generate a corresponding key frame sequence and a hybrid video according to the three-dimensional virtual reality video.
  • the sequence of frames is pushed to the content distribution network server 604 for the corresponding key frame sequence and mixed video frame sequence generated for each view.
  • the push stream server 606 is further configured to acquire a preset total number of views; and divide the panoramic view of the three-dimensional virtual reality video into different views according to the total number of views.
  • the system 600 further includes a streaming media receiving management server 605.
  • the push stream server 606 is further configured to push the corresponding key frame sequence and the mixed video frame sequence generated corresponding to each view to the streaming media receiving management server 605.
  • the streaming media receiving management server 605 is configured to send the corresponding key frame sequence and the mixed video frame sequence generated corresponding to each view to the content distribution network server 604, and manage the transmission state of the key frame sequence and the mixed video frame sequence.
  • the transmission status of the key frame sequence and the mixed video frame sequence is status information such as the success or failure of the key frame sequence and the mixed video frame sequence in the transmission process, and the failure of the transmission.
  • the transmission status of the key frame sequence and the mixed video frame sequence includes at least one transmission state such as success, packet loss, and misorder.
  • the content distribution web server 604 can be a content distribution web server for live video streaming.
  • the content distribution network server 604 is further configured to store the received corresponding key frame sequence and the mixed video frame sequence generated corresponding to each view.
  • the server in FIG. 1 may be a server cluster including a push stream server, a streaming media receiving management server, and a Content Delivery Network (CDN) server.
  • CDN Content Delivery Network
  • the video processing system based on the virtual reality scene may directly play the key frame corresponding to the current play time node in the switched current view and replace the mixed video frame sequence corresponding to the original view, and play the key in the playback key.
  • the video frames sequentially searched from the next video frame of the video frame corresponding to the current play time node are played, and the new perspective after the switch is quickly realized.
  • the normal playback of the virtual reality video image does not require a waiting time for waiting for the virtual reality video image before the viewing angle change, and shortens the display delay time when the angle of view is switched.
  • a timing diagram of a video processing method based on a virtual reality scene is provided.
  • the timing diagram specifically includes the following steps:
  • the push flow server obtains the three-dimensional virtual reality video in the virtual reality scene, and obtains the preset total number of views; according to the total number of views, the panoramic view of the three-dimensional virtual reality video is divided into different perspectives.
  • the push stream server corresponds to each view, and generates a corresponding key frame sequence and a mixed video frame sequence according to the three-dimensional virtual reality video.
  • the push stream server pushes the corresponding key frame sequence and the mixed video frame sequence generated corresponding to each view to the streaming media receiving management server frame by frame.
  • the streaming media receiving management server respectively transmits the corresponding key frame sequence and the mixed video frame sequence generated corresponding to each view to the content distribution network server frame by frame.
  • the streaming media receiving management server manages the transmission status of the key frame sequence and the mixed video frame sequence.
  • the virtual reality device acquires an original perspective in the virtual reality scenario and initiates an access request to the content distribution network server.
  • the content distribution network server transmits a sequence of mixed video frames corresponding to the original perspective to the virtual reality device.
  • the virtual reality device plays the mixed video frame sequence corresponding to the original view in the virtual reality scene.
  • the virtual reality device When the virtual reality device detects a change in the head posture, it determines that the angle of view is switched; and acquires the current head posture after the change.
  • a generated video frame acquisition request is sent to the content distribution network server according to the current head pose for determining the current perspective.
  • the content distribution network server maps the current head pose in the video frame acquisition request to the current perspective in the virtual reality scene.
  • the content distribution network server acquires a switching time point of the view switching; in the play time node corresponding to the key frame sequence or the mixed video frame sequence corresponding to the current view, the play time node matching the switching time point is searched, and the current play is obtained. time frame.
  • the content distribution network server acquires a key frame sequence corresponding to the current view angle; and searches for a key frame corresponding to the current play time node in the key frame sequence.
  • the content distribution network server returns the key frames to the virtual reality device.
  • the virtual reality device replaces the key frame with the mixed video frame sequence corresponding to the original view.
  • the content distribution network server sequentially searches for the video frame from the next video frame of the video frame corresponding to the current play time node in the mixed video frame sequence corresponding to the current view.
  • the content distribution network server sequentially returns the found video frames to the virtual reality device.
  • the virtual reality device sequentially plays subsequent received video frames after playing the key frame.
  • a computer device is provided, the internal structure of which may be as shown in FIG.
  • the computer device can be a server.
  • the computer device includes a video processing device based on a virtual reality scene, and the video processing device based on the virtual reality scene includes various modules, each of which may be implemented in whole or in part by software, hardware, or a combination thereof.
  • a virtual reality scene-based video processing device 1000 is provided.
  • the device 1000 includes a current view obtaining module 1004, a play time node determining module 1006, a video frame output module 1008, and Video frame lookup module 1010, wherein:
  • the current view obtaining module 1004 is configured to obtain a current view angle after the view switch is performed in the virtual reality scene.
  • the play time node determining module 1006 is configured to determine a current play time node.
  • the video frame output module 1008 is configured to output a key frame corresponding to the current viewing angle and corresponding to the current playing time node.
  • the video frame search module 1010 is configured to sequentially search for a video frame from the next video frame of the video frame corresponding to the current play time node in the mixed video frame sequence corresponding to the current view.
  • the video frame output module 1008 is also used to output the found video frame.
  • the current view obtaining module 1004 is further configured to acquire a current head pose after the head pose is changed; and map the current head pose to the preset mapping relationship between the head pose and the perspective in the virtual reality scene.
  • the video frame output module 1008 is further configured to acquire a key frame sequence corresponding to the current view angle; and in the key frame sequence, search for a key frame corresponding to the current play time node.
  • the play time node determining module 1006 is further configured to acquire a switching time point of the view switching; in the play time node corresponding to the key frame sequence or the mixed video frame sequence corresponding to the current view, the searching and switching time points Matching play time node; matching the play time node as the current play time node.
  • the apparatus 1000 further includes:
  • the view dividing module 1002 is configured to acquire a three-dimensional virtual reality video in a virtual reality scene, and acquire different perspectives corresponding to the three-dimensional virtual reality video.
  • the video frame sequence generating module 1003 is configured to generate a corresponding key frame sequence and a mixed video frame sequence according to the three-dimensional virtual reality video corresponding to each view.
  • the view dividing module 1002 is further configured to obtain a preset total number of views; and divide the panoramic view of the three-dimensional virtual reality video into different views according to the total number of views.
  • the hybrid video sequence is generated frame by frame from live video frames.
  • a computer device is provided, the internal structure of which may be as shown in FIG.
  • the computer device can be a virtual reality device.
  • the computer device includes a video processing device based on a virtual reality scene, and the virtual reality scene-based video processing device includes various modules, each of which may be implemented in whole or in part by software, hardware, or a combination thereof.
  • a virtual reality scene-based video processing apparatus 1200 is provided.
  • the apparatus 1200 includes: a playing module 1202, a video frame requesting module 1204, and a video frame receiving module 1206, where:
  • the playing module 1202 is configured to play a sequence of mixed video frames corresponding to the original view in the virtual reality scene.
  • the video frame requesting module 1204 is configured to generate a video frame acquisition request and output when the view angle is switched.
  • the video frame receiving module 1206 is configured to receive a key frame returned in response to the video frame acquisition request; the key frame corresponds to a current perspective after the view switching, and corresponds to a current play time node.
  • the playing module 1202 is further configured to play the key frame instead of the mixed video frame sequence corresponding to the original view.
  • the video frame receiving module 1206 is further configured to receive the video frame in the mixed video frame sequence corresponding to the current view and notify the playing module 1202 to sequentially play the received video frame; the video frame is in the mixed video frame sequence corresponding to the current view, The next video frame of the video frame corresponding to the current play time node is searched in turn.
  • the video frame requesting module 1204 is further configured to: when detecting that the head pose changes, determine that the angle of view is switched; acquire the changed current head pose; according to the current head pose for determining the current perspective , generate a video frame acquisition request and output.
  • the video frame requesting module 1204 is further configured to map the current head pose to a current perspective in the virtual reality scene according to a preset mapping relationship between the head pose and the perspective in the virtual reality scene; and generate a video according to the current perspective.
  • the frame gets the request and outputs it.
  • the key frame is a key frame corresponding to the current play time node in the key frame sequence corresponding to the current view.
  • the sequence of mixed video frames is generated frame by frame from live video frames.
  • Figure 13 is a block diagram showing the internal structure of a computer device in an embodiment.
  • the computer device can be the server shown in Figure 1, which includes a processor, memory and network interface connected by a system bus.
  • the memory comprises a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium of the computer device can store operating system and computer readable instructions.
  • the computer readable instructions when executed, may cause the processor to perform a video processing method based on a virtual reality scene.
  • the processor of the computer device is used to provide computing and control capabilities to support the operation of the entire computer device.
  • the internal memory can store computer readable instructions that, when executed by the processor, cause the processor to perform a video processing method based on a virtual reality scene.
  • the network interface of the computer device is used for network communication.
  • FIG. 13 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied.
  • the specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.
  • the virtual reality scene-based video processing apparatus may be implemented in the form of a computer readable instruction executable on a computer device as shown in FIG.
  • the non-volatile storage medium may store various program modules constituting the virtual reality scene-based video processing device, such as the current view obtaining module 1004, the play time node determining module 1006, the video frame output module 1008, and the video shown in FIG. Frame lookup module 1010.
  • Computer readable instructions comprising respective program modules for causing the computer device to perform the steps in the virtual reality scene based video processing method of various embodiments of the present application described in this specification, for example, the computer device may be as shown in FIG.
  • the current view obtaining module 1004 in the virtual reality scene-based video processing device 1000 obtains the current view angle after the view switch in the virtual reality scene, and determines the current play time node by the play time node determining module 1006.
  • the computer device may output a key frame corresponding to the current viewing angle and corresponding to the current playing time node through the video frame output module 1008, and through the video frame searching module 1010 in the mixed video frame sequence corresponding to the current viewing angle, from the current playing time node
  • the video frame is searched sequentially for the next video frame of the corresponding video frame.
  • the computer device can output the found video frame through the video frame output module 1008.
  • Figure 14 is a block diagram showing the internal structure of a computer device in an embodiment.
  • the computer device can be the virtual reality device shown in Figure 1, including a processor, memory, network interface, display screen, and input device connected by a system bus.
  • the memory comprises a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium of the computer device can store operating system and computer readable instructions.
  • the computer readable instructions when executed, may cause the processor to perform a video processing method based on a virtual reality scene.
  • the processor of the computer device is used to provide computing and control capabilities to support the operation of the entire computer device.
  • the internal memory can store computer readable instructions that, when executed by the processor, cause the processor to perform a video processing method based on a virtual reality scene.
  • the network interface of the computer device is used for network communication.
  • the display of the computer device can be a liquid crystal display or an electronic ink display.
  • the input device of the computer device may be a touch layer covered on the display screen, a button, a trackball or a touchpad provided on the terminal casing, or an external keyboard, a touchpad or a mouse.
  • the computer device may be a personal computer, a mobile terminal, or an in-vehicle device, and the mobile terminal includes at least one of a mobile phone, a tablet, a personal digital assistant, or a wearable device.
  • FIG. 14 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied.
  • the specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.
  • the virtual reality scene-based video processing apparatus may be implemented in the form of a computer readable instruction executable on a computer device as shown in FIG.
  • the non-volatile storage medium may store various program modules constituting the virtual reality scene-based video processing device, such as the play module 1202, the video frame request module 1204, and the video frame receiving module 1206 shown in FIG.
  • Computer readable instructions comprising respective program modules for causing the computer device to perform the steps in the virtual reality scene based video processing method of various embodiments of the present application described in this specification, for example, the computer device may be as shown in FIG.
  • the playback module 1202 in the virtual reality scene-based video processing device 1200 plays a mixed video frame sequence corresponding to the original perspective in the virtual reality scene, and generates a video frame acquisition request by the video frame requesting module 1204 when the perspective is switched. Output.
  • the computer device may receive, by the video frame receiving module 1206, a key frame returned in response to the video frame acquisition request; the key frame corresponds to a current perspective after the view switching, and corresponds to the current play time node.
  • the computer device can play the key frame by the playback module 1202 instead of the mixed video frame sequence corresponding to the original view.
  • the video frame receiving module 1206 And receiving, by the video frame receiving module 1206, the video frame in the mixed video frame sequence corresponding to the current viewing angle, and notifying the playing module 1202 to sequentially play the received video frame; the video frame is in the mixed video frame sequence corresponding to the current perspective, from the current The next video frame of the video frame corresponding to the play time node is searched in order.
  • a computer device which may be a server.
  • the computer device includes a memory and one or more processors having stored therein computer readable instructions that, when executed by one or more processors, cause one or more processors to perform the steps of: acquiring virtual reality The current viewing angle after the switching of the viewing angle in the scene; determining the current playing time node; outputting a key frame corresponding to the current viewing angle and corresponding to the current playing time node; in the mixed video frame sequence corresponding to the current viewing angle, from the current playing time node Corresponding to the next video frame of the video frame, the video frame is searched sequentially; the found video frame is output.
  • acquiring a current view after the view switch is performed in the virtual reality scene includes: acquiring a current head pose after the head pose is changed; and according to a preset mapping relationship between the head pose and the perspective in the virtual reality scene, the current view is The head pose is mapped to the current perspective in the virtual reality scene.
  • outputting a key frame corresponding to the current viewing angle and corresponding to the current playing time node includes: acquiring a key frame sequence corresponding to the current viewing angle; and searching for a key frame corresponding to the current playing time node in the key frame sequence .
  • determining the current play time node includes: acquiring a switch time point of the view switching; searching for the match time point in the play time node corresponding to the key frame sequence or the mixed video frame sequence corresponding to the current view Play time node; match the playing time node as the current play time node.
  • the processor when the computer readable instructions are executed by the processor, the processor further causes the processor to perform the steps of: acquiring a three-dimensional virtual reality video in a virtual reality scene; acquiring different perspectives corresponding to the three-dimensional virtual reality video; The angle of view generates a corresponding key frame sequence and a mixed video frame sequence according to the three-dimensional virtual reality video.
  • acquiring different perspectives corresponding to the three-dimensional virtual reality video includes: acquiring a preset total number of views; and dividing the panoramic view of the three-dimensional virtual reality video into different perspectives according to the total number of views.
  • the hybrid video sequence is generated frame by frame from live video frames.
  • a computer device which may be a virtual reality device.
  • the computer device includes a memory and one or more processors having stored therein computer readable instructions that, when executed by one or more processors, cause one or more processors to perform the steps of: Playing a sequence of mixed video frames corresponding to the original view in the scene; generating a video frame acquisition request and outputting when the view is switched; receiving a key frame returned in response to the video frame acquisition request; the key frame corresponds to the current view after the view switching, and Corresponding to the current play time node; playing the key frame instead of the mixed video frame sequence corresponding to the original view; receiving the video frame in the mixed video frame sequence corresponding to the current view and playing; the video frame is the mixed video corresponding to the current view
  • the frame sequence is searched in order from the next video frame of the video frame corresponding to the current play time node.
  • generating the video frame acquisition request and outputting includes: when detecting that the head pose changes, determining that the view angle is switched; acquiring the changed current head pose; The current head pose of the perspective, generating a video frame acquisition request and outputting.
  • mapping the current head pose to the preset mapping relationship between the head gesture and the perspective in the virtual reality scene The current perspective in the virtual reality scene; the video frame acquisition request is generated according to the current perspective and output.
  • the key frame is a key frame corresponding to the current play time node in the key frame sequence corresponding to the current view.
  • the sequence of mixed video frames is generated frame by frame from live video frames.
  • one or more storage media storing computer readable instructions that, when executed by one or more processors, cause one or more processors to perform the steps of: acquiring virtual reality The current viewing angle after the switching of the viewing angle in the scene; determining the current playing time node; outputting a key frame corresponding to the current viewing angle and corresponding to the current playing time node; in the mixed video frame sequence corresponding to the current viewing angle, from the current playing time node Corresponding to the next video frame of the video frame, the video frame is searched sequentially; the found video frame is output.
  • acquiring a current view after the view switch is performed in the virtual reality scene includes: acquiring a current head pose after the head pose is changed; and according to a preset mapping relationship between the head pose and the perspective in the virtual reality scene, the current view is The head pose is mapped to the current perspective in the virtual reality scene.
  • outputting a key frame corresponding to the current viewing angle and corresponding to the current playing time node includes: acquiring a key frame sequence corresponding to the current viewing angle; and searching for a key frame corresponding to the current playing time node in the key frame sequence .
  • determining the current play time node includes: acquiring a switch time point of the view switching; searching for the match time point in the play time node corresponding to the key frame sequence or the mixed video frame sequence corresponding to the current view Play time node; match the playing time node as the current play time node.
  • the processor when the computer readable instructions are executed by the processor, the processor further causes the processor to perform the steps of: acquiring a three-dimensional virtual reality video in a virtual reality scene; acquiring different perspectives corresponding to the three-dimensional virtual reality video; The angle of view generates a corresponding key frame sequence and a mixed video frame sequence according to the three-dimensional virtual reality video.
  • acquiring different perspectives corresponding to the three-dimensional virtual reality video includes: acquiring a preset total number of views; and dividing the panoramic view of the three-dimensional virtual reality video into different perspectives according to the total number of views.
  • the hybrid video sequence is generated frame by frame from live video frames.
  • one or more storage media storing computer readable instructions that, when executed by one or more processors, cause one or more processors to perform the steps of: Playing a sequence of mixed video frames corresponding to the original view in the scene; generating a video frame acquisition request and outputting when the view is switched; receiving a key frame returned in response to the video frame acquisition request; the key frame corresponds to the current view after the view switching, and Corresponding to the current play time node; playing the key frame instead of the mixed video frame sequence corresponding to the original view; receiving the video frame in the mixed video frame sequence corresponding to the current view and playing; the video frame is the mixed video corresponding to the current view
  • the frame sequence is searched in order from the next video frame of the video frame corresponding to the current play time node.
  • the step of generating a video frame acquisition request and outputting includes: when detecting that the head pose changes, determining that the view angle is switched; acquiring the changed current head pose; Determining the current head pose of the current view, generating a video frame acquisition request and outputting.
  • the step of generating a video frame acquisition request and outputting comprises: following the preset mapping relationship between the head gesture and the perspective in the virtual reality scene, the current head gesture Mapping to the current perspective in the virtual reality scene; generating a video frame acquisition request according to the current perspective and outputting.
  • the key frame is a key frame corresponding to the current play time node in the key frame sequence corresponding to the current view.
  • the sequence of mixed video frames is generated frame by frame from live video frames.
  • the various steps in the various embodiments of the present application are not necessarily performed in the order indicated by the steps. Except as explicitly stated herein, the execution of these steps is not strictly limited, and the steps may be performed in other orders. Moreover, at least some of the steps in the embodiments may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be executed at different times, and the execution of these sub-steps or stages The order is also not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of the other steps.
  • Non-volatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM is available in a variety of formats, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronization chain.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • Synchlink DRAM SLDRAM
  • Memory Bus Radbus
  • RDRAM Direct RAM
  • DRAM Direct Memory Bus Dynamic RAM
  • RDRAM Memory Bus Dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Processing Or Creating Images (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

一种虚拟现实环境下的控制方法,包括:在虚拟现实环境中,显示包括活动部件和固定部件的三维交互物件;监测虚拟操作体在所述虚拟现实环境中的移动;当虚拟操作体移动至与所述活动部件接触后,控制所述活动部件相对于所述固定部件并跟随所述虚拟操作体运动;按照所述活动部件相对于所述固定部件的相对位置,输出与所述三维交互物件对应的控制指令。

Description

基于虚拟现实场景的视频处理方法、服务器、虚拟现实设备和系统
本申请要求于2017年10月20日提交中国专利局,申请号为2017109820014,申请名称为“基于虚拟现实场景的视频处理方法、装置和系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,特别是涉及一种基于虚拟现实场景的视频处理方法、服务器、虚拟现实设备和系统。
背景技术
目前,随着科学技术的飞速发展,虚拟现实(VR,Virtual Reality)视频直播凭借其良好的参与性,逐渐受到大家的欢迎。
用户在观看虚拟现实直播视频时,并非静止不动的,往往会产生一定幅度的运动,而引起观看视角的变换。传统方法中,在用户的视角变换后,仍会播放一段时间的视角变换前的视频画面,直到过了这段时间,才能看到新视角下的清晰的视频。因此,传统方法在视角切换时通常需要一段时间后才能看到新视角下清晰的视频,存在一定的显示延时。
发明内容
根据本申请提供的各种实施例,提供了一种基于虚拟现实场景的视频处理方法、服务器、虚拟现实设备、存储介质和系统。
一种基于虚拟现实场景的视频处理方法,包括:
服务器获取虚拟现实场景中视角切换后的当前视角;
所述服务器确定当前播放时间节点;
所述服务器输出与所述当前视角对应、且与所述当前播放时间节点对应 的关键帧;
所述服务器在与所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧;及
所述服务器输出查找到的视频帧。
一种服务器,包括存储器和一个或多个处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
获取虚拟现实场景中视角切换后的当前视角;
确定当前播放时间节点;
输出与所述当前视角对应、且与所述当前播放时间节点对应的关键帧;
在与所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧;及
输出查找到的视频帧。
一个或多个存储有计算机可读指令的存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行如下步骤:
获取虚拟现实场景中视角切换后的当前视角;
确定当前播放时间节点;
输出与所述当前视角对应、且与所述当前播放时间节点对应的关键帧;
在与所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧;及
输出查找到的视频帧。
一种基于虚拟现实场景的视频处理方法,包括:
虚拟现实设备在虚拟现实场景中播放原始视角所对应的混合视频帧序列;
所述虚拟现实设备在视角切换时,生成视频帧获取请求并输出;
所述虚拟现实设备接收响应于所述视频帧获取请求返回的关键帧;所述关键帧对应于视角切换后的当前视角、且对应于当前播放时间节点;
所述虚拟现实设备将所述关键帧替代所述原始视角所对应的混合视频帧序列进行播放;及
所述虚拟现实设备接收对应于所述当前视角的混合视频帧序列中的视频帧并播放;所述视频帧为在所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起依次查找得到。
一种虚拟现实设备,包括存储器和一个或多个处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:
在虚拟现实场景中播放原始视角所对应的混合视频帧序列;
在视角切换时,生成视频帧获取请求并输出;
接收响应于所述视频帧获取请求返回的关键帧;所述关键帧对应于视角切换后的当前视角、且对应于当前播放时间节点;
将所述关键帧替代所述原始视角所对应的混合视频帧序列进行播放;及
接收对应于所述当前视角的混合视频帧序列中的视频帧并播放;所述视频帧为在所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起依次查找得到。
一个或多个存储有计算机可读指令的存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行如下步骤:
在虚拟现实场景中播放原始视角所对应的混合视频帧序列;
在视角切换时,生成视频帧获取请求并输出;
接收响应于所述视频帧获取请求返回的关键帧;所述关键帧对应于视角切换后的当前视角、且对应于当前播放时间节点;
将所述关键帧替代所述原始视角所对应的混合视频帧序列进行播放;及
接收对应于所述当前视角的混合视频帧序列中的视频帧并播放;所述视频帧为在所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起依次查找得到。
一种基于虚拟现实场景的视频处理系统,包括虚拟现实设备和内容分发 网络服务器;
所述虚拟现实设备,用于获取虚拟现实场景中的原始视角,从所述内容分发网络服务器获取与所述原始视角对应的混合视频帧序列并播放;在视角切换时,生成视频帧获取请求并发送至所述内容分发网络服务器;
所述内容分发网络服务器,用于响应于所述视频帧获取请求,获取视角切换后的当前视角和当前播放时间节点;获取与所述当前视角对应、且与所述当前播放时间节点对应的关键帧;将所述关键帧发送至所述虚拟现实设备;
所述内容分发网络服务器还用于在与所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧;将查找到的视频帧发送至所述虚拟现实设备;及
所述虚拟现实设备还用于将所述关键帧替代所述原始视角所对应的混合视频帧序列进行播放;并将后续接收到的视频帧依次播放。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。基于本申请的说明书、附图以及权利要求书,本申请的其它特征、目的和优点将变得更加明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为一个实施例中基于虚拟现实场景的视频处理方法的应用场景图;
图2为一个实施例中基于虚拟现实场景的视频处理方法的流程示意图;
图3A至图3B为一个实施例中描述视角所对应的虚拟现实视频图像的界面示意图;
图4为一个实施例中关键帧序列和混合视频帧序列的示意图;
图5为另一个实施例中基于虚拟现实场景的视频处理方法的流程示意 图;
图6为一个实施例中基于虚拟现实场景的视频处理系统的架构图;
图7为另一个实施例中基于虚拟现实场景的视频处理系统的架构图;
图8为又一个实施例中基于虚拟现实场景的视频处理系统的架构图;
图9为一个实施例中基于虚拟现实场景的视频处理方法的时序图;
图10为一个实施例中基于虚拟现实场景的视频处理装置的结构框图;
图11为另一个实施例中基于虚拟现实场景的视频处理装置的结构框图;
图12为又一个实施例中基于虚拟现实场景的视频处理装置的结构框图
图13为一个实施例中计算机设备的框图;及
图14为另一个实施例中计算机设备的框图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
图1为一个实施例中基于虚拟现实场景的视频处理方法的应用场景图。参照图1,该应用场景中包括通过网络连接的头戴式显示器110、虚拟现实设备120和服务器130。其中,头戴式显示器(HMD,Head Mounted Display)110可以是具备显示虚拟现实场景画面功能的头戴式显示设备。可以理解,头戴式显示器也可以用其他具备显示虚拟现实场景画面功能的设备替代。虚拟现实设备120是具备实现虚拟现实场景功能的设备。虚拟现实设备120可以是是台式计算机或移动终端,移动终端可以包括手机、平板电脑、个人数字助理和穿戴式设备等中的至少一种。服务器130可以用独立的服务器或者是多个物理服务器组成的服务器集群来实现。
虚拟现实设备120中可以播放原始视角对应的混合视频帧序列。用户通过头戴式显示器110可以观看到虚拟现实设备120在播放原始视角对应的混合视频帧序列所形成的虚拟现实场景画面。虚拟现实设备120可以在视角切 换时,生成视频帧获取请求并发送至服务器130。服务器130可以获取虚拟现实场景中视角切换后的当前视角,并确定当前播放时间节点。服务器130可以将与当前视角对应、且与当前播放时间节点对应的关键帧输出至虚拟现实设备120。服务器130还可以在与当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧,并将查找到的视频帧发送至虚拟现实设备120。虚拟现实设备120可以将该关键帧替代原始视角所对应的混合视频帧序列进行播放,并将后续接收到的视频帧依次播放,以实现对切换后的当前视角下的视频播放。用户通过头戴式显示器110可以观看到虚拟现实设备120在播放关键帧以及依次播放后续接收到的视频帧时形成的切换后的当前视角所对应的虚拟现实场景画面。
可以理解,在其他实施例中,应用场景中也可以不包括头戴式显示器110。比如,如果虚拟现实设备120本身具备头戴式显示器110所具备的显示虚拟现实场景画面的功能,则可以省去头戴式显示器110。
图2为一个实施例中基于虚拟现实场景的视频处理方法的流程示意图。本实施例主要以该基于虚拟现实场景的视频处理方法应用于图1中的服务器130来举例说明。参照图2,该方法具体包括如下步骤:
S202,获取虚拟现实场景中视角切换后的当前视角。
其中,虚拟现实场景,是利用电脑模拟产生的一个三维空间的虚拟世界,提供使用者关于视觉、听觉、触觉等感官的模拟,让使用者如同身历其境一般地观察虚拟世界的三度空间内的事物。可以理解,虚拟现实场景是三维的。
在一个实施例中,虚拟现实场景,可以是虚拟现实游戏直播场景。
视角(Filed of View),是观察点与可视区域边缘的连线所形成的夹角。视角用于表征观察点能够看到的可视区域的范围。可以理解,不同视角所能够看到的可视区域的范围不同。
具体地,服务器可以预先将三维虚拟现实视频的总视野划分为多个不同视角。其中,三维虚拟现实视频的总视野是三维虚拟现实视频的全部视野。三维虚拟现实视频的总视野可以是360度全景视野,也可以是少于360度的 视野,比如180度视野。
可以理解,划分出的每个视角对应于三维虚拟现实视频中的局部场景的虚拟现实视频图像序列。其中,局部场景的虚拟现实视频图像序列,是呈现三维虚拟现实视频所展示的虚拟现实场景的局部特征的虚拟现实视频图像序列。可以理解,每个视角所对应的局部场景的虚拟现实视频图像序列,为在该视角所对应的可视区域内呈现的局部场景的虚拟现实视频图像序列,即在该视角内,从观察点可以看到该视角所对应的局部场景的虚拟现实视频图像序列。
图3A至图3B为一个实施例中描述视角所对应的虚拟现实视频图像的界面示意图。图3A为全景视野下的三维虚拟现实视频图像。图3B为一个视角下的虚拟现实视频图像。图3B中所呈现的一个视角下的虚拟现实视频图像为三维虚拟现实视频图像的一个局部场景。
具体地,服务器可以直接接收虚拟现实设备发送的虚拟现实场景中视角切换后的当前视角。服务器也可以自身根据视角切换关联信息,得到视角切换后的当前视角。其中,视角切换关联信息是与视角切换具有关联关系的信息。在一个实施例中,视角切换关联信息可以包括头部姿态。可以理解,头部姿态的改变可以引起视角的切换,则与视角切换具有关联关系,所以头部姿态属于视角切换关联信息。
S204,确定当前播放时间节点。
其中,播放时间节点,是预先设置的用于播放视频帧的时间节点。可以理解,每个视频帧都有对应的播放时间节点,在达到播放时间节点时,该播放时间节点所对应的视频帧可以被输出以进行播放。
需要说明的是,一个播放时间节点可以对应于多个不同视角下的视频帧,而实际达到该播放时间节点时,会将其中一个视角下的对应于该播放时间节点视频帧进行输出。现举例说明,比如有视角1、视角2和视角3,视角1、视角2和视角3各自对应的视频帧中,都有一个对应于播放时间节点t1的视频帧。那么,在达到播放时间节点t1时,会从视角1、视角2和视角3中确 定出一个视角,然后将所确定的这个视角下的对应于该播放时间节点t1的视频帧进行输出。
当前播放时间节点,是当前所要播放的播放时间节点。
具体地,服务器可以直接获取虚拟现实设备发送的当前播放时间节点。服务器也可以获取视角切换的切换时间点,根据该切换时间点确定当前播放时间节点。其中,切换时间点,是生成视角切换时的时间点。
在一个实施例中,服务器还可以根据本地的当前系统时间,确定当前播放时间节点。在一个实施例中,服务器可以在与当前视角对应的关键帧序列或混合视频帧序列所对应的播放时间节点中,查找与本地的当前系统时间匹配的播放时间节点,作为当前播放时间节点。
在一个实施例中,服务器可以从已经播放的播放时间节点中,选取最接近切换时间点的播放时间节点,将选取的最接近的播放时间节点之后的下一个播放时间节点作为当前播放时间节点。
S206,输出与当前视角对应、且与当前播放时间节点对应的关键帧。
其中,关键帧,是不以其他帧图像做参考,只利用本帧的信息进行编码的帧。在一个实施例中,关键帧可以是I帧(Intra-coded picture,帧内编码帧)。
在一个实施例中,输出关键帧,可以是发送关键帧,以使该关键帧被播放。具体地,服务器可以将关键帧添加至发送队列,通过发送队列将关键帧发送至虚拟现实设备,虚拟现实设备可以将与当前视角对应的关键帧进行解码播放,以生成当前视角下的该关键帧所对应的虚拟现实视频图像。可以理解,当前视角下的关键帧所对应的虚拟现实视频图像属于局部场景的虚拟现实视频图像。
具体地,服务器中存储了视角与关键帧间的对应关系,以及存储了关键帧与播放时间节点间的对应关系。服务器可以根据上述对应关系,确定与当前视角对应、且与当前播放时间节点对应的关键帧并输出。
在一个实施例中,步骤S206包括:获取与当前视角对应的关键帧序列;在关键帧序列中,查找与当前播放时间节点对应的关键帧。
具体地,服务器中预先存储了视角与关键帧序列间的对应关系。关键帧序列是由多个关键帧组成的视频帧序列。关键帧序列中的多个关键帧被逐帧输出时,可以生成与所对应的视角对应的虚拟现实视频图像序列。在视角切换到当前视角时,服务器可以获取与当前视角对应的关键帧序列,在该关键帧序列中,根据预设的关键帧与播放时间节点的对应关系,查找与当前播放时间节点对应的关键帧。
S208,在与当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧。
其中,混合视频帧序列,是经过压缩的由关键帧和帧间预测帧组成的混合视频帧序列。帧间预测帧,是利用视频图像帧间的相关性参考其他视频帧进行编码的视频帧。帧间预测帧包括P帧(向前帧间预测帧,Predictive-codedPicture)和/或B帧(双向帧间预测帧,Bidirectionally predicted picture)。可以理解,混合视频帧序列是在正常状态下播放虚拟现实视频时所使用的视频帧序列。
在一个实施例中,混合视频帧序列是由直播视频帧逐帧生成。
具体地,服务器中预先设置了视角与混合视频帧序列的对应关系。服务器可以根据该对应关系,获取与当前视角对应的混合视频帧序列。可以理解,混合视频帧序列中的各视频帧都有对应的播放时间节点。服务器可以在与当前视角对应的混合视频帧序列中,确定与当前播放时间节点对应的视频帧,并从所确定的视频帧的下一视频帧起,依次查找视频帧。
可以理解,关键帧和混合视频帧序列中的视频帧可以对应于同一个播放时间节点。将对应于同一视角且对应于同一个播放时间节点的关键帧和混合视频帧序列中的视频帧分别输出时,所生成的虚拟现实视频图像是相同的。可以理解,这里所说的虚拟现实视频图像相同,是具有相同画面内容的虚拟现实视频图像,并不排除根据关键帧输出的虚拟现实视频图像与根据混合视频帧序列中的视频帧输出的虚拟现实视频图像之间存在清晰度等方面的质量差异。
在一个实施例中,关键帧是与当前视角对应的关键帧序列中的、且对应于当前时间节点的关键帧。本实施例中,关键帧序列与混合视频帧序列是该当前视角所对应的虚拟现实视频图像序列的不同表示形式。即当前视角所对应的虚拟现实视频图像序列,可以由关键帧序列进行表示,将关键帧序列中的各关键帧逐帧输出,可以生成当前视角所对应的虚拟现实视频图像序列。当前视角所对应的虚拟现实视频图像序列,也可以由混合视频帧序列进行表示,将混合视频帧序列中的各视频帧逐帧输出,可以生成当前视角所对应的虚拟现实视频图像序列。可以理解,与当前视角对应的关键帧序列和混合视频帧序列所对应的虚拟现实视频图像序列皆为局部场景的虚拟现实视频图像序列。
图4为一个实施例中关键帧序列和混合视频帧序列的示意图。图4中为对应于同一个视角的关键帧序列和混合视频帧序列。关键帧序列由各个关键帧(I帧)组成,混合视频帧序列由关键帧(I帧)和帧间预测帧(如图4中的P帧)组成。关键帧序列和混合视频帧序列中同一虚线框内的视频帧对应于同一播放时间节点,比如I帧402a和P帧402b对应于同一播放时间节点t1,I帧404a和P帧404b对应于同一播放时间节点t2。
S210,输出查找到的视频帧。
具体地,服务器可以将查找到的视频帧逐帧依次输出。可以理解,服务器可以将查找到的视频帧添加至发送队列,通过发送队列将视频帧逐帧发送至虚拟现实设备。虚拟现实设备可以该视频帧逐帧解码播放,以实现该当前视角下的虚拟现实视频图像序列的正常播放。
可以理解,混合视频帧序列中的帧间预测帧依赖于前一视频帧输出生成的视频图像进行解码播放的,而如前文,将对应于同一视角且对应于同一个播放时间节点的关键帧和混合视频帧序列中的视频帧分别输出时,所生成的虚拟现实视频图像是相同的。所以,对应于当前视角且对应于当前播放时间节点的关键帧和混合视频帧序列中的视频帧分别输出时,所生成的虚拟现实视频图像是相同的。那么,在与当前视角对应的混合视频帧序列中,从当前 播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧,将查找到的视频帧在关键帧之后依次输出,则后续输出的视频帧可以依赖于关键帧实现该当前视角下的虚拟现实视频图像序列的正常播放。
现结合图4对视频处理方法进行举例说明。服务器在获取虚拟现实场景中视角切换后的当前视角,以及确定当前播放时间节点t1后,可以获取与该当前视角对应的关键帧序列和混合视频帧序列,假设图4即为与当前视角对应的关键帧序列和混合视频帧序列。服务器可以从关键帧序列中选取与t1对应的关键帧402a并输出,服务器可以在混合视频帧序列中确定与t1对应的视频帧402b,从视频帧402b的下一视频帧404b起,依次查找视频帧,并将查找到的视频帧逐帧依次的输出。
上述基于虚拟现实场景的视频处理方法,在视角切换后,可以直接输出播放切换后的当前视角下与当前播放时间节点对应的关键帧,并在输出播放关键帧后,将与切换后的当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧并输出,快速地实现了切换后的新视角下的虚拟现实视频图像的正常播放,而不需要一段等待播放视角变换前的虚拟现实视频图像的等待时间,缩短了视角切换时的显示延时时间。
在一个实施例中,步骤S202包括:获取头部姿态产生变化后的当前头部姿态;按照虚拟现实场景中头部姿态与视角的预设映射关系,将当前头部姿态映射为虚拟现实场景中的当前视角。
其中,头部姿态,是以预设的头部端正位置为参照时,头部所处的相对位置。头部端正位置,是头部正立、不偏斜时的位置。头部姿态包括头部左右转动时所处的相对位置、头部左右倾斜时所处的相对位置和抬头或低头时所处的相对位置等。
具体地,服务器中预先设置了多个不同视角,并设置了虚拟现实场景中头部姿态和视角的预设映射关系。服务器可以获取检测到头部姿态产生变化后的当前头部姿态,按照该预设映射关系,将当前头部姿态映射为虚拟现实 场景中的当前视角。
可以理解,服务器可以获取虚拟现实设备在检测到头部姿态产生变化后发送的当前头部姿态。具体地,虚拟现实设备可以检测头部姿态,当检测到头部姿态产生变化,发送头部姿态变化后的当前头部姿态至服务器,服务器直接获取虚拟现实设备发送的该当前头部姿态。在其他实施例中,服务器也可以从其他设备获取检测到头部姿态产生变化后的当前头部姿态。
上述实施例中,通过当前头部姿态,以及头部姿态与视角的映射关系,可以快速且准确地确定出当前视角,因为通常情况下是通过头部佩戴具有显示虚拟现实场景画面功能的头戴式显示设备来观看虚拟现实场景下的虚拟现实视频图像的播放,因此头部姿态的变化是引起视角切换的重要原因之一,所以根据当前头部姿态确定出的切换后的当前视角很准确。此外,根据头部姿态与视角的映射关系,来确定当前视角,避免了复杂的计算处理,从而进一步地缩短了视角切换时的显示延时时间。
在一个实施例中,步骤S204包括:获取视角切换的切换时间点;在与当前视角对应的关键帧序列或混合视频帧序列所对应的播放时间节点中,查找与切换时间点匹配的播放时间节点;将匹配的播放时间节点作为当前播放时间节点。
其中,切换时间点,是生成视角切换时的时间点。
具体地,服务器可以获取虚拟现实设备发送的视角切换的切换时间点。可以理解,服务器也可以对虚拟现实设备进行实时监测,当通过监测虚拟现实设备监测到视角切换,则自身获取视角切换的切换时间点。
如前文,关键帧序列中的各关键帧都有对应的播放时间节点,混合视频帧序列中的各视频帧也有所对应的播放时间节点,则服务器可以在与当前视角对应的关键帧序列或混合视频帧序列所对应的播放时间节点中,查找与切换时间点匹配的播放时间节点。服务器可以将匹配的播放时间节点作为当前播放时间节点。
在一个实施例中,服务器可以将与当前视角对应的关键帧序列或混合视 频帧序列所对应的播放时间节点中,最接近于切换时间点的播放时间节点作为与切换时间点匹配的播放时间节点。可以理解,这里最接近于切换时间点的播放时间节点,存在早于该切换时间点的播放时间节点、晚于该切换时间点的播放时间节点,以及等于该切换时间点的播放时间节点的情况。
比如,当前视角对应的关键帧序列中的各关键帧分别对应的播放时间节点,分别为第20ms(毫秒)、第21ms和第22ms,假设切换时间点为播放的第20.4ms,则最近接于切换时间点的播放时间节点为,早于该切换时间点的第20ms的播放时间节点。再假设切换时间点为播放的第20.8ms,则最近接于切换时间点的播放时间节点为,晚于该切换时间点的第21ms的播放时间节点。又假设切换时间点为播放的第22ms,则最近接于切换时间点的播放时间节点为,等于该切换时间点的第22ms的播放时间节点。
在一个实施例中,服务器可以从与当前视角对应的关键帧序列或混合视频帧序列所对应的播放时间节点中,筛选晚于切换时间点的播放时间节点中,并从筛选的晚于切换时间点的播放时间节点中,选择最接近切换时间点的播放时间节点作为与切换时间点匹配的播放时间节点。
同样地,比如,当前视角对应的关键帧序列中的各关键帧分别对应的播放时间节点,分别为第20ms(毫秒)、第21ms和第22ms,假设切换时间点为播放的第20.4ms,则晚于第20.4ms的播放时间节点为第21ms和第22ms,从中选择最接近切换时间点第20.4ms的播放时间节点,则为第21ms的播放时间节点。
上述实施例中,将与视角切换的切换时间点匹配的播放时间节点作为当前播放时间节点,进而根据该当前播放时间节点确定切换后的当前视角对应的关键帧和混合视频帧序列中的视频帧进行输出,使得所输出的当前视角所对应的关键帧和混合视频帧序列中的视频帧与发生视角切换时的原始视角下所播放至的视频图像的相关性更强,实现了视角切换时虚拟现实视频图像的紧密衔接,保证了虚拟现实视频的播放质量。
在一个实施例中,该方法还包括:获取虚拟现实场景下的三维虚拟现实 视频;获取对应于三维虚拟现实视频的不同的视角;对应每个视角,根据三维虚拟现实视频分别生成相应的关键帧序列和混合视频帧序列。
其中,三维虚拟现实视频,是以三维虚拟现实场景形态展示视频图像的视频。对应于三维虚拟现实视频的不同视角,是待将三维虚拟现实视频的总视野所划分成的不同的视角。
具体地,服务器可以直接获取预先设置的对应于三维虚拟现实视频的不同的视角。
在一个实施例中,服务器也可以获取预设的视角总数,按照视角总数,将三维虚拟现实视频的全景视野划分不同的视角。
其中,预设的视角总数,是预设的待划分视角的总数量。
具体地,服务器可以将三维虚拟现实视频的全景视野划分为满足视角总数的不同的视角。比如,预设的视角总数为6,则服务器可以将三维虚拟现实视频的全景视野划分为6个不同的视角。
在一个实施例中,服务器可以将三维虚拟现实视频的全景视野,按照预设的视角总数进行等分,划分得到范围相等或近似相等的不同的视角。
在一个实施例中,服务器也可以将三维虚拟现实视频的全景视野按照视野位置的主次,按照预设的视角总数,划分得到不同的视角,其中,全景视野中比较主要的位置可以划分为较大的视角,相对次要的位置可以划分为较小的视角。
可以理解,划分出的每个视角对应于三维虚拟现实视频中的局部场景的虚拟现实视频图像序列。其中,不同视角对应于三维虚拟现实视频中不同的局部场景的虚拟现实视频图像序列。
具体地,服务器可以确定每个视角在三维虚拟现实视频中所对应的局部场景的虚拟现实视频图像序列,并生成表示该局部场景的虚拟现实视频图像序列的视频帧序列。服务器可以针对每个视角所对应的局部场景的虚拟现实视频图像序列,分别生成相应的关键帧序列和混合视频帧序列。即,把同一视角所对应的局部场景的虚拟现实视频图像序列,分别以关键帧序列和混合 视频帧系列两种形式进行表示。可以理解,将相应的关键帧序列进行解码可以生成相应的局部场景的虚拟现实视频图像序列,将混合视频帧序列进行解码也可以生成相应的局部场景的虚拟现实视频图像序列。
上述实施例中,通过将完整的三维虚拟现实视频的视野划分为不同的视角,对应每个视角,生成相对应的关键帧序列和混合视频帧系列。这样一来,在视角切换时,就可以从关键帧序列中获取切换后的当前视角所对应的当前播放时间节点下的关键帧,对该关键帧进行输出,然后在当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧进行输出,该继关键帧之后输出的视频帧可以依据该关键帧解码播放,快速地实现了切换后的新视角下的虚拟现实视频图像的正常播放,而不需要一段等待播放视角变换前的虚拟现实视频图像的等待时间,缩短了视角切换时的显示延时时间。
如图5所示,在一个实施例中,提供了一种基于虚拟现实场景的视频处理方法,现以该方法应用于图1中的虚拟现实设备进行举例说明。该方法具体包括以下步骤:
S502,在虚拟现实场景中播放原始视角所对应的混合视频帧序列。
其中,原始视角,是虚拟现实场景中切换视角前所处的视角。混合视频帧序列,是经过压缩的由关键帧和帧间预测帧组成的混合视频帧序列。帧间预测帧,是利用视频图像帧间的相关性参考其他视频帧进行编码的视频帧。帧间预测帧包括P帧(向前帧间预测帧,Predictive-codedPicture)和/或B帧(双向帧间预测帧,Bidirectionally predicted picture)。可以理解,混合视频帧序列是在正常状态下播放虚拟现实视频时所使用的视频帧序列。混合视频帧序列中的各视频帧都有对应的播放时间节点。
具体地,虚拟现实设备可以从服务器中获取与原始视角对应的混合视频帧序列,并将该混合视频帧序列在虚拟现实场景中进行播放,以呈现连续的虚拟现实视频图像。
在一个实施例中,混合视频帧序列是由直播视频帧逐帧生成。
S504,在视角切换时,生成视频帧获取请求并输出。
具体地,虚拟现实设备可以检测视角是否发生切换,当视角发生切换时,生成视频帧获取请求并输出。
在一个实施例中,虚拟现实设备可以在视角切换时,获取切换后的当前视角,生成包括当前视角的视频帧获取请求并输出。
在一个实施例中,虚拟现实设备可以获取视角切换后的当前头部姿态,生成包括当前头部姿态的视频帧获取请求并输出。
在一个实施例中,虚拟现实设备还可以获取视角切换的切换时间点,生成包括当前视角和视角切换的切换时间点的视频帧获取请求,或生成包括当前头部姿态和视角切换的切换时间点的视频帧获取请求。
在一个实施例中,虚拟现实设备可以把生成的视频帧获取请求发送至服务器。
S506,接收响应于视频帧获取请求返回的关键帧,关键帧对应于视角切换后的当前视角、且对应于当前播放时间节点。
在一个实施例中,虚拟现实设备可以接收服务器响应于视频帧获取请求返回的关键帧。当视频帧获取请求中包括用于确定当前视角的当前头部姿态时,则服务器可以响应于视频帧获取请求,根据视频帧获取请求中的当前头部姿态,确定视角切换后的当前视角。服务器可以确定当前播放时间节点,并查找对应于视角切换后的当前视角、且对应于当前播放时间节点的关键帧并返回。虚拟现实设备可以接收服务器返回的关键帧。
在一个实施例中,服务器可以根据视频帧获取请求中包括的视角切换的切换时间点确定当前播放时间节点。服务器还可以根据本地的当前系统时间,确定当前播放时间节点。在一个实施例中,服务器可以在与当前视角对应的关键帧序列或混合视频帧序列所对应的播放时间节点中,查找与本地的当前系统时间匹配的播放时间节点。
在一个实施例中,服务器可以获取与当前视角对应的关键帧序列,从该关键帧序列中,查找与当前播放时间节点对应的关键帧并返回。
其中,关键帧序列是由多个关键帧组成的视频帧序列。关键帧序列中的多个关键帧被逐帧输出时,可以生成与所对应的视角对应的虚拟现实视频图像序列。关键帧序列中的各关键帧都有对应的播放时间节点。
S508,将关键帧替代原始视角所对应的混合视频帧序列进行播放。
可以理解,将关键帧替代原始视角所对应的混合视频帧进行播放,是停止播放原始视角所对应的混合视频帧序列,而转为播放关键帧。
S510,接收对应于当前视角的混合视频帧序列中的视频帧并播放;视频帧为在当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起依次查找得到。
在一个实施例中,服务器在响应于视频帧获取请求时,在确定当前视角以及当前播放时间节点后,还可以确定当前视角所对应的混合视频帧序列,在该混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧,将查找到的视频帧返回至虚拟现实设备。虚拟现实设备接收所返回的混合视频帧序列中的视频帧后,逐帧依次播放视频帧。
需要说明的是,与当前视角对应的关键帧序列与混合视频帧序列,分别是该当前视角所对应的虚拟现实视频图像序列的不同表示形式。即当前视角所对应的虚拟现实视频图像序列,可以由关键帧序列进行表示,将关键帧序列中的各关键帧逐帧输出,可以生成当前视角所对应的虚拟现实视频图像序列。当前视角所对应的虚拟现实视频图像序列,也可以由当前视角所对应的混合视频帧序列进行表示,将混合视频帧序列中的各视频帧逐帧输出,可以生成当前视角所对应的虚拟现实视频图像序列。
关键帧和混合视频帧序列中的视频帧可以对应于同一个播放时间节点。将对应于同一视角且对应于同一个播放时间节点的关键帧和混合视频帧序列中的视频帧分别输出时,所生成的虚拟现实视频图像是相同的。可以理解,这里所说的虚拟现实视频图像相同,是具有相同画面内容的虚拟现实视频图像,并不排除根据关键帧输出的虚拟现实视频图像与根据混合视频帧序列中的视频帧输出的虚拟现实视频图像之间存在清晰度等方面的质量差异。
结合上文所述,由于将对应于同一视角且对应于同一个播放时间节点的关键帧和混合视频帧序列中的视频帧分别输出时,所生成的虚拟现实视频图像是相同的。所以,对应于当前视角且对应于当前播放时间节点的关键帧和混合视频帧序列中的视频帧分别输出时,所生成的虚拟现实视频图像是相同的。那么,虚拟现实设备将关键帧替代原始视角所对应的混合视频帧序列进行播放后,将查找到的视频帧依次播放,则后续输出的视频帧可以依赖于关键帧实现该当前视角下的虚拟现实视频图像序列的正常播放。
在一个实施例中,混合视频帧序列是由直播视频帧逐帧生成。
上述基于虚拟现实场景的视频处理方法,在视角切换后,可以直接将切换后的当前视角下与当前播放时间节点对应的关键帧替代原始视角所对应的混合视频帧序列进行播放,并在播放关键帧后,将与当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找出的视频帧进行播放,快速地实现了切换后的新视角下的虚拟现实视频图像的正常播放,而不需要一段等待播放视角变换前的虚拟现实视频图像的等待时间,缩短了视角切换时的显示延时时间。
在一个实施例中,步骤S504包括:当检测到头部姿态产生变化时,则判定视角发生切换;获取变化后的当前头部姿态;根据用于确定当前视角的当前头部姿态,生成视频帧获取请求并输出。
其中,头部姿态,是以预设的头部端正位置为参照时,头部所处的相对位置。头部端正位置,是头部正立、不偏斜时的位置。头部姿态包括头部左右转动时所处的相对位置、头部左右倾斜时所处的相对位置和抬头或低头时所处的相对位置等。
具体地,虚拟现实设备可以检测头部姿态是否发生变化。在一个实施例中,虚拟现实设备可以监听陀螺仪传感器;通过陀螺仪传感器获取当前头部姿态;将当前头部姿态与前次获取的头部姿态进行比对;根据比对结果判断头部姿态是否发生变化。在一个实施例中,当比对结果为当前头部姿态与前次获取的头部姿态的差异超出预设范围时,则判定头部姿态发生变化。
虚拟现实设备在检测到头部姿态产生变化时,则判定视角发生切换。可以理解,头部姿态变化可以引起视角切换,所以头部姿态变化后的当前头部姿态可以用于确定当前视角。虚拟现实设备可以获取变化后的当前头部姿态,根据用于确定当前视角的当前头部姿态,生成视频帧获取请求并输出。
在一个实施例中,根据用于确定当前视角的当前头部姿态生成的视频帧获取请求中包括当前头部姿态。
在一个实施例中,虚拟现实设备可以按照虚拟现实场景中头部姿态与视角的预设映射关系,将当前头部姿态映射为虚拟现实场景中的当前视角;根据当前视角生成视频帧获取请求并输出。
具体地,虚拟现实设备中预先设置了多个不同视角,并设置了虚拟现实场景中头部姿态和视角的预设映射关系。虚拟现实设备可以获取检测到头部姿态产生变化后的当前头部姿态,按照该预设映射关系,将当前头部姿态映射为虚拟现实场景中的当前视角。虚拟现实设备可以根据当前视角生成视频帧获取请求并输出。其中,生成的视频帧获取请求包括当前视角。
上述实施例中,虚拟现实设备通过检测头部姿态发生变化,来判定视角发生切换。因为头部姿态的变化是引起视角切换的重要原因之一,所以根据当前头部姿态发生变化判定视角发生切换比较准确。此外,在判定视角发生变化后,根据当前头部姿态,生成视频帧获取请求并输出,可以使得根据当前头部姿态确定出的切换后的当前视角更加准确。
如图6所示,在一个实施例中,提供了一种基于虚拟现实场景的视频处理系统600,该系统包括虚拟现实设备602和内容分发网络服务器604,其中:
虚拟现实设备602,用于获取虚拟现实场景中的原始视角,从内容分发网络服务器604获取与原始视角对应的混合视频帧序列并播放;在视角切换时,生成视频帧获取请求并发送至内容分发网络服务器604。
内容分发网络服务器604,用于响应于视频帧获取请求,获取视角切换后的当前视角和当前播放时间节点;获取与当前视角对应、且与当前播放时间节点对应的关键帧;将关键帧发送至虚拟现实设备602。
其中,内容分发网络(CDN,Content Delivery Network)服务器,是实时地根据网络流量和各节点的连接、负载状况以及到用户的距离和响应时间等综合信息将用户的请求重新导向离用户最近的服务节点上的服务器。
内容分发网络服务器604还用于在与当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧;将查找到的视频帧发送至虚拟现实设备602。
虚拟现实设备602还用于将关键帧替代原始视角所对应的混合视频帧序列进行播放;并将后续接收到的视频帧依次播放。
在一个实施例中,虚拟现实设备602还用于当检测到头部姿态产生变化时,则判定视角发生切换;获取变化后的当前头部姿态;根据用于确定当前视角的当前头部姿态,生成视频帧获取请求并发送至内容分发网络服务器604。
在一个实施例中,虚拟现实设备602还用于按照虚拟现实场景中头部姿态与视角的预设映射关系,将当前头部姿态映射为虚拟现实场景中的当前视角;根据当前视角生成视频帧获取请求并发送至内容分发网络服务器604。
在一个实施例中,内容分发网络服务器604还用于获取头部姿态产生变化后的当前头部姿态;按照虚拟现实场景中头部姿态与视角的预设映射关系,将当前头部姿态映射为虚拟现实场景中的当前视角。
在一个实施例中,内容分发网络服务器604还用于获取与当前视角对应的关键帧序列;在关键帧序列中,查找与当前播放时间节点对应的关键帧。
在一个实施例中,内容分发网络服务器604还用于获取视角切换的切换时间点;在与当前视角对应的关键帧序列或混合视频帧序列所对应的播放时间节点中,查找与切换时间点匹配的播放时间节点;将匹配的播放时间节点作为当前播放时间节点。
如图7所示,在一个实施例中,该系统600还包括:
推流服务器606,用于获取虚拟现实场景下的三维虚拟现实视频;获取对应于三维虚拟现实视频的不同的视角;对应每个视角,根据三维虚拟现实 视频分别生成相应的关键帧序列和混合视频帧序列,并将对应每个视角所生成的相应的关键帧序列和混合视频帧序列推送至内容分发网络服务器604。
在一个实施例中,推流服务器606还用于获取预设的视角总数;按照视角总数,将三维虚拟现实视频的全景视野划分不同的视角。
如图8所示,在一个实施例中,该系统600还包括:流媒体接收管理服务器605。
推流服务器606还用于将对应每个视角所生成的相应的关键帧序列和混合视频帧序列推送至流媒体接收管理服务器605。
流媒体接收管理服务器605,用于将对应每个视角所生成的相应的关键帧序列和混合视频帧序列发送至内容分发网络服务器604,并管理关键帧序列和混合视频帧序列的发送状态。
其中,关键帧序列和混合视频帧序列的发送状态,是关键帧序列和混合视频帧序列在发送过程中成功与否,以及发送失败的情况等状态信息。关键帧序列和混合视频帧序列的发送状态包括成功、丢包和错序等至少一种发送状态。
在一个实施例中,内容分发网络服务器604可以是用于视频直播的内容分发网络服务器。
内容分发网络服务器604还用于存储所接收的对应每个视角所生成的相应的关键帧序列和混合视频帧序列。
可以理解,图1中的服务器可以是包括推流服务器、流媒体接收管理服务器和内容分发网络(CDN,Content Delivery Network)服务器的服务器集群。
上述基于虚拟现实场景的视频处理系统,在视角切换后,可以直接将切换后的当前视角下与当前播放时间节点对应的关键帧替代原始视角所对应的混合视频帧序列进行播放,并在播放关键帧后,将与当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找出的视频帧进行播放,快速地实现了切换后的新视角下的虚拟现实视频图像 的正常播放,而不需要一段等待播放视角变换前的虚拟现实视频图像的等待时间,缩短了视角切换时的显示延时时间。
如图9所示,在一个实施例中,提供了一个基于虚拟现实场景的视频处理方法的时序图。该时序图具体包括以下步骤:
1)推流服务器获取虚拟现实场景下的三维虚拟现实视频,并获取预设的视角总数;按照视角总数,将三维虚拟现实视频的全景视野划分不同的视角。
2)推流服务器对应每个视角,根据三维虚拟现实视频分别生成相应的关键帧序列和混合视频帧序列。
3)推流服务器将对应每个视角所生成的相应的关键帧序列和混合视频帧序列分别逐帧推送至流媒体接收管理服务器。
4)流媒体接收管理服务器将对应每个视角所生成的相应的关键帧序列和混合视频帧序列分别逐帧发送至内容分发网络服务器。
5)流媒体接收管理服务器管理关键帧序列和混合视频帧序列的发送状态。
6)虚拟现实设备获取虚拟现实场景中的原始视角,并向内容分发网络服务器发起访问请求。
7)内容分发网络服务器将与原始视角对应的混合视频帧序列发送至虚拟现实设备。
8)虚拟现实设备在虚拟现实场景中播放原始视角所对应的混合视频帧序列。
9)虚拟现实设备当检测到头部姿态产生变化时,则判定视角发生切换;获取变化后的当前头部姿态。
10)根据用于确定当前视角的当前头部姿态,生成视频帧获取请求发送至内容分发网络服务器。
11)内容分发网络服务器将视频帧获取请求中的当前头部姿态,映射为虚拟现实场景中的当前视角。
12)内容分发网络服务器获取视角切换的切换时间点;在与当前视角对 应的关键帧序列或混合视频帧序列所对应的播放时间节点中,查找与切换时间点匹配的播放时间节点,得到当前播放时间节点。
13)内容分发网络服务器获取与当前视角对应的关键帧序列;在关键帧序列中,查找与当前播放时间节点对应的关键帧。
14)内容分发网络服务器将关键帧返回至虚拟现实设备。
15)虚拟现实设备将关键帧替代原始视角所对应的混合视频帧序列进行播放。
16)内容分发网络服务器在与当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧。
17)内容分发网络服务器将查找到的视频帧依次返回至虚拟现实设备。
18)虚拟现实设备在播放关键帧后依次播放后续所接收的视频帧。
在一个实施例中,提供了一种计算机设备,该计算机设备的内部结构可如图13所示。该计算机设备可以是服务器。该计算机设备包括基于虚拟现实场景的视频处理装置,基于虚拟现实场景的视频处理装置中包括各个模块,每个模块可全部或部分通过软件、硬件或其组合来实现。如图10所示,在一个实施例中,提供了一种基于虚拟现实场景的视频处理装置1000,该装置1000包括:当前视角获取模块1004、播放时间节点确定模块1006、视频帧输出模块1008及视频帧查找模块1010,其中:
当前视角获取模块1004,用于获取虚拟现实场景中视角切换后的当前视角。
播放时间节点确定模块1006,用于确定当前播放时间节点。
视频帧输出模块1008,用于输出与当前视角对应、且与当前播放时间节点对应的关键帧。
视频帧查找模块1010,用于在与当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧。
视频帧输出模块1008还用于输出查找到的视频帧。
在一个实施例中,当前视角获取模块1004还用于获取头部姿态产生变化 后的当前头部姿态;按照虚拟现实场景中头部姿态与视角的预设映射关系,将当前头部姿态映射为虚拟现实场景中的当前视角。
在一个实施例中,视频帧输出模块1008还用于获取与当前视角对应的关键帧序列;在关键帧序列中,查找与当前播放时间节点对应的关键帧。
在一个实施例中,播放时间节点确定模块1006还用于获取视角切换的切换时间点;在与当前视角对应的关键帧序列或混合视频帧序列所对应的播放时间节点中,查找与切换时间点匹配的播放时间节点;将匹配的播放时间节点作为当前播放时间节点。
如图11所示,在一个实施例中,该装置1000还包括:
视角划分模块1002,用于获取虚拟现实场景下的三维虚拟现实视频;获取对应于三维虚拟现实视频的不同的视角;
视频帧序列生成模块1003,用于对应每个视角,根据三维虚拟现实视频分别生成相应的关键帧序列和混合视频帧序列。
在一个实施例中,视角划分模块1002还用于获取预设的视角总数;按照视角总数,将三维虚拟现实视频的全景视野划分不同的视角。
在一个实施例中,混合视频序列是由直播视频帧逐帧生成。
在一个实施例中,提供了一种计算机设备,该计算机设备的内部结构可如图14所示。该计算机设备可以是虚拟现实设备。该计算机设备包括基于虚拟现实场景的视频处理装置,该基于虚拟现实场景的视频处理装置中包括各个模块,每个模块可全部或部分通过软件、硬件或其组合来实现。如图12所示,在一个实施例中,提供了一种基于虚拟现实场景的视频处理装置1200,该装置1200包括:播放模块1202、视频帧请求模块1204以及视频帧接收模块1206,其中:
播放模块1202,用于在虚拟现实场景中播放原始视角所对应的混合视频帧序列。
视频帧请求模块1204,用于在视角切换时,生成视频帧获取请求并输出。
视频帧接收模块1206,用于接收响应于视频帧获取请求返回的关键帧; 关键帧对应于视角切换后的当前视角、且对应于当前播放时间节点。
播放模块1202还用于将关键帧替代原始视角所对应的混合视频帧序列进行播放。
视频帧接收模块1206还用于接收对应于当前视角的混合视频帧序列中的视频帧并通知播放模块1202依次播放接收到的视频帧;视频帧为在当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起依次查找得到。
在一个实施例中,视频帧请求模块1204还用于当检测到头部姿态产生变化时,则判定视角发生切换;获取变化后的当前头部姿态;根据用于确定当前视角的当前头部姿态,生成视频帧获取请求并输出。
在一个实施例中,视频帧请求模块1204还用于按照虚拟现实场景中头部姿态与视角的预设映射关系,将当前头部姿态映射为虚拟现实场景中的当前视角;根据当前视角生成视频帧获取请求并输出。
在一个实施例中,关键帧是当前视角对应的关键帧序列中对应于当前播放时间节点的关键帧。
在一个实施例中,混合视频帧序列是由直播视频帧逐帧生成。
图13为一个实施例中计算机设备的内部结构示意图。参照图13,该计算机设备可以是图1中所示的服务器,该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,存储器包括非易失性存储介质和内存储器。该计算机设备的非易失性存储介质可存储操作系统和计算机可读指令。该计算机可读指令被执行时,可使得处理器执行一种基于虚拟现实场景的视频处理方法。该计算机设备的处理器用于提供计算和控制能力,支撑整个计算机设备的运行。该内存储器中可储存有计算机可读指令,该计算机可读指令被处理器执行时,可使得处理器执行一种基于虚拟现实场景的视频处理方法。计算机设备的网络接口用于进行网络通信。
本领域技术人员可以理解,图13中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的 限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,本申请提供的基于虚拟现实场景的视频处理装置可以实现为一种计算机可读指令的形式,计算机可读指令可在如图13所示的计算机设备上运行,计算机设备的非易失性存储介质可存储组成该基于虚拟现实场景的视频处理装置的各个程序模块,比如,图10所示的当前视角获取模块1004、播放时间节点确定模块1006、视频帧输出模块1008及视频帧查找模块1010。各个程序模块所组成的计算机可读指令用于使该计算机设备执行本说明书中描述的本申请各个实施例的基于虚拟现实场景的视频处理方法中的步骤,例如,计算机设备可以通过如图10所示的基于虚拟现实场景的视频处理装置1000中的当前视角获取模块1004获取虚拟现实场景中视角切换后的当前视角,并通过播放时间节点确定模块1006确定当前播放时间节点。计算机设备可以通过视频帧输出模块1008输出与当前视角对应、且与当前播放时间节点对应的关键帧,并通过视频帧查找模块1010在与当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧。计算机设备可以通过视频帧输出模块1008输出查找到的视频帧。
图14为一个实施例中计算机设备的内部结构示意图。参照图14,该计算机设备可以是图1中所示的虚拟现实设备,该计算机设备包括通过系统总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中,存储器包括非易失性存储介质和内存储器。该计算机设备的非易失性存储介质可存储操作系统和计算机可读指令。该计算机可读指令被执行时,可使得处理器执行一种基于虚拟现实场景的视频处理方法。该计算机设备的处理器用于提供计算和控制能力,支撑整个计算机设备的运行。该内存储器中可储存有计算机可读指令,该计算机可读指令被处理器执行时,可使得处理器执行一种基于虚拟现实场景的视频处理方法。计算机设备的网络接口用于进行网络通信。计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏等。计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是终端外壳上设置的按键、 轨迹球或触控板,也可以是外接的键盘、触控板或鼠标等。该计算机设备可以是个人计算机、移动终端或车载设备,移动终端包括手机、平板电脑、个人数字助理或可穿戴设备等中的至少一种。
本领域技术人员可以理解,图14中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,本申请提供的基于虚拟现实场景的视频处理装置可以实现为一种计算机可读指令的形式,计算机可读指令可在如图14所示的计算机设备上运行,计算机设备的非易失性存储介质可存储组成该基于虚拟现实场景的视频处理装置的各个程序模块,比如,图12所示的播放模块1202、视频帧请求模块1204以及视频帧接收模块1206。各个程序模块所组成的计算机可读指令用于使该计算机设备执行本说明书中描述的本申请各个实施例的基于虚拟现实场景的视频处理方法中的步骤,例如,计算机设备可以通过如图12所示的基于虚拟现实场景的视频处理装置1200中的播放模块1202在虚拟现实场景中播放原始视角所对应的混合视频帧序列,并通过视频帧请求模块1204在视角切换时,生成视频帧获取请求并输出。计算机设备可以通过视频帧接收模块1206接收响应于视频帧获取请求返回的关键帧;关键帧对应于视角切换后的当前视角、且对应于当前播放时间节点。计算机设备可以通过播放模块1202将关键帧替代原始视角所对应的混合视频帧序列进行播放。并通过视频帧接收模块1206接收对应于当前视角的混合视频帧序列中的视频帧并通知播放模块1202依次播放接收到的视频帧;视频帧为在当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起依次查找得到。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器。该计算机设备包括存储器和一个或多个处理器,存储器中存储有计算机可读指令,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器 执行如下步骤:获取虚拟现实场景中视角切换后的当前视角;确定当前播放时间节点;输出与当前视角对应、且与当前播放时间节点对应的关键帧;在与当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧;输出查找到的视频帧。
在一个实施例中,获取虚拟现实场景中视角切换后的当前视角包括:获取头部姿态产生变化后的当前头部姿态;按照虚拟现实场景中头部姿态与视角的预设映射关系,将当前头部姿态映射为虚拟现实场景中的当前视角。
在一个实施例中,输出与当前视角对应、且与当前播放时间节点对应的关键帧包括:获取与当前视角对应的关键帧序列;在关键帧序列中,查找与当前播放时间节点对应的关键帧。
在一个实施例中,确定当前播放时间节点包括:获取视角切换的切换时间点;在与当前视角对应的关键帧序列或混合视频帧序列所对应的播放时间节点中,查找与切换时间点匹配的播放时间节点;将匹配的播放时间节点作为当前播放时间节点。
在一个实施例中,计算机可读指令被处理器执行时,还使得处理器执行如下步骤:获取虚拟现实场景下的三维虚拟现实视频;获取对应于三维虚拟现实视频的不同的视角;对应每个视角,根据三维虚拟现实视频分别生成相应的关键帧序列和混合视频帧序列。
在一个实施例中,获取对应于三维虚拟现实视频的不同的视角包括:获取预设的视角总数;按照视角总数,将三维虚拟现实视频的全景视野划分不同的视角。
在一个实施例中,混合视频序列是由直播视频帧逐帧生成。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是虚拟现实设备。该计算机设备包括存储器和一个或多个处理器,存储器中存储有计算机可读指令,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行如下步骤:在虚拟现实场景中播放原始视角所对应的混合视频帧序列;在视角切换时,生成视频帧获取请求并输出;接收响应于视频帧获 取请求返回的关键帧;关键帧对应于视角切换后的当前视角、且对应于当前播放时间节点;将关键帧替代原始视角所对应的混合视频帧序列进行播放;接收对应于当前视角的混合视频帧序列中的视频帧并播放;视频帧为在当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起依次查找得到。
在一个实施例中,在视角切换时,生成视频帧获取请求并输出包括:当检测到头部姿态产生变化时,则判定视角发生切换;获取变化后的当前头部姿态;根据用于确定当前视角的当前头部姿态,生成视频帧获取请求并输出。
在一个实施例中,根据用于确定当前视角的当前头部姿态,生成视频帧获取请求并输出包括:按照虚拟现实场景中头部姿态与视角的预设映射关系,将当前头部姿态映射为虚拟现实场景中的当前视角;根据当前视角生成视频帧获取请求并输出。
在一个实施例中,关键帧是当前视角对应的关键帧序列中对应于当前播放时间节点的关键帧。
在一个实施例中,混合视频帧序列是由直播视频帧逐帧生成。
在一个实施例中,提供了一个或多个存储有计算机可读指令的存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行如下步骤:获取虚拟现实场景中视角切换后的当前视角;确定当前播放时间节点;输出与当前视角对应、且与当前播放时间节点对应的关键帧;在与当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧;输出查找到的视频帧。
在一个实施例中,获取虚拟现实场景中视角切换后的当前视角包括:获取头部姿态产生变化后的当前头部姿态;按照虚拟现实场景中头部姿态与视角的预设映射关系,将当前头部姿态映射为虚拟现实场景中的当前视角。
在一个实施例中,输出与当前视角对应、且与当前播放时间节点对应的关键帧包括:获取与当前视角对应的关键帧序列;在关键帧序列中,查找与当前播放时间节点对应的关键帧。
在一个实施例中,确定当前播放时间节点包括:获取视角切换的切换时间点;在与当前视角对应的关键帧序列或混合视频帧序列所对应的播放时间节点中,查找与切换时间点匹配的播放时间节点;将匹配的播放时间节点作为当前播放时间节点。
在一个实施例中,计算机可读指令被处理器执行时,还使得处理器执行如下步骤:获取虚拟现实场景下的三维虚拟现实视频;获取对应于三维虚拟现实视频的不同的视角;对应每个视角,根据三维虚拟现实视频分别生成相应的关键帧序列和混合视频帧序列。
在一个实施例中,获取对应于三维虚拟现实视频的不同的视角包括:获取预设的视角总数;按照视角总数,将三维虚拟现实视频的全景视野划分不同的视角。
在一个实施例中,混合视频序列是由直播视频帧逐帧生成。
在一个实施例中,提供了一个或多个存储有计算机可读指令的存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行如下步骤:在虚拟现实场景中播放原始视角所对应的混合视频帧序列;在视角切换时,生成视频帧获取请求并输出;接收响应于视频帧获取请求返回的关键帧;关键帧对应于视角切换后的当前视角、且对应于当前播放时间节点;将关键帧替代原始视角所对应的混合视频帧序列进行播放;接收对应于当前视角的混合视频帧序列中的视频帧并播放;视频帧为在当前视角对应的混合视频帧序列中,从当前播放时间节点所对应视频帧的下一视频帧起依次查找得到。
在一个实施例中,在视角切换时,生成视频帧获取请求并输出的步骤包括:当检测到头部姿态产生变化时,则判定视角发生切换;获取变化后的当前头部姿态;根据用于确定当前视角的当前头部姿态,生成视频帧获取请求并输出。
在一个实施例中,根据用于确定当前视角的当前头部姿态,生成视频帧获取请求并输出的步骤包括:按照虚拟现实场景中头部姿态与视角的预设映 射关系,将当前头部姿态映射为虚拟现实场景中的当前视角;根据当前视角生成视频帧获取请求并输出。
在一个实施例中,关键帧是当前视角对应的关键帧序列中对应于当前播放时间节点的关键帧。
在一个实施例中,混合视频帧序列是由直播视频帧逐帧生成。
应该理解的是,虽然本申请各实施例中的各个步骤并不是必然按照步骤标号指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,各实施例中至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技 术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种基于虚拟现实场景的视频处理方法,包括:
    服务器获取虚拟现实场景中视角切换后的当前视角;
    所述服务器确定当前播放时间节点;
    所述服务器输出与所述当前视角对应、且与所述当前播放时间节点对应的关键帧;
    所述服务器在与所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧;及
    所述服务器输出查找到的视频帧。
  2. 根据权利要求1所述的方法,其特征在于,所述服务器获取虚拟现实场景中视角切换后的当前视角包括:
    服务器获取头部姿态产生变化后的当前头部姿态;及
    所述服务器按照虚拟现实场景中头部姿态与视角的预设映射关系,将所述当前头部姿态映射为虚拟现实场景中的当前视角。
  3. 根据权利要求1所述的方法,其特征在于,所述服务器输出与所述当前视角对应、且与所述当前播放时间节点对应的关键帧包括:
    所述服务器获取与所述当前视角对应的关键帧序列;及
    所述服务器在所述关键帧序列中,查找与所述当前播放时间节点对应的关键帧。
  4. 根据权利要求3所述的方法,其特征在于,所述服务器确定当前播放时间节点包括:
    所述服务器获取视角切换的切换时间点;
    所述服务器在与所述当前视角对应的关键帧序列或混合视频帧序列所对应的播放时间节点中,查找与所述切换时间点匹配的播放时间节点;及
    所述服务器将匹配的播放时间节点作为当前播放时间节点。
  5. 根据权利要求3所述的方法,其特征在于,还包括:
    所述服务器获取虚拟现实场景下的三维虚拟现实视频;
    所述服务器获取对应于所述三维虚拟现实视频的不同的视角;及
    所述服务器对应每个所述视角,根据所述三维虚拟现实视频分别生成相应的关键帧序列和混合视频帧序列。
  6. 根据权利要求5所述的方法,其特征在于,所述服务器获取对应于所述三维虚拟现实视频的不同的视角包括:
    所述服务器获取预设的视角总数;及
    所述服务器按照所述视角总数,将所述三维虚拟现实视频的全景视野划分不同的视角。
  7. 根据权利要求1所述的方法,其特征在于,所述混合视频序列是由直播视频帧逐帧生成。
  8. 一种基于虚拟现实场景的视频处理方法,包括:
    虚拟现实设备在虚拟现实场景中播放原始视角所对应的混合视频帧序列;
    所述虚拟现实设备在视角切换时,生成视频帧获取请求并输出;
    所述虚拟现实设备接收响应于所述视频帧获取请求返回的关键帧;所述关键帧对应于视角切换后的当前视角、且对应于当前播放时间节点;
    所述虚拟现实设备将所述关键帧替代所述原始视角所对应的混合视频帧序列进行播放;及
    所述虚拟现实设备接收对应于所述当前视角的混合视频帧序列中的视频帧并播放;所述视频帧为在所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起依次查找得到。
  9. 根据权利要求8所述的方法,其特征在于,所述虚拟现实设备在视角切换时,生成视频帧获取请求并输出包括:
    所述虚拟现实设备当检测到头部姿态产生变化时,则判定视角发生切换;
    所述虚拟现实设备获取变化后的当前头部姿态;及
    所述虚拟现实设备根据用于确定当前视角的当前头部姿态,生成视频帧获取请求并输出。
  10. 根据权利要求9所述的方法,其特征在于,所述虚拟现实设备根据用于确定当前视角的当前头部姿态,生成视频帧获取请求并输出包括:
    所述虚拟现实设备按照虚拟现实场景中头部姿态与视角的预设映射关系,将所述当前头部姿态映射为虚拟现实场景中的当前视角;及
    所述虚拟现实设备根据所述当前视角生成视频帧获取请求并输出。
  11. 根据权利要求8所述的方法,其特征在于,所述关键帧是所述当前视角对应的关键帧序列中对应于当前播放时间节点的关键帧。
  12. 根据权利要求8所述的方法,其特征在于,所述混合视频帧序列是由直播视频帧逐帧生成。
  13. 一种服务器,包括存储器和一个或多个处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
    获取虚拟现实场景中视角切换后的当前视角;
    确定当前播放时间节点;
    输出与所述当前视角对应、且与所述当前播放时间节点对应的关键帧;
    在与所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧;及
    输出查找到的视频帧。
  14. 根据权利要求13所述的服务器,其特征在于,所述输出与所述当前视角对应、且与所述当前播放时间节点对应的关键帧包括:
    获取与所述当前视角对应的关键帧序列;及
    在所述关键帧序列中,查找与所述当前播放时间节点对应的关键帧。
  15. 根据权利要求14所述的服务器,其特征在于,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
    获取虚拟现实场景下的三维虚拟现实视频;
    获取对应于所述三维虚拟现实视频的不同的视角;及
    对应每个所述视角,根据所述三维虚拟现实视频分别生成相应的关键帧 序列和混合视频帧序列。
  16. 一种虚拟现实设备,包括存储器和一个或多个处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
    在虚拟现实场景中播放原始视角所对应的混合视频帧序列;
    在视角切换时,生成视频帧获取请求并输出;
    接收响应于所述视频帧获取请求返回的关键帧;所述关键帧对应于视角切换后的当前视角、且对应于当前播放时间节点;
    将所述关键帧替代所述原始视角所对应的混合视频帧序列进行播放;及
    接收对应于所述当前视角的混合视频帧序列中的视频帧并播放;所述视频帧为在所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起依次查找得到。
  17. 根据权利要求16所述的虚拟现实设备,其特征在于,所述在视角切换时,生成视频帧获取请求并输出包括:
    当检测到头部姿态产生变化时,则判定视角发生切换;
    获取变化后的当前头部姿态;及
    根据用于确定当前视角的当前头部姿态,生成视频帧获取请求并输出。
  18. 一个或多个存储有计算机可读指令的存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行如下步骤:
    获取虚拟现实场景中视角切换后的当前视角;
    确定当前播放时间节点;
    输出与所述当前视角对应、且与所述当前播放时间节点对应的关键帧;
    在与所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧;及
    输出查找到的视频帧。
  19. 一个或多个存储有计算机可读指令的存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行如下步骤:
    在虚拟现实场景中播放原始视角所对应的混合视频帧序列;
    在视角切换时,生成视频帧获取请求并输出;
    接收响应于所述视频帧获取请求返回的关键帧;所述关键帧对应于视角切换后的当前视角、且对应于当前播放时间节点;
    将所述关键帧替代所述原始视角所对应的混合视频帧序列进行播放;及
    接收对应于所述当前视角的混合视频帧序列中的视频帧并播放;所述视频帧为在所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起依次查找得到。
  20. 一种基于虚拟现实场景的视频处理系统,包括虚拟现实设备和内容分发网络服务器;
    所述虚拟现实设备,用于获取虚拟现实场景中的原始视角,从所述内容分发网络服务器获取与所述原始视角对应的混合视频帧序列并播放;在视角切换时,生成视频帧获取请求并发送至所述内容分发网络服务器;
    所述内容分发网络服务器,用于响应于所述视频帧获取请求,获取视角切换后的当前视角和当前播放时间节点;获取与所述当前视角对应、且与所述当前播放时间节点对应的关键帧;将所述关键帧发送至所述虚拟现实设备;
    所述内容分发网络服务器还用于在与所述当前视角对应的混合视频帧序列中,从所述当前播放时间节点所对应视频帧的下一视频帧起,依次查找视频帧;将查找到的视频帧发送至所述虚拟现实设备;及
    所述虚拟现实设备还用于将所述关键帧替代所述原始视角所对应的混合视频帧序列进行播放;并将后续接收到的视频帧依次播放。
PCT/CN2018/110935 2017-10-20 2018-10-19 基于虚拟现实场景的视频处理方法、服务器、虚拟现实设备和系统 WO2019076356A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710982001.4 2017-10-20
CN201710982001.4A CN109698949B (zh) 2017-10-20 2017-10-20 基于虚拟现实场景的视频处理方法、装置和系统

Publications (1)

Publication Number Publication Date
WO2019076356A1 true WO2019076356A1 (zh) 2019-04-25

Family

ID=66173539

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/110935 WO2019076356A1 (zh) 2017-10-20 2018-10-19 基于虚拟现实场景的视频处理方法、服务器、虚拟现实设备和系统

Country Status (2)

Country Link
CN (1) CN109698949B (zh)
WO (1) WO2019076356A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114900740A (zh) * 2022-04-14 2022-08-12 北京奇艺世纪科技有限公司 一种多媒体对象的连播控制方法、系统及装置

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111935557B (zh) * 2019-05-13 2022-06-28 华为技术有限公司 视频处理方法、装置及系统
CN111309236B (zh) * 2020-02-13 2021-06-29 微幻科技(北京)有限公司 一种三维场景视角变换方法及装置
CN111372145B (zh) * 2020-04-15 2021-07-27 烽火通信科技股份有限公司 一种多视点视频的视点切换方法和系统
CN114125516B (zh) * 2020-08-26 2024-05-10 Oppo(重庆)智能科技有限公司 一种视频播放方法及穿戴式设备、存储介质
CN114584769A (zh) * 2020-11-30 2022-06-03 华为技术有限公司 一种视角切换方法及装置
CN115529449A (zh) * 2021-06-26 2022-12-27 华为技术有限公司 虚拟现实视频传输方法及装置
CN114339134B (zh) * 2022-03-15 2022-06-21 深圳市易扑势商友科技有限公司 基于互联网和vr技术的远程在线会议系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105791882A (zh) * 2016-03-22 2016-07-20 腾讯科技(深圳)有限公司 视频编码方法及装置
WO2016191694A1 (en) * 2015-05-27 2016-12-01 Google Inc. Streaming spherical video
US20170078742A1 (en) * 2014-03-10 2017-03-16 Nokia Technologies Oy Method and apparatus for video processing
CN106998409A (zh) * 2017-03-21 2017-08-01 华为技术有限公司 一种图像处理方法、头戴显示器以及渲染设备
US20170287227A1 (en) * 2013-06-03 2017-10-05 Microsoft Technology Licensing, Llc Mixed reality data collaboration

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050156817A1 (en) * 2002-08-30 2005-07-21 Olympus Corporation Head-mounted display system and method for processing images
CN1561111A (zh) * 2004-02-26 2005-01-05 晶晨半导体(上海)有限公司 在数字视频压缩码流中快速索引播放信息的方法
CN105872698B (zh) * 2016-03-31 2019-03-22 宇龙计算机通信科技(深圳)有限公司 播放方法、播放系统和虚拟现实终端
CN106909221B (zh) * 2017-02-21 2020-06-02 北京小米移动软件有限公司 基于vr系统的图像处理方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170287227A1 (en) * 2013-06-03 2017-10-05 Microsoft Technology Licensing, Llc Mixed reality data collaboration
US20170078742A1 (en) * 2014-03-10 2017-03-16 Nokia Technologies Oy Method and apparatus for video processing
WO2016191694A1 (en) * 2015-05-27 2016-12-01 Google Inc. Streaming spherical video
CN105791882A (zh) * 2016-03-22 2016-07-20 腾讯科技(深圳)有限公司 视频编码方法及装置
CN106998409A (zh) * 2017-03-21 2017-08-01 华为技术有限公司 一种图像处理方法、头戴显示器以及渲染设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114900740A (zh) * 2022-04-14 2022-08-12 北京奇艺世纪科技有限公司 一种多媒体对象的连播控制方法、系统及装置
CN114900740B (zh) * 2022-04-14 2024-02-23 北京奇艺世纪科技有限公司 一种多媒体对象的连播控制方法、系统及装置

Also Published As

Publication number Publication date
CN109698949A (zh) 2019-04-30
CN109698949B (zh) 2020-08-21

Similar Documents

Publication Publication Date Title
WO2019076356A1 (zh) 基于虚拟现实场景的视频处理方法、服务器、虚拟现实设备和系统
AU2022204875B2 (en) Multi-view audio and video interactive playback
WO2015070694A1 (zh) 屏幕拼接系统和视频数据流的处理方法
US11694316B2 (en) Method and apparatus for determining experience quality of VR multimedia
CN113163230B (zh) 视频消息生成方法、装置、电子设备及存储介质
JP2008140271A (ja) 対話装置及びその方法
CN115802076A (zh) 一种三维模型分布式云端渲染方法、系统及电子设备
CN110415293B (zh) 交互处理方法、装置、系统和计算机设备
CN112188219B (zh) 视频接收方法和装置以及视频发送方法和装置
KR101063153B1 (ko) 입체영상 상영을 위한 동기제어시스템 및 동기제어방법
TWI705692B (zh) 三維場景模型中資訊的分享方法及裝置
WO2023088104A1 (zh) 视频的处理方法、装置、电子设备和存储介质
JP6149967B1 (ja) 動画配信サーバ、動画出力装置、動画配信システム、及び動画配信方法
JP2015070418A (ja) 情報処理装置、及びプログラム
Jinjia et al. Networked VR: State of the Art
WO2023103875A1 (zh) 自由视角视频的视角切换方法、装置及系统
WO2018178510A2 (en) Video streaming
US20230412877A1 (en) Systems and methods for recommending content items based on an identified posture
JP6359870B2 (ja) 情報処理装置および動画データ再生方法
CN115174978A (zh) 一种3d数字人的音画同步方法及电子设备
JP2015070417A (ja) 通信システム、情報処理装置、及びプログラム
JP2015070419A (ja) 端末装置、及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18867929

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18867929

Country of ref document: EP

Kind code of ref document: A1