WO2019100247A1 - 应用于虚拟现实的图像显示方法、装置、设备及系统 - Google Patents

应用于虚拟现实的图像显示方法、装置、设备及系统 Download PDF

Info

Publication number
WO2019100247A1
WO2019100247A1 PCT/CN2017/112307 CN2017112307W WO2019100247A1 WO 2019100247 A1 WO2019100247 A1 WO 2019100247A1 CN 2017112307 W CN2017112307 W CN 2017112307W WO 2019100247 A1 WO2019100247 A1 WO 2019100247A1
Authority
WO
WIPO (PCT)
Prior art keywords
action
content
category
image frame
virtual reality
Prior art date
Application number
PCT/CN2017/112307
Other languages
English (en)
French (fr)
Inventor
贾伟杰
赵其勇
王娟娟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2017/112307 priority Critical patent/WO2019100247A1/zh
Publication of WO2019100247A1 publication Critical patent/WO2019100247A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0346Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/14Digital output to display device ; Cooperation and interconnection of the display device with other functional units

Definitions

  • the present application relates to the field of communications, and in particular, to an image display method, apparatus, device, and system for applying to virtual reality.
  • Virtual reality technology is a computer technology that can create and experience a virtual world. It uses a content-providing device such as a computer to generate a virtual environment, and uses a virtual display device with a sensor-mounted head-mounted display device (referred to as a helmet). The device allows the user to enter the virtual space and perceive and manipulate various virtual objects in the virtual environment in real time, thereby obtaining an immersive feeling of visual, tactile and auditory integration.
  • a content-providing device such as a computer to generate a virtual environment
  • a virtual display device with a sensor-mounted head-mounted display device referred to as a helmet.
  • the device allows the user to enter the virtual space and perceive and manipulate various virtual objects in the virtual environment in real time, thereby obtaining an immersive feeling of visual, tactile and auditory integration.
  • a virtual reality device has two ways of displaying an image.
  • the virtual reality device is accessed through a High Definition Multimedia Interface (HDMI) for transmitting images and a Serial Serial Bus (USB) for transmitting motion sequences.
  • HDMI High Definition Multimedia Interface
  • USB Serial Serial Bus
  • the content providing device so that after the virtual reality device captures the action, the action sequence is sent to the content providing device through the USB, and the content providing device sequentially performs action analysis, coordinate calculation, scene matching, and screen rendering, and the panoramic image frame is transmitted through HDMI.
  • the YUV or RGB (color coding mode) sequence obtained after rendering is sent to the virtual reality device, and the virtual reality device displays the YUV or RGB sequence.
  • a smart device such as a mobile phone is embedded in the virtual reality device, and the video stream transmitted by the cloud is received by the smart device, and the video stream is obtained by encoding and compressing the panoramic video by the content providing device; The video stream is decoded and decompressed.
  • the virtual reality device captures the action, the virtual reality device sends the action sequence to the smart device, and the smart device extracts the field of view (FOV) image frame obtained by decompressing according to the action and the decoding.
  • the rendering and display are sequentially performed, and the virtual reality device enlarges the field of view image frame displayed by the smart device, so that the enlarged image is filled with the human vision to achieve immersion.
  • FOV field of view
  • each FOV image frame is approximately one eighth of the panoramic image frame, so the resolution of the FOV image frame is close to one eighth of the resolution of the panoramic image frame. It is only about 1280*720, that is, 720P. Therefore, the resolution of the view image frame is low in response to each action, and the resolution is a bottleneck.
  • An embodiment of the present application provides an image display method, apparatus, device, and system for applying to a virtual reality, which are used to solve a large transmission code rate in response to each action, and to respond to a view image frame for each action.
  • the resolution is low and the problem becomes a bottleneck.
  • a first aspect provides an image display method applied to a virtual reality, the method comprising: capturing an action currently performed by a user; classifying the action according to a feature of the image change caused by the action, and obtaining a category identifier of the action;
  • the content acquisition request including the category identifier is sent to the content providing device; the content transmitted by the content providing device is received; and the visual field image frame is displayed according to the content.
  • the content is a category in which the content providing device determines an action according to the category identifier, and the difference is
  • the content acquisition policy corresponding to the category is determined by the content policy, and the differentiated strategy includes the content acquisition policy corresponding to each category.
  • a content acquisition policy may be set for the action according to the feature of the image change, so that the content providing device determines to send only the image change to the virtual reality device according to the content acquisition policy.
  • Content When the image portion changes, the data amount of the content of the image change is usually small, and the transmission code rate can be saved; in addition, when the data amount of the content of the image change is small, even if the resolution of the image frame is increased, the content is not occupied. Too much bandwidth, so the resolution of the field of view image can be improved, thereby improving the clarity of the user experience.
  • the content acquisition strategy when the category is an action category that does not cause data update of the panoramic image frame, the content acquisition strategy is to acquire the panoramic video stream; when the category is an action category that causes local data update of the view image frame, The content acquisition strategy is to acquire the local difference pixel; when the category is the action category that causes the overall data update of the view image frame, the content acquisition strategy is to acquire the view image frame.
  • the action of not updating the data of the panoramic image frame may be a turning motion or a low speed moving motion.
  • the action that causes the local data update of the view image frame may be a handle action or a gesture action.
  • the action of causing the overall data update of the view image frame may be a fast moving action or a sit-in action.
  • the place outside the user's field of view may display a static picture, so that only the data in the user's field of view needs to be updated to reduce the amount of data of the updated panoramic image frame.
  • the method further includes: calculating a desired field of view according to the action Coordinates, and detects whether a panoramic video stream is stored locally; when a panoramic video stream is stored locally, the previous panoramic image frame is read from the panoramic video stream, and the rendered previous panoramic image frame is cropped according to the visual field coordinates; The frame image frame obtained after the cropping; when the panoramic video stream is not stored locally, triggering the step of transmitting the content acquisition request including the category identifier to the content providing device.
  • the content providing device When the action does not cause the data update of the panoramic image frame, the content providing device only needs to send the panoramic video stream to the virtual reality device at the first time, after which the virtual reality device does not need to obtain the update data from the content providing device, that is, locally.
  • the panoramic image frame in the stored panoramic video stream is cropped to complete the response to the current action, thereby avoiding the time consuming of data transmission and improving the response speed of the motion.
  • the category is an action category that causes local data update of the view image frame
  • the content is a local difference pixel
  • the method includes: calculating a desired visual field coordinate and an action execution position coordinate according to the action; displaying the visual field image frame according to the content, including: replacing, by using the local difference pixel, a pixel at a position indicated by the action execution position coordinate in the previous visual field image frame, and displaying the replacement The resulting field of view image frame.
  • the local difference pixel is determined by the content providing device according to the action sequence, the action execution location, and the scenario match when determining that the desired visual field coordinate calculated according to the action sequence of the action and the visual field coordinate corresponding to the previous visual field image frame are the same.
  • the content description calculates the obtained pixel; when the content acquisition request carries the action sequence of the action, the action execution position coordinate and the view coordinate are calculated by the content providing device according to the action sequence; and when the content acquisition request carries the action sequence and the action execution position coordinate And the visual field coordinates, the action execution position coordinates and the visual field coordinates are read by the content providing device from the content acquisition request.
  • the virtual reality device needs to acquire the local data changed in the view image frame from the content providing device, that is, the local difference pixel, and the local difference pixel has a small amount of data, and the local part is transmitted.
  • the difference pixel requires a shorter duration, which also increases the response speed of the action.
  • the category is an action category that causes an overall data update of the view image frame
  • the content is a view image frame
  • displaying the visual field image frame according to the content includes: displaying the visual field image frame in the region indicated by the visual field coordinate.
  • the view image frame is an image frame calculated by the content providing device according to the view coordinate, the action sequence of the action, and the content description determined by the scenario matching; when the content acquisition request carries the action sequence of the action, the view coordinate is the content providing device according to the content providing device.
  • the action sequence is calculated; when the content acquisition request carries the action sequence and the view coordinate, the view coordinate is read by the content providing device from the content acquisition request.
  • the virtual reality device needs to acquire the view image frame from the content providing device. Since the data amount of the view image frame is smaller than the data amount of the panoramic image frame, the time required to transmit the view image frame is shorter, and the response speed of the action can be improved; in addition, since the data amount of the view image frame is small Even if the resolution of the view image frame is increased, the view image frame does not occupy too much bandwidth, so the resolution of the view image frame can also be improved to improve the definition of the user experience.
  • the local difference pixel is unencoded compressed content.
  • the transmission can be performed in a manner of not encoding and not compressing, thereby saving time for encoding and compression and improving the response speed of the motion.
  • the view image frame is the content obtained by intraframe compression.
  • the compression ratio is compared with the interframe compression algorithm such as Advanced Video Coding (AVC) and High Efficiency Video Coding (HEVC). Not high, but the intraframe compression algorithm takes less time and can improve the response speed of the action.
  • AVC Advanced Video Coding
  • HEVC High Efficiency Video Coding
  • a second aspect provides an image display method for applying to a virtual reality, the method comprising: receiving a content acquisition request sent by a virtual reality device, where the content acquisition request includes a category identifier of the action, and the action is captured by the virtual reality device.
  • the action currently performed by the user, the category identifier is obtained by classifying the action according to the feature of the image change caused by the action; determining the category of the action according to the category identifier; and finding the content acquisition strategy corresponding to the category in the differentiated strategy, the difference
  • the policy includes a content acquisition policy corresponding to each category; determining content sent to the virtual reality device according to the content acquisition policy; and transmitting the content to the virtual reality device for display.
  • a content acquisition policy may be set for the action according to the feature of the image change, so that the content providing device determines to send only the image change to the virtual reality device according to the content acquisition policy.
  • Content When the image portion changes, the data amount of the content of the image change is usually small, and the transmission code rate can be saved; in addition, when the data amount of the content of the image change is small, even if the resolution of the image frame is increased, the content is not occupied. Too much bandwidth, so the resolution of the field of view image can be improved, thereby improving the clarity of the user experience.
  • the content acquisition strategy when the category is an action category that does not cause data update of the panoramic image frame, the content acquisition strategy is to acquire the panoramic video stream; when the category is an action category that causes local data update of the view image frame, The content acquisition strategy is to acquire the local difference pixel; when the category is the action category that causes the overall data update of the view image frame, the content acquisition strategy is to acquire the view image frame.
  • the action of not updating the data of the panoramic image frame may be a turning motion or a low speed moving motion.
  • the action that causes the local data update of the view image frame may be a handle action or a gesture action.
  • the action of causing the overall data update of the view image frame may be a fast moving action or a sit-in action.
  • the place outside the user's field of view (such as the left and right sides and the back) can be displayed statically.
  • the picture in this way, only needs to update the data within the user's field of view to reduce the amount of data for updating the panoramic image frame.
  • determining the content sent to the virtual reality device according to the content acquisition policy including: acquiring the panoramic video stream according to the content acquisition policy, The panoramic video stream is determined to be the content sent to the virtual reality device.
  • the content providing device When the action does not cause the data update of the panoramic image frame, the content providing device only needs to send the panoramic video stream to the virtual reality device at the first time, after which the virtual reality device does not need to obtain the update data from the content providing device, that is, locally.
  • the panoramic image frame in the stored panoramic video stream is cropped to complete the response to the current action, thereby avoiding the time consuming of data transmission and improving the response speed of the motion.
  • the content is a local difference pixel
  • the content sent to the virtual reality device is determined according to the content acquisition policy, including: when the content When acquiring the motion sequence carrying the action, calculating the motion execution position coordinate and the desired visual field coordinate according to the motion sequence, and determining the content acquisition strategy when determining the visual field coordinate and the visual field coordinate corresponding to the previous visual image frame displayed by the virtual reality device are the same Obtaining a local difference pixel, calculating a local difference pixel according to the content description determined by the action sequence, the action execution position, and the scenario matching, and determining the local difference pixel as the content sent to the virtual reality device; when the content acquisition request carries When the action sequence and the motion action execution position coordinate and the desired visual field coordinate are read, the motion execution position coordinate and the visual field coordinate are read from the content acquisition request, and the visual field coordinate and the visual field corresponding to the previous visual field image frame of the virtual reality device are determined. Determined when the coordinate
  • the virtual reality device needs to acquire the local data changed in the image frame from the content providing device, that is, the local difference pixel, and the local difference is transmitted due to the small amount of data of the local difference pixel.
  • Value pixels require a shorter duration and can also increase the response speed of the action.
  • the sending the content to the virtual reality device for display comprises: transmitting the un-compressed compressed local difference pixel to the virtual reality device for display.
  • the transmission can be performed in a manner of not encoding and not compressing, thereby saving time for encoding and compression and improving the response speed of the motion.
  • the content is a view image frame
  • the content sent to the virtual reality device is determined according to the content acquisition policy, including: when the content is acquired When requesting the action sequence carrying the action, determining the content acquisition strategy is to acquire the visual field image frame, calculate the desired visual field coordinate according to the motion sequence, and calculate the visual field image frame according to the content description determined by the visual field coordinate, the motion sequence and the scene matching, and The view image frame is determined as the content sent to the virtual reality device; when the content acquisition request carries the action sequence of the action and the desired view coordinate, determining that the content acquisition strategy is to acquire the view image frame, and reading the view coordinate from the content acquisition request, The visual field image frame is calculated according to the content description determined by the visual field coordinates, the motion sequence, and the scene matching, and the visual field image frame is determined as the content transmitted to the virtual reality device.
  • the virtual reality device When the action causes the overall data of the view image frame to be updated, it is indicated that the virtual reality device needs to acquire the view image frame from the content providing device. Since the data amount of the view image frame is smaller than the data amount of the panoramic image frame, the time required to transmit the view image frame is shorter, and the response speed of the action can be improved; in addition, since the data amount of the view image frame is small Even if the resolution of the view image frame is increased, the view image frame does not occupy too much bandwidth, so the resolution of the view image frame can also be improved to improve the definition of the user experience.
  • the content is sent to the virtual reality device for display, including: performing intraframe compression on the view image frame, and transmitting the compressed view image frame to the virtual reality device for display.
  • intra-frame compression When intra-frame compression is performed on a view image frame, although the compression ratio is not high compared to an interframe compression algorithm such as AVC and HEVC, the time required for intraframe compression is small, and the response speed of the motion can be improved.
  • an image display device for use in virtual reality, the device having the function of implementing the image display method applied to the virtual reality provided by the first aspect and the possible implementation of the first aspect.
  • the functions may be implemented by hardware or by corresponding software implemented by hardware.
  • the hardware or software includes one or more than one unit corresponding to the functions described above.
  • an image display device for use in virtual reality, the device having the function of implementing an image display method applied to virtual reality provided by the possible implementations of the second aspect and the second aspect described above.
  • the functions may be implemented by hardware or by corresponding software implemented by hardware.
  • the hardware or software includes one or more than one unit corresponding to the functions described above.
  • an image display device for virtual reality comprising: a processor, a memory connected to the processor, and a processor in the device, by executing a program or an instruction stored in the memory to implement the above
  • the first aspect and the possible implementation of the first aspect provide an image display method applied to virtual reality.
  • an image display device for virtual reality comprising: a processor, a memory connected to the processor, and a processor in the device, by executing a program or an instruction stored in the memory to implement the foregoing
  • the second aspect and the possible implementation of the second aspect provide an image display method applied to virtual reality.
  • a computer readable storage medium storing at least one instruction, at least one program, a code set, or a set of instructions, at least one instruction, at least one program, code set, or instruction set is loaded by a processor And an image display method applied to virtual reality provided by implementing the first aspect and the possible implementation of the first aspect.
  • a computer readable storage medium storing at least one instruction, at least one program, a code set, or a set of instructions, at least one instruction, at least one program, code set, or instruction set is loaded by a processor And an image display method applied to virtual reality provided by implementing the second aspect and the possible implementation of the second aspect.
  • an image display system applied to virtual reality comprising the image display device applied to the virtual reality according to the third aspect and the image display applied to the virtual reality according to the fourth aspect Device.
  • an image display system applied to virtual reality comprising the image display device applied to the virtual reality as described in the fifth aspect and the image display applied to the virtual reality as described in the sixth aspect Device.
  • FIG. 1 is a schematic structural diagram of a virtual reality system according to an exemplary embodiment of the present application.
  • FIG. 2 is a block diagram of a virtual reality device and a content providing device provided by an exemplary embodiment of the present application
  • FIG. 3 is a schematic structural diagram of a virtual reality device or a content providing device according to an exemplary embodiment of the present application
  • FIG. 4A is a block diagram of an implementation of a virtual reality device according to an exemplary embodiment of the present application.
  • FIG. 4B is a block diagram of an implementation of a virtual reality device according to an exemplary embodiment of the present application.
  • FIG. 5 is a flowchart of an image display method applied to virtual reality according to an exemplary embodiment of the present application.
  • FIG. 6 is a flowchart of a process after a virtual reality device captures an action according to an exemplary embodiment of the present application
  • FIG. 7 is a flowchart of a process after the content providing device receives the content obtaining request according to an exemplary embodiment of the present application
  • FIG. 8 is a structural diagram of an image display apparatus applied to virtual reality according to an exemplary embodiment of the present application.
  • FIG. 9 is a structural diagram of an image display device applied to virtual reality according to an exemplary embodiment of the present application.
  • unit refers to a functional structure that is logically divided, and the “unit” can be implemented by pure hardware or a combination of hardware and software.
  • FIG. 1 is a schematic structural diagram of a virtual reality system 100 provided by an exemplary embodiment of the present application.
  • the virtual reality system includes a virtual reality device 110, a handle 120, and a content providing device 130.
  • the virtual reality device 110 is connected to the handle 120 and the content providing device 130, respectively.
  • the virtual reality device 110 is a head mounted display as an example.
  • a head mounted display is a display for wearing an image display on a user's head.
  • the head mounted display generally includes a wearing portion including a temple for wearing the head mounted display on the head of the user and an elastic band, and the display portion including a left eye display and a right eye display.
  • the head-mounted display can display different images on the left-eye display and the right-eye display, thereby simulating a three-dimensional virtual environment for the user.
  • a head mounted display is provided with a motion sensor for capturing a user's head motion to cause a smart device such as a mobile phone to change the display of the virtual head in the head mounted display.
  • the head mounted display is electrically connected to the smart device through a flexible circuit board or a hardware interface or a data line.
  • the smart device is configured to collect data reported by the sensor of the local (virtual reality device and/or human body) to determine an action performed by the user, receive the video stream sent by the content providing device 130, decode the video stream, and render the frame (reconstruction) ) and display.
  • the smart device may be integrated in the interior of the head mounted display, or may be integrated in other devices than the head mounted display, which is not limited in this embodiment.
  • the smart device is integrated into the interior of the head mounted display as an example for description.
  • the other device may be a desktop computer or a server, etc., which is not limited in this embodiment.
  • the smart device receives an input signal of the handle 120 and generates a display screen of the head mounted display based on the input signal.
  • Smart devices are typically implemented by electronics such as processors, memory, and image intelligence devices that are placed on a circuit board.
  • the smart device further includes an image capture device for capturing a user's head motion and changing a display screen of the virtual head in the head mounted display according to the user's head motion.
  • the content providing device 130 can be implemented as a server, which is a background server of the virtual reality device 110 at this time.
  • the content providing device 130 can be a server cluster or a cloud computing center composed of one server or multiple servers.
  • the virtual reality device 110 can be electrically connected to the content providing device 130 through a flexible circuit board or a hardware interface or a data line or a wireless network.
  • FIG. 2 shows a block diagram of a virtual reality device and a content providing device.
  • the virtual reality device includes a motion capture module, an action classification module, a desired visual field calculation module, a cropping and difference acquisition decision module, a frame rendering (reconstruction) module, a frame scan output module (including vertical synchronization), and the content providing device includes a scene matching Module and differentiated policy execution module.
  • the action classification module, the cropping and difference acquisition decision module, and the frame rendering (reconstruction) module in the virtual reality device are newly added modules, and the differentiated policy execution module in the content providing device is a newly added module.
  • the action classification module is configured to classify the captured actions and output the category identifier of the category to which the action belongs.
  • the cropping and difference obtaining decision module is configured to crop the local image frame in response to the action, or to obtain the local difference pixel or obtain the whole when the difference is a local difference pixel or a view image frame Field of view image frame.
  • "local” appearing here and below refers to a virtual reality device, which will not be described below.
  • the differentiation policy execution module is configured to determine the classification to which the action belongs according to the category identifier of the action, determine the content acquisition policy according to the classification, and determine the content sent to the virtual reality device according to the content acquisition policy.
  • the frame rendering (reconstruction) module generates a new field of view image frame by combining the content acquired by the differentiation strategy with the local content.
  • the virtual reality device does not need to perform frame rendering (reconstruction), and its rendering function can be integrated on the content providing device, and is realized by the powerful graphics processing capability of the cloud.
  • FIG. 3 is a schematic structural diagram of a virtual reality device or content providing device 300 according to another exemplary embodiment of the present application.
  • the virtual reality device 300 can be the virtual reality device 140 shown in FIG. 1 , and the virtual reality device includes a processor 320 and a transceiver 340 connected to the processor 320 .
  • the transceiver 340 can be comprised of one or more antennas that enable the virtual reality device 300 to transmit or receive radio signals.
  • the transceiver 340 can be coupled to a communication circuit 360 that can perform various processing on signals received via the transceiver 340 or transmitted via the transceiver 340, such as modulating signals transmitted via the transceiver 340, demodulating via the transceiver 340 received signal, in actual implementation, the communication circuit 360 can be composed of a radio frequency (RF) chip and a baseband chip.
  • RF radio frequency
  • Communication circuit 360 can be coupled to processor 320.
  • the optional communication circuit 360 can also be integrated in the processor 320.
  • the processor 320 is a control center of the virtual reality device, and the processor 320 may be a central processing unit (CPU), a network processor (in English: network processor, NP) or a combination of a CPU and an NP.
  • Processor 320 may also further include a hardware chip.
  • the hardware chip may be an application specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof.
  • the above PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (GAL) or Any combination thereof.
  • the memory 380 is connected to the processor 320 by a bus or other means.
  • the memory 380 may be a volatile memory, a non-volatile memory, or a combination thereof.
  • the volatile memory can be a random access memory (RAM), such as static random access memory (SRAM), dynamic random access memory (English: dynamic random access memory) , DRAM).
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • the non-volatile memory can be read-only memory (ROM), such as programmable read only memory (PROM), erasable programmable read-only memory (English: erasable programmable) Read only memory (EPROM), electrically erasable programmable read-only memory (EEPROM).
  • PROM programmable read only memory
  • EPROM erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • the non-volatile memory can also be a flash memory (English: flash memory), a magnetic memory such as a magnetic tape (English: magnetic t virtual reality device e), a floppy disk (English: floppy disk), a hard disk.
  • the non-volatile memory can also be an optical disc.
  • the panoramic video stream, the panoramic image frame, the view image frame, the type identification of the action, the action sequence, and the like may be stored in the memory 380 of the virtual reality device.
  • the memory 380 of the content providing device may store a differentiation policy, a virtual reality content source, an action sequence, a view coordinate, and the like.
  • the virtual reality content source may be a panoramic video captured by a camera, or may be a computer graphics (Computer Graphics, CG), which is not limited in this embodiment.
  • FIG. 4A and FIG. 4B illustrate two practical modes of the virtual reality device in the prior art, wherein the view in FIG. 4A is a block diagram of the first display mode, and the view in FIG. 4B is the second view.
  • the virtual reality device obtains content through the network, which is a huge challenge to bandwidth and delay, and the industry has been studying corresponding optimization solutions.
  • the first optimization scheme is to compress a large panoramic video stream, but there is still a stable bandwidth requirement of about 500 Mbps (megabits per second) after compression.
  • Table 1 below illustrates two devices as an example.
  • the second optimization scheme is to use AVC or HEVC for the panoramic video stream.
  • AVC advanced HEVC
  • the panoramic video stream of 4K, 30FPS number of frames transmitted per second
  • the resolution of the image frame in the field of view is only about 720P (resolution is 1280*720), and the resolution is very difference.
  • Table 2 below illustrates two panoramic video stream resolutions as an example.
  • This embodiment provides a An image display method applied to virtual reality for solving the above problem.
  • FIG. 5 is a flowchart of an image display method applied to virtual reality provided by an exemplary embodiment of the present application.
  • This embodiment is exemplified by the method used in the virtual reality system shown in FIG. 1. The following steps are performed by the virtual reality device, and the method includes the following steps:
  • step 501 the virtual reality device captures an action currently performed by the user.
  • the virtual reality device captures the action performed by the user on the handle. For example, if the virtual reality device is currently displaying the shot screen and the user operates the handle to shoot, the virtual reality device captures that the user performs the shooting action.
  • the virtual reality device captures the action performed by the user through the sensor. For example, if the virtual reality device is currently displaying a picture of the user sitting at the table and the user makes a gesture of picking up the cup on the table, the virtual reality device captures that the user is performing a cup action. For another example, if the user turns his head, the virtual reality device captures that the user performs a turning action. For another example, if the user is walking slowly, the virtual reality device captures that the user performs a low speed moving action, and the speed of the low speed movement is less than a preset threshold.
  • the virtual reality device captures that the user performs a fast moving action, and the speed of the fast moving is higher than a preset threshold. For another example, if the user is sitting quietly watching a movie or a TV show, the virtual reality device captures that the user performs a sit-in action.
  • the virtual reality device can also capture more other actions performed by the user, for example, performing a moving action, waving a weapon action, a throwing action, and tapping through a handle and a gesture.
  • the operation or the like does not limit the operation in this embodiment.
  • the object has 6 degrees of freedom in space, that is, the degree of freedom of movement along the three orthogonal coordinate axes of x, y, and z and the degree of freedom of rotation around the three coordinate axes, when performing motion capture of 6 degrees of freedom,
  • the immersive experience of virtual reality will be better.
  • the sensor can capture the action performed by the user based on 3 degrees of freedom or 6 degrees of freedom.
  • Step 502 The virtual reality device classifies the action according to the feature of the image change caused by the action, and obtains the category identifier of the action.
  • the feature of image change refers to the feature of data update caused by image change.
  • the feature of the data update is that the data is not updated, or the feature of the data update is to update the local data, or the feature of the data update is to update the overall data, which is not limited in this embodiment.
  • the present embodiment Before analyzing the characteristics of the image change caused by the action, the present embodiment first analyzes the delay (Motoion to Photons, MTP) delay of the above-mentioned six types of actions.
  • the head movement is sensitive to the delay, such as turning left, turning right, heading up, bowing and other daily head movements.
  • the MTP delay requirement of such actions is very strict and needs to be no more than 20ms. If the picture lags behind the action of the user's head, that is, the head does not see the expected picture in time, it will cause dizziness.
  • the main action during low-speed movement is still tracking the tiny rotation of the user's head, MTP
  • the delay requirement is less than 20ms, and the user's moving speed tends to be small relative to the picture in front of the picture, and the relative change of the picture in the field of view is small.
  • the fast moving process requires a wide range of pictures in the field of view to change rapidly with respect to low speed movements, and the MTP delay requirement is less than 20 ms.
  • the user's action is mainly to look at the front, and occasionally turn around and look around, the MTP delay requirement is less than 20ms.
  • the delay of the gesture action and the handle action is extended, and the problem that all actions in the related art require a small delay is solved, and the transmission requirement can be relaxed.
  • the action can be marked. For example, mark the rotor action as 001, the handle action as 010, the gesture action as 011, the low speed motion as 100, the fast motion as 101, and the sit motion as 110.
  • the category identifier may be marked by other methods. This embodiment does not limit the form of the category identifier.
  • the type identifier may also be modified to 111 to indicate that the content providing device sends the content identifier. Panoramic video stream.
  • the action is classified according to the above categories, and the category identifier corresponding to the category is obtained. For example, if the action captured by the virtual reality device is a turning action, the category identifier is 001.
  • Step 503 The virtual reality device sends a content acquisition request including the category identifier to the content providing device.
  • the virtual reality device can further classify the action types. For example, when the head is turned, the surrounding picture needs to be displayed in the field of view, and the field of view image frame suitable for the current offset after the current offset image frame is cut out to respond to the action, and the panoramic image is not required to be acquired from the content providing device.
  • the update data of the frame that is, does not cause the data update of the panoramic image frame; the surrounding picture does not change when moving at a low speed, when the user moves forward at a low speed, the picture in the field of view becomes smaller, and the field of view is rich in detail, and the local panorama is required at this time.
  • the image frame of the field of view suitable for the current field of view after the current motion can be cropped in the image frame.
  • the update data of the panoramic image frame does not need to be acquired to the content providing device, so the traverse and the low-speed movement can be divided into action types that do not cause data update of the panoramic image frame.
  • the action is made by the handle or the gesture, since the special effect needs to be displayed at the position indicated by the action execution position coordinate of the handle or the gesture, and the content of the other position in the view image frame does not change, the view providing image frame needs to be acquired from the content providing device.
  • the data is locally updated, so the handle action and the gesture action can be divided into action categories that cause local data updates of the view image frame.
  • the overall data of the view image frame is changing during fast moving, and the overall update data of the view image frame needs to be acquired from the content providing device; the picture of the movie or the TV is changed in real time while sitting still, and the overall update of the view image frame needs to be acquired from the content providing device.
  • Data, so fast motion and sit-down can be divided into action categories that cause the overall data update of the view image frame.
  • the method further includes: calculating a desired visual field coordinate according to the motion, and detecting whether the local storage is stored. There is a panoramic video stream; when the panoramic video stream is stored locally, the local panoramic video is cropped according to the view coordinate; the view image frame obtained after the cropping is displayed; when the panoramic video stream is not stored locally, the trigger execution includes the category identifier.
  • the content acquisition strategy is to acquire the panoramic video stream. That is, when the panoramic video stream is not stored in the virtual reality device, the virtual reality device needs to obtain the panoramic video stream from the content providing device. At this time, the content providing device may send the content obtaining request carrying the category identifier, so that the content providing device is based on the content.
  • the acquisition request sends a panoramic video stream to the virtual reality device.
  • the virtual reality device may carry the action sequence and the class identifier in the content acquisition request, or the virtual reality device may carry the action sequence in the content acquisition request, and carry the category identifier in the action sequence, and the content acquisition request is not in this embodiment. Limited.
  • the virtual reality device can calculate the desired visual field coordinates, and when the action is a turning action, the image is adapted from the panoramic image frame in the local panoramic video stream to be suitable for the current turning.
  • the field of view image frame of the field of view coordinates displays the field of view image frame obtained after cropping. For example, when the user turns the head 30 degrees to the left, the appropriate field of view image frame is cropped from the panoramic image frame in the panoramic video stream according to the new field of view coordinates.
  • the picture in the field of view becomes smaller and the field of view is rich in detail, and the field of view image frame suitable for the field of view coordinates after the current motion can be cropped from the panoramic image frame in the local panoramic video stream.
  • the method further includes: calculating the desired according to the action Field of view coordinates and motion execution position coordinates.
  • the content acquisition strategy is to acquire local difference pixels. That is, the virtual reality device needs to acquire local difference pixels to the content providing device. Since the virtual reality device needs to determine which position of the view image frame to update the local difference pixel, the virtual reality device also needs to calculate the desired view coordinate and the action execution position coordinate according to the handle or the gesture, and the action execution position coordinate is displayed. The position coordinates of the local difference pixel.
  • the field of view coordinates and the position coordinates of the hand it is necessary to calculate the field of view coordinates and the position coordinates of the hand, and use the position coordinates of the hand as the action execution position coordinates to display the local difference pixel of the cup at the position of the hand in the field of view image frame.
  • the position coordinates of the hand it is necessary to determine the coordinates of the field of view and the coordinates of the handle position, and calculate the coordinates of the contact position of the bullet with the display screen when the bullet is fired from the handle position, and use the coordinates of the contact position as the motion execution position coordinates so as to be in the field of view image frame.
  • the local difference pixel of the effect of the bullet is displayed at the contact position.
  • the virtual reality device may carry the action sequence and the class identifier in the content acquisition request, or the virtual reality device may carry the view coordinate, the action execution location coordinate, the category identifier, and the action sequence in the content acquisition request, or the virtual reality device may The action sequence is carried in the content acquisition request, and the view coordinate, the action execution location coordinate, and the category identifier are carried in the action sequence. This embodiment does not limit the content acquisition request.
  • the method further includes: calculating a desired view according to the action coordinate.
  • the content acquisition strategy is to acquire a view image frame. That is, the virtual reality device needs to acquire a view image frame from the content acquisition device. Since the virtual reality device needs to determine which field of view image frame data to update, the virtual reality device also needs to calculate the desired field of view coordinates.
  • the virtual reality device may carry the action sequence and the class identifier in the content acquisition request, or the virtual reality device may carry the view coordinate, the category identifier, and the action sequence in the content acquisition request, or the virtual reality device may be in the content acquisition request.
  • the action sequence is carried, and the view coordinate and the class coordinate are carried in the action sequence. This embodiment does not limit the content acquisition request.
  • the virtual reality device may further carry the identifier of the virtual reality device or the identifier of the panoramic image frame in the content acquisition request, so that the content providing device determines the panoramic image frame displayed by the virtual reality device according to the identifier, and then obtains the panoramic image frame from the panoramic image. A local difference pixel or a view image frame is selected in the frame.
  • Step 504 The content providing device receives a content acquisition request sent by the virtual reality device.
  • Step 505 The content providing device determines the category of the action according to the category identifier.
  • the action is a handle action.
  • the category identifier received by the content providing device is 111
  • the virtual display device does not store the panoramic video stream locally, and the panoramic video stream needs to be sent to the virtual reality device.
  • Step 506 The content providing device searches for a content acquisition policy corresponding to the category in the differentiated policy, where the differentiated policy includes a content acquisition policy corresponding to each category.
  • the differentiation strategy includes the correspondence between categories and content acquisition strategies.
  • the content acquisition strategy corresponding to the action category that does not cause the data update of the panoramic image frame is to acquire the panoramic video stream;
  • the content acquisition strategy corresponding to the action category that causes the local data update of the view image frame is to acquire the local difference pixel;
  • the content acquisition strategy corresponding to the action category of the overall data update of the frame is to acquire the view image frame.
  • Step 507 The content providing device determines the content that is sent to the virtual reality device according to the content acquisition policy.
  • step 503 the flow of determining the content sent to the virtual reality device by the content providing device is explained below.
  • determining the content sent to the virtual reality device according to the content acquisition policy comprising: acquiring the panoramic video stream according to the content acquisition policy, and determining to send the panoramic video stream to the The content of the virtual reality device.
  • the content is a local difference pixel
  • the content sent to the virtual reality device is determined according to the content acquisition policy, including: when the content acquisition request carries an action In the sequence, the motion execution position coordinate and the desired visual field coordinate are calculated according to the action sequence, and determining the content acquisition strategy is to obtain the local difference pixel when determining the visual field coordinate and the visual field coordinate corresponding to the previous image frame displayed by the virtual reality device are the same, according to The content sequence, the motion execution position, and the scene matching determine the content description to calculate the local difference image
  • the local difference pixel is determined as the content transmitted to the virtual reality device.
  • the action execution position coordinate and the visual field coordinate are read from the content acquisition request, and the visual field coordinate and the virtual reality device are determined before
  • the content acquisition strategy is to obtain the local difference pixel, and the local difference pixel is calculated according to the content description determined by the action sequence, the action execution position, and the scenario matching, and the local difference pixel is determined to be sent to The content of the virtual reality device.
  • the scenario matching is to determine a desired content description after the differentiation strategy is determined, and calculate a local difference pixel according to the action sequence, the content description, and the action execution position.
  • the content description is used to determine the content of the three-dimensional image.
  • the content providing device needs to calculate the view coordinate and the action execution position coordinate according to the action sequence; when the content acquisition request carries the view coordinate and the action execution position coordinate, the content is provided The device directly reads the view coordinate and the action execution position coordinate from the content acquisition request.
  • the content providing device compares the visual field coordinates with the visual field coordinates corresponding to the previous visual image frame. When the two are the same, it indicates that the local difference pixel needs to be calculated according to the content description determined by the action sequence, the action execution position, and the scenario matching.
  • the content given to the virtual reality device when the two are different, it is required to calculate the view image frame as the content sent to the virtual reality device according to the content description determined by the action sequence and the scenario match.
  • the category is an action category that causes an overall data update of the view image frame
  • the content is a view image frame
  • the content sent to the virtual reality device is determined according to the content acquisition policy, including: when the content acquisition request carries an action sequence of actions
  • the content acquisition strategy is determined by acquiring a field of view image frame, calculating a desired field of view coordinate according to the motion sequence, and calculating a field of view image frame according to the content description determined by the field of view coordinate, the motion sequence and the scene matching, and determining the field of view image frame to be sent to The content of the virtual reality device.
  • the content acquisition strategy is determined to acquire the view image frame, and the view coordinate is read from the content acquisition request, and is determined according to the view coordinate, the action sequence, and the scenario match.
  • the content description calculates the calculated view image frame, and determines the view image frame as the content sent to the virtual reality device.
  • the scenario matching is to determine a desired content description after the differentiation strategy is determined, and calculate a visual field image frame according to the action sequence, the content description, and the visual field coordinates.
  • the content description is used to determine the content of the three-dimensional image.
  • the content providing device When the content acquisition request does not carry the visual field coordinates, the content providing device needs to calculate the visual field coordinates according to the action sequence; when the content acquisition request carries the visual field coordinates, the content providing device directly reads the visual field coordinates from the content acquisition request.
  • the content providing device calculates the calculated visual field image frame as the content transmitted to the virtual reality device according to the content description determined by the visual field coordinates, the motion sequence, and the scenario matching.
  • the content providing device determines the panoramic image frame displayed by the virtual reality device according to the identifier, and selects the local difference from the panoramic image frame. Value pixel or field of view image frame.
  • Step 508 The content providing device sends the content to the virtual reality device for display.
  • the content providing device can compress the panoramic video stream and send the compressed panoramic video stream to the virtual reality device to save transmission bandwidth.
  • sending the content to the virtual reality device for display comprises: transmitting the un-compressed compressed local difference pixel to the virtual reality device for display.
  • the transmission can be performed in a manner of not encoding and not compressing, thereby saving time for encoding and compression and improving the response speed of the motion.
  • the content is sent to the virtual reality device for display, including: performing intraframe compression on the view image frame, and transmitting the compressed view image frame to the virtual reality device for display.
  • intra-frame compression When intra-frame compression is performed on a view image frame, although the compression ratio is not high compared to an interframe compression algorithm such as AVC and HEVC, the time required for intraframe compression is small, and the response speed of the motion can be improved.
  • Step 509 The virtual reality device receives the content sent by the content providing device.
  • step 510 the virtual reality device displays the view image frame according to the content.
  • the virtual reality device may select a view image frame from the panoramic image frame of the panoramic video stream, render the view image frame, and then combine the vertical synchronization, and finally scan and output the view image frame.
  • the category is an action category that causes local data update of the view image frame
  • the content is a local difference pixel
  • displaying the view image frame according to the content includes: replacing the action execution position in the previous view image frame with the local difference pixel
  • the pixel corresponding to the coordinate displays the field of view image frame obtained after the replacement.
  • the virtual reality device replaces the pixels corresponding to the motion execution position coordinates in the previous field image frame by using the local difference pixel, that is, performs frame reconstruction, and then combines vertical synchronization, and finally scans and outputs the field of view image frame.
  • the category is an action category that causes the overall data update of the view image frame
  • the content is a view image frame
  • displaying the view image frame according to the content includes displaying the view image frame in the region indicated by the view coordinate.
  • the virtual reality device renders the view image frame, and then combines the vertical synchronization, and finally scans and outputs the view image frame.
  • FIG. 6 shows a process flow diagram after the virtual reality device captures the action.
  • FIG. 7 shows a processing flowchart after the content providing device receives the content acquisition request.
  • the virtual reality device can load the full scene video stream at a time when the virtual environment is initially entered into the virtual environment, and can adopt a compression compression method with a large compression ratio, which can evade the user's perception of the delay.
  • the subsequent actions only need to complete the cropping and display of the screen locally, and do not need to transmit bandwidth, completely release the transmission resource.
  • the handle action and the gesture action it only needs the local difference pixel, and the local difference pixel is encapsulated based on the desired visual field, and the data amount is small, and the method can be processed without coding and compression, because the delay requirement is less than 150ms, compared with 20ms, has a longer transmission time, reduces the amount of data transmission, and increases the transmission time, which generally reduces the transmission bandwidth requirement, and reduces the bandwidth by more than 80%.
  • the local display buffered panoramic video stream cannot support fast picture changes, and the field of view image frame needs to be transmitted, and the MTP is required to be less than 20 ms.
  • Such data needs to adopt an intra-frame compression scheme, which can be 10 to 20 times. Compressing data locally can save both transmission bandwidth and intraframe compression to reduce latency.
  • the image display method applied to the virtual reality because the action is classified according to the feature of the image change caused by the action, therefore, a content may be set for the action according to the feature of the image change.
  • the obtaining policy causes the content providing device to determine, according to the content obtaining policy, only the image changed content is sent to the virtual reality device.
  • the data amount of the content of the image change is usually small, which can save the transmission code rate; in addition, when the data amount of the content of the image change is small, even if the resolution of the image frame is increased, the content is not It takes up too much bandwidth, so it can improve the resolution of the field of view image, thus improving the clarity of the user experience.
  • the place outside the user's field of view may display a static picture, so that only the data in the user's field of view needs to be updated to reduce the amount of data of the updated panoramic image frame.
  • the action does not cause data update of the panoramic image frame, it means that the content providing device only needs to view the panoramic view for the first time.
  • the frequency stream is sent to the virtual reality device, and then the virtual reality device does not need to obtain the update data from the content providing device, that is, the locally stored panoramic video stream is processed to complete the response to the action, thereby avoiding the data transmission. Time-consuming, can improve the response speed of the action.
  • the virtual reality device needs to acquire the local data changed in the image frame from the content providing device, that is, the local difference pixel, and the local difference is transmitted due to the small amount of data of the local difference pixel.
  • Value pixels require a shorter duration and can also increase the response speed of the action.
  • the virtual reality device When the action causes the overall data of the view image frame to be updated, it is indicated that the virtual reality device needs to acquire the view image frame from the content providing device. Since the data amount of the view image frame is smaller than the data amount of the panoramic image frame, the time required to transmit the view image frame is shorter, and the response speed of the action can be improved; in addition, since the data amount of the view image frame is small Even if the resolution of the view image frame is increased, the view image frame does not occupy too much bandwidth, so the resolution of the view image frame can also be improved to improve the definition of the user experience.
  • FIG. 8 is a block diagram of an image display apparatus applied to virtual reality provided by an embodiment of the present application.
  • the image display device applied to the virtual reality can be implemented as all or part of the virtual reality device by software, hardware, or a combination of both.
  • the image display device applied to the virtual reality may include a capturing unit 810, a classifying unit 820, a transmitting unit 830, a receiving unit 840, and a display unit 850.
  • the capturing unit 810 is configured to implement the function of step 501 described above.
  • the classification unit 820 is configured to implement the functions of step 502 above.
  • the sending unit 830 is configured to implement the function of step 503 above.
  • the receiving unit 840 is configured to implement the function of the foregoing step 509.
  • the display unit 850 is configured to implement the functions of step 510 described above.
  • the foregoing capturing unit 810 may be implemented by a processor in a virtual reality device; the foregoing classification unit 820 may be implemented by a processor in a virtual reality device; and the sending unit 830 may pass through a virtual reality device.
  • the processor determines the transmission timing, which is implemented by the transceiver.
  • the receiving unit 840 can be implemented by a transceiver in the virtual reality device; the real-life unit 830 can be implemented by a processor in the virtual reality device.
  • FIG. 9 is a block diagram of an image display apparatus applied to virtual reality provided by an embodiment of the present application.
  • the image display device applied to the virtual reality may be implemented as all or a part of the content providing device by software, hardware, or a combination of both.
  • the image display device applied to the virtual reality may include a receiving unit 910, a determining unit 920, a searching unit 930, and a transmitting unit 940.
  • the receiving unit 910 is configured to implement the functions of the foregoing steps 504 and 507.
  • the determining unit 920 is configured to implement the function of step 505 described above.
  • the searching unit 930 is configured to implement the function of step 506 described above.
  • the sending unit 940 is configured to implement the function of the foregoing step 508.
  • the foregoing receiving unit 910 may be implemented by a transceiver in the content providing device; the determining unit 920 may be implemented by a processor in the content providing device; the searching unit 930 may pass the content.
  • the processor in the device is provided for implementation; the sending unit 940 can determine the sending opportunity by the processor in the content providing device, and is implemented by the transceiver.
  • the embodiment also discloses an image display system applied to virtual reality, the system comprising an image display device applied to virtual reality as shown in FIG. 8 and an image display device applied to virtual reality as shown in FIG.
  • the image display device applied to the virtual reality provided by the above embodiment is only illustrated by the division of the above functional modules when performing image display applied to the virtual reality. In actual applications, the image display device may be used as needed.
  • the above function assignment is completed by different functional modules, and the internal structure of the image display device applied to the virtual reality is divided into different functional modules to complete all or part of the functions described above.
  • the image display device applied to the virtual reality provided by the above embodiment is the same as the embodiment of the image display method applied to the virtual reality. The specific implementation process is described in detail in the method embodiment, and details are not described herein again.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit may be only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined. Or it can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present application which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本申请公开了一种应用于虚拟现实的图像显示方法、装置、设备及系统,涉及通信领域,所述方法包括:捕捉用户当前执行的动作;按照所述动作引起的图像变化的特征对所述动作进行分类,得到所述动作的类别标识;将包含所述类别标识的内容获取请求发送给内容提供设备;接收所述内容提供设备发送的内容,所述内容是所述内容提供设备根据所述类别标识确定所述动作的类别,在差异化策略中查找所述类别对应的内容获取策略,根据所述内容获取策略确定的,所述差异化策略包括各个类别对应的内容获取策略;根据所述内容显示图像帧。本申请既可以节省传输码率,也便于提高视野图像的分辨率,即提高用户体验的清晰度。

Description

应用于虚拟现实的图像显示方法、装置、设备及系统 技术领域
本申请涉及通信领域,特别涉及一种应用于虚拟现实的图像显示方法、装置、设备及系统。
背景技术
虚拟现实技术是一种可以创建和体验虚拟世界的计算机技术,它利用计算机这种内容提供设备生成一种虚拟环境,并借助带有传感器的头戴式显示设备(可简称头盔)这种虚拟现实设备,可以让用户进入虚拟空间,实时感知和操作虚拟环境中的各种虚拟对象,从而获得视觉、触觉和听觉合一的沉浸式感受。
相关技术中,虚拟现实设备有两种显示图像的方式。在第一种显示方式中,虚拟现实设备通过用于传输图像的高清晰度多媒体接口(High Definition Multimedia Interface,HDMI)和用于传输动作序列的串行总线接口(Universal Serial Bus,USB)接入内容提供设备,这样,虚拟现实设备在捕捉到动作后,通过USB将动作序列发送给内容提供设备,内容提供设备依次进行动作解析、坐标计算、情景匹配和画面渲染,通过HDMI将对全景图像帧进行渲染后得到的YUV或RGB(色彩编码方式)序列发送给虚拟现实设备,虚拟现实设备对YUV或RGB序列进行显示。在第二种显示方式中,虚拟现实设备中嵌入有诸如手机之类的智能设备,通过智能设备接收云端传输的视频流,该视频流是内容提供设备对全景视频进行编码压缩后得到的;对视频流进行解码解压,在虚拟现实设备捕捉到动作后,虚拟现实设备将动作序列发送给智能设备,智能设备根据动作和解码解压后得到的全景视频裁剪视野(Field of view,FOV)图像帧,再依次进行渲染和显示,虚拟现实设备对智能设备显示的视野图像帧进行放大,使得放大后的画面充满人的视野,以实现沉浸感。
在第一种显示方式中,响应每个动作都需要传输完整的全景视频帧,消耗较大的传输码率;在第二种显示方式中,由于当前业界全景图像帧的分辨率普遍是3840×2160,以90度视场角为例来说,每个FOV图像帧近似是全景图像帧的八分之一,所以,FOV图像帧的分辨率接近于全景图像帧的分辨率的八分之一,只有1280*720左右,即720P,所以,响应每个动作时视野图像帧的分辨率较低,且分辨率的提升成为瓶颈。
发明内容
本申请实施例提供了一种应用于虚拟现实的图像显示方法、装置、设备及系统,用于解决响应每个动作都需要消耗较大的传输码率,以及,响应每个动作时视野图像帧的分辨率较低且提升成为瓶颈的问题。所述技术方案如下:
第一方面,提供了一种应用于虚拟现实的图像显示方法,该方法包括:捕捉用户当前执行的动作;按照该动作引起的图像变化的特征对动作进行分类,得到该动作的类别标识;将包含该类别标识的内容获取请求发送给内容提供设备;接收内容提供设备发送的内容;根据内容显示视野图像帧。其中,该内容是内容提供设备根据类别标识确定动作的类别,在差异 化策略中查找类别对应的内容获取策略,根据内容获取策略确定的,差异化策略包括各个类别对应的内容获取策略。
由于动作是根据其引起的图像变化的特征进行分类的,所以,可以根据图像变化的特征为该类动作设置一个内容获取策略,使得内容提供设备根据内容获取策略确定只向虚拟现实设备发送图像变化的内容。当图像部分变化时,图像变化的内容的数据量通常较小,可以节省传输码率;另外,在图像变化的内容的数据量较小时,即使提高图像帧的分辨率,该内容也不会占用太多的带宽,所以可以提高视野图像的分辨率,从而提高用户体验的清晰度。
在一种可能的实现方案中,当类别是不引起全景图像帧的数据更新的动作类别时,内容获取策略是获取全景视频流;当类别是引起视野图像帧的局部数据更新的动作类别时,内容获取策略是获取局部差值像素;当类别是引起视野图像帧的整体数据更新的动作类别时,内容获取策略是获取视野图像帧。
其中,不引起全景图像帧的数据更新的动作可以是转头动作或低速移动动作。引起视野图像帧的局部数据更新的动作可以是手柄动作或手势动作。引起视野图像帧的整体数据更新的动作可以是快速移动动作或静坐动作。
其中,当动作是静坐时,用户视野外的地方(比如左右两侧和背后)显示的可以是静态图片,这样,只需要更新用户视野内的数据,以减少更新全景图像帧的数据量。
在一种可能的实现方案中,当类别是不引起全景图像帧的数据更新的动作类别时,在将包含类别标识的内容获取请求发送给内容提供设备之前,还包括:根据动作计算期望的视野坐标,并检测本地是否存储有全景视频流;当本地存储有全景视频流时,从全景视频流中读取前一个全景图像帧,将渲染后的前一个全景图像帧按照视野坐标进行裁剪;显示裁剪后得到的视野图像帧;当本地未存储全景视频流时,触发执行将包含类别标识的内容获取请求发送给内容提供设备的步骤。
当动作不引起全景图像帧的数据更新时,说明内容提供设备只需要在首次时将全景视频流发送给虚拟现实设备,之后,虚拟现实设备就不需要从内容提供设备获取更新数据,即对本地存储的全景视频流中的全景图像帧进行裁剪即可完成对本次动作的响应,从而避免了数据传输的耗时,可以提高动作的响应速度。
在一种可能的实现方案中,当类别是引起视野图像帧的局部数据更新的动作类别时,内容是局部差值像素,则在将包含类别标识的内容获取请求发送给内容提供设备之前,还包括:根据动作计算期望的视野坐标和动作执行位置坐标;根据内容显示视野图像帧,包括:利用局部差值像素替换前一个视野图像帧中动作执行位置坐标所指示的位置处的像素,显示替换后得到的视野图像帧。其中,局部差值像素是内容提供设备在确定根据动作的动作序列计算出的期望的视野坐标和前一个视野图像帧对应的视野坐标相同时,根据动作序列、动作执行位置和情景匹配确定出的内容描述计算得到的像素;当内容获取请求携带有动作的动作序列时,动作执行位置坐标和视野坐标是内容提供设备根据动作序列计算得到的;当内容获取请求携带有动作序列、动作执行位置坐标和视野坐标时,动作执行位置坐标和视野坐标是内容提供设备从内容获取请求中读取得到的。
当动作引起视野图像帧的局部数据更新时,说明虚拟现实设备需要从内容提供设备获取视野图像帧中变化的局部数据,即局部差值像素,由于局部差值像素的数据量较小,传输局部差值像素所需的时长较短,也可以提高动作的响应速度。
在一种可能的实现方案中,当类别是引起视野图像帧的整体数据更新的动作类别时,内容是视野图像帧,则在将包含类别标识的内容获取请求发送给内容提供设备之前,还包括:根据动作计算期望的视野坐标;根据内容显示视野图像帧,包括:在视野坐标所指示的区域显示视野图像帧。其中,视野图像帧是内容提供设备根据视野坐标、动作的动作序列和情景匹配确定出的内容描述计算得到的图像帧;当内容获取请求携带有动作的动作序列时,视野坐标是内容提供设备根据动作序列计算得到的;当内容获取请求携带有动作序列和视野坐标时,视野坐标是内容提供设备从内容获取请求中读取得到的。
当动作引起视野图像帧的整体数据更新时,说明虚拟现实设备需要从内容提供设备获取视野图像帧。由于视野图像帧的数据量相比于全景图像帧的数据量来说较小,传输视野图像帧所需的时长较短,可以提高动作的响应速度;另外,由于视野图像帧的数据量较小,即使提高视野图像帧的分辨率,视野图像帧也不会占用太多的带宽,所以还可以提高视野图像帧的分辨率,以提高用户体验的清晰度。
在一种可能的实现方案中,局部差值像素是未经编码压缩的内容。
由于局部差值像素的数据量较小,所以可以采取不编码和不压缩的方式进行传输,从而节省编码和压缩的耗时,提高动作的响应速度。
在一种可能的实现方案中,视野图像帧是经帧内压缩得到的内容。
当对视野图像帧的整体数据进行帧内压缩时,虽然相比于高级视频编码(Advanced Video Coding,AVC)和高效视频压缩标准编码(High Efficiency Video Coding,HEVC)等帧间压缩算法压缩比例并不高,但是帧内压缩算法的耗时少,可以提高动作的响应速度。
第二方面,提供了一种应用于虚拟现实的图像显示方法,该方法包括:接收虚拟现实设备发送的内容获取请求,该内容获取请求包括动作的类别标识,该动作是虚拟现实设备捕捉到的用户当前执行的动作,该类别标识是虚拟现实设备按照动作引起的图像变化的特征对动作进行分类得到的;根据类别标识确定动作的类别;在差异化策略中查找类别对应的内容获取策略,差异化策略包括各个类别对应的内容获取策略;根据内容获取策略确定发送给虚拟现实设备的内容;将内容发送给虚拟现实设备进行显示。
由于动作是根据其引起的图像变化的特征进行分类的,所以,可以根据图像变化的特征为该类动作设置一个内容获取策略,使得内容提供设备根据内容获取策略确定只向虚拟现实设备发送图像变化的内容。当图像部分变化时,图像变化的内容的数据量通常较小,可以节省传输码率;另外,在图像变化的内容的数据量较小时,即使提高图像帧的分辨率,该内容也不会占用太多的带宽,所以可以提高视野图像的分辨率,从而提高用户体验的清晰度。
在一种可能的实现方案中,当类别是不引起全景图像帧的数据更新的动作类别时,内容获取策略是获取全景视频流;当类别是引起视野图像帧的局部数据更新的动作类别时,内容获取策略是获取局部差值像素;当类别是引起视野图像帧的整体数据更新的动作类别时,内容获取策略是获取视野图像帧。
其中,不引起全景图像帧的数据更新的动作可以是转头动作或低速移动动作。引起视野图像帧的局部数据更新的动作可以是手柄动作或手势动作。引起视野图像帧的整体数据更新的动作可以是快速移动动作或静坐动作。
其中,当动作是静坐时,用户视野外的地方(比如左右两侧和背后)显示的可以是静态 图片,这样,只需要更新用户视野内的数据,以减少更新全景图像帧的数据量。
在一种可能的实现方案中,当类别是不引起全景图像帧的数据更新的动作类别时,根据内容获取策略确定发送给虚拟现实设备的内容,包括:根据内容获取策略获取全景视频流,将全景视频流确定为发送给虚拟现实设备的内容。
当动作不引起全景图像帧的数据更新时,说明内容提供设备只需要在首次时将全景视频流发送给虚拟现实设备,之后,虚拟现实设备就不需要从内容提供设备获取更新数据,即对本地存储的全景视频流中的全景图像帧进行裁剪即可完成对本次动作的响应,以便从而避免了数据传输的耗时,可以提高动作的响应速度。
在一种可能的实现方案中,当类别是引起视野图像帧的局部数据更新的动作类别时,内容是局部差值像素,则根据内容获取策略确定发送给虚拟现实设备的内容,包括:当内容获取请求携带有动作的动作序列时,根据动作序列计算得到动作执行位置坐标和期望的视野坐标,在确定视野坐标和虚拟现实设备显示的前一个视野图像帧对应的视野坐标相同时确定内容获取策略是获取局部差值像素,根据动作序列、动作执行位置和情景匹配确定出的内容描述计算得到局部差值像素,将局部差值像素确定为发送给虚拟现实设备的内容;当内容获取请求携带有动作序列、动作的动作执行位置坐标和期望的视野坐标时,从内容获取请求中读取得到动作执行位置坐标和视野坐标,在确定视野坐标和虚拟现实设备现实的前一个视野图像帧对应的视野坐标相同时确定内容获取策略是获取局部差值像素,根据动作序列、动作执行位置和情景匹配确定出的内容描述计算得到局部差值像素,将局部差值像素确定为发送给虚拟现实设备的内容。
当动作引起视野图像帧的局部数据更新时,说明虚拟现实设备需要从内容提供设备获取图像帧中变化的局部数据,即局部差值像素,由于局部差值像素的数据量较小,传输局部差值像素所需的时长较短,也可以提高动作的响应速度。
在一种可能的实现方案中,将内容发送给虚拟现实设备进行显示,包括:将未经编码压缩的局部差值像素发送给虚拟现实设备进行显示。
由于局部差值像素的数据量较小,所以可以采取不编码和不压缩的方式进行传输,从而节省编码和压缩的耗时,提高动作的响应速度。
在一种可能的实现方案中,当类别是引起视野图像帧的整体数据更新的动作类别时,内容是视野图像帧,则根据内容获取策略确定发送给虚拟现实设备的内容,包括:当内容获取请求携带有动作的动作序列时确定内容获取策略是获取视野图像帧,根据动作序列计算得到期望的视野坐标,并根据视野坐标、动作序列和情景匹配确定出的内容描述计算得到视野图像帧,将视野图像帧确定为发送给虚拟现实设备的内容;当内容获取请求携带有动作的动作序列和期望的视野坐标时确定内容获取策略是获取视野图像帧,从内容获取请求中读取得到视野坐标,根据视野坐标、动作序列和情景匹配确定出的内容描述计算得到视野图像帧,将视野图像帧确定为发送给虚拟现实设备的内容。
当动作引起视野图像帧整体数据更新时,说明虚拟现实设备需要从内容提供设备获取视野图像帧。由于视野图像帧的数据量相比于全景图像帧的数据量来说较小,传输视野图像帧所需的时长较短,可以提高动作的响应速度;另外,由于视野图像帧的数据量较小,即使提高视野图像帧的分辨率,视野图像帧也不会占用太多的带宽,所以还可以提高视野图像帧的分辨率,以提高用户体验的清晰度。
在一种可能的实现方案中,将内容发送给虚拟现实设备进行显示,包括:对视野图像帧进行帧内压缩,将压缩后的视野图像帧发送给虚拟现实设备进行显示。
当对视野图像帧进行帧内压缩时,虽然相比于AVC和HEVC等帧间压缩算法压缩比例并不高,但是帧内压缩的耗时少,可以提高动作的响应速度。
第三方面,提供了一种应用于虚拟现实的图像显示装置,该装置具有实现上述第一方面及第一方面的可能的实现方案所提供的应用于虚拟现实的图像显示方法的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多于一个与上述功能相对应的单元。
第四方面,提供了一种应用于虚拟现实的图像显示装置,该装置具有实现上述第二方面及第二方面的可能的实现方案所提供的应用于虚拟现实的图像显示方法的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多于一个与上述功能相对应的单元。
第五方面,提供了一种应用于虚拟现实的图像显示设备,该设备包括:处理器、与处理器相连的存储器,该设备中的处理器,通过执行存储器中存储的程序或指令以实现上述第一方面及第一方面的可能的实现方案所提供的应用于虚拟现实的图像显示方法。
第六方面,提供了一种应用于虚拟现实的图像显示设备,该设备包括:处理器、与处理器相连的存储器,该设备中的处理器,通过执行存储器中存储的程序或指令以实现上述第二方面及第二方面的可能的实现方案所提供的应用于虚拟现实的图像显示方法。
第七方面,提供了一种计算机可读存储介质,存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,至少一条指令、至少一段程序、代码集或指令集由处理器加载并执行以实现第一方面及第一方面的可能的实现方案所提供的应用于虚拟现实的图像显示方法。
第八方面,提供了一种计算机可读存储介质,存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,至少一条指令、至少一段程序、代码集或指令集由处理器加载并执行以实现第二方面及第二方面的可能的实现方案所提供的应用于虚拟现实的图像显示方法。
第九方面,提供了一种应用于虚拟现实的图像显示系统,该系统包括如第三方面所述的应用于虚拟现实的图像显示装置和如第四方面所述的应用于虚拟现实的图像显示装置。
第十方面,提供了一种应用于虚拟现实的图像显示系统,该系统包括如第五方面所述的应用于虚拟现实的图像显示装置和如第六方面所述的应用于虚拟现实的图像显示装置。
附图说明
图1是本申请一示例性实施例提供的虚拟现实系统结构示意图;
图2是本申请一示例性实施例提供的虚拟现实设备和内容提供设备的框图;
图3是本申请一示例性实施例提供的虚拟现实设备或内容提供设备的结构示意图;
图4A是本申请一示例性实施例提供的虚拟现实设备的一种实现框图;
图4B是本申请一示例性实施例提供的虚拟现实设备的一种实现框图;
图5是本申请一示例性实施例提供的应用于虚拟现实的图像显示方法的流程图;
图6是本申请一示例性实施例提供的虚拟现实设备捕捉到动作后的处理流程图;
图7是本申请一示例性实施例同的内容提供设备接收到内容获取请求后的处理流程图;
图8是本申请一示例性实施例提供的应用于虚拟现实的图像显示装置的结构图;
图9是本申请一示例性实施例提供的应用于虚拟现实的图像显示装置的结构图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
在本文中提及的“单元”是指按照逻辑划分的功能性结构,该“单元”可以由纯硬件实现,或者,软硬件的结合实现。
请参考图1,其示出了本申请一个示例性实施例提供的虚拟现实系统100的结构示意图。该虚拟现实系统包括虚拟现实设备110、手柄120和内容提供设备130。其中,虚拟现实设备110分别与手柄120和内容提供设备130相连。
以虚拟现实设备110是头戴式显示器为例进行说明。头戴式显示器是用于佩戴在用户头部进行图像显示的显示器。头戴式显示器通常包括佩戴部和显示部,佩戴部包括用于将头戴式显示器佩戴在用户头部的眼镜腿及弹性带,显示部包括左眼显示屏和右眼显示屏。头戴式显示器能够在左眼显示屏和右眼显示屏显示不同的图像,从而为用户模拟出三维虚拟环境。
可选地,头戴式显示器上设置有运动传感器,用于捕捉用户的头部动作,以使得诸如手机之类的智能设备改变头戴式显示器中的虚拟头部的显示画面。
头戴式显示器通过柔性电路板或硬件接口或数据线,与智能设备电性相连。
智能设备用于采集本地(虚拟现实设备和/或人体)的传感器上报的数据以确定用户执行的动作,接收内容提供设备130发送到的视频流、对视频流进行解码、帧的渲染(重构)及显示。
可选地,智能设备可以集成在头戴式显示器的内部,也可以集成在与头戴式显示器不同的其它设备中,本实施例对此不作限定。本实施例中,以智能设备集成在与头戴式显示器的内部为例进行说明。其中,其它设备可以为台式计算机或服务器等,本实施例对此不作限定。
智能设备接收手柄120的输入信号,并根据该输入信号生成头戴式显示器的显示画面。智能设备通常由设置在电路板上的处理器、存储器、图像智能设备等电子器件实现。可选地,智能设备还包括图像采集装置,用于捕捉用户的头部动作,并根据用户的头部动作改变头戴式显示器中的虚拟头部的显示画面。
内容提供设备130可以实现为服务器,此时该服务器是虚拟现实设备110的后台服务器。在实现时,内容提供设备130可以是一台服务器或多台服务器组成的服务器集群或云计算中心。
虚拟现实设备110可以通过柔性电路板或硬件接口或数据线或无线网络,与内容提供设备130电性相连。
请参考图2,其示出了虚拟现实设备和内容提供设备的框图。虚拟现实设备包括动作捕捉模块、动作分类模块、期望视野计算模块、裁剪和差值获取决策模块、帧的渲染(重建)模块、帧的扫描输出模块(含垂直同步),内容提供设备包括情景匹配模块和差异化策略执行模块。其中,虚拟现实设备中的动作分类模块、裁剪和差值获取决策模块、帧的渲染(重建)模块是新增的模块,内容提供设备中的差异化策略执行模块是新增的模块。
动作分类模块用于对捕捉到的动作进行分类,并输出动作所属分类的类别标识。
裁剪和差值获取决策模块用于对本地的图像帧进行裁剪以响应动作,或者,当差值是局部差值像素或视野图像帧时,用于向内容提供设备获取局部差值像素或获取整个视野图像帧。其中,此处及下文中出现的“本地”都是指代虚拟现实设备,下文不再赘述。
差异化策略执行模块用于根据动作的类别标识确定动作所属的分类,再根据分类确定内容获取策略,按照内容获取策略确定发送给虚拟现实设备的内容。
帧的渲染(重建)模块把通过差异化策略获取到的内容结合本地的内容生成新的视野图像帧。
在传输视野图像帧的场景下,虚拟现实设备不需要进行帧的渲染(重建),其渲染功能可以集成在内容提供设备上,利用云端强大的图形处理能力实现。
请参考图3,其示出了本申请另一个示例性实施例示出的虚拟现实设备或内容提供设备300的结构示意图。该虚拟现实设备300可以是图1中所示出的虚拟现实设备140,该虚拟现实设备包括:处理器320、与处理器320相连的收发器340。
该收发器340可由一个或多个天线组成,该天线使得虚拟现实设备300能够发送或接收无线电信号。
收发器340可连接至通信电路360,该通信电路360可对经由收发器340接收或经由收发器340发送的信号执行各种处理,如:调制经由收发器340发送的信号,解调经由收发器340接收的信号,在实际实现时,该通信电路360可由射频(英文:radio frequency,RF)芯片和基带芯片组成。
通信电路360可连接至处理器320。可替换的该通信电路360也可集成在处理器320中。处理器320是虚拟现实设备的控制中心,该处理器320可以是中央处理器(英文:central processing unit,CPU),网络处理器(英文:network processor,NP)或者CPU和NP的组合。处理器320还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路(英文:虚拟现实设备plication-specific integrated circuit,ASIC),可编程逻辑器件(英文:programmable logic device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(英文:complex programmable logic device,CPLD),现场可编程逻辑门阵列(英文:field-programmable gatearray,FPGA),通用阵列逻辑(英文:generic array logic,GAL)或其任意组合。
存储器380用总线或其它方式与处理器320相连,存储器380可以为易失性存储器(英文:volatile memory),非易失性存储器(英文:non-volatile memory)或者它们的组合。易失性存储器可以为随机存取存储器(英文:random-access memory,RAM),例如静态随机存取存储器(英文:static random access memory,SRAM),动态随机存取存储器(英文:dynamic random access memory,DRAM)。非易失性存储器可以为只读存储器(英文:readonly memory image,ROM),例如可编程只读存储器(英文:programmable read only memory,PROM),可擦除可编程只读存储器(英文:erasable programmable read only memory,EPROM),电可擦除可编程只读存储器(英文:electrically erasable programmable read-only memory,EEPROM)。非易失性存储器也可以为快闪存储器(英文:flash memory),磁存储器,例如磁带(英文:magnetic t虚拟现实设备e),软盘(英文:floppy disk),硬盘。非易失性存储器也可以为光盘。
虚拟现实设备的存储器380中可以存储全景视频流、全景图像帧、视野图像帧、动作的类型标识、动作序列等。
内容提供设备的存储器380中可以存储差异化策略、虚拟现实内容源、动作序列和视野坐标等。其中,虚拟现实内容源可以是通过摄像头拍摄得到的全景视频,也可以是计算机动画(Computer Graphics,CG),本实施例不作限定。
请参考图4A和图4B,其示出了现有技术中虚拟现实设备的两种现实方式,其中,图4A中的视图是第一种显示方式的框图,图4B中的视图是第二种显示方式的框图。
对于第二种显示方式,虚拟现实设备通过网络来获取内容,这对带宽和时延是一个巨大的挑战,业界已经在研究相应的优化方案。
第一种优化方案是对大的全景视频流进行压缩,但是压缩后仍然有大约500Mbps(兆比特每秒)的稳定带宽需求,下表一以两种设备为例进行说明。
表一
型号 分辨率 帧率 视频码率
设备1 2160*1200 90 5.6Gbps
设备2 2560*1440 60 4.7Gbps
第二种优化方案是对全景视频流使用AVC或HEVC,虽然有很大的压缩比(比如120倍),4K、30FPS(每秒传输帧数)的全景视频流压缩后的码率仅有20Mbps,如果头盔视场角为96度,由于用户的视野仅看到全景视频流的近似1/8,那么视野内的图像帧的分辨率只有720P左右(分辨率为1280*720),清晰度很差。下表二以两种全景视频流分辨率为例进行说明。
表二
Figure PCTCN2017112307-appb-000001
可以看出,下载全景视频流到本地对传输带宽的要求极高,且提升分辨率要付出很大的带宽成本。
综上所述,这两种优化方案都不能从根本上解决虚拟现实图像显示需要消耗较大的传输码率和视野图像帧的分辨率较低且提升成为瓶颈的问题,本实施例提供了一种应用于虚拟现实的图像显示方法,用于解决上述问题。
请参考图5,其示出了本申请一示例性实施例提供的应用于虚拟现实的图像显示方法的流程图。本实施例以该方法用于如图1所示的虚拟现实系统中来举例说明,由虚拟现实设备执行下述步骤,该方法包括以下几个步骤:
步骤501,虚拟现实设备捕捉用户当前执行的动作。
当用户操作手柄时,虚拟现实设备通过该操作来捕捉用户对手柄执行的动作。比如,虚拟现实设备当前正在显示射击的画面,且用户操作手柄进行了射击,则虚拟现实设备捕捉到用户执行的是射击动作。
当用户未操作手柄时,虚拟现实设备通过传感器来捕捉用户执行的动作。比如,虚拟现实设备当前正在显示用户坐在桌子边的画面,且用户做出拿起桌子上的杯子的手势,则虚拟现实设备捕捉到用户执行的是拿杯子动作。又比如,用户转了下头,则虚拟现实设备捕捉到用户执行的是转头动作。又比如,用户正在缓慢步行,则虚拟现实设备捕捉到用户执行的是低速移动动作,低速移动的速度小于预设阈值。又比如,用户正在快跑,则虚拟现实设备捕捉到用户执行的是快速移动动作,快速移动的速度高于预设阈值。又比如,用户正在静坐观看电影或电视剧,则虚拟现实设备捕捉到用户执行的是静坐动作。
本实施例仅以上述动作进行举例,在实际实现时,虚拟现实设备还可以捕捉到用户执行的更多其他动作,比如,通过手柄和手势执行搬东西动作、挥动武器动作、投掷动作和敲击动作等,本实施例不对动作进行限定。
其中,物体在空间具有6个自由度,即沿x、y、z三个直角坐标轴方向的移动自由度和绕这三个坐标轴的转动自由度,当进行6自由度的动作捕捉时,虚拟现实的沉浸感体验会比较好。本实施例中,传感器可以基于3自由度或6自由度来捕捉用户执行的动作。
步骤502,虚拟现实设备按照动作引起的图像变化的特征对动作进行分类,得到动作的类别标识。
其中,图像变化的特征是指由于图像变化所引起的数据更新的特征。比如,数据更新的特征是不更新数据,或者,数据更新的特征是更新局部数据,或者,数据更新的特征是更新整体数据,本实施例不作限定。
在对动作引起的图像变化的特征进行分析之前,本实施例先对上述例举的六种类型的动作的延迟(Motoion to Photons,MTP)时延进行分析。
对于转头动作,头部运动对时延很敏感,如向左转、向右转、仰头、低头等日常头部运动,此类动作的MTP时延需求是很严格的,需要不大于20ms,如果画面滞后于用户头部的动作,也就是头部在转动过程中没有及时看到预期的画面,便会产生眩晕感。
对于手势动作和手柄动作,与使用鼠标时类似的,从动作发出到画面显示有同样的MTP时延需求,基于Steve Swink提出的“在50毫秒以内,人感觉瞬间响应;超过100毫秒,人感觉到显而易见的滞后,但可忽略;在200毫秒,人感觉反应迟钝”的研究,此类动作的MTP延迟需要小于150ms。
对于低速移动动作,低速移动过程中主要动作仍然是跟踪用户的头部的微小转动,MTP 时延要求小于20ms,而且相对于眼前的画面,用户的移动速度往往较小,视野内的画面相对变化较小。
对于快速移动动作,相对于低速移动,快速移动过程要求视野内的画面大范围快速变化,MTP时延要求小于20ms。
对于静坐类型的动作,用户的动作主要是注视前方,偶尔会转头环顾四周,MTP时延要求小于20ms。
本实施例中对手势动作和手柄动作的时延进行了延长,解决了相关技术中所有动作都要求较小的时延的问题,可以放宽传输需求。
下面通过表三对这六种类型的动作的图像变化特征进行说明。
表三
Figure PCTCN2017112307-appb-000002
基于上述分析,可以对动作进行标记。比如,将转头动作标记为001,将手柄动作标记为010,将手势动作标记为011,将低速移动动作标记为100,将快速移动动作标记为101,将静坐动作标记为110。本实施例仅以三位二进制数据对类别标识进行举例说明,在实际实现时,还可以通过其他方式来标记类别标识,本实施例不对类别标识的形式进行限定。
可选的,当虚拟现实设备确定类别标识是001或100,且本地没有存储全景视频流,需要向内容提供设备获取全景视频流时,还可以将类型标识修改为111,以指示内容提供设备发送全景视频流。
当虚拟现实设备在捕捉到动作后,按照上述类别对动作进行分类,得到该分类对应的类别标识。比如,虚拟现实设备捕捉到的动作是转头动作,则类别标识是001。
步骤503,虚拟现实设备将包含类别标识的内容获取请求发送给内容提供设备。
虚拟现实设备还可以进一步对动作类型进行分类。比如,转头时需要将周围的画面显示在视野内,从本地的全景图像帧中裁剪出适合当前偏移后的视野坐标的视野图像帧即可响应动作,不需要向内容提供设备获取全景图像帧的更新数据,即,不引起全景图像帧的数据更新;低速移动时周围的画面不变,当用户向前低速移动时,视野内画面变小,视野细节丰富,此时需要从本地的全景图像帧中裁剪出适合当前运动后的视野坐标的视野图像帧即可 响应动作,向后移动时相反,不需要向内容提供设备获取全景图像帧的更新数据,所以,可以将转头和低速移动划分为不引起全景图像帧的数据更新的动作类型。通过手柄或手势做出动作时,由于需要在手柄或手势的动作执行位置坐标所指示的位置处显示特效,而视野图像帧中其他位置的内容不变,需要向内容提供设备获取视野图像帧的局部更新数据,所以,可以将手柄动作和手势动作划分为引起视野图像帧的局部数据更新的动作类别。快速移动时视野图像帧整体数据都在变化,需要向内容提供设备获取视野图像帧的整体更新数据;静坐时电影或电视的画面是实时变化的,需要向内容提供设备获取视野图像帧的整体更新数据,所以,可以将快速移动和静坐划分为引起视野图像帧的整体数据更新的动作类别。
基于上述分类,下面对发送内容获取请求的流程进行解释。
1)当类别是不引起全景图像帧的数据更新的动作类别时,在将包含类别标识的内容获取请求发送给内容提供设备之前,还包括:根据动作计算期望的视野坐标,并检测本地是否存储有全景视频流;当本地存储有全景视频流时,按照视野坐标对本地的全景视频进行裁剪;显示裁剪后得到的视野图像帧;当本地未存储全景视频流时,触发执行将包含类别标识的内容获取请求发送给内容提供设备的步骤。
当类别是不引起全景图像帧的数据更新的动作类别时,内容获取策略是获取全景视频流。即,当虚拟现实设备中没有存储全景视频流时,虚拟现实设备需要向内容提供设备获取全景视频流,此时可以向内容提供设备发送携带有类别标识的内容获取请求,使得内容提供设备基于内容获取请求向虚拟现实设备发送全景视频流。其中,虚拟现实设备可以在内容获取请求中携带动作序列和类别标识,或者,虚拟现实设备可以在内容获取请求中携带动作序列,并将类别标识携带在动作序列中,本实施例不对内容获取请求进行限定。
当虚拟现实设备中存储有全景视频流时,虚拟现实设备可以计算出期望的视野坐标,当动作是转头动作时,从本地全景视频流中的全景图像帧中裁剪出适合当前转头后的视野坐标的视野图像帧,显示裁剪后得到的视野图像帧。比如,当用户向左转头30度时,按照新的视野坐标从全景视频流中的全景图像帧中裁剪出适合的视野图像帧。比如,当用户向前低速移动时,视野内画面变小,视野细节丰富,可以从本地全景视频流中的全景图像帧中裁剪出适合当前运动后的视野坐标的视野图像帧。
2)当类别是引起视野图像帧的局部数据更新的动作类别时,内容是局部差值像素,则在将包含类别标识的内容获取请求发送给内容提供设备之前,还包括:根据动作计算期望的视野坐标和动作执行位置坐标。
当类别是引起视野图像帧的局部数据更新的动作类别时,内容获取策略是获取局部差值像素。即,虚拟现实设备需要向内容提供设备获取局部差值像素。由于虚拟现实设备需要确定更新哪个视野图像帧中哪个位置的局部差值像素,所以,虚拟现实设备还需要根据手柄或手势计算期望的视野坐标和动作执行位置坐标,该动作执行位置坐标即为显示局部差值像素的位置坐标。
比如,当通过手势拿杯子时,需要计算视野坐标和手的位置坐标,将手的位置坐标作为动作执行位置坐标,以便在视野图像帧中手的位置处显示杯子这个局部差值像素。当通过手柄进行射击时,需要确定视野坐标和手柄位置坐标,并计算从手柄位置射出子弹时子弹与显示屏的接触位置坐标,将该接触位置坐标作为动作执行位置坐标,以便在视野图像帧中该接触位置处显示子弹的特效这个局部差值像素。
其中,虚拟现实设备可以在内容获取请求中携带动作序列和类别标识,或者,虚拟现实设备可以在内容获取请求中携带视野坐标、动作执行位置坐标、类别标识和动作序列,或者,虚拟现实设备可以在内容获取请求中携带动作序列,并将视野坐标、动作执行位置坐标和类别标识携带在动作序列中,本实施例不对内容获取请求进行限定。
3)当类别是引起视野图像帧的整体数据更新的动作类别时,内容是视野图像帧,则在将包含类别标识的内容获取请求发送给内容提供设备之前,还包括:根据动作计算期望的视野坐标。
当类别是引起视野图像帧的整体数据更新的动作类别时,内容获取策略是获取视野图像帧。即,虚拟现实设备需要向内容获取设备获取视野图像帧。由于虚拟现实设备需要确定更新哪个视野图像帧的数据,所以,虚拟现实设备还需要计算期望的视野坐标。
其中,虚拟现实设备可以在内容获取请求中携带动作序列和类别标识,或者,虚拟现实设备可以在内容获取请求中携带视野坐标、类别标识和动作序列,或者,虚拟现实设备可以在内容获取请求中携带动作序列,并将视野坐标和类别坐标携带在动作序列中,本实施例不对内容获取请求进行限定。
在实现时,虚拟现实设备还可以在内容获取请求中携带虚拟现实设备的标识或全景图像帧的标识,以便内容提供设备根据该标识确定虚拟现实设备所显示的全景图像帧,再从该全景图像帧中选取局部差值像素或视野图像帧。
步骤504,内容提供设备接收虚拟现实设备发送的内容获取请求。
步骤505,内容提供设备根据类别标识确定动作的类别。
比如,当内容提供设备接收到的类别标识是010时,确定动作是手柄动作。
可选的,当内容提供设备接收到的类别标识是111,则确定实际的类别标识是001或100,且虚拟显示设备本地没有存储全景视频流,需要向虚拟现实设备发送全景视频流。
步骤506,内容提供设备在差异化策略中查找类别对应的内容获取策略,该差异化策略包括各个类别对应的内容获取策略。
差异化策略包括类别和内容获取策略的对应关系。比如,不引起全景图像帧的数据更新的动作类别对应的内容获取策略是获取全景视频流;引起视野图像帧的局部数据更新的动作类别对应的内容获取策略是获取局部差值像素;引起视野图像帧的整体数据更新的动作类别对应的内容获取策略是获取视野图像帧。
步骤507,内容提供设备根据内容获取策略确定发送给虚拟现实设备的内容。
对应于步骤503中依据三种不同类型的动作发送的三种内容获取请求,下面对内容提供设备确定发送给虚拟现实设备的内容的流程进行解释。
1)当类别是不引起全景图像帧的数据更新的动作类别时,根据内容获取策略确定发送给虚拟现实设备的内容,包括:根据内容获取策略获取全景视频流,将全景视频流确定为发送给虚拟现实设备的内容。
2)当类别是引起视野图像帧的局部数据更新的动作类别时,内容是局部差值像素,则根据内容获取策略确定发送给虚拟现实设备的内容,包括:当内容获取请求携带有动作的动作序列时,根据动作序列计算得到动作执行位置坐标和期望的视野坐标,在确定视野坐标和虚拟现实设备显示的前一个图像帧对应的视野坐标相同时确定内容获取策略是获取局部差值像素,根据动作序列、动作执行位置和情景匹配确定出的内容描述计算得到局部差值像 素,将局部差值像素确定为发送给虚拟现实设备的内容。或者,当内容获取请求携带有动作的动作序列、动作执行位置坐标和期望的视野坐标时,从内容获取请求中读取得到动作执行位置坐标和视野坐标,在确定视野坐标和虚拟现实设备前一个图像帧对应的视野坐标相同时确定内容获取策略是获取局部差值像素,根据动作序列、动作执行位置和情景匹配确定出的内容描述计算得到局部差值像素,将局部差值像素确定为发送给虚拟现实设备的内容。
其中,情景匹配是在差异化策略确定后确定期望的内容描述,并根据动作序列、内容描述和动作执行位置计算出局部差值像素。其中,内容描述用于确定三维图像的内容。
当内容获取请求中未携带视野坐标和动作执行位置坐标时,内容提供设备需要根据动作序列计算视野坐标和动作执行位置坐标;当内容获取请求中携带有视野坐标和动作执行位置坐标时,内容提供设备从内容获取请求中直接读取视野坐标和动作执行位置坐标。
内容提供设备对视野坐标与前一个视野图像帧对应的视野坐标进行比较,当两者相同时,说明需要根据动作序列、动作执行位置和情景匹配确定出的内容描述计算出局部差值像素作为发送给虚拟现实设备的内容;当两者不同时,说明需要根据动作序列和情景匹配确定出的内容描述计算出视野图像帧作为发送给虚拟现实设备的内容。
3)当类别是引起视野图像帧的整体数据更新的动作类别时,内容是视野图像帧,则根据内容获取策略确定发送给虚拟现实设备的内容,包括:当内容获取请求携带有动作的动作序列时确定内容获取策略是获取视野图像帧,根据动作序列计算得到期望的视野坐标,并根据视野坐标、动作序列和情景匹配确定出的内容描述计算得到视野图像帧,将视野图像帧确定为发送给虚拟现实设备的内容。或者,当内容获取请求携带有动作的动作序列和期望的视野坐标时确定内容获取策略是获取视野图像帧,从内容获取请求中读取得到视野坐标,根据视野坐标、动作序列和情景匹配确定出的内容描述计算得到计算得到视野图像帧,将视野图像帧确定为发送给虚拟现实设备的内容。
其中,情景匹配是在差异化策略确定后确定期望的内容描述,并根据动作序列、内容描述和视野坐标计算出视野图像帧。其中,内容描述用于确定三维图像的内容。
当内容获取请求中未携带视野坐标时,内容提供设备需要根据动作序列计算视野坐标;当内容获取请求中携带有视野坐标时,内容提供设备从内容获取请求中直接读取视野坐标。
内容提供设备根据视野坐标、动作序列和情景匹配确定出的内容描述计算得到计算出视野图像帧作为发送给虚拟现实设备的内容。
在实现时,当内容获取请求还包括虚拟现实设备的标识或全景图像帧的标识时,内容提供设备根据该标识确定虚拟现实设备所显示的全景图像帧,再从该全景图像帧中选取局部差值像素或视野图像帧。
步骤508,内容提供设备将内容发送给虚拟现实设备进行显示。
对应于步骤507中确定的三种内容,下面对内容提供设备发送内容的流程进行解释。
1)当内容是全景视频流时,由于数据量较大,所以,内容提供设备可以对全景视频流进行压缩,将压缩后的全景视频流发送给虚拟现实设备,以节省传输带宽。
2)当内容是局部差值像素时,将内容发送给虚拟现实设备进行显示,包括:将未经编码压缩的局部差值像素发送给虚拟现实设备进行显示。
由于局部差值像素的数据量较小,所以可以采取不编码和不压缩的方式进行传输,从而节省编码和压缩的耗时,提高动作的响应速度。
3)当内容是视野图像帧时,将内容发送给虚拟现实设备进行显示,包括:对视野图像帧进行帧内压缩,将压缩后的视野图像帧发送给虚拟现实设备进行显示。
当对视野图像帧进行帧内压缩时,虽然相比于AVC和HEVC等帧间压缩算法压缩比例并不高,但是帧内压缩的耗时少,可以提高动作的响应速度。
步骤509,虚拟现实设备接收内容提供设备发送的内容。
步骤510,虚拟现实设备根据内容显示视野图像帧。
对应于步骤508发送的三种内容,下面对虚拟现实设备现实图像帧的流程进行解释。
1)当内容是全景视频流时,虚拟现实设备可以从全景视频流的全景图像帧中选取视野图像帧,对该视野图像帧进行渲染,再结合垂直同步,最后扫描输出该视野图像帧。
2)当类别是引起视野图像帧的局部数据更新的动作类别时,内容是局部差值像素,则根据内容显示视野图像帧,包括:利用局部差值像素替换前一个视野图像帧中动作执行位置坐标对应的像素,显示替换后得到的视野图像帧。
虚拟现实设备利用局部差值像素替换前一个视野图像帧中动作执行位置坐标对应的像素,即,进行帧的重建,再结合垂直同步,最后扫描输出该视野图像帧。
3)当类别是引起视野图像帧的整体数据更新的动作类别时,内容是视野图像帧,则根据内容显示视野图像帧,包括:在视野坐标所指示的区域显示视野图像帧。
虚拟现实设备对视野图像帧进行渲染,再结合垂直同步,最后扫描输出该视野图像帧。
请参考图6,其示出了虚拟现实设备捕捉到动作后的处理流程图。
请参考图7,其示出了内容提供设备接收到内容获取请求后的处理流程图。
综上所述,虚拟现实设备可以在初始进入虚拟环境时的加载过程中一次性加载完全景视频流,可以采取大压缩比的编码压缩方法,可以规避用户对时延的感知。之后,对于转头动作和低速移动动作,由于已经在本地显示端缓冲区中缓存了全景视频流,后续的动作只需要在本地完成画面的裁剪和显示,不需要传输带宽,完全释放传输资源,极大地克服了时延需求,体会不到卡顿。对于手柄动作和手势动作,其只需要局部差值像素,基于期望的视野进行局部差值像素的封装,数据量较小,可以采取不进行编码和压缩的方法进行处理,因时延需求是小于150ms,相比20ms,有较长的传输时间,减少了传输数据量,且传输时间加大,总体上降低了传输带宽需求,最大降低带宽80%以上。对于快速移动动作和静坐动作,本地显示端缓冲全景视频流已无法支持快速的画面变化,需要传输视野图像帧,且要求MTP小于20ms,此类数据需要采取帧内压缩方案,可以10到20倍地压缩数据,既可以节省传输带宽,也可以通过帧内压缩来减少时延。
综上所述,本申请实施例提供的应用于虚拟现实的图像显示方法,由于动作是根据其引起的图像变化的特征进行分类的,所以,可以根据图像变化的特征为该类动作设置一个内容获取策略,使得内容提供设备根据内容获取策略确定只向虚拟现实设备发送图像变化的内容。当图像部分变化时,图像变化的内容的数据量通常较小,既可以节省传输码率;另外,在图像变化的内容的数据量较小时,即使提高图像帧的分辨率,该内容也不会占用太多的带宽,所以可以提高视野图像的分辨率,从而提高用户体验的清晰度。
其中,当动作是静坐时,用户视野外的地方(比如左右两侧和背后)显示的可以是静态图片,这样,只需要更新用户视野内的数据,以减少更新全景图像帧的数据量。
当动作不引起全景图像帧的数据更新时,说明内容提供设备只需要在首次时将全景视 频流发送给虚拟现实设备,之后,虚拟现实设备就不需要从内容提供设备获取更新数据,即对本地存储的全景视频流进行处理即可完成对本次动作的响应,以便从而避免了数据传输的耗时,可以提高动作的响应速度。
当动作引起视野图像帧的局部数据更新时,说明虚拟现实设备需要从内容提供设备获取图像帧中变化的局部数据,即局部差值像素,由于局部差值像素的数据量较小,传输局部差值像素所需的时长较短,也可以提高动作的响应速度。
当动作引起视野图像帧整体数据更新时,说明虚拟现实设备需要从内容提供设备获取视野图像帧。由于视野图像帧的数据量相比于全景图像帧的数据量来说较小,传输视野图像帧所需的时长较短,可以提高动作的响应速度;另外,由于视野图像帧的数据量较小,即使提高视野图像帧的分辨率,视野图像帧也不会占用太多的带宽,所以还可以提高视野图像帧的分辨率,以提高用户体验的清晰度。
请参考图8,其示出了本申请一个实施例提供的应用于虚拟现实的图像显示装置的框图。该应用于虚拟现实的图像显示装置可以通过软件、硬件或者两者的结合实现成为虚拟现实设备的全部或者一部分。该应用于虚拟现实的图像显示装置可以包括:捕捉单元810、分类单元820、发送单元830、接收单元840和显示单元850。
捕捉单元810,用于实现上述步骤501的功能。
分类单元820,用于实现上述步骤502的功能。
发送单元830,用于实现上述步骤503的功能。
接收单元840,用于实现上述步骤509的功能。
显示单元850,用于实现上述步骤510的功能。
相关细节可结合参考图5所述的方法实施例。
需要说明的是,上述的捕捉单元810可以通过虚拟现实设备中的处理器来实现;上述的分类单元820可以通过虚拟现实设备中的处理器来实现;上述发送单元830可以通过虚拟现实设备中的处理器确定发送时机,由收发器发送来实现;上述的接收单元840可以通过虚拟现实设备中的收发器来实现;上述现实单元830可以通过虚拟现实设备中的处理器来实现。
请参考图9,其示出了本申请一个实施例提供的应用于虚拟现实的图像显示装置的框图。该应用于虚拟现实的图像显示装置可以通过软件、硬件或者两者的结合实现成为内容提供设备的全部或者一部分。该应用于虚拟现实的图像显示装置可以包括:接收单元910、确定单元920、查找单元930和发送单元940。
接收单元910,用于实现上述步骤504和507的功能。
确定单元920,用于实现上述步骤505的功能。
查找单元930,用于实现上述步骤506的功能。
发送单元940,用于实现上述步骤508的功能。
相关细节可结合参考图3所述的方法实施例。
需要说明的是,上述的接收单元910可以通过内容提供设备中的收发器来实现;上述的确定单元920可以通过内容提供设备中的处理器来实现;上述查找单元930可以通过内容 提供设备中的处理器来实现;上述发送单元940可以通过内容提供设备中的处理器确定发送时机,由收发器发送来实现。
本实施例还公开了一种应用于虚拟现实的图像显示系统,该系统包括如图8所示的应用于虚拟现实的图像显示装置和如图9所示应用于虚拟现实的图像显示装置。
需要说明的是:上述实施例提供的应用于虚拟现实的图像显示装置在进行应用于虚拟现实的图像显示时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将应用于虚拟现实的图像显示装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的应用于虚拟现实的图像显示装置与应用于虚拟现实的图像显示方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,可以仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉 本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应所述以权利要求的保护范围为准。

Claims (26)

  1. 一种应用于虚拟现实的图像显示方法,其特征在于,所述方法包括:
    捕捉用户当前执行的动作;
    按照所述动作引起的图像变化的特征对所述动作进行分类,得到所述动作的类别标识;
    将包含所述类别标识的内容获取请求发送给内容提供设备;
    接收所述内容提供设备发送的内容,所述内容是所述内容提供设备根据所述类别标识确定所述动作的类别,在差异化策略中查找所述类别对应的内容获取策略,根据所述内容获取策略确定的,所述差异化策略包括各个类别对应的内容获取策略;
    根据所述内容显示视野图像帧。
  2. 根据权利要求1所述的方法,其特征在于,
    当所述类别是不引起全景图像帧的数据更新的动作类别时,所述内容获取策略是获取全景视频流;
    当所述类别是引起视野图像帧的局部数据更新的动作类别时,所述内容获取策略是获取局部差值像素;
    当所述类别是引起视野图像帧的整体数据更新的动作类别时,所述内容获取策略是获取视野图像帧。
  3. 根据权利要求2所述的方法,其特征在于,当所述类别是不引起全景图像帧的数据更新的动作类别时,在所述将包含所述类别标识的内容获取请求发送给内容提供设备之前,还包括:
    根据所述动作计算期望的视野坐标,并检测本地是否存储有全景视频流;
    当本地存储有所述全景视频流时,从所述全景视频流中读取前一个全景图像帧,将渲染后的所述前一个全景图像帧按照所述视野坐标进行裁剪;显示裁剪后得到视野图像帧;
    当本地未存储所述全景视频流时,触发执行所述将包含所述类别标识的内容获取请求发送给内容提供设备的步骤。
  4. 根据权利要求2所述的方法,其特征在于,当所述类别是引起视野图像帧的局部数据更新的动作类别时,所述内容是局部差值像素,则
    在所述将包含所述类别标识的内容获取请求发送给内容提供设备之前,还包括:根据所述动作计算期望的视野坐标和动作执行位置坐标;
    所述根据所述内容显示视野图像帧,包括:利用所述局部差值像素替换前一个视野图像帧中所述动作执行位置坐标所指示的位置处的像素,显示替换后得到的视野图像帧,所述局部差值像素是所述内容提供设备在确定根据所述动作的动作序列计算出的期望的视野坐标和前一个视野图像帧对应的视野坐标相同时,根据所述动作序列、所述动作执行位置和情景匹配确定出的内容描述计算得到的像素;
    其中,当所述内容获取请求携带有所述动作的动作序列时,所述动作执行位置坐标和所述视野坐标是所述内容提供设备根据所述动作序列计算得到的;当所述内容获取请求携带有 所述动作序列、所述动作执行位置坐标和所述视野坐标时,所述动作执行位置坐标和所述视野坐标是所述内容提供设备从所述内容获取请求中读取得到的。
  5. 根据权利要求2所述的方法,其特征在于,当所述类别是引起视野图像帧的整体数据更新的动作类别时,所述内容是视野图像帧,则
    在所述将包含所述类别标识的内容获取请求发送给内容提供设备之前,还包括:根据所述动作计算期望的视野坐标;
    所述根据所述内容显示视野图像帧,包括:在所述视野坐标所指示的区域显示所述视野图像帧,所述视野图像帧是所述内容提供设备根据所述视野坐标、所述动作的动作序列和情景匹配确定出的内容描述计算得到的图像帧;
    其中,当所述内容获取请求携带有所述动作的动作序列时,所述视野坐标是所述内容提供设备根据所述动作序列计算得到的;当所述内容获取请求携带有所述动作序列和所述视野坐标时,所述视野坐标是所述内容提供设备从所述内容获取请求中读取得到的。
  6. 根据权利要求2或4所述的方法,其特征在于,所述局部差值像素是未经编码压缩的内容。
  7. 根据权利要求2或5所述的方法,其特征在于,所述视野图像帧是经帧内压缩得到的内容。
  8. 一种应用于虚拟现实的图像显示方法,其特征在于,所述方法包括:
    接收虚拟现实设备发送的内容获取请求,所述内容获取请求包括动作的类别标识,所述动作是所述虚拟现实设备捕捉到的用户当前执行的动作,所述类别标识是所述虚拟现实设备按照所述动作引起的图像变化的特征对所述动作进行分类得到的;
    根据所述类别标识确定所述动作的类别;
    在差异化策略中查找所述类别对应的内容获取策略,所述差异化策略包括各个类别对应的内容获取策略;
    根据所述内容获取策略确定发送给所述虚拟现实设备的内容;
    将所述内容发送给所述虚拟现实设备进行显示。
  9. 根据权利要求8所述的方法,其特征在于,
    当所述类别是不引起全景图像帧的数据更新的动作类别时,所述内容获取策略是获取全景视频流;
    当所述类别是引起视野内图像帧的局部数据更新的动作类别时,所述内容获取策略是获取局部差值像素;
    当所述类别是引起视野图像帧的整体数据更新的动作类别时,所述内容获取策略是获取视野图像帧。
  10. 根据权利要求9所述的方法,其特征在于,当所述类别是不引起全景图像帧的数据 更新的动作类别时,所述根据所述内容获取策略确定发送给所述虚拟现实设备的内容,包括:
    根据所述内容获取策略获取全景视频流,将所述全景视频流确定为发送给所述虚拟现实设备的内容。
  11. 根据权利要求9所述的方法,其特征在于,当所述类别是引起视野图像帧的局部数据更新的动作类别时,所述内容是局部差值像素,则所述根据所述内容获取策略确定发送给所述虚拟现实设备的内容,包括:
    当所述内容获取请求携带有所述动作的动作序列时,根据所述动作序列计算得到动作执行位置坐标和期望的视野坐标,在确定所述视野坐标和所述虚拟现实设备显示的前一个视野图像帧对应的视野坐标相同时确定所述内容获取策略是获取局部差值像素,根据所述动作序列、所述动作执行位置和情景匹配确定出的内容描述计算得到所述局部差值像素,将所述局部差值像素确定为发送给所述虚拟现实设备的内容;
    当所述内容获取请求携带有所述动作的动作序列、动作执行位置坐标和期望的视野坐标时,从所述内容获取请求中读取得到所述动作执行位置坐标和所述视野坐标,在确定所述视野坐标和所述虚拟现实设备显示前一个视野图像帧对应的视野坐标相同时确定所述内容获取策略是获取局部差值像素,根据所述动作序列、所述动作执行位置和情景匹配确定出的内容描述计算得到所述局部差值像素,将所述局部差值像素确定为发送给所述虚拟现实设备的内容。
  12. 根据权利要求11所述的方法,其特征在于,所述将所述内容发送给所述虚拟现实设备进行显示,包括:
    将未经编码压缩的所述局部差值像素发送给所述虚拟现实设备进行显示。
  13. 根据权利要求9所述的方法,其特征在于,当所述类别是引起视野图像帧的整体数据更新的动作类别时,所述内容是视野图像帧,则所述根据所述内容获取策略确定发送给所述虚拟现实设备的内容,包括:
    当所述内容获取请求携带有所述动作的动作序列时确定所述内容获取策略是获取视野图像帧,根据所述动作序列计算得到期望的视野坐标,并根据所述视野坐标、所述动作序列和情景匹配确定出的内容描述计算得到视野图像帧,将所述视野图像帧确定为发送给所述虚拟现实设备的内容;
    当所述内容获取请求携带有所述动作的动作序列和期望的视野坐标时确定所述内容获取策略是获取视野图像帧,从所述内容获取请求中读取得到所述视野坐标,根据所述视野坐标、所述动作序列和情景匹配确定出的内容描述计算得到视野图像帧,将所述视野图像帧确定为发送给所述虚拟现实设备的内容。
  14. 根据权利要求13所述的方法,其特征在于,所述将所述内容发送给所述虚拟现实设备进行显示,包括:
    对所述视野图像帧进行帧内压缩,将压缩后的所述视野图像帧发送给所述虚拟现实设备进行显示。
  15. 一种应用于虚拟现实的图像显示装置,其特征在于,所述装置包括:
    捕捉单元,用于捕捉用户当前执行的动作;
    分类单元,用于按照所述捕捉单元捉到的所述动作引起的图像变化的特征对所述动作进行分类,得到所述动作的类别标识;
    发送单元,用于将包含所述分类单元得到的所述类别标识的内容获取请求发送给内容提供设备;
    接收单元,用于接收所述内容提供设备发送的内容,所述内容是所述内容提供设备根据所述类别标识确定所述动作的类别,在差异化策略中查找所述类别对应的内容获取策略,根据所述内容获取策略确定的,所述差异化策略包括各个类别对应的内容获取策略;
    显示单元,用于根据所述接收单元接收到的所述内容显示图像帧。
  16. 根据权利要求15所述的装置,其特征在于,
    当所述类别是不引起全景图像帧的数据更新的动作类别时,所述内容获取策略是获取全景视频流;
    当所述类别是引起视野图像帧的局部数据更新的动作类别时,所述内容获取策略是获取局部差值像素;
    当所述类别是引起视野图像帧的整体数据更新的动作类别时,所述内容获取策略是获取视野图像帧。
  17. 一种应用于虚拟现实的图像显示装置,其特征在于,所述装置包括:
    接收单元,用于接收虚拟现实设备发送的内容获取请求,所述内容获取请求包括动作的类别标识,所述动作是所述虚拟现实设备捕捉到的用户当前执行的动作,所述类别标识是所述虚拟现实设备按照所述动作引起的图像变化的特征对所述动作进行分类得到的;
    确定单元,用于根据所述接收单元接收到的所述类别标识确定所述动作的类别;
    查找单元,用于在差异化策略中查找所述确定单元确定的所述类别对应的内容获取策略,所述差异化策略包括各个类别对应的内容获取策略;
    所述确定单元,还用于根据所述查找单元查找到的所述内容获取策略确定发送给所述虚拟现实设备的内容;
    发送单元,用于将所述确定单元得到的所述内容发送给所述虚拟现实设备进行显示。
  18. 根据权利要求17所述的装置,其特征在于,
    当所述类别是不引起全景图像帧的数据更新的动作类别时,所述内容获取策略是获取全景视频流;
    当所述类别是引起视野内图像帧的局部数据更新的动作类别时,所述内容获取策略是获取局部差值像素;
    当所述类别是引起视野图像帧的整体数据更新的动作类别时,所述内容获取策略是获取视野图像帧。
  19. 一种应用于虚拟现实的图像显示设备,其特征在于,所述设备包括:
    处理器,用于捕捉用户当前执行的动作;
    所述处理器,还用于按照所述动作引起的图像变化的特征对所述动作进行分类,得到所述动作的类别标识;
    收发器,用于将包含所述处理器得到的所述类别标识的内容获取请求发送给内容提供设备;
    所述收发器,还用于接收所述内容提供设备发送的内容,所述内容是所述内容提供设备根据所述类别标识确定所述动作的类别,在差异化策略中查找所述类别对应的内容获取策略,根据所述内容获取策略确定的,所述差异化策略包括各个类别对应的内容获取策略;
    所述处理器,还用于根据所述收发器接收到的所述内容显示图像帧。
  20. 根据权利要求19所述的装置,其特征在于,
    当所述类别是不引起全景图像帧的数据更新的动作类别时,所述内容获取策略是获取全景视频流;
    当所述类别是引起视野图像帧的局部数据更新的动作类别时,所述内容获取策略是获取局部差值像素;
    当所述类别是引起视野图像帧的整体数据更新的动作类别时,所述内容获取策略是获取视野图像帧。
  21. 一种应用于虚拟现实的图像显示设备,其特征在于,所述设备包括:
    收发器,用于接收虚拟现实设备发送的内容获取请求,所述内容获取请求包括动作的类别标识,所述动作是所述虚拟现实设备捕捉到的用户当前执行的动作,所述类别标识是所述虚拟现实设备按照所述动作引起的图像变化的特征对所述动作进行分类得到的;
    处理器,用于根据所述收发器接收到的所述类别标识确定所述动作的类别;
    所述处理器,还用于在差异化策略中查找所述类别对应的内容获取策略,所述差异化策略包括各个类别对应的内容获取策略;
    所述处理器,还用于根据所述内容获取策略确定发送给所述虚拟现实设备的内容;
    所述收发器,还用于将所述内容发送给所述虚拟现实设备进行显示。
  22. 根据权利要求21所述的装置,其特征在于,
    当所述类别是不引起全景图像帧的数据更新的动作类别时,所述内容获取策略是获取全景视频流;
    当所述类别是引起视野内图像帧的局部数据更新的动作类别时,所述内容获取策略是获取局部差值像素;
    当所述类别是引起视野图像帧的整体数据更新的动作类别时,所述内容获取策略是获取视野图像帧。
  23. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令 集由所述处理器加载并执行以实现权利要求1至7任一所述的应用于虚拟现实的图像显示方法。
  24. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现权利要求8至14任一所述的应用于虚拟现实的图像显示方法。
  25. 一种应用于虚拟现实的图像显示系统,其特征在于,所述系统包括如权利要求15所述的应用于虚拟现实的图像显示装置和如权利要求17所述的应用于虚拟现实的图像显示装置。
  26. 一种应用于虚拟现实的图像显示系统,其特征在于,所述系统包括如权利要求19所述的应用于虚拟现实的图像显示装置和如权利要求21所述的应用于虚拟现实的图像显示装置。
PCT/CN2017/112307 2017-11-22 2017-11-22 应用于虚拟现实的图像显示方法、装置、设备及系统 WO2019100247A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/112307 WO2019100247A1 (zh) 2017-11-22 2017-11-22 应用于虚拟现实的图像显示方法、装置、设备及系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/112307 WO2019100247A1 (zh) 2017-11-22 2017-11-22 应用于虚拟现实的图像显示方法、装置、设备及系统

Publications (1)

Publication Number Publication Date
WO2019100247A1 true WO2019100247A1 (zh) 2019-05-31

Family

ID=66630466

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/112307 WO2019100247A1 (zh) 2017-11-22 2017-11-22 应用于虚拟现实的图像显示方法、装置、设备及系统

Country Status (1)

Country Link
WO (1) WO2019100247A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140347263A1 (en) * 2013-05-23 2014-11-27 Fastvdo Llc Motion-Assisted Visual Language For Human Computer Interfaces
CN105487673A (zh) * 2016-01-04 2016-04-13 京东方科技集团股份有限公司 一种人机交互系统、方法及装置
CN105765516A (zh) * 2013-09-30 2016-07-13 高通股份有限公司 通过使用已知的和有待穿戴的传感器的手势检测系统的分类
CN106249918A (zh) * 2016-08-18 2016-12-21 南京几墨网络科技有限公司 虚拟现实图像显示方法、装置及应用其的终端设备
CN106445129A (zh) * 2016-09-14 2017-02-22 乐视控股(北京)有限公司 全景图像信息显示方法、装置及系统
CN107197285A (zh) * 2017-06-06 2017-09-22 清华大学 一种基于位置的虚拟现实压缩方法
WO2017178862A1 (en) * 2016-04-11 2017-10-19 Berkovs Boriss Method for modeling and simulating physical effect in interactive simulators and electronic games, system for implementing the same and method for calibrating the system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140347263A1 (en) * 2013-05-23 2014-11-27 Fastvdo Llc Motion-Assisted Visual Language For Human Computer Interfaces
CN105765516A (zh) * 2013-09-30 2016-07-13 高通股份有限公司 通过使用已知的和有待穿戴的传感器的手势检测系统的分类
CN105487673A (zh) * 2016-01-04 2016-04-13 京东方科技集团股份有限公司 一种人机交互系统、方法及装置
WO2017178862A1 (en) * 2016-04-11 2017-10-19 Berkovs Boriss Method for modeling and simulating physical effect in interactive simulators and electronic games, system for implementing the same and method for calibrating the system
CN106249918A (zh) * 2016-08-18 2016-12-21 南京几墨网络科技有限公司 虚拟现实图像显示方法、装置及应用其的终端设备
CN106445129A (zh) * 2016-09-14 2017-02-22 乐视控股(北京)有限公司 全景图像信息显示方法、装置及系统
CN107197285A (zh) * 2017-06-06 2017-09-22 清华大学 一种基于位置的虚拟现实压缩方法

Similar Documents

Publication Publication Date Title
US11348283B2 (en) Point cloud compression via color smoothing of point cloud prior to texture video generation
CN107771395B (zh) 生成和发送用于虚拟现实的元数据的方法和装置
KR102363364B1 (ko) 파노라마 비디오의 상호작용적 전송을 위한 방법 및 시스템
JP6263830B2 (ja) 圧縮ビデオデータにおいて複数の関心領域の指標を含めるための技術
US11363247B2 (en) Motion smoothing in a distributed system
US20190174125A1 (en) Positional zero latency
WO2018044917A1 (en) Selective culling of multi-dimensional data sets
US11310560B2 (en) Bitstream merger and extractor
US10958950B2 (en) Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
US10964067B2 (en) Visual quality enhancement of reconstructed point clouds via color smoothing
US11039115B2 (en) Low complexity color smoothing of reconstructed point clouds
CN112949547A (zh) 数据传输和显示方法、装置、系统、设备以及存储介质
US11803987B2 (en) Attribute transfer in V-PCC
US11922663B2 (en) Decision-making rules for attribute smoothing
WO2022022348A1 (zh) 视频压缩方法、解压方法、装置、电子设备及存储介质
US20220172440A1 (en) Extended field of view generation for split-rendering for virtual reality streaming
WO2019100247A1 (zh) 应用于虚拟现实的图像显示方法、装置、设备及系统
WO2021249562A1 (zh) 一种信息传输方法、相关设备及系统
US20240095966A1 (en) Coding of displacements by use of contexts for vertex mesh (v-mesh)
US20240187615A1 (en) Signaling of multiview tiled volumetric video
US11233999B2 (en) Transmission of a reverse video feed
CN117333404A (zh) 一种渲染方法及渲染引擎
CN117956214A (zh) 视频展示方法、装置、视频展示设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17932597

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17932597

Country of ref document: EP

Kind code of ref document: A1