CN115509345A

CN115509345A - Virtual reality scene display processing method and virtual reality equipment

Info

Publication number: CN115509345A
Application number: CN202210869352.5A
Authority: CN
Inventors: 张幸乾; 张桐源; 李芳慧
Original assignee: Beijing Weishiwei Information Technology Co ltd
Current assignee: Beijing Weishiwei Information Technology Co ltd
Priority date: 2022-07-22
Filing date: 2022-07-22
Publication date: 2022-12-23
Anticipated expiration: 2042-07-22
Also published as: CN115509345B

Abstract

The disclosure relates to a display processing method of a virtual reality scene and a virtual reality device, wherein the method comprises the following steps: acquiring head movement information and eye movement information of a first user within a first time window of using a first virtual reality device; extracting the head movement characteristics and the eye movement characteristics of the first user from the head movement information and the eye movement information respectively; according to the head movement characteristics and the eye movement characteristics, obtaining a first eye movement type of a first user at the end moment of the first time window; drawing a scene picture of the virtual reality scene at the ending moment according to the drawing precision corresponding to the first eye movement type; and displaying a scene picture through a screen of the first virtual reality device.

Description

Virtual reality scene display processing method and virtual reality equipment

Technical Field

The invention relates to the technical field of computer processing, in particular to a display processing method of a virtual reality scene and virtual reality equipment.

Background

Playing the virtual reality scene through the virtual reality device can provide a user with a higher sense of immersion than a conventional two-dimensional display device, which allows the user to freely explore the virtual three-dimensional world. In order to ensure the immersion and experience of the user, the virtual reality device generally needs to meet a refresh frequency of 60Hz or more, a field angle of 100 degrees or more, and a resolution of 1920 × 1080 pixels or more, which puts an extremely high demand on the computing power of the virtual reality device and affects the rendering efficiency.

Disclosure of Invention

It is an object of the embodiments of the present disclosure to provide a display processing scheme for a virtual reality scene of a virtual reality device to improve rendering efficiency.

According to a first aspect of the present disclosure, a method for processing display of a virtual reality scene is provided, the method including:

acquiring head movement information and eye movement information of a first user within a first time window of using a first virtual reality device; wherein the head movement information comprises head movement speeds of the first user at a plurality of sampling moments within the first time window, the eye movement information comprises gaze location information of the first user gazing at a screen of the first virtual reality device at the plurality of sampling moments;

extracting head motion features of the first user from the head motion information and extracting eye motion features of the first user from the eye motion information;

obtaining a first eye movement type of the first user at the end time of the first time window according to the head movement characteristics and the eye movement characteristics;

drawing a scene picture of the virtual reality scene at the ending moment according to the drawing precision corresponding to the first eye movement type;

and displaying the scene picture through a screen of the first virtual reality device.

Optionally, the first eye movement type is one eye movement type in a set of eye movement types, the set of eye movement types includes at least two eye movement types, and the at least two eye movement types include an eye jump type; the drawing the scene picture of the virtual reality scene at the ending time according to the drawing precision corresponding to the first eye movement type includes:

under the condition that the first eye movement type is the eye jump type, drawing a scene picture of the virtual reality scene at the ending moment according to drawing precision corresponding to the eye jump type; and different eye movement types in the eye movement type set correspond to different drawing accuracies, and the drawing accuracy corresponding to the eye jump type is lower than the drawing accuracy corresponding to other eye movement types in the eye movement type set.

Optionally, the at least two eye movement types further include a gaze type, where the gaze type corresponds to two rendering accuracies, namely a first rendering accuracy and a second rendering accuracy, and the first rendering accuracy is higher than the second rendering accuracy;

the method further includes, according to a rendering accuracy corresponding to the first eye movement type, rendering the virtual reality scene before the scene picture at the end time, where the method further includes:

acquiring an eye fixation position of the first user on the screen at the ending moment;

the drawing the scene picture of the virtual reality scene at the ending time according to the drawing precision corresponding to the first eye movement type further includes:

under the condition that the first eye movement type is a watching type, drawing a first scene picture corresponding to a first screen area taking the eye watching position as the center according to the first drawing precision, and drawing a second scene picture corresponding to a second screen area according to the second drawing precision; the first screen area and the second screen area constitute a display area of the screen, and the first scene picture and the second scene picture constitute a scene picture of the virtual reality scene at the end time.

Optionally, the extracting the head movement feature of the first user from the head movement information includes:

inputting the head movement information into a preset first feature extraction model to obtain the head movement feature of the first user; wherein the first feature extraction model comprises a first convolutional neural network and a first timing network connected in series, the first convolutional neural network receives the head motion information, and the first timing network outputs the head motion feature.

Optionally, the first convolutional neural network comprises three first network units connected in series, and each first network unit comprises a convolutional layer, a batch normalization layer, an activation function layer and a max-pooling layer which are connected in sequence.

Optionally, the first timing network is a bidirectional gated cyclic unit.

Optionally, the extracting the eye movement feature of the first user from the eye movement information includes:

inputting the eye movement information into a preset second feature extraction model to obtain the eye movement features; the second feature extraction model comprises a second convolutional neural network and a second time sequence network which are connected in series, the second convolutional neural network receives the eye movement information, and the second time sequence network outputs the eye movement features.

Optionally, the obtaining a first eye movement type of the first user at the end time of the first time window according to the head movement feature and the eye movement feature includes:

inputting the head movement characteristics and the eye movement characteristics into a preset classification model to obtain a first eye movement type of the user at the ending moment; wherein the classification model is configured to obtain an eye movement type classification result at the end time of the first time window according to motion features of the first time window, wherein the motion features comprise the head motion feature and the eye motion feature;

the classification model comprises two second network units and a Softmax network layer which are connected in sequence, wherein each second network unit comprises a full connection layer, a batch normalization layer, an activation function layer and a random inactivation layer which are connected in sequence.

Optionally, the extracting the head movement feature of the first user from the head movement information and the extracting the eye movement feature of the first user from the eye movement information includes:

inputting the head movement information into a preset first feature extraction model to obtain the head movement feature;

inputting the eye movement information into a preset second feature extraction model to obtain the eye movement features;

the obtaining a first eye movement type of the first user at an end time of the first time window according to the head movement feature and the eye movement feature includes:

inputting the head movement characteristics and the eye movement characteristics into a preset classification model to obtain a first eye movement type of the first user at the end time of the first time window;

the model parameters of the first feature extraction model, the model parameters of the second feature extraction model and the model parameters of the classification model are obtained by synchronous training of the same training sample set, each training sample in the training sample set comprises sample data and a sample label, the sample data comprises head motion information and eye motion information of a second user in a second time window using second virtual reality equipment, and the sample label is an eye movement type of the second user at the end time of the second time window.

According to a second aspect of the present disclosure, there is also provided a virtual reality device comprising a memory for storing a computer program and a processor for executing the display processing method according to the first aspect of the present disclosure under the control of the computer program.

One beneficial effect of the display processing method of the embodiment of the present disclosure is that the first eye movement type of the first user at the ending time of the first time window is determined according to the head movement information and the eye movement information of the first user in the first time window using the first virtual reality device, and the scene picture of the virtual reality scene at the ending time is drawn and displayed according to the drawing precision corresponding to the first eye movement type.

Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments of the invention, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.

Fig. 1 is a schematic view of an application scenario of a display processing method according to an embodiment of the present disclosure;

FIG. 2 is a flow diagram of a display processing method according to some embodiments;

FIG. 3 is a schematic diagram of a model structure for feature extraction and eye movement classification according to some embodiments;

FIG. 4 is a model structure diagram of a feature extraction model according to some embodiments;

fig. 5 is a hardware architecture diagram of a virtual reality device according to some embodiments.

Detailed Description

Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless it is specifically stated otherwise.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as exemplary only and not as limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

The embodiment of the disclosure relates to a display processing method for a virtual reality scene of a virtual reality device. The virtual reality scene can be any scene, for example, the virtual reality scene can be a movie and television video in the aspect of movie and television entertainment; for another example, the virtual reality scene may be a game scene; for another example, the virtual reality scene may also be a virtual experiment scene used in scientific research and teaching in the fields of medical treatment, aerospace, and the like; for another example, the virtual reality scenario may also be various design scenarios used in product design, and the like, which are not limited herein.

Since the virtual reality scene played by the virtual reality device is a 360-degree panorama, in use, the virtual reality device needs to draw a scene picture corresponding to a visual field range of a user at a moment according to the visual field range of the user at the moment, so that the user can freely explore a three-dimensional world through the virtual reality device. In order to ensure the immersion and experience of the user in the virtual reality scene, the rendering precision of the virtual reality device on the scene picture generally needs to meet a refresh frequency of 60Hz or more, a field angle of 100 degrees or more, and a resolution of 1920 × 1080 pixels or more, and such rendering precision puts an extremely high demand on the computing capacity of the virtual reality device and affects the rendering efficiency.

In order to solve the problem of low drawing efficiency of a virtual reality scene, the inventor conducts exploration research on the perception capability of a user on a picture in different eye movement states, and finds that the perception capability of the user on the picture in different eye movement states has a large difference, as shown in fig. 1, when the user watches the scene picture of the virtual reality scene through a virtual reality device 1000, the perception capability of the user on the scene picture is relatively strong when watching a screen, namely, when the eye movement type is a watching type, at the moment, the user is sensitive to the definition of the scene picture, and the scene picture 1 with high drawing precision can be output, so that the immersion feeling and the experience feeling of the user are guaranteed; when the eyes of the user jump up and down and/or left and right, that is, the eye movement type is the eye jump type, the perception capability of the user on the scene picture is very weak, and the visual scene cannot be seen clearly basically, at this time, even if the scene picture 1 with high drawing precision is output, the user cannot perceive the scene picture, so that the scene picture 2 with low drawing precision can be output, and the drawing efficiency can be improved.

Through the research, the inventor provides a technical scheme that in the process of playing the virtual reality scene, the eye movement type of the user is identified, and the scene picture is drawn for display according to the drawing precision corresponding to the identified eye movement type, so that the drawing efficiency of the virtual reality device on the virtual reality scene is improved on the premise of not influencing the watching of the user.

Fig. 2 shows a hardware configuration diagram of a virtual reality device 1000 that can be used to implement the display processing method according to the embodiment of the present disclosure.

In some embodiments, the Virtual Reality device 1000 may be a Virtual Reality (VR) all-in-one machine, and for the VR all-in-one machine, the Virtual Reality device 1000 is also a head-mounted device, and the head-mounted device integrates functions of displaying, processing, and the like.

In other embodiments, the virtual reality device 1000 may also include a head-mounted device and a host, where the head-mounted device and the host may be in communication connection in a wired or wireless manner, and the virtual reality device 1000 may process the scene video through the head-mounted device, or may process the scene video through the host, and the host sends the processed scene picture to the head-mounted device for display and output, which is not limited herein. In these embodiments, the display processing method according to the embodiments of the present disclosure may be implemented by a host, or may be implemented by a head-mounted device, where the head-mounted device may send the collected head motion information and eye motion information to the host to perform eye movement type recognition and scene picture drawing processing, and receive a scene picture drawn by the host for display.

As shown in fig. 2, the virtual reality apparatus 1000 may include a processor 1100, a memory 1200, an interface device 1300, a communication device 1400, a display device 1500, an eye tracking device 1600, a head tracking device 1700, a speaker 1800, and the like.

Processor 1100 is used to execute computer programs, which may be written in an instruction set of architectures such as x86, arm, RISC, MIPS, SSE, and the like. The memory 1200 includes, for example, a ROM (read only memory), a RAM (random access memory), a nonvolatile memory such as a hard disk, and the like. The interface device 1300 includes, for example, a USB interface, a headphone interface, a network cable interface, and the like. The communication device 1400 is capable of wired or wireless communication, for example, the communication device 1400 may include at least one short-range communication module, such as any module for performing short-range wireless communication based on short-range wireless communication protocols, such as the Hilink protocol, wiFi (IEEE 802.11 protocol), mesh, bluetooth, zigBee, thread, Z-Wave, NFC, UWB, liFi, and the like, and the communication device 1400 may also include a long-range communication module, such as any module for performing WLAN, GPRS, 2G/3G/4G/5G long-range communication. The display device 1500 is, for example, a liquid crystal display, and the display device 1500 is provided in a head-mounted apparatus. The eye tracking device 1600 is used to track the gaze location of a user on the screen of the display device 1500, the eye tracking device 1600 being located in a head-mounted apparatus. The head-motion tracking apparatus 1700 employs, for example, a gyroscope or an Inertial Measurement Unit (IMU), and the head-motion tracking apparatus 1700 is located in the head-mounted device. The speaker 1800 is used to output audio of the played virtual reality scene.

In this embodiment, the memory 1200 of the virtual reality apparatus 1000 is used for storing a computer program, which is used for controlling the processor 1100 to operate, so as to control the virtual reality apparatus 1000 to implement the display processing method of the virtual reality scene according to the embodiment of the present disclosure, and the like. A skilled person can design a computer program according to the solution disclosed in the present invention. How computer programs control the operation of the processor is well known in the art and will not be described in detail herein.

FIG. 2 illustrates a flow diagram of a method of display processing of a virtual reality scene, in accordance with some embodiments. The method is implemented by a virtual reality device that plays a virtual reality scene, and the display processing method of this embodiment will now be described by taking a first virtual reality device as the virtual reality device 1000 as an example.

As shown in fig. 2, the display processing method of the present embodiment includes steps S210 to S250:

step S210, head movement information and eye movement information of the first user within a first time window of using the first virtual reality device are acquired.

It is studied that the type of movement of the eyes is highly related to the eye movement speed and the head movement speed of the user, and thus, in order to determine the type of eye movement of the user, head movement information and eye movement information of the first user during a period of time (i.e., within a first time window) in which the first virtual reality device 1000 is used may be acquired. The first virtual reality device 1000 acquires head motion information and eye motion information of the first user in the first time window (t- Δ t1, t), so as to determine an eye movement type of the first user at an end time (i.e., time t) of the first time window through the head motion information and the eye motion information of the first user in the first time window (t- Δ t1, t).

In a first time window (t- Δ t1, t), time t is a certain time in the playing process, Δ t1 represents the length of the first time window, and Δ t1 can be set as required. For example, in this embodiment, the time t is a picture refreshing time according to a refreshing frequency, and the first virtual reality apparatus 1000 determines the eye movement type of the first user at the time t according to the head movement information and the eye movement information within a length Δ t1 before the picture refreshing time, that is, determines the first eye movement type.

The length Δ t1 of the first time window may be set in a range greater than or equal to 0.5 second and less than or equal to 2 seconds, for example, Δ t1 is set to 1 second, so as to reduce the data processing amount on the premise of obtaining a satisfactory eye movement classification result.

In this embodiment, the head movement information comprises the head movement speed of the first user at a plurality of sampling instants within the first time window, i.e. the head movement information is a first time sequence related to the head movement speed. The head movement speed can be obtained according to the head posture information acquired by the head movement tracking device, therefore, the plurality of sampling moments in the first time window can be determined according to the sampling frequency of the virtual reality device for the head posture information and the initial sampling moment, for example, the sampling frequency is 100Hz, the length delta t1 of the first time window is 1 second, 100 sampling points can be generated in the first time window, each sampling point corresponds to one sampling moment, the ith sampling point in the first time window is set to correspond to the sampling moment ti, wherein ti is greater than or equal to (t-delta t 1) and is less than or equal to t.

The head-movement velocity of the first user at the sampling instant ti may be denoted as (HX) _ti ,HY _ti ) Wherein ti is greater than or equal to (t- Δ t 1) and less than or equal to t, HX _ti Representing the speed of movement, HY, of the first user moving left and right at the sampling instant ti _ti Representing the motion speed of the first user moving up and down at the sampling time ti, the head motion information of the first user in the first time window can be expressed as

In this embodiment, the eye movement information includes gaze position information that the first user gazes at the screen of the first virtual reality device at the above multiple sampling moments, that is, the eye movement information is a second time series related to the gaze position information, where, under a certain sampling rate, a difference value of adjacent gaze position information in the eye movement information reflects an eye movement speed, and therefore, the eye movement information includes not only the gaze position information but also the eye movement speed information. In the present embodiment, each sampling instant within the first time window has a corresponding head movement speed and gaze location information, which can be understood as the above first time series having the same time node as the second time series.

The gaze location information of the first user at the sampling instant ti may be denoted as (EX) _ti ,EY _ti ) Wherein, EX _ti Representing the gaze position coordinate, EY, of the first user in the width direction of the screen at the sampling instant ti _ti Representing the gaze position coordinates of the first user in the height direction of the screen at the sampling instant ti, the eye movement information of the first user in the first time window may be represented as

For example, the coordinates of the lower left corner of the screen of the first virtual reality device are (0, 0), the coordinates of the upper right corner are (1, 1), and then EX _ti ∈[0,1]，EY _ti ∈[0,1]. The gaze location information may be obtained from data collected by any type of eye tracking device configured with the first virtual reality apparatus.

In this embodiment, the first virtual reality device 1000 may synchronously acquire the head movement speed and the gaze position information according to the same sampling frequency, so that in step S210, the head movement information and the eye movement information of the first user in the first time window can be obtained according to the head movement speed and the gaze position information of each of the plurality of sampling moments of the first user in the first time window.

Step S220, extracting the head movement feature of the first user from the head movement information, and extracting the eye movement feature of the first user from the eye movement information.

The head movement information reflects a head movement speed of the first user within the first time window, and therefore, the first virtual reality device may extract head movement characteristics of the first user from the head movement information. The first virtual reality device can extract head motion characteristics from the head motion information through network structures such as a convolutional neural network.

The eye movement information reflects the eye movement direction, the eye movement speed and other characteristics of the first user in the first time window, and therefore the first virtual reality device can extract the eye movement characteristics of the first user from the eye movement information. The first virtual reality device can extract the eye movement features from the eye movement information through a network structure such as a convolutional neural network.

In some embodiments, extracting the head movement characteristics of the first user from the head movement information in step S220 may include: and inputting the head movement information into a preset first feature extraction model to obtain the head movement feature.

In these embodiments, as shown in fig. 3, the first feature extraction model M1 may include a first convolutional neural network CNN1 and a first timing network TN1 connected in series, where the first convolutional neural network CNN1 receives the head motion information

The first timing network TN1 outputs the head movement characteristic, that is, the output of the first convolutional neural network CNN1 is the input of the first timing network TN 1. The first feature extraction model M1 can extract the posture features in the head movement information through the first convolutional neural network CNN1, and extract the time-related features in the head movement information through the first timing network TN1, and this feature extraction mode is favorable for improving the accuracy of eye movement classification.

The first timing network TN1 may adopt a bidirectional gated repeat unit (BiGRU), a Gated Repeat Unit (GRU), a long-short term memory network (LSTM), a bidirectional long-short term memory network (BiLSTM), and the like, which are not limited herein.

In some embodiments, referring to fig. 4, the first timing network TN1 employs BiGRU, and in the case of employing BiGRU, the BiGRU outputs hidden states corresponding to the first and last time steps, respectively, for eye movement classification.

As shown in fig. 4, the first convolutional neural network CNN1 may include three first network units connected in series, and each of the first network units may include a convolutional layer, a batch normalization layer, an activation function layer, and a max pooling layer, which are connected in sequence.

The size of each layer in the first network element may be set as desired. For example, the convolutional layer of the first network unit may use one-dimensional convolution, the convolutional kernel size is 3, and the convolutional layer has 16 output channels. As another example, the activation function layer of the first network element may employ a ReLU activation function. As another example, the largest pooling layer of the first network element may use, for example, a pooling layer with a size of 2, so as to perform a dimension reduction process on the upper layer output by half.

In some embodiments, extracting the eye movement characteristics of the first user from the eye movement information in step S220 may include: and inputting the eye movement information into a preset second feature extraction model to obtain the eye movement features.

In these embodiments, as shown in fig. 3, the second feature extraction model M2 may include a second convolutional neural network CNN2 and a second timing network TN2 connected in series, where the second convolutional neural network CNN2 receives eye movement information

The second timing network TN2 outputs the eye movement characteristics, i.e. the output of the second convolutional neural network CNN2 is the input of the second timing network TN 2. The second feature extraction model M2 may extract spatial position features in the eye movement information through the second convolutional neural network CNN2 and time-dependent features in the eye movement information through the second timing network TN2, the features beingThe extraction mode is favorable for improving the accuracy of eye movement classification.

The second timing network TN2 may adopt a bidirectional gated cyclic unit (BiGRU), a gated cyclic unit (GRU), a long-short term memory (LSTM) network, a bidirectional long-short term memory (bllstm) network, etc., which are not limited herein. For example, the second timing network TN2 may employ BiGRU that outputs hidden states corresponding to the first and last time steps, respectively, for eye movement classification.

The second convolutional neural network CNN2 and the first convolutional neural network CNN1 may have the same network structure, and therefore, the second convolutional neural network CNN2 may refer to the first convolutional neural network CNN1 shown in fig. 4, and details thereof are not repeated herein. Here, it should be clear to those skilled in the art that the second convolutional neural network CNN2 and the first convolutional neural network CNN2 may have different model parameters in the case of having the same network structure, so as to improve the effectiveness of extracting the required features in the corresponding information.

In some embodiments, the first virtual reality device may extract the head motion feature and the eye motion feature based on models having the same network structure to perform consistent feature extraction on the head motion information and the eye motion information, thereby improving the accuracy of eye movement classification based on the extracted features, that is, the first feature extraction model M1 shown in fig. 3 for extracting the head motion feature and the second feature extraction model M2 shown in fig. 3 for extracting the eye motion feature may have the same network structure, for example, the second convolutional neural network CNN2 of the second feature extraction model M2 and the first convolutional neural network CNN1 of the first feature extraction model M1 may have the same network structure, but the first feature extraction model M1 and the second feature extraction model M2 may have different model parameters, and specific model parameters may be determined based on multiple sample training in the same application scenario, so as to improve the effectiveness of feature extraction performed by the feature extraction model.

Step S230, obtaining a first eye movement type of the first user at the ending time of the first time window according to the head movement characteristic and the eye movement characteristic.

Since the head movement speed and the eye movement speed are highly related to the eye movement type, in step S230, the eye movement type of the first user at the ending time (i.e. time t) of the first time window may be determined and marked as the first eye movement type according to the head movement features extracted from the head movement information reflecting the head movement speed and the eye movement features extracted from the eye movement information reflecting the eye movement speed.

In some embodiments, obtaining the first eye movement type of the first user at the ending time of the first time window according to the head movement feature and the eye movement feature in step S230 may include: and inputting the head movement characteristics and the eye movement characteristics into a preset classification model to obtain a first eye movement type of the user at the end moment of the first time window.

In these embodiments, the classification model is arranged to derive the eye movement type classification result at the end of the first time window based on motion features of the first time window, the motion features including head motion features and eye motion features. The model parameters of the classification model can be obtained through sample training under the same application scene.

In some embodiments, as shown in fig. 3, the classification model M3 may include two second network elements and one Softmax network layer connected in sequence, each second network element including a fully-connected layer, a batch normalization layer, an activation function layer, and a random deactivation layer connected in sequence. The Softmax network layer is used for generating probabilities corresponding to different eye movement types to realize eye movement classification.

Each layer of the second network element may be configured according to the classification requirement, and is not limited herein. For example, the number of neurons in the fully connected layer is 64, and the extracted features are integrated by the fully connected layer. As another example, the activation function layer employs a ReLU activation function. For another example, the deactivation rate (droout rate) of the random deactivation layer for increasing the generalization ability of the network is set to 0.5.

In step S240, a scene image of the virtual reality scene at the end time of the first time window is drawn with a drawing accuracy corresponding to the first eye movement type.

The end time of the first time window, i.e. the time t, may be a picture refresh time determined according to the refresh frequency. In this embodiment, the first virtual reality device may determine the eye movement type of the first user at each frame refreshing time, and perform drawing of the corresponding scene frame, thereby improving accuracy of drawing the scene frame according to the eye movement type. In this embodiment, the first virtual reality device may also set, after determining the first eye movement type at the time t, that the determined first eye movement type is valid within a short set time length (for example, less than or equal to 1 s) from the time t, and then may perform drawing of a scene picture according to the first eye movement type within the set time length after the first time window, so as to reduce data processing amount.

In this embodiment, a plurality of eye movement types and rendering accuracies corresponding to each eye movement type may be preset in the first virtual reality device 1000, for example, mapping data reflecting correspondence between the plurality of eye movement types and the plurality of rendering accuracies is stored in the first virtual reality device 1000, so that, when the first eye movement type of the first user at time t is determined in step S230, the rendering accuracy corresponding to the determined first eye movement type may be obtained according to the mapping data.

In this embodiment, the rendering accuracy may include at least one of a refresh frequency, a field angle, and a resolution. The different rendering accuracies differ at least in resolution, the resolution of the higher rendering accuracy being higher than the resolution of the lower rendering accuracy.

In some embodiments, the first eye movement type is one of a set of eye movement types, the set of eye movement types including at least two eye movement types, the at least two eye movement types including an eye jump type. In these embodiments, the step S240 of rendering the scene picture of the virtual reality scene at the end time with the rendering accuracy corresponding to the first eye movement type may include: under the condition that the first eye movement type is the eye jump type, drawing a scene picture of the virtual reality scene at the end time according to drawing precision corresponding to the eye jump type; and different eye movement types in the eye movement type set correspond to different drawing precisions, and the drawing precision corresponding to the eye jump type is lower than the drawing precision corresponding to other eye movement types in the eye movement type set.

In the embodiments, since the perception capability of the user on the scene picture is very weak under the condition of the eye jump, and the user basically cannot see the visual scene clearly, the scene picture at the corresponding moment can be drawn according to the lowest drawing precision to be displayed under the condition that the first eye movement type is the eye jump type, so that the drawing efficiency is improved.

In some embodiments, at least two eye movement types of the set of eye movement types may further comprise a gaze type. In these embodiments, the gaze type may correspond to two rendering accuracies, a first rendering accuracy and a second rendering accuracy, respectively, wherein the first rendering accuracy is higher than the second rendering accuracy. Correspondingly, in step S240, the scene picture of the virtual reality scene at the end time is drawn with the drawing precision corresponding to the first eye movement type, and the method may further include: an eye gaze location of a first user on a screen at an end time of a first time window is obtained. In step S240, the rendering of the scene picture of the virtual reality scene at the ending time according to the rendering accuracy corresponding to the first eye movement type may further include: under the condition that the first eye movement type is a watching type, drawing a first scene picture corresponding to a first screen area taking the eye watching position as a center according to a first drawing precision, and drawing a second scene picture corresponding to a second screen area according to a second drawing precision; the first screen area and the second screen area form a display area of a screen, and the first scene picture and the second scene picture form a scene picture of the virtual reality scene at the end time of the first time window.

In these embodiments, when the first eye type is the gazing type, since the user has a strong perception on the first scene picture of the first screen region centered on the eye gazing position while gazing on the screen, and has a weak perception on the second scene picture of the second screen region other than the first screen region, the scene picture is drawn by distinguishing different screen regions in the gazing type, and the drawing efficiency can be improved as well.

And step S250, displaying the scene picture through the screen of the first virtual reality device.

After the scene picture of the virtual reality scene at the time t is drawn in step S240, the scene picture may be displayed through the screen of the first virtual reality device for the user to view.

As can be seen from steps S210 to S250, in the display processing method of the virtual reality scene according to this embodiment, the first eye movement type of the first user at the ending time of the first time window may be determined according to the head movement information and the eye movement information of the first user in the first time window, and then the scene picture of the virtual reality scene at the time may be drawn for display according to the drawing accuracy corresponding to the determined first eye movement type, so that the reasonable setting of the drawing accuracy may be performed according to the eye movement type, so as to improve the drawing efficiency without affecting the immersion feeling and the experience feeling of the user.

In some embodiments, extracting the head movement feature of the first user from the head movement information and extracting the eye movement feature of the first user from the eye movement information in the above step S220 includes: inputting the head movement information into a preset first feature extraction model to obtain head movement features; and inputting the eye movement information into a preset second feature extraction model to obtain the eye movement features. In the above step S230, obtaining the first eye movement type of the first user at the ending time of the first time window according to the head movement feature and the eye movement feature includes: and inputting the head movement characteristics and the eye movement characteristics into a preset classification model to obtain a first eye movement type of the first user at the end moment of the first time window.

As shown in fig. 3, head movement information is extracted

And the eyeMotion information

Respectively input to the first feature extraction model M1 and the second feature extraction model M2, the classification model M3 may output a first eye movement type, and here, the classification model M3 may output the head movement information

And eye movement information

And the eye movement type with the highest probability is the first eye movement type.

The model parameters of the first feature extraction model M1, the model parameters of the second feature extraction model M2, and the model parameters of the classification model M3 may be obtained by synchronous training using the same training sample set, that is, the first feature extraction model M1, the second feature extraction model M2, and the classification model M3 are used as an integral model, and the model parameters of the integral model are trained by the training sample set, where the model parameters of the integral model include the model parameters of the first feature extraction model M1, the model parameters of the second feature extraction model M2, and the model parameters of the classification model M3.

Each training sample in the training sample set comprises sample data and a sample label, the sample data comprises head movement information and eye movement information of the second user in a second time window using the second virtual reality device, and the sample label is an eye movement type of the second user at the end time of the second time window.

For the collection of sample data, the method of obtaining the head motion information and the eye motion information of the first user in the first time window in step S210 may be referred to, where the length of the second time window is Δ t1, and the sampling frequency and the like may also be set in the same way, so as to improve the classification accuracy of the whole model obtained by training, which is not described herein again.

For the collection of the training sample set, the second virtual reality device and the first virtual reality device may be the same device or different devices, which is not limited herein.

In this embodiment, a plurality of second users may participate in the sample collection, and the plurality of second users may include the first user or may not include the first user, which is not limited herein.

In the model parameters for training the entire model, the loss function, the hyper-parameters, and the like may be set as needed, and are not limited herein. For example, when training the whole model by training the sample set, cross entropy loss function (cross entropy loss) can be adopted and set to 1 × 10 using weight decay (weight decay) ^-4 To minimize training loss. For another example, the initial learning rate may be set to 0.01, and the learning rate may be attenuated by 0.95 times every other round (epoch) using an exponential decay strategy. As another example, the whole model is trained for a total of 100 rounds at a batch size (batch size) of 256, and so on.

The embodiment of the disclosure also provides a virtual reality device for implementing the display processing method. As shown in fig. 5, the virtual reality apparatus 500 includes a memory 520 and a processor 510, the memory 520 is used for storing computer programs, and the processor 510 is used for executing a display processing method according to any embodiment of the present disclosure under the control of the computer programs.

The virtual reality device may be a VR all-in-one machine with only a head mount device, and may also include a head mount device and a host, which are not limited herein.

The present invention may be a system, method and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives the computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

Computer program instructions for carrying out operations of the present invention may be assembler instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. Implementation in hardware, implementation in software, and implementation in a combination of software and hardware are all equivalent as known to those skilled in the art.

While embodiments of the present invention have been described above, the above description is illustrative, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims

1. A display processing method of a virtual reality scene is characterized by comprising the following steps:

acquiring head motion information and eye motion information of a first user within a first time window of using a first virtual reality device; wherein the head movement information comprises head movement speeds of the first user at a plurality of sampling moments within the first time window, the eye movement information comprises gaze location information of the first user gazing at a screen of the first virtual reality device at the plurality of sampling moments;

extracting head movement features of the first user from the head movement information and extracting eye movement features of the first user from the eye movement information;

drawing a scene picture of the virtual reality scene at the ending moment according to drawing precision corresponding to the first eye movement type;

2. The method of claim 1, wherein the first eye movement type is one of a set of eye movement types, the set of eye movement types including at least two eye movement types, the at least two eye movement types including an eye jump type; the drawing the scene picture of the virtual reality scene at the ending time according to the drawing precision corresponding to the first eye movement type includes:

3. The method of claim 2, wherein the at least two eye movement types further comprise a gaze type corresponding to two rendering accuracies, respectively a first rendering accuracy and a second rendering accuracy, wherein the first rendering accuracy is higher than the second rendering accuracy;

the method further includes, before the scene picture of the virtual reality scene at the end time is drawn with a drawing accuracy corresponding to the first eye movement type, the method further including:

the rendering the scene picture of the virtual reality scene at the ending time according to the rendering accuracy corresponding to the first eye movement type further includes:

under the condition that the first eye movement type is a watching type, drawing a first scene picture corresponding to a first screen area taking the eye watching position as the center according to the first drawing precision, and drawing a second scene picture corresponding to a second screen area according to the second drawing precision; the first screen area and the second screen area form a display area of the screen, and the first scene picture and the second scene picture form a scene picture of the virtual reality scene at the end time.

4. The method of claim 1, wherein the extracting the head movement feature of the first user from the head movement information comprises:

inputting the head movement information into a preset first feature extraction model to obtain the head movement feature of the first user; the first feature extraction model comprises a first convolutional neural network and a first timing network which are connected in series, the first convolutional neural network receives the head motion information, and the first timing network outputs the head motion features.

5. The method of claim 4, wherein the first convolutional neural network comprises three first network elements connected in series, each of the first network elements comprising a convolutional layer, a batch normalization layer, an activation function layer, and a max pooling layer connected in sequence.

6. The method of claim 4, wherein the first timing network is a bi-directional gated cyclic unit.

7. The method of claim 1, wherein the extracting the eye movement characteristics of the first user from the eye movement information comprises:

8. The method of claim 1, wherein obtaining the first eye movement type of the first user at the end time of the first time window from the head motion feature and the eye motion feature comprises:

inputting the head movement characteristics and the eye movement characteristics into a preset classification model to obtain a first eye movement type of the user at the ending moment; wherein the classification model is configured to obtain an eye movement type classification result at an end time of the first time window according to motion features of the first time window, the motion features including the head motion feature and the eye motion feature;

9. The method of claim 1, wherein extracting the head motion feature of the first user from the head motion information and extracting the eye motion feature of the first user from the eye motion information comprises:

the model parameters of the first feature extraction model, the model parameters of the second feature extraction model and the model parameters of the classification model are obtained by synchronous training of the same training sample set, each training sample in the training sample set comprises sample data and a sample label, the sample data comprises head motion information and eye motion information of a second user in a second time window using second virtual reality equipment, and the sample label is an eye movement type of the second user at the end moment of the second time window.

10. A virtual reality device comprising a memory for storing a computer program and a processor for performing the display processing method according to any one of claims 1 to 9 under the control of the computer program.