CN116597288A

CN116597288A - Gaze point rendering method, gaze point rendering system, computer and readable storage medium

Info

Publication number: CN116597288A
Application number: CN202310879693.5A
Authority: CN
Inventors: 王晓敏; 张琨
Original assignee: Jiangxi Geruling Technology Co ltd
Current assignee: Jiangxi Geruling Technology Co ltd
Priority date: 2023-07-18
Filing date: 2023-07-18
Publication date: 2023-08-15
Anticipated expiration: 2043-07-18
Also published as: CN116597288B

Abstract

The application provides a gaze point rendering method, a gaze point rendering system, a computer and a readable storage medium, wherein the method comprises the following steps: calibrating an eye tracking device by a user, and collecting gaze point data generated by eyeballs of the user when observing a scene image in real time by the calibrated eye tracking device; correlating the gaze point data with the scene image to generate a corresponding target data set, and training a corresponding gaze point prediction model according to the target data set and a preset neural network; and outputting a gaze point distribution map corresponding to the gaze point data through the gaze point prediction model, and carrying out fusion processing on the gaze point distribution map and the scene image so as to complete image rendering of each gaze point. The method and the device can greatly improve the rendering fidelity of each fixation point, simultaneously improve the rendering effect of the scene image and correspondingly improve the use experience of the user.

Description

Gaze point rendering method, gaze point rendering system, computer and readable storage medium

Technical Field

The present application relates to the field of virtual reality technologies, and in particular, to a gaze point rendering method, a gaze point rendering system, a gaze point rendering computer, and a gaze point rendering program.

Background

With the progress of technology and the rapid development of productivity, the virtual reality technology has also been rapidly developed, and the basic implementation manner of the existing virtual reality technology is mainly based on computer technology, and utilizes and integrates the latest development achievements of various high technologies such as three-dimensional graphics technology, multimedia technology, simulation technology, display technology, servo technology and the like, so that people can generate an immersive feeling.

The gaze point rendering is an important technology in virtual reality, and is used for simulating visual effects generated when eyes gaze a scene in real time, and correspondingly improving the effect of picture rendering.

Most of the prior art finishes the rendering effect of the fixation point by uniformly sampling all pixels in the scene, however, the rendering mode cannot accurately simulate the fixation point change of human eyes in the process of observing the scene, so that the rendering fidelity is lower, and the use experience of a user is correspondingly reduced.

Disclosure of Invention

Based on this, the present application aims to provide a gaze point rendering method, a system, a computer and a readable storage medium, so as to solve the problem that the rendering fidelity is lower because the gaze point change of human eyes in the process of observing a picture cannot be accurately simulated in the rendering mode in the prior art.

An embodiment of the present application provides a gaze point rendering method, where the method includes:

calibrating an eye tracking device by a user, and collecting gaze point data generated by eyeballs of the user when observing a scene image in real time by the calibrated eye tracking device;

correlating the gaze point data with the scene image to generate a corresponding target data set, and training a corresponding gaze point prediction model according to the target data set and a preset neural network;

and outputting a gaze point distribution map corresponding to the gaze point data through the gaze point prediction model, and carrying out fusion processing on the gaze point distribution map and the scene image so as to complete image rendering of each gaze point.

The beneficial effects of the application are as follows: the method comprises the steps of collecting the gaze point data generated by eyeballs of a user when the scene image is observed in real time, correspondingly associating the gaze point data with the current scene image to generate a corresponding target data set, training a corresponding gaze point prediction model through the target data set based on the gaze point data set, and finally predicting the positions of all gaze points in the scene image through the gaze point prediction model, so that the rendering of images around all gaze points can be finally completed, the positions of the gaze points generated by the eyes of the user in the process of observing the picture can be accurately found, the rendering of all gaze points can be further carried out, the rendering fidelity of all gaze points can be greatly improved, the rendering effect of the scene image is improved, and the use experience of the user is correspondingly improved.

Preferably, the step of associating the gaze point data with the scene image comprises:

when the fixation point data and the scene images are acquired, the fixation point coordinates marked in each scene image are acquired in real time based on the fixation point data, and a plurality of fixation points generated in real time by eyeballs of the user in the scene images are found out according to the fixation point coordinates;

acquiring first time stamps corresponding to each scene image respectively, and acquiring second time stamps corresponding to each gaze point respectively;

comparing the first timestamp with the second timestamp one by one, and associating the same gaze point with scene images, each of which contains one or more of the gaze points.

Preferably, the training the corresponding gaze point prediction model according to the target data set and the preset neural network includes:

performing interpolation processing on the target data set to generate corresponding continuous and consistent time series data, and performing image enhancement processing on the time series data;

and dividing a corresponding training set, a verification set and a test set according to the time sequence data after the image enhancement processing, and training an R-CNN neural network through the training set, the verification set and the test set to generate the gaze point prediction model.

Preferably, the step of training the R-CNN neural network by the training set, the verification set and the test set includes:

inputting the training set into a data processing layer of the R-CNN neural network to conduct predictive training on the data processing layer and generate a first predictive result;

inputting the verification set into the data processing layer to output a corresponding second prediction result, and simultaneously inputting the first prediction result and the second prediction result into a data optimization layer of the R-CNN neural network to enable the data optimization layer to output a corresponding optimization scheme;

inputting the optimization scheme into the data processing layer to perform optimization processing on initial parameters in the data processing layer, and inputting the test set into the optimized data processing layer to output a corresponding test result;

judging whether the training of the R-CNN neural network is completed or not according to the test result.

Preferably, the step of determining whether training of the R-CNN neural network is completed according to the test result includes:

calculating the prediction accuracy corresponding to the test result, and judging whether the prediction accuracy is higher than a preset threshold in real time;

and if the prediction accuracy rate is judged to be higher than the preset threshold value in real time, judging that the test result is qualified, and finishing training of the R-CNN neural network.

Preferably, the step of fusing the gaze point distribution map with the scene image to complete image rendering of each gaze point includes:

when the gaze point distribution map is acquired, finding out target positions corresponding to the gaze points in the scene image according to the gaze point distribution map, and detecting all pixel points contained in the scene image;

and adding corresponding rendering weights to pixel points around each gaze point through a gaze weight algorithm based on a preset rule, and completing image rendering of each gaze point based on the rendering weights.

Preferably, the preset rule is:

the size of the rendering weight is inversely proportional to the size of the distance between the pixel point and the nearest gaze point.

A second aspect of an embodiment of the present application proposes a gaze point rendering system, wherein the system includes:

the acquisition module is used for calibrating the eye movement tracking equipment through a user and acquiring fixation point data generated when the eyeball of the user observes a scene image in real time through the calibrated eye movement tracking equipment;

the training module is used for associating the gaze point data with the scene image to generate a corresponding target data set, and training a corresponding gaze point prediction model according to the target data set and a preset neural network;

and the rendering module is used for outputting a gaze point distribution diagram corresponding to the gaze point data through the gaze point prediction model, and carrying out fusion processing on the gaze point distribution diagram and the scene image so as to complete image rendering of each gaze point.

In the gaze point rendering system, the training module is specifically configured to:

In the gaze point rendering system, the training module is further specifically configured to:

In the gaze point rendering system, the rendering module is specifically configured to:

In the gaze point rendering system, the preset rule is as follows:

A third aspect of an embodiment of the present application proposes a computer comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the gaze point rendering method as described above when executing the computer program.

A fourth aspect of the embodiments of the present application proposes a readable storage medium having stored thereon a computer program, wherein the program, when executed by a processor, implements a gaze point rendering method as described above.

Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.

Drawings

Fig. 1 is a flowchart of a gaze point rendering method according to a first embodiment of the present application;

fig. 2 is a block diagram of a gaze point rendering system according to a third embodiment of the present application.

The application will be further described in the following detailed description in conjunction with the above-described figures.

Detailed Description

In order that the application may be readily understood, a more complete description of the application will be rendered by reference to the appended drawings. Several embodiments of the application are presented in the figures. This application may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.

It will be understood that when an element is referred to as being "mounted" on another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. The terms "vertical," "horizontal," "left," "right," and the like are used herein for illustrative purposes only.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

Referring to fig. 1, a gaze point rendering method according to a first embodiment of the present application is shown, where the gaze point rendering method according to the present embodiment can greatly improve the rendering fidelity of each gaze point, and simultaneously improve the rendering effect of a scene image, and correspondingly improve the use experience of a user.

Specifically, the gaze point rendering method provided by the embodiment specifically includes the following steps:

step S10, calibrating an eye tracking device by a user, and acquiring gaze point data generated by eyeballs of the user when observing a scene image in real time by the calibrated eye tracking device;

specifically, in this embodiment, it should be first described that the gaze point rendering method provided in this embodiment is specifically applied to the field of virtual reality technology, and is used to track each gaze point generated when an eyeball of a user observes a scene image in real time, and perform image rendering based on each gaze point, so as to correspondingly improve a display effect of the scene image, and correspondingly improve a use experience of the user.

It should be noted that, the scene image provided in this embodiment refers to an image or video that is watched by the user, so when the scene image is a picture, the current scene image is static, and correspondingly, when the scene image is a video or an application program, the current scene image is dynamic.

Based on this, in this step, it should be noted that, in this step, the gaze point rendering method is first started by the existing eye tracking device, and specifically, the eye tracking device can detect the dynamic change process of the eyeball of the user in real time, that is, can track the eyeball of the user in real time. Furthermore, before the formal rendering starts, the current eye tracking device needs to be calibrated first, specifically, the current eye tracking device needs to be calibrated by the current user, that is, when each new user uses the current eye tracking device, the corresponding calibration needs to be performed. Furthermore, the eye tracking device after calibration can acquire the gaze point data generated by the eyeballs of the user when observing the scene image in real time.

Step S20, associating the gaze point data with the scene image to generate a corresponding target data set, and training a corresponding gaze point prediction model according to the target data set and a preset neural network;

further, in this step, it should be noted that, because the eyeballs of the user dynamically change when observing the scene image, a plurality of gaze points are generated in the process of observation, that is, the gaze point data is generated, and based on this, the current gaze point data and the scene image that is currently played in real time need to be associated together in real time in this step, so as to correspondingly generate a target data set that can facilitate subsequent processing.

On the basis, a required gaze point prediction model can be trained further according to the current target data set and a preset neural network, and particularly, the gaze point prediction model can predict the distribution position of each gaze point in the current scene image so as to facilitate subsequent processing.

And step S30, outputting a gaze point distribution diagram corresponding to the gaze point data through the gaze point prediction model, and carrying out fusion processing on the gaze point distribution diagram and the scene image so as to complete image rendering of each gaze point.

Further, in this step, after the gaze point prediction model is constructed in real time, the present step further outputs a gaze point distribution map corresponding to the current gaze point data through the current gaze point prediction model, and specifically, the distribution position of each gaze point can be obtained through the distribution map.

Based on the method, the current gaze point distribution map and the current scene image are fused, so that each gaze point can be correspondingly displayed in the current scene image, and on the basis, corresponding image rendering is carried out in the current scene image by taking each gaze point as the center, so that the area of image rendering can be correspondingly reduced, the image rendering efficiency is correspondingly improved, and meanwhile, the use experience of a user is improved.

When the method is used, the gaze point data generated by eyeballs of a user when the scene image is observed are collected in real time, the gaze point data are correspondingly associated with the current scene image to generate a corresponding target data set, based on the gaze point data, a corresponding gaze point prediction model is trained through the target data set, and finally, the positions of all gaze points in the scene image are predicted through the gaze point prediction model, so that the rendering of images around all gaze points can be finally completed, the positions of gaze points generated by eyes of the user in the process of observing the picture can be accurately found, and the gaze points can be further rendered, so that the rendering fidelity of all gaze points can be greatly improved, the rendering effect of the scene image is improved, and the use experience of the user is correspondingly improved.

It should be noted that the foregoing implementation procedure is only for illustrating the feasibility of the present application, but this does not represent that the gaze point rendering method of the present application is only one implementation procedure, and may be incorporated into the feasible implementation of the gaze point rendering method of the present application, as long as it can be implemented.

In summary, the gaze point rendering method provided by the embodiment of the application can greatly improve the rendering fidelity of each gaze point, and simultaneously improve the rendering effect of the scene image, and correspondingly improve the use experience of the user.

The second embodiment of the present application also provides a gaze point rendering method, which is different from the gaze point rendering method provided in the first embodiment in that:

specifically, in this embodiment, it should be noted that the step of associating the gaze point data with the scene image includes:

Specifically, in this embodiment, it should be noted that, the gaze point data provided in this embodiment includes a plurality of gaze points acquired in real time, where it should be noted that, in this embodiment, when each gaze point is acquired, a corresponding timestamp is added to each gaze point, where the timestamp corresponds to the acquisition time. Similarly, when each scene image is played, the embodiment also adds a corresponding timestamp to each scene image, and specifically, the timestamp corresponds to the playing time of each scene image. Based on this, after a plurality of gaze points are acquired through the gaze point data, a first timestamp corresponding to the current scene image and a second timestamp corresponding to the current gaze point are acquired simultaneously.

Based on the above, each first timestamp and each second timestamp are compared one by one, and in the process, if the same point of regard and scene images are detected, the same point of regard and scene images are associated together in real time. In addition, it should be noted that one or more gaze points may be included in each scene image, requiring a differential association.

Specifically, in this embodiment, it should be further noted that the step of training the corresponding gaze point prediction model according to the target data set and the preset neural network includes:

In particular, in this embodiment, it should be further noted that, in order to effectively train a desired gaze point prediction model, this embodiment needs to perform secondary processing on the target data set, and in particular, because in the process of collecting gaze point data, there may occur a situation that data is lost or recording intervals are inconsistent, so that the obtained gaze point data is incomplete, and therefore, in this embodiment, it is also necessary to perform interpolation processing on the current target data set first, so as to generate corresponding continuous time series data with consistent data formats, and similarly, the time series data includes the gaze point data and the scene image. Based on this, the present embodiment further performs image enhancement processing such as brightness adjustment, contrast adjustment, image inversion, rotation, and clipping on the current time-series data, so as to correspondingly improve the robustness of the constructed gaze point prediction model.

After the image enhancement processing of the time-series data is completed, a required training set, a validation set and a test set can be partitioned based on the current time-series data at this time, so as to further perform corresponding training on the R-CNN (Region-Convolutional Neural Networks, deep learning target detection algorithm) provided in the present embodiment, so as to generate the gaze point prediction model.

In addition, in this embodiment, it should be noted that the step of training the R-CNN neural network through the training set, the verification set and the test set includes:

In addition, in this embodiment, it should be noted that, in particular, the R-CNN neural network provided in this embodiment includes a data input layer, a data processing layer, a data optimization layer, and a data output layer, and in particular, this embodiment inputs the training set described above to the data processing layer through the data input layer, where an initial parameter is set in the data processing layer that just starts training, and based on this, the current initial parameter is trained through the training set described above, so that the data output layer outputs a corresponding first prediction result. Further, the verification set is input into the current data processing layer through the data input layer, so that a second prediction result can be correspondingly output, and on the basis, the current first prediction result and the second prediction result are simultaneously input into the data optimization layer, so that the current data optimization layer can output a corresponding optimization scheme according to a deviation value between the current first prediction result and the second prediction result. The optimization scheme is used for optimizing the initial parameters.

Further, the current optimization scheme is input into the data processing layer, so that the optimization processing can be performed on the initial parameters in the current data processing layer to complete the optimization of the initial parameters. Based on the above, the test set is input into the optimized data processing layer, so that a corresponding test result can be output, and whether the training of the current R-CNN neural network is completed can be judged in real time according to the test result.

In addition, in this embodiment, it should be further noted that the step of determining whether to complete training of the R-CNN neural network according to the test result includes:

In addition, in this embodiment, it should also be noted that, immediately after the required prediction accuracy is obtained through the above steps, the present embodiment compares the current prediction accuracy with the preset threshold, and determines in real time whether the current prediction accuracy is higher than the current preset threshold, and preferably, the prediction accuracy provided in this embodiment is set to 95%.

Further, if the current prediction accuracy is judged to be higher than the current preset threshold in real time, the test result is judged to be qualified to finish the training of the R-CNN neural network, and correspondingly, if the current prediction accuracy is judged to be lower than the current preset threshold in real time, the test result is judged to be unqualified, and the test process is required to be repeated until the output prediction accuracy is higher than the current preset threshold to finish the training of the R-CNN neural network.

In this embodiment, it should be noted that, the step of fusing the gaze point distribution map and the scene image to complete image rendering of each gaze point includes:

In this embodiment, it should be noted that, after the required gaze point distribution map is obtained through the above steps, the target positions corresponding to the current gaze points in the corresponding scene images can be quickly found according to the current gaze point distribution map, and at the same time, all the pixel points included in the current scene images are synchronously detected.

Furthermore, according to preset rendering rules, corresponding rendering weights are added to pixel points around each current gaze point through a preset gaze weight algorithm, so that image rendering of each gaze point can be completed only based on the rendering weights of the pixel points.

In this embodiment, it should be noted that, the preset rule is:

In this embodiment, it should be noted that, in the preset rule provided in this embodiment, the magnitude of the rendering weight is inversely proportional to the magnitude of the distance between each pixel point and the gaze point closest thereto, that is, the greater the distance between the pixel point and the gaze point closest thereto, the smaller the rendering weight of the current pixel point, and conversely, the smaller the distance between the pixel point and the gaze point closest thereto, the greater the rendering weight of the current pixel point, so that the rendering resources can be effectively saved, and the rendering efficiency is correspondingly improved.

It should be noted that, for the sake of brevity, the method according to the second embodiment of the present application, which implements the same principle and some of the technical effects as the first embodiment, is not mentioned here, and reference is made to the corresponding content provided by the first embodiment.

Referring to fig. 2, a gaze point rendering system according to a third embodiment of the present application is shown, wherein the system includes:

the acquisition module 12 is used for calibrating the eye movement tracking device by a user and acquiring fixation point data generated when the eyeball of the user observes a scene image in real time by the calibrated eye movement tracking device;

the training module 22 is configured to correlate the gaze point data with the scene image to generate a corresponding target data set, and train a corresponding gaze point prediction model according to the target data set and a preset neural network;

and the rendering module 32 is configured to output a gaze point distribution map corresponding to the gaze point data through the gaze point prediction model, and perform fusion processing on the gaze point distribution map and the scene image, so as to complete image rendering of each gaze point.

The training module 22 in the gaze point rendering system is specifically configured to:

The training module 22 in the gaze point rendering system is specifically further configured to:

The rendering module 32 in the gaze point rendering system is specifically configured to:

The preset rule in the gaze point rendering system is as follows:

A fourth embodiment of the present application provides a computer, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the gaze point rendering method provided in the above embodiments when executing the computer program.

A fifth embodiment of the present application provides a readable storage medium having stored thereon a computer program, wherein the program, when executed by a processor, implements the gaze point rendering method as provided by the above embodiments.

In summary, the gaze point rendering method, the gaze point rendering system, the gaze point rendering computer and the gaze point rendering medium provided by the embodiments of the present application can greatly improve the rendering fidelity of each gaze point, and simultaneously improve the rendering effect of the scene image, and correspondingly improve the use experience of the user.

The above-described respective modules may be functional modules or program modules, and may be implemented by software or hardware. For modules implemented in hardware, the various modules described above may be located in the same processor; or the above modules may be located in different processors in any combination.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. A gaze point rendering method, the method comprising:

2. A gaze point rendering method according to claim 1, characterized by: the step of associating the gaze point data with the scene image comprises:

3. A gaze point rendering method according to claim 1, characterized by: the step of training a corresponding gaze point prediction model according to the target data set and a preset neural network comprises the following steps:

4. A gaze point rendering method according to claim 3, characterized in that: the step of training the R-CNN neural network by the training set, the validation set, and the test set includes:

5. The gaze point rendering method of claim 4, wherein: the step of judging whether the training of the R-CNN neural network is completed according to the test result comprises the following steps:

6. A gaze point rendering method according to claim 1, characterized by: the step of fusing the gaze point distribution map and the scene image to complete image rendering of each gaze point comprises the following steps:

7. The gaze point rendering method of claim 6, wherein: the preset rule is as follows:

8. A gaze point rendering system, the system comprising:

9. A computer comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the gaze point rendering method of any of claims 1 to 7 when executing the computer program.

10. A readable storage medium having stored thereon a computer program, which when executed by a processor implements a gaze point rendering method as claimed in any one of claims 1 to 7.