CN116708737A

CN116708737A - Stereoscopic image playing device and stereoscopic image generating method thereof

Info

Publication number: CN116708737A
Application number: CN202210177426.9A
Authority: CN
Inventors: 和佑; 徐文正; 林士豪; 谭驰澔
Original assignee: Acer Inc
Current assignee: Acer Inc
Priority date: 2022-02-25
Filing date: 2022-02-25
Publication date: 2023-09-05

Abstract

A stereoscopic image playing device, comprising: the system comprises a processor and a graphic processing unit, wherein the processor and the graphic processing unit are used for creating a stereoscopic grid and materials thereof to obtain a stereoscopic scene and capturing a plane image of the stereoscopic scene. When the planar images are not side-by-side images, the processor performs image preprocessing on the planar images to obtain a first image. The graphics processing pipeline of the graphics processing unit performs depth estimation on the first image to obtain a depth image, updates the stereoscopic grid according to the depth setting of the depth image, maps the stereoscopic grid to a coordinate system according to an eyeball tracking result, projects the first image to the mapped stereoscopic grid to obtain an output stereoscopic grid, and captures and outputs side-by-side images from the output stereoscopic grid. The graphics processing pipeline weaves left eye images and right eye images of the output side-by-side images into output images, and plays the output images in the stereoscopic image display device. The disclosure also relates to a stereoscopic image generation method.

Description

Stereoscopic image playing device and stereoscopic image generating method thereof

Technical Field

The present disclosure relates to stereoscopic image display technologies, and in particular, to a stereoscopic image playing device and a stereoscopic image generating method thereof.

Background

With the advancement of technology, virtual reality (virtual reality) devices are becoming more popular. Because the virtual reality scene requires a great amount of computation, when the position of the user in the virtual reality scene changes or the direction of the line of sight changes, the conventional stereoscopic image generating device often can only use a processor (software) to recalculate the updated virtual reality scene and obtain the left eye image and the right eye image that the user views from the updated virtual reality scene. The above operation method often causes a great burden on a processor in the stereoscopic image playing device.

Therefore, a stereoscopic image playing device and a stereoscopic image generating method thereof are needed to solve the above-mentioned problems.

Disclosure of Invention

The present disclosure provides a stereoscopic image playing device, comprising: a processor; the graphic processing unit is used for creating a three-dimensional grid and materials thereof to obtain a three-dimensional scene and capturing a plane image of the three-dimensional scene; wherein, in response to the planar image not being a side-by-side image, the processor performs image preprocessing on the planar image to obtain a first image; a graphics processing pipeline of the graphics processing unit performs depth estimation on the first image to obtain a depth image, updates the stereoscopic grid according to the depth setting of the depth image, and maps the stereoscopic grid to a corresponding coordinate system according to the eyeball tracking result of a user of the stereoscopic image playing device; the graphic processing pipeline projects the first image to the imaged stereoscopic grid to obtain an output stereoscopic grid, and captures output side-by-side images from the output stereoscopic grid, wherein the output side-by-side images comprise a left-eye image and a right-eye image; the image processing pipeline weaves the left eye image and the right eye image into an output image, and plays the output image in a stereoscopic image display device.

In some embodiments, in response to the planar image being the side-by-side image, the graphics processing pipeline directly weaves the left-eye image and the right-eye image in the side-by-side image into an output image, and plays the output image on a stereoscopic image display device.

In some embodiments, the vertex shader of the graphics processing pipeline projects the first image to the mapped stereoscopic mesh to obtain the output stereoscopic mesh.

In some embodiments, the image preprocessing adjusts the size and format of the planar image to meet the requirements of an artificial intelligence model in the graphics processing pipeline, and the artificial intelligence model performs depth estimation on the first image to obtain the depth image.

In some embodiments, the processor subtracts the minimum depth from the maximum depth in the depth image to obtain the depth setting.

In some embodiments, the graphics processing pipeline updates the stereoscopic mesh by using the depth image as a material of the stereoscopic mesh.

The invention further provides a stereoscopic image generating method for a stereoscopic image playing device, wherein the stereoscopic image playing device comprises a processor and a graphic processing unit, and the method comprises the following steps: creating a three-dimensional grid and a material thereof by using the graphic processing unit to obtain a three-dimensional scene, and capturing a plane image of the three-dimensional scene; in response to the planar image being not a side-by-side image, performing image preprocessing on the planar image by using the processor to obtain a first image; performing depth estimation on the first image by using a graphics processing pipeline of the graphics processing unit to obtain a depth image, updating the stereoscopic grid according to the depth setting of the depth image, and mapping the stereoscopic grid to a corresponding coordinate system according to the eye tracking result of a user of the stereoscopic image playing device; utilizing the graphic processing pipeline and projecting the first image to the imaged stereoscopic grid to obtain an output stereoscopic grid, and capturing output side-by-side images from the output stereoscopic grid, wherein the output side-by-side images comprise a left-eye image and a right-eye image; and weaving the left eye image and the right eye image into an output image by using the graphic processing pipeline, and playing the output image in a stereoscopic image display device.

In some embodiments, the method further comprises: and in response to the planar image being the side-by-side image, directly weaving the left eye image and the right eye image in the side-by-side image into an output image by using the graphics processing pipeline, and playing the output image in a stereoscopic image display device.

In some embodiments, the method further comprises: the vertex shader of the graphics processing pipeline is used to project the first image to the mapped stereoscopic grid to obtain the output stereoscopic grid.

In some embodiments, the method further comprises: subtracting the minimum depth from the maximum depth in the depth image to obtain the depth setting.

In some embodiments, the stereoscopic image playing device further comprises a camera for capturing a facial image of the user, and the method further comprises: the processor is used for detecting the sight line direction of the eyes of the user from the face image as the eyeball tracking result.

Drawings

Fig. 1 is a block diagram of a stereoscopic image playing device according to an embodiment of the disclosure.

Fig. 2 is a schematic diagram of a planar image, a stereoscopic grid, and a planar grid according to an embodiment of the disclosure.

FIG. 3 is a flow chart of a stereoscopic image generation method using a graphics processing pipeline according to an embodiment of the disclosure.

Fig. 4 is a flowchart of a stereoscopic image generation method according to an embodiment of the disclosure.

Reference numerals illustrate:

10: stereoscopic image playing device

100: host machine

110: processor and method for controlling the same

111: system bus

120: graphics processing unit

121: graphics processing pipeline

130: memory cell

140: storage device

141: operating system

142: graphics driver

143: stereoscopic image playing program

144: parallel image detection module

145: image preprocessing module

146: parallel image generating module

147: image braiding module

160: camera with camera body

170: transmission interface

180: stereoscopic image display device

210. 220, 230: region(s)

310: initialization phase

312-328: square block

330: parallel image generation stage

332-340: square block

S410-S480: step (a)

Detailed Description

The following description is of the preferred implementation of the disclosure, and is intended to describe the basic spirit of the disclosure, but not to limit the disclosure. The actual disclosure must be referred to the claims following.

It should be appreciated that the terms "comprising," "including," and the like, as used in this specification, are intended to specify the presence of stated features, values, method steps, operations, components, and/or groups thereof, but do not preclude the addition of further features, values, method steps, operations, components, groups thereof, or groups thereof.

In the claims, terms such as "first," "second," "third," and the like, are used for modifying elements of the claims, and are not intended to indicate a priority order, a precedence relationship or a precedence order of one element over another or the temporal order in which steps of a method are performed, but are used to distinguish one element from another element having a similar name.

The stereoscopic image playing device 10 may be, for example, a personal computer, a server, a portable electronic device or other electronic devices with similar computing capabilities. As shown in fig. 1, the stereoscopic image playing device 10 includes a host 100 and a stereoscopic image display apparatus 180. The host 100 is connected to the stereoscopic image display device 180, and the host 100 can generate an image signal, for example, and transmit the image signal to the stereoscopic image display device 180 for playing. The stereoscopic image display device 180 may play the image signal from the stereoscopic image playing apparatus 100 in a stereoscopic image playing mode or a planar image playing mode according to the format of the image signal from the host 100.

The host 100 includes: the processor 110, the graphics processing unit 120, the memory unit 130, the storage device 140, the camera 160, and the transmission interface 180 are coupled to each other by the system bus 111 in the stereoscopic image playing device 100.

The processor 110 is, for example, a Central Processing Unit (CPU), a general-purpose processor (general-purpose processor), etc., but the disclosure is not limited thereto. The gpu 120 may be, for example, a gpu on a display adapter or a gpu integrated into the processor 110, but the disclosure is not limited thereto.

The memory unit 130 is a random access memory, such as a Dynamic Random Access Memory (DRAM) or a Static Random Access Memory (SRAM), but the disclosure is not limited thereto. The memory unit 130 may also be referred to as a system memory, and may be used as an image buffer (image buffer) in addition to the processor 110 for buffering data.

The storage device 140 is a non-volatile memory (non-volatile memory), such as a hard disk drive (hard disk drive), a solid state disk (solid state disk), a flash memory (flash memory), or a read-only memory (read-only memory), but the disclosure is not limited thereto. For example, the storage device 140 may store an operating system 141 (e.g., windows, linux, macOS, etc.), a graphics driver 142, and a stereoscopic image playing program 143. The processing unit 110 may, for example, read and execute the operating system 141, the graphics driver 142 and the stereoscopic image playing program 143 to the memory unit 130. The graphics processing unit 120 may, for example, perform graphics processing of the stereoscopic playback program 143 executed by the processing unit 110 to generate an image signal including one or more images, and transmit the image signal to the stereoscopic display device 180 through the transmission interface 170.

The transmission interface 170 may be a wired transmission interface and/or a wireless transmission interface, wherein the wired transmission interface may include: a high-resolution multimedia interface (High Definition Multimedia Interface, HDMI), a Display Port (DP) interface, an embedded display port (impregnated DisplayPort, eDP), a low voltage differential signaling (low voltage differential signaling, LVDS) interface, a universal serial bus (Universal Serial Bus, USB) interface, a USB Type-C interface, a Lei Li (Thunderbolt) interface, a Digital Video Interface (DVI), a Video Graphics Array (VGA) interface, a General Purpose Input Output (GPIO) interface, a Universal Asynchronous Receiver Transmitter (UART) interface, a Serial Peripheral Interface (SPI) interface, an integrated circuit bus (I2C) interface, or a combination thereof, and the wireless transmission interface may include: bluetooth (Bluetooth), wi-Fi, near Field Communication (NFC) interface, etc., but the present disclosure is not limited thereto.

The camera 160 may be disposed on the stereoscopic image playing device 100 (or the stereoscopic image display apparatus 180) for capturing a facial image of a user in front of the stereoscopic image playing device 100, for example. In addition, the processor 110 may execute, for example, an eye tracking program (not shown) for detecting the eye orientation (orientation) of the eyes of the user from the face image as an eye tracking result, which may be used in the side-by-side image generation stage of the stereoscopic image playing program 143, the details of which will be described later.

The stereoscopic image display device 180 may be, for example, a head-mounted display (head mounted display, HMD) or an auto stereoscopic (autosteroscopic) display device, for playing virtual reality (stereoscopic image) images or stereoscopic images. The stereoscopic image display device 180 may be implemented using different stereoscopic display technologies in the art of this disclosure. For example, naked eye stereoscopic display techniques may include: parallax barrier (parallax barriers) technology, lenticular (lenticular lenses), directional light source (directional backlight), etc., which can alternately play or simultaneously play left-eye and right-eye images in a stereoscopic image. The head-mounted display may include a left-eye display panel and a right-eye display panel, respectively playing a left-eye image and a right-eye image in the stereoscopic image, and imaging the left eye and the right eye of the user through the corresponding left-eye lens and right-eye lens to generate stereoscopic vision. Those skilled in the art will appreciate that the related players of the head-mounted display and the stereoscopic display device are not described in detail herein.

The stereoscopic image playing program 143 includes a side-by-side image detecting module 144, an image preprocessing module 145, a side-by-side image generating module 146 and an image weaving module 147. The side-by-side image detection module 144 is configured to detect whether the received planar image is a side-by-side image, wherein the side-by-side image is an image of a left-eye image and a right-eye image. The image preprocessing module 145 is used for adjusting the size and/or format conversion of the received planar image, and the size and/or format of the processed planar image meets the requirements of the stereoscopic image playing program 143 and the artificial intelligence model (AI model) executed by the graphics processing unit 120 for depth estimation.

The side-by-side image generation module 146 uses the stereoscopic grid updated by the gpu 120 according to the depth setting (depth setting) and performs coordinate mapping according to the eye tracking result to generate side-by-side images, the details of which will be described later.

Fig. 2 includes regions 210, 220, and 230, wherein region 210 is a schematic diagram of a two-dimensional mesh (two-dimensional mesh), region 220 is a schematic diagram of a three-dimensional mesh (three-dimensional mesh), and region 230 is a schematic diagram of a three-dimensional scene. For example, the two-dimensional grid at region 210 includes a plurality of triangles arranged in a sequence in the X-Y plane, each triangle having a set of vertices (vertices). The graphics processing pipeline 121 in the gpu 120 generates a two-dimensional mesh corresponding to the three-dimensional scene, and gives a certain height (i.e., Z-axis direction) to the vertices of each triangle in the two-dimensional mesh to obtain the three-dimensional mesh shown in the region 220, wherein the Z-axis direction is the scene depth. Finally, the graphics processing pipeline 121 attaches the corresponding material of each triangle to obtain the three-dimensional scene shown in region 230.

FIG. 3 is a flow chart of a stereoscopic image generation method using a graphics processing pipeline according to an embodiment of the disclosure. Please refer to fig. 1 and fig. 3 simultaneously.

The process 300 begins at an initialization stage 310, where the initialization stage 310 includes blocks 312, 314, and 316. Block 312: a stereoscopic grid is created. For example, the graphics processing pipeline 121 of the gpu 120 may first create a corresponding stereoscopic mesh (3 Dmesh) of the three-dimensional scene during graphics processing, as shown in region 220 of fig. 2.

Block 314: creating a material. For example, the stereoscopic mesh includes a plurality of triangles, and each triangle has a corresponding texture (texture). The graphics processing pipeline 121 creates corresponding materials for each triangle together, and a vertex shader (vertex loader) of the graphics processing pipeline 121 pastes the materials for each triangle during graphics processing to obtain a three-dimensional scene.

Block 316: image capture is started. For example, although the graphics processing unit 120 has generated a three-dimensional scene, the host 100 still generates a corresponding stereoscopic image for playing on the stereoscopic image display device 180. At this time, the processor 110 may capture an image of the three-dimensional scene to obtain a planar image of the three-dimensional scene.

Block 318: whether the image has arrived is determined. For example, in the flow of fig. 3, finally, in block 340, the stereoscopic image may be successfully generated, and then block 318 determines that the image has arrived. If the stereoscopic image was not successfully generated at block 340, then block 318 determines that the image did not arrive, and block 320 is performed to stop image capture.

Block 322: and detecting the side-by-side images. Block 324: judging whether the images are side-by-side images or not. For example, the side-by-side image detection module 144 is configured to detect whether the planar image generated by the gpu 120 is a side-by-side image, wherein the side-by-side image is a side-by-side image of a left-eye image and a right-eye image. If a side-by-side image is detected at block 324, block 340 is entered. If it is detected at block 324 that the planar image is not a side-by-side image, then block 326 is entered.

Block 326: and (5) image preprocessing. For example, the image preprocessing module 145 is used to adjust the size and format conversion of the planar image generated by the gpu 120, and the size and/or format of the processed planar image meets the requirements of the stereoscopic image playing program 143 and the artificial intelligence model (AI model) executed by the gpu 120 for depth estimation. It should be noted that when the artificial intelligence model performs depth estimation on the input image, the format and/or the size (resolution) of the input image should meet the requirements of the artificial intelligence model.

Block 328: and (5) estimating the depth. For example, the artificial intelligence model executed by the gpu 120 is trained in advance, and a single input image (i.e., a planar image) is obtained to determine the depth (depth) of the object in the input image. Thus, the GPU 120 may obtain a depth setting for the planar image, where the depth setting may be, for example, a depth effect intensity parameter (depth effect strength parameter) that may be the maximum depth minus the minimum depth of the planar image in the Z-axis direction. In one embodiment, the depth estimation of the present disclosure may be performed for each Frame (Frame), i.e., a depth map (depth map) is generated for each Frame of the input image.

The side-by-side image generation stage 330 includes blocks 332, 334, 336, and 338. Block 332: updating the stereoscopic grid. For example, the graphics processing pipeline 121 of the graphics processing unit 120 may update the stereoscopic mesh according to the depth setting of the planar image, wherein the vertex shader of the graphics processing pipeline 121 may use the depth map (depth map) or depth image (depth image) obtained at block 328 as the material of the corresponding triangle in the stereoscopic mesh. In one embodiment, the user can dynamically adjust the depth setting value, for example, the depth of the display plane is 0, the maximum depth value in the direction of the screen is-10, the user can set the depth to be 0 to-6 according to the user's own needs, and the maximum depth value is-6; in another embodiment, the user can set the interval of depth values to-3 to-9; accordingly, the graphics processing pipeline 121 may update the stereoscopic mesh according to the depth setting of the planar image.

Block 334: and (5) mapping a coordinate system. For example, the graphics processing unit 120 coordinates maps the stereoscopic grid according to the eye tracking result. When a user wears a head-mounted display or views a naked-eye stereoscopic display device, the user may adjust the position of the body or the head, or adjust the eye's gaze direction, which all affect the stereoscopic image of the virtual reality scene (i.e. three-dimensional scene) that the host 100 is viewing in computing. For example, when the user performs the above-mentioned actions, the positions of the left camera and the right camera (corresponding to the left eye and the right eye of the user) in the virtual reality scene are changed, so the gpu 120 needs to recalculate the left image and the right image of the virtual reality scene captured by the changed positions of the left camera and the right camera to be the left eye image and the right eye image viewed by the user, respectively.

Block 336: the image is projected onto a stereoscopic grid. For example, through blocks 332 and 334, the relative position and distance (or depth) of the updated and mapped stereoscopic mesh of the user's eyes may be determined, so that the graphics processing pipeline 121 may project the planar image onto the stereoscopic mesh to obtain the updated virtual reality scene.

Block 338: a side-by-side image is generated. Because the updated virtual reality scene is obtained at block 336, the processor 110 (or the gpu 120) may take a photograph of the updated virtual reality scene with the left and right cameras to obtain left and right eye images, and side-by-side the left and right eye images to obtain side-by-side images.

Block 340: image weaving (weaving) process. For example, for some stereoscopic image display apparatuses (e.g., naked-eye stereoscopic display apparatuses), it is necessary to play left-eye images and right-eye images simultaneously so that a user can feel stereoscopic vision. The input image format of such a stereoscopic image display device needs to be a woven image (weave image), for example, an odd line is a left eye image and an even line is a right eye image, or an odd line is a right eye image and an even line is a left eye image.

In the present disclosure, since the graphic processing pipeline 121 of the graphic processing unit 120 is a dedicated hardware circuit, and when a user's position or line of sight in a virtual reality scene is changed, the graphic processing pipeline 121 can rapidly update a stereoscopic grid and calculate a mapped coordinate system, and can attach a planar image to the mapped stereoscopic grid to obtain an updated stereoscopic scene. Therefore, the present disclosure can utilize the graphics processing pipeline 121 of the graphics processing unit 120 to rapidly calculate the left and right images of the virtual reality scene captured by the left and right cameras at the changed positions, and generate corresponding side-by-side images and woven images for playing by the stereoscopic image display device 180. In one embodiment, the graphics processing pipeline 121 of the gpu 120 stores each frame of image simultaneously to generate a depth map (depth map), so that the processing speed is very fast by processing the depth map (depth map) and each frame of captured image simultaneously in the same graphics processing pipeline 121.

Fig. 4 is a flowchart of a stereoscopic image generation method according to an embodiment of the disclosure. Please refer to fig. 1, 3, and 4 simultaneously.

In step S410, a stereoscopic mesh and its material are created using the graphic processing unit 120. For example, the stereoscopic mesh includes a plurality of triangles, and each triangle has a corresponding texture (texture). The graphics processing pipeline 121 creates corresponding materials for each triangle together, and a vertex shader (vertex loader) of the graphics processing pipeline 121 pastes the materials for each triangle during graphics processing to obtain a three-dimensional scene.

In step S420, a planar image of the stereoscopic grid is captured. For example, although the graphics processing unit 120 has generated a three-dimensional scene, the host 100 still generates a corresponding stereoscopic image for playing on the stereoscopic image display device 180. At this time, the processor 110 may capture an image of the three-dimensional scene to obtain a planar image of the three-dimensional scene.

In step S430, the planar image is subjected to image preprocessing to obtain a first image in response to the planar image being not the side-by-side image. For example, the image preprocessing module 145 is used to adjust the size and/or format conversion of the planar image generated by the gpu 120, and the size and/or format of the processed planar image is adapted to the requirements of the stereoscopic image playing program 143 and the artificial intelligence model (AI model) executed by the gpu 120 for depth estimation. It should be noted that when the artificial intelligence model performs depth estimation on the input image, the format and/or the size (resolution) of the input image should meet the requirements of the artificial intelligence model.

In step S440, the first image is subjected to depth estimation by using the graphics processing pipeline of the gpu to obtain a depth image. For example, the artificial intelligence model executed by the gpu 120 is trained in advance, and a single input image (i.e., a planar image) is obtained to determine the depth (depth) of the object in the input image. Thus, the GPU 120 may obtain a depth setting for the planar image, where the depth setting may be, for example, a depth effect intensity parameter (depth effect strength parameter) that may be the maximum depth minus the minimum depth of the planar image in the Z-axis direction. In addition, the gpu 120 determines a corresponding depth map for each of the input images.

In step S450, the stereoscopic grid is updated by using the depth image, and the stereoscopic grid is mapped to the corresponding coordinate system according to the eye tracking result of the user. For example, the graphics processing unit 120 coordinates maps the stereoscopic grid according to the eye tracking result. When a user wears a head-mounted display or views a naked-eye stereoscopic display device, the user may adjust the position of the body or the head, or adjust the eye's gaze direction, which all affect the stereoscopic image of the virtual reality scene (i.e. three-dimensional scene) that the host 100 is viewing in computing. For example, when the user performs the above-mentioned actions, the positions of the left camera and the right camera (corresponding to the left eye and the right eye of the user) in the virtual reality scene are changed, so the gpu 120 needs to recalculate the left image and the right image of the virtual reality scene captured by the changed positions of the left camera and the right camera to be the left eye image and the right eye image viewed by the user, respectively.

In step S460, the first image is projected onto the mapped stereoscopic grid to obtain an output stereoscopic grid. For example, through steps S440 and S450, the relative positions and distances (or depths) of the updated and mapped stereoscopic meshes of the eyes of the user can be determined, so that the vertex shader in the graphics processing pipeline 121 can project the planar image onto the stereoscopic meshes to obtain updated virtual reality scenes (i.e., three-dimensional scenes).

In step S470, output side-by-side images are captured from the output stereoscopic grid, wherein the output side-by-side images include a left-eye image and a right-eye image. Because the updated virtual reality scene is obtained in step S460, the processor 110 (or the gpu 120) can take images with the left camera and the right camera in the updated virtual reality scene to obtain left-eye images and right-eye images, and side-by-side the left-eye images and the right-eye images to obtain side-by-side images.

In step S480, the left eye image and the right eye image are woven into an output image, and the output image is played on the stereoscopic display device. For example, for some stereoscopic image display apparatuses (e.g., naked-eye stereoscopic display apparatuses), it is necessary to play left-eye images and right-eye images simultaneously so that a user can feel stereoscopic vision. The input image format of such a stereoscopic image display device needs to be a woven image (weave image), for example, an odd line is a left eye image and an even line is a right eye image, or an odd line is a right eye image and an even line is a left eye image.

In summary, the disclosure provides a stereoscopic image playing device and a stereoscopic image generating method, which can use an artificial intelligent model of a graphics processing pipeline of a graphics processing unit to rapidly determine the depth of each object in a planar image, and the graphics processing pipeline can rapidly update a stereoscopic grid and calculate a coordinate system after mapping, and can attach the planar image to the mapped stereoscopic grid to obtain an updated stereoscopic scene. Therefore, when the position of the user in the stereoscopic scene changes or the line of sight changes, the present disclosure can utilize the graphics processing pipeline of the graphics processing unit to rapidly calculate the left image and the right image of the virtual reality scene captured by the left camera and the right camera at the changed positions, and generate corresponding side-by-side images and woven images for the stereoscopic image display device to play, thereby improving the image quality of the output stereoscopic image and increasing the instruction period when playing the stereoscopic image. Compared with the traditional method of pixel shifting the original image according to the depth map information to obtain another eye image, the method of pixel shifting can quickly calculate the other eye image according to the user requirement and the position movement through the stereoscopic grid without additional calculation.

While the present disclosure has been described with reference to the preferred embodiments, it should be understood that the invention is not limited thereto, but may be embodied with various changes and modifications within the spirit and scope of the present disclosure by those skilled in the art.

Claims

1. A stereoscopic image playing device, comprising:

a processor; and

a graphic processing unit for creating a three-dimensional grid and its material to obtain a three-dimensional scene and capturing the plane image of the three-dimensional scene;

wherein, in response to the planar image not being a side-by-side image, the processor performs image preprocessing on the planar image to obtain a first image;

a graphics processing pipeline of the graphics processing unit performs depth estimation on the first image to obtain a depth image, updates the stereoscopic grid according to the depth setting of the depth image, and maps the stereoscopic grid to a corresponding coordinate system according to the eyeball tracking result of a user of the stereoscopic image playing device;

the graphic processing pipeline projects the first image to the imaged stereoscopic grid to obtain an output stereoscopic grid, and captures output side-by-side images from the output stereoscopic grid, wherein the output side-by-side images comprise a left-eye image and a right-eye image;

the image processing pipeline weaves the left eye image and the right eye image into an output image, and plays the output image in a stereoscopic image display device.

2. The stereoscopic image playing device according to claim 1, wherein the graphics processing pipeline directly weaves the left-eye image and the right-eye image in the side-by-side image into an output image in response to the planar image being the side-by-side image, and plays the output image on a stereoscopic image display apparatus.

3. The stereoscopic image playing device according to claim 1, wherein the vertex shader of the graphics processing pipeline projects the first image to the mapped stereoscopic grid to obtain the output stereoscopic grid.

4. The stereoscopic image playing device according to claim 1, wherein the image preprocessing is to adjust the size and format of the planar image to meet the requirements of an artificial intelligence model in the graphics processing pipeline, and the artificial intelligence model performs depth estimation on the first image to obtain the depth image.

5. The stereoscopic image playing device according to claim 1, wherein the processor subtracts the minimum depth from the maximum depth in the depth image to obtain the depth setting.

6. The stereoscopic image playing device according to claim 1, further comprising: the camera is used for capturing the facial image of the user, and the processor detects the sight line direction of the two eyes of the user from the facial image as the eyeball tracking result.

7. The stereoscopic image playing device according to claim 1, wherein the graphics processing pipeline uses the depth image as a material of the stereoscopic grid to update the stereoscopic grid.

8. A stereoscopic image generating method is used for a stereoscopic image playing device, wherein the stereoscopic image playing device comprises a processor and a graphic processing unit, and the method comprises the following steps:

creating a three-dimensional grid and a material thereof by using the graphic processing unit to obtain a three-dimensional scene, and capturing a plane image of the three-dimensional scene;

in response to the planar image being not a side-by-side image, performing image preprocessing on the planar image by using the processor to obtain a first image;

performing depth estimation on the first image by using a graphics processing pipeline of the graphics processing unit to obtain a depth image, updating the stereoscopic grid according to the depth setting of the depth image, and mapping the stereoscopic grid to a corresponding coordinate system according to the eye tracking result of a user of the stereoscopic image playing device;

utilizing the graphic processing pipeline and projecting the first image to the imaged stereoscopic grid to obtain an output stereoscopic grid, and capturing output side-by-side images from the output stereoscopic grid, wherein the output side-by-side images comprise a left-eye image and a right-eye image; and

the left eye image and the right eye image are woven into an output image by the graphics processing pipeline, and the output image is played on a stereoscopic image display device.

9. The stereoscopic image generation method of claim 8, further comprising: and in response to the planar image being the side-by-side image, directly weaving the left eye image and the right eye image in the side-by-side image into an output image by using the graphics processing pipeline, and playing the output image in a stereoscopic image display device.

10. The stereoscopic image generation method of claim 8, further comprising: the vertex shader of the graphics processing pipeline is used to project the first image to the mapped stereoscopic grid to obtain the output stereoscopic grid.

11. The method of claim 8, wherein the image preprocessing is to adjust the size and format of the planar image to meet the requirements of an artificial intelligence model in the graphics processing pipeline, and the artificial intelligence model performs depth estimation on the first image to obtain the depth image.

12. The stereoscopic image generation method of claim 8, further comprising: subtracting the minimum depth from the maximum depth in the depth image to obtain the depth setting.

13. The method of claim 8, wherein the stereoscopic image playing device further comprises a camera for capturing facial images of the user, and the method further comprises: the processor is used for detecting the sight line direction of the eyes of the user from the face image as the eyeball tracking result.

14. The stereoscopic image generation method of claim 8, further comprising: the graphics processing pipeline is utilized to update the stereoscopic grid by using the depth image as the material of the stereoscopic grid.