CN117321987A

CN117321987A - Immersive viewing experience

Info

Publication number: CN117321987A
Application number: CN202280030471.XA
Authority: CN
Inventors: 罗伯特·埃德温·道格拉斯; 大卫·拜伦·道格拉斯; 凯瑟琳·玛丽·道格拉斯
Original assignee: Da WeiBailunDaogelasi; Kai SelinMaliDaogelasi; Luo BoteAidewenDaogelasi
Current assignee: Da WeiBailunDaogelasi; Kai SelinMaliDaogelasi; Luo BoteAidewenDaogelasi
Priority date: 2021-04-22
Filing date: 2022-04-21
Publication date: 2023-12-29
Also published as: WO2022226224A1; JP2024518243A; EP4327552A1

Abstract

The patent discloses a method for recording images in a manner larger than the visual range of a user. Thus, allowing the user to view naturally through head tracking and eye tracking enables the person to see and view the scene as if it were naturally in real-time on-site. Also taught herein is an intelligent system that analyzes viewing parameters of a user and streams custom images to be displayed.

Description

Immersive viewing experience

Technical Field

Aspects of the present disclosure generally relate to the use of work distribution.

Cross reference to related applications

The present application is PCT for US17/237,152 filed on 22 th 4 of 2021, US17/237,152 is a continuation-in-part application for US patent application 17/225,610 filed on 7 th 4 of 2021, and US patent application 17/225,610 is a continuation-in-part application for US patent application 17/187,828 filed on 28 th 2 of 2021.

Introduction to the invention

Movies are a form of entertainment.

Disclosure of Invention

All examples, aspects, and features mentioned herein may be combined in any technically conceivable manner. This patent teaches a method, software and apparatus for an immersive viewing experience.

In general, this patent improves upon the technology taught in U.S. patent application Ser. No. 17/225,610, filed on 7 at 4 at 2021, the entire contents of U.S. patent application Ser. No. 17/225,610 being incorporated herein by reference. Some of the devices described in U.S. patent application 17/225,610 have the ability to generate ultra-large data sets. The present patent improves the display of such very large data sets.

The present patent discloses a system, method, apparatus, and software to achieve an improved immersive viewing experience. First, the user's viewing parameters are uploaded to the cloud, where the cloud stores an image (which in the preferred embodiment is an oversized dataset). Viewing parameters may include any action, pose, body position, eye gaze angle, eye convergence/vergence (convergence/vergence), or input (e.g., through a graphical user interface). Thus, the user's viewing parameters are characterized in near real-time (e.g., by various devices, such as an eye-oriented camera, a gesture-recording camera) and sent to the cloud. Second, a set of user-specific images is optimized from the images, wherein the user-specific images are based at least on the viewing parameters. In a preferred embodiment, the field of view of the user-specific image is smaller than the image. In a preferred embodiment, the locations that the user is looking at have high resolution, while the locations that the user is not looking at have low resolution. For example, if the user is looking at an object to the left, the user-specific image to the left will be high resolution. In some embodiments, the user-specific image will be streamed in near real-time.

In some embodiments, the user-specific image includes a first portion having a first spatial resolution and a second portion having a second spatial resolution, wherein the first spatial resolution is higher than the second spatial resolution. Some embodiments include, wherein the viewing parameter comprises a viewing position, wherein the viewing position corresponds to the first portion.

Some embodiments include wherein the user-specific image includes a first portion having a first zoom setting and a second portion having a second zoom setting, wherein the first zoom setting is higher than the second zoom setting. Some embodiments include, wherein the first portion is determined by the viewing parameter, wherein the viewing parameter comprises at least one of the group consisting of: the position of the user's body; the direction of the user's body; the pose of the user's hand; facial expressions of the user; the position of the user's head; and the direction of the user's head. Some embodiments include wherein the first portion is determined by a graphical user interface (e.g., a mouse or controller).

Some embodiments include wherein the image includes a first field of view (FOV), and the user-specific image includes a second FOV, wherein the first FOV is greater than the second FOV.

Some embodiments include, wherein the image comprises a stereoscopic image, wherein the stereoscopic image is obtained by a stereoscopic camera or a stereoscopic camera cluster.

Some embodiments include, wherein the image comprises a stitched image, wherein the stitched image is generated by at least two cameras.

Some embodiments include, wherein the image comprises a composite image, wherein the composite image is generated by: capturing a first image of a scene with a first set of camera settings, wherein the first set of camera settings cause a first object to be in focus and a second object to be out of focus; and capturing a second image of the scene with a second set of camera settings, wherein the second set of camera settings causes the second object to be in focus and the first object to be out of focus. Some embodiments include wherein the first image is presented to the user when the user looks at the first object and the second image is presented to the user when the user looks at the second object. Some embodiments include combining at least the first object from the first image and the second object from the second image into the composite image.

Some embodiments include performing image stabilization. Some embodiments include, wherein the viewing parameter comprises convergence. Some embodiments include wherein the user-specific image is a 3D image (three-dimensional image), wherein the 3D image is presented on an HDU, a pair of stereoscopic glasses, or a pair of polarized glasses.

Some embodiments include wherein the user-specific image is presented to the user on a display, wherein the user has a field of view of at least 0.5 pi steradians.

Some embodiments include wherein the user-specific image is presented on a display. In some embodiments, the display is a screen (e.g., a TV, a reflective screen coupled to a projector system, an augmented reality head display unit comprising an augmented reality display, a virtual reality display, or a mixed reality display).

Drawings

FIG. 1 shows

Fig. 1 shows a retrospective display of stereoscopic images.

Fig. 2 illustrates a method of determining which stereo pair to display to a user at a given point in time.

Fig. 3 shows the display of video recordings on an HDU.

Fig. 4 shows a pre-recorded stereoscopic viewing performed by user 1.

Fig. 5 illustrates remote stereoscopic imaging of a remote object using a stereoscopic camera cluster.

Fig. 6 illustrates the post-acquisition capability of adjusting an image based on user eye tracking to obtain the best possible picture by generating a stereoscopic composite image.

Fig. 7A shows a moving image and application of image stabilization processing.

Fig. 7B shows a moving image displayed in the HDU.

Fig. 7C illustrates the application of image stabilization to an image using stereoscopic images.

Fig. 8A shows left and right images with a first camera setting.

Fig. 8B shows left and right images with a second camera setting.

Fig. 9A shows a top view of all data of a scene collected at one point in time.

Fig. 9B shows a wide-angle display 2D image frame of a video recording.

Fig. 9C shows a top view of user a with viewing angles of-70 ° and 55 ° FOV.

Fig. 9D shows what user a would see given that user a's viewing angles are-70 ° and 55 ° FOV.

Fig. 9E shows a top view of user B with viewing angles of +50° and 85 ° FOV.

Fig. 9F shows what user B would see given a +50° and 85 ° FOV of user B.

Fig. 10A shows a field of view captured by a left camera at a first point in time.

Fig. 10B shows the field of view captured by the right camera at a first point in time.

Fig. 10C shows the personalized field of view (FOV) of the first user at a given point in time.

Fig. 10D shows the personalized field of view (FOV) of the second user at a given point in time.

Fig. 10E shows a personalized field of view (FOV) of a third user at a given point in time.

Fig. 10F shows the personalized field of view (FOV) of the fourth user at a given point in time.

Fig. 11A shows a top view of a left eye view of a first user.

Fig. 11B shows a top view of a left eye view of a first user with convergence points near the left and right eyes.

Fig. 11C shows a left eye view without convergence at time point 1.

Fig. 11D shows a left eye view with convergence at time point 2.

Fig. 12 illustrates reconstructing various stereoscopic images from previously acquired wide-angle stereoscopic images.

Fig. 13A shows a top view of a home theater.

Fig. 13B shows a side view of the home theater shown in fig. 13A.

Fig. 14A shows a top view of a home theater.

Fig. 14B shows a side view of the home theater as shown in fig. 14A.

Fig. 15A shows an approximately spherical TV in which the user looks straight ahead at time point # 1.

Fig. 15B shows the television portion and field of view observed by the user at time point # 1.

Fig. 15C shows an approximately spherical TV in which the user looks straight ahead at time point # 2.

Fig. 15D shows the television portion and field of view observed by the user at time point # 2.

Fig. 15E shows an approximately spherical TV in which the user looks directly ahead at time point # 3.

Fig. 15F shows the television section and field of view observed by the user at time point # 3.

Fig. 16A shows an image without zooming.

Fig. 16B shows digital magnification of a portion of an image.

Fig. 17A shows an image without zooming.

Fig. 17B shows an optical-type magnification on a part of an image.

Fig. 18A shows a single resolution image.

Fig. 18B shows a multi-resolution image.

Fig. 19A shows a large field of view in which a first user is looking at a first portion of an image and a second user is looking at a second portion of the image.

Fig. 19B shows that only the first portion of the image in fig. 19A and the second portion of the image in fig. 19A are high resolution, while the rest of the image is lower resolution.

Fig. 20A shows a low resolution image.

Fig. 20B shows a high resolution image.

Fig. 20C shows a composite image.

Fig. 21 illustrates a method and process for performing near real-time streaming (streaming) of custom images.

Fig. 22A illustrates a rear junction in connection with a stereo camera use, where the first camera position is unknown.

Fig. 22B illustrates rear intersections in conjunction with stereo camera use where the object position is unknown.

Fig. 23A shows a top view of a person looking forward toward the center of a home theater screen.

Fig. 23B shows a top view of a person looking forward to the right of the home theater screen.

Fig. 24 illustrates a method, system and apparatus for optimizing stereo camera settings during image acquisition in motion.

Detailed Description

The flow chart does not describe the syntax of any particular programming language. Rather, the flow diagrams illustrate the functional information one of ordinary skill in the art requires in making circuits or generating computer software to perform the processing required in accordance with the present invention. It should be noted that many routine elements, such as initialization of loops and variables and the use of temporary variables, are not shown. It will be appreciated by those of ordinary skill in the art that unless otherwise indicated herein, the particular sequence of steps described is illustrative only and can be varied without departing from the spirit of the invention. Thus, unless otherwise indicated, the steps described below are unordered meaning that, where possible, the steps can be performed in any convenient or desirable order.

Fig. 1 shows a retrospective display of stereoscopic images. 100 shows step a, determining the position at which the viewer is looking at point in time n (e.g., coordinates (α _n ,，β _n ,，r _n )). Note #1: the location may be a near convergence point, an intermediate convergence point, or a far convergence point. Note #2: a series of stereoscopic images are acquired and recorded. Step a follows the acquisition process and is performed during user viewing for some subsequent period of time. 101 shows step B, determining a position (e.g., (a) at point in time n _n ,，β _n ,，r _n ) Coordinates, notes: the user may select FOV) corresponding to FOV _n .102 shows step C, selecting the camera(s) corresponding to the left eye FOV with the option of performing additional image processing (e.g., using a composite image, using a convergence zone) to generate a Personalized Left Eye Image (PLEI) at point in time n _n ). 103 shows step D, selecting the camera(s) corresponding to the right eye FOV with the option of performing additional image processing (e.g. using a composite image, using a vergence zone) to generate a personalized right eye image (pre i) at point in time n _n ). 104 shows step E, displaying the PLEI on the left eye display of the HDU _n .105 shows step F, displaying PREI on the right eye display of the HDU _n .106 shows step G, incrementing the time step to n+1 and going to step a above.

Fig. 2 illustrates a method of determining which stereo pair (stereo pair) to display to a user at a given point in time. 200 shows a text box that analyzes user parameters to determine which stereoscopic image to display to the user. First, the viewing direction of the user's head is used. For example, a first stereo pair may be used if the user's head is in a forward direction, and a second stereo pair may be used if the user's head is in a left direction. Second, the viewing angle of the user's gaze is used. For example, if the user looks in a direction toward a distant object (e.g., a distant mountain), a distant (e.g., region 3) stereoscopic image pair would be selected for that point in time. Third, convergence of the user is used. For example, if the viewing direction of a near object (e.g., a leaf on a tree) is very similar to the viewing direction of a far object (e.g., a far mountain), then a combination of convergence and viewing angle is selected for use. Fourth, accommodation of the user's eyes is used. For example, the pupil size of the user is monitored and changes in size are used to indicate where the user is looking (near/far).

Fig. 3 shows the display of video recordings on an HDU. 300 shows the establishment of a coordinate system. For example, using a camera head as the origin, using the pointing direction of the camera head as the axis. This is discussed in more detail in U.S. patent application Ser. No. 17/225,610, the entire contents of which are incorporated herein by reference. 301 illustrates performing wide-angle recording of a scene. For example, data is recorded with a FOV that is larger than the FOV displayed to the user). 302 illustrates performing an analysis of a user as discussed in fig. 2 to determine where the user is looking in a scene. 303 shows optimizing the display based on the analysis in 302. In some embodiments, the characteristics (e.g., location, size, shape, orientation, color, brightness, texture, classification by AI algorithm) of the physical object determine the characteristics (e.g., location, size, shape, orientation, color, brightness, texture) of the virtual object. For example, a user is using a mixed reality display in a room of a house, where some areas in the room (e.g., daytime windows) are bright and some areas in the room are dark (e.g., dark blue walls). In some embodiments, the placement location of the virtual object is based on the location of the object within the room. For example, if the background is a dark blue wall, the virtual object may be colored white for highlighting. For example, if the background is a white wall, the virtual object may be colored blue for highlighting. For example, the virtual object may be placed (or repositioned) such that its background enables the virtual object to be displayed, thereby optimizing the viewing experience of the user.

Fig. 4 shows a pre-recorded stereoscopic viewing performed by user 1. 400 shows that user 1 performs stereoscopic recording using a stereoscopic camera system (e.g., a smartphone, etc.). This is discussed in more detail in U.S. patent application Ser. No. 17/225,610, the entire contents of which are incorporated herein by reference. 401 illustrates storing a stereoscopic recording on a storage device. 402 illustrates a user (e.g., user 1 or other user (s)) retrieving a stored stereo record. Note that the stereo record may be transmitted to the other user(s) and the other user(s) may receive the stored stereo record. 403 shows a user (e.g., user 1 or other user (s)) viewing the above stored stereoscopic recording on a stereoscopic display unit (e.g., augmented reality, mixed reality, virtual reality display).

Fig. 5 illustrates remote stereoscopic imaging of a remote object using a stereoscopic camera cluster. 500 shows two clusters of cameras placed at a distance of at least 50 feet apart. 501 shows selecting a target that is at least 1 mile away. 502 shows precisely aligning each camera cluster such that the focal center line intersects the target. A stereoscopic image of the acquisition target is shown at 503. 504 shows viewing and/or analyzing the acquired stereoscopic image. Some embodiments use cameras with tele lenses instead of clusters of cameras. In addition, some embodiments have a stereoscopic spacing of less than or equal to 50 feet to optimize viewing less than 1 mile away.

Fig. 6 illustrates the post-acquisition capability of adjusting an image based on user eye tracking to obtain the best possible picture by generating a stereoscopic composite image. Several objects in the stereoscopic image displayed at this point in time may be of interest to a person viewing the scene. Thus, at each point in time, a stereo composite image will be generated to match the input of at least one user. For example, if the user is looking at (eye tracking determines the viewing position) mountain 600 or cloud 601 at a first point in time, a stereo composite image pair will be generated that is transmitted to the HDU such that the distant object mountain 600 or cloud 601 is in focus and the near objects, including deer 603 and flower 602, are out of focus. If the user is looking (eye tracking determines the viewing position) at deer 603, the stereoscopic composite image presented by the frame will be optimized for medium distances. Finally, if the user is looking (eye tracking determines the viewing position) at the flowers 603 near, the stereo composite image will be optimized for near (e.g., achieving convergence and blurring distant items such as deer 603, mountain 600 and cloud 601). Various user inputs may be used to indicate to the software suite how to optimize the stereoscopic composite image. Gestures such as squints can be used to optimize stereoscopic composite images of more distant objects. Gestures such as tilting forward may be used to push objects far apart. The GUI may also be used to improve the immersive viewing experience.

Fig. 7A shows a moving image and application of image stabilization processing. 700A shows a left eye image of an object, where the edges of the object have motion blur. 701A shows a left eye image of an object to which image stabilization processing is applied.

Fig. 7B shows a moving image displayed in the HDU. 702 shows an HDU.700A shows a left eye image of an object, where the edges of the object have motion blur. 700B shows a right eye image of an object, where the edges of the object have motion blur. 701A shows a left eye display aligned with the left eye of a user. 701B shows a right eye display aligned with the right eye of the user.

Fig. 7C illustrates the application of image stabilization to an image using stereoscopic images. One key task of image processing is image stabilization using stereoscopic images. 700A shows a left eye image of an object to which image stabilization processing is applied. 700B shows a left eye image of an object to which image stabilization processing is applied. 701A shows a left eye display aligned with the left eye of a user. 701B shows a right eye display aligned with the right eye of the user. 702 shows an HDU.

Fig. 8A shows left and right images with a first camera setting. Note that text on the display is in focus, while distant objects of the knob on the cabinet are out of focus.

Fig. 8B shows left and right images with a second camera setting. Note that text on the display is out of focus, while distant objects of the knob on the cabinet are in focus. One innovation is the use of at least two cameras. The first image is obtained from a first camera. The second image is obtained from a second camera. The first camera and the second camera are at the same viewing angle. Furthermore, they are images of a scene (e.g., a still scene, or the same point in time of a scene with motion/change). A composite image is generated, wherein a first portion of the composite image is obtained from the first image and a second portion of the composite image is obtained from the second image. Note that in some embodiments, objects within a first image may be segmented, and the same objects within a second image may also be segmented. The first image of the object and the second image of the object may be compared to see which has a better quality. An image with better image quality may be added to the composite image. However, in some embodiments, intentional selection of some unclear portions may be performed.

Fig. 9A shows a top view of all data of a scene collected at one point in time.

Fig. 9B shows a wide-angle display 2D image frame of a video recording. Note that, assuming that the internal FOV (human eye FOV) of the user does not match the FOV of the imaging system, the entire field angle displayed to the user may be distorted.

Fig. 9C shows a top view of user a with viewing angles of-70 ° and 55 ° FOV. One key innovation is that the user can select a portion of the stereoscopic image according to the viewing angle. Note that the selected portion may actually reach 180 °, but not larger.

Fig. 9D shows what user a would see given that user a's viewing angles are-70 ° and 55 ° FOV. This is an improvement over the prior art in that it allows different viewers to see different portions of the field of view. Although humans have a horizontal field of view slightly greater than 180 degrees, humans can only read text over a field of view of about 10 degrees, can only evaluate shape over a field of view of about 30 degrees, and can only evaluate color over a field of view of about 60 degrees. In some embodiments, filtering (subtraction) is performed. The person has a vertical field of view of about 120 degrees with an upward (above horizontal) field of view of 50 degrees and a downward (below horizontal) field of view of about 70 degrees. However, the maximum rotation of the eyeball is limited to about 25 degrees above the horizontal and about 30 degrees below the horizontal. Typically, the normal line of sight from the seated position is about 15 degrees below horizontal.

Fig. 9E shows a top view of user B with viewing angles of +50° and 85 ° FOV. One key innovation is that the user can select a portion of the stereoscopic image according to the viewing angle. Further, note that the FOV of user B is larger than that of user a. Note that the selected portion may actually reach 180 °, but not even larger due to the limitations of the human eye.

Fig. 9F shows what user B would see given a +50° and 85 ° FOV of user B. This is an improvement over the prior art in that it allows different viewers to see different portions of the field of view. In some embodiments, multiple cameras record a 240 movie. In one embodiment, 4 cameras (each camera capturing a 60 sector) are used for simultaneous recording. In another embodiment, the sectors are sequentially photographed—one recording at a time. Some scenes in a movie may be shot sequentially, while other scenes may be shot simultaneously. In some embodiments, image stitching may be performed using overlapping camera settings. Some embodiments include the use of the photosphere system described in U.S. patent application 17/225,610, which is incorporated herein by reference in its entirety. After the images are recorded, the images from the cameras are edited to synchronize and splice the scenes together. A laser radar (LIDAR) device may be integrated into the camera system for accurate camera direction pointing.

Fig. 10A shows a field of view captured by a left camera at a first point in time. Left camera 1000 and right camera 1001 are shown. The left FOV 1002 is shown by a white area, approximately 215 °, and would have an alpha range from +90° to-135 ° (sweeping counterclockwise from +90° to-135 °). The region that is not imaged within the left FOV 1003 is approximately 135 deg. and will have an alpha range from +90 deg. to-135 deg. (sweeping clockwise from +90 deg. to-135 deg.).

Fig. 10B shows the field of view captured by the right camera at a first point in time. Left camera 1000 and right camera 1001 are shown. The right FOV 1004 is shown by a white area, approximately 215 °, and would have an alpha range from +135° to-90 ° (sweeping in a counter-clockwise direction from +135° to-90 °). The area not imaged within the right FOV 1005 is about 135 deg. and will have an alpha range from +135 deg. to-90 deg. (sweeping from +135 deg. to-90 deg. in the counter-clockwise direction).

Fig. 10C shows the personalized field of view (FOV) of the first user at a given point in time. 1000 shows a left camera. 1001 shows a right camera. 1006a shows the left boundary of the left eye FOV of the first user, which is displayed in light gray. 1007a shows the right boundary of the left eye FOV of the first user, which is displayed in light gray. 1008a shows the left boundary of the right eye FOV of the first user, which is displayed in light gray. 1009a shows the right boundary of the right eye FOV of the first user, which is displayed in light gray. 1010a shows the centerline of the left eye FOV of the first user. 1011a shows the centerline of the right eye FOV of the first user. Note that the center line 1010a of the left-eye FOV of the first user and the center line 1011a of the right-eye FOV of the first user are parallel, which corresponds to a convergence point at infinity. Note that the first user is looking forward. It is suggested that most of the actions in the scene occur in this forward looking direction when shooting movements.

Fig. 10D shows the personalized field of view (FOV) of the second user at a given point in time. 1000 shows a left camera. 1001 shows a right camera. 1006b shows the left boundary of the left eye FOV of the second user, which is displayed in light gray. 1007b shows the right boundary of the left eye FOV of the second user, which is displayed in light gray. 1008b shows the left boundary of the right eye FOV of the second user, which is displayed in light gray. 1009b shows the right boundary of the right eye FOV of the second user, which is displayed in light gray. 1010b shows the centerline of the left eye FOV of the second user. 1011b shows the centre line of the right eye FOV of the second user. Note that the centerline 1010b of the second user's left-eye FOV and the centerline 1011b of the second user's right-eye FOV meet at convergence point 1012. This allows the second user to view the small object in more detail. Note that the second user is looking forward. It is suggested that most of the actions in the scene occur in this forward looking direction when shooting movements.

Fig. 10E shows a personalized field of view (FOV) of a third user at a given point in time. 1000 shows a left camera. 1001 shows a right camera. 1006c shows the left boundary of the left eye FOV of the third user, which is displayed in light gray. 1007c shows the right boundary of the left eye FOV of the third user, which is displayed in light gray. 1008c shows the left boundary of the right eye FOV of the third user, which is displayed in light gray. 1009c shows the right boundary of the right eye FOV of the third user, which is displayed in light gray. 1010c shows the centerline of the left eye FOV of the third user. 1011c shows the centre line of the right eye FOV of the third user. Note that the center line 1010c of the third user's left-eye FOV and the center line 1011c of the third user's right-eye FOV are approximately parallel, which corresponds to looking at a great distance. Note that the third user is looking in a moderately left direction. Note that the overlapping of the left eye FOV and the right eye FOV provides stereoscopic viewing for the third viewer.

Fig. 10F shows the personalized field of view (FOV) of the fourth user at a given point in time. 1000 shows a left camera. 1001 shows a right camera. 1006d shows the left boundary of the left eye FOV of the fourth user, which is displayed in light gray. 1107d shows the right boundary of the left eye FOV of the fourth user, which is displayed in light gray. 1008d shows the left boundary of the right eye FOV of the fourth user, which is displayed in light gray. 1009d shows the right boundary of the right eye FOV of the fourth user, which is displayed in light gray. 1010d shows the centerline of the left eye FOV of the fourth user. 1011d shows the centerline of the right eye FOV of the fourth user. Note that the center line 1010d of the fourth user's left-eye FOV and the center line 1011d of the fourth user's right-eye FOV are approximately parallel, which corresponds to looking at a great distance. Note that the fourth user is looking in the far left direction. Note that the first user, the second user, the third user, and the fourth user all view different views of the movie at the same point in time. It should be noted that some of these designs, such as the camera clusters or spherical systems described

Fig. 11A shows a top view of the left eye view of the first user at time point 1. 1100 shows the left eye viewpoint. 1101 shows the right eye viewpoint. 1102 shows the portion of the field of view (FOV) not covered by either camera. 1103 shows the portion of the FOV covered by at least one camera. 1104A shows the inside portion (media port) of the high resolution FOV used by the user, which corresponds to α= +25°. This is discussed in more detail in U.S. patent application Ser. No. 17/225,610, the entire contents of which are incorporated herein by reference.

1105A shows the outside portion (lateral port) of the high resolution FOV used by the user, which corresponds to α= -25 °.

Fig. 11B shows a top view of a left eye view of a first user with convergence points near the left and right eyes. 1100 shows the left eye viewpoint.

1101 shows the right eye viewpoint. 1102 shows the portion of the field of view (FOV) not covered by either camera. 1103 shows the portion of the FOV covered by at least one camera. 1104B shows the inside portion of the high resolution FOV used by the user, which corresponds to a= -5 °.1105B shows the outside portion of the high resolution FOV used by the user, which corresponds to α= +45°.

Fig. 11C shows a left eye view without convergence at time point 1. Note that in the image, a flower 1106 is shown, which is positioned along the viewing angle α=0°.

Fig. 11D shows a left eye view with convergence at time point 2. Note that flowers 1106 are shown in the image, which are still positioned along the viewing angle α=0°. However, the user has converged during this point in time. This convergence behavior causes the left eye field of view to change from a horizontal field of view between-25 deg. and 25 deg. (as shown in fig. 11A and 11C) to a field of view between-5 deg. and +45 deg. (as shown in fig. 11B and 11D). The system improves upon the prior art in that it provides stereoscopic convergence on a stereoscopic camera by moving the image according to the left (right) field of view. In some embodiments, a portion of the display is non-optimized, as described in U.S. patent 10,712,837, the entire contents of which are incorporated herein by reference.

Fig. 12 illustrates reconstructing various stereoscopic images from previously acquired wide-angle stereoscopic images. 1200 illustrates acquiring an image from a stereoscopic camera system. The camera system is discussed in more detail in U.S. patent application 17/225,610, which is incorporated herein by reference in its entirety. 1201 illustrates the use of a first camera for a left eye viewing angle and a second camera for a right eye viewing angle. 1202 shows selecting a field of view of a first camera based on a left eye perspective and selecting a field of view of a second camera based on a right eye perspective. In a preferred embodiment, the selection will be performed by a computer (e.g., integrated into the head display unit) based on an eye-tracking system that tracks the eye movement of the user. It should also be noted that in the preferred embodiment there is also an inward image shift on the display closer to the nose during convergence, as taught in U.S. patent 10,712,837, and in particular fig. 15A, 15B, 16A and 16B, which is incorporated herein by reference in its entirety. 1203 shows presenting a left eye field of view to a left eye of a user and a right eye field of view to a right eye of the user. There are various options in this context. First, a composite stereo image pair is used, wherein a left eye image is generated by at least two lenses (e.g., a first optimized for close-up imaging and a second optimized for remote imaging), and wherein a right eye image is generated by at least two lenses (e.g., a first optimized for close-up imaging and a second optimized for remote imaging). When the user is looking at a near object, a stereoscopic image pair is presented in which the near object is in focus and the far object is out of focus. When the user is looking at a distant object, a stereoscopic image pair is presented in which the near object is out of focus and the distant object is in focus. Second, various display devices (e.g., augmented reality, virtual reality, mixed reality displays) are used.

Fig. 13A shows a top view of a home theater. 1300 shows a user. 1301 shows a projector. 1302 shows a screen. Note that this immersive home theater display has a field of view that is larger than the field of view of user 1300. For example, if user 1300 is looking straight ahead, the home theater will display a horizontal FOV greater than 180 degrees. Thus, the FOV of the home theater will completely cover the horizontal FOV of the user. Similarly, if the user is looking straight ahead, the home theater will display a vertical FOV greater than 120 degrees. Thus, the FOV of the home theater will completely cover the vertical FOV of the user. An AR/VR/MR headset may be used in conjunction with the system, but is not required. Inexpensive anaglyph or disposable color glasses may also be used. Conventional IMAX polarized projectors may be used with IMAX disposable polarized glasses. The size of the home theater may vary. Walls of home theatres can be constructed using white reflective panels and frames. The projector will have multiple heads to cover a larger field of view.

Fig. 13B shows a side view of the home theater shown in fig. 13A. 1300 shows a user. 1301 shows a projector. 1302 shows a screen. Note that this immersive home theater display has a field of view that is larger than the field of view of user 1300. For example, if the user 100 is looking forward on the couch, the home theater will display a vertical FOV greater than 120 degrees. Thus, the FOV of the home theater will completely cover the FOV of the user. Similarly, if the user is looking straight ahead, the home theater will display a horizontal FOV greater than 120 degrees. Thus, the FOV of the home theater will completely cover the FOV of the user.

Fig. 14A shows a top view of a home theater. 1400A shows a first user. 1400B shows a first user. 1401 shows a projector. 1402 shows a screen. Note that the immersive home theater display has a field of view that is greater than the FOV of the first user 1400A or the second user 1400B. For example, if the first user 1400A is looking straight ahead, the first user 1400A will see a horizontal FOV greater than 180 degrees. Thus, the FOV of the home theater will completely cover the horizontal FOV of the user. Similarly, if the first user 1400A is looking straight ahead, the home theater will display a vertical FOV greater than 120 degrees, as shown in fig. 14B. Thus, the FOV of the home theater will completely cover the vertical FOV of the user. An AR/VR/MR headset may be used in conjunction with the system, but is not required. Inexpensive anaglyph or polarized glasses may also be used. Conventional IMAX polarized projectors may be used with IMAX disposable polarized glasses. The size of the home theater may vary. Walls of home theatres can be constructed using white reflective panels and frames. The projector will have multiple heads to cover a larger field of view.

Fig. 14B shows a side view of the home theater as shown in fig. 14A. 1400A shows a first user. 1401 shows a projector. 1402 shows a screen. Note that this immersive home theater display has a field of view that is larger than the field of view of the first user 1400A. For example, if user 1400A is looking forward on the couch, the user will see a vertical FOV of greater than 120 degrees. Thus, the FOV of the home theater will completely cover the FOV of the first user 1400A. Similarly, if the first user 1400A is looking straight ahead, the home theater will display a horizontal FOV greater than 120 degrees. Thus, the FOV of the home theater will completely cover the FOV of the first user 1400A.

Typically, theThe high resolution display has 4000 pixels over a distance of 1.37 meters. This corresponds to 10X 10 per 1.87 square meters ⁶ And each pixel. Consider data for a hemispherical theater. Assume a radius of 2 meters for a hemispherical cinema. The surface area of the hemisphere is 2 x pi x r ² Equal to (4) (3.14) (2 ² ) Or 50.24m ² . Assuming that the desired spatial resolution is equal to the spatial resolution of a typical high resolution display, this would be equal to (50.24 m ² )(10×10 ⁶ Every 1.87m of each pixel ² ) Or 4.29 billion pixels. Assume a frame rate of 60 frames per second. This corresponds to 26 times the data size of a standard 4K monitor.

Some embodiments include constructing a home theater that matches the geometry of the projector. The preferred embodiment is sub-spherical (e.g., hemispherical). One low cost construction is to use a reflective surface that is tiled with a multi-head projector. In some embodiments, the field of view includes a spherical coverage of 4pi steradians. This may be achieved by HDU. In some embodiments, the field of view includes a sub-spherical coverage of at least 3 pi steradians. In some embodiments, the field of view includes a sub-spherical coverage of at least 2pi steradians. In some embodiments, the field of view includes a sub-spherical coverage of at least 1 pi steradians. In some embodiments, the field of view includes a sub-spherical coverage of at least 0.5 pi steradians. In some embodiments, the field of view includes a sub-spherical coverage of at least 0.25 pi steradians. In some embodiments, the field of view includes a sub-spherical coverage of at least 0.05 pi steradians. In some embodiments, a sub-spherical IMAX system is created to improve the theatre experience for many viewers. The chair will be placed in a similar position to a standard cinema, but the screen will be sub-spherical. In some embodiments, non-spherical shapes may also be used.

Fig. 15A shows a point in time #1, where the user looks straight ahead, seeing a horizontal field of view of about 60 degrees horizontal and 40 degrees vertical with a fairly accurate field of view (e.g., the user can see the shape and color of the peripheral FOV).

Fig. 15B shows the central portion of the TV and the field of view observed by the user at time point # 1. Note that in some embodiments, the data will be streamed (e.g., over the internet). Note that one innovative feature of this patent is referred to as "viewing parameter directed streaming". In this embodiment, the viewing parameters are used to direct the streamed data. For example, if user 1500 is looking straight ahead, the first set of data will be streamed to correspond with the direct view perspective of user 1500. However, if the user is looking to the side of the screen, the second set of data will be streamed to correspond to the side view perspective of user 1500. Other viewing parameters that can control viewing angle include, but are not limited to, the following: vergence of the user; the head position of the user; the head direction of the user. In a broad sense, any feature (age, gender, preference) or action (perspective, location, etc.) of the user may be used to guide streaming. Note that another innovative feature is streaming of at least two image qualities. For example, a first image quality (e.g., high quality) will be streamed according to a first parameter (e.g., within a 30 ° horizontal FOV and a 30 ° vertical FOV of a user). And, a second image quality (e.g., lower quality) that does not meet the criteria (e.g., is not within the user's 30 ° horizontal FOV and 30 ° vertical FOV) will also be streamed. Surround sound will be implemented in the system.

Fig. 15C shows a time point #2 where the user is looking to the left of the user's screen, seeing a horizontal field of view of about 60 degrees horizontal and 40 degrees vertical with a fairly accurate field of view (e.g., the user can see the shape and color of the peripheral FOV).

Fig. 15D shows a time point #2, and the field of view observed by the user at the time point #2 is different from fig. 15B. The region of interest is half of time point # 1. In some embodiments, the user is provided with more detail and higher resolution of objects within a small FOV within the scene. Outside this high resolution field of view region, lower resolution image quality may be presented on the screen.

Fig. 15E shows a time point #3 in which the user is looking to the right of the user's screen.

Fig. 15F shows time point #3, seeing a circular high resolution FOV.

Fig. 16A shows an image without zooming. 1600 shows an image. 1601A shows a box representing an area within the image 1600 that is set to zoom in.

Fig. 16B shows digital magnification of a portion of an image. This may be accomplished by the method described in U.S. patent 8,384,771 (e.g., 1 pixel to 4 pixels), which is incorporated by reference in its entirety. Note that the region to be enlarged may be implemented by various user inputs, including: a gesture tracking system; an eye movement tracking system; and a graphical user interface (graphical user interface, GUI). Note that the region within the image 1601A shown in fig. 16A is now enlarged, as shown in 1601B. Note that the resolution of the 1601B region is equal to the resolution of the image 1600, but is larger. Note that 1600B shows an un-enlarged portion of 1600A. Note that 1601A is now enlarged, and note that portions of 1600A are not visualized.

Fig. 17A shows an image without zooming. 1700 shows an image. 1701A shows a box representing an area within the image 1700 where magnification is set.

Fig. 17B shows an optical-type magnification of a part of an image. Note that the region to be enlarged may be implemented by various user inputs, including: a gesture tracking system; an eye movement tracking system; a Graphical User Interface (GUI). Note that the region within the image 1701A shown in fig. 17A is now enlarged as shown in 1701B, and also note that the image within 1701B exhibits higher image quality. This can be achieved by selectively displaying the highest quality image in the region 1701B and magnifying the region 1701B. Not only is the cloud larger, but the resolution of the cloud is also better. Note that 1700B shows the portion of 1700A that is not amplified (note that some portions of 1700A that are not amplified are now covered by an amplified region).

Fig. 18A shows a single resolution image. 1800A shows an image. 1801A shows a box representing an area within the image 1800A that is set to be of increased resolution.

Fig. 18B shows a multi-resolution image. Note that the area of increased resolution may be achieved by various user inputs, including: a gesture tracking system; an eye movement tracking system; and a Graphical User Interface (GUI) including a joystick or controller. Note that the region within the image 1801A represented in fig. 18A is now displayed at a higher resolution, as shown at 1801B. In some embodiments, the image within 1801B may also be changed in other options (e.g., different color schemes, different brightness settings, etc.). This may be accomplished by selectively displaying higher (e.g., highest) quality images in region 1801B without zooming in on region 1701B.

Fig. 19A shows a large field of view in which a first user is looking at a first portion of an image and a second user is looking at a second portion of the image. 1900A is a large field of view with a first resolution. 1900B is the position the first user is looking at, which is set to become high resolution as shown in fig. 19B. 1900C is the position the second user is looking at, and is set to become high resolution as shown in fig. 19B.

Fig. 19B shows that only the first portion of the image in fig. 19A and the second portion of the image in fig. 19A are at high resolution, while the rest of the image is at low resolution. 1900A is a large field of view with a first resolution (low resolution). 1900B is the location of the high resolution area of the first user, which has a second resolution (in this case, high resolution). 1900C is the location of the high resolution area of the second user, which has the second resolution (high resolution in this example). Thus, the first high resolution area is for the first user. And, the second high resolution region may be for a second user. As shown in fig. 14A and 14B, the system may be useful for home theater displays.

Fig. 20A shows a low resolution image.

Fig. 20B shows a high resolution image.

Fig. 20C shows a composite image. Note that the composite image has a first portion 2000 of low resolution and a second portion 2001 of high resolution. This is described in U.S. patent 16/893,291, the entire contents of which are incorporated herein by reference. The first portion is determined by a viewing parameter (e.g., viewing angle) of the user. One innovation is near real-time streaming of a first portion 2000 having a first image quality and a second portion having a second image quality. Note that the display of the first portion may be different from the second portion. For example, the first portion and the second portion may differ in visual presentation parameters, including: brightness; a color scheme; or otherwise. Thus, in some embodiments, a first portion of an image may be compressed while a second portion of the image is not compressed. In other embodiments, the composite image is generated for display to the user by an arrangement of stitching together some high resolution images and some low resolution images. In some embodiments, portions of the large image (e.g., 4.29 billion pixels) are high resolution and portions of the large image are low resolution. The high resolution portion of the large image will be streamed according to the user's viewing parameters (e.g., convergence point, viewing angle, head angle, etc.).

Fig. 21 illustrates a method and process for performing near real-time streaming of custom images.

With respect to display 2100, the display includes, but is not limited to, the following: large TV; augmented reality (e.g., augmented reality, virtual reality, or mixed reality display); an on-screen projector system; computer displays, etc. One key component of the display is the ability to track the user's gaze location in the image and what the viewing parameters are.

Regarding viewing parameters 2101, the viewing parameters include, but are not limited to, the following: viewing angle; vergence/convergence; user preferences (e.g., objects of particular interest, filtering-some objects rated "R" may be filtered for a particular user, etc.).

With respect to cloud 1202, each frame in a movie or video will be very large data (particularly if the home theater shown in fig. 14A and 14B is used in conjunction with a camera cluster described in U.S. patent application 17/225,610, the entire contents of which are incorporated herein by reference). Note that the cloud refers to a memory, a database, or the like. Note that the cloud is capable of cloud computing. One innovation of this patent is to send the viewing parameters of the user(s) to the cloud, process the viewing parameters in the cloud (e.g., select a field of view or a synthetic stereo image pair as discussed in fig. 12), and determine which portions of the oversized data to stream to optimize the experience of the individual user. For example, multiple users may view their movies simultaneously. Each user streams 2103 the personalized optimized data for that particular point in time from the cloud onto their mobile device. Each user will then view the respective optimized data on his own device. This will result in an improved immersive viewing experience. For example, assume that at a single point in time there is a dinner scene, where there are ceiling lights, dogs, elderly people, bookcases, long tables, carpeting, and wall gear. The user named Dave may be a watch dog, and the image of Dave will be optimized (e.g., streaming the image of the dog with maximum resolution and optimized color to Dave's mobile device and display on Dave's HDU). The user named Kathy may be looking at a pendant, and the image of Kathy will be optimized (e.g., the image of the pendant with the greatest resolution and optimized color is streamed to the mobile device of Kathy and displayed on the HDU of Kathy). Finally, the user named Bob may be watching elderly people, and Bob's image will be optimized (e.g., streaming an image of elderly people with maximum resolution and optimized color to Bob's mobile device and displaying on Bob's HDU). It is noted that the cloud will store a huge data set at each point in time, but only parts of the data set will be streamed, which parts are determined by the viewing parameters and/or preferences of the user. Thus, bookcases, long tables, carpets, and wall ornaments may all be within the fields of view of Dave, kathy, and Bob, but these objects are not optimized for display (e.g., the highest possible resolution of these images stored in the cloud is not streamed).

Finally, the concept of a pioneer is introduced. If an upcoming scene is predicted that may cause the viewing parameters of a particular user to change (e.g., the user turns around), the additional image frames may be previously streamed. For example, if the time of the movie is at 1:43: at 05, dinosaur will be at 1:43: the entire scene may be downloaded in a low resolution format and additional data sets for selective portions of the FOV may be downloaded as needed (e.g., based on the user's viewing parameters, based on predicting the upcoming dinosaur scene the user is looking at) at 30 and jumped out from the left side of the screen. Thus, a jumped-out dinosaur will always be at its maximum resolution. This technique creates a more immersive and improved viewing experience.

Fig. 22A illustrates the use of rear convergence (resolution) in conjunction with a stereo camera. Camera #1 has a known location (e.g., latitude and longitude from GPS). The distance (2 miles) and direction (north-west 330 °) from camera #1 to object 2200 are known. The position of object 2200 may be calculated. Camera #2 has an unknown position, but a known distance (1 mile) and direction (30 ° north-east) to object 2200. Since the position of object 2200 can be calculated, the geometry can be solved and the position of camera #2 can be determined.

Fig. 22A shows the rear convergence in use in conjunction with a stereo camera. Camera #1 has a known location (e.g., latitude and longitude from GPS). Camera #1 and camera #2 have known positions. The direction from camera #1 to object 2200B (north-west 330 °) is known. The direction from camera #2 to object 2200B (30 ° north-east) is known. The position of object 2200B can be calculated.

Fig. 23A shows a top view of a person looking forward toward the center of a home theater screen. The person 2300 is looking forward toward the center portion 2302B of the home theater screen 2301. During this point in time, streaming is customized to have an optimized center portion 2302B (e.g., highest possible resolution), a non-optimized left portion 2302A (e.g., low resolution or black), and a non-optimized right portion 2302C (e.g., low resolution or black). Note that for proper streaming, inputs are made to the monitoring system (detecting the user's viewing direction and other viewing parameters, such as gestures or facial expressions) or to the controller (receiving the user's command must also be in place).

Fig. 23B shows a top view of a person looking forward to the right of the home theater screen. The person 2300 is looking forward toward the right portion 2302C of the home theater screen 2301. During this point in time, streaming is customized to have an optimized right portion 2302C (e.g., highest possible resolution), a non-optimized left portion 2302A (e.g., low resolution or black), and a non-optimized middle portion 2302B (e.g., low resolution or black). Note that for proper streaming, inputs are made to the monitoring system (detecting the user's viewing direction and other viewing parameters, such as gestures or facial expressions) or to the controller (receiving the user's command must also be in place).

Fig. 24 illustrates a method, system and apparatus for optimizing stereo camera settings during image acquisition in motion. 2400 illustrates determining a distance of an object at a point in time (e.g., using a laser rangefinder). An object tracking/target tracking system may be implemented. 2401 shows adjusting the zoom setting of the stereoscopic camera system so that it is optimized for the distance determined in step 2400. In a preferred embodiment, this will be performed when using a zoom lens, as opposed to performing digital zooming. 2402 shows adjusting the separation distance (stereo distance) between stereo cameras so that it is optimized for the distance determined in step 2400. Note that there is also an option to adjust the orientation of the camera so that it is optimized for the distance determined in step 2400. 2403 shows a stereoscopic image of the acquisition target at a point in time of step 2400. 2404 shows recording, viewing, and/or analyzing acquired stereoscopic images.

Claims

1. A method, comprising:

uploading viewing parameters of a user to the cloud through the Internet;

wherein the cloud stores an image of the object,

wherein the cloud is capable of cloud computing

Wherein the viewing parameter of the user comprises a viewing angle;

Optimizing a user-specific display image in the cloud according to the image;

wherein the user-specific display image is based at least on the viewing parameter;

wherein the user-specific display image includes a first portion and a second portion;

wherein the first portion of the user-specific display image is different from the second portion of the user-specific display image;

wherein the first portion of the user-specific display imagery includes a first image quality;

wherein the first portion of the user-specific display image corresponds to the viewing angle;

wherein the second portion of the user-specific display image comprises a second image quality; and

wherein the second image quality is lower than the first image quality;

downloading the user specific display image through the internet; and

displaying the user-specific display image to the user.

2. The method of claim 1, further comprising:

wherein the user-specific display image includes a first portion having a first spatial resolution and a second portion having a second spatial resolution; and

wherein the first spatial resolution is higher than the second spatial resolution.

3. The method of claim 1, further comprising, wherein the image comprises a video image.

4. The method of claim 1, further comprising:

wherein the user-specific display image comprises, wherein the first portion comprises a first zoom setting and the second portion comprises a second zoom setting; and

wherein the first zoom setting is higher than the second zoom setting.

5. The method of claim 4, further comprising, wherein the first portion is determined by at least one of the group consisting of:

the location of the user's body;

the direction of the user's body;

a gesture of a hand of the user;

facial expressions of the user;

the position of the user's head; and

the direction of the user's head.

6. The method of claim 4, further comprising wherein the first portion is determined by a graphical user interface.

7. The method of claim 1, further comprising:

wherein the image comprises a first field of view FOV;

wherein the user-specific display image includes a second FOV; and

wherein the first FOV is greater than the second FOV.

8. The method of claim 1, further comprising:

Wherein the image comprises a stereoscopic image; and

the stereoscopic image is obtained through a stereoscopic camera or a stereoscopic camera cluster.

9. The method of claim 1, further comprising, wherein the image comprises a stitched image, wherein the stitched image is generated by at least two cameras.

10. The method of claim 1, further comprising:

wherein the image comprises a composite image;

wherein the composite image is generated by:

shooting a first image of a scene using a first set of camera settings, wherein the first set of camera settings causes a first object to be in focus and a second object to be out of focus; and

a second image of the scene is captured using a second set of camera settings, wherein the second set of camera settings causes the second object to be in focus and the first object to be out of focus.

11. The method of claim 10, further comprising, wherein:

when the user looks at the first object, the first image is presented to the user; and

when the user looks at the second object, the second image is presented to the user.

12. The method of claim 10, further comprising combining at least the first object from the first image and the second object from the second image into the composite image.

13. The method of claim 1, further comprising wherein the view angle is movable by the user.

14. The method of claim 1, further comprising, wherein the viewing parameter comprises convergence.

15. The method of claim 1, further comprising wherein the user-specific image is a 3D image, wherein the 3D image is presented on a head display unit HDU.

16. The method of claim 15, further comprising wherein the viewing angle is determined by a direction of the HDU.

17. A method, comprising:

determining a viewing parameter of a user, wherein the viewing parameter of the user comprises a viewing angle;

transmitting the viewing parameters of the user to a cloud through the internet, wherein the cloud is capable of cloud computing;

wherein the cloud computing generates a user-specific display image from images stored on the cloud;

wherein the user-specific display image is based at least on the viewing parameters of the user;

wherein the user-specific display image includes a first portion and a second portion,

wherein the second image quality is lower than the first image quality;

receiving the user-specific display image through the internet; and

the user-specific display image is displayed on a head display unit HDU, wherein the HDU includes a left eye display and a right eye display.

18. The method of claim 1, further comprising wherein the user-specific display image is presented to the user on a display, wherein the user has a field of view of at least 0.5 pi steradians.

19. The method of claim 18, further comprising, wherein the display comprises at least one of the group consisting of:

a screen and a projector;

a TV; and

and a monitor.

20. A method, comprising:

receiving viewing parameters of a user at the cloud through the Internet;

wherein the viewing parameter of the user comprises a viewing angle;

wherein the cloud is capable of cloud computing;

Generating a user-specific display image from an image stored on the cloud using cloud computing;

wherein the second image quality is lower than the first image quality;

the user-specific display image is sent to the head display unit HDU via the internet,

wherein the HDU comprises a left eye display and a right eye display;

wherein the HDU displays the user-specific display image.