WO2005009052A1

WO2005009052A1 - Head tracked autostereoscopic display

Info

Publication number: WO2005009052A1
Application number: PCT/IB2004/051162
Authority: WO
Inventors: Peter-André REDERT; Marc J. R. Op De Beeck; Bartolomeus W. D. Van Geest
Original assignee: Koninklijke Philips Electronics N.V.
Priority date: 2003-07-16
Filing date: 2004-07-08
Publication date: 2005-01-27

Abstract

The invention relates to a method and a system for providing a 3 dimensional (3-D) view of a scene where said scene is captured by M-camera views, said scene is displayed by N-display views, a selection of N-camera views of said M-camera views is assigned to said N-display views to provide a viewing zone displaying said scene, a position of a viewer's head is tracked by tracking means, said selection of said N-camera views is assigned to said N-display views according to said tracked position of said viewer's head such that said tracked position of said viewer's head is centered within said viewing zone.

Description

Head tracked autostereoscopic display

The invention relates to a method for providing a 3 dimensional (3-D) view of a scene where said scene is captured by M-camera views, said scene is displayed by redisplay views, and M-camera views are assigned to said N-display views to provide a viewing zone displaying said scene. The invention further relates to a system for providing a 3-D autostereoscopic view of a scene with means for providing M-camera views, means for providing N-display views to a display device, and selection means for assigning said M-camera views to said N- display views to provide a viewing zone displaying said scene.

Current 3-D technology provides 3-D views by using head-tracked or parallax displays. Parallax displays offer viewing zones in which free 3-D viewing is possible. Stereo viewing as well as limited motion parallax and a limited viewing zone is provided by these displays. Stereo viewing means viewing two different images in the eyes, which causes a 3-D effect. Motion parallax causes different perspective views when moving the eyes parallel to the display, e.g. from left to right and vice versa. Parallax displays usually provide several viewing zones located adjacent to each other. Free viewing is possible within each zone. But at zone transitions, viewing is distorted. Motion parallax by parallax displays is instantaneous and exact within each zone, similar to real world objects. The drawback is that moving from one zone to the next does not provide a new perspective onto the scene. Instead, there is only one set of perspectives. This set of perspectives is provided by different camera views, which are re-used within each zone. The view is not altered between the zones. The only effects occurring upon zone transition are image distortion and image reset. Further known displays are tracked displays. For these displays the position of a viewer's eyes is tracked. The image to be displayed is directed to the viewer's eyes according to their tracked position. A viewing zone in the vicinity of the eyes is created to display the corresponding view of the scene. Tracked displays have extremely small viewing zones in the order of the size of the eyes. The zones are positioned within the viewing space. When the user moves his head, the position of his eyes is tracked. Via said eye tracking, the viewing zones are aimed at the viewer's eyes and moved accordingly. Basically, tracked displays only offer stereo viewing. This means that only two different camera views are used to create the image of the scene and these two camera views are always directed to the viewer's eyes. No matter where the viewer is positioned, he always sees the same perspective of the scene. However, when images for the right and left eye are rendered adaptively, based on the viewer's position, motion parallax may also be achieved. Motion tracking within current displays suffers from latency and noise in the viewer tracker, as tracking has to be very accurate due to the small viewing zone around the viewer's eyes. Subsequent adaptive rendering, as previously described, leads to further latency. Noise from the tracker gives the scene a trembling appearance. Latency gives the image an unnatural, elastic impression of the scene when the viewer moves around to observe the motion parallax effect. Furthermore, the viewing zones from conventional motion parallax displays are in the range of about 20-30 cm. When the viewer moves to a new zone, the image is repeated within the new zone. The optic of tracked displays provides viewing zones of about 2-3 cm. With tracking, these zones can be of a size of 1-2 m, but no motion parallax is available, or there is latency due to image rendering. The tracking accuracy has to be in the range of the viewing zone, e.g. 2-3 cm. This leads to challenges in optics calibration. Tracking errors larger than 2-3 cm lead to severe image distortion, as the image is not focused within the viewer's eyes.

The invention tries to overcome these drawbacks. It is thus an object of the invention to provide a large viewing zone without tracking latency. It is a further object of the invention to allow lower tracking accuracy. Another object of the invention is to allow easy tracking of the viewer's position. Yet another object of the invention is to allow motion parallax within the full range of perspective views. Another object of the invention is to allow continuous motion parallax without rendering latency. These and other objects of the invention are solved by a method where a selection of N-camera views of said M-camera views is assigned to said N-display views to provide a viewing zone displaying said scene, a position of a viewer's head is tracked by tracking means, said selection of said N-camera views is assigned to said N-display views according to said tracked position of said viewer's head such that said tracked position of said viewer's head is centered within said viewing zone. By providing said N-camera views to said N-display views, viewing zones may be established within viewing distance. Each viewing zone may then provide different view projections, according to the corresponding camera views. By combining parallax display technology with head tracking it becomes possible to shift the viewing zone according to the viewers head position. The content of the zone may be rendered viewpoint adaptive. According to the viewer's head position, the N- display views are fed by appropriate views of an M-view camera array. The assignment of the views from the cameras is done viewpoint adaptive, such that the viewer is centered as much as possible in one zone. The view projections may thus be shifted according to the viewer's head position. A set of N-view projections may constitute said viewing zones. The advantage of a method according to the invention is that the viewing zone is large. Further, for tracking, only angular measurements are required. Angular measurements are considerably more easy to acquire, e.g. via a single tracking camera, compared to x, z measurements requiring stereo or range cameras in tracked displays. To allow centering the viewer within the moving zones and to provide zone specific content via viewpoint adaptive providing of camera views, a method according to claim 2 is preferred. According to this embodiment, the selection of camera views and the assignment of these camera views to the display views to provide the image of the scene is viewpoint adaptive. According to the viewer's head position, different camera views are selected. By selecting appropriate camera views out of an array of different camera views different view projections are created within the viewing zones. Furthermore, the viewer's head may be centered within the zone. By providing the moving zone together with centering the viewer's head within the zone, motion parallax is available in the full range of perspective views. Motion parallax is instantaneous, without elasticity, regardless of the latency of any possible rendering. Motion parallax is also free from tracker noise. Continuous motion parallax may be provided, while still enabling acquisition of image views with a discrete set of cameras. With this embodiment, there is still non-graceful degradation of the image quality whenever the tracker makes errors larger than the angular size of a viewing zone of the parallax display. This size is of about 20-30 cm. Thus, the tracker noise is possibly far below this limit. Therefore, errors only appear when the viewer moves within the full viewing range at high speed. As the system is resilient to any tracking errors below the angular size of the viewing zone, tracking errors may be trade-off with noise errors and vice versa. This tradeoff may assure that both errors are below their limit. An implementation for this trade-off may be based on linear smoothing/prediction filters operating on the tracking coordinates. According to claim 3, the camera views are fixed. Even though the camera positions are fixed, the viewpoint adaptive selection of camera views allows centering the viewer's head within a zone as much as possible. This allows the acquisition with a real camera array, where the camera position is fixed. The view projections within a viewing zone may then show the same camera views, but the sequence of view projections may be altered according to the viewer's position, allowing centering the viewer in the zone. Motion parallax is possible in case of a method according to claim 4. The selection of N-display views out of M-camera views, when M is greater than N, enables displaying different perspectives according to the viewer's head position. In case M equals N, motion parallax is removed. But a stereo view without noise and latency may still be acquired. According to claim 5, the camera views are captured by real cameras or by rendering. Rendering allows creating views from a computer model of an object. Real cameras allow creating different views from real world object. The acquisition with real cameras according to claim 6 allows using conventional cameras. This reduces the costs of implementing such a method. The method according to claim 7 may enable motion parallax and stereo viewing even when the viewer moves horizontally and vertically parallel to the display. Another aspect of the invention is a system for providing a 3-D autostereoscopic view of a scene with means for providing M-camera views, means for providing N-display views to a display device, selection means for assigning a selection of N-camera views out of said M-camera views to said N-display views to provide a viewing zone displaying said scene, tracking means for tracking the viewer's head position, and said selection means enabling the assignment of said selection of N-camera views to said N- display views according to the tracked viewer's head position such that said tracked viewer's head position is centered within said viewing zone. The system of claim 9 reduces implementation costs, as conventional cameras may be used. The system of claim 10 allows displaying the views with conventional display devices. A system of claim 11 allows displaying a geometrically correct image without geometric corrections. The geometry between acquisition and display system may differ from the exact match, e.g. in overall scale, or more geometrical parameters, e.g. translation, skew. Each difference influences the geometrical correctness of the visualized 3-D scene, but this often hardly affects the perceptual 3-D effect. These and other aspects of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.

Fig. 1 shows a 5-view display with 5-view viewing zones; Fig. 2 shows a 5-view display with a 5-view viewing area; Fig. 3 shows a 5-view display according to the invention;

Fig. 1 depicts a spatial light modulator 12 with a 5-view lenticular screen 14.

The display 12, 14 directs different views directly to the viewer in a cone-shaped manner. The 5-view lenticular 14 is capable of directing 5 different images into 5 different directions in space thus creating viewing zones with 5 view projections, each. Between the intersections of the different view projections, one perspective view of the scene is displayed. Any eye located exactly within such an intersection observes this particular view. All 5 different perspective views from 5 different camera views create a 5-view viewing zone, in which 3-D viewing is ensured. The 5-view viewing zones are repeated, each zone showing the same content. When the viewer moves, he will move out of a viewing zone, and later enter the next viewing zone, which is displaying a repetition of the previous viewing zone. Further depicted is the viewing distance z and the distance between consecutive view projections d. Fig. 2 again depicts the spatial light modulator 12 and the lenticular screen 14. Further depicted is the 5-view viewing zone 16, and a 5-view viewing area 18. Also depicted is x-direction x, y-direction y, and z-direction z as well as the viewing angle φ, representing the angular position of the viewer's head with respect to the display. The viewing area 18 induced by the 5-view zone 16 is depicted. Automatically each 5-view zone 16 results in a 5- view viewing area that extends in z-direction z. The shape of the 5-view viewing area follows from the width and position of the display 12, 14. From any viewpoint within the 5-view zone 16, the viewer observes one view projection of the scene at the display. From any other point within the viewing area 18, the display shows a different combination of view projections of the scene, which results in a fully consistent 3-D viewing. Fig. 3 depicts a system according to the invention. The inputs of the display i o-id₄ representing display views are fed by images from different camera views i_CN. The selection of the camera views i_c and the assignment of these camera views i_c to the display views is done according to the viewer's head position allowing to center the viewer's head as good as possible within the viewing zone 16. In case the camera array comprises 5 cameras and is fixed, the camera views may be selected out of an array i_L e [-2,+2] . The display views may be represented by i_d e [θ,4]. The center of the viewing zone may be calculated based on the tracked angular position of the viewer's head as Ztanφ Zone , centre d

Where Z is the distance between the display and the viewing zones, d is the distance between two consecutive projections, and φ is the viewer's angular position. The square brackets mean rounding to the nearest integer. The assignment of the selected camera views ic to the 5 display views id is done via i_d=(ic+iz_one,ce_ntre) mod 5 to assure that each display view is assigned a different camera view. In this case, the camera array is fixed. Within each zone the same image is displayed. But within a zone, a 3-D effect is realized by providing 5 different view projections. In case the camera views are selected out of a larger camera array, e.g. more than 5 cameras, the selection out of the cameras may be done more adaptively, thus creating different perspective views within the zones. As depicted in Fig. 3, a camera array of multiple camera views i_c-2 - i_C6 is provided. The selection of these camera views changes according to the angular position of the viewer's head. The center camera can be calculated in this case by Ztanφ d The other selected cameras follow then from ^lc ^e VC, centre ~ ^> ^lC, centre + 2J -

The assignment of the selected camera positions i_c to the 5 display views i_d is done via id=io mod 5. The same holds for displays with different numbers of display views and camera views. The equations have only to be adapted to the new number of display views. Possibly, one or more plus or minus signs have to be replaced within the above equations. This depends on the numbering/direction conventions used in the display views, the camera views, and the tracking coordinates. Fig. 3 depicts the assignment of camera view i_cι to display view idi, i_c2 to display view id_, i_C3 to display view id3, ic4 to display view id4, and i_C5 to display view iao. In this case, the center of the 5-view viewing zone 16 is shifted according to the above mentioned equations. The display shows a perspective view of the scene from this shifted position. By providing such a method, motion parallax together with stereo viewing is possible without tracking noise, rendering latency, and at low costs due to the possible use of conventional cameras.

Claims

CLAIMS:

1. Method for providing a 3 dimensional (3 -D) view of a scene where - said scene is captured by M-camera views, - said scene is displayed by N-display views, - a selection of N-camera views of said M-camera views is assigned to said N- display views to provide a viewing zone displaying said scene, - a position of a viewer's head is tracked by tracking means, - said selection of said N-camera views is assigned to said N-display views according to said tracked position of said viewer's head such that said tracked position of said viewer's head is centered within said viewing zone.

2. A method of claim 1, characterized in that the selection of said N-camera views is adaptive to said tracked position of said viewer's head such that the camera view assigned to the center of the display view and the adjacent camera views assigned to the neighboring display views are moving with said tracked position of said viewer's head such that the content in a viewing zone is conform to said tracked position of said viewer's head.

3. A method of claim 1, characterized in that the position of said M-camera views is fixed and that the camera view assigned to the center of the display views and the adjacent camera views assigned to the neighboring display views are adapted to said tracked position of said viewer's head.

4. A method of claim 1, characterized in that M equals N, or M is greater than N.

5. A method of claim 1, characterized in that said M-camera views are captured by an array of real cameras, or by rendering from a 3-D scene model.

6. A method of claim 1, characterized in that said M-camera views are discrete such that the positions of said camera views are permanently fixed.

7. A method of claim 1, characterized in that the angular position of the viewer's head is tracked parallel to the display in horizontal and vertical direction.

8. System for providing a 3-D autostereoscopic view of a scene with - means for providing M-camera views, - means for providing N-display views to a display device, - selection means for assigning a selection of N-camera views out of said M- camera views to said N-display views to provide a viewing zone displaying said scene, - tracking means for tracking the viewer's head position, - and said selection means enabling the assignment of said selection of N- camera views to said N-display views according to the tracked viewer's head position such that said tracked viewer's head position is centered within said viewing zone.

9. A system of claim 8, characterized in that said M-camera views are captured by a discrete camera array.

10. A system of claim 8, characterized in that said display device comprises a spatial light modulator with a lenticular sheet of lenses or a parallax barrier.

11. A system of claim 8, characterized in that the geometry of said camera array equals the geometry of said display device.