WO1998029860A1

WO1998029860A1 - System and method for synthesizing three-dimensional video from a two-dimensional video source

Info

Publication number: WO1998029860A1
Application number: PCT/US1997/023941
Authority: WO
Inventors: Amber C. Davidson; Loran L. Swensen
Original assignee: Chequemate International Inc.
Priority date: 1996-12-27
Filing date: 1997-12-24
Publication date: 1998-07-09
Also published as: CA2276190A1; EP1012822A1; BR9713629A; JP2001507890A; CN1244278A; AU5720698A

Abstract

The present invention is directed to systems and methods for synthesizing a three-dimensional video stream from a two-dimensional video source. A frame (48) from the two-dimensional video source is digitized (50) and split (54) into a plurality of fields (56, 58). Each field contains a portion of the information in the frame. The fields are then separately processed and transformed (60, 62) to introduce visual clues that, when assembled with the other fields, will be interpreted by a viewer as a three-dimensional image. Such transformations can include, but are not limited to, skewing transformation, shifting transformations, and scaling transformations. The transformations may be performed in the horizontal dimension, the vertical dimension, or a combination of both. In many embodiments the transformation and reassembly of the transformed fields is performed within a single frame so that no temporal shifting is introduced or utilized to create the synthesized three-dimensional video stream. After the three-dimensional video stream has been synthesized, it is displayed on an appropriate display device. Appropriate display devices include a multiplexed display device which alternates the viewing of different fields in conjunction with a pair of shuttered glasses which allows one field to be displayed to one eye of viewer and another field to be displayed to another eye of a viewer. Other types of single display devices and multidisplay devices may also be used.

Description

SYSTEM AND METHOD FOR SYNTHESIZING

THREE-DIMENSIONAL VIDEO FROM A

TWO-DIMENSIONAL VTDEO SOURCE

BACKGROUND OF THE INVENTION

This application claims the benefit of United States Provisional Patent Application No. 60/034,149 entitled "TWO-DIMENSIONAL TO THREE-DIMENSIONAL STEREOSCOPIC TELEVISION CONVERTER," in the name of Amber C. Davidson and Loran L. Swensen, filed on December 27, 1996, and incorporated herein by reference.

1. The Field of the Invention

This invention relates to systems and methods for processing and displaying video imagery. More specifically, this invention relates to systems and methods that receive a two-dimensional video signal and synthesize a three-dimensional video signal which is displayed on a display device.

2. The Prior State of the Art

Realistic three-dimensional video is useful in entertainment, business, industry, and research. Each area has differing requirements and differing goals. Some systems that are suitable for use in one area are totally unsuitable for use in other areas due to the differing requirements. In general, however, three-dimensional video imagery must be comfortable to view for extended periods of time without having the viewing system impart stress and eye strain. In addition, system should be of sufficient resolution and quality to allow for a pleasing experience. However, prior art systems have not always accomplished these goals in a sufficient manner.

Any approach designed to produce three-dimensional video images relies on the ability to project a different video stream to each eye of viewer. The video streams contain visual clues that are interpreted by the viewer as a three-dimensional image. Many different systems have been developed to present these two video streams to different eyes of an individual. Some systems utilize twin screen displays using passive polarized or differently colored viewing lenses and glasses that are worn by the viewer in order to allow each eye to perceive a different video stream. Other approaches use field or frame multiplexing which utilizes a single display screen that quickly switches between the two video streams. These systems typically have a pair of shuttered glasses that are worn by an individual and the shutters alternately cover one eye and then the other in order to allow each eye to perceive a different video stream. Finally, some systems, such as those commonly used in virtual reality systems, use dual liquid crystal or dual CRT displays that are built into an assembly worn on the viewers head. Other technologies include projection systems and various auto stereoscopic systems that do not require the wearing of glasses.

Prior art systems that generate and display three-dimensional video imagery have typically taken one of two approaches. The first approach has been to employ a binocular system, e.g., two lenses or two cameras to produce two channels of visual information.

The spacial offset of the two channels creates a parallax effect that mimics the effect created by an individual's eyes.

The key factor in producing high quality stereoscopic video that uses two cameras is the maintenance of proper alignment of the two channels of image data. The alignment of the camera lenses must be maintained and the video signals generated by the cameras must maintain a proper temporal alignment as they are processed by system electronics or optics. Misalignment will be perceived as distortion to a viewer. Twin screen viewing systems are known to be particularly prone to misalignment, tend to be bulky and cumbersome, and tend to be rather expensive due to the cost of multiple displays. Single screen solutions which multiplex fields or frames tend to minimize the problems associated with dual display monitors, yet these systems also rely on the accuracy of alignment of the input video data.

The second approach taken by various systems has been an attempt to convert an input two-dimensional video signal into a form that is suitable for stereoscopic display.

These systems traditionally have split the two-dimensional video signal into two separate channels of visual information and have delayed one channel of video information with respect to the other channel of video information. Systems which synthesize a simulated three-dimensional scene from two-dimensional input data tend to be somewhat less expensive due to the reduced hardware requirements necessary to receive and process two separate channels of information. In addition, such systems may utilize any conventional video source rather than requiring generation of special video produced by a stereoscopic camera system. The reliance on temporal shifting of portions of the data in order to create a simulated three-dimensional scene, however, does not work well for objects that are not moving in the scene. Thus, there presently does not exist a system that can produce high quality simulated three-dimensional video from a two-dimensional input signal.

Another factor limiting the commercial success of traditional three-dimensional video has been adverse physical reactions including eye strain, headaches, and nausea experienced by a significant number of viewers of these systems. This is illustrated, for example, by the 3-D movies that were popular in the 1950s and 1960s. Today, however, outside of theme parks and similar venues, these movies are typically limited to less than about thirty minutes in length, because the average viewer tolerance for this media is limited. Viewer tolerance problems seem to be intrinsic to the methodology of traditional stereoscopy, and result from the inability of these systems to realistically emulate the operation of the human visual system. Such systems also seem to suffer from the inability to account for the central role of the human brain and the neuro-cooperation between the brain and eyes for effective visual processing.

In summary, prior art systems have suffered from poor image quality, low user tolerance, and high cost. In would be an advancement in the art to produce a three-dimension video system that did not suffer from these problems.

SUMMARY OF THE INVENTION

The problems of the prior art have been successfully overcome by the present invention which is directed to systems and methods for synthesizing a simulated three-dimensional video image from a two-dimensional input video signal. The present invention is relatively inexpensive, produces high quality video, and has high user tolerance. The systems of the present invention do not rely on temporal shifting in order to create a simulated three-dimensional scene. However, certain embodiments may use temporal shifting in combination with other processing to produce simulated three-dimensional video from a two-dimensional video source. Traditional video sources, such as an NTSC compatible video source is composed of a sequence of frames that are displayed sequentially to a user in order to produce a moving video image. The frame rate for NTSC video is thirty frames per second. Frames are displayed on a display device, such as a monitor or television, by displaying the individual horizontal scan lines of the frame on the display device. Traditionally, televisions have been designed to display the frame by interlacing two different fields. In other words, the television first displays all the odd numbered scan lines and then interlaces the even numbered scan lines in order to display a complete frame. Thus, a frame is typically broken down into an even field which contains the even numbered scan lines and an odd field which contains the odd numbered scan lines. The present invention takes a two-dimensional video input signal and digitizes the signal so that it can be digitally processed. The digitized frame is separated into the even field and the odd field. The even field and/or the odd field are then processed through one or more transformations in order to impart characteristics to the field that, when combined with the other field, and properly displayed to a viewer will result in a simulated three-dimensional video stream. The fields are then placed in a digital memory until they are needed for display. When the fields are needed for display, they are extracted from the digital memory and sent to the display device for display to the user.

The fields are displayed to the user in such a manner that one field is viewed by one eye and the other field is viewed by the other eye. Many mechanisms may be used to achieve this, including the various prior art mechanisms previously discussed. In one embodiment, the system utilizes a pair of shuttered glasses that are synchronized with the display of the different fields so that one eye is shuttered or blocked during the display of one field and then the other eye is shuttered or blocked during the display of the other field. By alternating the fields in this manner, three-dimensional video may be viewed on a conventional display device, such as a conventional television. The mind, when receiving signals from the eyes, will interpret the visual clues included in the video stream and will fuse the two fields into a single simulated three-dimensional image.

The processing used to impart various characteristics to a field that will be interpreted as three-dimensional visual clues may comprise one or more transformations that occur in the horizontal and/or vertical dimension of a field. When a frame is digitized and separated into two fields, the fields are comprised of a matrix of sampled video data. This matrix of video data may be transformed through shifting, scaling, and other spatial transformations in order to impart appropriate visual clues that will be interpreted by the brain of a viewer in order to create the simulated three-dimensional images that are desired.

One transformation useful in imparting these visual clues is a skewing transformation. The skewing transformation begins with a particular row or column of information and then shifts each succeeding row or column by a specified amount relative to the row or column immediately preceding it. For example, each line may be shifted a certain number of data samples in a horizontal direction relative to the row above. Data samples that extend beyond the boundary of the matrix may be dropped or may be wrapped back to the front of the row.

Other transformations that have proven useful in imparting visual clues are shifting transformations where all rows or columns are shifted by a designated amount, and scaling transformations which scale rows or columns to increase or decrease the number of data samples in the rows or columns of the field. When fields are scaled, fill data samples may be inserted as needed through the use of interpolation or simply by picking a fixed value to insert.

In many embodiments of the present invention, the processing of various fields through transformations, as previously described, occur within a single frame. In other words, no temporal transformation or delay is introduced into the system. A frame is simply broken into its component fields, the fields are transformed appropriately, and then the frame is reassembled. In other embodiments, however, it may be desirable to introduce a temporal displacement in one or the other of the fields. In other words, a field may be transformed and then held and recombined with other fields of a later frame. In particular, it may be desirable to impart a vertical transformation and a temporal transformation in combination to introduce various visual clues into the scene that will be interpreted as a three-dimensional image.

BRIEF DESCRIPTION OF THE DRAWINGS In order that the manner in which the above-recited and other advantages and objects of the invention are obtained, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered limiting of its scope, the invention will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:

Figure 1 is a diagram illustrating the conceptual processing that occurs in one embodiment of the present invention;

Figure 2 illustrates the conceptual processing that takes place in another embodiment of the present invention;

Figures 3A through 3D illustrate various transformations that may be used to impart visual clues to the synthesized three-dimensional scene;

Figures 4A through 4D illustrate a specific example using a scaling transformation; Figure 5 illustrates temporal transformation; and Figures 6 A through 8B illustrate the various circuitry of one embodiment of the present invention.

DETAILED DESCRD7TION OF THE PREFERRED EMBODIMENTS

The present invention is directed to systems and methods for synthesizing a three-dimensional video stream from a two-dimensional video source. The video source may be any source of video such as a television signal, the signal from a VCR, DVD, video camera, cable television, satellite TV, or any other source of video. Since the present invention synthesizes a three-dimensional video stream from a two-dimensional video stream no special video input source is required. However, if a video source produces two video channels, each adapted to be viewed by an eye of a user, then the present invention may also be used with appropriate modification. From the discussion below, those skilled in the art will quickly recognize the modifications that should be made.

The following discussion presents the basics of a video signal and is intended to provide a context for the remainder of the discussion of the invention. Although specific examples and values may be used in this discussion, such should be construed as exemplary only and not as limiting the present invention. As previously explained, the present invention may be adapted to be used with any video source.

In general, a video signal is comprised of a plurality of frames that are intended to be displayed in a sequential fashion to the user or viewer of a display device in order to provide a moving scene for the viewer. Each frame is analogous to the frame on a movie film in that it is intended to be displayed in its entirety before the next frame is displayed. Traditional display devices, such as television sets or monitors, may display these video frames in a variety of ways. Due to limitations imposed by early hardware, televisions display a frame in an interlaced manner. This means that first one sequence of lines is scanned along the monitor and the then another sequence of lines is scanned along the monitor. In this case, a television will scan the odd numbered lines first and then return and scan the even numbered lines. The persistence of the phosphor on the television screen allows the entire frame to be displayed in such a manner that the human eye perceives the entire frame displayed at once even though all lines are not displayed at once. The two different portions of the frame that are displayed in this interlaced manner are generally referred to as fields. The even field contains the even numbered scan lines, and the odd field contains the odd numbered scan lines.

Due to hardware advances, many computer monitors and some television sets are capable of displaying images in a non-interlaced manner where the lines are scanned in order. Conceptually, the even field and odd field are still displayed, only in a progressive manner. In addition, it is anticipated that with the introduction of advanced TV standards, that there may be a move away from interlaced scanning to progressive scanning. The present invention is applicable to either an interlaced scanning or a progressive scanning display. The only difference is the order in which information is displayed. As an example of the particular scan rates, consider standard NTSC video.

Standard NTSC video has a frame rate of thirty frames per second. The field rate is thus sixty fields per second since each frame has two fields. Other video sources use different frame rates. This, however, is not critical to the invention and the general principles presented herein will work with any video source. Referring now to Figure 1 , a general diagram of the processing of one embodiment of the present invention is illustrated. In Figure 1, an input video stream, shown generally as 20 is comprised of a plurality of frames 22 labeled FI through F8. In Figure 1, frame 24 is extracted for processing. As illustrated in Figure 1, frame 24 is comprised of a plurality of scan lines. The even scan lines of frame 24 are labeled 26 and the odd scan lines of frame 24 are labeled 28. This is done simply for notational purposes and to illustrate that a frame, such as frame 24, may be divided into a plurality of fields.

Although two fields are illustrated in Figure 1, comprising even scan lines 26 and odd scan lines 28, other delineations may be made. For example, it may be possible to divide the frame into more than two fields.

The frame is digitized by encoder 30. Encoder 30, among other things, samples the video data of frame 24 and converts it from analog format to a digital format.

Encoder 30 may also perform other processing functions relating to color correction/translation, gain adjustments, and so forth. It is necessary that encoder 30 digitize frame 24 with a sufficient number of bits per sample in order to avoid introducing unacceptable distortion into the video signal. In addition, it may be desirable to sample various aspects of the video signal separately. In NTSC video, it may be desirable to sample the luminescence and chrominance of the signal separately. Finally, the sample rate of encoder 30 must be sufficient to avoid introducing aliasing artifacts into the signal. In one embodiment, a 13.5 MHz sample rate using sixteen bits to represent the signal has been found to be sufficient for standard NTSC video. Other video sources may require different sample rates and sample sizes. In Figure 1, the digitized frame is illustrated as 32

Digitized frame 32 is processed by modification processing component 34.

Modification processing component 34 performs various transformations and other processing on digitized frame 32 in order to introduce visual clues into the frame that, when displayed to a viewer, will cause the frame to be interpreted as a three-dimensional image. A wide variety of processing may be utilized in modification processing component 34 to introduce appropriate visual clues. Various transformations and other processing are discussed below. In general, however, modification processing component 34 will prepare the frame to be displayed to a user so that the frame is interpreted as a three-dimensional object. The transformations and other processing performed by modification processing component 34 often entail separating frame 32 into two or more components and transforming one component relative to the other. The resultant modified frame is illustrated in Figure 1 as 36.

After the frame has been modified, the next step is to save the modified frame and display it on a display device at the appropriate time and in the appropriate fashion. Depending on the processing speed of encoder 30 and modification processor 34, it may be necessary to hold modified frame 36 for a short period of time. In the embodiment illustrated in Figure 1, controller 38 stores modified frame 36 in memory 40 until it is needed. When it is needed, modified frame 36 is extracted and sent to the appropriate display device to be displayed. This may require controller 38, or another component, to control the display device or other systems so that the information is displayed appropriately to the viewer.

The exact process of extracting the modified frame and displaying it on a display device will be wholly dependent upon the type of display device used. In general, it will be necessary to use a display device that allows one eye of a viewer to view a portion of the frame and another eye of the viewer to view the other portion of the frame. For example, one display system previously described separates the frame into two fields that are multiplexed on a single display device. A pair of shuttered glasses, or other shuttering device is then used so that one field is viewed by one eye while the other eye is covered and then the other field is viewed while the shutter switches. In this manner, one eye is used to view one field and the other eye is used to view the other field. The brain will take the visual clues introduced by modification processing component 34 and fuse the two fields into a single image that is interpreted in a three-dimensional manner. Other mechanisms may also be utilized. These mechanisms include multidisplay systems where one eye views one display and the other eye views the other display. The traditional polarized or colored approach which utilizes a pair of passive glasses may also be used, as previously described.

In the embodiment illustrated in Figure 1, controller 38 is illustrated as controlling a shuttering device 42 in order to allow images multiplexed on monitor 44 to be viewed appropriately. In addition, decoder 46 converts modified frame 36 from a digital form to an analog form appropriate for display on monitor 44. Decoder 46 may also generate various control signals necessary to control monitor 44 in conjunction with shuttering device 42 so that the appropriate eye views the appropriate portion of frame 36. Decoder 46 may also perform any other functions necessary to ensure proper display of frame 36 such as retrieving the data to be displayed in the appropriate order.

Referring now to Figure 2, a more detailed explanation of one embodiment of the present invention is presented. The embodiment of Figure 2 has many elements in common with the embodiment illustrated in Figure 1. However, a more detailed explanation of certain processing that is performed to modify the frame from two-dimensional to three-dimensional is illustrated.

In Figure 2 a video frame, such as frame 48, is received and encoded by encoder 50. Encoder 50 represents an example of means for receiving a frame from a two-dimensional video stream and for digitizing the frame so that the frame can be processed. Encoder 50, therefore, digitizes frame 48 among other things. The digitized frame is illustrated in Figure 2 as digitized frame 52. Encoder 50 may also perform other functions as previously described in conjunction with the encoder of Figure 1.

Digitized frame 52 is split by splitter 54 into odd field 56 and even field 58. Splitter 54 represents an example of means for separating a frame into a plurality of fields.

Odd field 56 and even field 58 are simply representative of the ability to split a frame, such as digitized frame 52, into multiple fields. When interlaced display devices are utilized, it makes sense to split a frame into the even and odd fields that will be displayed on the device. In progressively scanned display devices, even and odd fields may be used, or other criteria may be used to split a frame into multiple fields. For example, at one time it was proposed that an advanced TV standard may use vertical scanning rather than the traditional horizontal scanning. In such a display device, the criteria may be based on a vertical separation rather than the horizontal separation as illustrated in Figure 2. All that needs happen is that splitter 54 separate frame 52 into at least two fields that will be processed separately.

Odd field 56 and even field 58 are processed by modification processing components 60 and 62, respectively. Modification processing component 60 and 62 represent the conceptual processing that occurs to each of the fields separately. In actuality, the fields may be processed by the same component. Modification processing component 60 and 62 represent but one example of means for transforming at least one field using a selected transform. Such a means may be implemented using various types of technologies such as a processor which digitally processes the information or discrete hardware which transforms the information in the field. Examples of one implementation are presented below. In Figure 2, modified odd field 64 and modified even field 66 represent the fields that are transformed by modification processing components 60 and 62, respectively. Note that although Figure 2 illustrates modified field 64 and 66, in various embodiments one, the other, or both fields may be modified. The fields may be transformed in any manner that is desirable to introduce appropriate visual clues into the field, as previously explained. Examples of some transforms that have been found useful to introduce visual clues in order to convert a two-dimensional video stream into a three-dimensional video stream are presented and discussed below. In general, such transforms involve shifting, scaling, or otherwise modifying the information contained in one or both fields. Note that the transforms performed by modification processing components 60 and 62 may be performed either in the horizontal direction, the vertical direction, or both. Modified fields 64 and 66 are then stored by controller 68 in memory 70 until they are needed for display. Once they are needed for display, controller 68 will extract the information in the desired order and transfer the information to decoder 72. If the display requires an interlaced display of one field and then the other, controller 68 will transfer one field and then the other field for appropriate display. If, however, the display is progressively scanned, then controller 68 may supply the information in a different order. Thus, controller 68 represents an example of means for recombining fields and for transferring the recombined fields to a display device. In the alternative, certain of this functionality may be included in decoder 72. Decoder 72 is responsible for taking the information and converting it from a digital form to an analog form in order to allow display of the information. Decoder 72 may also be responsible for generating appropriate control signals that controls the display. In the alternative, controller 68 may also supply certain control signals in order to allow proper display and inteφretation of the information. As yet another example, a separate device, such as a processor or other device, may be responsible for generating control signals that control the display device so that the information is properly displayed. From the standpoint of the invention, all that is required is that the information be converted from a digital format to a format suitable for use with the display device. Currently, in most cases this will be an analog format, although other display devices may prefer to receive information in a digital format. The display device is then properly controlled so that the information is presented to the viewer in an appropriate fashion so that the scene is interpreted as three-dimensional. This may include, for example, multiplexing one field and then the other on the display device while, simultaneously, operating a shuttering device which allows one eye to view one field and the other eye to view the other field. In the alternative, any of the display devices previously discussed may also be used with appropriate control circuitry in order to allow presentation to an individual. In general, however, all these display systems are premised on the fact that one eye views a certain portion of the information and another eye views a different portion of the information. How this is accomplished is simply a matter of choice, given the particular implementation and use of the present invention.

Referring next to Figures 3 A through 3D, some of the transforms that have been found useful for providing visual clues that are included in the data and inteφreted by a viewer as three-dimensional. The examples illustrated in Figures 3 A through 3D present transformations in the horizontal direction. Furthermore, the examples illustrate transformation in a single horizontal direction. Such should be taken as exemplary only.

These transformations may also be used in a different horizontal direction or in a vertical direction. Finally, combinations of any of the above may also be used. Those of skill in the art will recognize how to modify the transformations presented in Figures 3 A through 3D as appropriate.

Referring first to Figure 3 a skew transform is presented. This transform skews the data in the horizontal or vertical direction. In Figure 3 A a field that is to be transformed is illustrated generally as 74. This field has already been digitized and may be represented by a matrix of data points. In Figure 3 this matrix is five columns across by three rows down. The transformations used in the present invention will shift or otherwise modify the data of the field matrix. Typical field matrices are hundreds of columns by hundreds of rows. For example, in NTSC video an even or odd field may contain between eight and nine hundred columns and two to three hundred rows. The skew transform picks a starting row or column and then shifts each succeeding row or column by an amount relative to the column or row that precedes it. In the example in Figure 3 A, each row is shifted by one data point relative to the row above it. Thus, the transformed field, illustrated generally as 76, has row 78 being unshifted, row 80 being shifted by one data point, and row 82 being shifted by two data points. As illustrated in Figure 3 A, the data points of original matrix are thus bounded by dashed lines 84 and takes on a skew shape. The total shift from the beginning row to the ending row is a measure of the amount of skew added to the frame. When each row is shifted, the data points begin to move outside the original matrix boundaries, illustrated in Figure 3A by solid lines 86. As the data points are shifted, "holes" begin to develop in the field matrix as illustrated by data points 88. Thus, the question becomes what to place in data points 88. Several options may be utilized. In one embodiment, as the data points are shifted they are wrapped around and placed in the holes created at the beginning of the row or column. Thus, in row 80 when the last data point was shifted outside the field matrix boundary it would be wrapped and placed at the beginning of the row. The process would be similar for any other rows. In the alternative, if the holes opened in the field matrix lie outside the normal visual range presented on the display, then they may simply be ignored or filled with a fixed value, such as black. In the alternative, various interpolation schemes may be used to calculate a value to place in the holes. As previously mentioned, this transformation may be performed in the horizontal direction, the vertical direction, or a combination of both.

Referring next to Figure 3B, a shifting transform is presented. In the shifting transform, each row or column in the field matrix is shifted by a set amount. In Figure 3B, the unshifted field matrix is illustrated as 90, while the shifted field matrix is illustrated as 92. As indicated in Figure 3B, this again places certain data points outside the boundaries of the field matrix. The data points may be wrapped to the beginning of the row and placed in the holes opened up, or the holes that opened up may be filled with a different value and the data points that fall beyond the boundaries of the field matrix may simply be ignored. Again, various schemes may be used to fill the holes such as filling with a fixed data point or using a myriad of inteφolation schemes.

Figures 3C and 3D illustrate various scaling transformations. Figure 3C illustrates a scaling transformation that shrinks the number of data points in the field matrix while Figure 3D illustrates a scaling transformation that increases the number of data points. This would correspond to making something smaller and larger respectively. In Figure 3C, the unsealed matrix is illustrated as 96 while the scaled field matrix is illustrated by 98. When a scaling is applied that reduces the number of data points, such as a scaling illustrated in Figure 3C, appropriate data points are simply dropped and the remainder of the data points are shifted to eliminate any open space for data points that were dropped. Because the number of data points is reduced by the scaling, values must be placed in the holes that are opened by the reduced number of data points. Again, such values may be from a fixed value or may be derived through some interpolation or other calculation. In one embodiment, the holes are simply filled with black data points.

Figure 3D represents a scaling that increases the number of data points in a field matrix. In Figure 3D the unsealed field matrix is illustrated by 100 and the scaled field matrix is illustrated by 102. Generally, when a matrix of data points is scaled up, the

"holes" open up in the middle of the data points. Thus, again a decision must be made as to what values to fill in the holes. In this situation, it is typically adequate to interpolate between surrounding data values to arrive at a particular value to put in a particular place. In addition, since the data points grow, any data points that fall outside the size of the field matrix are simply ignored. This means that the only values that must be interpolated and filled are those that lie within the boundaries of the field matrix.

Although the transformations illustrated in Figures 3 A and 3D have been applied separately, it is also possible to apply them in combination with each other. Thus, a field may be scaled and then skewed or shifted and then skewed or scaled and then shifted. Furthermore, other transformations may also be utilized. For example, transformations that skew a field matrix from the center outward in two directions may be useful. In addition, it may also be possible to transform the values of the data points during the transformation process. In other words, it may be possible to adjust the brightness or other characteristic of a data point during the transformation. Referring next to Figures 4A through 4D, a specific example is presented in order to illustrate another aspect of the various transformations. It is important to note that when a field is shifted or otherwise transformed, it is possible to pick an alignment point between the transformed field and the other field. For example, it may be desirable to align the fields at the center and then allow the skewing, shifting, scaling, or other transforms to grow outward from the alignment point. In other words, when fields are transformed it is generally necessary to pick an alignment point and then shift the two fields in order to align them to the alignment point. This will determine how the values are then used to fill in the holes that are opened up. As a simple example, consider a skew transform which begins not at the first row as illustrated in Figure 3 A but at the center row. The rows above the center row may then be shifted one direction and the rows below the center row may then be shifted the other direction. Obviously such a skew transform would be different than a skewed transform which began at the top row and then proceeded downward or a skewed transform that began at the bottom row and then proceeded upward.

Referring first to Figure 4 an untransformed frame 104 is illustrated. This frame comprises six rows, numbered 105 through 110 and seven columns. The rows of the frame are first separated into an even field and an odd field. Odd field 112 contains rows 105, 107, and 109 while even field 114 contains rows 106, 108 and 110. Such a function may be performed, for example, by a splitter or other means for separating a frame into a plurality of fields. Splitter 54 of Figure 2 is but one example. Referring next to Figure 4B, the process of transforming one or both fields is illustrated. In the example illustrated in 4B, odd field 112 will be transformed while even field 114 remains untransformed. The untransformed fields are illustrated on the left-hand side of Figure 4B while the transformed fields are illustrated on the right-hand side of Figure 4B. In this case, a scaling transform which increases the number of data points in the horizontal direction is applied to odd field 112. This results in transformed odd field 116. As previously explained in conjunction with Figure 3D, when a transform that expands the number of data points is applied, "holes" will open up between various data points in the field matrix. In Figure 4B, these holes are illustrated by the grey data points indicated by 118. These "holes" may be filled in any desired manner. As previously explained, a good way to fill these holes is to inteφolate among the surrounding data points in order to arrive at a value that should be placed therein.

Referring next to Figure 4C, the alignment issues that can be created when a transform is applied are illustrated. Such a situation is particularly apparent when a transform is applied that changes the number of data points in a field. For example, transformed odd field 116 has ten columns instead of the normal seven. In such a situation, as previously explained it is desirable to pick an alignment point and shift the data points until the field matrices are aligned. For example, suppose it is desired to align the second column of transformed odd field 116 with the first column of even field 114. In such a situation, the fields would appropriately shifted as shown on the right hand side of Figure 4C. The edge of the field matrix is then indicated by dashed lines 120 and any data points that fall outside those lines can simply be discarded.

Picking an alignment point and performing the shifting in order to properly align the fields is an important step. Depending on the alignment point selected and the shifting that is performed, very different results may be achieved when the reconstructed simulated three-dimensional frame is displayed. Shifting tends to create visual clues that begin to indicate depth. In general, shifting one direction will cause something to appear to move out of the screen while shifting the other direction will cause something to appear to move into the background of the screen. Thus, depending on the alignment point and the direction of shift, various features can be brought in or out of the display. Furthermore, these effects may be applied to one edge of the screen or the other edge of the screen depending on the alignment point selected. Since most action in traditional programs takes place near the center of the screen, it may be desirable to apply transformations that enhance the three dimensional effect at the center of the screen.

Referring next to Figure 4D, the process of recombining the fields to create a simulated three-dimensional frame is illustrated. The left-hand side of Figure 4D illustrates transformed odd field 116 that has been cropped to the appropriate size.

Figure 4D also illustrates even field 114. The frame is reconstructed by interleaving the appropriate rows as indicated on the right-hand side of Figure 4D. The reconstructed frame is illustrated generally as 122. Such a reconstruction may take place, for example, when the fields are displayed on a display device. If the display device is an interlaced display, as for example a conventional television set, then the odd field may be displayed after which the even field is displayed in order to create the synthesized three-dimensional frame.

In various embodiments of the present invention, the synthesized three-dimensional frame is referred to as being constructed from a recombining of the various fields of the frame. The reconstructed frame is then illustrated as being displayed on a display device.

In actuality, these two steps may take place virtually simultaneously. In other words, in the case of an interlaced monitor or display device, one field is displayed after which the other field is displayed. The total display of the two fields, however, represents the reconstructed frame. Similarly, if a two-display system is utilized, then the total frame is never physically reconstructed except in the mind of the viewer. However, conceptually the step of creating the synthesized three-dimensional frame by recombining the fields is performed. Thus, the examples presented herein should not be construed as limiting the scope of the invention, but the steps should be inteφreted broadly.

The embodiments presented above have processed a frame and then displayed the same frame. In other words, the frame rate of the output video stream is equal to the frame rate of the input video stream. Technologies exist, however, that either increase or decrease the output frame rate relative to the input frame rate. It may be desirable to employ such technologies with the present invention.

In employing technologies that increase the output frame rate relative to the input frame rate, decisions must be made as to what data will be used to supply the increased frame rate. One of two approaches may be used. The first approach is simply to send the data of a frame more often. For example, if the output frame rate is doubled, the information of a frame may simple be sent twice. In the alternative, it may be desirable to create additional data to send to the display through further transformations. For example, two different transformations may be used to create two different frames which are then displayed at twice the normal frame rate.

The embodiments and the discussions presented above have illustrated how a single frame is broken down into two or more fields and those fields are then processed and then recombined to create a synthesized three-dimensional frame. An important aspect of the embodiments presented above is that they do not temporarily shift either of the fields when performing the synthesis of a three-dimensional frame. In other words, both the fields are extracted from a frame the fields are processed and then the fields are displayed within the exact same frame. In the alternative, however, with certain transformations it may be desirable to introduce a temporal transformation or a temporal shift into the processing that creates the synthesized three-dimensional frame. Referring next to Figure 5, the concept of temporal shifting is presented.

In Figure 5 an input video stream comprising a plurality of frames is illustrated generally as 124. In accordance with the present invention, a single frame is extracted for processing. This frame is illustrated in Figure 5 as 126. The frame is broken down into a plurality of fields, as for example field 128 and 130. As previously discussed, although two fields are illustrated, the frame may be broken into more than two fields if desired.

The individual fields are then processed by applying one or more transformations as illustrated in Figure 5 by modification processing components 132 and 134. Modified field 130 is illustrated as field 136. In the case of field 128, however, the embodiment illustrated in Figure 5 introduces a temporal shift as illustrated by delay 138. Delay 138 simply holds the transformed field for a length of time and substitutes a transformed field from a previous frame. Thus, a field from frame 1 may not be displayed until frames 2 or 3. A delayed field, illustrated in Figure 5 as 140, is combined with field 136 to create frame 142. Frame 142 is then placed in the output video stream 144 for proper display. Referring next to Figures 6A through Figures 8B, one embodiment of the present invention is presented. These figures represent circuit diagrams with which one of skill in the art is readily familiar. The discussion which follows, therefore, will be limited to a very high level which discusses the functionality incoφorated into some of the more important functional blocks. The embodiments illustrated in Figures 6A through 8B is designed to operate with a conventional display, such as a television and shuttered glasses which operate to alternatively block one eye and the other so that one field of the frame is seen by one eye and another field of the frame is seen by the other eye.

Referring first to Figure 6A, a first part of the circuitry of the embodiment is presented. In Figure 6 processor 144 is illustrated. Processor 144 is responsible for overall control of the system. For example, processor 144 is responsible for receiving various user input commands, as from a remote control or other input devices in order to allow user input for various parameters of the system. Such inputs may, for example, adjust various parameters in the transforms that are used to produce the synthesized three-dimensional images. Such an ability allows a user to adjust the synthesized three- dimensional scene to suit his or her own personal tastes. Processor 144 will then provide this information to the appropriate components. In addition, processor 144 may help perform various transformations that are used in producing the synthesized three-dimensional scenes. Figure 6A also illustrates a schematic representation of shuttered glasses 150, which is discussed in greater detail below.

Figure 6B illustrates a block level connection diagram of video board 146. Video board 146 will be more particularly described in conjunction with Figures 7 A through 71 below. Video board 146 contains all necessary video circuitry to receive a video signal, digitize the video signal, store and receive transformed fields in memory, reconvert transformed fields back to analog signals, and provide the analog signals to the display device. In addition, video board 146 may contain a logic to generate control signals that are used to drive shuttered glasses used by this embodiment to produce a synthesized three-dimensional effect when worn by a viewer.

Block 148 of Figure 6C contains a schematic representation of the drivers which are used to drive the shuttered glasses. The shuttered glasses are illustrated schematically in Figure 6A by block 150. Figures 6D - 6F contain various types of support circuitry and connectors as for example, power generation and filtering, various ground connectors, voltage converters, and so forth. The support circuitry is labeled generally as 152. Referring next to Figures 7A through 71 a more detailed schematic diagram of video board 146 of Figure 6B is presented. Video board 146 comprises decoder 154 (Figure 7A), controller 156 (Figure 7B), memory 158 (Figures 7C and 7D), and encoder 162 (Figure 7E). In addition, in Figure 7F an alternate memory configuration is illustrated as block 160. Various support circuitry is illustrated in Figures 7G through 71.

Block 164 of Figure 7G contains various input circuitry that receives video and other data from a variety of sources. Block 165 of Figure 7G illustrates how the pinouts of video board 146 of Figure 6B translate into signals of Figures 7 A through 71. Block 166 of Figures 7H and 71 contains output and other support circuitry. Decoder 154 (Figure 7 A) is responsible for receiving the video signal and for digitizing the video signal. The digitized video signal is stored in memory 158 (Figures 7C and 7D) under the control of controller 156 (Figure 7B). Controller 156 is a highly sophisticated controller that basically allows information to be written into memory 158 while, information is being retrieved from memory 158 by encoder 162 (Figure 7E) for display. The various frames and fields of an input video received by decoder 154 may be identified from the control signals in the video data. The fields may then be separated out for processing and transformation, as previously described.

It should be noted that if transformations occur in the horizontal direction, then the transformation may be applied line by line as the field is received. If, on the other hand, a transformation occurs in the vertical direction, it may be necessary to receive the entire field before transformation can occur. The exact implementation of the transformations will be dependent upon various design choices that are made for the embodiment.

Turning now to controller 156 of Figure 7B, it should be noted that in addition to storing and retrieving information from memory 158, controller 156 also generates the control signals which drive the shuttered glasses. This allows controller 156 to synchronize the shuttering action of the glasses with the display of information that is retrieved from memory 158 and passed to encoder 162 for display on the display device. Encoder 162 (Figure 7E) takes information retrieved from memory 158 and creates the appropriate analog signals that are then sent to the display device.

Alternate memory 160 (Figure 7F), which is more fully illustrated in Figures 8 A and 8B, is an alternate memory configuration using different component parts that may be used in place of memory 158. Figure 8A illustrates the various memory chips used by alternate memory 160. Figure 8B illustrate how the pinouts of Figure 7F translate into the signals of Figures 8 A and 8B in pinout block 161. Figure 8B also illustrates filtering circuitry 163. In summary, the present invention produces high-quality, synthesized, three-dimensional video. Because the present invention converts a two-dimensional video source into a synthesized three-dimensional video source, the present invention may be used with any video source. The system will work, for example, with television signals, cable television signals, satellite television signals, video signals produced by laser disks,

DVD devices, VCRs, video cameras, and so forth. The use of two-dimensional video as an input source substantially reduces the overall cost of creating three-dimensional video since no specialized equipment must be used to generate an input video source.

The present invention retrieves the video source, digitizes it, splits the video frame into a plurality of fields, transforms one or more of the fields, and then reassembles the transformed fields into a synthesized, three-dimensional video stream. The synthesized three-dimensional video stream may be displayed on any appropriate display device. Such display devices include, but are not limited to, multiplexed systems that use a single display to multiplex two video streams and coordinate the multiplexing with a shuttering device such as a pair of shutter glasses worn by a viewer. Additional display options may be multiple display devices which allow each eye to independently view a separate display. Other single or multidisplay devices are also suitable for use with the present invention and have been previously discussed.

The present invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described embodiments are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is, therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope. What is claimed is:

Claims

1. A method for creating and displaying a three-dimensional video stream that is synthesized from a two-dimensional video stream, comprising the steps of: receiving a two-dimensional digitized video stream comprising a plurality of video frames that are intended to be displayed sequentially on a display device, each frame comprising a plurality of fields which together contain all digital video information to be displayed for a frame; extracting from said video stream a single two-dimensional digital video frame for processing; separating said plurality of fields of said single two-dimensional digital video frame into at least a first field and a second field; spatially transforming at least one of said first field or said second field in order to produce a simulated three-dimensional video frame when said first field and said second field are recombined and viewed on a display device; and displaying said first field and said second field without temporally shifting either said first field or said second field in order to create said simulated three- dimensional video frame by displaying said first field and said second field on a display device within a single frame such that said first field is viewed by one eye of an individual viewing the display device and said second field is viewed by the other eye of the individual.

2. A method for creating and displaying a three-dimensional video stream as recited in claim 1 wherein said first field and said second field each comprise a plurality of pixels arranged in a matrix having a plurality of rows and columns, and wherein said spatial transformation step skews one field in the horizontal direction relative to the other field by performing at least the steps of: selecting a total skew value; selecting a starting row of pixels; and for each row after said selected starting row, shifting the row relative to the preceding row in a chosen horizontal direction by a predetermined value derived from the total skew value.

3. A method for creating and displaying a three-dimensional video stream as recited in claim 1 wherein said first field and said second field each comprise a plurality of pixels arranged in a matrix having a plurality of rows and columns, and wherein said spatial transformation step skews one field in the vertical direction relative to the other field by performing at least the steps of: selecting a total skew value; selecting a starting column of pixels; and for each column after said selected starting column, shifting the column relative to the preceding column in a chosen vertical direction by a predetermined value derived from the total skew value.

4. A method for creating and displaying a three-dimensional video stream as recited in claim 1 wherein said spatial transformation step shifts one field in the horizontal direction relative to the other field.

5. A method for creating and displaying a three-dimensional video stream as recited in claim 1 wherein said spatial transformation step shifts one field in the vertical direction relative to the other field.

6. A method for creating and displaying a three-dimensional video stream as recited in claim 1 wherein said spatial transformation step scales one field in the horizontal direction relative to the other field.

7. A method for creating and displaying a three-dimensional video stream as recited in claim 1 wherein said spatial transformation step scales one field in the vertical direction relative to the other field.

8. A method for creating and displaying a three-dimensional video stream that is synthesized from a two-dimensional video stream, comprising the steps of: receiving a two-dimensional digitized video stream comprising a plurality of video frames that are intended to be displayed sequentially on a display device, each frame comprising a plurality of fields which together contain all digital video information to be displayed for a frame; extracting from said video stream a single two-dimensional video frame for processing; separating said plurality of fields of said single two-dimensional video frame into at least a first field and a second field; spatially transforming at least one of said first field or said second field using at least one vertical transform that alters the information of a transformed field in the vertical dimension; and displaying a simulated three-dimensional video frame on a display device by alternating said first field and said second field such that said first field is viewed by one eye of an individual viewing the display device and said second field is viewed by the other eye of the individual.

9. A method for creating and displaying a three-dimensional video stream as recited in claim 8 wherein said first field and said second field each comprise a plurality of pixels arranged in a matrix having a plurality of rows and columns, and wherein said spatial transformation step skews one field in the vertical direction relative to the other field by performing at least the steps of: selecting a total skew value; selecting a starting column of pixels; and for each column after said selected starting column, shifting the column relative to the preceding column in a chosen vertical direction by a predetermined value derived from the total skew value.

10. A method for creating and displaying a three-dimensional video stream as recited in claim 8 wherein said spatial transformation step shifts one field in the vertical direction relative to the other field.

11. A method for creating and displaying a three-dimensional video stream as recited in claim 8 wherein said spatial transformation step scales one field in the vertical direction relative to the other field.

12. A method for creating and displaying a three-dimensional video stream as recited in claim 8 further comprising the step of temporally shifting at least one of said first field or said second field in order to introduce a time delay relative to its original location in said two dimensional video stream.

13. A method for creating a three-dimensional video stream from a two- dimensional video stream comprising the steps of: receiving a two-dimensional digitized video stream comprising a plurality of video frames that are intended to be displayed sequentially on a display device, each frame comprising a plurality of fields which together contain all digital video information to be displayed for a frame; extracting from said video stream a single two-dimensional video frame for processing; separating said plurality of fields of said single two-dimensional video frame into at least a first field and a second field; spatially transforming at least one of said first field or said second field using a transform comprising at least one of (a) a skewing transform that skews one field relative to the other, (b) a scaling transform that scales one field relative to the other, and (c) a shifting transform that shifts one field relative to the other; recombining said first field and said second field without temporally shifting either said first field or said second field in order to create in order to produce a simulated three-dimensional video frame; and displaying said simulated three-dimensional video frame on a display device by alternating said first field and said second field such that said first field is viewed by one eye of an individual viewing the display device and said second field is viewed by the other eye of the individual.

14. A method for creating a three dimensional video stream as recited in claim 13 wherein said spatial transformation step transforms at least one of said first field or said second field in the horizontal direction.

15. A method for creating a three dimensional video stream as recited in claim 13 wherein said spatial transformation step transforms at least one of said first field or said second field in the vertical direction.

16. A system for creating a three dimensional video stream from a two dimensional video stream, said two dimensional video stream comprising a plurality of video frames intended to be sequentially displayed on a display device, each of said video frames comprising at least a first field and a second field, said system comprising: means for receiving a frame of said two dimensional video stream and for digitizing said frame so that said frame can be further processed by the system; means for separating said frame into at least a first field and a second field, each of said fields containing a portion of the video data in said frame; means for transforming at least one of said first field or said second field using a selected transform that will produce a simulated three-dimensional video frame when said first field and said second field are recombined and displayed on a display device; means for recombining said first field and said second field without temporally shifting either said first field or said second field and for transferring said recombined first field and second field to a display device in order to create said simulated three- dimensional video frame; and means for controlling said display device so that said first field is viewed by one eye of an individual viewing said display device and said second field ins viewed by the other eye of the individual.

17. A system for creating and displaying a three-dimensional video stream as recited in claim 16 wherein said selected transform comprises a skew transform that skews one field in the horizontal direction relative to the other field.

18. A system for creating and displaying a three-dimensional video stream as recited in claim 16 wherein said selected transform comprises a skew transform that skews one field in the vertical direction relative to the other field.

19. A system for creating and displaying a three-dimensional video stream as recited in claim 16 wherein said selected transform comprises a shift transform that shifts one field in the horizontal direction relative to the other field.

20. A system for creating and displaying a three-dimensional video stream as recited in claim 16 wherein said selected transform comprises a shift transform that shifts one field in the vertical direction relative to the other field.

21. A system for creating and displaying a three-dimensional video stream as recited in claim 16 wherein said selected transform comprises a scaling transform that scales one field in the horizontal direction relative to the other field.

22. A system for creating and displaying a three-dimensional video stream as recited in claim 16 wherein said selected transform comprises a scaling transform that scales one field in the vertical direction relative to the other field.

23. A system for creating a three dimensional video stream from a two dimensional video stream, said two dimensional video stream comprising a plurality of video frames intended to be sequentially displayed on a display device, each of said video frames comprising at least a first field and a second field, said system comprising: means for receiving a frame of said two dimensional video stream and for digitizing said frame so that said frame can be further processed by the system; means for separating said frame into at least a first field and a second field, each of said fields containing a portion of the video data in said frame; means for transforming at least one of said first field or said second field using a selected vertical transform that operates to transform the video data in the vertical direction so that said first field and said second field will produce a simulated three- dimensional video frame when said first field and said second field are recombined and displayed on a display device; means for recombining said first field and said second field for transferring said recombined first field and second field to a display device in order to create said simulated three-dimensional video frame; and means for controlling said display device so that said first field is viewed by one eye of an individual viewing said display device and said second field ins viewed by the other eye of the individual.

24. A system for creating and displaying a three-dimensional video stream as recited in claim 23 wherein said selected transform comprises a skew transform that skews one field in the vertical direction relative to the other field.

25. A system for creating and displaying a three-dimensional video stream as recited in claim 23 wherein said selected transform comprises a shift transform that shifts one field in the vertical direction relative to the other field.

26. A system for creating and displaying a three-dimensional video stream as recited in claim 23 wherein said selected transform comprises a scaling transform that scales one field in the vertical direction relative to the other field.

27. A system for creating and displaying a three-dimensional video stream as recited in claim 23 further comprising means for temporally shifting at least one of either said first field or said second field.

28. A system for creating a three dimensional video stream from a two dimensional video stream, said two dimensional video stream comprising a plurality of video frames intended to be sequentially displayed on a display device, each of said video frames comprising at least a first field and a second field, said system comprising: means for receiving a frame of said two dimensional video stream and for digitizing said frame so that said frame can be further processed by the system; means for separating said frame into at least a first field and a second field, each of said fields containing a portion of the video data in said frame; means for transforming at least one of said first field or said second field using at least one of (a) a skewing transform that skews one field relative to the other, (b) a scaling transform that scales one field relative to the other, and (c) a shifting transform that shifts one field relative to the other in order to produce a simulated three-dimensional video frame when said first field and said second field are recombined and displayed on a display device; means for recombining said first field and said second field without temporally shifting either said first field or said second field and for transferring said recombined first field and second field to a display device in order to create said simulated three- dimensional video frame; and means for controlling said display device so that said first field is viewed by one eye of an individual viewing said display device and said second field ins viewed by the other eye of the individual.