METHODS AND APPARATUS FOR IMAGE PROCESSING
The present invention is directed to methods and apparatus for image processing, and more particu¬ larly in its preferred aspects is directed to methods and apparatus for detection and enhancement of image edges and optical wavelet transform processing.
It is an object of the present invention to provide improved methods and apparatus for electronic image edge enhancement, spatial frequency enhancement, or wavelet processing. It is a further object to provide methods and apparatus for obtaining electronic images of high definition. These and other objects will become more apparent from the following detailed description and the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a schematic illustration of an embodiment of an edge enhancing electronic camera system in accordance with the present invention which utilizes sequential images of different spatial resolution using the same optical path;
Figure la is a schematic illustration of another embodiment of an edge enhancing electronic camera system in accordance with the present invention which utilizes two simultaneously-imaging coincidentally positioned imager arrays having optical paths of different spatial resolution;
Figure lb is a schematic illustration of another embodiment of an edge enhancing electronic camera system in accordance with the present invention, which is particularly adapted for edge enhancement and identification for stereoscopic image analysis;
Figure 2 is an illustration of the rectangular imager pixel array of the electronic camera system of Figure 1; Figure 2a is a schematic illustration of an alternative imager array for the electronic camera
system of Figure 1, which is particularly adapted for charge-mode, on-chip output differencing;
Figure 3 is a graphical cross-sectional representation of the point spread functions of the camera system of Figure 1 at a first resolution and at a second resolution, with respect to an individual pixel of the imager array of Figure 2 and an adjacent pixel;
Figure 4 is a graphical representation of respective blur circles at the individual pixels of Figure 3, at a first resolution, and at a second resolution, suitable for differencing methods and apparatus for providing edge enhancement in accordance with the present invention;
Figure 5 is a graphical representation of the edge enhancement image output function of an individual pixel of Figure 3, produced by the differencing of image responses of the pixel at the first resolution and the second resolution of Figure 3;
Figure 6 is a schematic illustration of an embodiment of an electronic focus control system which may be utilized with an edge enhancing electronic camera system like that of Figure 1 or Figure lb;
Figure 7 is a schematic illustration of another embodiment of an electronic focus/defocus system which may be utilized with an edge enhancing electronic camera system like that of Figure 1 or Figure lb;
Figure 8 is a schematic illustration of an embodiment of an anamorphic focus control system which may be utilized with an edge enhancing electronic camera system like that of Figure 1, Figure la, or Figure lb to provide edge gradient vector information at each pixel;
Figure 9 is an illustration of a pixel-nulling electronic stage and camera system for television and motion picture special effects; Figure 9a is a partial perspective view, partially broken away, of a submicrometer reflective
grating polarizing beamsplitter for use in the cameras of Figure 9 or 9b;
Figure 9b is an illustration of a polarization resolving electronic camera for capturing simultaneous images of different polarization resolution, for producing television and motion picture special effects such as realistic superposition of images;
Figure 10 is a schematic illustration of a high resolution step-and-repeat camera which utilizes multiple, linearly-scanned image frames to fill an image pixel lattice;
Figure 10a is an illustration of a positionally displaced, pixel lattice-pattern-filling pixel imager array which may be linearly scanned at the focal plane of camera system of Figure 10 to provide a high-definition image and/or a high-definition edge-enhanced image the approximate size of the imager array;
Figure 11 is an illustration of a linear-scanning lattice-filling imager which is particularly adapted for linear lattice-filling, multiple-frame scanning of color images;
Figure 12 is an illustration of a fast-skip-scanning, lattice-pattern-filling imager which may be used to provide fine-resolution electronic images and/or fine resolution edge-enhanced images of rela¬ tively large size, substantially exceeding the size of the imager;
Figure 13 is an illustration of an X-ray imaging system which utilizes a fast-skip-scanning imager like that of Figure 12 to provide X-ray images and edge-enhanced X-ray images of extraordinary resolution;
Figure 13a is a cross-sectional illustration of a fast-skip, linear-scanning, image-lattice-filling imager like that of Figure 12, having on-chip image resolution modification capability for X-ray imaging
edge enhancement in an X-ray system like that of Figure 13;
Figure 14 is a schematic illustration of an embodiment of an imaging system for converting photographic images such as still photographs or motion picture film, to electronic images at high resolution, which may include edge enhancement;
Figure 15 is an illustration of the scanning pattern of an adjustable-resolution, linear-scanning, image-lattice-filling imager for use in a scanning camera like that of Figure 10;
Figures 16 and 17 are schematic illustrations of birefringent high resolution image steering systems; Figure 18 is a graphical illustration of a spatial bandpass wavelet, together with positive and negative components of such wavelet;
Figures 19a and 19b are, respectively, normal X-rays of a broken hand, and a "Mexican-Hat" enhanced image thereof; and Figures 20a and 20b are, respectively, a mammogram illustrating microcalcification incident to carcinoma, and a "Mexican-Hat" enhanced image thereof. DESCRIPTION OF THE INVENTION Various aspects of the present invention are directed to methods and apparatus for image edge enhancement and related image processing operations. In this regard, edge detection is regarded as the extreme or maximum degree of edge enhancement, and may be referred to herein as a form of edge enhancement. Generally in accordance with such methods, an object scene is imaged on a photosensor imager, which is desirably a discrete sensor element array comprising a plurality of discrete photosensor image zones or pixels of an image lattice responsive to local image intensity, to provide an image at a first optical resolution. When utilizing a photosensor comprising discrete pixel elements, the output of the photosensor pixels of an
imager may form an image frame. The pixel locations of an imager typically form a regular image lattice which conventionally may be a regular rectangular (e.g. square) , or hexagonal array. An image lattice may consist of a single image frame taken at one time, or may comprise an integrated or interdigitated array of image frames obtained at different times, which together form a complete image in which all of the lattice pixel positions are filled with image data, as will be discussed hereinafter. Further in accordance with the present methods, the object scene is also imaged on a photosensor imager such as a discrete sensor array to provide an image of the object scene at a second optical resolution, and the images of different optical resolution are differenced to provide an edge-enhanced image of the object scene. The images are differenced in registration, which may be accomplished by imaging at substantially the same image zones or pixel locations, so that the first image is differenced from the second image at corresponding image zones or pixel positions (e.g. on an image frame pixel-by-pixel basis) , to provide an edge enhanced image of the optical scene. The differencing of the images at image zones or individual pixel locations of the image lattice at different resolutions is an important feature of the present disclosure. By "pixel" is meant a discrete photosensor aperture. By "differencing" is meant that one image at one optical resolution is subtracted on a corresponding image zone or pixel-by-pixel basis from an image of the same object at a second optical resolution different from the first optical resolution. Preferably an image at lower spatial resolution is subtracted from the corresponding image at a higher spatial resolution, but the subtraction may be carried out in either order to provide image enhancement with proper attention to the sign of the resulting data. The differencing may be carried out at different weighting factors for each
image, depending on the degree of enhancement or other processing desired. The subtraction may be carried out digitally after converting the respective images at the first resolution and at the second resolution to digital values representing the image intensity at each respec¬ tively corresponding pixel location of the image. In converting the photosensor output to digital values, the response of each pixel, and the overall optical camera response may be corrected for accuracy in accordance with conventional practice. In this regard, for example, a "two point" correction may be made for both the dark current and the sensitivity of each individual pixel of the imager (the dark current correction being subtracted from the pixel measurement, and the sensi- tivity further being a multiplication factor for the pixel measurement data) ; the correction factors may be empirically determined for each pixel under uniform illumination and under dark conditions, in accordance with conventional practice. The differencing may also be carried out on an analog basis (off-chip or on-chip) , by subtracting charge packets, capacitance, current or voltage representations of image intensity, as may be desired in respect to particular system design consid¬ erations. The pixel-wise differencing of images of different resolution can 4quickly and simply perform a difference of Gaussian-type approximation of a Laplacian of Gaussian edge enhancement operation. Depending on the point-spread functions selected for the different images, other operations may also be performed. Because the differencing may be carried out by subtracting image intensity values measured by the same pixels of an imager, the difference output can be normalized for the dark current and sensitivity of each individual pixel. When performed digitally, these difference of Gaussian calculations are conventionally very computationally intensive.
The methods and apparatus of the present invention are useful, inter alia, in analysis/- enhancement of images for bandwidth compression, machine vision, medical X-rays, microscopy, robotics and artificial sight. For example, edge detection is used as a first step in constructing primal sketches for machine vision [D. Marr and E. Hildreth, "Theory of edge detection," Proc. R. Soc. London Ser. B 207, 187-217 (1980); D. Marr, Vision (Freeman, San Francisco, Calif., (1982) ; M. Brady, "Computational approaches to image understanding", Comput. Surv. 14, 3-71 (1982); E. C. Hildreth, "The detection of intensity changes by computer and biological vision systems," Comput. Vision Graphics Image Process. 22, 1-27 (1983); D.H. Ballard and C. M. Brown, Computer Vision Prentice-Hall,
Englewood Cliffs, N.J. 1982]. The methods and apparatus of the present invention are also useful for image processing over the wide spectral range for which lens elements and imagers are available, from the ultraviolet to the far infrared. The methods and apparatus find particular practical utility in the visible light range (e.g., 400-700 nanometer wavelengths), although there are significant image processing benefits provided for lens and imager systems which operate in the near ultraviolet (e.g., 150-400 nanometer wavelengths), near infrared (e.g., 1 to 15 micron wavelength), and in the far infrared (e.g., 15 to 100 micron wavelength) spectral regions. As will be discussed in more detail, such imaging methods and apparatus may operate at a selected wavelength, such as through the use of filters, or over a range of wavelengths, depending on the spectral range of the desired image, as well as the information and imaging desired. The methods and apparatus of the present invention are also particularly valuable for use in portable or miniaturized equipment which cannot for reasons of cost, weight or size limitations accommodate adequate digital computational
systems to provide real-time edge enhancement performance at high resolution.
As indicated, positionally corresponding image zones or discrete pixels of respective object images at different resolutions are subtracted with respective weighting functions depending on the degree of image enhancement or other processing desired. For simple edge detection, substantially equal weighting functions may be used, and the image pixels obtained at higher resolution may be directly subtracted from the positionally corresponding image pixels imaged at respectively lower resolution, to produce an edge-enhanced image function for the object scene. The zero crossings of the edge-enhanced image function may subsequently be identified to detect the edges of the image in accordance with conventional practice. For example, a relatively simple method is to identify those pixels of the image after differencing which are within a predetermined (relatively small) range of effectively zero enhanced image intensity (accounting for "fat" zero accommodation of negative numbers or common baselines) , and which are also adjacent at least one pixel of higher (positive) value, and adjacent at least one pixel of lower (negative) value, as an image edge points. More accurate methods, as described in various of the references cited herein, may be performed which utilize gradients or slopes of the enhanced image data to more accurately locate the true image edges, to fractions of a pixel. Where desired, equal weighting factors for the images of different resolution which are differenced may be achieved as a practical matter (with optimization of the correction of pixel sensitivity and dark current variance,) by utilizing substantially similar exposures for both the higher resolution image and the lower resolution image, preferably using the same exposure time and lens f-stop, for reasons which will be discussed hereinafter. Substantially equal weightings
of differenced images of different resolution also provide an enhanced image which has high data content with relatively low signal bandwidth, which is useful for transmitting image data (such as for deep space applications like a planetary probe or lander) where bandwidth and transmission time is restricted.
For lesser degrees of edge enhancement, the relatively defocused image may be subtracted from the relatively focused image at the respective pixel locations, with a weighting factor for the defocused image of less than 1 (such as in the range of from about 0.1 to about 0.9) in comparison to a weighting factor of 1 for the relatively focused image. Such edge enhance¬ ment is particularly useful for enhancing the "sharp- ness" of edges of images which may be degraded or inherently lacking in "sharpness". For example, infrared images (e.g. , "night vision" systems for surveillance or automated highway vehicle control systems) and X-ray images may have relatively poor edge contrast and localization, which may be improved for visualization or machine vision processing by enhancement by subtracting a lower resolution image from the highest resolution image at a fractional weighting factor in accordance with the present invention. Fractional weighting factors are also useful when a predetermined degree of lateral inhibition is desired to correct for optical system diffraction effects or other limitations. For example, methods and apparatus in accordance with the present invention may be used to provide optical storage devices having very high-speed read-out systems in which individual binary optical data "bits" of the optical storage medium are detected by imaging in parallel using an array of imager pixels (one pixel to detect each optical "bit" site) , with correc- tion for diffraction effects from adjacent optical data "bits", as will be described in more detail hereinafter. Partial enhancement, or selected amounts of lateral
inhibition, may be readily accomplished by methods such as using a lower total exposure for the relatively unfocused image, or by digital calculation if the respective images are digitized. In-registration differencing of images of different resolution provides significant image-processing capability in the present methods and apparatus. In-registration images of different resolution may be provided in a wide variety of ways. For example, an image frame obtained at a first resolution which has maximum resolution for the lens and electronic imager system being used, may be differenced from a defocused image of the same scene made without substantially moving the camera, the degree of defocusing being determined at least in part by the desired scale of edge enhancement. The degree of defocusing of a lens/imager system may be characterized by a defocus parameter U, which is approximated as
U = (D)/(2wF2) where: D is the absolute value of the difference between the image plane and photosensor plane distances from the lens, l/2wF is the lens coherent cut-off frequency, F •= f/D is the f number of the lens system, and f and D are the lens system focal length and diameter respectively [e.g., see C. Fales, F. Huck and R. Sa ms, "Imaging System Design for Improved Information Capacity", Applied Optics Vol. 23, pp. 872 (1984)]. In various embodiments, the defocus parameter of the relatively defocused image will be in the range of from about 1 to about 10, such as in the range of from about 2 to about 7. Registration of the differenced images is preferably provided by maintaining the same relative camera position, as indicated, but may also be provided in other ways, such as by scaling and positioning of images into pixel-matching configurations, and/or by using multiple cameras which are preferably aligned in object registration at the image plane.
While resolution may be varied by defocusing, images of relatively higher and lower resolution may also be obtained by other methods, such as by varying the depth of field of an image scene, by introducing astigmatism or selected distortion such as spherical distortion, by reducing the lens optical transfer function, and/or by blurring the image in other ways such as selective blurring at the focal plane.
Conventional focus control systems may be used to vary image resolution, such as mechanical focus adjustment systems (e.g., axial movement of one or more lens components, and/or the imager) . The defocusing may also, for certain edge enhancement and edge detection operations, be carried out by decreasing the f-number of the optical system to decrease the depth of field (and depth of focus) , by temporarily or selectively inter¬ posing a resolution-degrading optical component in the lens system, by activating an electronically variable optical compo+ent in the optical system, or by differen- cing images from separate aligned-in-registration imager systems having different resolutions. Typically the difference in resolution between the subtracted images, which may be variously defined in terms of point spread function, blur circle, spatial frequency response, optical transfer function (or its real part, the modulation transfer function) , or defocus parameter, etc. , will be selected within a specific range to provide the desired image processing function.
In addition, while radially symmetrical optical resolution functions are readily implemented and are particularly useful for conventional edge enhance¬ ment operations, asymmetrical (e.g., astigmatic or anamorphic) variation of optical resolution parameters is also particularly useful for performing specific edge enhancement tasks, such as pointwise edge gradient or vector determination. Different depth-of-field resolutions may be applied to three-dimensional edge
enhancement and detection, as will be described in more detail hereinafter.
In accordance with apparatus aspects of the invention, edge enhancement apparatus is provided comprising electronic imaging means for providing an electronic image defined by optical image intensity at discrete pixel locations, lens means for forming an image of an object scene at the electronic imaging means at a first optical resolution and at a second optical resolution different from the first optical resolution, and means for differencing the first electronic image from the second electronic image at respectively corresponding image pixel locations. The edges of an electronic image, such as a high-speed and/or high definition CCD image, are enhanced/detected as the second derivative of intensity variation of the image by a difference of Gaussian (circularly symmetrical, or unidirectional) operation performed by subtracting a first relatively focused image on a pixel-by-pixel basis, from a second, relatively unfocused image.
Preferably the corresponding image pixel image intensity measurements which are differenced are performed by the same imager photosensor sites, although corresponding image lattice pixel measurements which are differenced may be made by different imager photosensor sites when the imager is scanned as will be discussed.
The electronic imaging means may be any suitable photosensor system such as a surface channel or buried channel charge coupled device ("CCD") or other charge transfer device ("CTD") , a charge injection device ("CID") , or other photosensor array. Although imagers designed for the visible spectrum (e.g., 400-700 nanometers in wavelength) , such as silicon imagers are commonly available, the present methods and apparatus are useful for edge enhancement and detection in the infrared (e.g., 0.7 to 15 microns) far infrared (e.g., 15-25 microns, or 15-100 micron wavelength) and
ultraviolet (e.g., 100-400 nanometer wavelength) spectral ranges. Various types of imagers are conventionally known and used over various such light wavelength ranges, and such imagers may be used to provide edge enhancement and detection at such wavelengths in accordance with the present invention. Charge coupled devices having buried channel, and preferably peristaltic buried channel, charge transfer channels are particularly preferred because of their high speed operation. Such CCDs may be provided which are capable of very high frame rates, good pixel uniformity, and low noise. Examples of suitable CCD imagers for the visible spectrum are described in copending U.S. application Serial No. 07/698,315, incorporated herein by reference.
High speed Charge Injection Devices (CIDs) may also be used as electronic imagers for large-area arrays or rectangular scanning arrays. Such devices may comprise an X-Y addressable array of photosensitive capacitor elements and may be fabricated in large device formats similar to large CCD arrays. Image readout may be performed by storing pixel image charge on one (X or Y column or row) electrode and subsequently measuring the potential produced at the other electrode upon transfer of the charge packet to that electrode [see
Zarnowski et. al., SPIE vol. 1447 (1991) Charge Coupled Devices and Solid State Optical Sensors II, pp. 191-201; Carbone et. al., same SPIE Vol. 1447, pp. 229-237 for CID imager description] . CTDs and CIDs may also be designed to perform image subtraction at the local pixel level on a charge mode basis at each pixel. Such imagers may desirably have an effective fill factor of at least 75 percent, particularly when provided with a lenticular lens array or a lenslet array (diffractive optic or graded index) or may have a small fill factor of less than 25 percent, particularly when fine-scanning systems are used in which an image lattice is obtained
by assembling successive or concurrent (multiple imager) non-coincident image frames.
Because edge enhancement may be carried out at each individual photosensor site independently of the image lattice configuration, such imagers may have a variety of pixel array configurations, including square, or rectangular lattice arrays, and hexagonal lattice arrays, as well as line-scan and fine-scan arrays, as will be described in more detail hereinafter. The pixel sizes are preferably as uniform as possible in the imager. Nominally square, hexagonal, rectangular or circular pixels may desirably have a width, depending on the imager resolution desired, in the range of from about 2 microns, to about 50 microns for imagers in the visible range. UV and IR imager pixels may also be sized as appropriate. Square pixels in a rectangular imager array which are from 5 to 25 microns wide are optimal for many uses. Rectangular pixels should best have a length-to-width ratio not exceeding 2.0. It is desirable that design of the optical lens system and the electronic imager system be optimized to provide an appropriate balancing of resolution and aliasing. [C. S. Bell, Lens Evaluation for Electronic Photography, SPIE Vol. 1448 (1991) pp. 59-67]. In this regard, the limiting resolution of the image sensor and its inherent aliasing characteristics should best be designed to correspond with the characteristics of the lens system. A large amount of image information is lost in conventional imaging systems as a result of aliasing and/or blurring, even under optimal conditions. Much more information can be lost under non-optimal conditions. A minimum of two pixels is needed to detect modulated contrast in an image (although it is noted that each individual lattice pixel can produce an image edge enhancement value of the image at the image lattice position in accordance with the present invention) , so the Nyquist frequency limit
of the sensor array is approximately inversely propor¬ tional to twice the pixel repeat distance. Aliasing, an interference effect, may occur in imagers with image information at frequencies (spatial resolutions) higher than the Nyquist limit. Optical blurring is conven¬ tionally used with single frame imagers to reduce (or match) the lens spatial frequency resolution to that of the imaging system, and for systems with color sampling, optical prefilters which provide different amounts of blur or smear for different colors may be utilized to optimize the lens system to the image at each wavelength sensitivity.
A diffraction-limited lens system distributes the light from a single object source point in accord- ance with a point spread function at the image plane, which is a central region containing most of the energy, sometimes referred to as an Airy disk, surrounded by increasingly fainter rings. A commonly used criterion defines a lens to be at its resolution limit when the peak of one image point spread function falls on the first zero of another point spread function of two respective points to be resolved. When matched to an image array, the lens should be able to resolve two real object points separated by the imager pixel pitch, which is met as a practical matter when the lens provides an approximately modulation transfer function value (50 percent MTF) at the Nyquist frequency limit of the imager as determined by the pixel size and spacing.
Blur circle diameter, which is defined herein as the width of the projection at the image sensor plane (in the direction under consideration in the absence of rotational symmetry) where the intensity of the point spread function has fallen to 50 percent of its maximum value, is a practical measure of lens system resolution. More rigorous definition of lens resolution is provided by the optical transfer function (OTF) . The MTF is the real component of the OTF, and is a sufficient descrip-
tion of most conventional photographic lenses, except those which have significant aberrations such as coma and astigmatism. The blur circle diameter, Db, may be approximated for a variety of conventional lens systems by the empirical relationship:
Db (in millimeters) = X/F where X is in the range of about 0.55 to about 0.60 (e.g., 0.57) and where F, in cycles per millimeter, is the frequency corresponding to the 50 percent modulation point of the lens system modulation transfer function curve. Accordingly, an effective correspondence of lens and imager resolution characteristics for system design purposes may, for example, be achieved when
Pixel width (microns) = W/F where W is in the range of from about 400 to about 450 (e.g., 430), and F is as previously defined. For electronic camera systems, a wide variety of lens designs may be used, including Petzval, Cooke, Split-Crown, Triplet, Tessar, Rogmar/Aviar, Double Gauss, telephoto and retrofocus lens designs, as well as more modern lens systems. The lens systems may also have either a clear or shaded aperture. The spatial response of the imaging system is the convolution of the spatial response of the lens system and the spatial response of the imager pixel lattice. Transmittance shading reduces the lens transmittance from the center to the edge of the lens aperture, but may improve image fidelity in various lens-imager systems. Binary optic and color corrected binary optic/single lens element combinations may also be used, particularly for cameras in which cost is a primary factor.
For edge-detection, desirably the images of different resolution which are differenced will have substantially equal weighting factors. In practical systems equal weighting may be provided by using substantially the same exposure on a normalized basis. By "substantially the same exposure" is meant that the
exposure ratio of the first image to the second image (for the sum of those pixels in the pixel image frame or zone or field of view which is being differenced) is in the range of from about 0.75:1 to about 1:0.75, and preferably in the range of from about 0.9:1 to about 1:0.9. By "normalized" is meant that the scale of sensitivity of the image response to object scene light for the first image is effectively the same for the images which are differenced. An image may be normalized in any suitable manner. For example, because an image made by the camera system of Figure 1 at a lens opening f value of 4 at an exposure time of 0.01 second will have a total light response approximately 32 times that of an image obtained at a f value of 16, the images may be normalized by a) increasing total exposure time of the image obtained at fl6 to 0.32 seconds, b) by approximately weighting the analog or digitized pixel values of the first and second images to effectively multiply the digitized value of the second image by 32 before digitally differencing the images, or c) by effectively amplifying the analog pixel outputs in an analog system.
Having generally described various aspects of resolution-differencing image lattice edge enhancement systems, the present invention will now be more particu¬ larly described with respect to the specific embodiment of Figure 1, which is a schematic illustration of an edge enhancing electronic camera system 100. The camera system 100 is particularly adapted for edge enhancement/detection of two dimensional and similar images (such as photographs or X-ray images) which are located in a specific focal plane, or which otherwise are constrained within a relatively narrow depth of field. As shown in Figure 1, the edge enhancing camera system 100 comprises a lens system 102, a CCD imager 104, a lens and CCD controller 106, a CCD output analog-to-digital (A/D) converter 108, a digital dual
frame store memory 110, an image processing computer 112, an image storage memory 114, and a communications backplane 116.
The lens 102 may be a conventional autofocus CCD camera compound lens which may be varied in its focus by controller 106. The lens 102 may comprise a compound birefringent phase grating or other blur filter to match the optical response of the lens system 104 to that of the imager array 104 in accordance with conventional practice.
The CCD imager 104 is positioned within the focal range of the lens 102, such that light from an object 118 within the field of view of the lens 102 may be adjusted to maximum focus at the plane of the photo- sensitive pixels of the CCD imager by the controller
106. The illustrated camera lens system has a maximum, "wide open" f-number of 4, and an adjustable iris stop which is capable of adjusting the f-number of the lens in the range of from 4 to 16. The camera lens and imager have a focal range of from 2 meters to infinity at the "wide-open" lens iris position of f 4. As will be discussed, the depth of focus (and the depth of field) of the lens 102 is substantially less when adjusted to its "wide open" f-stop of 4, as compared to its "stopped-down" position of fl6. The CCD imager 104 may be a conventional CID or CCD imager, but is prefer¬ ably a high speed imager such as described in above referred to application Serial No. 07/698,315. In the illustrated embodiment 100 the CCD imager is a "black & white" imager sensitive to the visible spectrum, although the imager may also be a color imager such as an imager having pixels rendered sensitive to red, green or blue portions of the spectrum, respectively, in accordance with conventional "color" imager design. It is also noted that the camera 100 may optionally include a color filter such as a mechanical or electrooptic filter 117 under direct operator selective or programmed
control through controller 106. It may be desired to enhance or detect edges at selected wavelengths (e.g., blue object edges, etc.). As indicated, the imager embodiment 104 has wide spectral sensitivity in the visible spectrum, and accordingly the image it produces is a "black and white" image without color selectivity. When wavelength selectivity is desired for edge enhancement in image systems in accordance with the present invention, an optical filter may be provided in the optical path (e.g., as part of the lens 102) to select the desired wavelength(s) . When multispectral imaging is desired, mechanical filter wheels of discrete monochromatic filter elements (which typically have a relatively low switching speed of 0.1 to 1 second) may be used, as well as diffraction grating monochromators which provide a tunable wavelength over a relatively wide range with variable bandwidth. However, it is particularly preferred to use fast-tuning, electroni¬ cally controlled filters such as acoustooptic tunable filters and liquid crystal tunable filters. Acousto¬ optic tunable filters generate a variable acoustic diffraction grating under electronic control within a birefringent crystal, and can be tuned in microseconds. Liquid crystal tunable filters may also be used to convert a "black-and-white" wide spectrum edge-enhancing camera system into a multispectral edge-enhancing camera system, [for tunable filter systems, see Chang, Acoustic-optic Tunable Filters, Optical Engineering, 20:6:824-829 (1981); Katzka, AOTF Overview, Proc. SPIE Acousto-Optic, Electro-optic and Magneto-optic Devices and Applications Vol. 753, p. 22-28 (1987); C. Hoyt et. al, Photonics Spectra p. 92 (November, 1992)].
By imaging an object scene at a selective wavelength and bandwidth at a first preferably well-focused resolution appropriately matched with the resolution of the imager, and differencing an image of the object scene at a lower resolution at the selected
wavelength and bandwidth, object edges at such wave¬ length and bandwidth may be rapidly and efficiently enhanced in accordance with the present invention. The drive and control signals 120 for the operation of the CCD, including all necessary drive electrode clock signals and read-out control signals, are provided by controller 106 in accordance with conventional electronic camera design. The analog output 122 of the CCD imager is directed to A/D converter 108, where the analog signal value for each respective pixel of the image is converted at 12 bit accuracy to a digital value and stored in dual frame store 110. The frame store memory 110 is an image "frame grabber" and image processor which is designed to store at least two complete output image frames from the CCD 104, and is capable of subtracting the stored digital values, on a corresponding pixel-by-pixel basis, of one stored image frame from another with programmably-selected weighting. For example, the image intensity value measured by each respective pixel in the image frame of lower resolution will be subtracted by the frame store 110 from the positionally-corresponding image intensity value measured by the same pixel in the higher resolution image frame. The output frame produced by such frame store subtraction, (which represents a Difference of Gaussian-like ("DoG") edge enhancement function of the image when the two differ¬ enced frames are of the same image at appropriately different resolutions) is stored in the digital image memory 114.
The microprocessor computer system 112 processes this "DoG" image frame data from the frame store 110 in a variety of ways in accordance with conventional image analysis practice. In this regard, the image edges may be identified by identifying the "zero crossings" of the "DoG" image frame, and various line-finder, primal sketch, texture identification and
object recognition image analysis routines or procedures may be applied as desired. The communications backplane provides common timing signals and data communication between the various components 106,108,110,112,114, and also provides input and output data channels to and from the camera system 100, for operator control and for accessing the image and processed image data produced by the camera system.
In operation of the camera system 100, the lens 102 may be focused on an object 108 located at a distance X from lens 102, so that a first image of the object at a first resolution is projected on the photo¬ sensitive pixel array of CCD imager 104. The first resolution may be at substantially the maximum resolution of the lens-imager system. The camera is operated for a suitable exposure length under the predetermined light conditions and lens f-stop to provide a charge-mode analog image which is clocked off chip, and digitized by analog-to-digital converter 108 to provide a first high-quality digitized electronic image stored in the frame store 110. After the first higher resolution image has been stored in the frame store memory 110, a second relatively lower-resolution image may be obtained by defocusing the lens 102 to a predetermined defocus condition under programmed control of controller 106. The degree of defocus (or relative decrease of resolution or optical transfer function) is important to the edge enhancement processing function.
In the illustrated embodiment 104, the CCD has a rectangular interline transfer pixel array which is substantially square, as shown in Figure 2, which is a top, schematic view of a portion of the imager chip. Although the interline architecture is used in the preferred embodiment, other architectures such as frame transfer and/or horizontal/vertical scan generator architectures may also be used. As shown in Figure 2, the individual, photosensitive pixels 201,202 are each
connected through individual transfer gates 204 to the interline transfer registers 206, which in turn are connected through parallel-to-serial transfer gates 208 to buried channel serial output register(s) 210 (which are clocked off the imager chip through floating gate output and amplifier circuitry 212 to the converter 108) . After analog-to-digital (A/D) conversion, the first, higher resolution image frame is stored in the frame store 110. As indicated, the first electronic image frame stored in the dual frame store 110 is of relatively high resolution, which desirably is substantially optimized for the lens and imager array system 102, 104. Figure 3 is a graphical representation of the point spread functions of the camera system of Figure 1 at such a first resolution and at such a second resolution, with respect to an individual pixel 201 of the imager array of Figure 2. Further in this regard, Figure 3 is a cross-sectional view of two adjacent pixels 201, 202 taken through line 3-3 of Figure 2, over which are superimposed the higher-resolution and the lower-resolution point spread functions 301,303 of object point images centered at the center of each pixel. The optimization of the lens and imaging systems represents a balancing of lens resolution and imager sampling resolution, against sampling aliasing, which is simplified when the imager has a high fill factor (a high percentage of the imager surface occupied by photosensitive pixels) . In the illustrated embodiment, the fill factor may be increased to at least about 80 percent, and for certain applications to at least about 90 percent by providing a lenticular array of individual lenslets 302 on the surface of the imager disposed adjacent each respective photosite [K. Parulski, A High Performance Digital Color Video Camera, SPIE Vol. 1448, pp. 45-57 (1991) ] to direct light from adjacent non-active areas to the photosite, or by aligning a
lenslet array such as a binary optic lenslet array having a high fill factor (e.g. over 95 percent) atop the CCD, with the individual binary optic lenslet elements 306 focused on respective corresponding active pixels 202 of the CCD. However, it should be noted that the edge-enhancement operation is substantially independent of the fill factor, and aliasing considerations are reduced in scale for the edge enhancement processing, because this function may be carried out independently at each pixel.
The high-resolution point spread function of an object point centered by the lens system 102 at the centerpoint of the pixel 201 is shown as a monochromatic intensity curve 301 on Figure 3 above the pixel 201. Because the lens system 102 is a conventional lens system which is relatively free of astigmatism, the point spread function 301 is substantially radially symmetrical about its axis 308, and substantially approximates a Gaussian distribution shape consistent with the resolution of the imager pixel size, as previously discussed:
* .y) = l e <-R2/2s2) ] /2πS
where R2 = x2 + y2, where S is a shape factor, and where I.v „. is the radially symmetrical intensity function at Cartesian coordinates x,y.
Similarly, the point spread function of the lens system 102 for an object point imaged at the center of adjacent pixel 202 of the high-resolution image, graphically shown on Figure 3 as a monochromatic intensity curve 303, is substantially identical to the point spread function 301.
As discussed, in the operation of the edge enhancement camera 100, a second electronic image is made and stored in frame store memory 110 at a lower resolution than the first image. The point spread
function for the same object point which has the point spread function 301 in the first, higher resolution image, when similarly centered on the pixel 201, has the more widely distributed point spread function 305 which is similarly Gaussian-shaped, and may be approximated by the following Gaussian curve:
I .y) = [ e ~R2 2 (aS) 2) ] /2πS
where R2 = x2 + y2, where a is a Gaussian spreading factor, where S is a shape factor, and where I(χ } is the radially symmetrical intensity function.
There is a substantially identical point spread function 307 for the image of light from an object point centered upon the adjacent pixel 202. It should be noted that the integration of the respective point spread functions 301, 305 over their effective surface area corresponds to the total light intensity derived from the object point, the image of which is centered over the pixel 201. It should also be noted that for electronic images which have substan¬ tially equivalent exposures, the total energy from this object point distributed according to the narrow point spread function 301 in the high resolution image is substantially the same as the total energy from this same object point distributed according to the broader point spread function 305 in the lower resolution image. Although the total energy of the entire image is substantially the same for the higher-resolution and lower-resolution images (at the same total exposure) , it will be apparent from inspection of Figure 3 that while a relatively small amount of the total energy of func¬ tion 301 is directed outside the pixel 201, a relatively much greater amount of the energy distribution of the broader point spread function 305 of the relatively lower resolution image is directed outside of the pixel
201. Conversely, it will also be apparent that while a relatively large amount of the energy of the narrow point spread function 301 is directed into the pixel 201, only a relatively smaller amount of the energy distributed by the broader point spread function 305 is directed into the pixel 201 upon which it is centered, the rest being distributed outside the pixel to adjacent areas (including other pixels) within its distribution zone. The different resolutions of the images may also be represented by the different blur circles as previously defined. Figure 4 is a graphical represen¬ tation of respective blur circles for image points of the point spread functions 301, 303 at the individual pixel 201 of Figure 3, at a first resolution, and at a second resolution, suitable for differencing to provide edge enhancement in accordance with the present invention.
The previous discussion has concerned only light from an object point imaged at the center of a pixel of the imager 104. However, light is received by each pixel from all object points which are imaged by the lens 102 to have a point distribution function which overlaps such pixel. Generally for each pixel, these object points are those imaged within a zone centered on each pixel having a width of approximately the width of the point spread function plus the width of the pixel. Figure 5 is a graphical representation of the image response of the pixel 201 at the first resolution and at the second resolution respectively, and the edge enhancement image output function of the pixel produced by the differencing of image response of the pixel at the first resolution and second resolution of Figure 3. In this regard, with reference to Figures 3 and 5, for the relatively focused image having the relatively narrow point spread function 301, light is received by the pixel 201 from object points imaged along radius Rf,
which is half the effective width of point spread function 301 of the focused image, plus half of the width of the pixel 201. Because the proportion of light energy from each object-image point decreases from the center toward the periphery of the zone surrounding each pixel, the spatial origin of the light directed to each pixel is governed by a Gaussian-like spatial origin function 501 which represents the summation (by inte¬ gration along radius Rf) of the point spread functions which overlap the pixel 201. The Gaussian-like function 501 is similar to the point spread function 301, but is broader by the width of the pixel 201.
Similarly for the relatively defocused image having the lower-resolution point spread function 305, light is received by the pixel 201 from object points imaged along radius Ru, which is half the effective width of point-spread function 305 of the defocused image, plus half of the width of the pixel 201. Summation of all of the point spread functions of the lower resolution image which overlap the pixel 201 (e.g. , by integration along radius Ru) similarly provides a Gaussian-like (image response) spatial origin function 503 which is similar to the point spread function 305, but broader by the width of the pixel 201. The total volume under function 501 is substantially equal to the total volume under function 503 for images of equivalent exposure. By subtracting the broader response function 503 of the lower-resolution image from the narrower response function 503 of the higher- resolution image, a laterally-inhibited Difference-of- Gaussian-like edge enhancement function 505 is provided as shown on Figure 5. Because the Gaussian-like object spatial origin functions 501,503 for the images of different resolution are effectively generated at each pixel, the difference of these functions can be generated at each pixel in the image lattice. Accord¬ ingly, by subtracting the total light exposure of a
relatively focused image registered at each pixel from the total light exposure of the relatively unfocused image registered at each corresponding pixel, a smooth, laterally-inhibited edge enhancement function is readily generated for the image at each pixel in the lattice array of the image.
The DoG-like edge enhancement operation performed at each pixel location can be highly accurate, even without digital correction, although each pixel of the imager 102 may have its own dark current which depends (inter alia) on exposure time and temperature, and each pixel has its own sensitivity because of slightly different size and other fabrication variables, by differencing pixel image values from the same pixel (preferably at the same image total exposure and total exposure time) , inaccuracies caused by such pixel-by-pixel variances are to a significant degree inherently minimized. In addition, because for radially symmetrical lens systems of good quality, the spatial light distribution functions of the differenced images are more radially symmetrical than the pixel array (even a hexagonal array) , the difference edge enhancement function provided by subtracting images of different resolution may be substantially nondirectional in comparison to an edge enhancement function provided by digital filtering of adjacent pixels.
As discussed hereinabove, biological image edge enhancement in machine and human vision systems may be carried out at several scales of resolution, which may desirably be in optical image octaves. The presence of coincident edges at different scales can confirm prominent edges and object location, particularly for stereoopsis. Moreover, imprecision of edge location by edge detection at one scale may be corrected by edge detection at different scales. In accordance with the present invention, it will be appreciated that edge enhancement of an image may be readily carried out at
different scales, by varying the resolution of the images which are differenced on a pixel-by-pixel basis. For example, the focus (and corresponding point spread function) of the higher resolution image may be kept the same, but the resolution of the lower resolution image may be increased or decreased to broaden or narrow the point spread function of the subtractive image, thereby varying the shape function of the effective DoG differencing function described hereinabove. Similarly, the resolution of the relatively higher resolution image may be varied with, or independently of, the resolution of the lower resolution image which is differenced therefrom. In this way, edge enhancement/detection may be carried out at different scales in a very efficient manner for precise edge location, and other image analysis purposes.
Desirably, the point spread ratio of the higher and lower resolution images which are differenced may vary from about 10:1 to about 1.1:1, and more preferably from about 4:1 to about 1.5:1. By "point spread ratio" is meant the ratio of the half-intensity width of the point spread function of the image which is differenced having the wider point spread function, to the half-intensity width of the point spread function of the image which is differenced having the narrower point spread function. For point spread functions which are not radially symmetrical, the point spread function is measured at its widest half-intensity width in the direction of measurement. High quality image edge enhancement may be performed by the present methods and apparatus in an efficient manner, even when using relatively inexpensive equipment. For example, highly effective edge enhance¬ ment of a printed photographic image has been performed using only a scientific grade "black and white" electronic imaging video camera (Hitachi KP-M1) , the analog signal output of which was directed to a video
frame grabber board (Willow PV 1006-002) installed in a standard Compaq 386/20 microcomputer. A natural "woodland duck" scene of a calendar print having a wide range of textures and spatial frequencies was selected as a test object. The photographic scene was illuminated, and the image of the duck calendar made by the Hitachi scientific camera was brought into sharp focus by manual adjustment of the camera lens, by visual inspection. An image frame of the sharp-focus Hitachi camera video output was digitized and stored by the Willow frame grabber board and subsequently stored in digital image format on hard storage disk of the Compaq microcomputer. The Hitachi camera lens was subsequently manually adjusted without moving the camera or the calendar photographic scene being imaged, so that the image appeared to be slightly out-of-focus by visual inspection. An image frame of the defocused Hitachi camera video output at substantially the same total exposure conditions as the sharp-focused image was also digitized and stored in the Willow frame grabber board, and subsequently stored in digital image format on hard storage disk by the Compaq computer. The defocused image frame was then subtracted from the sharp-focused image frame on a corresponding pixel-by-pixel basis by means of a simple Turbo C program running on the Compaq computer to produce an excellent edge enhancement frame of the image. The process was repeated using a range of focused and defocused conditions, with differencing of more defocused images producing edge enhancement of broad features of the original scene, and differencing of images of sharper (but still different) focus producing edge enhancement of more detailed features of the original scene.
The camera system of Figure 1 uses sequential images of different resolution. Figure la is a schematic illustration of an alternative embodiment of an edge enhancing electronic camera system 150 in
accordance with the present invention which utilizes two imager arrays of different spatial resolution. In the camera 150, which is otherwise similar to camera 100 of Figure 1, a beam-splitter 152 divides the light 154from lens 102 into approximately two equal parts 156,158. One part 158 is directed to CCD imager 104, which is relatively focused, and the other part 156 is directed to CCD imager 105. The optical system comprising the lens 102, beam-splitter 152 and imager 104 has a different resolution than the optical system formed by the lens 102, beam splitter 152 and the imager 105. For example, the optical path to the imager 105 may have a permanent, relative defocus condition with respect to the image produced at imager 104. Alternatively, different blur filters, apertures, or other optical elements may be used to produce a resolution difference in the two optical systems. However, despite the resolution difference, the imagers 104,105 should be in accurate, pixel-by-pixel registration with each other. Desirably, the registration will be within 0.5 of the width of the pixels of the array, and preferably within 0.15 of the pixel width, across the entire image frame of the imagers 104, 105.
Figure lb is a schematic illustration of another embodiment of an edge-enhancing electronic camera system 170 in accordance with the present invention, which is particularly adapted for stereo¬ scopic imaging, for purposes such as robotics and machine vision, as well as automated stereophoto- grammetry. As shown in Figure lb, the stereoscopic camera system 170 is a dual camera system with each channel being similar to the camera system 100 of Figure 1 (or Figure la) . In this regard, the camera 170 comprises dual imaging camera lenses 102,102' and respective imagers 104,104' like those of Figure 1 (or Figure la) , as well as respective lens and CCD controllers 106,106' CCD output A/D converters 108,108'
digital dual frame store memories 110,110', image processing computers 112,112' an image storage memory 114, and a communications backplane 116. The dual stereoscopic camera system 170 provides substantially simultaneous stereoscopic electronic images of an object scene from two imaging plane locations separated by a stereoscopic imaging distance "Y". By obtaining a first stereoscopic image pair at a first resolution, and a second stereoscopic image pair at a second resolution, the respective images of different resolution at each image location may be differenced to provide edge-enhanced image functions and "zero-crossing" functions for each stereoscopic image location as previously described in a very computationally efficient manner. The zero-crossing detected edge data for each respective stereo image may be further utilized in automated stereo photogrammetry image analysis systems to identify an array of matching image points, typically using a coarse-to-fine control strategy using multiple edge detection resolutions to match feature points, edges, corners, and/or nodes [see Brookshire et al, Automated Stereophotogrammetry, Computer Vision, Graphics and Image Processing 52, 276-296 (1990) and references cited therein for stereophotogrammetry processing methods and apparatus] . The high resolution edges produced by systems in accordance with the present invention reduce ambiguity and enhance stereo photo¬ grammetry analysis. In addition, image edge depth information provided by differencing of stereo images of differing depth-of-field resolution (as described hereinafter) may be used to further reduce ambiguity and simplify the search and identification processing for matching features in the two stereo images. The intensive computational aspects of stereoscopic image analysis may be simplified by differencing images of selectively different resolution in accordance the present invention. By identifying edges at different
depths-of-field as described hereinafter for astigmatic/anamorphic image differencing, the stereoscopic analysis can be further enhanced.
As discussed, the camera 100 is particularly adapted for edge enhancement of an image of an object located in a limited focal plane zone (or depth-of-field) such as object "A" of Figure 1. For example, the object "A" may be a two dimensional photograph in a plane located a distance "X" from the lens as shown in Figure 1, which is focused by the lens 102 at a focal plane which is the plane of the imager 104 to obtain a first image of high resolution. When the image of object "A" is defocused by the conventional CCD camera lens 102 to obtain an image of lower resolution, the focal plane is adjusted to a different plane, so that the image of object "A" is relatively defocused at the imager 104. Such simple defocusing is effective and efficient for edge enhancement of a plane photograph or limited depth of field object scene, but when the lens 102 images a three-dimensional scene of substantial depth, different objects can come into sharp focus in the "unfocused" image of another object, when the defocusing is carried out merely by adjusting the focal plane of a conventional lens. For edge enhancement processing of the objects in a focused image plane zone of a three-dimensional scene, it is desirable that the spatial resolution of the lens system be varied. In this way, a first image may be recorded at a first, relatively high spatial resolution which is at the optimal focal plane of the lens system. The spatial resolution of the lens system may then be reduced, and a second image of the same scene recorded which is still at the optimal focal plane of the lens system. However, because the spatial resolution of the lens system is reduced, the point-spread function and blur circles are still provided which are broader in the second image, which is relatively unfocused with respect to the first
image, despite the imager being at the optimal focal plane, without bringing objects at a different depth-of-field into sharper focus.
The spatial resolution of the lens system 102 may be varied in a number of ways. For example, a plane lens element composed of a "soft focus" glass may be mechanically interposed in the light path of the lens, or the imager and/or at least one component of the lens system (such as a lightweight diftractive optical component) may be (e.g., randomly) vibrated during the low resolution image exposure time (e.g., by a piezo crystal with a travel distance generally corresponding to the difference in width between the point spread functions of the higher and lower resolution images) to broaden the effective point spread function. A more suitable method is to use a "soft focus" lens system which is adapted to vary the spherical aberration of the lens system in a controlled manner. Images of different spatial resolution may be provided by "soft focus" methods and apparatus, such as by varying, or introducing, controlled amounts of spherical aberration in the optical system. In this way, by electronically recording a first image which is "in focus" in a desired object plane, and subsequently electronically recording a second image of the same object plane at under a "soft-focus" adjustment made by increasing spherical aberration of the system, but still at the "best" focus, the image may be edge enhanced as previously described by subtracting the "soft-focused" image on a pixel-by-pixel basis from the relatively "sharp-focused" image. For example, as described in US 5,122,826, a focusing control apparatus for the lens system may include a focusing information detector for detecting focusing control information regarding a focusing lens and an auxiliary lens set position detector for detecting a set position of an auxiliary focus/defocus control lens such as a soft focus lens. A stored
correction coefficient for the set position of the auxiliary lens is may be used, together with the focusing control information, to determine a drive distance of the focusing lens to achieve proper focusing. If the auxiliary lens is manipulated and the set position is changed after an in-focus state is attained in a single mode, the above process is repeated. If the auxiliary lens is manipulated after the in-focus state is attained in the single mode, the mode is switched to a continuous mode to prevents reduction in focusing precision. The lenses may be moved for focus/defocus control by screw mechanisms, which are relatively slow, by solenoid mechanisms (e.g., objective lens double solenoid focus actuators such as paired juxtaposed coils concentric about low coercivity ferromagnetic cylinder, itself concentric w.r.t. objective lens as described in US patent 5,107,485), or by electrooptic lens operation. Soft focusing lens assemblies having a movable front lens group and stationary rear lens group as described in GB 2238399 in which the soft focusing objective lens assembly consists of a front lens group with a positive resultant refractive power and a rear lens groups with a negative resultant refractive power. To focus the system the front group of lenses may be moved under control of electromechanical circuitry to compute movement required for focusing lens to cancel shift of focus point due to soft focus lens. From the object side, the front lens group consists of a convergent first lens, a divergent second lens, a divergent third lens and a convergent fourth lens.
A soft focus lens system including a softening lens arranged to move axially to selectively change the soft focus effect, and a lens for focus adjustment arranged to move axially to adjust the focus such as described in U.S. 4,957,354 may also be used. A driver moves the lens for focus adjustment under direction of
controller 106, and a selector device moves the lens for softening to select axial positions. A controller 106 computes the amount of movement of the lens for focus adjustment to cancel the shift of the focal point resulting from the movement of the lens for softening and producing an output to the driver. Such a system permits relatively fast mechanical changes between normal and soft focus modes.
Another soft-focus lens system having a front group having one-group-two-element or two-group-two-element composition of positive and negative lens elements is described in U.S. 4,948,236. Such lenses can be composed of, in order from the object side, a front group having a one-group- two-element or two-group-two-element composition of a positive and negative lens element with a positive overall power and a rear group having a one-group-two-element or two-group-two-element composition of a negative and a positive power. If conditions of the system are such that the system has a brightness in the range of 1:2.8 to 1:3.5 in terms of F number and a viewing angle of about 40 degrees, a desired soft-focus effect over the entire image is produced. Alternatively, lenses of inherent "soft focus" characteristics may be inserted mechanically into the optical path. A photographic objective suitable as a lens 102 having a soft focus function is also described in U.S. 4,826,301 consisting of from front to rear, a lens component of positive power, another lens component having a meniscus-shaped lens of forward convexity, an aperture stop, and a third lens component. The second lens component is axially movable for the purpose of producing a soft focus effect. The image shift resulting from the movement of the soft-focus-introducing lens component is compensated by automatically readjusting the position of the focusing lens component.
Because of the inherently high speed of the optical edge enhancement processing of the present invention, which is effectively carried out in parallel at each image pixel, and the speed at which image frame subtraction may be carried out (which may also be carried out in parallel if desired) , a mechanical lens focus adjustment may be the performance-limiting factor of an image-processing system such as camera 100 of Figure 1. In this regard, for example, a high speed imager having a 512 x 512 rectangular array of pixels as described in copending Application Serial No. 07/698,315 may operate at a speed of from about 500 to 2,000 or even 10,000 frames per second, with the data for edge enhancement being acquired in two frames. However, if the lens system requires 0.1 second for a predetermined minor adjustment of resolution between frames, the lens adjustment time becomes the limiting performance factor by up to several orders of magnitude.
Accordingly, it will be appreciated that rapid optical resolution adjustment systems are preferred for high performance systems. For a system which uses a simple focal plane adjustment, an electrooptic lens element may be used to rapidly change the focal length of the lens system 102 by a predetermined amount corresponding to the desired defocus (or focus) amount for edge enhancement image differencing. Because edge enhancement processing may be performed with only relatively small focus/resolution adjustments, a variety of electrooptic lens systems may be utilized to provide resolution adjustment under electronic control without mechanical movement of the focusing lenses or the imager. For example, a twisted nematic electrooptic kinoform or Fresnel lens may be used to vary the focal length of the lens system 102 under program control [see E.C. Tarn et al, "Spatial-Light- Modulator Based
Electrooptical Imaging System", Applied Optics Vol. 31, p. 578 (1992)]. Varifocal electrooptic lenses such as
PLZT lenses may also be of the E-0 zoom or E-O Fresnel-zone plate type [Sato et. al., Proc. SPIE 1319, (1990) 493; Shibaguch, et. al., JPN. J. Appl. Phys 31 (1992) 3196-3200]. Such electrooptic lens components may be relatively thin, because only a limited focusing/defocusing range is adequate to provide edge-enhancement operation when used with a camera lens system such as lens 102. Supertwisted nematic liquid crystal devices may also be used to minimize the thickness and response time of the liquid crystal layer [H. Katsuhiko et. al., Jpn. J. Appl. Phys. 31 (1992) 2743-2747], In addition to electrooptic lens systems for focusing/defocusing, electrooptic components may also be used to vary the optical resolution of the lens system for large depth-of-field edge enhancement processing as previously described. In this regard. Figure 6 is a cross-sectional illustration of an embodiment of an electrooptic blur filter 600 which may be utilized with a lens system like lens 102 to vary the optical resolution of with an edge enhancing electronic camera system like that of Figure 1, together with a perspective view of the blur filter 600. The electronic blur filter 600 may be positioned in the lens 102 adjacent to or in substitution for the monochromatic filter 117, under electronic control of the controller 106 to decrease (and restore) the spatial resolution of the lens system by a predetermined. As shown in Figure 6, the blur filter 600 comprises an optically flat optical glass plate 602 having an isotropic index of refraction, having etched on one surface a plurality of regularly spaced projections 604 having a width selected in the range, for example, from about 0.5 to 10 microns, and a height of from about 0.5 to about 5 microns. The width and height of the projections may vary (e.g. , have rounded surfaces) , or be uniform, and the projections may be regularly spaced, or randomly spaced, depending upon the optical diffraction or
refraction effect desired to broaden the point spread function of lens-imaged light transmitted therethrough. The projections may by formed by conventional photo¬ lithographic and etching techniques. A perspective view of the optical glass flat 602 with its projections 604 is also shown in Figure 6 in registration with the cross sectional view of the blur filter 600 of which it is a component. An optically flat glass cover plate 606 is sealed to the optical flat 602 against the projections 604, and the interval space 608 between the projections 604 is filled with a liquid crystal material such as a nematic liquid crystal fluid having an index of refraction when aligned with the surface and grooves, which matches that of the glass plate 602. The surfaces of the glass flats 602,606 are treated in accordance with conventional methods to assure the desired alignment. Transparent optically flat electrodes such as tin oxide or indium tin oxide layers 610 and 612 are applied to the top and bottom surfaces of the optical blur filter 600, across which may be applied an activating voltage by the controller 106 (Figure 1) . In its inactivated state, accordingly, the blur filter is a homogeneous optical flat which does not substantially affect the performance of the lens 102. However, when an electrical potential is applied across the electrodes 608, 610, the liquid crystal material is aligned by the resulting electric field, which alters the index of refraction of the volume 108 occupied by the liquid crystal for the light propagating through blur filter 600, typically by about 0.2 index of refraction units (e.g. ,n=1.5 vs. n=1.7). Depending on the character¬ istics of the liquid crystal material selected and the initial molecular orientation selected by appropriate surface treatment in the design of the blur filter, the refractive index change may be positive or negative upon electrical activation, as desired. In the activated state, the projections 604 act as a plurality of
apertures or lenses, effectively decreasing the resolution of the lens system 102, and increasing the point spread function of the lens system. A wide variety of such electronic blur filter configurations may be provided.
Although conventional liquid crystal devices may be slow in operation, because of the high surface area to volume ratio of the device 600, its speed of operation is enhanced. It is also noted that the device 600 does not operate by birefringence, and accordingly does not require a polarizing filter. In addition, high-speed electrooptic materials, such as ferroelectric and antiferroelectric materials may be used to further increase the operating speed of the blur filter device. Ferroelectric liquid crystal fluids and anti-ferroelectric crystal fluids have fast switching speeds (see Ozaki et. al., Jpn. J. Appl. Phys. 31 (1992) pp. 3189-3192 and references there cited, for ferro¬ electric device properties; see Moritake et. al., Jpn. J. Appl. Phys. 31 (1992) pp. 3193-95; Yamamoto et. al. , Jpn. J. Appl. Phys. 31 (1992) pp. 3186-88; Hayashi et. al., Jpn. J. Appl. Phys. 31 (1992) pp. 3182-85 and references there cited for antiferroelectric device properties) , and accordingly are desirable as active electrooptic materials for soft-focus and variable lens systems to be used with very fast frame imagers, such as those having frame speeds of at least 500 frames per second. Ferroelectric and antiferroelectric liquids may also be used in electrooptic polarizers for plane screen imaging systems as will be described hereinafter.
Electronic or electrooptic lens systems may also include diffractive optics and spatial light modulators using such electrooptic structures. Diffractive optics including submicron linear gratings, "moths eye" gratings, microlenslet arrays, optical transform diffractive gratings and kinofor diffractive lenses immersed in an appropriately oriented liquid crystal
material may be made variable as a function of applied voltage, to provide electrooptic devices which are electronically adjustable on a pixel-by-pixel basis, such as lenses and lenslet arrays of electronically adjustable focal length, and electronically adjustable diffractive optic transform systems. Optical elements made from liquid-crystal-impregnated silica or organopolymer aerogels may also be provided with electrooptoorienting properties. By using submicron optical features or optical wavelength grating surfaces in which to embed the normally slow-orienting liquid smectic, cholesteric or nematic crystal materials, high speed electronic lenses and "soft focus" elements may be provided using electrooptic materials and devices. Liquid ferroelectric or antiferroelectric materials may be used instead of nematic, cholesteric or smectic liquid crystals when even higher speed operation is desired.
Polymer-dispersed and glass-dispersed liquid crystal devices may also be used to provide electrooptic defocusing. Figure 7 is a schematic illustration of an embodiment of an electronic focus/defocus system which may be utilized with an edge enhancing electronic camera system like that of Figure l. In this regard, as shown in Figure 7, a dispersed liquid crystal optical element 700, which is preferably an optically flat layer, is provided in the optical path, either as part of the lens system 102 or at a location immediately adjacent the imager 104 as shown in Figure 7. The liquid crystal optical element 700 has from about 1 to about 15 volume percent of a finely dispersed liquid crystal component (e.g. 1-3 micron diameter nematic or ferroelectric liquid crystal droplets 702) homogeneously dispersed in a transparent polymer or gel glass matrix 704 having a refractive index which substantially matches the refractive index of the liquid crystal droplets. The matrix is positioned between optical glass flats 710,712
having respective outer transparent electrodes 706,708. When an electric field is applied to the droplets 702 and matrix 704 by applying an electric potential to transparent outer electrodes 706, 708, the liquid crystal material in the droplets is aligned by the electric field, thereby changing the refractive index match between the matrix 704 and the dispersed droplets, to produce a dispersive optical element. Relatively small droplets and thin device layers provide effective defocusing through light dispersion, rather than opacity [see Levy et. al., J. Non-Crystalline Solids, 147 & 148 (1992) p. 647].
In the illustrated embodiment 100 of Figure 1, the differencing of the relatively higher-and lower-resolution images is performed off-chip after digitizing of the pixel image intensity values. However, subtraction of analog charge packets from each pixel may also be carried out on-chip in a variety of ways. For example, signal charge packets to be differenced may be transferred directly onto separate transfer gates for differencing at each pixel site. In this way, analog pixel charge packets from a first image may be transferred to a storage gate at each respective pixel, and then subtracted in charge mode from a second image at each respective pixel, before being clocked off-chip by charge-mode transfer along the register gates. Such neighborhood subtraction architectures tend to limit the pixel fill-factor, but charge-mode subtraction can also be performed at the imager chip edge before being clocked o f-chip. In this regard.
Figure 2a is a schematic illustration of an imager 250 like that of Figure 2 having a rectangular interline transfer pixel array which is substantially square. As shown in Figure 2, the individual, photosensitive pixels 201,202 are each connected through two sets of individual transfer gates 203,204 to respective inter¬ line transfer registers 205,206, which in turn are input
to charge-mode differencing circuits 207, which in turn output a difference charge through parallel-to-serial transfer gates 208 to buried channel serial output register(s) 210 (which are clocked off the imager chip through floating gate output and amplifier circuitry 212 to the converter 108) . In operation, a first, higher resolution image frame is imaged at photosites 201,202, and the resulting photosite charge packets are clocked through gates 203 to interline transfer registers 205 for temporary storage. A second, lower resolution image of the same scene is then imaged at the same photosites, and clocked through gates 204 to transfer registers 206. The charge packets from the respective images at the same photosites are subsequently clocked to the charge-mode differencing circuits 207 for on-chip differencing. The differencing circuits 207 have a first gate which is initially reset to a low reset voltage, and a second gate which is reset to a high reset voltage. The reset switches are subsequently turned off, and signal charge packets QI and Q2 from the transfer registers 205,206 representing higher-resolution, and lower-resolution image data from the same pixel, respectively, are transferred directly onto the first gate and the second gate, respectively, such as through reverse-biased p-n junctions. To accommodate a negative result of the subtraction, an offset charge Qoffset is also provided for a "fill and spill" register, which defines "zero" charge. Total node capacitances Cl(V) of the first gate and C2(V) of the second gate are matched, so that the resulting signal charge Qsub under the second gate after a "fill-and-spill" operation is proportional to (Ql-Q2+Qoffset) . When subtracting two analog charge packets, it is possible to have a negative values, which are represented by a charge of less than the offset charge Qoffset. The charge-mode output Qsub repre¬ senting the difference of images of different resolution
at each pixel is clocked through parallel-to-serial transfer gates 208 to buried channel serial output register(s) 210, which are clocked off the imager chip through floating gate output and amplifier circuitry 212 to the A/D converter 108. The differencing process is repeated for each charge-packet pixel pair. It is noted that CCD charge subtraction circuitry which utilizes charge-to-voltage and voltage-to-charge conversion, may have reduced accuracy and robustness to process variations, so that high quality fabrication techniques are desirable.
In addition to steps of simple defocusing, or otherwise degrading the lens system optical transfer function (OTF blurring) , relative differences in spatial resolution may be provided between two corresponding images by varying the depth of field of the respective image frames. "Depth of field" lies in the object field and is the difference between the upper and lower planes of clear, sharp focus. "Depth of focus" on the other hand, lies in the plane of the CCD imager and is the depth within which the CCD can be moved with the image remaining in sharp focus. The "depth of field" determines the volume of the object field of view which is in focus at one lens setting. By balancing of aliasing with system resolution as previously discussed, an in-focus zone may be provided in which the imager is the primary resolution-limiting factor, until exceeded by lower resolution of the lens system as its depth of field limitations become predominant. By pixel-wise image frame differencing of images of the same scene having a difference of resolution corresponding to different depth-of-field resolutions of the optical system, more complex optical processing corresponding to edge enhancement of a depth zone of the scene may be carried out. For example, with reference to Figure 1, when the lens 102 is focused "wide-open" at an f-stop setting of 4 on the object "A"
at a focal distance "X" from the lens 102, a relatively shallow depth of field (marked as "f=4" on Figure 1) is in focus at the plane of the imager 104. The object "B", which is at a distance of "2X" from the lens is not in the depth of field zone, and is recorded at relatively lower resolution by the imager 102. When the camera 100 is operated with the lens 102 "stopped down" to a smaller aperture, for example to an f-stop of 16, the depth of field is much wider, as shown in Figure 1 (marked as f=16) , so that the zone of high focal resolution includes object "B" at distance 2X, as well as object "A" which is at the object focal point of the lens 102. By subtracting on a pixel-by-pixel basis a first image made at a relatively shallow depth-of-field (e.g., with the lens 102 "wide open" at f=4) , from an equally-exposed or weighted image of the same object scene taken at a relatively longer depth of field (e.g., with the lens 102 "stopped down" at f=16) in accordance with the present invention, an edge enhancement will be performed with respect to objects which are not in the focal zone of the relatively shallow field image. Moreover, substantially no edge enhancement will be performed with respect to objects (e.g. object "A") which are at the focal point in both the shallow (f=4) and the wide (f=16)depth of field images. When a subsequent processing step of identifying the zero crossing points is performed, the edges of objects outside of the shallow focal zone are detected. In this regard, the two images are substantially the same at the focal point of the lens, so differencing of the images of (substantially) the same overall weighted intensity values on a pixel-by-pixel basis produces a substan¬ tially null or vanishingly weak output which can be thresholded out of a zero-crossing edge detection processing step. Moreover, by successively varying the focal point of the lens 102 to known object field depths (electrooptically or mechanically controlled through the
lens 102 by controller 106) , the space volume in the lens field of view may be parsed at different depths to locate edges at different depths. For intermediate positions between objects "A" and "B", a generally smoothly varying edge enhancement function of increasing strength and scale away from the object focal plane is provided, from substantially no edge enhancement processing at the object focal plane at object "A", to the scale of image processing provided by the respective defocus parameters of the wide-open lens (f=4) and the stopped-down lens (f=16) . Another way of characterizing the image processing of differencing two images of different depth-of-object-field resolution is to note that the different effective resolution functions of shallow field and the deep field image of the camera system which are differenced, each vary as a function of the distance from the object focal plane along the lens axis. The effective defocus parameter, which modulates the breadth of the point spread function, increases more rapidly with distance from the object focal plane for the shallow field image than the deep-field image, providing different resolutions outside the shallow focal zone for differencing at each pixel. The effective sensitivity of the differencing function also initially increases with distance outside the shallow-focus zone as the respective point-spread functions become different. However, because the point-spread functions of both the deep-field and the shallow field images increase with distance from the object focal plane, the scale of edge detection provided by differencing also increases as a function of distance from the shallow zone of highest resolution. Thus, edge enhancement of the higher spatial frequency edges is performed nearest the shallow focal zone, and the spatial frequency of the edge enhancement operator decreases with distance from the shallow-focal zone in object space along the lens axis away from the shallow
depth-of-field zone. This distance-varying edge enhancement function is useful in image-depth analysis, particularly in stereoscopic camera systems such as camera 170 of Figure lb, and when utilized with similar edge enhancement analysis of images of the same object scene with different object-field focal plane depths. In this regard, it is further noted that the edge enhancement function variation (and hence edge enhancement scale) with such focal plane distance is known for a given lens system, as is the distance of the object focal plane from the lens, from the mechanical or electronic lens focus adjustment control system of the lens. By comparing (or even differencing) the edge-enhanced images made with respect to different focal planes by means of lens focal adjustment and/or lens f-number adjustment, (or edge-detected primal sketches derived therefrom) distance and edge scales may be determined for "true" multiscale coincident edge detection, and stereoopsis, as generally described hereinabove.
Images having asymmetric resolution differences may also be subtracted in accordance with the present invention to provide useful and powerful image processing functions. In this regard, illustrated in Figure 8 is a perspective view of an embodiment of an astig atic/anamorphic, non-rotationally symmetrical lens and imaging system 800 which may be utilized with an edge enhancing electronic camera system like that of Figure 1 to provide edge gradient vector information. The distance between the lens 802 and the object plane
807 is foreshortened for purposes of illustration in the Figure. Although the previously described lens 102 has a radially symmetrical point-spread function for the images of different resolution, useful edge enhancement and image processing procedures may be readily carried out in accordance with various aspects of the present invention by differencing images of asymmetrically
different resolution. For example, the lens 802 of Figure 8 is otherwise like the lens 102 of Figure 1, but has a mechanically or electronically rotatable and removable cylindrical lens element 804. The lens 802 may be used with the camera system of Figure 1 to provide directional edge vector enhancement and detection. As shown in Figure 8, the lens 802 without the cylindrical lens element 804 focuses object points in an object plane 807 at an image focal plane 818, at which the image sensor 104 is positioned for electronic imaging. The object rays from an object point 806 in the object plane 807 which are parallel to the longitudinal axis 808 of the cylindrical lens element 804, as shown by dotted plane 810, are substantially unaffected by the cylindrical lens element 802. The object rays in plane 810 having point 806 in the object plane 807 are focused at point 812 in the image plane IFP. In the absence of the cylindrical lens element 804, all of the light rays from object point 806 would similarly be focused at point 812 and have an effective point-spread function 301 as shown in Figure 3. However, the cylindrical lens element 804, which in the illustrated embodiment is a converging lens element with respect to that component of light rays from the object point 806 which is perpendicular to the plane 810 and lens axis 808, effectively focuses that component of the light rays from the object 806 with greater convergence in the plane 814 perpendicular to plane 810, to a different focal point 816, to produce a broader point spread function at plane 818 like that of point spread function 305 of Figure 3. The focal convergence of the object light rays in the plane 814 perpendicular to the plane 810, as shown in Figure 8, converges more rapidly than the rays unaffected by lens 804, to a focal point 816 in an image plane 820 which is spaced apart from the focal point 812 by a distance which is a function of the refractive power of the cylindrical lens 804. With the
cylindrical lens element 104 in place, the point spread function of the image of the object point 806 at point 812 in the image focal plane 818 is elongated in a direction parallel to the axis 808 of the cylindrical lens element 804. In this regard, the point spread function of the image is effectively defocused in a direction parallel to the lens axis 808 having a broad point spread function like function 305 of Figure 3. However, the point spread function is not defocused in a direction perpendicular to the axis 808, and accordingly has a relatively narrow point spread function in a direction parallel to the axis 808 which is substan¬ tially the same as the point spread function of the lens 802 (and 102) without the cylindrical lens element 804. An enlarged representation of the blur circle
822 of the relatively focused image of the lens system 802 without the lens 804, and the corresponding "blur ellipse" 824 elongated along an axis 826 parallel to lens axis 808, is shown (as an insert) in Figure 8. By subtracting a radially unsymmetrically defocused image produced by lens 802 with the cylindrical lens element 804, on a pixel-by-pixel basis as previously described, from a corresponding radially symmetrical relatively focused image of the same object scene, an edge enhancement operation is performed which enhances only the edge gradient parallel to the axis 808. It should be noted that the refractive power of the lens 804 is relatively small to accomplish only the limited degree of defocusing, perpendicular to its axis, as previously described for the radially symmetrical lens system 102. By rotating the asymmetrical lens element 804 about the longitudinal optical axis of the lens system 802, and repeating the image differencing process, the axis of the measured edge gradient may be similarly rotated. By effectively rotating the lens 804 by 90 degrees between successive images, and separately subtracting the two successive images from the same
relatively focused (radially symmetrical) image made by the lens 802 without the lens 804, two image edge gradient functions are generated which provide the edge gradient values Vx,Vy in orthogonal directions x,y respectively corresponding to the respective longitudinal axes of the asymmetrical lens elements utilized in the processing. These orthogonal values may be combined geometrically to produce a vector value V2=(Vx)2 + (Vy)2, and determine the direction of the gradient vector. Thus, while various aspects of the present invention may be used to emulate the traditional nondirectional Difference of Gaussian edge operator, other types of processing may also be used to provide directional edge enhancement operators which measure edge vector gradients.
A lens system like that of lens 802 may be operated by mechanically removing the cylindrical lens element 804 to produce a relatively high-resolution image having a radially symmetrical point spread function, and replacing the lens 804 to produce or relatively lower resolution image having an elliptical point spread function. The lens 804 may similarly be mechanically rotated about the lens axis by a suitable mechanical system, preferably under programmed control of controller 106. However, such mechanical operation is relatively slow, and limits the output performance of the camera system. Electronically controllable lens systems are preferred because of their high speed operation, capacity for variation of refraction to produce a range of image resolutions, elimination of moving parts, ease of programmability, light weight and compactness.
An electrooptic imaging lens, such as a liquid crystal Fresnel cylindrical lens of the type described by Tarn et. al., supra, may be used to permit asymmetrical defocusing under electronic control. It should be noted that by fabricating the transparent.
linear phase-level electrodes in one direction on one face of the liquid crystal lens element, and a second set of transparent linear electrodes orthogonal to the first set on the other face of the lens, a cylindrical lens may be activated in one direction by applying a uniform voltage to one set of electrodes and applying a kinoform phase-varying electrode voltage pattern to the electrodes on the other side while obtaining an elec¬ tronic image using the lens system. The electrooptic cylindrical lens may subsequently be rotated 90 degrees by reversing the electrode voltage pattern so that the phase varying voltage pattern is applied to the electrodes on the previously uniform-potential side, and the uniform voltage is applied to the previously voltage-fringe-patterned electrodes on the other side of the electrooptic element. The lens is "turned-off" or "removed" by removing the electrode voltage from both electrodes. It should be noted that some electrooptic lenses utilize a polarization analyzer when the liquid crystal devices are constructed to affect only the extraordinary light component.
Electrooptic lenses of negative refractive power, as well as converging electrooptic lenses, may also be fabricated for use as elements such as cylindrical lens 804. In this regard, an important embodiment of electrooptic lens systems for image processing systems in accordance with the present invention comprises a lens system such as that of lens 102 further comprising an electrooptic cylindrical converging lens 804, and an electrooptic cylindrical diverging lens 805 with its refractive axis rotated 90 degrees with respect to the converging lens. By simultaneously activating both lenses to approximately equal but opposite refractive power, the lens system is rapidly defocused in an effectively radially symmetrical manner for an image located at the original focal plane without the lens activation, and may also be defocused
in an elliptical manner by activating only one of the electrooptic lenses. This produces an effective "blur circle of least confusion" like that shown at numeral 820 of Figure 8, which may be at the original focal plane if the negative and positive anamorphic lenses are of equal (but opposite) refractive power. While the described CCD camera embodiments utilize photosensors having discrete pixels which convert light input into charge packets which are digitized, it should also be noted that images of different resolution may be differenced using various spatial light modulator materials and systems (SLM) which may output the resulting edge enhanced image in a variety of ways, such as coherent or incoherent optical readout. For example, a first image of an object or scene at a first resolution (e.g., at sharp focus) may be exposed on a continuous spatial light modulator such as a photo deuterated potassium dihydrogen phosphate (DKDP) SLM at a first voltage bias polarity. A second image of the object at a second resolution (e.g., slightly defocused) may then be subtracted from the first image by exposing the second image on the DKDP SLM at the opposite polarity [see D. Casasent, "Spatial Light Modulators", Proceedings of the IEEE, pp. 143-157 (1977) for DKDP image subtraction and other SLM systems which can accomplish image subtraction, particularly through bias reversal]. By exposing the images so that the darkest portions of the images receive an appropriate gray scale, the subtraction using an SLM may accommodate a zero baseline for the enhancement function. It should also be noted that other discrete pixel photosensor systems may be employed to carry out the acquisition and differencing of in-registration images of different resolution. For example, discrete-pixel imagers may also be designed and operated in reverse-bias mode for one exposure to subtract sequential images or exposure times of different resolution. Similarly, partially
transparent imager designs in which two different photosensors are separated vertically so that there is an inherent focus difference between the photosensor layers, may be used to produce fixed-difference edge enhancement. For example, an image focused on one partially transparent imager of a pair of CCD imagers fabricated on opposite sides of a thinned silicon or silicon-on-sapphire (both sides) wafer, will be defocused at the imager on the opposite side of the wafer.
As indicated hereinabove, differencing of differing-resolution images using fractional weighting factors is also desirable for producing a predetermined degree of lateral inhibition to correct for optical system diffraction effects or other limitations. For example, methods and apparatus in accordance with the present invention may be used to provide optical storage devices having very high-speed read-out systems in which individual binary optical data "bits" of the optical storage medium in square or rectangular array are detected by imaging in parallel using laser light projection imaging of a selected area of the optical disk onto an array of imager pixels (one pixel to detect each optical "bit" site) , with correction for diffrac- tion effects from adjacent optical data "bits". Optical memory media are capable of enormous storage capacity, at a "bit" size of approximately 1 micron in diameter, and efforts have been made to provide high-speed parallel readout optical disks (PROD) , but the rate at which conventional parallel readout systems can operate is a significant limitation [Yamamura, A., et. al., "Application of optical disk technology to optical information processing", SPIE Critical Reviews Series Vol. 1150, Spatial Light Modulators and Applications, pp. 104-112; see also, W. Imaino et. al.,"Actuation Mechanisms in Optical Storage", Adv. Info. Storage Syst., Vol. 1, pp. 375-401 (1991)]. The small size of
the optical "bit" storage sites and their proximity to each other in comparison with the read light wavelength also may produce anomalous read errors due to diffrac¬ tive light scattering from adjacent optical "bit" sites. In this regard, for example, if the 8 "bits" surrounding a selected zero (dark) bit in a rectangular array are "ones" (bright light source,) even a relatively small amount of diffractive scatter from each surrounding site into the pixel on which the "zero" site is projected, will produce a relatively bright, "false one" readout. By utilizing a high-speed CCD imager as a detector array for a 2-dimensional block of optical storage sites which are laser-pulse-illuminated, with each site being imaged on a pixel of the imager array, very high-speed readout operation may be accomplished. For example, at 2,000 frames per second of a high-speed 512x512 imager as described herein, which is substituted for the standard detector array of a SONY PROD optical disk (e.g., see Figure 1 of Yamamura et. al., supra), the readout data rate exceeds 500 million bits per second. Moreover, by partial enhancement of the bit-site image, by subtracting a slightly lower resolution image (e.g., a blur circle for the lower resolution correction image which is 1.15 to 1.25 times the blur circle of the higher resolution read image provided by activation of an electrooptic element in the illuminating beam path) from a maximum resolution image in accordance with the present disclosure, selected amounts of lateral inhibition to correct for such scattering, may be readily accomplished. An appropriate fractional weighting factor (e.g. , 15-30%) for the lower resolution image as compared to the maximum resolution image, is readily provided by methods such as using a lower total exposure (time) for the relatively unfocused image. Subtraction is preferably carried out on-chip, to maximize the output data rate.
The present disclosure is also directed to image processing methods and apparatus for special effects such as combining electronic images. In this regard, in both television and motion picture produc- tion, there is a need to realistically superimpose images of people or objects onto a background scene. For example, television news broadcasts routinely use a method called "blue screen" to combine one image, typically of a moving person such as a weatherman) with a background scene such as a weather map which also has moving elements. In the blue screen technique, the first image such as the weatherman is photographed in front of a solid blue background, while a separate camera takes a picture of the evening weather map. The camera photographing the weatherman is programmed to eliminate (set to zero) any pixels in the image which are of sufficiently intense blue corresponding to the blue screen intensity and spectral range. The images are then combined with the non-zero foreground image pixels being substituted for the corresponding background pixels. Thus, the background (blue screen) is eliminated, and the first image (e.g., the weatherman) is superimposed on the weather map picture. A primary problem with the blue screen technique can be seen if a weatherman wears an article of clothing which contains some blue coloring - the blue screen technique may be non-selective and may turn the blue-clothed portion of the weatherman into the invisible man. Similar problems occur a) at image edges which may reflect some of the bluescreen light, b) at pixels which include part of the bluescreen background and the first image, and c) with motion of the first image, which also blends image and bluescreen light at affected pixels. Blue screen effects for motion pictures are achieved in much the same way as in television, and suffer from similar limitations. To obtain some degree of blue selectivity in motion pictures, the background
blue screen is actively back-illuminated. The camera or post-processing system is then programmed to eliminate only blue pixels which exceed a preselected threshold intensity. Thus, a person wearing a blue suit could be photographed with the blue screen technique if the image intensity of the actively lighted blue background is sufficiently brighter than the image of the blue suit, and the threshold is appropriately set. While the thresholding technique is useful, it is not without limitations, particularly in certain lighting conditions, and when the blue colors of the background image and object image are similar. Use of other background colors, such as a "green screen" may be useful for certain situations of background and superimposed image colors. However, "blue screen" is typically chosen for special effects involving images of people because human flesh tones tend to contain very little blue. Removing green coloring from such human flesh tones tends to produce unrealistic human flesh tones, without further computer image processing to produce realistic images. An additional (expensive) image processing production step may therefore be required to correct flesh tones.
There is a significant need in both TV and motion pictures to achieve "blue screen" effects without being sensitive to any one particular color. In accordance with the present invention, methods and apparatus are provided for electronic image special effects such as superimposing a foreground image with a selected or synthetic background image. Generally in accordance with such method aspects of the present invention, background scene is provided which radiates polarized light of uniform polarization orientation. The polarized light may be linearly or circularly polarized light, and the background scene may radiate the polarized light as a result of emission, transmis¬ sion or reflection, but typically will be provided by
transmission of an artificial lighting source through a uniform polarizing filter. Preferably, the light radiated by the background object will be substantially uniform in intensity over its entire imaging zone. Advantageously, the light radiated from the background object may be wide-spectrum, or "white" light, as will be described in more detail hereinafter. Further in accordance with the present invention, a foreground scene is provided which radiates light which is substantially unpolarized. By "foreground scene" is meant one or more objects which are desired to be isolated from the background scene for superposition with another scene. The light radiated from the objects of the foreground scene may be emitted, transmitted or reflected, but typically will be reflected by the foreground objects from natural or artificial lighting sources. The foreground scene and background scene are electronically imaged on a discrete photosensor array comprising a plurality of pixels of an image lattice (e.g., as previously described) to provide an image at a first polarization resolution. Further in accordance with the present methods, the background scene and foreground scene are also electronically imaged on a photosensor array at substantially the same discrete pixel locations to provide an image at a second polarization resolution, and the pixels of the first image are compared to the corresponding pixels of the second image, with those exceeding a predetermined threshold of intensity difference being nulled in at least one of the images. Typically, the first image will be differenced from the second image at each corresponding image pixel position (e.g., on an image frame pixel-by-pixel basis) to provide an image polarization difference frame of the foreground and background scenes. The differencing of the images at individual pixel locations of the image lattice at different polarization resolutions is an important
feature of the present disclosure. Preferably an image in which the polarized background scene is relatively dark is subtracted from the corresponding image at a polarization resolution which provides a relatively lighter background scene, but the subtraction may be carried out in either order to provide image polari¬ zation difference frame with appropriate attention to the sign of the resulting data. The subtraction may be carried out digitally after converting the respective images at the first polarization resolution and at the second polarization resolution to digital values representing the image intensity at each respectively corresponding pixel location of the image. In con¬ verting the photosensor output to digital values, the response of each pixel, and the overall optical camera response may be corrected for accuracy in accordance with conventional practice. The differencing may also be carried out on an analog basis (off-chip or on-chip) , by subtracting charge packets, capacitance, current or voltage representations of image intensity, as may be desired in respect to particular system design considerations. Those pixel locations which exceed a predetermined threshhold may be nulled in one or both of the images. By using the effectively crossed polarizers to distinguish the foreground from the background, under optimal conditions, relatively large differences are produced by the present methods between the respective intensities of the background pixels in the different polarization images. Preferably, those pixels for which the comparison shows that the darker pixel is less than 75 percent, and more preferably less than 35 percent of the corresponding lighter polarization pixel, will be nulled. If only one of the images is processed to identify and null background pixels in this manner, it is usually preferable that the image be selected in which the background image is of lower intensity. In this manner, the polarized background light is minimized
throughout the entire image, including foreground object edge pixels, and object pixels which have a reflected light component originating from the background scene. This method is also particularly useful when a "colored" (e.g. , blue) polarized light background scene is provided. However, both images may be processed to null those pixels which exceed a selected difference threshold in comparison between the two images, and performance benefits may be provided by combining (adding together) the corresponding pixels of both of the nulled images. In this way, the combined image includes all polarization components of the foreground object image.
The in-registration, pixel-wise differencing of images of different polarization resolution can quickly and simply identify those pixels of an image which are in the background scene, in order to remove them from the scene and isolate the objects in the foreground scene. The image frame may subsequently be "or"ed with a destination scene, to introduce the isolated foreground scene as a foreground scene in the destination scene, in accordance with conventional practice.
Having generally described various aspects of the special effects imaging methods and apparatus, the invention will now be further described with reference to the specific embodiment of special effects imaging apparatus 900 illustrated in Figure 9. As shown in Figure 9, the special effects apparatus comprises an electronic motion picture camera system similar to the camera system 100 of Figure 1, a back-illuminated background screen 902 which emits polarized white light, and photographic lighting units 904 for illuminating the foreground object "A" located between the back-illuminated background screen and the camera 100, with unpolarized white light.
A polarized light radiating background screen is an important part of the special effects imaging system 900. The illustrated embodiment 902 of a polarized-light emitting background screen comprises a lamp enclosure 906 which encloses a plurality of lamps and reflectors for uniformly illuminating the front surface 908 of the background screen. The front surface 908 of the background screen 902 comprises a transparent layer of linear light polarizing material, such as polarized glass, or polarized plastic sheets such as oriented iodine-containing polyvinyl alcohol [for example, see W. Gunning, "Improvement in the trans¬ mission of iodine-polyvinyl alcohol polarizers", Applied Optics, Vol.22 pp.3229-31 (1983)], which transmits substantially only that component of the light from the enclosed lamps which has an electric vector oriented in one (here, vertical) direction. Accordingly, the light emanating from the panel is linearly polarized white (broad-spectrum) light with its electric vector in a vertical direction, as shown by arrow 910. While the illustrated embodiment utilizes active back-lighting to generate a uniformly-lighted polarized light background zone, reflected light may also be polarized. In this regard, a polarizing layer or sheet backed by a reflec- tive surface (such as an aluminized, polyvinyl alcohol polarizer sheet) is also an effective means for providing a polarized light background. When illumi¬ nated with a nonpolarized light source such as studio lights 904, the light transmitted through the polarizing sheet, reflected from the aluminized back surface, and again transmitted through the polarizing sheet toward the camera lens, is an effective polarized light background scene. Advantages of such a reflective-polarizing background are that it is passive, can be made relatively inexpensively in large surface area forms, and is light and portable.
The electronic studio camera 100 comprises a lens 102 like that of Figure 1, further including an electrooptic light polarization rotator 912 and a linear polarization light analyzer 914. The illustrated electrooptic light polarization rotator 912 is a conventional nematic liquid crystal, optical quality flat half-wave plate positioned in front of the polarization analyzer 914 with respect to light from the object "A" and the background scene 902, which functions to rotate the electric vector of light passing through the rotator 912 by 90 degrees [see, e.g., Jacobs, S., "Liquid Crystals as Large Aperture Waveplates and Circular Polarizers", pp. 98-105, SPIE Vol. 307 Polarizers and Applications, (1981)]. As shown in Figure 9a, the 90 degree rotator 912 comprises a birefringent nematic liquid crystal layer of uniform thickness between optical glass orienting sheets provided with outer transparent electrodes. The orientation vector of the liquid crystal is substan- tially at a 45 degree angle with respect to the vertical polarization vector 910 of the light emanating from the background screen 902, so that the orientation of this vertically image relatively defocused parallel and crossed polarization images. It is noted that when movement is present between different polarization image frames, several procedures may be used.
As indicated, there may also be differences between the vertically polarized image of the object scene and the horizontally polarized image. In this regard, reflected light from an unpolarized source may be partially polarized, and light from the background screen which is reflected from or transmitted through the object scene may retain some of its polarized character in original or rotated orientation. Accord- ingly, a preferred method for providing an object scene for combining with a different background scene comprises the steps of comparing a first image and a
second image of different polarization filtering on a pixel-by-pixel basis, identifying and nulling the background pixels in the first image and the second image, and combining the first image and the second image to provide a composite image. The composite image will be substantially free of polarization defects.
Another method for providing a first image having a first polarization resolution between its object scene and its background, and a second image having a second polarization resolution between its object scene and its background scene which is different from the first polarization resolution, is to illuminate the object scene with polarized light. With reference to Figure 9, for example, the lamps 904 may project light which is linearly or circularly (right- or left-handed) polarized. Depending on the object characteristics, the reflected light may retain at least a portion of its polarization characteristics, which may be resolved as generally described to provide images of different polarization resolution of an image having predetermined differences in object light polarization.
While the camera of Figure 9 is a "black-and-white" camera, the imager may be a conventional single-chip color imager which is used to provide color images of different polarization for comparison and identification of background or foreground objects; or three (or other multiple number) separate color selective imager systems, preferably RGB imagers in precise image registration, may be used to provide color images. The comparison of relative intensity between pixels of different polarization resolution may be carried out at each of several wavelengths or wavelength ranges (e.g. , RGB) to provide increased statistical accuracy in the identification of pixels to be nulled. It should also be noted that while it is preferred that white polarized light should be
used for the background scene, that polarized light of selected wavelength(s) may also be used.
Illustrated in Figure 9b is a dual-imaging electronic camera system 950 like that of camera Figure la, which acquires simultaneous images of different polarization resolution to avoid image differences caused by object movement. The polarization-resolving electronic camera system 950 comprises two imager arrays which simultaneously acquire images of the same scene of different polarization resolution. In the camera 950, which may be otherwise similar to camera 150 of Figure 1, a conventional polarizing beam-splitter 952 divides the light 954 from lens 102 into two parts 956,958 of different polarization. The beam splitter 952 may be any suitable polarization-resolving beam-splitting optical element. For example, the polarization beam-splitter 952 may be a dielectric mirror oriented at the brewster angle with respect to the incident light from the lens 102, so that the reflected light 956 contains one direction of linear polarization having its electric vector normal to the plane of incidence. The polarization beam-splitter may also be a grid polarizer comprising a grid of parallel, reflecting conductors (in which the period of the grid is smaller than the wavelength of the incident light) , which reflects one polarization of incident light while transmitting the other [see Yeh, P. , "Generalized model for Wire Grid Polarizers", SPIE Vol. 307 Polarizers and Applications, pp. 13-24(1981); Slocum, R. "Evaporative Thin Metal Films as Polarizers", SPIE Vol. 307, pp.
25-30 (1981)]. The polarization beam-splitter may also be, for example, a submicron-etched, metallized mirror which selectively reflects light components having polarization with electric vector parallel to the longitudinal axes of the conducting strips, and transmits light components of orthogonal polarization. Such a beam-splitter 952 is illustrated in Figure 9a,
and comprises a transparent dielectric substrate 980 such as quartz on which is deposited a thin reflective metal coating 982 such as a 500 Angstrom aluminum coating. The reflective metal coating is masked and etched to form a submicrometer grating of reflective metal strips using conventional holographic and plasma or chemical etching techniques [e.g., see D. Flanders, Submicrometer periodicity gratings as artificial anisotropic dielectrics", Appl. Phys. Lett. Vol. 42, pp. 492-494 (1983) ; Y. Ono et. al., "Antireflection effect in ultrahigh spatial-frequency holographic relief gratings", Applied Optics Vol. 26, pp. 1142-46 (1987) ; L. Cescato, "Holographic Quarterwave Plates", Applied Optics Vol. 29, pp. 3286-90 (1990)]. Because the period and reflective strip width of the grating is small with respect to the light wavelength (e.g., for period of 100-250 nanometers, width of 35-55 percent of the period) , the submicrometer mirror grating device reflects the electric polarization vector light component parallel to the axes of the etched reflective metal strips, and transmits the light component with electric vector perpendicular to the reflective strips. It should also be noted that by providing a dielectric • submicrometer periodicity grating between transparent electrodes with the grating slots filled with an oriented liquid crystal such as illustrated in Figure 6, a variably electrooptic birefringent plate is provided, which may serve as a polarization rotator.
Accordingly, using any of a variety of polarizing beamsplitters, the beam 958 which is transmitted through the polarizing beam-splitter is predominantly polarized in a direction orthogonal to the beam 956.ically polarized light may be rotated by birefringent phase delay. The nematic liquid crystal thickness times the birefringence is substantially equal to one half the center wavelength of the visible light band, so that a 180 degree phase shift is introduced
between the ordinary and extraordinary rays. When electrically inactivated, the rotator 912 does not substantially affect the polarization of the light. The illustrated linear polarized light analyzer 914 is a conventional, optical quality, optically flat linear polarization filter which in the illustrated embodiment 900 is aligned so that it transmits substantially only electric vector components of light incident thereon which are parallel to the electric vector of the polarized light emanating from the polarized light panel 908 of the background screen 902. When the polarization rotator is electrically inactivated, the vertically oriented, linearly polarized light emanating from the background screen passes substantially unattenuated through both the rotator 912 and the analyzer filter 914, into the lens 102 for electronic imaging by the camera 100. However, while unpolarized light from the object scene is transmitted through the rotator 912 without substantial effect, the horizontal electric vector components of the unpolarized light from the object scene which are perpendicular to the electric vector orientation of the analyzer filter are selec¬ tively removed by the filter 914 before imaging. While a theoretically perfect analyzer will transmit 50 percent of randomly oriented light to produce a transmitted beam of perfectly linearly polarized light, typically the analyzer 912 will transmit from about 35 to about 60 percent of unpolarized, randomly oriented light incident thereon to produce a transmitted beam which is at least 75 percent polarized. Accordingly, with the rotator 912 electrically inactivated, the electronic image produced by the camera system is substantially of just the vertically polarized light components of the object scene and the background scene. When the rotator 912 is operative, the electric vectors of light components passing therethrough are rotated by approximately 90 degrees. Because the light from the
object scene is substantially unpolarized, this has little effect on the electronic image produced by the camera 100, because the horizontally polarized light image of the object scene (with limited exceptions under certain circumstances to be discussed in respect to
Figure 9b) , is substantially identical to its vertically polarized light image. However, because the light emanating from the background screen 902 is substan¬ tially all vertically polarized, it has substantially no horizontally polarized light components. Accordingly, when the electrooptic light polarization rotator 912 is operated to rotate the incident light by 90 degrees, substantially the same object scene will be captured with a substantially darker background. The two images are then compared, and any pixels which have changed by a selected threshold value are set to zero. Because the method and apparatus of Figure 9 are not sensitive to any one particular color, "blue-screen" effects are possible for any object color. While the illustrated embodiment utilizes a nematic rotator plate 912 which rotates polarity in its inactive state and does not rotate the polarization when electrically activated, other electrooptic 90 degree phase shifting components may alternatively be used as the rotator which are optically isotropic when electrically inactivated, but which rotate the polarization upon application of an appropriate electric field. For example, a PLZT electrooptic wafer with its optical axis oriented 45 degrees with respect to the axis 910 of polarization of light from the background 902 may be used to rotate the polarization plane upon application of appropriate electrode potential when it is desired to obtain images of different polarization orientation for comparison and background elimination in accordance with the present methods [see, e.g., A. Thornton, "Electro-optic shutter devices utilizing lead lanthanum zirconate titanate (PLZT) ceramic wafers", pp.32-38; J. Whitley et. al. ,
"Continuously variable electrooptic filter", pp. 38-42; G. Gasparian, "Fiberoptic reflection coupler for bidirectional transmission", pp. 43-45; J. Roese, "Advances in electrooptic stereoscopic displays", pp. 46-52, all in SPIE Volume 307 Polarizers and
Applications (1981) , for information in respect to such PLZT rotator components] . Moreover, while the illustrated embodiment utilizes an optoelectronic half-wave plate to control rotation of the polarization plane, mechanical systems may also be used. For example, the electrooptic rotator 912 may be removed, and the analyzer/filter 914 may be constructed to be mechanically rotatable about the longitudinal axis of the lens 102. Rotation of the camera polarizer/filter 914 has substantially the same effect as previously described for the rotator-analyzer combination. Such rotation may be carried out by hand (primarily for still photography) or by electric motor under control of the controller 106. It is not necessary to stop the rotation of the polarization filter while obtaining an image, although the selectivity declines with increase of exposure time when the polarization axes for the respectively compared images are not fully perpen¬ dicular. In addition, while the illustrated embodiment utilizes plane polarized light resolution differences, circularly-polarized (including elliptically polarized) light differences may also be utilized. A quarter-wave plate converts linearly polarized light to either right-circularly or left-circularly polarized light (depending on component delay) . Either the background 902 or the lamps 904 may radiate circularly polarized light. Such light may similarly be reconverted to linearly polarized light for selective image comparison, or selectively rejected by an appropriate electrooptic device such as a cholesteric liquid crystal component in the camera lens system for such selective polarization
image comparison, as previously described [e.g., see Jacobs, supra, p. 103].
Real time comparison of the two images of different light polarization intensity may be performed using suitable differencing apparatus or circuitry as previously described, such as conventional frame grabber boards or on-chip storage and subtraction circuitry. The effectiveness of such polarization differencing image combination may be increased for various situations (such as when there is rapid image movement between differenced images) by image resolution image edge enhancement as previously described. The detected edges may be used to verify pixel null zones, which are desired to be replaced by another image, or to select pixels which are to be "blended " with the new background scene. Further in this regard, a camera such as that of Figure lb may be used to simultaneously image parallel and cross-polarized images so that pixel contrasts due to object motion are minimized. The polarization axis of the beam-splitter 952 should best be oriented such that it is either substan¬ tially parallel to, or substantially orthogonal to, the polarization vector of the polarized light incident thereon from the background screen 902. In this way, the polarization-difference effects are maximized. In the illustrated embodiment 950, the polarization axis of the beam-splitter 952 is perpendicular to the polari¬ zation vector of the vertically-polarized light from the background screen 902, so that this light 958 is selec- tively transmitted to CCD imager 104. The orthogonally polarized light 956 (with respect to the light 958) is directed to CCD imager 105. The optical system comprising the lens 102, polarization beam-splitter 952 and imager 104 accordingly records light having a different polarization than the optical system formed by the lens 102, polarization beam splitter 952 and the imager 105. Because the imagers 104,105 are in
accurate, pixel-by-pixel registration with each other, two simultaneous electronic images of the background scene and the object scene are acquired by the camera 950, in which the individual pixels spatially correspond, because the imagers are in registration. Desirably, the registration will be within 0.5 of the width of the pixels of the array, and preferably within 0.15 of the pixel width, across the entire image frame of the imagers 104, 105. Because the images of differ- ent polarization resolution are made simultaneously, the differential effects of object motion (e.g., between successive frames) are minimized. The images may be compared on a pixel-by-pixel basis; those pixels receiving light from the polarized light-emitting background will have a relatively large intensity difference between the two images of different polarization resolution because of the effects of the polarization selectivity. These pixels may be identi¬ fied for nulling or replacement by the replacement special effects background. Those pixels having an intermediate difference of intensity between the two images of different polarization resolution may be regarded as having a mixture of background and desired object light, because the pixels are at an image edge, of because of object motion during the exposure time which blends background and foreground light in the pixel. Such pixels may be identified for blending with the special effects background, preferably in substan¬ tially the same or similar proportion to the proportion of background and foreground light determined by the comparison step. The intensity value of such intermediate pixels may desirably be the value of the "darker" pixel from the image frame in which the polarized background light is more selectively excluded. Those pixels which are substantially the same or similar in intensity in the two images of different polarization resolution are selected as foreground objects which will
be superimposed on the special effects background scene. These pixels may be combined and effectively averaged if it is desired to produce a scene effectively derived from unpolarized light. High quality special effects may be produced by the present methods and apparatus in an efficient manner, even when using relatively simple or inexpensive equipment. For example, highly effective special effects image processing has been performed using only a polarizing light sheet, a polarizing filter and a scientific grade "black and white" electronic imaging video camera (Hitachi KP-M1) , the analog signal output of which was directed to a video frame grabber board (Willow PV1006-200) installed in a standard Compaq 386/20 microcomputer. To create "special effects" imaging with this equipment, a "close up" image (from above) of several paper clips illuminated from above in unpolarized light against a linearly polarized light background provided by a backlit Polaroid HN32 linear polarizer sheet on a Bishop Graphics Light Table, was selected as a test scene. A first electronic image was made of this scene using a sheet of the Polaroid HN32 linear polarizer directly in front of the camera lens in polarization alignment with the background light polari- zation direction, so that the backlit polarized back¬ ground was relatively bright. An image frame of the resulting "bright background" Hitachi camera video output was digitized at a full scale digitizer range of 8 bits (256 scale units) , stored by the Willow frame grabber board and subsequently stored in digital image format on hard storage disk of the Compaq microcomputer. The polarizing filter in front of the Hitachi camera lens was subsequently manually rotated about 90 degrees, so that the backlit polarized background appeared to be relatively dark by visual inspection, although the paper clips were relatively unchanged in appearance. An image frame of the "dark background" Hitachi camera video
output at substantially the same exposure time as the "bright background" image was also digitized over the 8-bit full scale digitization range of 256 units, stored in the Willow frame grabber board, and subsequently stored in digital image format on hard storage disk by the Compaq computer. The "dark background" image frame was then compared to the "light background" image frame to identify those corresponding pixels which were signi¬ ficantly different in intensity in the two electronic images. The comparison was carried out by subtracting the digitized values of the "dark background" image pixels from the digitized values of the corresponding image pixels of the "bright background" image frame, on a corresponding pixel-by-pixel basis, by means of a Turbo C program running on the Compaq computer [program attached as Appendix A] . Those pixel-subtraction image locations exhibiting a difference in intensity of at least 60 out of the full-scale 256 A/D units were "nulled" (set to zero) in the image memory of the "light background" image, while those image-pixel locations which had an intensity difference of less than 60 eight-bit units were left unchanged, to produce an "insert" image" suitable for combining with a different background image. An image frame of a test background image of a person in the test area was also captured using the Hitachi camera, digitized at a full scale digitizer range of 8 bits (256 scale units) and stored by the Willow frame grabber board, and subsequently stored in digital image format on hard storage disk of the Compaq microcomputer. The "insert" image having the nulled pixels was then combined with the test background image by substituting only the non-zero image pixels of the paper clip special effects "insert" image in the person-test image. The resulting image was an excellent "special effects" image having the paper clips at a relatively large magnification clearly and cleanly
superimposed on the person-test image at a relatively smaller image scale.
Although for purposes of discussion, the use of fixed-location imagers has been described which produce a final image having approximately the same number and same arrangement of image-pixels as the number and arrangement of discrete photosites on the imaging plane of the imager, imaging methods and apparatus employing imager-scanning may be used which produce more detailed and/or larger images in which the images have more image-pixels than the imager. Moreover, because edge-enhancement image processing methods (or polarization-resolution "special effects" image processing) as previously described are carried out by differencing (or comparing) light from image sites obtained at different imaging (or polarization) resolutions, such higher-resolution and/or larger-format methods may be carried out relatively independently at each image pixel site, and may be relatively independent of the fill factor of the imager (the percent of the image sensor area occupied by active photosensor area) . Such methods and apparatus in accordance with the present invention, particularly including high resolution systems, may also be provided which are particularly beneficial for use with low fill-factor electronic imagers. By "low fill-factor electronic imagers" is meant an imager array of discrete photosites (or pixels) in which the effective active photosensor area is less than 50 percent of the active imaging zone of the imager. The effective active photosensor area may be increased (or decreased) by lenticular micro- lenses, digital lenslet arrays and masks, such as described herein. Imagers which have pixel neighborhood image processing or other specialized electronic integrated circuitry surrounding or associated with each pixel, may have low fill factors, particularly if not effectively increased by light-gathering structures such
as lenticular layers adjacent to photosites or binary optic lenslet arrays spaced slightly apart from the photosites. To provide increased resolution, imagers are conventionally used in an image-scanning mode in which successive images are made after moving the imager by a pixel-width in the image field ("step and repeat or "fine scan" operation) . Imagers used to increase image resolution in this manner are also intentionally designed to have low fill factors [J. Milch, "High Resolution Digitization of Photographic Images with an Area Charge-Coupled Device (CCD) Imager", Applications of Digital Image Processing, Proc. SPIE 697, pp. 96-104 (1986) ; Kontron Commercial Camera]. Accordingly, it will be appreciated that the methods and apparatus of the present method may use systems in which the output of the photosensor pixels of an imager form an image frame in which the pixel locations of an imager typically form a regular image lattice which conventionally may be a regular rectangular (e.g. square) , or hexagonal array consisting of a single image frame taken by the imager. The methods and apparatus may also use systems in which the completed image may comprise an integrated or interdigitated array of image frames obtained at different times or at different imagers, which together form a complete image in which all of the lattice pixel positions are filled with image data, as will be discussed hereinafter.
The methods and apparatus of the present invention may be used to readily carry out edge enhancement/image processing or polarization resolution comparison of such step-and-repeat/fine scan (or other interdigitated) images, by obtaining a relatively high resolution image and a relatively low resolution image at each respective pixel photosite fine scan position, and differencing or comparing the images on a same pixel image-site by pixel image-site basis.
However, conventional step-and repeat fine scan systems typically require sequential "x" and "y" positioning in which each pixel moves only one step in either direction between fields to form a rectangular scan-path of imaging sites. Such systems are compli¬ cated in requiring coordination and settling of dual electronic and mechanical driver circuits and apparatus. Such systems may also be limited in travel distance and speed, and to images of the approximate size of the imager array. Improved methods and apparatus for image scanning which permit linear scanning would be desirable for simplifying the control activity and cost, and for increasing the speed of operation. In addition, methods and apparatus which provide for larger scanning distance than single-pixel-width displacement would be desirable for real-time large-area scanning of images, such as X-ray images and motion picture film, at very high resolution, would also be desirable. It would be further desirable to apply the edge enhancement and polarization-resolution methods of the present invention to such high-resolution linear-scanning and wide-area scanning imaging systems to provide very high resolution images, edge enhanced images or image edges, or polarization resolved images for special effects. In this regard, various aspects of the present invention are also directed to methods and apparatus for fine-scanning an electronic image. Imager arrays and linear scanning patterns as shown in Figure 10 for CCD and other 2-D pixel-array electronic imagers, can be used to greatly increase image resolution. Linear scanning methods are simpler and cheaper to implement than existing serial x-y positioning systems. The scanning methods may also be used for medical X-ray imaging, high resolution microscopy, motion pictures, and high resolution electronic photography. The linear scanning methods and apparatus are compatible with the previously described edge enhancement methods and
apparatus and polarization resolution methods and apparatus for very high resolution edge enhancement and/or special effects electronic photography as previously described. As shown substantially to scale in Figure 10a, a lattice-filling pixel imager is used which is linearly scanned at the focal plane of a camera lens system to provide a high-definition image the approximate size of the imager chip. Because successive pixel rows are displaced by one pixel width in a regular manner, by successively advancing the imager by increments of one pixel width along the pixel column direction, an imager pixel is positioned at each of the image lattice locations over one pattern-repeat distance (shown by arrow 1052) . By obtaining an electronic image at each imager location in the linear scanning pattern, and storing the respective pixel image data in the respective image memory lattice position corresponding to its true position in the image, a relatively high resolution image may be provided. Figure 10 is a schematic illustration of a high resolution, step-and-repeat camera 1002 which utilizes multiple, linearly-scanned image frames to fill an image pixel lattice, and Figure 10a is a schematic top illustration of a positionally-displaced-pixel imager array 1050 of the camera 1002, which may be linearly scanned at its focal plane to provide a high-definition image and/or a high-definition edge-enhanced image the approximate size of the imager, but at substantially higher resolution than that provided by one frame of the imager. As shown in Figure 10, the camera system 1002 may be similar to the camera system of Figure 1, but further includes means for scanning the imager 104 along a straight line in the focal plane, in a direction perpendicular to the rows of pixels of the imager 104. In the illustrated embodiment, the imager is mounted on each side along the axis of travel, to suitably mounted "long-scan" electro¬ mechanical positioners 1052 which operate under control
of controller 106. The electromechanical positioners may be appropriate devices such as precision "loud¬ speaker" type driver coils, servosolenoids, etc., for precisely controlling the position of the imager, preferably to a positional precision of less than half of a pixel-width of the imager 104 pixels 202, and more desirably to less than 10 percent of the pixel width, over a distance of at least 6 imager pixel widths. The linear scanning means of the camera 1002 also includes "short-scan" piezoelements 1006 under operational control of the controller 106, which have a total positional-displacement scanning distance of about one pixel width. In operation, a series of six image frames is made and stored by the camera for each complete image; after each image is made, the imager is advanced along the scan line by the "long-scan" drivers 1004 or the long scan drivers with the "short-scan" drivers 1006, until the six image scan-repeat pattern has been completed, after which the imager is returned to its original position. The composite image obtained in this manner has approximately six times the "sharpness" of a single frame. In order to provide an edge-enhanced, image, a relatively focused and a relatively defocused image frame is made at each imager position, and the image-points are processed as previously described by differencing the corresponding image points. Similarly, the camera 1002 may be used for polarization-resolution special effects, by obtaining image frames of different polarization resolution at each imager position in the scan pattern, and comparing the positionally- corresponding image lattice points to identify those which differ by at least a predetermined amount, as previously discussed. The linear scanning pattern of the imager 104 is shown in detail in Figure 10a, which is a top view, partially broken away, of a small portion 1050 of only 18 pixel photosites 202 of the imager 104 (which may, for example, have a 512x512 array of
photosites) . The pixels of the imager 104 are specially positioned to permit linear scanning in the direction indicated by scan direction arrow 1052, and are sized to have a fill-factor of only 1\6 of the imager surface. The positions of the imager pixels during the first image of the scan sequence are shown by the numeral 1, the positions of the same pixels during the second image are shown by the numeral 2, the positions of the imager pixels during the third image of the scan sequence are shown by the numeral 3, the positions of the imager pixels during the fourth image are shown by the numeral 4, the positions of the imager pixels during the fifth image of the scan sequence are shown by the numeral 5, and the positions of the imager pixels during the sixth image of the scan sequence are shown by the numeral 6. By interdigitating these images (or their edge-enhanced or polarization-compared counterparts) , a composite image, or edge-enhanced composite image, or special effects image may be produced at high resolution. The linear scanning may be carried out using a suitable electromechanical displacing component, such as a piezoelectric element, a servomicromanipulator, a coil driver, or a servo driver. Optionally, a "long" driver having a constant rate of displacement (such as a coil or servo driver) may be combined with a short-range driver such as a piezoele ent to produce a "start-stop" motion for imaging, with the image exposure being carried out as the long range driver continues to move in one direction, but the short range driver is oscillating in the opposite direction to temporarily substantially "stop" the motion of the imager. Alternatively, the "long range" (e.g., over 30 microns) driver may be driven in "start-stop" mode. By extending the frame scan distance at least one additional pattern repeat distance, and processing redundant active pixel output, pixel response non-uniformity, as well as "dead" pixels in the imager, can be eliminated or
alleviated. The scanning pattern of the imager is also shown by line 1054, which is a plot of scan distance versus time to the scale of the imager 1050, when the imager is operated in the "start-stop" mode produced by a steadily advancing "long" driver 1004 with an oscillating "short" driver 1006. The images are exposed when the imager is relatively stationery, shown by sections 1056 of the graph 1054, and the imager read-out occurs while the imager is moving, shown by sections 1058 of the graph 1054. Different-focus or different polarity images may be made while the imager is in each position, or the images may be made during a repeat scan. The imager may be quickly returned in one cycle, as shown by graph segment 1060, or may also scan the image during its return. Dual imaging cameras such as those of Figures la and 9b with fine-scanning imagers may be used to simultaneously capture images of different focus or polarization resolution, as previously described. It should be noted that while the embodiment of Figure 10 linearly scans the imager chip, that other scanning-position filters, such as a pixel like mask with a lenslet array, or the object itself, may be similarly scanned to provide a high-resolution, image lattice-filling effect by combining sequential images, and that other linear scanning patterns and pixel arrangements which provide various resolutions, may similarly be used.
While the imager of Figure 10a is a "black and white" imager, Figure 11 is an illustration of a linear-scanning, lattice-filling imager array 1102 which is particularly adapted for color images, comprising rows of red (1104) , blue (1106) , and green-sensitive (1108) pixels. The imager of Figure 11 is linearly scanned, for example in one pixel-length steps, a total pattern-repeat distance three times as long as the monochromatic or "black-and-white" imager 104 of Figures
10,10b to produce three high resolution images at each of the three spectral sensitivities. High-speed imagers having a frame rate of at least 500 frames per second are particularly preferred for such scanning systems, for real time HDTV or motion picture utilization. While the illustrated color imager has three equal color stripes, it will be appreciated that other designs to avoid aliasing may be provided, and that in other designs such as diagonal color stripe placement, the imager may travel the pattern-repeat distance along the pixel columns like that of Figure 10, be displaced one pixel width along the row, repeat this pattern-repeat scan, be displaced another pixel width along the row direction, and repeat the pattern scan, to fill one color image lattice. Moreover, one sensitivity, such as green, may have twice the pixel density for limited bandwidth applications such as HDTV. Such systems have a variety of uses, including electronic photography and very high resolution archival imaging. Such systems can also be used to convert steadily advancing motion picture film to high resolution electronic images, using a micro-oscillating imager strip, as shown in Figure 14. While the camera 1002 of Figure 10 is designed to scan the imager in single-pixel-width steps to produce a composite interdigitated image the approximate size of the imager array, Figure 12 is an illustration like that of Figure 10a, of a portion of a fast-skip-scanning, lattice-filling imager 1202 which may be used to provide a fine-resolution electronic image and/or a fine resolution edge-enhanced image, and/or a fine resolution polarization resolved special-effects image of relatively large size which substantially exceeds the size of the imager array. The linear scanning pattern of the imager 104 is shown in detail in Figure 12, which, like Figure 10a is a top view, partially broken away, of a portion 1202 of 18 pixel photosites 202 of a wide, short imager (which may,
for example, have a 512x6, or 512x12 array of photo¬ sites) . The width of the imager determines the width of the image, but the length of the image, which is scanned in the "short" direction of the imager, may be indefinitely long in the direction of scan travel. As in the case of the imager 104 of Figure 10a, the pixels of the imager 1202 are specially positioned to permit linear scanning in the direction indicated by scan direction arrow 1204, and are sized to have a fill-factor of only 1\6 of the imager surface. The pixels are rectangular, and narrower in the direction of scanning travel, to accommodate motion during exposure time. To illustrate the skip-scanning pattern, the pixels are grouped as A, B, C, D, E, F, and G. The imager, or the image, is moved a multiple-pixel-width distance shown by arrow 1204 between each exposure, which creates a pattern, with a plurality of adjacent groups (or rows) of other pixels, to fill all of the lattice positions of the image. The relative position of one the imager pixels with respect to the image during the first image of the scan sequence is shown by the numeral Al, the position of the same pixel during the second image is shown by the numeral A2, the position of the imager pixel during the third image of the scan sequence is shown by the numeral A3, the position of this imager pixel during the fourth image is shown by the numeral A4, the position of this imager pixel during the fifth image of the scan sequence is shown by the numeral A5, and the position of this imager pixel during the sixth image of the scan sequence is shown by the numeral A6. Similar positions of the B, C, D, and E imager-pixels are shown in Figure 12 along the same scan column. By interdigitating these images (or their edge-enhanced or polarization-compared counterparts) , a composite image, or edge-enhanced composite image, or special effects image may be produced at high resolution, at indefinite length
depending on the scan time to produce the image. By extending the number of columns of pixels at least one additional pattern repeat distance (here, 6 pixels) and processing redundant active pixel output, pixel response non-uniformity, as well as "dead" pixels in the imager, can be eliminated or alleviated. In a manner similar to Figure 10a, the scanning pattern of the imager 1202 is also shown to scale by line 1254, which is a plot of scan distance versus time to the scale of the imager 1202, when the imager is operated in a "start-stop" mode produced by a steadily advancing the image or the imager, while operating an oscillating "short" driver. The images are exposed when the imager is relatively stationary with respect to the image projected thereon by the lens system of the camera, shown by sections 1256 of the graph 1254, and the imager read-out occurs while the imager is moving with respect to the image, as shown by sections 1258 of the graph 1254. Different-focus or different polarity images may be made while the imager is in each position, or a second set of pixel groups may be provided which has defocusing means adjacent the pixels, or a polarization-resolving layer adjacent the pixels, to provide the data for differencing and/or comparing, as previously described. While one fast-skip pattern is shown in Figure 12, other fast skip, image-lattice filling patterns will be apparent from the present description.
A long-image scanning system like that of Figure 12 is particularly desirable for medical imaging at very high resolution. Figure 13 is an illustration of an X-ray imaging system 1300 which utilizes a long, narrow, end-abutted array of fast-skip-scanning imager chips of Figure 12 to provide electronic X-ray images and edge-enhanced X-ray images of extraordinary resolution. The imager chips are mounted in the lower zone 1304 of a scanning assembly 1306 of the X-ray imaging machine 1300. A very narrow X-ray source beam,
which is only slightly wider than the narrow width of the photosensor array in its direction of scanning travel, is provided by X-ray generator 1308 at the upper zone of the scanning assembly 1306. To provide an X-ray image, of a person or other object in the scanning zone 1310, the scanning assembly is scanned along the object at a scanning rate and imaging rate to provide a fast-skip-scan imaging pattern of the type described in Figure 12. The scanning X-ray imager may be optically coupled to an X-ray sensitive phosphor by a fiber optic array, or may utilize a CIS (Contrast Image Sensor) staggered double row array of graded index linear microlenses (so-called "rod lenses" which each have parabolic refractive indices and are so arranged and spaced that they create a continuous image by overlapping fields of view of a coverglass coated with a suitable X-ray sensitive material [Photonics Spectra (November 1992) p. 84-88]. Such systems may be optically defocused to provide a relatively defocused image from which to subtract a focused image. However, to provide the highest resolution, X-ray sensitive material be deposited directly on the imaging chip by appropriate technique such as laser or electron beam deposition, a sol-gel deposition, or mechanically mounting or bonding an X-ray sensitive layer immediately adjacent the imager. In the case of direct deposition on the imager of an X-ray fluorescent material (depending on the constituent elements and purity of the X-ray fluorescent material) , the surface of the imager should best be protected by a dense transparent layer of a protective material such as silicon dioxide or silicon nitride. When obtaining a high-resolution image, it is preferred that the X-ray sensitive layer be at least partially etched around each pixel, to prevent light from entering the pixel generated by X-ray fluorescence from around the periphery of the pixel or at adjacent pixels. However, in order to provide a lower-resolution
image for differencing with such a high resolution X-ray image, the X-ray sensitive layer may be spaced apart from the pixels by a transparent or translucent dielectric layer to permit the fluorescent light generated around each pixel to diffuse in a relatively Gaussian profile, before entering the respective pixels. As described with respect to Figure 12, a skip-scanning pixel array may include "focused" X-ray pixels and "defocused" X-ray pixels on the same chip. Skip-scan patterns may also be used to provide very high resolution electronic still cameras. Figure 14 is a schematic illustration of an embodiment 1400 of an imaging system for converting photographic images such as still photographs and motion picture film to electronic images at high resolution, which may include edge enhancement processing if desired. The electronic "still" camera 1400, which provides extremely high resolution, has a narrow electronic image shutter array, which may comprise one, or a plurality of end-abutted, long narrow imager chips bonded to a suitable substrate. The electronic shutter is physically scanned at a fast rate (e.g., 1 second to 1/200 second or less) along the focal plane of a conventional camera (such as by replacing the focal plane shutter of a conventional 35mm camera with the imager chip array) , or the image focal plane may be scanned linearly across the linear imager array. The physical scanning rate and the total scanning distance are multiplied by extending the distance of the scan pattern, while sampling each image pixel zone over the scan time, by using succeeding rows and storing the sampled pixel locations in a suitable image processor memory system. For a camera in which the relatively simple embodiment of the "skip-stop" image pattern of Figure 12 advances 25 microns with each frame, at a frame rate of 4,000 frames per second, produces an image-plane image almost 5 centimeters long at extremely
high definition in a 1/2 second exposure, or "shutter time". Longer "skip stop" patterns increase the complexity of pixel alignment, rows necessary for whole image reconstitution, and pixel reconstitution image memory requirements, but increase the "shutter speed" capability for a given image length. As described hereinabove, redundancy of image point collection can be used to increase image accuracy and eliminate "dead" pixel image problems. Such linear scanning systems may also be utilized as the imaging system for electronic imaging microscopes which are adapted to provide high-resolution images and/or high resolution edge-enhanced electronic images. For example, by using only an objective lens and mounting a CCD at the "intermediate" image formed by a microscope objective, the magnification can, for example, be increased by a factor of up to 100. By utilizing a linear-scan CCD system with pixel size of 25 microns down to several microns, lens-diffraction limited resolution can readily be provided, and greatly reduces the light loss and distortion from eyepiece lenses and camera optics (this is particularly desirable for low-light level fluorescent microscopy) . The use of only an objective lens greatly reduces the cost of high-resolution ultraviolet microscopy (e.g. , using the 245 nm. wavelength from high pressure mercury lamps) because the quartz optics are very expensive. The microscope slide may be "scanned", with the condenser-objective-CCD assembly remaining stationary. A continuous linear scan of the slide could be used with variable linear-scanning angle theta (as described hereinafter) , or a fixed pixel geometry may be used for scanning an entire image or for skip-scanning as described hereinabove, preferably with a vibratory in-axis motion of the CCD for "stop action" of each frame. Successive scans or neighboring images (e.g.,
displaced orthogonal to the scan direction) may be blended using data-blending software.
Figure 15 is an illustration of the pixel scanning pattern and direction of an adjustable- resolution scanning imager 1502 of a variable resolution electronic camera system like that of Figure 10, which permits variation of the angle (theta) of the scan direction with respect to the pixel column (or row) axes. Wide variation of resolution and operation can be achieved by varying the linear scan direction (angle theta) and the distance 1504 the imager 1502 is displaced relative to the image of the rectangular pixel lattice as shown in Figure 15. By providing a camera having means for varying the angle theta, and for varying the image scan travel distance along the scan line between image frames, an adjustable resolution camera can be provided under operator control. Frame images may be taken at a desired resolution along the scan line, and the resolution orthogonal to the scan line may be varied by varying the angle theta of the scan line. Such cameras may be used to provide electronic images, as well as edge-enhanced and/or special effects images.
While the scanning of the image with respect to the imager may be performed mechanically by relative movement of the imager and the lens system, or by electromechanical mirrors, it may also be accomplished by electrooptic steering of the imaging light which is directed to the imager by the lens system. Birefringent electrooptic beam steering systems are particularly desirable because of their light weight, simplicity, and controllable precision. In a very simple system for demonstration purposes, horizontal resolution of a test pattern of vertical lines was substantially doubled by using a CCD camera with noncontiguous pixels, by placing a sheet of linear polarizer material (Polaroid 30 mil thick HN30 polarizer material) in front of the camera
lens (between the lens and the object) , and a 2 milli¬ meter thick birefringent quartz plate between the lens and the CCD detector array. The quartz plate produced a birefringent image displacement of about 12-13 microns, and was oriented such that the axis of the shift between the ordinary ray and the extraordinary ray was along the horizontal axis of the CCD imager array. In unpolarized light, the quartz plate produced two images which were horizontally separated by 12-13 microns, which was approximately half of the center-to-center spacing of the low fill-factor imager pixels. By rotating the polarizer to first select the horizontally polarized light, and then select the vertically polarized light, each of the images produced by the quartz plate was isolated and separately imaged by the detector array to produce separate 256x256 images. The two 256x256 images were interleaved on a horizontally alternating pixel-by- pixel basis to form a 512 x 256 (horizontal:vertical) image having substantially doubled horizontal resolu- tion. Illustrated in Figure 16 is an electrooptic beam steering system 1600 which may be used to substantially quadruple the resolution of a camera system, for example like that of Figure 1, having an imager with square pixels arranged in an array having approximately a 25% fill factor with inactive spaces of approximately the same dimension as those of the active pixels on both the vertical and horizontal sides thereof. The electrooptic steering system functions by rotating the polarization of light to select or deselect the birefringent dis- placement of one or more birefringent plates having a displacement distance which is oriented with and corresponds to the desired stepping distance(s) of the image at the imager to achieve the increased resolution. As illustrated in Figure 16, the electrooptic scanning system 1600 is designed to increase the resolution along one axis, either the horizontal, or the vertical axis of the camera imager. The system 1600 illustrates a very
simple system which doubles camera resolution in both the horizontal and vertical directions. It comprises a polarizer 1602, an electrooptic polarization rotator 1604, and a horizontally-oriented quartz birefringent plate 1606, an electrooptic polarization rotator 1608, and a vertically oriented birefringent plate 1610, all positioned between the lens 102 and the CCD imager array 1612. The birefringent plates 1606 and 1610 each have a fixed displacement of the extraordinary ray from the extraordinary ray which corresponds to half the distance between pixels 1614 of the camera imager 1610, and are oriented such that this displacement is in the hori¬ zontal and vertical directions, respectively. In the system 1600, the light from the lens 102 is directed to a linear polarizer 1602, which removes light components having a polarization in the vertical direction (perpen¬ dicular to the Figure) , to produce an image beam having substantially only light polarized in the horizontal direction (shown by arrow 1612) . The horizontally polarized light is directed to an electrooptical polari¬ zation rotation means 1604, which may be a conventional twisted nematic liquid crystal device. The rotation means 1604, upon application of an appropriate control voltage, changes the rotation of the light beam passing therethrough by 90 degrees, from horizontal to vertical, and vice versa, under control of a suitable control means (not shown) , which may be a conventional LCD control system. Adjacent the rotator means 1604 is the birefringent plate, such as a 45 degree cut crystalline quartz birefringent plate 1606 having a thickness and alignment direction such that the displacement of the extraordinary ray corresponds to half the distance between imager pixels 1614. The illustrated imager has square pixels 1614 with a width of approximately 12 microns, separated by approximately 24 micron center-to- center spacing. The quartz birefringent plate 1606 has a thickness of about 2 millimeters, which produces a
lateral displacement of the extraordinary ray of about 12 microns along the horizontal direction of the imager 1612. By activating and deactivating the rotator 1604, the ordinary or the extraordinary image of the quartz plate 1606 may be selected, so that the image received by the imager may be "steered" or "scanned" horizontally a distance of 12 microns. The light passing through the quartz plate 1606 is directed to the rotator 1608, which is like the rotator 1604 in providing for selection of the extraordinary images of vertically oriented quartz birefringent plate 1610. In operation, a first image is made with the rotators 1604 and 1608 in rotationally inactive state, a second image is captured with the rotator 1604 in an active state and the rotator 1608 in an inactive state (scanning the image 12 microns horizontally with respect to the first image) , a third image is captured with the rotator 1604 in an inactive state and the rotator 1608 in an active state (scanning the image 12 microns vertically with respect to the first image) , and a fourth image is captured with both the rotators 1606 and 1610 in an active state (scanning the image 12 microns horizontally and 12 microns verti¬ cally with respect to the first image) . By forming a composite image in which the four separate images are interleaved to their respective image locations, a composite image is produced having four times the resolution of the individual images. By capturing focused and slightly defocused or other point spread controlled images at each of the four image positions, high resolution edge enhancement and/or wavelet processing may also be obtained.
While the system 1600 substantially doubles the resolution in both horizontal and vertical direc¬ tions by employing a one-step displacement in each direction, systems capable of x,y addressable, multi-step capability in both directions may also be provided, as shown by a system 1700 in Figure 17. The
syste 1700 is like that of system 1600 of Figure 16, except that the imager pixels 1614 are 6 microns wide (but still with a center-to-center spacing of 24 microns) , and the system further includes rotator components 1702 and 1706 like components 1604 and 1608, and quartz birefringent plates 1704 and 1708 which are half the thickness and produce half the displacement of the plates 1606 and 1610. Birefringent plate 1704 is oriented horizontally like plate 1606. By appropriately selecting (binary) combinations of displacement by plates 1606 and 1704 by activating and deactivating respective rotators 1604, 1702, the image may be scanned along the horizontal direction in four 6 micron steps, corresponding to relative center locations of 0, 6, 12 and 18 microns. Similarly, the quartz birefringent plate 1708 is oriented in the vertical direction like plate 1610, and the image may be scanned in the vertical direction by appropriately activating and deactivating the rotators 1608, 1706 to address positions having relative displacement of 0, 6, 12, and 18 microns. The horizontal and vertical image positions may be independ¬ ently addressed, thereby providing for 16 separate images which may be interleaved to produce a composite image having quadrupled resolution in each of the hori- zontal and vertical directions. It will be appreciated that the rotation and selection of images should be relatively precise, or there will be crosstalk. In order to reduce crosstalk, one or more rotator-polarizer combination elements may be incorporated in the compo- nent stack to filter "stray" polarization components and reestablish polarization linearity. Such systems are valuable for all-electronic, high resolution cameras, and for high-speed analog memory systems in which, for example, 16 separate memory-images on a virtual memory system memory card adjacent an imager may be read out by electrooptically scanning the memory card.
It is an important aspect of the present invention that by appropriately selecting the point spread functions of images which are differenced as described herein, and by differencing images at image lattice zones using a selected plurality of point spread functions which may be of varying breadth, a lattice of wavelet coefficients at image pixel locations (or a subset of image pixel locations) may be provided as a discrete wavelet transform (or subset of a complete wavelet transform) . In such an optical wavelet transform, dilations and translations of one or more wavelet kernels are utilized to provide a localized frequency decomposition of a function or signal [see I. Daubechies, "The Wavelet Transform, Time-Frequency Localization and Signal Analysis", IEEE Transactions on Information Theory, 36,961-1005 (1990)]. A wavelet transform may be an inner signal/wavelet kernel Hubert space product. In this regard, an array of "daughter" wavelet kernels may be constructed from a "mother" wavelet kernel by both (a) effective spatial or temporal shifts (e.g., corresponding to the image lattice posi¬ tions or a subset of the image lattice positions of a 2-dimensional image) , and (b) scale changes (dilations) of the wavelet kernel. A set of shifted and dilated wavelet functions, when correlated with spatially or temporally localized signal values, can form a complete set of basis functions useful as a linear wavelet transform. Such wavelet kernel functions which are also substantially orthogonal provide highly efficient trans- form encoding. However, such functions may be used, even though they are not orthogonal, as basis functions to represent one or two-dimensional signals such as sound signals which are amplitude signals in time, or a two dimensional light intensity signal such as an image with some degree of signal redundancy. Although one wavelet kernel function (synthesized by differencing point spread functions as described herein) may be
dilated and/or contracted to effectively vary the frequency content of the transform set, different wavelet kernels formed by point spread differencing may be used to provide image transforms, to process images, and to carry out a variety of optical applications, as will be more fully described hereinafter.
Typically in a one- or two-dimensional signal which undergoes a digital wavelet transform, for each input signal coordinate (space, time, etc.) there are two transform space coordinates for each input signal coordinate. Thus, a 1-D signal produces a 2-D digital wavelet transform and a 2-D signal (with certain exceptions) produces a 4-D digital wavelet transform. The computational demands of even "fast" digital wavelet transforms are substantial, so that digital calculation of a wavelet transform (or its inverse) from digitized signal or image values (or wavelet transform coeffi¬ cients) can be relatively slow; and even slight errors in the digital computation can produce relatively large errors in wavelet coefficients. It would be desirable to perform wavelet transforms optically in parallel, and substantial effort has been devoted to wavelet optics research in order to accomplish direct optical wavelet transforms. Wavelet transform kernels may desirably be formed by combining point spread functions to produce composite effective wavelet functions, which may be substantially square-integrable functions having substantially constant energy, and a substantially zero total area of the wavelet amplitude function (the
"admissible condition") . Wavelet transform function amplitude values may be combined linearly to reconstruct the original image or signal.
While substantially linear wavelet transforms have some characteristics in common with frequency transforms such as Fourier transforms, they differ in providing spatial or time signal localization, as well
as resolution in the frequency domain, so that they are particularly useful for processing effects and events which are localized in time or space. In addition, because wavelet transforms may be substantially of first order in the signal, they may utilize linear super¬ position for both the signal transform, and the signal reconstruction from the transformed signal. This is an important feature which may be utilized by methods and apparatus in accordance with the present invention to rapidly provide analyses and signal processing of enormous computational power.
Wavelet functions conventionally used in digitally computed transforms include Morlet's Wavelet, the Mexican-Hat wavelet, Meyer's wavelet, Lemarie-Battle's Wavelet, and Daubechies wavelet. Morlet's Wavelet is a complex wavelet function originally used for seismic analysis, which has as its real part, an even cos-Gaussian function. The Morlet wavelet kernel only substantially satisfies the wavelet admissible condition, because its zero frequency transform value is not zero, but can be approximately considered as zero in a numerical computation. The Laplacian edge detector operator, or the second derivative of the Gaussian function, which is even and real valued and satisfies the admissible condition, is commonly referred to as the "Mexican-Hat Wavelet" for purposes of digital wavelet transform computation. Higher order derivatives of the Gaussian function may also be used as wavelet kernels. While wavelet kernels such as the Morlet and Mexican-Hat wavelets do not form fully orthonormal basis functions by dilation and dis¬ placement, orthonormal wavelet kernels such as Meyer's Wavelet and Lemarie-Battle's Wavelet are suitable for nonredundant digital signal transform computations. Such digital wavelet transforms are useful for a wide variety of signal analysis, signal recognition and signal processing functions. For example, wavelet
tranεforms are particularly useful for fractal physics, turbulence and transient signal and image processing. [H. Szu et. al., "Causal Analytical Wavelet Transform", Optical Engineering, Vol. 31, p. 1825 (1992)]. In computer vision and optical processing systems, it is frequently desirable to change the size of the image-analysis operators to correspond to the size of the image features of interest, which is readily accomplished by wavelet processing. Multiresolution signal decompositions, which permit scale-invariant processing of an image and/or selection and enhancement of specific features of an image, may be carried out by digital wavelet transform operations [S. Mallat, "A Theory for Multiresolution Signal Decomposition: The Wavelet Representation", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, pp. 674-693 (1989) ] In addition, digital wavelet transforms are capable of performing high compression ratio image compression, while retaining high image quality and smoothness consistent with human vision characteristics
[R. DeVore, "Image Compression Through Wavelet Transform Coding", IEEE Transactions on Information Theory, 38, 719 (1992)] [A Brief Introduction to Wavelet Trans¬ forms] . Because regions of slow change may be sampled at a low rate, and regions of faster change may be sampled at progressively higher rates so that those terms dominate the wavelet transforms, wavelet transforms are particularly suitable for bandwidth reduction [Szu, Harold H. and Caulfield, H. John, Wavelet Transforms, Optical Engineering, Vol. 31, pp.
1823-1824 (1992)]. Wavelet normalizations for the daughter wavelets in which each frequency is represented equally may also be utilized for efficient representa¬ tion and processing of signals [Teller, Brian, and Szu, Harold H. , "New Wavelet Transform Normalization to
Remove Frequency Bias," Optical Engineering, Vol. 31, pp. 1830-1831 (1992)] Wavelet transforms may also be
utilized for both representation and classification of images and other signals [Szu, Harold H. , Telfer, Brian and Kadambe, Shubba, "Neutral Network Adaptive Wavelets for Signal Representation and Classification," Optical Engineering, Vol. 31, No. 9, pp. 1907-1915(1992)].
However, because of the enormous computational requirements of implementing a wavelet transform, which are particularly acute for high resolution images, significant effort has been applied to developing optical implementations of wavelet transforms
[Caulfield, H. John and Szu, Harold H. , Parallel discrete and continuous wavelet transforms, Optical Engineering, September, 1992, Vol. 31, No. 9, p. 1835]. Such optical wavelet transform systems typically take the form of complicated holographic and/or coherent imaging systems, such as those utilizing a bank of VanderLugt filters or holographic correlation filters in a coherent optical system [Li, Yao and Zhang, Yan, "Coherent Optical Processing of Gabor and Wavelet Expansions of One-and Two-Dimensional Signals," Optical Engineering, Vol. 31, No. 9, p. 1865].
In accordance with the present invention, bipolar wavelet kernels of substantially any desired shape and spatial frequency response may be generated by recording and differencing appropriately shaped point spread functions in a (preferably regular) lattice array (e.g., square or hexagonal) of image pixels. Such kernels be shaped and dilated or contracted optically to form orthogonal wavelet functions or partially redundant nonorthogonal function sets, which may be used to encode, recognize, select, delete, emphasize or deemphasize predetermined or selected spatial frequency zones. By processing an image utilizing these point spread generated wavelet kernel operators, wavelet transforms may be readily generated, and the signals they represent may be processed (and reconstructed) in parallel with high accuracy, without requiring coherent
optical systems. For example, "Mexican-Hat" wavelets, substantially approximating a one-dimensional Gaussian second derivative function, or a two-dimensional Laplacian of Gaussian function, may be synthesized by appropriately generating at an imager, two different Gaussian point spread functions of respectively appropriate breadth for differencing in accordance with the present invention. In this regard, for example, by utilizing a lens or holographic light diffuser (such as those commercially available from the Physical Optics Corporation of Torrance, California) 102 in the camera system 100 of Figure 1 which produces a first Gaussian point spread function of equation (1) at the imager for acquiring a first image at relatively sharp focus, and subtracting therefrom a second image acquired at a point spread function substantially approximating Equation 2 where the spreading factor is, for example, in the range of from about 1.5 to about 2.5, produces a synthetic Difference of Gaussian wavelet closely approximating the one dimensional or two dimensional Gaussian second derivative Mexican-Hat wavelet. The narrowest wavelet produces a wavelet set with the highest spatial frequency components of an image. This process may be repeated at effectively wider spreading factors to produce wavelet sets at lower spatial frequency ranges. However, at larger "out of focus" amounts using conventional camera lens systems, the point spread function deviates substantially from a Gaussian distribution (becoming more "disk-shaped", or "flat-topped" in radial distribution) . Depending on the optical system utilized, both the fully-focused and the defocused point spread (image plane light distribution) functions may deviate from a theoretical Gaussian diffraction-limited distribution. Particularly as more widely-dilated point spreads are produced at higher defocus levels in conventional photographic lens systems, a more evenly distributed, disk-like light
distribution may be produced at the imager. Differen¬ cing of such functions produces a convolution operator, or wavelet kernel, which tends away from a Difference of Gaussian transform kernel, toward a radially symmetrical Haar transform kernel (difference of axially symmetrical disks) . The transformed information may be processed, and/or recovered, using an inverse transform operation, either digitally or using a lens system having focus-defocus characteristics like that of the lens system or other defocus system used to generate the original edge-enhancement or other transform. When it is desired to produce a Gaussian-like or other point spread light distribution at the imager, resolution degradation of the optical system may be used rather than defocusing (preferably employing fast electro¬ optical elements where high speed operation is desired) , or the point spread function may be precisely shaped as desired, by liquid crystal or other electrooptic elements, and/or binary optic elements (preferably electrooptic binary optic elements) as described herein.
The wavelet kernel shape produced by differencing such "out-of-focus" images may be used to analyze, reduce or enhance selected ranges of spatial frequency components of the image, but upon reconstruction, the actual wavelet kernel components which were used to obtain a particular wavelet set should be used to reconstruct the wavelet coefficients for such kernel. This may be accomplished on an empirical basis by using a substan¬ tially identical lens system to invert the transform in an optical retransform system such as described herein¬ after, as was used to form the initial transform. For digital inversions, the actual point spread functions of a camera system may be empirically determined from intensity values of the pixels of the image produced by the camera system, using an effectively single point source of light. Where it is desired to produce a more Gaussian-like light distribution at larger out-of-focus
conditions, holographic light distribution elements such as those manufactured by the Physics Optical Corporation may be used to control the point spread. Holographic light diffusing elements may be combined as one element of a liquid crystal device, so that the point spread of the light diffusing element may be varied by application of an electric field, which varies the refractive index adjacent the holographic diffusing element.
As shown in Figure 5, the first, relatively focused Gaussian point spread 303 represents a one dimensional Gaussian point spread function along the axis, or a two-dimensional radially symmetrical point spread function in which light intensity originating from a point object is integrated along the radial distance from the centerline. The second Gaussian point spread function 307 is selected to have a spreading factor of about of e.g., 2. By subtracting, on a corresponding pixel-by-pixel basis, an image obtained with point spread function 307 from an image of the same scene obtained with the point spread function 303, a synthetic Difference of Gaussian wavelet function 1606 is provided which closely approximates the Gaussian second derivative "Mexican-Hat" at each pixel of the image. Because the pixels 202 of imager 104 are regularly spaced in the image plane, the wavelet translation operator at a specific dilation is applied to the image at each displacement corresponding to the location of each pixel. By subtracting equal image exposure (energy or information content) of the second image from the first image, the total wavelet energy is substantially zero, so that the wavelet admissibility condition is preserved. The resulting set of pixel values produced by the subtraction corresponding to each pixel are a set of wavelet transform coefficients for the selected wavelet function dilation. By subsequently obtaining images of broader Gaussian point spread function, wavelets kernels of different dilation may be
provided. Typically, the dilation of the synthetic wavelet kernel will increase by a factor of 2 for a one-dimensional wavelet for each dilation set, and by a factor of 2 (or for more accuracy, the square root of 2) for two-dimensional ring wavelet kernels. Because the information is of lower frequency with increasing wavelet kernel dilation, the number of image pixels used may decrease with increasing wavelet dilation. The number of coefficients saved at each scale may decrease in proportion to the spreading factor (or the area of support for the wavelet kernel) . However, for image processing purposes, all of the pixel coefficients may be retained to increase accuracy and reduce noise where low-contrast and limiting factors, such as in IR and X-ray imaging. It is important to note that a wavelet transform may be rapidly provided in this manner using incoherent light, and that complicated laser light systems are not required. It is noted that the resolution of the lens system 102 will preferably be greater than the resolution of the imager caused by its pixel size and space-filling limitations. As will be discussed hereinafter, blur filters such as utilized in conventional electronic camera systems to limit aliasing at the expense of resolution, need not be used, because wavelet transform processing may be used to limit or overcome aliasing without substantial resolution loss. It should also be noted that the fill factor of the imager (the percent of the imager surface area effectively occupied by active image pixels may be relatively low, such as in the range of from about 5 percent to about 25 percent. With a lens system of resolution at least matching the pixel size, lower fill factors may produce image transforms of higher resolution than high fill factor imagers, although higher fill factors may also be used. The values obtained at each pixel constitute a first set of wavelet transform coefficients at each respective pixel image
lattice position. For a square 512 x 512 pixel imager 104 array, there will thus be 262,144 coefficients Cx,y for y = 1-512, x = 1-512. The wavelet kernels formed by image point spread differencing may be dilated independently in the x (row) and y (column) directions to generate independent sets of wavelet transform coefficients, or the point spread may be dilated in a radially symmetrical manner to produce a radially symmetrical ring-wavelet kernel function. Multiple sets of wavelet transform coefficients may be produced in the same manner, but using wavelet kernels which are dilated with respect to the first kernel. Desirably, the dilation ratio of the width of the second wavelet kernel to the first wavelet kernel will be in the range of from about 1.2 to about 4 (most preferably a dyadic ratio of 2) , depending on the use desired for the transform. Where high accuracy is desired in the correspondence of the wavelet transform or its reconstruction to the image data, the dilation ratio between the width of the first and second wavelet DoG kernel functions may be relatively small, such as is the range from about 1.25 to about 2.25. For a ring DoG wavelet kernel, a ratio of the square root of two is a desirable ratio, because the area of the wavelet support is accordingly doubled with each succeeding wavelet kernel, producing an efficient transform with minimal redundancy. For a unidirectional wavelet such as produced by dilating the point spread function in only the x or y directions of the imager, the dilation ratio may desirably be about 2, which similarly corresponds to a doubling of the wavelet support area (as defined herein, that central, contiguous zone of the wavelet which constitutes at least 99 percent of its total integrated value) . Such wavelet transforms may be utilized for data compression for storage and transmission, and may also be utilized for image processing, recognition or enhancement of selected features of the image. The wavelet transform
of an image thus obtained may also be reconstructed, with or without enhancement or other signal processing operations, by digital methods in accordance with conventional practice. The image may be reconstructed at a different set of lattice points than the original set of pixels used to obtain the transform. In reconstituting the image, the wavelet coefficient represents an effective signal strength at which the wavelet kernel associated therewith and centered at its wavelet transform lattice location of origin, is combined in the transform image. For example, a wavelet transform obtained from an imager 102 comprising pixels 202 which effectively occupy 25 percent of the imager, may be reconstructed from the wavelet coefficients to form an image having 4 times as many pixels, by digitally summing the contribution of each wavelet kernel over its breadth (multiplied by its normalized transform coefficient) over each desired image location, including image locations intermediate the image locations corresponding to the original pixel location.
The same effective wavelet kernel shapes/distributions used to make the transform are used to reconstruct the image. While there will not necessarily be 4 times as much definition to accompany the quadrupled number of image lattice locations, the high frequency information contained in the narrowest wavelet kernel inhibits aliasing and permits recovery of increased definition in the zones between the original image pixel locations. By optically generating a wavelet kernel by decomposing the kernel into appropriate point spread functions for image differencing, in which the wavelet has at least five maxima and minima, of a scale such that the distance between maxima and minima preferably is at least 0.75 the pixel width of the imager, increased image definition may be generated upon reconstruction of the image in accordance with conventional wavelet transform techniques. Such a digital computation is an
intensive computation, but may be effectively simplified by adding the higher frequency wavelet values at each target image pixel, to the conventional image, preferably obtained at an effective fill-factor approximating 100%. An image which is blurred
(preferably by a "flat topped" point spread function) so that the effective fill factor approximates 100%, may be regarded as approximating the inverse transform sum of all the lower order wavelet sets to the scale of the effective pixel size of the image. By adding a higher order inverse wavelet set formed by operating a low fill-factor imager at its highest resolution to form a wavelet set having relatively high frequency information the fill factor of a low-fill-factor imager of the type utilized to obtain higher resolution wavelet transform information, may be effectively increased by blurring or defocusing the image. It is also an important aspect of the present invention that image or signal reconstruction, with or without intermediate enhancement or other signal processing, may be carried out optically, in a manner similar to that which was used to initially produce the transform. In this regard, a wavelet reconstruction or inversion system may comprise a light means comprising a coherent or incoherent light source and a collimating lens directing collimated light to a spatial light modulator (SLM) , for generating a two-dimensional spatially modulated light output corresponding to the wavelet coefficient values of the wavelet transform, and a camera system like that of Figure 1 to sum the transformed wavelet functions. The SLM may have a modulating pixel array which corresponds to the number and relative location of the corresponding wavelet transform lattice positions. In operation in incoherent light, the positive wavelet coefficients for the first wavelet kernel operator are introduced into the spatial light modulator such that the SLM modulates the transmitted light from the light source at each SLM
pixel in proportion to a signal coefficient for the corresponding positive or negative wavelet kernel component at a lattice position. The SLM may be a photographic transparency, or an electrically, optically or acoustically addressed spatial light modulator, of which there exist a wide variety of types, such as liquid crystal, photorefractive, SEED, deformable or cantilevered mirror and quantum effect devices. Because the SLM is the limiting component in terms of operating speed, high speed SLM devices are preferred. In addition, because the wavelets are bipolar functions, the individual wavelet coefficients at each lattice (pixel) location may be either positive (in which case the wavelet kernel distribution value integrated over each respective target pixel times the respective coefficient is added to the respective target pixels) , or negative (in which case this wavelet volume integrated at each target pixel is subtracted from the target pixels) . Moreover, although the total wavelet volumes (the volume bounded by the wavelet kernel shape multiplied by the respective coefficient) associated with each wavelet coefficient are substantially zero because of the admissibility condition, but will typically be positive or negative at respective individual target pixels within the "support range" of the wavelet, reconstruction requires the analog capability of adding and subtracting light. This may be accomplished using coherent light sources, but because of the difficulty of subtracting incoherent light, the reconstruction may be divided into separate steps for positive and negative coefficients in the same manner described herein for wavelet formation. In this regard, the image may be reconstructed using the same image coefficients that would be used to form the wavelet. The wavelet operator kernels associated with each set of Mexican-Hat wavelet coefficients, for example, are effectively produced by focusing and defocusing to
produce appropriate point spread functions in the same manner in which they were originally generated. Thus, in order to produce an image with increased resolution and edge enhancement as compared to an original 100% fill factor image, a Gaussian-shaped sharp-focused image modulated by an SLM may be imaged on an imager having more pixels than the SLM, and a slightly defocused image may subsequently be imaged through the SLM on the large pixel array (the sharp-focused image and the slightly defocused image being selected such that upon subtrac¬ tion of the defocused image from the sharp-focused image, a Mexican-Hat wavelet kernel is formed which has support over the area surrounding each pixel of the low-fill-factor imager, but not fully encompassing adjacent pixels) . The reconstruction system may ideally utilize the same optical lens system, at the same point spread modification settings, as was used to create the initial wavelet component images. By adding the sharp-focused reconstructed wavelet component, and subtracting the slightly defocused wavelet component from a lower resolution image imposed on the larger pixel image, an image of increased resolution may be obtained. In addition, spatial frequency ranges may be enhanced by increasing the relative contribution of various wavelet sets in the reconstructed image. It is again noted that the number of pixels of the imager need not correspond with the location or number of pixels of the original imager, or spatial light modulator. For example, the number of image pixels on reconstruction may be quadrupled with respect to the original imager.
As earlier pointed out, when using coherent (laser) light in the reconstruction, the negative wavelet coefficients may be processed in the same steps with the positive coefficients, because light modulating the negative coefficients with light which is 180 degrees out of phase with the coherent light of the positive coefficients, so that the requisite subtraction is
accomplished at the imager in a single step, and the actual wavelet coefficients and wavelet kernel shapes may be projected directly on an imager to sum the wavelet contributions. This reduces the number of operations by a factor of 2. It is also noted that, depending on this purpose for transforming and reconstituting the image or signal, that it may not be necessary to carry out a full wavelet transform. For example, in order to enhance feature definition, to reduce aliasing, and to increase spatial resolution, the conventional image captured by the imager (at an effective fill factor approximating at least 75, and more preferably at least 90% such as may be provided by a suitable blur filter when using an imager having a relatively small fill factor) may be regarded to be the sum of all of the broader wavelet kernels, and the image may be processed or enhanced by adding or subtracting specific wavelet kernel transform components, thereby limiting the number of processing steps to produce the enhanced or processed image. For example, in processing x-ray images to enhance spatial frequency range features of diagnostic importance, the conventional image may be enhanced by adding selected wavelet transform compo¬ nents, and subtracting wavelet transform components of little or no diagnostic value.
As discussed, although the DoG Mexican-Hat wavelet kernel is relatively easy to generate and reconstruct at each pixel by differencing Gaussian or Gaussian-like effective point spread functions produced by an optical imaging system, light diffusion systems or the like, it would be desirable to not only improve the efficiency of such Gaussian differencing, but also to facilitate the generation of other wavelet kernel functions by point spread manipulation. In this regard, it is noted that the differencing of Gaussian-like point spread functions of equivalent energy but different scale involves subtracting the respective function
maxima, thereby reducing the output sensitivity to a value which may be substantially less than the sensitivity of the more sensitive Gaussian function. A mildly toroidal lens element, such as a holographic or binary optic lens element used in conjunction with a conventional imaging system, may be used to generate a coma-like distortion for modifying the point spread function of the lens system. When an image which is defocused with the coma-like distortion is subtracted from a relatively sharp image with a Gaussian-like point spread, the difference function is more sensitive than the simple DoG function, but retains a Fourier spatial frequency distribution similar to the DoG Mexican-Hat function. By appropriately shaping the point spread functions of images which are differenced at image pixels wavelet coefficients at image pixel locations (or a subset of image pixel locations) may be rapidly provided in parallel as a discrete wavelet transform (or subset of a complete wavelet transform) . In such an optical wavelet transform, dilations and translations of one or more point spread functions shaped as wavelet kernels are utilized to provide a localized frequency decomposition of a function or signal. A wavelet trans- form may be an inner signal/wavelet kernel Hilbert space product. In this regard, an array of "daughter" wavelet kernels may be constructed from a "mother" wavelet kernel by both (a) effective spatial or temporal shifts (e.g., corresponding to the image lattice positions or a subset of the image lattice positions of a 2-dimensional image) , and (b) scale changes (dilations) of the wavelet kernel [see I. Daubechies, "The Wavelet Transform, Time-Frequency Localization and Signal Analysis", IEEE Transactions on Information Theory, 36, 961-1005 (1990) ] . A set of shifted and dilated wavelet func¬ tions, when correlated with spatially or temporally localized signal values, can form a complete set of
basis functions useful as a linear wavelet transform. Such wavelet kernel functions which are also substan¬ tially orthogonal provide highly efficient transform encoding. However, such functions may be used, even though they are not orthogonal, as basis functions to represent one or two-dimensional signals such as sound signals which are amplitude signals in time, or a two dimensional light intensity signal such as an image with some degree of signal redundancy. Although one wavelet kernel function (synthesized by differencing point spread functions as described herein) may be dilated and/or contracted to effectively vary the frequency content of the transform set, different wavelet kernels formed by point spread differencing may be used to provide image transforms, to process images, and to carry out a variety of optical applications, as will be more fully described hereinafter.
Typically in a one- or two-dimensional signal which undergoes a digital wavelet transform, for each input signal coordinate (space, time, etc.) there are two transform space coordinates for each input signal coordinate. Thus, a 1-D signal produces a 2-D digital wavelet transform and a 2-D signal (with certain exceptions) produces a 4-D digital wavelet transform. The computational demands of even "fast" digital wavelet transforms are substantial, so that digital calculation of a wavelet transform (or its inverse) from digitized signal or image values (or wavelet transform coeffi¬ cients) can be relatively slow; and even slight errors in the digital computation can produce relatively large errors in wavelet coefficients. It would be desirable to perform wavelet transforms optically in parallel, and substantial effort has been devoted to wavelet optics research in order to accomplish direct optical wavelet transforms.
While substantially linear wavelet transforms have some characteristics in common with frequency
transforms such as Fourier transforms, they differ in providing spatial or time signal localization, as well as resolution in the frequency domain, so that they are particularly useful for processing effects and events which are localized in time or space. In addition, because wavelet transforms may be substantially of first order in the signal, they may utilize linear super¬ position for both the signal transform, and the signal reconstruction from the transformed signal. This is an important feature which may be utilized by methods and apparatus in accordance with the present invention to rapidly provide analyses and signal processing of enormous computational power.
Wavelet functions conventionally used in digitally computed transforms include Morlet's Wavelet, the Mexican Hat wavelet, Meyer's wavelet, Lemarie-Battle's Wavelet, and Daubechies wavelet. Morlet's Wavelet is a complex wavelet function originally used for seismic analysis, which has as its real part, an even cos-Gaussian function. The Morlet wavelet kernel only substantially satisfies the wavelet admissible condition, because its zero frequency transform value is not zero, but can be approximately considered as zero in a numerical computation. The Laplacian edge detector operator, or the second derivative of the Gaussian function, which is even and real valued and satisfies the admissible condition, is commonly referred to as the "Mexican-hat Wavelet" for purposes of digital wavelet transform computation. Higher-order derivatives of the Gaussian function can also be used as wavelet kernels. While wavelet kernels such as the Morlet and Mexican hat wavelets do not form fully orthonormal basis functions by dilation and displacement, orthonormal wavelet kernels such as Meyer's Wavelet and the Lemarie-Battle wavelet are suitable for nonredundant digital signal transform computations. Such digital wavelet transforms are
useful for a wide variety of signal analysis, signal recognition and signal processing functions. For example, wavelet transforms are particularly useful for fractal physics, turbulence and transient signal and image processing [H. Szu et al., "Causal Analytical Wavelet Transform", Optical Engineering, Vol. 31, p. 1825 (1992)]. In computer vision and optical processing systems, it is desirable to adapt the size of the image-analysis operators to the size of the object or features which are to be analyzed, which is readily accomplished by wavelet processing. Multiresolution signal decompositions, which permit scale-invariant interpretation of an image and/or selection and enhancement of specific features of an image, may be carried out by wavelet transform operations [S. Mallat, "A Theory for Multiresolution Signal Decomposition: The Wavelet Representation", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, pp. 674-693 (1989)]. In addition, digital wavelet transforms are capable of performing high compression ratio image compression, while retaining high image quality and smoothness consistent with human vision characteristics [R. DeVore, "Image Compression Through Wavelet Transform Coding", IEEE Transactions on Information Theory, 38, 719 (1992) ] [A Brief Introduction to Wavelet Trans¬ forms] . Because regions of slow change may be sampled at a low rate, and regions of faster change may be sampled at progressively higher rates so that those terms dominate the wavelet transforms, wavelet trans- forms are particularly suitable for bandwidth reduction
[Szu, Harold H. and Caulfield, H. John, Wavelet Trans¬ forms, Optical Engineering, Vol. 31, pp. 1823-1824 (1992)]. Wavelet normalizations for the daughter wavelets in which each frequency is represented equally may also be utilized for efficient representation and processing of signals [Teller, Brian, and Szu, Harold H. , New Wavelet transform normalization to remove
frequency bias, Optical Engineering, Vol. 31, pp. 1830-1831 (1992)]. Wavelet transforms may also be utilized for both representation and classification of images and other signals [Szu, Harold H. , Telfer, Brian and Kadambe, Shubba, Neutral network adaptive wavelets for signal representation and classification, Optical Engineering, Vol. 31, No. 9, pp. 1907-1915 (1992)].
However, because of the enormous computational requirements of implementing a wavelet transform, which are particularly acute for high resolution images, significant effort has been applied to developing optical implementations of wavelet transforms [Caulfield, H. John and Szu, Harold H. , Parallel discrete and continuous wavelet transforms, Optical Engineering, September, 1992, Vol. 31, No. 9, p. 1835]. Such optical wavelet transform systems typically take the form of complicated holographic and/or coherent imaging systems, such as those utilizing a bank of VanderLugt filters or holographic correlation filters in a coherent optical system [Li, Yao and Zhang, Yan, Coherent optical processing of Gabor and Wavelet expansions of one-and two-dimensional signals, Optical Engineering, Vol. 31, No. 9, p. 1865].
Bipolar wavelet kernels of substantially any desired shape may be generated by recording and differ¬ encing appropriately shaped point spread functions in the array of image pixels. By optically processing an image utilizing electrooptically-generated wavelet kernel operators, wavelet transforms may be readily generated, and the signals they represent may be processed (and inverted) in parallel with high accuracy, without requiring coherent optical systems. For example, "Mexican Hat" wavelets, substantially approximating a one-dimensional Gaussian second derivative function, or a two-dimensional Laplacian of Gaussian function, may be synthesized by appropriately generating at an imager, two different Gaussian point
spread functions for differencing in accordance with the present invention. In this regard, for example, by utilizing a lens 102 in the camera system 100 of Figure 1 which produces a first Gaussian point spread function of equation (1) at the imager for acquiring a first image, and subtracting therefrom a second image acquired at a broader Gaussian point spread function produces a synthetic Difference of Gaussian wavelet closely approximating the second derivative Mexican Hat wavelet.
The discrete wavelet transform may be produced by differencing a series of image pairs of progressively broader (dilated) point spread functions, to produce a series of image transforms at different wavelet kernel dilation as a series of integral transforms of the input function with appropriately scaled "wavelets" produced by image point spread differencing-shaping, which are effectively shifted at discrete image positions corresponding to image lattice (pixel) positions. Scale changes are preferably powers of 2, such
• that the dilation ratio of successive wavelet sets are in these ratios. The Laplacian, or Mexican Hat wavelet, only has three localized minima and maxima, and is relatively inefficient at spatial frequency encoding. Preferably, in accordance with the present invention, wavelet kernels having at least five localized maxima and minima may be generated by an optical system and synthesized at the imager by differencing images of appropriate optical point spreads. Such kernels may be generated in dilated or contracted form in a series of differenced images created by varying the width of the point spread functions of the optical system. Appro¬ priate point spread functions may be generated by appropriate figuring of conventional lens elements, and by appropriate design of diffractive lens elements. Relatively planar lens elements which are figured or designed to produce maxima and minima may also be used
with a standard lens system to produce the desired point spreads.
Wavelets can be made orthogonal to their own dilation and translation with discrete dilation and translation parameters to give nonredundant signal representation. Anamorphic lens elements may be used to produce point spreads of desired directional orientation for generating directional wavelet transforms by image differencing as described herein. However, the wavelet transform need not be orthogonal. With continuous dila¬ tion and translation variables, the wavelet transform may be highly redundant, which can be a desirable feature for some purposes, such as noise reduction.
The decomposition of a signal into its wavelet coefficients and the reconstruction of the signal from the wavelet transform coefficients are conventionally performed by computer using multiresolution analysis algorithms, such as for medical diagnosis.
Richardson, "Nonlinear filtering and multiscale texture discrimination for mammograms", SPIE 1768 (1992) and Laine et al. , "Multiscale Wavelet Representations for Mammographic Feature Analysis", SPIE 1768 (1992) both describe the feature extraction capabilities of digital wavelet transforms for medical diagnosis. By subtracting x-ray images at different wavelet scales, smooth-edged benign lesions may be distinguished from the indistinct or spiculated border of a malignancy. At a very detailed wavelet level, microcalcification may be detected, localized, and enhanced. In accordance with the present invention, wavelet transforms may be generated optically by differ¬ encing images of appropriate point spread functions.
The shape of the point spread functions which are subtracted may be controlled by high speed electro- optic lenses of radially symmetrical or anamorphic design, to produce desired wavelet shapes.
-Ill- Wavelet and edge-enhancement systems in accordance with the present invention may also be utilized for stereooptic analysis by comparing "wavelet" images of different scales, and measuring the "distance" between edges determined at different wavelet scales in each image of the stereo pair. In this manner, the depth analysis should be easily detected and analyzed (a simple subtraction of the two wavelet images will be a distance measure) . Wavelet transform processing systems in accordance with the present invention have particular utility for processing and analysis of medical images such as X-ray images, particularly for diagnostic techniques which extend the capabilities of the human eye, and for screening for particular attention by a medical expert. For example, various types of cancer, such as breast cancer, can be controlled by early diagnosis and effective treatment. Screening X-ray mammography may permit detection at a lower lesion size threshold than physical examination by either the physician or patient. Wavelet transform processing also has utility for effectively increasing the resolution of imagers. For example, an imager of relatively low fill factor (e.g., less than 25 percent), as described herein, may be used with an appropriate optical system to generate a compact wavelet by image differencing (preferably using a wavelet kernel having at least five maxima and minima) , which encodes high spatial frequency image information in the zones between imager pixels. A higher-resolution imager than the conventional imager produced by the imager may be generated by re-transforming the wavelet transform thus produced. In addition, rapid wavelet encoding for image compression and transmission may be inexpensively carried out in a very lightweight electrooptic system. In addition, spatial frequency ranges of images may be enhanced or
subtracted from an image by adding in the appropriate transform.
Nonorthogonal wavelet kernels such as the rotationally symmetrical "Mexican Hat" wavelet having a high degree of redundancy are particularly useful for image enhancement and reconstruction, but are not efficient for data transfer or storage purposes. Orthogonal wavelet kernels of defined frequency band may also be generated optically by adding and subtracting images made using optical systems having point spread functions which resolve the positive and negative components of the desired wavelet kernel. For example, holographic, binary optic, and/or liquid crystal anamorphic lens elements can be used to effectively produce at each imager pixel an orthogonal wavelet kernel such as the frequency band kernel defined by Mallat, IEEE Transactions on Pattern Recognition and Machine Intelligence, Vol. 11, pp 674-718 (July 1989), or the orthogonal smooth walsh-function wavelet packets described by Coifman, SPIE Volume 1826, Intelligent
Robots and Computer Vision, pp. 14-26 (1992) .
Electrooptic lens elements of relatively small optical effect may be used to modify the point spread function to produce extremely rapid parallel wavelet kernel generation for telemetry, image encoding, and other data processing purposes in an extraordinarily lightweight and simple camera system.
Orthogonal wavelet-based interframe image compression efficiently exploits first order image correlations (locally similar luminance) , second order correlations (oriented edges) , and higher order textural correlations, so that most of the image information can be represented by a relatively small number of coefficients. Computationally intensive ditigal wavelet image compressions can be achieved of 20=1 or more with excellent fidelity for medical and other images [A. Manduca, Interactive Wavelet-Based 2-D and 3-D Image
Compression, pp. 307-318, SPIE Volume 1897, Image Capture, Formatting and Display (1993)].
For moving objects, real time image compression, similar frame-to-frame compressions may be achieved, resulting in high total compression good fidelity.
In accordance with the present invention, such wavelet image compression/processing may be generated by means of incoherent light parallel processing techniques which can substantially eliminate major digital computa¬ tion requirements. The wavelet processor itself may be a very lightweight camera with a thin, programmable electrooptic lens element to modify the point spread function of the optical system, in order to produce the preselected wavelet kernel by image differencing.
Although the Mexican-Hat wavelet is a very useful wavelet kernel, its dilated daughter functions are not orthogonal functions, and the spatial frequency ranges of the dilated daughter wavelets overlap and have relatively non-uniform sensitivity as a function of frequency in Fourier Transform space. Further in accordance with the present invention, as previously indicated, a wide variety of radially symmetrical (ring) wavelet kernels, and nonradially symmetrical wavelet kernels such as wavelets oriented along (orthogonal) X and Y axes may be generated by optical point spread function manipulation (e.g., differencing). Binary optic and holographic elements are particularly useful for generating point spread functions of arbitrary, desired shape. Such binary optic and holographic elements may be made variable under electronic control by incorporating them in liquid crystal devices in which the effective refractive index of the liquid crystal is changed by electric field control. For example, substantially orthogonal wavelet kernels of defined frequency bands may be generated to emulate, for example a bandpass function in frequency space, by adding and
subtracting images made using optical systems having point spread functions which resolve the positive and negative components of the desired wavelet kernel. In this regard, for example, illustrated in Figure 18 is a frequency band wavelet kernel 1802 as defined by Mallat, IEEE Transactions on Pattern Recognition and Machine Intelligence, Vol. 11, pp. 674-718 (July 1989). The wavelet kernel may be resolved into a positive point spread wavelet component 1804, and a negative point spread wavelet component 1806. It should be noted that the "low" points of the wavelet 1802 correspond to "high" points of negative component distribution 1806, and "high" points of wavelet 1802 correspond to "high" points of positive component distribution 1804. The positive and negative distribution components may be determined empirically or by conventional curve-fitting techniques, such as Gaussian, Fourier series synthesis. The distributions may readily be produced by binary optics or holographic optics, which may be incorporated into conventional camera lens systems. The wavelet and its positive and negative components may be scaled and oriented along one axis, or may be radially symmetrical, etc. By utilizing a suitable binary optic or holo¬ graphic optical element which produces such point spreads, wavelet transform sets may be readily produced by differencing images made having such point spreads. In this regard, by obtaining a first image having a point spread distribution of the positive components of a desired wavelet (such as distribution 1804) , and subtracting therefrom on a pixel by pixel basis an image made using an optical/imaging system having a point spread distribution corresponding to the negative components of the wavelet (such as distribution 1806) , the wavelet may be generated as described herein. They may further be processed, and reconstructed as previously described, by employing digital calculation, direct use of the wavelet coefficients with coherent
light SLM imaging, or by differencing the positive and negative constituent images produced by incoherent SLM imaging. These wavelets may be dilated, as appropriate to the desired processing, and other wavelets, such as those previously described, and the orthogonal wave packets described by Coifman at pp. 14-25 of SPIE Volume 1826 Intelligent Robots and Computer Vision XI (1992) , may be similarly produced. In addition, wavelets of arbitrary shape in frequency space may also be generated (e.g., by resolving them into two different point spread functions for image differencing) , which have utility for frequency enhancement and removal of undesired frequency components.
Thus, by appropriately dilating the function using appropriate lens elements, bandpass wavelet functions which are substantially orthogonal, may be effectively generated at each pixel of the imager. Wavelet kernels may also be formed which follow an arbitrary spatial frequency curve, of user selected shape (perhaps emphasizing some spatial frequencies, and deemphasizing others) . Such wavelet transform processing has particular utility for a variety of otherwise computationally intensive image processing applications. For example, in x-ray evaluation and processing, it is desirable to emphasize or recognize certain features in an x-ray image which has inherently poor contrast and resolution, such as enhancing microcalcification in mammograms, performing edge detection in selected frequency ranges for measuring artery volume in radiographic angiograms, and detecting bone features such as subtle fractures, and structural density variations which are difficult to detect using the unaided eye. In this regard, Figure 19a is an electronic image of an X-ray of a broken hand made with a high resolution Kontron stepping camera. Figure 19b is an electronic image of the same X-ray in which the Mexican-Hat wavelet formed by subtracting a slightly
defocused image from the sharply focused image has been added to the original image to enhance the high frequency spatial detail. The degree of enhancement may be varied by varying the relative proportions of the original image and the wavelet component.
Imaging systems which perform a variety of image-differencing optical point spread functions to carry out wavelet transform processing in a very robust manner, which is not subject to sequential image time or motion differences, may be provided by utilizing polarization differencing using a single imager. The system cost may be minimized for a wide range of image processing operations by using a limited set of (programmable) components. For example, a relatively simplified system tailored to supplement different human vision defects, can be manufactured at relatively low cost using such a system. Relatively crude edge enhancements of an image may be performed for a "seeing aid" system by subtracting a defocused image from a focused image at each pixel, and projecting an image containing, for example, 10% to 90% of its image information content derived from the resulting Mexican-Hat wavelet(s) by means of a viewer for the seeing-impaired person. A real-time DoG camera such as illustrated in Figure lb may be used to generate the spatial frequency enhanced image by adding the DoG image to the conventional image, and the enhanced image thereby produced may by displayed on an imaging headset such as an LCD "virtual reality" glasses or "heads-up" military display unit. The resulting Difference of Gaussian "DoG" function approxi¬ mates a Laplacian of Gaussian edge operator at each pixel location, or a specific frequency-enhancing wavelet may be produced by selected point spread differencing as previously described. By varying the focus and/or the focus difference between the two images which are differenced (to vary the size of the
operators) , the width of the DoG or other operator can be controlled to provide an optimum enhancement for the deficit of the individual who will be using the "seeing aid" system. The Fourier transform in spatial frequency space similarly dilates to higher frequencies, and contracts to lower frequencies with respective contraction and dilation of the kernel width. By introducing spherical aberration into the lens system, the size of the DoG operator can also be made to vary like that of the human visual system, being sharper in the center and broader at the periphery.
Differencing of sequential images involves rapid focusing and defocusing in real time, and is sensitive to object motion. The focusing and defocusing can be done mechanically or electrooptically, such as by a liquid crystal kinoform lens using only polarized light. Motion artifacts can be minimized using a high speed imager, such as a 512 x 512 imagers and cameras for operation at 2500 frames per second, which is more than adequate to overcome motion artifacts.
However, differencing of Gaussian-shaped point spreads is a relatively inefficient use of light, because the peak intensity values are subtracted from each other. This inefficiency could limit effectiveness in low-light (indoor and evening usage) situations, particularly when using high frame rates to control motion artifacts. By differencing images having point spread functions which have offset peak intensity values, a much more efficient edge enhancement operator is produced, for example as previously described with respect to "donut" toroidal point spreads. The images may be enhanced by adding the enhancement image function to the focused image in desired proportions. However, by treating the enhancement operator as a wavelet kernel upon image reconstruction, much more sophisticated image processing can be performed. For example, by differen¬ cing appropriately shaped point spread functions
(generated by LC/BOEs) at each of the imager pixels, relatively orthogonal spatial frequency ranges may be enhanced or deleted. The disadvantages of capturing two sequential images for differencing can be overcome by simultaneously using differently polarized light for the different point spread functions to be differenced. By using an appropriate design, a flat liquid crystal lens can be electrically energized to control the focus of polarized light of one linear polarity. This produces one image of one polarization vector (approximately half the light) at one focal plane, and another image of one polarization vector ( the other half of the light) at another focal plane. By proper design of a birefringent plate, light of one of the images can be displaced laterally on the imager by one pixel position. By making the imager pixels sensitive to only the desired image of appropriate linear polarity (which can be carried out by applying or fabricating polarization filters to individual pixels) two different images of the same scene at different focus or other point spread difference can be obtained simultaneously. Image shadowing or mixing due to crosstalk can be minimized if necessary by partial subtraction of adjacent pixels. This reduces the frame rate requirements by eliminating motion artifacts and simplifies the memory requirements, making the entire unit simple, small, very light, small, and inexpensive, which is ideal for a high resolution "seeing aid" system. Moreover, resolution of such a system may be enhanced by electrooptic scanning, as previously described, with the composite image from sequential images being stored in electronic memory for the display unit. A liquid crystal lens can also be used in the other polarization image channel so that both channels can be shaped independently and electro- optically as desired. Complex point spread functions can be generated by designing the shape of both polarization channels to produce any desired spatial
frequency operator kernel to compensate for more complex deficits, as previously described. Such imagers may be square, or may be rectangular, to produce the desired image format. As indicated, the wavelet kernel transforms produced in accordance with the present invention may be utilized in the enhancement, reconstruction or other processing of images at a spatial frequency resolution exceeding the Nyquist frequency limitation of the imager array. In this regard, the wavelet kernel in the more compressed forms which can be accommodated by the pixel size of the imager, contains localized spatial frequency information at a higher frequency than the imager Nyquist limit. Because the transform set(s) of the high(er) frequency wavelet transform kernels may be combined linearly with the lower frequency wavelet kernels, an entire image may be reconstructed from the wavelet transform produced by point spread kernel generation of the entire image by digital or optical reconstruction processes, as previously discussed.
However, because the conventional image obtained by the imager may be regarded as the sum of the lower frequency wavelet kernel transform sets, this image may be enhanced by adding the high(er) frequency wavelet transform sets to produce a high resolution image exceeding the resolution of the conventional image obtained using the imager. For example, for 512 x 512 pixel imager in which the image pixels are in a square array occupying 25% of the imager area, the image captured by the imager (in the absence of fine scanning) is limited to a resolution of 512 x 512 in detail, which moreover is subject to incomplete sampling/aliasing error. In order to generate a higher resolution image, the digitized pixel data of a conventional 512 x 512 image captured by the imager may be mapped (without increased resolution) onto a "larger" image, such as a 1024 x 1024 image, by introducing alternating, addi-
tional rows and columns of pixels into the image lattice. The image pixel values of pixels directly between measured pixels may be determined by linear averaging. Diagonal pixel values may be readily calculated by linear averaging of the four diagonal pixels (or by averaging the four adjacent pixels previously calculated by averaging the intermediate pixels of the expanded array. Nonlinear or predictive averaging may also be carried out if desired, but is more computationally intensive and not necessary.
The resulting 1024 x 1024 image which has four times the number of pixels as the original image, but substantially the same amount of detail and information content, may be regarded as the reconstruction of all of the low frequency (high dilation) wavelet transform sets of the image, to a spatial detail approximating the Nyquist limit of the original 512 x 512 imager. By adding to the 1024 x 1024 expanded image, the inverse transform of the wavelet set(s) having image spatial frequency components higher than those in the original 512 x 512 image, the image detail of the 1024 x 1024 expanded image may substantially exceed that of the original image, and surpass the Nyquist sampling limit of the imager. In addition, the sampling/aliasing error of the imager is at least partially corrected. Because the low fill-factor pixel configuration of the imager can accommodate a more compact, and therefor higher-frequency wavelet kernel than an imager with a relatively larger pixel size and therefor higher fill-factor, it will be appreciated that a lower fill factor can produce a higher resolution image.
Wavelet transform processing systems in accordance with the present invention have particular utility for processing and analysis of medical images such as X-ray images, particularly for diagnostic techniques which extend the capabilities of the human eye, and for screening for particular attention by a
medical expert. For example, various types of cancer, such as breast cancer, can be controlled by early diagnosis and effective treatment. Screening X-ray mammography may permit detection at a lower lesion size threshold than physical examination by either the physician or patient. Digital image processing techniques have been used to assist the radiologist, but such digital systems are relatively expensive and computationally time-consuming. Figure 20a is an electronic image of a mammogram illustrating carcinoma associated with microcalcification, which was made with a high resolution Kontron camera on a light table. Figure 20b is an electronic image of the same X-ray in which the Mexican-Hat wavelet formed by subtracting a slightly defocused image from the sharply focused image has been added to the original image to enhance the high frequency spatial detail of the microcalcification. The degree of enhancement may be varied by varying the relative proportions of the original image and the wavelet component.
Other uses for wavelet image processing include confocal-like sectioning of an image through enhancement and subtraction of spatial frequency image components. Image slices may be provided in accordance with the present invention which are relatively free of interference from other layers of the specimen, by using wavelets such as those obtained by subtracting focused and defocused microscope images, or the positive and negative components of frequency-selective wavelets. Further in this regard, depth-slice processing system may also be provided which are useful for 3-D micro¬ scopy. By focusing a microscopy image at a specimen layer, the shortest (most compact) wavelet will contain information predominantly from the focal plane, because the image is not focused elsewhere. The shortest wavelet information can be obtained at layers of focal planes through the specimen. The next longest wavelet,
for example at 2x the width of the first, recovers lower frequency data from the focal plane, plus the defocused, highest-frequency data from the adjacent layers. Because this adjacent data is known, as described, it can be "defocused" according to the depth-of-field function, and subtracted from this next-longest wavelet response at each layer. In this manner, images of each in-focus layer of a specimen can be reconstructed, from which relatively clearer 3-D images can be constructed. This applies to regular photography, also.
It will be appreciated that although various aspects of the invention have been described with respect to specific embodiments, alternatives and modifications will be apparent from the present disclosure, which are within the spirit and scope of the present invention as set forth in the following claims.