WIDE DYNAMIC RANGE CAMERA ~ 3 FIELD OF THE INVENTION
The p~esent invention relates to video imagery and more particularly to apparatus and techniques for providing enhanced video images.
BACKGROUND OF THE INVENTION
Various types of video image enhancement apparatus and techniques have been proposed. Automatic Gain Control (AGC) techniques have been employed, inter alia, in video signals for reducing the dynamic range of the video signal by subtraction of its DC level. Such a technique is described in U.S. Patent 4,719,35~, wherein linear AGC is applied to the video image on a pixel by pixel basis. This technique is applicable only to a solid area which is delimited by a line parallel to the scan direction.
Generally speaking, AGC is employed in video processing only on a frame by frame basis.
- S~MMARY OF ~E I.~VENTION ~ 3 ~ ~ ~ 7 ~
The present invention seeks to provide improved video image enhancemen~ apparatus which overcomes limitations of the prior art apparatus and techniques discussed above.
5There is thus provided in accordance with a preferred embodiment of the present invention, video imaging apparatus - including apparatus for providing a plurality of video images of a scene at different exposure levels and apparatus for processing the plurality of video images to produce a combined video image 10including image information from the plural:Lty of video images and including enhanced detail at local areas therein.
Further in accordance with a preferred embodiment of the invention, the apparatus for processing the plurality of video images comprises apparatus for locally enhancing the 15dynamic range of portions of the combined video image.
Additionally in accordance wi~h a preferred embodiment of the invention, the apparatus for processing the plurality oP
video images comprises apparatus for preserving edge indicating information in the combined video image.
20Further in accordance with an embodiment of the invention, the apparatus ~or processing the plurality of video images comprises apparatus for applying neighborhood ~ransforms to the plurality of video images.
Additionally in accordance with a preferred embodiment 25o~ the present invention, there is provided video image enhancement apparatus comprising apparatus Por providing a plurality of video images of a scene at difPerent exposure levels 131897~
and apparatus for processiny the plurality of vidao images to produce a combined video image including image information from the plurality of video images and includlng enhanced detail at local areas therein.
Additionally, in accordance with a preferred embodiment of the invention, apparatus for processing may also include image enhancement apparatus such as histogram equalization apparatus.
According ko a broad aspect of the invention ~here is provided wide dynamic range video imaging apparatus comprising:
sensor means for providing a plurality of video images of a scene at different exposure levels; and means for processing the plurality of video images to produce a combined video image including image information from the plurality of video images by applying neighborhood transforms to the plurality of video images.
According to another broad aspect of the invention there is provided wide dynamic range video imaging apparatus comprising:
means for receiving a plurality of video images of a scene at different exposure levels; and means for processing the plurality of video images to produce a combined video image including image information from the plurality of video images by applying neighborhood transforms to the plurality of video images.
Accordlng to another broad aspect of the invention there is provided wida dynamic range video imaging apparatus comprising:
sensor means for providing a plurality of video images of a scene at clifferent exposure levels; and means for p.rocessing said plurality of video images to produee a combined video image including lmage information from the plurality of video images by applyiny neighborhoocl transfor~s to the plurality of video images, said means comprising means for locally selecting the opera-ting le~els of the dynamic range of the combined video image within the dynamic range of the sensor means, whereby the resulting video image includes image information from the plurality of video images with enhanced information content at local areas therein.
- BRIEF DESCRIPTION OF THE DRAWINGS ~ 3 ~ 8 3 The present invention will be understood and appreciated more fully from the following detailed description taken in conjunction with the drawings in which:
Fig. 1 is a block diagram illustration of a video imaging system constructed and operative in accordance with a preferred embodiment of the present invention;
Fig. 2 is a simplified block diagram of neighborhood transform apparatus which provides a non-].inear neighborhood transform;
Fig. 3 is a block diagram illustration of an alternative video i3aging system constructed and operative in accordance with an alternative preferred embodiment of the present invention;
Fig. 4 is a block diagram illustrstion of a further video imaging syste~ constructed an~ OperQtive in accord~nce with an slternative preferred embodiment of the present invention;
Fig. 5 is a graphical illustration of the operation of the video imaging system of Fig. 4:
Fig. 6 is a block diagram illustration of a pixel selector useful in the embodiment of Fig. 5;
Fig. 7 is a block diagram illustration of yet a further alternative embodiment of a video imaging system constructed and operative in accordance with a preferred embodiment of the present invention; and Fig. 8 is a block diagram illustration o~ an exposure selector forming part of the circuitry of Fig. 1.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
Reference is now made to Fig. 1, which illustrates one preferred embodiment of a video imaging system constructed and operative in accordance with a preferred embodime~t of the present invention and which comprises a camera 10, such as a CCD, CID, photodiode array or any other visible or non visible light sensor array which permits the exposure time to b~ v~ried by externally supplied control pulses or which permits any other form of externalexposure control. One commercially available example of camera l0 is a model RVS-~9 available from Xybion Electronic Systems Corp. of San Diego, California. U.S.A.
Camera timing circuitry 12 supplies timing pulses to camera 10. The timing circuitry 12 may comprise conventionai clocks, counters and frequency dividers. The timing pulses supplied to camera 10 are operative to actuate the photoelectric accumulation of charge in the sensor arrays for varying periods of selectable duration and are also operative to govern the read-out of the signal currents produced by sensing through pre-amplifier circults preferably incorporated within camera 10 to an A/D converter 14. A typical suitable A/D converter is a TRW TDC-la48 which operates at 10 MHz at 8 bit resolution. Control of the photoelectric accumulation o~ charge may be accomplished generally in two ways, by operating a shutter, such as an electronic shutter to control the light input or by controlling the integration time of the sensor array.
The digitized video data from A/D converter 14 is 3 ~ 7 ~
supplied in parallel to two subsys~ms, a Look Up Table ~LUT) 16 and an exposure selector 17. The exposure selector is illustrated in Fig. 8 and comprises first and second comparators 200 and 202 arranged in parallel and outputting to an AND gate 2~4.
Comparator 2~0 compares the signal from A/D converter 14 with a low threshold level I(L), such as a signal level oP 20, in an 8-bit range of ~ - 255. Comparator 2~2 compares the signal from A/D
converter 14 with a high threshold value I(N), sùch as 235. If the signal is above 20 and below 235, then the two comparators both generate logic "true" signals which are ANDed by AND gate 204.
The output of AND gate 204 is supplied to a counter 206, for incrementing thereof when two '7true" signals are received at AND gate 204.
Counter 2~6 is reset to zero at the beginning of each frame. If the image of the current frame is mostly saturated, i.e. many pixels are white (having, for example, a digital value of 255 or close thereto), then at the end of the frame the counter will contain a very low number. Such will also be the case, if the image is mostly cut-off, i.e. many pixels are black, (i.e. having a digital value of 20 or less).
Conversely, for "normal" images, which possess a certain spread of values, a large number of the pixels will have values between 2~ and 235. For frames of such images, the counter 2~6 will contain a large number at the end of each frame.
The output of counter 206 is supplied to a comparator 208. At the end of each frame, the output of counter 206 is compared by comparator 208 with a threshold value N(th). This _ ~3~7~
~- threshold is selected to determine whether the image of the frame ~as a "normal" one, as opposed to an image that was mainly saturated or cut-off. If the value of the counter 2~6 output is higher than N(th), then the output of the comparator 208 is a 5logic "true". That output is supplied to both the timing control circuitry 22 and the host CPV 18 (Fig. 1).
- The measurement and determination of whether a certain frame at a given exposure level should or should not be combined with a plurality of frames at different exposure levels may be 10carried out in at least two ways.
According to one embodiment of the invention, the measurement can be done on a relatively infrequent basis, as datermined by the host CPU 18. In such a case, a complete series of exposures, covering the full range of exposures of which the 15system is capable, is carried out. At the end of each exposure.
the output of comparator 2~8 is received by the hos~ CPU 18.
The host CPU 18 controls the parameters I(L), I(H) and N(th), and can modify them at will. The information gathered assists the host CPU 18 to determine which exposures will be 20taken or used until the next measurement.
According to an alternative embodiment o~ the present invention, the measurement can be carried out on a continuous basis. The series o f exposures begins from an intermediate exposure setting, such as that determined by a conventional 25exposure meter, and proceeds towards the two extremas.
For example, it may be is assumed that there exist 15 possible exposure levels, where number 1 is the shortest ( having ~ .3~ ~7~
the least sensitivity) and numbeF-~5 is the longest (having the greatest sensitivity). The system typically proceeds by starting a new sequence wi~h exposure number 7, and then proceeds with numbers 6, 5, ... 1. When a cut-of~ frame is encountered, for wh.ich comparator 238 produces a logic "false" output, the progression o~ the sequence in that direction is terminated and the sequence proceeds with exposure numbers 8, 9, .... 15. When a saturated frame is encountered, the sequence is complete. In such a case, the system may receive expasure numbers 3 through 13, of which exposure numbers 3 and 13 are cut-off a~d saturated, respec~ively, and exposure numbers 4 - 12 are "normal".
It is appreciated that not all of the exposures may be necessary. For example, two consecutive images taken at exposure levels separated by one may be very similar and the amount of information provided by one as compared with the other may be marginal. Such an occurence may be detected by comparing the two consecutive images, pixel by pixel. IP the corresPonding pixels dif~er by more than a given threshold ~hen a counter is incremented. If the count over the en~ire frame exceeds a given threshold, the entire exposure is retained. Otherwise, it may be discarded.
Alternatively, after t~king a picture at the i'th level of exposure, a picture at the k'th level is taken and compared with the picture taken at the i'th level, pixel by pixel. I~ they are different at more than a given number of pixels, then a picture is also taken at the intermediate exposure level ~, which is between i and k. While taking the picture at the j'th exposure level the output can be compared with the outputs at bcth the 7 ~
= i'th and k'th leveis in order to determine whether any exposure between the j'th level and either the i'th level or the k'th level need be taken.
The foregoing techniques are intended to enable less than the full series of exposures to be employed. Normally one to three exposures are sufficient to provide a desired level of _ information.
Look Up Table (LUT) 16 pre-processes the digital video signal in accordance with program instructions received from a Host CPU 18. The Host CPU 18 is typically embodied in a personal computer communicating with the system hardware via a conventional serial or parallel link. Alternatively, the Host CPU
18 may be embodied in a relatively simple microcontroller, such as an Intel 8~51 or a suitable custo~-designed controller. As will be described more fully hereinbelow, the Ho~t CPU defines input and output LUTs, camera expus~res and transforms applied to the video signals.
The signal output from input LUT 16 is provided to a neighborhood transform processor (~'TP) 2Q which is operative to reduce the low ~requency portions of the video data and to perform an edge enhancement transform Oll the video data received from input LUT 16. Timing control circuitry 22 supplies timing control signals to camera timing circuitry 12, input LUT 16, NTP
2OD and to additional circuitry ~hich will be described hereinbelow. The timing control circuitry 22 is operative to step the camera through various exposure levels, i.e. sensitivities.
Preferably it may operate adaptively, by sensing the saturation J 3 1 8 9 ~ ~
and cut-off limits and begin*~ng at an exposure level intermediate therebetween, as described hereinabove.
Considering now the operation and structure of the neighborhood transform processor (NTP), it may be appreciated that according to one preferred embodiment of the invention, the NTP carries out a linear convolution. In such an embodiment, the NTP may comprise an off-the-shelf, special purpose VLSI chip, such as Zoran ZR33481 or ZR33881 Digital Filter Chips and some additional components, for example, delay lines and delay buffers, such as described in Zoran Technical Note No. Z.T.N. ~3 entitled "Real Time Spatial Filtering with a Zoran DFP".
A typical function of the neighborhood transform processor in the above-described embodiment is to compute a new output value for each input pixel on the basis of the input values of that pixel and its immediate neighbors. A typical kernel for this convolution may be as follows:
-1 8~e -1 where e is <1 and is determined heuristically. For example e may be set to be equal to l/number of frames combined in a picture.
Typically e = ~.1 if the number of frames combined or the number of dif~erent exposure settings combined is equal to lO. The kernel and the value oP e may be varied by the Host CPU 18.
The output of the NTP 2a iS supplied to a combiner 24, m~.r ~
which also receives an input fromt~-iming control circuitry 22.
Combiner 24 combines the pixels from the neighborhood transform with the accumulated value stored for that pixel from previous frames in a frame buffer 26. The combiner may be embodied in a conventional adder, such as a 74F283 of Texas Instruments, which produces the sum of all successive frames, receives an input from the frame buffer 26 and stores the summed result in frame buffer 26.
Frame buffer 26 typically comprises a matrix of static or dynamic RAM chips having associated circu:its for reading in the data for each pixel from combiner 24 to the appropriate address, for reading out the next pixel into the combiner, and for outputting the entire frame after the video data of all the required exposures has been combined. It may thus be appreciated that the frame buffer is zeroed before each new sequence begins and contains at any point during the process of collecting images at different exposures, a combination of pixel data collected up to that point through the various exposures already made.
When the process of collecting pixel data for all of the desired exposures is completed, the pixel data for the entire frame is output from the frame buffer 26 sequentially through an output Look Up Table (LUT) 28, which receives a control input from Host CPU 18 and may provide digital signal post processing, i~ required. Alternatively, the output LUT 28 may be eliminated.
The output of output LUT 28 is preferably supplied to a real time image enhancement processor (IEP) 30, such as a histogram equalizer, which is operative to smooth the distribution o f gray levels or colors over the area of the image.
A typical IER is descrlbed ln U.S. Patent 4,450,482. Alter-natively, the I~P may be ellmlnated.
The output of IEP 30 is supplled to a frame memory 32, typically comprlslng a matrlx of RAMs whlch stores the enhanced, processed vldeo image untll a new lmage has been collected and processed. The frame memory 32 typlcally outputs via frame memory Look Up Table (LUT) 34 whlch permlts further post processlng to be carried out. The output of LUT 34 ls supplled to a video genera-tor 36, such as an of~-the-shel~ clrcult lncludlng a a/A converter and tlmlng clrcuits, such as an Analog Devices HD~-0807. The vldeo generator recelves synchronlzaklon pulses from a conven-tlonal sync generator 37, such as a ZNA134J, to enable the current vldeo signal to be vlewed contlnuously, lndependent of the sub-sequent frame processlng. The vldeo slgnal thus generated may be vlewed on a standard vldeo monitor 38. Addltlonally or alter-natlvely~ the digltal output of LUT 34 may be supplled directly to a computer/lmage processor.
Reference ls now made to Flg. 2, whlch lllustrates apparatus for providlng a non-llnear nelghborhood transform whlch Z0 may be employed lnstead of NTP 20 lFlg. 1) and whlch lncludes a multiplier 50 whlch recelves an output slgnal from the LVT 16 representing a plxel ~ li and a nelghborhood slgnal. The output of multlpller 5~ ls supplled to an output T li 52. A nelghborhood averaglng clrcuit S6, whlch may be based on the Zoran ~R~3481 or ZR33881 Dlgltal Filter Chips as described above, which provides an output indication of the average intensity of the neighborhood ~ .~r ~f ~897~
6~055-219 around plxel i~. This average ln-tenslty may be computed by means of a convolution of, for e~ample, a 3 x 3 nelghborhood around pixel 1~ wlth the 3 x 3 kernel of al] 1/9.
The output of clrcult 56 is ampllfied by an ampllfier 58, havlng an ampllfication G, and is supplied to a subtractor 60 which subtracts lt from 1. The dlfference is supplled to multl-pller 50 as the nelghborhood slgnal.
The prlnclple of operatlon of the clrcultry of Flg. 2 ls descrlbed ln an artlcle entltled VLSI Architecture for the Auto-matic Galn Control Image Processing Algorithm by Ron Riesenbach, Ran Ginosar and Alfred Brucksteln, 1987 (Proceedings of the 15th IEEE Convention in Israel, March, 1987), and may be summarized by the followlng expression:
T 1~ t 1 - G (Neighborhood Transform of T 1~ )~
Reference ls now made to Fig. 3 whlch illustrates an alternative embodlment of the present lnvention wherein multiple cameras 70 are employed, each acquiring the same image at a dif-ferent exposure level. Each camera generates a video slgnal which is supplled to a nelghborhood transform processor ~NTP) 72, which ~ 20 may bs ldentlcal to NTP 20 descrlbed above or to that described ln ; Fig. 2.
Preferably, all of the cameras 70 and NTPs 72 are synch-ronlzed so that they simultaneously produce outputs relating to the same plxel at dlfferent light sensitivltles.
Accordlng to an alternative embodlment of the present ~- invention, instead of having a plurality of cameras ~all boresighted and synchronized to shoot simultaneouslyt a plurality of cameras may be arranged in a row, along the direction of motion of a moving object to be viewed. The cameras fire in sequence, timed to shoot such that each one captures the obJect in the same position in the frame. If necessary. the timing of _ the images f`rom the sequence of cameras can be adjusted after the images are captured, in order to achieve proper registration of all of the frames, An embodiment of the type described above may be employed with a row of line array cameras to obser~e moving obJ'ects, such as assemblies on a production line or moving sheets or a moving web, thereby achieving more highly enhanced spatial resolution and speed than would be attainable from matrix array sensors.
The outputs of NTPs 72 are supplied to a plurality of first stage parallel combiner circuits 74, such as circuits employed in combiner 24 ( Fig. l~, as described hereinabove. The outputs of combiner circuits 74 are combined in second and higher stage combiner circuits as necessary to produce a single combined video output, which is supplied to image enhancement circuitry 76 such as a histogram equalizer 30 ~Fig. l).
Reference is now made to Fig. 4, which illustrates an alternative embodiment of video lmaging system. Here, a plurality of cameras 8~, each operative at a different exposure level or sensitivity, simultaneously acquires different versions of the same image at differing sensitivities. The video outputs of the _ cameras 80, which are all synchro~ized, are su1~3 ~o7~ pixel selector 82, which will be described hereinbelow.
The pixel selector is operative to provide a single output video signal having an extended dynamic range. For example, if each ~ignal supplied to the pixel selector spans a 1:256 range (8 bits per pixel), the output of the pixel selector may span a range of 1:1,048,576 (20 bits per pixel). The output signal from the pixel selector 82 may be supplied to a neighborhood trans~orm processor (NTP) 84, which may be identical to NTP 2~, which reduces the dynamic range down to typically 1:256 (8 bits~, while preserving substantial image contents and edge information. The output of NTP 84 may be supplied to image enha~cement apparatus of the type described above, such as shown at reference numerals 28, 3~ and 32 or employed without image enhancement as appropriate. Alternatively, the output o~ pixel selector 82 oay be used in its full 20 bit format as an output 86.
The pixel selector employs a measurement correction algorithm based on ~he following:
Given a certain pixel P is being sensed by a plurality of n cameras, which produce n different outPuts for the light intensity of P . If the exposure settings of the plurality of cameras is set such that each camera k is set to receive exactly one haIf of the light entering camera k + l for all k and assuming that ~he camera response to light is linear within a certain range, beyond which it is either sa~urated at high intensity or cut-off at low intensity, then I (k) + C = 1/2 I (k + l) ij ij ~31~97~
where C is a constant offset term and I (k~ is the intensity at pixel ij for camera k, provided that both values of I are within the linear range of responsivity of the cameras.
If some light intensity value I (m) is at or above the saturation value SAT, then ij I (k) = (approximately) SAT for all K ~ ml m~l ... n.
jk I~ some light intensity value I (m) is at or belo~
the cut-off value CUT, then I (m) = (approximately) cUr for all k ~ l, ... m-l, m.
Stated differently, the responsivity oP the set of cameras to the light intensity oP a given pixel is as shown in the graph of Fig. 5, which illustrates the aperture numerical aperture versus the Intensity I for a given single pixel P
Pixel selection circuitry which will be described hereinbelow is operative to search along the graph of Fig. 5 for each set of received pixel intensities I (l), I (2) ...
I (n) and to look for any one point on the linear segment, for example, I (k) which comes from a camera k. It then transforms the intensity value I (k) to the equivalent intensity which would have been measured by camera n, if pixel P were not causing saturation at camera n.
; The transformation is as follows:
(n - k) I (n) = I (k) x 2 ij i~
' The foregoing may be generalized for a light intensity relationship l/R instead of l/2 by replacing 2 by R throughout the above discussion.
~ Reference is now made to Fig. 6 which illustrates a preferred embodimenc of pixel selector 82 arranged for operation with a plurality of boresighted and/or mutually synchronized cameras. Each of a plurality of digital comparators 1~, 5 typically of conventional construction, compares a signal from a camera with a signal from another camera whose value is multiplied by l/R in the general case or l/2 in the specific example employed above. The outputs of comparators l~0 pQSS
through threshold circuits 1~2, such as to a priority encoder 10 circuit 104. The output of the priority encoder circuit 1~4 controls a selector lQ6 which selects one of the inputs I(l) I(n) and supplies it to a barrel shifter circuit 108. The barrel shifter circuit 1~8 receives a control input from a subtractor circuit 110.
The operation of the circuitry of Fig. 6 will now be explained briefly. The threshold circuits 102 which receive the outputs from comparators lO0 provide outputs to circuitry 1~4 only from those comparators receiving signals within the linaar range. Thus signals from outside the linear range are eliminated.
20 This occurs, because the comparator 1~3 output is relatively high when the two inputs thereto are nearly the same, and is much lower when there is a significant difference therebetween.
The priority encoder circuit selects one of the signals from the linear range and employs that signal for scaling the (n-k) 25 corresponding intensity I (k) by the factor 2 as described abo~e. Accordingly, ehe intensity output is unaffected by which camera records it. The priority encoder may be embodied in a TI
SN 74148 chip. _ 1 31 3 ~ ~1 Reference is now made to Fig. 7 which illustrates an alternative embodiment of image processor constructed and operative in accordance with an embodiment of the present invention. A single camera 120, provides an output to a 2 to 1 selector 122 and to a comparator 124. I~le output oP the comparator passes through a threshold circuit 126, which outputs to combinational logic (CL) 128. The CL 128 outputs to a single bit per pixel frame buffer 130, which in turn outPuts to the CL
128. The output of CL 128 is supplied as a control input to selector 122. The signal output from selector 122 is supplied via a frame buffer 132. The output from frame buffer 132 may be fed back via a multiply by two ( or R) circuit 134 to the selector 122 and to comparator 124 and is supplied as an output signal to a neighborhood transform processor (NTP) such as NTP 84 in the embodiment of Fig. 4. The output from frame buffer 132 may also serve as the system outpu~, thus providing a full dynamic range output.
The apparatus o f Fig. 7 corresponds to that shown in Fig. 6 for the use of a single camera. The apparatus of Fig. 7 is operative to sweep the sensitivity range over multiple successive acquisitions of the same image. T~e sensitivity sweep typically starts at the lowest sensitivity, i.e. the shortest integration time, for k=l and increases the sensitivi y twofold in an exemplary case or R-fold in a generalized case, from one frame to the next until n frames are taken.
Accordingly, any given pixel msy be in the cut-off region during the first few fraMes, then in the linear region, ~ 3~397~
-- and then in the saturation region. Some pixels may be in only one or two of/the three possible regions throughout the process (e.g.
always saturated, or never saturated.) At the outset frame buffer 130 is set to all zeros.
~hen a pixel intensity I (k) of the k'th image is output by the camera, i~ is compared with twice the value already stored in the _ frame bu~fer 132 of the same pixel ij by readlng that value out and shifting it to the left by one bit ~ i.e. thus multiplying it by 2). If the values are different by more than a given predetermined tolerance level, indicating that the given pixel is still in the cut-off region, then the old value, already multipled by two, is stored in the frame buffer 132. Otherwise, for example, if the pixel is in the linear region, the new value is stored and the i,j bit of frame buffer 13~ is set to one.
15Once the i,j bit of frame buffer 13~ has been set to one, the decision is reversed. If the values are within a given predetermined tolerance, indicating tha~ the pixel is in the linear region, then the new value is stored in the frame buf~er.
Otherwise, ~or example, if the pixel has reached saturation, the old value, already multipled by two, is stored.
After all n frames are examined, the value stored for each pixel is the last one of the linear region, shifted left, (i.e. multipled by 2,) the appropriate number of times. Thus all pixel intensities correspond to the computed value taken by a non-existing large dynamic range camera, operating at the sensitivity level of camera n.
It will be appreciated by persons skilled in the art ~3~7~
that the present invention is n~ limited by what has been particularly shown and described hereinabsve. Rather the scope of the invention is defined only by the claims whioh follow: