AUTOMATIC BLACK LEVEL, LUMINOSITY AND COLOR COMPENSATION FOR DIGITAL
STILL IMAGES AND DIGITAL VIDEO
BACKGROUND OF THE INVENTION 1. Field of Invention
The invention relates generally to image processing systems. More particularly, methods and apparatus for efficiently providing more natural looking video in real time are disclosed. The same methods and apparatus can also be used for still-image processing and for non-real-time image processing. 2. Description of Relevant Art
Conventional digital video recording and playback systems are limited in their ability to perform real-time image processing due to the high speed requirements of processing algorithms and the related high cost of hardware that can run those algorithms. For these reasons, real-time digital video processing systems are either not available or are attainable only at a high cost (and with limited capability). Therefore, such real-time systems are available only to high end users such as scientific institutions, movie studios and the like. Furthermore, most video recording and playback systems use most of their computing power to execute basic recording or playback functions; additional image processing would require additional computing power or a diminished frame rate in the recording or playback.
Fig. 1 illustrates a conventional NTSC standard TV picture 1. The TV picture 1 is formed of an active picture 10 that is the area of the TV picture 1 that carries picture information. Outside of the active picture area 10 is a blanking region 11 suitable for line and field blanking. The active picture area 10 uses frames 12, pixels 14 and scan lines 16 to form the actual TV image. The frame 12 represents a still
image or one of many frames produced by any of a variety of sources such as an analog video camera, an analog television receiver, as well as digital sources such as a computer, digital television receiver (DTV), digital still-image camera etc. In systems where interlaced scan is used, each frame 12 represents a field of information. Frame 12 may also represent other breakdowns of a still image depending upon the type of scanning being used.
Information in frame 12 is represented by any number of pixels 14. A pixel (an acronym for "picture element") is the smallest distinguishable and resolvable area in an image as well as the discrete location of at least one individual photo-sensor in a solid state camera. Each pixel in turn represents digitized information and is often represented by 4 to 36 bits. Each scan line 16 includes any number of pixels 14, thereby representing a horizontal line of information within frame 12. In NTSC video (a television standard using interlaced scan), for example, a field of information appears every 60th of a second, a frame (including the two alternating interlaced fields) appears every 30th of a second and the continuous presentation of frames of information produce a moving picture. On a computer monitor using progressive scan, a frame of information is refreshed on the screen at least every 60th of a second to produce the display seen by a user.
One attempt at improving the appearance of a video display is described in U.S. Patent 5,406,336, "Contrast and Brightness Control Whereby Both Are Based on the Detected Difference Between a Fixed Black Level in the Video Signal and the Black Peak Value", issued to Hartmut, et al. The method of Hartmut requires that the black level of the digital image be set according to the difference between a pre-set black level and the black level of the received video signal. Using this approach, the image
rendered is darker than would be desirable and Hartmut attempts to overcome this deficiency by increasing luminance for the video signal based upon the difference between the pre-set black level and the video signal black level and the maximum white peak level and the white peak level of a signal. This approach gives increased contrast but an overall darker picture as a result.
In the digital format, each pixel is represented by a brightness, or luminance component (also referred to as luma, "Y") and optionally color, or chrominance, components. Since the human visual system has much less acuity for spatial variation of color than for brightness, it is advantageous to convey the brightness component, or luma, in one channel, and color information that has had luma removed in two other channels. In a digital system each of the two color channels can have considerably lower data rate (or data capacity) than the luma channel and, in some cases, not produce a noticeable degradation in the eyes of most viewers. Since green dominates the luma channel (typically, about 59% of the luma signal comprises green information), it is sensible, and advantageous for signal-to-noise reasons, to base the two color channels on blue and red. In the digital domain, these two color channels are referred to as chroma blue, Cb and chroma red Cr.
Since chrominance is the color part of a signal relating to the hue and saturation but not to the brightness or luminance of the signal, black and white images do not have chrominance, but any colored signal has both chrominance and luminance.
In composite video, luminance and chrominance are combined along with the timing reference 'sync' information using a coding standard such as NTSC.
Since the human eye has far more luminance resolving power than color resolving power, for digital images the color bandwidth of a coded signal is reduced to far below that of the luminance, typically one half to one quarter of the bandwidth. By way of example, digital image processing of a standard 525/60 NTSC (interlaced) TV picture requires a data bandwidth of at least 25.75/MBytes/sec (note that 1MByte is equal to 1024 x 1024 = 1,048,576 Bytes). This bandwidth is required since there are 525 interlaced lines over two fields in a 525/60 NTSC TV picture each having 858 luminance (Y) samples and 429 each of two chrominance samples Cr and Cb (making a total of 1716 samples per line) for a total of 1716 x 525 = 900,900 samples per frame of two interlaced fields. At about 29.97 frames a second requires the data bandwidth of about 900,900 x 29.97 = 26,999,973 Bytes per second. The processing clock for a 525/60 interlaced digital application is 27 MHz.
With the advent of High Definition TV (HDTV) the data bandwidth required to transmit a video stream is very substantial. By way of example, the HDTV broadcast standard called 1080i (1080 interlaced lines) increases the current NTSC 525 lines per frame to 1,125 lines while at the same time increasing the horizontal sampling from 720 to 1920 active pixels with a change of aspect ratio from 12:9 to 16:9. This increase in size and resolution requires an HDTV image to have over 2 million pixels with the concomitant increase in required data transmission and processing bandwidth.
Image processing is often needed to improve the quality of a digital image. Most consumer-level video and still-image products do not convey to the viewer a natural balance of back level, contrast and color. The improvement done in this application is related to restoring that natural balance. The prohibitive processing power required to perform real-time digital image processing has contributed to the
inability of the current state of the art to produce in real time, naturally appearing images for use in, for example, video displays such as TV (be it either analog of digital), computers, still cameras and the like. Several conventional approaches to producing appealing images use what can be referred to as a brute force approach. Such approaches include the use of manually, and in some cases, semi-automatically activated, color, brightness, contrast and tint controls included on most TV systems today. These conventional approaches can only provide a rough approximation to what would be considered a natural appearance. In addition, these settings cannot typically be changed except for very limited circumstances (such as changing brightness level to partially compensate for ambient light changes) and real-time image processing is seldom attainable due to at least in part the large data transmission and processing bandwidth requirements of digital images.
One commonly used technique that is supposed to improve a picture inside televisions, Scanning Velocity Modulation, is an analog technique that accelerates and decelerates the electron scanning in a Cathode Ray Tube. The resulting picture can appear to have more detail but most experts prefer to defeat it since it also creates unnatural noticeable artifacts such lines that are thinner or thicker than they should be.
Therefore, what is desired is an efficient method and apparatus for producing a naturally appearing image in real time that can be used in any video-type display device such as, but not exclusive to, TV displays, computer displays, still cameras (both digital and analog), and the like.
SUMMARY OF THE INVENTION
The invention relates to an improved method, apparatus and system for real- time enhancement of black level, contrast and color of digital images. In one
implementation, the invention recovers shades of gray that appear to be lost in the digital image providing more detail that in turn makes the image to appear more naturally realistic. In the described embodiment, a digital image is analyzed and enhanced by first creating a histogram of the luma components of the pixels that form the digital image. In this case the luma histogram is a graphical representation of the frequency of occurrence of each intensity or range of intensities (gray levels) of pixels in the digital image such that the height of the histogram represents the number of observations occurring in each interval. The image is then modified by, in a preferred embodiment, first setting a new black level by subtracting a constant value for all luma components (preventing them from going below the minimum black level of the application), effectively shifting the histogram. A luma multiplication factor based upon the modified luma histogram is created which is then used to further process the luma component corresponding to each of the pixels to distribute a narrow range of image gray scale values across the entire available range. This redistribution improves contrast and represents more closely what the human eye would perceive at the real-life scene. The processed luma component for each pixel is then combined with the chroma component, if existent, corresponding to that pixel to form an enhanced black and white or color pixel. The process is repeated until all pixels forming the standard image are enhanced at which time the enhanced pixels are arranged to form the enhanced digital image.
In another embodiment of the invention, a method enhancing a digital image is disclosed. The digital image is formed of pixels each pixel being associated with a luminance component and a chrominance component. As a method, a luminance histogram is created and its statistical mode is then determined. Based upon the mode of the luminance histogram, a new black level for the digital image is set by
subtracting a constant value from all luminance components, effectively shifting the histogram. A luma multiplication factor is then generated based upon the mode of the original histogram wherein each pixel is multiplied by the luma multiplication factor to form a digital image having enhanced luma characteristics. The enhanced luma components are then combined with the unprocessed chroma components.
In yet another embodiment of the invention, an apparatus for enhancing a digital image is disclosed. The digital image includes a plurality of pixels, and each of the pixels includes a luma component and a chroma component. The apparatus includes a luma component histogram generator arranged to generate a luma histogram based upon those of the plurality of pixels located within a specified portion of the digital image. A luma processor coupled to the luma component histogram generator determines the mode of the luma histogram. Based upon the luminance histogram, a new black level for the digital image is set by subtracting a constant value from every luminance component and effectively shifting the histogram. A luma multiplication factor is then generated based upon the mode of the original histogram wherein each pixel is multiplied by the luma multiplication factor to form a digital image having enhanced luma characteristics. For color images, a chroma multiplication factor is also generated based on the mode of the luma histogram and the chroma components are multiplied too. In another embodiment there is a chroma processor that combines input from the luma processor and a chroma histogram to create new chroma multiplication factors to enhance the chroma components. A multiplexer is used to combine the respective multiplied luma components and the respective chroma components into a plurality of enhanced pixels to form the enhanced digital image.
In this way, black and white images are made to appear more natural since the black level and contrast are balanced. In addition, by enhancing both the luminance components as well as chrominance components, color images are made to appear more natural. By capturing historical luminance data over a series of inter related frames of digital image data, as in digital video, real time enhancing of digital video is practically implemented.
Embodiments of the present invention are applicable to real-time processing, to still image processing, or to non-real-time processing such as "overnight" processing. In addition, although the 4:2:2 is the most commonly used color sampling standard for professional video and represents chroma components at one-half the bandwidth of the luma component (which has been described in the Background), the present invention is also applicable to the 4:2:0, 4:1:1 (quarter chroma bandwidth) and
4:4:4 (full chroma bandwidth) color sampling standards.
These and other advantages of the present invention will become apparent upon reading the following detailed descriptions and studying the various figures of the drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention, together with further advantages thereof, may best be understood by reference to the following description taken in conjunction with the accompanying drawings in which:
Fig. 1 illustrates a prior art image representation scheme that uses pixels and scan lines.
Fig. 2 is a block diagram of a real-time image processor system in accordance with an embodiment of the invention. Fig. 3 is a schematic representation of a data word typical of the prior art.
Fig. 4A is a schematic representation of an implementation of the digital image processing engine shown in Fig. 2.
Figure 4B is a representative luma histogram in accordance with an embodiment of the invention. Fig. 4C illustrates setting the black level by shifting the luma histogram of Fig.
4B an appropriate amount.
Fig. 4D illustrates an input output luma relationship in accordance with an embodiment of the invention.
Fig. 5 is a flowchart detailing a process for enhancing a digital image in accordance with an embodiment of the invention.
Fig. 6 is a timing diagram illustrating the temporal relationship in the case where a single scan line is enhanced in accordance with an embodiment of the
invention.
Fig. 7 illustrates a timing diagram for enhancing a series of interrelated scan lines, as would be found in a digital video, in accordance with an embodiment of the invention.
Fig. 8 illustrates a hardware implementation of a real-time image processor system in accordance with an embodiment of the invention. Fig. 9A is an unprocessed digital image. Fig 9B is a luma histogram corresponding to the digital image of Fig. 9A.
Fig. 10A illustrates the effect of setting a new black level on the digital image shown in Fig. 9 A according to one implementation of the invention.
Fig 1 OB is a luma histogram corresponding to the digital image of Fig. 10A.
Fig. 11 A illustrates the effect of multiplying the pixels that form the digital image shown in Fig. 9A by a luma multiplication factor in accordance with one implementation of the invention.
Fig 1 IB is a luma histogram corresponding to the digital image of Fig. 11A. Fig. 12A illustrates the effect of multiplying the pixels that form the digital image shown in Fig. 10A by a luma multiplication factor in one implementation of the invention.
Fig 12B is a luma histogram corresponding to the digital image of Fig. 12A. Fig. 13 illustrates the effect of multiplying the pixels that form the digital image shown in Fig. 9 A by a chroma multiplication factor in one implementation of the invention.
Fig. 14 illustrates a final processed digital image incorporating the processing illustrated in Figs. 10A, 11 A, and 13 A in accordance with an embodiment of the invention. DETAILED DESCRIPTION OF THE EMBODIMENTS
The human eye has two types of vision: scotopic, otherwise referred to as night vision, and photopic, or day vision. The total range of illumination that the eye can see is from near absolute darkness to what is referred to as the glare limit. However, the human eye can only distinguish at any particular time a maximum of 32 shades of gray between what is perceived as total black (i.e., absence of light), the darkest area of a scene, and total white (i.e., controlled by the iris), the brightest source of light. The human iris controls at all times the top of the range of 32 shades the eye can see.
In many image capture devices, charge-coupled devices, also referred to as CCDs (a photo-sensitive image sensor implemented with large scale integration
technology), are used to capture images. In one implementation, in what is referred to as a Frame Transfer CCD, the entire image is transferred from the sensing area to a storage area on chip. Data (i.e., the charge generated) is then read out from the storage area in a full frame mode. In another implementation, in what is referred to as an Interline Transfer CCD, data (i.e., the charge generated) is transferred simultaneously out by odd and even lines, or fields, directly from the image sensors to their corresponding sensor registers. In this case, the output from the digital camera is always one field (or frame) behind the image being captured.
In any case, these, and other image capture devices are generally capable of capturing more than the 32 shades of gray perceivable by the human eye. Depending on the optics and electronics involved, this superset of captured images, even when sampled to, for example, 256 shades of gray, could represent a compressed image dynamic range. The compressed dynamic range can be perceived, even if more than 32 shades of gray are placed between perceived black and white (i.e., the captured 256 shades can only be perceived as 32 shades of gray), if the contrast and color balance are not correct. Without the correct balance the image will not appear natural. The net effect of this apparent dynamic range compression is that the image appears grayish and details are diminished and in some cases lost when viewed by the human eye. In one implementation, the invention recovers the shades of gray apparently lost in the digital image thus providing more detail that makes the image appear more natural. In the described embodiment, a digital image is enhanced by creating a luma histogram of the luma components of the pixels that form the digital image. In this case the luma histogram is a graphical representation of the frequency of occurrence of each intensity or range of intensities (gray levels) of pixels in the digital image
such that the height of the histogram represents the number of observations occurring in each interval.
In a particular embodiment, the histogram is collected from a certain region of the digital image to be enhanced since that region is determined to most likely contain the most important visual information (i.e. between lines 70 and 410 of the total 525 lines). In a more particular embodiment, the histogram includes 64 bins, each bin representing a particular shade of gray. Based on the histogram a new black level is set and the histogram is effectively shifted. A luma multiplication factor based upon the mode of the shifted histogram is then used to modify the luma component. In a preferred embodiment, the color components, Cr and Cb are also modified and re- combined with the modified luma components to form the enhanced digital image.
This histogram shifting and modification of components makes the image appear more natural by revealing the details in the image that were otherwise obscured by the grayscale dynamic range compression. In addition, contrast enhancement also provides improved edge detail that makes the image appear to be sharper and more focused.
Referring now to Fig. 2, a block diagram of a real-time processor system 200 in accordance with an embodiment of the invention is shown. Real-time processor system 200 includes an image source 202 arranged to provide any number of video input signals for processing. These video signals can have any number and type of well-known formats, such as Composite, S-Video, Component, serial digital, parallel digital, or RGB video. The signal can be analog provided the image source 202 includes, analog image source 204 such as for example, an analog television, still camera, analog VCR, DVD player, camcorder, laserdisc player, TV tuner, set-top box (with satellite DSS or cable signal) and the like. The image source 202 can also
include a digital image source 206 such as for example a digital television (DTV) set- top box receiver, digital still camera, and the like. The digital video signal can be any number and type of well known digital formats such as, SMPTE 274M-1995 (1920 x 1080 resolution, progressive or interlaced scan), SMPTE 296M-1997 (1280 x 720 resolution, progressive scan), as well as standard 480 interlaced scan or progressive scan video.
In the case where the image source 202 provides an analog image signal, an analog-to-digital converter (A/D) 208 is connected to the analog image source 204. In the described embodiment, the A/D converter 208 converts an analog voltage or current signal into a discrete series of digitally encoded numbers (signal) forming in the process an appropriate digital image data stream suitable for digital processing. Any of a wide variety of A/D converters can be used. By way of example, other A/D converters include, for example those manufactured by: Philips, Texas Instrument, Analog Devices, Brooktree, and others. In one embodiment of the invention, the A/D converter 208 uses what is referred to as 4:2:2 sampling to generate a scan line data word 300 as shown in Fig. 3. It should be noted that 4:2:2 sampling is a sampling technique applied to the color difference component video signals (Y, Cr, Cb) where the color difference signals, Cr and Cb, are sampled at a sub-multiple of the luminance Y frequency. If 4:2:2 sampling is applied, the two color difference signals Cr and Cb are sampled at the same instant as the even luminance Y samples. The use of 4:2:2 sampling is the 'norm' for professional video as it ensures the luminance and the chrominance digital information is coincident thereby minimizing chroma luma delay and also provides very good picture quality and reduces sample size by 1/3. With ITU-R 601 coding, the Y luminance signal is sampled at 13.5 MHz and the two color difference signals
Cb and Cr are sampled at 6.75 MHz. It should be noted that it is possible to also process digital signals with different sampling characteristics such as 4:4:4, 4:1 :1 and 4:2:0 sampling.
In the described embodiment, the analog to digital conversion process carried out by the A/D converter 208 occurs in three parts: signal preparation, sampling and digitization. Either the whole composite signal (PAL or NTSC for example) or the components (meaning separate signals that together make-up the full color signal) can be converted.
It should be noted that the A/D converter 208 operates most correctly if the signals applied to it are correctly conditioned to ensure the correct voltage range and amplitude of the signal is given to the A/D converter 208. For example, luminance amplitude between black and white must be set so that it does not exceed the range that the A/D converter 208 will accept since it has only a finite set of numbers with which to describe the signal. The importance of this is such that the ITU-R 601 standard specifies that black should correspond to level 16 and white to level 235, leaving space for errors, noise and spikes to avoid overflow or underflow on the A/D converter 208. Similarly for the color difference signals, a zero signal corresponds to level 128 and full amplitude covers 225 levels.
Under the ITU-R 601 standard, analog signals must also be low-pass filtered to prevent information beyond the 5.75 MHz luminance band limit and the 2.75 MHz color difference band limit from reaching the A/D converter 208. If it did, aliasing artifacts would result and be visible in the picture. For this reason, low pass (antialiasing) filters are used that sharply cut off frequencies beyond the band limit. The low-pass filtered signals of the correct amplitudes are then passed to the A/D converter 208 where they are sampled and digitized. In some embodiments, two A/D
converters are used, one for the luminance Y, and the other for both color difference signals, Cb and Cr. Within the active picture, the A/D converter 208 takes a sample of the analog signals (to create pixels) each time they receive a clock pulse (generated from the sync signal). For luminance Y, the clock frequency is 13.5 MHz, and for each color difference channel Cr and Cb, the frequency is half that - 6.75 MHz - making a total sampling rate of 27 MHz. It is important that the pattern of sampling is adhered to, otherwise eventual conversion back to analog by a digital-to-analog converter will not know where each sample fits into the picture. The amplitude of each sample is held and precisely measured in the A/D converter 208. Its value is then expressed and output as a binary number and the analog to digital conversion is complete.
Fig. 3 illustrates a scan line data word 300 output by the A/D converter 208 in accordance with an embodiment of the invention. As shown in Fig. 3, the scan line data word 300 represents a digitized version of the scan line 16 of the 525 line NTSC TV picture shown in Figure 1 (the scan line 16 is part of the active picture 10). A/D converter 208 converts the analog signal corresponding to the scan line 16 and generates 720 luminance components Y and 360 each of the corresponding chrominance components Cr and Cb. Each of luminance components Y and corresponding chrominance components Cr and Cb can be sampled at a resolution of either 8 bits or 10 bits. For this example, pixel 14 can be represented by a data word 302 that includes luminance component Y 302, chroma components Cb 151 and Cr 151.
Referring back to Fig. 2, an inboard video signal selector 210 connected to the digital image source 206 and the A/D converter 208 is arranged to select which of the two image sources (analog image source 204 or digital image source 206) will
provide the digital image to be enhanced by a digital image processing engine 212 connected thereto. After appropriately processing the digital image received from the video signal selector 210, the digital image processing engine 212 outputs an enhanced version of the received digital image. The digital signal can then be sent to an image scaling or display unit 216. The image scaling or display unit 216 can include a standard analog TV, a digital TV, computer monitor, a deinterlacing unit, a scaling unit, etc.
In the case where the image display unit 216 includes an analog display device 218, such as a standard analog TV, a digital-to-analog (D/A) converter 220 connected to the outboard video signal selector 214 converts the enhanced digital signal to an appropriate analog format. In the process of converting the digitized video signal to a corresponding analog signal, in one embodiment, the digital information is fed to three digital converters, one each for Y, Cr and Cb, which are clocked in the same way and with the same frequencies as was the case with the A/D converter 208. The output is a continuous analog voltage representation similar to the original analog signal.
By way of example, if the analog device 218 is a standard NTSC format TV (a television standard using interlaced scan), the enhanced analog signal provided by the D/A converter 220 represents a field of information that appears every 60th of a second and wherein a frame (that includes the two interlaced fields) appears every 30th of a second such that the continuous presentation of frames of information produce a picture. On the other hand, in the case were the image display unit 216 includes a digital display device 222, such as for example, a digital input High Definition TV (HDTV) or digital interface LCD computer monitor, the digital display device 222 is connected directly to the video signal selector 214.
When a deinterlacing unit 224 or scaling unit 225 is used, the enhanced digital output of the digital signal processing engine 212 can be converted to progressive scan with a deinterlacer or scaled to a smaller or larger image with different horizontal or vertical size than the original signal. For example, the digital output from a deinterlacer may be passed through a D/A converter before being fed to a progressive display, or may be fed directly to a digital display. In one embodiment, deinterlacing unit 224 may include any of the DV101, DV102 or DV103 deinterlacer chips available from DVDO, Inc. of Campbell, California. Scaling unit 225 may include the gmVLXl A-X chip available from Genesis Microchip, Inc, or other similar scaling chips. In the case of the gmVLXl A-X chip, it incorporates both a deinterlacer and a sealer.
In an alternative embodiment, video signal selector 214 is optional and the output of engine 212 may be fed directly to any or all of the devices in block 216. Selector 214 is preferred if engine 212 has limited fan out capability. Fig. 4 is a schematic representation of a digital image processing engine 400 in accordance with an embodiment of the invention. It should be noted that the digital image processing engine 400 is one implementation of the digital image processing engine 212 shown in Fig. 2. In the described embodiment, the digital image processing engine 400 is capable of receiving a digital video data stream in the form of data stream 300. In another embodiment, the digital image processing engine 400 receives a digital still image in the form of a single frame, for example. In this way, the digital image processing engine is capable of providing real-time digital enhancement for digital video as well as individual digital frames provided by, for example, a digital still camera and the like.
In the described embodiment, the digital image processing engine 400 includes a data word separator, or demultiplexer 402. Implementation of a demultiplexer will be familiar to those of skill in the art. In one implementation, the demultiplexer 402 receives a sequence of data words 302 and separates them into their respective luma (Y) components and chroma components (Cb and Cr). Once separated the luma component Y is passed substantially simultaneously to a luma controller 404, a luma histogram processor 406, and a luma processor 408. In the described embodiment, one of the functions of the luma controller 404 is to control the storage and retrieval of the luma component Y and its processing by the luma processor 408. The luma histogram processor 406 collects a histogram and generates statistics. The luma controller 404 controls access to historical luma component data stored in a memory 410 connected thereto. In this way, current luma data can be stored and historical data can be retrieved in accordance with the contemporaneous digital image processing. In the described embodiment, the memory 410 takes the form of a static random access memory (SRAM) 410 capable of at least storing the contents of a particular frame buffer. In other embodiments, other types of memory such as Dynamic Random Access Memory, also referred to as DRAM, may be used.
The luma histogram processor 406 is arranged to generate a luma histogram based upon luma components received directly from the demultiplexer 402. Once the luma histogram has been generated by the luma histogram processor 406, the luma processor 408 is arranged to generate the Mode of the luma histogram using techniques well known by those skilled in the art. It should be noted that Fig. 4B, described below, shows an exemplary luma histogram in accordance with the described embodiment of the invention.
Once the luma histogram has been created by the luma histogram processor
406, the luma processor 408 uses the luma histogram to generate a luma multiplication factor LMF. The luma multiplication factor LMF is then used by the luma processor 408 to generate the enhanced digital image. In the described embodiment, the demultiplexer 402 is also connected to the chroma processor 412 as well as a chroma controller 416 which, in turn, is connected to a second memory device 418 that can also be implemented as SRAM 418. A multiplexer 420 is arranged to re-combine enhanced luma components Y' and enhanced chroma components Cr' and Cb' to form an enhanced pixel data word. It should be noted that the chroma processor 412, chroma controller 416, and SRAM 418 perform substantially similar functions as their respective luma counterparts, the luma processor 408 and SRAM 410, on the chroma components Cr and Cb.
Although not shown in Fig. 4A or Fig. 2, the digital image processing engine 200 is contemplated to include a chroma histogram processor capable of generating either a chroma Cb histogram, a chroma Cr histogram, or some other appropriate combination thereof to generate statistics, such as mode, to aid in the picture processing.
Although operation of the digital image processing engine 400 will now be described in terms of the processing of the pixel data word 302, it should be noted, however, the digital image processing engine 400 is capable of processing a continuous stream of data words. This stream of data words can take the form of, for example, a digital video as well as a single image typical of a digital or analog still- image camera.
After the demultiplexer 402 receives the pixel data word 302 from the A/D converter 208, it separates the pixel data word 302 into its constituent luma
component Y302 and chroma components Crl 51 and Cbl 51. Once the pixel data word 302 has been separated, the luma component Y302 is output substantially simultaneously to the luma controller 404, the luma histogram processor 406, and the luma processor 408. The luma histogram processor 406 then uses the received luma component Y302 in combination with historical luma component data stored and retrieved from the SRAM 410 to form a luma histogram. The luma histogram is ready to be processed by the luma processor 408 when most or all of the luminance (Y) components that constitute the picture have been added to the histogram. Once ready for processing, the luma histogram is then passed to the luma processor 408 where, under the control of the luma controller 404, a luma multiplication factor LMF is generated. In a preferred embodiment, the luma multiplication factor LMF is used in a multiplication operation of substantially all the pixel luma components that ultimately goes into the formation of the enhanced pixel data word. In a most preferred embodiment, the luma multiplication factor LMF is based upon the luma processor 408 first shifting the luma histogram and setting a new black level. Based upon the shifted version of the histogram, the luma processor 408 then uses the mode of the shifted histogram to derive the appropriate luma multiplication factor LMF. In this way, contrast enhancement provided by the new black level as well as the luma multiplication factor LMF brings out edges and details otherwise obscured by the image capturing process.
By way of example, Figures 4B - 4D illustrates an approach to deriving a luma multiplication factor LMF in accordance with one embodiment of the invention. Figure 4B is a representative luma histogram 450 in accordance with an embodiment of the invention. The luma histogram 450 has a horizontal axis representing the total
number of luma bins between black (luma bin = 0) and white, the upper limit (luma bin = 255). The vertical axis represents the number of sampled pixels having an associated luma value corresponding to a particular luma bin. For example, in this case, the luma histogram processor 406 has determined that based upon the received luma component data, 172 pixels of the digital image being processed have a luma component value corresponding to a luma bin = 67. It is important to note that since the lowest order populated luma bin is displaced from luma bin = 0 (i.e., absolute dark) by "X" luma bins, any information included within this region is not captured. Therefore, in order to increase the dynamic range of the image (the difference between the darkest part and the brightest part), the histogram 450 is shifted left by the number of luma bins represented by "X" to form a histogram 460 as shown in Fig. 4C. It should also be noted that the mode N' of the histogram 460 is also shifted relative to the mode N of the non-shifted histogram 450. The shifting produces a darker picture. Once the luma processor 408 has shifted the histogram generated by the luma histogram processor 406, the luma processor derives, in one embodiment, a linear luma multiplication factor LMF that is based upon the mode of the shifted luma histogram 460. The purpose of the LMF is to improve the contrast balance in the picture. Fig. 4D illustrates an input/output luma relationship 470 in accordance with an embodiment of the invention. The input/output relationship 470 is best seen in regards to the ideal transfer function represented by the linear transfer function represented by an ideal response 472. The ideal response 472 shows a "pass through" situation where no information is lost in processing a particular pixel. In this case, input luma values equal output luma values. However, in reality, curve 474 represents the situation of Fig 4B where some information is lost due to the
displacement of the luma histogram from absolute dark (i.e., luma bin = 0). In the case of curve 474, the image appears dark and detail is lost due to the compression of the image from 255 luma bins to (255-X) luma bins.
By applying, in one case, an empirically derived luma multiplication factor LMF, a corrected curve 476 can be obtained. The corrected curve 476 approaches the ideal curve 472 in a region 478 that depends upon the mode N' of the shifted histogram 460. The region 478 is typically arranged to coincide with the luma bin values that the human eye is most responsive. In this way, the image appears to have greater detail, edge features are enhanced, and contrast is improved such that the image appears to be more natural than would otherwise be possible.
It should be noted, however, in some embodiments, the luma multiplication factor LMF can be a non-linear function of luma input and luma output beyond a scalar value for multiplication. The LMF can take the form of a look-up table where each input value can be mapped non-linearly to another output value. More particularly in a current embodiment, the luma multiplication factor LMF is empirically determined as included in a luma multiplication factor look-up table.
The luma multiplication factor look-up table is based upon the mode of the shifted luma histogram. By way of example, first shifting the gray level values of the histogram between dark and light portions of an image improves both visibility and feature detection in a predetermined manner. Once the histogram is shifted, then the mode, or most frequent value, of the shifted luma histogram is then determined. The luma histogram mode is then used to generate the luma multiplication factor LMF. In the described embodiment, the luma multiplication factor LMF takes the form of an entry in a luma multiplication factor look-up table. The entries in the LMF look-up
table are then used by the luma filter 414 to multiply each luma component in order to form the luma enhanced portion of the digital image.
In one embodiment, the luma look-up table takes the form of a luma multiplication factor associated with a particular luma bin extending from pure black to pure white as represented by the shifted luma histogram. By way of example, assume, for simplicity, that the luma histogram has only 4 luma bins, namely, lumaO, lumal, luma2, and luma3, representing all available shades of gray from all black at lumaO to all white at luma3. In this example then, each luma bin will have associated with it a luma multiplication factor which, for this example, are 1.25, 1.5, 1.75, and 2.0. This luma look up table is presented as Table 1.
Table 1 Luma bin LMF
0 1.25
1 1.50 2 1.75
3 2.00
Using the Table 1 as an example, if the shifted luma histogram mode N' is determined to correspond to luma bin = 1 , then the luma multiplication factor LMF is set to 1.5. In this way, the luma component value for every pixel processed by the image processing engine is multiplied by the factor 1.5.
In one embodiment, the chroma components Cr 151 and Cb 151 are processed at substantially the same time as the luma component Y302. The chroma components Cr 151 and Cb 151 are multiplied by a chroma multiplication factor CMF that is, in the described embodiment, derived from the luma multiplication factor LMF such that the chroma multiplication factor CMF is linearly related to the luma multiplication factor LMF. To form an enhanced data word, the multiplexer 420 appropriately
combines substantially all the enhanced pixel data words. The enhanced data word is then combined with other enhanced data words that taken together are then provided to the image de-interlacing, scaling and display device 216.
In an alternative embodiment, the values shown in Table 2 have been found to yield good results. In this example, for many video applications it has been empirically determined that, for a four-bin histogram implementation, the values in Table 2 yield a more natural-looking picture. This example assumes 8-bit luma values ranging from 0 (black) to 255 (white), and 8-bit values for chroma around 128, where the CMF multiplies above or below 128 to increase the value towards 0 or 255. Table 2
Mode in Bin Constant Value for Shifting LMF CMF
0 0 1.25 0.90
1 8 1.15 0.95
2 16 1.10 1.10 3 24 0.95 1.20
Fig. 5 is a flowchart detailing a process 500 for enhancing a digital image in accordance with an embodiment of the invention. The process 500 begins at 502 by separating the received digital signal, in the form of a data word, into its constituent luminance Y, and chrominance Cr and Cb components. The luma component Y is then used to create a luma histogram at 504. In the described embodiment, the luma histogram includes a plurality of luma bins with each bin corresponding to a particular shade of gray such that the shades of gray extend from total black to total white. Moreover, each bin has an associated bin count representative of the number of pixels having a luma component value corresponding to that particular bin.
In one embodiment of the invention, the luma histogram is based upon a portion of the digital image being processed. By way of example, in a particular embodiment, only pixels that form a central portion of the digital image are used to generate the luma histogram since the central portion of the digital image contains most of the information in the picture. In other embodiments, other portions or the entire digital image can be used as deemed appropriate by the application at hand. By way of example, with medical imaging, selected portions of the image to be enhanced can be selected based upon diagnostic imperatives. In other cases, the entire image to be enhanced can form the input to the luma histogram as deemed appropriate. At 506, the mode of the histogram is determined after which the black level of the histogram is set at 508. In one embodiment, the luma histogram is effectively shifted by subtracting a specified scalar value from all luma components. The specified scalar value corresponds to a luma bin located approximately in the range bounded by the luma bin at least three luma bins lower than the luma mode and eight luma bins above the zero luminance bin (i.e., black in the original picture).
At 510, the luma multiplication factor (LMF) is set based upon the shifted luma histogram. In one embodiment, the LMF is formed of a look-up table (LUT). Once the luma multiplication factor LMF has been set, the luma components are multiplied by the LMF at 512. Returning to 508, the chroma multiplication factor CMF is set at 514 using the mode of the luma histogram and consulting a chroma look-up table and at 516 the CMF is used to multiply the chroma components Cr and Cb. Once both the luma components Y and the chroma components Cr and Cb have been processed for a particular data word, both the enhanced luma components Y and the enhanced chroma components Cr and Cb are combined at 518 to form an enhanced data word. The
enhanced data word is then combined with other enhanced data words which is then presented to an appropriate display at 520.
Fig. 6 illustrates a timing diagram 600 illustrating the temporal relationship in the case where the scan line data word 300 is enhanced in accordance with an embodiment of the invention. It should be noted that frame rates shown are exemplary only and should not be considered limiting the scope of the invention. By way of example, the frame rates for NTSC and PAL systems is 29.97 Hz and 25 Hz, respectively. Whereas, for Advanced Television Standards Committee (ATSC) digital video, also referred to as HDTV, the following frame rates apply: 23.976 Hz, 24 Hz, 29.97 Hz, 30 Hz, 59.95 Hz, and 60 Hz.
When the data word 300 is processed, the setting of the black level (508), the setting of the LMF (510) and the setting of the CMF (514) are performed in the blanking interval. In the described embodiment, the operations performed in the blanking interval are based upon statistics generated by the luma histogram of the previous frame In some embodiments, however, the SRAM 410 and the SRAM 418 are arranged to store historical luma and chroma components for a pre-determined number of previous data words or frames. The statistics of the historical luma and chroma components are then used to calculate a historical luma histogram which is, in turn, used to calculate a historical luma and chroma multiplication factor. In this way the processing is done based on the statistics of the stored frame and the processing is done on the stored frame.
While the blanking interval operations are performed based upon data from a previous frame, the creating the luma histogram (504), the determining the mode of the luma histogram (506), and the filtering of the luma and chroma components (512, 514) are done in real time during the active video window.
Along those lines, Fig. 7 illustrates a timing diagram 700 for enhancing a series of interrelated frames, as would be found in a digital video, in accordance with an embodiment of the invention. It should be noted that frame rates shown are exemplary only and should not be considered limiting the scope of the invention. By way of example, the frame rates for NTSC and PAL systems is 29.97 Hz and 25 Hz, respectively. Whereas, for Advanced Television Standards Committee (ATSC) digital video, also referred to as HDTV, the following frame rates apply: 23.976 Hz, 24 Hz, 29.97 Hz, 30 Hz, 59.95 Hz, and 60 Hz. As with the timing diagram 600 presented in Fig. 6, the setting of the black level (508), the setting of the LMF (510) and the setting of the CMF (514) for the pixels in frame 1 are performed in the blanking interval for the scan line 16. In the described embodiment, these operations performed in the blanking interval associated with the scan line 16 are based upon luma components stored in the SRAM 410 that are derived from pixels from a previous scan line. In some embodiments, however, the SRAM 410 and the SRAM 418 are arranged to store historical luma and chroma components for a predetermined number of scan lines.
As with the timing diagram 600, the creating the luma histogram (504), the determining the mode of the luma histogram (506), and the filtering of the luma and chroma components (512, 514) associated with the pixels of the scan line 16 are performed in real time during the active video window associated with the scan line 16.
Fig. 8 illustrates a hardware implementation of a real-time image processor system 800 in accordance with an embodiment of the invention. It should be noted that the real time image processor system 800 is one embodiment of the real time image processor system 200 shown in Fig. 2. The real time image processor system
800 includes an A/D converter 802 that, in the described embodiment, takes the form of a SAA7111 A A/D converter unit manufactured by the Philips Corporation connected to the analog video source 204. The SAA7111 A A/D converter 802 is in turn connected to a field programmable gate array (FPGA) configured to operate as an image processing engine 804. In the described embodiment, the FPGA is product type XC4044XL manufactured by the Xilinx Corporation of San Jose, CA. As well known in the art, the XC4044XL can be configured to function as the image processing engine 804 based upon user supplied configuration data. For example, based upon such configuration data, certain of the programmable gates included in the XC4044XL are arranged to operate as the video signal selectors 210 and 214 thereby reducing the number of individual components. Other portions of the configuration data cause certain other of the programmable gates to operate as the luma controller 404, the luma histogram processor 406, the luma processor 412, the chroma processor 412, as well as the chroma controller 416. It should be noted, that since the XC4044XL (as well as any FPGA and other programmable type devices) is capable of being re-configured as needed, the real time image processor system 800 can be coupled to an input device, such as a data bus as well as an EEPROM, arranged to provide updated configuration data. In this way, the image processing engine 804 can be re-configured, or updated, as may be required by the contemplated image processing task such that the amount of time and expense in adding new or additional features is substantially reduced.
Figures 9A and 9b show an unprocessed color digital image 900 and a corresponding luma histogram 950 in accordance with an embodiment of the invention. As shown, the horizontal axis of the histogram 950 represents luminance component (luma) values ranging from absolute black (luma bin = 0) to absolute
white (the glare limit) at luma bin = 255. The mode of the histogram 950 is determined to be approximately equivalent to a luma bin equal to 40. It should be noted, that the large spike in lower luma bin region corresponds to an overall dark appearance of the unprocessed digital image 900. Figure 10 illustrates an intermediately processed digital image 1000 and a corresponding luma histogram 1050 in accordance with an embodiment of the invention. As shown, the processed image 1000 has had the corresponding luma histogram 1050 shifted approximately 32 luma bins to the left. In this way, a new black level is set depending upon the value of the mode of the unprocessed image 900. In this case, by all information from the digital image 900 below luma bin 32 is lost in the shifting process that forms the digital image 1000. By shifting the luma histogram 950 to form the luma histogram 1050, the large luma bin spike evident in the luma histogram 950 is effectively eliminated. By shifting the luma histogram to the left, the intermediately processed image 1000 appears darker than the unprocessed image 900.
Figure 11 is an exemplary further processed digital image 1100 and a corresponding luma histogram 1150 in accordance with an embodiment of the invention. The digital image 1100 represents the process of multiplying each luma component of the unprocessed digital image 900 by a luma multiplication factor LMF that is based upon the mode of the histogram 950. In this example, the LMF has been determined to be approximately a value of 1.25. As shown, the digital image 1100 appears to be considerably brighter then the unprocessed digital image 900 since the histogram 950 is now "stretched out" to include luma bin values heretofore not included in the histogram 950. As seen in the histogram 1150, the upper luma bin values now extend to the glare limit corresponding to the luma bin = 255. It should
be noted, however, that by stretching out the luma histogram 950 by multiplying each luma bin by the LMF = 1.25, all information heretofore in the range extending approximately from luma bin 205 to luma bin 255 is lost since in the luma histogram 1150 this range of luma bins have been stretched beyond the glare limit. Figure 12 illustrates an exemplary processed digital image 1200 based upon the image 1000 shown in Figure 10. The image 1200 illustrates the effect of multiplying the shifted luma histogram 1050 corresponding to the digital image 100 by the luma multiplication factor LMF equal to 1.25. As noted, the multiplication by the LMF stretches the luma histogram 1050 by the value corresponding to the luma multiplication factor thereby making the digital image 1200 appear to be more natural than the digital image 1000. By way of example, the digital image 1200 appears brighter and more natural than does the digital image 1000 and even more so then the original unprocessed digital image 900. This corresponds to the upper portion of the corrected curve 476 shown in Figure 4D and more particularly to the region 476 where the human eye is most responsive.
Figure 13 is an exemplary digital image 1300 in accordance with an embodiment of the invention. The digital image 1300 illustrates the effect of utilizing a chroma multiplication factor CMF empirically derived, in this example, to be approximately the same as the luma multiplication factor LMF. In this example, both chroma components Cr and Cb associated with each of the pixels that form the digital image 900 are multiplied by the CMF of 1.25. In this way, the color balance of the digital image 900 is improved as shown by the digital image 1300.
Figure 14 is an exemplary final processed digital image 1400 in accordance with an embodiment of the invention. The digital image 1400 includes all the processing discussed above. In other words, the black level is set by shifting the luma
histogram left an amount determined by the mode of the luma histogram (in this case
32 luma bins), the luma components of the shifted luma histogram are then multiplied by the LMF, and the chroma components are processed by the CMF. As can be seen, the final processed digital image 1400 is more natural appearing (i.e., brighter and more detailed in the region that the human eye is most responsive), has greater contrast enhancement, and better color balance than the original unprocessed image 900.
While the present invention has been described as being used with a digital video system, it should be appreciated that the present invention may generally be implemented on any suitable digital image system. Specifically, the methods of using a contrast enhanced histogram based filtering scheme may generally be implemented in any imaging system without departing from the spirit or the scope of the present invention. By way of example, the invention can be used to provide enhanced MRI images, X-ray images and the like for those applications in which enhanced features are advantageous. Therefore, the present examples are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.