CROSS REFERENCE TO RELATED APPLICATION
This application claims the benefit of provisional application Ser. No. 60/129,041 filed Apr. 13, 1999.
FILED OF THE INVENTION
The present invention related to methods for enhancing digital color images. More particularly, the method applies a scaling function to vary the strength of RGB dots within an image without exceeding the inherent dynamic range for the image.
BACKGROUND OF THE INVENTION
If one were to photograph the stain glass windows of a Cathedral, the resulting image would be typically too dark for modern tastes. The beauty of the actual stained glass is lost. Further, a photographer can increase depth of field of an image by reducing the device's aperture, but light collection is sacrificed and can also result in a dark image. Further, as a still image device is capable of only recording the image using a predetermined aperture and exposure time, images having a large disparity between light and dark can only make a compromise; adjusting to expose the film, or record, based on either the dark areas or the light areas in an attempt to properly reveal the detail therein. Images resulting from any of the above can benefit from image enhancement.
The usual response by the viewer is to wish to brighten darkened areas and thus reveal the details otherwise obscured to the naked eye. Current techniques produce one or two undesirable effects when brightening: either the colors are washed out into near grays (imagine a nearly black and white and badly faded Stain Glass window); or the brighter areas become very distorted in terms of color and some, completely washed out. The distorted areas tend to be the ones of the most interest, such as the details of a face.
Conventional digital image processing has these same problems when a substantial level of brightening is required. The ‘standard’ enhancement is to brighten the image (fade it out) and then afterwards, amplify the colors. This solution is only an approximate solution and only works on images that aren't too bad in the first place. The image becomes badly color-distorted if a substantial amount of brightening is used.
In U.S. Pat. No. 5,982,926 to Kuo et al. (“Kuo”), Kuo suggests that a color image, particularly one originating from video, can be enhanced much more effectively by first transforming the RGB color space to HSV color space. All operations thereafter are performed on the HSV transformed color space. Once in HSV color space, Kuo then isolates and removes the color information (Hue) from the remaining image components (saturation and intensity). Kuo suggests that the components of saturation and intensity can be enhanced without introducing distortion into the color or Hue component. Kuo's color image is represented by a plurality of pixels in HSV color space. Once transformed, Kuo inverse transforms HSV back to RGB color space, all the while claiming this to be efficient. In the preferred embodiment, Kuo adjusts intensity (V) and saturation (S). In summary, Kuo first transforms RGB to HSV color space, applies two sequential transformation functions to V and S respectively, and finally inverse transforms the altered HSV back to RGB for display.
Respectfully, Applicant asserts that manipulation of the saturation does affect the color and thus the Kuo technique does not result in true color enhancement. Color photos have three degrees of freedom, being R,G and B, where as black and white photos only have one. Any transformation of the three values results in three more values, each of which contains a color component and not merely a single color value (i.e. hue) and two other independent structural components (i.e. saturation and intensity). Hue may be a more extreme sense of color than saturation, but saturation is still a color component. Adjusting a dot's saturation results in a change in the proportions of RGB in the dot and hence its color. If the saturation is changed, say as part of brightening the image, the result is not the same as if the recording device or camera had obtained the image directly from the real world subject under brighter conditions, or more exposure.
Further, note that Kuo emphasizes and attempts to minimize the computational overhead or expense. Unfortunately, Kuo introduces two RGB-HSV and HSV-RGB transformations in addition to whatever adjustments (preferably two) Kuo makes to the HSV pixel. A transformation from RGB to HSV color space, and back again, involves the use of computation-intensive mathematical functions.
Accordingly, there is demonstrated a need for a computationally efficient process which is capable of maintaining true color for each dot during enhancement of an image.
The present invention addresses these problems by providing a technique where, no matter how much image-brightening is needed or what the nature of that brightening is, the color of all dots in the image are preserved in all circumstances.
SUMMARY OF THE INVENTION
The effect of the present invention is to virtually amplify the light captured by the digital image recorder. This means that once the captured image is processed using the present invention, each dot within the processed image is modified to be the same as if the digital image recorder, had used a different light gathering power or procedure for that dot, including simulating the use of a larger aperture or a longer light gathering duration. This modified light gathering procedure may be applied uniformly across the entire processed image or may vary from dot to dot. By ensuring that the dynamic range of the digital recorder is never exceeded, by operating in the primary RGB color space, and by identically treating each of R,G, and B in a color dot, the virtual light amplification process preserves the true color of the original image.
In a preferred embodiment, and remaining in RGB space, a triplet of RGB values is extracted for each dot of the digital image. The maximum of the RGB triplet is determined for each dot. The maximum of all of the dot maximums, or an image maximum, is determined and a scaling function is defined which provides scaling factors for each dot maximum. The scaling factors, or a function defining same, are determined and applied to the values of the dot maximums and most particularly, the image maximum, so that no resulting value exceeds the known dynamic range of the system. The very same scaling factor, applied to a dot maximum, is also applied to each of the remaining R, G or B of that dot so as to maintain the original proportions or ratios between R,G and B, and thereby maintain the true color.
The enhancement is very efficient, only requiring three simple multiplications, by the same scaling factor, to enhance each color dot. Further, and more preferably, by forming a look-up table of scaling factors for the determined dot maximums, then the calculation of the scaling factor is performed only once. Typically the dynamic range is 0—255 (a maximum of 256 different light strengths) and therefore, for a usual image having in the order of 250,000 dots, at least 1000 of them have the same strength and thus the same scaling factor can be applied to an average of 1000 dots, saving that many calculations. Larger images, which are becoming common, will save even more calculations.
Accordingly in a broad aspect of the invention a method is provided for adjusting a digital image without introducing color distortion. The image is formed of a plurality of color dots, each dot having at least three independent values representing the strength of the three primary colors R, G, and B. Each of the RGB values lies between a minimum and a maximum of the dynamic range for the system.
The method comprises the steps of:
determining the dot maximum of the three RGB values in RGB color space;
applying a predetermined scaling factor to each dot maximum, the scaling factor being such that the scaled dot maximum is always less than or equal to the maximum of the dynamic range of the system; and
applying the same predetermined scaling factor used for a dot maximum to each of that dot's two remaining R,G or B values.
When adjusted in the above method, the adjusted image comprises a plurality of new scaled RGB values for each dot wherein the ratios between R,G and B for a dot remain the same after scaling as they were before scaling—thereby maintaining true color.
Preferably the scaling factors are obtained from a continuous scaling function. The scaling function normalizes at least a portion of the range of the image to a portion of the dynamic range without ever exceeding the maximum of the dynamic range. Due to the lower magnitude of the dynamic range than number of dots in an image, computation efficiency is improved by first establishing a look-up table of scaling factors.
The preferred scaling function is a lazy “S” curve form which produces an aesthetically pleasing enhancement of most digital images.
In another preferred embodiment, one can select only a portion of the image, usually an underexposed area, and normalize subset image range to substantially the entire dynamic range of the system for enhancing the detail therein, all without ever exceeding the system's dynamic range for any dots and while maintaining true color by being mindful of the ratios of R,G, and B for each dot.
BRIEF DESCRIPTION OF THE DRAWINGS
In the Figures, several digital images are presented. As this application is directed to the enhancement of color digital images without suffering a distortion in color, the results cannot be properly reproduced herein in the gray scale printing medium. Original color images have been provided to the respective Patent Offices.
FIG. 1 is a graph representing a linear function as the basis for correcting each R,G or B dot maximum as input to an adjusted output dot maximum, both of which are constrained to the system's dynamic range. This particular function would be a unity function, and would not perform any correction unless the input is normalized to the dynamic range by pre-scaling the maximum of the dot maximums to 1.0;
FIG. 2 is a brief coding example in Visual Basic for reading a digital screen image, extracting color dots, finding a dot maximum, applying a correction factor for the dot maximum, applying the correction to the entire RGB values for a dot and writing the corrected color dot back to the screen;
FIG. 3 is a graph illustrating a scaling function designed to modify an image which was intentionally underexposed, such as by using a small aperture so as to obtain improved depth of field;
FIG. 4 is a graph illustrating a scaling function which enhances the contrast within a specific area of the dot maximums, falling between 0.3—0.5 of the image's range, by scaling 20% of the range to nearly 100% or substantially the entire dynamic range, and wherein the darker and lighter areas contrasts are diminished;
FIG. 5 is a graph according to FIG. 4 which enhances the dot maximums in the dark area falling between 0.1—0.2 of the image's range;
FIG. 6 is a graph according to FIG. 4 which enhances the dot maximums in the bright area falling between 0.9—1.0 of the image's range;
FIG. 7a is a brief coding example in Visual Basic for using a GUI interface to select an x1,y1 and x2,y2 window area, reading the digital screen image in the window, extracting color dots, finding a dot maximum, applying a correction factor for the dot maximum, and building a histogram of dot maximum occurrences;
FIG. 7b is a brief coding example in Visual Basic for generating the histogram according to the third embodiment.
FIG. 8 is a graph illustrating variable scaling function superimposed over a unity diagonal, the variable function producing an aesthetically pleasing enhancement through the brightening of the image. The scaling function is a smooth curve, such as a third order curve, which de-emphasizes the darker areas and brightens the lighter areas;
FIGS. 9a—9 f are photographs of an Abbey which are respectively, the original, brightened under the prior art, brightened and contrast adjusted under the prior art, brightened and saturation adjusted under the prior art, enhanced according to the first embodiment of the present invention to the full dynamic range and enhanced according to the second embodiment of the present invention;
FIGS. 10a—10 f are photographs of Stone Henge which are respectively, the original, brightened under the prior art, brightened and contrast adjusted under the prior art, brightened and saturation adjusted under the prior art, enhanced according to the first embodiment of the present invention to the full dynamic range and enhanced according to the second embodiment of the present invention;
FIGS. 11a and 11 b are respectively an original photo of a satellite and an enhanced photo according to the third embodiment of the present invention;
FIGS. 12a and 12 b are respectively an original photo of a blimp and an enhanced photo according to the third embodiment of the present invention;
FIGS. 13a and 13 b are respectively an original photo of a car license plate and an enhanced photo according to the third embodiment of the present invention; and
FIGS. 14a, 14 b and 14 c are respectively an original photo of skiers and ski tracks in the snow and two enhanced photos according to the third embodiment of the present invention, each using a different portion of the photo to build the enhancement.
FIG. 15a is a flow chart of one embodiment of the invention illustrating determination of scaling factors;
FIG. 15b is a flow chart of another embodiment of the invention illustrating determination of scaling factors and storing them in a lookup table;
FIG. 15c is a flow chart of another embodiment of the invention illustrating determination of scaling factors and application of a function to adjust the color dots;
FIG. 16 is a flow chart of another embodiment of the invention illustrating a normalization process for the dots to at least a portion of the dynamic range such as that set forth in the third embodiment; and
FIG. 17 is a flow chart of another embodiment of the invention illustrating a selection of a portion of the image for adjustment and two implementations of the invention to adjust the image.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
First, an image is captured using some form of digital image recorder. Digital image recorders fall into two categories: physical and virtual. Physical digital image recorders are devices that record a digital image by the measurement of light energy; such as a digital cameras. Like a traditional film camera, digital cameras have a ‘lens complex’ that provides light gathering and the image is recorded by an array of digital sensors so that the value of each dot represent actual measurements of the light. A digital image can also be obtained as a digital scan of a traditional photograph. In a photograph, light gathering was provided by a traditional camera and the image was recorded on film. Accordingly, a physical digital image recorder can be a combination of camera that produced the film/print image and a scanner that digitized it. Other examples include digital movies, digitized movies, digital x-rays, and the like.
Virtual digital image recorders are computer renderings that imitate reality. The programs create a ‘virtual image’ (as in virtual reality) by a logical imitation of the photographic process completely internal to the computer itself. These digital images are what a photograph would have looked like had a ‘computer model’ actually existed. An example is those movies with stunning dinosaur simulations.
A lens complex is the apparatus that gathers light in various forms of photography. There is at least one lens and usually a system of such lenses. The lens complex also includes an aperture stop and a shutter which are both controls on the light gathering.
The light gathering power of a lens is often measured in terms of the surface area of the objective lens itself. A lens with twice the area of another can gather twice the light. In practice, a lens complex has two controls on the amount of light actually gathered. The first control is the adjustable aperture which varies the amount of light that is collected per unit time. Twice the area means twice the light per unit time. The second control is called the shutter and it varies the amount of time that light enters the body of the camera. Holding the shutter open twice as long means that twice the light energy enters the camera.
A true color digital image comprises a grid of dots wherein each dot has three independent measured values representing the strengths of the red, green and blue (RGB) components of the light. This is known as RGB color space. There are various computer file formats used for these images, using various ‘compression schemes’ to save computer disk storage space. Regardless of compression scheme, all such digital file formats store a grid of dots with RGB values.
Currently, the common standard maximum value stored by such files or a system, for each of the R,G, or B value, is 255. Accordingly, each of the RGB components can range in strength from 0 to 255. Some file formats now store values in the range of 0 to 1023, and higher formats are likely.
Dynamic Range
Every device, including our digital image recorder, has a ‘dynamic range’ which is a measure of its ability to record relative energy fluctuations. As stated above, this dynamic range is usually set by the storage means, file or system and is typically 0—255. In photography, the trick to success is to use all the dynamic range without exceeding it. In a photo, the trick is to capture both the details in bright areas and details in dark areas without a loss of details anywhere.
In film photography, when a significant area of the negative has turned completely opaque it means that the light was so intense in that area that no film crystals were left unchanged. The variations within the washed out area of the photograph are lost and can be said to ‘have exceeded the dynamic range of the system’.
In digital photography, the strength of the light energy is measured by the photo-electric sensors. These values are stored as a true color format computer file. The dynamic range of the system is exceeded when areas of the grid have been set to the maximum (say 255). Accordingly, variations within the bright areas, hypothetically 256—300 can only be recorded as 255 and thus details within such areas are lost.
With a limited dynamic range, as is the case with digital images, the amount of light which is captured can significantly affect the image. Consider if one image is obtained containing one particular dot of light which is measured within the dynamic range of the recording device. For example, the light may come from a brown surface. The light dot is measured in terms of three color strengths, the red (R), green (G) and blue (B). If a second image is obtained having had double the exposure time, then twice as much light will go into the recording device for each and every dot, including the dot we are considering. If the range of light remains within the dynamic range of the system, then all the three values of R, G, and B will be doubled—yet the color of the original brown dot remains as brown. Doubling of the incoming light means its strength measurement will be doubled—R is doubled, G is doubled and B is doubled.
In the case of digital information, for example, upon a doubling of the light (through a larger aperture or longer exposure), relative RGB values are also doubled. RGB values of 50,30,20 are doubled to 100, 60,40, because 100=50*2, 60=30*2, and 40=20*2.
When twice as much energy hits each sensor, then each sensor has twice the stimulation. Twice as much red energy hits the red sensor and the other energy doesn't matter to it. Twice as much green energy hits the green sensor. And twice as much blue energy hits the blue sensor.
As the light gathering power increases, the three measurements of the primary colors of the same dot increase proportionally as shown in the following Table 1—hence the color remains the same, only brighter.
TABLE 1 |
|
Effect of Increasing Light Gathering Power on Measurements of RGB |
Light |
|
|
|
|
|
|
Gathering |
|
|
|
|
Green/ |
Blue/ |
Power |
Description |
Red |
Green |
Blue | Red |
Red | |
|
1 |
“Reference, one |
50 |
30 |
20 |
0.600 |
0.400 |
|
unit” |
1.1 |
Up 10% |
55 |
33 |
22 |
0.600 |
0.400 |
1.2 |
Up 20% |
60 |
36 |
24 |
0.600 |
0.400 |
1.3 |
Up 30% |
65 |
39 |
26 |
0.600 |
0.400 |
1.4 |
Up 40% |
70 |
42 |
28 |
0.600 |
0.400 |
1.5 |
Up 50% |
75 |
45 |
30 |
0.600 |
0.400 |
2 |
Double |
100 |
60 |
40 |
0.600 |
0.400 |
2.5 |
Two and a half |
125 |
75 |
50 |
0.600 |
0.400 |
|
times |
3 |
Tripe |
150 |
90 |
60 |
0.600 |
0.400 |
4 |
Four times |
200 |
120 |
80 |
0.600 |
0.400 |
5 |
Five times |
250 |
150 |
100 |
0.600 |
0.400 |
|
In Table 1, note that the ratio of Green/Red and the ratio of Blue/Red remain constant, regardless of the light gathering power.
The reference level for light gathering power is artificial, a matter of convenience. What is the ‘correct’ measurement of the color of the dot of Table 1? An image collected from an overcast outdoors environment may measure the color dot at 50,30,20 and the color dot measured in a bright indoor setting could be 200,120,80 utilizing four times the light gathering power.
While it is possible to calibrate light gathering power in terms of the energy that is collected, it is rarely done in practice. The light gathering power of our own eyes naturally varies. City lights that seem so very bright at night appear dull in daylight because the eye's iris automatically opens at night and then closes during the day, as is required to deal with the natural extreme variations in light level. Applicant is not aware of a recording device that has a wide enough dynamic range to handle the range from natural daylight to artificial city night lighting with the same light gathering settings. What makes a good image is, in part, that the light gathering ability is varied with the light—so that the recording is not pushed outside of the dynamic range. Both measurements of 50,30,20 and 200,120,80 for the same color or dot are valid.
For any set of three numbers, such as the intensity values for each of Red, Green, and Blue, there are a total of six possible ratios and they are GIR, B/R, G/B, R/G, R/B, and B/G. Only two of the ratios are unique, the other four ratios are redundant as they are variations of the first two.
For example, based upon G/R and B/R, the others are:
|
|
|
green/blue = |
(green/red)/(blue/red) |
|
red/green = |
1/(green/red) |
|
red/blue = |
1/(blue/red) |
|
blue/green = |
(blue/red)/(green/red) |
|
|
In principle, any two of the six can be chosen. For the purposes of this description, the ratios of green/red and blue/red are chosen.
An efficient choice for the value of the strength of the RGB triplet would be to simply take the maximum value from amongst the three RGB values. In the case of 50,30,20 for RGB respectively, red happens to be the maximum value and the calculation of strength and the two ratios becomes:
|
|
|
Strength = |
red= |
50 |
|
|
Ratio1 = |
green/red = |
30/50 = |
0.600 |
|
Ratio2 = |
blue/red = |
20/50 = |
0.400 |
|
|
In reverse, one can back-calculate and recover the red, green and blue values as follows:
|
|
|
red = |
Strength = 50 |
|
green = |
Strength * Ratio1 = red * green/red = 50 * 0.600 = 30 |
|
blue = |
Strength * Ratio2 = red * blue/red = 50 * 0.400 = 20 |
|
|
This artificial representation of the RGB triplet is useful because, for any dot, the two ratios are independent of the light gathering power. The amount of light gathered only affects the strength component.
Too much light gathering will cause the measurement of the color to become distorted because at least one of the three primary colors RGB will exceed the dynamic range and thus will not be accurately represented. Consider the previous example dot of 50, 30, 20 at some arbitrary reference level of light gathering power. If the image is re-recorded, but at a much higher light gathering power, then at least one color will be pushed beyond the dynamic range.
TABLE 2 |
|
Increasing Light Gathering Power beyond the Dynamic Range of System |
Light |
|
|
|
|
|
|
Gathering |
|
|
|
|
Green/ |
Blue/ |
Power |
Description |
Red |
Green |
Blue | Red |
Red | |
|
1 |
Reference |
50 |
30 |
20 |
0.600 |
0.400 |
3 |
Triple |
150 |
90 |
60 |
0.600 |
0.400 |
5 |
Five times |
250 |
150 |
100 |
0.600 |
0.400 |
5.1 |
5.1 times |
255 |
153 |
102 |
0.600 |
0.400 |
5.2 |
5.2 times |
255 |
156 |
104 |
0.612 |
0.408 |
6 |
Six times |
255 |
180 |
120 |
0.706 |
0.471 |
7 |
Seven times |
255 |
210 |
140 |
0.824 |
0.549 |
8 |
Eight times |
255 |
240 |
160 |
0.941 |
0.627 |
9 |
Nine times |
255 |
255 |
180 |
1.000 |
0.706 |
10 |
Ten times |
255 |
255 |
200 |
1.000 |
0.784 |
11 |
Eleven times |
255 |
255 |
220 |
1.000 |
0.863 |
12 |
Twelve times |
255 |
255 |
240 |
1.000 |
0.941 |
13 |
Thirteen times |
255 |
255 |
255 |
1.000 |
1.000 |
1 |
“Reference, one |
50 |
30 |
20 |
0.600 |
0.400 |
|
unit” |
|
Notice that neither the ratio of green/red nor blue/red are constant (because they have saturated) past 5.1 times the light gathering power. Because there is a maximum amount of energy that can be measured by the recording device (either digital sensors or film) there is a practical limit to the useful light gathering power for any subject. In Table 2, when light gathering power exceeds five times the reference power, the dynamic range is almost fully used. The strength of the dot is 250 and the maximum that can be stored is 255. That maximum is reached, exactly, at 5.1 times the light gathering power. At 5.1 the strength of the dot is 255 yet the two ratios are still 0.600 and 0.400 respectively. At 5.2 times the light gathering power, even more light enters the recording device. The red sensor would have stored the number 260 but it cannot because the dynamic range is surpassed and the value is clipped to 255 instead. The correct numbers are stored for both the green and blue sensors, however, the incorrect total color is recorded and this is revealed by noticing that the two ratios now vary from the proper ratios of 0.600 and 0.400. At 6 times the light gathering power, even more light enters and the distortions are correspondingly greater. The ratios are now significantly different and incorrect at 0.706 and 0.471 and these correspond to a significantly different color. At 8.5 times the light gathering power, the blue sensor also measures clipped values. The light is recorded as having the same strength in both the red and blue primary colors. The blue to red ratio is now 1.000 and the previous distorted color (reddish brown) now further distorts to an orange. Finally, at 12.25 times the light gathering power, all three sensors clip and the color is recorded as three full strength primary colors, meaning white.
It is important to avoid collecting light beyond the ability of the recording system (it's dynamic range) because it causes distortion in the color.
Any one of the RGB triplet can be chosen as the reference color. If green was the strongest color, and assuming the color was 30,50,20 (having the same reference light gathering power as the previous red example), then having reference to Table 3, the same behavior is exhibited.
TABLE 3 |
|
Effect of Increasing Light Gathering Power with Green as strongest color |
Light |
|
|
|
|
|
|
Gathering |
|
|
|
|
Green/ |
Blue/ |
Power |
Description |
Red |
Green |
Blue | Red |
Red | |
|
1 |
Reference |
30 |
50 |
20 |
1.667 |
0.667 |
3 |
Triple |
90 |
150 |
60 |
1.667 |
0.667 |
5 |
Five times |
150 |
250 |
100 |
1.667 |
0.667 |
5.1 |
5.1 times |
153 |
255 |
102 |
1.667 |
0.667 |
5.2 |
5.2 times |
156 |
255 |
104 |
1.635 |
0.667 |
6 |
Six times |
180 |
255 |
120 |
1.417 |
0.667 |
7 |
Seven times |
210 |
255 |
140 |
1.214 |
0.667 |
8 |
Eight times |
240 |
255 |
160 |
1.063 |
0.667 |
9 |
Nine times |
255 |
255 |
180 |
1.000 |
0.706 |
10 |
Ten times |
255 |
255 |
200 |
1.000 |
0.784 |
11 |
Eleven times |
255 |
255 |
220 |
1.000 |
0.863 |
12 |
Twelve times |
255 |
255 |
240 |
1.000 |
0.941 |
13 |
Thirteen times |
255 |
255 |
255 |
1.000 |
1.000 |
|
As shown, the same situation occurs wherein the green/red and blue/red ratios become distorted in a very similar manner. In this case, the ratios red/green and blue/green produce the same numbers (0.600 and 0.400) as in the previous 50,30,20 illustration for the case of red.
Accordingly, regardless of which of the RGB triplet is the maximum, there is a symmetry and strength is defined as the maximum strength value of (Red, Green, and Blue) and color ratios result (red/Strength, green/Strength, blue/Strength). One of the three ratios will be exactly equal to 1 because strength is always equal to one of the numerators. In the case of Red from Table 1, RGB=50,30,20 so that strength=max of (50,30,20) which is 50 and the color ratios are (50/50, 30/50, 20/50)=(1, 0.600, 0.400).
Thus, for Red, Green and Blue respectively:
|
|
|
Color |
Strength |
R/S |
G/S |
B/S |
|
|
|
|
50, 30, 20 |
Red |
1 |
0.6 |
0.4 |
|
30, 50, 20 |
Green |
0.6 |
1 |
0.4 |
|
30, 20, 50 |
Blue |
0.6 |
0.4 |
1 |
|
|
As considered before, when a camera gathers more light compared to another setting for the same image, a given dot has the qualities that the ratios of the measured primary colors remain the same—if we are within the dynamic range of the recording system. Instead of thinking of the dot as an RGB triplet we can think of the dot as a strength and ratios. Again, for our dot, 50,30,20 at its reference strength of 50, the color ratios are 1, 0.600 and 0.400.
Enhancing the Image
As long as the color ratios are unaltered, the image can be adjusted without adversely affecting the colors. For instance, should insufficient light have been gathered by the recording device, we can virtually amplify the light power or strength while maintaining the color.
To virtually amplify or scale the light by a correction factor of 2, the strength is doubled without varying the color ratios: For example, doubling the strength of our dot of (50,30,20) to 100 results in a color dot of 50 *2* (1,0.6,0.4)=(100,60,40). Simply, the same result can be achieved by simply multiplying the R,G, and B by 2.
This virtual true color light amplification is accomplished by multiplying or scaling all three primary colors by the same number. No colors become distorted, as long as the output values stay within the dynamic range. The dynamic range is exceeded if a R, G, or B value is calculated that is greater than the range used by the file format for the image (such as 255).
One can ensure that no value, resulting from the triple multiplication scaling, can exceed the dynamic range, by deriving the correction from an algebraic scaling expression or function.
Having reference to FIG. 1, the X-axis represents the strength of an image dot (Maximum of red, green, and blue). The scales of 0—1 represent the limits of the dynamic range (such as 0—255). For the linear diagonal shown, a strength of 50 (50/256=0.2) for a dot is scaled by unity for an output value of 0.2. Accordingly, the unity diagonal corresponds to a ‘no operation’ situation where the output is identical to the input. However, once the scaling function deviates from unity, the output values will be different from the input values, resulting in a change to the image.
To scale the dynamic range from 0 to 1.0 is to simply divide a current strength (maximum of the RGB triplet) value by the maximum number that can be stored with this dynamic range. Suppose that number is 255. An arbitrary dot will have a dot maximum strength having a number between 0 and 255. To scale the dot maximum to 0 and 1.0 one divides the strength by 255.
Both the input and the output axis represent the values of the maximum of the RGB triplet; the dot maximum. The input is the dot maximum under consideration. The output corresponds to the adjusted dot maximum for the RGB triplet that will be calculated as a result of the method.
Dot Maximum
One method to find the maximum of an input RGB triplet is to choose one as maximum, testing each of the others, and resetting the maximum to that other if higher.
The following 3 lines of pseudo code illustrated the selection of a dot maximum:
strength=red
strength is set to red value
IF strength<green THEN strength=green
if strength is less than green value then reset it to the green value
IF strength<blue THEN strength=blue
if strength is less than blue value then reset it to the blue value
Having reference to FIG. 1, any scaling or correction is constrained to the 1×1 graph shown. The input axis is constrained to the domain from 0 to 1 and the output axis is constrained to the range of 0 to 1. This means that the strength of an adjusted or corrected dot will not exceed the dynamic range.
Any scaling function that can be plotted within the constrained graph can be used for virtual true color light amplification. The properties of a particular graph will affect the final aesthetics and application. A particular function is chosen as appropriate for the application; whether it be to adjust the brightness of an entire image, or a portion of the image, or other adjustment.
Two implementations of the scaling function correction include forming a lookup table of corrections (a finite number dictated by the dynamic range); and another less efficient means is to calculate each dot independently in turn. One can understand that in an image, the value of the strength of a particular dot may be repeated many times for many other dots. Thus, for efficient calculation, in the case of a look-up table, a correction can be calculated only once but applied many times.
For the look-up table approach, once the parameters needed to specify a particular scaling function are known—all the corrections (usually 256 of them, this depends on the limit of the dynamic range) are then calculated by means of a subroutine and the results stored in a look up table so that each is only calculated once. The correction is then simply looked up in this way (a pseudo coded example is:)
corr=corra(strength)
where corr is the particular correction for the current dot;
corrao is the lookup table or array that stores the corrections; and
strength is the dot maximum of the RGB for the current dot.
The other (less efficient) way is to have a function calculate the correction for a particular dot, each one in turn. An example would be
corr=correct(strength)
where corr is the particular correction for the current dot;
correct ( ) is the correction function which executed by simply ‘calling’ its name in this way; and
strength is the dot maximum of the RGB for the current dot.
A given point along the graphed scaling function, which provides an input and output value, is to be used to derive the correction multiplier or factor. A correction factor is equal to output value/input value.
So as to maintain the respective color ratios, all three of the RGB triplet values are multiplied by this same correction corr as determined for and specified by the dot maximum.
Accordingly, red=red*corr; green=green*corr; blue=blue*corr where red, green, blue end up holding the values for the current dot before and after correction. The effect of coupling these four considerations is that of Virtual True Color Light Amplification. The Dynamic Range is never exceeded and the color is always preserved.
Practical Implementation
An image can be read in various ways. Applicant has avoided the need to review the various graphical computer file formats by illustrating the method on a displayed image. Applicant is aware that, currently, Visual Basic (a programming language operable under the Windows operating system—all trademarks of Microsoft Corporation) and most other modern programming languages have simple commands that allow for image reads. In Visual Basic, one command is pbox.Picture=LoadPicture(file_in) where pbox is a ‘picture box object’, used for displaying pictures, Picture is a ‘method’ which assigns a picture to the object, LoadPicture( ) is the function that reads Picture Files, and file_in is the name of the file that is to be read. Once this command is issued by Visual Basic, the image file is read in and displayed on the screen in the programming ‘tool’ called the ‘picture box object’. There is a similar method to save a picture.
Having reference to FIG. 2, the simplified code illustrates a Visual Basic implementation of the virtual true color light amplification method applied to an image. This simplified technique requires, at a minimum, a 16 bit video card and a 24 bit card is preferable. The code of FIG. 2 is directed to extracting color values from the video card itself. This is not the most efficient technique and could be improved significantly by storing the image in the main RAM memory. This would eliminate accessing the video card at all and eliminate the extraction steps of stripping R, G and B values from a combined color variable such as that returned by Visual Basic function pbox.Point(icol,irow). Accessing memory could result in about a 7 times efficiency gain.
Simply, the process permits a rectangular displayed image of dots to be adjusted. For example, the image may be dark, only having a maximum strength for any of the dot maximums being about 128, or half the dynamic range for the system. In the simplest case, the image range is scaled to the dynamic range as a linear function. Accordingly, by normalizing the maximum of 128 to 255, the strength of all dot maximums will be doubled. Accordingly, the scaling function is merely a constant of 2 and the look up result, for any dot maximum, is 2. For each column of the image, the values of the color are extracted for blue, green and red. The dot maximum is set as red and the green and blue are tested to reset the strength to the maximum amongst the three. The correction is looked up in the table, in this case being a constant of 2. Each of the values for RGB are scaled by 2, the maximum scaled value being 128*2 or 256—the maximum of the dynamic range. The modified dot is written back to the display, all colors having been preserved and without having exceeding the dynamic range.
Applications—Effect of the Scaling Function
The virtually infinite selection of scaling functions, as applied to the image, results in different effects to suit various differing objectives. Sometimes the quality of the image dictates which function is used (such as a function designed simply for brightening a dark image), or a more particular function which enhances only specified strengths within an image (such as extracting detail from a narrow portion with minimal concern regarding the effect on the other portions of the image).
Applications—Virtual Lens
Both the aperture and the shutter cause problems in and of themselves. If the subject of the photograph is moving, the shutter can only be open for a short period of time or the image will be blurred by the motion itself.
When the aperture is opened up, the depth of field (which means the range of distance that is in focus) decreases. Even when focusing correctly, a wide open aperture means that only a small range of distance will be in focus. This ‘distortion’ is due to the spherical shape of the lens itself. When the aperture is small, the depth of field is better because only the nearly flat center of the lens was used. The smallest aperture setting provides the largest depth of field.
In practice, photography (and light gathering in general) is a trade off of these two effects. A motionless scene can have a large depth of field, by choosing a small aperture and a slow shutter speed. A racing car can only be photographed at the cost of the depth of field, the shutter cannot be open for long and so the aperture must be opened to gather more light.
Suppose that a ‘normal’ quality lens is used to photograph something that is moving quickly but the photographer does not want to lose the depth of field. Without the process of the present invention there is no way to do this. Accordingly, by setting the controls to gather too little light, in terms of normal photographic thinking, the uncorrected image will be too dark but the depth of field is preserved. Suppose the photographer had collected a quarter (25%) of the amount of light that would make full use of the dynamic range. The image will be nearly black, with the highest value recorded being 63 whereas the dynamic range limit is 255. This image can be corrected to bring out the detail in the dark areas.
Accordingly, in a first embodiment, and having reference to FIG. 3, the graph is a straight line which terminates at the point (x, 1.0) where x is the maximum of the entire measured image. This maximum value can be found using a modified histogram approach. In this example, x would be 0.25 but it could be any value between 0 and 1. In the preferred embodiment of the invention, virtual light is added in the same way that opening the aperture more would have except that it will have a depth of field associated with a superior lens. This process can also be used to make up for ‘blunders’ where inappropriate lens settings resulted in a too dark photograph.
Virtual Iris
In a second embodiment, and having reference to FIG. 8, a graph can be chosen so that the darkest parts of the image will remain dark, the duller parts will brighten but also so that the bright parts of the image remain nearly unaffected. FIG. 8, and similarly shaped smooth non-linear graphs, have the effect of imitating the iris when used with the virtual true color light amplification of the present invention. The output image happens to more closely resemble one's visual memory of the experience. Applicant refers to this enhancement as “virtual iris”.
The gentle nature of this non-linear graph ensures that a quality input image will result in an attractive processed image. The important aspects to maintain this aesthetic result is that the graph remains smooth, the slope of the graph is never zero and is also smoothly changing, and that there is a net brightening effect in total.
In FIG. 8, the scaling function approaches the asymptote of the minimum and maximum of the system's dynamic range. The more the function approaches a tangent to the minimum and maximum of the dynamic range, the more severe the correction.
Virtual Detail Enhancement
Simply, any one frame of a photo is the result of only one aperture and shutter setting. In investigative work, this has the annoying limitation that details in certain areas of the photograph will be subtle. In a third embodiment, a process is provided for bringing out detail in that certain subtle area.
Having reference to FIG. 4, an area of interest is chosen and a histogram approach is applied to find that the minimum strength dot within the area is 0.30 and the maximum strength dot is 0.50. This area utilizes only 20% of the dynamic range, which means that the contrasts will be subtle—or virtually indistinguishable to the eye.
Applying a three part linear scaling function as shown in FIG. 4 turns these small contrasts into large ones as the output from that specified area now varies over 80% of the dynamic range.
The small sloped lines below 0.30 and above 0.50 on the input have the effect of washing out the detail in darker and brighter parts of the image. But the resulting color, at each dot, will never be corrupted and the brighter parts of the image provide a good reference.
Any area of the photograph can be chosen. Having reference to FIG. 5, details in a very dark area are revealed, such as writing obscured in shadow. FIG. 6 illustrates how to bring out the details in a bright area, such as tracks in the snow. Any number of areas can have their detail enhanced by simply choosing the area of interest and applying the correction.
More particularly, any areas or portions of the image can be optimized. First, the area needs to be identified. In a Graphics User Interface, this is easily done using the mouse in a ‘click and drag’ operation. This can be done in Object Oriented Programming by using the Operating System (Windows) to identify when the mouse button has been clicked. In Visual Basic there are built in subroutines (for every program) that are executed as soon as the mouse button is depressed or released.
A user can select any rectangular area within the image. The ‘coordinates’ are stored in common memory as xdwn, ydwn, xup and yup. See the photographic examples #1 and #2 for the superimposed rectangle on the image.
In both the Virtual Lens and the Virtual Detail Enhancement, I referred to a ‘modified histogram approach’. A histogram is simply the measurement of the number of occurrences against the value of the occurrences.
For example, as shown in Table 4, in terms of a set of RGB triplets:
TABLE 4 |
|
Standard Histogram Approach |
“Red, Green and Blue values taken as independent” |
|
Dot # |
Red |
Green |
Blue |
Value | # Events | |
|
|
|
1 |
1 |
1 |
1 |
|
|
|
2 |
0 |
0 |
1 |
0 |
2 |
|
3 |
2 |
2 |
1 |
1 |
11 |
|
4 |
4 |
3 |
1 |
2 |
3 |
|
5 |
4 |
4 |
4 |
3 |
7 |
|
6 |
3 |
2 |
1 |
4 |
5 |
|
7 |
5 |
1 |
1 |
5 |
2 |
|
8 |
3 |
3 |
1 |
Total |
30 |
|
9 |
4 |
3 |
1 |
|
10 |
3 |
3 |
5 |
|
|
In this example, there are 10 dots each having 3 values (red, green and blue) and each of these 30 values range in value from 0 to 5. The normal histogram is calculated by adding up the number of times each value (0, 1, 2, 3, 4, and 5) occurs in total.
The modified histogram approach, where strength =max(red, green, blue), described by Table 5 as follows:
TABLE 5 |
|
Modified Histogram Approach |
“Red, Green, and Blue values taken |
as a unit, Histogram on maximum” |
Dot # |
Red |
Green |
Blue |
Strength |
Value | # Events | |
|
1 |
1 |
1 |
1 |
1 |
|
|
2 |
0 |
0 |
1 |
1 |
0 |
0 |
3 |
2 |
2 |
1 |
2 |
1 |
2 |
4 |
4 |
3 |
1 |
4 |
2 |
1 |
5 |
4 |
4 |
4 |
4 |
3 |
2 |
6 |
3 |
2 |
1 |
3 |
4 |
3 |
7 |
5 |
1 |
1 |
5 |
5 |
2 |
8 |
3 |
3 |
1 |
3 |
Total |
10 |
9 |
4 |
3 |
1 |
4 |
10 |
3 |
3 |
5 |
5 |
|
Simply, it is the occurrence of numbers as found in the maximum strength that is used to build the histogram and not the values of red, green and blue separately.
This is in keeping with the nature of this patent application which is that red, green, and blue values are to be treated as a unit having a strength and ratios and not as three independent values.
The histogram is built by considering only those dots within the range of rows=xdwn to xup and columns=ydwn to yup.
Having reference to FIG. 7a, example code is provided by which to apply the modified histogram.
At this point, the histogram is formed and its running total is known with respect to the strength of the RGB triplets of the marked area. The beginning and the ending significant strengths are determined, as reflected by the histogram data.
To avoid errors such as dead or saturated recording elements and otherwise ‘stray values’ that are not representative of the area, one can limit the relevant dots to the 2% and the 98% of the number of occurrences to represent the smallest and largest relevant RGB strengths.
The number of strengths counted by the histogram, in total, is equal to the last running total value, and the 2% and 98% values are, therefore, easily found. Code is shown in FIG. 7b which determines the range of strength index (hmin and hmax) corresponding to the range of strengths within the box selected by the user.
Having reference also to FIG. 4, the modified histogram approach found 0.30 (of the dynamic range maximum) and 0.50 (of the dynamic range maximum) to be the minimum and maximum strength values of the portion of the image selected by the user. (See photographic examples #1, #2, for the boxes).
It was also assumed that the range calculated from the modified histogram approach should be modified so that it varied over the majority of the dynamic range.
So, what should this input range of hmin and hmax be turned into? We want it to occupy most of the dynamic range. A good guess is 80% of the dynamic range with a little left over for the darker and lighter areas so they can still be used as a reference.
In application code, the output strength range of the selected area was originally set to 0.1 to 0.9 of the dynamic range maximum. It was later reset to 0.2 to 0.9 as these numbers simply seemed better after observing many images.
The code used to calculate the look up table based on the modified histogram approach for determining the input strength range and the (nearly) arbitrary output range of 0.2 to 0.9 follows.
The graph of FIG. 4 can be thought of, in the general sense, as having 3 line segments each with two end points.
|
Line Segment |
Start |
Stop |
|
|
|
#1 |
(0, 0) |
(xmin, 0.20) |
|
#2 |
(xmin, 0.20) |
(xmax, 0.90) |
|
#3 |
(xmax, 0.90) |
(1.0, 1.0) |
|
|
where xmin and xmax have been calculated by the modified histogram approach. That is to say:
xmin=hmin/drmx
xmax=hmax/drmx
Both input axes are measured in terms of the dynamic range. The input values are in terms of the strength (max of RGB) of the dot. The graph never leaves the 0 to 1 ‘box’. These constraints must always be met by any scaling function in any specific process.
All that remains, here, is that the output value be calculated by a program equivalent to that described for FIG. 2 above and that the lookup table does not hold the graph, exactly, but the ratio of the output to the input.
Each line segment can be expressed with algebra in the ‘slope intercept’ form, of which the general form is: y=m*x+b. For each of the three line segments, linear equations and scaling factors are determined. An array of corrections or scaling factors can be formed from the three equations. Dividing the output values by the system dynamic range produces the ratio of output to input.
This virtual detail enhancement technique, or forensic flash due to its ability to delve into the normally obscured areas, maximizes the dynamic range of any target so that the details are enhanced. This is not restricted to the target area but any part of the photograph that has similar strengths to the user's choice will also be so enhanced. Any target area can be selected and so there can be many valuable corrections performed on the same photograph. Those areas which were stronger than the strength range picked by the user remain as useful references due to the ‘true color’ nature of the correction.
Examples
Virtual Flash—Virtual Iris
The examples illustrated in FIGS. 9 and 10 illustrate corrections to the limitations that occur from the physical light gathering devices or image recorders. The one aperture and shutter setting per frame means that a photograph is likely to vary from one's memory of the experience of being there. The eye's iris adjusts itself when experiencing contrasts. In a park on a sunny day, the iris opens up when moving from sunlight to the shade so that you remember all the grass as being green whereas photos often show shaded grass as black.
Photo #1 (FIGS. 9 a—9 f) Tourist Photo of an Abbey
This example illustrates how various prior art processes and the present invention enhance the image. The approach with the prior art processes is to “play” with the image until the brightness looks about right for what you want. It is a subjective thing and the expert user is someone who is good at making the necessary compromises. In FIG. 9a, the original image, scanned from a photograph, is very dark but still uses all/most of dynamic range (of the dot maximums (hits) being outside the strengths of 5 and 254) of a system of 256.
FIGS. 9b-9 d illustrate prior art image brightening techniques. FIG. 9b does so by increasing the image brightness by 80%. While the image of FIG. 9b is brighter, the colors are badly faded and the sky has also experienced change in color. FIG. 9c illustrates the prior art brightened image of FIG. 9b with contrast set to 50%. Contrast is increased in an attempt to try to restore the colors lost in brightening. Notice how much of the detail of the image is lost. The process has pushed many of the dots past the edge—outside of the dynamic range. FIG. 9d illustrates the brightened image of FIG. 9b with the saturation set to 50%. Increase in saturation is another technique for restoring the colors. As a result, the sky is almost returned to what it was but the rest of the image has significant and unsightly color distortions.
Applying the techniques of the present invention, the photo fairs much better particularly in FIG. 9f. In FIG. 9e, the image range of 5 to 254 is linearly mapped to 0 to 255. The effect is small because the original image was nearly full range already. It does, however, ensure that we have the full dynamic range in the output image. In FIG. 9f, the scaling function of FIG. 8 was applied for obtaining a superior image. All the colors are true to the image, as it was scanned, and are vibrant, just as they would have been to the eye with no loss of detail.
Photo #2 (FIGS. 10 a-10 f) Tourist Photo of Stone Henge
In the original frame of FIG. 10a, the stones are in the shadows due to the extreme lighting conditions. In FIG. 10f, the virtual iris process of the present invention compensates in a similar way that the iris does automatically, turning the poor photo into a good one.
More particularly, in FIG. 10a, the subject is very dark and again uses all/most of dynamic range (only 1% of the hits being outside of a strength of 6 and 253). This image had been “pre-processed” by others to bring out detail—notice how the sky is nearly white but clouds are still available (reproduction of the Figures in this application does not necessarily preserve the actual presence of the clouds). The prior art had taken it “as far” as its could but the subject was still too dark. FIG. 10b illustrates prior art brightening of the image by 60%. While image is brighter, the colors are badly faded and the stones have lost all their color. Some of the detail is also lost by this process alone. Note, even in the gray-scale rendering, the whitening of the “red” rock left of the stones and at the left extreme of FIG. 10b image. FIG. 10c is the brightened image of FIG. 10b with contrast set to 40%. Notice that the stones have, in areas, regained some color but not in other areas. Also notice how much more of the detail of the image is lost. FIG. 10d is the brightened image of FIG. 10b with saturation set to 15%. There is improved color that is “sort of right in a way” but it also adds artificial colors, such as some reds and yellows. Even in photos where not much correction is needed, manipulation of saturation for each dot ends up with different RGB ratios than were recorded. At best, one ends up with a compromise solution.
Applying the techniques of the present invention, in FIG. 10e the image range of 6 to 253 is linearly mapped to 0 to 255 to ensure the full dynamic range in the output image. In FIG. 10f, the scaling function of FIG. 8 was again applied for obtaining the superior, true color image with no loss of detail.
Forensic Flash—Virtual Detail Enhancement
Photo #3 (FIGS. 11 a, 11 b) Satellite
In FIG. 11a, the satellite is in the shadows and the surface is very dark. This often happens in space because of the extreme contrast in lighting. Important ‘docking’ holes cannot be seen. Using the forensic flash correction technique of the third embodiment, a window or box was selected within the dark area and the histogram approach used to build a correction graph suited for that area.
In FIG. 11b, the processed image now shows the detail in the dark area. The docking holes, marked with white circles, are now revealed.
Photo #4 (FIGS. 12 a, 12 b) Blimp
In the original image of FIG. 12a, the underside of the blimp is in the shadows. This often happens in photography when the lens is directed into the sky. Specifically, the identification markings cannot be seen as the tail is too dark.
Under the principles of the third embodiment, the tail area is selected and the histogram approach used to build a correction graph suited for that area. As a result, as shown in FIG. 12b, the processed image shows the lettering in the previously obscured, dark area. The blimp is now identified as COLUMBIA N3A.
Photo #5 (FIGS. 13 a, 13 b) Car Plate
In the original frame of FIG. 13a, the car's license plate is mostly shrouded in shadows. The car cannot be identified because the plate cannot be read. Using the third embodiment, the plate is selected and the histogram approach used to build a correction graph suited for that area.
As a result, and referring to FIG. 13b, the processed image shows that the car does not have a normal plate at all but, instead, the words: Classic Mustang.
Photo #6 (FIGS. 14 a, 14 b, 14 c) Tracks in the Snow
In FIG. 14a, there are two people skiing. Their tracks are identifiable, but subtle. Using the forensic flash correction technique of the third embodiment, a window or box was selected within the overexposed area of the tracks in the snow and the histogram approach used to build a correction graph suited for that area.
In FIG. 14b, the processed image now shows the detail in the overexposed area. The tracks in the snow are distinctly visible.
Similarly, a window or box was selected in the dark area of the face of the skier in the photograph. FIG. 14c is the result of the histogram approach used to build a correction graph uniquely suited to this area of the photo. The features of the skier are more clearly visible than in the original photo shown in FIG. 14a.
SUMMARY
The key concepts are here expressed as the combination of the following six factors: correcting in RGB color space, the correction graph; the definition of the correction axes; constraint of domain and range to the system's dynamic range; properties of the graph; and application of the same correction factor to each of R,G and B in the triplet.
The correction must be applied to the RGB color space to maintain true color. Any correction graph can be used that embodies the above characteristics. Both the input and the output axes represent the maximum of the RGB triplet. The input is the maximum of the RGB triplet under consideration and the output corresponds to the maximum of the RGB triplet that is calculated as a result of virtual true color light amplification. The correction is constrained to the dynamic range. This means that the strength of the calculated dot is constrained to the dynamic range. Any graph that can be plotted within the constraints can be used for the process. The properties of a particular graph will affect the emphasis of the correction. Again, all three of the RGB values must be multiplied by the scaling factor derived from the graph. A given point on the graph has an input and output value. The correction equals the ratio (division) of these two and all three of the RGB triplet values are multiplied by this ratio of output to input.
The result of these considerations is a suite of processes all of which preserve the essential color of each and every dot in the input digital image while varying the effective light gathering power which can easily be on a dot to dot basis.
All that differs between the above embodiments is to vary the graph within the imposed constraints. With virtual true color light amplification, a new and useful result is dependent only upon identifying an image enhancing need and identifying a reasonable graph to fit that need.