CA2515253A1

CA2515253A1 - Method and system for analyzing images

Info

Publication number: CA2515253A1
Application number: CA002515253A
Authority: CA
Inventors: Edythe P. Lefeuvre; Rodney D. Hale; Douglas J. Pittman
Original assignee: Individual
Current assignee: Individual
Priority date: 2005-08-12
Filing date: 2005-08-12
Publication date: 2007-02-12

Description

aappiicstion numb: vum~:o d4 d~:r_w~d2: O a,~l~".,'~ x.5.3 Fi~,rures: 1,~ a .~,~~~~...- a~
P a?es:
IDIS - '~..P
Unscanriable items received wirh this application (Request ori~~.a1 documents in File Prep. Section on the 10th Floor) Doc:umenzs recus aver cede demande ne pouvant ~tre balayes ( Cmnmunder les documents ori~n_naux Bans la sec~on de prepararion des dossiers au Lc)ie:ne erase) Method and System for Analyzing Images 1. Introduction Approximately 23% of consumer images taken with cameras are captured in portrait mode. These images are rotated either approximately 90 degrees or approximately -90 degrees from horizontal, with the horizon oriented along a short edge of the image. If these images are in digital form, and have not been rotated appropriately, they will be oriented at approximately +/- 90 degrees from the preferred viewing orientation when displayed on a PC monitor, television or other display. The preferred viewing orientation is with the edge containing the lowest elevation objects in the image aligned parallel to and closest to the base of the display. In the preferred viewing orientation, the sky is at the top of the display, the ground is at the bottom, and people and buildings are upright. See Figure 1 B for examples of the preferred viewing orientation. Portrait mode digital images may be captured with a digital camera, scanned from film or prints or produced by other means. Images that are oriented at approximately 180 degrees from the preferred orientation can be produced as a result of scanning prints that are fed into the print scanner upside down or as a result of scanning film from single-lens reflex (SLR) cameras. SLR cameras produce images that are upside down on film relative to non-SLR type cameras and thus come out upside down when stored on digital media. SLR cameras capture roughly 20 percent of images taken with film cameras, and are of particular concern to film processors that scan rolls of film to produce digital images for storage and display. In addition to film scanners, there are also automated print scanners that scan existing prints to produce digital images for storage and display. Automated print scanners accept a stack of prints as input, and the preferred viewing orientation of these may be at approximately 0, 180, 90 or -90 degrees relative to the orientation in which they are scanned. Some images may be captured at orientations significantly different from 0, 180, 90 or -90 degrees relative to the preferred viewing orientation, but the majority of photo images are within +/- 10 degrees of the 0, 180, 90 or -90 degree orientations.
It is in the general interest of the photo imaging industry, whenever digital images are provided to the consumer, to ensure that the images are provided in the preferred viewing orientation on whatever image storage medium is used, so that the images may be viewed in the preferred viewing orientation when displayed.
The image storage medium used could be any medium that is capable of storing an image in digital form, including but not limited to CDs, DVDs, floppy disks, hard drives, flash memories or servers. Such storage means may be based on magnetic, optical, biological, molecular, atomic or other storage principles.

Humans can very quickly identify the correct orientation of an image through the use of contextual information and object recognition (see Figure 1A). This is not a simple task to reproduce in software.
The approach to implement this in software has been to recognize global image features and reference objects that help with orientation detection such as: sky, foliage, faces, eyes, walls, and straight lines. For some reference objects like eyes and faces, orientation information is used to make the decision about whether or not to rotate the image. For other objects like sky, foliage and straight lines, absolute and relative locations are used. Different algorithms are required and useful depending on the subject in the image. For example eye detection is not useful in an image without a human face.
A number of different sub-algorithms are used to differing degrees to help make a decision about the rotation of images depending on subject matter. The sub-algorithms are:
Figure 1A: Rotated Images: Left : 90 degrees; Right :- 90 degrees (same as 270 degrees) Figure 1 B: Images in Preferred Viewing Orientation 1 ) Eye Detection Sub-Algorithm

2) Upper Face Detection Sub-Algorithm

3) Straight Line Detection Sub-Algorithm

4) Global Image Parameters Sub-Algorithm ~ Sky Detection Sub-Algorithm ~ Foliage Detection Sub-Algorithm ~ Wall Detection Sub-Algorithm ~ Flesh Tone Detection Sub-Algorithm ~ Flash (Bright Foreground & Dark Background) Detection Sub-Algorithm The information generated by the sub-algorithms is input to an Overall Orientation Decision Maker Algorithm to determine the final image orientation decision. In the preferred embodiment the decision will be either 0, 180, 90 or -90 degrees. The Overall Orientation Decision Maker could be implemented to output an image orientation decision for any angular interval rather than the 90 degree interval of the preferred embodiment. For example, the decision could be made for a 45 degree interval, i.e. the decision would be 0, 45, 90, 135, 180, -135, -90, or -45 degrees.

2.0 Image Orientation Detection Sub-Algorithms 2.1 Eye Detection Sub-Algorithm In images where eyes are larger and where details like the whites of the eyes are visible, the approach has been to recognize single eyes (See Figure 2). The Eye Detection Sub-Algorithm can be summarized as follows:
1 ) Sub-sample the image.
2) Segmentation of relatively dark objects on either a flesh toned background or a white background (See Figure 2).
3) Calculation of features for segmented objects.
4) Use feature data to classify each object as a human eye at 0 degree rotation, a human eye at 90 degree rotation, a human eye at 180 degree rotation, a human eye at -90 degree rotation, or not a human eye.

5) Repeat steps 1 to 4 at different resolutions to find eyes of different sizes.

6) The number and location of the objects classified as human eyes is then used as follows:
a) If no objects are classified as human eyes then the eye detection algorithm provides no useful information to the overall decision about the rotation of the image.
b) If only one object is classified as an eye, then the orientation of the eye (i.e. 0 degrees rotation, 90 degrees, or -90 degrees) is used to help make the overall decision about the rotation of the image.
c) If multiple objects are classified as a human eye then location and orientation information will be used to help make the overall decision about the rotation of the image.
Figure 2. Left- Original; Right - Segmented objects Eye Features include but are not limited to the following:
~ the number of eyes found at each orientation, i.e. 0, 180, 90 and -90 degrees ~ the sum of classifier confidence levels (i.e. the classifier distances) for detected eyes at each orientation, i.e. 0, 180, 90 and -90 degrees the sum of coordinate locations for the centroid of detected eyes at each orientation, i.e. 0, 180, 90 and -90 degrees ~ the maximum and minimum coordinate values of the group of eyes detected at each orientation, i.e. 0, 180, 90 and -90 degrees 2.2 Upper Face Detection Sub-Algorithm The easiest part of the human face to automatically detect is the "upper face"
region from the nose to the forehead including the eyes. The mouth region is more difficult to recognize because of facial hair and different expressions which put the mouth in different configurations or positions. In low resolution images or in images where the human subject is small the approach has been to look for a pattern that approximates the average human upper face region (see Figure 5).
Even at varying scale, the eye-nose pattern is a distinctive facial feature.
The relationship between the eyes and nose gives an indication of facial orientation in the image. From this pattern, a viewer gets a sense of a typical image's horizon since the triangle drawn between the eyes and nose points toward its bottom. It is this pattern that the Upper Face Detection Sub-Algorithm uses to make its decision on image orientation.
Figure 3: Objects Classified as Eyes Figure 4: Image Orientations - Left Image: 90 degrees, Centre Image: 0 degrees, Right Image: - 90 degrees (same as 270 degrees) The pattern in Figure 5 below approximates the average human upper face region. Despite the blurred appearance, the pattern is still recognizable as the upper face. This blurring is intentional to generalize the pattern and tailor it to images where the human subject is further from the lens.
In the pattern search, search regions are restricted to flesh tone areas. A
normalized greyscale correlation determines how closely the pattern resembles a suspect image region. A pattern search is conducted to detect the pattern in each of four orientations (see Figure 4 for orientation definitions):
~ 0 degrees ~ 90 degrees ~ 180 degrees ~ 270 degrees Figure 5: Average Upper Face Pattern For each of the four orientations, the search is performed through +/-15° of image vertical in 1 ° steps, to account for some head tilt, and across a number of image resolutions to accommodate varying subject sizes. The search could be expanded to include other orientations as well, for example every 45 degrees or 30 degrees, or even smaller increments.
One known limitation of using this method occurs if the subject's head is not forward facing. If the second eye is not visible, the pattern is no longer visible in the image. A second limitation lies in flesh region segmentation. Wood tones are sometimes confused with flesh. As a result, wood grain can be confused with the light to dark transitions in the upper face pattern. The use of object texture evaluation provides a means of eliminating objects that are similar in color to flesh such as wood.
The Upper Face Detection sub-algorithm can be summarized as follows:
1 ) Sub-sample the image.
2) Segmentation of Upper face like regions using standard pattern matching techniques.
3) Calculation of features for segmented objects.
4) Use feature data to classify each object as an upper face at 0 degree rotation, an upper face at 90 degree rotation, an upper face at 180 degree rotation, an upper face at -90 degree rotation, or not an upper face.
5) Repeat steps 1 to 4 at different resolutions to find upper faces of different sizes.
6) The number and location of the objects classified as upper faces are then used as follows:
a) If no objects are classified as upper faces then the upper face detection sub-algorithm provides no useful information to the overall decision about the rotation of the image.
b) If only one object is classified as an upper face, then the orientation of the upper face (i.e. 0 degrees rotation, 90 degrees, or -90 degrees) is used to help make the overall decision about the rotation of the image.
c) If multiple objects are classified as upper faces then location and orientation information will be used to help make the overall decision about the rotation of the image.
Upper Face Features include but are not limited to the following:
~ the number of upper faces found at each orientation, i.e. 0, 180, 90 and -90 degrees ~ the sum of classifier confidence levels (i.e. the classifier distances) for detected upper faces at each orientation, i.e. 0, 180, 90 and -90 degrees ~ the sum of coordinate locations for the centroid of detected upper faces at each orientation, i.e. 0, 180, 90 and -90 degrees ~ the maximum and minimum coordinate values of the group of upper faces detected at each orientation, i.e. 0, 180, 90 and -90 degrees 2.3 Straight Line Detection Sub-Algorithm Straight lines in any orientation may be a useful parameter for detecting image orientation. This will be verified through statistical analysis of a database of images. Straight lines oriented roughly parallel (e.g.
plus or minus 10 degrees) with the side of an image are predominantly vertical. In addition, straight lines that indicate a perspective view of parallel lines converging in the distance would be interpreted as predominantly horizontal. Straight lines may also be an indication of walls, as discussed in Section 2.4 bel ow.
To extract straight lines in an image, preliminary edge detection is applied to the image. A Hough transform is applied to the resulting binarized image. Predominant lines are extracted from the Hough image and binned by angle in the image.
Straight Line Features include but are not limited to:
~ slope of the line ~ length of the line ~ intersection points ~ convergence points 2.4Global Image Parameter Calculation Sub-Algorithm Features in the foreground and background lend information to determining image orientation. In a typical image, the subject occupies the center 25% of the scene. To approximate foreground and background feature extraction, the image regions are defined using two types of masks - central and border masks. The Global Image Parameter Calculation (GIPC) sub-algorithm extracts color and texture parameters from these mask regions to help make a decision on image orientation.
The central masks focus mainly on the subject while extracting some background information (see Figure 6). The border masks are located on the perimeter of the image to focus mainly on background information with some subject inclusion (see Figure 7). Both central and border masks are divided into four regions to separate image features by location in the image. Other mask formats may be used, such as arranging the image into a series of concentric shapes including but not limited to rectangles or circles and a series of strips roughly parallel to each other.

To calculate global image features, several simple variations of the input image are used. These image variations include the red component, green component, blue component, intensity edge image, bright region image, and dark region image. Additional more complicated image variations include but are not limited to:
~ Sky Image ~ Foliage Image (e.g. trees, grass, plants, shrubs, and the like) ~ Wall Image (characterized by regions in the image that are low variance in a similar color bounded by straight lines at corners where walls meet other walls, floors or ceilings) ~ Flesh Image (typically human flesh tones, but could be animals) Explanations of the algorithms used to generate these more complicated image variations follow in Sections 2.4.3 to 2.4.6.
Global Image Parameter features include but are not limited to ~ For each color component (i.e. red, green and blue), the intensity image, and the intensity edge image and for each region/quadrant of each mask:
o Mean intensity o Standard deviation of intensity o Coordinates of the centre of gravity of the intensity ~ For each region/quadrant of each mask:
o Area that is flesh tone o Area that is sky tone o Area that is foliage o Area that is bright o Area that is dark o Area that is wall o The area of the edges detected that exceed a threshold (areas where there is a large change in colour intensity) ~ For adjacent pairs of regions of the border mask o The area of the edges detected that exceed a threshold (areas where there is a large change in colour intensity) These features and their relationships with each other provide an indication of the image orientation.

2.4.1 Central Quadrant Masks The central quadrant mask format separates the image into two or more regions.
In one embodiment the image is ranged into four equal square quadrants as per Figure 6. The regions could also be of unequal sizes, and of trapezoidal or any other shapes. For each of the at least five generated images mentioned above, the occurrence of a feature in a region and the size or amount of the feature (relative to the overall image size) in that region are noted. The following discussion of generated images is with reference to the central quadrant embodiment, but the use of the parameters would be similar for regions of any number, size or shape.
Segmented Sky ImacLe If sky is found in much of the area of two adjacent quadrants, and no sky is found in the other two quadrants, the indication would be that the correct orientation of the image is with the two "sky" quadrants at the top of the image. If sky is found in one quadrant, the top of the image will be one of the two image edges that contact the quadrant. This will be verified through statistical analysis of a database of images.
Segmented Foliage Imacre If foliage is found in much of the area of two adjacent quadrants, and no foliage is found in the other two quadrants, the indication would be that the correct orientation of the image is with the two "foliage"
quadrants at the bottom of the image. If foliage is detected in one quadrant only, the bottom of the image will be one of the two image edges that contact the quadrant. This will be verified through statistical analysis of a database of images.
Segmented Wall Image Walls would predominantly be found either at the upper two quadrants or at either side of an image, but not at the bottom. Therefore the detection of walls can be used to determine where the bottom of an image is not located. This will be verified through statistical analysis of a database of images.
Figure 6: Central Quadrant Masks Segmented Flesh Image It is anticipated that humans or animals will be predominantly located centrally and possibly in the upper or lower portion of images. This will be verified through statistical analysis of a database of images.
Therefore the location of flesh in specific quadrants may be indicative of the orientation of the image.
2.4.2 Border Mask The Border mask is located around the perimeter of the image, along the four edges of the image. The border may be of any width, and may be organized into any number of sectors of any size or shape. In one embodiment, the border mask is formulated as shown in Figure 7 with four sectors at the top, bottom, left, and right of the image. The shape of the four border sectors shown is trapezoidal, but other shapes of unequal size may also be used. A hypothesis with the border mask is that the bottom of an image will contain more objects than the upper portion. This will be verified through statistical analysis of a database of images. For each of the at least five parameters mentioned above, the occurrence of a feature in a particular border sector and the size or amount of the feature (relative to the overall image size) in that sector are noted. The following discussion of parameters is with reference to the border embodiment shown in Figure 7, but the use of the parameters would be similar for regions of any number, size or shape.
Figure 7: Border Mask Segmented Skylmaqe If sky is found in much of the area one or more border sectors along a particular edge of the image, and no sky is found in the border sectors along the other edges of the image, the indication would be that the correct orientation of the image is with the "sky" border sector or sectors at the top of the image. This will be verified through statistical analysis of a database of images.

Segmented Foliage Image If foliage is found in much of the area of one or more border sectors along a particular edge of the image and no foliage is in the border sectors along the other edges of the image, the indication would be that the correct orientation of the image is with the "foliage" border sector or sectors at the bottom of the image.
This will be verified through statistical analysis of a database of images.
Segmented Wall Image Walls would predominantly be found either at the border sector or sectors at the top edge of the image or at the border sector or sectors at the side edges of the image, but NOT at the border sector or sectors at the bottom edge. Therefore the detection of walls can be used to determine where the bottom of an image is NOT located. This will be verified through statistical analysis of a database of images.
Segmented Flesh Image It is anticipated that humans or animals will be predominantly located in a particular portion of images.
This will be verified through statistical analysis of a database of images.
Therefore the location of flesh in specific border sectors may be indicative of the orientation of the image.
2.4.3 Sky Detection Sub-Algorithm The colors of sky for this algorithm include but are not limited to shades of blue, light grey and white (see example in Figure 8). For example, sunset colors such as red and orange are less common but may be included in the algorithm. Night sky colors such as black and dark grey may also be included.
Development of the sky detection algorithm started with the collection of color data (i.e. red, green and blue image plane values) from examples of sky pixels in many different images.
Plots of the sky color data showed that relationships between green and blue and between green and red for sky pixels are fairly linear. The first part of sky segmentation uses these linear relationships to find pixels whose red, green, and blue values are similar to the sky examples within some error bounds (see Figure 9). The next part of the segmentation removes small objects and objects not touching an edge of the image (see Figure 10). Obviously this algorithm will occasionally segment other large blue objects in an image that touch the boundary of the image (like the hood of the truck in Figure 10), but statistically this is not very problematic.

Fig. 8: Original Image Fig. 9: Pixels Meet Color Criterion Fig. 10: Segmented Sky Features describing location, size, and color are collected from the sky image and used by the global image classifier to help make the decision about the image orientation.
2.4.4 Foliage Detection Sub-Algorithm The colors of foliage for this algorithm include but are not limited to shades of green, yellow and brown (see example in Figure 11 ). For example other colors such as gray or black may be included.
Development of the foliage detection algorithm started with the collection of color data (i.e. red, green and blue image plane values) from examples of foliage pixels in many different images. Plots of the foliage color data showed relationships between the i) green and blue, ii) green and red and iii) blue and red image planes. The first part of foliage segmentation uses these relationships to find pixels whose red, green, and blue values are similar to the foliage examples within some error bounds. The final part of the segmentation removes very small objects (see Figure 12).
Fig. 11: Original Image Fig. 12: Segmented Foliage 2.4.5 Wall Detection Sub-Algorithm For this algorithm walls are characterized by smooth regions where neighboring pixels are similar in color.
Walls covered in wallpaper and texture will not be segmented by this algorithm.
The wall detection algorithm can be summarized as follows:
1 ) Find smooth areas in the image by convolving the intensity image with an edge filter and thresholding to keep low edge regions (see Figures 13 and 14).
Fig. 13: Original Image Fig. 14: Low Variance Regions 2) Keep at most the three largest smooth regions (see Figure 15) and calculate each region's mean color and standard deviation.
3) Segment all areas of the image with color close to the dominant color previously segmented (see Figure 16).
Fig. 15: Three Largest Low Variance Regions Fig. 16: All Similar Colors 4) Segment low variance regions of the image as in Step 1 but use a higher threshold so that more regions are kept as low variance and AND this image with the "All Similar Colors" image (see result of AND operation in Figure 17).
5) Remove small objects to generate the final "Wall" image (see Figure 18).
Fig. 17: Similar Color and Low Variance Fig. 18: Final Image 2.4.6 Flesh Tone Detection Sub-Algorithm The colors of flesh for this algorithm include but are not limited to flesh colored shades of beige, pink, yellow and brown. Development of the flesh detection algorithm started with the collection of color data (i.e. red, green and blue image plane values) from examples of human flesh pixels in many different images. Plots of the flesh color data showed relationships between the i) green and blue, ii) green and red and iii) blue and red image planes. Flesh segmentation uses these relationships to find pixels whose red, green, and blue values are similar to the flesh examples within some error bounds. The final part of the segmentation removes objects with shapes that are not characteristic of humans or animals such as very small objects and elongated objects. The use of object texture evaluation provides a means of eliminating objects that are similar in color to flesh such as wood.
2.5 "Flash" Image Detection Sub-Algorithm "Flash" images for this application are characterized by the presence of a bright subject and a relatively darker background (see Figure 19). These images are usually taken in a dark environment using the camera's flash. The bright, colorful subject (commonly a person or people) usually extends from the central region of the image to the bottom. The detected eye information and Global Image Parameters are used to first detect "Flash" images then to detect the orientation of the "Flash" images.

2.6 Image Resizing At small sizes, images contain general cues to indicate image orientation. In the photos below, resizing has tittle effect on our perception of image orientation (see Figure 20).
Typically, therefore, it is possible to resize images to a smaller size (i.e. decrease the image resolution) to speed up the image analysis process.
Figure 19: "Flash" Image Figure 20: Resizing Illustration 3. Overall Image Orientation Detection Algorithm The Overall Image Orientation Detection Algorithm is shown in the flowchart of Figure 21. The algorithm can be summarized as follows, where each point number refers to the box in the flowchart:
Box #2: Resize the input image using sub-sampling if necessary. Full image resolution is not always necessary for orientation detection and can also cause the algorithm to run too slowly therefore sub-sampling to a certain size provides a compromise between accuracy and execution speed.
Box #3: From the resized image extract the Red, Green and Blue component images.
Box #4: From the resized image extract the Intensity image and from that extract the Intensity Edge image using an edge detection kernel.
Box #5: From the Red, Green, Blue, Intensity, and Intensity Edge images generate the binary segmented images containing Sky (see Section 2.4.3), Foliage (see Section 2.4.4), Walls (see Section 2.4.5) and Flesh tones (see Section 2.4.6). Also generate images in which Dark pixels are segmented and in which Bright pixels are segmented. These images are generated using simple thresholding of the component images.
Box #6: Extract the Global Image Features using the Central Quadrant Masks (see Section 2.4.1 ) and the Border Masks (see Section 2.4.2) combined with the Red, Green, Blue, Intensity, Intensity Edge, Sky, Foliage, Walls, and Flesh tone images. These features will be used by the Orientation Decision Maker Algorithm.
Box #7: Perform eye detection using segmentation of eye-like blobs, blob feature extraction and blob classification (see Section 2.1 ) where the final classification for each blob is either "not an eye", "eye at 0° ", "eye at 90° ", "eye at90° ", or "eye at 180°': Extract Eye Features describing the locations of the detected eyes, their orientations and sizes. These features will be combined with the Global Image Features to be used by the Orientation Decision Maker Algorithm.
Box #8: Perform straight line detection using edge detection and a Hough transform (see Section 2.3).
Extract Straight Line Features from the Horizontal, almost Horizontal, Vertical, and almost Vertical Lines, describing the locations of the lines, their slopes, their lengths, intersection points, and off image convergence points. These features will be combined with the Global Image Features and the Eye Features to be used by the Orientation Decision Maker Algorithm.
Box #9: Perform upper face detection using segmentation of upper face-like blobs, blob feature extraction and blob classification (see Section 2.2) where the final classification for each blob is either "not an upper face", "upper face at 0° ", 'tipper face at 90° ", 'tipper face at -90° ", or upper face at 180° "
Extract Upper Face Features describing the locations of the detected upper faces, their orientations and sizes. These features will be combined with the Global Image Features, the Eye Features, and the Straight Line Features to be used by the Orientation Decision Maker Algorithm.

Box #10: Run the Orientation Decision Maker Algorithm using the Global Image Features, the Eye Features, the Straight Line Features, and the Upper Face Features to classify the orientation of the image as either 0° , 90° ; 90° or 180° . See Section 4 for details.
4. Orientation Decision Maker Algorithm Information from each of the sub-algorithms described is used by a final classification algorithm (the Orientation Decision Maker Algorithm) to make a decision regarding the orientation of the image. A
flowchart for the Orientation Decision Maker Algorithm is shown in Figures 22 and 23. The rules used by the algorithm are generated by analyzing a large database of random consumer images. The algorithms are used to develop a decision tree consisting of a set of classifiers. As the analysis of an image progresses through the decision tree of classifiers, decisions are made to determine the orientation of the image or to perform further analysis to improve the probability that a correct decision will be made. The sequence of application of the classifiers was optimized by testing all possible combinations of sequential application of the classifiers. Testing is conducted by analyzing a database of images (some rotated and some not rotated) and noting the number of images that are correctly and incorrectly diagnosed with respect the proper image orientation. The optimal sequence was the one that achieved the highest value of the Performance Metric as defined herein. This is the sequence shown in the flowchart of Figures 22 and 23. It is possible that further development will result in the addition of new sub-algorithms and that this will result in a new optimal sequence being chosen.
The Orientation Decision Maker Algorithm can be summarized as follows, where each box number refers to the box in the flowchart of Figures 22 and 23. "Features" in the follanring description refers collectively to the Global Image Features, the Eye Features, the Straight Line Features, and the Upper Face Features.
SKY Branch Box #1: Run the Sky Detection Classifier using the Features to classify the image as containing sky or not containing sky.
Box #2: If the image contains sky proceed to Box #3, if it does not contain sky proceed to Box #8.
Box #3: Run the SKY classifier using the Features to classify the image as more likely to be 90° or-90°
rotated. If the image is more likely to be 90° proceed to Box #4, otherwise proceed to Box #5.
Box #4: Run the 90° SKY classifierusing the Features to classify the image as 90° or 0° . If the image is more likely to be 0° proceed to Box #6, otherwisethe image orientation is classified as 90° and processing is complete.

Box #5: Run the -90° SKY classifierusing the Features to classify the image as -90° or 0° . If the image is more likely to be 0° proceed to Box #7, otherwisethe image orientation is classified as -90° and processing is complete.
Box #6: Run a 0° classifier using the Features to classify an image as 0° or 180° . Processing is complete.
Box #7: Run a 0° classifier using the Features to classify an image as 0° or 180° . Processing is complete.
Flash Branch Box #8: Run the Flash Detection Classifier using the Features to classify the image as having been taken with a flash or not.
Box #9: If a flash was used proceed to Box #10, if it was not proceed to Box #15.
Box #10: Run the Flash classifier using the Features to classify the image as more likely to be 90° or-90°
rotated. If the image is more likely to be 90° proceed to Box #11, otherwise proceed to Box #12.
Box #11: Run the 90° Flash classifier using the Features to classify the image as 90° or 0° . fl the image is more likely to be 0° proceed to Box #13, otherwise the image orientation is classified as 90° and processing is complete.
Box #12: Run the -90° Flash classifier using the Features to classify the image as -90° or 0° . If the image is more likely to be 0° proceed to Box #14, otherwise the image orientation is classified as -90° and processing is complete.
Box #13: Run a 0° classifier using the Features to classify an image as 0° or 180° . Processing is complete.
Box #14: Run a 0° classifierusing the Features to classify an image as 0° or 180° . Processing is complete.
Eye Branch Box #15: If the image contains eyes as determined by the eye detection algorithm (Box #7 in Figure 21) proceed to Box #16, if it does not contain eyes proceed to Box #21.
Box #16: Run the EYE classifier using the Features to classify the image as more likely to be 90° or-90°
rotated. If the image is more likely to be 90° proceed to Box #17, otherwise proceed to Box #18.
Box #17: Run the 90° EYE classifier usingthe Features to classify the image as 90° or 0° . If the image is more likely to be 0° proceed to Box #19, otherwise the image orientation is classified as 90° and processing is complete.
Box #18: Run the -90° EYE classifier using the Features to classier the image as -90° or 0° . If the image is more likely to be 0° proceed to Box #20, otherwise the image orientation is classified as-90° and processing is complete.

Box #19: Run a 0° classifier using the Features to classify an image as 0° or 180° . Proessing is complete.
Box #20: Run a 0° classifier using the Features to classify an image as 0° or 180° . Processing is complete.
Upper Face Branch Box #21: If the image contains upper faces as determined by the upper face detection algorithm (Box #9 in Figure 21) proceed to Box #22, if it does not contain eyes proceed to Box #27.
Box #22: Run the Upper Face classifier using the Features to classify the image as more likely to be 90°
or -90° rotated. If the image is more likely to be 90° proceed to Box#23, otherwise proceed to Box #24.
Box #23: Run the 90° Upper Face classifier using the Features to classify the image as 90° or 0° . If the image is more likely to be 0° proceed to Box #25, otherwise the image orientation is classified as 90° and processing is complete.
Box #24: Run the -90° Upper Face classifier using the Features to classify the image as-90° or 0° . If the image is more likely to be 0° proceed to Box #26, otherwise the image orientation is classified as-90° and processing is complete.
Box #25: Run a 0° classifier using the Features to classify an image as 0° or 180° . Processing is complete.
Box #26: Run a 0° classifier using the Features to classify an image as 0° or 180° . Processing is complete.
Line Branch Box #27: If the image contains horizontal, near horizontal, vertical and or near vertical straight lines as determined by the straight line detection algorithm (Box #8 in Figure 21) proceed to Box #28, if it does not proceed to Box #33.
Box #28: Run the Line classifier using the Features to classify the image as more likely to be 90° or-90°
rotated. If the image is more likely to be 90° proceed to Box ~9, otherwise proceed to Box #30.
Box #29: Run the 90° Line classifier using the Features to classify the image as 90° ob° . If the image is more likely to be 0° proceed to Box #31, otherwise the image orientation is classified as 90° and processing is complete.
Box #30: Run the -90° Line classifier using the Features to classify the image as -90° or 0° . If the image is more likely to be 0° proceed to Box #32, otherwise the image orientation is classified as -90° and processing is complete.
Box #31: Run a 0° classifier using the Features to classify an image as 0° or 180° . Processing is complete.

Box #32: Run a 0° cla$ifier using the Features to classify an image as 0° or 180° . Processing is complete.
Last Chance Branch Box #33: Run the Last Chance classifier using the Features to classify the image as more likely to be 90°
or -90° rotated. If the image is more lilely to be 90° proceed to Box #34, otherwise proceed to Box #35.
Box #34: Run the 90° Last Chance classifier using the Features to classify the image as 90° or 0° . If the image is more likely to be 0° proceed to Box #36, otherwise the image orientation isclassified as 90° and processing is complete.
Box #35: Run the -90° Last Chance classifier using the Features to classify the image as-90° or 0° . If the image is more likely to be 0° proceed to Box #37, otherwise the image orientation is classified as-90°
and processing is complete.
Box #36: Run a 0° classifier using the Features to classify an image as 0° or 180° . Processing is complete.
Box #37: Run a 0° classifier using the Features to classify an image as 0° or 180° . Processing is complete.
Performance The performance statistics of the Overall Image Orientation Detection Algorithm on a database of random consumer images were as follows:
Image OrientationActual Image Orientation Estimation 0 90 -90 0 96.8% 48.3% 48.3%

90 1.7% 46.3% 6.5%

-90 1.5% 5.4% 45.2%

The software has been designed to be most accurate at detecting 0°
images since these are more common than 90°- and-90° images. It has been estimated based on data collection and consultation with industry that roughly 77% of images are captured at 0° and the remaining 23% are rotated at either 90° or -90° , with an equal probability of 90° or-90°- .
Using these estimates we can evaluate the performance of the Overall Image Orientation Detection Algorithm as follows:

Performance Metric = 77% x (0° detection) + 11.5 % x (90°
detection) + 11.5% x (-90° detection) = 85.0%
Using the values in the Table above, the Overall Image Orientation Detection Algorithm is calculated to correctly detect the orientation of 85.0% of images.
5. Image Orientation Detection Algorithm for Scanned Film Rolls The detection of the orientation of images from scanned film rolls can be improved by applying a process to classify the film roll as a "0 degree roll" or "180 degree roll." Images that are oriented at 180 degrees from the preferred orientation can be produced as a result of scanning film from single-lens reflex (SLR) cameras. SLR cameras produce images that are upside down on film relative to non-SLR type cameras and thus come out upside down when stored on digital media.
The process of the Image Orientation Detection Algorithm for Scanned Film Rolls is shown in the flowchart in Figure 24, and is described as follows:
Box #1: In practice each scanned film roll is intended to be analyzed as a directory or group of image files. All images from a roll are input to the Overall Image Orientation Detection Algorithm.
Box #2: Each image in the roll is analyzed by the Overall Image Orientation Detection Algorithm and classified as either 0°, 90°, -90° or 180°.
Box #3: Once all images from the roll have been classified, the proportion of images from the roll classified as 180° by the Overall Image Orientation Detection Algorithm is calculated as the "180 Degree Proportion".
Box #4: The 180 Degree Proportion is compared to a threshold value. The threshold value is determined by selecting a value that achieves the highest rate of correct classification of 0 degree and 180 degree rolls based on an iterative process of assuming different threshold values and applying the Image Orientation Detection Algorithm for Scanned Film Rolls to a database of scanned images from film rolls.
Box #5: If the proportion of images classified as 180°- by the Overall Image Orientation Detection Algorithm does not exceed the threshold value, then that roll is classified as a 0 degree roll.

Box #6: All images from the roll that were originally classified as 180 degrees are reclassified as 0°-.
Box #7: If the 180 degree proportion exceeds the threshold value, then that roll is classified as a 180 degree roll.
Box #8: All images from the roll that were originally classified as 0 degrees are reclassified as 180°- .
This completes the orientation classification for that roll.
Performance SLR cameras capture roughly 20 percent of images taken with film cameras, and are of particular concern to film processors that scan rolls of film to produce digital images for storage and display. The SLR images are rotated 180 degrees compared to non-SLR images and thus come out upside down when stored on digital media, unless the camera is rotated at 90° when the photo is taken. It is assumed that 23 percent of photos are in portrait mode (11.5% at 90Q and 11.5% d -90°). Since a single roll will only contain 0°- images or 180° images (rever both) the detection of 180Q images will be treated somewhat differently and two metrics are used to gauge performance:
Detection of SLR Rolls Performance Metric = 80.0% x (0° roll detection) + 20.0 x (180° roll detection) Based upon a database of 100 image rolls, made up of 20 SLR rolls and 80 conventional rolls, and using a 180° classfication minimum threshold of 12.0%, the Overall Image Orientation Defection Algorithm correctly detected 83.3% of SLR rolls and 97.8% of conventional rolls. Based on the assumption that 20 percent of all rolls are SLR, the above detection rates result in (0.833)*20 +
(0.978)*80 = 95% of all rolls being correctly detected. The performance statistics of the Overall Image Orientation Detection Algorithm on a database of random consumer images were as follows:
Image Actual Ima a Orientation Orientation0 90 -90 180 Estimation 0 93.2% 42.7% 42.1 % 44.2%

90 1.7% 46.3% 6.5% 4.8%

-90 1.5% 5.4% 45.2% 4.7%

180 3.6% 5.6% 6.2% 46.3%

Using the values in the Table above, the overall rate of correct detection of images is calculated as follows:
Pertormance Metric = (77%) x(0° roll detectionk(80%)x (0°
detection + 180° detection) +
(77°/) x(180° roll detectionk(20%)x (0° detection +
180° detection) + 11.5 x (90~ietection) + 11.5 x (-90° detection) =80.4%
The percentage of correctly oriented images prior to using the Image Orientation Detection Algorithm for Scanned Film Rolls would be (80%)x (77%) = 61.6%.
Therefore the improvement due to using the Image Orientation Detection Algorithm for Scanned Film Rolls is 80.4%-61.6% = 16.8%.
To achieve a greater improvement in performance, when rolls are detected to be 180 degrees, all images within the roll can be rotated by 180 degrees and reanalyzed by the AIO
algorithm. When the roll of images is reanalyzed by the AIO, it only attempts to detect 0, -90 or 90 degree orientations. This has been found to produce less false positives than when the AIO attempts to detect 0, -90, 90 or 180 degree orientations. As a result, the performance is improved by the reduction in false positives.
6. Draft Claims Wh~ 's claimed is:
1. A metho determining the orientation of digital images scanned from film, the method comprising the steps of:
(a) Inputting a series of i es scanned from a film roll (b) Applying the Overall Image O tation Detection Algorithm to each image, whereby the Overall Image Orientation Detection Algorithm inclu an Orientation Decision Maker that consists of a series of classifiers, where each classifier is based an image characteristic and the series of classifiers is applied sequentially (c) Classifying the orientation of the each image in the as 0, 90, -90 or 180 degrees (d) Counting the number of images from the film roll that were c ified as 180 degrees and dividing by the total number of images from the film roll, with this quotient name a "180 degree proportion."
(e) Comparing the "180 degree proportion" to a set threshold value (f) Classifying the roll of images as a 180 degree roll if the threshold is exceeded (g) Reclassifying all 0 degree images from the film roll as 180 degree images if the threshold i xceeded

Claims

Claims What is claimed is:

1. A method of determining the orientation of digital images scanned from film, the method comprising the steps of:
(a) Inputting a series of images scanned from a film roll (b) Applying the Overall Image Orientation Detection Algorithm to each image, whereby the Overall Image Orientation Detection Algorithm includes an Orientation Decision Maker that consists of a series of classifiers, where each classifier is based on an image characteristic and the series of classifiers is applied sequentially (c) Classifying the orientation of the each image in the roll as 0, 90, -90 or 180 degrees (d) Counting the number of images from the film roll that were classified as 180 degrees and dividing by the total number of images from the film roll, with this quotient named the "180 degree proportion."
(e) Comparing the "180 degree proportion" to a set threshold value (f) Classifying the roll of images as a 180 degree roll if the threshold is exceeded (g) Reclassifying all 0 degree images from the film roll as 180 degree images if the threshold is exceeded (h) Classifying the roll of images as a 0 degree roll if the threshold is not exceeded (i) Reclassifying all 180 degree images from the film roll as 0 degree images if the threshold is not exceeded

2. The method of Claim 1 where the Overall Image Orientation Detection Algorithm consists of the following steps (a) the determination of image characteristics through a process of segmentation, detection and feature extraction whereby such characteristics include but are not limited to a. the presence of sky b. the use of a flash c. the presence of foliage d. the presence of walls e. the presence of flesh tone f. the presence of eyes g. the presence of lines h. the presence of upper face regions (b) the use of a central mask and border mask to determine the location of image characteristics (c) the use of a sky detection classifier to determine if sky can be used to classify the image orientation, if so then a sky classifier is used to determine image orientation (d) the use of a flash detection classifier to determine if the image characteristics associated with the flash can be used to classify the image orientation, if so then a flash classifier is used to determine image orientation (e) the use of an eye detection classifier to determine if image characteristics associated with eyes can be used to classify the image orientation, if so then an eye classifier is used to determine image orientation (f) the use of an eye detection classifier to determine if image characteristics associated with eyes can be used to classify the image orientation, if so then an eye classifier is used to determine image orientation (g) the use of an upper face detection classifier to determine if image characteristics associated with upper face regions can be used to classify the image orientation, if so then an upper face classifier is used to determine image orientation (h) the use of a straight line detection classifier to determine if image characteristics associated with straight lines can be used to classify the image orientation, if so then a straight line classifier is used to determine image orientation (i) the use of a last chance classifier to determine image orientation if no other classifiers are determined to be suitable (j) the use of a decision tree for each classifier comprising a. a 90 degree branch further producing a 90 decision or input to a 0 degree classifier, the 0 degree classifier producing a 0 degree decision or 180 degree decision b. a -90 degree branch further producing a -90 decision or input to a 0 degree classifier, the 0 degree classifier producing a 0 degree decision or 180 degree decision

3. The method of Claim 1 where the image is reduced in size prior to being input to the Overall Image Orientation Detection Algorithm to reduce computational time.

4. The method of Claim 1 where the threshold value is determined by selecting a value that achieves the highest rate of correct classification of 0 degree and 180 degree rolls based on the statistical analysis of the results of applying the Overall Image Orientation Detection Algorithm to a database of scanned images from film rolls