CA2498484C

CA2498484C - Automatic perspective detection and correction for document imaging

Info

Publication number: CA2498484C
Application number: CA2498484A
Authority: CA
Inventors: Ufuk Orhun
Original assignee: Psion Systems Inc
Current assignee: Psion Systems Inc
Priority date: 2005-02-25
Filing date: 2005-02-25
Publication date: 2013-05-28
Anticipated expiration: 2025-02-25
Also published as: CA2498484A1

Abstract

A method and apparatus for detecting and correcting perspective distortion for document imaging is described. The document template of the present invention contains special markers that define the corners of the document. One of these markers, different from the others, uniquely identifies a particular corner. When an image of a document is captured and it is found to contain perspective distortion, the smallest rectangle that encloses the special markers in the captured image is calculated and geometric transforms are used to map the special markers in the captured image to the corners of the smallest rectangle. To correct for orientation errors during image capture, the captured image is rotated based on the location of the unique marker. The present invention can also provide feedback to the operator as the image is being captured. This feedback guides the operator to properly align the image reader for distortion-free imaging.

Description

AUTOMATIC PERSPECTIVE DETECTION AND CORRECTION FOR DOCUMENT
IMAGING
FIELD OF INVENTION
[0001 ] The present invention relates generally to the field of image readers, and more particularly to a method and apparatus for correcting perspective distortions and orientation errors.
BACKGROUND OF THE INVENTION

[0002] The use of portable image readers over fixed-mount image readers is increasing and these portable image readers are seeing applications in many industries.
One of the main challenges with portable image readers however, is the perspective distortion caused by inconsistent image reading positions. With fixed-mount systems, the image reader is placed in such a manner that the optical path of the image reader is perpendicular to the image plane. With portable systems, however, the position of the image reader is dependent on a human operator. It is difficult for an operator to know the ideal point from where to capture an image. More often than not, the user captures the image at an oblique angle (i.e. the image reader is not in a plane parallel to the plane of the document) and the captured image is skewed.

[0003] Accordingly, the image data may be uploaded to a personal computer for processing by various correction algorithms. The algorithms are employed to correct the distortion effects associated with off angle images of documents. The correction algorithms require a user to manually identify the corners of a region of a captured image.
Many image readers use geometric transforms such as affine transformations during post-processing of the image to correct for perspective distortions. In order to apply these transforms, the edges or corners of the image need to be defined. By measuring the spatial displacement of the identified corners from desired positions associated with a rectangular arrangement, an estimation of the amount of distortion is calculated. The correction algorithm then processes the imaged document to possess the desired perspective and size as necessary.

[0004] U.S. Patent Applications 2003/0156201 - Zang published August 21, 2003;
2004/0012679 - Fan published January 22, 2004 and 2004/0022451- Fugimoto published February 5, 2004 discuss automatic methods for identifying the corner or edges of the document based on statistical models. While these methods do not require user input to manually identify the document corners, additional complexity is added to the image reader. Also, the degree of accuracy is not the same when the locations of the corners are estimated positions. A document can also contain many different types of objects such as 1 or 2-dimensional codes, text, written signatures, etc. As a result it may be difficult to define the boundaries of the document by statistical methods.

[0005] Further, the prior art accounts for correction of perspective distortion, but cannot correct for orientation. The operator may not always align the image reader in the same orientation as the document so the captured image may require rotation. Many image readers have rectangular aspect ratios so it is necessary at times to rotate the image reader by 90 degrees with respect to the document in order to "fill" the field of view (FOV) of the image reader with the document.

[0006] Therefore there is a need for an image reader that can automatically correct for both perspective distortion and orientation.
SUMMARY OF THE INVENTION

[0007] The present invention is directed to a method and apparatus for correcting perspective distortion in an image captured by an image reader wherein the captured image has a number of special markers located on the boundary of the image having a predetermined shape. Distortion is corrected by calculating the smallest predetermined shape that encloses all of the special boundary markers, building a geometric transform to map the location of the special markers in the captured image to corresponding locations on the predetermined shape and applying the geometric transform to the captured image.
Further, the special boundary markers may include a unique identifier marker different from the other special boundary markers, which is used to correct orientation errors in the captured image.

[0008] In accordance with a specific aspect of the invention, the predetermined shape of the image is a rectangle and the special boundary markers are corner markers.
Further, the geometric transform comprises affine transformations.

[0009] The present invention is further directed to a method and apparatus for positioning an image reader having a rectangular field of view to avoid perspective distortion in a captured image wherein the captured image has special boundary markers located at the corners of the image having a rectangular shape. The image reader is positioned by capturing an image, calculating the distance between the special boundary markers and the field of view corners, determining if the distances are all the same. If the distances are not the same the image reader repositioned and the image recaptured until the distances are all the same. Further, the special boundary markers may include a unique identifier marker different from the other special boundary markers, which is used to correct orientation errors in the captured image.

[0010] The present invention is also directed to a method and apparatus for producing an image of a substantially rectangular target having special boundary markers at the corners with one of the markers being a unique corner marker. The image is produced by capturing an image using an image reader, correcting perspective distortion on the captured image using the special boundary markers, correcting orientation errors of the image using the unique corner marker and processing the image. The perspective distortion may be corrected by calculating the smallest predetermined shape that encloses all of the special boundary markers, building a geometric transform to map the location of the special markers in the captured image to corresponding locations of the predetermined shape and applying the geometric transform to the captured image.
[0011 ] In accordance with another aspect of this invention, orientation errors in the image may be corrected by rotating the captured image.
[0012] In accordance with a specific aspect of this invention, the special boundary markers are polygon shapes.

[0013] Other aspects and advantages of the invention, as well as the structure and operation of various embodiments of the invention, will become apparent to those ordinarily skilled in the art upon review of the following description of the invention in conjunction with the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0014] The invention will be described with reference to the accompanying drawings, wherein:
Figure 1 is a simplified diagram of an image reader;
Figure 2 shows how perspective distortion is caused;
Figure 3 shows the results of applying the present invention to an image with perspective distortion;
Figure 4 is a flowchart outlining the process steps of a first embodiment of the present invention;
Figure 5 shows an example of unique document markers;
Figure 6 shows how the smallest rectangle is determined as part of the perspective distortion correction algorithm;
Figure 7 is a flowchart outlining the process steps of a second embodiment of the present invention; and Figure 8 is a simplified diagram on an image reader employing the algorithms of the present invention.
DETAILED DESCRIPTION
[0015] A conventional image reader, such as a portable image reader 1 is shown in the simplified diagram of Figure 1. It comprises an image capture device 2, such as a CCD or CMOS image sensor, an optical system 3 mounted over the image sensor, an analog-to-digital (A/D) conversion unit 4, memory 5, processor 6, user interface 7 and output port 8.

[0016] The analog information produced by image capture device 2 is converted to digital information by A/D conversion unit 3. A/D conversion unit 3 may convert the analog information received from image capture device 2 in either a serial or parallel manner.
The converted digital information may be stored in memory 5 (e.g., random access memory or flash memory). The digital information is then processed by processor 6.
Additionally or alternatively, other circuitry (not shown) may be utilized to process the captured image such as an application specific integrated circuit (ASIC). User interface 7 (e.g., a touch screen, keys, and/or the like) may be utilized to edit the captured and processed image. The image may then be provided to output port 8. For example, the user may cause the image to be downloaded to a personal computer (not shown) via output port 8.
[0017] Figure 2 shows diagrammatically how perspective distortion is caused.
Image reader 1' shown in dotted lines shows the correct position of image reader 1' over a document 10 to ensure distortion-free imaging. In practice, the position of image reader 1 is shown in solid line. Image reader 1 is shown at an oblique angle with respect to document 10. Since the optical path of image reader 1 is not directly perpendicular with surface of document 10, perspective distortion will result.
[0018] Figure 3 shows the results of applying the method of the present invention to an image suffering from perspective distortion. Captured image 15 is a skewed image of a document. A marker 16 on the document indicates the location of the upper left hand corner of the document. In applying the present invention to captured image 15, the automatic perspective detection and correction method of the present invention produces a processed image 17. The distortion is removed from processed image 17, but marker 16, which indicates the upper left hand corner of the document, shows that processed image 17 is not oriented correctly. If perspective distortion correction and orientation correction are applied together, the result is processed image 18. Marker 16 of processed image 18 correctly indicates the upper left hand corner of the document, thus confirming correct orientation.

[0019] Figure 4 shows a flowchart outlining the preferred embodiment of the present invention. The first step of the process is to capture 25 an image of the target such as a document 35 including special markers 36, 37, 38 and 39 as shown on figure 5.
The special markers 36, 37, 38 and 39 are included on the document 35 to identify the four corners of the document boundary. Three markers 37, 38 and 39 out of the four markers 36, 37, 38 and 39 are identical, while a fourth marker 36 indicates a particular corner for example, the upper left hand corner. This is used as an orientation reference marker. The markers 36, 37, 38 and 39 in the present invention are polygon forms such as squares, circles or triangles. The markers 36, 37, 38 and 39 should be unique enough so that they are not confused with other objects on the document. A document template would include these special markers 36, 37, 38 and 39 and as a result all documents will have the special markers. It should be understood by those skilled in the art that any number or shape of markers falls within the present invention.
[0020] Figure 5 shows a specific example of a document template 35 having four special markers 36, 37, 38 and 39. These four markers all include a square, but whereas markers 37, 38 and 39 all contain dots, marker 36 contains a three-line segment. This marker 36 uniquely identifies the upper left hand corner of document 35. If this document is captured by an image reader and marker 36 appears on the bottom left hand corner, it will be evident that a rotation is required to correct the orientation.
[0021 ] As the operator attempts to read an image, the image reader projects a targeting pattern onto the target image. This targeting pattern indicates to the operator either the center of, or the boundary of the image reader's FOV. The operator may need to move the image reader back and forth in front of the image so that the image reader can detect all of the special markers 36, 37, 38 and 39. Detection is done through pattern recognition software. The image reader will read all objects within its field of view until it identifies the special markers 36, 37, 38 and 39. Since these markers 36, 37, 38 and 39 are located along the periphery of the document, any object that appears similar to the markers, but is located in the center of the document, will be discarded. As soon as the image reader detects the four special markers 36, 37, 38 and 39, it will give feedback to the operator in the form of a visual indicator such as a light-emitting diode (LED) or an audible signal.
Upon receiving the feedback, the operator can capture 25 the image. Since these markers 36, 37, 38 and 39 are necessary for the perspective distortion correction, the process cannot continue if they are not all detected. After image capture and special marker detection, the image and marker locations are transferred 26 to the host such as a personal computer for image processing. The image reader can also do the processing, if this capability is present.
[0022] Once it is established that all markers are present on the captured image, correction of the captured image begins. The first step of the perspective correction algorithm is to calculate 27 the smallest rectangle that encloses all the markers of the captured image. Figure 6 shows a diagram of determining the smallest rectangle.
Boundary 45 defines the FOV of the image reader as well as the boundary of the captured image. Document 46 located within boundary 45 suffers from perspective distortion.
Markers 36, 37, 38 and 39 define the corners or boundaries of document 46.
Based on the locations of these markers 36, 37, 38 and 39, the smallest rectangle that encloses them is defined by rectangle 47. The corrected image will have an area defined by rectangle 47.
[0023) The second step of the perspective correction algorithm is to build 28 a perspective transformation matrix that will map the markers of the captured image to the corresponding corners of the smallest rectangle. This requires the use of geometric transforms such as affine transformations. This technique is known to those skilled in the art and will not be discussed further here.
[0024] The third step of the perspective correction algorithm is to apply 29 the transformation, which will move the markers of the captured image to the corners of the smallest rectangle that encloses the captured image. The last step of the correction algorithm is to cut 30 the rectangular part of the image, the part of the image defined by the smallest rectangle, from the rest of the captured image. This rectangular image is then made the principal image. In reference to Figure 6, the image defined by rectangle 47 is cut away from the image area defined by boundary 45. The image area defined by rectangle 47 becomes the principal image. This reduces the image size thus taking up less space in memory and making transmission of the image, such as to a host, much easier.
[0025] The final step in the process outlined in Figure 4 is to determine 31 if rotation is required 31. This determination is based on the location of the upper left-hand corner marker 36, the orientation reference marker. The location of this marker 36 in any other corner other than the predetermined orientation reference corner marker, the upper left one, for example, indicates that rotation is required. The orientation reference marker is not limited to upper-left hand corner. Other corners can be envisioned while still falling within the scope of the present invention.
[0026] A further embodiment of the present invention incorporates perspective distortion detection that will reduce or may even eliminate the need for perspective distortion correction. Figure 7 outlines the process for this embodiment of the present invention.
The first step of capturing the image including special markers 36, 37, 38 and 39 is similar to the first step of Figure 4. Once all the special markers 36, 37, 38 and 39 are detected, feedback is given to the operator to capture 51 the image.
[0027) The next step of the process is to calculate 52 the distance between the corners of the FOV and the markers i.e. the distance between the upper left hand corner of the FOV
and the upper left hand marker. Once the distances are measured for each of the four corners, the distances are compared 53 with each other. If they are all the same, or within a predetermined tolerance to each other, the image is considered to be distortion free. In this case, the image reader will provide "positive" feedback to the operator such as a LED
indicator or an audible signal. If the distances are not all the same, the image reader will provide "negative" feedback, to indicate to the operator that distortion exists in the captured image and to re-capture the image. This feedback is meant to guide the operator to manually correct the image reader alignment. This can be done through a number of ways such as le$/right and/or top/bottom LED indicators. If the image reader needs to be moved in a particular direction, the appropriate LED will illuminate. Another option is a range of audible tones. As the operator moves the image reader, the tones can indicate if the operator is approaching proper alignment or increasing the amount of distortion.
[0028] The next step of the process is to transfer 54 the image to a host processor such as a personal computer for image processing. Step 54 is optional if the capability is present for the image reader itself to perform any post-processing.
(0029] Whether the image is transferred to a host or the processor is part of the image reader, the processor will re-check 55 the distances between the corners of the FOV and the markers. If it appears that the distances are not the same or are not within a predetermined tolerance, the perspective distortion correction algorithm outlined in figure 4 is applied 57 to the image. If the distances appear to be similar, then the perspective distortion correction algorithm is bypassed. The last step of this process is orientation determination and correction 58. Upon examination of the location of the orientation reference corner marker, the image may require rotation.
[0030] Figure 8 shows the simplified diagram of an image reader of Figure 1, but further includes the algorithms of the present invention. Assuming that the captured image is not transferred to a host and the image reader itself does the post-processing, the processor 6 of Figure 8 shows the algorithms of the present invention. These include the optimal alignment algorithm 65 outlined in Figure 7 and the perspective distortion correction algorithm 66 outlined in Figure 4. If the optimal alignment condition is enabled, algorithm 65 is applied. If it is disabled, the perspective distortion correction algorithm 66 is applied.
[0031 ] From the embodiments described above, the present invention has the advantage of being simpler than the prior art by avoiding complex corner/edge detecting algorithms.
The accuracy is also higher since the corners of the document are identifiable by the special markers, whereas the prior art uses statistical methods to provide an estimate of the document corners.

[0032] A further advantage of the present invention is the detection of perspective distortion, which gives feedback to the operator for correct positioning of the image reader. Perspective distortion correction rnay not be necessary if the operator can be guided into capturing a distortion-free image.
[0033] While the invention has been described according to what is presently considered to be the most practical and preferred embodiments, it must be understood that the invention is not limited to the disclosed embodiments. Those ordinarily skilled in the art will understand that various modifications and equivalent structures and functions may be made without departing from the spirit and scope of the invention as defined in the claims. Therefore, the invention as defined in the claims must be accorded the broadest possible interpretation so as to encompass all such modifications and equivalent structures and functions.

Claims

What is claimed is:

1. A method of producing a principal image of a target, comprising the steps of:
capturing an image of a target by an image reader, the target having a plurality of special boundary markers located on the boundary of the target as part of the target, the plurality of special boundary markers forming a predetermined shape for defining the boundary of the target;
based on the plurality of special boundary markers present on the captured image of the target, calculating a smallest predetermined shape that encloses the images of the plurality of special boundary markers and is proportional to the predetermined shape;
building a geometric transform to map the images of the plurality of special boundary markers to corresponding locations of the smallest predetermined shape; and transforming the captured image of the target to a principal image defined by the smallest predetermined shape, including processing the captured image of the target based on the geometric transform.

2. The method as claimed in claim 1, wherein the plurality of special boundary markers include a unique identifier marker having a shape different from the shapes of the other special boundary markers, and wherein the method comprises:
correcting for orientation errors in the captured image of the target based on the unique identifier marker.

3. The method as claimed in claim 2, wherein the step of correcting comprises:

rotating the captured image of the target.

4. The method as claimed in any one of claims 1 to 3, wherein the predetermined shape is a rectangle and the special boundary markers are corner markers.

5. The method as claimed in any one of claims 1 to 4, wherein at least one of the special boundary marker is a polygon shape.

6. The method as claimed in any one of claims 1 to 5, wherein the geometric transform comprises affine transformations.

7. The method as claimed in any one of claims 1 to 5, wherein the method comprises:
cutting the principal image within the smallest predetermined shape in the transformed captured image.

8. The method as claimed in any one of claims 1 to 7, comprising:
using a document template having the plurality of special boundary markers so that the image of the target is captured with the plurality of special boundary markers.

9. The method as claimed in any one of claims 1 to 8, comprising:
prior to the step of calculating, detecting the plurality of boundary markers by the image reader; and based on the result of the step of detecting, generating a feedback signal for indicating if the images of all of the plurality of special boundary markers are capturable by the image reader.

10. An apparatus for producing a principal image of a target based on a target's image captured by an image reader, the target having a plurality of special boundary markers located on the boundary of the target as part of the target, the plurality of special boundary markers forming a predetermined shape for defining the boundary of the target, comprising:
means for calculating, based on the plurality of special boundary markers present on the captured image of the target, a smallest predetermined shape that encloses the images of the plurality of special boundary markers and is proportional to the predetermined shape;
means for building a geometric transform to map the images of the plurality of special boundary markers to corresponding locations of the smallest predetermined shape; and means for transforming the captured image of the target to a principal image defined by the smallest predetermined shape, including means for processing the captured image of the target based on the geometric transform.

11. The apparatus as claimed in claim 10, wherein the plurality of special boundary markers include a unique identifier marker having a shape different from the shapes of the other special boundary markers, and wherein the apparatus comprises:
means for correcting for orientation errors in the captured image of the target based on the unique identifier marker.

12. The apparatus as claimed in claim 11, wherein the means for correcting comprises:
means for rotating the captured image of the target.

13. The apparatus as claimed in any one of claims 10 to 12, wherein the predetermined shape is a rectangle and the special boundary markers are corner markers.

14. The apparatus as claimed in any one of claims 10 to 13, wherein at lease one of the special boundary markers is a polygon shape.

15. The apparatus as claimed in any one of claims 10 to 14, wherein the geometric transform comprises affine transformations.

16. The apparatus as claimed in any one of claims 10 to 15, wherein the apparatus comprises:
means for cutting the principal image within the smallest predetermined shape in the transformed captured image.

17. The apparatus as claimed in any one of claims 10 to 16, comprising:
using a document template having the plurality of special boundary markers so that the image of the target is captured together with the plurality of special boundary markers.

18. The apparatus as claimed in any one of claims 10 to 17, comprising:
means for detecting the plurality of boundary markers, prior to the calculation, and means for generating a feedback signal for indicating if the images of all of the plurality of special boundary markers are capturable by the image reader, based on the result of the detection.