US20090016606A1 - Method, system, digital camera and asic for geometric image transformation based on text line searching - Google Patents

Method, system, digital camera and asic for geometric image transformation based on text line searching Download PDF

Info

Publication number
US20090016606A1
US20090016606A1 US11/915,948 US91594806A US2009016606A1 US 20090016606 A1 US20090016606 A1 US 20090016606A1 US 91594806 A US91594806 A US 91594806A US 2009016606 A1 US2009016606 A1 US 2009016606A1
Authority
US
United States
Prior art keywords
text
connected pixels
image
lines
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US11/915,948
Other versions
US9036912B2 (en
Inventor
Hans Christian Meyer
Mats Stefan Carlin
Knut Tharald Fosseide
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lumex AS
Original Assignee
Lumex AS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to NO20052656 priority Critical
Priority to NO20052656A priority patent/NO20052656D0/en
Application filed by Lumex AS filed Critical Lumex AS
Priority to PCT/NO2006/000189 priority patent/WO2006130012A1/en
Assigned to LUMEX AS reassignment LUMEX AS ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CARLIN, MATS STEFAN, FOSSEIDE, KNUT THARALD, MEYER, HANS CHRISTIAN
Publication of US20090016606A1 publication Critical patent/US20090016606A1/en
Publication of US9036912B2 publication Critical patent/US9036912B2/en
Application granted granted Critical
Application status is Active legal-status Critical
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/006Geometric correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/18Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints using printed characters having additional code marks or containing code marks, e.g. the character being composed of individual strokes of different shape, each representing a different code value
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • G06K9/32Aligning or centering of the image pick-up or image-field
    • G06K9/3275Inclination (skew) detection or correction of characters or of image to be recognised
    • G06K9/3283Inclination (skew) detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/42Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/0031Geometric image transformation in the plane of the image for topological mapping of a higher dimensional structure on a lower dimensional surface
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K2209/00Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K2209/01Character recognition

Abstract

The present invention provides a method, system and/or a digital camera providing a geometrical transformation of deformed images of documents comprising text, by text line tracking, resulting in an image comprising parallel text lines. The transformed image is provided as an input to an OCR program either running in a computer system or in a processing element comprised in said digital camera.

Description

    FIELD OF INVENTION
  • The present invention is related to Optical Character Recognition (OCR) systems, and especially to a method, system or digital camera comprising said method for geometric image transformation of deformed images of text based on text line searching.
  • BACKGROUND
  • In prior art the flat bed scanner has become a standard equipment in almost every office providing scanned input of typed text, book pages and different types of documents such as for example handwritten applications or partially handwritten schemes etc. to computers for further word-processing, electronic storage, electronic distribution etc. However, whenever a document or page is not properly aligned on the flatbed scanner, or the thickness of a book renders the page adjacent to the back of the book curved above the flatbed scanner, the scanned images transferred to the computer provides a deformed image of the text that is difficult to recognize in an OCR program as known in prior art.
  • In recent years, digital cameras have become an alternative to flatbed scanners due to the flexibility when using the camera. However, the problem with deformed text images for OCR processing in digital cameras is further enhanced since the misalignment of a camera image may occur in three dimensions (perspective distortion), even for pictures of flat pages. Lens faults like lens aberration and distortion may also influence the OCR efficiency.
  • A geometrical transformation of the deformed document image providing corrected images suitable for the OCR processing may solve the problem. The U.S. Pat. No. 6,304,313 disclose a digital camera with an OCR function based on dividing a document page into blocks, where each block is photographed before each block is processed by the OCR function. When all the blocks has been processed by the OCR function, the recognized blocks with text corresponding to the plurality of images are combined together to form one text data set corresponding to the whole document. However, the geometrical transformation according to this disclosure is merely to divide the page in such small blocks that the deformation in each small block is negligible. Therefore, this solution may require extensive processing to accomplish the task when the deformation exceeds a specific level. Further, the division of text may render the text in each block unrecognizable because the blocks are becoming too small to contain recognizable text.
  • The US patent application US 2003/0026482 from Feb. 6, 2003 disclose a method for correcting perspective distortion in a digital document image, for example from a digital camera, wherein a mathematical model of how parallel lines passes a single point when viewed under some perspective view is used to identify the perspective of the image. According to a preferred embodiment of this invention, horizontal and vertical border lines of an image comprising text is used to identify the perspective of the image. Based on this mathematical model of the distortion due to the perspective, text lines are corrected. As easily understood, this perspective based method do not cope with the other types of distortion readily encountered when for example a page in a book is photographed, and then passed through an OCR function. Besides perspective distortions, structural distortions due for example to bending or curving of book pages adds significantly to the problem of correcting such images from cameras. It is also clear from practical experience when using a camera for capturing images of text, the camera usually is oriented straight ahead above the page to be photographed. Therefore, the perspective distortion will usually contribute less to the total distortions encountered in the image compared to for example structural distortions of the object (text page, bending of book pages, curving pages etc.) itself.
  • The paper “Correcting Document Warping based on regression of curved text lines” by Zhang and Tan, International Conference on Document Analysis and Recognition, ICADR-2003, disclose a method based on models of the text line deformations as quadratic polynomial curves instead of using a more common cylinder model for the book deformation near the back of the book as described above. The lines are tracked using a connected element clustering algorithm within bounding boxes defined by the orientation of an already identified segment of the text lines.
  • The paper “Document image de-warping for text/graphics recognition” by Wu and Agam, International Symposium on Statistical Pattern Recognition, SSPR-2002, disclose a method based on lines that are tracked using a local adaptive cumulative projection at different angles. The tracked lines may cross each other due to the local nature of the algorithm when two starting points result in two different search directions. A second step of removing lines that are crossing based on the average orientation of the lines is included, limiting the method to images with fairly regular lines and a small perspective distortion. A rectangular mesh is fitted to the remaining lines for dewarping.
  • The paper “Rectifying the bound document image captured by a camera: A model based approach” by Cao et. al., International Conference on Document Analysis and Recognition, ICADR-2003, disclose a method based on applying a cylinder model to the book deformation near the back of he book and a perspective model to compensate for the depth difference. A best match between the cylinder model and a set of threshold skeletons of the lines are used to rectify the images.
  • All of these above referenced papers disclose methods having clear limitations with respect to the type of geometric deformations that can be dewarped by these methods. The cylinder model and the quadratic polynomials will only fit the type of geometric deformations that is found in books with stiff book cover. The average orientation filtering requires that the text lines are fairly regular, which is found in the case with open books, and which also limits the methods to only small perspective deformations.
  • Therefore there is a need for a method and system providing better geometrical transformation of distorted images comprising text before processing images with an OCR function to achieve more reliable and more complete text recognition of documents in a computer system or digital camera system.
  • SUMMARY
  • According to an aspect of the present invention, text like information in an image may be identified and evaluated on basis of connected pixels that probably comprise text, and on basis of direction of connected pixels constituting text, a text line direction may be identified without any introduction of an a priori assumption or model of present document deformations. Based on the property that most text-lines are parallel and are relatively homogenous in size in actual documents, text-lines may be geometrically transformed providing aligned and parallel text lines that much more easily are handled by OCR programs, thereby providing more reliable and more complete recognition of images comprising text by said OCR program.
  • According to an example of embodiment of the present invention an image is reviewed to identify text like structures, and to make an assessment if the total text like structure comprised in the image is enough to extract text lines as basis for a geometrical transformation of the whole document, wherein potentially connected pixels that may form characters are identified and traced to form text lines providing points for a definition of transformation points on said text lines, wherein said transformation points are used for a geometrical transformation of said text lines or parts of said text lines, providing images comprising parallel and homogenous text lines as input for an OCR program.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 depicts an example of connected pixels forming a probable word.
  • FIG. 2 depicts examples of connected pixels forming probable characters that are forming words.
  • FIG. 3 illustrates an image of a document page comprising slight deformation.
  • FIG. 4 illustrates text line tracking according to an example of embodiment of present invention.
  • FIG. 5 illustrates an example of meshes for forward and inverse geometric morphing transformation according to an example of embodiment of present invention.
  • FIG. 6 depicts a flow diagram of an example of embodiment of present invention.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • According to an aspect of the present invention, in stead of establishing an a priori mathematical model of distortions in an image comprising text, the main issue is that pixels representing characters, character fragments (which may be natural due to typographical aspects of the text or artificial due to deformations), words or parts of words constitute connected pixels, wherein connected pixels related to characters are spaced apart by a distance defined by the typeface used (Times New Roman etc.), and wherein groups of connected pixels forming words are spaced by another distance defining distance between words on a text line, wherein it is possible to search an image to identify even deformed text lines based on said searching using for example said distances.
  • According to an aspect of the present invention, the objective of the present invention may be solved in a process comprising three basic steps:
    • a) performing an initial check to evaluate if there is enough text like structures in the images to perform a transformation according to the present invention,
    • b) identify connected pixels probably forming characters, words and searching said probable characters, words to identify transformation points,
    • c) based on said transformation points, transforming the image or a part of an image comprising text to an image where the text lines are straight and parallel, and homogenous in size.
  • According to an aspect of the present invention, a reference system providing a coordinate system for locating graphical elements, objects, characters etc. in an image may be defined by the plane provided by the flatbed scanner platen or the surface of the image capturing device, such as a digital camera, for example. However, any definition of a coordinate system may be used according to the present invention.
  • Any pixel in an image is therefore referable by an ordered set of coordinate values. Pixels related to an image of characters constituting text provides several attributes that may be used in OCR functions as known to a person skilled in the art. For example, the shape of characters may be identified providing means for recognizing characters, and then whole words, for example. Whenever there is a deformation of the image, the recognition is difficult as described above.
  • According to an example of embodiment of the present invention, the geometrical image transformation may be executed whenever there is enough information in the image to provide a transformation grid. According to the present example, an initial text check is performed analysing connected pixels to verify that they are consistent in size, shape and relative position consistent with text. For example, if images has insufficient resolution (character height below five to ten pixels, for example), insufficient text line structure (a single line or sparse words cannot be used to define a transformation grid), the image is rejected. In some instances, the text lines may extend outside the edges of the image.
  • Whenever such situations occur, they may be signaled back to users providing a possibility to correct the reason why the image does not provide sufficient information for the geometrical transformation according to the present invention. The signaling may be displaying a message to the user on a computer display. According to an example of embodiment of the present invention comprising a digital camera, the initial check provides a feedback signal as a green light whenever the initial analysis concludes with sufficient information. When the green light is not present the user may perform adjustments, for example changing camera position relative to the document the user at present is investigating, zooming into the image or moving the camera closer to the paper, book etc., or just by turning the viewing angle of the camera. When the green light comes on, the image may be captured to be processed by an OCR program running in the camera itself, or in an attached computer receiving the image from the camera.
  • According to the present example of embodiment, connected pixels are also measured to provide a ratio between area/height/width which must be higher than a lower threshold value, and below a higher threshold value. If said ratio is below said lower threshold, the pixels are regarded as noise or artifacts. If said ratio is above said higher threshold value, the pixels are regarded as being non-text elements or artifacts. Any set of connected pixels not falling between said lower and higher threshold values are rejected from being part of the geometric transformation. According to an example of embodiment of the present invention, a table is created identifying locations of such rejected sets of connected pixels.
  • FIG. 1 illustrates an example of an image comprising an inclined word “city”. Pixels constituting a character is a connected set of pixels as outlined above, wherein the connectivity is provided for example in a four-way direction as a north-south-west-east connected set of pixels (or up, down, left, right), or in an eight-way manner that in addition to the north-south-west-east directions also comprise the diagonals. The denominations of directions are merely a convenient way of assigning different directions rather than being actual directions. However, directions related to a reference system based on the flat bed scanner platen for example, provide an orientation as north, south etc. or up, down etc. that is easily comprehended and easy to implement as computer program routine. Based on these rules for connectivity, an analysing routine may be programmed, as known to a person skilled in the art, to provide an identification of connected pixels. In FIG. 1, the letter ‘c’ is a connected set of pixels using any rule of connectivity. The letter ‘i’ comprises two sets of connected pixels, the stem and the dot using any of the connectivity rules. The letters ‘t’ and ‘y’ comprises two sets of connected pixels when using the four-way rule, while they are one set of connected pixels when applying the eight-way rule. In FIG. 2 another example of connected pixels forming words are illustrated. FIG. 2 also illustrates the problem sometimes encountered in OCR when deformations change the size of characters. According to an aspect of the present invention such changed character heights may be rejected by said test of said ratio of area/height/width as described above. FIG. 3 depicts an example of an image of text with a back ground making it difficult to distinguish pixels from the background. Whenever such situations arise, it may be signaled back to users, providing adjustments for example of scanner parameters etc. that may improve the quality of the images to be processed.
  • When the initial analysis concludes that an image provides sufficient information to be used in a geometrical transformation, according to the present invention, the image under investigation is searched to identify possible text lines. According to an example of embodiment the image is first searched to measure distances between connected pixels. The distance is measured as a count of pixels or space between the connected pixels. The search is performed in a plurality of directions. In the present example of embodiment, the measured distances are assembled in a histogram where the x-axis represent distance measured in pixels and the y-axis is the count of each measured distance. Since any document comprising text provides a first distinct distance between characters, and another second distinct distance between words, the histogram provides to distinct columns representing the count of each of said distinct distances. In this manner identified connected pixels may be interlinked or clustered in any direction searched in the image to identify any identifiable collection or clustering of connected pixels as being probable words based on said distinct distances. In an example of embodiment of the present invention, locations of connected pixels being probable words are listed in a table, one entry for each probable word. The locations listed in said table may be compared with the list of rejected pixels identified in the initial analysis as described above. Any pixels being listed as rejected is removed from said table. In this manner the relationship between locations of characters listed in an entry in said table representing a probable word provides a direction of the text line on the location on the text line the present probable word is located, wherein said direction is relative to the coordinate system used. As known to a person skilled in the art, the locations listed in said table may be relative coordinates.
  • In yet another example of embodiment of the present invention, other parameters in addition to distance characterising characters and words are used. For example, size of connected pixels should also reflect that same characters in the same font set actually have similar size. In this manner size and/or directions (i.e. height-width relationship of characters) may be used to form homogeneity criteria providing a tool to further increase the probable detection of words. In an example of embodiment of the present invention, any known method of geometric sorting may be used to cluster connected pixels, for example range searching algorithms. In yet another example of embodiment, the relationship between height-width of clustered connected pixels is used to identify the local text line direction for the probable word.
  • According to an aspect of the present invention, a word on a text line is probably followed by a next probable word spaced with said second distinct distance. By searching the image in a plurality of directions, following words apart from a selected word with said second distinct distance, an example of embodiment of the present invention provides a listing of candidates that may be linked together to identify a text line.
  • In yet another embodiment of the present invention, other parameters in addition to distance characterising words are used to identify said candidates. For example, the height-width relationship between clustered connected pixels may be sorted to identify probable long words providing a more probable identification of direction of said text lines.
  • According to an example of embodiment of the present invention, candidates linked to probable text lines are sorted and grouped to form text blocks based on their mutual distance and position in the image. According to the present example of embodiment, the consistency of the identified text lines are then investigated, for example by investigating if any text lines are crossing each other, or if that they are discontinuous.
  • FIG. 4 illustrates text line tracking according to an example of embodiment of present invention. Based on the requirement that a font set provides similar height for connected pixels constituting characters, staff-lines interconnecting characters on a text line may be introduced. Intersections between staff-lines and connected pixels may provide points in the image that may be used for the geometrical transformation. Such selected points are marked with crosses in FIG. 4.
  • FIG. 5 illustrates an example of meshes for forward and inverse geometric morphing transformation according to an example of embodiment of present invention. The morphing may be performed as known to a person skilled in the art. According to an aspect of the present invention, any type of transformation may be used.
  • FIG. 6 depicts a flow diagram illustrating a preferred example of embodiment of the present invention in a computer program running in a computer system or in a processor element comprised in a digital camera. An image 10 is communicated to a quality measuring module 14. The purpose of the quality measuring module is to provide an initial analysis of the image to assess if there is sufficient information in the image 10 to perform the geometrical transformation according to the present invention, as described above. Parameters are stored in a storage location 15 providing a possibility to adjust parameters to improve the quality of the image 10, if necessary. The threshold module 11 receives parameters from the storage location 15 providing a bitmap 12 of the image 10 with reduced noise, as known to a person skilled in the art. The bitmap 12 is communicated to the module 13 providing an identification of connected pixels, as described above. Identified connected pixels are analyzed and sorted in module 17. Module 19 identifies single connected pixels as words while module 16 is clustering connected pixels to probable words as described above. Output from module 16 and 19 are communicated to the module 18 providing a consistency check of words. Results from the consistency check is communicated back to the quality measuring module 14 resulting in a possible adjustment of parameters stored in the storage location 15. Words are communicated from module 18 to module 21 providing linking of words to candidate text-lines as described above. A consistency check may be performed in module 20 which communicates text-line candidates to document analysis module 23. On basis of the image 10, document analysis module 23 is providing for example staff-lines as described above in the image 10. An example of alternative embodiment provides a module 24 comprising a model description of text, for example based on knowledge about fonts that are used in the present image. The document analysis 23 provides text blocks 26 that is combined with document layout 22, where module 25 extracts the geometry in the image 10 that is used to identify transformation points 27, as described above. The actual transformation is performed in module 28 outputting the transformed image 29, which is communicated to an OCR program as known to a person skilled in the art.
  • According to the present invention, deformed text lines may be corrected to provide straight and parallel text lines providing more reliable recognition of text in OCR programs with no a priori knowledge of images, or by a priori geometrical modeling of deformations. A computer program executing the steps of a method according to the present invention may be incorporated in a standard OCR program running in a computer system or programmable device receiving images of documents from an attached scanner, or from a digital camera transferring said images to said computer system, for example via wire transfer, or via wireless communication such as Bluetooth, for example. According to another embodiment of the present invention, said method may be executed in a program or programmable device running in such a processor element comprised in said camera. According to yet another embodiment of the present invention, said method may be implemented as an ASIC (Application Specific Integrated Circuit), as known to a person skilled in the art, in a digital camera or in any another type of equipment. Said digital camera may be implemented in a mobile telephone, or any other type of mobile wireless user equipment.

Claims (47)

1. Method for geometrical transformation of a deformed image comprising text by searching text-lines in the image, comprising the steps of:
a) performing an initial analysis to evaluate if there is enough text-like structures in the image to perform the transformation,
b) identifying connected pixels probably forming characters, words and searching the probable characters, words to identify a direction of each probable character, word reflecting the direction of text-lines, staff-lines or similar elements constituting a direction of text-lines at each of the respective positions in the image that comprise each of the identified connected pixels forming characters, words,
c) linking together identified directions of adjacent identified connected pixels thereby identifying text lines, staff-lines or similar elements constituting a direction of a text lines across the complete or part of the complete image area, identifying transformation points related to the linked direction of text lines across the image area,
d) based on the transformation points, transforming the image or a part of the image comprising the text to an image where the text lines are straight and parallel.
2. Method according to claim 1, wherein the step a) further comprises an analysis of connected pixels to verify that the connected pixels are consistent in size, shape and relative position with text.
3. Method according to claim 1, wherein the step a) further comprises comparing heights of connected pixels with a predefined threshold level identifying a minimum pixel resolution of the image.
4. Method according to claim 1, wherein the step a) further comprising measuring connected pixels providing a ratio between area/height/width of each of the connected pixels, and providing a comparison of the measured ratio with a predefined lower threshold level and a higher threshold level, rejecting connected pixels having the ratio outside the range defined by the lower and higher threshold levels.
5. Method according to claim 1, wherein the step b) further comprising measuring distances between connected pixels providing a first distinct distance related to spacing between characters, and a second distinct distance related to spacing between words.
6. Method according to claim 5, further comprising clustering connected pixels forming probable words based on the first and second distinct distances.
7. Method according to claim 6, further comprising clustering the connected pixels by geometric sorting.
8. Method according to claim 5, further comprising identifying a direction of the text-lines based on the clustering of connected pixels reflecting the direction of a local text-line on a location on the text-line wherein the clustering of the connected pixels are located.
9. Method according to claim 8, further comprising sorting the local text line directions into text lines based on the second distinct distance spacing the words.
10. Method according to claim 9, further comprising identifying long words by sorting height-width relationship between clustered connected pixels, and using the long words as basis for sorting the local text line directions.
11. System implemented in a programmable computer system or device providing geometrical transformation of a deformed image comprising text, by searching text-lines in the image, comprising:
e) a program module performing an initial analysis to evaluate if there is enough text-like structures in the image to perform the transformation,
f) a program module identifying connected pixels probably forming characters, words and searching the probable characters, words to identify a direction of each probable character, word reflecting the direction of text-lines, staff-lines or similar elements constituting a direction of text-lines at each of the respective positions in the image that comprise each of the identified connected pixels forming characters, words,
g) a program module linking together identified directions of adjacent identified connected pixels thereby identifying text lines, staff-lines or similar elements constituting a direction of a text lines across the complete or part of the complete image area, identifying transformation points related to the linked direction of text lines across the image area,
h) a program module that based on the transformation points is transforming the image or a part of the image comprising the text to an image where the text lines are straight and parallel.
12. System according to claim 11, wherein the module e) further comprises an analysis of connected pixels to verify that the connected pixels are consistent in size, shape and relative position with text.
13. System according to claim 11, wherein the module e) further comprises comparing heights of connected pixels with a predefined threshold level identifying a minimum pixel resolution of the image.
14. System according to claim 11, wherein the module e) further comprises measuring connected pixels providing a ratio between area/height/width of each of the connected pixels, and providing a comparison of the measured ratio with a predefined lower threshold level and a higher threshold level, rejecting connected pixels having the ratio outside the range defined by the lower and higher threshold levels.
15. System according to claim 11, wherein the module f) further comprises measuring distances between connected pixels providing a first distinct distance related to spacing between characters, and a second distinct distance related to spacing between words.
16. System according to claim 15, wherein the module further comprises clustering connected pixels forming probable words based on the first and second distinct distances.
17. System according to claim 16, wherein the module further comprises clustering the connected pixels by geometric sorting.
18. System according to claim 15, further comprising identifying a local text line direction based on the clustering of connected pixels reflecting the direction of the text line on a location on the text line wherein the clustering of the connected pixels are located.
19. System according to claim 18, wherein the module further comprises sorting the local text line directions into text lines based on the second distinct distance spacing the words.
20. System according to claim 19, wherein the module further comprises identifying long words by sorting height-width relationships between clustered connected pixels, and using the long words as basis for sorting the local text line directions.
21. System according to claim 11, wherein the module e) further comprises signaling users if the initial analysis concludes that the image comprise insufficient information.
22. Digital Camera comprising programmable device executing a program for geometrical transformation of deformed image comprising text, by searching text-lines in the image, comprising:
i) a program module identifying connected pixels probably forming characters, words and searching the probable characters, words to identify a direction of each probable character, word reflecting the direction of text-lines, staff-lines or similar elements constituting a direction of text-lines at each of the respective positions in the image that comprise each of the identified connected pixels forming characters, words,
j) a program module linking together identified directions of adjacent identified connected pixels thereby identifying text lines, staff-lines or similar elements constituting a direction of a text lines across the complete or part of the complete image area, identifying transformation points related to the linked direction of text lines across the image area,
k) a program module that based on the transformation points is transforming the image or a part of the image comprising the text to an image where the text lines are straight and parallel.
23. Digital Camera according to claim 22, wherein the module i) further comprises an analysis of connected pixels to verify that the connected pixels are consistent in size, shape and relative position with text.
24. Digital Camera according to claim 22, wherein the module i) further comprises comparing heights of connected pixels with a predefined threshold level identifying a minimum pixel resolution of the image.
25. Digital Camera according to claim 22, wherein the module i) further comprises measuring connected pixels providing a ratio between area/height/width of each of the connected pixels, and providing a comparison of the measured ratio with a predefined lower threshold level and a higher threshold level, rejecting connected pixels having the ratio outside the range defined by the lower and higher threshold levels.
26. Digital Camera according to claim 22, wherein the module j) further comprises measuring distances between connected pixels providing a first distinct distance related to spacing between characters, and a second distinct distance related to spacing between words.
27. Digital Camera according to claim 26, wherein the module further comprises clustering connected pixels forming probable words based on the first and second distinct distances.
28. Digital Camera according to claim 27, wherein the module further comprises clustering the connected pixels by geometric sorting.
29. Digital Camera according to claim 26, further comprising identifying a local text line direction based on the clustering of connected pixels reflecting the direction of the text line on a location on the text line wherein the clustering of the connected pixels are located.
30. Digital Camera according to claim 29, wherein the module further comprises sorting the local text line directions into text lines based on the second distinct distance spacing the words.
31. Digital Camera according to claim 30, wherein the module further comprises identifying long words by sorting height-width relationships between clustered connected pixels, and using the long words as basis for sorting the local text line directions.
32. Digital Camera according to claim 11, wherein the module i) further comprises signaling users if the initial analysis concludes that the image comprise insufficient information.
33. ASIC (Application Specific Integrated Circuit) comprising electronic circuitry for geometrical transformation of deformed image comprising text, by searching text-lines in the image, comprises:
l) electronic circuitry identifying connected pixels probably forming characters, words and searching the probable characters, words to identify a direction of each probable character, word reflecting the direction of text-lines, staff-lines or similar elements constituting a direction of text-lines at each of the respective positions in the image that comprise each of the identified connected pixels forming characters, words,
m) electronic circuitry linking together identified directions of adjacent identified connected pixels thereby identifying text lines, staff-lines or similar elements constituting a direction of a text lines across the complete or part of the complete image area, identifying transformation points related to the linked direction of text lines across the image area,
n) electronic circuitry that based on the transformation points is transforming the image or a part of the image comprising the text to an image where the text lines are straight and parallel.
34. ASIC according to claim 33, wherein the electronic circuitry l) further comprises an analysis of connected pixels to verify that the connected pixels are consistent in size, shape and relative position with text.
35. ASIC according to claim 33, wherein the electronic circuitry l) further comprises comparing heights of connected pixels with a predefined threshold level identifying a minimum pixel resolution of the image.
36. ASIC according to claim 33, wherein the electronic circuitry l) further comprises measuring connected pixels providing a ratio between area/height/width of each of the connected pixels, and providing a comparison of the measured ratio with a predefined lower threshold level and a higher threshold level, rejecting connected pixels having the ratio outside the range defined by the lower and higher threshold levels.
37. ASIC according to claim 33, wherein the electronic circuitry m) further comprises measuring distances between connected pixels providing a first distinct distance related to spacing between characters, and a second distinct distance related to spacing between words.
38. ASIC according to claim 37, wherein the electronic circuitry further comprises clustering connected pixels forming probable words based on the first and second distinct distances.
39. ASIC according to claim 38, wherein the electronic circuitry further comprises clustering the connected pixels by geometric sorting.
40. ASIC according to claim 37, wherein the electronic circuitry further comprises identifying a local text line direction based on the clustering of connected pixels reflecting the direction of the text line on a location on the text line wherein the clustering of the connected pixels are located.
41. ASIC according to claim 40, wherein the electronic circuitry further comprises sorting the local text line directions into text lines based on the second distinct distance spacing the words.
42. ASIC according to claim 41, wherein the electronic circuitry further comprises identifying long words by sorting height-width relationships between clustered connected pixels, and using the long words as basis for sorting the local text line directions.
43. ASIC according to claim 33, wherein the electronic circuitry l) further comprises signaling users if the initial analysis concludes that the image comprise insufficient information.
44. Mobile wireless user equipment comprising a Digital Camera according to claim 22.
45. Optical Character Recognition System comprising a method according to claim 1.
46. Optical Character Recognition System comprising a system according to claim 11.
47. Optical Character Recognition System comprising an ASIC according to claim 33.
US11/915,948 2005-06-02 2006-05-19 Method, system, digital camera and asic for geometric image transformation based on text line searching Active 2030-09-17 US9036912B2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
NO20052656 2005-06-02
NO20052656A NO20052656D0 (en) 2005-06-02 2005-06-02 Geometric image transformation based on tekstlinjesoking
PCT/NO2006/000189 WO2006130012A1 (en) 2005-06-02 2006-05-19 Method, system, digital camera and asic for geometric image transformation based on text line searching

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/NO2006/000189 A-371-Of-International WO2006130012A1 (en) 2005-06-02 2006-05-19 Method, system, digital camera and asic for geometric image transformation based on text line searching

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/706,751 Continuation US9600870B2 (en) 2005-06-02 2015-05-07 Method, system, digital camera and asic for geometric image transformation based on text line searching

Publications (2)

Publication Number Publication Date
US20090016606A1 true US20090016606A1 (en) 2009-01-15
US9036912B2 US9036912B2 (en) 2015-05-19

Family

ID=35295256

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/915,948 Active 2030-09-17 US9036912B2 (en) 2005-06-02 2006-05-19 Method, system, digital camera and asic for geometric image transformation based on text line searching
US14/706,751 Active US9600870B2 (en) 2005-06-02 2015-05-07 Method, system, digital camera and asic for geometric image transformation based on text line searching

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/706,751 Active US9600870B2 (en) 2005-06-02 2015-05-07 Method, system, digital camera and asic for geometric image transformation based on text line searching

Country Status (8)

Country Link
US (2) US9036912B2 (en)
EP (1) EP1949303B1 (en)
AU (1) AU2006253155B2 (en)
CA (1) CA2610214A1 (en)
IL (1) IL187777D0 (en)
NO (2) NO20052656D0 (en)
RU (1) RU2412482C2 (en)
WO (1) WO2006130012A1 (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070253031A1 (en) * 2006-04-28 2007-11-01 Jian Fan Image processing methods, image processing systems, and articles of manufacture
US20090297062A1 (en) * 2005-03-04 2009-12-03 Molne Anders L Mobile device with wide-angle optics and a radiation sensor
US20090305727A1 (en) * 2005-03-04 2009-12-10 Heikki Pylkko Mobile device with wide range-angle optics and a radiation sensor
US20100014782A1 (en) * 2008-07-15 2010-01-21 Nuance Communications, Inc. Automatic Correction of Digital Image Distortion
US20100020102A1 (en) * 2001-05-16 2010-01-28 Motionip, Llc Method and device for browsing information on a display
US20100171691A1 (en) * 2007-01-26 2010-07-08 Ralph Cook Viewing images with tilt control on a hand-held device
US20110299775A1 (en) * 2010-06-08 2011-12-08 International Business Machines Corporation Correcting page curl in scanned books
US20120134588A1 (en) * 2010-11-29 2012-05-31 Microsoft Corporation Rectification of characters and text as transform invariant low-rank textures
WO2014018867A1 (en) * 2012-07-27 2014-01-30 The Neat Company, Inc. Portable document scanner having user interface and integrated communications means
US8818132B2 (en) 2010-11-29 2014-08-26 Microsoft Corporation Camera calibration with lens distortion from low-rank textures
US20150093033A1 (en) * 2013-09-30 2015-04-02 Samsung Electronics Co., Ltd. Method, apparatus, and computer-readable recording medium for converting document image captured by using camera to dewarped document image
US20160104039A1 (en) * 2014-10-10 2016-04-14 Morpho Method for identifying a sign on a deformed document
US10185885B2 (en) 2014-10-31 2019-01-22 Hewlett-Packard Development Company, L.P. Tex line detection
US10265728B2 (en) 2013-12-12 2019-04-23 Koninklijke Philips N.V. Monolithically integrated three electrode CMUT device

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102007021185B4 (en) 2007-05-05 2012-09-20 Ziehm Imaging Gmbh X-ray diagnostic device having a plurality of coded tags and a method for determining the location of device parts of the X-ray diagnostic device
JP6542230B2 (en) * 2013-12-20 2019-07-10 イ.エル.イ.エス. Method and system for correcting projected distortion
RU2628266C1 (en) * 2016-07-15 2017-08-15 Общество с ограниченной ответственностью "Аби Девелопмент" Method and system of preparing text-containing images to optical recognition of symbols
RU2636097C1 (en) * 2016-12-06 2017-11-20 Общество с ограниченной ответственностью "Аби Девелопмент" Method and system of preparing text-containing images to optical recognition of symbols
RU2632427C1 (en) 2016-12-09 2017-10-04 Общество с ограниченной ответственностью "Аби Девелопмент" Optimization of data exchange between client device and server

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5245676A (en) * 1989-12-21 1993-09-14 Xerox Corporation Determination of image skew angle from data including data in compressed form
US5491759A (en) * 1992-11-25 1996-02-13 Eastman Kodak Company Document edge detection apparatus
US5513277A (en) * 1991-07-30 1996-04-30 Xerox Corporation Measuring character and stroke sizes and spacings for an image
US5539841A (en) * 1993-12-17 1996-07-23 Xerox Corporation Method for comparing image sections to determine similarity therebetween
US5563403A (en) * 1993-12-27 1996-10-08 Ricoh Co., Ltd. Method and apparatus for detection of a skew angle of a document image using a regression coefficient
US5689585A (en) * 1995-04-28 1997-11-18 Xerox Corporation Method for aligning a text image to a transcription of the image
US5708717A (en) * 1995-11-29 1998-01-13 Alasia; Alfred Digital anti-counterfeiting software method and apparatus
US5818978A (en) * 1994-04-15 1998-10-06 Canon Kabushiki Kaisha Image pre-processor for character image recognition system
US6304313B1 (en) * 1997-12-09 2001-10-16 Canon Kabushiki Kaisha Digital camera and document processing system using the digital camera
US20030026482A1 (en) * 2001-07-09 2003-02-06 Xerox Corporation Method and apparatus for resolving perspective distortion in a document image and for calculating line sums in images
US20030086615A1 (en) * 2001-11-02 2003-05-08 Xerox Corporation Method and apparatus for capturing text images
US20030202696A1 (en) * 2002-04-25 2003-10-30 Simard Patrice Y. Activity detector
US20050216564A1 (en) * 2004-03-11 2005-09-29 Myers Gregory K Method and apparatus for analysis of electronic communications containing imagery
US20050225808A1 (en) * 2000-11-09 2005-10-13 Braudaway Gordon W Method and apparatus to correct distortion of document copies
US20060291727A1 (en) * 2005-06-23 2006-12-28 Microsoft Corporation Lifting ink annotations from paper

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6538691B1 (en) 1999-01-21 2003-03-25 Intel Corporation Software correction of image distortion in digital cameras
SE517445C2 (en) 1999-10-01 2002-06-04 Anoto Ab Positioning on a surface provided with a position-coding pattern

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5245676A (en) * 1989-12-21 1993-09-14 Xerox Corporation Determination of image skew angle from data including data in compressed form
US5513277A (en) * 1991-07-30 1996-04-30 Xerox Corporation Measuring character and stroke sizes and spacings for an image
US5491759A (en) * 1992-11-25 1996-02-13 Eastman Kodak Company Document edge detection apparatus
US5539841A (en) * 1993-12-17 1996-07-23 Xerox Corporation Method for comparing image sections to determine similarity therebetween
US5563403A (en) * 1993-12-27 1996-10-08 Ricoh Co., Ltd. Method and apparatus for detection of a skew angle of a document image using a regression coefficient
US5818978A (en) * 1994-04-15 1998-10-06 Canon Kabushiki Kaisha Image pre-processor for character image recognition system
US5689585A (en) * 1995-04-28 1997-11-18 Xerox Corporation Method for aligning a text image to a transcription of the image
US5708717A (en) * 1995-11-29 1998-01-13 Alasia; Alfred Digital anti-counterfeiting software method and apparatus
US6304313B1 (en) * 1997-12-09 2001-10-16 Canon Kabushiki Kaisha Digital camera and document processing system using the digital camera
US20050225808A1 (en) * 2000-11-09 2005-10-13 Braudaway Gordon W Method and apparatus to correct distortion of document copies
US20030026482A1 (en) * 2001-07-09 2003-02-06 Xerox Corporation Method and apparatus for resolving perspective distortion in a document image and for calculating line sums in images
US20030086615A1 (en) * 2001-11-02 2003-05-08 Xerox Corporation Method and apparatus for capturing text images
US20030202696A1 (en) * 2002-04-25 2003-10-30 Simard Patrice Y. Activity detector
US20050216564A1 (en) * 2004-03-11 2005-09-29 Myers Gregory K Method and apparatus for analysis of electronic communications containing imagery
US20060291727A1 (en) * 2005-06-23 2006-12-28 Microsoft Corporation Lifting ink annotations from paper

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9727095B2 (en) 2001-05-16 2017-08-08 Apple Inc. Method, device and program for browsing information on a display
US20100020102A1 (en) * 2001-05-16 2010-01-28 Motionip, Llc Method and device for browsing information on a display
US20100125818A1 (en) * 2001-05-16 2010-05-20 Motionip, Llc Method, device and program for browsing information on a display
US20090297062A1 (en) * 2005-03-04 2009-12-03 Molne Anders L Mobile device with wide-angle optics and a radiation sensor
US20090305727A1 (en) * 2005-03-04 2009-12-10 Heikki Pylkko Mobile device with wide range-angle optics and a radiation sensor
US20070253031A1 (en) * 2006-04-28 2007-11-01 Jian Fan Image processing methods, image processing systems, and articles of manufacture
US8213687B2 (en) * 2006-04-28 2012-07-03 Hewlett-Packard Development Company, L.P. Image processing methods, image processing systems, and articles of manufacture
US20100171691A1 (en) * 2007-01-26 2010-07-08 Ralph Cook Viewing images with tilt control on a hand-held device
US9507431B2 (en) 2007-01-26 2016-11-29 Apple Inc. Viewing images with tilt-control on a hand-held device
US8994644B2 (en) * 2007-01-26 2015-03-31 Apple Inc. Viewing images with tilt control on a hand-held device
US10318017B2 (en) 2007-01-26 2019-06-11 Apple Inc. Viewing images with tilt control on a hand-held device
US20100014782A1 (en) * 2008-07-15 2010-01-21 Nuance Communications, Inc. Automatic Correction of Digital Image Distortion
US8285077B2 (en) * 2008-07-15 2012-10-09 Nuance Communications, Inc. Automatic correction of digital image distortion
TWI483198B (en) * 2010-06-08 2015-05-01 Ibm Correcting for page curl in scanned books
US20130039598A1 (en) * 2010-06-08 2013-02-14 International Business Machines Corporation Correcting page curl in scanned books
US20110299775A1 (en) * 2010-06-08 2011-12-08 International Business Machines Corporation Correcting page curl in scanned books
CN102918548A (en) * 2010-06-08 2013-02-06 国际商业机器公司 Correcting page curl in scanned books
US8687916B2 (en) * 2010-06-08 2014-04-01 International Business Machines Corporation Correcting page curl in scanned books
US20120134588A1 (en) * 2010-11-29 2012-05-31 Microsoft Corporation Rectification of characters and text as transform invariant low-rank textures
US8818132B2 (en) 2010-11-29 2014-08-26 Microsoft Corporation Camera calibration with lens distortion from low-rank textures
US8774558B2 (en) * 2010-11-29 2014-07-08 Microsoft Corporation Rectification of characters and text as transform invariant low-rank textures
WO2014018867A1 (en) * 2012-07-27 2014-01-30 The Neat Company, Inc. Portable document scanner having user interface and integrated communications means
US9167124B2 (en) 2012-07-27 2015-10-20 The Neat Company, Inc. Portable document scanner having user interface and integrated communication means
US9305211B2 (en) * 2013-09-30 2016-04-05 Samsung Electronics Co., Ltd. Method, apparatus, and computer-readable recording medium for converting document image captured by using camera to dewarped document image
US20150093033A1 (en) * 2013-09-30 2015-04-02 Samsung Electronics Co., Ltd. Method, apparatus, and computer-readable recording medium for converting document image captured by using camera to dewarped document image
US10265728B2 (en) 2013-12-12 2019-04-23 Koninklijke Philips N.V. Monolithically integrated three electrode CMUT device
US10025977B2 (en) * 2014-10-10 2018-07-17 Morpho Method for identifying a sign on a deformed document
US20160104039A1 (en) * 2014-10-10 2016-04-14 Morpho Method for identifying a sign on a deformed document
US10185885B2 (en) 2014-10-31 2019-01-22 Hewlett-Packard Development Company, L.P. Tex line detection

Also Published As

Publication number Publication date
EP1949303B1 (en) 2012-08-01
CA2610214A1 (en) 2006-12-07
EP1949303A1 (en) 2008-07-30
US20150243005A1 (en) 2015-08-27
IL187777D0 (en) 2008-08-07
NO20080016L (en) 2008-01-02
US9600870B2 (en) 2017-03-21
WO2006130012A1 (en) 2006-12-07
RU2412482C2 (en) 2011-02-20
US9036912B2 (en) 2015-05-19
AU2006253155B2 (en) 2012-04-12
RU2007149518A (en) 2009-07-20
AU2006253155A1 (en) 2006-12-07
NO20052656D0 (en) 2005-06-02

Similar Documents

Publication Publication Date Title
US7505178B2 (en) Semantic classification and enhancement processing of images for printing applications
AU2004271639B2 (en) Systems and methods for biometric identification using handwriting recognition
DE4311172C2 (en) Method and device for identifying a skew angle of a document image
US20080069476A1 (en) System and method of determining image skew using connected components
EP2178028A2 (en) Representing documents with runlength histograms
US20180232572A1 (en) Systems and methods for mobile image capture and content processing of driver's licenses
US9292737B2 (en) Systems and methods for classifying payment documents during mobile image processing
US5852676A (en) Method and apparatus for locating and identifying fields within a document
US20060045379A1 (en) Photographic document imaging system
EP1999688B1 (en) Converting digital images containing text to token-based files for rendering
JP4101290B2 (en) System and method for automatic page registration and automatic region detection during formatting process
US6151423A (en) Character recognition with document orientation determination
Liang et al. Geometric rectification of camera-captured document images
US20030198386A1 (en) System and method for identifying and extracting character strings from captured image data
EP0483391A1 (en) Automatic signature verification
US8112706B2 (en) Information processing apparatus and method
EP1052593A2 (en) Form search apparatus and method
US7437001B2 (en) Method and device for recognition of a handwritten pattern
US8406476B2 (en) Model-based dewarping method and apparatus
WO2007028166A2 (en) A system and method for detecting text in real-world color images
US7836390B2 (en) Strategies for processing annotations
US9760788B2 (en) Mobile document detection and orientation based on reference object characteristics
US20070253040A1 (en) Color scanning to enhance bitonal image
US8515208B2 (en) Method for document to template alignment
CN101625760A (en) Method for correcting certificate image inclination

Legal Events

Date Code Title Description
AS Assignment

Owner name: LUMEX AS, NORWAY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MEYER, HANS CHRISTIAN;CARLIN, MATS STEFAN;FOSSEIDE, KNUT THARALD;REEL/FRAME:020608/0001

Effective date: 20080214

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

FEPP Fee payment procedure

Free format text: SURCHARGE FOR LATE PAYMENT, SMALL ENTITY (ORIGINAL EVENT CODE: M2554); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2551); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

Year of fee payment: 4