JP6542230B2 - Method and system for correcting projected distortion - Google Patents

Method and system for correcting projected distortion Download PDF

Info

Publication number
JP6542230B2
JP6542230B2 JP2016541592A JP2016541592A JP6542230B2 JP 6542230 B2 JP6542230 B2 JP 6542230B2 JP 2016541592 A JP2016541592 A JP 2016541592A JP 2016541592 A JP2016541592 A JP 2016541592A JP 6542230 B2 JP6542230 B2 JP 6542230B2
Authority
JP
Japan
Prior art keywords
text
point
unique
line
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2016541592A
Other languages
Japanese (ja)
Other versions
JP2017500662A (en
Inventor
マー、ジャングリン
ダウ、ミシェル
ミューレネール、ピエール ドゥ
ミューレネール、ピエール ドゥ
デュボン、オリヴィエ
Original Assignee
イ.エル.イ.エス.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US14/136,695 priority Critical
Priority to US14/136,585 priority patent/US8897600B1/en
Priority to US14/136,585 priority
Priority to US14/136,695 priority patent/US8811751B1/en
Priority to US14/136,501 priority patent/US8913836B1/en
Priority to US14/136,501 priority
Priority to PCT/EP2014/078930 priority patent/WO2015092059A1/en
Application filed by イ.エル.イ.エス. filed Critical イ.エル.イ.エス.
Publication of JP2017500662A publication Critical patent/JP2017500662A/en
Application granted granted Critical
Publication of JP6542230B2 publication Critical patent/JP6542230B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • G06K9/32Aligning or centering of the image pick-up or image-field
    • G06K9/3275Inclination (skew) detection or correction of characters or of image to be recognised
    • G06K9/3283Inclination (skew) detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • G06K9/32Aligning or centering of the image pick-up or image-field
    • G06K9/3275Inclination (skew) detection or correction of characters or of image to be recognised
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/387Composing, repositioning or otherwise geometrically modifying originals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/387Composing, repositioning or otherwise geometrically modifying originals
    • H04N1/3872Repositioning or masking
    • H04N1/3873Repositioning or masking defined only by a limited number of coordinate points or parameters, e.g. corners, centre; for trimming

Description

  The present invention relates to methods, systems, devices and computer program products for correcting projection distortion.
  Digital cameras (hereinafter referred to as cameras) may also be used to capture images. With advances in technology, digital cameras are implemented in almost all types of digital devices. Examples of such digital devices include, but are not limited to, mobile communication devices, tablets, laptops, and personal digital assistants (PDAs). In many instances, the camera can serve as an alternative to the document scanner, as the camera can be used to capture an image of the document. Images of the document may need to be processed prior to text recognition and / or text extraction. Processing of the image of the document imposes two major challenges: poor image quality of the captured image due to objectionable imaging conditions and distortion in the captured image. The distortion may be due to the camera and / or the angle and position of the camera relative to the plane of the document while capturing an image. The distortion caused by the latter is known as projection distortion. In projected distortion, text symptoms or characters appear to be larger the closer they are to the camera plane and the smaller the further they are. There are known techniques for improving the quality of images. However, improving the quality of the image may not help in the recognition and / or extraction of text, especially when the document's image is subject to projection distortion. Projection distortion not only disturbs the visual interpretation of the text, but can also affect the accuracy of the text recognition algorithm.
  Existing techniques exist to correct for projected distortions. One of the currently known techniques for performing projection distortion correction uses auxiliary data. The auxiliary data may include a combination of direction measurement data, accelerometer data, and distance measurement data. However, such auxiliary data may not be available in all electronic devices due to the lack of various sensors and / or processing capabilities. Several other techniques contemplate manual correction of projection distortion. One such technique manually manually uses the four corners of a quadrilateral which the user used to be a rectangle formed by two horizontal line segments and two vertical line segments before straining. Need to identify, mark. Another technique requires the user to identify and mark parallel lines corresponding to horizontal or vertical lines prior to distortion. Based on those corners or parallel lines, correction of projection distortion is performed. However, manual correction of projection distortion is time consuming, inefficient and prone to errors.
  Techniques also exist for automatic correction of projection distortion algorithms. These techniques focus on identifying horizontal vanishing points and vertical vanishing points. These vanishing points can then mean points at which the outline of the document in the image (e.g. horizontal outline or vertical outline) converges to a certain point. These techniques use the horizontal and vertical vanishing points to perform projection distortion correction. However, most techniques require complex manual parameter settings for correction. If the content of the image changes, the parameters need to be changed manually. This limits the capabilities of those techniques. Furthermore, existing techniques are computationally expensive and difficult to implement in small devices, such as mobile communication devices. Furthermore, most techniques work on the assumption that the document image contains only text. In the case of document images that have a combination of text and pictures, those techniques may not produce one or more useful results at all. Many of these techniques work on the assumption that text in an image of a document is formed and / or positioned in a particular manner. As such, those techniques fail when the text in the image is formed and / or positioned in a particular manner.
Martin A. Fischer and Robert C. et al. Bolles, "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography", Comm. of the ACM 24 (6): 381-395, June 1981
  It is an object of the present invention to provide a method, system, device and / or computer program product for performing projection correction of distorted images, which is at least one of the drawbacks mentioned above. I do not show one.
  This object is achieved according to the invention as defined in the independent claims.
  According to a first aspect of the invention, which may be combined with other aspects described herein, there is provided a method for projection correction of an image comprising at least one text portion that is subject to perspective distortion. Disclosed. The method comprises the step of image binarization, wherein the image is binarized. Thereafter, the method includes the step of performing connected component analysis. Connected component analysis involves detecting pixel blobs in the at least one text portion of the binarized image. Thereafter, the method includes the step of horizontal vanishing point determination. Horizontal vanishing point determination comprises the steps of: estimating text baselines using the unique points of the pixel blobs; and determining horizontal vanishing points of the at least one text portion using the text baselines including. The method further includes the step of determining vertical vanishing points for the at least one text portion based on the vertical features. The method further comprises the step of projection correction which requires the step of correcting the perspective in the image based on the horizontal and vertical vanishing points.
  In an embodiment according to the first aspect, the step of text and picture separation is performed after the image binarization and before the connected component analysis, and only text information is retained.
  In an embodiment according to the first aspect, each unique point may be centered at the bottom of the bounding box of the respective pixel blob. Estimating the text baseline can include the step of removing unique points that are confusing. In the vicinity of the eigenpoint being considered, confusing eigenpoints that are out of line with the eigenpoint may be detected. Convoluted eigenpoints may be ignored for the text baseline estimation.
  In an embodiment according to the first aspect, the step of removing the eigenpoints of confusion comprises determining the width and height of the pixel blob, and determining an average value for the width and height of the pixel blob. The eigenpoint is detected as an eigenpoint belonging to a pixel blob at least one of the step, the width of the pixel blob under consideration, and the height differs from the calculated average value by a predetermined range And the step of
In an embodiment according to the first aspect, the step of estimating text baselines may include the step of clustering unique points into unique point groups. The unique point group is subject to the following conditions:
-The condition that the point-to-point distance between the unique points of the group is below the first distance threshold;
A condition in which the point-to-line distance between each unique point of the group and the line formed by the unique points of the group is below a second distance threshold,
-The condition that the off horizontal angle of the line formed by the unique points of the group is below the maximum angle;
-The unique point group can satisfy at least one of the conditions including the minimum number of unique points. The text baseline may be estimated based on the unique point group.
  In an embodiment according to the first aspect, the first distance threshold, the second distance threshold, the maximum angle, and the minimum number of unique points are based on the content of the image. May be set adaptively. The step of estimating the text baseline may further include the step of unique point group merging. The unique point groups on either side of the ignored unique points may be merged into a larger unique point group.
  In an embodiment according to the first aspect, the step of determining horizontal vanishing points comprises: defining each of the estimated text baselines as a line in a Cartesian coordinate system; Converting each of the text baselines into data points in the homogeneous coordinate system and assigning a confidence level to each of the data points may be included. The confidence level is based on at least the length of each text baseline, the group of unique points used to estimate the text baseline, and the proximity of the resulting text baseline It can be
  In an embodiment according to the first aspect, the step of determining a horizontal erasure point groups several data points having a confidence level above a predetermined threshold into a priority sample array. And clustering the data points in the priority sample array into several sample groups, based on at least the confidence level assigned to each data point in the sample groups. The method may further include: assigning group confidence values to each sample group; and iteratively selecting sample groups of data points from the priority sample array for line fitting. Each sample group can include more than one data point. The iteration may start with the sample group with the highest confidence value in the priority sample array.
  In an embodiment according to the first aspect, the step of determining a horizontal vanishing point comprises performing line fitting for a first sample group resulting in a first adapted line, and a further adapted line. Performing a line fitting thereafter for each further sample group leading to a predetermined one from the first adapted line based on the first adapted line and the further adapted line Determining a set of data points located below the distance threshold, and at least first and second candidate horizontal vanishing points from the horizontal text baseline corresponding to the determined set of data points And estimating.
  In an embodiment according to the first aspect, the step of determining the horizontal vanishing point comprises performing a projection correction based on each estimated horizontal vanishing point candidate, and the resulting horizontal text after the projection correction. The steps of comparing the proximity of each horizontal vanishing point candidate to the direction and selecting the closest horizontal vanishing point candidate in the horizontal text direction of the image document after projection correction may be included.
  In an embodiment according to the first aspect, the steps of determining vertical vanishing points are each one of selected ones of the pixel blobs selected by the blob filtering algorithm for the text part of the image. Estimating a plurality of vertical text lines corresponding to Y, defining each of said estimated vertical text lines as lines in Cartesian coordinate system, and estimating said vertical texts in Cartesian coordinate system Converting each of the lines into data points in the homogeneous coordinate system and assigning a confidence level to each of the data points may be included. The confidence level may be based on at least the eccentricity of the shape of the pixel blob used to estimate the respective vertical text line.
  In an embodiment according to the first aspect, the step of determining vertical erasure points groups several data points having a confidence level above a predetermined threshold into a priority sample array. , And clustering the data points in the priority sample array into several sample groups. Each sample group can include at least two data points. The step of determining vertical loss points comprises: assigning a group confidence value to each sample group based on the confidence level assigned to each data point in the sample group; Iteratively selecting a sample group of data points from the priority sample array. The iteration may start from the sample group with the highest group confidence value in the priority sample array.
  In an embodiment according to the first aspect, the step of determining the vertical vanishing point comprises performing line fitting for a first sample group resulting in a first adapted line, and a further adapted line Performing a line fitting thereafter for each further sample group leading to a predetermined one from the first adapted line based on the first adapted line and the further adapted line Determining a set of data points positioned below the distance threshold; and estimating at least first and second candidate vertical erasure points from vertical text lines corresponding to the determined set of data points And b.
  In an embodiment according to the first aspect, the step of determining the vertical vanishing point comprises performing a projection correction based on each estimated vertical vanishing point candidate, and the resulting vertical text after the projection correction. Comparing the proximity of each estimated vertical vanishing point candidate to the direction and selecting the closest vertical vanishing point candidate in the vertical text direction of the image document may be included.
  In an embodiment according to the first aspect, the blob filtering algorithm is in the form of considered pixel blobs which represent the following conditions: how it is stretched. Eccentricity (values between 0 and 1; 0 and 1 are extremes, ie a blob whose eccentricity is 0 is actually a circular object, but its eccentricity A blob where is 1 is a line segment) but the condition that the line segment is above a predetermined threshold, and the proximity of each pixel blob to the image boundary is above the predetermined distance threshold. And the area of each pixel blob defined by the number of pixels is the condition that the condition of and the angle of the resulting vertical line with respect to the vertical direction is below the maximum angle threshold. It is below the threshold, but on the basis of one or more of the conditions are above the minimum area threshold, it is possible to select a pixel blobs.
  In an embodiment according to the first aspect, the first and second vanishing point candidates are different approximation methods selected from the group consisting of least squares, weighted least squares, and adaptive least squares. It may be estimated using.
  In a first alternative aspect of the invention, which may be combined with the other aspects described herein, there is provided a method for projection correction of an image comprising at least one text portion that is subject to perspective distortion. Disclosed. The method comprises an image binarization step and a connected component analysis step, wherein the image is binarized. Connected component analysis detects pixel blobs in the at least one text portion of the binarized image. For each of the pixel blobs, location pixels may be selected over the pixel blob baseline of the pixel blob. The positioning pixels may define the positions of pixel blobs in the binarized image. The method further comprises the step of horizontal vanishing point determination. Horizontal vanishing point determination includes estimating a text baseline using the locating pixels and determining a horizontal vanishing point of the at least one text portion using the text baseline. The method further includes vertical vanishing point determination. Vertical vanishing points are determined for the at least one text portion based on the vertical features. The method further comprises the step of projection correction, wherein the perspective distortion in the image is corrected based on the horizontal and vertical vanishing points.
  In an embodiment according to the first alternative aspect, the step of text and picture separation is performed after the image binarization and before the connected component analysis, and only text information is retained.
  In an embodiment of the first alternative aspect, the locating pixel as described may be centered at the bottom of the bounding box of the pixel blob. The location pixel may, in an alternative embodiment, determine the position of the bottom corner of the bounding box of the pixel blob (i.e. the bottom left corner or the right corner) or the pixel blob or the bounding box above it. Can be a pixel.
  In an embodiment of the first aspect or the first alternative aspect, a software code compatible with one or more processors configured to perform the method or the steps described above Systems or devices may be provided that include parts.
  In an embodiment of the first aspect or the first alternative aspect, further comprising software code portions in a format that is executable on a computing device and executed on said computing device A non-transitory storage medium may also be provided, on which a computer program product configured to perform the method or the steps described above is stored. The computer device is one of the following devices: personal computer, portable computer, laptop computer, netbook computer, tablet computer, smartphone, digital still camera, video camera, mobile It can also be any device such as a communication device, a personal digital assistant, a scanner, a multifunction device, or any other computer.
  In a second aspect according to the invention, which can be combined with the other aspects described herein, a method is described for determining candidate vanishing points of text portions in an image document that is distorted by perspective Ru. The method includes the step of image binarization, wherein the image is binarized. Thereafter, the method comprises performing connected component analysis, wherein pixel blobs are detected in the at least one text portion of the binarized image. Positioning pixels are selected for each of the pixel blobs above the pixel blob baseline of the pixel blob, the positioning pixels defining the positions of the pixel blobs in the binarized image There is. The method also estimates the number of text lines in the Cartesian coordinate system based on the positioning pixels, each text line representing an approximation of the horizontal text direction or the vertical text direction of the text portion. It contains. The method also includes the step of transforming each of the text lines into data points in the homogeneous coordinate system. The method further includes the step of assigning a confidence level to each of the data points. The method includes the step of grouping several data points with confidence levels above a predetermined threshold into a priority sample array. The method includes clustering data points in the priority sample array into a number of sample groups. Each sample group contains two or more data points. The method further includes assigning a group confidence value to each sample group based on at least the confidence level assigned to each data point in the sample group. Additionally, the method includes applying a random sample consensus (RANSAC) algorithm to determine an inlier set of the data points for the first adapted line. The RANSAC algorithm starts with the sample group having the highest group confidence value in the priority sample array. The method further comprises the step of estimating at least one erasure point candidate from text lines corresponding to the set of inliers.
  In an embodiment according to the second aspect, the step of separating text and picture is performed after the image binarization and before the connected component analysis, and only text information is retained.
  In an embodiment according to the second aspect, the confidence level assigned to the data points is based on at least the length of the respective text line and the proximity of the locating pixel to the respective text line. can do.
  In an embodiment according to the second aspect, the RANSAC algorithm may comprise the following steps: The first step is to iteratively select a sample group of data points from the priority sample array for line fitting. The iteration is started with the sample group having the highest group confidence value in the priority sample array. Then, performing line fitting for the first sample group leading to the first fitted line and then performing line fitting for each further sample group leading to the further adapted line Step. Then, based on the first adapted line and the further adapted line, a set of data points positioned below a predetermined distance threshold from the first adapted line is determined The set of data points form the set of inliers.
  In an embodiment according to the second aspect, the predetermined distance threshold from the first adapted line may be a fixed parameter. The predetermined distance threshold from the first adapted line may instead be adaptable based on the content of the image document.
  In an embodiment according to the second aspect at least first and second candidate erasure points may be deduced from text lines corresponding to said set of inliers. The first and second candidate vanishing points may be estimated using different approximation methods selected from the group consisting of least squares, weighted least squares, and adaptive least squares. The method may then further include the step of selecting a vanishing point from the estimated vanishing point candidates. The selection is performed by performing a projection correction on the image document based on each estimated vanishing point candidate and comparing the proximity of each vanishing point candidate to the resulting horizontal or vertical text direction after projection correction. The method may include the steps of: performing the projection correction; and selecting the closest vanishing point candidate in the horizontal text direction or the vertical text direction of the image document.
  In an embodiment according to the second aspect, the group confidence value of each sample group is further based on the distance between each estimated text line corresponding to a data point in the sample group. It can be The confidence level of each of the data points may be based on the principal direction of the pixel blob used to estimate each respective text line. The main direction may be defined by the eccentricity of the shape of each pixel blob. The maximum number of data points grouped into a priority sample array may be between 2 and 20, and more preferably between 5 and 10.
  In an embodiment according to the second aspect, the estimated text line is a vertical text respectively corresponding to a selected one direction of said pixel blobs selected by a blob filtering algorithm on the text part of the image It can be a blob line.
  In an embodiment of the second aspect, there is provided a system or device comprising one or more processors configured to perform the above described method or steps, and compatible software code portions. It is also possible.
  In an embodiment of the second aspect, it further comprises software code portions in an executable format on a computing device, as described above when executed on said computing device A non-transitory storage medium may also be provided on which a computer program product configured to perform the method or step is stored. The computer device is one of the following devices: personal computer, portable computer, laptop computer, netbook computer, tablet computer, smartphone, digital still camera, video camera, mobile It can also be any device such as a communication device, a personal digital assistant, a scanner, a multifunction device, or any other computer.
  In a third aspect of the invention, which may be combined with the other aspects described herein, a method is disclosed for projection correction of an image comprising at least one text portion that is subject to distortion by perspective. . The method includes the step of image binarization, wherein the image is binarized. Thereafter, the method includes the step of performing connected component analysis. Connected component analysis involves detecting pixel blobs for the at least one text portion of the binarized image. Positioning pixels are selected for each of the pixel blobs above the pixel blob baseline of the pixel blob. The position determining pixel defines the position of the pixel blob in the binarized image. The method includes the step of horizontal vanishing point determination. Horizontal vanishing point determination comprises the steps of estimating a text baseline using locating pixels of the pixel blob, identifying horizontal vanishing point candidates from the estimated text baseline, and the horizontal vanishing point Determining a horizontal vanishing point of the at least one text portion using the candidate. The method also includes the step of determining vertical vanishing points for the at least one text portion based on the vertical features. The method further comprises the step of projection correction. Projection correction requires the step of correcting the perspective in the image based on the horizontal and vertical vanishing points. The horizontal vanishing point determination may include a first removing step for the level of unique points, a second removing step for the level of the text baseline, and a third removing step for the level of the horizontal vanishing point candidate. .
  In an embodiment according to the third aspect, the step of text and picture separation is performed after the image binarization and before the connected component analysis, and only text information is retained.
  In an embodiment according to the third aspect, the first removing step comprises the step of detecting confusable unique points out of line with respect to unique points near the considered unique point. The confused unique points may be ignored for the text baseline estimation.
  In an embodiment according to the third aspect, the confounding unique point removing step determines an average value of the pixel blob width and height, the pixel blob width and height, and the step of determining the pixel blob width and height. The eigenpoint to be confused as an eigenpoint belonging to a pixel blob different from the calculated average value by at least one of the following steps: width and height of the pixel blob under consideration And detecting.
In an embodiment according to the third aspect, the step of estimating text baselines comprises the step of clustering unique points into unique point groups. The unique point group is subject to the following conditions:
-The condition that the point-to-point distance between the unique points of the group is below the first distance threshold;
A condition in which the point-to-line distance between each unique point of the group and the line formed by the unique points of the group is below a second distance threshold,
-The condition that the off horizontal angle of the line formed by the unique points of the group is below the maximum angle;
-The unique point group can satisfy at least one of the conditions including the minimum number of unique points. The text baseline may then be estimated based on the eigenpoint group.
  In an embodiment according to the third aspect, the first distance threshold, the second distance threshold, the maximum angle, and the minimum number of unique points are based on the content of the image. May be set adaptively. The step of estimating the text baseline may further include the step of unique point group merging, in which step the unique point groups on both sides of the ignored unique points are larger unique points. Merged into a point group.
  In an embodiment according to the third aspect, the second removing step comprises the steps of: assigning a confidence level to the text baseline; and removing the text baseline based on the confidence level. . Confidence levels are determined based on at least the length of each text baseline and the proximity of the group of unique points used to estimate the text baseline to the resulting text baseline. It is also possible. Text baseline removal may also be performed using the RANSAC algorithm, in which the confidence level is taken into account.
  In an embodiment according to the third aspect, the third removing step comprises performing a projection correction based on each of the identified horizontal vanishing point candidates, and each one for the resulting horizontal text direction after the projection correction. The steps of comparing the proximity of the horizontal vanishing point candidate and selecting the closest horizontal vanishing point candidate in the horizontal text direction of the image document after projection correction.
  In an embodiment according to the third aspect, first and second candidate horizontal vanishing points may be estimated from the text baseline after the second removal step. A different approximation method selected from the group consisting of least squares, weighted least squares and adaptive least squares is used for the estimation of the first and second candidate horizontal vanishing points. Sometimes.
  In an embodiment of the third aspect, there is provided a system or device comprising one or more processors configured to perform the above described methods or steps, and compatible software code portions. It is also possible.
  In an embodiment of the third aspect, it further comprises software code portions in an executable format on a computing device, as described above when executed on said computing device A non-transitory storage medium may also be provided on which a computer program product configured to perform the method or step is stored. The computer device is one of the following devices: personal computer, portable computer, laptop computer, netbook computer, tablet computer, smartphone, digital still camera, video camera, mobile It can also be any device such as a communication device, a personal digital assistant, a scanner, a multifunction device, or any other computer.
  The invention will be further elucidated using the following description and the attached drawings.
FIG. 5 illustrates a process flow for describing projection correction of a distorted image, according to one embodiment of the present disclosure. FIG. 7 illustrates a process flow for identifying horizontal vanishing points, according to one embodiment of the present disclosure. FIG. 4 is a diagram illustrating a unique point clustering algorithm, also sometimes referred to in the text as FIG. 3, according to one embodiment of the present disclosure. FIG. 4 is a diagram illustrating a unique point clustering algorithm, also sometimes referred to in the text as FIG. 3, according to one embodiment of the present disclosure. FIG. 7 illustrates a process flow for identifying vertical vanishing points using positioning pixels, according to one embodiment of the present disclosure. FIG. 7 illustrates a process flow for identifying vertical vanishing points using text stroke features, according to one embodiment of the present disclosure. FIG. 6 illustrates an example binarized image having a picture with text according to an embodiment of the present disclosure. FIG. 7 illustrates the resulting image after filtering out a picture from text according to an embodiment of the present disclosure. FIG. 7 illustrates an example pixel blob, according to one embodiment of the present disclosure. FIG. 7 illustrates a presentation grid for a user to adjust the corners of an image, according to one embodiment of the present disclosure. FIG. 7 shows a captured image according to an embodiment of the present disclosure. FIG. 7 shows an improved image as a result of projection correction, according to an embodiment of the present disclosure. FIG. 7 illustrates an example image in which unique points for text are identified, according to one embodiment of the present disclosure. FIG. 7 shows an image of an example having over-classified unique point groups according to an embodiment of the present disclosure. FIG. 7 shows an image of an example having an integrated unique point group according to an embodiment of the present disclosure. FIG. 7 is a diagram showing portions of an example of text for which a baseline is estimated, according to an embodiment of the present disclosure. FIG. 7 illustrates an example image in which margin feature points are identified in the margin, according to one embodiment of the present disclosure. FIG. 7 illustrates an example image having two estimated vertical lines along the same margin, according to one embodiment of the present disclosure. FIG. 7 shows an example image showing the merging of estimated vertical lines according to an embodiment of the present disclosure. FIG. 5 is an illustration of an example showing the character's text stroke characteristics according to an embodiment of the present disclosure. FIG. 6 is an illustration of an example showing selectively extracted blobs after text stroke feature identification according to one embodiment of the present disclosure. FIG. 7 is an illustration of an example showing an estimated vertical text blob line for a selected pixel blob, according to one embodiment of the present disclosure. FIG. 7 is an illustration of an example showing vertical text blob lines selected for vertical vanishing points, according to one embodiment of the present disclosure.
  The present invention will be described with respect to particular embodiments and with reference to certain drawings but the invention is not limited thereto but only by the claims. is there. The drawings to be described are merely schematic and are non-limiting. In the drawings, the size of some of the elements may be exaggerated and not drawn on scale for illustrative purposes. The dimensions and relative dimensions do not necessarily correspond to the actual reductions in which the invention is implemented.
  Furthermore, in the present description, the terms first, second, third and the like in the claims are used to distinguish between similar elements, but not necessarily in sequential order or chronological order. It is not necessarily for explaining. The terms are interchangeable under appropriate circumstances and the embodiments of the invention can operate in other sequences than described or illustrated herein.
  Furthermore, the terms upper, lower, upper, lower, and the like in the description, in this description, are used for descriptive purposes, but not necessarily for describing relative positions. Absent. Those terms used as such are interchangeable under appropriate circumstances and the embodiments of the invention described herein are other than as described or exemplified herein. It can also operate in other orientations.
  The term "comprising / including", as used in the claims, should not be construed as being limited to the means listed thereafter; it excludes other elements or steps There is nothing to do. The term should be construed as designating the presence of the stated feature, integer, step or component, as mentioned, but one or more other features, integers, steps or It does not exclude the presence or addition of components, or their groups. Thus, the scope of the expression "a device comprising means A and B" should not be limited to a device consisting only of components A and B. The term, in the context of the present invention, means that the only relevant components of the device are A and B.
  Referring to FIG. 1, a process flow 100 for projection correction of a distorted image is described. Images may also be received for projection correction. The image can optionally be examined to determine the quality of the image. Examining the image can include checking for the presence of noise, lighting conditions, character clarity, resolution, and the like. The image may be processed at step 102 if the quality of the image is above a predetermined threshold. If the quality of the image is below a predetermined threshold, the image can be preprocessed to improve the quality of the image. Pre-processing is modifying the hue, correcting the luminance imbalance, adjusting the sharpness, removing the noise, removing the motion blur, to restore and improve the resolution of the image. It may be necessary including / correcting, compensating for camera misfocus etc. In one example implementation, pre-processing may be performed automatically. In another example implementation, toolbox options can be provided to the user to select the type of preprocessing for the image. In one embodiment, pre-processing is, but is not limited to, Gaussian filtering and median filtering, Wiener filtering, bilateral filtering, Wiener deconvolution, total variation deconvolution, contrast limited adaptive histograms, etc. It may also be implemented using known techniques, including various image filtering methods, such as
In step 102, image binarization is performed. Image binarization can include converting pixel values of the received image to either logical ones (1) or logical zeros (0). These values may be represented by a single bit or by multiple bits, such as, for example, an 8-bit unsigned integer. The pixels of the received image may be grayscale pixels, color pixels, or pixels represented in any other format. The values may also be represented by the corresponding black or white color. In one embodiment, of the known techniques, binarization can be broadly classified into a global approach, a domain based approach, a local approach, a hybrid approach, or any variation thereof. It may also be run using one. In one example implementation, image binarization is performed using Sauvola binarization. In this technique, binarization is performed based on small image patches. As soon as analyzing the statistical data of the local image patch, the binarization threshold is:

Where m and s are the local mean deviation and standard deviation, respectively, R is the maximum value of the standard deviation, and k is the threshold value It is a parameter that controls the value. The parameter k may be selected according to the document image. In one embodiment, k may be set manually. In another embodiment, the parameter k may be set automatically depending on the text characteristics of the document image.
  In step 104, it is determined whether the binarized image (hereinafter referred to as an image) includes any picture. If the image does not contain any pictures, the process proceeds to step 108. If the image includes one or more pictures, the one or more pictures are separated from the text at step 106. Any of the known techniques, such as page analysis methods, text location methods, and / or machine learning methods, may be used to separate one or more pictures from text. Techniques based on page analysis methods may also be used for images that appear substantially similar to document images generated or scanned from the scanned document. Techniques based on text location methods may also be used for images with complex backgrounds, such as having pictures in the background. Techniques based on machine learning methods may be used for any type of image. Techniques based on machine learning methods may require training samples for learning. In an example implementation for separating one or more pictures from text, the background of the document image is extracted. Using the background, the document image is normalized to compensate for the effects of non-uniform illustrations. Subsequently, non-text objects are removed from the binary image using heuristic filtering, in which heuristic rules are: area, relative size, proximity to image frame, density, average contrast , Edge and contrast etc. FIG. 6A shows an example binarized image including a picture with text. FIG. 6B shows the resulting image after the picture has been removed.
  At step 108, connected component analysis is performed on the binarized image having only textual information. Connected component analysis may require identifying and labeling connected pixel components in a binary image. Pixel blobs may also be identified during connected component analysis. A pixel blob can be an area with a set of connected components where certain characteristics, such as color, are constant or change within a predetermined range. For example, the word "Hello" has five different sets of connected components, ie, each character of the word is a connected component, or a pixel blob. Positioning pixels are identified for each of the pixel blobs. Positioning pixels define the position of the pixel blob in the binary image. In one embodiment, the position determining pixels can be unique points. The unique point may be the pixel at the center of the pixel blob baseline inside the pixel blob. In another embodiment, the position determining pixel may be the pixel at the left or right edge of the pixel blob baseline inside the pixel blob. Other embodiments having pixel location pixels at different locations within the pixel blob, or a bounding box drawn above the pixel blob, are contemplated within the scope of the present disclosure. FIG. 7A illustrates an example pixel blob 702. A bounding box 704 is formed around the connected component or pixel blob 702. In FIG. 7A, the connected component identified is the character "A" 702. Bounding box 704 has a unique point 706, which may be defined as the center of the bottom of bounding box 704. The unique point 706 may be one of the position determining pixels used herein. Other position determining pixels may also be used in projection correction. For example, position determining pixels 708 and 710 represent the position determining pixel at the bottom left corner and the position determining pixel at the top left corner. Positioning pixels may be used to estimate one or more horizontal and / or vertical text lines in the binarized image. Each text line represents an approximation of the horizontal or vertical text direction of the associated text portion.
  At step 110, a horizontal vanishing point is determined. In one embodiment, the horizontal vanishing point may be determined using a text baseline determined using positioning pixels. Various embodiments for determining horizontal vanishing points are described in connection with FIG.
  At step 112, vertical vanishing points are determined. In one embodiment, the vertical vanishing point is determined using margin lines identified using position determining pixels. In another embodiment, the vertical vanishing point may be determined using the vertical stroke feature of the connected component. In yet another embodiment, vertical vanishing points are identified using margin lines and vertical stroke features. Various embodiments for determining vertical vanishing points are described in connection with FIGS. 3 and 4.
  At step 114, projection correction of the image is performed using the horizontal vanishing point and the vertical vanishing point. Projection correction is performed based on the estimation of eight unknown parameters of the projection transformation model. An exemplary projection transformation model is provided below.
  In one embodiment, a horizontal projection transformation matrix and a vertical projection transformation matrix are constructed to estimate the parameters of the projection transformation model. The horizontal projection transformation matrix and the vertical projection transformation matrix are constructed using the equations provided below.

Where (v x , v y ) are vanishing points, (w, h) are the width and height of the document image, t x = w / 2, ty = h / 2,

It is. Projection correction of the image is performed using the projection matrix.
In another embodiment, using the vertical vanishing point and the horizontal vanishing point, the corners of the original distorted image (x i , y j ) (4 <= i <= 1) and the It is possible to identify those corresponding locations (X i , Y j ) (4 <= i <= 1) in the document images that have not been received or registered. A projected transformation model may be estimated based on the four pairs of corresponding corners. The projection transformation model is an equation

It may be estimated using.
  Eight parameters may be obtained by using (4) following identification of four corners in the projectively corrected image. Following construction of the projection transformation model, general trends of projection correction are generated and displayed for user review, as shown in FIG. The user may be provided with an option to accept the general trend or a tool to adjust the four corners. For example, as shown in FIG. 8, a graphical user interface element 804 may be provided along with the possibility for the user to adjust the corners. Depending on the change in the corners per user input, the projection transformation model may be modified and a corresponding projection correction may be performed. Depending on the unchanged acceptance, projection correction may be performed. The resulting image may be presented as shown in element 806 of FIG. One skilled in the art will appreciate that appropriate additional options may also be provided to the user. An example of the result of projection correction is illustrated in FIGS. 9A and 9B. FIG. 9A shows a captured image. FIG. 9B shows the image after projection correction.
FIG. 2 considers an example method 200 for identifying horizontal vanishing points, according to one embodiment. In step 202, unique points may be identified. Intrinsic points may also be identified through connected component analysis of the image. Unique points are defined for all pixel blobs. In step 204, unique points are clustered and grouped. In one embodiment, unique points may be processed prior to being clustered. Intrinsic point processing can include removing the ambiguity inherent points. The unique points to be confused can be unique points that are above or below the text baseline. The intrinsic points to be confused may consist mainly of three sets of characters, ie, the first set includes characters that may consist of two blobs, where there are smaller blobs Are above the text baseline, such as 'j', 'i', and the second set is the text baseline when printed, such as 'p', 'q', 'g' The third set includes characters such as comma (,) and hyphen (-). Confounding unique points associated with the first and third sets of characters may be identified based on the size of the pixel blob. The size of the pixel blobs associated with the first and third sets of characters can be quite small, either horizontally or vertically, as compared to the other characters. Thus, the confusing point may be identified by comparing the pixel blob size with the average value of all pixel blobs. In one example implementation, the widths and heights of all pixel blobs are calculated. In addition, the average value for the width (m w ) and height (m h ) of the pixel blob is calculated. Eigenpoints belonging to pixel blobs whose width and / or height deviate from the calculated average value by a predetermined range are marked as confusion points. In one example, unique points having a width that exceeds the range of [0.3, 5] * m w and / or a height that exceeds the range of [0.3, 5] * m h are confusing uniqueness Identified as a point. Such confusion points may be dropped from further processing.
The remaining unique points are classified and clustered into different unique point groups such that each unique point group contains unique points from the same text line. An example unique point clustering algorithm is illustrated in FIG. The eigenpoint clustering algorithm is generally based on the following conditions (i) conditions where the eigenpoints are close to each other, and (2) the eigenpoints are: It is based on the assumption that the conditions that form a substantially straight line and (3) the direction of the constructed line satisfy one or more of the conditions close to the horizontal direction. In one embodiment, these conditions are defined by the following conditions: point-to-point distance between this unique point of the group and the other unique points, the first distance threshold T d The point-to-line distance between the condition below and this unique point of the group and the line formed by the multiple unique points is below the second distance threshold T i If the at least one of the condition and the off-horizontal angle of the line formed by the plurality of unique points of the group is below the maximum angle T a , then the unique point is a specific unique point Converted to respective constraints in the unique point clustering algorithm so as to be assigned to groups. Furthermore, in order to more robust intrinsic point clustering algorithm, additional constraints, singularity point group is to include a specific point T m of a at least a minimum number, it is also to be added that.
In one embodiment, the constraints of the eigenpoint clustering algorithm, ie, point-to-point distance threshold T d , point-to-line distance threshold T i , maximum angle off horizontal The threshold value T a and the minimum number T m of unique points in the unique point group may also be set adaptively based on analysis of the image, eg analysis of a camera document image. In an alternative embodiment, the parameters may be set manually. The horizontal direction T a may be offset by about 20 degrees and T m may be about 10, assuming that it has at least two words or three words in the text . It should be understood that other values can be selected for the T a and T m. The values of T d and T i may depend on the content of the text in the document image. For example, if the character size is a large T d , then T i may be held higher and vice versa. In one embodiment, T d and T i may be adaptively calculated as follows. A median distance D c is calculated based on all shortest distances between adjacent characters in the word. T i may be set to D c and T d may be set to 3 * D c . These values allow T d to find adjacent characters and words in the same paragraph, while preventing words belonging to adjacent paragraphs in the horizontal direction from being considered to be in the same unique point group. Selected to be large enough to do. Setting T d large enough to search for adjacent characters and words in the same paragraph allows for the identification of paragraph margin lines between a paragraph and a horizontal adjacent paragraph. Will. In some illustrative examples, spaces between words in a single line can cause excessive classification of unique points in the line into multiple unique point groups. The overclassification may be due to some small or large connected components that may have been removed during the large gap causing unique point removal procedure between words.
At step 206, overclassified unique point groups are merged by merging into corresponding groups. An exemplary intrinsic point merging algorithm may be described as follows. In each eigenpoint group {C i } (n> = i> = 1), the left end eigenpoint l i , the right end eigenpoint r i and (n> = i> = 1) may be respectively identified. is there. A pixel blob is identified that can correspond to the rightmost one of the unique point groups. The right adjacent pixel blobs of the rightmost unique point are retrieved from among the truncated pixel blobs (eg, pixel blobs corresponding to the unique points to be confused). In response to identifying the blobs to right adjacent blobs right neighbor is sometimes set as the new right end points r i. The step of searching for further right adjacent pixel blobs of the new right end point as described in the previous steps may be repeated until no further right adjacent blobs are found. In response to the absence of the right adjacent blob, the blob's unique point coordinates, such as r_new i , are recorded. The search index k is initialized to zero (0) using the new array r_new i (n> = i> = 1) of the right end point. The search index may be increased by one, ie k = k + 1, and the distance between l k and r_new i (n> = i> = 1) may be calculated. The unique point groups corresponding to the pairs of points l k and r_new i ({C k } and {C i }) have the following condition: the distance between unique point groups is predetermined The conditions that are inside the distance of (in one implementation of the example, the distance may be less than 0.5 * (Td)) and the lines corresponding to the unique point groups are close to each other (e.g. It may be merged if at least one of the following conditions is satisfied: the line distance is less than (T i ). If unique point groups are merged, the number of unique point groups may be reduced by one, ie n = n-1. A check can be performed to determine if the search index is equal to the number of point groups (k == n). If the search index is not equal then the search index is increased and the previous steps of calculating distance, unique point group merging steps are performed if they meet the defined conditions described above Ru. FIG. 10A shows an image of a previous example of unique point classification. FIG. 10A shows unique points for pixel blobs in the text baseline. FIG. 10B shows an example image after classification of unique points into groups. The figure shows an image having a group in each of the text lines. For example, the first text line indicates unique point group 1002. The second text line shown in the image shows over-classified unique point groups 1004 and 1006. Overclassified groups 1004 and 1006 (two groups) may also be found in the second line of text in FIG. 10B (indicated by the square symbol and the circular symbol for the corresponding unique point group) ). FIG. 10C shows an example image with integrated unique point groups. Overlined groups 1004 and 1006 of the second line, as shown in FIG. 10B, are combined into one unique point group 1008 (indicated by a plus mark).
At step 208, the text baseline is estimated using the grouped unique points provided after the clustering and merging steps. In one embodiment, the text baseline is estimated using a method based on adaptive weighted line estimation (hereinafter referred to as a priori line estimation). A priori line estimation can assign a weighting factor to each unique point needed in the line estimation. n unique points, i.e. p1, p2,. . . Consider the scenario where pn is used for line estimation ax + by + c = 0 (or y = kx + t). For each of the unique points, the weighting factors w1, w2,. . . wn may be assigned. In this case, the line estimate is

It may be considered as an equivalent form of the minimization problem defined by
The minimum of the sum of squares in equation [5] may be found by setting the slope to zero. Since the model contains two (2) parameters, there are two (2) gradient equations. The minimization of the above equation leads to the pseudocode of the following example:

It may also be performed using The weighting factor for each unique point is a weighting function, ie
w i = exp (−dis i ) ...... [6]
Where dis i is defined as the distance between the unique point and the expected text baseline. Thus, if the intrinsic points are closer to the expected text baseline, the intrinsic points may be assigned a higher weighting factor, and vice versa. An iterative procedure can be used to get closer to the expected text baseline. In one example implementation, the iteration may be a threshold (e.g., about 0.01) over which the difference between two successive line angles is small, for a predetermined number of rounds (e.g., about 10 to 70 rounds). It may be executed until it is lower than
  The estimated lines may be further refined by removing outliers at unique point groups. Outliers may be identified, for example, by using a Gaussian model. According to the Gaussian model, most of the eigenpoints (e.g., about 99.7%) may be located within the three standard deviations. Therefore, if the eigenpoints are located beyond three standard deviations, the eigenpoints may be considered as outliers. The remaining unique points in the point group may then be used for line estimation using conventional least squares. The a priori line estimation may be performed for all eigenpoint groups. FIG. 11 shows a portion of an example of text for which a baseline is to be estimated. It may be seen that unique point groups are shown as connected by lines. Example lines are highlighted within 1102.
  At step 210, horizontal vanishing points may be identified using the estimated text baseline. According to homogeneous coordinate theory, each horizontal line in Cartesian coordinate system may be regarded as a data point in uniform space, and lines passing through these data points correspond to vanishing points ing. Therefore, horizontal vanishing point identification may be viewed as a line fitting problem in a homogeneous coordinate system.
  Although the estimated text baseline is carefully estimated, some text baselines can contribute to outliers in terms of vanishing point estimation. Such outlier data points can be removed to improve the estimate of the horizontal loss point. Outliers may also be obtained due to inaccurate line estimates, non-text components (e.g., in case of failure of separation of text and picture), distortion, etc. To overcome this problem, according to one embodiment, Martin A. et al. Fischer and Robert C. et al. Bolles, "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography", Comm. A method based on the conventional Random Sample Consensus (RANSAC) algorithm as described in the the ACM 24 (6): pages 381-395, June 1981, for identifying the horizontal loss point Used for The RANSAC-based algorithm is chosen due to its robustness in removing outliers when estimating model parameters. The proposed RANSAC-based algorithm differs from the conventional RANSAC algorithm in such a way that initial data points can be selected for model parameter estimation and their confidence level can be taken together. Unlike the random selection of initial data points in the conventional RANSAC algorithm, the proposed RANSAC-based algorithm selects the initial sample with the highest degree of confidence.
  An example implementation of the proposed RANSAC-based algorithm is next described below.
  In one embodiment, each of the estimated text baselines may be defined in a Cartesian coordinate system. Each of the text baselines defined in the Cartesian coordinate system may be converted to data points in the homogeneous coordinate system.
Confidence levels for each of the data points may be assigned. Confidence levels for the data points are determined based on the proximity of the unique points used to estimate the text baseline to the resulting text baseline and the length of each text baseline It is also possible. The confidence level for each horizontal text baseline is

Where s max and s min denote the maximum standard deviation and the minimum standard deviation of all n line segments, l max is It represents the longest line segment of the n lines. Therefore, longer horizontal text baselines are assigned higher confidence levels. This is based on the assumption that the longer the horizontal text baseline, the better the estimation of the horizontal text baseline. Similarly, the lower the standard deviation (indicating the proximity of the unique point to the corresponding estimated text baseline), the better the text baseline estimation. As a result, such text baselines are assigned higher confidence levels. Data points in sample points having confidence levels above a predetermined threshold may be grouped into a priority sample array. Data points in the priority sample array may be clustered into several sample groups. In one embodiment, each sample group can include more than one data point. In line estimation, the accuracy may also be determined by the distance of the data points used to estimate the line. If the two data points are far apart from one another, then there is a higher degree of confidence that the line estimate will be accurate. Hence, a second confidence level indicator may be assigned to the point pair in the sample group, ie

Where Dis j, k is the distance between line j and line k in the vertical direction, and Dis max is the maximum distance of the m * (m-1) pairs of lines is there. The selection of m (m << n) lines may be taken into account to formulate a priority sample group that selects the first m lines with the best confidence level. Each sample group may also be assigned a group confidence value based on at least the confidence level assigned to each data point in the sample group.
  At step A, sample groups of data points may be iteratively selected from the prioritized sample array for line fitting. The iteration may be started from the sample group with the highest confidence value in the priority sample array. (If the number of iterations exceeds a certain threshold, then it may be stopped and the algorithm proceeds to step F). In step B, line fitting may be performed for the first sample group resulting in the first fitted line, and each further sample resulting in the further fitted line • Line fitting is then performed for the group.
  In step C, the set of data points located below a predetermined distance threshold from the first adapted line is based on the first adapted line and the further adapted line. It may be decided. These data points are called inliers. The predetermined distance threshold from the first adapted line may be a fixed parameter or may be set adaptively based on the content of the document image. In step D, a count of data points located below a predetermined distance threshold from the first adapted line is calculated. The maximum number of inliers to be determined is recorded. In step E, a check may be performed to determine if the maximum number of inliers is equal to the number of data points. If the maximum number of inliers is not equal to the number of data points, the number of iterations is recalculated and step A may be started again. Step F may be initiated if the maximum number of inliers is equal to the number of data points.
In step F, vanishing points can be estimated using the largest inliers. In one embodiment, the first and second candidate horizontal vanishing points use different approximation methods selected from the group consisting of least squares, weighted least squares, and / or adaptive least squares. May be estimated. The use of other approximation methods is also contemplated herein. In step G, a candidate horizontal vanishing point closest to the horizontal text direction of the image document may be selected after projection correction. The proximity of the horizontal text direction is

Where n is the number of horizontal lines in the document image, and α i is the angle of the ith line angle with respect to horizontal after projection correction has been performed As defined (180 ° ≧ α i 00 °), p is the index of the p th candidate horizontal vanishing point selected from the m candidate vanishing points.
  The conventional RANSAC algorithm uses randomly selected points for initial line estimation. As a result, different results may exist each time the conventional RANSAC algorithm is performed. Furthermore, determining the result of the conventional RANSAC algorithm can be difficult. The proposed RANSAC-based algorithm addresses this problem by incorporating some a priori knowledge of the points. In the proposed RANSAC-based algorithm, points with good confidence levels are selected first to estimate the inliers. As a result, the proposed RANSAC-based algorithm provides more consistent results.
  Although this disclosure describes using unique points for horizontal vanishing point determination, it is understood that other positioning pixels of a pixel blob may also be used for horizontal vanishing point determination. It should.
FIG. 3 describes an example unique point clustering algorithm 300 according to one embodiment. At step 302, a set of unique points "I" may be identified. At step 304, unique points can be counted to determine if the number is sufficient to generate unique point groups. If the number is above sufficient (at least above the threshold number ( TM )), then the unique point set "I" may be processed. The threshold number may be set as a constraint for the generation of unique point groups. If the number of unique points is less than the threshold, then step 324 may be performed. In one example implementation, the threshold number of unique points may be ten, suggesting the presence of at least two or three words in a single line. The threshold may be set to prevent the possibility of assigning unrelated unique points to the unique point group.
At step 306, unique points (eg, p 0 ) are randomly selected from the set I of unique points. Singularity point p 0 may also be entered as the first specific point in the candidate line group "C". In one embodiment, candidate line group C may be a bi-directional queue. Furthermore, the unique points p 0 are removed from the set I of unique points. The unique points from one side of p 0 are input to candidate line group C.
In step 308, the newly added eigenpoint p i from candidate eigenpoint group C is selected from one side of a bi-directional queue (eg, a non-negative direction i> = 0 queue). Specific point p * is identified from the set I of the nearest unique point singularity point p i.
In step 310, the distance between unique points p i and p * is calculated. If the distance is below the threshold distance (T d ), step 312 is performed. If the distance is above the threshold distance (T d ), step 314 is performed. The threshold distance can mean the maximum distance between the eigenpoints that are inside the group. In one example implementation, the threshold distance between the eigenpoints of a group is below the first distance threshold, which may be three times the median distance of the closest set of adjacent eigenpoints. It is in.
In step 312, whether the selected unique point p * satisfies the constraints imposed by the point-to-line distance threshold (T i ) and the horizontal proximity threshold (T a ) Is determined. Point-to-line distance threshold (T i ) shall specify the text baseline to point maximum distance threshold so that the unique points are selected for the unique point group Can. A point-to-line distance threshold (T i ) is used to select unique points that contribute in forming a straight line. The horizontal proximity threshold (T a ) may define the maximum angle of the eigenpoint from the line with respect to the horizontal to cause the eigenpoint to be selected for the eigenpoint group. The proximity threshold (T a ) to the horizontal direction is used to select unique points that contribute to the formation of the direction of the line close to the horizontal direction. In one example implementation, T a can be twenty (20) degrees. In response to determining that the chosen eigenpoint p * satisfies the constraint, the eigenpoint p * is p i + 1 point at the bi-directional queue (in the non-negative direction) and i = i + 1 in the time interval As a candidate line group C may be selected. In response to determining that the selected unique point p * does not satisfy the constraint, the unique point p * may be placed in a special line group "L".
The process steps 308 to 312 are performed until all unique points from one side (bidirectional queue non-negative direction) have been evaluated. Depending on the completion of the evaluation of one side of the unique points, the remaining unique points from the other side of p 0 are taken into account (non positive direction of the bi-directional queue). The remaining unique points from the other side of p 0 are input to candidate line group C.
At step 314, the unique points p j from the candidate line group C (non-forward direction of the bi-directional queue, j <= 0) are selected from the other side. A unique point p * from the set I of unique points closest to the unique point p j from the other side of the unique point group C is identified. At step 316, the distance between the unique points p j and p is calculated. If the distance is below T d , step 618 is performed. If the distance is above T d , step 320 is performed.
In step 318, the unique points p j are checked to determine if the selected unique points p * satisfy the constraint condition with respect to T i and T a . Depending on the specific points p j decides satisfying the constraint condition, singularity point p * is, p bidirectional queue and (in the non-positive), the j = j-1 Metropolitan in between time j- One point may be selected for candidate line group C. In response to determining that the unique points do not satisfy the constraint, unique points p j may be placed in a special line group “L”.
  The process steps 316 to 318 are performed until all unique points from the other side have been evaluated.
In step 320, it is possible to specific points in the candidate line group C is counted, the number to determine whether above the threshold number T m. If the number is above Tm , step 322 is performed. If the number is below Tm , the process is mapped to step 304 to determine if any other unique points for processing exist. In step 322, candidate line group C is assigned an index number, such that candidate line group C becomes a unique point array for the lines indexed by the index number.
At step 324, for each unique point in a particular line group L, whether the unique point is within the constraints of T m , T i and T a for any of the line groups It will be checked. In response to determining that the unique points are internal to the constraints T m , T i and T a , the unique points are merged into the corresponding line group.
  The process is repeated for every text baseline until all lines in the document image have been processed.
  One advantage of the unique point clustering algorithm as described herein is that it provides consistent clustering results regardless of the initial point to cluster. The use of bi-directional queuing allows the use of two end points on the line rather than one end point in one direction, thereby providing an algorithm for seeding points to form point groups. I am reducing my confidence. The unique point clustering algorithm is flexible in the sense that the algorithm does not require that each unique point needs to belong to one of the point groups. Some unique points not included in any of the groups are truncated or ignored. This results in simpler, faster convergence for the proposed Eigenpoint Clustering algorithm than conventional clustering algorithms. Nevertheless, the use of conventional or any other clustering algorithm to cluster unique points into different line groups is also contemplated herein.
  FIG. 4 illustrates an example process flow 400 for identifying vertical loss points using margin feature points according to one embodiment. At step 402, margin feature points may be identified. The margin feature points may be position determining pixels according to one embodiment. Margin feature points may also be identified as described below. In one embodiment, the margin feature point may be the lower left end pixel of the pixel blob for the left margin, and the margin feature point may be the lower right end pixel of the pixel blob for the right margin. can do. The lower left end point may also be identified by finding the blob associated with the left unique point in the unique point group (eg, identified during horizontal line estimation). The unique point groups determined after the unique point merging step and prior to the use of unique point groups for horizontal line formation may also be used for margin point determination. The reason that the unique points are after merging with the left or right unique points may correspond to merging blobs. The unique points may not have been removed just prior to line formation. The left unique point may be found after comparing the x-coordinates of the unique points in the group. The corresponding blob of the left unique point may be found. The lower left end point of the blob may be used as a left margin feature point. Similar to the lower left end point, the lower right end point may be identified by finding the blob associated with the right unique point in the unique point group. After identifying the blob above the right end of the unique point group, it may be determined whether there is an adjacent blob near the identified right blob. A blob search is then performed using a process similar to the process used in the adjacent blob search algorithm in the unique point merging procedure. The bottom right end point corresponding to the blob found is then used to form feature points for right margin line estimation. In alternative embodiments, other variations of margin feature points may be used. FIG. 12 shows an example image in which margin feature points are identified in the margin. It can be seen that the margin feature points are marked by dots in the margin as shown inside 1202. Paragraph margins are usually vertical and parallel if no projection distortion occurs.
At step 404, the features of the margin points are clustered into different margin groups. Margin feature points along the margin lines of the document in the image may be used to estimate the margin. In one embodiment, the margin feature points may be clustered based on the proximity of pixel blobs within the corresponding margin. In one illustrative example, a clustering algorithm similar to the Eigenpoint Clustering algorithm described in connection with FIG. 3 may be used to cluster the margin feature points. In an alternative embodiment, different point clustering algorithms may be used, such as described below.
Step 1: Set margin point feature distance threshold TEnd th and indicate all left margin points (at step 402) identified as {P i },
Step 2: Initialize the left margin point group {C 1} with one point selected randomly from {P i}, remove the point from the {P i}, a group _ index = 1 Set,
Step 3: For each point in {P i }, calculate the minimum distance between this point and the point in {C i } (group_index ≧ i ≧ 1). If the distance is lower than TEnd th then this point is assigned to the point group reaching the minimum distance, otherwise the group index will be increased by 1 ie the group _Index = group_index + 1, this point will be assigned to the most recent left margin point group: {C group_index}.
TEnd th is set equal to 6 * (T d ), where (T d is the median distance between the eigenpoints as discussed above in connection with FIG. 2), this value May be selected such that it is sufficiently satisfactory to search for features of adjacent margin points that are expected to be within the same margin line. The left end point clustering method may differ from the eigenpoint clustering method for horizontal line estimation because the left end point clustering algorithm can use all margin points Although, in the eigenpoint clustering algorithm, some eigenpoints may be removed during the clustering process.
  Other clustering algorithms may also be used in alternative embodiments. Clustered position determining pixels identified in the margin may be processed into different margin point groups. For example, if there are two columns in the document image, positioning pixels for the left and right margins of both columns are identified and grouped accordingly. In step 406, overclassified margin lines may be merged with corresponding margin lines. For example, two or more lines along the same margin may be merged into a single margin.
At step 408, vertical line estimation may be performed using margin point groups. Similar to the eigenpoint clustering algorithm, not all margin point groups may be used for vertical line estimation. Margin feature pixels for a group are subject to the following conditions that are suitable for margin line estimation, ie, the minimum number of points in the margin line P th (eg, threshold for P th is 3 Two unique points), a minimum percentage P 1 of points on the margin line (eg, about 50%), and a maximum angle α v of the line in the vertical direction (eg, the maximum angle is about 20). Must meet one or more of: °) and the minimum non-boundary point confidence level P b (for example, the minimum non-boundary point may be about 50%) there is a possibility.
The feature of the margin point (which contributes to P th ) is inside the margin line if the distance between the pixel decision point and the margin line is inside the threshold (T l ) This threshold (T l ) may be regarded as being equal to the median specific point distance (T d ) in one example implementation. The percentage of points on the margin line P l is defined as the ratio between the number of unique points inside the margin line in the clustered eigenpoint group and the number of features of the margin point Sometimes. In some embodiments, there may be pixel decision points that are out of range. For example, when document content is partially captured, the boundaries of the image can have content that is partially captured. Pixel decision points associated with such blobs at boundaries may also be defined as boundary points. Boundary points may not be used in margin line estimation, and the percentage of non-boundary points is the number of non-boundary points in the clustered margin point feature group and the number of features of the margin point It may be defined as a ratio between The minimum non-boundary point confidence level P b may be defined as the multiplication of the percentage of points above the margin line by the percentage of non-boundary points.
In one embodiment, vertical line estimation may be performed using vertical offset least squares, although alternative methods are also contemplated herein. Suppose that a possible nearly vertical line is represented as y = kx + t. Using the vertical offset least squares method, the optimal line coefficient is given by the following object minimization function:

It corresponds to
  Based on the vertical offset least squares method, an iterative robust method for near vertical line estimation as described below may be used according to one embodiment.
In step 1, lines are initialized using the vertical offset line estimation method. In step 2, the distance from the sample point is calculated. In step 3, the line function is recalculated based on the weighted vertical offset method. In step 4, the angular difference between successive estimated lines may be calculated. The method proceeds to step 5 if the angular difference is below a predetermined threshold or if the iteration count exceeds the maximum allowable iteration. If the angular difference is above the predetermined threshold, or if the repeat count is within the maximum allowable iteration, then the next iteration is performed by proceeding to step 2. At step 5, a line function is calculated. The predetermined threshold and the maximum allowable number of iterations are the same value as the respective parameters in the horizontal line estimation method according to one embodiment. Instead, a value different from that used for horizontal line estimation is used for the predetermined threshold for vertical line estimation and the maximum allowable iteration. The weighted vertical offset method follows the pseudo code of the example below:

May be implemented using
  In another embodiment, vertical line estimation may be performed using xy replaceable weighted least squares. In the x-y exchangeable weighted least squares method, the x and y coordinates may be exchanged prior to the estimation of the vertical line, so that the vertical offsets are calculated during the vertical line estimation. You will be constrained.
Once vertical lines have been estimated, vertical lines may be merged. For example, multiple broken margin lines along the line space can be merged to form a single margin. Vertical lines may be merged using the following steps. In step 1, x-coordinates may be calculated for each margin line, keeping the vertical coordinates (y-coordinates) fixed. In step 2, the x-coordinate distance may be calculated for the margin line. The margin lines may be merged if the x-coordinate distance is below the threshold Tvth . T vth may also be chosen to be 2 * (T d ), where T d may be the median distance between the margin feature points. In the example when there are multiple vertical lines, the closest vertical lines may be merged before they are used for vertical vanishing point identification. FIG. 13 shows an example image showing two estimated vertical lines 1302A and 1302B along the same margin. FIG. 14 shows an example image showing the merging of the estimated vertical lines into the single margin 1402 of FIG.
  At step 410, using the estimated vertical lines, vertical vanishing points may be identified. The determined vertical lines may also be processed using a modified RANSAC algorithm, as described below, which method is used for horizontal erasure point identification And very similar. The estimated vertical margin lines resulting from the merging step may be defined in a Cartesian coordinate system. In addition, each of the estimated vertical margin lines is transformed from a Cartesian coordinate system to data points in a homogeneous coordinate system. The confidence level for each of the data points is used to estimate the resulting margin line, as well as the length of each margin line, as it was done using horizontal vanishing point identification It may be assigned based on the proximity of the margin points. A set of data points of the data points having confidence levels above a predetermined threshold is grouped into a priority sample array. In addition, data points in the priority sample array are clustered into several sample groups. In one embodiment, each of the sample groups includes two or more data points. In addition, group confidence values may be assigned to each sample group based on the confidence level assigned to each data point in the sample group. Sample groups of data points may be iteratively selected from the prioritized sample array for line fitting. In one embodiment, the iteration may be started from the sample group having the highest confidence value in the priority sample array. Line fitting for the first sample group may also be performed, resulting in a first adapted line. Line fitting for each additional sample group may be performed subsequently, resulting in further adapted lines. A set of data points positioned below a predetermined distance threshold from the first adapted line is determined based on the first adapted line and the further adapted line. Sometimes. The first and second candidate vertical erasure points may be estimated from vertical lines corresponding to the determined set of data points. In one embodiment, the first and second candidate horizontal vanishing points may be estimated using different approximation methods, such as least squares, weighted least squares, and / or adaptive least squares. is there. Other approximation methods may also be used. The proximity of each vertical vanishing point candidate may be compared to the resulting vertical text direction after projection correction. The closest vertical vanishing point candidate in the vertical text direction of the image document after projection correction may be selected.
  If the number of margin lines detected is relatively small (e.g., less than 5), it is also possible to calculate the vanishing points directly using the weighted vertical vanishing point identification method. Using this method, each of the estimated vertical margin lines is transformed from a Cartesian coordinate system to data points in a homogeneous coordinate system. Confidence levels for each of the data points may be assigned as described above. Thereafter, weighted least squares can be used to fit the line corresponding to the vertical vanishing point.
  FIG. 5 illustrates an example process 500 for identifying vertical vanishing points using connected component analysis, according to one embodiment. Process 500 may also be employed if the vertical margin line may not be available due to the lack of margin. The vertical vanishing point may also be identified using the text stroke feature of the pixel blob, which is a constituent unit of the text character. At step 502, the text stroke features of the pixel blob may be identified. FIG. 15 shows an example image showing identification information of the character's text stroke features. A portion of the text identified by circle 1502 is shown on the right of the figure. Vertical text stroke features 1504 of the plurality of characters "dans la" are identified and shown.
At step 504, a set of pixel blobs may be identified using text stroke features in accordance with one or more defined criteria. In one embodiment, the pixel blobs may have pixel blobs that are not close to the margin, with eccentricities of 0.97 for the pixel blobs, 70 ° and 110 °. It may also be selected if one or more of the angle of the text stroke in between and the area of the pixel blob inside [0.3, 5] * area m are satisfied. The eccentricity can be used to indicate how close the pixel blob is to a circular shape. Because the eccentricity of the circular shape is zero, the smaller the eccentricity value, the more circular the pixel blob. If the eccentricity of the pixel blob is greater than 0.97, the pixel blob looks like a line segment and can therefore be a distorted blob that can exhibit vertical distortion. In one embodiment, the eccentricity of a pixel blob may be found by identifying an ellipse surrounding the pixel blob, and then:

It can be calculated according to where a and b represent the major and minor axes of the ellipse. In languages such as Chinese and Russian, optional pre-processing procedures such as edge detection and mathematical morphological filtering can be used to enhance the pixel blob eccentricity feature. Pixel blobs having 0.97 may be filtered using an appropriate filter. The closeness of pixel blobs to image boundaries may not be used for estimation. In one embodiment, proximity filtering can be used to remove pixel blobs that intersect with image boundaries. Similarly, in one embodiment, angular filtering may be performed to filter pixel blobs having text strokes that are not within 70 degrees and 110 degrees. [0.3, 5] * Pixel blobs with areas within the area m may be selected. To identify blobs within such a range, a robust method can be used to estimate the median area of the pixel blobs selected after filtering the criteria described above. Pixel blobs whose area values are in the range [0.3, 5] * area m are used for vertical vanishing point estimation. FIG. 16 shows an example image showing selectively extracted blobs after identification of text stroke features.
  The selected pixel blob is used to estimate the vertical text blob line. Vertical lines are estimated at step 506. Vertical lines are estimated using a line function that can correspond to the direction of the pixel blob. FIG. 17 shows an example image showing estimated vertical text blob lines for selected pixel blobs.
  At step 508, using vertical lines, vertical vanishing points may be determined. In one embodiment, the vertical erasure points may be determined using a modified RANSAC algorithm as described previously. FIG. 18 shows an example image showing vertical text blob lines selected as a result of application of the modified RANSAC algorithm. For the sake of brevity, a brief description summarizing the application of the modified RANSAC on vertical lines is provided below. Each of the estimated vertical text blob lines is defined as a line in a Cartesian coordinate system. Each further of the estimated vertical text blob lines is transformed in Cartesian coordinates to data points in homogeneous coordinates. Confidence levels for each of the data points may also be assigned. The confidence level may be based on at least the eccentricity of the shape of the pixel blob used to estimate each vertical text blob line. Furthermore, the modified RANSAC method is applied as described above in connection with the above figures to determine the vertical vanishing point.
  The projection correction algorithm may also be implemented as a set of instructions associated with a computer that generates a machine when loaded onto a computing device to perform the functions described herein. . These computer program instructions may also be stored in non-transitory computer readable memory, which can indicate that a computer or other programmable data processing device functions in the manner described. The projection correction algorithm may also be implemented in computer-based systems, or as hardware or a combination of hardware and software that may be implemented in connection with computer-based systems. One skilled in the art can appreciate that a computer-based system includes an operating system associated with a server / computer and various supporting software. Projection correction algorithms as described herein may also be deployed by the organization and / or third party vendors associated with the organization.
  The projection correction algorithm may be a stand-alone application or a modular application (eg, a plug-in) residing on a user device that may be integrated with other applications, such as image processing applications and OCR applications. For example, stand-alone applications include personal computers, portable computers, laptop computers, netbook computers, tablet computers, smart phones, digital still cameras, video cameras, mobile communication devices, portable personal digital assistants , A scanner, a multifunction device, or any device capable of having a processor for performing the operations described herein, such as obtaining a document image, and being present on a user device it can. In another contemplated implementation, a portion of the projection correction algorithm may be performed by a user device (e.g., a user's camera), and other portions of the projection correction algorithm are coupled to the user device May be performed by a processing device (eg, a user's personal computer). In this case, the processing device can perform more computationally expensive tasks. The projection correction algorithm may also be implemented as a server based application residing on a server (eg, an OCR server) accessible to user devices through the network. The projection correction algorithm may also be implemented as a network based application having modules implemented through multiple networked devices.
In summary, the present disclosure provides various embodiments of methods for projection correction of perspective-distorted images, such as camera-based document images, which methods include the following techniques: Have at least one of the
-Use of eigenpoints to estimate horizontal vanishing points. Generally, since these baselines are mostly aligned for multiple sequential characters in the text portion, one of the pixels on the border box's baseline as the positioning pixel It is preferred to use. Of these, their unique points are preferred as they are a by-product of standard connected component analysis and therefore no additional processing steps are required to obtain them for each pixel blob .
-A unique point selection procedure is proposed to select unique points that can be used for text line estimation. An example is disclosed that groups remaining unique points by removing confusing unique points, clustering, or merging. Furthermore, the result of clustering of unique points is a baseline that has already been estimated.
The left end point and the right end point of the baseline of the text part are used as margin feature points for margin line estimation. Left and right end point clustering algorithms are proposed to estimate margin lines.
In order to identify inliers in erasure point estimation, an adaptation of the conventional RANSAC algorithm, which may be referred to as RANSAC, is proposed, wherein the conventional algorithm is a priori knowledge, eg a confidence value or a confidence value. It is improved by taking into account the degree level.
An erasure point selection program is employed to select among a number of candidate erasure points that may be determined in different ways.
Weighted line estimation is proposed for horizontal vanishing point estimation using confidence level, and adaptive weighted line estimation is proposed for vertical vanishing point estimation.
Vertical offset least squares and xy replaceable weighted least squares are proposed to calculate vertical margin lines.
-Vertical vanishing point estimation based on blob analysis is proposed, in particular by considering the features of the vertical stroke of the pixel blob.
Page analysis is incorporated into the processing chain, only text information is used for projection correction. An embodiment is proposed in which the steps are taken to remove or separate pictures before performing projection correction.
-A complete processing chain is proposed to solve the projection correction problem, where the need for user intervention may be avoided.
A projection correction method is proposed to collectively improve the result of the projection correction, which comprises removal steps for different levels, i.e. for eigenpoints, baselines and missing point candidates.

Claims (48)

  1. A method for projection correction of an image comprising at least one text portion that is subject to distortion by perspective projection, comprising:
    An image binarization step, wherein the image is binarized;
    Connected component analysis, wherein pixel blobs are detected in the at least one text portion of the binarized image;
    Estimating a horizontal baseline using the pixel blob's unique points, and determining horizontal vanishing points of the at least one text portion using the text baseline. When,
    Determining a vertical vanishing point, wherein a vertical vanishing point is determined for the at least one text portion based on vertical features of the at least one text portion;
    The perspective drawing in the image, wherein the correction based on the horizontal and vertical vanishing point, viewed including the steps of projecting the correction,
    The step of estimating the text baseline comprises the step of clustering the unique points into unique point groups, the unique point groups satisfying the following conditions:
    -The condition that the point-to-point distance between the eigenpoints of the eigenpoint group is below a first distance threshold;
    -The point-to-line distance between each unique point of the unique point group and the line formed by the unique points of the unique point group is below a second distance threshold Condition and
    The condition that the off horizontal angle of the line formed by the unique points of the unique point group is below a maximum angle;
    -A condition in which the unique point group contains a minimum number of unique points
    The text baseline is estimated based on the unique point group .
  2.   The method of claim 1, wherein each unique point is a center of a bottom of a bounding box of the respective pixel blob.
  3.   The step of estimating the text baseline includes the step of removing confusable eigenpoints, and confusing eigenpoints that are out of line with the eigenpoints are detected near the eigenpoints being considered The method of claim 1, wherein the confounding intrinsic points are ignored for the text baseline estimation.
  4. The confusing point removal step is:
    Determining the width and height of the pixel blob;
    Determining an average value for the width and height of the pixel blob;
    The width of the pixel blobs under consideration, the step of at least one of the height, to detect a specific point to the confused as a unique point belonging only to different pixel blobs predetermined range from the determined mean value The method according to claim 3, comprising
  5. The first distance threshold, the second distance threshold, the maximum angle, and the minimum number of unique points are adaptively set based on the content of the image. The method of claim 1 .
  6. The step of estimating text baselines further includes the step of unique point group merging, where unique point groups on either side of the ignored unique points are merged into a larger unique point group The method according to claim 1 .
  7. The step of determining the horizontal vanishing point is:
    Defining each of the estimated text baselines as a line in a Cartesian coordinate system;
    Transforming each of the text baselines defined in the Cartesian coordinate system into data points in a homogeneous coordinate system;
    The confidence level comprises the steps of assigning to each of the data points, the confidence level, the at least the length of each text baseline, the unique that is used to estimate the text baseline Assigning based on point groups and proximity to the estimated text baseline;
    Grouping several data points having a confidence level above a predetermined threshold into a priority sample array;
    Clustering the data points in the priority sample array into several sample groups, each sample group comprising at least two data points; ,
    Assigning a group confidence value to each sample group based on at least the confidence level assigned to each data point in the sample group;
    Iteratively selecting a sample group of data points from the priority sample array for line fitting, the iteration having the highest confidence value in the priority sample array Iteratively selecting from the first sample group;
    Performing a line fitting on said first sample group leading to a first adapted line, and then performing a line fitting on each further sample group leading to a further adapted line ,
    Based on the first adapted line and the further adapted line, determining a set of data points located below a predetermined distance threshold from the first adapted line Step to
    Estimating at least first and second candidate horizontal vanishing points from a horizontal text baseline corresponding to the determined set of data points;
    Performing a projection correction based on each estimated horizontal vanishing point candidate;
    After projection correction, comparing the proximity of each horizontal vanishing point candidate to the horizontal text direction of the image ;
    After projection correction, selecting the candidate horizontal vanishing point closest to the horizontal text direction of the image .
  8. The first and second candidate horizontal vanishing points are estimated using different approximation methods selected from the group consisting of least squares, weighted least squares and adaptive least squares. The method of claim 7 .
  9. The step of determining the vertical vanishing point is:
    Estimating a plurality of vertical text blob lines, each corresponding to a selected one of the pixel blobs selected by the blob filtering algorithm for the text portion of the image;
    Defining each of said estimated vertical text blob lines as a line in a Cartesian coordinate system;
    Converting each of the vertical text blob lines estimated in the Cartesian coordinate system to data points in a homogeneous coordinate system;
    Assigning a confidence level to each of the data points, wherein the confidence level is at least the eccentricity of the shape of the pixel blob used to estimate the respective vertical text blob line Assigning, and based on
    Grouping several data points having a confidence level above a predetermined threshold into a priority sample array;
    Clustering the data points in the priority sample array into several sample groups, each sample group comprising at least two data points; ,
    Assigning a group confidence value to each sample group based on the confidence level assigned to each data point in the sample group;
    Iteratively selecting a sample group of data points from the priority sample array for line fitting, the iteration comprising the highest group confidence value in the priority sample array Iteratively selecting from the first group of samples having
    Performing a line fitting on said first sample group leading to a first adapted line, and then performing a line fitting on each further sample group leading to a further adapted line ,
    Based on the first adapted line and the further adapted line, determining a set of data points located below a predetermined distance threshold from the first adapted line Step to
    Estimating at least first and second candidate vertical erasure points from the vertical text blob line corresponding to the determined set of data points;
    Performing a projection correction based on each estimated vertical vanishing point candidate;
    After projection correction, comparing the proximity of each estimated vertical vanishing point candidate to the vertical text direction of the image ;
    Selecting , after projection correction, the candidate vertical vanishing point closest to the vertical text direction of the image .
  10. The first and second candidate vertical vanishing points are estimated using different approximation methods selected from the group consisting of least squares, weighted least squares and adaptive least squares. 10. The method of claim 9.
  11. The blob filtering algorithm has the following conditions:
    The condition that the eccentricity of the shape of each pixel blob, which represents the main direction of the pixel blob, is above a predetermined threshold,
    The condition that the proximity of each pixel blob to the image border is above a predetermined distance threshold;
    The condition that the angle of the estimated vertical text blob line is below a maximum angle threshold;
    10. The pixel blob according to claim 9 , wherein the area of each pixel blob defined by the number of pixels selects the pixel blob based on at least one of the conditions below the maximum area threshold. Method.
  12.   The step of separating text and picture is performed after the image binarization and before the connected component analysis, wherein only text information is retained in the binarized image. the method of.
  13. A system for projection correction of an image comprising at least one text part that is subject to distortion by perspective, said system comprising at least one processor and a program executable using said at least one processor. Storage, and
    A first software code portion configured for image binarization that, when executed, binarizes the image;
    A second software code portion configured for connected component analysis to detect pixel blobs in the at least one text portion of the binarized image when executed;
    A horizontal vanishing point determination of estimating a text baseline using unique points of the pixel blobs, when executed, and determining a horizontal vanishing point of the at least one text portion using the text baseline A third software code portion configured to:
    When executed, on the basis of the vertical features in at least one of the text portion, a fourth software code configured for vertical vanishing point determination for determining a vertical vanishing point for the at least one text portion Part,
    When executed, on the basis of the horizontal and vertical vanishing point of viewing including a fifth software code portions configured for projection correction for correcting the perspective drawing in the image,
    Estimating the text baseline comprises clustering the unique points into unique point groups, the unique point groups satisfying the following conditions:
    -The condition that the point-to-point distance between the eigenpoints of the eigenpoint group is below a first distance threshold;
    -The point-to-line distance between each unique point of the unique point group and the line formed by the unique points of the unique point group is below a second distance threshold Condition and
    The condition that the off horizontal angle of the line formed by the unique points of the unique point group is below a maximum angle;
    -A condition in which the unique point group contains a minimum number of unique points
    A system in which the text baseline is estimated based on the unique point group .
  14. Of the following: personal computers, portable computers, laptop computers, netbook computers, tablet computers, smart phones, digital still cameras, video cameras, mobile communication devices, portable personal digital assistants The system of claim 13 , comprising one of: a scanner, a multifunction device.
  15. For projection correction images including at least one text portion subjected to distortion by perspective drawing, computer program product is stored, a non-transitory storage medium, the computer program product,
    The following steps can be performed when executable on a computing device and executed on said computing device:
    An image binarization step, wherein the image is binarized;
    Connected component analysis, wherein pixel blobs are detected in the at least one text portion of the binarized image;
    A horizontal vanishing point determination comprising: estimating a text baseline using the pixel blob unique points; and determining a horizontal vanishing point of the at least one text portion using the text baseline Step and
    Determining a vertical vanishing point, wherein a vertical vanishing point is determined for the at least one text portion based on vertical features of the at least one text portion;
    Performing the step of projection correction, wherein the perspective in the image is corrected based on the horizontal and vertical vanishing points ;
    The step of estimating the text baseline comprises the step of clustering the unique points into unique point groups, the unique point groups satisfying the following conditions:
    -The condition that the point-to-point distance between the eigenpoints of the eigenpoint group is below a first distance threshold;
    -The point-to-line distance between each unique point of the unique point group and the line formed by the unique points of the unique point group is below a second distance threshold Condition and
    The condition that the off horizontal angle of the line formed by the unique points of the unique point group is below a maximum angle;
    -A condition in which the unique point group contains a minimum number of unique points
    At least one filled of the text baseline, the singularity point based on the group, which includes software code portions in estimated format configured so that, non-transitory storage medium.
  16. A method for determining vanishing point candidates of at least one text portion in an image subject to distortion by perspective projection, comprising:
    An image binarization step, wherein the image is binarized;
    Performing connected component analysis, wherein a pixel blob is detected in the at least one text portion of the binarized image, and for each of the pixel blobs, a position determining pixel is the pixel Performing on a pixel blob baseline of a blob, wherein the locating pixels define the position of the pixel blob in the binarized image;
    Estimating, in a Cartesian coordinate system, a number of text lines, each text line representing an approximation of the horizontal or vertical text orientation of the text portion based on the positioning pixel;
    Converting each of the text lines into data points in a homogeneous coordinate system;
    Assigning a confidence level to each of the data points;
    Grouping several data points having a confidence level above a predetermined threshold into a priority sample array;
    Clustering the data points in the priority sample array into a number of sample groups, each sample group comprising at least two data points; ,
    Assigning a group confidence value to each sample group based on at least the confidence level assigned to each data point in the sample group;
    Applying the RANSAC algorithm to determine the set of inliers for the first adapted line of the data points, the RANSAC algorithm being the highest in the priority sample array Applying from the sample group having a group confidence value;
    Estimating at least one candidate erasure point from the text line corresponding to the set of inliers.
  17. 17. The apparatus of claim 16 , wherein the confidence level assigned to the data point is based on at least the length of the respective text line and the proximity of the locating pixel to the respective text line. Method.
  18. The RANSAC algorithm comprises the following steps:
    Iteratively selecting a sample group of data points from said priority sample array for line fitting, said repetition being the highest group confidence in said priority sample array Iteratively selecting starting from a first sample group having a value;
    Performing a line fitting on the first group of samples leading to a first fitted line, and then performing a line fitting on each further sample group leading to a further fitted line Step and
    Based on the first adapted line and the further adapted line, determining a set of data points located below a predetermined distance threshold from the first adapted line The method of claim 16 , wherein the determining step comprises: determining the set of data points to form the set of inliers.
  19. 19. The method of claim 18 , wherein the predetermined distance threshold from the first adapted line is a fixed parameter.
  20. The method according to claim 18 , wherein the predetermined distance threshold from the first adapted line is an adaptation parameter adapted based on the content of the image .
  21. 17. The method of claim 16 , wherein at least first and second candidate erasure points are estimated from the text line corresponding to the set of inliers.
  22. The first and second candidate point for elimination are estimated using different approximation methods selected from the group consisting of least squares, weighted least squares and adaptive least squares. A method according to item 21 .
  23. The method further includes the step of selecting a vanishing point from the estimated vanishing point candidates, the selection being
    Performing a projection correction on the image based on each estimated vanishing point candidate;
    After projection correction, comparing the proximity of each vanishing point candidate to the horizontal or vertical text direction of the image ;
    After the projection correction, and selecting the nearest the vanishing point candidate to the horizontal or vertical text direction of the image, The method of claim 16.
  24. It said group confidence value for each sample group is further based on the distance between the estimated the text lines of their respective that corresponds to the data point in the said sample group, The method of claim 16 .
  25. The confidence level of each of the data points is further based on the principal direction of the pixel blob used to estimate each respective text line, the principal direction being: The method according to claim 16 , defined by the eccentricity of the shape of the blob.
  26. The method according to claim 16 , wherein the maximum number of data points grouped into said priority sample array is between 2 and 20, more preferably between 5 and 10. .
  27. 17. The method of claim 16 , wherein each of the at least one vanishing point candidate is a horizontal vanishing point candidate and the positioning pixel is a unique point of the pixel blob.
  28. Each of the at least one erasure point candidate is a vertical erasure point candidate, and the estimated text line is selected by a blob filtering algorithm on the at least one text portion of the image, each being The method according to claim 16 , wherein it is a vertical text blob line corresponding to the selected one of the pixel blobs.
  29. Step separation of text and pictures, after the image binarized, it is and executed prior to the connected component analysis, are held in the image that only the text information is the binarized claim 16 The method described in.
  30. A method for projection correction of an image comprising at least one text portion that is subject to distortion by perspective projection, comprising:
    An image binarization step, wherein the image is binarized;
    Performing connected component analysis, wherein a pixel blob is detected in the at least one text portion of the binarized image, and for each of the pixel blobs, a position determining pixel is the pixel blob. Performing on a pixel blob baseline of a blob, wherein the locating pixels define the position of the pixel blob in the binarized image;
    Estimating a text baseline using the determined pixels of the pixel blob, and determining at least one candidate horizontal vanishing point of the at least one text portion using the text baseline The step of determining horizontal vanishing points, including
    Estimating vertical text blob lines, each selected by a blob filtering algorithm on the text portion of the image, corresponding to a selected one of the pixel blobs; Determining a vertical erasure point comprising determining at least one vertical erasure point candidate of the at least one text portion using a text blob line,
    At least one of the horizontal and vertical vanishing point determinations is:
    Transforming each of the estimated text lines into data points in a homogeneous coordinate system;
    Assigning a confidence level to each of the data points;
    Grouping several data points having a confidence level above a predetermined threshold into a priority sample array;
    Clustering the data points in the priority sample array into a number of sample groups, each sample group comprising at least two data points;
    Assigning a group confidence value to each sample group based on at least the confidence level assigned to each data point in the sample group;
    Applying, among the data points, a RANSAC algorithm to determine a set of inliers for a first adapted line, the RANSAC algorithm being the highest of the priority sample array Applying from the sample group having a group confidence value;
    Estimating the at least one erasure point candidate from the text line corresponding to the set of inliers, determining a vertical erasure point;
    A step of projection correction, wherein the perspective in the image is selected from a horizontal vanishing point selected from among the at least one horizontal vanishing point candidate and from the at least one vertical vanishing point candidate And correcting the projection point based on the vertical vanishing point.
  31. A system for projection correction of an image comprising at least one text part that is subject to distortion by perspective, said system comprising at least one processor and a program executable using said at least one processor. Storage, and
    A first software code portion configured for image binarization that, when executed, binarizes the image;
    When executed, detect pixel blobs in the at least one text portion of the binarized image, and for each of the pixel blobs, above the pixel blob baseline of the pixel blobs. A second software code portion configured for connected component analysis to select positioning pixels defining the position of the pixel blob in the binarized image;
    When executed, a text baseline is estimated using the locating pixels of the pixel blob and the text baseline is used to determine at least one candidate horizontal vanishing point of the at least one text portion A third software code portion configured for determining horizontal vanishing points;
    When executed, vertical text blobs selected by a blob filtering algorithm on the at least one text portion of the image, each corresponding to a selected one of the pixel blobs. A fourth software code portion configured for vertical erasure point determination to estimate a line and determine at least one candidate vertical erasure point for the at least one text portion using the vertical text blob line. There,
    At least one of the third and fourth software code portions is
    Transforming each of the estimated text lines into data points in a homogeneous coordinate system;
    Assigning a confidence level to each of the data points;
    Grouping several data points having a confidence level above a predetermined threshold into a priority sample array;
    Clustering the data points in the priority sample array into a number of sample groups, each sample group comprising at least two data points;
    Assigning a group confidence value to each sample group based on at least the confidence level assigned to each data point in the sample group;
    Applying the RANSAC algorithm to determine the set of inliers for the first adapted line of said data points, said RANSAC algorithm being the highest of said priority sample array Applying from the sample group having a group confidence value;
    A fourth software code portion configured to perform the steps of: estimating the at least one erasure point candidate from the text line corresponding to the set of inliers;
    In the image, based on the horizontal vanishing point selected from the at least one horizontal vanishing point candidate and the vertical vanishing point selected from the at least one vertical vanishing point candidate, when executed. A fifth software code portion configured to perform projection correction to correct the perspective.
  32. Of the following: personal computers, portable computers, laptop computers, netbook computers, tablet computers, smart phones, digital still cameras, video cameras, mobile communication devices, portable personal digital assistants 32. The system of claim 31 , comprising one of:, a scanner, a multifunction device.
  33. For determining the disappearance point candidates of the at least one of the text portion in the picture which receives the distortion by perspective drawing, computer program product is stored, a non-transitory storage medium,
    The following steps can be performed when executed on a computing device, ie:
    An image binarization step, wherein the image is binarized;
    Performing connected component analysis, wherein a pixel blob is detected in the at least one text portion of the binarized image, and for each of the pixel blobs, a position determining pixel is the pixel blob. Performing on a pixel blob baseline of a blob, wherein the locating pixels define the position of the pixel blob in the binarized image;
    Estimating in the Cartesian coordinate system a number of text lines, each text line representing an approximation of the horizontal or vertical text orientation of the at least one text portion based on the positioning pixel;
    Converting each of the text lines into data points in a homogeneous coordinate system;
    Assigning a confidence level to each of the data points;
    Grouping several data points having a confidence level above a predetermined threshold into a priority sample array;
    Clustering the data points in the priority sample array into a number of sample groups, each sample group comprising at least two data points; ,
    Assigning a group confidence value to each sample group based on at least the confidence level assigned to each data point in the sample group;
    Applying the RANSAC algorithm to determine the set of inliers for the first adapted line of the data points, the RANSAC algorithm being the highest in the priority sample array Applying from the sample group having a group confidence value;
    The text that contains the software code portions in the configuration format to perform the steps of estimating at least one vanishing point candidates from the line, non-transitory storage medium corresponding to the set of inliers.
  34. A method for projection correction of an image comprising at least one text portion that is subject to distortion by perspective projection, comprising:
    An image binarization step, wherein the image is binarized;
    A connected component analysis step, wherein a pixel blob is detected in the at least one text portion of the binarized image, and for each of the pixel blobs, a position determining pixel is the pixel blob Connected component analysis, selected on a pixel blob baseline, wherein the locating pixels define the position of the pixel blob in the binarized image;
    Estimating a text baseline using the determined pixels of the pixel blob, identifying horizontal vanishing point candidates from the estimated text baseline, and using the horizontal vanishing point candidates Determining a horizontal vanishing point comprising: determining a horizontal vanishing point of at least one text portion;
    A step of determining a vertical vanishing point, wherein a vertical vanishing point is determined for the at least one text portion based on vertical features of the at least one text portion;
    The step of projection correction, wherein the perspective of the image is corrected based on the horizontal and vertical vanishing points;
    The horizontal vanishing point determination has a first removal step on the level of the position-determining pixel, a second removal step on the level of the text baseline, and the third removing step of on the level of the horizontal vanishing point candidate only including,
    The step of estimating the text baseline comprises the step of clustering unique points into unique point groups, said unique point groups satisfying the following conditions:
    -The condition that the point-to-point distance between the eigenpoints of the eigenpoint group is below a first distance threshold;
    -The point-to-line distance between each unique point of the unique point group and the line formed by the unique points of the unique point group is below a second distance threshold Condition and
    The condition that the off horizontal angle of the line formed by the unique points of the unique point group is below a maximum angle;
    -A condition in which the unique point group contains a minimum number of unique points
    The text baseline is estimated based on the unique point group .
  35. 35. The method of claim 34 , wherein the position determining pixel is a unique point of the pixel blob.
  36. The first removing step comprises the step of detecting confounding eigenpoints that are out of line with respect to eigenpoints near the eigenpoint being considered, wherein the confounding eigenpoints are the text baseline estimation 36. The method of claim 35 , wherein the method is ignored.
  37. The unique points to be confused are the following steps:
    Determining the width and height of the pixel blob;
    Determining an average value for the width and height of the pixel blob;
    The width of the pixel blobs under consideration, the step of at least one of the height, to detect a specific point to the confused as a unique point belonging only to different pixel blobs predetermined range from the determined mean value 37. The method of claim 36 , wherein the method is detected using
  38. The first distance threshold, the second distance threshold, the maximum angle, and the minimum number of unique points are adaptively set based on the content of the image. 35. The method of claim 34 .
  39. The step of estimating the text baseline further includes the step of unique point group merging, and the unique point groups on both sides of the unique points that are not ignored are merged into a larger unique point group 35. The method of claim 34 .
  40. The second removal step
    Assigning a confidence level to the text baseline;
    On the basis of the confidence level, and removing the text baseline method of claim 34.
  41. The confidence level is the proximity of the unique point group used to estimate at least the length of the respective text baseline, the text baseline, and the estimated text baseline. 41. The method of claim 40 , wherein the method is determined based on
  42. Step, the confidence level is performed using the RANSAC algorithm to be taken into account, the method according to claim 40 of removing the text baseline.
  43. The third removal step is
    Performing a projection correction based on each of the identified horizontal vanishing point candidates;
    After projection correction, comparing the proximity of each horizontal vanishing point candidate to the horizontal or text direction of the image ;
    After the projection correction, and selecting the nearest the horizontal vanishing point candidates in the horizontal text direction of the image, The method of claim 34.
  44. First and second candidate horizontal vanishing points are estimated from the text baseline after the second removing step, and a minimum of two for the estimation of the first and second candidate candidate horizontal vanishing points. 35. The method according to claim 34 , wherein different approximation methods selected from the group consisting of multiplication, weighted least squares and adaptive least squares are used.
  45. 35. The method according to claim 34 , wherein the step of separating text and picture is performed after the image binarization and before the connected component analysis, and only text information is retained in the binarized image. the method of.
  46. A system for projection correction of an image comprising at least one text part that is subject to distortion by perspective, said system comprising at least one processor and a program executable using said at least one processor. Storage, and
    A first software code portion configured for image binarization that, when executed, binarizes the image;
    A connected component analysis that, when executed, detects pixel blobs in the at least one text portion of the binarized image, wherein for each of the pixel blobs, a position determining pixel is: A second configured for connected component analysis, selected on a pixel blob baseline of a blob, wherein the locating pixels define the position of the pixel blob in the binarized image; Two software code parts,
    Estimating a text baseline using the determined pixels of the pixel blob, as executed, identifying horizontal vanishing point candidates from the estimated text baseline, and the horizontal vanishing A third software code portion configured for horizontal erasure point determination, performing the steps of: determining horizontal erasure points of the at least one text portion using point candidates.
    When executed, on the basis of the vertical features in at least one of the text portion, a fourth software code configured for vertical vanishing point determination for determining a vertical vanishing point for the at least one text portion Part,
    A fifth software code portion for projection correction to correct said perspective in said image based on said horizontal and vertical vanishing points when executed;
    When the third software code portion is executed, a first removal step for the level of the locating pixel, a second removal step for the level of the text baseline, and a level of the horizontal vanishing point candidate run the third removal step for,
    The step of estimating the text baseline comprises the step of clustering unique points into unique point groups, said unique point groups satisfying the following conditions:
    -The condition that the point-to-point distance between the eigenpoints of the eigenpoint group is below a first distance threshold;
    -The point-to-line distance between each unique point of the unique point group and the line formed by the unique points of the unique point group is below a second distance threshold Condition and
    The condition that the off horizontal angle of the line formed by the unique points of the unique point group is below a maximum angle;
    -A condition in which the unique point group contains a minimum number of unique points
    A system in which the text baseline is estimated based on the unique point group .
  47. Of the following: personal computers, portable computers, laptop computers, netbook computers, tablet computers, smart phones, digital still cameras, video cameras, mobile communication devices, portable personal digital assistants 47. The system of claim 46 , comprising: one of: a scanner, a multifunction device.
  48. For projection correction images including at least one text portion subjected to distortion by perspective drawing, computer program product is stored, a non-transitory storage medium, the computer program product,
    The following steps can be performed when executable on a computing device and executed on said computing device:
    The image is binarized, and the binarized steps of the image,
    A connected component analysis step, wherein a pixel blob is detected in the at least one text portion of the binarized image, and for each of the pixel blobs, a position determining pixel is the pixel blob Connected component analysis, selected on a pixel blob baseline, wherein the locating pixels define the position of the pixel blob in the binarized image;
    Estimating a text baseline using the determined pixels of the pixel blob, identifying a candidate horizontal vanishing point from the estimated text baseline, and using the candidate horizontal vanishing point Determining a horizontal vanishing point comprising determining a horizontal vanishing point of the at least one text portion;
    Determining a vertical vanishing point, wherein a vertical vanishing point is determined for the at least one text portion based on vertical features of the at least one text portion;
    Including the software code portion in a format configured to perform the step of projection correction, wherein the perspective in the image is corrected based on the horizontal and vertical vanishing points;
    The horizontal vanishing point determination includes a first removing step for the level of the locating pixel, a second removing step for the level of the text baseline, and a third removing step for the level of the horizontal vanishing point candidate. ,
    The step of estimating the text baseline comprises the step of clustering unique points into unique point groups, said unique point groups satisfying the following conditions:
    -The condition that the point-to-point distance between the eigenpoints of the eigenpoint group is below a first distance threshold;
    -The point-to-line distance between each unique point of the unique point group and the line formed by the unique points of the unique point group is below a second distance threshold Condition and
    The condition that the off horizontal angle of the line formed by the unique points of the unique point group is below a maximum angle;
    -A condition in which the unique point group contains a minimum number of unique points
    A non-transitory storage medium satisfying at least one of the above, wherein the text baseline is estimated based on the unique point group .
JP2016541592A 2013-12-20 2014-12-19 Method and system for correcting projected distortion Active JP6542230B2 (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
US14/136,585 US8897600B1 (en) 2013-12-20 2013-12-20 Method and system for determining vanishing point candidates for projective correction
US14/136,585 2013-12-20
US14/136,695 US8811751B1 (en) 2013-12-20 2013-12-20 Method and system for correcting projective distortions with elimination steps on multiple levels
US14/136,501 US8913836B1 (en) 2013-12-20 2013-12-20 Method and system for correcting projective distortions using eigenpoints
US14/136,501 2013-12-20
US14/136,695 2013-12-20
PCT/EP2014/078930 WO2015092059A1 (en) 2013-12-20 2014-12-19 Method and system for correcting projective distortions.

Publications (2)

Publication Number Publication Date
JP2017500662A JP2017500662A (en) 2017-01-05
JP6542230B2 true JP6542230B2 (en) 2019-07-10

Family

ID=52292917

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2016541592A Active JP6542230B2 (en) 2013-12-20 2014-12-19 Method and system for correcting projected distortion

Country Status (2)

Country Link
JP (1) JP6542230B2 (en)
WO (1) WO2015092059A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2631765C1 (en) 2016-04-26 2017-09-26 Общество с ограниченной ответственностью "Аби Девелопмент" Method and system of correcting perspective distortions in images occupying double-page spread
CN108323389B (en) * 2018-01-18 2019-11-08 华南农业大学 The rice transplanting rice shoot spacing in the rows of rice transplanter and the detection method and device of cave rice shoot number
CN110084236B (en) * 2019-04-29 2021-05-28 北京朗镜科技有限责任公司 Image correction method and device

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02116987A (en) * 1988-10-27 1990-05-01 Toshiba Corp Character recognizing device
JPH04127288A (en) * 1990-05-21 1992-04-28 Fuji Facom Corp Character discriminating method by base line
JPH04271488A (en) * 1991-02-27 1992-09-28 Nec Corp System for detecting noise
JPH07121658A (en) * 1993-10-20 1995-05-12 Nippon Digital Kenkyusho:Kk Character string detection system
US6873732B2 (en) * 2001-07-09 2005-03-29 Xerox Corporation Method and apparatus for resolving perspective distortion in a document image and for calculating line sums in images
NO20052656D0 (en) * 2005-06-02 2005-06-02 Lumex As Geometric image transformation based on text line searching
CN101192269B (en) * 2006-11-29 2012-05-02 佳能株式会社 Method and device for estimating vanishing point from image, computer program and its storage medium
CN101267493B (en) * 2007-03-16 2011-01-19 富士通株式会社 Correction device and method for perspective distortion document image
CN101520852B (en) * 2008-02-29 2011-09-07 富士通株式会社 Vanishing point detecting device and detecting method

Also Published As

Publication number Publication date
WO2015092059A1 (en) 2015-06-25
JP2017500662A (en) 2017-01-05

Similar Documents

Publication Publication Date Title
US8811751B1 (en) Method and system for correcting projective distortions with elimination steps on multiple levels
US8897600B1 (en) Method and system for determining vanishing point candidates for projective correction
US9363499B2 (en) Method, electronic device and medium for adjusting depth values
US8712188B2 (en) System and method for document orientation detection
KR101399709B1 (en) Model-based dewarping method and apparatus
US8457403B2 (en) Method of detecting and correcting digital images of books in the book spine area
WO2014160433A2 (en) Systems and methods for classifying objects in digital images captured using mobile devices
JP4955096B2 (en) DETECTING DEVICE, DETECTING METHOD, DETECTING PROGRAM, AND RECORDING MEDIUM
US10289924B2 (en) System and method for scanned document correction
US9179035B2 (en) Method of editing static digital combined images comprising images of multiple objects
US10455163B2 (en) Image processing apparatus that generates a combined image, control method, and storage medium
RU2631765C1 (en) Method and system of correcting perspective distortions in images occupying double-page spread
US10169673B2 (en) Region-of-interest detection apparatus, region-of-interest detection method, and recording medium
JP6542230B2 (en) Method and system for correcting projected distortion
US8913836B1 (en) Method and system for correcting projective distortions using eigenpoints
CN109697414B (en) Text positioning method and device
US9131193B2 (en) Image-processing device removing encircling lines for identifying sub-regions of image
KR101377910B1 (en) Image processing method and image processing apparatus
US9094617B2 (en) Methods and systems for real-time image-capture feedback
US10423851B2 (en) Method, apparatus, and computer-readable medium for processing an image with horizontal and vertical text
US9008444B2 (en) Image rectification using sparsely-distributed local features
CN109948521B (en) Image deviation rectifying method and device, equipment and storage medium
US9225876B2 (en) Method and apparatus for using an enlargement operation to reduce visually detected defects in an image
WO2019062426A1 (en) Border detection method, server and storage medium
JP6669390B2 (en) Information processing apparatus, information processing method, and program

Legal Events

Date Code Title Description
A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20161107

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20170828

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20180920

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20181101

A601 Written request for extension of time

Free format text: JAPANESE INTERMEDIATE CODE: A601

Effective date: 20190131

A601 Written request for extension of time

Free format text: JAPANESE INTERMEDIATE CODE: A601

Effective date: 20190401

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20190426

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20190516

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20190612

R150 Certificate of patent or registration of utility model

Ref document number: 6542230

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150