CN113449729A

CN113449729A - Image processing apparatus, image processing method, and storage medium for eliminating lines

Info

Publication number: CN113449729A
Application number: CN202010224697.6A
Authority: CN
Inventors: 汪留安; 孙俊
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2021-09-28
Anticipated expiration: 2040-03-26
Also published as: JP7585906B2; JP2021157793A; CN113449729B

Abstract

The present disclosure relates to an image processing apparatus, an image processing method, and a storage medium. According to an embodiment of the present disclosure, the image processing apparatus includes: a binarization unit for binarizing a grayscale document image as a target image; a dividing unit for obtaining an arrangement along the first direction by dividing the target image a plurality of bar regions; a directed graph determination unit for determining a directed graph for the entire target image based on the intra-regional connected domains of the plurality of bar regions; a target path determination unit for determining and a directed graph based on the directed graph a target path related to the single-source shortest path of the graph; and a line removal unit for removing lines corresponding to the target path in the grayscale document image. The method, device and storage medium of the present disclosure can help to achieve at least one of the following effects: eliminate noise lines in document images, have fast processing speed, occupy less computing resources, and improve the recognition performance of a character recognition engine.

Description

Image processing apparatus, image processing method, and storage medium for eliminating lines

Technical Field

The present disclosure relates generally to image processing, and more particularly, to an image processing apparatus, an image processing method, and a storage medium for eliminating lines.

Background

Words in an image can be converted to text using Optical Character Recognition (OCR) techniques. In optical character recognition applications, input images need to be preprocessed in order to improve the recognition performance of OCR recognition engines. In general, the more noise in an input image, the lower the recognition performance of the recognition engine. The pre-processing of the input image includes removing noise. The preprocessing can thus boost the performance of the recognition engine.

Therefore, there are many conventional image processing methods for removing noise such as salt and pepper in an image.

Disclosure of Invention

A brief summary of the disclosure is provided below in order to provide a basic understanding of some aspects of the disclosure. It should be understood that this summary is not an exhaustive overview of the disclosure. It is not intended to identify key or critical elements of the disclosure or to delineate the scope of the disclosure. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is discussed later.

The inventors have discovered that in OCR recognition there may be non-text lines of identification that identify rows and/or columns near the text in the input document image. For example, text line images from bank forms, insurance, courier, with much of the upper and lower lines. These lines will generally degrade the performance of the recognition engine. These lines may be considered noise to the recognition engine and need to be preprocessed. In some cases, some recognition engines may misjudge foreground and background pixels in the input image, which may degrade the character recognition performance of the recognition engine. For example, when the recognition engine recognizes the above-described identification line as a foreground pixel and attempts character recognition, the character recognition performance of the recognition engine is deteriorated. In view of this, the inventors propose in the present disclosure a technique for eliminating lines in a document image based on a directed graph.

According to an aspect of the present disclosure, there is provided an image processing method including: binarizing the gray level document image to be used as a target image; obtaining a plurality of bar-shaped areas arranged along a first direction by dividing a target image; determining a directed graph for the whole target image based on connected domains in the regions of the plurality of bar-shaped regions; determining a target path related to a single-source shortest path of the directed graph based on the directed graph; and eliminating a line in the grayscale document image corresponding to the target path.

According to an aspect of the present disclosure, there is provided an image processing apparatus including: a binarization unit configured to binarize the grayscale document image as a target image; the dividing unit is used for obtaining a plurality of strip-shaped areas arranged along a first direction by dividing the target image; the directed graph determining unit is used for determining a directed graph aiming at the whole target image based on the connected regions in the regions of the strip-shaped regions; a target path determination unit for determining a target path related to a single-source shortest path of the directed graph based on the directed graph; and a line eliminating unit for eliminating a line corresponding to the target path in the gradation document image.

According to another aspect of the present disclosure, there is provided a computer-readable storage medium having a program stored thereon, wherein the program is to implement, when executed by a processor, an image processing method including: binarizing the gray level document image to be used as a target image; obtaining a plurality of bar-shaped areas arranged along a first direction by dividing a target image; determining a directed graph for the whole target image based on connected domains in the regions of the plurality of bar-shaped regions; determining a target path related to a single-source shortest path of the directed graph based on the directed graph; and eliminating a line in the grayscale document image corresponding to the target path.

The method, apparatus and storage medium of the present disclosure can help achieve at least one of the following effects: eliminating the row lines and the column lines in the document image, eliminating the noise lines in the document image, having high processing speed, occupying less computing resources and improving the recognition performance of the character recognition engine.

Drawings

The above and other objects, features and advantages of the present disclosure will be more readily understood from the following description of embodiments thereof with reference to the accompanying drawings. The drawings are only for the purpose of illustrating the principles of the disclosure. The dimensions and relative positioning of the elements in the figures are not necessarily drawn to scale. Like reference numerals may denote like features. In the drawings:

FIG. 1 shows a flow diagram of an image processing method according to one embodiment of the present disclosure;

FIG. 2 shows an example of a target image;

fig. 3 exemplarily shows the target image after hiding the odd-numbered bar regions and after hiding the even-numbered bar regions;

FIG. 4 illustrates a flow diagram of a method of determining a directed graph according to one embodiment of the present disclosure;

FIG. 5 illustrates a flow diagram of a method of determining a cost of an edge according to one embodiment of the present disclosure;

fig. 6 illustrates an example single source shortest path intra-corresponding region connected domain according to one embodiment of this disclosure;

FIG. 7 illustrates a target intra-area connected domain determined by a removal process according to one embodiment of the present disclosure;

FIG. 8 illustrates a grayscale document image after elimination of a line in the grayscale document image corresponding to a target path, according to one embodiment of the present disclosure;

FIG. 9 illustrates a target intra-area connected domain determined by a removal process according to another embodiment of the present disclosure;

FIG. 10 shows an exemplary block diagram of an image processing apparatus according to one embodiment of the present disclosure;

FIG. 11 shows a flow diagram of an optical character recognition method according to one embodiment of the present disclosure;

FIG. 12 illustrates an exemplary block diagram of an optical character recognition device in accordance with one embodiment of the present disclosure; and

fig. 13 shows an exemplary block diagram of an information processing apparatus according to one embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure will be described hereinafter with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual embodiment are described in the specification. It will of course be appreciated that in the development of any such actual embodiment, numerous implementation-specific decisions may be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which will vary from one implementation to another.

Here, it should be further noted that, in order to avoid obscuring the present disclosure with unnecessary details, only the device structure closely related to the scheme according to the present disclosure is shown in the drawings, and other details not so related to the present disclosure are omitted.

It is to be understood that the disclosure is not limited to the described embodiments, as described below with reference to the drawings. In this context, embodiments may be combined with each other, features may be replaced or borrowed between different embodiments, one or more features may be omitted in one embodiment, where feasible.

As will be appreciated by one skilled in the art, aspects of the exemplary embodiments may be embodied as a system, method or computer program product. Thus, aspects of the exemplary embodiments can be embodied in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects. These may be referred to herein generally as "circuits," modules, "or" systems. Furthermore, aspects of the illustrative embodiments may take the form of a computer program product embodied on one or more computer-readable media having computer-readable program code embodied thereon. The computer program may be distributed, for example, over a computer network, or it may be located on one or more remote servers or embedded in the memory of the device.

Any combination of one or more computer-readable media may be used. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any suitable form, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof.

A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied in a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, radio frequency, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the exemplary embodiments disclosed herein may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages.

Various aspects of the exemplary embodiments disclosed herein are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to exemplary embodiments. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

One aspect of the present disclosure provides an image processing method for processing a document image. The processed document image may be used for text recognition. The method may be used as part of a pre-processing of an input image in an optical character recognition application. The method may be used to eliminate noisy lines in an image. The line may be: a row line identifying a row, a column line identifying a column, or other non-character identifying linearity. The lines may be solid, discontinuous or other lines having a defined overall orientation. The discontinuous lines include dashed and dotted lines. The document image may be a color image. Considering that a color document image can be converted into a grayscale image by a conventional method at the time of character recognition, an image processing method will be described below taking the grayscale document image as an example of processing. The method is described below by way of example with reference to fig. 1.

FIG. 1 shows a flow diagram of an image processing method 10 according to one embodiment of the present disclosure. The input to the method includes a grayscale document image. The output of the method includes a grayscale document image with at least some of the non-character identification lines eliminated.

At step S101, the grayscale document image Img is binarized as the target image Imo. This step can be implemented, for example, using the Sauvola method. Binarization may remove the background of the image. Fig. 2 is an example of a target image Imo. The target image Imo in fig. 2 includes a line L, a character Ch1, and a character Ch 2. The corresponding gradation document image Img includes a line L corresponding to the line L therein. The goal of the image processing method 10 includes eliminating the line l in the grayscale document image Img. Note that the reference numerals (e.g., "L") and S-shaped indicating lines in fig. 2 are only shown as elements included in the example target image Imo for explaining the target image, and the reference numerals and S-shaped indicating lines are not included in the example target image Imo itself. Although the characters in the example target image Imo are in handwriting, the characters in the target image are not so limited and may be in print, and may even include both print and handwritten characters. It should be noted that the method 10 may be performed after the document image of the character row is divided into two document images along the central line, so as to remove the bottom row line and the top row line of the document image of the character row respectively. This binarization step can be omitted if the input image to be processed is already a binarized image. Although one line L with a direction in the horizontal direction is shown in fig. 2, and the image processing method is exemplarily described by taking the corresponding line L of the eliminated line L as an example, the same point exists in the method of eliminating a plurality of lines in other directions and the concept of eliminating one line with a direction in the horizontal direction. Thus, one skilled in the art will appreciate from the method of eliminating one line oriented horizontally that the method of the present disclosure may also be used to eliminate multiple lines and/or lines having other orientations.

At step S103, a plurality of bar-shaped regions R arranged in the first direction are obtained by dividing the target image Imo₁、R₂、……、R_n、……、R_N. The first direction may be a horizontal direction (a lateral direction of the target image), a vertical direction (a vertical direction of the target image), or other directions. The length of each bar region may be equal to one of the length or the width of the target image. For example, in one example division, the target image Imo having a length Len is divided into N rectangular regions in the horizontal direction, each rectangular region having a length equal to the width w of the target image Imo and a width Len/N. In order to ensure a fast and accurate detection of the lines to be eliminated, the width of each strip-shaped area may be, for example, equal to a predetermined value of two to ten pixel widths. According to this division, the odd-numbered or even-numbered bar regions are hidden, and the target image Imo will become an image with a dotted line. Sequentially stitching these strip regions together will result in the target image Imo. FIG. 3 illustrates exemplary blurring target images after hiding odd-numbered bar regions and after hiding even-numbered bar regions, wherein the images are scaled appropriately relative to the images in FIG. 2; the first direction is the horizontal direction; (a) for blurring the target image Imo after hiding the even-numbered bar regions_o(ii) a And (b) is a blurring target image Imo after hiding the odd-numbered bar regions_e. It should be noted that the widths of the plurality of bar-shaped regions in the present disclosure may be equal or different.

At step S105, based on a plurality of bar regions R_nIntra-regional connected domain CC_ni’A directed graph is determined for the entire target image, where I ' is an index that distinguishes different connected domains, I ' is 1,2, … …, I '. I' is a bar-shaped region R_nTotal number of connected domains within. In the present disclosure, connected domains within a stripe region are also referred to as intra-region connected domains. Under the condition of not needing to distinguish the strip-shaped areas to which the connected areas belong in each area, the 'CC' or the 'CC' can be used_i"denotes" connected domain within region ", I is 1,2, … …, I. I isIs the total number of connected domains within the region within the target image Imo. The number of intra-region connected domains within each stripe region may be zero, 1,2 or more. Intra-region connected domain set { CC } consisting of intra-region connected domains of all strip regions of target image_i}. Per-region connected component CC_ni’Is composed of foreground pixels of the target image, and a connected component CC in each region_ni’In a single stripe region R_nAnd (4) the following steps. Although a certain foreground line of the target image before division forms an initial connected domain penetrating through a plurality of strip-shaped regions, for the divided target image, due to the fact that the boundaries of different strip-shaped regions are distinguished, the line is divided into a plurality of sections corresponding to the connected domains in the plurality of regions, and correspondingly, the initial connected domain is considered to be formed by a plurality of intra-region connected domains. In other words, each bar region can be regarded as a sub-image, and the connected component in each corresponding sub-image is the connected component in the region. For example, when one connected component in the target image spans two adjacent bar-shaped regions, two connected component regions within the two adjacent bar-shaped regions are identified as two intra-region connected components. The connected domains shown in fig. 3(a) and 3(b) are intra-region connected domains, and all of the intra-region connected domains in the two diagrams constitute an intra-region connected domain set { CC } for the entire target image_i}。

The directed graph includes a plurality of nodes n_jA plurality of edges e_jj’Where j is an index that distinguishes different nodes, e_jj’For indicating a connecting node n_jAnd node n_j’And the direction is from node n_j(head node) pointing to node n_j’(tail node). Each edge having a cost cn_jj’. The nodes, edges, costs and connected domain sets in the region { CC_iAnd (4) correlation, including information of the intra-area communication domain. Since the edges are directed graph edges, the edges are directed edges.

For the case of a grayscale document image having both bottom row lines and top row lines, in one example, a connected region centered above the bit line in the level of the target image may be selected to be provided with the nodes of the graph to eliminate the top row lines; or select a connected region centered below the horizontal median line of the target image to place the nodes of the graph to eliminate the bottom row lines.

At step S107, a destination path Po relating to a single-source shortest path P (also simply referred to as "shortest path") of the directed graph is determined based on the directed graph. In one example, this step may include: adding virtual nodes to the directed graph; constructing virtual edges from the virtual nodes to other nodes; the cost of the virtual edge is set to zero. Having determined the single-source shortest path P of the directed graph, the destination path Po may be determined based on the determined single-source shortest path. In one example, the single source shortest path may be directly taken as the destination path.

In one example, the single source shortest path of the directed graph is determined, for example, by Bellman-Ford (Bellman-Ford) method/algorithm, with the virtual node as a starting point. The single source shortest path includes a plurality of nodes n_kAnd K is an index for distinguishing each node, and K is 1,2, … …, K. These nodes constitute a set of nodes n_k}, node n_kAnd an intra-region communication domain

Accordingly, the single-source shortest path corresponds to a connected domain in a plurality of areas. The connected domains in the corresponding areas can be called as shortest path connected domains, and the shortest path connected domains form a shortest path connected domain set

Similar to the shortest path P, the destination path Po also comprises a plurality of corresponding nodes n_k’(hereinafter referred to as target node) and intra-area communication domain

(hereinafter referred to as connected domain in target region), target node n_k’The corresponding intra-region connected domain is the target intra-region connected domain

These corresponding nodes constitute a set of target nodes n_k’}. These corresponding regions are interconnectedDomain composition target region intra-connected domain set

More specifically, the target node n_k’Set of nodes selected from shortest path n_k}. In one example, the set of target nodes n_k’Set of nodes n that can be shortest paths with_kThe same.

At step S109, the line corresponding to the target path in the gradation document image is eliminated. Specifically, the set of connected domains in the target area corresponding to the target path Po may be located according to the position

Target region-within connected domain in (1)

Region A in the corresponding grayscale document image Img_k’Is set to a pixel value associated with a background pixel of the grayscale document image, for example, to a mean (e.g., arithmetic mean, geometric mean, weighted mean), median, etc., of the background pixels of the grayscale document image.

With regard to the determination of the digraph in step S105, fig. 4 may be referred to. Fig. 4 shows a flow diagram of a method 40 of determining a directed graph according to one embodiment of the present disclosure. The method 40 is based on each strip-shaped region R_nIntra-regional connected domain CC_ni’Determining a directed graph for the entire target image Imo, in other words, method 40 is based on a set of connected domains { CC ] within a region_iDetermine a directed graph for the entire target image Imo.

At step S401, a node of the directed graph is determined based on the intra-region connected domain. In one example, the nodes are in a one-to-one correspondence with intra-region connectivity domains, e.g., intra-region connectivity domains CC_iIs n_jWhere i and j may be set according to the following rule.

At step S403, an edge of the directed graph is constructed based on the intra-region connected component. Edges can be constructed based on the centers of connected domains within a region. Each region is connected withDomain CC_iHas a coordinate of (x)_i,y_i). For example, the Intra-region connected Domain CC_iCoordinate (x) of center of_i,y_i) Can be ((xmax-xmin)/2, (ymax-ymin)/2) or (xav, yav), wherein xmax is CC_iMaximum value in the abscissa of all pixels in, xmin is CC_iThe minimum value in the abscissa of all pixels within, ymax being CC_iThe maximum value in the ordinate of all pixels in, ymin is CC_iMinimum value in ordinate of all pixels in, xav is CC_iThe average (e.g., arithmetic mean) of the abscissas of all pixels within yav is CC_iThe average (e.g., arithmetic mean) of the ordinates of all pixels within. Intra-regional connected domain CC_iThe manner of determining the coordinates of the center of (b) is not limited to the foregoing examples. Considering that the nodes correspond to the connected domains in the region one by one, the coordinates of the centers of the connected domains in the region may also be referred to as the coordinates of the nodes. The coordinates of the center of the intra-region connected domain are a part of the information of the intra-region connected domain, that is, the directed graph has a relationship with the coordinates of the center of the intra-region connected domain. More specifically, the cost of nodes, edges, and edges of the directed graph are related to the coordinates of the center of the connected domain within the region. Wherein when the stripe region R_nWhen the number of the connected domains in the inner region is zero, the constructed directed graph does not contain the bar-shaped region R_nAny of (3).

Edges of the directed graph may be constructed according to predetermined conditions. For example, an edge of a directed graph may be constructed by: if node n_iAnd n_i’When the following condition of the formula (1) is satisfied, the node n_iAnd n_i’Build a directed edge e between_ii’。

x_i<x_i′AND x_i′<x_i+H AND|y_i′-y_i′|<min(h_i,h_i′) (1)

Where H is the height of the target image, H_iIs an intra-region communication domain CC_iHeight of (h)_i’Is an intra-region communication domain CC_i’Of (c) is measured. That is, the abscissa of the head node and the tail node of the edge is different; end node and head of edgeThe difference between the abscissas of the nodes (i.e., the lateral distance between the tail node and the head node) is less than the height of the target image; the absolute value of the difference between the ordinates of the tail node and the head node of the edge (i.e., the longitudinal distance between the tail node and the head node) is smaller than each of the heights of the connected regions within the corresponding regions corresponding to the tail node and the head node. The above conditions are only examples and more generally, the edges of the directed graph may be constructed according to the orientation of the lines to be eliminated. The above conditions are preferably used in the case where the orientation of the lines to be eliminated is substantially horizontal.

If it is known that the line to be eliminated is substantially in the first direction, the edge can be constructed according to the following condition: the coordinates in the first direction of the centers of the connected domains in the corresponding areas of the two nodes connected by the edge are different; the distance between the centers of the connected domains in the corresponding areas of the two nodes connected by the edge in the first direction is smaller than the size of the target image in the second direction; the absolute value of the difference in coordinates in the second direction of the centers of connected domains within the corresponding region of the two nodes to which the edge is connected is smaller than each of the sizes in the second direction of the corresponding connected regions; wherein the first direction is perpendicular to the second direction.

In addition, when the line to be eliminated is in the vertical direction, the grayscale document image may be rotated by 90 °, binarized, directed graph constructed, and line eliminated, and finally the document image with the line eliminated may be rotated by 90 ° in the reverse direction. In this manner, the conditions used to construct the edges of the directed graph may use the conditions already shown above.

In step S405, the cost of the constructed edge of the directed graph is determined based on the intra-region connected domain for determining the single-source shortest path of the directed graph.

Fig. 5 may be referred to with respect to determining the cost in step S405. FIG. 5 shows a flow diagram of a method 50 of determining a cost of an edge according to one embodiment of the present disclosure.

At step S501, a fitted straight line connecting the centers of domains within the region is determined. Based on connected domain set { CC in region_iThe coordinates of the centers of connected domains in each region in the (Z) are linearly fitted to obtain the straight line closest to the centersA wire. In one example, the fitted line is determined by a RANdom SAmple Consensus (RANSAC: RANdom SAmple Consensus) method/algorithm. Linear fitting of a plurality of discrete points is a conventional technique and will not be described in detail here. In one modification, whether or not there is a line to be eliminated in the target image may be determined based on the fitted straight line. For example, if the slope of the fitted straight line is less than a predetermined slope threshold, it is determined that there is a line to be eliminated, otherwise a hint is given to exit the image processing method.

At step S503, the cost of the constructed edge is determined based on the fitted straight line. More specifically, it may be based on fitting a straight line L_fAnd determining the cost by the coordinates of the center of the communication domain in the area corresponding to the constructed edge. In one example of determining the cost, an initial cost may first be determined based on the location of the corresponding node and a fitted straight line (e.g., equation (2)).

c_ii′＝|x_i′-x_i|+|y_i′-y_i|+dis(CC_i′,L_f) (2)

Wherein, c_ii’Is a slave node n_iPointing to a node n_i’Constructed edge e of_ii’(ii) initial cost of (x)_i，y_i) Is node n_iCorresponding intra-region communication domain CC_i(x) of (c)_i’，y_i’) Is node n_i’Corresponding intra-region communication domain CC_i’Coordinate of center of (c), dis (CC)_i’,L_f) Is a node n_i’Corresponding region-in-region through-region center to fitting straight line L_fThe distance of (c).

The initial cost is then normalized as the cost of the constructed edge. An example of the normalization manner may be a logarithmic function exemplified by equation (3).

cn_ii′＝log[2*(c_ii′-c_min)/(c_max-c_min)] (3)

Wherein cn is_ii’Is a slave node n_iPointing to a node n_i’Constructed edge e of_ii’A negative number, c_minIs the initial cost of all constructed edgesMinimum initial cost of c_maxIs the largest initial cost among the initial costs of all constructed edges.

A variation of method 10 is described below.

There is a possibility that lines to be eliminated in the document image that hinder character recognition may adhere to characters in the document image. An example of this phenomenon can be seen in fig. 2: the character Ch2 is stuck to the line L. Such sticking may result in the intra-area connection field corresponding to the determined shortest path of the single source including a partial character area. Fig. 6 illustrates a shortest path connected domain of an example single source shortest path P according to one embodiment of this disclosure

(the region colored in the darkest color in the figure, and each connected region is not shown in the figure

In fig. 6), wherein an adjusted example target image is shown with three colorations to illustrate three different intra-region connected domains. In addition to the shortest path connected component (darkest colored region) in the exemplary target image shown by way of example, other intra-region connected components, i.e., remaining connected components, are shown, including odd-numbered intra-region connected components (colored with a middle depth color in the figure and identified as "odd-numbered CCs") and even-numbered intra-region connected components (colored with a lightest color in the figure and identified as "even-numbered CCs") of the remaining connected components, where the numbers refer to the indices of the bar-shaped regions to which the intra-region connected components belong. As shown in FIG. 6, the shortest path connected domain set of the single source shortest path

Including the partial character region of the character Ch2 (see bottom of the character Ch 2). If such a shortest path connected domain set in the document image is directly eliminated

The pixels within the corresponding region (e.g.,replacing the pixel values of the corresponding region with the mean of the background pixels), the character to be recognized may be corrupted, thereby degrading the performance of the text recognition engine. Therefore, it is preferable that the shortest path connected domain set needing to remove the shortest path of the single source

And (4) connected domain in the adhesion area adhered with the character in the target image.

Step S107 in fig. 1 may therefore preferably comprise: determining a single-source shortest path of the directed graph based on the directed graph; shortest path connected domain set for removing single-source shortest path P according to preset adhesion removal condition

To determine a set of connected components within the target area

The process of removing the connected domains in the adhesion region may be simply referred to as a removal process. In one example, the predetermined adhesion removal condition may be: shortest path connected domain of single source shortest path

Height h of_kMean h of the height of intra-region connected regions in the target image greater than Tht times_avAnd Tht is a predetermined value. In one example, the predetermined adhesion removal condition may be: shortest path connected domain of single source shortest path

Height h of_kMedian h of height of intra-region connected domain in target image greater than Tht times_mi. Tht is a predetermined threshold greater than zero and may be empirically determined, and illustratively, Tht may be 0.2. If h is_kIf the preset adhesion removing condition is met, the domain set is communicated from the shortest path

Removing the connected domain in the corresponding area, otherwise, reserving the connected domain in the corresponding area. FIG. 7 illustrates a connected component within a target region determined by a removal process according to one embodiment of the present disclosure

Wherein the target areas are not shown

Boundaries between inter-connected domains. It can be seen that the previously stuck partial character area (see bottom of character Ch 2) has not connected the set of domains in the target area

And (4) the following steps.

FIG. 8 illustrates a grayscale document image Img' after elimination of a line in the grayscale document image Img corresponding to the target path according to one embodiment of the disclosure. The elimination process may be, for example, to connect with the connected domain in the target area for the grayscale document image Img

Area a corresponding to connected component in (1)_k’Is set to the average of the background pixels, wherein the area a_k’Is an image area, A, in the grayscale document image Img_k’According to position and

in (3) an intra-region through-domain

And correspond to each other.

In addition, for the target image, the foreground composed of characters may also be extracted as one line. FIG. 9 illustrates a connected component within a target region determined by a removal process according to another embodiment of the present disclosure

Set of connected domains within the target region in FIG. 9

The deepest colored line of construction obviously need not be eliminated, and this line may be referred to as a "false positive line". Therefore, before the elimination process is performed, it is preferable to determine whether or not the corresponding line corresponding to the connected component set in the target region is a false positive line, and if the determination result is yes, the elimination process is not performed, and if the determination result is no, the elimination process is performed. Can be based on the connected domain set in the target area

The number of sections and the number of blocking of the corresponding line of (a) determine whether the corresponding line is a false positive line. In one example, whether a corresponding line is a false positive line may be determined based on the following conditions:

Lenmax<H AND n_touching/n_segment>Th，

wherein Lenmax is the length of the longest segment in the segments of the corresponding line, H is the height of the target image, n _ touching is the total number of times of adhesion of the corresponding line, n _ segment is the total number of the segments of the corresponding line, Th is a predetermined threshold, if the condition is met, the corresponding line is a false positive line, otherwise, the corresponding line is not a false positive line. Th may be 1.6 in one example. In fig. 9, n _ segment is 9, n _ touching is 15, and the determination result is yes, and therefore, the elimination process is not executed. For a segment whose number of blocking is at most 2, the number of blocking for the segment is determined based on whether the left nearest neighbor of the leftmost pixel of the segment and the right nearest neighbor of the rightmost pixel of the segment are foreground pixels, and when both are foreground pixels, the number of blocking for the segment is 2. The sum of the times of adhesion of the segments is n _ segment. Although the "false positive line determination" is performed after the removal processing in the present embodiment, it may be performed before the removal processing in a modification, that is, the set of connected domains with the shortest path is determined

Whether or not the corresponding line is a false positive line is determined, and when the determination result is yes, the erasing process is not performed, and when the determination result is no, the erasing process and the erasing process are performed.

One aspect of the present disclosure provides an image processing apparatus. The apparatus is described below by way of example with reference to fig. 10.

Fig. 10 shows an exemplary block diagram of an image processing apparatus 100 according to one embodiment of the present disclosure. The image processing apparatus 100 includes a binarization unit 101, a dividing unit 103, a directed graph determination unit 105, a target path determination unit 107, and a line elimination unit 109. The binarization unit 101 is used to binarize the grayscale document image as a target image. The dividing unit 103 is configured to obtain a plurality of bar-shaped areas arranged in the first direction by dividing the target image. The directed graph determining unit 105 is configured to determine a directed graph for the entire target image based on the intra-region connected components of the plurality of bar-shaped regions. The target path determination unit 107 is configured to determine a target path related to the single-source shortest path of the directed graph based on the directed graph. The line elimination unit 109 is for eliminating a line corresponding to the target path in the gradation document image. Further description of the units of the image processing apparatus 100 may refer to the description of the image processing method of the present disclosure.

One aspect of the present disclosure provides an optical character recognition method. FIG. 11 shows a flow diagram of an optical character recognition method 11 according to one embodiment of the present disclosure. The optical character recognition method 11 includes: step 1101, preprocessing an image to be recognized; and step S1103, recognizing characters in the preprocessed image; wherein the pre-processing comprises an image processing method according to the present disclosure.

One aspect of the present disclosure provides an optical character recognition apparatus. FIG. 12 illustrates an exemplary block diagram of the optical character recognition device 12 according to one embodiment of the present disclosure. The optical character recognition device 12 includes: 1201, a preprocessing unit configured to preprocess an image to be recognized; and a recognition unit 1203 configured to recognize characters in the preprocessed image; wherein the pre-processing comprises an image processing method according to the present disclosure.

One aspect of the present disclosure provides a computer-readable storage medium having a program stored thereon. The program causes, when executed by a processor, an image processing method to be implemented, the image processing method including: binarizing the gray level document image to be used as a target image; obtaining a plurality of bar-shaped areas arranged along a first direction by dividing a target image; determining a directed graph for the whole target image based on connected domains in the regions of the plurality of bar-shaped regions; determining a target path related to a single-source shortest path of the directed graph based on the directed graph; and eliminating a line in the grayscale document image corresponding to the target path.

One aspect of the present disclosure provides a computer-readable storage medium having a program stored thereon. The program is such that the optical character recognition method according to the present disclosure is implemented when the program is executed by a processor.

According to an aspect of the present disclosure, there is also provided an information processing apparatus.

Fig. 13 is an exemplary block diagram of the information processing apparatus 13 according to one embodiment of the present disclosure. In fig. 13, a Central Processing Unit (CPU)1301 performs various processes according to a program stored in a Read Only Memory (ROM)1302 or a program loaded from a storage portion 1308 to a Random Access Memory (RAM) 1303. In the RAM 1303, data and the like necessary when the CPU 1301 executes various processes are also stored as necessary.

The CPU 1301, the ROM 1302, and the RAM 1303 are connected to each other via a bus 1304. An input/output interface 1305 is also connected to bus 1304.

The following components are connected to the input/output interface 1305: an input portion 1306 including a soft keyboard and the like; an output section 1307 including a display such as a Liquid Crystal Display (LCD) and a speaker; a storage portion 1308 such as a hard disk; and a communication portion 1308 including a network interface card such as a LAN card, modem, or the like. The communication section 1308 performs communication processing via a network such as the internet, a local area network, a mobile network, or a combination thereof.

A driver 1310 is also connected to the input/output interface 1305 as necessary. A removable medium 1311 such as a semiconductor memory or the like is mounted on the drive 1310 as needed, so that the program read therefrom is mounted to the storage portion 1308 as needed.

The CPU 1301 may run a program for implementing an image processing method or an optical character recognition method according to the present disclosure.

The method, the device, the information processing equipment and the storage medium of the disclosure can at least help to realize one of the following effects: eliminating the row lines and the column lines in the document image, eliminating the noise lines in the document image, having high processing speed, occupying less computing resources and improving the recognition performance of the character recognition engine.

While the invention has been described in terms of specific embodiments thereof, it will be appreciated that those skilled in the art will be able to devise various modifications (including combinations and substitutions of features between the embodiments, where appropriate), improvements and equivalents of the invention within the spirit and scope of the appended claims. Such modifications, improvements and equivalents are also intended to be included within the scope of this disclosure.

It should be emphasized that the term "comprises/comprising" when used herein, is taken to specify the presence of stated features, elements, steps or components, but does not preclude the presence or addition of one or more other features, elements, steps or components.

Furthermore, the methods of the embodiments of the present invention are not limited to being performed in the time sequence described in the specification or shown in the drawings, and may be performed in other time sequences, in parallel, or independently. Therefore, the order of execution of the methods described in this specification does not limit the technical scope of the present invention.

Supplementary note

1. An image processing apparatus characterized by comprising:

a binarization unit configured to binarize the grayscale document image as a target image;

a dividing unit configured to obtain a plurality of bar-shaped regions arranged in a first direction by dividing the target image;

a directed graph determination unit configured to determine a directed graph for the entire target image based on intra-region connected regions of the plurality of bar-shaped regions;

a target path determination unit for determining a target path related to a single-source shortest path of the directed graph based on the directed graph; and

a line eliminating unit for eliminating a line corresponding to the target path in the gradation document image.

2. The image processing apparatus according to supplementary note 1, wherein the first direction is one of a horizontal direction or a vertical direction of the target image.

3. The image processing apparatus according to supplementary note 2, wherein the length of each strip-shaped area is equal to one of the length or the width of the target image.

4. The image processing apparatus according to supplementary note 1, wherein each of the strip regions is rectangular.

5. The image processing apparatus according to supplementary note 4, wherein the width of each of the stripe regions is equal to a predetermined value of two to ten pixel widths.

6. The image processing apparatus according to supplementary note 1, wherein a connected region centered at a position equal to or higher than a bit line in a horizontal direction of the target image is selected to set a node of the directed graph; or

Selecting a connected region centered below a median line in the level of the target image to set the nodes of the directed graph.

7. The image processing apparatus according to supplementary note 1, wherein the directed graph determining unit is configured to determine a plurality of nodes of the directed graph;

the plurality of nodes correspond to the intra-region communication regions of the plurality of strip-shaped regions one to one.

8. The image processing apparatus according to supplementary note 7, wherein the directed graph determining unit is configured to construct edges of a directed graph.

9. The image processing apparatus according to supplementary note 8, wherein the constructed edge satisfies the following condition:

the head node and the tail node of the edge have different abscissas;

the difference between the abscissa of the tail node and the abscissa of the head node of the edge is smaller than the height of the target image; and is

The absolute value of the difference between the vertical coordinates of the end node and the head node of the edge is smaller than each of the heights of the connected regions in the corresponding regions corresponding to the end node and the head node;

and the coordinates of each node of the directed graph are the coordinates of the center of the connected domain in the corresponding area.

10. The image processing apparatus according to supplementary note 8, wherein the directed graph determining unit is configured to determine the cost of an edge of the directed graph based on coordinates of two nodes of the edge.

11. The image processing apparatus according to supplementary note 10, wherein determining the cost of an edge of the directed graph based on coordinates of two nodes of the edge includes:

and determining a fitted straight line of the center of the connected domain in the corresponding area of the plurality of nodes of the directed graph.

12. The image processing apparatus according to supplementary note 11, wherein the fitted straight line is determined by a random sampling consensus algorithm.

13. The image processing apparatus according to supplementary note 11, wherein the directed graph determining unit is configured to determine whether or not there is a line to be eliminated in the target image based on a slope of a fitted straight line.

14. The image processing apparatus according to supplementary note 11, wherein the directed graph determining unit is configured to determine the cost of an edge of the directed graph based on coordinates of two nodes of the edge and the fitted straight line.

15. The image processing apparatus according to supplementary note 14, wherein the cost of the edge is determined based on an absolute value of a difference in abscissa of two nodes of the edge, an absolute value of a difference in ordinate of two nodes of the edge, and a distance between a tail node of the two nodes and the fitted straight line.

16. The image processing apparatus according to supplementary note 15, wherein the cost is a negative value normalized using a logarithmic function.

17. The image processing apparatus according to supplementary note 1, wherein determining a target path related to a single-source shortest path of the directed graph based on the directed graph comprises:

adding virtual nodes to the directed graph;

constructing virtual edges from the virtual nodes to other nodes;

setting the cost of the virtual edge to zero.

18. The image processing apparatus according to supplementary note 17, wherein determining a target path related to a single-source shortest path of the directed graph based on the directed graph comprises:

and determining the single-source shortest path of the directed graph by a Bellman-Ford algorithm with the virtual node as a starting point.

19. An image processing method, characterized in that the image processing method comprises:

binarizing the gray level document image to be used as a target image;

obtaining a plurality of bar-shaped areas arranged along a first direction by dividing the target image;

determining a directed graph for the entire target image based on intra-region communication fields of the plurality of bar-shaped regions;

determining a destination path related to a single-source shortest path of the directed graph based on the directed graph; and

eliminating lines in the grayscale document image corresponding to the target path.

20. A computer-readable storage medium on which a program is stored, the program causing an image processing method to be implemented when the program is executed by a processor, the image processing method comprising:

binarizing the gray level document image to be used as a target image;

Claims

1. An image processing apparatus characterized by comprising:

2. The image processing apparatus according to claim 1, wherein a connected region centered above a bit line in a horizontal of the target image is selected to set a node of the directed graph; or

3. The image processing apparatus according to claim 1, wherein the directed graph determining unit is configured to determine a plurality of nodes of the directed graph;

4. The image processing apparatus according to claim 3, wherein the directed graph determining unit is configured to construct edges of the directed graph;

wherein the constructed edge satisfies the following conditions:

the head node and the tail node of the edge have different abscissas;

5. The image processing apparatus according to claim 4, wherein the directed graph determining unit is configured to determine the cost of an edge of the directed graph based on coordinates of two nodes of the edge.

6. The image processing apparatus of claim 5, wherein determining the cost of an edge of the directed graph based on coordinates of two nodes of the edge comprises:

7. The image processing apparatus according to claim 6, wherein the directed graph determining unit is configured to determine the cost of an edge of the directed graph based on coordinates of two nodes of the edge and the fitted straight line.

8. The image processing apparatus of claim 1, wherein determining a target path based on the directed graph that is related to a single source shortest path of the directed graph comprises:

adding virtual nodes to the directed graph;

constructing virtual edges from the virtual nodes to other nodes;

setting the cost of the virtual edge to zero.

9. An image processing method, characterized in that the image processing method comprises:

binarizing the gray level document image to be used as a target image;

10. A computer-readable storage medium on which a program is stored, the program causing an image processing method to be implemented when the program is executed by a processor, the image processing method comprising:

binarizing the gray level document image to be used as a target image;