CN113095327B

CN113095327B - Method and system for positioning optical character recognition area and storage medium thereof

Info

Publication number: CN113095327B
Application number: CN202110280359.9A
Authority: CN
Inventors: 丁凡
Original assignee: Shenzhen Emperor Technology Co Ltd
Current assignee: Shenzhen Emperor Technology Co Ltd
Priority date: 2021-03-16
Filing date: 2021-03-16
Publication date: 2022-10-14
Anticipated expiration: 2041-03-16
Also published as: CN113095327A

Abstract

The invention relates to the technical field of character recognition, in particular to a method, a system and a storage medium for positioning an optical character recognition area, wherein the method comprises the following steps: acquiring an original image, converting the original image into a gray image, and performing binarization processing to obtain an original binary image; marking connected domains in the image; acquiring height and width parameters of a connected domain; performing self-adaptive threshold value calculation on a current pixel point in the gray level image to obtain a first binary image; then removing a noise area in the image to obtain a second binary image; respectively carrying out integral and haar characteristic diagram calculation on the first binary image to obtain a gradient diagram; obtaining a y-axis positioning interval according to the gradient frequency of the gradient map; determining a target area according to the y-axis positioning interval, intercepting the second binary image, counting the distance between connected domains in the image, and determining an x-axis positioning interval according to the distance between the connected domains; the invention can realize the quick positioning of the positioning area and improve the stability and the accuracy of character recognition.

Description

Method and system for positioning optical character recognition area and storage medium thereof

Technical Field

The invention relates to the technical field of character recognition, in particular to a method for positioning an optical character recognition area, a system for positioning the optical character recognition area and a readable storage medium storing the method.

Background

With the development of information technology, people often use Optical Character Recognition (OCR) in social life, which is a technology that electronic equipment is used to optically convert characters in a paper document into an image file of a black-and-white dot matrix, and Recognition software is used to convert the characters in the image into a text format for further editing and processing by word processing software; therefore, the OCR technology is one of the key technologies for implementing paperless computer automatic processing, and the main indexes for measuring the performance of the OCR system include: rejection rate, false recognition rate, recognition speed and the like.

In the OCR technology, one of the key steps is to locate the character area; in the prior art, when complex background interference exists in a paper material, the conventional OCR recognition method cannot meet the requirement of quickly positioning and segmenting an OCR character area, so that the stability of OCR recognition is poor and the accuracy of character recognition is low.

Disclosure of Invention

In order to overcome the above-mentioned drawbacks, the present invention provides a positioning method, a positioning system and a readable storage medium storing the positioning method, which can quickly position and segment an OCR character area.

The purpose of the invention is realized by the following technical scheme:

the invention relates to a method for positioning an optical character recognition area, which comprises the following steps:

acquiring an original image to be identified and width parameters thereof, carrying out gray level conversion on the original image to obtain a gray level image, and then carrying out binarization processing on the gray level image to obtain an original binary image;

marking a plurality of connected domains in the original binary image by a connected domain marking algorithm; acquiring the height and width attribute parameters of each connected domain; the attribute parameters of the height and the width of all connected domains are counted, and the attribute parameters of the width and the height with the highest occurrence frequency are defined as the actual attribute parameters of the width, the height and the area;

performing threshold calculation on the gray values of the current pixel point and the neighborhood pixel block in the gray image by using an adaptive threshold binarization algorithm to obtain a first binary image;

removing a noise area in the first binary image according to the height and width attribute parameters of the connected domain to obtain a second binary image;

integrating the first binary image to obtain an integral image, determining a haar characteristic value according to actual width and height attribute parameters, and calculating the haar characteristic value of the integral image according to the haar characteristic value to obtain a gradient image; after binarization processing is carried out on the gradient map, traversing the gradient frequency of the gradient map along the direction from small to large of the y-axis coordinate, and determining a y-axis starting point and a y-axis end point by judging whether the gradient frequency is greater than a character number threshold value or not to obtain a y-axis positioning interval;

determining a target area according to the y-axis positioning interval and the width parameter of the original image, and intercepting the second binary image according to the target area to obtain a third binary image; counting the distance between adjacent connected domains in the third binary image, and calculating the average distance range; and determining an x-axis starting point and an x-axis end point according to the distance between the connected domains and the actual width attribute parameters of the connected domains to obtain an x-axis positioning interval.

In the present invention, traversing the gradient frequency of the y-axis coordinate from small to large and determining the y-axis starting point and the y-axis end point by judging whether the gradient frequency is greater than the character number threshold, and obtaining the y-axis positioning interval includes:

traversing the gradient frequency of the y-axis coordinate from small to large and judging whether the gradient frequency is greater than a character number threshold, and if so, determining the current y-direction position as a y-axis starting point; and continuously judging whether the gradient frequency of the positioning area is greater than the character number threshold value along the y-axis direction, and if the gradient frequency of the positioning area is less than the character number threshold value, determining the current y-direction position as a y-axis terminal point to obtain a y-axis positioning interval.

In the present invention, the determining an x-axis starting point and an x-axis ending point according to the distance between the connected domains and the actual width attribute parameters of the connected domains to obtain an x-axis positioning interval includes:

traversing the distance between the connected domains from small to large along the x-axis coordinate, determining two adjacent connected domains which meet the average distance range, taking the x-axis coordinate of the connected domain with the small x-axis coordinate as the x-axis starting point, continuously judging whether the distance between the subsequent connected domains and the actual wide attribute parameter are in the preset range along the x-axis direction, and if not, taking the x-axis coordinate of the current connected domain as the x-axis end point to obtain an x-axis positioning interval.

In the present invention, the removing the noise region in the first binary image according to the height and width attribute parameters of the connected domain to obtain the second binary image includes:

in the first binary image, marking a plurality of second connected regions by a connected domain marking algorithm of eight adjacent points, defining the second connected regions as connected domains, and acquiring the height and width attribute parameters of each connected domain;

and removing the noise region in the first binary image according to the height and width attribute parameters of the connected domain to obtain a second binary image.

In the present invention, the removing the noise region in the first binary image according to the height and width attribute parameters of the connected domain to obtain a second binary image includes:

acquiring maximum area parameters and minimum area parameters in all connected domains, determining a maximum area threshold value through the maximum area parameters, determining a minimum area threshold value through the minimum area parameters, and calculating an area threshold value range;

and respectively judging whether the area of each connected domain is within the area threshold range, if not, defining the connected domain as a noise region, and removing the noise region from the first binary image.

In the present invention, said marking a plurality of connected regions in the original binary image by a connected component marking algorithm includes:

in the original binary image, a plurality of first connected regions are marked through a connected domain marking algorithm of four adjacent points, and the first connected regions are defined as connected domains.

In the present invention, before performing threshold calculation on the gray values of the current pixel point and the neighboring pixel block in the gray image by using an adaptive threshold binarization algorithm, the method further includes:

calculating the size ratio value between the actual width and height attribute parameters and the preset standard width and height attribute parameters, and zooming the gray-scale image according to the size ratio value.

In the present invention, the performing threshold calculation on the gray values of the current pixel point and the neighboring pixel block in the gray image by using the adaptive threshold binarization algorithm to obtain a first binary image comprises:

acquiring the gray value of a pixel point adjacent to the current pixel point in the gray image, and setting a gray threshold value according to the gray value of the adjacent pixel point; and then carrying out self-adaptive threshold value binarization calculation according to the set gray threshold value to obtain a first binary image.

Based on the same concept, the present invention also provides a positioning system of an optical character recognition area, comprising:

the system comprises an original image acquisition module, a width parameter acquisition module and a recognition module, wherein the original image acquisition module is used for acquiring an original image to be recognized and a width parameter thereof;

the gray level image conversion module is connected with the original image acquisition module and is used for carrying out gray level conversion on the original image to obtain a gray level image;

the original binary image generation module is connected with the gray level image conversion module and is used for carrying out binarization processing on the gray level image to obtain an original binary image;

the connected domain generating module is connected with the original binary image generating module and used for marking a plurality of connected domains in the original binary image through a connected domain marking algorithm and acquiring the height and width attribute parameters of each connected domain;

the actual parameter acquisition module is connected with the connected domain generation module and is used for defining the attribute parameters with the highest frequency of occurrence, such as the width and the height, as the actual attribute parameters of the width, the height and the area by counting the attribute parameters of the height and the width of all the connected domains;

the first binary image generation module is connected with the gray level image conversion module and is used for performing threshold calculation on gray levels of a current pixel point and a neighborhood pixel block in the gray level image by using a self-adaptive threshold binary algorithm to obtain a first binary image;

the second binary image generation module is connected with the connected domain generation module and the first binary image generation module and is used for removing a noise area in the first binary image according to the height and width attribute parameters of the connected domain to obtain a second binary image;

the y-axis interval positioning module is connected with the first binary image generating module and the actual parameter acquiring module and used for integrating the first binary image to obtain an integral map, determining a haar characteristic value according to actual width and height attribute parameters, and calculating the integral map according to the haar characteristic value to obtain a gradient map; after binarization processing is carried out on the gradient map, traversing the gradient frequency of the gradient map along the direction from small to large of the y-axis coordinate, and determining a y-axis starting point and a y-axis end point by judging whether the gradient frequency is greater than a character number threshold value or not to obtain a y-axis positioning interval;

the x-axis interval positioning module is respectively connected with the y-axis interval positioning module, the second binary image generating module and the original image acquiring module and is used for determining a target area according to the width parameters of the y-axis positioning interval and the original image, and intercepting the second binary image according to the target area to obtain a third binary image; counting the distance between adjacent connected domains in the third binary image, and calculating the average distance range; and determining an x-axis starting point and an x-axis end point according to the distance between the connected domains and the actual width attribute parameters of the connected domains to obtain an x-axis positioning interval.

Based on the same concept, the present invention also provides a computer-readable program storage medium storing computer program instructions which, when executed by a computer, cause the computer to perform the method as described above.

According to the method, the position of the y-axis direction in the positioning area is quickly obtained by carrying out normalization processing on the image and according to the gradient frequency distribution in the image; the position in the x-axis direction in the positioning area is quickly obtained through the character width and character detection characteristics in the x-axis direction; therefore, the method and the device can realize quick positioning of the positioning area, further greatly improve the stability of character recognition and improve the accuracy of character recognition.

Drawings

For ease of illustration, the invention is described in detail in the following description of the preferred embodiments and in the accompanying drawings.

FIG. 1 is a flowchart illustrating an embodiment of a method for locating an OCR region according to the present invention;

FIG. 2 is a schematic diagram illustrating a haar feature calculation in the method for locating an optical character recognition area according to the present invention;

FIG. 3 is a schematic diagram illustrating the calculation of haar features in the method for locating an optical character recognition area according to the present invention;

FIG. 4 is a schematic diagram illustrating the principle of integral calculation for each point in an integral graph in the method for locating an optical character recognition area according to the present invention;

FIG. 5 is a flowchart illustrating a method for locating an OCR region according to another embodiment of the present invention;

FIG. 6 is a schematic logical structure diagram of an embodiment of a system for locating an OCR region according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention. Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first" and "second" may explicitly or implicitly include one or more of the described features. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.

In the description of the present invention, it should be noted that the terms "mounted," "connected," and "connected" are to be construed broadly and may be, for example, fixedly connected, detachably connected, or integrally connected unless otherwise explicitly stated or limited. Either mechanically or electrically. Either directly or indirectly through intervening media, either internally or in any other relationship. The specific meanings of the above terms in the present invention can be understood according to specific situations by those of ordinary skill in the art.

In the following, an embodiment of the method for locating an optical character recognition area of the present invention is described in detail, referring to fig. 1, which includes:

s101, acquiring an original image to be identified

Obtaining an image containing characters by image scanning or importing the image from a database, positioning and roughly cutting an original image containing character areas and obtaining a width parameter X of the image by an empirical parameter configuration mode _width 。

S102, obtaining a gray level image through gray level conversion

The gray scale conversion is performed on the original image to obtain a gray scale image I, which is specifically calculated by a formula I (x, y) =1/3 \uR (x, y) +1/3 \uG (x, y) +1/3 \uB (x, y), wherein R, G and B are red, green and blue colors in the figure respectively.

S103, carrying out binarization processing to obtain an original binary image

Carrying out binarization processing on the gray level image I by utilizing an Otsu (otsu) algorithm to obtain an original binary image Bo; wherein, otsu algorithm is also called as: the maximum inter-class variance method.

S104, marking a plurality of connected domains through a connected domain marking algorithm

Marking a plurality of connected domains in the original binary image Bo through a connected domain marking algorithm; the connected domain refers to an image region which is formed by foreground pixel points with the same pixel value and adjacent positions in an image, and each connected domain is marked as a blob. Connected component analysis refers to finding and marking each connected component in the image. In an image, the smallest unit is a pixel point, 8 adjacent pixels are around each pixel point, and 2 common adjacent relations are provided: 4 contiguous with 8 contiguous. 4 are adjacent to a total of 4 points, namely, the upper, lower, left and right directions of the pixel points, 8 are adjacent to a total of 8 points, and the 4 points adjacent to the 4 points also comprise 4 points of upper left, lower left, upper right and lower right; in this embodiment, it marks the connected domain in an 8-way adjacency. And (3) performing connected domain analysis on the gray value Bo (x, y) =255 pixel points in the original binary image Bo, and calculating to obtain the height and width attribute parameters of each connected domain.

S105, acquiring actual width and height attribute parameters

The method comprises the steps that the attribute parameters of the height and the width of all connected domains are counted, and the attribute parameters of the width and the height with the highest occurrence frequency are defined as the actual attribute parameters of the width, the height and the area; since it is known that there are many characters in the original image to be recognized, and the aspect ratio of the characters is generally consistent, the actual width and height of the image of most of the character numbers can be obtained through statistics in this step.

S106, obtaining a first binary image by using a self-adaptive threshold value binarization algorithm

Performing threshold calculation on the gray values of the current pixel point and the neighborhood pixel block in the gray image I by using an adaptive threshold binarization algorithm to obtain a first binary image B1; the neighborhood pixels are square, and the size of the neighborhood pixels is set to be one fourth of the actual height of the connected domain; specifically, the gray values of the current pixel point and the neighborhood pixel block are compared, if the gray values of the current pixel point and the neighborhood pixel block are smaller than the gray value of the neighborhood pixel block, the current pixel point is set to be 0, otherwise, the pixel point is set to be 255, and therefore the first binary image B1 is generated.

S107, removing the noise area to obtain a second binary image

Removing a noise area in the first binary image B1 according to the height and width attribute parameters of the connected domain to obtain a second binary image B2;

s108, calculating and obtaining a y-axis positioning interval by combining a haar characteristic diagram

And integrating the first binary image B1 to obtain an integral map Sat, wherein the conversion formula of the integral map Sat is as follows:

determining a haar-like characteristic value according to the actual width and height attribute parameters, wherein the haar characteristic is a digital image characteristic used for object identification; then carrying out haar characteristic diagram calculation on the integral diagram according to the haar characteristic value to obtain a gradient diagram T; wherein, haar edge characteristics are as shown in fig. 2 to 4, and the characteristics are described as follows: the width of the black part is one third of the actual width of the connected domain, the height is the actual width of the connected domain, the width and height of the white part are the standard width and height, and the specific formula is as follows:

Harr _A-B ＝Sum(A)-Sum(B)

＝[SAT ₆ +SAT ₂ -SAT ₂ -SAT ₅ ]-[SAT ₅ +SAT ₁ -SAT ₂ -SAT ₄ ]

wherein, the integral calculation of each point in the integral graph is specifically as follows:

SAT ₁ ＝Sum(Ra)

SAT ₂ ＝Sum(Ra)+Sum(Rb)

SAT ₃ ＝Sum(Ra)+Sum(Rc)

Sum(Rd)＝SAT ₁ +SAT ₄ -SAT ₂ -SAT ₃

after the gradient map T is subjected to binarization processing, the gradient frequency F of the gradient map T is subjected to binarization processing along the direction from small to large of the y-axis coordinate _ocr Traversing and determining a y-axis starting point and a y-axis end point by judging whether the number of characters is greater than a character number threshold value Tresh to obtain a y-axis positioning interval;

s109, obtaining an x-axis positioning interval according to the distance of the connected domains and the actual wide attribute parameters

Positioning an interval according to the y axis and the width parameter X of the original image _width Determining a target area, and intercepting the second binary image B2 according to the target area to obtain a third binary image B3; counting the distance between adjacent connected domain blobs in the third binary image B3, and calculating the average distance range; and determining an x-axis starting point and an x-axis end point according to the distance between the connected domain blobs and the actual width attribute parameters of the connected domain blobs to obtain an x-axis positioning interval.

In the following, a method for positioning an optical character recognition area according to another embodiment of the present invention is described in detail, referring to fig. 5, which includes:

s201, acquiring an original image to be identified

Obtaining an image containing characters by image scanning or importing the image from a database, positioning and roughly cutting an original image containing character areas and obtaining a width parameter X of the image through an empirical parameter configuration mode _width 。

S202, obtaining a gray level image through gray level conversion

S203, carrying out binarization processing to obtain an original binary image

S204, marking a plurality of first communication areas through a communication area marking algorithm

In the original binary image Bo, performing connected domain analysis marking on the pixel points Bo (x, y) =255 to obtain a plurality of first connected regions through a connected domain marking algorithm of four adjacent points, and defining the first connected regions as connected domains; the connected domain refers to an image region which is formed by foreground pixels with the same pixel value and adjacent positions in an image, and each connected domain is marked as a blob. Connected component analysis refers to finding and marking each connected component in the image. In an image, the smallest unit is a pixel point, 8 adjacent pixels are arranged around each pixel point, and 2 common adjacent relations are provided: 4 contiguous with 8 contiguous. 4 are adjacent to a total of 4 points, namely, the upper, lower, left and right directions of the pixel points, 8 are adjacent to a total of 8 points, and the 4 points adjacent to the 4 points also comprise 4 points of upper left, lower left, upper right and lower right; in this embodiment, it marks the connected domain in an 8-adjacency manner. And (3) performing connected domain analysis on the gray value Bo (x, y) =255 pixel points in the original binary image Bo, and calculating to obtain the height and width attribute parameters of each connected domain.

S205, obtaining actual width and height attribute parameters

Calculating the aspect ratios of all connected domains according to the height and width attribute parameters of all connected domains, putting the calculated aspect ratios into corresponding aspect ratio intervals according to the respective aspect ratios, counting the aspect ratio intervals in which the most connected domains fall, and defining the width and height attribute parameters corresponding to the aspect ratio intervals as actual width and height attribute parameters; since it is known that there are many characters in the original image to be recognized, and the aspect ratio of the characters is generally consistent, the actual width and height of the image of most of the character numbers can be obtained through statistics in this step. In this embodiment, a first communication area is obtained by means of communication area marking of four adjacent points, and actual width and height are obtained according to statistics of the first communication area; it can quickly obtain the actual width and height of the collocation.

S206, zooming the gray level image

Calculating the size ratio value between the actual width and height attribute parameters and the preset standard width and height attribute parameters, and zooming the gray-scale image according to the size ratio value. Specifically, it uses the formula according to the actual width and height of the connected domain in the image and the width and height of the target character: the proportion = actual height, width/target height, width, according to the proportion high, wide calculation result, carry on the size scaling to the gray level image, get the gray level image Id, in this way, can guarantee the character size processed keeps invariable size always; facilitating subsequent processing. Wherein the width and height of the target character are set empirically.

S207, obtaining a first binary image by using a self-adaptive threshold binarization algorithm

Acquiring the gray value of a pixel point adjacent to the current pixel point in the gray image Id, and setting a gray threshold value according to the gray value of the adjacent pixel point; then, performing adaptive threshold binarization calculation according to the set gray threshold to obtain a first binary image B1; the neighborhood pixels are square, and the size of the neighborhood pixels is set to be the actual height of a quarter of the connected domain; specifically, the gray values of the current pixel point and the neighborhood pixel block are compared, if the gray values of the current pixel point and the neighborhood pixel block are smaller than the gray value of the neighborhood pixel block, the current pixel point is set to be 0, otherwise, the pixel point is set to be 255, and therefore the first binary image B1 is generated.

S208, marking a plurality of second connected regions through a connected region marking algorithm

In the first binary image B1, performing connected domain analysis on the pixel points of B1 (x, y) =255 through a connected domain marking algorithm with eight adjacent points to mark a plurality of second connected regions, defining the second connected regions as connected domains, and obtaining attribute parameters of height, width and area of each connected domain.

S209, removing the noise area to obtain a second binary image

respectively judging whether the area of each connected domain is within the area threshold range, if not, defining the connected domain as a noise region, and removing the noise region from the first binary image; the method comprises the following steps: counting the area distribution of a class of connected domains which meet the requirements of width and height of characters to obtain the maximum area Smax and the minimum area Smin; resetting a communication area smaller than Tmin in the first binary image B1 by taking the size of one fourth of Smin as a threshold Tmin; and (3) clearing a connected region larger than Tmax in the first binary image B1 by taking the size of 2 times of Smax as a threshold Tmax to obtain a second binary image B2.

S210, calculating and obtaining a y-axis positioning interval by combining a haar characteristic diagram

determining a haar-like characteristic value according to the actual width and height attribute parameters, wherein the haar characteristic is a digital image characteristic used for object identification; then according to the haar characteristic value, carrying out haar characteristic diagram calculation on the integral diagram to obtain a gradient diagram T; wherein, haar edge characteristics are as shown in fig. 2 to 4, and the characteristics are described as follows: the width of the black part is one third of the actual width of the connected domain, the height is the actual width of the connected domain, the width and height of the white part are the standard width and height, and the specific formula is as follows:

Harr _A-B ＝Sum(A)-Sum(B)

＝[SAT ₆ +SAT ₂ -SAT ₂ -SAT ₅ ]-[SAT ₅ +SAT ₁ -SAT ₂ -SAT ₄ ]

SAT ₁ ＝Sum(Ra)

SAT ₂ ＝Sum(Ra)+Sum(Rb)

SAT ₃ ＝Sum(Ra)+Sum(Rc)

Sum(Rd)＝SAT ₁ +SAT ₄ -SAT ₂ -SAT ₃

carrying out binarization processing on the gradient map T, wherein the threshold value is standard width multiplied by standard width/4; gradient frequency F along y-axis coordinate from small to large _ocr Traversing and judging whether the width is larger than a character number threshold value Tresh or not, wherein the width W of the connected domain is known _ocr And width W of the image _T Obtaining a character number threshold value Tresh = alpha W _T /W _ocr Wherein, alpha is (1/2, 2/3); if the current y-direction position is larger than the threshold value Tresh, determining the current y-direction position as a y-axis starting point; and continuously judging the gradient frequency F along the y-axis direction _ocr Whether the number of the characters is larger than a character number threshold value or not, if the number of the characters is smaller than a threshold value Tresh, determining the current y-direction position as a y-axis terminal point to obtain a y-axis positioning interval; it may specifically be: if the gradient frequency F of the current line _ocr If the current Y-direction position is larger than the threshold value Tresh, the current Y-direction position is determined as the Y-axis starting point Y _start At this time, OCR flag _ocr Self-adding 1; if the gradient frequency F of the current line _ocr If the value is less than the threshold value Tresh, the Y-axis starting point Y is determined _start Set to zero at this time OCR flag _ocr Setting zero; if the gradient frequency F of the current line _ocr Is smaller than a threshold value Tresh and meets an OCR flag bit flag _ocr When 1 is added, the current Y-direction position is determined as the Y-axis terminal point Y _end By the coordinate Y _start 、Y _end And determining a positioning interval of the Y-axis direction of the OCR.

S211, obtaining an x-axis positioning interval according to the distance of the connected domains and the actual wide attribute parameters

Positioning an interval according to the y axis and the width parameter X of the original image _width Determining a target area, wherein the specific target area is (0,Y) _start ,X _width ,Y _end -Y _start ) Intercepting the second binary image B2 according to a target area to obtain a third binary image B3; counting the distance between adjacent connected domain blobs in the third binary image B3, and calculating the average distance range; traversing the space between the connected domains from small to large along the x-axis coordinate, determining two adjacent connected domains which meet the average space range, andtaking the X-axis coordinate of the connected domain with the small X-axis coordinate as the X-axis starting point, and continuously judging the distance between the subsequent connected domains and the actual wide attribute parameter X along the X-axis direction _width And if not, taking the x-axis coordinate of the current connected domain as an x-axis terminal point to obtain an x-axis positioning interval. Specifically, calculating the distribution interval of each connected domain blob in the x-axis direction, and counting the distance of each adjacent blob block according to the sorting from small to large according to the x coordinate of the center of the blob block, wherein the distance is the character interval; and the normal OCR character interval should have small difference, based on the property, the mean gap of blob interval distribution is calculated first _avg (ii) a Then, evaluating the minimum starting point in the x-axis direction with uniform interval, and taking the blob with the minimum central coordinate x value as the starting block blob of the line of OCR ₀ ，blob ₀ The starting position X of the X-axis direction of the OCR area is determined _start (ii) a 5) From X _start Starting with the actual wide attribute parameter of the connected field, the character pitch gap _avg Traversing from left to right until the size of the character does not meet the requirement, and obtaining the OCR end position X in the X-axis direction at the moment _end (ii) a Location area (X) _start ,Y _start ,X _end -X _start ,Y _end -Y _start ) I.e. a cut-out box of the OCR area that needs to be located.

The present invention includes a computer readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, aspects of the invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention described in the above section "exemplary methods" of the present description, when said program product is run on the terminal device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on the above readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

An embodiment of a positioning system for an optical character recognition area according to the present invention is described in detail below, with reference to fig. 6, which includes:

the system comprises an original image acquisition module 101, wherein the original image acquisition module 101 is used for acquiring an original image to be identified and width parameters thereof; specifically, the method acquires an image containing characters through image scanning or importing the image from a database, and positions and cuts rough original images containing character areas and acquires width parameters of the images through an empirical parameter configuration mode.

The grayscale image conversion module 102, the grayscale image conversion module 102 is connected to the original image acquisition module 101, and is configured to perform grayscale conversion on the original image to obtain a grayscale image; the formula I (x, y) =1/3 _r (x, y) +1/3 _g (x, y) +1/3 _b (x, y), wherein R, G, B are red, green, blue colors in the figure.

An original binary image generation module 103, where the original binary image generation module 103 is connected to the grayscale image conversion module 102, and is configured to perform binarization processing on the grayscale image to obtain an original binary image; in the present embodiment, it performs binarization processing on the grayscale image by using an otsu (otsu) algorithm.

A connected domain generating module 104, where the connected domain generating module 104 is connected to the original binary image generating module 103, and is configured to mark a plurality of connected domains in the original binary image through a connected domain marking algorithm, and obtain attribute parameters of the height and width of each connected domain; the connected domain refers to an image region which is formed by foreground pixels with the same pixel value and adjacent positions in an image, and each connected domain is marked as a blob. Connected component analysis refers to finding and labeling each connected component in an image. In an image, the smallest unit is a pixel point, 8 adjacent pixels are arranged around each pixel point, and 2 common adjacent relations are provided: 4 contiguous and 8 contiguous. 4 are adjacent to a total of 4 points, namely, the upper, lower, left and right directions of the pixel points, 8 are adjacent to a total of 8 points, and the 4 adjacent points also comprise 4 points of upper left, lower left, upper right and lower right; in this embodiment, it marks the connected domain in an 8-way adjacency. And (3) performing connected domain analysis on the gray value Bo (x, y) =255 pixel points in the original binary image Bo, and calculating to obtain the height and width attribute parameters of each connected domain.

An actual parameter obtaining module 105, where the actual parameter obtaining module 105 is connected to the connected domain generating module 104, and is configured to define, by performing statistics on the height and width attribute parameters of all connected domains, the width and height attribute parameters with the highest occurrence frequency as actual width, height, and area attribute parameters; specifically, a plurality of aspect ratio sections are preset, the aspect ratios of all connected domains are calculated according to the height attribute parameters and the width attribute parameters of all connected domains, the connected domains are placed into the corresponding aspect ratio sections according to the respective aspect ratios, the aspect ratio section where the most connected domains fall is counted, and the width attribute parameters and the height attribute parameters corresponding to the aspect ratio sections are defined as the actual width attribute parameters and the actual height attribute parameters.

A first binary image generation module 106, where the first binary image generation module 106 is connected to the grayscale image conversion module 102, and is configured to perform threshold calculation on grayscale values of a current pixel point and a neighborhood pixel block in the grayscale image by using an adaptive threshold binarization algorithm to obtain a first binary image; specifically, the gray values of the current pixel point and the neighborhood pixel block are compared, if the gray values of the current pixel point and the neighborhood pixel block are smaller than the gray values of the neighborhood pixel block, the current pixel point is set to be 0, otherwise, the pixel point is set to be 255, and therefore a first binary image is generated.

A second binary image generating module 107, where the second binary image generating module 107 is connected to the connected domain generating module 104 and the first binary image generating module 106, and is configured to remove a noise region in the first binary image according to the high and wide attribute parameters of the connected domain to obtain a second binary image; the method comprises the following steps: acquiring maximum area parameters and minimum area parameters in all connected domains, determining a maximum area threshold value through the maximum area parameters, determining a minimum area threshold value through the minimum area parameters, and calculating an area threshold value range; and respectively judging whether the area of each connected domain is within the area threshold range, if not, defining the connected domain as a noise region, and removing the noise region from the first binary image.

A y-axis interval positioning module 108, wherein the y-axis interval positioning module 108 is connected to the first binary image generating module 106 and the actual parameter acquiring module 105, and is configured to integrate the first binary image to obtain an integral map, determine a haar feature value according to actual width and height attribute parameters, and calculate a haar feature map of the integral map according to the haar feature value to obtain a gradient map; after binarization processing is carried out on the gradient map, traversing the gradient frequency of the gradient map along the direction from small to large of the y-axis coordinate, and determining a y-axis starting point and a y-axis end point by judging whether the gradient frequency is greater than a character number threshold value or not to obtain a y-axis positioning interval; specifically, after the gradient map is subjected to binarization processing, traversing the gradient frequency of the gradient map along the y-axis coordinate from small to large and judging whether the gradient frequency is greater than a character number threshold value, and if so, determining the current y-direction position as a y-axis starting point; and continuously judging whether the gradient frequency of the positioning area is greater than the character number threshold value along the y-axis direction, and if the gradient frequency of the positioning area is less than the character number threshold value, determining the current y-direction position as a y-axis terminal point to obtain a y-axis positioning interval.

The x-axis interval positioning module 109 is connected with the y-axis interval positioning module 108, the second binary image generating module 107 and the original image acquiring module 101 respectively, and is used for determining a target area according to the y-axis positioning interval and the width parameter of the original image, and intercepting the second binary image according to the target area to obtain a third binary image; counting the distance between adjacent connected domains in the third binary image, and calculating the average distance range; determining an x-axis starting point and an x-axis end point according to the distance between the connected domains and the actual wide attribute parameters of the connected domains to obtain an x-axis positioning interval; specifically, after the average distance range is calculated, traversing the distance between the connected domains along the x-axis coordinate from small to large, determining two adjacent connected domains which meet the average distance range, taking the x-axis coordinate of the connected domain with the small x-axis coordinate as the x-axis starting point, continuously judging whether the distance between the subsequent connected domains and the actual wide attribute parameter are within a preset range along the x-axis direction, and if not, taking the x-axis coordinate of the current connected domain as the x-axis end point to obtain an x-axis positioning interval.

In the description of the present specification, reference to the description of "one embodiment", "some embodiments", "illustrative embodiments", "examples", "specific examples", or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A method for locating an optical character recognition area, comprising:

marking a plurality of connected domains in the original binary image by a connected domain marking algorithm; acquiring the height and width attribute parameters of each connected domain; the attribute parameters of the height and the width of all connected domains are counted, and the attribute parameters of the width and the height with the highest frequency of occurrence are defined as the actual attribute parameters of the width, the height and the area;

integrating the first binary image to obtain an integral image, determining a haar characteristic value according to actual width and height attribute parameters, and calculating the integral image according to the haar characteristic value to obtain a gradient image; after binarization processing is carried out on the gradient map, traversing the gradient frequency of the gradient map along the direction from small to large of the y-axis coordinate, and determining a y-axis starting point and a y-axis end point by judging whether the gradient frequency is greater than a character number threshold value or not to obtain a y-axis positioning interval;

2. The method of claim 1, wherein traversing the gradient frequency of the y-axis coordinate from small to large and determining a y-axis start point and a y-axis end point by determining whether the gradient frequency is greater than a threshold of the number of characters to obtain a y-axis location section comprises:

traversing the gradient frequency of the y-axis coordinate from small to large and judging whether the gradient frequency is greater than a character number threshold, and if so, determining the current y-direction position as a y-axis starting point; and continuously judging whether the gradient frequency is greater than the character number threshold value or not along the y-axis direction, and if the gradient frequency is less than the threshold value, determining the current y-direction position as a y-axis terminal point to obtain a y-axis positioning interval.

3. The method of claim 2, wherein determining an x-axis start point and an x-axis end point according to the distance between connected components and the actual wide property parameters of the connected components to obtain an x-axis positioning interval comprises:

traversing the distance between the connected domains along the direction from small to large of the x-axis coordinate, determining two adjacent connected domains which meet the average distance range, taking the x-axis coordinate of the connected domain with the small x-axis coordinate as the x-axis starting point, continuously judging whether the distance between the subsequent connected domains and the actual wide attribute parameter are in a preset range along the x-axis direction, and if not, taking the x-axis coordinate of the current connected domain as the x-axis end point to obtain an x-axis positioning interval.

4. The method of claim 3, wherein the removing the noise region in the first binary image according to the property parameters of the height and width of the connected component to obtain the second binary image comprises:

in the first binary image, marking a plurality of second connected regions through a connected domain marking algorithm of eight adjacent points, defining the second connected regions as connected domains, and acquiring the height and width attribute parameters of each connected domain;

and removing a noise region in the first binary image according to the high and wide attribute parameters of the connected domain to obtain a second binary image.

5. The method of claim 4, wherein the removing the noise region in the first binary image according to the property parameters of the height and width of the connected component to obtain the second binary image comprises:

acquiring maximum area parameters and minimum area parameters in all connected domains, determining a maximum area threshold value through the maximum area parameters, determining a minimum area threshold value through the minimum area parameters, and then calculating an area threshold value range;

6. The method as claimed in claim 5, wherein the step of marking a plurality of connected components in the original binary image by a connected component marking algorithm comprises:

7. The method as claimed in claim 6, wherein the step of performing threshold calculation on the gray values of the current pixel point and the neighborhood pixel block in the gray image by using an adaptive threshold binarization algorithm further comprises:

8. The method of claim 7, wherein the performing a threshold calculation on the gray values of the current pixel point and the neighborhood pixel block in the gray image by using an adaptive threshold binarization algorithm to obtain a first binary image comprises:

9. A system for locating an optical character recognition area, comprising:

the system comprises an original image acquisition module, a width parameter identification module and a width parameter identification module, wherein the original image acquisition module is used for acquiring an original image to be identified and the width parameter thereof;

the actual parameter acquisition module is connected with the connected domain generation module and is used for defining the attribute parameters of the width and the height with the highest occurrence frequency as the actual attribute parameters of the width, the height and the area by counting the attribute parameters of the height and the width of all connected domains;

the second binary image generation module is connected with the connected domain generation module and the first binary image generation module and is used for removing a noise area in the first binary image according to the attribute parameters of the height and the width of the connected domain to obtain a second binary image;

the x-axis interval positioning module is respectively connected with the y-axis interval positioning module, the second binary image generating module and the original image acquiring module and is used for determining a target area according to a y-axis positioning interval and width parameters of the original image, and intercepting the second binary image according to the target area to obtain a third binary image; counting the distance between adjacent connected domains in the third binary image, and calculating the average distance range; and determining an x-axis starting point and an x-axis end point according to the distance between the connected domains and the actual width attribute parameters of the connected domains to obtain an x-axis positioning interval.

10. A computer-readable program storage medium storing computer program instructions which, when executed by a computer, cause the computer to perform the method of any one of claims 1 to 8.