CN111414919A

CN111414919A - Method, device and equipment for extracting characters from printed pictures with forms and storage medium

Info

Publication number: CN111414919A
Application number: CN202010225345.2A
Authority: CN
Inventors: 李佳; 杨阳; 刘旭东
Original assignee: Telephase Technology Development Beijing Co ltd
Current assignee: Guangzhou Juying Information Technology Co ltd
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2020-07-14
Anticipated expiration: 2040-03-26
Also published as: CN111414919B

Abstract

The application discloses a method, a device, equipment and a storage medium for extracting table print form picture characters, wherein the method for extracting the table print form picture characters comprises the following steps: removing the transverse lines with projection integrals smaller than a first preset distance threshold value and/or the vertical lines with projection integrals smaller than a second preset distance threshold value from the transverse lines and/or the vertical lines to obtain a table of the binaryzation picture; and deleting the table of the binary image and reserving the character content in the binary image. According to the method and the device, the interference lines in the table picture can be removed after the table picture in the picture is extracted, so that the characters of the picture are accurately extracted based on the table from which the interference lines are removed.

Description

Method, device and equipment for extracting characters from printed pictures with forms and storage medium

Technical Field

The present application relates to the field of text recognition, and in particular, to a method, an apparatus, a device, and a storage medium for extracting text from a print image with a table.

Background

Generally, for print font character recognition, character extraction is firstly carried out, and then a table is arranged in some character pictures, so that the table needs to be removed before the print font characters are extracted, and then the characters can be extracted to prepare for the next character recognition. Therefore, it is necessary to provide an accurate and efficient technical method to remove the table in the picture, so as to realize the text extraction.

In the application of removing the table of the character picture at present, the basic process is to carry out graying and binaryzation on the original picture, and then carry out corrosion and expansion algorithm to extract transverse lines and longitudinal lines, thereby realizing table extraction. However, two requirements exist in the prior art, one is that the length of the structural element is required to be determined in the corrosion and expansion processes, and the length usually has a default initial value and needs to be adjusted according to different pictures to achieve the best effect; and secondly, the horizontal lines and the vertical lines which need to be formed are sometimes connected by strokes of characters, so that interfering short lines are formed instead of actual horizontal and vertical lines, and due to the two defects, the recognition accuracy of the existing character recognition technology is low.

Disclosure of Invention

The application aims to disclose a method, a device and equipment for extracting characters from a printed picture with a table and a storage medium, which are used for removing interference lines in the table picture after the table picture in the picture is extracted, so that the characters of the picture are accurately extracted based on the table from which the interference lines are removed.

The application discloses a method for extracting characters from a printed picture with a table, which comprises the following steps:

acquiring a picture to be processed, wherein the picture to be processed comprises a table;

calculating the gray value of each pixel point according to the RGB value of each pixel point of the picture to be processed;

sequentially comparing the gray value of each pixel point with a preset threshold value and converting the gray value of each pixel point into 0 or 255 according to the comparison result so as to convert the picture to be processed into a binary picture;

identifying a plurality of transverse lines and/or a plurality of vertical lines in the binary image according to structural elements, a corrosion algorithm and an expansion algorithm;

calculating the horizontal projection integral of each transverse line and/or the horizontal projection integral of each vertical line;

removing the transverse lines with projection integrals smaller than a first preset distance threshold value and/or the vertical lines with projection integrals smaller than a second preset distance threshold value from the transverse lines and/or the vertical lines to obtain a table of the binaryzation picture;

and deleting the table of the binary image and keeping the text content in the binary image.

According to the method, the interference lines can be removed from the table pictures extracted from the pictures, so that characters of the pictures can be accurately extracted based on the tables from which the interference lines are removed.

As an optional implementation manner, after the calculating the integral of the horizontal projection of each horizontal line and/or the integral of the horizontal projection of each vertical line, before removing the horizontal line whose integral of the projection is smaller than a first preset distance threshold and/or the vertical line smaller than a second preset distance threshold from the plurality of horizontal lines and/or the plurality of vertical lines and obtaining the table of the binarized picture, the method further includes:

acquiring horizontal projection integrals of the text contents of the binary image in the row direction and the column direction;

taking the magnitude of the horizontal projection integral of the text content in the row direction as the first preset interval threshold;

and taking the horizontal projection integral of the text content in the column direction as the second preset interval threshold value.

As an optional implementation manner, the identifying, according to a structural element, a corrosion algorithm, and a dilation algorithm, a plurality of horizontal lines and/or a plurality of vertical lines in the binarized picture includes:

determining the structural elements according to the number of rows and the number of columns of the binarization picture;

performing corrosion operation on the structural elements and each image in the binary image to obtain a corrosion operation result;

performing expansion operation on the structural element and each pixel in the binary image to obtain an expansion operation result;

and identifying a plurality of transverse lines and/or a plurality of vertical lines in the binary image according to the corrosion operation result and the expansion operation node.

As an optional implementation manner, the determining the calculation formula of the structural element according to the number of rows and columns of the binarized picture is:

s is CO L S// SCA L E or S is ROWS// SCA L E;

wherein CO L S represents the column number of the binary image, ROWS represents the row number of the binary image, and,// coincidence represents an integer, and the remainder is removed;

and SCA L E is CO L S// D _ CO L or SCA L E is ROW// D _ ROW, wherein D _ CO L represents a column pitch of the binarized picture, and D _ ROW represents a ROW pitch of the binarized picture.

As an optional implementation manner, the calculation formula for calculating the gray value of each pixel point according to the RGB values of each pixel point of the to-be-processed picture is as follows:

H＝0.3*R+0.59*G+0.11*B；

wherein, H represents the gray value of the pixel point, and R, G, B is the R value, G value, B value in the RGB value of each pixel point respectively.

As an optional implementation manner, after the calculating the gray value of each pixel point according to the RGB value of each pixel point of the to-be-processed picture, sequentially comparing the gray value of each pixel point with a preset threshold and converting the gray value of each pixel point into 0 or 255 according to a comparison result, so that before the to-be-processed picture is converted into a binarized picture, the method further includes:

calculating the preset threshold according to a calculation formula T-B-C;

wherein, T represents the preset threshold, B represents the weighted average of the pixels in the R region around the pixel point, and C represents the difference of the pixels in the R region around the pixel point.

As an optional implementation manner, the deleting the table of the binarized picture and retaining the text content in the binarized picture includes:

and carrying out subtraction operation on the table of the binary image and the binary image to obtain the character content in the binary image, wherein the binary image is a reduced number, and the table of the binary image is a reduced number.

The second aspect of the application discloses a take form print picture characters extraction element the device includes:

the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a picture to be processed, and the picture to be processed comprises a table;

the gray processing module is used for calculating the gray value of each pixel point according to the RGB value of each pixel point of the picture to be processed;

the binarization processing module is used for sequentially comparing the gray value of each pixel point with a preset threshold value and converting the gray value of each pixel point into 0 or 255 according to a comparison result so as to convert the picture to be processed into a binarization picture;

the identification module is used for identifying a plurality of transverse lines and/or a plurality of vertical lines in the binary image according to structural elements, a corrosion algorithm and an expansion algorithm;

the calculation module is used for calculating the horizontal projection integral of each transverse line and/or the horizontal projection integral of each vertical line;

the screening module is used for removing the transverse lines with projection integrals smaller than a first preset distance threshold value and/or the vertical lines with projection integrals smaller than a second preset distance threshold value from the transverse lines and/or the vertical lines and obtaining a table of the binaryzation picture;

and the deleting module is used for deleting the table of the binary image and reserving the character content in the binary image.

The device of the second aspect of the present application can remove the interference lines in the table picture extracted into the picture by executing the method of the first aspect of the present application, so that the characters of the picture can be accurately extracted based on the table from which the interference lines are removed, and compared with the prior art, the device of the second aspect of the present application has better recognition accuracy.

The third aspect of the present application discloses a take table print picture characters to draw equipment, equipment includes:

a processor; and

a memory configured to store machine readable instructions which, when executed by the processor, perform a method of tabbed print picture text extraction as disclosed in the first aspect of the application.

The device of the third aspect of the present application, by performing the method of the first aspect of the present application, can remove the interference lines from the table pictures extracted from the pictures, so that the characters of the pictures can be accurately extracted based on the table from which the interference lines are removed, and compared with the prior art, the device of the third aspect of the present application has better recognition accuracy.

A fourth aspect of the present application discloses a storage medium storing a computer program, which, when executed by a processor, executes the method for extracting text from a print image with form disclosed in the first aspect of the present application.

The storage medium of the fourth aspect of the present application, by executing the method of the first aspect of the present application, can remove the interference lines from the table pictures extracted from the pictures, so that the characters of the pictures can be accurately extracted based on the table from which the interference lines are removed.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

Fig. 1 is a schematic flow chart of a method for extracting text from a printed picture with a table according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of a device for extracting pictures and characters with forms and print forms disclosed in the second embodiment of the present application;

fig. 3 is a schematic structural diagram of a device for extracting pictures and texts with form print according to a third embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Example one

Referring to fig. 1, fig. 1 is a schematic flow chart illustrating a method for extracting text from a printed picture with a table according to an embodiment of the present application. As shown in fig. 1, the method comprises the steps of:

101. acquiring a picture to be processed, wherein the picture to be processed comprises a table;

102. calculating the gray value of each pixel point according to the RGB value of each pixel point of the picture to be processed;

103. sequentially comparing the gray value of each pixel point with a preset threshold value and converting the gray value of each pixel point into 0 or 255 according to the comparison result so as to convert the picture to be processed into a binary picture;

104. identifying a plurality of transverse lines and/or a plurality of vertical lines in the binary image according to the structural elements, the corrosion algorithm and the expansion algorithm;

105. calculating the horizontal projection integral of each transverse line and/or the horizontal projection integral of each vertical line;

106. removing the transverse lines with projection integrals smaller than a first preset distance threshold value and/or the vertical lines with projection integrals smaller than a second preset distance threshold value from the transverse lines and/or the vertical lines to obtain a table of the binary image;

107. and deleting the table of the binary image and keeping the text content in the binary image.

As an optional implementation manner, after calculating the integral of the horizontal projection of each horizontal line and/or the integral of the horizontal projection of each vertical line, before removing the horizontal line whose integral of the projection is smaller than the first preset distance threshold and/or the vertical line smaller than the second preset distance threshold from the plurality of horizontal lines and/or the plurality of vertical lines and obtaining the table of the binarized picture, the method further includes:

acquiring horizontal projection integrals of the text contents of the binary image in the row direction and the horizontal projection integrals in the column direction;

taking the magnitude of horizontal projection integral of the text content in the row direction as a first preset interval threshold;

and taking the horizontal projection integral of the text content in the column direction as a second preset interval threshold value.

As an optional implementation manner, identifying a plurality of horizontal lines and/or a plurality of vertical lines in the binarized picture according to the structural elements, the erosion algorithm and the dilation algorithm includes:

determining structural elements according to the number of rows and the number of columns of the binary image;

and identifying a plurality of transverse lines and/or a plurality of vertical lines in the binary image according to the corrosion operation result and the expansion operation result.

As an optional implementation manner, the calculation formula for determining the structural elements according to the number of rows and columns of the binarized picture is:

s is CO L S// SCA L E or S is ROWS// SCA L E;

and SCA L E is CO L S// D _ CO L or SCA L E is ROW// D _ ROW, where D _ CO L represents the column pitch of the binarized picture and D _ ROW represents the ROW pitch of the binarized picture.

As an optional implementation manner, the calculation formula for calculating the gray value of each pixel point according to the RGB value of each pixel point of the to-be-processed picture is as follows:

H＝0.3*R+0.59*G+0.11*B；

As an optional implementation manner, after calculating the gray value of each pixel point according to the RGB value of each pixel point of the to-be-processed picture, sequentially comparing the gray value of each pixel point with a preset threshold and converting the gray value of each pixel point into 0 or 255 according to the comparison result, so that before converting the to-be-processed picture into the binarized picture, the method further includes:

calculating a preset threshold according to a calculation formula T ═ B-C;

wherein, T represents the preset threshold value, B represents the weighted average of the pixels in the R region around the pixel point, and C represents the difference of the pixels in the R region around the pixel point.

As an optional implementation manner, deleting the table of the binarized picture and retaining the text content in the binarized picture, including:

and subtracting the table of the binary image and the binary image to obtain the character content in the binary image, wherein the binary image is a reduced number, and the table of the binary image is a reduced number.

Example two

Referring to fig. 2, fig. 2 is a schematic structural diagram of a device for extracting text from a printed picture with a table according to an embodiment of the present disclosure. As shown in fig. 2, the apparatus includes:

an obtaining module 201, configured to obtain a to-be-processed picture, where the to-be-processed picture includes a table;

the gray processing module 202 is configured to calculate a gray value of each pixel point according to the RGB value of each pixel point of the picture to be processed;

a binarization processing module 203, configured to compare the gray value of each pixel point with a preset threshold in sequence and convert the gray value of each pixel point into 0 or 255 according to a comparison result, so as to convert the to-be-processed picture into a binarization picture;

the identification module 204 is used for identifying a plurality of transverse lines and/or a plurality of vertical lines in the binary image according to the structural elements, the corrosion algorithm and the expansion algorithm;

a calculating module 205, configured to calculate a horizontal projection integral of each horizontal line and/or a horizontal projection integral of each vertical line;

the screening module 206 is configured to remove, from the plurality of horizontal lines and/or the plurality of vertical lines, horizontal lines whose projection integrals are smaller than a first preset distance threshold and/or vertical lines whose projection integrals are smaller than a second preset distance threshold, and obtain a table of the binarized picture;

and the deleting module 207 is used for deleting the table of the binary image and keeping the text content in the binary image.

The device of the embodiment of the application can remove the interference lines in the table pictures extracted into the pictures by executing the method disclosed by the embodiment of the application, so that the characters of the pictures can be accurately extracted based on the table from which the interference lines are removed.

EXAMPLE III

Referring to fig. 3, fig. 3 is a schematic structural diagram of a device for extracting text from a printed picture with a table according to an embodiment of the present application. As shown in fig. 3, the apparatus includes:

a processor 302; and

the memory 301 is configured to store machine readable instructions, and the instructions, when executed by the processor 302, execute the method for extracting text from a tabbed print image disclosed in an embodiment of the present application.

The device of the embodiment of the application can remove the interference lines in the table pictures extracted into the pictures by executing the first method of the application, so that the characters of the pictures can be accurately extracted based on the table from which the interference lines are removed.

Example four

The embodiment of the application discloses a storage medium, wherein a computer program is stored in the storage medium, and when the computer program is executed by a processor, the method for extracting the characters from the print picture with the form disclosed by the first aspect of the application is executed.

By executing the method disclosed by the embodiment of the application, the storage medium can remove the interference lines in the table pictures extracted from the pictures, so that the characters of the pictures can be accurately extracted based on the tables from which the interference lines are removed.

In the embodiments disclosed in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a positioning base station, or a network device) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above embodiments are merely examples of the present application and are not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims

1. A method for extracting characters from a printed picture with a table is characterized by comprising the following steps:

2. The method as claimed in claim 1, wherein after the calculating of the integral of the horizontal projection of each of the horizontal lines and/or the integral of the horizontal projection of each of the vertical lines, before the removing of the horizontal lines whose integral of projection is smaller than a first preset pitch threshold value and/or the vertical lines whose integral of projection is smaller than a second preset pitch threshold value from the number of horizontal lines and/or the number of vertical lines and obtaining the table of the binarized picture, the method further comprises:

3. The method as claimed in claim 1, wherein said identifying a number of horizontal lines and/or a number of vertical lines in said binarized picture based on structural elements, erosion algorithm and dilation algorithm comprises:

4. The method as claimed in claim 3, wherein the calculation formula for determining the structural element according to the number of rows and columns of the binarized picture is:

s is CO L S// SCA L E or S is ROWS// SCA L E;

5. The method according to claim 1, wherein the calculation formula for calculating the gray value of each pixel point according to the RGB values of each pixel point of the to-be-processed picture is:

H＝0.3*R+0.59*G+0.11*B；

6. The method as claimed in claim 1, wherein after the calculating the gray value of each pixel point according to the RGB values of each pixel point of the picture to be processed, the sequentially comparing the gray value of each pixel point with a preset threshold and converting the gray value of each pixel point into 0 or 255 according to the comparison result, so that before the converting the picture to be processed into the binary picture, the method further comprises:

calculating the preset threshold according to a calculation formula T-B-C;

7. The method as claimed in claim 1, wherein the deleting the table of the binarized picture and the preserving the text content in the binarized picture comprises:

8. A kind of extraction device of the form print picture characters, characterized by that, the said device includes:

9. A tabular form print picture text extraction apparatus, the apparatus comprising:

a processor; and

a memory configured to store machine readable instructions which, when executed by the processor, perform the method of tabbed print picture text extraction of any of claims 1-7.

10. A storage medium storing a computer program which, when executed by a processor, performs the method of extracting text from a tabbed print image according to any of claims 1 to 7.