CN111753830A - Job image correction method and computing device - Google Patents
Job image correction method and computing device Download PDFInfo
- Publication number
- CN111753830A CN111753830A CN202010573519.4A CN202010573519A CN111753830A CN 111753830 A CN111753830 A CN 111753830A CN 202010573519 A CN202010573519 A CN 202010573519A CN 111753830 A CN111753830 A CN 111753830A
- Authority
- CN
- China
- Prior art keywords
- image
- column
- area
- character
- rectangular
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 84
- 238000003702 image correction Methods 0.000 title claims abstract description 44
- 238000004891 communication Methods 0.000 claims abstract description 68
- 238000012545 processing Methods 0.000 claims description 15
- 230000009466 transformation Effects 0.000 claims description 13
- 230000007704 transition Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000000877 morphologic effect Effects 0.000 claims description 5
- 238000012937 correction Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 13
- 230000011218 segmentation Effects 0.000 description 8
- 230000008569 process Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 230000002093 peripheral effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000013145 classification model Methods 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000010339 dilation Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 230000007723 transport mechanism Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000000149 penetrating effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Character Input (AREA)
Abstract
The invention discloses a job image correction method which is suitable for being executed in computing equipment and comprises the following steps: acquiring a job image to be processed, and identifying each character connected domain in the image; dividing one or more character communication domains with similar left-most horizontal coordinates into the same group; determining the operation image to be in single-column layout or multi-column layout and a corresponding column area according to the divided one or more groups; and for each column area, determining one or more standard communication fields with the width reaching a preset condition, and carrying out image correction on the column area based on the text lines of the standard communication fields. The invention also discloses a computing device for executing the method.
Description
Technical Field
The present invention relates to the field of image processing, and in particular, to a method and a computing device for correcting a job image.
Background
The operation pictures have certain particularity and are generally divided into single-column pictures, double-column pictures, even three-column pictures and four-column pictures, and each column area has a title. In addition, there are special cases such as a single column being a main local double column in the operation picture. Moreover, the job picture is generally on a workbook through which the user gets the job picture, and the workbook may have a certain thickness, resulting in a certain image distortion of the photographed image, such as a downward distortion of the upper content of the page, an upward distortion of the lower content of the page, and so on. In addition, some text areas are few in the operation picture, for example, geometric figures, cartoon characters, answer areas and other contents appear in the question. Therefore, it is necessary to provide a method for analyzing, correcting and plate-type recognizing a job picture so as to confirm each topic area.
Disclosure of Invention
In view of the above, the present invention proposes a job image correction method and a computing device in an attempt to solve or at least solve the above-existing problems.
According to an aspect of the present invention, there is provided a job image correction method adapted to be executed in a computing device, the method comprising the steps of: acquiring a job image to be processed, and identifying each character connected domain in the image; dividing one or more character communication domains with similar left-most horizontal coordinates into the same group; determining the operation image to be in single-column layout or multi-column layout and a corresponding column area according to the divided one or more groups; and for each column area, determining one or more standard communication fields with the width reaching a preset condition, and carrying out image correction on the column area based on the text lines of the standard communication fields.
Alternatively, in the job image correction method according to the present invention, further comprising the steps of: and for the multi-column layout, image correction is carried out on the transition area between the adjacent column areas according to the image correction results of the two adjacent column areas.
Alternatively, in the job image correction method according to the present invention, the standard communicating regions are two uppermost and lowermost ones of the character communicating regions whose widths meet the predetermined condition.
Optionally, in the job image correction method according to the present invention, the step of performing image correction on the corresponding column area based on the text line of the standard communication field includes: and respectively calculating corresponding transformation formulas according to the text lines and the image horizontal lines of the two standard communication domains, and correcting the column area according to the two transformation formulas obtained by calculation.
Alternatively, in the job image correction method according to the present invention, the step of determining that the job image is a single-column layout or a multi-column layout and a corresponding column region according to the divided one or more groups includes: if the grouping number is 1, judging that the operation image is in a single-column layout, wherein the corresponding column area is an area between the minimum horizontal coordinate and the maximum horizontal coordinate of all character communication areas in the operation image; if the number of the groups is not 1, generating a rectangular area of each group, and performing de-duplication processing on the plurality of rectangular areas, wherein the de-duplicated rectangular areas are column areas.
Alternatively, in the job image correction method according to the present invention, the left and right boundaries of the rectangular region are the leftmost boundary and the rightmost boundary of all the character linking regions within the corresponding group, and the upper and lower boundaries are the upper and lower boundaries of the job image.
Alternatively, in the job image correction method according to the present invention, the step of performing the deduplication processing for the plurality of rectangular areas includes: deleting a rectangular area coinciding with more than one rectangular area; for the remaining rectangular areas, the rectangular areas included in the other rectangular areas are deleted, and the rectangular area having a smaller width is deleted from the two intersecting rectangular areas.
Optionally, in the job image correction method according to the present invention, before identifying each connected text domain in the image, the method further includes: converting the operation image into a binary image, and performing morphological expansion processing on the binary image to enable adjacent characters in the same line to be communicated.
Alternatively, in the job image correction method according to the present invention, the letter communication field is a letter communication field of a printed character.
Alternatively, in the job image correction method according to the present invention, the leftmost abscissa is close, meaning that the difference between the leftmost abscissas of the character linkage fields is less than a first percentage of the page width.
Alternatively, in the job image correction method according to the present invention, further comprising the steps of: detecting a longitudinal straight line in the operation image, and longitudinally correcting the operation image according to an included angle between the longitudinal straight line and an image vertical line; and/or detecting a transverse straight line in the working image, and performing transverse correction on the working image according to an included angle between the transverse straight line and an image horizontal line.
Optionally, in the job image correction method according to the present invention, the longitudinal straight line is a straight line having an included angle with the vertical line of the image smaller than a predetermined angle and a straight line height greater than a second percentage of the image height, or a longitudinal frame line of the job image; the transverse straight line is a straight line which forms an included angle with the horizontal line of the image which is smaller than a preset angle and has the height greater than the second percentage of the height of the image, or a transverse frame line of the operation image;
alternatively, in the job image correction method according to the present invention, the width reaching the predetermined condition means that the width is 75% or more of the column width, the first percentage is 8%, the second percentage is 50%, and the predetermined angle is 15 degrees.
Alternatively, in the job image correction method according to the present invention, the method further includes a step of dividing each title region in each column region: for each column area, determining a first word link field of each line of text in the area; determining question mark lines from top to bottom in sequence based on the first character formats, contents and abscissa positions of the determined character communication domains; and determining the area of each question according to the vertical coordinate positions of two adjacent question number lines, and segmenting each question area for storage.
Alternatively, in the job image correction method according to the present invention, the determination rule of the title line includes: the title number is Chinese number or Arabic number; the same-level title numbers have the same character format, continuous numbers, the same or similar abscissa and the same punctuations after the title numbers.
Alternatively, in the job image correction method according to the present invention, further comprising the steps of: the line in which the picture is located is marked as the title area of the nearest title line above the picture.
According to yet another aspect of the present invention, there is provided a computing device comprising: one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs when executed by the processors implement the steps of the job image correction method as described above.
According to still another aspect of the present invention, there is provided a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, implement the steps of the job image correction method as described above.
According to the technical scheme of the invention, one or more groups are determined according to the leftmost coordinates of each character connected domain in the job image, and whether the job image is a single-column image or a multi-column image is determined by analyzing the grouping conditions. The column area of the single-column image is the whole operation area, and the column area of the multi-column image is the area of each column. And then, selecting a standard connected domain with the standard width from each column area, and correcting the image based on the text line and the image horizontal line of the standard connected domain. The method and the device can accurately identify the layout of the complex operation image, can correct the distorted operation image, and are convenient for the subsequent processes of title content identification, title segmentation and the like.
In addition, when image correction is carried out, two character connected domains with the largest ordinate and the smallest ordinate are selected as standard connected domains from character connected domains with the standard width in each column region (for example, the width of the connected domain is more than or equal to 75% of the width of the column). And calculating a transformation formula according to the text lines and the horizontal lines of the two connected domains, wherein the upper picture is transformed and corrected downwards layer by layer according to the upper transformation formula, the lower picture is transformed and corrected upwards layer by layer according to the lower transformation formula, and the default distortion of the middle line is the minimum. The invention adopts a layer-by-layer correction mode from two ends (top and bottom) to the middle, improves the image correction accuracy rate and enables the corrected picture to be closer to a plane printing format.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
To the accomplishment of the foregoing and related ends, certain illustrative aspects are described herein in connection with the following description and the annexed drawings, which are indicative of various ways in which the principles disclosed herein may be practiced, and all aspects and equivalents thereof are intended to be within the scope of the claimed subject matter. The above and other objects, features and advantages of the present disclosure will become more apparent from the following detailed description read in conjunction with the accompanying drawings. Throughout this disclosure, like reference numerals generally refer to like parts or elements.
FIG. 1 shows a block diagram of a computing device 100, according to one embodiment of the invention;
FIG. 2 shows a flow diagram of a job image correction method 200 according to one embodiment of the present invention;
FIG. 3 is a diagram illustrating a plurality of text-connected fields in a job image according to another embodiment of the present invention;
FIG. 4 illustrates a schematic diagram of a plurality of grouped rectangles in a job image, according to one embodiment of the invention;
FIGS. 5 and 6 are diagrams illustrating, respectively, the interrelationship between grouping rectangles in a job image, according to one embodiment of the present invention;
FIG. 7 shows a schematic diagram of a final demarcated section area of a job image according to one embodiment of the present invention;
FIG. 8 illustrates a flow diagram of a job title segmentation method 800, according to one embodiment of the present invention; and
fig. 9 is a flowchart illustrating a job title segmentation method according to another embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
FIG. 1 is a block diagram of a computing device 100 according to one embodiment of the invention. In a basic configuration 102, computing device 100 typically includes system memory 106 and one or more processors 104. A memory bus 108 may be used for communication between the processor 104 and the system memory 106.
Depending on the desired configuration, the processor 104 may be any type of processing, including but not limited to: a microprocessor (μ P), a microcontroller (μ C), a Digital Signal Processor (DSP), or any combination thereof. The processor 104 may include one or more levels of cache, such as a level one cache 110 and a level two cache 112, a processor core 114, and registers 116. The example processor core 114 may include an Arithmetic Logic Unit (ALU), a Floating Point Unit (FPU), a digital signal processing core (DSP core), or any combination thereof. The example memory controller 118 may be used with the processor 104, or in some implementations the memory controller 118 may be an internal part of the processor 104.
Depending on the desired configuration, system memory 106 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. System memory 106 may include an operating system 120, one or more applications 122, and program data 124. In some embodiments, application 122 may be arranged to operate with program data 124 on an operating system. The program data 124 includes instructions that in the computing device 100 according to the present invention, the program data 124 includes instructions for performing the job image correction method 200 and/or the job title segmentation method 800.
Computing device 100 may also include an interface bus 140 that facilitates communication from various interface devices (e.g., output devices 142, peripheral interfaces 144, and communication devices 146) to the basic configuration 102 via the bus/interface controller 130. The example output device 142 includes a graphics processing unit 148 and an audio processing unit 150. They may be configured to facilitate communication with various external devices, such as a display or speakers, via one or more a/V ports 152. Example peripheral interfaces 144 may include a serial interface controller 154 and a parallel interface controller 156, which may be configured to facilitate communication with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 158. An example communication device 146 may include a network controller 160, which may be arranged to facilitate communications with one or more other computing devices 162 over a network communication link via one or more communication ports 164.
A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media, such as carrier waves or other transport mechanisms, in a modulated data signal. A "modulated data signal" may be a signal that has one or more of its data set or its changes made in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, Radio Frequency (RF), microwave, Infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media.
Computing device 100 may also include a storage interface bus 134. The storage interface bus 134 enables communication from the storage devices 132 (e.g., removable storage 136 and non-removable storage 138) to the basic configuration 102 via the bus/interface controller 130. At least a portion of the operating system 120, applications 122, and data 124 may be stored on removable storage 136 and/or non-removable storage 138, and loaded into system memory 106 via storage interface bus 134 and executed by the one or more processors 104 when the computing device 100 is powered on or the applications 122 are to be executed.
Applications 122 execute on operating system 120, i.e., operating system 120 provides various interfaces for operating hardware devices (e.g., storage device 132, output device 142, peripheral interface 144, and communication devices) and also provides an environment for application context management (e.g., memory space management and allocation, interrupt handling, process management, etc.). The application 122 utilizes the interface and environment provided by the operating system 120 to control the computing device 100 to perform the corresponding functions. In some implementations, some applications 122 also provide interfaces. So that other applications 122 may invoke these interfaces to implement the functionality.
Computing device 100 may be implemented as a server, such as a file server, a database server, an application server, a WEB server, etc., or as part of a small-form factor portable (or mobile) electronic device, such as a cellular telephone, a Personal Digital Assistant (PDA), a personal media player device, a wireless WEB-watch device, a personal headset device, an application specific device, or a hybrid device that include any of the above functions. Computing device 100 may also be implemented as a personal computer including both desktop and notebook computer configurations. In some embodiments, the computing device 100 is configured to perform the job image correction method 200 and/or the job title segmentation method 800.
FIG. 2 shows a flow diagram of a job image correction method 200 according to one embodiment of the present invention. Method 200 is performed in a computing device, such as computing device 100, that performs a distortion correction process on a job image.
As shown in fig. 2, the method begins at step S210. In step S210, a job image to be processed is acquired, and each character connected domain in the image is identified.
The job image may be obtained by any method, such as photographing or scanning, which is not limited in the present invention. According to one embodiment, before identifying each text connected domain in the image, the method may further include the steps of: converting the operation image into a binary image, and performing morphological expansion processing on the binary image to enable adjacent characters in the same line to be communicated. Here, the magnitude of the expansion is such that adjacent letters in the same row can be connected. Two characters generally belong to two different connected domains if they are separated by several blank characters or space bars.
Furthermore, the invention can also recognize the printing form character area in the operation image and recognize the character communication area of the printing form character area, and the recognized character communication area is the character communication area of the printing character. If a segment of continuous text has only print text (e.g., title portion) or both print and handwritten text (similar to filling in blank questions, the user has filled in answers), then the segment of text is a word association field. If the continuous text segment only contains handwritten text (e.g. the user answers the question in the blank area), the text segment is not identified as the word linking area.
Here, only the print character area is morphologically expanded before the print character area can be morphologically expanded. The recognition of the print text region can also be performed after morphological dilation, which can perform morphological dilation on both the print text region and the handwritten text region. The invention does not limit the sequence of the steps.
It should be understood that the present invention can train a classification model capable of recognizing the print character region and the handwritten character region, and train the classification model by labeling the print character region and the handwritten character region in a plurality of images as a training set. The structure and parameters of the classification model can be set by those skilled in the art according to the needs, and the present invention is not limited thereto.
FIG. 3 is a schematic diagram of a text communication field in a work image, where the work image includes a top time title and two lower columns of questions, where the questions include a printed matter part, a handwritten matter part to be answered by a user, and multiple picture formats in the question stem. The class time titles and some of the title stems are pure printed forms and are identified as character communication areas; some filling-in-blank questions have both printed forms and handwritten forms and are also identified as character communication fields; and the answer of the user in the blank area and the question picture part are identified as a non-character communication area.
Subsequently, in step S220, one or more word communication fields with the leftmost abscissa being close are divided into the same group.
Wherein, the left-most horizontal coordinates are similar, which means that the difference of the left-most horizontal coordinates of the character communication fields is less than the first percentage of the width of the page. The first percentage is 8% primarily to account for errors and setbacks, although not limited thereto. Here, the working image may be calibrated by using a coordinate system, such as the upper left corner of the picture as the origin, the right side as the X axis, and the down side as the Y axis, although not limited thereto. Fig. 4 is a schematic diagram of a plurality of groups divided according to the word linking field of fig. 3, and the groups are divided according to the starting position of each text box in the image. Wherein, three marking lines penetrating through the operation image represent the initial positions of three groups, and the left-most abscissa positions of the character communication fields on the lines are the same or similar and are divided into the same group.
Subsequently, in step S230, the job image is determined to be a single-column layout or a multi-column layout and a corresponding column area according to the divided one or more groups.
In one implementation, if the number of the groups is 1, it is determined that the job image is a single-column layout, and a column area corresponding to the single-column layout is an area between the minimum abscissa and the maximum abscissa of all the character communication areas in the job image. The upper and lower boundaries of the column area are the upper and lower boundaries of the job image.
In another implementation manner, if the number of the groups is not 1, a rectangular region of each group is generated, and the multiple rectangular regions are subjected to deduplication processing, where the rectangular region subjected to deduplication is a column region.
Here, the left and right boundaries of the rectangular region are the leftmost boundary and the rightmost boundary of all the character linkage regions in the corresponding group, and the upper and lower boundaries are the upper and lower boundaries of the job image. Here, traversing all the groups, calculating a rectangle, wherein the top of the rectangle coincides with the top of the picture, the bottom coincides with the bottom of the picture, the left side is the minimum value of the abscissa of the leftmost position of all the text connected domains in the group, and the right side is the maximum value of the abscissa of the rightmost position of all the text connected domains in the group. Fig. 5 is a schematic diagram of a rectangular area corresponding to each group in a job image, in which three grouped rectangles are exemplarily shown.
According to one embodiment, the step of performing the de-duplication process on the plurality of rectangular areas comprises: deleting a rectangular area coinciding with more than one rectangular area; for the remaining rectangular areas, the rectangular areas included in the other rectangular areas are deleted, and the rectangular area having a smaller width is deleted from the two intersecting rectangular areas.
The invention retains all packets that do not coincide with other packet rectangles, and if there is a packet whose rectangle coincides, deletes the packet that coincides with more than one other packet rectangle first. As shown in fig. 5, a grouping rectangle formed by the heading "how much money is saved in the third session" intersects with both of the other grouping rectangles, and is deleted. In this way the impact of a somewhat centered title on the layout can be removed. At this time, if two intersected groups exist, a group rectangle with a larger width is selected, and a group rectangle with a smaller width is deleted; if a narrower grouping rectangle is a subset of other wider grouping rectangles, the narrower grouping rectangle is deleted. The narrow range of three grouped rectangles as contained in fig. 6 will be deleted.
And performing operation duplication removal such as merging and deleting of the rectangular areas to obtain the final number of the rectangular areas, namely the number of columns of the operation image. If there are two rectangular areas subjected to the repeat operation as shown in fig. 7, the job image has a two-column layout, and the two corresponding column areas are the two rectangular areas.
Subsequently, in step S240, for each of the column areas, one or more standard communication fields having a width satisfying a predetermined condition are determined therefrom, and the image correction is performed on the column area based on the text line of the standard communication field. Wherein, the text line is the character center connecting line of the connected domain.
Generally, a standard communication field is one or more literal communication fields whose widths meet a predetermined condition. The width reaching the predetermined condition means that the width is 75% or more of the column width, but is not limited thereto. And then, calculating a transformation formula according to the text lines and the horizontal lines of the standard communication domain, and carrying out image correction on the column region according to the transformation formula and the priori knowledge of the distortion degree of each text line in the distorted image.
For example, for a two-column picture, one or more word connected components can be selected from the connected components with the width greater than 75% in the left column area as standard connected components, and then the left column area is subjected to image correction according to the text lines and the horizontal lines of the standard connected components. Similarly, image correction can be performed on the right column area.
Further, the standard communication domain is a character communication domain with a width reaching a predetermined condition, an uppermost character communication domain and a lowermost character communication domain, and the two standard communication domains are total. At the moment, corresponding transformation formulas are respectively calculated according to the text lines and the image horizontal lines of the two standard communication domains, and the column area is corrected according to the two transformation formulas obtained through calculation.
As described above, the degree of distortion of the operation image gradually decreases from the upper and lower ends to the middle, the upper picture is distorted downward, and the lower picture is distorted upward, so that the present invention selects a text connected domain having a width up to standard and as close as possible to the two ends of the picture as a standard connected domain from the upper and lower parts, respectively. And correcting the operation image layer by layer from the upper side to the lower side by taking the conversion relation between the horizontal line and the text line of the two connected domains as a reference. Here, too, some a priori knowledge of the image distortion variations is used.
Furthermore, the invention can use three standard communication domains, namely, the character communication domain with the width reaching the preset condition, the two character communication domains at the top, the one character communication domain at the bottom and the character communication domain at the middle. Then, the job image is corrected based on the conversion formula of the three character union regions.
According to an embodiment of the present invention, after step S240, the method 200 further comprises the steps of: and for the multi-column layout, image correction is carried out on the transition area between the adjacent column areas according to the image correction results of the two adjacent column areas. And calculating a modification formula in a linear fitting mode according to the transformed point coordinates on the adjacent boundary lines of the adjacent column areas, and finally, performing transverse correction on the whole operation. For example, the area between the two column areas in fig. 7 is a transition area, and knowing the coordinates of each point on the two middle boundary lines before correction and the coordinates after correction, a transformation formula of the transition point between the two points can be obtained by linear fitting according to the transformation mode of the two points on the same horizontal line on the original operation picture, thereby completing the image correction of the whole transition area.
According to an embodiment of the invention, the method 200 further comprises the step of correcting the job image according to the longitudinal straight lines and/or the transverse straight lines.
Wherein, the step of correcting according to the longitudinal straight line comprises: and detecting a longitudinal straight line in the operation image, and longitudinally correcting the operation image according to an included angle between the longitudinal straight line and the image vertical line. The vertical straight line is a straight line (mostly a frame line of the image) having an included angle smaller than a predetermined angle with the vertical line of the image and a height greater than a second percentage of the height of the image, or the vertical straight line is a vertical frame line of the operation image. Preferably the central subfield line of the image, and if there is no subfield line, the border line is used.
The step of correcting according to the transverse straight line comprises the following steps: and detecting a transverse straight line in the operation image, and performing transverse correction on the operation image according to an included angle between the transverse straight line and an image horizontal line. Wherein, the horizontal straight line is a straight line which has an included angle with the image horizontal line smaller than a preset angle and has a straight line height larger than the second percentage of the image height, or the horizontal straight line is a horizontal frame line of the operation image.
In short, the job image is subjected to the longitudinal inclination correction according to the vertical partition line or the image border line, and the job image is subjected to the lateral inclination correction according to the horizontal partition line or the image border line. Alternatively, the predetermined angle is 15 degrees and the second percentage is 50%, although not limited thereto.
The layout determination and the distortion and inclination correction of the job image are completed, and on the basis, the single subject area can be continuously divided from the job image for storing the single subject.
FIG. 8 shows a flow diagram of a job title segmentation method 800 according to one embodiment of the present invention. Method 800 is performed in a computing device, such as computing device 100. As shown in fig. 8, the method begins at step S810.
In step S810, for each column area, the first word link field of each line of text in the area is determined, and a vertical set of link fields is formed.
Subsequently, in step S820, the question mark row is determined based on the first several character formats, contents, and abscissa positions of the determined respective letter communication fields in order from top to bottom.
The judgment rule of the title line comprises the following steps: the title number is Chinese number or Arabic number; the same-level title numbers have the same character format, continuous numbers, the same or similar abscissa and the same punctuations after the title numbers. In addition, the question mark presents a tree structure, the specific question is in the first-level question mark at the bottom, and the question mark above the bottom-level question mark represents the question pattern. The line where a picture (such as a geometric figure, a cartoon character and the like) is not necessarily the question mark line, and the line where the picture is located is generally marked as the question area of the nearest question mark line above the picture. The title is mostly in the first character or the second character (in the case of the title with parentheses), and therefore the first characters may be the first character, and may also be the first two characters or the first three characters.
The segmentation of the single topics of the operation image can be understood by referring to fig. 9, the single-column regions are respectively used for judgment, the single-column regions are transversely cut by the text connected domain to form a group of blocks which are longitudinally arranged, and the blocks can be text blocks or drawing blocks. Then, each block is sequentially identified and processed from top to bottom, whether the block is a new topic area is judged in a mode of taking the topic number as a main part in the processing, and the topic number analysis and identification considers the logic such as classification logic, the succession of the topic numbers and the like.
Specifically, a block to be cancelled is obtained first, and if the region is not a text block, such as a picture block, the text block is added to the set of topic blocks currently being identified. If the block is a text block, whether the first characters of the text contain numbers is judged, and if not, the text block is also added into the currently identified topic block set.
If the text block contains numbers, judging whether the formats of the numbers and the head line title of the currently identified title are consistent, if so, ending the identification of the current title area, and judging that the text block is the title number line of the next title. Otherwise, if they are not consistent, they generally represent two cases, one is that the current topic is finished and a new upper topic type appears, and the other is that the block still belongs to the current topic.
Therefore, whether the previous question number exists or not is judged, if the previous question number exists and the format of the previous question number is consistent with that of the previous question number, the identification of the current question area is ended, the current hierarchy block is represented to be ended, and the text block is judged to be a new question number line of the previous question type, namely the starting block of the next block of the current hierarchy. If the upper topic number exists but the format of the upper topic number is not consistent with the format of the upper topic number, the text block is added into the currently identified topic block set. If no upper-level question number exists, judging whether other-level question numbers exist and whether the format of the other-level question numbers is consistent with that of the other-level question numbers. If the two items are consistent, the identification of the current item is still finished, and the block is the starting block of the other-level item numbers. If no other level topic number exists, then the current block is added to the set of topic blocks being identified.
The invention not only relies on the identification of the single-digit, but also sets the checking and fault-tolerant functions for the question number information according to the preset logic judgment. For example, if a possible title is found but does not conform to the title rule of the same level under contract-title, the possible title is determined as a common number instead of title information.
Subsequently, in step S830, the area of each topic is determined according to the ordinate position of the two adjacent topic number rows, and each topic area is divided and stored.
And after confirming the theme area, carrying out segmentation processing on a single theme in the single-column picture to obtain a single theme picture, and storing and distributing the single theme picture.
According to the technical scheme of the invention, the layout of the operation image can be accurately determined, the distortion correction, the inclination correction and the like of the image are completed, and the operation image is close to the original layout as much as possible. Meanwhile, each single-subject image is accurately extracted so as to facilitate subsequent work such as image identification, single-subject distribution, single-subject correction, score summarization and the like. The invention has high detection accuracy, small calculation amount and high calculation performance, and can carry out real-time and rapid batch processing.
A9, the method of any one of A1-A8, wherein the literal association domain is a literal association domain of a printed character. A10, the method of any one of A1-A9, wherein the leftmost abscissas being similar indicates that the difference between the leftmost abscissas of the word-through fields is less than a first percentage of the width of the page. A11, the method of any one of A1-A10, further comprising the steps of: detecting a longitudinal straight line in the operation image, and longitudinally correcting the operation image according to an included angle between the longitudinal straight line and an image vertical line; and/or detecting a transverse straight line in the operation image, and performing transverse correction on the operation image according to an included angle between the transverse straight line and an image horizontal line.
A12, the method of A11, wherein the longitudinal straight line is a straight line that makes an angle with the vertical line of the image less than a predetermined angle and has a straight line height greater than a second percentage of the image height, or a longitudinal frame line of the job image; the transverse straight line is a straight line which forms an included angle with the horizontal line of the image smaller than a preset angle and has a straight line height larger than a second percentage of the image height, or a transverse frame line of the operation image. A13, the method of A12, wherein said width meeting predetermined conditions is a width of 75% or more of the column width, said first percentage is 8%, said second percentage is 50%, and said predetermined angle is 15 degrees.
A14, the method of any one of A1-A13, further comprising the step of dividing each topic area in each hurdle area: for each column area, determining a first word link field of each line of text in the area; determining question mark lines from top to bottom in sequence based on the first character formats, contents and abscissa positions of the determined character communication domains; and determining the area of each question according to the vertical coordinate positions of two adjacent question number lines, and segmenting each question area for storage. A15, the method as in a14, wherein the decision rule of the title line includes: the title number is Chinese number or Arabic number; the first characters of the same-level title are same in format, continuous in number, same or similar in abscissa, and the punctuations after the title are same. A16, the method of A14, further comprising the steps of: the line in which the picture is located is marked as the title area of the nearest title line above the picture.
The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as removable hard drives, U.S. disks, floppy disks, CD-ROMs, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to perform the method of the invention according to instructions in said program code stored in the memory.
By way of example, and not limitation, readable media may comprise readable storage media and communication media. Readable storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of readable media.
In the description provided herein, algorithms and displays are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with examples of this invention. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules or units or components of the devices in the examples disclosed herein may be arranged in a device as described in this embodiment or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into multiple sub-modules.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. Any of the claimed embodiments may be used in any combination, for example, in the following claims.
Furthermore, some of the described embodiments are described herein as a method or combination of method elements that can be performed by a processor of a computer system or by other means of performing the described functions. A processor having the necessary instructions for carrying out the method or method elements thus forms a means for carrying out the method or method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is used to implement the functions performed by the elements for the purpose of carrying out the invention.
As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed in an illustrative rather than a restrictive sense with respect to the scope of the invention, as defined in the appended claims.
Claims (10)
1. A method of job image correction, adapted to be executed in a computing device, the method comprising the steps of:
acquiring a job image to be processed, and identifying each character connected domain in the image;
dividing one or more character communication domains with similar left-most horizontal coordinates into the same group;
determining the operation image to be in single-column layout or multi-column layout and a corresponding column area according to the divided one or more groups; and
for each column area, one or more standard communication fields with the width reaching a preset condition are determined, and image correction is carried out on the column area based on the text lines of the standard communication fields.
2. The method of claim 1, further comprising the steps of:
and for the multi-column layout, image correction is carried out on the transition area between the adjacent column areas according to the image correction results of the two adjacent column areas.
3. The method of claim 1 or 2, wherein the standard communication fields are two upper and lower most character communication fields of a character communication field having a width satisfying a predetermined condition.
4. The method of claim 3, wherein the step of image-correcting the corresponding hurdle area based on the text line of the standard communication field comprises:
and respectively calculating corresponding transformation formulas according to the text lines and the image horizontal lines of the two standard communication domains, and correcting the column area according to the two transformation formulas obtained by calculation.
5. The method of any of claims 1-4, wherein the determining the job image as a single column layout or a multi-column layout and corresponding column region from the divided one or more groupings comprises:
if the grouping number is 1, judging that the operation image is in a single-column layout, wherein the corresponding column area is an area between the minimum abscissa and the maximum abscissa of all character communication areas in the operation image;
if the number of the groups is not 1, generating a rectangular area of each group, and performing de-duplication processing on the plurality of rectangular areas, wherein the de-duplicated rectangular areas are column areas.
6. The method of claim 5, wherein the left and right boundaries of the rectangular region are the leftmost and rightmost boundaries of all of the word-through regions within the corresponding grouping, and the upper and lower boundaries are the upper and lower boundaries of the job image.
7. The method of claim 5, wherein the step of de-duplicating the plurality of rectangular regions comprises:
deleting a rectangular area coinciding with more than one rectangular area;
for the remaining rectangular areas, the rectangular areas included in the other rectangular areas are deleted, and the rectangular area having a smaller width is deleted from the two intersecting rectangular areas.
8. The method of any one of claims 1-7, wherein prior to identifying each literal connected domain in the image, further comprising the steps of:
and converting the job image into a binary image, and performing morphological expansion processing on the binary image to enable adjacent characters in the same line to be communicated.
9. A computing device, comprising:
at least one processor; and
a memory storing program instructions;
wherein the processor is configured to perform the method of any one of claims 1-8 according to program instructions stored in the memory.
10. A computer readable storage medium having program instructions stored thereon that are readable by a computing device to cause the computing device to perform the method of any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010573519.4A CN111753830A (en) | 2020-06-22 | 2020-06-22 | Job image correction method and computing device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010573519.4A CN111753830A (en) | 2020-06-22 | 2020-06-22 | Job image correction method and computing device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111753830A true CN111753830A (en) | 2020-10-09 |
Family
ID=72674851
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010573519.4A Pending CN111753830A (en) | 2020-06-22 | 2020-06-22 | Job image correction method and computing device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111753830A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022089196A1 (en) * | 2020-10-27 | 2022-05-05 | 北京字节跳动网络技术有限公司 | Image processing method and apparatus, and electronic device and storage medium |
CN114663902A (en) * | 2022-04-02 | 2022-06-24 | 北京百度网讯科技有限公司 | Document image processing method, device, equipment and medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000187705A (en) * | 1998-12-22 | 2000-07-04 | Toshiba Corp | Document reader, document reading method and storage medium |
CN107798321A (en) * | 2017-12-04 | 2018-03-13 | 海南云江科技有限公司 | A kind of examination paper analysis method and computing device |
CN110414529A (en) * | 2019-06-26 | 2019-11-05 | 深圳中兴网信科技有限公司 | Paper information extracting method, system and computer readable storage medium |
WO2019227615A1 (en) * | 2018-06-01 | 2019-12-05 | 平安科技(深圳)有限公司 | Method for correcting invoice image, apparatus, computer device, and storage medium |
-
2020
- 2020-06-22 CN CN202010573519.4A patent/CN111753830A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000187705A (en) * | 1998-12-22 | 2000-07-04 | Toshiba Corp | Document reader, document reading method and storage medium |
CN107798321A (en) * | 2017-12-04 | 2018-03-13 | 海南云江科技有限公司 | A kind of examination paper analysis method and computing device |
WO2019227615A1 (en) * | 2018-06-01 | 2019-12-05 | 平安科技(深圳)有限公司 | Method for correcting invoice image, apparatus, computer device, and storage medium |
CN110414529A (en) * | 2019-06-26 | 2019-11-05 | 深圳中兴网信科技有限公司 | Paper information extracting method, system and computer readable storage medium |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022089196A1 (en) * | 2020-10-27 | 2022-05-05 | 北京字节跳动网络技术有限公司 | Image processing method and apparatus, and electronic device and storage medium |
CN114663902A (en) * | 2022-04-02 | 2022-06-24 | 北京百度网讯科技有限公司 | Document image processing method, device, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111325110B (en) | OCR-based table format recovery method, device and storage medium | |
CN107798321B (en) | Test paper analysis method and computing device | |
CN108304814B (en) | Method for constructing character type detection model and computing equipment | |
CN109829453B (en) | Method and device for recognizing characters in card and computing equipment | |
CN110069767B (en) | Typesetting method based on electronic book, electronic equipment and computer storage medium | |
CN110390269A (en) | PDF document table extracting method, device, equipment and computer readable storage medium | |
US9330331B2 (en) | Systems and methods for offline character recognition | |
US7492366B2 (en) | Method and system of character placement in opentype fonts | |
CN107729865A (en) | A kind of handwritten form mathematical formulae identified off-line method and system | |
WO2021237909A1 (en) | Table restoration method and apparatus, device, and storage medium | |
CN111310426B (en) | OCR-based table format recovery method, device and storage medium | |
JP2004139484A (en) | Form processing device, program for implementing it, and program for creating form format | |
JP4443576B2 (en) | Pattern separation / extraction program, pattern separation / extraction apparatus, and pattern separation / extraction method | |
US6614929B1 (en) | Apparatus and method of detecting character writing area in document, and document format generating apparatus | |
CN111753830A (en) | Job image correction method and computing device | |
CN111340020A (en) | Formula identification method, device, equipment and storage medium | |
CN111582267A (en) | Text detection method, computing device and readable storage medium | |
CN110598196B (en) | Table data extraction method and device without outer frame and storage medium | |
CN114463767A (en) | Credit card identification method, device, computer equipment and storage medium | |
JP5950700B2 (en) | Image processing apparatus, image processing method, and program | |
CN113095320A (en) | License plate recognition method and system and computing device | |
CN117496521A (en) | Method, system and device for extracting key information of table and readable storage medium | |
CN110941972B (en) | Segmentation method and device for characters in PDF document and electronic equipment | |
CN113011131B (en) | Typesetting method based on picture electronic book, electronic equipment and storage medium | |
CN112100978B (en) | Typesetting processing method based on electronic book, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |