CN113887401A - Form identification method and device - Google Patents

Form identification method and device Download PDF

Info

Publication number
CN113887401A
CN113887401A CN202111155487.7A CN202111155487A CN113887401A CN 113887401 A CN113887401 A CN 113887401A CN 202111155487 A CN202111155487 A CN 202111155487A CN 113887401 A CN113887401 A CN 113887401A
Authority
CN
China
Prior art keywords
image
line
fitted
text
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111155487.7A
Other languages
Chinese (zh)
Inventor
赵志勇
苏雪峰
李轩
任辉
黄恺
曹润东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN202111155487.7A priority Critical patent/CN113887401A/en
Publication of CN113887401A publication Critical patent/CN113887401A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration by the use of local operators
    • G06T5/30Erosion or dilatation, e.g. thinning

Abstract

The application discloses a form identification method, which can firstly acquire a first image comprising a form, determine a form line in the first image through a corrosion and expansion technology, and avoid identifying noise in the first image as the form line through the corrosion and expansion technology. And setting the average height and/or the maximum height of the text included in the first image as parameters of the erosion and dilation technique can avoid misrecognizing horizontal line strokes and vertical line strokes in the text in the first image as table lines. In addition, because a table line obtained by using the erosion and expansion technology may have a curve, the table line may be further subjected to straight line fitting to obtain a fitted table line, a target table may be obtained by drawing according to the fitted table line, and the target table may be output. Therefore, in the scheme, the fitted table line is accurate, and the target table drawn by using the fitted table line is also accurate.

Description

Form identification method and device
Technical Field
The present application relates to the field of image processing, and in particular, to a method and an apparatus for table recognition.
Background
In some scenarios, it is desirable to identify tables in the image. Currently, tables in images can be identified using machine learning models. However, in the method of recognizing the table in the image by using the machine learning model, the machine learning model needs to be trained in advance, a large number of training samples are needed for training the machine model, and the trained machine learning model cannot accurately recognize the table in the image under the condition that the training samples are insufficient.
Therefore, a solution to the above problems is urgently needed.
Disclosure of Invention
The technical problem to be solved by the application is how to accurately identify a table in an image, and the application provides a table identification method and a table identification device.
In a first aspect, an embodiment of the present application provides a table identification method, where the method includes:
acquiring a first image including a form;
determining the grid lines in the first image by erosion and dilation techniques, the parameters of which include: an average height and/or a maximum height of text included in the first image;
performing linear fitting on the table line to obtain a fitted table line;
and drawing according to the fitted table line to obtain a target table, and outputting the target table.
Optionally, the method further includes:
determining text included in the first image using Optical Character Recognition (OCR) techniques;
an average height and/or a maximum height of the text is determined.
Optionally, the obtaining a target table according to the fitted table line drawing includes:
identifying cells according to the fitted table lines and an image edge contour processing technology;
drawing the identified cells to obtain the target table.
Optionally, the method further includes:
processing the first image by using an OCR technology, and determining a boundary line of the cell loss;
supplementing the missing boundary lines of the cells completely to obtain the processed cells;
the drawing the identified cell includes:
and drawing the processed cell.
Optionally, the acquiring a first image including a table includes:
acquiring a second image comprising a table, and identifying a diagonal line in the second image;
calculating an included angle between the oblique line and the direction of the transverse axis;
and rotating the second image according to the included angle to obtain the first image.
In a second aspect, an embodiment of the present application provides a table identification apparatus, where the apparatus includes:
an acquisition unit configured to acquire a first image including a table;
a first determination unit for determining the ruled lines in the first image by means of an erosion and dilation technique, the parameters of which comprise: an average height and/or a maximum height of text included in the first image;
the fitting unit is used for performing straight line fitting on the table line to obtain a fitted table line;
the drawing unit is used for drawing according to the fitted table line to obtain a target table;
an output unit for outputting the target table.
Optionally, the apparatus further comprises:
a second determination unit configured to determine a text included in the first image using an Optical Character Recognition (OCR) technique;
a third determining unit for determining an average height and/or a maximum height of the text.
Optionally, the drawing unit is configured to:
identifying cells according to the fitted table lines and an image edge contour processing technology;
drawing the identified cells to obtain the target table.
Optionally, the apparatus further comprises:
a fourth determining unit, configured to process the first image by using an OCR technology, and determine a boundary line where the cell is missing;
the processing unit is used for completely supplementing the missing boundary line of the cell to obtain a processed cell;
the drawing unit is configured to:
and drawing the processed cell.
Optionally, the obtaining unit is configured to:
acquiring a second image comprising a table, and identifying a diagonal line in the second image;
calculating an included angle between the oblique line and the direction of the transverse axis;
and rotating the second image according to the included angle to obtain the first image.
In a third aspect, embodiments of the present application provide a table identifying apparatus, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs configured to be executed by the one or more processors include instructions for:
acquiring a first image including a form;
determining the grid lines in the first image by erosion and dilation techniques, the parameters of which include: an average height and/or a maximum height of text included in the first image;
performing linear fitting on the table line to obtain a fitted table line;
and drawing according to the fitted table line to obtain a target table, and outputting the target table.
Optionally, the operations further include:
determining text included in the first image using Optical Character Recognition (OCR) techniques;
an average height and/or a maximum height of the text is determined.
Optionally, the obtaining a target table according to the fitted table line drawing includes:
identifying cells according to the fitted table lines and an image edge contour processing technology;
drawing the identified cells to obtain the target table.
Optionally, the operations further include:
processing the first image by using an OCR technology, and determining a boundary line of the cell loss;
supplementing the missing boundary lines of the cells completely to obtain the processed cells;
the drawing the identified cell includes:
and drawing the processed cell.
Optionally, the acquiring a first image including a table includes:
acquiring a second image comprising a table, and identifying a diagonal line in the second image;
calculating an included angle between the oblique line and the direction of the transverse axis;
and rotating the second image according to the included angle to obtain the first image.
In a fourth aspect, embodiments of the present application provide a computer-readable medium having stored thereon instructions, which, when executed by one or more processors, cause an apparatus to perform the table identification method of any one of the above first aspects.
Compared with the prior art, the embodiment of the application has the following advantages:
embodiments of the present application provide a table identification method, in which a first image including a table may be first obtained, and a table line in the first image may be determined through erosion and dilation techniques, it being understood that erosion and dilation techniques may avoid identifying noise in the first image as a table line. And setting the average height and/or the maximum height of the text included in the first image as parameters of the erosion and dilation technique can avoid misrecognizing horizontal line strokes and vertical line strokes in the text in the first image as table lines. In addition, because a table line obtained by using the erosion and expansion technology may have a curve, the table line may be further subjected to straight line fitting to obtain a fitted table line, a target table may be obtained by drawing according to the fitted table line, and the target table may be output. Therefore, in the scheme, the fitted table line is accurate, and the target table drawn by using the fitted table line is also accurate.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flowchart of a table identification method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a table identification apparatus according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a client according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The inventor of the present application has found through research that a table in an image can be recognized by using a machine learning model at present. However, in the method of recognizing the table in the image by using the machine learning model, the machine learning model needs to be trained in advance, a large number of training samples are needed for training the machine model, and the trained machine learning model cannot accurately recognize the table in the image under the condition that the training samples are insufficient.
For example, the following steps are carried out: if a training sample containing noise is not used when the machine learning model is trained, the machine learning model cannot accurately identify the table in the image to be processed when the image to be identified includes noise. If the training sample with the form line as the curve is not adopted when the machine learning model is trained, when the form line in the image to be recognized is the curve, the machine learning model obtained by training cannot accurately recognize the form in the image to be recognized.
In order to solve the above problem, embodiments of the present application provide a table identification method and apparatus, which can accurately identify a table in an image.
Various non-limiting embodiments of the present application are described in detail below with reference to the accompanying drawings.
Exemplary method
Referring to fig. 1, the figure is a schematic flowchart of a table identification method provided in the embodiment of the present application.
The method provided by the embodiment of the present application may be performed by a first device, where the first device mentioned herein includes, but is not limited to, a terminal device and a server. The terminal device mentioned here may be a mobile terminal such as a smart phone or a tablet computer, or may be a terminal device such as a desktop computer.
The method shown in fig. 1 can be implemented, for example, by the following S101 to S104.
S101: a first image including a form is acquired.
The first image in the embodiment of the present application may be any one of a screenshot, a printed text scan picture, a word document picture, a pdf document picture, and a LaTeX document picture.
In an implementation manner of the embodiment of the present application, in order to avoid recognizing that the table is tilted in consideration of the fact that the table in the image may be tilted, in the embodiment of the present application, the first image may be obtained by processing a second image including the table, and the table in the first image does not include a diagonal line. Specifically, the method comprises the following steps:
the method includes the steps of firstly obtaining a second image including a table, then identifying an oblique line in the second image, calculating an included angle between the oblique line and a transverse axis direction, and then rotating the second image according to the included angle to obtain a first image. It is understood that the oblique line in the second image is a line parallel to the direction of the horizontal axis in the first image. The horizontal axis direction mentioned here may be a horizontal axis of a pre-established rectangular coordinate system.
Considering that the hough transform technique can identify lines in the image, in the embodiment of the present application, the hough transform technique can be used to identify oblique lines in the second image.
In a specific implementation, the second image is rotated according to the included angle, for example, the second image may be rotated clockwise or counterclockwise. For example, in a rectangular coordinate system, if the included angle between the oblique line and the positive direction of the horizontal axis is α, the second image may be rotated clockwise by α, or rotated counterclockwise (180 ° - α), and the specific rotation manner may be determined according to the direction of the characters in the table. If the text in the first image obtained after the second image is rotated clockwise by α is normally displayed, the second image may be rotated clockwise by α, and if the text in the first image obtained after the second image is rotated clockwise by α is inversely displayed, the second image may be rotated counterclockwise (180 ° - α).
S102: determining the table lines in the first image by erosion and dilation techniques, the parameters of which include: an average height and/or a maximum height of text included in the first image.
In the embodiment of the present application, it is considered that noise may exist in the first image, and the noise may be erroneously recognized as a table line, thereby causing inaccurate table line recognition. Whereas erosion and dilation techniques may eliminate noise and may identify lines in the image. Thus, in embodiments of the present application, the table lines in the first image may be determined using erosion and dilation techniques. When identifying the table line in the first image by using the erosion and dilation technique, parameters of the erosion and dilation technique may be set, and the embodiment of the present application is not particularly limited with respect to the parameters of the erosion and dilation technique.
In one example, considering that text may exist in the table in the first image, and some horizontal line strokes or vertical line strokes may exist in the text, the horizontal line strokes and the vertical line strokes in the text may be misrecognized as the table lines, and thus the table line recognition may be inaccurate. To address this problem, in one implementation of an embodiment of the present application, an average height and/or a maximum height of text included in the first image may be used as a parameter of the erosion and dilation technique. Therefore, when the table lines in the first image are identified by using the erosion and expansion technology, the average height and/or the maximum height of the text in the table are/is considered, so that the horizontal line strokes and the vertical line strokes in the text can be effectively prevented from being mistakenly identified as the table lines, and the accuracy of the identified table lines is effectively improved.
For this case, before performing S102, it is also necessary to determine an average height and/or a maximum height of the letters included in the first image. In the embodiment of the present application, in consideration of Optical Character Recognition (OCR) technology, the shape of a Character in an image can be determined by detecting dark and light patterns, and then the shape is translated into a computer Character by a Character Recognition method. In other words, OCR technology may recognize text included in the first image. In view of this, in the embodiment of the present application, the text included in the first image may be determined by using an OCR technology, and then, the height of the text included in the first image is counted, so as to determine the average height and/or the maximum height of the text.
S103: and performing straight line fitting on the table line to obtain the fitted table line.
In the embodiment of the present application, it is considered that the table line in the first image may include a curve, which results in the table line identified by S102 including a curve. For example, for an image taken by a cell phone, the table line in the first image is a curve due to the problem of the angle of the image taken. In view of this, in the embodiment of the present application, a straight line may be fitted to the table line to obtain a fitted table line. The straight line fitting in the embodiment of the application refers to processing the curve to obtain a straight line. It will be appreciated that the fitted table line is a line segment.
The embodiment of the present application does not specifically limit a specific implementation manner of performing straight line fitting on the table line, and in the embodiment of the present application, the least square method may be used to perform straight line fitting on the table line, the gradient descent method may also be used to perform straight line fitting on the table line, and the gauss-newton and column-horse algorithms may also be used to perform straight line fitting on the table line, which is not specifically limited in the embodiment of the present application.
S104: and drawing according to the fitted table line to obtain a target table, and outputting the target table.
After the fitted table line is obtained, the target table can be obtained by drawing according to the fitted table line. It can be understood that, because the fitted table line is relatively accurate, the table drawn according to the fitted table line is also relatively accurate.
It should be noted that the fitted table lines include horizontal lines and vertical lines, which are lines having widths, and there are many intersections between one horizontal line having a width and one vertical line having a width, whereas for an accurate table, there is only one intersection between one horizontal line and one vertical line, which is one vertex of a rectangular cell. Therefore, the horizontal line and the vertical line in the fitted table line are connected to obtain a rough frame of the table. For example, a binary image is obtained that includes the frame of the table. If an accurate target table is to be drawn, the cells in the table can be further identified, and then the identified cells are drawn, so that the target table is obtained.
In the embodiment of the application, the cells can be identified according to the fitted table lines and the image edge contour processing technology. It is understood that the fitted table line may represent the approximate location of the cell, and the cell may be accurately identified by using an image edge contour processing technique at the approximate location of the cell. For example, the binary image of the frame including the table is identified by using an image edge contour processing technology, so as to obtain the cell.
Consider the case where the table in the first image may have a missing boundary line. For example, the left border line is missing; as another example, the right boundary line is absent; as another example, the upper or lower boundary line is absent. Therefore, in an implementation manner of this embodiment, after the cell is identified by using the image edge contour processing technology, the cell may be further optimized, so as to complete the missing boundary line of the cell.
As previously described, OCR technology may identify text included in the first image. It can be determined in conjunction with the text in the table whether the cell lacks a boundary. For example, for a certain text, the text has no corresponding left boundary, it may be determined that the cell in which the text is located has no left boundary. For another example: for a certain text, the text has no corresponding right boundary, and it can be determined that the cell in which the text is located has no right boundary. For another example, for a certain text, the text has no corresponding upper boundary line, and it can be determined that the cell in which the text is located has no upper boundary line. For another example, for a certain text, the text has no corresponding lower boundary line, and it can be determined that the cell in which the text is located has no lower boundary line.
Therefore, in the embodiment of the present application, the OCR technology may be combined to process the first image, determine the missing boundary lines of the cells, and further complete the missing boundary lines to obtain the processed cells. Specifically, the method comprises the following steps: the method includes recognizing texts in the first image by using an OCR technology, determining whether the cells where the texts are located lack boundary lines, and if the cells where the texts are located lack boundary lines, filling the lacking boundary lines to obtain complete cells.
For example, the following steps are carried out:
the method comprises the steps of recognizing text A by using an OCR technology, wherein the text A can comprise one or more characters, the text A has no corresponding left boundary line, so that the cell in which the text A is positioned can be determined to have no left boundary line, and therefore, the left boundary line of the cell in which the text A is positioned can be completely supplemented in combination with other boundary lines of the cell in which the text A is positioned.
After the boundary lines of the missing cells are filled up to obtain the processed cells, the processed cells can be drawn, so that the target table is obtained. It can be understood that the target table drawn in this way does not have the situation that the cells lack boundaries.
In the embodiment of the present application, when the target form is output in a specific implementation, for example, a form in a specific format may be output, and the specific format is not specifically limited in the embodiment of the present application, and the specific format may be, for example, an excel format.
In addition, in the embodiment of the application, besides the target table, texts in the target table may be drawn, so that the target table including the texts is obtained. After the target table including the text is obtained, the target table including the text may be output. The embodiment of the present application is not particularly limited with respect to the implementation manner of drawing the text in the target table.
Exemplary device
Based on the method provided by the above embodiment, the embodiment of the present application further provides an apparatus, which is described below with reference to the accompanying drawings.
Referring to fig. 2, the figure is a schematic structural diagram of a table identification apparatus according to an embodiment of the present application. The apparatus 200 may specifically include, for example: an acquisition unit 201, a first determination unit 202, a fitting unit 203, a rendering unit 204, and an output unit 205.
An acquisition unit 201 configured to acquire a first image including a table;
a first determination unit 202 for determining the ruled lines in the first image by erosion and dilation techniques, the parameters of which include: an average height and/or a maximum height of text included in the first image;
a fitting unit 203, configured to perform straight line fitting on the table line to obtain a fitted table line;
a drawing unit 204, configured to draw a target table according to the fitted table line;
an output unit 205, configured to output the target table.
Optionally, the apparatus further comprises:
a second determination unit configured to determine a text included in the first image using an Optical Character Recognition (OCR) technique;
a third determining unit for determining an average height and/or a maximum height of the text.
Optionally, the drawing unit 204 is configured to:
identifying cells according to the fitted table lines and an image edge contour processing technology;
drawing the identified cells to obtain the target table.
Optionally, the apparatus further comprises:
a fourth determining unit, configured to process the first image by using an OCR technology, and determine a boundary line where the cell is missing;
the processing unit is used for completely supplementing the missing boundary line of the cell to obtain a processed cell;
the drawing unit 204 is configured to:
and drawing the processed cell.
Optionally, the obtaining unit 204 is configured to:
acquiring a second image comprising a table, and identifying a diagonal line in the second image;
calculating an included angle between the oblique line and the direction of the transverse axis;
and rotating the second image according to the included angle to obtain the first image.
Since the apparatus 200 is an apparatus corresponding to the method provided in the above method embodiment, and the specific implementation of each unit of the apparatus 200 is the same as that of the above method embodiment, for the specific implementation of each unit of the apparatus 200, reference may be made to the description part of the above method embodiment, and details are not repeated here.
The method provided by the embodiment of the present application may be executed by a client or a server, and the client and the server that execute the method are described below separately.
Fig. 3 shows a block diagram of a client 300. For example, the client 300 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
Referring to fig. 3, client 300 may include one or more of the following components: processing component 302, memory 304, power component 306, multimedia component 308, audio component 310, input/output (I/O) interface 33, sensor component 314, and communication component 316.
The processing component 302 generally controls overall operation of the client 300, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing elements 302 may include one or more processors 320 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 302 can include one or more modules that facilitate interaction between the processing component 302 and other components. For example, the processing component 302 can include a multimedia module to facilitate interaction between the multimedia component 308 and the processing component 302.
The memory 304 is configured to store various types of data to support operations at the client 300. Examples of such data include instructions for any application or method operating on the client 300, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 304 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power component 306 provides power to the various components of the client 300. The power components 306 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the client 300.
The multimedia component 308 comprises a screen providing an output interface between the client 300 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 308 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the client 300 is in an operation mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 310 is configured to output and/or input audio signals. For example, the audio component 310 includes a Microphone (MIC) configured to receive external audio signals when the client 300 is in an operating mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 304 or transmitted via the communication component 316. In some embodiments, audio component 310 also includes a speaker for outputting audio signals.
The I/O interface provides an interface between the processing component 302 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
Sensor component 314 includes one or more sensors for providing status assessment of various aspects to client 300. For example, sensor component 314 may detect an open/closed state of device 300, the relative positioning of components, such as a display and keypad of client 300, sensor component 314 may also detect a change in the position of client 300 or a component of client 300, the presence or absence of user contact with client 300, client 300 orientation or acceleration/deceleration, and a change in the temperature of client 300. Sensor assembly 314 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 314 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 314 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 316 is configured to facilitate communications between the client 300 and other devices in a wired or wireless manner. The client 300 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication section 316 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 316 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the client 300 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the following methods:
acquiring a first image including a form;
determining the grid lines in the first image by erosion and dilation techniques, the parameters of which include: an average height and/or a maximum height of text included in the first image;
performing linear fitting on the table line to obtain a fitted table line;
and drawing according to the fitted table line to obtain a target table, and outputting the target table.
Optionally, the method further includes:
determining text included in the first image using Optical Character Recognition (OCR) techniques;
an average height and/or a maximum height of the text is determined.
Optionally, the obtaining a target table according to the fitted table line drawing includes:
identifying cells according to the fitted table lines and an image edge contour processing technology;
drawing the identified cells to obtain the target table.
Optionally, the method further includes:
processing the first image by using an OCR technology, and determining a boundary line of the cell loss;
supplementing the missing boundary lines of the cells completely to obtain the processed cells;
the drawing the identified cell includes:
and drawing the processed cell.
Optionally, the acquiring a first image including a table includes:
acquiring a second image comprising a table, and identifying a diagonal line in the second image;
calculating an included angle between the oblique line and the direction of the transverse axis;
and rotating the second image according to the included angle to obtain the first image.
Fig. 4 is a schematic structural diagram of a server in an embodiment of the present application. The server 400 may vary significantly due to configuration or performance, and may include one or more Central Processing Units (CPUs) 422 (e.g., one or more processors) and memory 432, one or more storage media 430 (e.g., one or more mass storage devices) storing applications 442 or data 444. Wherein the memory 432 and storage medium 430 may be transient or persistent storage. The program stored on the storage medium 430 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 422 may be arranged to communicate with the storage medium 430, and execute a series of instruction operations in the storage medium 430 on the server 400.
Still further, the central processor 422 may perform the following method:
acquiring a first image including a form;
determining the grid lines in the first image by erosion and dilation techniques, the parameters of which include: an average height and/or a maximum height of text included in the first image;
performing linear fitting on the table line to obtain a fitted table line;
and drawing according to the fitted table line to obtain a target table, and outputting the target table.
Optionally, the method further includes:
determining text included in the first image using Optical Character Recognition (OCR) techniques;
an average height and/or a maximum height of the text is determined.
Optionally, the obtaining a target table according to the fitted table line drawing includes:
identifying cells according to the fitted table lines and an image edge contour processing technology;
drawing the identified cells to obtain the target table.
Optionally, the method further includes:
processing the first image by using an OCR technology, and determining a boundary line of the cell loss;
supplementing the missing boundary lines of the cells completely to obtain the processed cells;
the drawing the identified cell includes:
and drawing the processed cell.
Optionally, the acquiring a first image including a table includes:
acquiring a second image comprising a table, and identifying a diagonal line in the second image;
calculating an included angle between the oblique line and the direction of the transverse axis;
and rotating the second image according to the included angle to obtain the first image.
The server 400 may also include one or more power supplies 426, one or more wired or wireless network interfaces 450, one or more input-output interfaces 456, one or more keyboards 456, and/or one or more operating systems 441, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.
Embodiments of the present application also provide a computer-readable medium having stored thereon instructions that, when executed by one or more processors, cause an apparatus to perform a method of:
acquiring a first image including a form;
determining the grid lines in the first image by erosion and dilation techniques, the parameters of which include: an average height and/or a maximum height of text included in the first image;
performing linear fitting on the table line to obtain a fitted table line;
and drawing according to the fitted table line to obtain a target table, and outputting the target table.
Optionally, the method further includes:
determining text included in the first image using Optical Character Recognition (OCR) techniques;
an average height and/or a maximum height of the text is determined.
Optionally, the obtaining a target table according to the fitted table line drawing includes:
identifying cells according to the fitted table lines and an image edge contour processing technology;
drawing the identified cells to obtain the target table.
Optionally, the method further includes:
processing the first image by using an OCR technology, and determining a boundary line of the cell loss;
supplementing the missing boundary lines of the cells completely to obtain the processed cells;
the drawing the identified cell includes:
and drawing the processed cell.
Optionally, the acquiring a first image including a table includes:
acquiring a second image comprising a table, and identifying a diagonal line in the second image;
calculating an included angle between the oblique line and the direction of the transverse axis;
and rotating the second image according to the included angle to obtain the first image.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice in the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (12)

1. A method of form recognition, the method comprising:
acquiring a first image including a form;
determining the grid lines in the first image by erosion and dilation techniques, the parameters of which include: an average height and/or a maximum height of text included in the first image;
performing linear fitting on the table line to obtain a fitted table line;
and drawing according to the fitted table line to obtain a target table, and outputting the target table.
2. The method of claim 1, further comprising:
determining text included in the first image using Optical Character Recognition (OCR) techniques;
an average height and/or a maximum height of the text is determined.
3. The method of claim 1, wherein said obtaining a target table from said fitted table line mapping comprises:
identifying cells according to the fitted table lines and an image edge contour processing technology;
drawing the identified cells to obtain the target table.
4. The method of claim 3, further comprising:
processing the first image by using an OCR technology, and determining a boundary line of the cell loss;
supplementing the missing boundary lines of the cells completely to obtain the processed cells;
the drawing the identified cell includes: and drawing the processed cell.
5. The method of claim 1, wherein the obtaining a first image comprising a form comprises:
acquiring a second image comprising a table, and identifying a diagonal line in the second image;
calculating an included angle between the oblique line and the direction of the transverse axis;
and rotating the second image according to the included angle to obtain the first image.
6. A form recognition apparatus, the apparatus comprising:
an acquisition unit configured to acquire a first image including a table;
a first determination unit for determining the ruled lines in the first image by means of an erosion and dilation technique, the parameters of which comprise: an average height and/or a maximum height of text included in the first image;
the fitting unit is used for performing straight line fitting on the table line to obtain a fitted table line;
the drawing unit is used for drawing according to the fitted table line to obtain a target table;
an output unit for outputting the target table.
7. The apparatus of claim 6, further comprising:
a second determination unit configured to determine a text included in the first image using an Optical Character Recognition (OCR) technique;
a third determining unit for determining an average height and/or a maximum height of the text.
8. The apparatus of claim 6, wherein the rendering unit is configured to:
identifying cells according to the fitted table lines and an image edge contour processing technology;
drawing the identified cells to obtain the target table.
9. The apparatus of claim 8, further comprising:
a fourth determining unit, configured to process the first image by using an OCR technology, and determine a boundary line where the cell is missing;
the processing unit is used for completely supplementing the missing boundary line of the cell to obtain a processed cell;
the drawing unit is configured to: and drawing the processed cell.
10. The apparatus of claim 6, wherein the obtaining unit is configured to:
acquiring a second image comprising a table, and identifying a diagonal line in the second image;
calculating an included angle between the oblique line and the direction of the transverse axis;
and rotating the second image according to the included angle to obtain the first image.
11. A form recognition apparatus comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory, and wherein the one or more programs being configured to be executed by the one or more processors comprises instructions for performing the form recognition method of any one of claims 1-5.
12. A computer-readable medium having stored thereon instructions, which when executed by one or more processors, cause an apparatus to perform the table identification method of any of claims 1-5.
CN202111155487.7A 2021-09-29 2021-09-29 Form identification method and device Pending CN113887401A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111155487.7A CN113887401A (en) 2021-09-29 2021-09-29 Form identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111155487.7A CN113887401A (en) 2021-09-29 2021-09-29 Form identification method and device

Publications (1)

Publication Number Publication Date
CN113887401A true CN113887401A (en) 2022-01-04

Family

ID=79004502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111155487.7A Pending CN113887401A (en) 2021-09-29 2021-09-29 Form identification method and device

Country Status (1)

Country Link
CN (1) CN113887401A (en)

Similar Documents

Publication Publication Date Title
US10127471B2 (en) Method, device, and computer-readable storage medium for area extraction
US10157326B2 (en) Method and device for character area identification
US20170124386A1 (en) Method, device and computer-readable medium for region recognition
US10095949B2 (en) Method, apparatus, and computer-readable storage medium for area identification
US20170124412A1 (en) Method, apparatus, and computer-readable medium for area recognition
US20150332439A1 (en) Methods and devices for hiding privacy information
CN107480665B (en) Character detection method and device and computer readable storage medium
CN110619350B (en) Image detection method, device and storage medium
CN108062547B (en) Character detection method and device
CN110569835B (en) Image recognition method and device and electronic equipment
EP3040884A1 (en) Method and device for classifying pictures
US20230252778A1 (en) Formula recognition method and apparatus
CN105678296B (en) Method and device for determining character inclination angle
CN111754414B (en) Image processing method and device for image processing
CN113887401A (en) Form identification method and device
US11417028B2 (en) Image processing method and apparatus, and storage medium
CN113920083A (en) Image-based size measurement method and device, electronic equipment and storage medium
CN111723627A (en) Image processing method and device and electronic equipment
CN113869306A (en) Text positioning method and device and electronic equipment
CN113885713A (en) Method and device for generating handwriting formula
CN116110062A (en) Text recognition method, device and medium
CN114004245A (en) Bar code information identification method, device, equipment and storage medium
CN114185478A (en) Application program display method and device and storage medium
CN114155320A (en) Method and device for identifying content of structure diagram, electronic equipment and storage medium
CN114155160A (en) Connector restoring method and device of structure diagram, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination