CN111651971A - Form information transcription method, system, electronic equipment and storage medium - Google Patents

Form information transcription method, system, electronic equipment and storage medium Download PDF

Info

Publication number
CN111651971A
CN111651971A CN202010462194.2A CN202010462194A CN111651971A CN 111651971 A CN111651971 A CN 111651971A CN 202010462194 A CN202010462194 A CN 202010462194A CN 111651971 A CN111651971 A CN 111651971A
Authority
CN
China
Prior art keywords
image
cell
type
information
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010462194.2A
Other languages
Chinese (zh)
Inventor
张天澄
徐立凡
马业恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202010462194.2A priority Critical patent/CN111651971A/en
Publication of CN111651971A publication Critical patent/CN111651971A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/183Tabulation, i.e. one-dimensional positioning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention discloses a method, a system, electronic equipment and a storage medium for transcribing table information, wherein the method comprises the following steps: acquiring a target image with table information, and extracting a first type and a second type of table lines from the target image to obtain a first image comprising the first type of table lines and a second image comprising the second type of table lines; calculating the first image and the second image to obtain the edge line of each cell and the vertex position of each cell; determining a cell area based on the edge of the cell and the vertex position of the cell, and identifying text information of the cell area; and writing the recognized text information into a pre-constructed target file to finish the transcription of the table information in the image. The embodiment of the invention realizes the rapid transcription of the table information contained in the picture to the designated file, improves the data collection, comparison and analysis efficiency, further exerts the computer algorithm and saves the labor cost.

Description

Form information transcription method, system, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of image recognition, in particular to a table information transcription method, a table information transcription system, electronic equipment and a storage medium.
Background
At present, forms are still filled in by handwriting in some national regions at home and abroad, so that large batch of form information needs manual identification processing. In the face of complex and changeable tables, although image recognition technologies for recognizing various images are more and more along with the development of computer technologies, no corresponding image processing technology exists at present, and the large batch of pictures containing complex table character information can be effectively processed.
Therefore, a method for identifying the content of a tabular image, realizing the transcription of the tabular information in the image and improving the identification accuracy is needed.
Disclosure of Invention
The embodiment of the invention provides a table information transcription method, a table information transcription system, electronic equipment and a storage medium, which aim to accurately realize the transcription of table information in an image.
In a first aspect, an embodiment of the present invention provides a table information transcription method, where the method includes:
acquiring a target image with table information, and extracting a first type and a second type of table lines from the target image to obtain a first image comprising the first type of table lines and a second image comprising the second type of table lines;
calculating the first image and the second image to obtain the edge line of each cell and the vertex position of each cell;
determining a cell area based on the edge of the cell and the vertex position of the cell, and identifying text information of the cell area;
and writing the recognized text information into a pre-constructed target file to finish the transcription of the table information in the image.
In a second aspect, an embodiment of the present invention further provides a table information transcription system, where the system includes:
the extraction module is used for acquiring a target image with table information, extracting a first type of table lines and a second type of table lines from the target image, and obtaining a first image comprising the first type of table lines and a second image comprising the second type of table lines;
the operation module is used for performing operation on the first image and the second image to obtain the edge line of each cell and the vertex position of each cell;
the identification module is used for determining a cell area based on the edge of the cell and the vertex position of the cell and identifying text information of the cell area;
and the writing module is used for writing the recognized text information into a pre-constructed target file so as to complete the transcription of the table information in the image.
In a third aspect, an embodiment of the present invention further provides an electronic device, including:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the table information transcription method according to any embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention further provides a storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a table information transcription method according to any embodiment of the present invention.
In the embodiment of the invention, the obtained first image comprising the first type table lines and the obtained second image comprising the second type table lines are operated to obtain the side lines of each cell and the vertex position of each cell, further determine the cell area, recognize the text information of the cell area, and finally write the recognized text information into a pre-constructed target file to finish the transcription of the table information in the image. Therefore, the table information contained in the picture is quickly transcribed to the designated file, the data collection, comparison and analysis efficiency is improved, the computer algorithm is further exerted, and the labor cost is saved.
Drawings
FIG. 1a is a schematic flowchart of a table information transcription method according to a first embodiment of the present invention;
FIG. 1b is a diagram of a target image with table information according to a first embodiment of the present invention;
FIG. 1c is a schematic diagram illustrating the transcription result of table information according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a table information transcription system according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device in a third embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1a is a flowchart of a table information transcription method according to an embodiment of the present invention, where the present embodiment is applicable to a case where identification and transcription of table information in an image are required, and the method may be executed by a table information transcription system, where the system may be implemented in a software and/or hardware manner, and may be integrated on an electronic device, such as a computer device or a mobile terminal.
As shown in fig. 1a, the table information transcription method specifically includes the following steps:
s101, obtaining a target image with table information, and extracting a first type table line and a second type table line from the target image to obtain a first image comprising the first type table line and a second image comprising the second type table line.
In the embodiment of the invention, the initial interface of the form information transcription system provides a file selection control and an uploading control for a user, the user can select the target image to be uploaded from the local by clicking the selection control, and then the uploading operation of the target image is realized by clicking the uploading control. After a user uploads a target image, the table information transcription system acquires the target image with table information at the background, wherein the table information at least comprises a table frame and the text information in the cells of the table. In an alternative embodiment, a python-based flash framework is called to build the web side of the application and achieve front-end and back-end interaction.
After the target image is acquired, extracting first type and second type table lines from the target image to obtain a first image including the first type table lines and a second image including the second type table lines, wherein the first type table lines are horizontal lines (namely horizontal lines), and the second type table lines are vertical lines (namely vertical lines). In an alternative embodiment, the operation of extracting the first type and the second type of table lines from the target image includes S1011:
s1011, carrying out corrosion and expansion processing on the target image, and extracting a first type and a second type of table lines from the corroded and expanded image based on a preset morphological factor.
Wherein the erosion process and the dilation process are morphological operations on the image that substantially change the shape of objects in the image. Since the erosion process and the expansion process are generally applied to the binarized image for connecting adjacent elements and separating into independent elements, the erosion process and the expansion process are generally directed to a white portion of the image. Therefore, before the erosion processing and the expansion processing are performed on the target image, the target image is subjected to binarization processing, and the pixel values of the processed image are only 0 and 255. Specifically, pixel points left by characters far away from the table lines can be corroded away through corrosion treatment, and pixel points left by characters near the table lines are fused into the table dangerous seeds through expansion treatment. And extracting a first type table line (transverse line) and a second type table line (vertical line) from the image after the erosion and expansion processing based on a preset morphological factor. After the first type table lines (horizontal lines) and the second type table lines (vertical lines) are extracted, the first type table lines are saved as a first image (horizontal line image), and the second type table lines are saved as a second image (vertical line image).
S102, computing is carried out on the first image and the second image, and the edge line of each cell and the vertex position of each cell are obtained.
In an alternative embodiment, the performing an operation on the first image and the second image to obtain the edge of each cell and the vertex position of each cell includes S1021 to S1022:
and S1021, carrying out union operation on the first image and the second image to obtain a table line image, and determining the edge of each cell from the table line image.
In the embodiment of the invention, the first image and the second image can be optionally stored in an array form, so that union operation is performed on the first image and the second image, namely, the operation is performed on the array corresponding to each of the two images, and basic array operation can be optionally performed by calling a numpy library based on python. And obtaining a table line image through operation, and determining the edge line of each cell based on the table line image.
Further, considering that other interference elements may be misjudged as table edges, after obtaining the edges of each cell, the method further includes: and determining a table area based on each cell edge line, and discarding the table area with the area smaller than a preset threshold or with irregular area shape. And judging whether the area of the region is regular or not by comparing the area of the region with a rectangular area formed by four grid points of the upper part, the lower part, the left part and the right part of the boundary of the region, and judging that the region is irregular if the area difference is larger than a preset value.
S1022, carrying out intersection operation on the first image and the second image to obtain a lattice point image, and determining the vertex position of each unit cell from the lattice point image.
Because the vertex of the cell is the intersection point of the edge lines of the two cells, to determine the vertex of each cell, the first image and the second image only need to be subjected to union operation, namely, the union operation processing is performed on the respective corresponding arrays of the two images, and optionally, the python-based numpy library is called to perform basic array operation. A lattice point image is obtained through calculation, and the vertex of each cell is determined based on the lattice point image.
Further, after obtaining the edge line of each cell and the vertex position of each cell, the method further includes S01-S02:
and S01, carrying out pixel scanning according to a specified sequence, and determining a target grid point.
Optionally, the performing the pixel scanning according to the designated order includes: and finally, scanning pixels from left to right in sequence from the grid point at the leftmost upper corner to determine a target grid point, wherein the target grid point is the top point of a unit grid, and since the grid point image is a binary image, the pixels are only 0 and 255, and a white point (a point at a pixel position of 255) encountered in the scanning process is taken as the target grid point.
And S02, setting the target grid point as the top left corner vertex of the cell, searching other vertexes of the cell according to the position of the target grid point, and if the vertex is found, keeping the position information of each vertex of the cell.
It should be noted that if no other vertex of the cell can be found, the cell is considered to be absent.
S103, determining a cell area based on the edge of the cell and the vertex position of the cell, and identifying text information of the cell area.
After the edge of the cell and the vertex position of the cell are determined, each cell area existing in the target image can be determined, and then text information recognition is carried out on the image in the cell area to obtain a recognition result. In an alternative embodiment, the text information within the cell is identified and extracted by calling a python-based cnocr library.
And S104, writing the recognized text information into a pre-constructed target file to complete the transcription of the table information in the image.
Wherein, the target file is an Excel workbook optionally. When the target file is constructed in advance, optionally, calling an xlwt library to newly build an Excel workbook on a local computer, wherein the row height and the column width can be set according to the coordinate information and the pixel information of the grid points determined through the steps. Further, writing the text information identified in step S103 into a pre-constructed target file, optionally, calling an xlwt library to write the identified text information into a newly-built Excel workbook.
Illustratively, referring to fig. 1b and fig. 1c, fig. 1b shows a schematic diagram of the target image with the table information, and fig. 1c shows a schematic diagram of the transcription result of the table information after S101-S104.
In the embodiment of the invention, the obtained first image comprising the first type table lines and the obtained second image comprising the second type table lines are operated to obtain the side lines of each cell and the vertex position of each cell, further determine the cell area, recognize the text information of the cell area, and finally write the recognized text information into a pre-constructed target file to finish the transcription of the table information in the image. Therefore, the table information contained in the picture is quickly transcribed to the designated file, the data collection, comparison and analysis efficiency is improved, the computer algorithm is further exerted, and the labor cost is saved.
Further, after writing the recognized text information into a pre-constructed table file with a specified format, the method further comprises: and storing the table file to an appointed path, wherein the appointed path is a downloading path corresponding to the pre-established hyperlink. Therefore, after the form information is transcribed, the form information transcription system only needs to provide a hyperlink at the front end, and a user only needs to click the hyperlink to store the transcription result file to a desired address.
Example two
Fig. 2 is a schematic structural diagram of a table information transcription system according to a second embodiment of the present invention, where the system is used in a case where table information in an image needs to be transcribed in an identifying manner, and the system includes:
the extraction module 201 is configured to obtain a target image with table information, and extract a first type and a second type of table lines from the target image to obtain a first image including the first type of table lines and a second image including the second type of table lines;
an operation module 202, configured to perform an operation on the first image and the second image to obtain an edge of each cell and a vertex position of each cell;
the identification module 203 is configured to determine a cell area based on an edge of a cell and a vertex position of the cell, and perform text information identification on the cell area;
a writing module 204, configured to write the recognized text information into a pre-constructed target file, so as to complete the transcription of the table information in the image.
In the embodiment of the invention, the obtained first image comprising the first type table lines and the obtained second image comprising the second type table lines are operated to obtain the side lines of each cell and the vertex position of each cell, further determine the cell area, recognize the text information of the cell area, and finally write the recognized text information into a pre-constructed target file to finish the transcription of the table information in the image. Therefore, the table information contained in the picture is quickly transcribed to the designated file, the data collection, comparison and analysis efficiency is improved, the computer algorithm is further exerted, and the labor cost is saved.
On the basis of the foregoing embodiment, optionally, the extraction module includes:
and the extraction unit is used for carrying out corrosion and expansion processing on the target image and extracting the first type and the second type of table lines from the image after the corrosion and expansion processing based on a preset morphological factor.
On the basis of the foregoing embodiment, optionally, the operation module is specifically configured to:
performing union operation on the first image and the second image to obtain a table line image, and determining the edge line of each cell from the table line image;
and performing intersection operation on the first image and the second image to obtain a lattice point image, and determining the vertex position of each cell from the lattice point image.
On the basis of the foregoing embodiment, optionally, the system further includes:
and the discarding module is used for determining the table area based on each cell edge line and discarding the table area with the area smaller than a preset threshold or irregular area shape.
On the basis of the foregoing embodiment, optionally, the system further includes a pixel scanning module, configured to:
performing pixel scanning according to a specified sequence, and determining a target grid point, wherein the target grid point is a vertex of a cell;
and setting the target lattice point as the top left corner vertex of the cell, searching other vertexes of the cell according to the position of the target lattice point, and if the other vertexes of the cell are found, keeping the position information of each vertex of the cell.
On the basis of the foregoing embodiment, optionally, the system further includes:
and the storage module is used for storing the table file to an appointed path, wherein the appointed path is a downloading path corresponding to the pre-established hyperlink.
On the basis of the above embodiment, optionally, the first type of table line is a horizontal line, and the second type of table line is a vertical line.
The table information transcription system provided by the embodiment of the invention can execute the table information transcription method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
EXAMPLE III
Fig. 3 is a schematic structural diagram of an electronic device according to a third embodiment of the present invention. FIG. 3 illustrates a block diagram of an exemplary electronic device 12 suitable for use in implementing embodiments of the present invention. The electronic device 12 shown in fig. 3 is only an example and should not bring any limitations to the function and scope of use of the embodiments of the present invention.
As shown in FIG. 3, electronic device 12 is embodied in the form of a general purpose computing device. The components of electronic device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Electronic device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by electronic device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 3, and commonly referred to as a "hard drive"). Although not shown in FIG. 3, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Electronic device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with electronic device 12, and/or with any devices (e.g., network card, modem, etc.) that enable electronic device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the electronic device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 20. As shown, the network adapter 20 communicates with other modules of the electronic device 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by executing programs stored in the system memory 28, for example, implementing a table information transcription method provided by an embodiment of the present invention, the method including:
acquiring a target image with table information, and extracting a first type and a second type of table lines from the target image to obtain a first image comprising the first type of table lines and a second image comprising the second type of table lines;
calculating the first image and the second image to obtain the edge line of each cell and the vertex position of each cell;
determining a cell area based on the edge of the cell and the vertex position of the cell, and identifying text information of the cell area;
and writing the recognized text information into a pre-constructed target file to finish the transcription of the table information in the image.
Example four
The fourth embodiment of the present invention further provides a storage medium, in particular, a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the table information transcription method provided in the fourth embodiment of the present invention, where the method includes:
acquiring a target image with table information, and extracting a first type and a second type of table lines from the target image to obtain a first image comprising the first type of table lines and a second image comprising the second type of table lines;
calculating the first image and the second image to obtain the edge line of each cell and the vertex position of each cell;
determining a cell area based on the edge of the cell and the vertex position of the cell, and identifying text information of the cell area;
and writing the recognized text information into a pre-constructed target file to finish the transcription of the table information in the image.
Storage media for embodiments of the present invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method for transcribing tabular information, the method comprising:
acquiring a target image with table information, and extracting a first type and a second type of table lines from the target image to obtain a first image comprising the first type of table lines and a second image comprising the second type of table lines;
calculating the first image and the second image to obtain the edge line of each cell and the vertex position of each cell;
determining a cell area based on the edge of the cell and the vertex position of the cell, and identifying text information of the cell area;
and writing the recognized text information into a pre-constructed target file to finish the transcription of the table information in the image.
2. The method of claim 1, wherein extracting the first type and the second type of table lines from the target image comprises:
and carrying out corrosion and expansion processing on the target image, and extracting a first type and a second type of table lines from the image subjected to corrosion and expansion processing based on a preset morphological factor.
3. The method of claim 1, wherein computing the first image and the second image to obtain the edge of each cell and the vertex position of each cell comprises:
performing union operation on the first image and the second image to obtain a table line image, and determining the edge line of each cell from the table line image;
and performing intersection operation on the first image and the second image to obtain a lattice point image, and determining the vertex position of each cell from the lattice point image.
4. The method of claim 1, wherein after obtaining the edge of each cell, the method further comprises:
and determining a table area based on each cell edge line, and discarding the table area with the area smaller than a preset threshold or with irregular area shape.
5. The method of claim 1, wherein after obtaining the edge line of each cell and the vertex position of each cell, the method further comprises:
performing pixel scanning according to a specified sequence, and determining a target grid point, wherein the target grid point is a vertex of a cell;
and setting the target lattice point as the top left corner vertex of the cell, searching other vertexes of the cell according to the position of the target lattice point, and if the other vertexes of the cell are found, keeping the position information of each vertex of the cell.
6. The method of claim 1, wherein after writing the recognized text information into a pre-constructed form file of a specified format, the method further comprises:
and storing the table file to an appointed path, wherein the appointed path is a downloading path corresponding to the pre-established hyperlink.
7. The method of claim 1, wherein the first type of grid line is a horizontal line and the second type of grid line is a vertical line.
8. A form information transcription system, the system comprising:
the extraction module is used for acquiring a target image with table information, extracting a first type of table lines and a second type of table lines from the target image, and obtaining a first image comprising the first type of table lines and a second image comprising the second type of table lines;
the operation module is used for performing operation on the first image and the second image to obtain the edge line of each cell and the vertex position of each cell;
the identification module is used for determining a cell area based on the edge of the cell and the vertex position of the cell and identifying text information of the cell area;
and the writing module is used for writing the recognized text information into a pre-constructed target file so as to complete the transcription of the table information in the image.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the table information transcription method as recited in any one of claims 1-7.
10. A storage medium on which a computer program is stored, which program, when being executed by a processor, carries out a table information transcription method as claimed in any one of claims 1 to 7.
CN202010462194.2A 2020-05-27 2020-05-27 Form information transcription method, system, electronic equipment and storage medium Pending CN111651971A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010462194.2A CN111651971A (en) 2020-05-27 2020-05-27 Form information transcription method, system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010462194.2A CN111651971A (en) 2020-05-27 2020-05-27 Form information transcription method, system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111651971A true CN111651971A (en) 2020-09-11

Family

ID=72348377

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010462194.2A Pending CN111651971A (en) 2020-05-27 2020-05-27 Form information transcription method, system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111651971A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036365A (en) * 2020-09-15 2020-12-04 中国工商银行股份有限公司 Information importing method and device, and image processing method and device
CN112668298A (en) * 2021-01-15 2021-04-16 上海杉互健康科技有限公司 Questionnaire recording method, system, equipment and storage medium based on mobile terminal

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726643A (en) * 2018-12-13 2019-05-07 北京金山数字娱乐科技有限公司 The recognition methods of form data, device, electronic equipment and storage medium in image
CN110532968A (en) * 2019-09-02 2019-12-03 苏州美能华智能科技有限公司 Table recognition method, apparatus and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726643A (en) * 2018-12-13 2019-05-07 北京金山数字娱乐科技有限公司 The recognition methods of form data, device, electronic equipment and storage medium in image
CN110532968A (en) * 2019-09-02 2019-12-03 苏州美能华智能科技有限公司 Table recognition method, apparatus and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036365A (en) * 2020-09-15 2020-12-04 中国工商银行股份有限公司 Information importing method and device, and image processing method and device
CN112036365B (en) * 2020-09-15 2024-05-07 中国工商银行股份有限公司 Information importing method and device and image processing method and device
CN112668298A (en) * 2021-01-15 2021-04-16 上海杉互健康科技有限公司 Questionnaire recording method, system, equipment and storage medium based on mobile terminal

Similar Documents

Publication Publication Date Title
US10817717B2 (en) Method and device for parsing table in document image
WO2020140698A1 (en) Table data acquisition method and apparatus, and server
KR20210042864A (en) Table recognition method, device, equipment, medium and computer program
CN108108342B (en) Structured text generation method, search method and device
KR20160132842A (en) Detecting and extracting image document components to create flow document
CN109934229B (en) Image processing method, device, medium and computing equipment
US10482344B2 (en) System and method for performing optical character recognition
CN111651971A (en) Form information transcription method, system, electronic equipment and storage medium
CN110738030A (en) Table reconstruction method and device, electronic equipment and storage medium
CN110008923B (en) Image processing method, training method, device, medium, and computing device
CN110162757B (en) Table structure extraction method and system
CN113850060A (en) Civil aviation document data identification and entry method and system
CN113762455A (en) Detection model training method, single character detection method, device, equipment and medium
CN114511862B (en) Form identification method and device and electronic equipment
CN115810132A (en) Crack orientation identification method, device, equipment and storage medium
CN107330470B (en) Method and device for identifying picture
CN115862044A (en) Method, apparatus, and medium for extracting target document part from image
CN115359008A (en) Display interface testing method and device, storage medium and electronic equipment
CN111291758B (en) Method and device for recognizing seal characters
CN114049686A (en) Signature recognition model training method and device and electronic equipment
CN112434700A (en) License plate recognition method, device, equipment and storage medium
CN111476090A (en) Watermark identification method and device
CN110414496B (en) Similar word recognition method and device, computer equipment and storage medium
CN112966671A (en) Contract detection method and device, electronic equipment and storage medium
CN112801960A (en) Image processing method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination