CN114943978A - Table reconstruction method and electronic equipment - Google Patents

Table reconstruction method and electronic equipment Download PDF

Info

Publication number
CN114943978A
CN114943978A CN202210523453.7A CN202210523453A CN114943978A CN 114943978 A CN114943978 A CN 114943978A CN 202210523453 A CN202210523453 A CN 202210523453A CN 114943978 A CN114943978 A CN 114943978A
Authority
CN
China
Prior art keywords
text
region
cell
abscissa
coordinates
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210523453.7A
Other languages
Chinese (zh)
Other versions
CN114943978B (en
Inventor
王伟印
张晓程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Hongji Information Technology Co Ltd
Original Assignee
Shanghai Hongji Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hongji Information Technology Co Ltd filed Critical Shanghai Hongji Information Technology Co Ltd
Priority to CN202210523453.7A priority Critical patent/CN114943978B/en
Publication of CN114943978A publication Critical patent/CN114943978A/en
Priority to PCT/CN2023/084482 priority patent/WO2023216745A1/en
Application granted granted Critical
Publication of CN114943978B publication Critical patent/CN114943978B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/147Determination of region of interest
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Character Input (AREA)

Abstract

The method comprises the steps of carrying out text recognition on an image to be processed to obtain a text region recognition result of the image to be processed, wherein the text region recognition result comprises region text content and region position information of a text region; determining table row coordinates and table column coordinates of the target table according to the region position information; generating a blank table according to the table row coordinates and the table column coordinates; and adding the regional text content into the blank table according to the regional position information to obtain the target table. Therefore, the framed table or the frameless table in the image to be processed can be reconstructed, and the accuracy and the application range of table reconstruction are improved.

Description

Table reconstruction method and electronic equipment
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method for reconstructing a table and an electronic device.
Background
With the development of information technology and the popularization of paperless information office, people have higher and higher requirements on convenience of data processing. In some office scenarios, form recognition and form reconstruction are usually required for the form image to obtain a reconstructed form.
In the prior art, image processing operations such as dilation and erosion are generally adopted to determine lines in a table image, and a table is reconstructed according to the lines and intersection point coordinates of the lines.
However, if the table in the table image includes cells without borders or cells with inconspicuous borders, the reconstructed table has a certain variation in this manner.
Disclosure of Invention
An object of the present invention is to provide a table reconstruction method and an electronic device, which are used to reduce the deviation of a reconstructed table when reconstructing a table in a table image.
In one aspect, a table reconstruction method is provided, including:
performing text recognition on the image to be processed to obtain a text region recognition result of the image to be processed, wherein the text region recognition result comprises region text content and region position information of a text region;
determining table row coordinates and table column coordinates of the target table according to the region position information;
generating a blank table according to the table row coordinates and the table column coordinates;
and adding the regional text content into the blank table according to the regional position information to obtain the target table.
In the implementation process, the table is reconstructed according to the area text content and the area position information of each text area identified in the image to be processed, the framed table in the image to be processed can be reconstructed, the frameless table in the image to be processed can also be reconstructed, and the accuracy of table reconstruction is improved.
In one embodiment, the region location information includes region vertex coordinates of a text region, and performs text recognition on the image to be processed to obtain a text region recognition result of the image to be processed, and the method includes: performing text detection on the image to be processed to obtain a plurality of region vertex coordinates of the text region, wherein the region vertex coordinates are the coordinates of the vertex of the text region; and performing text recognition on the text region to obtain the text content of the region.
In the implementation process, the text recognition is carried out on the graph to be processed, and the vertex coordinates of each region in the text region and the text content of the region are determined, so that the position and the content of each text region can be accurately recognized.
In one embodiment, the determining the table row coordinates and the table column coordinates of the target table according to the area location information includes: determining the maximum ordinate and the minimum ordinate in the vertex ordinates of each region; determining the maximum abscissa and the minimum abscissa in the abscissas of the vertexes of the regions; determining the number of first areas of each ordinate and the number of second areas of each abscissa according to the area position information, wherein the number of the first areas is the number of text areas containing a certain ordinate, and the number of the second areas is the number of text areas containing a certain abscissa; determining the row coordinates of each table according to the maximum ordinate, the minimum ordinate and the number of the first areas; and determining the coordinates of each table column according to the maximum abscissa, the minimum abscissa and the number of the second areas.
In the implementation process, the row coordinates and the column coordinates of each table are determined according to the position of each text area, the rows and columns of the frameless table can be identified, and the accuracy of table reconstruction is improved.
In one embodiment, determining the respective table row coordinate based on the maximum ordinate, the minimum ordinate, and the first zone number comprises: determining a trough ordinate according to each ordinate and the corresponding first region number thereof, wherein the first region number of the trough ordinate is not higher than the first region number of the adjacent ordinate of the trough ordinate, and the adjacent ordinates of the trough ordinate are the former ordinate and the latter ordinate of the trough ordinate; and obtaining the row coordinate of each table according to the maximum ordinate, the minimum ordinate and the trough ordinate.
In the implementation process, because the number of the text regions crossed by the horizontal line where the table row coordinate is located is relatively small, the table row coordinate is determined according to the number of the first regions of the text regions crossed by the horizontal line where each vertical coordinate is located.
In one embodiment, determining the respective table column coordinate based on the maximum abscissa, the minimum abscissa, and the second zone number comprises: determining a trough abscissa according to each abscissa and the number of corresponding second regions thereof, wherein the number of the second regions of the trough abscissa is not higher than the number of the second regions of adjacent abscissas of the trough abscissa, and the adjacent abscissas of the trough abscissa are a former abscissa and a latter abscissa of the trough abscissa; and obtaining the coordinates of each table column according to the maximum abscissa, the minimum abscissa and the trough abscissa.
In the implementation process, because the number of the text areas crossed by the vertical line where the table column coordinate is located is relatively small, the table column coordinate is determined according to the number of the second areas of the text areas crossed by the vertical line where each horizontal coordinate is located.
In one embodiment, after generating the blank table according to each table row coordinate and each table column coordinate, the method further comprises: determining cell position information of cells in the blank table according to the row coordinates and the column coordinates of each table; determining target cells covered by the text region according to the region position information and the cell position information; wherein a coordinate point which is located in the target cell and is not coincident with the boundary of the target cell exists in the text region; if a plurality of target cells covered by the text area are determined, merging the target cells.
In the implementation process, a plurality of target cells covered by the text area are merged according to the area where the cell is located and the area where the text area is located, so that the problem of how to set the merged cells in the table is solved.
In one embodiment, adding the text content of the region to the blank table according to the region position information to obtain the target table includes: respectively aiming at each cell in the blank table, executing the following steps: if the unit cell only contains one text area according to the area position information, adding the area text content in the text area contained in the unit cell into the unit cell; and if the unit cell is determined to contain at least two text regions according to the region position information, sequencing the region text contents in each text region contained in the unit cell according to the region position information, and adding the sequenced region text contents into one unit cell.
In the implementation process, the regional text contents of the text regions in the same cell are sequenced, and then the sequenced regional text contents are added into the cell, so that the problem of how to add the regional text contents of the text regions to the same cell is solved.
In one embodiment, determining that a cell contains only one text region according to the region location information includes: determining the coordinates of the central point of the text area according to the area position information, wherein the coordinates of the central point are the coordinates of the central point of the text area; and if the coordinates of the center point of only one text area are determined to be positioned in the cell according to the cell position information of one cell, determining that one cell only contains one text area.
In the implementation process, the text area contained in the cell can be quickly determined according to the center point coordinates of the text area and the area where the cell is located.
In one embodiment, adding regional text content in a text region contained in a cell to a cell comprises: the attribute value of the text attribute of one cell is set as the regional text content in the text region contained in one cell.
In the implementation process, the text content of the region can be added in the cell by setting the text attribute.
In one embodiment, determining that a cell contains at least two text regions based on the region location information comprises: if the number of the text areas is determined to be multiple, respectively determining the center point coordinate of each text area according to the area position information of each text area, wherein the center point coordinate is the coordinate of the center point of the text area; and if the coordinates of the center points of at least two text areas are determined to be positioned in the cell according to the cell position information of the cell, determining that the cell contains at least two text areas.
In the implementation process, according to the coordinates of the center point of the text area and the area where the cell is located, a plurality of text areas contained in the cell can be quickly determined.
In one embodiment, the center point coordinates include a center point abscissa and a center point ordinate, and the sorting of the text contents of the regions in each text region included in one cell includes: sequencing the text contents of the regions in each text region contained in one cell according to the sequence of the vertical coordinate of the central point from large to low; and sequencing the text contents of the areas in the text areas with the same central point ordinate again according to the sequence of the central point abscissa from small to large.
In the implementation process, the text regions in the same cell are sorted according to the horizontal coordinate of the center point and the vertical coordinate of the center point of each text region.
In one embodiment, adding the sorted regional text content to a cell comprises: and setting the attribute value of the text attribute of one cell as the text content of the sorted region.
In the implementation process, the text content of the region can be added in the cell by setting the text attribute.
In one aspect, an apparatus for table reconstruction is provided, including:
the identification unit is used for carrying out text identification on the image to be processed to obtain a text region identification result of the image to be processed, wherein the text region identification result comprises region text content and region position information of a text region; the determining unit is used for determining table row coordinates and table column coordinates of the target table according to the region position information; the generating unit is used for generating a blank table according to each table row coordinate and each table column coordinate; and the obtaining unit is used for adding the regional text content into the blank table according to the regional position information to obtain the target table.
In one embodiment, the region location information includes region vertex coordinates of the text region, and the identifying unit is configured to: performing text detection on the image to be processed to obtain a plurality of region vertex coordinates of the text region, wherein the region vertex coordinates are the coordinates of the vertex of the text region; and performing text recognition on the text region to obtain the text content of the region.
In one embodiment, the area vertex coordinates comprise an area vertex abscissa and an area vertex ordinate, and the determination unit is configured to: determining the maximum ordinate and the minimum ordinate in the vertex ordinates of each region; determining the maximum abscissa and the minimum abscissa in the abscissas of the vertexes of the regions; determining the number of first areas of each ordinate and the number of second areas of each abscissa according to the area position information, wherein the number of the first areas is the number of text areas containing a certain ordinate, and the number of the second areas is the number of text areas containing a certain abscissa; determining the row coordinates of each table according to the maximum ordinate, the minimum ordinate and the number of the first areas; and determining the coordinates of each table column according to the maximum abscissa, the minimum abscissa and the number of the second areas.
In one embodiment, the determining unit is configured to: determining a longitudinal coordinate of the wave trough according to each longitudinal coordinate and the corresponding first region number of the longitudinal coordinate of the wave trough, wherein the first region number of the longitudinal coordinate of the wave trough is not higher than the first region number of the adjacent longitudinal coordinate of the wave trough, and the adjacent longitudinal coordinates of the longitudinal coordinate of the wave trough are the former longitudinal coordinate and the latter longitudinal coordinate of the wave trough; and obtaining the row coordinates of each table according to the maximum ordinate, the minimum ordinate and the trough ordinate.
In one embodiment, the determining unit is configured to: determining a trough abscissa according to each abscissa and the number of corresponding second regions thereof, wherein the number of the second regions of the trough abscissa is not higher than the number of the second regions of adjacent abscissas of the trough abscissa, and the adjacent abscissas of the trough abscissa are a former abscissa and a latter abscissa of the trough abscissa; and obtaining the coordinates of each table column according to the maximum abscissa, the minimum abscissa and the trough abscissa.
In one embodiment, the generating unit is further configured to: determining cell position information of cells in the blank table according to the row coordinates and the column coordinates of each table; determining target cells covered by the text region according to the region position information and the cell position information; wherein a coordinate point which is located in the target cell and is not coincident with the boundary of the target cell exists in the text region; and if the number of the target cells covered by the text area is determined to be multiple, merging the target cells.
In one embodiment, the obtaining unit is configured to: respectively aiming at each cell in the blank table, executing the following steps: if the unit cell only contains one text area according to the area position information, adding the area text content in the text area contained in the unit cell into the unit cell; and if the unit cell is determined to contain at least two text regions according to the region position information, sequencing the region text contents in each text region contained in the unit cell according to the region position information, and adding the sequenced region text contents into one unit cell.
In one embodiment, the obtaining unit is configured to: determining the coordinates of the central point of the text area according to the area position information, wherein the coordinates of the central point are the coordinates of the central point of the text area; and if the coordinates of the center point of only one text area are determined to be positioned in the cell according to the cell position information of one cell, determining that one cell only contains one text area.
In one embodiment, the obtaining unit is configured to: and setting the attribute value of the text attribute of one cell as the text content of the region in the text region contained in one cell.
In one embodiment, the obtaining unit is configured to: if the number of the text areas is determined to be multiple, respectively determining the center point coordinate of each text area according to the area position information of each text area, wherein the center point coordinate is the coordinate of the center point of the text area; and if the coordinates of the center points of at least two text areas are determined to be positioned in the cell according to the cell position information of the cell, determining that the cell comprises at least two text areas.
In one embodiment, the center point coordinates include a center point abscissa and a center point ordinate, and the obtaining unit is configured to: sequencing the text contents of the regions in each text region contained in one cell according to the sequence of the vertical coordinate of the central point from top to bottom; and sequencing the text contents of the areas in the text areas with the same central point ordinate again according to the sequence of the central point abscissa from small to large.
In one embodiment, the obtaining unit is configured to: and setting the attribute value of the text attribute of one cell as the text content of the sorted region.
In one aspect, an electronic device is provided, comprising a processor and a memory, the memory storing computer-readable instructions which, when executed by the processor, perform the steps of the method provided in any of the various alternative implementations of table reconstruction described above.
In one aspect, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method as provided in any of the various alternative implementations of table reconstruction described above.
In one aspect, a computer program product is provided, which when run on a computer causes the computer to perform the steps of the method as provided in any of the various alternative implementations of table reconstruction described above.
Additional features and advantages of the present application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the present application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a flowchart of a table rebuilding method provided in an embodiment of the present application;
FIG. 2 is an exemplary diagram of a user attribute table image according to an embodiment of the present application;
fig. 3 is a specific flowchart of a method for reconstructing a user attribute table according to an embodiment of the present application;
FIG. 4 is a first exemplary graph of a first curve provided by an embodiment of the present application;
FIG. 5 is a first illustration of a second curve provided by an embodiment of the present application;
FIG. 6 is an exemplary diagram of a merged form image provided by an embodiment of the present application;
fig. 7 is a specific flowchart of a method for rebuilding a merge table according to an embodiment of the present application;
FIG. 8 is a second exemplary graph of a first curve provided by an embodiment of the present application;
FIG. 9 is a second exemplary graph of a second curve provided by an embodiment of the present application;
fig. 10 is a block diagram illustrating a structure of an apparatus for table reconstruction according to an embodiment of the present disclosure;
fig. 11 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not construed as indicating or implying relative importance.
First, some terms referred to in the embodiments of the present application will be described to facilitate understanding by those skilled in the art.
The terminal equipment: may be a mobile terminal, a fixed terminal, or a portable terminal such as a mobile handset, station, unit, device, multimedia computer, multimedia tablet, internet node, communicator, desktop computer, laptop computer, notebook computer, netbook computer, tablet computer, personal communication system device, personal navigation device, personal digital assistant, audio/video player, digital camera/camcorder, positioning device, television receiver, radio broadcast receiver, electronic book device, gaming device, or any combination thereof, including the accessories and peripherals of these devices, or any combination thereof. It is also contemplated that the terminal device can support any type of interface to the user (e.g., wearable device), and the like.
A server: the cloud server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, big data and artificial intelligence platforms and the like.
Optical Character Recognition (OCR): it refers to a process in which an electronic device checks characters printed on paper, determines their shapes by detecting dark and light patterns, and then translates the shapes into computer characters by a character recognition method.
In order to reduce the deviation of a reconstructed table when reconstructing the table in a table image. The embodiment of the application provides a table rebuilding method and electronic equipment.
In the embodiment of the present application, the execution subject is an electronic device, and optionally, the electronic device may be a server or a terminal device.
Referring to fig. 1, a flowchart of a table rebuilding method provided in the embodiment of the present application is shown, and a specific implementation flow of the method is as follows:
step 100: and performing text recognition on the image to be processed to obtain a text region recognition result of the image to be processed.
Specifically, when step 100 is executed, the following steps may be adopted:
s1001, text detection is carried out on the image to be processed, and a plurality of region vertex coordinates of the text region are obtained.
In one embodiment, if the text regions are rectangular, performing text detection on the image to be processed to obtain region vertex coordinates of one or more rectangular text regions.
The region vertex coordinates are coordinates of the vertices of the text region. The region vertex coordinates include a region vertex abscissa and a region vertex ordinate. Since the text regions are rectangular, the number of region vertex coordinates of each text region is 4.
In practical applications, the text area may also have other shapes, such as a quadrilateral, and is not limited herein.
And S1002, performing text recognition on the text region to obtain the text content of the region.
In one embodiment, OCR technology is adopted to perform text recognition on the text area, and the text content of the area in the text area is obtained.
The text content of the region is the information identified in the text region. The regional text content can include at least one of: text, formula, and date.
In practical applications, the regional text content may also be other types of information, and is not limited herein.
S1003, carrying out: and obtaining a text region identification result of the image to be processed based on the region vertex coordinates and the region text content of the text region.
The text area recognition result includes area text content and area position information of the text area. The region position information includes region vertex coordinates of the text region.
In one embodiment, the region vertex coordinates are used as region position information in the text region identification result.
Step 101: and determining the coordinates of each table row and the coordinates of each table column of the target table according to the area position information in the text area identification result.
Specifically, when step 101 is executed, the following steps may be adopted:
and S1011, determining the maximum ordinate and the minimum ordinate in the vertex ordinates of each area.
In one embodiment, if there are a plurality of text regions, the maximum ordinate and the minimum ordinate among the ordinate of the vertices of all the regions of each text region are determined.
And S1012, determining the maximum abscissa and the minimum abscissa of the abscissas of the vertexes of the regions.
In one embodiment, if there are a plurality of text regions, the maximum abscissa and the minimum abscissa among the abscissas of the vertices of all the regions of each text region are determined.
Thus, the boundaries of the target table can be determined by the maximum ordinate, the minimum ordinate, the maximum abscissa, and the minimum abscissa.
And S1013, determining the first area number of each ordinate and the second area number of each abscissa according to the area position information.
The number of the first areas is the number of text areas containing a certain ordinate, and the number of the second areas is the number of text areas containing a certain abscissa. Note that the text area includes a certain ordinate, which means that the ordinate at which the coordinate point exists in the text area is the certain ordinate. The text area includes a certain abscissa, which means that the abscissa where the coordinate point exists in the text area is the certain abscissa.
In one embodiment, an abscissa section and an ordinate section of a text region are determined based on region position information of the text region, the text region is determined to include a certain ordinate if it is determined that the certain ordinate is located within the ordinate section of the text region, and the text region is determined to include a certain abscissa if it is determined that the certain abscissa is located within the abscissa section of the text region.
As an example, taking any one of the text regions as an example, the abscissa section of a certain text region is determined according to the maximum value and the minimum value of the abscissas in the vertex coordinates of the 4 regions of the text region, and the ordinate section of a certain text region is determined according to the maximum value and the minimum value of the ordinates in the vertex coordinates of the 4 regions of the text region.
In this way, a first region number of text regions through which each ordinate line (the ordinate of each coordinate point in the ordinate line is the same) passes and a second region number of text regions through which each abscissa line (the abscissa of each coordinate point in the abscissa line is the same) passes can be determined.
And S1014, determining each table row coordinate of the target table according to the maximum ordinate, the minimum ordinate and the first area number.
Specifically, according to each vertical coordinate and the number of the corresponding first areas, a trough vertical coordinate meeting the trough vertical coordinate condition is determined, and according to the maximum vertical coordinate, the minimum vertical coordinate and the trough vertical coordinate, the row coordinate of each table is obtained.
Wherein, the trough ordinate condition is: the number of the first areas of the longitudinal coordinates of the wave troughs is not higher than that of the first areas of the adjacent longitudinal coordinates of the wave troughs, and the adjacent longitudinal coordinates of the wave troughs are the former longitudinal coordinate and the latter longitudinal coordinate of the longitudinal coordinates of the wave troughs.
In one embodiment, obtaining respective table row coordinates based on the maximum ordinate, the minimum ordinate, and the trough ordinate comprises: taking the maximum ordinate and the minimum ordinate as table row coordinates; according to the longitudinal coordinates of all the wave troughs, generating a wave trough longitudinal coordinate interval (all the longitudinal coordinates in the wave trough longitudinal coordinate interval are wave trough longitudinal coordinates), and screening out a wave trough longitudinal coordinate interval not containing the maximum longitudinal coordinate and a wave trough longitudinal coordinate interval not containing the minimum longitudinal coordinate; and in the screened trough longitudinal coordinate interval, if the trough longitudinal coordinate interval is determined to only contain one trough longitudinal coordinate, taking the trough longitudinal coordinate as a table row coordinate, and if the trough longitudinal coordinate interval is determined to contain a plurality of trough longitudinal coordinates, selecting one trough longitudinal coordinate from the trough longitudinal coordinate interval as a table row coordinate. Optionally, a trough ordinate may be randomly selected from the trough ordinate intervals. In practical application, the specific mode may be set according to a practical application scenario, and is not limited herein.
And S1015, determining the coordinates of each table column according to the maximum abscissa, the minimum abscissa and the number of the second areas.
Specifically, according to each abscissa and the number of second regions corresponding to the abscissa, a trough abscissa meeting a trough abscissa condition is determined, and according to the maximum abscissa, the minimum abscissa, and the trough abscissa, the table column coordinates are obtained.
Wherein, the trough abscissa condition is: the number of second regions of the trough abscissa is not higher than the number of second regions of the adjacent abscissa of the trough abscissa, and the adjacent abscissas of the trough abscissa are the former and the latter of the trough abscissa.
In one embodiment, obtaining respective table column coordinates based on the maximum abscissa, the minimum abscissa, and the trough abscissa comprises:
taking the maximum horizontal coordinate and the minimum horizontal coordinate as table column coordinates; generating a trough abscissa interval according to each trough abscissa (each abscissa in the trough abscissa interval is a trough abscissa); screening a trough abscissa interval not containing the maximum abscissa and a trough abscissa interval not containing the minimum abscissa; and in the screened wave trough abscissa intervals, if the wave trough abscissa intervals are determined to only contain one wave trough abscissa, taking the wave trough abscissa as a table column coordinate, and if the wave trough abscissa intervals are determined to contain a plurality of wave trough abscissas, selecting one wave trough abscissa from the wave trough abscissa intervals as a table column coordinate. Optionally, one trough abscissa may be selected from the trough abscissa intervals as a table column coordinate. In practical application, the specific mode may be set according to a practical application scenario, and is not limited herein.
Step 102: and generating a blank table according to the table row coordinates and the table column coordinates.
Furthermore, the target cells covered by the text area in the blank table can be merged. The target cells covered by the text area meet the following conditions: there are coordinate points in the text region that are located within the target cell and that do not coincide with the boundary of the target cell.
For example, if the coordinate point b in the text area a is located within the cell C and does not coincide with the boundary of the cell C, the cell C is determined to be the target cell covered by the text area a.
When merging the target cells covered by the text area in the blank table, the method may include:
s1021: and determining the cell position information of the cells in the blank table according to the row coordinates and the column coordinates of each table.
In one embodiment, the cells in the blank table are rectangles, and the cell position information of the cells includes four cell vertex coordinates of the cells. The cell vertex coordinates are coordinates of the vertices of the cells. The cell vertex coordinates include a cell vertex ordinate and a cell vertex abscissa.
S1022: and determining the target cell covered by the text area according to the area position information and the cell position information.
S1023: and if the number of the target cells covered by the text area is determined to be multiple, merging the target cells.
In one embodiment, if a plurality of text regions are provided, for a target text region in each text region (the target text region is any one of the text regions), if a plurality of target cells covered by the target text region are determined, the target cells covered by the target text region are merged.
Step 103: and adding the regional text content in the text region identification result into the blank table according to the regional position information to obtain the target table.
Specifically, the following steps are executed for each cell in the blank table respectively:
s1031: and if the unit cell only contains one text area according to the area position information, adding the area text content in the text area contained in the unit cell into one unit cell.
One cell contains one text region, which means that all coordinate points in the text region are located in the cell, i.e., the coverage area of the cell is not smaller than the text region.
In one embodiment, determining that a cell contains only one text region according to the region location information may include: and determining the coordinates of the center point of the text area according to the area position information, and if the coordinates of the center point of only one text area are determined to be positioned in the cell according to the cell position information of one cell, determining that one cell only contains one text area. And the central point coordinate is the coordinate of the central point of the text area. The center point coordinates include a center point abscissa and a center point ordinate. This is because, in step 102, if the text region covers multiple cells, the multiple cells covered by the text region are merged, and the text region can only be located in one cell, so that the cell in which the text region is located can be determined only according to the coordinates of the center point of the text region.
In one embodiment, adding the text content of the region in the text region contained in one cell to one cell may include: the attribute value of the text attribute of one cell is set as the regional text content in the text region contained in one cell.
S1032: and if the unit cell is determined to contain at least two text regions according to the region position information, sequencing the region text contents in each text region contained in the unit cell according to the region position information, and adding the sequenced region text contents into one unit cell.
In one embodiment, determining that a cell contains at least two text regions according to the region location information may include: and if the number of the text areas is determined to be multiple, respectively determining the center point coordinates of each text area according to the area position information of each text area. And if the coordinates of the center points of at least two text areas are determined to be positioned in the cell according to the cell position information of the cell, determining that the cell contains at least two text areas.
In one embodiment, sorting the text content of the regions in each text region included in one cell may include: sequencing the text contents of the regions in each text region contained in one cell according to the sequence of the vertical coordinate of the central point from large to low; and sequencing the text contents of the areas in the text areas with the same central point ordinate again according to the sequence of the central point abscissa from small to large.
In one embodiment, adding the ordered text content of the region into one cell may include: and setting the attribute value of the text attribute of one cell as the text content of the sorted region.
This is because if there are multiple text regions in a cell, the content in the cell will be recognized as multiple text regions by OCR technology, for example, a line feed will be recognized as two text regions. Therefore, it is necessary to sort text contents containing a plurality of regions in the same cell.
In practical application, the text contents of each region may be sorted according to a practical application scenario, which is not limited herein.
When a cell is empty, since the cell does not contain a text region, it is not necessary to perform other processing on the empty cell.
Thus, the target table corresponding to the image to be processed can be generated.
In the embodiment of the application, text recognition technology is adopted to perform text recognition so as to obtain the position and content of each text region in each image to be processed, development amount is reduced, furthermore, a framed table in the image to be processed can be reconstructed, a frameless table in the image to be processed can also be reconstructed, accuracy of table reconstruction is improved, merging cells and the condition that the same cell contains text content of a plurality of regions can be accurately recognized, and accuracy of table reconstruction is further improved.
The above embodiments are illustrated below using a specific application scenario. Fig. 2 is a diagram showing an example of a user attribute form image. Fig. 2 shows a user attribute table image containing a plurality of pieces of user attribute information, which is an image to be processed requiring table reconstruction. Referring to fig. 3, a specific flowchart of a method for reconstructing a user attribute table is shown, where the method shown in fig. 3 is used to reconstruct the user attribute table in the user attribute table image shown in fig. 2, and a specific implementation flow of the method is as follows:
step 300: and performing text recognition on the user attribute form image to obtain a text region recognition result of the user attribute form image.
Specifically, OCR technology is used to perform text detection and text recognition on the user attribute table image shown in fig. 2, and obtain the regional text content and the regional position information of 34 text regions. The region position information includes region vertex coordinates of the text region, i.e., first region vertex coordinates, second region vertex coordinates, third region vertex coordinates, and fourth region vertex coordinates.
In one embodiment, the output format of the text content of each identified region and the region position information of the text content of each region is { ' content ': region text content ', ' location ': first region vertex coordinates, second region vertex coordinates, third region vertex coordinates, fourth region vertex coordinates }. The recognition result of the text region of the user attribute form image in fig. 2 is:
{'content':'Salary','location':[245,40,245,21,288,22,287,41]};
{'content':'WokrAge','location':[438,40,439,22,499,23,499,40]};
{'content':'Name','location':[144,39,144,22,185,23,185,40]};
{'content':'Number','location':[37,39,37,23,90,23,90,39]};
{'content':'Bonus','location':[347,39,347,23,389,23,389,39]};
{'content':'1000','location':[249,84,249,66,284,66,284,84]};
{'content':'LiLei','location':[149,83,149,66,182,66,182,83]};
{'content':'1','location':[56,84,56,66,71,66,71,84]};
{'content':'10','location':[358,83,358,67,378,67,378,83]};
{'content':'10','location':[459,83,459,67,479,67,479,83]};
{'content':'Han','location':[150,118,150,102,180,102,180,118]};
{'content':'20','location':[459,127,459,110,479,110,479,127]};
{'content':'2','location':[56,127,56,110,71,110,71,127]};
{'content':'3000','location':[249,127,249,111,284,111,284,127]};
{'content':'90','location':[358,127,358,111,378,111,378,127]};
{'content':'Meimei','location':[141,135,141,119,190,119,190,135]};
{'content':'Weiyin','location':[142,163,142,146,188,146,188,163]};
{'content':'5000','location':[249,171,249,154,284,154,284,171]};
{'content':'888','location':[354,171,354,154,381,154,381,171]};
{'content':'30','location':[459,171,459,154,479,154,479,171]};
{'content':'3','location':[56,171,56,154,71,154,71,171]};
{'content':'Wang','location':[143,179,143,160,187,162,186,181]};
{'content':'Fangzheng','location':[129,207,129,189,200,190,200,207]};
{'content':'8000','location':[249,215,249,198,284,198,284,215]};
{'content':'40','location':[458,215,458,198,480,198,480,215]};
{'content':'4','location':[56,216,56,198,71,198,71,216]};
{'content':'303','location':[354,215,354,199,381,199,381,215]};
{'content':'Dashi','location':[145,224,145,206,184,205,184,223]};
{'content':'Chongxu','location':[135,251,136,234,195,235,194,252]};
{'content':'10000','location':[246,259,246,242,287,242,287,259]};
{'content':'400','location':[354,258,354,242,382,242,382,258]};
{'content':'50','location':[459,259,459,242,479,242,479,259]};
{'content':'5','location':[56,259,56,242,71,242,71,259]};
{'content':'Daozhang','location':[132,267,133,249,198,250,197,268]}]。
step 301: and determining the maximum ordinate and the minimum ordinate of the ordinate of each region vertex in the text region identification result.
Specifically, the maximum ordinate table _ top is 268 and the minimum ordinate table _ bottom is 21 are determined from the respective region vertex ordinates of the respective text regions in fig. 2.
Step 302: and determining the maximum abscissa and the minimum abscissa in the abscissas of the vertexes of the regions in the text region identification result.
Specifically, the maximum abscissa table _ right is 499 and the minimum abscissa table _ left is 37 are determined by the abscissa of each region vertex of each text region in fig. 2.
Step 303: and determining the first area number of each ordinate and the second area number of each abscissa according to the area position information in the text area identification result.
Specifically, for a target ordinate in [ table _ bottom, table _ top ] (i.e., [21, 268]) (the target ordinate is any one of [ table _ bottom, table _ top ]), the number of text regions through which a target ordinate line (the ordinate of each coordinate point on the target ordinate line is the target ordinate) passes is determined, and the first region number of the target ordinate is obtained. For a target abscissa (any of the target abscissas [ table _ left, table _ right ]) in [ table _ left, table _ right ] (i.e., [37, 499]), determining the number of text regions through which a target abscissa line (the abscissa of each coordinate point on the target abscissa line is the target abscissa line), and obtaining a second region number of the target abscissa line.
Step 304: and determining each table row coordinate of the target table according to the maximum ordinate, the minimum ordinate and the first area number.
Referring to fig. 4, an exemplary graph one of the first curve is shown. In one embodiment, 21 and 268 are determined as table row coordinates, and a first curve as shown in fig. 4 is generated based on the first area number of each ordinate, and a plurality of table row coordinates are determined as [67,119, 172, 223, 259] in this order from the first curve in fig. 4.
It should be noted that the points on the first curve shown in fig. 4 are continuous, and are only used for illustrating the corresponding relationship between each ordinate and the number of the first areas. In practical applications, the respective ordinate may also be discontinuous (the ordinate is usually obtained by sampling), and is not limited herein.
Step 305: and determining the coordinates of each table column according to the maximum abscissa, the minimum abscissa and the number of the second areas.
Referring to fig. 5, an exemplary graph one of the second curve is shown. In one embodiment, 37 and 499 are determined as table abscissas and a second curve as shown in FIG. 5 is generated based on a second number of zones on each abscissa, and a plurality of table abscissas are determined from the second curve in FIG. 5, in order [139,325, 456, 621 ].
It should be noted that the points on the second curve shown in fig. 5 are continuous, and are only used for illustrating the correspondence relationship between each abscissa and the number of second areas. In practical applications, the abscissa may also be discontinuous (the abscissa is usually obtained by sampling), and is not limited herein.
Step 306: and generating a blank table according to the table row coordinates and the table column coordinates.
Step 307: and adding the regional text content in the text region identification result into the blank table according to the regional position information to obtain the target table.
For example, in fig. 2, the coordinates of the position of the second column are between the valleys [139,325], the position of the second row is between the valleys [67,119], and it is determined that both "Han" and "meimeimeiei" are located at the second row of the second column and "Han" is located above "meiei" based on the region position information of "Han" and "meiei".
In this way, the user attribute table in the user attribute table image shown in fig. 2 can be reconstructed.
The above embodiment is illustrated below using another specific application scenario. Referring to FIG. 6, an exemplary diagram of a merged form image is shown. The merged table image is an image to be processed requiring table reconstruction. Referring to fig. 7, a specific flowchart of a method for reconstructing a merged table is shown, in which the method shown in fig. 7 is used to reconstruct the merged table in the merged table image shown in fig. 6, and the specific implementation flow of the method is as follows:
step 700: and performing text recognition on the combined form image to obtain a text region recognition result of the user attribute form image.
Specifically, OCR technology is used to perform text detection and text recognition on the merged form image shown in fig. 6, and obtain the regional text content and the regional position information of 8 text regions. The region position information includes region vertex coordinates of the text region, i.e., first region vertex coordinates, second region vertex coordinates, third region vertex coordinates, and fourth region vertex coordinates.
In one embodiment, the output format of the text content of each identified region and the region position information of the text content of each region is { ' content ': region text content ', ' location ': first region vertex coordinates, second region vertex coordinates, third region vertex coordinates, fourth region vertex coordinates }. The recognition result of the text region of the merged form image in fig. 6 is:
{'content':'500','location':[642,137,642,84,728,84,728,137]};
{'content':'1000','location':[481,136,480,94,583,92,584,133]};
{'content':'Lilei','location':[286,131,286,102,366,102,366,131]};
{'content':'500','location':[641,193,642,140,728,142,727,195]};
{'content':'2000','location':[477,193,477,142,586,141,586,192]};
{'content':'HanMeimel','location':[213,190,213,157,437,157,437,190]};
{'content':'3000','location':[555,250,553,202,661,198,663,246]};
{'content':'ChongXu','location':[246,247,245,217,413,216,413,246]}]。
step 701: and determining the maximum ordinate and the minimum ordinate of the ordinate of each region vertex in the text region identification result.
Specifically, the maximum ordinate table _ top is 250 and the minimum ordinate table _ bottom is 84 are determined from the respective region vertex ordinates of the respective text regions in fig. 6.
Step 702: and determining the maximum abscissa and the minimum abscissa in the abscissas of the vertexes of the regions in the text region identification result.
Specifically, the maximum abscissa table _ right and the minimum abscissa table _ left of each text region in fig. 6 are determined to be 728 and 213, respectively, from the region vertex abscissas.
Step 703: and determining the first area number of each ordinate and the second area number of each abscissa according to the area position information in the text area identification result.
Step 704: and determining each table row coordinate of the target table according to the maximum ordinate, the minimum ordinate and the first area number.
Fig. 8 shows an exemplary graph ii of the first curve. Fig. 8 is merely used to exemplify the correspondence between the respective ordinates and the first area number. In one embodiment, 84 and 250 are determined as table row coordinates, and a first curve as shown in FIG. 8 is generated based on the first number of regions for each ordinate, and a plurality of table row coordinates are determined sequentially as [137,195] from the first curve in FIG. 8.
Step 705: and determining the coordinates of each table column according to the maximum abscissa, the minimum abscissa and the number of the second areas.
Fig. 9 shows an exemplary graph ii of a second curve. Fig. 9 is merely used to exemplify the correspondence between each abscissa and the second number of areas. In one embodiment, 213 and 728 are determined as table abscissas and a second curve as shown in FIG. 9 is generated based on a second number of zones for each abscissa, and a plurality of table abscissas are determined from the second curve in FIG. 9, in order [437,586 ].
Step 706: and generating a blank table according to the table row coordinates and the table column coordinates.
Step 707: and adding the regional text content in the text region identification result into the blank table according to the regional position information to obtain the target table.
Based on the same inventive concept, the embodiment of the present application further provides a table reconstruction apparatus, and since the principle of the apparatus and the device for solving the problem is similar to that of a table reconstruction method, the implementation of the apparatus can refer to the implementation of the method, and repeated details are not repeated.
As shown in fig. 10, which is a schematic structural diagram of an apparatus for reconstructing a table according to an embodiment of the present application, including:
the identification unit 1001 is configured to perform text identification on an image to be processed to obtain a text region identification result of the image to be processed, where the text region identification result includes region text content of a text region and region position information;
a determining unit 1002, configured to determine, according to the area location information, table row coordinates and table column coordinates of the target table;
a generating unit 1003, configured to generate a blank table according to each table row coordinate and each table column coordinate;
an obtaining unit 1004, configured to add the regional text content to the blank table according to the regional position information, and obtain the target table.
In one embodiment, the region location information includes region vertex coordinates of the text region, and the identifying unit 1001 is configured to: performing text detection on the image to be processed to obtain a plurality of region vertex coordinates of the text region, wherein the region vertex coordinates are the coordinates of the vertex of the text region; and performing text recognition on the text region to obtain the text content of the region.
In one embodiment, the area vertex coordinates include an area vertex abscissa and an area vertex ordinate, and the determining unit 1002 is configured to: determining the maximum ordinate and the minimum ordinate in the vertex ordinates of each region; determining the maximum abscissa and the minimum abscissa in the abscissas of the vertexes of the regions; determining the number of first areas of each ordinate and the number of second areas of each abscissa according to the area position information, wherein the number of the first areas is the number of text areas containing a certain ordinate, and the number of the second areas is the number of text areas containing a certain abscissa; determining the row coordinates of each table according to the maximum ordinate, the minimum ordinate and the number of the first areas; and determining the coordinates of each table column according to the maximum abscissa, the minimum abscissa and the number of the second areas.
In one embodiment, the determining unit 1002 is configured to: determining a longitudinal coordinate of the wave trough according to each longitudinal coordinate and the corresponding first region number of the longitudinal coordinate of the wave trough, wherein the first region number of the longitudinal coordinate of the wave trough is not higher than the first region number of the adjacent longitudinal coordinate of the wave trough, and the adjacent longitudinal coordinates of the longitudinal coordinate of the wave trough are the former longitudinal coordinate and the latter longitudinal coordinate of the wave trough; and obtaining the row coordinate of each table according to the maximum ordinate, the minimum ordinate and the trough ordinate.
In one embodiment, the determining unit 1002 is configured to: determining a trough abscissa according to each abscissa and the number of corresponding second regions thereof, wherein the number of the second regions of the trough abscissa is not higher than the number of the second regions of adjacent abscissas of the trough abscissa, and the adjacent abscissas of the trough abscissa are a former abscissa and a latter abscissa of the trough abscissa; and obtaining the coordinates of each table column according to the maximum abscissa, the minimum abscissa and the trough abscissa.
In one embodiment, the generating unit 1003 is further configured to: determining cell position information of cells in the blank table according to the row coordinates and the column coordinates of each table; determining target cells covered by the text region according to the region position information and the cell position information; wherein a coordinate point which is located in the target cell and is not coincident with the boundary of the target cell exists in the text region; if a plurality of target cells covered by the text area are determined, merging the target cells.
In one embodiment, the obtaining unit 1004 is configured to: respectively aiming at each cell in the blank table, executing the following steps: if the unit cell only contains one text area according to the area position information, adding the area text content in the text area contained in the unit cell into one unit cell; and if the unit cell is determined to contain at least two text regions according to the region position information, sequencing the region text contents in each text region contained in the unit cell according to the region position information, and adding the sequenced region text contents into one unit cell.
In one embodiment, the obtaining unit 1004 is configured to: determining the coordinates of the center point of the text region according to the region position information, wherein the coordinates of the center point are the coordinates of the center point of the text region; and if the coordinates of the center point of only one text area are determined to be positioned in the cell according to the cell position information of one cell, determining that one cell only contains one text area.
In one embodiment, the obtaining unit 1004 is configured to: the attribute value of the text attribute of one cell is set as the regional text content in the text region contained in one cell.
In one embodiment, the obtaining unit 1004 is configured to: if the number of the text areas is determined to be multiple, respectively determining the center point coordinate of each text area according to the area position information of each text area, wherein the center point coordinate is the coordinate of the center point of the text area; and if the coordinates of the center points of at least two text areas are determined to be positioned in the cell according to the cell position information of the cell, determining that the cell contains at least two text areas.
In one embodiment, the center point coordinates include a center point abscissa and a center point ordinate, and the obtaining unit 1004 is configured to: sequencing the text contents of the regions in each text region contained in one cell according to the sequence of the vertical coordinate of the central point from large to low; and sequencing the text contents of the areas in the text areas with the same central point ordinate again according to the sequence of the central point abscissa from small to large.
In one embodiment, the obtaining unit 1004 is configured to: and setting the attribute value of the text attribute of one cell as the text content of the sorted region.
Fig. 11 shows a schematic structural diagram of an electronic device 1100. Referring to fig. 11, the electronic device 1100 includes: the processor 1110 and the memory 1120 may further include a power source 1130, a display unit 1140, and an input unit 1150.
The processor 1110 is a control center of the electronic device 1100, connects various components using various interfaces and lines, and performs various functions of the electronic device 1100 by running or executing software programs and/or data stored in the memory 1120, thereby performing overall monitoring of the electronic device 1100.
In the present embodiment, the processor 1110 executes the steps in the above embodiments when calling the computer program stored in the memory 1120.
Optionally, processor 1110 may include one or more processing units; preferably, the processor 1110 may integrate an application processor, which mainly handles operating systems, user interfaces, applications, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 1110. In some embodiments, the processor, memory, and/or memory may be implemented on a single chip, or in some embodiments, they may be implemented separately on separate chips.
The memory 1120 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, various applications, and the like; the storage data area may store data created according to the use of the electronic device 1100, and the like. Further, the memory 1120 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The electronic device 1100 also includes a power supply 1130 (e.g., a battery) that provides power to the various components and that may be logically coupled to the processor 1110 via a power management system to manage charging, discharging, and power consumption via the power management system.
The display unit 1140 may be used to display information input by a user or information provided to the user, various menus of the electronic device 1100, and the like, and in the embodiment of the present invention, the display unit is mainly used to display a display interface of each application in the electronic device 1100 and objects such as texts and pictures displayed in the display interface. The display unit 1140 may include a display panel 1141. The Display panel 1141 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The input unit 1150 may be used to receive information such as numbers or characters input by a user. The input unit 1150 may include a touch panel 1151 and other input devices 1152. Touch panel 1151, also referred to as a touch screen, may collect touch operations by a user on or near the touch panel 1151 (e.g., operations by a user on or near touch panel 1151 using any suitable object or accessory such as a finger, a stylus, etc.).
Specifically, the touch panel 1151 may detect a touch operation of a user, detect signals generated by the touch operation, convert the signals into touch point coordinates, transmit the touch point coordinates to the processor 1110, receive a command transmitted from the processor 1110, and execute the command. In addition, the touch panel 1151 can be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. Other input devices 1152 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, power on and off keys, etc.), a trackball, a mouse, a joystick, and the like.
Of course, the touch panel 1151 can cover the display panel 1141, and when the touch panel 1151 detects a touch operation on or near the touch panel, the touch panel is transmitted to the processor 1110 to determine the type of the touch event, and then the processor 1110 provides a corresponding visual output on the display panel 1141 according to the type of the touch event. Although in fig. 11, the touch panel 1151 and the display panel 1141 are two separate components to implement the input and output functions of the electronic device 1100, in some embodiments, the touch panel 1151 and the display panel 1141 may be integrated to implement the input and output functions of the electronic device 1100.
The electronic device 1100 may also include one or more sensors, such as pressure sensors, gravitational acceleration sensors, proximity light sensors, and the like. Of course, the electronic device 1100 may also include other components such as a camera, which are not shown in fig. 11 and will not be described in detail since they are not components that are used in the embodiments of the present application.
Those skilled in the art will appreciate that fig. 11 is merely an example of an electronic device and is not intended to limit the electronic device and may include more or fewer components than those shown, or some components may be combined, or different components.
In an embodiment of the present application, a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the communication device may be enabled to execute the steps in the above embodiments.
For convenience of description, the above parts are separately described as modules (or units) according to functional division. Of course, the functionality of the various modules (or units) may be implemented in the same one or more pieces of software or hardware when the application is implemented.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Claims (13)

1. A method of table reconstruction, comprising:
performing text recognition on an image to be processed to obtain a text region recognition result of the image to be processed, wherein the text region recognition result comprises region text content and region position information of a text region;
determining table row coordinates and table column coordinates of the target table according to the region position information;
generating a blank table according to the table row coordinates and the table column coordinates;
and adding the regional text content into the blank table according to the regional position information to obtain the target table.
2. The method according to claim 1, wherein the region position information includes region vertex coordinates of the text region, and the performing text recognition on the image to be processed to obtain a text region recognition result of the image to be processed includes:
performing text detection on the image to be processed to obtain a plurality of region vertex coordinates of the text region, wherein the region vertex coordinates are coordinates of a vertex of the text region;
and performing text recognition on the text region to obtain text content of the region.
3. The method of claim 2, wherein the region vertex coordinates include a region vertex abscissa and a region vertex ordinate, and wherein determining respective table row coordinates and respective table column coordinates of the target table from the region location information comprises:
determining the maximum ordinate and the minimum ordinate in the vertex ordinates of each region;
determining the maximum abscissa and the minimum abscissa of the abscissas of the vertexes of the regions;
determining the number of first areas of each ordinate and the number of second areas of each abscissa according to the area position information, wherein the number of the first areas is the number of text areas containing a certain ordinate, and the number of the second areas is the number of text areas containing a certain abscissa;
determining the row coordinate of each table according to the maximum ordinate, the minimum ordinate and the first area quantity;
and determining the coordinates of each table column according to the maximum abscissa, the minimum abscissa and the second area number.
4. The method of claim 3, wherein said determining respective table row coordinates based on said maximum ordinate, said minimum ordinate, and said first number of zones comprises:
determining a trough ordinate according to each ordinate and the corresponding first region number thereof, wherein the first region number of the trough ordinate is not higher than the first region number of the adjacent ordinate of the trough ordinate, and the adjacent ordinates of the trough ordinate are the former ordinate and the latter ordinate of the trough ordinate;
and obtaining the row coordinate of each table according to the maximum ordinate, the minimum ordinate and the trough ordinate.
5. The method of claim 3, wherein said determining respective table column coordinates based on said maximum abscissa, said minimum abscissa, and said second number of regions comprises:
determining a trough abscissa according to each abscissa and the corresponding second region number thereof, wherein the second region number of the trough abscissa is not higher than the second region number of the adjacent abscissa of the trough abscissa, and the adjacent abscissas of the trough abscissa are the former abscissa and the latter abscissa of the trough abscissa;
and obtaining the coordinates of each table column according to the maximum abscissa, the minimum abscissa and the trough abscissa.
6. The method of any of claims 1-5, wherein after the generating a blank table from the respective table row coordinates and the respective table column coordinates, the method further comprises:
determining cell position information of cells in the blank table according to the table row coordinates and the table column coordinates;
determining a target cell covered by the text region according to the region position information and the cell position information; wherein there is a coordinate point in the text region that is within the target cell and that does not coincide with a boundary of the target cell;
and if the target cells covered by the text area are determined to be a plurality of cells, merging the target cells.
7. The method according to any one of claims 1-5, wherein the adding the regional text content to the blank table according to the regional location information to obtain the target table comprises:
respectively aiming at each cell in the blank table, executing the following steps:
if it is determined that one cell only contains one text region according to the region position information, adding region text content in the text region contained in the cell into the cell;
if it is determined that one cell contains at least two text regions according to the region position information, the region text contents in each text region contained in the cell are sequenced according to the region position information, and the sequenced region text contents are added into the cell.
8. The method of claim 7, wherein determining that a cell contains only one text region based on the region location information comprises:
determining the coordinates of the center point of the text area according to the area position information, wherein the coordinates of the center point are the coordinates of the center point of the text area;
and if the coordinates of the center point of only one text region are determined to be positioned in the cell according to the cell position information of the cell, determining that the cell only contains one text region.
9. The method of claim 7, wherein adding regional text content within the text region contained by the one cell to the one cell comprises:
and setting the attribute value of the text attribute of the cell as the regional text content in the text region contained in the cell.
10. The method of claim 7, wherein determining that a cell contains at least two text regions based on the region location information comprises:
if the number of the text areas is determined to be multiple, respectively determining the center point coordinate of each text area according to the area position information of each text area, wherein the center point coordinate is the coordinate of the center point of the text area;
and if the coordinates of the central point of at least two text regions are determined to be positioned in the cell according to the cell position information of the cell, determining that the cell contains at least two text regions.
11. The method of claim 10, wherein the center point coordinates comprise a center point abscissa and a center point ordinate, and wherein sorting the region text content within each text region contained by the one cell comprises:
sequencing the text contents of the regions in the text regions contained in the cell according to the sequence of the vertical coordinate of the central point from large to low;
and sequencing the text contents of the areas in the text areas with the same central point vertical coordinate again according to the sequence of the central point horizontal coordinate from small to large.
12. The method of claim 7, wherein said adding the ordered regional text content to the one cell comprises:
and setting the attribute value of the text attribute of the cell as the text content of the sorted region.
13. An electronic device comprising a processor and a memory, the memory storing computer readable instructions that, when executed by the processor, perform the method of any one of claims 1-12.
CN202210523453.7A 2022-05-13 2022-05-13 Table reconstruction method and electronic equipment Active CN114943978B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210523453.7A CN114943978B (en) 2022-05-13 2022-05-13 Table reconstruction method and electronic equipment
PCT/CN2023/084482 WO2023216745A1 (en) 2022-05-13 2023-03-28 Table reconstruction method and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210523453.7A CN114943978B (en) 2022-05-13 2022-05-13 Table reconstruction method and electronic equipment

Publications (2)

Publication Number Publication Date
CN114943978A true CN114943978A (en) 2022-08-26
CN114943978B CN114943978B (en) 2023-10-03

Family

ID=82906729

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210523453.7A Active CN114943978B (en) 2022-05-13 2022-05-13 Table reconstruction method and electronic equipment

Country Status (2)

Country Link
CN (1) CN114943978B (en)
WO (1) WO2023216745A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023216745A1 (en) * 2022-05-13 2023-11-16 上海弘玑信息技术有限公司 Table reconstruction method and electronic device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09330377A (en) * 1996-06-10 1997-12-22 Hitachi Ltd Device and method for recognizing handwritten character
CN111985465A (en) * 2020-08-17 2020-11-24 中移(杭州)信息技术有限公司 Text recognition method, device, equipment and storage medium
WO2020232872A1 (en) * 2019-05-22 2020-11-26 平安科技(深圳)有限公司 Table recognition method and apparatus, computer device, and storage medium
CN112396048A (en) * 2020-11-17 2021-02-23 中国平安人寿保险股份有限公司 Picture information extraction method and device, computer equipment and storage medium
CN113239227A (en) * 2021-06-02 2021-08-10 泰康保险集团股份有限公司 Image data structuring method and device, electronic equipment and computer readable medium
CN114463765A (en) * 2022-02-10 2022-05-10 微民保险代理有限公司 Table information extraction method and device and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446264B (en) * 2018-03-26 2022-02-15 阿博茨德(北京)科技有限公司 Method and device for analyzing table vector in PDF document
CN114943978B (en) * 2022-05-13 2023-10-03 上海弘玑信息技术有限公司 Table reconstruction method and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09330377A (en) * 1996-06-10 1997-12-22 Hitachi Ltd Device and method for recognizing handwritten character
WO2020232872A1 (en) * 2019-05-22 2020-11-26 平安科技(深圳)有限公司 Table recognition method and apparatus, computer device, and storage medium
CN111985465A (en) * 2020-08-17 2020-11-24 中移(杭州)信息技术有限公司 Text recognition method, device, equipment and storage medium
CN112396048A (en) * 2020-11-17 2021-02-23 中国平安人寿保险股份有限公司 Picture information extraction method and device, computer equipment and storage medium
CN113239227A (en) * 2021-06-02 2021-08-10 泰康保险集团股份有限公司 Image data structuring method and device, electronic equipment and computer readable medium
CN114463765A (en) * 2022-02-10 2022-05-10 微民保险代理有限公司 Table information extraction method and device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023216745A1 (en) * 2022-05-13 2023-11-16 上海弘玑信息技术有限公司 Table reconstruction method and electronic device

Also Published As

Publication number Publication date
CN114943978B (en) 2023-10-03
WO2023216745A1 (en) 2023-11-16

Similar Documents

Publication Publication Date Title
Böhmer et al. A study on icon arrangement by smartphone users
CN107209905A (en) For personalized and task completion service, correspondence spends theme and sorted out
WO2016082598A1 (en) Method, apparatus, and device for rapidly searching for application program
US9588678B2 (en) Method of operating electronic handwriting and electronic device for supporting the same
CN102609083B (en) Realize the overall situation setting of posture based on culture
US20150134641A1 (en) Electronic device and method for processing clip of electronic document
WO2023216745A1 (en) Table reconstruction method and electronic device
CN116168038A (en) Image reproduction detection method and device, electronic equipment and storage medium
CN114428842A (en) Method and device for expanding question-answer library, electronic equipment and readable storage medium
CN114067797A (en) Voice control method, device, equipment and computer storage medium
TW201428515A (en) Content and object metadata based search in e-reader environment
CN105550183A (en) Identifying method of identifying information in webpage and electronic device
KR102315068B1 (en) Method and system for determining document consistence to improve document search quality
WO2023138475A1 (en) Icon management method and apparatus, and device and storage medium
CN114637866B (en) Information management method and device for digitalized new media
CN115600199A (en) Security assessment method and device, electronic equipment and computer readable storage medium
CN114547242A (en) Questionnaire investigation method and device, electronic equipment and readable storage medium
CN114416664A (en) Information display method, information display device, electronic apparatus, and readable storage medium
CN112417197A (en) Sorting method, sorting device, machine readable medium and equipment
WO2019103380A1 (en) Apparatus for generating news-article-based story, method therefor, and computer-readable recording medium on which program for performing same method is recorded
WO2009021563A1 (en) A data processing method, computer program product and data processing system
CN115661843A (en) Table reconstruction method and electronic equipment
CN115454989B (en) Data processing method and device for application program data
CN114286164B (en) Video synthesis method and device, electronic equipment and storage medium
CN116542680A (en) Abnormal visitor detection method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant