CN114417792A - Processing method and device of form image, electronic equipment and medium - Google Patents

Processing method and device of form image, electronic equipment and medium Download PDF

Info

Publication number
CN114417792A
CN114417792A CN202111668079.1A CN202111668079A CN114417792A CN 114417792 A CN114417792 A CN 114417792A CN 202111668079 A CN202111668079 A CN 202111668079A CN 114417792 A CN114417792 A CN 114417792A
Authority
CN
China
Prior art keywords
target
image
processing
frame
form image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111668079.1A
Other languages
Chinese (zh)
Other versions
CN114417792B (en
Inventor
段纪伟
黄旭进
张治强
侯冰基
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Office Software Inc
Zhuhai Kingsoft Office Software Co Ltd
Wuhan Kingsoft Office Software Co Ltd
Original Assignee
Beijing Kingsoft Office Software Inc
Zhuhai Kingsoft Office Software Co Ltd
Wuhan Kingsoft Office Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Office Software Inc, Zhuhai Kingsoft Office Software Co Ltd, Wuhan Kingsoft Office Software Co Ltd filed Critical Beijing Kingsoft Office Software Inc
Priority to CN202111668079.1A priority Critical patent/CN114417792B/en
Publication of CN114417792A publication Critical patent/CN114417792A/en
Application granted granted Critical
Publication of CN114417792B publication Critical patent/CN114417792B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Character Input (AREA)

Abstract

本发明提供一种表格图像的处理方法、装置、电子设备及介质,所述方法包括:确定目标表格图像为无边框表格,通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架,将所述目标表格图像中的内容添加至所述第一边框框架内。本发明提供的表格图像的处理方法能够快速有效得到第一边框框架,精准实现表格图像转换成可编辑表格,基于表格图像本身得到的表格信息更全面、清晰,提高表格图像的处理速度,提升用户体验。

Figure 202111668079

The present invention provides a form image processing method, device, electronic device and medium. The method includes: determining a target form image as a borderless form, and processing the target form image through a form recognition structure to obtain a first border frame , adding the content in the target table image to the first border frame. The table image processing method provided by the present invention can quickly and effectively obtain the first border frame, accurately realize the conversion of the table image into an editable table, obtain more comprehensive and clear table information based on the table image itself, improve the processing speed of the table image, and improve the user experience. experience.

Figure 202111668079

Description

一种表格图像的处理方法、装置、电子设备及介质Form image processing method, device, electronic device and medium

技术领域technical field

本发明涉及数据处理技术领域,尤其涉及一种表格图像的处理方法、装置、电子设备及介质。The present invention relates to the technical field of data processing, and in particular, to a method, device, electronic device and medium for processing table images.

背景技术Background technique

随着办公软件的不断发展,人们对办公软件处理的要求也越来越高,希望在满足正常办公要求的情况下,还能满足不同的应用场景。With the continuous development of office software, people's requirements for office software processing are getting higher and higher, and it is hoped that different application scenarios can be met while meeting normal office requirements.

目前,在实际应用中,用户可能只能得到表格图像,如从网上搜索到的表格图像,得不到可编辑的表格,此时需要将表格图像转换为可编辑的表格,对表格进行相应的处理。At present, in practical applications, users may only get table images, such as table images searched on the Internet, but cannot get editable tables. deal with.

现有技术中,将表格图像转换成可编辑的表格文档时,通过图像分割的方式将表格线分割出来,得到表格的横线和竖线,生成可编辑的表格,但是,这种处理方式,存在表格图像分割效果差、线条不完整的缺陷,导致处理的效果不佳,影响用户体验。In the prior art, when a table image is converted into an editable table document, the table line is divided by image segmentation to obtain the horizontal and vertical lines of the table, and an editable table is generated. However, this processing method, There are defects such as poor segmentation effect of table images and incomplete lines, resulting in poor processing results and affecting user experience.

发明内容SUMMARY OF THE INVENTION

基于现有技术中存在的问题,本发明提出一种无边框表格的表格线生成方法、装置、电子设备及介质,能够满足任何形式的表格图像的情况,具有提高表格图像处理效果和提升用户体验的优点。Based on the problems in the prior art, the present invention proposes a table line generation method, device, electronic device and medium for a borderless table, which can meet the situation of any form of table image, and has the advantages of improving table image processing effect and user experience. The advantages.

第一方面,本发明提供一种表格图像的处理方法,包括:In a first aspect, the present invention provides a method for processing table images, including:

确定目标表格图像为无边框表格;Determine the target table image as a borderless table;

通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架;The target table image is processed through the table recognition structure to obtain a first border frame;

将所述目标表格图像中的内容添加至所述第一边框框架内。Adding the content in the target table image to the first border frame.

进一步,根据本发明提供的表格图像的处理方法,所述方法还包括:Further, according to the method for processing table images provided by the present invention, the method further includes:

所述表格识别结构为线条、滑动窗口或者图卷积神经网络模型。The table recognition structure is a line, a sliding window or a graph convolutional neural network model.

进一步,根据本发明提供的表格图像的处理方法,所述方法还包括:Further, according to the method for processing table images provided by the present invention, the method further includes:

在所述表格识别结构为所述线条或所述滑动窗口的情况下,所述通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架,包括:When the table recognition structure is the line or the sliding window, the target table image is processed through the table recognition structure to obtain a first frame frame, including:

将所述表格识别结构沿所述目标表格图像的第一方向移动,记录未覆盖所述内容的第一区域;moving the table identification structure along the first direction of the target table image to record the first area not covering the content;

和/或,将所述表格识别结构沿所述目标表格图像的第二方向移动,记录未覆盖所述内容的第二区域;And/or, moving the form identification structure along the second direction of the target form image, and recording the second area that does not cover the content;

在所述第一区域和所述第二区域的至少一个区域生成表格线,得到第一边框框架。A table line is generated in at least one of the first area and the second area to obtain a first border frame.

进一步,根据本发明提供的表格图像的处理方法,所述方法还包括:Further, according to the method for processing table images provided by the present invention, the method further includes:

在所述表格识别结构为所述线条的情况下,所述表格识别结构包括第一方向线条和第二方向线条;In the case where the table identification structure is the line, the table identification structure includes a first direction line and a second direction line;

所述通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架,包括:The target table image is processed through the table recognition structure to obtain a first frame frame, including:

基于所述目标表格图像中的所述内容分布情况,在所述目标表格图像中平移所述第一方向线条和所述第二方向线条,确定所述目标表格图像中的非交叉区域;Based on the content distribution in the target form image, translating the first direction line and the second direction line in the target form image to determine a non-intersection area in the target form image;

在非交叉区域确定目标点,连接所述目标点得到第一边框框架。A target point is determined in the non-intersection area, and a first frame frame is obtained by connecting the target points.

进一步,根据本发明提供的表格图像的处理方法,所述基于所述目标表格图像中的所述内容分布情况,在所述目标表格图像中平移所述第一方向线条和所述第二方向线条,确定所述目标表格图像中的非交叉区域,包括:Further, according to the method for processing a table image provided by the present invention, the first direction line and the second direction line are translated in the target table image based on the content distribution in the target table image , determine the non-intersection area in the target table image, including:

对所述目标表格图像做二值化处理,得到第一二值图;其中,所述第一二值图中的第一值为所述目标表格图像中内容分布区域所对应的像素,所述第一二值图中的第二值为所述目标表格图像中非内容分布区域所对应的像素;Perform binarization processing on the target table image to obtain a first binary image; wherein, the first value in the first binary image is a pixel corresponding to the content distribution area in the target table image, and the The second value in the first binary image is the pixel corresponding to the non-content distribution area in the target table image;

按照第一方向对所述第一二值图中的像素进行处理,得到第一方向线条;Process the pixels in the first binary image according to the first direction to obtain lines in the first direction;

按照第二方向对所述第一二值图中的像素进行处理,得到第二方向线条;Process the pixels in the first binary image according to the second direction to obtain lines in the second direction;

将所述第一方向线条与所述第二方向线条进行交叉,将交叉后的、未被所述第一方向线条或所述第二方向线条覆盖的区域作为所述目标表格图像中的非交叉区域。Cross the lines in the first direction with the lines in the second direction, and take the intersected area not covered by the lines in the first direction or the lines in the second direction as the non-intersection in the target table image area.

进一步,根据本发明提供的表格图像的处理方法,所述在非交叉区域中确定目标点,包括:Further, according to the processing method of the table image provided by the present invention, the determining the target point in the non-intersecting area includes:

通过轮廓查找确定所述非交叉区域的轮廓;Determine the contour of the non-intersecting area by contour searching;

根据所述非交叉区域的轮廓,将所述非交叉区域的一个坐标点作为所述目标点。According to the outline of the non-intersection area, a coordinate point of the non-intersection area is used as the target point.

进一步,根据本发明提供的表格图像的处理方法,所述方法还包括:Further, according to the method for processing table images provided by the present invention, the method further includes:

在所述表格识别结构为所述滑动窗口的情况下,所述表格识别结构包括第一方向滑动窗口与第二方向滑动窗口;When the table identification structure is the sliding window, the table identification structure includes a first-direction sliding window and a second-direction sliding window;

所述通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架,包括:The target table image is processed through the table recognition structure to obtain a first frame frame, including:

基于所述目标表格图像中的所述内容分布情况,在所述目标表格图像中平移所述第一方向滑动窗口,得到第一方向框体;在所述目标表格图像中平移所述第二方向滑动窗口,得到第二方向框体;Based on the content distribution in the target table image, translate the sliding window in the first direction in the target table image to obtain a frame body in the first direction; translate the second direction in the target table image Sliding the window to get the second direction frame;

基于所述第一方向框体设置第一方向表格线,基于所述第二方向框体设置第二方向表格线,得到第一边框框架。A first direction table line is set based on the first direction frame body, and a second direction table line is set based on the second direction frame body to obtain a first frame frame.

进一步,根据本发明提供的表格图像的处理方法,所述方法还包括:Further, according to the method for processing table images provided by the present invention, the method further includes:

在所述表格识别结构为所述图卷积神经网络模型的情况下,根据所述目标表格图像中的第一文本框和第二文本框之间的位置关系,确定第一边框框架。When the table recognition structure is the graph convolutional neural network model, the first border frame is determined according to the positional relationship between the first text box and the second text box in the target table image.

进一步,根据本发明提供的表格图像的处理方法,在所述通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架之前,方法还包括:Further, according to the form image processing method provided by the present invention, before the target form image is processed through the form recognition structure to obtain the first border frame, the method further includes:

对所述目标表格图像进行二值化处理,得到第二二值图;其中,所述第二二值图中的第二值为目标表格图像中的文本字符所对应的像素,所述第二二值图中的第一值为所述目标表格图像中除文本字符之外的像素;Perform binarization processing on the target table image to obtain a second binary image; wherein, the second value in the second binary image is a pixel corresponding to a text character in the target table image, and the second value in the second binary image is a pixel corresponding to a text character in the target table image. The first value in the binary image is a pixel other than text characters in the target table image;

对所述第二二值图设置表格识别结构。A table identification structure is set on the second binary image.

进一步,根据本发明提供的表格图像的处理方法,所述将所述目标表格图像中的内容添加至所述第一边框框架内,包括:Further, according to the method for processing a table image provided by the present invention, the adding the content in the target table image into the first border frame includes:

对所述目标表格图像进行文本检测,根据文本检测结果得到所述目标表格图像中的文本框;Text detection is performed on the target table image, and a text box in the target table image is obtained according to the text detection result;

在所述第一边框框架内设置文本框;setting a text box in the first frame;

将所述目标表格图像中的文本框中的内容填写至所述第一边框框架的文本框内。Fill in the content in the text box in the target table image into the text box in the first border frame.

第二方面,本发明还提供一种表格图像的处理装置,包括:In a second aspect, the present invention also provides a table image processing device, comprising:

确定模块,用于确定目标表格图像为无边框表格;A determination module for determining the target table image as a borderless table;

处理模块,用于通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架;a processing module, configured to process the target table image through the table recognition structure to obtain a first border frame;

添加模块,用于将所述目标表格图像中的内容添加至所述第一边框框架内。The adding module is used for adding the content in the target table image into the first border frame.

第三方面,本发明还提供一种电子设备,包括:处理器、存储器和总线,其中,In a third aspect, the present invention also provides an electronic device, comprising: a processor, a memory and a bus, wherein,

所述处理器和所述存储器通过所述总线完成相互间的通信;The processor and the memory communicate with each other through the bus;

所述存储器存储有可被所述处理器执行的程序指令,所述处理器调用所述程序指令能够执行如上任一项所述表格图像的处理方法的步骤。The memory stores program instructions executable by the processor, and the processor invokes the program instructions to execute the steps of the method for processing a table image as described in any one of the above.

第四方面,本发明还提供一种计算机可读存储介质,所述计算机可读存储介质存储计算机指令,所述计算机指令使所述计算机执行如上述中任一项所述表格图像的处理方法的步骤。In a fourth aspect, the present invention also provides a computer-readable storage medium, where the computer-readable storage medium stores computer instructions, and the computer instructions cause the computer to execute the method for processing a table image according to any one of the above. step.

本发明提供一种表格图像的处理方法、装置、电子设备及介质,所述方法包括:确定目标表格图像为无边框表格;通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架;将所述目标表格图像中的内容添加至所述第一边框框架内。本发明提供的表格图像的处理方法能够快速有效得到第一边框框架,精准实现表格图像转换成可编辑的表格,基于表格图像本身得到的表格信息更全面、清晰,提高了表格图像的处理速度,提升用户体验。The present invention provides a form image processing method, device, electronic device and medium. The method includes: determining a target form image as a borderless form; processing the target form image through a form recognition structure to obtain a first border frame ; Add the content in the target table image to the first border frame. The table image processing method provided by the present invention can quickly and effectively obtain the first border frame, accurately realize the conversion of the table image into an editable table, obtain more comprehensive and clear table information based on the table image itself, and improve the processing speed of the table image. Improve user experience.

附图说明Description of drawings

为了更清楚地说明本发明或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are the For some embodiments of the invention, for those of ordinary skill in the art, other drawings can also be obtained according to these drawings without any creative effort.

图1是本发明提供的无边框表格的表格线生成方法的流程示意图;Fig. 1 is the schematic flow chart of the table line generation method of the borderless table provided by the present invention;

图2是本发明提供的一种无边框表格的范例图;2 is an exemplary diagram of a borderless table provided by the present invention;

图3是本发明提供的第一二值图的范例图;3 is an exemplary diagram of a first binary image provided by the present invention;

图4是本发明提供的竖向表格线区域的范例图;4 is an exemplary diagram of a vertical table line area provided by the present invention;

图5是本发明提供的横向表格线区域的范例图;5 is an exemplary diagram of a horizontal table line area provided by the present invention;

图6是本发明提供的横向表格线区域与竖向表格线区域叠加之后的范例图;FIG. 6 is an exemplary diagram after the horizontal table line area and the vertical table line area provided by the present invention are superimposed;

图7是本发明确定无交叉区域的中心点的范例图;7 is an exemplary diagram of the present invention determining the center point of the non-intersection area;

图8是本发明中竖向上中心点连接后得到的范例图;8 is an exemplary diagram obtained after the vertical center point is connected in the present invention;

图9是本发明中横向上中心点连接后得到的范例图;9 is an exemplary diagram obtained after the horizontal upper center point is connected in the present invention;

图10是本发明横向和竖向上中心点均连接的范例图;FIG. 10 is an exemplary diagram of the present invention in which both the horizontal and vertical center points are connected;

图11是本发明提供的一种文本检测结果的范例图;FIG. 11 is an exemplary diagram of a text detection result provided by the present invention;

图12是本发明提供的第二二值图的范例图;12 is an exemplary diagram of a second binary image provided by the present invention;

图13是本发明提供的经过形态学处理之后的范例图;Figure 13 is an example diagram provided by the present invention after morphological processing;

图14是本发明提供的文本框拆分之后的范例图;14 is an example diagram after the text box provided by the present invention is split;

图15是本发明提供的无边框表格的表格线生成装置的结构示意图;15 is a schematic structural diagram of a table line generation device for a borderless table provided by the present invention;

图16是本发明提供的电子设备的结构示意图。FIG. 16 is a schematic structural diagram of an electronic device provided by the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚,下面将结合本发明中的附图,对本发明中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the technical solutions in the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are part of the embodiments of the present invention. , not all examples. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

图1为本发明实施例提供的表格图像的处理方法的流程示意图,如图1所示,本发明提供的表格图像的处理方法,包括以下步骤:1 is a schematic flowchart of a method for processing a table image provided by an embodiment of the present invention. As shown in FIG. 1 , the method for processing a table image provided by the present invention includes the following steps:

步骤101:确定目标表格图像为无边框表格。Step 101: Determine that the target table image is a borderless table.

在本实施例中,需要对目标表格图像进行确定处理,当确定目标表格图像为无边框表格时,采用本发明提供的表格图像处理方法,对无边框表格中的表格线进行处理,实现无边框表格的表格线生成。In this embodiment, the target table image needs to be determined and processed. When it is determined that the target table image is a borderless table, the table image processing method provided by the present invention is used to process the table lines in the borderless table, so as to realize the borderless table. Table line generation for the table.

需要说明的是,目标表格图像是指本发明的方法所要处理的表格图像。顾名思义,表格图像是具有图像格式的表格,例如,表格是以jpg、png等图像格式保存的。无边框表格是指所有单元格的表格线均不存在的表格,如图2所示的表格,所有单元格的表格线表框线均不存在,属于无边框表格。需要说明的是,无边框表格可以是单独的表格图像,也可以是PDF文档中的表格通过转换处理形成的表格图像,在此不作具体限定。It should be noted that the target form image refers to the form image to be processed by the method of the present invention. As the name suggests, a table image is a table with an image format, for example, a table is saved in an image format such as jpg, png, etc. A borderless table refers to a table with no table lines in all cells. For the table shown in Figure 2, all cells have no table lines and table borders, so it belongs to a borderless table. It should be noted that the borderless table may be a separate table image, or may be a table image formed by converting a table in a PDF document, which is not specifically limited here.

需要说明的是,在图2所示的无边框表格中,各个单元格均填写有文本内容,但在其他实施例中,无边框表格中的某些单元格可以是空白的。对于此类无边框表格,也可采用本发明的方法生成表格线。It should be noted that, in the borderless table shown in FIG. 2 , each cell is filled with text content, but in other embodiments, some cells in the borderless table may be blank. For such borderless tables, the method of the present invention can also be used to generate table lines.

步骤102:通过表格识别结构对目标表格图像进行处理,得到第一边框框架。Step 102: Process the target table image through the table recognition structure to obtain a first border frame.

在本实施例中,第一边框框架是指对无边框表格进行还原处理得到的由表格线形成的框架。本实施例中需要通过表格识别结构对目标表格图像进行处理,得到第一边框框架。具体的处理流程可以是:利用表格识别结果对目标表格图像进行处理,然后根据处理结果确定出目标表格图像中无边框表格所对应的表格线,得到第一边框框架,具体的处理流程可见下述实施例,在此不作详细介绍。In this embodiment, the first border frame refers to a frame formed by table lines obtained by restoring a borderless table. In this embodiment, the target table image needs to be processed through the table recognition structure to obtain the first border frame. The specific processing flow may be: using the table recognition result to process the target table image, and then determining the table line corresponding to the borderless table in the target table image according to the processing result to obtain the first border frame. The specific processing flow can be seen in the following Examples are not described in detail here.

需要说明的是,第一边框框架中没有文本数据,只是单纯的表格线构成的框架,而且该第一边框框架是在目标表格图像上识别确定出来的,能够对目标表格图像中的表格信息实现精准还原。It should be noted that there is no text data in the first border frame, but a frame composed of simple table lines, and the first border frame is identified and determined on the target table image, which can realize the table information in the target table image. Precise restoration.

步骤103:将目标表格图像中的内容添加至第一边框框架内。Step 103: Add the content in the target table image to the first border frame.

在本实施例中,需要将目标表格图像中的内容添加至第一边框框架中,实现目标表格图像到可编辑表格之间的转换。第一边框框架为存在表格线的表格框架,对目标表格图像中的各个单元格进行文本识别,将文本识别结果填充到第一边框框架的对应单元格中,得到可编辑的表格。In this embodiment, the content in the target table image needs to be added to the first border frame to realize the conversion between the target table image and the editable table. The first border frame is a table frame with table lines. Text recognition is performed on each cell in the target table image, and the text recognition result is filled into the corresponding cells of the first border frame to obtain an editable table.

需要说明的是,在成功实现目标表格图像向具有表格线的可编辑表格的转换处理后,需要将目标表格图像中的各个单元格识别的文本结果填充到第一边框框架对应的单元格,实现目标表格图像内数据的还原。It should be noted that, after the conversion of the target table image to an editable table with table lines is successfully implemented, the text results recognized by each cell in the target table image need to be filled into the cells corresponding to the first border frame to achieve Restoration of data within the target table image.

根据本发明提供的表格图像的处理方法,确定目标表格图像为无边框表格,通过表格识别结构对目标表格图像进行处理,得到第一边框框架,然后将目标表格图像中的内容添加至第一边框框架内。本发明提供的表格图像的处理方法能够快速有效地得到第一边框框架,精准实现表格图像向可编辑表格的转换,并且基于表格图像本身得到的表格信息更全面、清晰,提高了表格图像的处理速度,提升用户体验。According to the form image processing method provided by the present invention, it is determined that the target form image is a borderless form, the target form image is processed through the form recognition structure to obtain a first border frame, and then the content in the target form image is added to the first border within the framework. The table image processing method provided by the present invention can quickly and effectively obtain the first border frame, accurately realize the conversion of the table image to an editable table, and obtain more comprehensive and clear table information based on the table image itself, which improves the processing of the table image. speed and improve user experience.

基于上述任一实施例,在本实施例中,方法还包括:表格识别结构为线条、滑动窗口或者图卷积神经网络模型。Based on any of the foregoing embodiments, in this embodiment, the method further includes: the table identification structure is a line, a sliding window, or a graph convolutional neural network model.

在本实施例中,表格识别结构为线条、滑动窗口或图卷积神经网络模型,线条是指采用横向线条或纵向线条的方式对目标表格图像进行处理,滑动窗口是指采用不同形状的闭合图形对目标表格图像进行处理,而图卷积神经网络模型是通过基于样本数据训练后得到的处理模型,对目标表格图像进行识别处理。In this embodiment, the table recognition structure is a line, a sliding window, or a graph convolutional neural network model. The line refers to the use of horizontal or vertical lines to process the target table image, and the sliding window refers to the use of closed graphics of different shapes. The target table image is processed, and the graph convolutional neural network model recognizes and processes the target table image through a processing model obtained after training based on sample data.

需要说明的是,图卷积神经网络模型的输入为各个文本框的位置信息(如文本框四个点的坐标信息)和目标表格图像,输出的则是某一文本框与其相邻多个文本框是否是同一行或同一列的信息,具体可以见下述实施例的详细介绍。It should be noted that the input of the graph convolutional neural network model is the position information of each text box (such as the coordinate information of the four points of the text box) and the target table image, and the output is a text box and its adjacent multiple texts. Whether the box is the information of the same row or the same column, please refer to the detailed introduction in the following embodiments for details.

根据本发明提供的表格图像的处理方法,将表格识别结构可以设定为线条、滑动窗口或者图卷积神经网络模型,能够实现对目标表格图像的多种处理,保证表格图像的处理效果,提升了用户体验。According to the table image processing method provided by the present invention, the table recognition structure can be set as a line, a sliding window or a graph convolutional neural network model, which can realize various processing on the target table image, ensure the processing effect of the table image, and improve the user experience.

基于上述任一实施例,在本实施例中,方法还包括:在表格识别结构为线条或滑动窗口的情况下,通过表格识别结构对目标表格图像进行处理,得到第一边框框架,包括:Based on any of the foregoing embodiments, in this embodiment, the method further includes: in the case that the table recognition structure is a line or a sliding window, processing the target table image through the table recognition structure to obtain a first border frame, including:

将表格识别结构沿目标表格图像的第一方向移动,记录未覆盖内容的第一区域;Move the table recognition structure along the first direction of the target table image, and record the first area of the uncovered content;

和/或,将表格识别结构沿目标表格图像的第二方向移动,记录未覆盖内容的第二区域;And/or, moving the form identification structure along the second direction of the target form image to record the second area of the uncovered content;

在第一区域和第二区域的至少一个区域生成表格线,得到第一边框框架。A table line is generated in at least one of the first area and the second area to obtain a first border frame.

在本实施例中,需要将表格识别结构沿着目标表格图像的第一方向移动,记录未覆盖内容的第一区域;还需要将表格识别结构沿着目标表格图像的第二方向进行移动,然后记录未覆盖内容的第二区域,从第一区域或第二区域中确定一个区域生成表格线,得到第一边框框架。需要说明的是,第一方向是指沿着目标表格图像移动的纵向,也就是从目标表格图像的上方往下移动,或者从下向上移动;第二方向是指沿着目标表格图像移动的横向,也就是从目标表格图像的左边向右移动,或者从右向左移动;相应的,第一区域则为沿纵向移动得到的区域,第二区域则为沿横向移动得到的区域。In this embodiment, the table recognition structure needs to be moved along the first direction of the target table image to record the first area of the uncovered content; it is also necessary to move the table recognition structure along the second direction of the target table image, and then Record the second area of the uncovered content, determine an area from the first area or the second area to generate a table line, and obtain a first border frame. It should be noted that the first direction refers to the vertical direction of moving along the target table image, that is, moving downward from the top of the target table image, or moving from bottom to top; the second direction refers to the horizontal direction moving along the target table image , that is, move from the left side of the target table image to the right, or move from right to left; correspondingly, the first area is the area obtained by moving in the vertical direction, and the second area is the area obtained by moving in the horizontal direction.

需要说明的是,在其他实施例中,第一方向和第二方向还可以是扇形中的圆弧方向或直径方向,还可以是45度角的方向等等,并不局限于横向和纵向,可以根据用户的实际需要实现任意方向上的移动,在此不作具体限定。It should be noted that, in other embodiments, the first direction and the second direction may also be an arc direction or a diameter direction in a fan shape, or a direction at an angle of 45 degrees, etc., and are not limited to the horizontal and vertical directions. Movement in any direction can be implemented according to the actual needs of the user, which is not specifically limited here.

需要说明的是,本实施例中沿着目标表格图像的纵向或横向移动的表格识别结构可以是线条,也可以是滑动窗口,而且需要记录的区域是指未覆盖内容的区域,未覆盖内容的区域是指空白的区域,没有存在文本数据的区域。It should be noted that, in this embodiment, the table identification structure that moves along the vertical or horizontal direction of the target table image may be a line or a sliding window, and the area to be recorded refers to the area that does not cover the content. The area refers to a blank area where no text data exists.

根据本发明提供的表格图像的处理方法,通过将表格识别结构沿目标表格图像的第一方向移动,记录未覆盖内容的第一区域,和/或,将表格识别结构沿目标表格图像的第二方向移动,记录未覆盖内容的第二区域,然后在第一区域和第二区域的至少一个区域生成表格线,得到第一边框框架,能够实现在目标表格图像的不同方向上的区域生成表格线,得到第一边框框架的目的,提高表格图像处理的速度。According to the processing method of the form image provided by the present invention, by moving the form identification structure along the first direction of the target form image, the first area of the uncovered content is recorded, and/or, the form identification structure is moved along the second direction of the target form image. Move in the direction, record the second area that does not cover the content, and then generate table lines in at least one area of the first area and the second area to obtain a first border frame, which can realize the generation of table lines in areas in different directions of the target table image. , to get the purpose of the first border frame and improve the speed of table image processing.

基于上述任一实施例,在本实施例中,方法还包括:在表格识别结构为线条的情况下,表格识别结构包括第一方向线条和第二方向线条;Based on any of the foregoing embodiments, in this embodiment, the method further includes: in the case that the table identification structure is a line, the table identification structure includes a first direction line and a second direction line;

通过表格识别结构对目标表格图像进行处理,得到第一边框框架,包括:The target table image is processed through the table recognition structure to obtain the first border frame, including:

基于目标表格图像中的内容分布情况,在目标表格图像中平移第一方向线条和第二方向线条,确定目标表格图像中的非交叉区域;Based on the content distribution in the target table image, the first direction line and the second direction line are translated in the target table image to determine the non-intersecting area in the target table image;

在各个非交叉区域中确定目标点,连接目标点得到第一边框框架。Target points are determined in each non-intersection area, and the target points are connected to obtain a first frame.

在本实施例中,第一方向线条为横向线条,第二方向线条为纵向线条。本实施例中需要基于目标表格图像中的内容分布情况,将得到的横向线条和纵向线条进行平移处理,确定出目标表格图像中的非交叉区域,然后在各个非交叉区域中确定出目标点,连接得到的所有目标点即可得到第一边框框被架。其中,目标点可以是中心点,也可以是其他的三分之二的点等,本实施例中优选的是中心点。需要说明的是,非交叉区域是指空白的区域,既没有文本内容分布、又没有被横向线条或纵向线条覆盖的区域。In this embodiment, the lines in the first direction are horizontal lines, and the lines in the second direction are vertical lines. In this embodiment, it is necessary to perform translation processing on the obtained horizontal lines and vertical lines based on the content distribution in the target table image to determine the non-intersecting areas in the target table image, and then determine the target points in each non-intersecting area, Connect all the obtained target points to get the first frame frame. The target point may be the center point, or other two-thirds points, etc., and the center point is preferred in this embodiment. It should be noted that the non-intersecting area refers to a blank area, an area that is neither distributed with text content nor covered by horizontal lines or vertical lines.

需要说明的是,当表格识别结构为线条时,需要对目标表格图像中的文本内容进行检测处理,得到多个文本框,然后对文本框进行相应的处理得到目标点,确定出第一边框框架。It should be noted that when the table recognition structure is a line, the text content in the target table image needs to be detected and processed to obtain multiple text boxes, and then the text boxes are processed accordingly to obtain the target point, and the first border frame is determined. .

需要说明的是,本实施例中,通过对目标表格图像进行文本检测处理,得到多个文本框,然后需要根据多个文本框确定出目标表格图像中的纵向表格线区域和横向表格线区域,再将得到的纵向表格线区域和横向表格线区域进行平移处理,得到平移之后的表格线区域,从该表格线区域中确定出多个非交叉区域,然后确定出多个非交叉区域的中心点,根据多个中心点生成目标表格图像所对应的第一边框框架。It should be noted that, in this embodiment, by performing text detection processing on the target table image, multiple text boxes are obtained, and then the vertical table line area and the horizontal table line area in the target table image need to be determined according to the multiple text boxes, Then, perform translation processing on the obtained vertical table line area and horizontal table line area to obtain the table line area after the translation, determine multiple non-intersecting areas from the table line area, and then determine the center points of the multiple non-intersecting areas. , and generate a first frame frame corresponding to the target table image according to the multiple center points.

需要说明的是,本实施例中,纵向表格线区域是指将表格识别结构沿着目标表格图像纵向移动所构成的区域,也就是上述实施例中的第二区域,横向表格线区域是指将表格识别结构沿着目标表格图像横向移动所构成的区域,也就是上述实施例中的第一区域;多个非交叉区域是指没有被横向表格线区域和纵向表格线区域覆盖的区域,也没有被叠加的区域。It should be noted that, in this embodiment, the vertical table line area refers to the area formed by vertically moving the table recognition structure along the target table image, that is, the second area in the above embodiment, and the horizontal table line area refers to the The area formed by the horizontal movement of the table recognition structure along the target table image, that is, the first area in the above-mentioned embodiment; the multiple non-intersecting areas refer to the area not covered by the horizontal table line area and the vertical table line area, and there is no superimposed area.

根据本发明提供的表格图像的处理方法,基于目标表格图像中的文本分布情况,在目标表格图像中平移得到的第一方向线条和第二方向线条,确定目标表格图像中的非交叉区域,然后在各个非交叉区域中确定目标点,连接目标点得到第一边框框架,操作简单,能够保证表格图像的还原处理的准确性。According to the form image processing method provided by the present invention, based on the text distribution in the target form image, the first direction lines and the second direction lines obtained by translation in the target form image are determined to determine the non-intersecting area in the target form image, and then The target points are determined in each non-intersection area, and the first frame frame is obtained by connecting the target points, the operation is simple, and the accuracy of the restoration processing of the table image can be ensured.

基于上述任一实施例,在本实施例中,基于目标表格图像中的内容分布情况,在目标表格图像中平移第一方向线条和第二方向线条,确定目标表格图像中的非交叉区域,包括:Based on any of the above embodiments, in this embodiment, based on the content distribution in the target form image, the lines in the first direction and the lines in the second direction are translated in the target form image to determine the non-intersecting area in the target form image, including :

对目标表格图像做二值化处理,得到第一二值图;其中,第一二值图中的第一值为目标表格图像中内容分布区域所对应的像素,第一二值图中的第二值为目标表格图像中非内容分布区域所对应的像素;Perform binarization processing on the target table image to obtain a first binary image; wherein, the first value in the first binary image is the pixel corresponding to the content distribution area in the target table image, and the first value in the first binary image is the pixel corresponding to the content distribution area in the target table image. The binary value is the pixel corresponding to the non-content distribution area in the target table image;

按照第一方向对第一二值图中的像素进行处理,得到第一方向线条;Process the pixels in the first binary image according to the first direction to obtain lines in the first direction;

按照第二方向对第一二值图中的像素进行处理,得到第二方向线条;Process the pixels in the first binary image according to the second direction to obtain lines in the second direction;

将第一方向线条与第二方向线条进行交叉,将交叉后的、未被第一方向线条或第二方向线条覆盖的区域作为目标表格图像中的非交叉区域。Cross the lines in the first direction with the lines in the second direction, and take the intersected area not covered by the lines in the first direction or the lines in the second direction as a non-intersection area in the target table image.

在本实施例中,需要对确定文本框之后的目标表格图像进行二值化处理,得到文本框对应的像素为第一值,除文本框之外的像素为第二值的第一二值图,如图3所示的第一二值图。其中,第一值是指像素为0的数值,第二值是指像素为255的数值,第一二值图是指对文本框区域进行二值化处理所生成的二值图。In this embodiment, the target table image after the text box is determined needs to be binarized to obtain a first binary image in which the pixels corresponding to the text box are the first value and the pixels other than the text box are the second value , the first binary image shown in Figure 3. The first value refers to a value with a pixel of 0, the second value refers to a value with a pixel of 255, and the first binary image refers to a binary image generated by binarizing the text box area.

在本实施例中,第一方向可以是指目标表格图像的纵向,对于纵向线条的确定,需要选用一定宽度的像素列,将这一像素列与文本框做闭运算处理,将纵向上的文本框所对应的像素点连接起来,从而形成纵向线条,得到纵向表格线区域,如图4所示的纵向表格线区域,其中,纵向表格线对应的像素值为0。需要说明的是,闭运算是形态学上的一种计算方法,具体为先膨胀后腐蚀的计算方式,其中,膨胀是将与物体接触的所有背景点合并到该物体中,使边界向外部扩展的处理过程,用来填补物体中的空洞。腐蚀是一种消除边界点,使边界向内部收缩的处理过程,可以用来消除小且无意义的物体。In this embodiment, the first direction may refer to the vertical direction of the target table image. For the determination of vertical lines, a pixel column with a certain width needs to be selected, and the pixel column and the text box are subjected to a closing operation, and the text in the vertical direction needs to be closed. The pixels corresponding to the boxes are connected to form vertical lines, and a vertical table line area is obtained, as shown in FIG. 4 , where the pixel value corresponding to the vertical table line is 0. It should be noted that the closing operation is a morphological calculation method, specifically the calculation method of dilation first and then erosion, wherein dilation is to merge all the background points in contact with the object into the object, so that the boundary expands to the outside. The process used to fill holes in objects. Erosion is a process of eliminating boundary points and shrinking the boundary inward, which can be used to eliminate small and meaningless objects.

在本实施例中,第二方向可以是指目标表格图像的横向,对于横向线条的确定,通过选用一定宽度的像素行,将它们与文本框做闭运算,得到横向线条,所有横向线条构成横向表格区域,如图5所示的横向表格线区域。In this embodiment, the second direction may refer to the horizontal direction of the target table image. For the determination of horizontal lines, select pixel rows of a certain width and perform a closing operation on them with the text box to obtain horizontal lines. All horizontal lines form horizontal lines. Table area, such as the horizontal table line area shown in Figure 5.

需要说明的是,本实施例中需要将得到的横向表格线区域和竖向表格线区域进行平移处理,得到平移后的表格线区域。平移后的表格线区域可以如图6所示,从图6中可以确定出没有被叠加的、没有被竖向表格线区域或横向表格线区域覆盖的区域确定为非交叉区域,也就是图6中所示的空白区域,即空白区域为非交叉区域。It should be noted that, in this embodiment, it is necessary to perform translation processing on the obtained horizontal table line area and vertical table line area to obtain the translated table line area. The translated table line area can be shown in Figure 6. From Figure 6, it can be determined that the area that is not superimposed, not covered by the vertical table line area or the horizontal table line area is determined as a non-intersecting area, that is, Figure 6 The blank area shown in , that is, the blank area is a non-intersecting area.

根据本发明提供的表格图像的处理方法,通过对目标表格图像做二值化处理,得到第一二值图,然后按照第一方向对第一二值图中的像素进行处理,得到第一方向线条,按照第二方向对第一二值图中的像素进行处理,得到第二方向线条,将第一方向线条与第二方向线条进行平移,将未被交叉的、未被第一方向线条或第二方向线条覆盖的区域作为目标表格图像中的非交叉区域,提高了表格图像的处理速度。According to the table image processing method provided by the present invention, the target table image is subjected to binarization to obtain a first binary image, and then the pixels in the first binary image are processed according to the first direction to obtain the first direction. Lines, process the pixels in the first binary image according to the second direction to obtain lines in the second direction, translate the lines in the first direction and the lines in the second direction, and convert the lines that are not intersected, not in the first direction or in the first direction. The area covered by the lines in the second direction serves as a non-intersecting area in the target form image, which improves the processing speed of the form image.

基于上述任一实施例,在本实施例中,在各个非交叉区域中确定目标点,包括:Based on any of the foregoing embodiments, in this embodiment, determining target points in each non-intersecting area includes:

通过轮廓查找确定各个非交叉区域的轮廓;Determine the contour of each non-intersecting area by contour search;

根据各个非交叉区域的轮廓,确定各个非交叉区域中的目标点。According to the outline of each non-intersecting area, the target point in each non-intersecting area is determined.

在本实施例中,需要通过轮廓查找的方式确定出各个非交叉区域的目标点的坐标。先确定出多个非交叉区域的轮廓,然后根据多个非交叉区域的轮廓确定出多个非交叉区域各自的轮廓目标点的坐标,根据目标点的坐标信息得到第一边框框架。需要说明的是,本实施例中的目标点设为中心点,由于文本框一般为四边形,可以通过确定四个顶点坐标的方式确定出文本框的中心点,如图7所示,其中各个小圆圈即为确定出的各个文本框的中心点。In this embodiment, the coordinates of the target points of each non-intersection area need to be determined by means of contour search. First determine the contours of the multiple non-intersecting areas, then determine the coordinates of the respective contour target points of the multiple non-intersecting areas according to the contours of the multiple non-intersecting areas, and obtain the first frame frame according to the coordinate information of the target points. It should be noted that the target point in this embodiment is set as the center point. Since the text box is generally a quadrilateral, the center point of the text box can be determined by determining the coordinates of the four vertices, as shown in FIG. The circle is the center point of each text box determined.

举例说明,比如得到一个非交叉区域,通过轮廓查找的方式查找到非交叉区域的四个顶点的坐标,分别为(2,6)和(2,9)、(8,6)和(8,9),计算得到(5,7.5)为中心点的坐标,然后通过上述查找方式将确定的多个中心点进行连接,则得到目标表格图像所对应的表格线。For example, to obtain a non-intersection area, find the coordinates of the four vertices of the non-intersection area by contour search, which are (2, 6) and (2, 9), (8, 6) and (8, 9), calculate and obtain (5, 7.5) as the coordinates of the center point, and then connect the determined multiple center points through the above search method to obtain the table line corresponding to the target table image.

需要说明的是,将纵向方向上的中心点进行连接可以得到如图8所示的各个竖线所构成的区域,将横向方向上的多个中心点进行连接得到如图9所示的各个横线所构成的区域,将得到的纵向连接线区域和横向连接线区域平移得到图10所示的表格线,即得到了无边框表格的第一边框框架。It should be noted that, connecting the center points in the longitudinal direction can obtain the area formed by each vertical line as shown in FIG. In the area formed by the line, the obtained vertical connecting line area and horizontal connecting line area are translated to obtain the table line shown in FIG. 10 , that is, the first border frame of the borderless table is obtained.

需要说明的是,本实施例中是在原来的目标表格图像上生成第一边框框架。在其他实施例中,还可以根据确定的中心点的坐标在一预设区域内按照纵向和/或横向连接多个中心点的映射点,将所有的映射点连接起来生成目标表格图像所对应的表格线。It should be noted that, in this embodiment, the first border frame is generated on the original target table image. In other embodiments, according to the determined coordinates of the center point, the mapping points of a plurality of center points may be connected vertically and/or horizontally in a preset area, and all the mapping points may be connected to generate the corresponding mapping points of the target table image. form line.

根据本发明提供的表格图像的处理方法,通过轮廓查找确定各个非交叉区域的轮廓,然后根据各个非交叉区域的轮廓,确定各个非交叉区域中的目标点。能够通过中心点的方式生成目标表格图像的第一边框框架,提高第一边框框架生成的准确性。According to the table image processing method provided by the present invention, the contour of each non-intersecting area is determined by contour search, and then the target point in each non-intersecting area is determined according to the contour of each non-intersecting area. The first border frame of the target table image can be generated by means of the center point, thereby improving the accuracy of generating the first border frame.

基于上述任一实施例,在本实施例中,在表格识别结构为滑动窗口的情况下,表格识别结构包括第一方向滑动窗口与第二方向滑动窗口;Based on any of the above embodiments, in this embodiment, in the case that the table identification structure is a sliding window, the table identification structure includes a first-direction sliding window and a second-direction sliding window;

通过表格识别结构对目标表格图像进行处理,得到第一边框框架,包括:The target table image is processed through the table recognition structure to obtain the first border frame, including:

基于目标表格图像中的内容分布情况,在目标表格图像中平移第一方向滑动窗口,得到第一方向框体;在目标表格图像中平移第二方向滑动窗口,得到第二方向框体;Based on the content distribution in the target table image, translate the sliding window in the first direction in the target table image to obtain the frame body in the first direction; translate the sliding window in the second direction in the target table image to obtain the frame body in the second direction;

基于第一方向框体设置第一方向表格线,基于第二方向框体设置第二方向表格线,得到第一边框框架。A first-direction table line is set based on the first-direction frame body, and a second-direction table line is set based on the second-direction frame body to obtain a first frame frame.

在本实施例中,需要基于目标表格图像中的文本内容分布情况平移第一滑动窗口和第二滑动窗口,得到第一方向框体和第二方向框体,然后基于第一方向框体设置第一方向表格线,基于第二方向框体设置第二方向表格线,根据平移后得到的表格线确定出第一边框框架。In this embodiment, it is necessary to translate the first sliding window and the second sliding window based on the distribution of the text content in the target table image to obtain the first direction frame body and the second direction frame body, and then set the first direction frame body based on the first direction frame body. A table line in one direction, a table line in a second direction is set based on the frame body in the second direction, and a first border frame is determined according to the table line obtained after translation.

需要说明的是,本实施例中第一方向框体可以是沿着目标表格图像纵向方向上的框体,可以是任意形状的框体,如长方形、正方形等;第二方向框体是沿着目标表格图像横向方向上的框体,同样可以是任意形状的框体;在其他实施例中,第一方向框体和第二方向框体还可以是目标表格图像上任意方向上的框体,如45度方向的框体,具体可以根据用户的实际需要进行设置,在此不作具体限定。It should be noted that, in this embodiment, the frame body in the first direction may be a frame body along the longitudinal direction of the target table image, and may be a frame body of any shape, such as a rectangle, a square, etc.; the frame body in the second direction is a frame body along the The frame body in the horizontal direction of the target form image can also be a frame body of any shape; in other embodiments, the frame body in the first direction and the frame body in the second direction can also be frames in any direction on the target form image, For example, the frame body in the 45-degree direction can be set according to the actual needs of the user, which is not specifically limited here.

需要说明的是,第一方向框体是根据第一方向滑动窗口沿着目标表格图像的第一方向进行滑动得到的框体;第二方向框体是根据第二滑动窗口沿着目标表格图像的第二方向进行滑动得到的框体;第一方向框体和第二方向框体在本实施例中是指不存在文本数据内容的空白区域,也就是说在根据第一方向框体和第二方向框体平移后得到的第一边框框架是不存在文本数据内容的区域。It should be noted that the frame body in the first direction is a frame body obtained by sliding the sliding window along the first direction of the target table image according to the first direction; the frame body in the second direction is a frame body obtained by sliding the window along the target table image according to the second sliding window. The frame body obtained by sliding in the second direction; the frame body in the first direction and the frame body in the second direction refer to a blank area without text data content in this embodiment, that is, the frame body in the first direction and the frame body in the second direction The first border frame obtained after the direction frame is translated is an area without text data content.

需要说明的是,本实施例中得到的第一方向框体和第二方向框体是可以生成表格线的,基于第一方向框体设置第一方向表格线,基于第二方向框体设置第二方向表格线,在完成第一方向框体和第二方向框体的表格线生成后,能够准确得到第一边框框架。It should be noted that the first direction frame body and the second direction frame body obtained in this embodiment can generate table lines, the first direction table line is set based on the first direction frame body, and the first direction frame body is set based on the second direction frame body. The two-direction table line can accurately obtain the first border frame after completing the generation of the table lines of the first-direction frame body and the second-direction frame body.

根据本发明提供的表格图像的处理方法,基于目标表格图像中的内容分布情况,在目标表格图像中平移第一方向滑动窗口,得到第一方向框体,在目标表格图像中平移第二方向滑动窗口,得到第二方向框体,基于第一方向框体设置第一方向表格线,基于第二方向框体设置第二方向表格线,得到第一边框框架,能够准确得到第一边框框架,提高表格图像的处理速度。According to the form image processing method provided by the present invention, based on the content distribution in the target form image, the sliding window in the first direction is translated in the target form image to obtain the frame body in the first direction, and the sliding window in the second direction is translated and slid in the target form image window, obtain the second direction frame body, set the first direction table line based on the first direction frame body, set the second direction table line based on the second direction frame body, obtain the first frame frame, can accurately obtain the first frame frame, improve Processing speed of table images.

基于上述任一实施例,在本实施例中,基于目标表格图像中的内容分布情况,在目标表格图像中平移第一方向滑动窗口,得到第一方向框体;在目标表格图像中平移所述第二方向滑动窗口,得到第二方向框体,包括:Based on any of the above embodiments, in this embodiment, based on the content distribution in the target form image, the sliding window in the first direction is translated in the target form image to obtain the frame body in the first direction; Sliding the window in the second direction to obtain the frame in the second direction, including:

在第一方向滑动窗口沿着与第一方向垂直的方向滑动的过程中,根据目标表格图像中的内容分布情况,确定第一方向滑动窗口未与内容相交的滑动范围,根据第一方向滑动窗口的形状以及第一方向滑动窗口未与内容相交的滑动范围,确定第一方向框体;In the process of sliding the sliding window in the first direction along the direction perpendicular to the first direction, according to the content distribution in the target table image, determine the sliding range in which the sliding window in the first direction does not intersect with the content, and slide the window according to the first direction. The shape of the first-direction sliding window and the sliding range where the first-direction sliding window does not intersect with the content determine the first-direction frame;

在第二方向滑动窗口沿着与第二方向垂直的方向滑动的过程中,根据目标表格图像中的文本分布情况,确定第二方向滑动窗口未与内容相交的滑动范围,根据第二方向滑动窗口的形状以及第二方向滑动窗口未与内容相交的滑动范围,确定第二方向框体。In the process of sliding the sliding window in the second direction along the direction perpendicular to the second direction, according to the text distribution in the target table image, determine the sliding range of the sliding window in the second direction that does not intersect with the content, and slide the window according to the second direction. The shape of the second direction and the sliding range of the second direction sliding window that does not intersect with the content determine the second direction frame.

在本实施例中,需要在目标表格图像上设置第一方向滑动窗口,当第一方向滑动窗口沿着与第一方向垂直的方向滑动时,根据目标表格图像中的文本内容分布情况,确定出第一方向滑动窗口未与文本内容相交的滑动范围,然后根据第一方向滑动窗口的形状以及滑动范围确定出第一方向框体。其中,第一方向滑动窗口是指在目标表格图像的纵向方向上设定的滑动窗口,可以是长方形的滑动窗口,或是正方形的滑动窗口,在此不作具体限定。In this embodiment, a first-direction sliding window needs to be set on the target table image, and when the first-direction sliding window slides in a direction perpendicular to the first direction, according to the distribution of text content in the target table image, determine The first-direction sliding window is a sliding range that does not intersect with the text content, and then the first-direction frame is determined according to the shape of the first-direction sliding window and the sliding range. The sliding window in the first direction refers to a sliding window set in the longitudinal direction of the target table image, which may be a rectangular sliding window or a square sliding window, which is not specifically limited herein.

需要说明的是,本实施例中第一方向为目标表格图像的纵向,当将第一方向滑动窗口设置为沿着与第一方向垂直的方向滑动时,确定的未与文本相交的滑动范围则为横向上的范围区域,也就是得到的第一方向框体为横向上的框体。在其他实施例中,还可以是第一方向滑动窗口沿着第一方向进行滑动,得到的第一方向框体则为纵向上的框体,具体可以根据用户的实际需要进行设定,在此不作具体限定。It should be noted that in this embodiment, the first direction is the longitudinal direction of the target table image. When the sliding window in the first direction is set to slide along the direction perpendicular to the first direction, the determined sliding range that does not intersect with the text is is the range area in the lateral direction, that is, the obtained first-direction frame body is the frame body in the lateral direction. In other embodiments, the sliding window in the first direction can also be slid along the first direction, and the obtained first-direction frame is a vertical frame, which can be set according to the actual needs of the user, here There is no specific limitation.

在本实施例中,还需要在目标表格图像上设置第二方向滑动窗口,当第二方向滑动窗口沿着与第二方向垂直的方向滑动时,根据目标表格图像中的文本内容分布情况,确定第二方向滑动窗口未与文本内容相交的滑动范围,根据第二方向滑动窗口的形状以及第二滑动窗口未与文本内容相交的滑动范围,确定第二方向框体。其中,第二方向滑动窗口是指在目标表格图像的横向方向上设定的滑动窗口,可以是长方形的滑动窗口,或是正方形的滑动窗口,在此不作具体限定。In this embodiment, a second-direction sliding window needs to be set on the target table image. When the second-direction sliding window slides in a direction perpendicular to the second direction, according to the distribution of text content in the target table image, determine The second-direction frame is determined according to the shape of the second-direction sliding window and the sliding range where the second-direction sliding window does not intersect with the text content. The sliding window in the second direction refers to a sliding window set in the horizontal direction of the target table image, which may be a rectangular sliding window or a square sliding window, which is not specifically limited herein.

需要说明的是,本实施例中第二方向为目标表格图像的横向,当将第二方向滑动窗口设置为沿着与第二方向垂直的方向滑动时,确定的未与文本内容相交的滑动范围则为纵向上的范围区域,也就是得到的第二方向框体为纵向上的框体。在其他实施例中,还可以是第二方向滑动窗口沿着第二方向进行滑动,得到的第二方向框体则为横向上的框体,具体可以根据用户的实际需要进行设定,在此不作具体限定。It should be noted that in this embodiment, the second direction is the horizontal direction of the target table image. When the sliding window in the second direction is set to slide along the direction perpendicular to the second direction, the determined sliding range that does not intersect with the text content Then it is the range area in the longitudinal direction, that is, the obtained frame body in the second direction is the frame body in the longitudinal direction. In other embodiments, the sliding window in the second direction can also be slid along the second direction, and the obtained frame body in the second direction is the frame body in the horizontal direction, which can be set according to the actual needs of the user, here There is no specific limitation.

需要说明的是,第一方向滑动窗口和第二方向滑动窗口的设置可以采用现有技术中较成熟的滑窗算法进行计算设计得到,具体的设置方法在此不作详细介绍。It should be noted that the settings of the sliding window in the first direction and the sliding window in the second direction can be calculated and designed by using a relatively mature sliding window algorithm in the prior art, and the specific setting method will not be described in detail here.

根据本发明提供的表格图像的处理方法,通过在目标表格图像上设置滑动窗口,然后在滑动窗口滑动的过程中确定出与文本内容未相交的滑动范围,根据滑动窗口的形状和滑动范围确定出第一方向框体和第二方向框体,操作简单,提高了表格图像的处理速度。According to the table image processing method provided by the present invention, by setting a sliding window on the target table image, and then determining the sliding range that does not intersect with the text content during the sliding process of the sliding window, and determining the sliding window according to the shape and sliding range of the sliding window. The first-direction frame body and the second-direction frame body are simple to operate and improve the processing speed of table images.

基于上述任一实施例,在本实施例中,在表格识别结构为图卷积神经网络模型的情况下,根据目标表格图像中的文本框和文本框之间的位置关系,确定第一边框框架。Based on any of the above embodiments, in this embodiment, when the table recognition structure is a graph convolutional neural network model, the first border frame is determined according to the positional relationship between the text box in the target table image and the text box .

在本实施例中,当表格识别结构为图卷积神经网络模型时,需要根据目标表格图像中的各个文本框之间的位置关系,确定出第一边框框架,需要说明的是,各个文本框之间的位置关系可以根据文本框的中心点的坐标信息进行确认。In this embodiment, when the table recognition structure is a graph convolutional neural network model, it is necessary to determine the first frame frame according to the positional relationship between each text box in the target table image. It should be noted that each text box The positional relationship between them can be confirmed according to the coordinate information of the center point of the text box.

需要说明的是,根据本实施例中提供的图卷积神经网络模型的输出可以确定出处于同一行的文本框和处于同一列的文本框,而且根据确定的同一行的文本框的高度值可以确定出单元格的高度,同一列的文本框的宽度值确定出单元格的宽度,即绘制得到无边框表格的表格线,详细介绍可见下述实施例。It should be noted that, according to the output of the graph convolutional neural network model provided in this embodiment, the text box in the same row and the text box in the same column can be determined, and according to the determined height value of the text box in the same row can be determined. The height of the cell is determined, and the width value of the text box in the same column determines the width of the cell, that is, the table line of the borderless table is drawn, and the detailed description can be found in the following embodiments.

根据本发明提供的表格图像的处理方法,在表格识别结构为图卷积神经网络模型的情况下,根据目标表格图像中的文本框和文本框之间的位置关系,确定第一边框框架,通过训练的图卷积神经网络模型能够准确、快速地确定出目标表格图像中的各个文本框的位置关系,得到第一边框框架。According to the method for processing a table image provided by the present invention, in the case that the table recognition structure is a graph convolutional neural network model, the first border frame is determined according to the positional relationship between the text box and the text box in the target table image, and the The trained graph convolutional neural network model can accurately and quickly determine the positional relationship of each text box in the target table image, and obtain the first frame frame.

基于上述任一实施例,在本实施例中,在表格识别结构为图卷积神经网络模型的情况下,根据目标表格图像中的文本框和文本框之间的位置关系,确定第一边框框架,包括:Based on any of the above embodiments, in this embodiment, when the table recognition structure is a graph convolutional neural network model, the first border frame is determined according to the positional relationship between the text box in the target table image and the text box ,include:

将预先获取的目标表格图像中的多个文本框的位置信息以及目标表格图像输入预先训练的图卷积神经网络模型,得到多个文本框中的任一文本框与其相邻文本框之间的位置关系;其中,图卷积神经网络模型是基于样本表格中的多个文本框的位置信息、样本表格图像以及样本表格的标签信息训练得到的;Input the pre-acquired position information of multiple text boxes in the target table image and the target table image into the pre-trained graph convolutional neural network model, and obtain the relationship between any text box in the multiple text boxes and its adjacent text boxes. Location relationship; wherein, the graph convolutional neural network model is trained based on the location information of multiple text boxes in the sample table, the image of the sample table, and the label information of the sample table;

根据多个文本框中的任一文本框与其相邻文本框之间的位置关系,确定在目标表格图像中各个行的文本框与各个列的文本框;Determine the text box of each row and the text box of each column in the target table image according to the positional relationship between any text box in the plurality of text boxes and its adjacent text boxes;

根据目标表格图像中各个行的文本框与各个列的文本框,生成目标表格图像的第一边框框架。A first border frame of the target table image is generated according to the text boxes of each row and the text boxes of each column in the target table image.

在本实施例中,需要将得到的目标表格图像中的多个文本框的位置信息以及目标表格图像输入预先训练的图卷积神经网络模型中,得到任一文本框与其相邻文本框之间的位置关系,然后根据得到的位置关系,确定出目标表格图像中各个行的文本框和各个列的文本框,并生成目标表格图像所对应的第一边框框架。其中,图卷积神经网络模型为人工智能模型,是基于样本表格中的多个文本框的位置信息、样本表格图像以及样本表格的标签信息进行训练得到的。In this embodiment, it is necessary to input the obtained position information of multiple text boxes in the target table image and the target table image into the pre-trained graph convolutional neural network model, and obtain the relationship between any text box and its adjacent text boxes. Then, according to the obtained position relationship, the text boxes of each row and the text boxes of each column in the target table image are determined, and the first border frame corresponding to the target table image is generated. The graph convolutional neural network model is an artificial intelligence model, which is obtained by training based on the position information of multiple text boxes in the sample table, the image of the sample table, and the label information of the sample table.

在本实施例中,图卷积神经网络模型的输入为各个文本框的位置信息(如文本框四个点的坐标信息)和目标表格图像,输出的则是某一文本框与其相邻多个文本框是否是同一行或同一列的信息。In this embodiment, the input of the graph convolutional neural network model is the position information of each text box (such as the coordinate information of the four points of the text box) and the target table image, and the output is a certain text box and its adjacent multiple Whether the textbox is the same row or the same column of information.

举例说明,假如确定的文本框a的位置信息为(1,1),与其相邻的文本框的位置信息通过计算分析分别为:a1(0,1)、a2(0,1)、a3(1,0)、a4(1,0)。其中,位置信息中第一数字表示与文本框a是否为同一行,如果1表示同行,0则表示不同行,那么相邻文本框a1、a2与文本框a同列;第二个数字表示与文本框a是否为同一列,如果1表示同列,0则表示不同列,相邻文本框a1、a2与文本框a同列,而相邻文本框a3、a4与文本框a同行。具体可以根据用户的实际需要进行设定,在此不作具体限定。For example, if the determined position information of the text box a is (1, 1), the position information of the adjacent text box is calculated and analyzed as: a1(0,1), a2(0,1), a3( 1,0), a4(1,0). Among them, the first number in the position information indicates whether it is in the same row as the text box a. If 1 indicates the same row, and 0 indicates a different row, then the adjacent text boxes a1 and a2 are in the same column as the text box a; the second number indicates the same column as the text box a. Whether box a is in the same column, if 1 means the same column, 0 means different columns, adjacent text boxes a1, a2 are in the same column as text box a, and adjacent text boxes a3, a4 are in the same row as text box a. Specifically, it can be set according to the actual needs of the user, which is not specifically limited here.

需要说明的是,根据本实施例中提供的图卷积神经网络模型的输出可以确定出处于同一行的文本框和处于同一列的文本框,而且根据确定的同一行的文本框的高度值可以确定出单元格的高度,同一列的文本框的宽度值确定出单元格的宽度,即绘制得到无边框表格的表格线。It should be noted that, according to the output of the graph convolutional neural network model provided in this embodiment, the text box in the same row and the text box in the same column can be determined, and according to the determined height value of the text box in the same row can be determined. The height of the cell is determined, and the width value of the text box in the same column determines the width of the cell, that is, the table line of the borderless table is drawn.

根据本发明提供的表格图像的处理方法,通过将多个文本框的位置信息以及目标表格图像输入预先训练好难得图卷积神经网络模型中,得到任一文本框与其相邻文本框之间的位置关系,然后根据位置关系确定出各个行的文本行和各个列的文本框,根据确定各行各列的文本框生成目标表格图像所对应的第一边框框架。通过多个文本框的位置关系生成目标表格图像所对应的第一边框框架,能够提高表格线生成的准确性。According to the method for processing table images provided by the present invention, by inputting the position information of multiple text boxes and target table images into a pre-trained rare graph convolutional neural network model, the relationship between any text box and its adjacent text boxes is obtained. and then determine the text row of each row and the text box of each column according to the position relationship, and generate the first border frame corresponding to the target table image according to the determined text box of each row and column. The first border frame corresponding to the target table image is generated by the positional relationship of the multiple text boxes, which can improve the accuracy of table line generation.

基于上述任一实施例,在本实施例中,根据目标表格图像中各个行的文本框与各个列的文本框,生成目标表格图像的第一边框框架,包括:Based on any of the above embodiments, in this embodiment, according to the text boxes of each row and the text boxes of each column in the target table image, the first border frame of the target table image is generated, including:

根据目标表格图像的目标行中各个文本框的高度目标值,设定目标行的高度值;其中,目标行是目标表格图像的任意一行;According to the height target value of each text box in the target row of the target table image, the height value of the target row is set; wherein, the target row is any row of the target table image;

根据目标表格图像的目标列中各个文本框的宽度目标值,设定目标列的宽度值;其中,目标列是目标表格图像的任意一列;According to the width target value of each text box in the target column of the target table image, the width value of the target column is set; wherein, the target column is any column of the target table image;

根据目标行的高度值以及目标行在目标表格图像中的位置,绘制目标行的表格线;和/或,根据目标列的宽度值、目标列在目标表格图像中的位置,绘制目标列的表格线,得到目标表格图像的第一边框框架。Draw the table line of the target row according to the height value of the target row and the position of the target row in the target table image; and/or, according to the width value of the target column and the position of the target column in the target table image, draw the table of the target column line to get the first border frame of the target table image.

在本实施例中,需要根据目标行中各个文本框的高度目标值设定目标行的高度值,根据目标列中各个文本框的宽度目标值,设定目标列的宽度值,然后根据目标行的高度值以及目标行在目标表格图像中的位置信息,绘制与该目标行所对应的行的表格线,根据目标列的宽度值以及目标列在目标表格图像中的位置信息,绘制与该目标列所对应的列的表格线,根据各行和各列生成的表格线得到第一边框框架。In this embodiment, it is necessary to set the height value of the target row according to the height target value of each text box in the target row, set the width value of the target column according to the width target value of each text box in the target column, and then according to the target row The height value of the target row and the position information of the target row in the target table image, draw the table line of the row corresponding to the target row, and draw the target row according to the width value of the target column and the position information of the target column in the target table image. For the table line of the column corresponding to the column, the first border frame is obtained according to the table line generated by each row and each column.

需要说明的是,本实施例中,是将目标行中各个文本框的最大高度值确定为目标行的高度值,或将目标列中各个文本框所对应的最大宽度设定为目标列的宽度值,在其他实施例中,还可以选取目标行中各个文本框的平均高度值作为目标行的高度值,选取目标列中各个文本框的平均宽度值作为目标列的宽度值。具体用户还可以根据实际需要进行设定,在此不作具体限定。It should be noted that, in this embodiment, the maximum height value of each text box in the target row is determined as the height value of the target row, or the maximum width corresponding to each text box in the target column is set as the width of the target column. In other embodiments, the average height value of each text box in the target row can also be selected as the height value of the target row, and the average width value of each text box in the target column can be selected as the width value of the target column. Specific users can also set according to actual needs, which is not specifically limited here.

需要说明的是,目标行或目标列的位置信息可以是其坐标信息,也可以是指与其他相邻任意行或任意列的位置关系,具体可以根据实际需要进行设定,在此不作具体限定。It should be noted that the position information of the target row or target column may be its coordinate information, or may refer to the positional relationship with any other adjacent row or column, which can be set according to actual needs, which is not specifically limited here. .

举例说明,如目标行中多个文本框的高度值分别为4cm、3cm、5cm,将该目标行中多个文本框的最大高度值确定为该目标行的高度值,也就是将5cm确定为该目标行的高度值;同样,将目标列中的最大宽度值确定为目标列的宽度值,对于其他的行或列依次采用上述的方法进行高度值或宽度值的确认,在此不作详细介绍。For example, if the height values of multiple text boxes in the target row are 4cm, 3cm, and 5cm, respectively, the maximum height value of the multiple text boxes in the target row is determined as the height value of the target row, that is, 5cm is determined as the height value of the target row. The height value of the target row; similarly, the maximum width value in the target column is determined as the width value of the target column, and the above method is used to confirm the height value or width value for other rows or columns, which will not be described in detail here. .

需要说明的是,还可以可将各个文本框的像素作为计量单位确定出目标行的高度值或目标列的宽度值,如得到某目标行中的目标文本框的最大像素为1024,通过计算分析得到该像素对应的数值为8.67cm,则将该像素所对应的8.67cm确定为目标行的高度值。在其他实施例中,还可以采用其他的数值确定方式。可以根据用户的具体需要选择不同的确定方式,在此不作具体限定。It should be noted that the pixels of each text box can also be used as the unit of measurement to determine the height value of the target row or the width value of the target column. For example, the maximum pixel of the target text box in a target row is 1024. If the value corresponding to the pixel is 8.67cm, then the 8.67cm corresponding to the pixel is determined as the height value of the target row. In other embodiments, other numerical determination methods may also be adopted. Different determination methods can be selected according to the specific needs of the user, which is not specifically limited here.

根据本发明提供的表格图像的处理方法,将目标行中各个文本框中的高度目标值确定为目标行的高度值,将目标列中各个文本框的宽度目标值确定为目标列的宽度值,然后根据目标行的高度值以及目标行在目标表格图像中的位置,目标列的宽度值、目标列在目标表格图像中的位置,绘制出目标行或目标列对应的表格线,并得到第一边框框架,能够保证表格线生成的准确性,使生成的表格线完整、清楚。According to the table image processing method provided by the present invention, the height target value in each text box in the target row is determined as the height value of the target row, and the width target value of each text box in the target column is determined as the width value of the target column, Then according to the height value of the target row and the position of the target row in the target table image, the width value of the target column, and the position of the target column in the target table image, draw the table line corresponding to the target row or target column, and get the first The border frame can ensure the accuracy of table line generation, so that the generated table line is complete and clear.

基于上述任一实施例,在本实施例中,在通过表格识别结构对目标表格图像进行处理,得到第一边框框架之前,方法还包括:Based on any of the foregoing embodiments, in this embodiment, before the target table image is processed through the table recognition structure to obtain the first border frame, the method further includes:

对目标表格图像进行二值化处理,得到第二二值图;其中,第二二值图中的第二值为目标表格图像中的文本字符所对应的像素,第二二值图中的第一值为目标表格图像中除文本字符之外的像素。The target table image is binarized to obtain a second binary image; wherein, the second value in the second binary image is the pixel corresponding to the text character in the target table image, and the second value in the second binary image is the pixel corresponding to the text character in the target table image. One value is the pixels other than text characters in the target table image.

对第二二值图设置表格识别结构。Set the table recognition structure for the second binary image.

在本实施例中,需要对包含文本检测结果的目标表格图像进行二值化处理,得到文本字符所对应的像素为第二值、文本字符除外的像素为第一值的第二二值图,其中,第一值是指像素为0的数值,第二值是指像素为255的数值。然后对得到的第二二值图进行形态学处理,得到形态学处理结果,确定处理结果中存在中间存在间隔的文本框,将该文本框进行拆分处理,得到目标表格图像中的多个文本框。In this embodiment, the target table image containing the text detection result needs to be binarized to obtain a second binary image in which the pixels corresponding to the text characters are of the second value and the pixels other than the text characters are of the first value, Wherein, the first value refers to the value where the pixel is 0, and the second value refers to the value where the pixel is 255. Then, perform morphological processing on the obtained second binary image to obtain a morphological processing result. It is determined that there is a text box with an interval in the processing result, and the text box is split and processed to obtain multiple texts in the target table image. frame.

需要说明的是,本实施例中还需要对得到的第二二值图设置表格识别结构,其中,表格识别结构可以是线条、框体或者图卷积神经网络模型,具体可以根据用户的需要进行设定,在此不作具体限定。It should be noted that, in this embodiment, a table recognition structure also needs to be set for the obtained second binary graph, wherein the table recognition structure may be a line, a frame or a graph convolutional neural network model, which may be specifically performed according to the needs of the user. The setting is not specifically limited here.

根据本发明提供的表格图像的处理方法,通过对包含文本检测结果的目标表格图像进行二值化处理,得到文本字符所对应的像素为第一值,目标表格图像中除文本字符之外的像素为第二值的第二二值图,然后对第二二值图设置表格识别结构,为后续根据表格识别结构确定出第一边框框架提供了支持。According to the form image processing method provided by the present invention, by performing binarization processing on the target form image containing the text detection result, the pixels corresponding to the text characters are obtained as the first value, and the pixels other than the text characters in the target form image are obtained. A table identification structure is set for the second binary image of the second value and then for the second binary image, which provides support for the subsequent determination of the first border frame according to the table identification structure.

基于上述任一实施例,在本实施例中,将目标表格图像中的内容添加至第一边框框架内,包括:Based on any of the above embodiments, in this embodiment, adding the content in the target table image to the first border frame includes:

对所述目标表格图像进行文本检测,根据文本检测结果得到所述目标表格图像中的文本框;Text detection is performed on the target table image, and a text box in the target table image is obtained according to the text detection result;

在所述第一边框框架内设置文本框;setting a text box in the first frame;

将所述目标表格图像中的文本框中的内容填写至所述第一边框框架的文本框内。Fill in the content in the text box in the target table image into the text box in the first border frame.

在本实施例中,需要对目标表格图像进行文本检测,然后根据文本检测结果得到目标表格图像中的文本框,在第一边框框架内设置多个文本框,将目标表格图像中的多个文本框中存在的数据内容填写至第一边框框架内的对应的文本框中,实现目标表格图像中的数据还原。In this embodiment, it is necessary to perform text detection on the target table image, and then obtain a text box in the target table image according to the text detection result, set a plurality of text boxes in the first border frame, and combine the plurality of text boxes in the target table image The data content existing in the box is filled in the corresponding text box in the first border frame, so as to realize the restoration of the data in the target table image.

在本实施例中,如图11所示,利用文本检测单元对无边框表格进行文本检测处理,识别出多个文本内容,根据文件检测结果确定出目标表格图像中的多个文本框,其中,一个长方形的框图代表一个文本框,图11中存在多个文本框。In this embodiment, as shown in FIG. 11 , a text detection unit is used to perform text detection processing on a borderless table, multiple text contents are identified, and multiple text boxes in the target table image are determined according to the file detection result, wherein, A rectangular box represents a text box, and there are multiple text boxes in Figure 11.

具体的说,在检测出目标表格图像中的文本之后,还需要对文本检测结果进行二值化和形态学处理,从而得到目标表格图像中的文本框。其中,图像二值化,即ImageBinarization,是将图像上的像素点的灰度值设置为0或255,实际上是将整个图像呈现出明显的黑白效果的过程,图12为对上述文本检测得到的多个文本框进行二值化处理后的示意图。形态学,数学形态学,即mathematical Morphology,是图像处理中应用最为广泛的技术之一,主要用于从图像中提取对表达和描绘区域形状有意义的图像分量,使后续的识别工作能够抓住目标对象最为本质的形状特征,其中,最为本质是指最具区分能力,即mostdiscriminative,如边界和连通区域等,如图13所示,图13为对二值化处理后的表格图像进行形态学处理得到的示意图。具体的处理过程见下述实施例,在此不作详细介绍。Specifically, after detecting the text in the target table image, it is necessary to perform binarization and morphological processing on the text detection result, so as to obtain the text box in the target table image. Among them, image binarization, that is, ImageBinarization, is to set the gray value of the pixel on the image to 0 or 255, which is actually the process of presenting the entire image with an obvious black and white effect. Figure 12 shows the above text detection results. Schematic diagram after binarization of multiple text boxes. Morphology, mathematical morphology, or mathematical Morphology, is one of the most widely used techniques in image processing. It is mainly used to extract image components that are meaningful for expressing and describing the shape of an area from an image, so that subsequent recognition work can capture The most essential shape feature of the target object. Among them, the most essential refers to the most discriminative ability, that is, the most discriminative, such as boundaries and connected areas, as shown in Figure 13. Figure 13 shows the morphology of the table image after binarization processing. Process the resulting schematic. The specific processing process is shown in the following embodiments, which will not be described in detail here.

需要说明的是,本实施例中在确定出目标表格图像中的多个文本框后,对于中间存在间隔的文本框进行拆分处理,如图14为对中间存在间隔的文本框拆分后得到的示意图,如将单一文本框“张三总经理123456937”,根据存在的间隔拆分为三个文本框,分别为“张三”、“总经理”、“123456937”。It should be noted that in this embodiment, after multiple text boxes in the target table image are determined, the text boxes with gaps in the middle are split, as shown in FIG. 14 after splitting the text boxes with gaps in the middle. For example, the single text box "General Manager Zhang San 123456937" is divided into three text boxes according to the existing interval, namely "Zhang San", "General Manager" and "123456937".

在本实施例中,需要对拆分后得到的多个文本框进行二值化处理,横向表格线区域和纵向表格线区域的确认处理,然后根据确定出的表格线区域确定出各个非交叉区域的中心点,连接各个中心点确定出目标表格图像所对应的第一边框框架,详细内容可见下述实施例,在本实施例中不再做详细介绍。In this embodiment, it is necessary to perform binarization processing on the plurality of text boxes obtained after splitting, confirm processing of the horizontal table line area and the vertical table line area, and then determine each non-intersecting area according to the determined table line area. The center point is connected to each center point to determine the first frame frame corresponding to the target table image. The details can be found in the following embodiments, which will not be described in detail in this embodiment.

需要说明的是,在本实施例中是通过确定横向表格区域和纵向表格区域,然后确定平移后的表格线区域的中心点,通过连接中心点的方式生成目标表格图像所对应的表格线,在其他实施例中可以是其他的生成方式,如基于文本框的位置关系生成表格线,还可以是其他的生成方式,在此不作具体限定。It should be noted that, in this embodiment, the horizontal table area and the vertical table area are determined, and then the center point of the translated table line area is determined, and the table line corresponding to the target table image is generated by connecting the center points. In other embodiments, other generating manners may be employed, such as generating table lines based on the positional relationship of text boxes, or other generating manners, which are not specifically limited herein.

需要说明的是,本实施例中需要在得到的第一边框框架内设置多个文本框,然后将目标表格图像中的多个文本框中存在的文本内容添加到第一边框框架内的对应的文本框内,其中,采用现有技术中较成熟的文本识别方法对目标表格图像中的多个文本框中的文本内容进行识别处理,然后将识别结果填充到第一边框框架内对应的文本框中,文本识别处理的具体过程在此不作详细介绍。It should be noted that, in this embodiment, multiple text boxes need to be set in the obtained first border frame, and then the text content existing in the multiple text boxes in the target table image is added to the corresponding text boxes in the first border frame. In the text box, wherein the more mature text recognition method in the prior art is used to recognize the text content in the multiple text boxes in the target table image, and then the recognition results are filled into the corresponding text boxes in the first border frame , the specific process of text recognition processing will not be described in detail here.

根据本发明提供的表格图像的处理方法,通过对目标表格图像进行文本检测,根据文本检测结果得到目标表格图像中的文本框,然后在第一边框框架内设置文本框,将目标表格图像中的文本框中的内容填写至第一边框框架的文本框内。能够实现表格图像中的文本数据的精准还原,提高了表格图像的处理速度。According to the form image processing method provided by the present invention, by performing text detection on the target form image, a text frame in the target form image is obtained according to the text detection result, and then a text frame is set in the first border frame, and the text frame in the target form image is set. The content in the text box is filled into the text box of the first border frame. The accurate restoration of the text data in the table image can be realized, and the processing speed of the table image can be improved.

基于上述任一实施例,在本实施例中,确定目标表格图像为无边框表格,包括:Based on any of the foregoing embodiments, in this embodiment, determining that the target table image is a borderless table includes:

将目标表格图像输入预先训练的表格分类模型;Input the target table image into the pre-trained table classification model;

根据表格分类模型的输出结果,确定目标表格图像为无边框表格;According to the output result of the table classification model, determine that the target table image is a borderless table;

其中,表格分类模型是基于样本表格、样本表格的类别标签训练得到的。Among them, the table classification model is trained based on the sample table and the category label of the sample table.

在本实施例中,需要将目标表格图像输入预先训练的表格分类模型中,然后根据表格分类模型的输出结果确定目标表格图像为无边框表格,其中,表格分类模型是基于样本表格、样本表格的类别标签训练得到的,输出结果为两种,一种是无边框表格,一种是有边框表格,只有在目标表格图像中的各个单元格均不存在表格线的情况下才确定目标表格图像为无边框表格。In this embodiment, the target table image needs to be input into the pre-trained table classification model, and then the target table image is determined to be a borderless table according to the output result of the table classification model, wherein the table classification model is based on the sample table, the sample table There are two output results obtained from category label training, one is a borderless table, and the other is a bordered table. Only when there is no table line in each cell in the target table image, can the target table image be determined as Borderless table.

需要说明的是,表格分类模型可以是VGG模型,也可以是Resnet模型,通过对选择的模型进行训练得到表格分类模型,用于识别目标表格图像的类型。其中,VGG模型是一种可以在多个迁移学习任务中进行识别确认的模型。从图像中提取CNN特征,VGG模型是首选算法。Resnet模型是指深度残差网络模型,通过很深层次的网络实现准确率非常高的图像识别、语音识别等能力。It should be noted that the table classification model may be a VGG model or a Resnet model, and a table classification model is obtained by training the selected model, which is used to identify the type of the target table image. Among them, the VGG model is a model that can be recognized and confirmed in multiple transfer learning tasks. To extract CNN features from images, the VGG model is the preferred algorithm. Resnet model refers to the deep residual network model, which realizes the ability of image recognition and speech recognition with very high accuracy through a very deep network.

举例说明,如将获取到目标表格图像1和目标表格图像2输入到预先训练好的表格分类模型中,模型输出结果为目标表格图像1中的各个单元格均不存在边框线、目标表格图像2中部分存在表格框线,将目标表格图像1确定为无边框表格、目标表格图像2确定为半边框表格。For example, if the obtained target table image 1 and target table image 2 are input into the pre-trained table classification model, the model output result is that each cell in the target table image 1 does not have border lines, and the target table image 2 does not have border lines. There is a table frame line in the middle part, and the target table image 1 is determined as a borderless table, and the target table image 2 is determined as a half-border table.

根据本发明提供的表格图像的处理方法,通过预先训练好的表格分类模型能够确定出目标表格图像的类型,根据表格分类模型的输出结果将目标表格图像确定为无边框表格,能够准确识别出目标表格图像的类型,提高后续无边框表格的处理速度。According to the table image processing method provided by the present invention, the type of the target table image can be determined through a pre-trained table classification model, and the target table image is determined as a borderless table according to the output result of the table classification model, so that the target can be accurately identified. The type of table image, which improves the processing speed of subsequent borderless tables.

基于上述任一实施例,在本实施例中,第一边框框架内的内容处于可编辑的状态。Based on any of the foregoing embodiments, in this embodiment, the content in the first border frame is in an editable state.

在本实施例中,通过目标表格图像转换得到第一边框框架内的内容是处于可编辑的状态的。根据上述提供的表格图像处理方法对类型为无边框表格的目标表格图像进行转换处理,得到具有表格线的第一边框框架,然后需要对目标表格图像中的各个单元格进行文本识别,将文本识别结果填充到第一边框框架的对应单元格中,得到可编辑的表格。In this embodiment, the content in the first border frame obtained by converting the target table image is in an editable state. According to the table image processing method provided above, the target table image whose type is a borderless table is converted to obtain a first border frame with table lines, and then text recognition needs to be performed on each cell in the target table image, and The results are filled into the corresponding cells of the first border frame, resulting in an editable table.

根据本发明提供的表格图像的处理方法,得到的第一边框框架内的内容处于可编辑的状态,满足用户的编辑需求,提升了用户体验。According to the table image processing method provided by the present invention, the obtained content in the first frame frame is in an editable state, which satisfies the user's editing requirements and improves the user experience.

基于上述任一实施例,在本实施例中,首先利用表格分类模型确认目标表格图像为无边框表格,然后根据文本检测单元对无边框表格进行文本检测处理得到多个文本框,对得到的带有多个文本框的目标表格图像进行二值化处理和形态学处理,然后根据处理结果,将其中存在中间间隔的文本框进行拆分处理,得到拆分后带有多个文本框的目标表格图像。Based on any of the above embodiments, in this embodiment, the table classification model is used to first confirm that the target table image is a borderless table, and then the borderless table is subjected to text detection processing according to the text detection unit to obtain a plurality of text boxes. The target table image with multiple text boxes is subjected to binarization processing and morphological processing, and then according to the processing results, the text boxes with intermediate intervals are split and processed, and the split target table with multiple text boxes is obtained. image.

在本实施例中,还需要根据目标表格图像的多个文本框进行表格线的生成处理,先进行二值化处理,得到第一二值图,从第一二值图中按照垂直方向提取出纵向表格线区域,按照水平方向提取出横向表格线区域,其中,纵向表格线和横向表格线为给定的模板。然后将得到的两个区域进行平移处理,将平移后区域上的白色区域设定为非交叉区域。In this embodiment, it is also necessary to perform table line generation processing according to multiple text boxes of the target table image, first perform binarization processing to obtain a first binary image, and extract from the first binary image in the vertical direction. For the vertical table line area, the horizontal table line area is extracted according to the horizontal direction, wherein the vertical table line and the horizontal table line are given templates. Then, the two obtained regions are translated, and the white region on the translated region is set as a non-intersecting region.

在本实施例中,通过轮廓查找的方式从平移后的区域中确定每个非交叉区域的轮廓,计算轮廓中心点,确定非交叉区域的坐标信息,根据每个方向上的顶点的坐标数值的平均值确定出中心点的坐标。从横向上和纵向上将中心点连接,生成目标表格图像的表格线,得到可编辑的第一边框框架。In this embodiment, the contour of each non-intersecting area is determined from the translated area by means of contour searching, the center point of the contour is calculated, and the coordinate information of the non-intersecting area is determined, according to the coordinate values of the vertices in each direction. The average determines the coordinates of the center point. Connect the center points in the horizontal and vertical directions to generate the table line of the target table image, and obtain the editable first border frame.

需要说明的是,本实施例中,还可以利用文本框和文本框之间的位置关系,确定表格线。例如,根据文本框之间的位置关系,构建图卷积网络模型,利用训练好的图卷积网络模型,确定同一列的文本框、同一行的文本框,确定之后,基于各个行和各个列的位置关系,确定出完整包围目标表格图像中的全部文本框的表格线。It should be noted that, in this embodiment, the table line may also be determined by using the text box and the positional relationship between the text boxes. For example, according to the positional relationship between the text boxes, a graph convolution network model is constructed, and the trained graph convolution network model is used to determine the text boxes in the same column and the text boxes in the same row. The position relationship is determined, and the table line that completely surrounds all the text boxes in the target table image is determined.

图15为本发明提供的一种表格图像的处理装置,如图15所示,本发明提供的表格图像的处理装置,包括:Figure 15 is a table image processing device provided by the present invention. As shown in Figure 15, the table image processing device provided by the present invention includes:

确定模块1501,用于确定目标表格图像为无边框表格;A determination module 1501, configured to determine that the target table image is a borderless table;

处理模块1502,用于通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架;a processing module 1502, configured to process the target table image through the table recognition structure to obtain a first border frame;

添加模块1503,用于将所述目标表格图像中的内容添加至所述第一边框框架内。The adding module 1503 is configured to add the content in the target table image into the first border frame.

根据本发明提供的表格图像的处理装置,确定目标表格图像为无边框表格,通过表格识别结构对目标表格图像进行处理,得到第一边框框架,然后将目标表格图像中的内容添加至第一边框框架内。本发明提供的表格图像的处理方法能够快速有效地得到第一边框框架,精准实现表格图像向可编辑表格的转换,并且基于表格图像本身得到的表格信息更全面、清晰,提高了表格图像的处理速度,提升用户体验。According to the table image processing device provided by the present invention, it is determined that the target table image is a borderless table, the target table image is processed through the table recognition structure to obtain a first border frame, and then the content in the target table image is added to the first border within the framework. The table image processing method provided by the present invention can quickly and effectively obtain the first border frame, accurately realize the conversion of the table image to an editable table, and obtain more comprehensive and clear table information based on the table image itself, which improves the processing of the table image. speed and improve user experience.

进一步,所述表格识别结构为线条、滑动窗口或者图卷积神经网络模型。Further, the table recognition structure is a line, a sliding window or a graph convolutional neural network model.

根据本发明提供的表格图像的处理装置,将表格识别结构可以设定为线条、滑动窗口或者图卷积神经网络模型,能够实现对目标表格图像的多种处理,保证表格图像的处理效果,提升了用户体验。According to the table image processing device provided by the present invention, the table recognition structure can be set as a line, a sliding window or a graph convolutional neural network model, which can realize various processing of the target table image, ensure the processing effect of the table image, and improve the user experience.

进一步,处理模块1502还用于:Further, the processing module 1502 is also used for:

将所述表格识别结构沿目标表格图像的第一方向移动,记录未覆盖内容的第一区域;moving the table identification structure along the first direction of the target table image to record the first area of the uncovered content;

和/或,将表格识别结构沿目标表格图像的第二方向移动,记录未覆盖内容的第二区域;And/or, moving the form identification structure along the second direction of the target form image to record the second area of the uncovered content;

在所述第一区域和所述第二区域的至少一个区域生成表格线,得到第一边框框架。A table line is generated in at least one of the first area and the second area to obtain a first border frame.

根据本发明提供的表格图像的处理装置,通过将表格识别结构沿目标表格图像的第一方向移动,记录未覆盖内容的第一区域,和/或,将表格识别结构沿目标表格图像的第二方向移动,记录未覆盖内容的第二区域,然后在第一区域和第二区域的至少一个区域生成表格线,得到第一边框框架,能够实现在目标表格图像的不同方向上的区域生成表格线,得到第一边框框架的目的,提高表格图像处理的速度。According to the form image processing device provided by the present invention, by moving the form identification structure along the first direction of the target form image, the first area of the uncovered content is recorded, and/or, the form identification structure is moved along the second direction of the target form image. Move in the direction, record the second area that does not cover the content, and then generate table lines in at least one area of the first area and the second area to obtain a first border frame, which can realize the generation of table lines in areas in different directions of the target table image. , to get the purpose of the first border frame and improve the speed of table image processing.

进一步,在所述表格识别结构为线条的情况下,所述表格识别结构包括横向线条和纵向线条,处理模块1502还用于:Further, in the case where the table identification structure is a line, the table identification structure includes horizontal lines and vertical lines, and the processing module 1502 is further configured to:

基于目标表格图像中的所述内容分布情况,在目标表格图像中平移第一方向线条和第二方向线条,确定目标表格图像中的非交叉区域;Based on the content distribution in the target table image, the first direction line and the second direction line are translated in the target table image to determine the non-intersecting area in the target table image;

在非交叉区域确定目标点,连接目标点得到第一边框框架。Determine the target points in the non-intersection area, and connect the target points to obtain the first frame.

根据本发明提供的表格图像的处理装置,基于目标表格图像中的文本分布情况,在目标表格图像中叠加得到的第一方向线条和第二方向线条,确定目标表格图像中的非交叉区域,然后在各个非交叉区域中确定目标点,连接目标点得到第一边框框架,操作简单,能够保证表格图像的还原处理的准确性。According to the table image processing device provided by the present invention, based on the text distribution in the target table image, the first direction lines and the second direction lines obtained by superimposing in the target table image are determined to determine the non-intersecting area in the target table image, and then The target points are determined in each non-intersection area, and the first frame frame is obtained by connecting the target points, the operation is simple, and the accuracy of the restoration processing of the table image can be ensured.

进一步,处理模块1502还用于:Further, the processing module 1502 is also used for:

对所述目标表格图像做二值化处理,得到第一二值图;其中,所述第一二值图中的第一值为目标表格图像中内容分布区域所对应的像素,所述第一二值图中的第二值为目标表格图像中非内容分布区域所对应的像素;Perform binarization processing on the target table image to obtain a first binary image; wherein, the first value in the first binary image is the pixel corresponding to the content distribution area in the target table image, and the first value in the first binary image is the pixel corresponding to the content distribution area in the target table image. The second value in the binary image is the pixel corresponding to the non-content distribution area in the target table image;

按照第一方向对所述第一二值图中的像素进行处理,得到第一方向线条;Process the pixels in the first binary image according to the first direction to obtain lines in the first direction;

按照第二方向对所述第一二值图中的像素进行处理,得到第二方向线条;Process the pixels in the first binary image according to the second direction to obtain lines in the second direction;

将所述第一方向线条与所述第二方向线条进行交叉,将交叉后的、未被所述第一方向线条或所述第二方向线条覆盖的区域作为所述目标表格图像中的非交叉区域。Cross the lines in the first direction with the lines in the second direction, and take the intersected area not covered by the lines in the first direction or the lines in the second direction as the non-intersection in the target table image area.

根据本发明提供的表格图像的处理装置,通过对目标表格图像做二值化处理,得到第一二值图,然后按照第一方向对第一二值图中的像素进行处理,得到第一方向线条,按照第二方向对第一二值图中的像素进行处理,得到第二方向线条,将第一方向线条与第二方向线条进行交叉,将交叉后的、未被第一方向线条或第二方向线条覆盖的区域作为目标表格图像中的非交叉区域,提高了表格图像的处理速度。According to the table image processing device provided by the present invention, the target table image is binarized to obtain a first binary image, and then the pixels in the first binary image are processed according to the first direction to obtain the first direction. Lines, process the pixels in the first binary image according to the second direction to obtain lines in the second direction, cross the lines in the first direction with the lines in the second direction, and cross the lines that are not in the first direction or the lines in the second direction. The area covered by the two-direction lines is regarded as the non-intersecting area in the target table image, which improves the processing speed of the table image.

进一步,处理模块1502还用于:Further, the processing module 1502 is also used for:

通过轮廓查找确定所述非交叉区域的轮廓;Determine the contour of the non-intersecting area by contour searching;

根据所述非交叉区域的轮廓,将所述非交叉区域的一个坐标点作为所述目标点。According to the outline of the non-intersection area, a coordinate point of the non-intersection area is used as the target point.

根据本发明提供的表格图像的处理装置,通过轮廓查找确定各个非重叠区域的轮廓,然后根据各个非交叉区域的轮廓,确定各个非交叉区域中的目标点。能够通过中心点的方式生成目标表格图像的第一边框框架,提高第一边框框架生成的准确性。According to the table image processing device provided by the present invention, the contour of each non-overlapping area is determined by contour search, and then the target point in each non-intersecting area is determined according to the contour of each non-intersecting area. The first border frame of the target table image can be generated by means of the center point, thereby improving the accuracy of generating the first border frame.

进一步,在所述表格识别结构为框体的情况下,所述表格识别结构包括第一方向框体与第二方向框体;Further, in the case that the table identification structure is a frame body, the table identification structure includes a first direction frame body and a second direction frame body;

相应的,处理模块1502还用于:Correspondingly, the processing module 1502 is also used for:

基于目标表格图像中的内容分布情况,在目标表格图像中平移第一方向滑动窗口,得到第一方向框体;在目标表格图像中平移第二方向滑动窗口,得到第二方向框体;Based on the content distribution in the target table image, translate the sliding window in the first direction in the target table image to obtain the frame body in the first direction; translate the sliding window in the second direction in the target table image to obtain the frame body in the second direction;

基于第一方向框体设置第一方向表格线,基于第二方向框体设置第二方向表格线,得到第一边框框架。A first-direction table line is set based on the first-direction frame body, and a second-direction table line is set based on the second-direction frame body to obtain a first frame frame.

根据本发明提供的表格图像的处理装置,基于目标表格图像中的内容分布情况,在目标表格图像中平移第一方向滑动窗口,得到第一方向框体;在目标表格图像中平移第二方向滑动窗口,得到第二方向框体;基于第一方向框体设置第一方向表格线,基于第二方向框体设置第二方向表格线,得到第一边框框架,能够准确得到第一边框框架,提高表格图像的处理速度。According to the table image processing device provided by the present invention, based on the content distribution in the target table image, the sliding window in the first direction is translated in the target table image to obtain the frame body in the first direction; the sliding window in the second direction is translated and slid in the target table image window to obtain the second direction frame body; set the first direction table line based on the first direction frame body, set the second direction table line based on the second direction frame body, obtain the first frame frame, can accurately obtain the first frame frame, improve the Processing speed of table images.

进一步,处理模块1502还用于:Further, the processing module 1502 is also used for:

在第一方向滑动窗口沿着与第一方向垂直的方向滑动的过程中,根据所述目标表格图像中的内容分布情况,确定第一方向滑动窗口未与文本相交的滑动范围,根据所述第一方向滑动窗口的形状以及所述第一滑动窗口未与文本相交的滑动范围,确定第一方向框体;In the process of sliding the sliding window in the first direction along the direction perpendicular to the first direction, according to the content distribution in the target table image, determine the sliding range in which the sliding window in the first direction does not intersect with the text, according to the first direction sliding window The shape of the sliding window in one direction and the sliding range in which the first sliding window does not intersect with the text determine the frame body in the first direction;

在第二方向滑动窗口沿着与第二方向垂直的方向滑动的过程中,根据目标表格图像中的文本分布情况,确定第二方向滑动窗口未与文本相交的滑动范围,根据所述第二方向滑动窗口的形状以及所述第二滑动窗口未与文本相交的滑动范围,确定第二方向框体。In the process of sliding the sliding window in the second direction along the direction perpendicular to the second direction, according to the text distribution in the target table image, determine the sliding range of the sliding window in the second direction that does not intersect with the text, according to the second direction The shape of the sliding window and the sliding range in which the second sliding window does not intersect with the text determine the frame body of the second direction.

根据本发明提供的表格图像的处理装置,通过在目标表格图像上设置滑动窗口,然后在滑动窗口滑动的过程中确定出与文本内容未相交的滑动范围,根据滑动窗口的形状和滑动范围确定出第一方向框体和第二方向框体,操作简单,提高了表格图像的处理速度。According to the table image processing device provided by the present invention, by setting a sliding window on the target table image, and then determining the sliding range that does not intersect with the text content during the sliding process of the sliding window, and determining the sliding window according to the shape and sliding range of the sliding window. The first-direction frame body and the second-direction frame body are simple to operate and improve the processing speed of table images.

进一步,在所述表格识别结构为图卷积神经网络模型的情况下,根据目标表格图像中的文本框和文本框之间的位置关系,确定第一边框框架。Further, when the table recognition structure is a graph convolutional neural network model, the first frame frame is determined according to the positional relationship between the text box and the text box in the target table image.

根据本发明提供的表格图像的处理装置,在表格识别结构为图卷积神经网络模型的情况下,根据目标表格图像中的文本框和文本框之间的位置关系,确定第一边框框架,通过训练的图卷积神经网络模型能够准确、快速地确定出目标表格图像中的各个文本框的位置关系,得到第一边框框架。According to the table image processing device provided by the present invention, when the table recognition structure is a graph convolutional neural network model, the first border frame is determined according to the positional relationship between the text box and the text box in the target table image, and the The trained graph convolutional neural network model can accurately and quickly determine the positional relationship of each text box in the target table image, and obtain the first frame frame.

进一步,处理模块1502还用于:Further, the processing module 1502 is also used for:

将预先获取的目标表格图像中的多个文本框的位置信息以及所述目标表格图像输入预先训练的图卷积神经网络模型,得到所述多个文本框中的任一文本框与其相邻文本框之间的位置关系;其中,所述图卷积神经网络模型是基于样本表格中的多个文本框的位置信息、样本表格图像以及样本表格的标签信息训练得到的;Inputting the pre-acquired position information of multiple text boxes in the target table image and the target table image into a pre-trained graph convolutional neural network model to obtain any text box and its adjacent text in the multiple text boxes The positional relationship between the frames; wherein, the graph convolutional neural network model is obtained by training based on the position information of a plurality of text frames in the sample table, the sample table image and the label information of the sample table;

根据所述多个文本框中的任一文本框与其相邻文本框之间的位置关系,确定在所述目标表格图像中各个行的文本框与各个列的文本框;Determine the text box of each row and the text box of each column in the target table image according to the positional relationship between any text box in the plurality of text boxes and its adjacent text boxes;

根据所述目标表格图像中各个行的文本框与各个列的文本框,生成所述目标表格图像的第一边框框架。A first border frame of the target table image is generated according to the text boxes of each row and the text boxes of each column in the target table image.

根据本发明提供的表格图像的处理装置,通过将多个文本框的位置信息以及目标表格图像输入预先训练好难得图卷积神经网络模型中,得到任一文本框与其相邻文本框之间的位置关系,然后根据位置关系确定出各个行的文本行和各个列的文本框,根据确定各行各列的文本框生成目标表格图像所对应的第一边框框架。通过多个文本框的位置关系生成目标表格图像所对应的第一边框框架,能够提高表格线生成的准确性。According to the table image processing device provided by the present invention, by inputting the position information of a plurality of text boxes and target table images into a pre-trained rare graph convolutional neural network model, the relationship between any text box and its adjacent text boxes is obtained. and then determine the text row of each row and the text box of each column according to the position relationship, and generate the first border frame corresponding to the target table image according to the determined text box of each row and column. The first border frame corresponding to the target table image is generated by the positional relationship of the multiple text boxes, which can improve the accuracy of table line generation.

进一步,处理模块1502还用于:Further, the processing module 1502 is also used for:

根据所述目标表格图像的目标行中各个文本框的高度目标值,设定所述目标行的高度值;其中,所述目标行是所述目标表格图像的任意一行;According to the height target value of each text box in the target row of the target table image, the height value of the target row is set; wherein, the target row is any row of the target table image;

根据所述目标表格图像的目标列中各个文本框的宽度目标值,设定所述目标列的宽度值;其中,所述目标列是所述目标表格图像的任意一列;According to the width target value of each text box in the target column of the target table image, the width value of the target column is set; wherein, the target column is any column of the target table image;

根据所述目标行的高度值以及目标行在所述目标表格图像中的位置,绘制所述目标行的表格线;和/或,根据目标列的宽度值、目标列在所述目标表格图像中的位置,绘制所述目标列的表格线,得到所述目标表格图像的第一边框框架。Drawing the table line of the target row according to the height value of the target row and the position of the target row in the target table image; and/or, according to the width value of the target column and the target column in the target table image position, draw the table line of the target column, and obtain the first border frame of the target table image.

根据本发明提供的表格图像的处理装置,将目标行中各个文本框中的高度目标值确定为目标行的高度值,将目标列中各个文本框的宽度目标值确定为目标列的宽度值,然后根据目标行的高度值以及目标行在目标表格图像中的位置,目标列的宽度值、目标列在目标表格图像中的位置,绘制出目标行或目标列对应的表格线,并得到第一边框框架,能够保证表格线生成的准确性,使生成的表格线完整、清楚。According to the table image processing device provided by the present invention, the height target value of each text box in the target row is determined as the height value of the target row, and the width target value of each text box in the target column is determined as the width value of the target column, Then according to the height value of the target row and the position of the target row in the target table image, the width value of the target column, and the position of the target column in the target table image, draw the table line corresponding to the target row or target column, and get the first The border frame can ensure the accuracy of table line generation, so that the generated table line is complete and clear.

进一步,表格图像的处理装置还用于:Further, the processing device of the table image is also used for:

对目标表格图像进行二值化处理,得到第二二值图;其中,所述第二二值图中的第二值为目标表格图像中的文本字符所对应的像素,所述第二二值图中的第一值为所述目标表格图像中除文本字符之外的像素。Perform binarization processing on the target table image to obtain a second binary image; wherein, the second value in the second binary image is the pixel corresponding to the text character in the target table image, and the second binary value is the pixel corresponding to the text character in the target table image. The first value in the figure is a pixel other than text characters in the target table image.

对所述第二二值图设置表格识别结构。A table identification structure is set on the second binary image.

根据本发明提供的表格图像的处理装置,通过对包含文本检测结果的目标表格图像进行二值化处理,得到文本字符所对应的像素为第一值,目标表格图像中除文本字符之外的像素为第二值的第二二值图,然后对第二二值图设置表格识别结构,为后续根据表格识别结构确定出第一边框框架提供了支持According to the table image processing device provided by the present invention, by performing binarization processing on the target table image containing the text detection result, the pixels corresponding to the text characters are obtained as the first value, and the pixels other than the text characters in the target table image are obtained. For the second binary image of the second value, and then set the table recognition structure for the second binary image, which provides support for the subsequent determination of the first border frame according to the table recognition structure

进一步,添加模块1503还用于:Further, the adding module 1503 is also used for:

对所述目标表格图像进行文本检测,根据文本检测结果得到所述目标表格图像中的文本框;Text detection is performed on the target table image, and a text box in the target table image is obtained according to the text detection result;

在所述第一边框框架内设置文本框;setting a text box in the first frame;

将所述目标表格图像中的文本框中的内容填写至所述第一边框框架的文本框内。Fill in the content in the text box in the target table image into the text box in the first border frame.

根据本发明提供的表格图像的处理装置,通过对目标表格图像进行文本检测,根据文本检测结果得到目标表格图像中的多个文本框,在第一边框框架内设置多个文本框,将目标表格图像中的多个文本框中的内容填写至第一边框框架的多个文本框内。能够实现表格图像中的文本数据的精准还原,提高了表格图像的处理速度。According to the table image processing device provided by the present invention, by performing text detection on the target table image, multiple text boxes in the target table image are obtained according to the text detection results, a plurality of text boxes are arranged in the first frame frame, and the target table The contents of the multiple text boxes in the image are filled into the multiple text boxes of the first border frame. The accurate restoration of the text data in the table image can be realized, and the processing speed of the table image can be improved.

进一步,确定模块1501还用于:Further, the determining module 1501 is also used for:

将目标表格图像输入预先训练的表格分类模型;Input the target table image into the pre-trained table classification model;

根据所述表格分类模型的输出结果,确定所述目标表格图像为无边框表格;According to the output result of the table classification model, determine that the target table image is a borderless table;

其中,所述表格分类模型是基于样本表格、样本表格的类别标签训练得到的。Wherein, the table classification model is obtained by training based on the sample table and the category label of the sample table.

根据本发明提供的表格图像的处理装置,通过预先训练好的表格分类模型能够确定出目标表格图像的类型,根据表格分类模型的输出结果将目标表格图像确定为无边框表格,能够准确识别出目标表格图像的类型,提高后续无边框表格的处理速度。According to the table image processing device provided by the present invention, the type of the target table image can be determined through the pre-trained table classification model, and the target table image can be determined as a borderless table according to the output result of the table classification model, so that the target can be accurately identified. The type of table image, which improves the processing speed of subsequent borderless tables.

进一步,所述第一边框框架内的内容处于可编辑的状态。Further, the content in the first border frame is in an editable state.

根据本发明提供的表格图像的处理装置,得到的第一边框框架内的内容处于可编辑的状态,满足用户的编辑需求,提升了用户体验。According to the table image processing device provided by the present invention, the obtained content in the first frame frame is in an editable state, which satisfies the user's editing requirements and improves the user experience.

由于本发明实施例所述装置与上述实施例所述方法的原理相同,对于更加详细的解释内容在此不再赘述。Since the principle of the apparatus described in the embodiment of the present invention is the same as that of the method described in the foregoing embodiment, more detailed explanations are not repeated here.

图16为本发明实施例中提供的电子设备实体结构示意图,如图16所示,本发明提供一种电子设备,包括:处理器(processor)1601、存储器(memory)1602和总线1603;FIG. 16 is a schematic diagram of the physical structure of an electronic device provided in an embodiment of the present invention. As shown in FIG. 16 , the present invention provides an electronic device, including: a processor (processor) 1601, a memory (memory) 1602, and a bus 1603;

其中,处理器1601、存储器1602通过总线1603完成相互间的通信;The processor 1601 and the memory 1602 communicate with each other through the bus 1603;

处理器1601用于调用存储器1602中的程序指令,以执行上述各方法实施例中所提供的方法,例如包括:确定目标表格图像为无边框表格;通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架;将所述目标表格图像中的内容添加至所述第一边框框架内。The processor 1601 is configured to call program instructions in the memory 1602 to execute the methods provided in the above method embodiments, for example, including: determining that the target table image is a borderless table; processing the target table image through a table recognition structure , to obtain a first border frame; and adding the content in the target table image to the first border frame.

本发明实施例中提供一种非暂态计算机可读存储介质,非暂态计算机可读存储介质存储计算机指令,计算机指令使所述计算机执行上述各方法实施例中所提供的方法,例如包括:确定目标表格图像为无边框表格;通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架;将所述目标表格图像中的内容添加至所述第一边框框架内。An embodiment of the present invention provides a non-transitory computer-readable storage medium, where the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions cause the computer to execute the methods provided in the foregoing method embodiments, for example, including: Determining that the target table image is a borderless table; processing the target table image through the table recognition structure to obtain a first border frame; adding the content in the target table image to the first border frame.

本发明还提供一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,计算机能够执行上述各方法所提供的方法,该方法包括:确定目标表格图像为无边框表格;通过表格识别结构对所述目标表格图像进行处理,得到第一边框框架;将所述目标表格图像中的内容添加至所述第一边框框架内。The present invention also provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions, when the program instructions are executed by a computer, the computer program The method provided by the above methods can be implemented, and the method includes: determining a target table image as a borderless table; processing the target table image through a table recognition structure to obtain a first border frame; Content is added to the first border frame.

本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps of implementing the above method embodiments can be completed by program instructions related to hardware, the aforementioned program can be stored in a computer-readable storage medium, and when the program is executed, execute It includes the steps of the above method embodiments; and the aforementioned storage medium includes: ROM, RAM, magnetic disk or optical disk and other media that can store program codes.

最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例中所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be Modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements to some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions depart from the spirit and scope of the technical solutions of the embodiments of the present invention .

Claims (13)

1. A method of processing a form image, comprising:
determining the target table image as a borderless table;
processing the target form image through a form identification structure to obtain a first frame;
and adding the content in the target form image into the first frame.
2. A method of processing form images according to claim 1, the method further comprising:
the table identification structure is a line, a sliding window or a graph convolution neural network model.
3. A method of processing form images according to claim 2, the method further comprising:
in a case that the table identification structure is the line or the sliding window, the processing the target table image through the table identification structure to obtain a first frame includes:
moving the table recognition structure along a first direction of the target table image, and recording a first area which does not cover the content;
and/or moving the table identification structure along a second direction of the target table image, and recording a second area which does not cover the content;
and generating a table line in at least one of the first area and the second area to obtain a first frame.
4. A method of processing form images according to claim 2, the method further comprising:
in the case that the table identification structure is the line, the table identification structure comprises a first direction line and a second direction line;
the processing the target form image through the form recognition structure to obtain a first frame, including:
translating the first direction line and the second direction line in the target form image based on the content distribution condition in the target form image, and determining a non-intersection area in the target form image;
and determining a target point in the non-intersection area, and connecting the target point to obtain a first frame.
5. The method of processing a form image according to claim 4, wherein the determining a non-intersection region in the target form image by translating the first direction line and the second direction line in the target form image based on the content distribution in the target form image comprises:
performing binarization processing on the target table image to obtain a first binary image; wherein, a first value in the first binary image is a pixel corresponding to a content distribution area in the target form image, and a second value in the first binary image is a pixel corresponding to a non-content distribution area in the target form image;
processing the pixels in the first binary image according to a first direction to obtain a first direction line;
processing the pixels in the first binary image according to a second direction to obtain a second direction line;
and intersecting the first direction lines and the second direction lines, and taking the intersected area which is not covered by the first direction lines or the second direction lines as a non-intersecting area in the target form image.
6. The method of processing a form image of claim 4, wherein the determining target points in non-intersecting regions comprises:
determining the contour of the non-intersection region through contour searching;
and taking a coordinate point of the non-intersection area as the target point according to the contour of the non-intersection area.
7. A method of processing form images according to claim 2, the method further comprising:
in the case that the table identification structure is the sliding window, the table identification structure comprises a first direction sliding window and a second direction sliding window;
the processing the target form image through the form recognition structure to obtain a first frame, including:
translating the first-direction sliding window in the target form image based on the content distribution condition in the target form image to obtain a first-direction frame body; translating the second-direction sliding window in the target form image to obtain a second-direction frame body;
and setting a first direction table line based on the first direction frame body, and setting a second direction table line based on the second direction frame body to obtain a first side frame.
8. A method of processing form images according to claim 2, the method further comprising:
and under the condition that the table identification structure is the graph convolution neural network model, determining a first frame according to the position relation between a first text box and a second text box in the target table image.
9. The method of processing a form image of claim 1, wherein prior to said processing the target form image by the form recognition structure resulting in a first frame, the method further comprises:
carrying out binarization processing on the target table image to obtain a second binary image; wherein, the second value in the second two-value image is the pixel corresponding to the text character in the target form image, and the first value in the second two-value image is the pixel except the text character in the target form image;
and setting a table identification structure for the second binary image.
10. The method of processing a form image of claim 1, wherein the adding content in the target form image into the first bounding box frame comprises:
performing text detection on the target form image, and obtaining a text box in the target form image according to a text detection result;
setting a text box in the first frame;
filling the content in the text box in the target form image into the text box of the first frame.
11. A form image processing apparatus, comprising:
the determining module is used for determining the target form image as a borderless form;
the processing module is used for processing the target form image through the form identification structure to obtain a first frame;
and the adding module is used for adding the content in the target form image into the first frame.
12. An electronic device, comprising: a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor for invoking steps of a method of processing a form image capable of performing any of claims 1-10.
13. A computer-readable storage medium, characterized in that it stores computer instructions that cause a computer to perform the steps of the method of processing a form image according to any one of claims 1 to 10.
CN202111668079.1A 2021-12-31 2021-12-31 A table image processing method, device, electronic equipment and medium Active CN114417792B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111668079.1A CN114417792B (en) 2021-12-31 2021-12-31 A table image processing method, device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111668079.1A CN114417792B (en) 2021-12-31 2021-12-31 A table image processing method, device, electronic equipment and medium

Publications (2)

Publication Number Publication Date
CN114417792A true CN114417792A (en) 2022-04-29
CN114417792B CN114417792B (en) 2025-03-07

Family

ID=81272225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111668079.1A Active CN114417792B (en) 2021-12-31 2021-12-31 A table image processing method, device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN114417792B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104094282A (en) * 2012-01-23 2014-10-08 微软公司 Borderless table detection engine
US20190266394A1 (en) * 2018-02-26 2019-08-29 Abc Fintech Co., Ltd. Method and device for parsing table in document image
US20200089946A1 (en) * 2018-06-11 2020-03-19 Innoplexus Ag System and method for extracting tabular data from electronic document
CN113239818A (en) * 2021-05-18 2021-08-10 上海交通大学 Cross-modal information extraction method of tabular image based on segmentation and graph convolution neural network
CN113408323A (en) * 2020-03-17 2021-09-17 华为技术有限公司 Extraction method, device and equipment of table information and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104094282A (en) * 2012-01-23 2014-10-08 微软公司 Borderless table detection engine
US20190266394A1 (en) * 2018-02-26 2019-08-29 Abc Fintech Co., Ltd. Method and device for parsing table in document image
US20200089946A1 (en) * 2018-06-11 2020-03-19 Innoplexus Ag System and method for extracting tabular data from electronic document
CN113408323A (en) * 2020-03-17 2021-09-17 华为技术有限公司 Extraction method, device and equipment of table information and storage medium
CN113239818A (en) * 2021-05-18 2021-08-10 上海交通大学 Cross-modal information extraction method of tabular image based on segmentation and graph convolution neural network

Also Published As

Publication number Publication date
CN114417792B (en) 2025-03-07

Similar Documents

Publication Publication Date Title
CN113221743B (en) Table analysis method, apparatus, electronic device and storage medium
WO2019192397A1 (en) End-to-end recognition method for scene text in any shape
CN110796031A (en) Table identification method and device based on artificial intelligence and electronic equipment
CN111488826A (en) Text recognition method and device, electronic equipment and storage medium
Arai et al. Method for real time text extraction of digital manga comic
CN108171104A (en) A kind of character detecting method and device
CN112085024A (en) A method for character recognition on the surface of a tank
CN113627439A (en) Text structuring method, processing device, electronic device and storage medium
WO2021190155A1 (en) Method and apparatus for identifying spaces in text lines, electronic device and storage medium
CN113569608A (en) Text recognition method, device and equipment based on deep learning and storage medium
CN113239818A (en) Cross-modal information extraction method of tabular image based on segmentation and graph convolution neural network
CN111832551B (en) Text image processing method, device, electronic scanning equipment and storage medium
WO2024041032A1 (en) Method and device for generating editable document based on non-editable graphics-text image
CN114663904A (en) PDF document layout detection method, device, equipment and medium
CN112364834A (en) Form identification restoration method based on deep learning and image processing
CN111666937A (en) Method and system for recognizing text in image
CN111310757B (en) Video bullet screen detection and identification method and device
CN115761773A (en) In-image table recognition method and system based on deep learning
CN115114229A (en) Document format conversion method, device, storage medium, equipment and program product
CN113468979A (en) Text line language identification method and device and electronic equipment
CN110991440B (en) Pixel-driven mobile phone operation interface text detection method
CN111563505A (en) A method and device for character detection based on pixel segmentation and merging
CN114399782B (en) Text image processing method, apparatus, device, storage medium, and program product
CN112132916A (en) Seal cutting work customized design generation device utilizing generation countermeasure network
CN115620322A (en) A method for identifying the table structure of full-line tables based on key point detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant