WO2021068682A1

WO2021068682A1 - Method and apparatus for intelligently filtering table text, and computer-readable storage medium

Info

Publication number: WO2021068682A1
Application number: PCT/CN2020/112334
Authority: WO
Inventors: 石明川; 李路路
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-10-11
Filing date: 2020-08-30
Publication date: 2021-04-15
Also published as: CN110929561A; CN110929561B

Abstract

The present application relates to artificial intelligence technology, and disclosed therein is a method for intelligently filtering table text, comprising: acquiring a document-based table image set, and performing a preprocessing operation on the table image set to obtain a standard table image set; enhancing the standard table image set by using an image enhancement algorithm so as to obtain a table key image region set; performing feature image extraction on the table key image region set so as to obtain a feature table image set; performing text position detection on the feature table image set by using a pre-constructed table text filtering model; if the position of the text is detected, filtering the text and then storing a corresponding feature table image; and if the position of the text is not detected, directly saving a corresponding feature table image to thereby complete the text filtering of the table image set. Further proposed in the present application are an apparatus for intelligently filtering table text and a computer-readable storage medium. The present application achieves the intelligent filtering of table text.

Description

Table text intelligent filtering method, device and computer readable storage medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on October 11, 2019, the application number is 201910965807.1, and the invention title is "Intelligent Form Text Filtering Method, Device, and Computer-readable Storage Medium". The entire content of the Chinese patent application is approved. The reference is incorporated in this application.

Technical field

This application relates to the field of artificial intelligence technology, and in particular to a method, device and computer-readable storage medium for intelligently filtering table text.

Background technique

There are various classifiers on the market, but most companies use traditional classification algorithms such as KNN, SVM, and BP neural network. The inventor realized that these traditional classifiers are usually not effective in table text filtering tasks, and the classification accuracy rate has not reached a very high level, especially for the bill table text filtering processing in the insurance industry. problem.

Summary of the invention

An intelligent filtering method for table text provided in this application includes:

Acquiring a document-based form image set, and performing a preprocessing operation on the form image set to obtain a standard form image set;

Performing enhancement processing on the standard table image set by using an image enhancement algorithm to obtain a table key image area set;

Performing feature image extraction on the table key image area set to obtain a feature table image set;

Use the pre-built table text filtering model to detect the text position of the feature table image set. If the position of the text in the feature table image of the feature table image set is detected, filter the text and save the feature For the table image, if the position of the text in the characteristic table image of the characteristic table image set is not detected, the characteristic table image is directly saved, thereby completing the text filtering of the table image set.

The present application also provides an electronic device that includes a memory and a processor, the memory stores a table text filtering program that can be run on the processor, and the table text filtering program is executed by the processor When implementing the following steps:

The present application also provides a computer-readable storage medium having a table text filtering program stored on the computer-readable storage medium, and the table text filtering program can be executed by one or more processors to realize the following table The steps of the text intelligent filtering method: obtaining a document-based table image set, and preprocessing the table image set to obtain a standard table image set;

This application also provides an intelligent filtering device for form text, which includes:

The image preprocessing module is used to obtain a document-based table image set, and perform a preprocessing operation on the table image set to obtain a standard table image set;

An enhancement processing module, configured to perform enhancement processing on the standard table image set by using an image enhancement algorithm to obtain a table key image area set;

The feature extraction module is used to perform feature image extraction on the table key image area set to obtain a feature table image set;

The filtering module is configured to use a pre-built table text filtering model to perform text position detection on the feature table image set, and filter the text if the position of the text in the feature table image of the feature table image set is detected After that, the characteristic table image is saved, and if the position of the text in the characteristic table image of the characteristic table image set is not detected, the characteristic table image is directly saved, thereby completing the text filtering of the table image set.

Description of the drawings

FIG. 1 is a schematic flowchart of a method for intelligently filtering table text provided by an embodiment of this application;

2 is a schematic diagram of the internal structure of an electronic device provided by an embodiment of the application;

FIG. 3 is a schematic diagram of modules of a table text intelligent filtering device provided by an embodiment of the application.

The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Detailed ways

It should be understood that the specific embodiments described here are only used to explain the application, and not used to limit the application.

This application provides an intelligent filtering method for table text. Referring to FIG. 1, it is a schematic flowchart of a method for intelligently filtering table text provided by an embodiment of this application. The method can be executed by a device, and the device can be implemented by software and/or hardware.

In this embodiment, the intelligent filtering method for table text includes:

S1. Obtain a document-based table image set, and perform a preprocessing operation on the table image set to obtain a standard table image set.

In a preferred embodiment of the present application, the document includes a word document. Wherein, the word document will contain a large amount of text content in the form of a table. Preferably, in this application, the text content in the form of a table is scanned to obtain a table image, according to the table The image combination forms a table image set.

Further, this application obtains the word document in the following two ways: method one, downloading it from major search engines using keyword terms; method two, downloading it from major professional academic websites, for example, China HowNet.

Preferably, in a preferred embodiment of the present application, the preprocessing operation includes: performing image gray-scale processing on the table image set according to various scale methods to obtain a gray-scale table image set, and use the contrast stretching method to perform the gray-scale table image set. Contrast enhancement is performed on the gray-scale table image set, and the standard table image set is obtained by performing an image thresholding operation on the gray-scale table image set after the contrast enhancement. In detail, the preprocessing operation is as follows:

Image grayscale processing:

The image gray-scale processing is to convert a color image into a gray-scale image. The brightness information of the grayscale image can fully express the overall and local characteristics of the image, and the grayscale processing of the image can greatly reduce the amount of calculation for subsequent work.

In a preferred embodiment of the present application, the table image set in each scale method is converted into a gray scale table image set, and the steps of implementing each scale method are: converting the R, G, and B components of the pixels in the table image set into YUV The Y component of the color space is the brightness value, and the calculation method of the Y component is shown in the following formula:

Y=0.3R+0.59G+0.11B

Among them, R, G, and B are the R, G, and B values of the image pixel in the RGB color mode, respectively.

Contrast enhancement:

The contrast refers to the contrast between the maximum value and the minimum value of the brightness in the imaging system, where low contrast makes image processing more difficult. In the preferred embodiment of the present application, a contrast stretching method is adopted, which uses a method of increasing the dynamic range of gray levels to achieve the purpose of image contrast enhancement. The contrast stretching is also called gray-scale stretching, which is a commonly used gray-scale transformation method at present.

Further, the present application performs gray scale stretching on a specific area according to the piecewise linear transformation function in the contrast stretching method, so as to further improve the contrast of the output image. When performing contrast stretching, it essentially realizes gray value conversion. This application implements the gray value transformation through linear stretching. The linear stretching refers to a pixel-level operation with a linear relationship between the input and output gray values. The gray conversion formula is as follows:

D _b =f(D _a )=a*D _a +b

Where a is the linear slope and b is the intercept on the Y axis. When a>1, the contrast of the output image at this time is enhanced compared to the original image. When a<1, the contrast of the output image is weaker than the original image at this time, where D _a represents the gray value of the input image, and D _b represents the gray value of the output image.

c. Image thresholding operation:

The image thresholding process is an efficient algorithm for binarizing the contrast-enhanced gray-scale table image set through the OTSU algorithm. The preferred embodiment of the present application presets the gray level t to be the segmentation threshold of the foreground and background of the gray image, and assumes that the proportion of the number of front spots in the image is w ₀ , the average gray level is u ₀ , and the proportion of background points in the image is w ₁ , The average gray level is u ₁ , then the total average gray level of the gray image is:

u=w ₀ ＊u ₀ +w ₁ ＊u ₁ ,

The variance of the foreground and background image of the grayscale image is:

g=w ₀ ＊(u ₀ －u)＊(u ₀ －u)+w ₁ ＊(u ₁ －u)*(u ₁ －u)=w ₀ ＊w ₁ ＊(u ₀ －u ₁ )* (u ₀ －u ₁ ),

Among them, when the variance g is the largest, the difference between the foreground and the background is the largest at this time, the gray level t at this time is the optimal threshold, and the gray level value of the gray level image after the contrast enhancement is greater than the gray level t It is set to 255, and the gray value smaller than the gray t is set to 0, thereby obtaining the standard table image set.

S2. Use an image enhancement algorithm to perform enhancement processing on the standard table image set to obtain a table key image area set.

In a preferred embodiment of the present application, the image enhancement algorithm includes a threshold segmentation method and a Retinex algorithm. Preferably, this application uses a threshold segmentation method to segment the foreground text and background pattern in the standard table image set. The core idea of the threshold segmentation method is to traverse each pixel in the image by setting a threshold T. When the gray value of the pixel is greater than T, it is considered as foreground text, otherwise it is considered as background pattern. Further, for the special characters in the standard table image set after segmentation, this application adopts the region growth method to perform segmentation processing. Wherein, the special text includes characters, symbols, etc. The core idea of the area growth method is to aggregate pixels or sub-regions into a larger area according to pre-defined criteria, starting from a set of growth points (the growth point can be a single pixel or a small area), and will be related to the nature of the growth point. Similar adjacent pixels or regions are merged with the growth point to form a new growth point, and this process is repeated until no growth is possible.

Preferably, in this application, the Retinex algorithm is used to calculate the key information image regions in the standard table image set after segmentation, to obtain the table key image regions, so as to combine to form the table key image region set, wherein the Retinex algorithm include:

S(x,y)=R(x,y)×L(x,y)

Among them, S(x,y) represents the table key image area, R(x,y) represents the reflected light image, L(x,y) represents the brightness image, x represents the abscissa of the table key image area, and y represents the table key The ordinate of the image area. The core idea of the Retinex algorithm is: an image is composed of a brightness image and a reflection image, expressed as the product of a pixel and a corresponding pixel between the brightness image and the reflection image of the image, and image enhancement can be achieved by reducing the influence of the brightness image on the reflection image purpose.

S3. Perform feature image extraction on the table key image area set to obtain a feature table image set.

In a preferred embodiment of the present application, feature image extraction is performed on the key image region set of the table through a residual block neural network. Wherein, the residual block neural network includes an input layer, a hidden layer and an output layer. Preferably, the present application inputs the table key image area set into the residual block neural network input layer, and uses the hidden layer of the residual block neural network to perform a convolution operation on the table key image area set, Obtain the feature map set of the table key image area set, and output the feature map set through the output layer of the residual block neural network, thereby obtaining the feature table image set.

Further, the embodiment of the present application also includes adding a shortcut connection to the residual block neural network, the shortcut connection is a direct connection or a shortcut connection, that is, the F(x)+x function of the residual block neural network is substituted The original H(x) function to achieve fast connection.

S4. Use the pre-built table text filtering model to perform text position detection on the feature table image set, if the position of the text in the feature table image is detected, filter the text and save the feature table image, if not detected The position of the text in the characteristic table image is extracted, and the characteristic table image is directly saved, thereby completing the text filtering of the table image set.

In a preferred embodiment of the present application, the table text filtering model includes a text detection network. The text position detection includes: generating a geometric diagram in the feature table image set, scaling the geometric diagram according to a preset ratio, and inputting the zoomed geometric diagram into the table text filtering model After training, the zoomed geometric figure loss L _{g is obtained} ; the class-balanced cross-entropy is used to calculate the text loss L _s in the zoomed geometric figure; the zoomed geometric figure loss and text loss are input to the preview The loss function value is obtained from the loss function, and the text position detection is performed on the feature table image set according to the loss function value. If the loss function value is less than the preset threshold, the position of the text in the feature table image is detected, and the text is filtered and the feature table image is saved. If the loss function value is greater than or equal to the preset When the threshold is used, the position of the text in the characteristic table image is not detected, and the characteristic table image is directly saved, thereby completing the text filtering of the table image set.

Preferably, the preset threshold in this application is 0.01. Wherein, the loss function includes:

L=L _s +λ _g L _g

Among them, L represents the loss function value, L _s and L _g represent text loss and geometric graph loss, respectively, and λ _g represents the importance level value between the two losses.

Further, in this application, inputting the zoomed geometric figure into the table text filtering model for training to obtain the zoomed geometric figure loss L _g includes: inputting the zoomed geometric figure into In the input layer of the tabular text filtering model, feature merging is performed on the zoomed geometric map through the hidden layer of the tabular text filtering model to obtain a feature map, and the output layer of the tabular text filtering model compares all features. The feature map performs frame regression, thereby outputting the loss L _{g of} the geometric map. Wherein, the hidden layer includes a convolutional layer and a pooling layer.

The invention also provides an intelligent filtering device for table text. Referring to FIG. 2, it is a schematic diagram of the internal structure of an electronic device provided by an embodiment of this application.

In this embodiment, the electronic device 1 may be a PC (Personal Computer, personal computer), or a terminal device such as a smart phone, a tablet computer, or a portable computer, or a server. The electronic device 1 at least includes a memory 11, a processor 12, a communication bus 13, and a network interface 14.

The memory 11 includes at least one type of readable storage medium. The readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, and the like. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a hard disk of the electronic device 1. In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SmartMediaCard, SMC), a Secure Digital (SD) card, and a flash memory. Card (FlashCard) etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can be used not only to store application software and various data installed in the electronic device 1, such as the code of the table text filtering program 01, etc., but also to temporarily store data that has been output or will be output.

In some embodiments, the processor 12 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip, for running program codes or processing data stored in the memory 11, For example, execute the form text filtering program 01 and so on.

The communication bus 13 is used to realize the connection and communication between these components.

The network interface 14 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface), and is usually used to establish a communication connection between the device 1 and other electronic devices.

Optionally, the device 1 may also include a user interface. The user interface may include a display (Display) and an input unit such as a keyboard (Keyboard). The optional user interface may also include a standard wired interface and a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light emitting diode) touch device, etc. Among them, the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the electronic device 1 and to display a visualized user interface.

Figure 2 only shows the electronic device 1 with the components 11-14 and the form text filtering program 01. Those skilled in the art can understand that the structure shown in Figure 1 does not constitute a limitation on the electronic device 1, and may include comparisons. Fewer or more parts are shown, or some parts are combined, or different parts are arranged.

In the embodiment of the device 1 shown in FIG. 2, the table text filtering program 01 is stored in the memory 11; when the processor 12 executes the table text filtering program 01 stored in the memory 11, the following steps are implemented:

Step 1: Obtain a document-based table image set, and perform a preprocessing operation on the table image set to obtain a standard table image set.

Image grayscale processing:

Y=0.3R+0.59G+0.11B

Contrast enhancement:

D _b =f(D _a )=a*D _a +b

c. Image thresholding operation:

u=w ₀ ＊u ₀ +w ₁ ＊u ₁ ,

The variance of the foreground and background image of the grayscale image is:

Step 2: Using an image enhancement algorithm to perform enhancement processing on the standard table image set to obtain a table key image area set.

S(x,y)=R(x,y)×L(x,y)

Step 3: Perform feature image extraction on the key image region set of the table to obtain a feature table image set.

In a preferred embodiment of the present application, feature image extraction is performed on the key image region set of the table through a residual block neural network. Wherein, the residual block neural network includes an input layer, a hidden layer and an output layer. Preferably, the present application inputs the table key image area set into the residual block neural network input layer, and uses the hidden layer of the residual block neural network to perform the convolution operation on the table key image area set, Obtain the feature map set of the table key image area set, and output the feature map set through the output layer of the residual block neural network, thereby obtaining the feature table image set.

Step 4. Use the pre-built table text filtering model to perform text position detection on the feature table image set. If the position of the text in the feature table image is detected, filter the text and save the feature table image. If there is no The position of the text in the characteristic table image is detected, and the characteristic table image is directly saved, thereby completing the text filtering of the table image set.

L=L _s +λ _g L _g

Optionally, in other embodiments, the table text filtering program may also be divided into one or more modules, and the one or more modules are stored in the memory 11 and run by one or more processors (in this embodiment, The processor 12) is executed to complete the application.

3, which is a schematic diagram of program modules in an embodiment of the form text intelligent filtering device of this application. In this embodiment, the form text intelligent filtering device can be divided into an image preprocessing module 10, an enhancement processing module 20, The feature extraction module 30 and the filtering module 40 exemplarily:

The image preprocessing module 10 is configured to obtain a document-based form image set, and perform a preprocessing operation on the form image set to obtain a standard form image set.

The enhancement processing module 20 is configured to perform enhancement processing on the standard table image set by using an image enhancement algorithm to obtain a table key image area set.

The feature extraction module 30 is configured to perform feature image extraction on the table key image area set to obtain a feature table image set.

The filtering module 40 is configured to: use a pre-built table text filtering model to perform text position detection on the characteristic table image set, and if the position of the text in the characteristic table image of the characteristic table image set is detected, the After the text is filtered, the characteristic table image is saved. If the position of the text in the characteristic table image of the characteristic table image set is not detected, the characteristic table image is directly saved, thereby completing the text filtering of the table image set.

The functions or operation steps implemented by the program modules such as the image preprocessing module 10, the enhancement processing module 20, the feature extraction module 30, and the filtering module 40 when executed are substantially the same as those in the foregoing embodiment, and will not be repeated here.

In addition, an embodiment of the present application also proposes a computer-readable storage medium having a table text filtering program stored on the computer-readable storage medium, and the table text filtering program can be executed by one or more processors to achieve the following operating:

The computer-readable storage medium may be non-volatile or volatile.

The specific implementation of the computer-readable storage medium of the present application is basically the same as the foregoing embodiments of the table text intelligent filtering device and method, and will not be repeated here.

It should be noted that the serial numbers of the above-mentioned embodiments of the present application are only for description, and do not represent the superiority or inferiority of the embodiments. And the terms "include", "include" or any other variants thereof in this article are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements, but also includes those elements that are not explicitly included. The other elements listed may also include elements inherent to the process, device, article, or method. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, device, article, or method that includes the element.

Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disk, optical disk), including a number of instructions to make a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) execute the method described in each embodiment of the present application.

The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

An intelligent filtering method for table text, wherein the method includes:

Acquiring a document-based form image set, and performing a preprocessing operation on the form image set to obtain a standard form image set;

Performing enhancement processing on the standard table image set by using an image enhancement algorithm to obtain a table key image area set;

Performing feature image extraction on the table key image area set to obtain a feature table image set;

Use the pre-built table text filtering model to detect the text position of the feature table image set. If the position of the text in the feature table image of the feature table image set is detected, filter the text and save the feature For the table image, if the position of the text in the characteristic table image of the characteristic table image set is not detected, the characteristic table image is directly saved, thereby completing the text filtering of the table image set.
8. The method for intelligently filtering table text according to claim 1, wherein the preprocessing operation of the table image set to obtain a standard table image set comprises:

Perform image gray-scale processing on the table image set according to each ratio method to obtain a gray-scale table image set, and use a contrast stretching method to perform contrast enhancement on the gray-scale table image set, and the grayscale after the contrast enhancement The standard table image set is obtained after the image thresholding operation is performed on the table image set.
8. The intelligent filtering method for form text according to claim 1, wherein said using an image enhancement algorithm to perform enhancement processing on said standard form image set to obtain a form key image area set comprises:

Segmenting the image foreground text and image background pattern in the standard table image set by a threshold segmentation method;

The Retinex algorithm is used to calculate the key information image area in the standard table image set after segmentation to obtain the table key image area, thereby combining to form the table key image area set, wherein the Retinex algorithm includes:

S(x,y)=R(x,y)×L(x,y)

Among them, S(x,y) represents the table key image area, R(x,y) represents the reflected light image, L(x,y) represents the brightness image, x represents the abscissa of the table key image area, and y represents the table key The ordinate of the image area.
The intelligent filtering method for form text according to claim 1, wherein said extracting the feature image from the key image region set of the form to obtain the feature form image set comprises:

The table key image area set is input into the residual block neural network input layer, and the hidden layer of the residual block neural network is used to perform a convolution operation on the table key image area set to obtain the table key image area The feature atlas set of the set is output through the output layer of the residual block neural network to obtain the feature table image set.
The intelligent filtering method for form text according to any one of claims 1 to 4, wherein said using a pre-built form text filtering model to perform text position detection on said feature form image set comprises:

Generate a geometric diagram in the feature table image set, and scale the geometric diagram according to a preset ratio, and input the scaled geometric diagram into the table text filtering model for training to obtain the scaled The geometric figure loss L g ;

Calculate the text loss L s in the zoomed geometric graph by using class balance cross entropy;

Inputting the scaled geometric graph loss and text loss into a preset loss function to obtain a loss function value, and performing text position detection on the feature table image set according to the loss function value.
The intelligent filtering method for table text according to claim 5, wherein said inputting the scaled geometric figure into the table text filtering model for training to obtain the scaled geometric figure loss L g includes :

Input the zoomed geometric figure into the input layer of the table text filtering model;

Performing feature merging on the zoomed geometric map through the hidden layer of the table text filtering model to obtain a feature map, wherein the hidden layer includes a convolutional layer and a pooling layer;

Perform frame regression on the feature map through the output layer of the table text filtering model, thereby outputting the loss L g of the geometric map.
An electronic device, wherein the electronic device includes a memory and a processor, the memory stores a table text filtering program that can be run on the processor, and when the table text filtering program is executed by the processor To achieve the following steps:

Acquiring a document-based form image set, and performing a preprocessing operation on the form image set to obtain a standard form image set;

Performing enhancement processing on the standard table image set by using an image enhancement algorithm to obtain a table key image area set;

Performing feature image extraction on the table key image area set to obtain a feature table image set;

Use the pre-built table text filtering model to detect the text position of the feature table image set. If the position of the text in the feature table image of the feature table image set is detected, filter the text and save the feature For the table image, if the position of the text in the characteristic table image of the characteristic table image set is not detected, the characteristic table image is directly saved, thereby completing the text filtering of the table image set.
8. The electronic device according to claim 7, wherein said performing a preprocessing operation on said form image set to obtain a standard form image set comprises:

Perform image gray-scale processing on the table image set according to each ratio method to obtain a gray-scale table image set, and use a contrast stretching method to perform contrast enhancement on the gray-scale table image set, and the grayscale after the contrast enhancement The standard table image set is obtained after the image thresholding operation is performed on the table image set.
8. The electronic device according to claim 7, wherein said using an image enhancement algorithm to perform enhancement processing on said standard table image set to obtain a table key image area set comprises:

Segmenting the image foreground text and image background pattern in the standard table image set by a threshold segmentation method;

The Retinex algorithm is used to calculate the key information image area in the standard table image set after segmentation to obtain the table key image area, thereby combining to form the table key image area set, wherein the Retinex algorithm includes:

S(x,y)=R(x,y)×L(x,y)

Among them, S(x,y) represents the table key image area, R(x,y) represents the reflected light image, L(x,y) represents the brightness image, x represents the abscissa of the table key image area, and y represents the table key The ordinate of the image area.
8. The electronic device according to claim 7, wherein said performing feature image extraction on said table key image area set to obtain a feature table image set comprises:

The table key image area set is input into the residual block neural network input layer, and the hidden layer of the residual block neural network is used to perform a convolution operation on the table key image area set to obtain the table key image area The feature atlas set of the set is output through the output layer of the residual block neural network to obtain the feature table image set.
8. The electronic device according to any one of claims 7 to 10, wherein said using a pre-built form text filtering model to perform text position detection on said feature form image set comprises:

Generate a geometric diagram in the feature table image set, and scale the geometric diagram according to a preset ratio, and input the scaled geometric diagram into the table text filtering model for training to obtain the scaled The geometric figure loss L g ;

Calculate the text loss L s in the zoomed geometric graph by using class balance cross entropy;

Inputting the scaled geometric graph loss and text loss into a preset loss function to obtain a loss function value, and performing text position detection on the feature table image set according to the loss function value.
The electronic device as claimed in claim 11, wherein the geometry of the scaled text input to the filter table of the geometric loss L g obtained after the scaled training model, comprising:

Input the zoomed geometric figure into the input layer of the table text filtering model;

Performing feature merging on the zoomed geometric map through the hidden layer of the table text filtering model to obtain a feature map, wherein the hidden layer includes a convolutional layer and a pooling layer;

Perform frame regression on the feature map through the output layer of the table text filtering model, thereby outputting the loss L g of the geometric map.
A computer-readable storage medium, wherein a table text filtering program is stored on the computer-readable storage medium, and the table text filtering program can be executed by one or more processors to realize the table text intelligence as described below Steps of the filtering method:

Acquiring a document-based form image set, and performing a preprocessing operation on the form image set to obtain a standard form image set;

Performing enhancement processing on the standard table image set by using an image enhancement algorithm to obtain a table key image area set;

Performing feature image extraction on the table key image area set to obtain a feature table image set;

Use the pre-built table text filtering model to detect the text position of the feature table image set. If the position of the text in the feature table image of the feature table image set is detected, filter the text and save the feature For the table image, if the position of the text in the characteristic table image of the characteristic table image set is not detected, the characteristic table image is directly saved, thereby completing the text filtering of the table image set.
15. The computer-readable storage medium of claim 13, wherein the preprocessing operation on the form image set to obtain a standard form image set comprises:

Perform image gray-scale processing on the table image set according to each ratio method to obtain a gray-scale table image set, and use a contrast stretching method to perform contrast enhancement on the gray-scale table image set, and the grayscale after the contrast enhancement The standard table image set is obtained after the image thresholding operation is performed on the table image set.
15. The computer-readable storage medium according to claim 13, wherein said using an image enhancement algorithm to perform enhancement processing on said standard table image set to obtain a table key image area set comprises:

Segmenting the image foreground text and image background pattern in the standard table image set by a threshold segmentation method;

The Retinex algorithm is used to calculate the key information image area in the standard table image set after segmentation to obtain the table key image area, thereby combining to form the table key image area set, wherein the Retinex algorithm includes:

S(x,y)=R(x,y)×L(x,y)

Among them, S(x,y) represents the table key image area, R(x,y) represents the reflected light image, L(x,y) represents the brightness image, x represents the abscissa of the table key image area, and y represents the table key The ordinate of the image area.
15. The computer-readable storage medium according to claim 13, wherein said performing feature image extraction on said table key image area set to obtain a feature table image set comprises:

The table key image area set is input into the residual block neural network input layer, and the hidden layer of the residual block neural network is used to perform a convolution operation on the table key image area set to obtain the table key image area The feature atlas set of the set is output through the output layer of the residual block neural network to obtain the feature table image set.
15. The computer-readable storage medium according to any one of claims 13 to 16, wherein said using a pre-built table text filtering model to perform text position detection on said feature table image set comprises:

Generate a geometric diagram in the feature table image set, and scale the geometric diagram according to a preset ratio, and input the scaled geometric diagram into the table text filtering model for training to obtain the scaled The geometric figure loss L g ;

Calculate the text loss L s in the zoomed geometric graph by using class balance cross entropy;

Inputting the scaled geometric graph loss and text loss into a preset loss function to obtain a loss function value, and performing text position detection on the feature table image set according to the loss function value.
The computer-readable storage medium according to claim 17, wherein the input of the zoomed geometric figure into the table text filtering model for training to obtain the zoomed geometric figure loss L g comprises :

Input the zoomed geometric figure into the input layer of the table text filtering model;

Performing feature merging on the zoomed geometric map through the hidden layer of the table text filtering model to obtain a feature map, wherein the hidden layer includes a convolutional layer and a pooling layer;

Perform frame regression on the feature map through the output layer of the table text filtering model, thereby outputting the loss L g of the geometric map.
A table text intelligent filtering device, wherein the device includes:

The image preprocessing module is used to obtain a document-based table image set, and perform a preprocessing operation on the table image set to obtain a standard table image set;

An enhancement processing module, configured to perform enhancement processing on the standard table image set by using an image enhancement algorithm to obtain a table key image area set;

The feature extraction module is used to perform feature image extraction on the table key image area set to obtain a feature table image set;

The filtering module is configured to use a pre-built table text filtering model to perform text position detection on the feature table image set, and filter the text if the position of the text in the feature table image of the feature table image set is detected After that, the characteristic table image is saved, and if the position of the text in the characteristic table image of the characteristic table image set is not detected, the characteristic table image is directly saved, thereby completing the text filtering of the table image set.
The table text intelligent filtering device according to claim 19, wherein the enhanced processing module comprises:

A segmentation module, configured to segment the image foreground text and image background pattern in the standard table image set by a threshold segmentation method;

The calculation module is configured to use the Retinex algorithm to calculate the key information image area in the standard table image set after segmentation to obtain the table key image area, thereby combining to form the table key image area set, wherein the Retinex algorithm includes:

S(x,y)=R(x,y)×L(x,y)

Among them, S(x,y) represents the table key image area, R(x,y) represents the reflected light image, L(x,y) represents the brightness image, x represents the abscissa of the table key image area, and y represents the table key The ordinate of the image area.