CN116721434A - Method, device, computer equipment and storage medium for detecting image wired form - Google Patents

Method, device, computer equipment and storage medium for detecting image wired form Download PDF

Info

Publication number
CN116721434A
CN116721434A CN202310440295.3A CN202310440295A CN116721434A CN 116721434 A CN116721434 A CN 116721434A CN 202310440295 A CN202310440295 A CN 202310440295A CN 116721434 A CN116721434 A CN 116721434A
Authority
CN
China
Prior art keywords
straight line
candidate straight
line segments
image
line segment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310440295.3A
Other languages
Chinese (zh)
Inventor
段炼
黄九鸣
张圣栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Xinghan Shuzhi Technology Co ltd
Original Assignee
Hunan Xinghan Shuzhi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Xinghan Shuzhi Technology Co ltd filed Critical Hunan Xinghan Shuzhi Technology Co ltd
Priority to CN202310440295.3A priority Critical patent/CN116721434A/en
Publication of CN116721434A publication Critical patent/CN116721434A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/203Drawing of straight lines or curves
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/187Segmentation; Edge detection involving region growing; involving region merging; involving connected component labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19107Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to the technical field of computer vision, and provides an image wired form detection method, an image wired form detection device, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring a binary image of a wired table image, and detecting candidate straight line segments based on the binary image; determining the external communication outline of the communication pixel points of each candidate straight line segment in the binary image, and filtering the candidate straight line segments which do not meet the thickness preset requirement according to the external communication outline; the candidate straight line segments are widened and standardized and then input into a trained model prediction category, and non-table line segments are filtered from the candidate straight line segments according to the category; filtering candidate straight line segments which do not meet the angle requirement from the candidate straight line segments according to the slope; and generating a wire frame of the table based on the filtered residual candidate linear line segments to obtain the wired table. By adopting the method, the wired form can be detected rapidly and accurately.

Description

Method, device, computer equipment and storage medium for detecting image wired form
Technical Field
The application belongs to the technical field of computer vision, and particularly relates to an image wired form detection method, an image wired form detection device, computer equipment and a storage medium.
Background
Forms are an important form of information presentation, and by organizing complex data into standard structures, information retrieval, comparison and analysis are facilitated, and form elements essential for daily informatization offices of people. However, in the practical application process, in order to facilitate transmission and reading, the form is usually transmitted in PDF or image format, which results in that the computer cannot directly understand the form information and needs to manually extract and process. However, the number of the forms in the information age is huge, the forms are complex and changeable, and the manual processing is complicated and time-consuming, so that the form under the PDF or the image carrier is understood to be a problem to be solved urgently.
At present, with rapid development of technology, table understanding technology is developed in spring, and becomes a research hotspot in academia and industry. According to the current development, the form understanding technology can be divided into form detection and form recognition. Form detection refers to locating a form region from an image, and is a precondition for form identification. Common table detection methods include methods based on object detection and methods based on semantic segmentation. However, the size of the table is changed more, especially the wired table, and the number of the table is few, and the number of the table is more than hundred, so that the situation that the detection of the table area is insufficient is easily caused by the target detection model and the semantic segmentation model, and the accuracy of the wired table detection is reduced.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method, apparatus, computer device, and storage medium for detecting an image wired table that can be fast and accurate.
The application provides an image wired form detection method, which comprises the following steps:
acquiring a binary image of a wired table image, and detecting candidate straight line segments based on the binary image;
determining the external communication outline of the communication pixel points of each candidate straight line segment in the binary image, and filtering the candidate straight line segments which do not meet the thickness preset requirement according to the external communication outline;
the candidate straight line segments are widened and standardized and then input into a trained model prediction category, and non-table line segments are filtered from the candidate straight line segments according to the category;
filtering candidate straight line segments which do not meet the angle requirement from the candidate straight line segments according to the slope;
and generating a wire frame of the table based on the filtered residual candidate linear line segments to obtain the wired table.
In one embodiment, the acquiring the binary image of the wired table image, detecting the candidate straight line based on the binary image, includes:
performing self-adaptive threshold segmentation on the wired form image by using an Ojin method to obtain a binary image;
inverting the binary image to obtain an inverted binary image;
and carrying out straight line detection on the binary image and the inverted binary image to obtain candidate straight line lines.
In one embodiment, the determining the external communication contour of the connected pixels of each candidate straight line segment in the binary image, and filtering the candidate straight line segments that do not meet the preset requirement of thickness according to the external communication contour includes:
finding out a pixel point set communicated with the candidate straight line segment from the binary image;
contour fitting is carried out on the pixel point set to determine an external communication contour, and an external communication contour pixel point set is obtained;
respectively calculating the linear distance between each pixel point in the outer communication contour pixel point set and the candidate linear line segment, and calculating the standard deviation of each linear distance;
and eliminating the candidate straight line segments with the difference value of the straight line distance and the standard deviation being larger than a thickness threshold value.
In one embodiment, the widening and normalizing each candidate straight line segment is input to a trained model prediction class, and filtering a non-table line segment from the candidate straight line segments according to the class includes:
performing line segment extension by using the starting point and the ending point of the candidate straight line segment to obtain extension coordinates;
constructing coordinates according to the origin of coordinates, the extension length and the distance between the starting point coordinates and the ending point coordinates to serve as standard points, taking the extension coordinates as basic points, and calculating a transformation matrix based on the standard points and the basic points;
the transformation matrix is utilized to carry out standardization processing on the wired table image, and then a corresponding sub-image area is intercepted, so that a line segment sub-image of the candidate straight line segment is obtained;
inputting the line segment subgraphs into a trained model prediction category, and eliminating the candidate straight line segments with the category of non-form line segments.
In one embodiment, the extending the line segment with the starting point and the ending point of the candidate straight line segment to obtain the extending coordinate includes:
determining an extension length according to the product of the length of the candidate straight line segment and a length threshold value;
and taking the starting point and the ending point as centers, and respectively extending to two sides along the vertical direction of the candidate straight line segment based on the extension length to obtain extension coordinates.
In one embodiment, the filtering, according to the slope, the candidate straight line segments that do not meet the angle requirement from the candidate straight line segments includes:
clustering each candidate straight line segment based on the slope of each candidate straight line segment to determine a transverse line segment and a vertical line segment;
respectively calculating standard slopes of the transverse line segments and the vertical line segments to obtain a transverse line standard slope and a vertical line standard slope;
calculating a corresponding angle difference value according to the slope of the transverse line segment and the standard slope of the transverse line, and calculating a corresponding angle difference value according to the slope of the vertical line segment and the standard slope of the vertical line;
and filtering the candidate straight line segments with the angle difference value larger than an angle threshold value.
In one embodiment, the generating the wire frame of the table based on the candidate straight line segments remaining after filtering, to obtain a wired table, includes:
constructing a blank binary image with the same size as the wired table image;
drawing a straight line in the blank binary image based on the filtered residual candidate straight line segments to obtain a grid line image;
and searching a connected region after expanding and corroding the table line image, and drawing the border of the table according to the minimum circumscribed rectangle of the connected region to obtain the wired table.
An image wired form detection apparatus comprising:
the linear detection module is used for acquiring a binary image of the wired table image and detecting candidate linear line segments based on the binary image;
the thickness filtering module is used for determining the outer communication outline of the communication pixel points of each candidate straight line segment in the binary image, and filtering the candidate straight line segments which do not meet the thickness preset requirement according to the outer communication outline;
the model filtering module is used for widening and standardizing each candidate straight line segment, inputting the candidate straight line segment into a trained model prediction category, and filtering non-table line segments from the candidate straight line segments according to the category;
the angle filtering module is used for filtering candidate straight line segments which do not meet the angle requirement from the candidate straight line segments according to the slope;
and the wire frame generation module is used for generating a wire frame of the table based on the filtered residual candidate linear line segments to obtain a wired table.
The application also provides a computer device comprising a processor and a memory storing a computer program, the processor implementing the steps of the method for detecting an image wired table as described in any one of the above when executing the computer program.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the image wired form detection method of any of the above.
According to the method, the device, the computer equipment and the storage medium for detecting the wired table of the image, through obtaining the binary image of the wired table image, after detecting the candidate straight line segments based on the binary image, determining the external communication outline of the communication pixel points of each candidate straight line segment in the binary image, and filtering the candidate straight line segments which do not meet the preset requirement of thickness. And simultaneously, widening and normalizing each candidate straight line segment, inputting the obtained linear line segment into a trained model, predicting and filtering the candidate straight line segment with the non-form line segment category, and filtering the line segment which does not meet the angle requirement from the candidate straight line segments according to the slope, so that the wire frame of the form is generated based on the filtered residual candidate straight line segments to obtain the wired form. The method is used for carrying out line analysis and filtering based on the characteristics of the wired form lines, thickness, angles and the like to finally obtain a form area, and is not easily influenced by the change of the form size and the line quantity relative to a target detection and semantic segmentation method, so that the wired form detection is realized rapidly and accurately.
Drawings
FIG. 1 is a diagram of an application environment of an image wired form detection method in one embodiment.
FIG. 2 is a flow chart of a method for detecting an image wired table in an embodiment.
Fig. 3 is a block diagram of an apparatus for detecting an image wired table in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The method for detecting the wired image table can be applied to an application environment shown in fig. 1, wherein the application environment relates to a terminal 102 and a server 104. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented as a stand-alone server or a server cluster composed of a plurality of servers.
When the terminal 102 receives the image wired form detection instruction, the above-described image wired form detection method may be implemented solely by the terminal 102. The terminal 102 may transmit an image wired form detection instruction to the communication server 104, and the server 104 may implement the image wired form detection method. Taking the server 104 as an example, specifically, the server 104 acquires a binary image of the wired form image, and detects candidate straight line segments based on the binary image; the server 104 determines the external communication outline of the communication pixel points of each candidate straight line segment in the binary image, and filters the candidate straight line segments which do not meet the preset thickness requirement according to the external communication outline; the server 104 widens and normalizes each candidate straight line segment, inputs the candidate straight line segment into a trained model prediction category, and filters non-table line segments from the candidate straight line segments according to the category; the server 104 filters candidate straight line segments which do not meet the angle requirement from the candidate straight line segments according to the slope; the server 104 generates a wire frame of the table based on the candidate straight line segments remaining after the filtering, resulting in a wired table.
In one embodiment, as shown in fig. 2, an image wired table detection method is provided, and the method is applied to a server for illustration, and includes the following steps:
step S201, acquiring a binary image of the wired table image, and detecting a candidate straight line segment based on the binary image.
The candidate straight line segments are straight line segments screened from the wired table image.
Specifically, after receiving the image wired form detection instruction, a corresponding wired form image is first acquired. Then, performing image processing on the wired table image to obtain a binary image, wherein the binary image can be obtained by performing self-adaptive threshold segmentation on the wired table image. And finally, carrying out linear detection on the binary image to obtain candidate linear line segments.
In one embodiment, step S201 includes: performing self-adaptive threshold segmentation on the wired form image by using an Ojin method to obtain a binary image; inverting the binary image to obtain an inverted binary image; and carrying out straight line detection on the binary image and the inverted binary image to obtain candidate straight line lines.
Specifically, for the wired table image I, adaptive threshold segmentation may be performed using the oxford method to obtain the binary image Ib1. Meanwhile, in consideration of the diversification of the line colors of the table, there may be a case where the line colors and the text colors are different, so that the binary image Ib1 may be inverted to obtain an inverted binary image Ib2. Then, the two binary images Ib1 and Ib2 are respectively subjected to line detection to obtain candidate line segments. The line detection may be performed by any conventional method, and hough line detection is preferred in this embodiment. The candidate straight line lines obtained in this embodiment may be recorded as lc= { l1, l2, … …, ln }, where each candidate straight line segment li= { (x 1, y 1), (x 2, y 2), θ, ib }, where (x 1, y 1), (x 2, y 2) is the coordinates of the starting two points of the candidate straight line segment, θ is the slope of the candidate straight line segment, and Ib is the source binary image of the candidate straight line segment.
Step S202, determining the external communication outline of the communication pixel points of each candidate straight line segment in the binary image, and filtering the candidate straight line segments which do not meet the preset thickness requirement according to the external communication outline.
The connected pixel points are pixel points which are connected with the candidate straight line segments in the binary image. The outer connected outline refers to an outer outline obtained based on all connected pixels. The thickness preset requirement is a width requirement set for the straight line according to actual conditions. Filtering refers to removing line segments.
Specifically, after each candidate straight line segment is obtained by straight line detection, since there may be a case where the obtained candidate straight line segment is not a table segment, it is necessary to filter the line segments. The present embodiment filters the lines based on consistency of the thickness of the lines of the table. That is, first, a set of pixel points communicating with the candidate straight line segment is acquired in the binary image, and the outer communicating contour is determined by the set of communicating pixel points. And measuring the thickness of the candidate straight line segments based on the outer communication contour, and screening out and rejecting the candidate straight line segments which do not meet the thickness preset requirement based on the outer communication contour.
In one embodiment, step S202 includes: finding out a pixel point set communicated with the candidate straight line segment from the binary image; determining an external communication contour by the contour fitting pixel point set to obtain an external communication contour pixel point set; respectively calculating the linear distance between each pixel point in the outer communication contour pixel point set and the candidate linear line segment, and calculating the standard deviation of each linear distance; and eliminating candidate linear segments with the difference value of the linear distance and the standard deviation larger than the thickness threshold value.
Specifically, for any one candidate straight line segment li= { (x 1, y 1), (x 2, y 2), θ, ib } in Lc, first, find all the pixel points that are connected with the candidate straight line segment li from the corresponding Ib, and obtain a connected pixel point set. And secondly, using a contour fitting method to find the corresponding external communication contour for the connected pixel point set, so as to obtain the pixel point set of the external communication contour corresponding to the external communication contour, which can be marked as C= { (xc 1, yc 1), (xc 2, yc 2), … …, (xck, yck) }. Then, each of the outer connected contour pixel points (xck, yck) is sequentially extracted from C, and the linear distance dis between the outer connected contour pixel points and the candidate linear line segment is calculated ck The straight line distance calculation formula is as follows:
and after obtaining the linear distances between all the pixel points in the C and the candidate linear line segments, calculating the standard deviation of the linear distances. Finally, the standard deviation is compared with a set thickness threshold. When the difference value between the linear distance of the candidate linear line segment li and the standard deviation is larger than the thickness threshold value, the line segment li is regarded as an invalid line segment which does not meet the thickness preset requirement, and the invalid line segment is removed from Lc. In this embodiment, the thickness threshold value is preferably set to 5 based on actual conditions.
Step S203, the candidate straight line segments are widened and standardized, then input into a trained model prediction category, and non-table line segments are filtered from the candidate straight line segments according to the category.
The widening refers to expanding the pixel point information around the line segment. The normalization processing refers to a normalization operation on an image. Non-form line segments refer to line segments that do not belong to the original wired form. The model is a neural network model which is trained in advance and used for predicting line segment types, and the output predicted types can be represented by labels. For example, label 0 represents a non-form segment and label 1 represents a form segment.
Specifically, the embodiment filters the candidate straight line segments based on thickness and direction angles, and filters the non-tabular line segments based on the neural network. The model prediction category is input by intercepting the corresponding line segment subgraphs in the original wired table image. It should be noted that, before the model is input, the candidate straight line segment needs to be widened and then input into the model after the image normalization process. Because in a practical scene, the line is usually a straight line with a very narrow width, and the direction of the straight line is arbitrary, and may be vertical, horizontal or inclined at a certain angle, the direct input to the neural network can normalize the size of the straight line image, so that the deformation of the line segment is serious. Therefore, the embodiment widens the line segment, so that the widened image not only contains the pixel point information of the line segment, but also contains the pixel point information of the periphery of the line segment, and the semantic features of the line segment image are enriched. Meanwhile, the standardization is carried out through perspective transformation, the line deformation is avoided, and the original characteristics of the image are reserved, so that the model training difficulty is reduced, and the classification accuracy is improved.
In one embodiment, step S203 includes: performing line segment extension by using the starting point and the ending point of the candidate straight line segment to obtain extension coordinates; constructing coordinates according to the coordinate origin, the extension length, the distance between the starting point coordinates and the end point coordinates to serve as standard points, taking the extension coordinates as basic points, and calculating a transformation matrix based on the standard points and the basic points; the transformation matrix is utilized to carry out standardization treatment on the wired table image, and then the corresponding sub-image area is intercepted, so as to obtain a line segment sub-image of the candidate linear line segment; inputting the line segment subgraphs into the trained model prediction category, and eliminating candidate straight line segments with the category of non-form line segments.
Specifically, first, extension is performed based on the start point (x 1, y 1) and the end point (x 2, y 2) to obtain extension coordinates (x 1l, y1 l), (x 1r, y1 r) (x 2l, y2 l), (x 2r, y2 r). Secondly, constructing a standard point calculated by taking coordinates as a transformation matrix according to the coordinate origin 0, the extension length B and the distance D between the starting point coordinates and the end point coordinates, wherein the standard point comprises: (0, 0), (0, 2 XB), (D, 0) and (D, 2 XB). The calculation formula of the distance D between the start point coordinates and the end point coordinates is as follows:
meanwhile, the transformation matrix M is calculated with the extension coordinates (x 1l, y1 l), (x 1r, y1 r) (x 2l, y2 l), (x 2r, y2 r) as the base points. And then, carrying out standardization operation on the wired table image I by adopting a transformation matrix M to obtain a standardized image, and intercepting a corresponding sub-image area from the standardized image as a line segment sub-image of the candidate straight line segment. I.e. the frame coordinates of the truncated sub-picture region are (0, 0), (0, 2 xb), (D, 0) and (D, 2 xb). And finally, inputting the intercepted line segment subgraphs into a trained neural network model for category prediction, and if the model predicts that the model is a non-table line segment, removing the corresponding candidate straight line segment from the Lc. In addition, the image size may be scaled based on actual requirements before inputting the neural network model, and the embodiment preferably performs the scaling of the line segment subgraph to (56,28) and then inputs the line segment subgraph into the neural network. The neural network may be any existing neural network for class prediction, and the VGG neural network is preferred in this embodiment.
In one embodiment, performing line segment extension with a start point and an end point of a candidate straight line segment, obtaining extension coordinates includes: determining an extension length according to the product of the length of the candidate straight line segment and a length threshold value; and the starting point and the ending point are taken as centers, and the extension coordinates are obtained by respectively extending to two sides along the vertical direction of the candidate straight line segment based on the extension length.
Specifically, for any one of the candidate straight line segments li= { (x 1, y 1), (x 2, y 2), θ, ib } in Lc, B pixel points are respectively extended toward both sides along the vertical direction of the candidate straight line segment with the start point coordinates (x 1, y 1) and the end point coordinates (x 2, y 2) as the centers, and extension coordinates (x 1l, y1 l), (x 1r, y1 r) corresponding to the start point coordinates (x 1, y1 l), and extension coordinates (x 2l, y2 l), (x 2r, y2 r) corresponding to the end point coordinates are obtained. Wherein B is 0.25 times of the length of the candidate straight line segment. That is, the product of the length of the candidate straight line segment and the length threshold value of 0.25 is taken as the extension length B.
Step S204, filtering the candidate straight line segments which do not meet the angle requirement from the candidate straight line segments according to the slope.
In particular, due to the nature of the form, the angle of inclination of the form lines is typically within a fixed range, and once a certain angle range is exceeded, it is highly likely that the lines in the form are not. Therefore, the present embodiment further filters the candidate straight line segments in Lc based on the preset angle requirement. The angle of the line segment can be known based on the slope, so that the candidate straight line which does not meet the angle requirement can be accurately screened out through the slope and filtered out.
In one embodiment, step S204 includes: clustering each candidate straight line segment based on the slope of each candidate straight line segment to determine a horizontal line segment and a vertical line segment; respectively calculating standard slopes of each transverse line segment and each vertical line segment to obtain a transverse line standard slope and a vertical line standard slope; calculating a corresponding angle difference value according to the slope of the transverse line segment and the standard slope of the transverse line segment, and calculating a corresponding angle difference value according to the slope of the vertical line segment and the standard slope of the vertical line; and filtering the candidate straight line segments with the angle difference value larger than the angle threshold value.
Specifically, for all candidate straight line segments in Lc, the slope θ of the candidate straight line segments is taken out, and a clustering function is adopted to cluster the candidate straight line segments based on the slope θ. The lines in the table are generally divided into horizontal lines and vertical lines, so the clustering class of this embodiment is set to 2, and the candidate straight line segments are divided into horizontal line segments and vertical line segments by clustering. Then, regarding the slope theta of all the line segments in the transverse line segment, taking the digit as the standard slope of the transverse line segment to obtain the standard slope of the transverse line. And calculating the angle difference value corresponding to the slope angle and the transverse standard slope of each transverse line segment in sequence, and if the angle difference value is larger than the angle threshold value, indicating that the angle difference value does not meet the preset angle requirement and belongs to an invalid line segment, removing the invalid transverse line segment from Lc. Similarly, for a vertical line segment, an invalid vertical line segment is removed from Lc in the same manner. That is, the median of all the slope line segments in the vertical line segment is taken as the standard slope of the vertical line segment, and the standard slope of the vertical line is obtained. And calculating the angle difference value corresponding to the slope angle and the standard slope of each vertical line segment in turn, and filtering if the angle difference value is larger than the angle threshold value. In this embodiment, the angle threshold is preferably 15 degrees, i.e. the line segments with angle differences greater than 15 degrees are filtered out.
In step S205, a wire frame of the table is generated based on the filtered remaining candidate straight line segments, and a wired table is obtained.
Specifically, after the line segments that do not meet the requirements in Lc are filtered through a series of modes such as the thickness, the angle, the neural network, and the like, the remaining candidate straight line segments are all the lines in the table. Accordingly, the wire frame of the table can be generated based on the candidate straight line segments remaining after the filtering, thereby obtaining the wired table in the wired table image.
In one embodiment, step S205 includes: constructing a blank binary image with the same size as the wired form image; drawing a straight line in the blank binary image based on the filtered residual candidate straight line segments to obtain a grid line image; and (3) searching the connected region after expanding and corroding the grid line image, and drawing the frame of the table according to the minimum circumscribed rectangle of the connected region to obtain the wired table.
Specifically, firstly, a blank binary image with the size consistent with that of the wired table image I is constructed, namely, all pixel points in the constructed binary image have values of 0. And then sequentially taking out one candidate straight line segment from the rest candidate straight line segments, and drawing a corresponding straight line on the blank binary image by adopting a drawing function, wherein the color of the pixel point is set to 255, so as to obtain the grid line image. Next, the obtained form image is subjected to morphological operations of expansion and corrosion, and the present embodiment preferably uses a core of 7X7 gauss in size for morphological operations of expansion and corrosion, thereby obtaining a new image. Finally, a connected region searching method is used for the new image, and a series of connected regions are obtained. And calculating the minimum circumscribed rectangle of each connected region, and taking the minimum circumscribed rectangle as the frame of the table to obtain a series of table coordinates. At this time, both the form lines and the form borders are already determined, so that the wired form in the wired form image can be accurately obtained.
According to the image wired form detection method, aiming at the type of the wired form, the problem that the change of the form size is various under the actual scene, so that the inaccurate area detection easily occurs in the method based on target detection and semantic segmentation is solved, the traditional image processing technology and the deep learning technology are combined to directly analyze and filter the form line strips, the method is used for carrying out line analysis and filtering based on the characteristics of the wire form lines, such as thickness, angle and the like to finally obtain the form area, and compared with the target detection and semantic segmentation method, the method is not easy to be influenced by the change of the form size and the number of the lines, so that the rapid and accurate wired form detection is realized.
It should be understood that, although the steps in the flowchart of fig. 2 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a portion of the steps in fig. 2 may include a plurality of steps or stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily sequential, but may be performed in rotation or alternatively with at least a portion of the steps or stages in other steps or other steps.
In one embodiment, as shown in fig. 3, there is provided an image wired form detection apparatus including:
the line detection module 301 is configured to obtain a binary image of the wired table image, and detect a candidate line segment based on the binary image.
The thickness filtering module 302 is configured to determine an external communication contour of the connected pixels of each candidate straight line segment in the binary image, and filter the candidate straight line segments that do not meet the thickness preset requirement according to the external communication contour.
The model filtering module 303 is configured to widen and normalize each candidate straight line segment, input the candidate straight line segment to a trained model prediction class, and filter a non-table line segment from the candidate straight line segments according to the class.
The angle filtering module 304 is configured to filter candidate line segments that do not meet the angle requirement from the candidate line segments according to the slope.
The wire frame generating module 305 is configured to generate a wire frame of the table based on the candidate straight line segments remaining after filtering, and obtain a wired table.
In one embodiment, the straight line detection module 301 is further configured to perform adaptive threshold segmentation on the wired table image by using the oxford method, so as to obtain a binary image; inverting the binary image to obtain an inverted binary image; and carrying out straight line detection on the binary image and the inverted binary image to obtain candidate straight line lines.
In one embodiment, the thickness filtering module 302 is further configured to find a set of pixels that are in communication with the candidate straight line segment from the binary image; determining an external communication contour by the contour fitting pixel point set to obtain an external communication contour pixel point set; respectively calculating the linear distance between each pixel point in the outer communication contour pixel point set and the candidate linear line segment, and calculating the standard deviation of each linear distance; and eliminating candidate linear segments with the difference value of the linear distance and the standard deviation larger than the thickness threshold value.
In one embodiment, the model filtering module 303 is further configured to perform line segment extension with the start point and the end point of the candidate straight line segment to obtain extension coordinates; constructing coordinates according to the coordinate origin, the extension length, the distance between the starting point coordinates and the end point coordinates to serve as standard points, taking the extension coordinates as basic points, and calculating a transformation matrix based on the standard points and the basic points; the transformation matrix is utilized to carry out standardization treatment on the wired table image, and then the corresponding sub-image area is intercepted, so as to obtain a line segment sub-image of the candidate linear line segment; inputting the line segment subgraphs into the trained model prediction category, and eliminating candidate straight line segments with the category of non-form line segments.
In one embodiment, the model filtering module 303 is further configured to determine the extension length according to a product of the length of the candidate straight line segment and the length threshold; and the starting point and the ending point are taken as centers, and the extension coordinates are obtained by respectively extending to two sides along the vertical direction of the candidate straight line segment based on the extension length.
In one embodiment, the angle filtering module 304 is further configured to cluster each candidate straight line segment based on a slope of each candidate straight line segment, and determine a horizontal line segment and a vertical line segment; respectively calculating standard slopes of each transverse line segment and each vertical line segment to obtain a transverse line standard slope and a vertical line standard slope; calculating a corresponding angle difference value according to the slope of the transverse line segment and the standard slope of the transverse line segment, and calculating a corresponding angle difference value according to the slope of the vertical line segment and the standard slope of the vertical line; and filtering the candidate straight line segments with the angle difference value larger than the angle threshold value.
In one embodiment, the wire frame generation module 305 is further configured to construct a blank binary image that is consistent with the size of the wired form image; drawing a straight line in the blank binary image based on the filtered residual candidate straight line segments to obtain a grid line image; and (3) searching the connected region after expanding and corroding the grid line image, and drawing the frame of the table according to the minimum circumscribed rectangle of the connected region to obtain the wired table.
The specific limitation of the image wired form detection device can be referred to the limitation of the image wired form detection method hereinabove, and the description thereof will not be repeated here. The modules in the above-described image wired form detection apparatus may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules. Based on such understanding, the present application may implement all or part of the procedures in the methods of the above embodiments, or may be implemented by a computer program for instructing related hardware, where the computer program may be stored in a computer readable storage medium, and the computer program may implement the steps of the embodiments of the method for detecting a wired table of images when executed by a processor. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc.
In one embodiment, a computer device is provided, which may be a server, including a processor, a memory, and a network interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an image wired form detection method. For example, a computer program may be split into one or more modules, one or more modules stored in memory and executed by a processor to perform the present application. One or more modules may be a series of computer program instruction segments capable of performing particular functions to describe the execution of a computer program in a computer device.
The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that is a control center of the computer device, connecting various parts of the overall computer device using various interfaces and lines.
The memory may be used to store the computer program and/or modules, and the processor may implement various functions of the computer device by running or executing the computer program and/or modules stored in the memory, and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
It will be appreciated by those skilled in the art that the computer device structure shown in this embodiment is only a partial structure related to the aspect of the present application, and does not constitute a limitation of the computer device to which the present application is applied, and a specific computer device may include more or fewer components, or may combine some components, or have different component arrangements.
In one embodiment, a computer device is provided, including a memory and a processor, where the memory stores a computer program, and the processor implements the method for detecting an image wired table according to any of the foregoing embodiments when executing the computer program.
In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor, implements the method for detecting an image wired table described in any of the above embodiments.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims (10)

1. An image wired form detection method, characterized by comprising:
acquiring a binary image of a wired table image, and detecting candidate straight line segments based on the binary image;
determining the external communication outline of the communication pixel points of each candidate straight line segment in the binary image, and filtering the candidate straight line segments which do not meet the thickness preset requirement according to the external communication outline;
the candidate straight line segments are widened and standardized and then input into a trained model prediction category, and non-table line segments are filtered from the candidate straight line segments according to the category;
filtering candidate straight line segments which do not meet the angle requirement from the candidate straight line segments according to the slope;
and generating a wire frame of the table based on the filtered residual candidate linear line segments to obtain the wired table.
2. The method of claim 1, wherein the acquiring a binary image of the wired-form image, detecting candidate straight-line lines based on the binary image, comprises:
performing self-adaptive threshold segmentation on the wired form image by using an Ojin method to obtain a binary image;
inverting the binary image to obtain an inverted binary image;
and carrying out straight line detection on the binary image and the inverted binary image to obtain candidate straight line lines.
3. The method according to claim 1, wherein determining the external connected outline of the connected pixels of each candidate straight line segment in the binary image, and filtering the candidate straight line segments that do not meet the preset thickness requirement according to the external connected outline includes:
finding out a pixel point set communicated with the candidate straight line segment from the binary image;
contour fitting is carried out on the pixel point set to determine an external communication contour, and an external communication contour pixel point set is obtained;
respectively calculating the linear distance between each pixel point in the outer communication contour pixel point set and the candidate linear line segment, and calculating the standard deviation of each linear distance;
and eliminating the candidate straight line segments with the difference value of the straight line distance and the standard deviation being larger than a thickness threshold value.
4. The method of claim 1, wherein the widening and normalizing each of the candidate straight line segments is performed and then input to a trained model prediction class, and filtering non-tabular line segments from the candidate straight line segments according to the class comprises:
performing line segment extension by using the starting point and the ending point of the candidate straight line segment to obtain extension coordinates;
constructing coordinates according to the origin of coordinates, the extension length and the distance between the starting point coordinates and the ending point coordinates to serve as standard points, taking the extension coordinates as basic points, and calculating a transformation matrix based on the standard points and the basic points;
the transformation matrix is utilized to carry out standardization processing on the wired table image, and then a corresponding sub-image area is intercepted, so that a line segment sub-image of the candidate straight line segment is obtained;
inputting the line segment subgraphs into a trained model prediction category, and eliminating the candidate straight line segments with the category of non-form line segments.
5. The method of claim 4, wherein the extending the line segment at the start point and the end point of the candidate straight line segment to obtain the extending coordinates comprises:
determining an extension length according to the product of the length of the candidate straight line segment and a length threshold value;
and taking the starting point and the ending point as centers, and respectively extending to two sides along the vertical direction of the candidate straight line segment based on the extension length to obtain extension coordinates.
6. The method of claim 1, wherein filtering candidate straight line segments from each of the candidate straight line segments that do not meet an angle requirement based on a slope comprises:
clustering each candidate straight line segment based on the slope of each candidate straight line segment to determine a transverse line segment and a vertical line segment;
respectively calculating standard slopes of the transverse line segments and the vertical line segments to obtain a transverse line standard slope and a vertical line standard slope;
calculating a corresponding angle difference value according to the slope of the transverse line segment and the standard slope of the transverse line, and calculating a corresponding angle difference value according to the slope of the vertical line segment and the standard slope of the vertical line;
and filtering the candidate straight line segments with the angle difference value larger than an angle threshold value.
7. The method of claim 1, wherein generating the wire frame of the table based on the filtered remaining candidate straight line segments results in a wired table, comprising:
constructing a blank binary image with the same size as the wired table image;
drawing a straight line in the blank binary image based on the filtered residual candidate straight line segments to obtain a grid line image;
and searching a connected region after expanding and corroding the table line image, and drawing the border of the table according to the minimum circumscribed rectangle of the connected region to obtain the wired table.
8. An image wired form detection apparatus, comprising:
the linear detection module is used for acquiring a binary image of the wired table image and detecting candidate linear line segments based on the binary image;
the thickness filtering module is used for determining the outer communication outline of the communication pixel points of each candidate straight line segment in the binary image, and filtering the candidate straight line segments which do not meet the thickness preset requirement according to the outer communication outline;
the model filtering module is used for widening and standardizing each candidate straight line segment, inputting the candidate straight line segment into a trained model prediction category, and filtering non-table line segments from the candidate straight line segments according to the category;
the angle filtering module is used for filtering candidate straight line segments which do not meet the angle requirement from the candidate straight line segments according to the slope;
and the wire frame generation module is used for generating a wire frame of the table based on the filtered residual candidate linear line segments to obtain a wired table.
9. A computer device comprising a processor and a memory, the memory storing a computer program, characterized in that the processor is adapted to implement the method of image wired form detection of any of claims 1-7 when executing the computer program.
10. A computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the method of image wired form detection of any of claims 1-7.
CN202310440295.3A 2023-04-23 2023-04-23 Method, device, computer equipment and storage medium for detecting image wired form Pending CN116721434A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310440295.3A CN116721434A (en) 2023-04-23 2023-04-23 Method, device, computer equipment and storage medium for detecting image wired form

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310440295.3A CN116721434A (en) 2023-04-23 2023-04-23 Method, device, computer equipment and storage medium for detecting image wired form

Publications (1)

Publication Number Publication Date
CN116721434A true CN116721434A (en) 2023-09-08

Family

ID=87868534

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310440295.3A Pending CN116721434A (en) 2023-04-23 2023-04-23 Method, device, computer equipment and storage medium for detecting image wired form

Country Status (1)

Country Link
CN (1) CN116721434A (en)

Similar Documents

Publication Publication Date Title
CN111476227B (en) Target field identification method and device based on OCR and storage medium
US9367766B2 (en) Text line detection in images
CN110046529B (en) Two-dimensional code identification method, device and equipment
WO2019169772A1 (en) Picture processing method, electronic apparatus, and storage medium
US9235758B1 (en) Robust method to find layout similarity between two documents
US20210295114A1 (en) Method and apparatus for extracting structured data from image, and device
CN110175609B (en) Interface element detection method, device and equipment
CN111860502A (en) Picture table identification method and device, electronic equipment and storage medium
CN110866930B (en) Semantic segmentation auxiliary labeling method and device
Vanetti et al. Gas meter reading from real world images using a multi-net system
CN112101386B (en) Text detection method, device, computer equipment and storage medium
CN113837151B (en) Table image processing method and device, computer equipment and readable storage medium
CN112380978B (en) Multi-face detection method, system and storage medium based on key point positioning
US8027978B2 (en) Image search method, apparatus, and program
He et al. Aggregating local context for accurate scene text detection
CN115171125A (en) Data anomaly detection method
CN112241736A (en) Text detection method and device
CN113780116A (en) Invoice classification method and device, computer equipment and storage medium
CN113269752A (en) Image detection method, device terminal equipment and storage medium
CN109213515B (en) Multi-platform lower buried point normalization method and device and electronic equipment
CN116721434A (en) Method, device, computer equipment and storage medium for detecting image wired form
CN115578736A (en) Certificate information extraction method, device, storage medium and equipment
CN111695441B (en) Image document processing method, device and computer readable storage medium
CN113128496B (en) Method, device and equipment for extracting structured data from image
CN113012189A (en) Image recognition method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination