CN110738219A - Method and device for extracting lines in image, storage medium and electronic device - Google Patents

Method and device for extracting lines in image, storage medium and electronic device Download PDF

Info

Publication number
CN110738219A
CN110738219A CN201910979506.4A CN201910979506A CN110738219A CN 110738219 A CN110738219 A CN 110738219A CN 201910979506 A CN201910979506 A CN 201910979506A CN 110738219 A CN110738219 A CN 110738219A
Authority
CN
China
Prior art keywords
line
target
picture
line segment
lines
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910979506.4A
Other languages
Chinese (zh)
Inventor
龚星
郭双双
李斌
洪科元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910979506.4A priority Critical patent/CN110738219A/en
Publication of CN110738219A publication Critical patent/CN110738219A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses image line extraction method and device, a storage medium and an electronic device, wherein the method comprises the steps of obtaining a target picture to be identified, wherein the target picture to be identified comprises lines, inputting the target picture into a target picture segmentation model to obtain a target pixel level line graph of the target picture, determining the positions of a plurality of line segments according to pixel points in the target pixel level line graph, and merging line segments positioned on the same straight lines in the plurality of line segments according to the positions of the plurality of line segments to obtain the lines on the target picture.

Description

Method and device for extracting lines in image, storage medium and electronic device
Technical Field
The invention relates to the field of computers, in particular to a method and a device for extracting lines in images, a storage medium and an electronic device.
Background
In the related art, it is generally required to acquire specific information of lines in a picture. When the specific information of the lines in the picture is acquired by the current means, the lines in various styles cannot be identified by the current means due to the fact that the styles of the lines in the picture are possibly various, and the flexibility is poor when the specific information of the lines in the picture is acquired.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides methods and devices for extracting lines in images, a storage medium and an electronic device, and at least solves the technical problem of poor flexibility when specific information of the lines in the images is acquired in the related art.
According to aspects of the embodiment of the invention, a method for extracting lines in images is provided, and the method comprises the steps of obtaining a target picture to be identified, wherein the target picture to be identified comprises lines, inputting the target picture into a target picture segmentation model to obtain a target pixel level line graph of the target picture, wherein the target pixel level line graph comprises pixel points of the lines on the target picture, the target picture segmentation model is a model for identifying the lines on the images, the model is obtained after an original picture segmentation model is trained by using a sample picture and a sample pixel level line graph corresponding to the sample picture, the positions of a plurality of line segments are determined according to the pixel points in the target pixel level line graph, and the line segments positioned on the same straight line in the plurality of line segments are combined according to the positions of the plurality of line segments to obtain the lines on the images.
As an alternative , after the lines on the merged image are combined with the text data identified by the text identification technology to obtain a table with text data, the method further includes storing the table with text data in a block chain.
According to another aspect of the embodiment of the present invention, the apparatus for extracting lines from an image is further provided, including a obtaining unit configured to obtain a target image to be recognized, where the target image to be recognized includes a line, an input unit configured to input the target image into a target image segmentation model to obtain a target pixel level line graph of the target image, where the target pixel level line graph includes a pixel point of the line on the target image, the target image segmentation model is a model for recognizing the line on the image, the model is obtained by training an original image segmentation model using a sample image and a sample pixel level line graph corresponding to the sample image, a determining unit configured to determine positions of a plurality of line segments according to the pixel point in the target pixel level line graph, and a merging unit configured to merge line segments located in the same straight line among the plurality of line segments according to the positions of the plurality of line segments to obtain the line on the image.
As optional examples, the target picture includes a table and text data, the line is the table, the apparatus further includes a recognition unit configured to, after merging line segments located on the same straight lines among the plurality of line segments according to positions of the plurality of line segments to obtain a line segment on the image, recognize the text data in the image by a text recognition technique, and a combination unit configured to combine the line segment on the image obtained by merging with the text data recognized by the text recognition technique to obtain the table with the text data.
As optional examples, the apparatus further includes a storage unit, configured to combine the lines on the merged image with the text data identified by using a text identification technology to obtain a table with text data, and store the table with text data in a block chain.
According to yet another aspect of an embodiment of the present invention, there is also provided computer-readable storage media having a computer program stored therein, wherein the computer program is configured to perform the line extraction method in the image when executed.
According to still another aspect of the embodiment of the present invention, there is provided electronic devices, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the method for extracting lines from an image by using the computer program.
In the embodiment of the invention, a target picture to be identified is obtained, wherein the target picture to be identified comprises lines, the target picture is input into a target picture segmentation model to obtain a target pixel level line graph of the target picture, the positions of a plurality of line segments are determined according to pixel points in the target pixel level line graph, and the line segments in the line segments are combined according to the positions of the line segments to obtain the line on the picture.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and constitute a part of this application , illustrate embodiments of the invention and together with the description serve to explain the invention without limiting it.
FIG. 1 is a schematic diagram of an application environment of alternative line extraction methods in an image according to an embodiment of the invention;
FIG. 2 is a flow chart diagram illustrating an alternative method for line extraction in an image, according to an embodiment of the invention;
FIG. 3 is a schematic diagram of alternative line extraction methods in an image, according to an embodiment of the invention;
FIG. 4 is a schematic diagram of another alternative line extraction methods in an image, according to an embodiment of the invention;
FIG. 5 is a schematic diagram of still another alternative line extraction methods in an image, according to an embodiment of the invention;
FIG. 6 is a schematic diagram of still another alternative line extraction methods in an image, according to an embodiment of the invention;
FIG. 7 is a schematic diagram of still another alternative line extraction methods in an image, according to an embodiment of the invention;
FIG. 8 is a schematic diagram of still another alternative line extraction methods in an image, according to an embodiment of the invention;
FIG. 9 is a schematic diagram of still another alternative line extraction methods in an image, according to an embodiment of the invention;
FIG. 10 is a schematic diagram of still another alternative line extraction methods in an image, according to an embodiment of the invention;
FIG. 11 is a schematic structural diagram of an alternative image line extraction device according to an embodiment of the invention;
fig. 12 is a schematic structural diagram of alternative electronic devices according to an embodiment of the invention.
Detailed Description
For those skilled in the art to better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a partial embodiment of of the present invention, rather than a complete embodiment.
Furthermore, the terms "comprises" and "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a series of steps or elements of is not necessarily limited to the expressly listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to aspects of the embodiments of the present invention, methods for extracting lines from images are provided, and optionally, as alternative implementations, the above methods for extracting lines from images may be applied, but not limited, to the environment shown in fig. 1.
In the scheme, the server 112 can receive a target picture uploaded by the user equipment 104, then the target picture is input into a target picture segmentation model to obtain a target pixel level line graph of the target picture, is carried out, positions of a plurality of line segments are determined according to pixel points in the target pixel level line graph, and the line segments positioned on the same straight lines in the plurality of line segments are merged according to the positions of the plurality of line segments to obtain a line segment on the picture, and then the line segment on the picture is returned to the user equipment 104.
Or, the scheme can also be applied to the user equipment 104, the user equipment 104 acquires a target picture, then inputs the target picture into a target picture segmentation model to obtain a target pixel level line graph of the target picture, step is carried out to determine positions of a plurality of line segments according to pixel points in the target pixel level line graph, and according to the positions of the plurality of line segments, line segments positioned on the same straight lines in the plurality of line segments are merged to obtain lines on the picture, and the lines on the picture are displayed.
In the embodiment of the invention, a target picture to be identified is acquired, wherein the target picture to be identified comprises lines, the target picture is input into a target picture segmentation model to acquire a target pixel level line graph of the target picture, the positions of a plurality of line segments are determined according to pixel points in the target pixel level line graph, line segments positioned on the same straight lines in the plurality of line segments are combined according to the positions of the plurality of line segments to acquire lines on the picture.
Alternatively, the user device 104 may be, but is not limited to, a mobile phone, a tablet computer, a laptop computer, a PC, etc. terminal, and the network 110 may include, but is not limited to, a wireless network including WIFI and other wireless communication enabled networks or a wired network including, but not limited to, domain network, metropolitan area network, local area network.
Optionally, as optional embodiments, as shown in fig. 2, the line extraction method in the image includes:
s202, acquiring a target picture to be identified, wherein the target picture to be identified comprises lines;
for example, after the target picture is input into the target picture segmentation model, a target pixel level line graph is obtained, the target pixel level line graph comprises a plurality of pixel points, and the plurality of pixel points are combined into a line in the target picture.
S204, inputting the target picture into a target picture segmentation model to obtain a target pixel level line graph of the target picture, wherein the target pixel level line graph comprises pixel points of lines on the target picture, and the target picture segmentation model is a model for identifying the lines on the image, which is obtained after training an original picture segmentation model by using a sample picture and the sample pixel level line graph corresponding to the sample picture;
when training, a sample picture and a sample pixel level line graph corresponding to the sample picture are obtained firstly, the sample pixel level line graph corresponding to the sample picture can be obtained through manual marking, samples correspond to sample pixel level line graphs, samples and sample pixel level line graphs form groups of samples, a plurality of groups of samples are obtained and input into an original picture segmentation model, the original picture segmentation model is trained, if the original picture segmentation model inputs current sample pictures, the matching degree of an th pixel level line graph output by the original picture segmentation model and a current pixel level line graph corresponding to the current sample picture is greater than a second threshold value, the th pixel level line graph output by the current model is considered to be in line with the predetermined condition, if the total number of times of th pixel level line graphs output by the original picture segmentation model and meeting the predetermined condition is greater than the second threshold value, the target picture segmentation model is identified as 3699%, and the target picture segmentation model is identified as 369.
S206, determining the positions of a plurality of line segments according to pixel points in the target pixel level line graph;
optionally, in the present solution, after the target picture is obtained and input into the target picture segmentation model, after the target pixel level line graph of the target picture is output by the target picture segmentation model, hough transform may be performed on the target pixel level line graph.
The process of Hough transform is mainly divided into two steps:
1. initializing a correlation matrix
The angle list theta is [0,1,2, …,178,179], where the angle is the angle between the perpendicular from the origin to the target straight line and the x coordinate axis.
The distance list rho [ -diag _ len +1, …, diag _ len-1, diag _ len ], where the distance refers to the perpendicular distance from the origin to the target straight line. Wherein diag _ len represents the length of the diagonal.
The voting matrix votes has a value of 0, the number of row elements is the number of elements in the rho list, and the number of column elements is the number of elements in the theta list.
2. For each non-zero pixel point in the line graph, traversing each angle value in the angle list, calculating a vertical line distance value corresponding to the pixel point under the angle value, wherein the angle value and the distance value form a data pair (rho, theta), the data pair corresponds to a (rho + diag _ len, theta) position in the voting matrix, and then adding 1 to the value accumulation of the position in the voting matrix.
The greater the value of a certain position in the voting matrix, the greater the value of the position, the (rho, theta) value corresponding to the position represents straight lines with high confidence.
For example, if the width of the line in the target pixel level line graph is greater than 1, the erosion and dilation operations need to be performed on the target pixel level line graph multiple times, so as to reduce the width of the line in the target pixel level line graph until the width is pixels.
The purpose of the erosion and dilation operation is to adjust a line with a line width of many pixels to a line with a line width of pixels.
1. And judging whether the pixel value of each pixel in the target pixel level line graph is larger than 127.5, if so, setting the pixel value of the pixel point to be 255, and otherwise, setting the pixel value of the point to be 0.
2. structural elements with cross-shaped structural units are obtained through an OpenCV library and are used for subsequent corrosion and expansion operations.
3. The skeletal information (skeletton) of the initialization target pixel level line graph is all-zero matrices.
4. The following operations are repeatedly performed until the target pixel level line graph is eroded into an image having all the pixel values of 0:
corroding the target pixel level line graph (image) to obtain a corroded image (affected), expanding the affected image to obtain a distorted image, calculating the difference value between the distorted image and the original target pixel level line graph image, performing OR operation on the difference value image and the mentioned skeeleton image, wherein the step can be regarded as continuously adding the skeleton information of the original target pixel level line graph, taking the affected image as a new target pixel level line graph, and repeating the step 4.
5. When all the pixel values of the target pixel level line graph are corroded to be 0, the skeeleton image is represented by skeleton information of the original target pixel level line graph, and the pixel width of each line segments in the image is 1.
And S208, merging the line segments positioned on the straight lines in the line segments according to the positions of the line segments to obtain the line segment on the target picture.
Optionally, in this scheme, after a hough transform method is used to convert pixel points in a target pixel level line graph into a plurality of line segments, the similarity between the shape of a pattern formed by the plurality of line segments and a line in a target picture is very high, and the plurality of line segments correspond to coordinates in a planar rectangular coordinate system, the line segments belonging to the same straight lines in the plurality of line segments are merged, the shape of the merged result is almost matched with the shape of the line in the target picture, and the merged result includes coordinates of every line segments.
Optionally, in the merging process, any two line segments are sequentially obtained from the plurality of line segments, wherein any two line segments comprise a line segment and a second line segment, the following steps are executed, namely, the coordinate difference of the center points of a line segment and the second line segment is obtained, and the line segment and the second line segment are merged to obtain a merged line segment under the condition that the coordinate difference is smaller than a third threshold value.
Optionally, in the present solution, in the process of merging the th line segment and the second line segment, the th line segment and the second line segment need to be determined, optionally, a horizontal line in the present solution is a line segment whose included angle between the line segment and the abscissa axis of the rectangular planar coordinate system is less than 45 degrees, and a vertical line is a line segment whose included angle between the line segment and the abscissa axis of the rectangular planar coordinate system is greater than 45 degrees.
, sequentially obtaining any two combined line segments, wherein any two combined line segments comprise a target line segment and a second target line segment, executing the following steps, combining a th target line segment and the second target line segment to obtain a line on the target picture under the condition that the distance from the center point of the th target line segment to the straight line where the second target line segment is located is smaller than a fourth threshold, after the combination, the line segments belonging to the same straight lines are combined, and the combined result is equal to the line on the target picture, or taking the combined result as another examples, in the second step combining process, under the condition that the distance from the center point of the st target line segment to the straight line where the second target line segment is located is smaller than the fourth threshold, and the distance from the center point of the th target line segment to the straight line where the second target line segment is located is smaller than the fourth threshold, and under the condition that the distance from the center point of the second target line segment to the th target line segment is smaller than the fourth threshold, the threshold is , and the target line segments are combined accurately according to the target line segments.
Optionally, in the scheme, after the plurality of line segments are combined to obtain the lines on the image, the lines on the image and the data identification result may be combined to obtain a table carrying data in the target image.
That is to say, when the line on the target picture is a table, the information of the table is obtained, and the information of the table and the information of the characters in the target picture are merged to obtain the table carrying the character data in the target picture.
Optionally, after the table carrying the data is identified, the table carrying the data may be stored in the block chain, so that it is ensured that both the table information and the data information of the table carrying the data cannot be modified.
The Blockchain (Blockchain) is essentially decentralized databases, is strings of data blocks which are generated by using a cryptographic method to be related, each data blocks contain batches of information of network transaction, and the information is used for verifying the validity (anti-counterfeiting) of the information and generating blocks.
The basic service module is deployed on all block link point devices and used for verifying the validity of a business request and recording the valid request after consensus is completed on storage, for new business requests, the basic service firstly performs interface adaptation analysis and authentication processing (interface adaptation) on the new business requests, then encrypts the business information through a consensus algorithm (consensus management), transmits the encrypted business information to a shared account (network communication) and records and stores the business information, the intelligent contract module is responsible for issuing registration and triggering and contract execution, developers can define contract logic through a certain programming language, issues contract logic on the regional contract, issues contract configuration, executes the contract configuration according to other contract configuration, executes the operation of the contract chain, and simultaneously executes the monitoring of the running state of the contract, the monitoring of the operation of the key chain, the monitoring of the running state of the key, the operation of the key chain, the operation, the monitoring of the operation of the key chain, the operation of the key, the operation of the key, the operation of the.
The platform product service layer provides basic capability and an implementation framework of typical application, and developers can complete block chain implementation of business logic based on the basic capability and the characteristics of the superposed business. The application service layer provides the application service based on the block chain scheme for the business participants to use.
Optionally, the line extraction method in the image may be applied to any field that needs to identify line information in a picture and acquire information, such as the medical field, the statistical field, the accounting field, and the like. For example, the method is applied to the process of identifying the lines of the invoice in the picture and acquiring information.
After the target pixel level line graph of the picture is output, the target pixel level line graph comprises pixel points of the line of the invoice, the positions of the line segments are determined by using the pixel points, and then the line segments positioned on the same straight lines in the line segments are merged by using the positions to obtain the line and position information in the invoice.
By the method, no matter what type of the lines in the picture, the pixel points of the lines can be accurately identified, step is to determine the positions of the line segments according to the similar points and obtain the lines according to the positions of the line segments, so that the information of the lines is extracted from the target picture, and the effect of flexibility in obtaining the information of the lines in the picture is achieved.
The line extraction method in the above image is described below with reference to specific examples and related drawings.
For example, the target picture is pictures including data and lines of medical treatment and medicine, and the lines may be tables, for example, the picture shown in fig. 3 (the medical treatment and medicine are only examples, and the content of the tables is not limited in this scheme). fig. 3 includes the lines of the tables and the related data content, after the target picture shown in fig. 3 is acquired, the target picture is first input into the target picture segmentation model, and the target picture is identified by the target picture segmentation model to obtain an identification result.
The target picture segmentation model in the scheme can be designed based on a Network design criterion of U-Net, and class networks of U-Net are mainly applied to an image segmentation task, and pixel-level classification prediction from an image to the image is completed through a Full Convolutional Network (FCN). optionally, as shown in FIG. 4, FIG. 4 is a structure of optional target picture segmentation models, where Conv denotes a Convolutional layer, Deconv denotes a Convolutional layer, and the following two parameters respectively denote the size of a Convolutional core and the number of output channels.
In the design, a network structure adopts a small convolution kernel of 3x3, the same receptive field as large convolution kernels of 5x5 and 7x7 is achieved by overlapping a plurality of convolution kernels, the number of nonlinear layers is increased, and characteristics can be better expressed. Meanwhile, the expansion convolution can be adopted to increase the receptive field, and a more refined segmentation result can be obtained.
In the design, after a target picture is input, the output of 5 convolutional layers is taken out, each output feature map is up-sampled by using a Deconvolution Layer (Deconvolution Layer) to enable the size of each output feature map to be kept , the feature maps with the 5 sizes of are spliced (Layer jump connection), and then information fusion processing is carried out on convolutional layers to obtain a final straight line segmentation result, the concept of multi-scale fusion is achieved, the fusion of multi-scale features is beneficial to providing richer information, and the image can be split.
When the network structure is trained, the design considers the pixel level 2 classification cross entropy loss corresponding to 5 feature maps (Output1, Output2, Output3, Output4 and Output5) and the fused feature map (i.e. the straight line segmentation result), i.e. minimizes 6 loss functions, if only minimizing the 2 classification loss function corresponding to the fused feature map has precision loss, the problem of edge discontinuity or loss of many important edges is caused, practically every groups of convolution network layers should serve as networks of single , which is responsible for generating edge maps in a certain scale range of , so it is necessary to consider the Output of every groups of network layers.
And identifying the target picture through the target network model to obtain a target pixel level line graph. Optionally, in the target pixel level line graph in this scheme, data in the target picture except for the table lines may be set as default, or the table line pixel value is set to zero, and the pixel values in the other positions are set to non-zero, or the table line pixel value is set to non-zero, such as 255, and the pixel values in the other positions are set to zero. For example, when the picture shown in fig. 3 is input into the target picture segmentation model, the target pixel level line graph shown in fig. 5 can be obtained, and the target pixel level line graph only includes lines, and the patterns at other positions are not visible.
After acquiring the target pixel level line graph, the target pixel level line graph only provides information at the pixel level, and we need vectorized line information in order to utilize coordinate information of the line in the subsequent steps. Vectoring involves two important steps:
(1) if the width of the vertical line in the segmentation result is wider, the vertical line is identified as a short transverse line of a series by the Hough transformation process, in addition, , the width of the straight line in different images is not uniform, and the thinning of the straight line to the width of a single pixel is beneficial to eliminating the influence of the straight lines with different thicknesses on the Hough transformation result.
For example, as shown in fig. 6, the left graph in fig. 6 may be a line in a target pixel level line graph identified by the target picture segmentation model, and the width of the line is relatively wide. At this time, the skeleton information shown on the right side of fig. 6, the width of which is 1 pixel width, can be extracted through a plurality of etching and expansion operations.
(2) The hough transform technique is applied to the skeleton image (skeleton information), the result of the pixel-level-based straight line segmentation is converted into a vectorized straight line, and the coordinate information of every straight lines in the image can be obtained, so that the straight line information can be utilized in the subsequent process.
straight lines (y, kx + b) in the rectangular coordinate system as shown in the left diagram of fig. 7, the more curves intersecting the point (r, θ) in the polar coordinate system, the more points the straight line represented by the point is composed of, i.e., straight lines are detected in the rectangular coordinate system, r is the shortest distance from the origin to the straight line (y, kx + b), and θ is the angle between the line from the origin to the straight line (y, kx + b) and the abscissa axis, for example, as shown in fig. 8, fig. 8 may be a vectorized straight line result obtained by performing hough transform on the diagram on the right side of fig. 6, a vectorized straight line result is detected as a short line segment, a segment segment in fig. 8, it should be noted that the result in fig. 8 is a pair of straight line segments, and a plurality of short line segments in the scheme is not limited to the structure of the short line segments.
It should be noted that the line graph at the target pixel level before the hough transform is performed can only display line information from the picture, but cannot be applied to the line information. And a plurality of line segments after Hough transformation can be put into use. And a plurality of line segments after Hough transform can be regarded as lines in the target picture.
The straight lines detected by the line segment vectorization process are mostly discontinuous straight lines, and we need to splice the straight lines into reasonable continuous straight lines through a knowledge-based method. Meanwhile, in order to solve the problem that a straight line in a deformed image is usually distorted, a mode of fitting the straight line by a plurality of sections of broken lines is provided, so that a final straight line extraction result is more fit with the straight line in the original image, and more accurate information is provided for obtaining a line number/column number result of a character subsequently.
The basic algorithm for line segment clustering is as follows:
Algorithm for clustering lines
input, fragmenting line segment list lines, calculating function Dis of distance between two straight lines, and merging function Merge of line segment
Output-clustered line segment List clustered _ lines after clustering
I ═ 0, clustered _ lines [ ], flag [ ] rule ]// initialized line index i [ ], clustered line list clustered _ lines [ ], flag list flag for indicating whether each line segment has been traversed, the length of flag is equal to the length of input line list, and the values are all equal to
Figure RE-GDA0002300516480000151
The above pseudo-code process is to traverse every line segments, and for the currently traversed line segment (cur _ line _ i), determine whether it can be spliced with the later traversed line segment (after _ line _ j), if the two line segments satisfy the splicing requirement, splice them at .
(1) The difference between the vertical coordinates of cur _ line and after _ line is less than a preset coordinate threshold;
(2) the distance between cur _ line and after _ line is less than a preset distance threshold.
And if the splicing requirement is not met, adding the current line cur _ line into the clustered straight line list. I and j are integers, j > i.
Specifically, if the two line segments are horizontal lines, the difference between the vertical coordinates of the center points of the two line segments is smaller than a third threshold (y _ thresh), if the two line segments are vertical lines, the difference between the horizontal coordinates of the center points of the two line segments is smaller than the third threshold (y _ thresh), if the condition is met, the two line segments are combined to obtain a combined line segment, then, if the condition is met, the distance from the center point of line segments to the straight line of another line segments and the distance from the center point of another line segments to the straight line of the line segments are both smaller than a distance threshold (dis _ thresh), if the condition is met, the combined line segments are combined again to obtain a final result, the final result can be that straight lines are fitted by a plurality of broken lines, for example, as shown in fig. 9, for curves on the upper side of fig. 9, after hough transformation and combination, a plurality of broken lines on the lower side of fig. 9 are obtained, a plurality of broken lines are more accurate broken lines, and a plurality of broken lines is used to obtain a plurality of broken lines, such as a curved line representing a curved line, which is more accurate curve, such as a curved line, where the broken line, where the accuracy is higher curve 6778.
For example, when the lines on the image are tables, as shown in fig. 10, the table information obtained by recognition is combined with the text information to obtain a combined table, the line number of each line of data can be displayed in the combined table, and the left side of each line of the table corresponds to numbers to indicate the line number of the line.
According to the embodiment, after the target picture is obtained, the target picture is input into the pre-trained target picture segmentation model, the target pixel level line graph is obtained through identification of the target picture segmentation model, and the target pixel level line graph comprises the pixel points of the line in the target picture.
As an alternative embodiment to , before inputting the target picture into the target picture segmentation model, the method further comprises:
s1, obtaining a sample picture and a sample pixel level line graph corresponding to the sample picture;
s2, inputting the sample pictures and sample pixel level line graphs corresponding to the sample pictures into an original picture segmentation model, and training the original picture segmentation model, wherein after a training result output by the current original picture segmentation model is obtained, the target times that the th pixel level line graph output by the original picture segmentation model meets a preset condition and the total times of the th pixel level line graph output by the original picture segmentation model are determined, and under the condition that the ratio of the target times to the total times is greater than a threshold value, the current original picture segmentation model is determined as the target picture segmentation model, wherein the preset condition is used for indicating that the ith th pixel level line graph obtained after the ith sample picture is input into the original picture segmentation model, and the matching degree of the ith sample pixel level line graph is greater than a second threshold value.
Optionally, the target picture segmentation model in this scheme is used to identify the pixel points in the target picture after receiving the target picture, identify the pixel points located on the line as types, and identify the pixel points not belonging to the line as another types, thereby completing segmentation of the line and other contents.
According to the embodiment, the original picture segmentation model is trained by using the sample picture before the target picture segmentation model is used, so that a mature target picture segmentation model is obtained, the flexibility of obtaining the information of lines in the picture is realized, and the identification accuracy of the target picture segmentation model is improved.
As an alternative embodiment of , merging the line segments located in the same straight lines from the plurality of line segments according to the positions of the line segments to obtain the line on the image includes:
s1, sequentially obtaining any two line segments from the plurality of line segments, wherein the any two line segments comprise a line segment and a second line segment, and executing the following steps of obtaining a coordinate difference value of the center points of the line segment and the second line segment, and combining the line segment and the second line segment to obtain a combined line segment under the condition that the coordinate difference value is smaller than a third threshold value;
and S2, sequentially acquiring any two combined line segments, wherein the any two combined line segments comprise a th target line segment and a second target line segment, executing the following steps, and combining the th target line segment and the second target line segment to obtain a line on the image under the condition that the distance from the center point of the th target line segment to the straight line where the second target line segment is located is smaller than a fourth threshold value.
Through the embodiment, the plurality of line segments are combined through the two steps, so that a plurality of broken lines used for fitting curves can be obtained, and the accuracy of obtaining the information of the lines in the target picture is improved while the flexibility of obtaining the information of the lines in the picture is realized.
As an alternative embodiment to , obtaining the coordinate difference of the center points of the th line segment and the second line segment comprises:
s1, determining the difference value of the vertical coordinates of the center points of the th line segment and the second line segment as a coordinate difference value under the condition that the th line segment and the second line segment are horizontal lines, wherein the horizontal lines are line segments which form an included angle of less than 45 degrees with the horizontal axis of a coordinate system where the line segments are located;
and S2, determining the difference value of the abscissa of the central point of the th line segment and the central point of the second line segment as a coordinate difference value when the th line segment and the second line segment are vertical lines, wherein the vertical lines are line segments with an included angle of less than 45 degrees with the ordinate axis of the coordinate system.
By the method, through comparing whether the two line segments which are both horizontal lines or both vertical lines meet the conditions, the two line segments with larger differences do not need to be compared in the comparison process, so that the flexibility of obtaining the information of the lines in the picture is realized, and the comparison efficiency of the line segments is also improved.
As an alternative embodiment of , before determining the positions of the plurality of line segments according to the pixel points in the target pixel level line graph of the target picture, the method further includes:
s1, performing erosion and expansion operation on the lines formed by the pixel points in the target pixel level line graph of the target picture to reduce the width of the lines.
Through the method, thicker lines in the target pixel level line graph can be converted into lines with the width of 1 pixel, so that accurate vertical lines can be identified instead of a plurality of short transverse lines in the Hough transform process, the flexibility of obtaining the information of the lines in the picture is realized, and the accuracy of Hough transform is improved.
As an alternative embodiment to , determining the locations of the plurality of line segments based on the pixel points in the target pixel level line graph includes:
and S1, carrying out Hough transform on the target pixel level line graph to obtain the positions of a plurality of line segments.
By the method, the lines on the target picture which cannot be utilized can be converted into the coordinate information of the lines which can be used, so that the line information can be subjected to structural operation, the flexibility of obtaining the information of the lines in the picture is realized, and the efficiency of obtaining the information of the lines in the picture is improved.
As an alternative embodiment of , after merging the line segments located in the same straight lines from the plurality of line segments according to the positions of the line segments, obtaining the line on the image, the method further includes:
s1, recognizing the character data in the image through character recognition technology;
and S2, combining the lines on the image obtained by merging with the character data obtained by using the character recognition technology to obtain a table carrying the character data.
Alternatively, the line in the scheme can be a table. The identification of the lines in the image can obtain the information of the lines. Meanwhile, the characters in the target picture can be identified by using a character identification technology, the identification result of the characters is obtained, and then the line information of the image and the identification result of the characters are combined, so that the accurate line and data information of the table can be obtained.
Through the method, the flexibility of obtaining the information of the lines in the picture is achieved, and meanwhile the combination of the line information and the data information is achieved.
As an alternative , after combining the lines on the image obtained by merging with the text data obtained by using the text recognition technology to obtain a table carrying text data, the method further includes:
and S1, storing the table carrying the text data into a block chain.
By the embodiment, the table carrying the data is stored in the block chain, so that the data of the table carrying the data cannot be modified or deleted, and the public reliability of the table carrying the data in use is improved.
It should be noted that for simplicity of description, the aforementioned method embodiments are described as series combinations of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts described, as some steps may occur in other orders or concurrently with other steps in accordance with the invention.
According to another aspects of the embodiments of the present invention, there is also provided line extraction devices in an image for implementing the line extraction method in an image as described above, as shown in fig. 11, the device includes:
(1) an obtaining unit 1102, configured to obtain a target picture to be identified, where the target picture to be identified includes a line;
for example, after the target picture is input into the target picture segmentation model, a target pixel level line graph is obtained, the target pixel level line graph comprises a plurality of pixel points, and the plurality of pixel points are combined into a line in the target picture.
(2) An input unit 1104, configured to input a target picture into a target picture segmentation model to obtain a target pixel-level line graph of the target picture, where the target pixel-level line graph includes pixel points of lines on the target picture, and the target picture segmentation model is a model for identifying lines on an image, where the model is obtained after training an original picture segmentation model using a sample picture and a sample pixel-level line graph corresponding to the sample picture;
when training, a sample picture and a sample pixel level line graph corresponding to the sample picture are obtained firstly, the sample pixel level line graph corresponding to the sample picture can be obtained through manual marking, samples correspond to sample pixel level line graphs, samples and sample pixel level line graphs form groups of samples, a plurality of groups of samples are obtained and input into an original picture segmentation model, the original picture segmentation model is trained, if the original picture segmentation model inputs current sample pictures, the matching degree of an th pixel level line graph output by the original picture segmentation model and a current pixel level line graph corresponding to the current sample picture is greater than a second threshold value, the th pixel level line graph output by the current model is considered to be in line with the predetermined condition, if the total number of times of th pixel level line graphs output by the original picture segmentation model and meeting the predetermined condition is greater than the second threshold value, the target picture segmentation model is identified as 3699%, and the target picture segmentation model is identified as 369.
(3) A determining unit 1106, configured to determine positions of multiple line segments according to pixel points in the target pixel level line graph;
optionally, in the present solution, after the target picture is obtained and input into the target picture segmentation model, after the target pixel level line graph of the target picture is output by the target picture segmentation model, hough transform may be performed on the target pixel level line graph.
The process of Hough transform is mainly divided into two steps:
1. initializing a correlation matrix
The angle list theta is [0,1,2, …,178,179], where the angle is the angle between the perpendicular from the origin to the target straight line and the x coordinate axis.
The distance list rho [ -diag _ len +1, …, diag _ len-1, diag _ len ], where the distance refers to the perpendicular distance from the origin to the target straight line. Wherein diag _ len represents the length of the diagonal.
The voting matrix votes has a value of 0, the number of row elements is the number of elements in the rho list, and the number of column elements is the number of elements in the theta list.
2. For each non-zero pixel point in the line graph, traversing each angle value in the angle list, calculating a vertical line distance value corresponding to the pixel point under the angle value, wherein the angle value and the distance value form a data pair (rho, theta), the data pair corresponds to a (rho + diag _ len, theta) position in the voting matrix, and then adding 1 to the value accumulation of the position in the voting matrix.
The greater the value of a certain position in the voting matrix, the greater the value of the position, the (rho, theta) value corresponding to the position represents straight lines with high confidence.
For example, if the width of the line in the target pixel level line graph is greater than 1, the erosion and dilation operations need to be performed on the target pixel level line graph multiple times, so as to reduce the width of the line in the target pixel level line graph until the width is pixels.
The purpose of the erosion and dilation operation is to adjust a line with a line width of many pixels to a line with a line width of pixels.
1. And judging whether the pixel value of each pixel in the target pixel level line graph is larger than 127.5, if so, setting the pixel value of the pixel point to be 255, and otherwise, setting the pixel value of the point to be 0.
2. structural elements with cross-shaped structural units are obtained through an OpenCV library and are used for subsequent corrosion and expansion operations.
3. The skeletal information (skeletton) of the initialization target pixel level line graph is all-zero matrices.
4. The following operations are repeatedly performed until the target pixel level line graph is eroded into an image having all the pixel values of 0:
corroding the target pixel level line graph (image) to obtain a corroded image (affected), expanding the affected image to obtain a distorted image, calculating the difference value between the distorted image and the original target pixel level line graph image, performing OR operation on the difference value image and the mentioned skeeleton image, wherein the step can be regarded as continuously adding the skeleton information of the original target pixel level line graph, taking the affected image as a new target pixel level line graph, and repeating the step 4.
5. When all the pixel values of the target pixel level line graph are corroded to be 0, the skeeleton image is represented by skeleton information of the original target pixel level line graph, and the pixel width of each line segments in the image is 1.
(4) A merging unit 1108, configured to merge, according to the positions of the multiple line segments, line segments located in the same straight lines in the multiple line segments, so as to obtain a line on the target picture.
Optionally, in this scheme, after a hough transform method is used to convert pixel points in a target pixel level line graph into a plurality of line segments, the similarity between the shape of a pattern formed by the plurality of line segments and a line in a target picture is very high, and the plurality of line segments correspond to coordinates in a planar rectangular coordinate system, the line segments belonging to the same straight lines in the plurality of line segments are merged, the shape of the merged result is almost matched with the shape of the line in the target picture, and the merged result includes coordinates of every line segments.
Optionally, in the merging process, any two line segments are sequentially obtained from the plurality of line segments, wherein any two line segments comprise a line segment and a second line segment, the following steps are executed, namely, the coordinate difference of the center points of a line segment and the second line segment is obtained, and the line segment and the second line segment are merged to obtain a merged line segment under the condition that the coordinate difference is smaller than a third threshold value.
Optionally, in the present solution, in the process of merging the th line segment and the second line segment, the th line segment and the second line segment need to be determined, optionally, a horizontal line in the present solution is a line segment whose included angle between the line segment and the abscissa axis of the rectangular planar coordinate system is less than 45 degrees, and a vertical line is a line segment whose included angle between the line segment and the abscissa axis of the rectangular planar coordinate system is greater than 45 degrees.
, sequentially obtaining any two combined line segments, wherein any two combined line segments comprise a target line segment and a second target line segment, executing the following steps, combining a th target line segment and the second target line segment to obtain a line on the target picture under the condition that the distance from the center point of the th target line segment to the straight line where the second target line segment is located is smaller than a fourth threshold, after the combination, the line segments belonging to the same straight lines are combined, and the combined result is equal to the line on the target picture, or taking the combined result as another examples, in the second step combining process, under the condition that the distance from the center point of the st target line segment to the straight line where the second target line segment is located is smaller than the fourth threshold, and the distance from the center point of the th target line segment to the straight line where the second target line segment is located is smaller than the fourth threshold, and under the condition that the distance from the center point of the second target line segment to the th target line segment is smaller than the fourth threshold, the threshold is , and the target line segments are combined accurately according to the target line segments.
Optionally, in the scheme, after the plurality of line segments are combined to obtain the lines on the image, the lines on the image and the data identification result may be combined to obtain a table carrying data in the target image.
That is to say, when the line on the target picture is a table, the information of the table is obtained, and the information of the table and the information of the characters in the target picture are merged to obtain the table carrying the character data in the target picture.
Optionally, after the table carrying the data is identified, the table carrying the data may be stored in the block chain, so that it is ensured that both the table information and the data information of the table carrying the data cannot be modified.
The Blockchain (Blockchain) is essentially decentralized databases, is strings of data blocks which are generated by using a cryptographic method to be related, each data blocks contain batches of information of network transaction, and the information is used for verifying the validity (anti-counterfeiting) of the information and generating blocks.
The basic service module is deployed on all block link point devices and used for verifying the validity of a business request and recording the valid request after consensus is completed on storage, for new business requests, the basic service firstly performs interface adaptation analysis and authentication processing (interface adaptation) on the new business requests, then encrypts the business information through a consensus algorithm (consensus management), transmits the encrypted business information to a shared account (network communication) and records and stores the business information, the intelligent contract module is responsible for issuing registration and triggering and contract execution, developers can define contract logic through a certain programming language, issues contract logic on the regional contract, issues contract configuration, executes the contract configuration according to other contract configuration, executes the operation of the contract chain, and simultaneously executes the monitoring of the running state of the contract, the monitoring of the operation of the key chain, the monitoring of the running state of the key, the operation of the key chain, the operation, the monitoring of the operation of the key chain, the operation of the key, the operation of the key, the operation of the.
The platform product service layer provides basic capability and an implementation framework of typical application, and developers can complete block chain implementation of business logic based on the basic capability and the characteristics of the superposed business. The application service layer provides the application service based on the block chain scheme for the business participants to use.
Optionally, the line extraction device in the image may be applied to any field that needs to identify line information in a picture and acquire information, such as the medical field, the statistical field, the accounting field, and the like. For example, the method is applied to the process of identifying the lines of the invoice in the picture and acquiring information.
After the target pixel level line graph of the picture is output, the target pixel level line graph comprises pixel points of the line of the invoice, the positions of the line segments are determined by using the pixel points, and then the line segments positioned on the same straight lines in the line segments are merged by using the positions to obtain the line and position information in the invoice.
By the method, no matter what type of the lines in the picture, the pixel points of the lines can be accurately identified, step is to determine the positions of the line segments according to the similar points and obtain the lines according to the positions of the line segments, so that the information of the lines is extracted from the target picture, and the effect of flexibility in obtaining the information of the lines in the picture is achieved.
As an alternative embodiment of , the above apparatus further comprises:
(1) the second acquisition unit is used for acquiring a sample picture and a sample pixel level line graph corresponding to the sample picture before the target picture is input into the target picture segmentation model;
(2) the training unit is used for inputting the sample pictures and sample pixel level line graphs corresponding to the sample pictures into an original picture segmentation model and training the original picture segmentation model, wherein after a training result output by the current original picture segmentation model is obtained, the target times that the th pixel level line graph output by the original picture segmentation model meets a preset condition and the total times of the th pixel level line graph output by the original picture segmentation model are determined, and the current original picture segmentation model is determined as the target picture segmentation model under the condition that the ratio of the target times to the total times is greater than a th threshold value, wherein the preset condition is used for indicating that the ith th pixel level line graph obtained after the ith sample picture is input into the original picture segmentation model has a matching degree with the ith sample pixel level line graph which is greater than a second threshold value.
According to the embodiment, the original picture segmentation model is trained by using the sample picture before the target picture segmentation model is used, so that a mature target picture segmentation model is obtained, the flexibility of obtaining the information of lines in the picture is realized, and the identification accuracy of the target picture segmentation model is improved.
As an alternative embodiment of , the merging unit includes:
(1) the processing module is used for sequentially obtaining any two line segments from the line segments, wherein any two line segments comprise a line segment and a second line segment, and the processing module executes the following steps of obtaining the coordinate difference value of the center points of the line segment and the second line segment, and combining the line segment and the second line segment to obtain a combined line segment under the condition that the coordinate difference value is smaller than a third threshold value;
(2) and the second processing module is used for sequentially acquiring any two combined line segments, wherein the any two combined line segments comprise an th target line segment and a second target line segment, executing the following steps, and combining the th target line segment and the second target line segment to obtain the line on the image under the condition that the distance from the center point of the th target line segment to the straight line where the second target line segment is located is smaller than a fourth threshold value.
Through the embodiment, the plurality of line segments are combined through the two steps, so that a plurality of broken lines used for fitting curves can be obtained, and the accuracy of obtaining the information of the lines in the target picture is improved while the flexibility of obtaining the information of the lines in the picture is realized.
As an alternative to , process module is further configured to:
s1, determining the difference value of the vertical coordinates of the center points of the th line segment and the second line segment as a coordinate difference value under the condition that the th line segment and the second line segment are horizontal lines, wherein the horizontal lines are line segments which form an included angle of less than 45 degrees with the horizontal axis of a coordinate system where the line segments are located;
and S2, determining the difference value of the abscissa of the central point of the th line segment and the central point of the second line segment as a coordinate difference value when the th line segment and the second line segment are vertical lines, wherein the vertical lines are line segments with an included angle of less than 45 degrees with the ordinate axis of the coordinate system.
By the method, through comparing whether the two line segments which are both horizontal lines or both vertical lines meet the conditions, the two line segments with larger differences do not need to be compared in the comparison process, so that the flexibility of obtaining the information of the lines in the picture is realized, and the comparison efficiency of the line segments is also improved.
As an alternative embodiment of , the above apparatus further comprises:
(1) and the adjusting unit is used for executing corrosion and expansion operations on lines formed by the pixel points in the target pixel level line graph of the target picture before determining the positions of a plurality of line segments according to the pixel points in the target pixel level line graph of the target picture so as to reduce the width of the lines.
Through the method, thicker lines in the target pixel level line graph can be converted into lines with the width of 1 pixel, so that accurate vertical lines can be identified instead of a plurality of short transverse lines in the Hough transform process, the flexibility of obtaining the information of the lines in the picture is realized, and the accuracy of Hough transform is improved.
As an alternative embodiment of , the determining unit includes:
(1) and the third processing module is used for executing Hough transform on the target pixel level line drawing to obtain the positions of a plurality of line segments.
By the method, the lines on the target picture which cannot be utilized can be converted into the coordinate information of the lines which can be used, so that the line information can be subjected to structural operation, the flexibility of obtaining the information of the lines in the picture is realized, and the efficiency of obtaining the information of the lines in the picture is improved.
As an alternative , the image includes a table and text data, the lines are the table, and the apparatus further includes:
(1) the recognition unit is used for recognizing the character data in the image through a character recognition technology after the line segments positioned on straight lines in the line segments are combined according to the positions of the line segments to obtain the line segments on the image;
(2) and the combination unit is used for combining the lines on the images obtained by combination with the character data obtained by using the character recognition technology to obtain a table carrying the character data.
Alternatively, the line in the scheme can be a table. The identification of the lines in the image can obtain the information of the lines. Meanwhile, the characters in the target picture can be identified by using a character identification technology, the identification result of the characters is obtained, and then the line information of the image and the identification result of the characters are combined, so that the accurate line and data information of the table can be obtained.
Through the method, the flexibility of obtaining the information of the lines in the picture is achieved, and meanwhile the combination of the line information and the data information is achieved.
As an alternative embodiment of , the above apparatus further comprises:
and the storage unit is used for combining the lines on the image obtained by merging with the text data obtained by using the text recognition technology to obtain a table carrying the text data, and then storing the table carrying the text data into a block chain.
By the embodiment, the table carrying the data is stored in the block chain, so that the data of the table carrying the data cannot be modified or deleted, and the public reliability of the table carrying the data in use is improved.
According to still aspects of an embodiment of the present invention, there is also provided electronic device for implementing the line extraction method in the image, as shown in fig. 12, the electronic device includes a memory 1202 and a processor 1204, the memory 1202 stores a computer program, and the processor 1204 is configured to execute the steps in any method embodiment described above through the computer program.
Optionally, in this embodiment, the electronic apparatus may be located in at least network devices of a plurality of network devices of a computer network.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
s1, acquiring a target picture to be recognized, wherein the target picture to be recognized comprises lines;
s2, inputting the target picture into a target picture segmentation model to obtain a target pixel level line graph of the target picture, wherein the target pixel level line graph comprises pixel points of lines on the target picture, and the target picture segmentation model is a model for identifying the lines on the target picture obtained after training an original picture segmentation model by using a sample picture and the sample pixel level line graph corresponding to the sample picture;
s3, determining the positions of a plurality of line segments according to the pixel points in the target pixel level line graph;
and S4, merging the line segments which are positioned on the same straight lines in the line segments according to the positions of the line segments to obtain the line segment on the target picture.
Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 12 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 12 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 12, or have a different configuration than shown in FIG. 12.
The memory 1202 may be configured to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for extracting lines from an image in the embodiment of the present invention, and the processor 1204 may execute various functional applications and data processing by running the software programs and modules stored in the memory 1202, so as to implement the method for extracting lines from an image in the embodiment of the present invention.
, the transmitting device 1206 includes Network adapters (NIC), which can be connected to routers through Network lines and can communicate with the internet or a local area Network, and , the transmitting device 1206 is a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
In addition, the electronic device further includes: a display 1208 for displaying information of lines in the image; and a connection bus 1210 for connecting the respective module parts in the above-described electronic apparatus.
According to a further aspect of an embodiment of the present invention, there is also provided computer readable storage media having a computer program stored therein, wherein the computer program is arranged when executed to perform the steps of any of the method embodiments described above.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, acquiring a target picture to be recognized, wherein the target picture to be recognized comprises lines;
s2, inputting the target picture into a target picture segmentation model to obtain a target pixel level line graph of the target picture, wherein the target pixel level line graph comprises pixel points of lines on the target picture, and the target picture segmentation model is a model for identifying the lines on the target picture obtained after training an original picture segmentation model by using a sample picture and the sample pixel level line graph corresponding to the sample picture;
s3, determining the positions of a plurality of line segments according to the pixel points in the target pixel level line graph;
and S4, merging the line segments which are positioned on the same straight lines in the line segments according to the positions of the line segments to obtain the line segment on the target picture.
Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by instructing hardware associated with the terminal device through a program, where the program may be stored in computer-readable storage medium, and the storage medium may include a flash Memory disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
It should be understood that the technical solution of the present invention may be essentially or partially contributed to by the prior art, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in this application, it should be understood that the disclosed client may be implemented in other manners, wherein the above-described embodiments of the apparatus are merely illustrative, for example, the division of the units is only logical function divisions, and in actual implementation, there may be other divisions, for example, a plurality of units or components may be combined or integrated with another systems, or features may be omitted, or not executed.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in places, or may also be distributed on multiple network units.
In addition, the functional units in the embodiments of the present invention may be integrated into processing units, or each unit may exist alone physically, or two or more units are integrated into units.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (15)

1, method for extracting lines in image, comprising:
acquiring a target picture to be identified, wherein the target picture to be identified comprises lines;
inputting the target picture into a target picture segmentation model to obtain a target pixel level line graph of the target picture, wherein the target pixel level line graph comprises pixel points of lines on the target picture, and the target picture segmentation model is a model for identifying the lines on the image, which is obtained after training an original picture segmentation model by using a sample picture and the sample pixel level line graph corresponding to the sample picture;
determining the positions of a plurality of line segments according to the pixel points in the target pixel level line graph;
and combining the line segments positioned on the straight lines in the line segments according to the positions of the line segments to obtain the line segment on the target picture.
2. The method of claim 1, wherein prior to inputting the target picture into a target picture segmentation model, the method further comprises:
obtaining the sample picture and a sample pixel level line graph corresponding to the sample picture;
inputting the sample pictures and sample pixel level line graphs corresponding to the sample pictures into the original picture segmentation model, and training the original picture segmentation model, wherein after a training result output by the current original picture segmentation model is obtained, a target frequency that an th pixel level line graph output by the original picture segmentation model meets a predetermined condition and a total frequency of an th pixel level line graph output by the original picture segmentation model are determined, and under the condition that the ratio of the target frequency to the total frequency is greater than a threshold, the current original picture segmentation model is determined as the target picture segmentation model, wherein the predetermined condition is used for indicating an ith th pixel level line graph obtained after the ith sample picture is input into the original picture segmentation model, and the matching degree of the ith pixel level line graph is greater than a second threshold.
3. The method of claim 1, wherein merging segments of the plurality of segments that are located in the same straight lines according to the positions of the plurality of segments to obtain the line on the target picture comprises:
sequentially acquiring any two line segments from the plurality of line segments, wherein the any two line segments comprise an th line segment and a second line segment, and executing the following steps of acquiring a coordinate difference value of central points of the th line segment and the second line segment, and merging the th line segment and the second line segment to obtain a merged line segment under the condition that the coordinate difference value is smaller than a third threshold value;
and under the condition that the distance from the center point of the th target line segment to the straight line where the second target line segment is located is smaller than a fourth threshold value, combining the th target line segment and the second target line segment to obtain the line on the target picture.
4. The method of claim 3, wherein said obtaining a difference in coordinates of center points of said th line segment and said second line segment comprises:
determining the difference value of the longitudinal coordinates of the central points of the line segment and the second line segment as the coordinate difference value under the condition that the line segment and the second line segment are horizontal lines, wherein the horizontal lines are line segments which form an included angle of less than 45 degrees with the abscissa axis of the coordinate system where the line segments are located;
and determining the difference value of the abscissa of the central point of the line segment and the central point of the second line segment as the coordinate difference value when the th line segment and the second line segment are vertical lines, wherein the vertical lines are line segments which form an included angle of less than 45 degrees with the ordinate axis of the coordinate system.
5. The method of claim 1, wherein prior to determining the locations of a plurality of line segments from the pixel points in a target pixel level line graph of the target picture, the method further comprises:
and carrying out corrosion and expansion operation on lines formed by the pixel points in the target pixel level line graph of the target picture so as to reduce the width of the lines.
6. The method of any of claims 1-5, wherein the determining the locations of a plurality of line segments from the pixel points in the target pixel level line graph comprises:
and performing Hough transform on the target pixel level line graph to obtain the positions of the line segments.
7. The method of any , wherein the target picture comprises a table and text data, the line is the table, and after merging line segments located in the same straight lines from the plurality of line segments according to the positions of the line segments, the method further comprises:
identifying the text data in the image through a text identification technology;
and combining the lines on the target picture obtained by merging with the text data obtained by using the text recognition technology to obtain a table carrying the text data.
8, A device for extracting lines in images, comprising:
an obtaining unit, configured to obtain a target picture to be recognized, where the target picture to be recognized includes a line;
the input unit is used for inputting the target picture into a target picture segmentation model to obtain a target pixel level line graph of the target picture, wherein the target pixel level line graph comprises pixel points of lines on the target picture, and the target picture segmentation model is a model for identifying the lines on the target picture, which is obtained after an original picture segmentation model is trained by using a sample picture and the sample pixel level line graph corresponding to the sample picture;
the determining unit is used for determining the positions of a plurality of line segments according to the pixel points in the target pixel level line graph;
and the merging unit is used for merging line segments positioned on the same straight lines in the line segments according to the positions of the line segments to obtain the line segment on the target picture.
9. The apparatus of claim 8, further comprising:
the second acquisition unit is used for acquiring the sample picture and a sample pixel level line graph corresponding to the sample picture before the target picture is input into a target picture segmentation model;
a training unit, configured to input the sample picture and a sample pixel level line graph corresponding to the sample picture into the original picture segmentation model, and train the original picture segmentation model, where after a training result output by the current original picture segmentation model is obtained, a target number of times that an th pixel level line graph output by the original picture segmentation model meets a predetermined condition and a total number of times that the th pixel level line graph output by the original picture segmentation model are determined, and when a ratio of the target number of times to the total number of times is greater than a th threshold, the current original picture segmentation model is determined as the target picture segmentation model, where the predetermined condition is used to indicate that an ith th pixel level line graph obtained after the ith sample picture is input into the original picture segmentation model has a matching degree with the ith sample pixel level line graph that is greater than a second threshold.
10. The apparatus of claim 8, wherein the merging unit comprises:
the processing module is used for sequentially obtaining any two line segments from the line segments, wherein the any two line segments comprise a line segment and a second line segment, and executing the following steps of obtaining a coordinate difference value of the center points of the line segment and the second line segment, and combining the line segment and the second line segment to obtain a combined line segment under the condition that the coordinate difference value is smaller than a third threshold value;
and the second processing module is used for sequentially acquiring any two combined line segments, wherein the any two combined line segments comprise an th target line segment and a second target line segment, and executing the following steps of combining the th target line segment and the second target line segment to obtain the line on the target picture under the condition that the distance from the center point of the th target line segment to the straight line where the second target line segment is located is smaller than a fourth threshold value.
11. The apparatus of claim 10, wherein the processing module is further configured to:
determining the difference value of the longitudinal coordinates of the central points of the line segment and the second line segment as the coordinate difference value under the condition that the line segment and the second line segment are horizontal lines, wherein the horizontal lines are line segments which form an included angle of less than 45 degrees with the abscissa axis of the coordinate system where the line segments are located;
and determining the difference value of the abscissa of the central point of the line segment and the central point of the second line segment as the coordinate difference value when the th line segment and the second line segment are vertical lines, wherein the vertical lines are line segments which form an included angle of less than 45 degrees with the ordinate axis of the coordinate system.
12. The apparatus of claim 8, further comprising:
and the adjusting unit is used for executing corrosion and expansion operations on lines formed by the pixel points in the target pixel level line graph of the target picture before determining the positions of a plurality of line segments according to the pixel points in the target pixel level line graph of the target picture so as to reduce the width of the lines.
13. The apparatus according to any of the claims 8 to 12 and , wherein the determining unit comprises:
and the processing module is used for executing Hough transform on the target pixel level line graph to obtain the positions of the line segments.
14, computer-readable storage medium, in which a computer program is stored, characterized in that the computer program, when running, performs the method of any of claims 1 to 7 to .
15, electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1-7 to by means of the computer program.
CN201910979506.4A 2019-10-15 2019-10-15 Method and device for extracting lines in image, storage medium and electronic device Pending CN110738219A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910979506.4A CN110738219A (en) 2019-10-15 2019-10-15 Method and device for extracting lines in image, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910979506.4A CN110738219A (en) 2019-10-15 2019-10-15 Method and device for extracting lines in image, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN110738219A true CN110738219A (en) 2020-01-31

Family

ID=69268898

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910979506.4A Pending CN110738219A (en) 2019-10-15 2019-10-15 Method and device for extracting lines in image, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN110738219A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111612804A (en) * 2020-05-13 2020-09-01 北京达佳互联信息技术有限公司 Image segmentation method and device, electronic equipment and storage medium
CN111709338A (en) * 2020-06-08 2020-09-25 苏州超云生命智能产业研究院有限公司 Method and device for detecting table and training method of detection model
CN111898402A (en) * 2020-06-01 2020-11-06 王昌龙 Intelligent typesetting system
CN112199756A (en) * 2020-10-30 2021-01-08 久瓴(江苏)数字智能科技有限公司 Method and device for automatically determining distance between straight lines
CN112199753A (en) * 2020-10-30 2021-01-08 久瓴(江苏)数字智能科技有限公司 Shear wall generation method and device, electronic equipment and storage medium
CN113139445A (en) * 2021-04-08 2021-07-20 招商银行股份有限公司 Table recognition method, apparatus and computer-readable storage medium
CN113375680A (en) * 2020-03-10 2021-09-10 阿里巴巴集团控股有限公司 Method, device and equipment for merging line elements in electronic map
CN116228634A (en) * 2022-12-07 2023-06-06 辉羲智能科技(上海)有限公司 Distance transformation calculation method, application, terminal and medium for image detection
CN111291758B (en) * 2020-02-17 2023-08-04 北京百度网讯科技有限公司 Method and device for recognizing seal characters
CN113139445B (en) * 2021-04-08 2024-05-31 招商银行股份有限公司 Form recognition method, apparatus, and computer-readable storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103093218A (en) * 2013-01-14 2013-05-08 西南大学 Automatically recognizing form type method and device
CN104129389A (en) * 2014-08-06 2014-11-05 中电海康集团有限公司 Method for effectively judging and recognizing vehicle travelling conditions and device thereof
CN107392139A (en) * 2017-07-18 2017-11-24 海信集团有限公司 A kind of method for detecting lane lines and terminal device based on Hough transformation
CN107679024A (en) * 2017-09-11 2018-02-09 畅捷通信息技术股份有限公司 The method of identification form, system, computer equipment, readable storage medium storing program for executing
CN108061897A (en) * 2017-12-05 2018-05-22 哈尔滨工程大学 A kind of submerged structure environment line feature extraction method based on Forward-Looking Sonar
CN109086714A (en) * 2018-07-31 2018-12-25 国科赛思(北京)科技有限公司 Table recognition method, identifying system and computer installation
US20180373932A1 (en) * 2016-12-30 2018-12-27 International Business Machines Corporation Method and system for crop recognition and boundary delineation
CN109993112A (en) * 2019-03-29 2019-07-09 杭州睿琪软件有限公司 The recognition methods of table and device in a kind of picture
CN110163198A (en) * 2018-09-27 2019-08-23 腾讯科技(深圳)有限公司 A kind of Table recognition method for reconstructing, device and storage medium
CN110210409A (en) * 2019-06-04 2019-09-06 南昌市微轲联信息技术有限公司 Form frame-line detection method and system in table document

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103093218A (en) * 2013-01-14 2013-05-08 西南大学 Automatically recognizing form type method and device
CN104129389A (en) * 2014-08-06 2014-11-05 中电海康集团有限公司 Method for effectively judging and recognizing vehicle travelling conditions and device thereof
US20180373932A1 (en) * 2016-12-30 2018-12-27 International Business Machines Corporation Method and system for crop recognition and boundary delineation
CN107392139A (en) * 2017-07-18 2017-11-24 海信集团有限公司 A kind of method for detecting lane lines and terminal device based on Hough transformation
CN107679024A (en) * 2017-09-11 2018-02-09 畅捷通信息技术股份有限公司 The method of identification form, system, computer equipment, readable storage medium storing program for executing
CN108061897A (en) * 2017-12-05 2018-05-22 哈尔滨工程大学 A kind of submerged structure environment line feature extraction method based on Forward-Looking Sonar
CN109086714A (en) * 2018-07-31 2018-12-25 国科赛思(北京)科技有限公司 Table recognition method, identifying system and computer installation
CN110163198A (en) * 2018-09-27 2019-08-23 腾讯科技(深圳)有限公司 A kind of Table recognition method for reconstructing, device and storage medium
CN109993112A (en) * 2019-03-29 2019-07-09 杭州睿琪软件有限公司 The recognition methods of table and device in a kind of picture
CN110210409A (en) * 2019-06-04 2019-09-06 南昌市微轲联信息技术有限公司 Form frame-line detection method and system in table document

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
司明: "表格识别的研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
姚鹏威: "基于数字图像处理的表格识别", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
张哲远: "果园机器人自主导航关键技术研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111291758B (en) * 2020-02-17 2023-08-04 北京百度网讯科技有限公司 Method and device for recognizing seal characters
CN113375680A (en) * 2020-03-10 2021-09-10 阿里巴巴集团控股有限公司 Method, device and equipment for merging line elements in electronic map
CN113375680B (en) * 2020-03-10 2024-03-29 阿里巴巴集团控股有限公司 Method, device and equipment for merging line elements in electronic map
CN111612804A (en) * 2020-05-13 2020-09-01 北京达佳互联信息技术有限公司 Image segmentation method and device, electronic equipment and storage medium
CN111612804B (en) * 2020-05-13 2024-03-19 北京达佳互联信息技术有限公司 Image segmentation method, device, electronic equipment and storage medium
CN111898402A (en) * 2020-06-01 2020-11-06 王昌龙 Intelligent typesetting system
CN111709338B (en) * 2020-06-08 2024-02-27 苏州超云生命智能产业研究院有限公司 Method and device for table detection and training method of detection model
CN111709338A (en) * 2020-06-08 2020-09-25 苏州超云生命智能产业研究院有限公司 Method and device for detecting table and training method of detection model
CN112199756A (en) * 2020-10-30 2021-01-08 久瓴(江苏)数字智能科技有限公司 Method and device for automatically determining distance between straight lines
CN112199753A (en) * 2020-10-30 2021-01-08 久瓴(江苏)数字智能科技有限公司 Shear wall generation method and device, electronic equipment and storage medium
CN112199753B (en) * 2020-10-30 2022-06-28 久瓴(江苏)数字智能科技有限公司 Shear wall generation method and device, electronic equipment and storage medium
CN113139445A (en) * 2021-04-08 2021-07-20 招商银行股份有限公司 Table recognition method, apparatus and computer-readable storage medium
CN113139445B (en) * 2021-04-08 2024-05-31 招商银行股份有限公司 Form recognition method, apparatus, and computer-readable storage medium
CN116228634B (en) * 2022-12-07 2023-12-22 辉羲智能科技(上海)有限公司 Distance transformation calculation method, application, terminal and medium for image detection
CN116228634A (en) * 2022-12-07 2023-06-06 辉羲智能科技(上海)有限公司 Distance transformation calculation method, application, terminal and medium for image detection

Similar Documents

Publication Publication Date Title
CN110738219A (en) Method and device for extracting lines in image, storage medium and electronic device
US11244435B2 (en) Method and apparatus for generating vehicle damage information
CN111931664B (en) Mixed-pasting bill image processing method and device, computer equipment and storage medium
WO2020098250A1 (en) Character recognition method, server, and computer readable storage medium
CN110853033B (en) Video detection method and device based on inter-frame similarity
CN109117773B (en) Image feature point detection method, terminal device and storage medium
CN110689043A (en) Vehicle fine granularity identification method and device based on multiple attention mechanism
CN109086834B (en) Character recognition method, character recognition device, electronic equipment and storage medium
US20230290120A1 (en) Image classification method and apparatus, computer device, and storage medium
CN108334879B (en) Region extraction method, system and terminal equipment
CN113449725B (en) Object classification method, device, equipment and storage medium
CN110569856A (en) sample labeling method and device, and damage category identification method and device
CN112541443B (en) Invoice information extraction method, invoice information extraction device, computer equipment and storage medium
CN112232203B (en) Pedestrian recognition method and device, electronic equipment and storage medium
CN113705462A (en) Face recognition method and device, electronic equipment and computer readable storage medium
CN110245573A (en) A kind of register method, apparatus and terminal device based on recognition of face
CN115063632A (en) Vehicle damage identification method, device, equipment and medium based on artificial intelligence
CN115222427A (en) Artificial intelligence-based fraud risk identification method and related equipment
CN114972771A (en) Vehicle loss assessment and claim settlement method and device, electronic equipment and storage medium
CN114332809A (en) Image identification method and device, electronic equipment and storage medium
CN110659631A (en) License plate recognition method and terminal equipment
CN112632249A (en) Method and device for displaying different versions of information of product, computer equipment and medium
CN111104965A (en) Vehicle target identification method and device
CN113486848B (en) Document table identification method, device, equipment and storage medium
CN105224957A (en) A kind of method and system of the image recognition based on single sample

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40020858

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination