WO2022134771A1 - Table processing method and apparatus, and electronic device and storage medium - Google Patents

Table processing method and apparatus, and electronic device and storage medium Download PDF

Info

Publication number
WO2022134771A1
WO2022134771A1 PCT/CN2021/124416 CN2021124416W WO2022134771A1 WO 2022134771 A1 WO2022134771 A1 WO 2022134771A1 CN 2021124416 W CN2021124416 W CN 2021124416W WO 2022134771 A1 WO2022134771 A1 WO 2022134771A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
line segment
segment detection
feature vector
Prior art date
Application number
PCT/CN2021/124416
Other languages
French (fr)
Chinese (zh)
Inventor
高超
徐国强
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2022134771A1 publication Critical patent/WO2022134771A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume

Definitions

  • the present application relates to the technical field of image processing, and in particular, to a form processing method, apparatus, electronic device, and storage medium.
  • OCR Optical Character Recognition
  • the OCR system In addition to solving the problem of text position detection and content recognition, the OCR system usually needs to parse and restore the layout structure of the document. As the most common and important layout structure, table parsing technology has become a key part of the electronicization of paper documents.
  • the inventor realizes that the current table detection and structure recognition in the industry is usually based on table frame line detection, and the text row and column positions are obtained from the horizontal and vertical lines in the image, and then the structure information of the entire table is obtained.
  • This method has a better recognition effect on the tables with complete frame lines of conventional tables, but cannot accurately restore the structure of tables without frame lines, three-line tables, and four-line tables with invisible table lines. Therefore, the problem of how to improve the accuracy of table restoration needs to be solved urgently.
  • Embodiments of the present application provide a table processing method, apparatus, electronic device, and storage medium, which can improve table restoration accuracy.
  • an embodiment of the present application provides a form processing method, the method comprising:
  • the restoration process is performed according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
  • an embodiment of the present application provides a table processing device, the device includes: an acquisition unit, a detection unit, a splicing unit, an input unit, and a restoration unit, wherein,
  • the acquiring unit is configured to acquire a table image of the first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector;
  • the detection unit is configured to perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
  • the splicing unit is used for splicing the first eigenvector and the second eigenvector to obtain a third eigenvector
  • the input unit is used to input the third feature vector into a preset model, obtain vertex features, and determine a relationship adjacency matrix based on the vertex features;
  • the restoration unit is configured to perform restoration processing according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
  • embodiments of the present application provide an electronic device, including a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be processed by the above-mentioned processing to implement the above table processing method, the method includes:
  • the restoration process is performed according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
  • an embodiment of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program for electronic data exchange, wherein the computer program causes a computer to execute to implement the table processing method.
  • the method includes:
  • the restoration process is performed according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
  • the overall recognition of the table image can be performed, and the overall recognition result can be obtained, for example, the content of the text box and the content information
  • the line detection can also be performed, which is equivalent to partial image recognition.
  • the features are fused, and then the global vertex information is effectively used, and the number of layers is deeper, and the feature extraction ability is stronger. In this way, the table can be deeply restored and the accuracy of the table restoration can be improved.
  • FIG. 1 is a schematic flowchart of a form processing method provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of another form processing method provided by an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 4 is a block diagram of functional units of a table processing apparatus provided by an embodiment of the present application.
  • the present application may relate to artificial intelligence technology, for example, relevant data may be acquired and processed based on artificial intelligence technology, for example, vertex features may be determined based on artificial intelligence technology.
  • the technical solution of the present application can be applied to form processing scenarios in various fields, such as form processing in digital medical scenarios, or form processing in financial technology scenarios, etc., to improve the accuracy of form restoration, thereby Promote the construction of smart cities.
  • the electronic devices involved in the embodiments of this application may include various handheld devices with image processing functions (such as mobile phones, tablet computers, POS machines, etc.), scanners, micro-printers, desktop computers, vehicle-mounted devices, wearable devices ( smart watches, smart bracelets, wireless headsets, augmented reality/virtual reality devices, smart glasses), AI robots, computing devices or other processing devices connected to wireless modems, and various forms of user equipment (UE), Mobile station (mobile station, MS), terminal device (terminal device) and so on.
  • UE user equipment
  • MS Mobile station
  • terminal device terminal device
  • FIG. 1 is a schematic flowchart of a form processing method provided by an embodiment of the present application. As shown in the figure, applied to an electronic device, the form processing method includes:
  • the first table may be any table, for example, an excel table, or a table hand-drawn by a user, or a table in a word file.
  • the electronic device can photograph the first form, obtain a form image of the first form, and identify the form image, mainly by using OCR technology, to obtain a recognition result, and the recognition result can include the text box content and content information , and further, perform feature extraction on the recognition result to obtain a first feature vector, and the feature obtained by feature extraction can be at least one of the following: vertex coordinates, center coordinates, width, height, color, text character type statistics of the text box, Text word vectors, sentence vectors, front and background colors, fonts, textures, etc., are not limited here.
  • the algorithm corresponding to feature extraction can be at least one of the following: Harris corner detection, scale-invariant feature transformation algorithm, neural network algorithm, wavelet transformation, etc., which are not limited here, and each feature can be represented in the form of a vector , to obtain the first feature vector, that is, the first feature vector may include the feature of the text box dimension of the table, and may also include the feature of the content dimension of the table.
  • the electronic device can perform OCR recognition on the entire document, obtain text box position and content information, and extract features to obtain a N t ⁇ D feature vector, where N t is the number of text boxes, D is the feature dimension, t is a positive integer.
  • Features may include at least one of the following: location features (eg, vertex coordinates, center coordinates, width, height of text boxes), language features (eg, text character type statistics, text word vectors, sentence vectors, etc.), image features (eg, front Background color, font, texture, etc.), etc., are not limited here.
  • obtaining the table image of the first table may include the following steps:
  • the environmental parameter may be at least one of the following: ambient light brightness, ambient color temperature, jitter parameter, temperature, humidity, weather, etc., which are not limited herein.
  • the electronic device can obtain the environmental parameters through the sensor, and the sensor can be at least one of the following: ambient light sensor, temperature sensor, humidity sensor, color temperature sensor, jitter detection sensor, weather sensor, etc., which are not limited here, the electronic device can focus on the above Sensors, through which each sensor is used to detect environmental parameters.
  • the shooting parameters may be at least one of the following: ISO, exposure time, flash brightness, flash operating frequency, flash color, anti-shake parameters, white balance parameters, etc., which are not limited herein.
  • the mapping relationship between the preset environmental parameters and the shooting parameters may be pre-stored in the electronic device.
  • the electronic device can obtain the target environment parameters, and then, according to the preset mapping relationship between the environment parameters and the shooting parameters, determine the target shooting parameters corresponding to the target environment parameters, and shoot the first table according to the target shooting parameters,
  • the table image is obtained, in this way, shooting parameters suitable for the environment can be obtained, which helps to obtain the best table image and helps to improve the efficiency of subsequent table restoration.
  • the electronic device may perform line segment detection on the table image, and the specific algorithm may be at least one of the following: Hough transform, neural network algorithm, ecological algorithm, etc., which are not limited here.
  • the algorithm corresponding to the feature extraction may be at least one of the following: Harris corner detection, scale-invariant feature transformation algorithm, neural network algorithm, wavelet transformation, etc., which are not limited herein.
  • the electronic device can perform line segment detection on the table image, obtain all line segment coordinate information in the table image, extract features, and obtain a feature vector of N l ⁇ D, where N l is the number of line segments, and D is the feature dimension .
  • the features may include at least one of the following: position features (eg, line segment start point, end point, midpoint coordinates, length, angle, etc.), image features (eg, foreground and background color, line shape, thickness, etc.), which are not limited here.
  • the electronic device can process the feature into a second feature vector based on the feature, for example, convert different features into the same dimension, and then perform splicing.
  • the line segment detection algorithm may be at least one of the following: Hough transform, a Line Segment Detector (LSD: aLine Segment Detector), a neural network algorithm, etc., which are not limited herein.
  • step 102 performing line segment detection on the table image to obtain a line segment detection result, which may include the following steps:
  • the electronic device can determine the boundary contour of the form image, and the interface contour can be set by the user, or can perform rough identification on the form image (for example, identify from the periphery to the inside, recognize only the peripheral contour, recognize the peripheral contour After that, the recognition operation is stopped), and the outermost contour is taken as the boundary contour. Further, the electronic device can perform texture extraction on the image in the interface outline to obtain the P-striped route, and then the electronic device can screen the P-striped route to obtain the Q-striped route.
  • the length of each texture in the P-striped route is It is within the preset length range, or it can be detected whether the width of each line in the P-striped road is within the preset width range, or it can be detected whether the bending angle of each line in the P-striped road is within the preset angle range, etc.
  • the length range, the preset width range, and the preset angle range can be set by the user or the system defaults.
  • the line segment detection is performed on the Q-stripe road to obtain the line segment detection result. In this way, the probability of misrecognition can be reduced.
  • the electronic device may process the dimensions of the first feature vector and the second feature vector into the same dimension in at least one dimension, and then splicing the two to obtain the third feature vector.
  • the preset model may be set by the user or the system defaults, and the preset model may be at least one of the following: a Transformer model, a neural network model, etc., which are not limited here.
  • the vertex feature may be at least one of the following: vertex position, vertex number, vertex vector, etc., which are not limited herein.
  • the third feature vector is input into a preset model to obtain vertex features, and based on the vertex features, a relationship adjacency matrix is determined, including the following steps:
  • model parameters satisfy a preset condition, output a relation adjacency matrix based on the optimized preset model.
  • the preset condition may be set by the user or the system defaults, for example, the preset condition is that the preset model converges, or the preset condition is that the recognition accuracy of the preset model reaches a specified threshold, and the specified threshold can be set by the user or system default.
  • the electronic device can input the third feature vector into the preset model to obtain vertex features, perform operations through bilinear multiplication based on the vertex features, obtain an initial relationship adjacency matrix, and obtain a preset relationship adjacency matrix predicted based on the vertex features, the The preset relationship adjacency matrix can be drawn by the user, or obtained by connecting each vertex according to a preset rule.
  • the preset rule can be preset or the system defaults, for example, a straight line connection or a certain curved arc connection, etc., It is not limited here.
  • the electronic device can optimize the model parameters of the preset model through the comparison result between the initial relationship adjacency matrix and the preset relationship adjacency matrix, and specifically can compare the difference between the initial relationship adjacency matrix and the preset relationship adjacency matrix to obtain a comparison As a result, the comparison result is fed back into the preset model to optimize the model parameters.
  • the model parameters can be control parameters of each module of the preset model, and the control parameters can be at least one of the following: convolution kernel size, offset, Threshold batch-size, etc., are not limited here.
  • the relational adjacency matrix is output.
  • the third eigenvector can be input into the optimized preset model to obtain the final relational adjacency matrix.
  • the electronic device can restore the table structure through a post-processing algorithm, and the specific implementation can be as follows: (1) solve the connected components for A0, and each connected component corresponds to a table; (2) according to The vertices with heavy connected components construct subgraphs from A1 and A2, and solve the maximal cliques for the subgraphs, and each clique corresponds to a row or column of the table; (3) Sort the rows and columns in the same table in order of position; ( 4) Restore cells according to the sorted rows and columns. If multiple vertices belong to the same row and column, it means they belong to the same cell.
  • the electronic device may perform restoration processing on the relational adjacency matrix through a table restoration algorithm to obtain a second table, where the second table is the table structure of the first table.
  • the table restoration algorithm may be at least one of the following: a neural network algorithm, an inverse transformation algorithm corresponding to a relational adjacency matrix, etc., which are not limited herein.
  • the text boxes and line segments in the document can be input into the graph model as vertices, and the relationship between the vertices can be predicted to obtain whether the two text boxes are in the same table, row, and column. relationship, and then restore the table structure, and further, can effectively use information such as the positional relationship between text boxes, text logical relationship, etc., and can support table recognition with incomplete frame lines, and achieve better table detection and structure recognition. effect.
  • the Transformer structure can also be used for vertex feature representation, which can effectively utilize global vertex information, and has deeper layers and stronger feature extraction capabilities, which can not only improve the efficiency of document entry and electronic paper text, but also better promote The development of informatization and digitalization in various industries.
  • target image quality evaluation value is less than or equal to the preset image quality evaluation value, perform image enhancement processing on the table image to obtain a target table image;
  • step 102 line segment detection is performed on the table image to obtain a line segment detection result, and feature extraction is performed on the line segment detection result to obtain a second feature vector, which can be implemented as follows:
  • the preset image quality evaluation value may be preset or system default.
  • the electronic device can perform image quality evaluation on the form image, and further, can obtain a target image quality evaluation value of the form image.
  • the electronic device can use at least one image quality evaluation value to perform image quality evaluation on the table image, and the image quality evaluation index can be at least one of the following: information entropy, sharpness, mean square error, mean gradient, edge retention, mean gray and so on, which are not limited here, and further, the target image quality evaluation value can be obtained, and when the target image quality evaluation value is greater than the preset image quality evaluation value, step 102 is performed; otherwise, the target image quality evaluation value can be less than or equal to the preset image quality evaluation value, perform image enhancement processing on the table image to obtain the target table image, and then perform step 102 according to the target table image, wherein, the image enhancement algorithm corresponding to the image enhancement processing can be at least one of the following : grayscale stretching, histogram equalization, wavelet transform, neural
  • step A3 image enhancement processing is performed on the table image to obtain the target table image, which may include the following steps:
  • A34 Determine the target deviation degree between the target attribute parameter and the reference attribute parameter
  • A35 Determine the target image enhancement parameter corresponding to the target deviation degree according to the mapping relationship between the preset deviation degree and the image enhancement parameter;
  • A36 Perform image enhancement processing on the table image according to the target image enhancement parameter to obtain the target table image.
  • the recognition result can include the text box content and content information
  • the electronic device can obtain the text box content in the recognition result, and can also determine the target attribute parameter of the text box content.
  • the target The attribute information may be at least one of the following: average width of lines of the text box, number of lines of the text box, vertex positions of the text box, number of vertices of the text box, area of the text box, etc., which are not limited herein.
  • the reference attribute parameter can also be at least one of the following: the average line width of the text box, the number of lines of the text box, the vertex position of the text box, the number of vertices of the text box, the area of the text box, etc., which are not limited here.
  • the reference attribute information may be a fixed attribute of the form. Since the first form exists in advance, for example, the form attribute of a printed form is also inherently set. Further, the electronic device can directly obtain the reference attribute parameter of the first form. Of course, the reference attribute parameters can also be set by the user based on experience.
  • the electronic device can also pre-store the mapping relationship between the preset deviation degree and the image enhancement parameter, and the image enhancement parameter can be at least one of the following: an image enhancement algorithm and a control parameter of the image enhancement algorithm, which are not limited here.
  • the image enhancement algorithm has different control parameters of the corresponding image enhancement algorithm.
  • the control parameters of the image enhancement algorithm are used to control the image enhancement parameters. Usually, the control parameters of the image enhancement algorithm are adjusted reasonably, so as to avoid excessive image enhancement or The image is under-enhanced.
  • the electronic device can obtain the reference attribute parameter of the first table, and can also determine the target deviation degree between the target attribute parameter and the reference attribute parameter, and the target deviation degree can be obtained as follows:
  • Target deviation (reference attribute parameter - target attribute parameter) / reference attribute parameter
  • the electronic device determines the target image enhancement parameter corresponding to the target deviation degree according to the mapping relationship between the preset deviation degree and the image enhancement parameter, and finally, can perform image enhancement processing on the table image according to the target image enhancement parameter to obtain the target image enhancement parameter.
  • the table image, and further, the image enhancement parameters can be adjusted according to the deviation between the table image and the real table, so as to enhance the image in a targeted manner, thereby preventing the table image from causing the image to be over-enhanced or under-enhanced.
  • determining the target image quality evaluation value of the table image may include the following steps:
  • A13 Perform feature point extraction on the target foreground image to obtain a target feature point set, and determine the number of target feature points according to the target feature point set;
  • A14 According to the number of the target feature points and the area of the target area, determine the distribution density of the target feature feature points
  • A15 Determine the target image quality evaluation value corresponding to the target feature point distribution density according to the preset mapping relationship between the distribution density of feature points and the image quality evaluation value.
  • a preset mapping relationship between the distribution density of feature points and the image quality evaluation value may be pre-stored in the electronic device. The higher the distribution density of feature points, the better the image quality.
  • the electronic device can perform a background removal operation on the table image to obtain the target foreground image. Since the background is a blank part, or a pre-selected background for the table (for example, a watermark, a background image, etc.), it can be determined that The area of the target area of the target foreground image, and extracting feature points from the target foreground image to obtain a target feature point set, and the feature point extraction algorithm can be at least one of the following: Harris corner detection algorithm, scale-invariant feature extraction algorithm, neural network Algorithms, etc., are not limited here. Furthermore, the electronic device can determine the number of target feature points according to the target feature point set, and can also determine the distribution density of target feature points according to the number of target feature points and the area of the target area, that is:
  • Distribution density of target feature points number of target feature points/target area area
  • the electronic device can determine the target image quality evaluation value corresponding to the target feature point distribution density according to the preset mapping relationship between the feature point distribution density and the image quality evaluation value. Since the background is removed, and only the foreground part of the image is used When performing image quality evaluation, specifically combined with the characteristics of the table, there are many blank contents in the table, and the image quality evaluation can be performed according to the corresponding feature points in the contour, and then accurate image quality evaluation can be performed for the table image.
  • the table processing method described in the embodiment of the present application obtains the table image of the first table, identifies the table image, obtains the recognition result, and performs feature extraction on the recognition result to obtain the first feature vector, which is Perform line segment detection on the table image to obtain the line segment detection result, and perform feature extraction on the line segment detection result to obtain the second feature vector, splicing the first feature vector and the second feature vector to obtain the third feature vector, and the third feature vector Input into the preset model to obtain vertex features, and based on the vertex features, determine the relationship adjacency matrix, and perform restoration processing according to the relationship adjacency matrix to obtain a second table, and the second table is the table structure of the first table, and then, one of them can be
  • the overall recognition of the table image can obtain the overall recognition result, for example, the text box content and content information, and secondly, it can also detect lines, which is equivalent to local image recognition, fuse the features of the two, and then effectively use the global vertex In addition, the number of layers is deeper
  • FIG. 2 is a schematic flowchart of a form processing method provided by an embodiment of the present application, which is applied to an electronic device. As shown in the figure, the form processing method includes:
  • 201 Acquire a table image of a first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector.
  • the table processing method described in the embodiment of the present application obtains the table image of the first table, identifies the table image, obtains the recognition result, performs feature extraction on the recognition result, obtains the first feature vector, and determines The target image quality evaluation value of the table image, when the target image quality evaluation value is greater than the preset image quality evaluation value, perform line segment detection on the table image to obtain the line segment detection result, and perform feature extraction on the line segment detection result to obtain the second feature vector , splicing the first eigenvector and the second eigenvector to obtain the third eigenvector, inputting the third eigenvector into the preset model to obtain the vertex feature, and based on the vertex feature, determine the relational adjacency matrix, and carry out according to the relational adjacency matrix
  • the restoration process is performed to obtain a second table, the second table is the table structure of the first table, and then, firstly, the overall recognition of the table image can be performed, and the overall recognition result can be obtained, for
  • FIG. 3 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device includes a processor, a memory, a communication interface, and one or more A program, the above-mentioned one or more programs are stored in the above-mentioned memory, and are configured to be executed by the above-mentioned processor.
  • the above-mentioned program includes instructions for executing the following steps:
  • the restoration process is performed according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
  • the electronic device described in the embodiment of the present application obtains the table image of the first table, recognizes the table image, obtains the recognition result, and performs feature extraction on the recognition result to obtain the first feature vector, and the table Perform line segment detection on the image to obtain the line segment detection result, and perform feature extraction on the line segment detection result to obtain the second feature vector, splicing the first feature vector and the second feature vector to obtain the third feature vector, and input the third feature vector Go to the preset model, obtain the vertex features, and determine the relationship adjacency matrix based on the vertex features, perform restoration processing according to the relationship adjacency matrix, and obtain a second table, and the second table is the table structure of the first table.
  • the overall recognition of the table image can obtain the overall recognition results, such as the text box content and content information, and secondly, it can also detect lines, which is equivalent to local image recognition, fuse the features of the two, and then effectively use the global vertex information. , and the number of layers is deeper, and the feature extraction ability is stronger. In this way, the table can be deeply restored and the accuracy of the table restoration can be improved.
  • the above program includes instructions for performing the following steps:
  • the above program includes instructions for performing the following steps:
  • the operation is performed by bilinear multiplication to obtain an initial relationship adjacency matrix
  • a relation adjacency matrix is output based on the optimized preset model.
  • the above program includes instructions for performing the following steps:
  • mapping relationship between the preset environmental parameters and the shooting parameters determine the target shooting parameters corresponding to the target environmental parameters
  • the first table is photographed according to the target photographing parameters to obtain the table image.
  • the above program also includes an instruction for performing the following steps:
  • image enhancement processing is performed on the table image to obtain a target table image
  • line segment detection is performed on the table image to obtain line segment detection
  • feature extraction is performed on the line segment detection result to obtain a second feature vector, including:
  • the above program includes instructions for performing the following steps:
  • mapping relationship between the preset deviation degree and the image enhancement parameter determine the target image enhancement parameter corresponding to the target deviation degree
  • the above program includes instructions for performing the following steps:
  • the target image quality evaluation value corresponding to the distribution density of the target feature points is determined according to the preset mapping relationship between the distribution density of feature points and the image quality evaluation value.
  • the electronic device includes corresponding hardware structures and/or software modules for executing each function.
  • the present application can be implemented in hardware or in the form of a combination of hardware and computer software, in combination with the units and algorithm steps of each example described in the embodiments provided herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
  • the electronic device may be divided into functional units according to the foregoing method examples.
  • each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units. It should be noted that the division of units in the embodiments of the present application is illustrative, and is only a logical function division, and other division methods may be used in actual implementation.
  • FIG. 4 is a block diagram of functional units of the table processing apparatus 400 involved in the embodiment of the present application.
  • the table processing apparatus 400 includes: an acquisition unit 401, a detection unit 402, a splicing unit 403, an input unit 404 and a restoration unit 405, wherein,
  • the obtaining unit 401 is configured to obtain a table image of a first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector;
  • the detection unit 402 is configured to perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
  • the splicing unit 403 is used for splicing the first eigenvector and the second eigenvector to obtain the third eigenvector;
  • the input unit 404 is configured to input the third feature vector into a preset model, obtain vertex features, and determine a relationship adjacency matrix based on the vertex features;
  • the restoration unit 405 is configured to perform restoration processing according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
  • the table processing device described in the embodiments of the present application acquires the table image of the first table, recognizes the table image, obtains the recognition result, and performs feature extraction on the recognition result to obtain the first feature vector, which is Perform line segment detection on the table image to obtain the line segment detection result, and perform feature extraction on the line segment detection result to obtain the second feature vector, splicing the first feature vector and the second feature vector to obtain the third feature vector, and the third feature vector Input into the preset model to obtain vertex features, and based on the vertex features, determine the relationship adjacency matrix, and perform restoration processing according to the relationship adjacency matrix to obtain a second table, and the second table is the table structure of the first table, and then, one of them can be
  • the overall recognition of the table image can obtain the overall recognition result, for example, the text box content and content information, and secondly, it can also detect lines, which is equivalent to local image recognition, fuse the features of the two, and then effectively use the global vertex In addition, the number of layers is deeper
  • the detection unit 402 is specifically configured to:
  • the input unit 404 is specifically used for:
  • the operation is performed by bilinear multiplication to obtain an initial relationship adjacency matrix
  • a relation adjacency matrix is output based on the optimized preset model.
  • the acquiring unit 401 is specifically configured to:
  • mapping relationship between the preset environmental parameters and the shooting parameters determine the target shooting parameters corresponding to the target environmental parameters
  • the first table is photographed according to the target photographing parameters to obtain the table image.
  • the device 400 is also specifically used for:
  • image enhancement processing is performed on the table image to obtain a target table image
  • line segment detection is performed on the table image to obtain line segment detection
  • feature extraction is performed on the line segment detection result to obtain a second feature vector, including:
  • the device 400 is specifically used for:
  • mapping relationship between the preset deviation degree and the image enhancement parameter determine the target image enhancement parameter corresponding to the target deviation degree
  • the apparatus 400 is specifically configured to:
  • the target image quality evaluation value corresponding to the distribution density of the target feature points is determined according to the preset mapping relationship between the distribution density of feature points and the image quality evaluation value.
  • Embodiments of the present application further provide a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, and the computer program causes the computer to execute part or all of the steps of any method described in the above method embodiments , the above computer includes electronic equipment.
  • the storage medium involved in the present application may be a computer-readable storage medium, such as a computer-readable storage medium, which may be non-volatile or volatile.
  • Embodiments of the present application further provide a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute any one of the method embodiments described above. some or all of the steps of the method.
  • the computer program product may be a software installation package, and the computer includes an electronic device.
  • the disclosed apparatus may be implemented in other manners.
  • the device embodiments described above are only illustrative.
  • the division of the above-mentioned units is only a logical function division.
  • multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical or other forms.
  • the above-mentioned units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the above-mentioned integrated units if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable memory.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art, or all or part of the technical solution, and the computer software product is stored in a memory.
  • a computer device which may be a personal computer, an electronic device, or a network device, etc.
  • the aforementioned memory includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.

Abstract

The present application relates to image processing technology, and in particular to a table processing method and apparatus, and an electronic device and a storage medium. The method comprises: acquiring a table image of a first table, performing identification on the table image, so as to obtain an identification result, and performing feature extraction on the identification result, so as to obtain a first feature vector; performing line segment detection on the table image, so as to obtain a line segment detection result, and performing feature extraction on the line segment detection result, so as to obtain a second feature vector; splicing the first feature vector and the second feature vector, so as to obtain a third feature vector; inputting the third feature vector into a preset model, so as to obtain a vertex feature, and determining a relationship adjacency matrix on the basis of the vertex feature; and performing restoration processing according to the relationship adjacency matrix, so as to obtain a second table, wherein the second table has the table structure of the first table. By using the embodiments of the present application, table restoration precision can be improved.

Description

表格处理方法、装置、电子设备及存储介质Form processing method, device, electronic device and storage medium
本申请要求于2020年12月23日提交中国专利局、申请号为202011538336.5,发明名称为“表格处理方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on December 23, 2020 with the application number 202011538336.5 and the invention title is "Form Processing Method, Apparatus, Electronic Device and Storage Medium", the entire contents of which are incorporated by reference in this application.
技术领域technical field
本申请涉及图像处理技术领域,具体涉及一种表格处理方法、装置、电子设备及存储介质。The present application relates to the technical field of image processing, and in particular, to a form processing method, apparatus, electronic device, and storage medium.
背景技术Background technique
光学字符识别(Optical Character Recognition,OCR)技术能够将图像中印刷文字转换为计算机可处理的文本格式,被广泛应用在数据录入、校验比对等场景中,成为国民经济各行业信息化和数字化应用的关键环节。随着大数据和深度学习技术的不断发展,OCR技术取得了突破性的进展,在印刷文档扫描件的识别中,通常可以达到99%以上的字符识别准确率。Optical Character Recognition (OCR) technology can convert the printed text in the image into a text format that can be processed by the computer. key aspects of the application. With the continuous development of big data and deep learning technology, OCR technology has made breakthroughs. In the recognition of scanned documents of printed documents, the character recognition accuracy rate of more than 99% can usually be achieved.
除了解决文字的位置检测和内容识别问题以外,OCR系统通常还需要对文档的版面结构进行解析和还原。表格作为最常见和最重要的版面结构,表格解析技术成为纸质文档电子化的关键一环。In addition to solving the problem of text position detection and content recognition, the OCR system usually needs to parse and restore the layout structure of the document. As the most common and important layout structure, table parsing technology has become a key part of the electronicization of paper documents.
发明人意识到,目前业界的表格检测与结构识别通常基于表格框线检测,由图像中的横竖线得到文本行列位置,进而获得整个表格的结构信息。这种做法对于常规表格框线完整的表格有较好的识别效果,而对于表格框线不可见的无框线表格、三线表、四线表等,则无法准确还原其结构。因此,如何提升表格还原精度的问题亟待解决。The inventor realizes that the current table detection and structure recognition in the industry is usually based on table frame line detection, and the text row and column positions are obtained from the horizontal and vertical lines in the image, and then the structure information of the entire table is obtained. This method has a better recognition effect on the tables with complete frame lines of conventional tables, but cannot accurately restore the structure of tables without frame lines, three-line tables, and four-line tables with invisible table lines. Therefore, the problem of how to improve the accuracy of table restoration needs to be solved urgently.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供了一种表格处理方法、装置、电子设备及存储介质,能够提升表格还原精度。Embodiments of the present application provide a table processing method, apparatus, electronic device, and storage medium, which can improve table restoration accuracy.
第一方面,本申请实施例提供一种表格处理方法,所述方法包括:In a first aspect, an embodiment of the present application provides a form processing method, the method comprising:
获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量;Obtain a table image of the first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector;
对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量;Perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量;Splicing the first feature vector and the second feature vector to obtain the third feature vector;
将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵;Inputting the third feature vector into a preset model, obtaining vertex features, and determining a relationship adjacency matrix based on the vertex features;
根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二表格为所述第一表格的表格结构。The restoration process is performed according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
第二方面,本申请实施例提供一种表格处理装置,所述装置包括:获取单元、检测单元、拼接单元、输入单元和还原单元,其中,In a second aspect, an embodiment of the present application provides a table processing device, the device includes: an acquisition unit, a detection unit, a splicing unit, an input unit, and a restoration unit, wherein,
所述获取单元,用于获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量;The acquiring unit is configured to acquire a table image of the first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector;
所述检测单元,用于对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量;The detection unit is configured to perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
所述拼接单元,用于将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量;The splicing unit is used for splicing the first eigenvector and the second eigenvector to obtain a third eigenvector;
所述输入单元,用于将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵;The input unit is used to input the third feature vector into a preset model, obtain vertex features, and determine a relationship adjacency matrix based on the vertex features;
所述还原单元,用于根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二 表格为所述第一表格的表格结构。The restoration unit is configured to perform restoration processing according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
第三方面,本申请实施例提供一种电子设备,包括处理器、存储器、通信接口以及一个或多个程序,其中,上述一个或多个程序被存储在上述存储器中,并且被配置由上述处理器执行,以实现上述表格处理方法,该方法包括:In a third aspect, embodiments of the present application provide an electronic device, including a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be processed by the above-mentioned processing to implement the above table processing method, the method includes:
获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量;Obtain a table image of the first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector;
对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量;Perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量;Splicing the first feature vector and the second feature vector to obtain the third feature vector;
将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵;Inputting the third feature vector into a preset model, obtaining vertex features, and determining a relationship adjacency matrix based on the vertex features;
根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二表格为所述第一表格的表格结构。The restoration process is performed according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
第四方面,本申请实施例提供了一种计算机可读存储介质,其中,上述计算机可读存储介质存储用于电子数据交换的计算机程序,其中,上述计算机程序使得计算机执行以实现上述表格处理方法,该方法包括:In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program for electronic data exchange, wherein the computer program causes a computer to execute to implement the table processing method. , the method includes:
获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量;Obtain a table image of the first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector;
对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量;Perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量;Splicing the first feature vector and the second feature vector to obtain the third feature vector;
将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵;Inputting the third feature vector into a preset model, obtaining vertex features, and determining a relationship adjacency matrix based on the vertex features;
根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二表格为所述第一表格的表格结构。The restoration process is performed according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
实施本申请实施例,其一,可以对表格图像进行整体识别,可以得到整体识别结果,例如,文本框内容以及内容信息,其二,还可以对线条检测,相当于局部图像识别,将两者的特征进行融合,再有效利用全局顶点信息,并且层数更深,特征提取能力更强,如此,能够深度还原表格,提升表格还原精准度。Implementing the embodiment of the present application, firstly, the overall recognition of the table image can be performed, and the overall recognition result can be obtained, for example, the content of the text box and the content information, and secondly, the line detection can also be performed, which is equivalent to partial image recognition. The features are fused, and then the global vertex information is effectively used, and the number of layers is deeper, and the feature extraction ability is stronger. In this way, the table can be deeply restored and the accuracy of the table restoration can be improved.
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following briefly introduces the accompanying drawings required for the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.
图1是本申请实施例提供的一种表格处理方法的流程示意图;1 is a schematic flowchart of a form processing method provided by an embodiment of the present application;
图2是本申请实施例提供的另一种表格处理方法的流程示意图;2 is a schematic flowchart of another form processing method provided by an embodiment of the present application;
图3是本申请实施例提供的一种电子设备的结构示意图;3 is a schematic structural diagram of an electronic device provided by an embodiment of the present application;
图4是本申请实施例提供的一种表格处理装置的功能单元组成框图。FIG. 4 is a block diagram of functional units of a table processing apparatus provided by an embodiment of the present application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make those skilled in the art better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only It is a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别不同 对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其他步骤或单元。The terms "first", "second" and the like in the description and claims of the present application and the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific order. Furthermore, the terms "comprising" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally also includes For other steps or units inherent to these processes, methods, products or devices.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。Reference herein to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor a separate or alternative embodiment that is mutually exclusive of other embodiments. It is explicitly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.
本申请可涉及人工智能技术,如可以基于人工智能技术对相关的数据进行获取和处理,比如基于人工智能技术确定顶点特征。可选的,本申请的技术方案可应用于各种领域中的表格处理场景,如数字医疗场景下的表格处理,又如金融科技场景下的表格处理等等,以提升表格还原精准度,从而推动智慧城市的建设。The present application may relate to artificial intelligence technology, for example, relevant data may be acquired and processed based on artificial intelligence technology, for example, vertex features may be determined based on artificial intelligence technology. Optionally, the technical solution of the present application can be applied to form processing scenarios in various fields, such as form processing in digital medical scenarios, or form processing in financial technology scenarios, etc., to improve the accuracy of form restoration, thereby Promote the construction of smart cities.
本申请实施例所涉及到的电子设备可以包括各种具备图像处理功能的手持设备(如手机、平板电脑、POS机等等)、扫描仪、微型打印机、台式机、车载设备、可穿戴设备(智能手表、智能手环、无线耳机、增强现实/虚拟现实设备、智能眼镜)、AI机器人、计算设备或连接到无线调制解调器的其他处理设备,以及各种形式的用户设备(user equipment,UE),移动台(mobile station,MS),终端设备(terminal device)等等。为方便描述,上面提到的设备统称为电子设备。The electronic devices involved in the embodiments of this application may include various handheld devices with image processing functions (such as mobile phones, tablet computers, POS machines, etc.), scanners, micro-printers, desktop computers, vehicle-mounted devices, wearable devices ( smart watches, smart bracelets, wireless headsets, augmented reality/virtual reality devices, smart glasses), AI robots, computing devices or other processing devices connected to wireless modems, and various forms of user equipment (UE), Mobile station (mobile station, MS), terminal device (terminal device) and so on. For convenience of description, the devices mentioned above are collectively referred to as electronic devices.
下面对本申请实施例进行详细介绍。The embodiments of the present application will be described in detail below.
请参阅图1,图1是本申请实施例提供的一种表格处理方法的流程示意图,如图所示,应用于电子设备,本表格处理方法包括:Please refer to FIG. 1. FIG. 1 is a schematic flowchart of a form processing method provided by an embodiment of the present application. As shown in the figure, applied to an electronic device, the form processing method includes:
101、获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量。101. Acquire a table image of a first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector.
其中,本申请实施例中,第一表格可以为任一表格,例如,excel表格,或者,用户手绘的表格,或者,word文件中的表格。具体实现中,电子设备可以对第一表格进行拍摄,得到第一表格的表格图像,并对表格图像进行识别,主要可以是采用OCR技术,得到识别结果,识别结果可以包括文本框内容和内容信息,进一步地,并对识别结果进行特征提取,得到第一特征向量,特征提取得到的特征可以为以下至少一种:文本框的顶点坐标、中心坐标、宽度、高度、颜色、文本字符类型统计、文本词向量、句向量、前背景颜色、字体、纹理等等,在此不作限定。特征提取对应的算法可以为以下至少一种:harris角点检测、尺度不变特征变换算法、神经网络算法、小波变换等等,在此不作限定,进而,可以将各个特征以向量的形式进行表示,得到第一特征向量,即第一特征向量即可以包括表格的文本框维度的特征,也可以包括表格的内容维度的特征。Wherein, in this embodiment of the present application, the first table may be any table, for example, an excel table, or a table hand-drawn by a user, or a table in a word file. In a specific implementation, the electronic device can photograph the first form, obtain a form image of the first form, and identify the form image, mainly by using OCR technology, to obtain a recognition result, and the recognition result can include the text box content and content information , and further, perform feature extraction on the recognition result to obtain a first feature vector, and the feature obtained by feature extraction can be at least one of the following: vertex coordinates, center coordinates, width, height, color, text character type statistics of the text box, Text word vectors, sentence vectors, front and background colors, fonts, textures, etc., are not limited here. The algorithm corresponding to feature extraction can be at least one of the following: Harris corner detection, scale-invariant feature transformation algorithm, neural network algorithm, wavelet transformation, etc., which are not limited here, and each feature can be represented in the form of a vector , to obtain the first feature vector, that is, the first feature vector may include the feature of the text box dimension of the table, and may also include the feature of the content dimension of the table.
具体地,电子设备可以对整个文档进行OCR识别,得到文本框位置和内容信息,并提取出特征,得到N t×D的特征向量,其中,N t为文本框个数,D为特征维度,t为正整数。特征可以包括以下至少一种:位置特征(例如,文本框的顶点坐标、中心坐标、宽度、高度)、语言特征(例如文本字符类型统计、文本词向量、句向量等)、图像特征(例如前背景颜色、字体、纹理等)等等,在此不作限定。 Specifically, the electronic device can perform OCR recognition on the entire document, obtain text box position and content information, and extract features to obtain a N t ×D feature vector, where N t is the number of text boxes, D is the feature dimension, t is a positive integer. Features may include at least one of the following: location features (eg, vertex coordinates, center coordinates, width, height of text boxes), language features (eg, text character type statistics, text word vectors, sentence vectors, etc.), image features (eg, front Background color, font, texture, etc.), etc., are not limited here.
可选地,上述步骤101,获取第一表格的表格图像,可以包括如下步骤:Optionally, in the above step 101, obtaining the table image of the first table may include the following steps:
11、获取目标环境参数;11. Obtain the target environment parameters;
12、按照预设的环境参数与拍摄参数之间的映射关系,确定所述目标环境参数对应的目标拍摄参数;12. According to the mapping relationship between the preset environmental parameters and the shooting parameters, determine the target shooting parameters corresponding to the target environmental parameters;
13、依据所述目标拍摄参数对所述第一表格进行拍摄,得到所述表格图像。13. Shoot the first table according to the target shooting parameters to obtain the table image.
其中,环境参数可以为以下至少一种:环境光亮度、环境色温、抖动参数、温度、湿度、天气等等,在此不作限定。电子设备可以通过传感器获取环境参数,传感器可以为以下至少一种:环境光传感器、温度传感器、湿度传感器、色温传感器、抖动检测传感器、气象传感器等等,在此不作限定,电子设备可以集中上述各个传感器,通过该各个传感器用以实现环境参数检测。拍摄参数可以为以下至少一种:感光度ISO、曝光时长、闪光灯亮度、闪光灯工作频率、闪光灯颜色、防抖参数、白平衡参数等等,在此不作限定。电子设备中可以预先存储预设的环境参数与拍摄参数之间的映射关系。The environmental parameter may be at least one of the following: ambient light brightness, ambient color temperature, jitter parameter, temperature, humidity, weather, etc., which are not limited herein. The electronic device can obtain the environmental parameters through the sensor, and the sensor can be at least one of the following: ambient light sensor, temperature sensor, humidity sensor, color temperature sensor, jitter detection sensor, weather sensor, etc., which are not limited here, the electronic device can focus on the above Sensors, through which each sensor is used to detect environmental parameters. The shooting parameters may be at least one of the following: ISO, exposure time, flash brightness, flash operating frequency, flash color, anti-shake parameters, white balance parameters, etc., which are not limited herein. The mapping relationship between the preset environmental parameters and the shooting parameters may be pre-stored in the electronic device.
具体实现中,电子设备可以获取目标环境参数,进而,按照预设的环境参数与拍摄参数之间的映射关系,确定目标环境参数对应的目标拍摄参数,依据目标拍摄参数对第一表格进行拍摄,得到表格图像,如此,可以得到与环境相宜的拍摄参数,有助于得到最佳的表格图像,有助于提升后续表格还原效率。In a specific implementation, the electronic device can obtain the target environment parameters, and then, according to the preset mapping relationship between the environment parameters and the shooting parameters, determine the target shooting parameters corresponding to the target environment parameters, and shoot the first table according to the target shooting parameters, The table image is obtained, in this way, shooting parameters suitable for the environment can be obtained, which helps to obtain the best table image and helps to improve the efficiency of subsequent table restoration.
102、对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量。102. Perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector.
其中,电子设备可以对表格图像进行线段检测,具体算法可以为以下至少一种:霍夫变换、神经网络算法、生态学算法等等,在此不作限定。特征提取对应的算法可以为以下至少一种:harris角点检测、尺度不变特征变换算法、神经网络算法、小波变换等等,在此不作限定。The electronic device may perform line segment detection on the table image, and the specific algorithm may be at least one of the following: Hough transform, neural network algorithm, ecological algorithm, etc., which are not limited here. The algorithm corresponding to the feature extraction may be at least one of the following: Harris corner detection, scale-invariant feature transformation algorithm, neural network algorithm, wavelet transformation, etc., which are not limited herein.
具体实现中,例如,电子设备可以对表格图像进行线段检测,得到表格图像中所有线段坐标信息,提取出特征,得到N l×D的特征向量,其中N l为线段个数,D为特征维度。特征可以包括以下至少一种:位置特征(例如线段起点、终点、中点坐标、长度、角度等)、图像特征(例如前背景颜色、线形、粗细等),在此不作限定。进而,电子设备可以基于特征可以处理为第二特征向量,例如,将不同的特征转化为相同维度,再进行拼接。线段检测算法可以为以下至少一种:霍夫变换、直线段检测算法(LSD:aLine Segment Detector)、神经网络算法等等,在此不作限定。 In specific implementation, for example, the electronic device can perform line segment detection on the table image, obtain all line segment coordinate information in the table image, extract features, and obtain a feature vector of N l ×D, where N l is the number of line segments, and D is the feature dimension . The features may include at least one of the following: position features (eg, line segment start point, end point, midpoint coordinates, length, angle, etc.), image features (eg, foreground and background color, line shape, thickness, etc.), which are not limited here. Furthermore, the electronic device can process the feature into a second feature vector based on the feature, for example, convert different features into the same dimension, and then perform splicing. The line segment detection algorithm may be at least one of the following: Hough transform, a Line Segment Detector (LSD: aLine Segment Detector), a neural network algorithm, etc., which are not limited herein.
可选地,上述步骤102,对所述表格图像进行线段检测,得到线段检测结果,可以包括如下步骤:Optionally, in the above step 102, performing line segment detection on the table image to obtain a line segment detection result, which may include the following steps:
21、确定所述表格图像的边界轮廓;21. Determine the boundary outline of the table image;
22、对所述界面轮廓内的图像进行纹路提取,得到P条纹路,所述P为大于1的整数;22. Perform texture extraction on the image in the interface outline to obtain a P-striped road, where P is an integer greater than 1;
23、对所述P条纹路进行筛选,得到Q条纹路,所述Q为大于1且小于所述P的整数;23. Screening the P-striped road to obtain a Q-striped road, where Q is an integer greater than 1 and less than the P;
24、对所述Q条纹路进行线段检测,得到所述线段检测结果。24. Perform line segment detection on the Q-striped road to obtain the line segment detection result.
具体实现中,电子设备可以确定表格图像的边界轮廓,界面轮廓可以由用户自行设置,或者,可以对表格图像进行粗识别(例如,由外围向内进行识别,仅识别外围轮廓,识别到外围轮廓之后,停止识别操作),将最外围的轮廓作为边界轮廓。进而,电子设备可以对界面轮廓内的图像进行纹路提取,得到P条纹路,进而,电子设备对P条纹路进行筛选,得到Q条纹路,例如,可以检测P条纹路中每一纹路的长度是否处于预设长度范围,或者,可以检测P条纹路中每一纹路的宽度是否处于预设宽度范围,或者,检测P条纹路中每一纹路的弯曲角度是否处于预设角度范围等等,预设长度范围、预设宽度范围、预设角度范围均可以由用户自行设置或者系统默认,最后,对Q条纹路进行线段检测,得到线段检测结果,如此,可以降低误识别概率。In a specific implementation, the electronic device can determine the boundary contour of the form image, and the interface contour can be set by the user, or can perform rough identification on the form image (for example, identify from the periphery to the inside, recognize only the peripheral contour, recognize the peripheral contour After that, the recognition operation is stopped), and the outermost contour is taken as the boundary contour. Further, the electronic device can perform texture extraction on the image in the interface outline to obtain the P-striped route, and then the electronic device can screen the P-striped route to obtain the Q-striped route. For example, it can detect whether the length of each texture in the P-striped route is It is within the preset length range, or it can be detected whether the width of each line in the P-striped road is within the preset width range, or it can be detected whether the bending angle of each line in the P-striped road is within the preset angle range, etc. The length range, the preset width range, and the preset angle range can be set by the user or the system defaults. Finally, the line segment detection is performed on the Q-stripe road to obtain the line segment detection result. In this way, the probability of misrecognition can be reduced.
103、将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量。103. Splicing the first feature vector and the second feature vector to obtain a third feature vector.
具体实现中,电子设备可以第一特征向量和第二特征向量的维度在至少一个维度处理为相同维度,再对两者进行拼接,得到第三特征向量。In a specific implementation, the electronic device may process the dimensions of the first feature vector and the second feature vector into the same dimension in at least one dimension, and then splicing the two to obtain the third feature vector.
具体实现中,例如。电子设备可以将第一特征向量和所述第二特征向量进行拼接,得到第三特征向量,如特征拼接,得到N×D的特征向量E,其中N=N t+N lIn specific implementations, eg. The electronic device may splicing the first feature vector and the second feature vector to obtain a third feature vector, such as feature splicing, to obtain an N×D feature vector E, where N=N t +N l .
104、将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵。104. Input the third feature vector into a preset model to obtain vertex features, and determine a relationship adjacency matrix based on the vertex features.
其中,预设模型可以由用户自行设置或者系统默认,预设模型可以为以下至少一种:Transformer模型、神经网络模型等等,在此不作限定。顶点特征可以为以下至少一种:顶点位置、顶点数量、顶点向量等等,在此不作限定。The preset model may be set by the user or the system defaults, and the preset model may be at least one of the following: a Transformer model, a neural network model, etc., which are not limited here. The vertex feature may be at least one of the following: vertex position, vertex number, vertex vector, etc., which are not limited herein.
具体实现中,电子设备可以将特征向量E输入至Transformer模型,得到顶点的特征表示X=Transformer(E),X的形状为N×D。基于顶点特征X。进一步地,还可以通过双线性乘法得到关系邻接矩阵A=XWX T,W的形状可以为R×D×D,A的形状可以为R×N×N,其中,R为关系个数。这里预测的关系可以包括3种(R=3):两个顶点在同一表格、同一行、同一列,其可以分别用A0、A1、A2表示。还可以人工标注出目标关系邻接矩阵Agt,输入顶点特征向量E,得到预测的关系邻接矩阵A。将预测结果A与标注答案Agt进行比较,以及还可以使用梯度下降法优化模型参数。不断迭代优化,直到模型训练完成。 In a specific implementation, the electronic device may input the feature vector E into the Transformer model, and obtain the feature representation of the vertex X=Transformer(E), where the shape of X is N×D. Based on vertex feature X. Further, the relational adjacency matrix A= XWXT can also be obtained by bilinear multiplication, the shape of W can be R×D×D, and the shape of A can be R×N×N, where R is the number of relations. The relationship predicted here can include three types (R=3): two vertices are in the same table, the same row, and the same column, which can be represented by A0, A1, and A2 respectively. It is also possible to manually mark the target relation adjacency matrix Agt, input the vertex feature vector E, and obtain the predicted relation adjacency matrix A. The prediction result A is compared with the labeled answer Agt, and the model parameters can also be optimized using gradient descent. Iterative optimization continues until the model training is complete.
可选地,上述步骤104,将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵,包括如下步骤:Optionally, in the above step 104, the third feature vector is input into a preset model to obtain vertex features, and based on the vertex features, a relationship adjacency matrix is determined, including the following steps:
41、将所述第三特征向量输入到预设模型,得到顶点特征;41. Input the third feature vector into a preset model to obtain vertex features;
42、基于所述顶点特征,通过双线性乘法进行运算,得到初始关系邻接矩阵;42. Based on the vertex feature, perform an operation through bilinear multiplication to obtain an initial relationship adjacency matrix;
43、获取基于所述顶点特征预测得到的预设关系邻接矩阵;43. Obtain a preset relationship adjacency matrix predicted based on the vertex feature;
44、通过所述初始关系邻接矩阵以及所述预设关系邻接矩阵之间的比对结果优化所述预设模型的模型参数;44. Optimizing the model parameters of the preset model through the comparison result between the initial relationship adjacency matrix and the preset relationship adjacency matrix;
45、在所述模型参数满足预设条件时,基于优化后的所述预设模型,输出关系邻接矩阵。45. When the model parameters satisfy a preset condition, output a relation adjacency matrix based on the optimized preset model.
其中,预设条件可以为由用户自行设置或者系统默认,例如预设条件为预设模型收敛,或者,预设条件为预设模型的识别精度达到指定阈值,该指定阈值可以由用户自行设置或者系统默认。电子设备可以将第三特征向量输入到预设模型,得到顶点特征,基于顶点特征,通过双线性乘法进行运算,得到初始关系邻接矩阵,获取基于顶点特征预测得到的预设关系邻接矩阵,该预设关系邻接矩阵可以由用户手绘,或者,将各个顶点按照进行预设规则进行连接得到,该预设规则可以预先设置或者系统默认,例如,直线连接或者,按照一定弯曲弧线连接等等,在此不作限定。进而,电子设备可以通过初始关系邻接矩阵以及预设关系邻接矩阵之间的比对结果优化预设模型的模型参数,具体可以比较初始关系邻接矩阵以及预设关系邻接矩阵之间差异,得到比对结果,再将比对结果反馈输入到预设模型加以优化模型参数,模型参数可以为预设模型的各个模块的控制参数,该控制参数可以为以下至少一种:卷积核大小、偏置、阈值batch-size等等,在此不作限定。在模型参数满足预设条件时,基于优化后的预设模型,输出关系邻接矩阵,具体地,可以将第三特征向量输入到优化后的预设模型,得到最终的关系邻接矩阵。The preset condition may be set by the user or the system defaults, for example, the preset condition is that the preset model converges, or the preset condition is that the recognition accuracy of the preset model reaches a specified threshold, and the specified threshold can be set by the user or system default. The electronic device can input the third feature vector into the preset model to obtain vertex features, perform operations through bilinear multiplication based on the vertex features, obtain an initial relationship adjacency matrix, and obtain a preset relationship adjacency matrix predicted based on the vertex features, the The preset relationship adjacency matrix can be drawn by the user, or obtained by connecting each vertex according to a preset rule. The preset rule can be preset or the system defaults, for example, a straight line connection or a certain curved arc connection, etc., It is not limited here. Further, the electronic device can optimize the model parameters of the preset model through the comparison result between the initial relationship adjacency matrix and the preset relationship adjacency matrix, and specifically can compare the difference between the initial relationship adjacency matrix and the preset relationship adjacency matrix to obtain a comparison As a result, the comparison result is fed back into the preset model to optimize the model parameters. The model parameters can be control parameters of each module of the preset model, and the control parameters can be at least one of the following: convolution kernel size, offset, Threshold batch-size, etc., are not limited here. When the model parameters meet the preset conditions, based on the optimized preset model, the relational adjacency matrix is output. Specifically, the third eigenvector can be input into the optimized preset model to obtain the final relational adjacency matrix.
105、根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二表格为所述第一表格的表格结构。105. Perform restoration processing according to the relationship adjacency matrix to obtain a second table, where the second table is a table structure of the first table.
其中,电子设备可以在获得关系邻接矩阵A后,通过后处理算法可以还原出表格结构,具体实施方式可以为:(1)对A0求解连通分量,每个连通分量对应一个表格;(2)根据连通分量重的顶点从A1、A2中构造子图,对子图求解极大团,每个团分别对应表格的一行或者一列;(3)对同一表格中的行、列按照位置顺序排序;(4)根据排序后的行列还原单元格,若多个顶点既属于同一行,又属于同一列,则表示属于同一单元格。电子设备可以通过表格还原算法对关系邻接矩阵进行还原处理,得到第二表格,第二表格为第一表格的表格结构。表格还原算法可以为以下至少一种:神经网络算法、关系邻接矩阵相应的逆变换算法等等,在此不作限定。Wherein, after obtaining the relational adjacency matrix A, the electronic device can restore the table structure through a post-processing algorithm, and the specific implementation can be as follows: (1) solve the connected components for A0, and each connected component corresponds to a table; (2) according to The vertices with heavy connected components construct subgraphs from A1 and A2, and solve the maximal cliques for the subgraphs, and each clique corresponds to a row or column of the table; (3) Sort the rows and columns in the same table in order of position; ( 4) Restore cells according to the sorted rows and columns. If multiple vertices belong to the same row and column, it means they belong to the same cell. The electronic device may perform restoration processing on the relational adjacency matrix through a table restoration algorithm to obtain a second table, where the second table is the table structure of the first table. The table restoration algorithm may be at least one of the following: a neural network algorithm, an inverse transformation algorithm corresponding to a relational adjacency matrix, etc., which are not limited herein.
具体实现中,本申请实施例,可以将文档中的文本框和线段作为顶点输入至图模型中,对顶点间的关系进行预测,得到两个文本框是否在同一表格、同一行、同一列的关系,进而还原出表格结构,进而,能够有效利用文本框之间的位置关系、文本逻辑关系等信息,并能够支持框线不全的表格识别,取得更佳的表格检测与结构识别效果。另外,还可以使用Transformer结构进行顶点特征表示,可以有效利用全局顶点信息,并且层数更深,特征提取能力更强,不仅可以提高文档录入和纸质文本电子化的效率,还可以更好地推动各行业信息化和数字化的发展。In the specific implementation, in this embodiment of the present application, the text boxes and line segments in the document can be input into the graph model as vertices, and the relationship between the vertices can be predicted to obtain whether the two text boxes are in the same table, row, and column. relationship, and then restore the table structure, and further, can effectively use information such as the positional relationship between text boxes, text logical relationship, etc., and can support table recognition with incomplete frame lines, and achieve better table detection and structure recognition. effect. In addition, the Transformer structure can also be used for vertex feature representation, which can effectively utilize global vertex information, and has deeper layers and stronger feature extraction capabilities, which can not only improve the efficiency of document entry and electronic paper text, but also better promote The development of informatization and digitalization in various industries.
可选地,在上述步骤101-步骤102之间,还可以包括如下步骤:Optionally, between the above steps 101 to 102, the following steps may also be included:
A1、确定所述表格图像的目标图像质量评价值;A1. Determine the target image quality evaluation value of the table image;
A2、在所述目标图像质量评价值大于预设图像质量评价值时,执行所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量的步骤;A2. When the target image quality evaluation value is greater than the preset image quality evaluation value, perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second Eigenvector steps;
或者,or,
A3、在所述目标图像质量评价值小于或等于所述预设图像质量评价值时,对所述表格图像进行图像增强处理,得到目标表格图像;A3. When the target image quality evaluation value is less than or equal to the preset image quality evaluation value, perform image enhancement processing on the table image to obtain a target table image;
则,上述步骤102,对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量,可以按照如下方式实施:Then, in the above step 102, line segment detection is performed on the table image to obtain a line segment detection result, and feature extraction is performed on the line segment detection result to obtain a second feature vector, which can be implemented as follows:
对所述目标表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量。Perform line segment detection on the target table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector.
其中,本申请实施例中,预设图像质量评价值可以预先设置或者系统默认。电子设备可以对表格图像进行图像质量评价,进而,可以得到表格图像的目标图像质量评价值。具体地,电子设备可以采用至少一个图像质量评价值对表格图像进行图像质量评价,图像质量评价指标可以为以下至少一种:信息熵、清晰度、均方差、平均梯度、边缘保持度、平均灰度等等,在此不作限定,进而,可以得到目标图像质量评价值,并在目标图像质量评价值大于预设图像质量评价值时,执行步骤102,否则,则可以在目标图像质量评价值小于或等于预设图像质量评价值时,对表格图像进行图像增强处理,得到目标表格图像,进而,再依据目标表格图像执行步骤102,其中,图像增强处理对应的图像增强算法可以为以下至少一种:灰度拉伸、直方图均衡化、小波变换、神经网络算法等等,在此不作限定,进而,可以保证线条检测精准度。Wherein, in this embodiment of the present application, the preset image quality evaluation value may be preset or system default. The electronic device can perform image quality evaluation on the form image, and further, can obtain a target image quality evaluation value of the form image. Specifically, the electronic device can use at least one image quality evaluation value to perform image quality evaluation on the table image, and the image quality evaluation index can be at least one of the following: information entropy, sharpness, mean square error, mean gradient, edge retention, mean gray and so on, which are not limited here, and further, the target image quality evaluation value can be obtained, and when the target image quality evaluation value is greater than the preset image quality evaluation value, step 102 is performed; otherwise, the target image quality evaluation value can be less than or equal to the preset image quality evaluation value, perform image enhancement processing on the table image to obtain the target table image, and then perform step 102 according to the target table image, wherein, the image enhancement algorithm corresponding to the image enhancement processing can be at least one of the following : grayscale stretching, histogram equalization, wavelet transform, neural network algorithm, etc., which are not limited here, and further, can ensure the accuracy of line detection.
进一步地,可选地,上述步骤A3,对所述表格图像进行图像增强处理,得到目标表格图像,可以包括如下步骤:Further, optionally, in the above step A3, image enhancement processing is performed on the table image to obtain the target table image, which may include the following steps:
A31、获取所述识别结果中的文本框内容;A31. Obtain the text box content in the recognition result;
A32、确定所述文本框内容的目标属性参数;A32. Determine the target attribute parameter of the text box content;
A33、获取所述第一表格的参考属性参数;A33. Obtain the reference attribute parameter of the first table;
A34、确定所述目标属性参数与所述参考属性参数之间的目标偏离度;A34. Determine the target deviation degree between the target attribute parameter and the reference attribute parameter;
A35、按照预设的偏离度与图像增强参数之间的映射关系,确定所述目标偏离度对应的目标图像增强参数;A35. Determine the target image enhancement parameter corresponding to the target deviation degree according to the mapping relationship between the preset deviation degree and the image enhancement parameter;
A36、依据所述目标图像增强参数对所述表格图像进行图像增强处理,得到所述目标表格图像。A36. Perform image enhancement processing on the table image according to the target image enhancement parameter to obtain the target table image.
其中,由上述可以知晓,识别结果可以包括文本框内容和内容信息,进而,电子设备可以获取识别结果中的文本框内容,还可以确定文本框内容的目标属性参数,本申请实施例中,目标属性信息可以为以下至少一种:文本框的线条平均宽度、文本框的线条数量、文本框的顶点位置、文本框的顶点数量、文本框的面积等等,在此不作限定。参考属性参数也可以为以下至少一种:文本框的线条平均宽度、文本框的线条数量、文本框的顶点位 置、文本框的顶点数量、文本框的面积等等,在此不作限定。参考属性信息可以为表格的固定属性,由于第一表格是事先存在的,例如,打印出来的表格,由于其表格属性也是固有设置,进而,电子设备可以直接获取第一表格的参考属性参数。当然,参考属性参数也可以由用户凭经验自行设置。电子设备中还可以预先存储预设的偏离度与图像增强参数之间的映射关系,图像增强参数可以为以下至少一种:图像增强算法、图像增强算法的控制参数,在此不作限定,不同的图像增强算法,其对应的图像增强算的控制参数不一样,图像增强算法的控制参数用于控制图像增强参数,通常情况下,通过合理调节图像增强算法的控制参数,进而避免导致图像过增强或者图像欠增强。Among them, it can be known from the above that the recognition result can include the text box content and content information, and further, the electronic device can obtain the text box content in the recognition result, and can also determine the target attribute parameter of the text box content. In the embodiment of the present application, the target The attribute information may be at least one of the following: average width of lines of the text box, number of lines of the text box, vertex positions of the text box, number of vertices of the text box, area of the text box, etc., which are not limited herein. The reference attribute parameter can also be at least one of the following: the average line width of the text box, the number of lines of the text box, the vertex position of the text box, the number of vertices of the text box, the area of the text box, etc., which are not limited here. The reference attribute information may be a fixed attribute of the form. Since the first form exists in advance, for example, the form attribute of a printed form is also inherently set. Further, the electronic device can directly obtain the reference attribute parameter of the first form. Of course, the reference attribute parameters can also be set by the user based on experience. The electronic device can also pre-store the mapping relationship between the preset deviation degree and the image enhancement parameter, and the image enhancement parameter can be at least one of the following: an image enhancement algorithm and a control parameter of the image enhancement algorithm, which are not limited here. The image enhancement algorithm has different control parameters of the corresponding image enhancement algorithm. The control parameters of the image enhancement algorithm are used to control the image enhancement parameters. Usually, the control parameters of the image enhancement algorithm are adjusted reasonably, so as to avoid excessive image enhancement or The image is under-enhanced.
进而,电子设备可以获取第一表格的参考属性参数,还可以确定目标属性参数与参考属性参数之间的目标偏离度,目标偏离度可以按照如下方式得到:Further, the electronic device can obtain the reference attribute parameter of the first table, and can also determine the target deviation degree between the target attribute parameter and the reference attribute parameter, and the target deviation degree can be obtained as follows:
目标偏离度=(参考属性参数-目标属性参数)/参考属性参数Target deviation = (reference attribute parameter - target attribute parameter) / reference attribute parameter
进一步地,电子设备按照预设的偏离度与图像增强参数之间的映射关系,确定目标偏离度对应的目标图像增强参数,最后,可以依据目标图像增强参数对表格图像进行图像增强处理,得到目标表格图像,进而,可以依据表格图像与真实表格之间的偏差,来调节图像增强参数,以针对性地增强图像,进而避免表格图像导致图像过增强或者图像欠增强。Further, the electronic device determines the target image enhancement parameter corresponding to the target deviation degree according to the mapping relationship between the preset deviation degree and the image enhancement parameter, and finally, can perform image enhancement processing on the table image according to the target image enhancement parameter to obtain the target image enhancement parameter. The table image, and further, the image enhancement parameters can be adjusted according to the deviation between the table image and the real table, so as to enhance the image in a targeted manner, thereby preventing the table image from causing the image to be over-enhanced or under-enhanced.
进一步地,可选地,上述步骤A1,确定所述表格图像的目标图像质量评价值,可以包括如下步骤:Further, optionally, in the above step A1, determining the target image quality evaluation value of the table image may include the following steps:
A11、对所述表格图像进行去背景操作,得到目标前景图像;A11. Perform a background removal operation on the table image to obtain a target foreground image;
A12、确定所述目标前景图像的目标区域面积;A12. Determine the target area area of the target foreground image;
A13、对所述目标前景图像进行特征点提取,得到目标特征点集,并依据所述目标特征点集确定目标特征点数量;A13. Perform feature point extraction on the target foreground image to obtain a target feature point set, and determine the number of target feature points according to the target feature point set;
A14、依据所述目标特征点数量以及所述目标区域面积,确定目标特征特征点分布密度;A14. According to the number of the target feature points and the area of the target area, determine the distribution density of the target feature feature points;
A15、按照预设的特征点分布密度与图像质量评价值之间的映射关系,确定所述目标特征点分布密度对应的所述目标图像质量评价值。A15. Determine the target image quality evaluation value corresponding to the target feature point distribution density according to the preset mapping relationship between the distribution density of feature points and the image quality evaluation value.
具体实现中,电子设备中可以预先存储预设的特征点分布密度与图像质量评价值之间的映射关系,特征点分布密度越大,则说明图像质量越好。In a specific implementation, a preset mapping relationship between the distribution density of feature points and the image quality evaluation value may be pre-stored in the electronic device. The higher the distribution density of feature points, the better the image quality.
具体实现中,电子设备可以对表格图像进行去背景操作,得到目标前景图像,由于背景是空白部分,或者,预先为表格选定的背景(例如,水印、背景图像等等),进而,可以确定目标前景图像的目标区域面积,以及对目标前景图像进行特征点提取,得到目标特征点集,特征点提取算法可以为以下至少一种:harris角点检测算法、尺度不变特征提取算法、神经网络算法等等,在此不作限定。进而,电子设备可以依据目标特征点集确定目标特征点数量,还可以依据目标特征点数量以及目标区域面积,确定目标特征特征点分布密度,即:In a specific implementation, the electronic device can perform a background removal operation on the table image to obtain the target foreground image. Since the background is a blank part, or a pre-selected background for the table (for example, a watermark, a background image, etc.), it can be determined that The area of the target area of the target foreground image, and extracting feature points from the target foreground image to obtain a target feature point set, and the feature point extraction algorithm can be at least one of the following: Harris corner detection algorithm, scale-invariant feature extraction algorithm, neural network Algorithms, etc., are not limited here. Furthermore, the electronic device can determine the number of target feature points according to the target feature point set, and can also determine the distribution density of target feature points according to the number of target feature points and the area of the target area, that is:
目标特征特征点分布密度=目标特征点数量/目标区域面积Distribution density of target feature points = number of target feature points/target area area
进而,电子设备可以按照预设的特征点分布密度与图像质量评价值之间的映射关系,确定目标特征点分布密度对应的目标图像质量评价值,由于去除了背景,并且仅仅依据图像的前景部分进行图像质量评价,具体结合表格特性,表格中空白内容比较多,则可以依据其对应的轮廓中特征点进行图像质量评价,进而能够针对表格图像进行精准图像质量评价。Furthermore, the electronic device can determine the target image quality evaluation value corresponding to the target feature point distribution density according to the preset mapping relationship between the feature point distribution density and the image quality evaluation value. Since the background is removed, and only the foreground part of the image is used When performing image quality evaluation, specifically combined with the characteristics of the table, there are many blank contents in the table, and the image quality evaluation can be performed according to the corresponding feature points in the contour, and then accurate image quality evaluation can be performed for the table image.
可以看出,本申请实施例中所描述的表格处理方法,获取第一表格的表格图像,并对表格图像进行识别,得到识别结果,并对识别结果进行特征提取,得到第一特征向量,对表格图像进行线段检测,得到线段检测结果,并对线段检测结果进行特征提取,得到第二特征向量,将第一特征向量和第二特征向量进行拼接,得到第三特征向量,将第三特征向 量输入到预设模型,得到顶点特征,并基于顶点特征,确定关系邻接矩阵,根据关系邻接矩阵进行还原处理,得到第二表格,第二表格为第一表格的表格结构,进而,其一,可以对表格图像进行整体识别,可以得到整体识别结果,例如,文本框内容以及内容信息,其二,还可以对线条检测,相当于局部图像识别,将两者的特征进行融合,再有效利用全局顶点信息,并且层数更深,特征提取能力更强,如此,能够深度还原表格,提升表格还原精准度。It can be seen that the table processing method described in the embodiment of the present application obtains the table image of the first table, identifies the table image, obtains the recognition result, and performs feature extraction on the recognition result to obtain the first feature vector, which is Perform line segment detection on the table image to obtain the line segment detection result, and perform feature extraction on the line segment detection result to obtain the second feature vector, splicing the first feature vector and the second feature vector to obtain the third feature vector, and the third feature vector Input into the preset model to obtain vertex features, and based on the vertex features, determine the relationship adjacency matrix, and perform restoration processing according to the relationship adjacency matrix to obtain a second table, and the second table is the table structure of the first table, and then, one of them can be The overall recognition of the table image can obtain the overall recognition result, for example, the text box content and content information, and secondly, it can also detect lines, which is equivalent to local image recognition, fuse the features of the two, and then effectively use the global vertex In addition, the number of layers is deeper, and the feature extraction ability is stronger. In this way, the table can be deeply restored and the accuracy of the table restoration can be improved.
请参阅图2,图2是本申请实施例提供的一种表格处理方法的流程示意图,应用于电子设备,如图所示,本表格处理方法包括:Please refer to FIG. 2. FIG. 2 is a schematic flowchart of a form processing method provided by an embodiment of the present application, which is applied to an electronic device. As shown in the figure, the form processing method includes:
201、获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量。201. Acquire a table image of a first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector.
202、确定所述表格图像的目标图像质量评价值。202. Determine a target image quality evaluation value of the table image.
203、在所述目标图像质量评价值大于预设图像质量评价值时,对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量。203. When the target image quality evaluation value is greater than a preset image quality evaluation value, perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector.
204、将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量。204. Splicing the first feature vector and the second feature vector to obtain a third feature vector.
205、将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵。205. Input the third feature vector into a preset model to obtain vertex features, and determine a relationship adjacency matrix based on the vertex features.
206、根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二表格为所述第一表格的表格结构。206. Perform restoration processing according to the relationship adjacency matrix to obtain a second table, where the second table is a table structure of the first table.
其中,上述步骤201-步骤206的具体描述可以参照上述图1所描述的相应步骤,在此不再赘述。For the specific description of the above steps 201 to 206, reference may be made to the corresponding steps described in the above FIG. 1 , which will not be repeated here.
可以看出,本申请实施例中所描述的表格处理方法,获取第一表格的表格图像,并对表格图像进行识别,得到识别结果,并对识别结果进行特征提取,得到第一特征向量,确定表格图像的目标图像质量评价值,在目标图像质量评价值大于预设图像质量评价值时,对表格图像进行线段检测,得到线段检测结果,并对线段检测结果进行特征提取,得到第二特征向量,将第一特征向量和第二特征向量进行拼接,得到第三特征向量,将第三特征向量输入到预设模型,得到顶点特征,并基于顶点特征,确定关系邻接矩阵,根据关系邻接矩阵进行还原处理,得到第二表格,第二表格为第一表格的表格结构,进而,其一,可以对表格图像进行整体识别,可以得到整体识别结果,例如,文本框内容以及内容信息,其二,还可以对线条检测,相当于局部图像识别,将两者的特征进行融合,再有效利用全局顶点信息,并且层数更深,特征提取能力更强,如此,能够深度还原表格,提升表格还原精准度。It can be seen that the table processing method described in the embodiment of the present application obtains the table image of the first table, identifies the table image, obtains the recognition result, performs feature extraction on the recognition result, obtains the first feature vector, and determines The target image quality evaluation value of the table image, when the target image quality evaluation value is greater than the preset image quality evaluation value, perform line segment detection on the table image to obtain the line segment detection result, and perform feature extraction on the line segment detection result to obtain the second feature vector , splicing the first eigenvector and the second eigenvector to obtain the third eigenvector, inputting the third eigenvector into the preset model to obtain the vertex feature, and based on the vertex feature, determine the relational adjacency matrix, and carry out according to the relational adjacency matrix The restoration process is performed to obtain a second table, the second table is the table structure of the first table, and then, firstly, the overall recognition of the table image can be performed, and the overall recognition result can be obtained, for example, the text box content and content information, and secondly, It can also detect lines, which is equivalent to local image recognition, fuse the features of the two, and then effectively use the global vertex information. The number of layers is deeper, and the feature extraction ability is stronger. In this way, the table can be deeply restored and the accuracy of table restoration can be improved. .
与上述实施例一致地,请参阅图3,图3是本申请实施例提供的一种电子设备的结构示意图,如图所示,该电子设备包括处理器、存储器、通信接口以及一个或多个程序,上述一个或多个程序被存储在上述存储器中,并且被配置由上述处理器执行,本申请实施例中,上述程序包括用于执行以下步骤的指令:Consistent with the above-mentioned embodiment, please refer to FIG. 3 , which is a schematic structural diagram of an electronic device provided by an embodiment of the present application. As shown in the figure, the electronic device includes a processor, a memory, a communication interface, and one or more A program, the above-mentioned one or more programs are stored in the above-mentioned memory, and are configured to be executed by the above-mentioned processor. In the embodiment of the present application, the above-mentioned program includes instructions for executing the following steps:
获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量;Obtain a table image of the first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector;
对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量;Perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量;Splicing the first feature vector and the second feature vector to obtain the third feature vector;
将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵;Inputting the third feature vector into a preset model, obtaining vertex features, and determining a relationship adjacency matrix based on the vertex features;
根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二表格为所述第一表格的表格结构。The restoration process is performed according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
可以看出,本申请实施例中所描述的电子设备,获取第一表格的表格图像,并对表格图像进行识别,得到识别结果,并对识别结果进行特征提取,得到第一特征向量,对表格图像进行线段检测,得到线段检测结果,并对线段检测结果进行特征提取,得到第二特征向量,将第一特征向量和第二特征向量进行拼接,得到第三特征向量,将第三特征向量输入到预设模型,得到顶点特征,并基于顶点特征,确定关系邻接矩阵,根据关系邻接矩阵进行还原处理,得到第二表格,第二表格为第一表格的表格结构,进而,其一,可以对表格图像进行整体识别,可以得到整体识别结果,例如,文本框内容以及内容信息,其二,还可以对线条检测,相当于局部图像识别,将两者的特征进行融合,再有效利用全局顶点信息,并且层数更深,特征提取能力更强,如此,能够深度还原表格,提升表格还原精准度。It can be seen that the electronic device described in the embodiment of the present application obtains the table image of the first table, recognizes the table image, obtains the recognition result, and performs feature extraction on the recognition result to obtain the first feature vector, and the table Perform line segment detection on the image to obtain the line segment detection result, and perform feature extraction on the line segment detection result to obtain the second feature vector, splicing the first feature vector and the second feature vector to obtain the third feature vector, and input the third feature vector Go to the preset model, obtain the vertex features, and determine the relationship adjacency matrix based on the vertex features, perform restoration processing according to the relationship adjacency matrix, and obtain a second table, and the second table is the table structure of the first table. The overall recognition of the table image can obtain the overall recognition results, such as the text box content and content information, and secondly, it can also detect lines, which is equivalent to local image recognition, fuse the features of the two, and then effectively use the global vertex information. , and the number of layers is deeper, and the feature extraction ability is stronger. In this way, the table can be deeply restored and the accuracy of the table restoration can be improved.
可选地,在所述对所述表格图像进行线段检测,得到线段检测结果方面,上述程序包括用于执行以下步骤的指令:Optionally, in the aspect of performing line segment detection on the table image to obtain a line segment detection result, the above program includes instructions for performing the following steps:
确定所述表格图像的边界轮廓;determining the boundary contour of the table image;
对所述界面轮廓内的图像进行纹路提取,得到P条纹路,所述P为大于1的整数;Extracting the texture of the image in the interface outline to obtain P striped texture, where P is an integer greater than 1;
对所述P条纹路进行筛选,得到Q条纹路,所述Q为大于1且小于所述P的整数;Screening the P-striped roads to obtain Q-striped roads, where Q is an integer greater than 1 and less than the P;
对所述Q条纹路进行线段检测,得到所述线段检测结果。Perform line segment detection on the Q-striped road to obtain the line segment detection result.
可选地,在所述将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵方面,上述程序包括用于执行以下步骤的指令:Optionally, in the aspect of inputting the third feature vector into a preset model, obtaining vertex features, and determining a relationship adjacency matrix based on the vertex features, the above program includes instructions for performing the following steps:
将所述第三特征向量输入到预设模型,得到顶点特征;Inputting the third feature vector into a preset model to obtain vertex features;
基于所述顶点特征,通过双线性乘法进行运算,得到初始关系邻接矩阵;Based on the vertex features, the operation is performed by bilinear multiplication to obtain an initial relationship adjacency matrix;
获取基于所述顶点特征预测得到的预设关系邻接矩阵;obtaining a preset relationship adjacency matrix predicted based on the vertex feature;
通过所述初始关系邻接矩阵以及所述预设关系邻接矩阵之间的比对结果优化所述预设模型的模型参数;Optimizing the model parameters of the preset model through the comparison result between the initial relationship adjacency matrix and the preset relationship adjacency matrix;
在所述模型参数满足预设条件时,基于优化后的所述预设模型,输出关系邻接矩阵。When the model parameters satisfy a preset condition, a relation adjacency matrix is output based on the optimized preset model.
可选地,在所述获取第一表格的表格图像方面,上述程序包括用于执行以下步骤的指令:Optionally, in terms of obtaining the table image of the first table, the above program includes instructions for performing the following steps:
获取目标环境参数;Get the target environment parameters;
按照预设的环境参数与拍摄参数之间的映射关系,确定所述目标环境参数对应的目标拍摄参数;According to the mapping relationship between the preset environmental parameters and the shooting parameters, determine the target shooting parameters corresponding to the target environmental parameters;
依据所述目标拍摄参数对所述第一表格进行拍摄,得到所述表格图像。The first table is photographed according to the target photographing parameters to obtain the table image.
可选地,在所述获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量之后,以及所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量之前,上述程序还包括用于执行以下步骤的指令:Optionally, after obtaining the table image of the first table, identifying the table image, obtaining a recognition result, and performing feature extraction on the recognition result to obtain the first feature vector, Perform line segment detection on the table image, obtain a line segment detection result, and perform feature extraction on the line segment detection result, before obtaining the second feature vector, the above program also includes an instruction for performing the following steps:
确定所述表格图像的目标图像质量评价值;determining the target image quality evaluation value of the table image;
在所述目标图像质量评价值大于预设图像质量评价值时,执行所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量的步骤;When the target image quality evaluation value is greater than the preset image quality evaluation value, perform the line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector A step of;
或者,or,
在所述目标图像质量评价值小于或等于所述预设图像质量评价值时,对所述表格图像进行图像增强处理,得到目标表格图像,所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量,包括:When the target image quality evaluation value is less than or equal to the preset image quality evaluation value, image enhancement processing is performed on the table image to obtain a target table image, and line segment detection is performed on the table image to obtain line segment detection As a result, feature extraction is performed on the line segment detection result to obtain a second feature vector, including:
对所述目标表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量。Perform line segment detection on the target table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector.
可选地,在所述对所述表格图像进行图像增强处理,得到目标表格图像方面,上述程序包括用于执行以下步骤的指令:Optionally, in the aspect of performing image enhancement processing on the form image to obtain the target form image, the above program includes instructions for performing the following steps:
获取所述识别结果中的文本框内容;Obtain the text box content in the recognition result;
确定所述文本框内容的目标属性参数;determining the target attribute parameter of the text box content;
获取所述第一表格的参考属性参数;obtaining the reference attribute parameter of the first table;
确定所述目标属性参数与所述参考属性参数之间的目标偏离度;determining the target deviation degree between the target attribute parameter and the reference attribute parameter;
按照预设的偏离度与图像增强参数之间的映射关系,确定所述目标偏离度对应的目标图像增强参数;According to the mapping relationship between the preset deviation degree and the image enhancement parameter, determine the target image enhancement parameter corresponding to the target deviation degree;
依据所述目标图像增强参数对所述表格图像进行图像增强处理,得到所述目标表格图像。Perform image enhancement processing on the table image according to the target image enhancement parameter to obtain the target table image.
可选地,在所述确定所述表格图像的目标图像质量评价值方面,上述程序包括用于执行以下步骤的指令:Optionally, in the aspect of determining the target image quality evaluation value of the table image, the above program includes instructions for performing the following steps:
对所述表格图像进行去背景操作,得到目标前景图像;performing a background removal operation on the table image to obtain a target foreground image;
确定所述目标前景图像的目标区域面积;determining the target area area of the target foreground image;
对所述目标前景图像进行特征点提取,得到目标特征点集,并依据所述目标特征点集确定目标特征点数量;Perform feature point extraction on the target foreground image to obtain a target feature point set, and determine the number of target feature points according to the target feature point set;
依据所述目标特征点数量以及所述目标区域面积,确定目标特征点分布密度;Determine the distribution density of target feature points according to the number of target feature points and the area of the target area;
按照预设的特征点分布密度与图像质量评价值之间的映射关系,确定所述目标特征点分布密度对应的所述目标图像质量评价值。The target image quality evaluation value corresponding to the distribution density of the target feature points is determined according to the preset mapping relationship between the distribution density of feature points and the image quality evaluation value.
上述主要从方法侧执行过程的角度对本申请实施例的方案进行了介绍。可以理解的是,电子设备为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所提供的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The foregoing mainly introduces the solutions of the embodiments of the present application from the perspective of the method-side execution process. It can be understood that, in order to realize the above-mentioned functions, the electronic device includes corresponding hardware structures and/or software modules for executing each function. Those skilled in the art should easily realize that the present application can be implemented in hardware or in the form of a combination of hardware and computer software, in combination with the units and algorithm steps of each example described in the embodiments provided herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
本申请实施例可以根据上述方法示例对电子设备进行功能单元的划分,例如,可以对应各个功能划分各个功能单元,也可以将两个或两个以上的功能集成在一个处理单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。需要说明的是,本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In this embodiment of the present application, the electronic device may be divided into functional units according to the foregoing method examples. For example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units. It should be noted that the division of units in the embodiments of the present application is illustrative, and is only a logical function division, and other division methods may be used in actual implementation.
图4是本申请实施例中所涉及的表格处理装置400的功能单元组成框图。该表格处理装置400,所述装置400包括:获取单元401、检测单元402、拼接单元403、输入单元404和还原单元405,其中,FIG. 4 is a block diagram of functional units of the table processing apparatus 400 involved in the embodiment of the present application. The table processing apparatus 400, the apparatus 400 includes: an acquisition unit 401, a detection unit 402, a splicing unit 403, an input unit 404 and a restoration unit 405, wherein,
所述获取单元401,用于获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量;The obtaining unit 401 is configured to obtain a table image of a first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector;
所述检测单元402,用于对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量;The detection unit 402 is configured to perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
所述拼接单元403,用于将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量;The splicing unit 403 is used for splicing the first eigenvector and the second eigenvector to obtain the third eigenvector;
所述输入单元404,用于将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵;The input unit 404 is configured to input the third feature vector into a preset model, obtain vertex features, and determine a relationship adjacency matrix based on the vertex features;
所述还原单元405,用于根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二表格为所述第一表格的表格结构。The restoration unit 405 is configured to perform restoration processing according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
可以看出,本申请实施例中所描述的表格处理装置,获取第一表格的表格图像,并对表格图像进行识别,得到识别结果,并对识别结果进行特征提取,得到第一特征向量,对表格图像进行线段检测,得到线段检测结果,并对线段检测结果进行特征提取,得到第二特征向量,将第一特征向量和第二特征向量进行拼接,得到第三特征向量,将第三特征向量输入到预设模型,得到顶点特征,并基于顶点特征,确定关系邻接矩阵,根据关系邻接矩阵进行还原处理,得到第二表格,第二表格为第一表格的表格结构,进而,其一,可以对表格图像进行整体识别,可以得到整体识别结果,例如,文本框内容以及内容信息,其二,还可以对线条检测,相当于局部图像识别,将两者的特征进行融合,再有效利用全局顶点信息,并且层数更深,特征提取能力更强,如此,能够深度还原表格,提升表格还原精准度。It can be seen that the table processing device described in the embodiments of the present application acquires the table image of the first table, recognizes the table image, obtains the recognition result, and performs feature extraction on the recognition result to obtain the first feature vector, which is Perform line segment detection on the table image to obtain the line segment detection result, and perform feature extraction on the line segment detection result to obtain the second feature vector, splicing the first feature vector and the second feature vector to obtain the third feature vector, and the third feature vector Input into the preset model to obtain vertex features, and based on the vertex features, determine the relationship adjacency matrix, and perform restoration processing according to the relationship adjacency matrix to obtain a second table, and the second table is the table structure of the first table, and then, one of them can be The overall recognition of the table image can obtain the overall recognition result, for example, the text box content and content information, and secondly, it can also detect lines, which is equivalent to local image recognition, fuse the features of the two, and then effectively use the global vertex In addition, the number of layers is deeper, and the feature extraction ability is stronger. In this way, the table can be deeply restored and the accuracy of the table restoration can be improved.
在一个可能地示例中,在所述对所述表格图像进行线段检测,得到线段检测结果方面,所述检测单元402具体用于:In a possible example, in terms of performing line segment detection on the table image to obtain a line segment detection result, the detection unit 402 is specifically configured to:
确定所述表格图像的边界轮廓;determining the boundary contour of the table image;
对所述界面轮廓内的图像进行纹路提取,得到P条纹路,所述P为大于1的整数;Extracting the texture of the image in the interface outline to obtain P striped texture, where P is an integer greater than 1;
对所述P条纹路进行筛选,得到Q条纹路,所述Q为大于1且小于所述P的整数;Screening the P-striped roads to obtain Q-striped roads, where Q is an integer greater than 1 and less than the P;
对所述Q条纹路进行线段检测,得到所述线段检测结果。Perform line segment detection on the Q-striped road to obtain the line segment detection result.
可选地,在所述将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵方面,所述输入单元404具体用于:Optionally, in the aspect of inputting the third feature vector into a preset model, obtaining vertex features, and determining a relationship adjacency matrix based on the vertex features, the input unit 404 is specifically used for:
将所述第三特征向量输入到预设模型,得到顶点特征;Inputting the third feature vector into a preset model to obtain vertex features;
基于所述顶点特征,通过双线性乘法进行运算,得到初始关系邻接矩阵;Based on the vertex features, the operation is performed by bilinear multiplication to obtain an initial relationship adjacency matrix;
获取基于所述顶点特征预测得到的预设关系邻接矩阵;obtaining a preset relationship adjacency matrix predicted based on the vertex feature;
通过所述初始关系邻接矩阵以及所述预设关系邻接矩阵之间的比对结果优化所述预设模型的模型参数;Optimizing the model parameters of the preset model through the comparison result between the initial relationship adjacency matrix and the preset relationship adjacency matrix;
在所述模型参数满足预设条件时,基于优化后的所述预设模型,输出关系邻接矩阵。When the model parameters satisfy a preset condition, a relation adjacency matrix is output based on the optimized preset model.
可选地,在所述获取第一表格的表格图像方面,所述获取单元401具体用于:Optionally, in terms of acquiring the table image of the first table, the acquiring unit 401 is specifically configured to:
获取目标环境参数;Get the target environment parameters;
按照预设的环境参数与拍摄参数之间的映射关系,确定所述目标环境参数对应的目标拍摄参数;According to the mapping relationship between the preset environmental parameters and the shooting parameters, determine the target shooting parameters corresponding to the target environmental parameters;
依据所述目标拍摄参数对所述第一表格进行拍摄,得到所述表格图像。The first table is photographed according to the target photographing parameters to obtain the table image.
可选地,在所述获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量之后,以及所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量之前,所述装置400还具体用于:Optionally, after obtaining the table image of the first table, identifying the table image, obtaining a recognition result, and performing feature extraction on the recognition result to obtain the first feature vector, Perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result, before obtaining the second feature vector, the device 400 is also specifically used for:
确定所述表格图像的目标图像质量评价值;determining the target image quality evaluation value of the table image;
在所述目标图像质量评价值大于预设图像质量评价值时,执行所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量的步骤;When the target image quality evaluation value is greater than the preset image quality evaluation value, perform the line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector A step of;
或者,or,
在所述目标图像质量评价值小于或等于所述预设图像质量评价值时,对所述表格图像进行图像增强处理,得到目标表格图像,所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量,包括:When the target image quality evaluation value is less than or equal to the preset image quality evaluation value, image enhancement processing is performed on the table image to obtain a target table image, and line segment detection is performed on the table image to obtain line segment detection As a result, feature extraction is performed on the line segment detection result to obtain a second feature vector, including:
对所述目标表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量。Perform line segment detection on the target table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector.
可选地,在所述对所述表格图像进行图像增强处理,得到目标表格图像方面,所述装 置400具体用于:Optionally, in the aspect of performing image enhancement processing on the form image to obtain the target form image, the device 400 is specifically used for:
获取所述识别结果中的文本框内容;Obtain the text box content in the recognition result;
确定所述文本框内容的目标属性参数;determining the target attribute parameter of the text box content;
获取所述第一表格的参考属性参数;obtaining the reference attribute parameter of the first table;
确定所述目标属性参数与所述参考属性参数之间的目标偏离度;determining the target deviation degree between the target attribute parameter and the reference attribute parameter;
按照预设的偏离度与图像增强参数之间的映射关系,确定所述目标偏离度对应的目标图像增强参数;According to the mapping relationship between the preset deviation degree and the image enhancement parameter, determine the target image enhancement parameter corresponding to the target deviation degree;
依据所述目标图像增强参数对所述表格图像进行图像增强处理,得到所述目标表格图像。Perform image enhancement processing on the table image according to the target image enhancement parameter to obtain the target table image.
可选地,在所述确定所述表格图像的目标图像质量评价值方面,所述装置400具体用于:Optionally, in the aspect of determining the target image quality evaluation value of the table image, the apparatus 400 is specifically configured to:
对所述表格图像进行去背景操作,得到目标前景图像;performing a background removal operation on the table image to obtain a target foreground image;
确定所述目标前景图像的目标区域面积;determining the target area area of the target foreground image;
对所述目标前景图像进行特征点提取,得到目标特征点集,并依据所述目标特征点集确定目标特征点数量;Perform feature point extraction on the target foreground image to obtain a target feature point set, and determine the number of target feature points according to the target feature point set;
依据所述目标特征点数量以及所述目标区域面积,确定目标特征特征点分布密度;Determine the distribution density of target feature points according to the number of target feature points and the area of the target area;
按照预设的特征点分布密度与图像质量评价值之间的映射关系,确定所述目标特征点分布密度对应的所述目标图像质量评价值。The target image quality evaluation value corresponding to the distribution density of the target feature points is determined according to the preset mapping relationship between the distribution density of feature points and the image quality evaluation value.
可以理解的是,本实施例的表格处理装置的各程序模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。It can be understood that the functions of each program module of the table processing apparatus in this embodiment can be specifically implemented according to the methods in the above method embodiments, and the specific implementation process can refer to the relevant descriptions of the above method embodiments, which will not be repeated here.
本申请实施例还提供一种计算机存储介质,其中,该计算机存储介质存储用于电子数据交换的计算机程序,该计算机程序使得计算机执行如上述方法实施例中记载的任一方法的部分或全部步骤,上述计算机包括电子设备。Embodiments of the present application further provide a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, and the computer program causes the computer to execute part or all of the steps of any method described in the above method embodiments , the above computer includes electronic equipment.
可选的,本申请涉及的存储介质可以是计算机可读存储介质,该存储介质如计算机可读存储介质可以是非易失性的,也可以是易失性的。Optionally, the storage medium involved in the present application may be a computer-readable storage medium, such as a computer-readable storage medium, which may be non-volatile or volatile.
本申请实施例还提供一种计算机程序产品,上述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,上述计算机程序可操作来使计算机执行如上述方法实施例中记载的任一方法的部分或全部步骤。该计算机程序产品可以为一个软件安装包,上述计算机包括电子设备。Embodiments of the present application further provide a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute any one of the method embodiments described above. some or all of the steps of the method. The computer program product may be a software installation package, and the computer includes an electronic device.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。It should be noted that, for the sake of simple description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Because in accordance with the present application, certain steps may be performed in other orders or concurrently. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如上述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the device embodiments described above are only illustrative. For example, the division of the above-mentioned units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical or other forms.
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络 单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The above-mentioned units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
上述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储器中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干指令用以使得一台计算机设备(可为个人计算机、电子设备或者网络设备等)执行本申请各个实施例上述方法的全部或部分步骤。而前述的存储器包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。The above-mentioned integrated units, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable memory. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art, or all or part of the technical solution, and the computer software product is stored in a memory, Several instructions are included to cause a computer device (which may be a personal computer, an electronic device, or a network device, etc.) to execute all or part of the steps of the above-mentioned methods of the various embodiments of the present application. The aforementioned memory includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储器中,存储器可以包括:闪存盘、只读存储器(英文:Read-Only Memory,简称:ROM)、随机存取器(英文:Random Access Memory,简称:RAM)、磁盘或光盘等。Those skilled in the art can understand that all or part of the steps in the various methods of the above embodiments can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable memory, and the memory can include: a flash disk , Read-only memory (English: Read-Only Memory, referred to as: ROM), random access device (English: Random Access Memory, referred to as: RAM), magnetic disk or optical disk, etc.
以上对本申请实施例进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The embodiments of the present application have been introduced in detail above, and the principles and implementations of the present application are described in this paper by using specific examples. The descriptions of the above embodiments are only used to help understand the methods and core ideas of the present application; at the same time, for Persons of ordinary skill in the art, based on the idea of the present application, will have changes in the specific implementation manner and application scope. In summary, the contents of this specification should not be construed as limitations on the present application.

Claims (20)

  1. 一种表格处理方法,其中,所述方法包括:A form processing method, wherein the method comprises:
    获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量;Obtain a table image of the first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector;
    对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量;Perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
    将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量;Splicing the first feature vector and the second feature vector to obtain the third feature vector;
    将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵;Inputting the third feature vector into a preset model, obtaining vertex features, and determining a relationship adjacency matrix based on the vertex features;
    根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二表格为所述第一表格的表格结构。The restoration process is performed according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
  2. 根据权利要求1所述的方法,其中,所述对所述表格图像进行线段检测,得到线段检测结果,包括:The method according to claim 1, wherein the performing line segment detection on the table image to obtain a line segment detection result comprises:
    确定所述表格图像的边界轮廓;determining the boundary contour of the table image;
    对所述界面轮廓内的图像进行纹路提取,得到P条纹路,所述P为大于1的整数;Extracting the texture of the image in the interface outline to obtain P striped texture, where P is an integer greater than 1;
    对所述P条纹路进行筛选,得到Q条纹路,所述Q为大于1且小于所述P的整数;Screening the P-striped roads to obtain Q-striped roads, where Q is an integer greater than 1 and less than the P;
    对所述Q条纹路进行线段检测,得到所述线段检测结果。Perform line segment detection on the Q-striped road to obtain the line segment detection result.
  3. 根据权利要求1或2所述的方法,其中,所述将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵,包括:The method according to claim 1 or 2, wherein, inputting the third feature vector into a preset model to obtain vertex features, and determining a relationship adjacency matrix based on the vertex features, comprising:
    将所述第三特征向量输入到预设模型,得到顶点特征;Inputting the third feature vector into a preset model to obtain vertex features;
    基于所述顶点特征,通过双线性乘法进行运算,得到初始关系邻接矩阵;Based on the vertex features, the operation is performed by bilinear multiplication to obtain an initial relationship adjacency matrix;
    获取基于所述顶点特征预测得到的预设关系邻接矩阵;obtaining a preset relationship adjacency matrix predicted based on the vertex feature;
    通过所述初始关系邻接矩阵以及所述预设关系邻接矩阵之间的比对结果优化所述预设模型的模型参数;Optimizing the model parameters of the preset model through the comparison result between the initial relationship adjacency matrix and the preset relationship adjacency matrix;
    在所述模型参数满足预设条件时,基于优化后的所述预设模型,输出关系邻接矩阵。When the model parameters satisfy a preset condition, a relation adjacency matrix is output based on the optimized preset model.
  4. 根据权利要求1或2所述的方法,其中,所述获取第一表格的表格图像,包括:The method according to claim 1 or 2, wherein the obtaining the table image of the first table comprises:
    获取目标环境参数;Get the target environment parameters;
    按照预设的环境参数与拍摄参数之间的映射关系,确定所述目标环境参数对应的目标拍摄参数;According to the mapping relationship between the preset environmental parameters and the shooting parameters, determine the target shooting parameters corresponding to the target environmental parameters;
    依据所述目标拍摄参数对所述第一表格进行拍摄,得到所述表格图像。The first table is photographed according to the target photographing parameters to obtain the table image.
  5. 根据权利要求1或2所述的方法,其中,在所述获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量之后,以及所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量之前,所述方法还包括:The method according to claim 1 or 2, wherein, in the obtaining of the form image of the first form, and recognizing the form image, a recognition result is obtained, and feature extraction is performed on the recognition result to obtain the first form. After the feature vector, and before performing line segment detection on the table image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector, the method further includes:
    确定所述表格图像的目标图像质量评价值;determining the target image quality evaluation value of the table image;
    在所述目标图像质量评价值大于预设图像质量评价值时,执行所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量的步骤;When the target image quality evaluation value is greater than the preset image quality evaluation value, perform the line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector A step of;
    或者,or,
    在所述目标图像质量评价值小于或等于所述预设图像质量评价值时,对所述表格图像进行图像增强处理,得到目标表格图像,所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量,包括:When the target image quality evaluation value is less than or equal to the preset image quality evaluation value, image enhancement processing is performed on the table image to obtain a target table image, and line segment detection is performed on the table image to obtain line segment detection As a result, feature extraction is performed on the line segment detection result to obtain a second feature vector, including:
    对所述目标表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量。Perform line segment detection on the target table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector.
  6. 根据权利要求5所述的方法,其中,所述对所述表格图像进行图像增强处理,得到目标表格图像,包括:The method according to claim 5, wherein the performing image enhancement processing on the form image to obtain the target form image comprises:
    获取所述识别结果中的文本框内容;Obtain the text box content in the recognition result;
    确定所述文本框内容的目标属性参数;determining the target attribute parameter of the text box content;
    获取所述第一表格的参考属性参数;obtaining the reference attribute parameter of the first table;
    确定所述目标属性参数与所述参考属性参数之间的目标偏离度;determining the target deviation degree between the target attribute parameter and the reference attribute parameter;
    按照预设的偏离度与图像增强参数之间的映射关系,确定所述目标偏离度对应的目标图像增强参数;According to the mapping relationship between the preset deviation degree and the image enhancement parameter, determine the target image enhancement parameter corresponding to the target deviation degree;
    依据所述目标图像增强参数对所述表格图像进行图像增强处理,得到所述目标表格图像。Perform image enhancement processing on the table image according to the target image enhancement parameter to obtain the target table image.
  7. 根据权利要求5所述的方法,其中,所述确定所述表格图像的目标图像质量评价值,包括:The method according to claim 5, wherein the determining the target image quality evaluation value of the table image comprises:
    对所述表格图像进行去背景操作,得到目标前景图像;performing a background removal operation on the table image to obtain a target foreground image;
    确定所述目标前景图像的目标区域面积;determining the target area area of the target foreground image;
    对所述目标前景图像进行特征点提取,得到目标特征点集,并依据所述目标特征点集确定目标特征点数量;Perform feature point extraction on the target foreground image to obtain a target feature point set, and determine the number of target feature points according to the target feature point set;
    依据所述目标特征点数量以及所述目标区域面积,确定目标特征点分布密度;Determine the distribution density of target feature points according to the number of target feature points and the area of the target area;
    按照预设的特征点分布密度与图像质量评价值之间的映射关系,确定所述目标特征点分布密度对应的所述目标图像质量评价值。The target image quality evaluation value corresponding to the distribution density of the target feature points is determined according to the preset mapping relationship between the distribution density of feature points and the image quality evaluation value.
  8. 一种表格处理装置,其中,所述装置包括:获取单元、检测单元、拼接单元、输入单元和还原单元,其中,A table processing device, wherein the device comprises: an acquisition unit, a detection unit, a splicing unit, an input unit and a restoration unit, wherein,
    所述获取单元,用于获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量;The acquiring unit is configured to acquire a table image of the first table, identify the table image, obtain a recognition result, and perform feature extraction on the recognition result to obtain a first feature vector;
    所述检测单元,用于对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量;The detection unit is configured to perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
    所述拼接单元,用于将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量;The splicing unit is used for splicing the first eigenvector and the second eigenvector to obtain a third eigenvector;
    所述输入单元,用于将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵;The input unit is configured to input the third feature vector into a preset model, obtain vertex features, and determine a relationship adjacency matrix based on the vertex features;
    所述还原单元,用于根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二表格为所述第一表格的表格结构。The restoration unit is configured to perform restoration processing according to the relationship adjacency matrix to obtain a second table, where the second table is a table structure of the first table.
  9. 一种电子设备,其中,包括处理器、存储器,所述存储器用于存储一个或多个程序,并且被配置由所述处理器执行,以实现表格处理方法,所述方法包括:An electronic device, comprising a processor and a memory for storing one or more programs and configured to be executed by the processor to implement a table processing method, the method comprising:
    获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量;obtaining a table image of the first table, and identifying the table image to obtain a recognition result, and performing feature extraction on the recognition result to obtain a first feature vector;
    对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量;Perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
    将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量;Splicing the first feature vector and the second feature vector to obtain the third feature vector;
    将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵;Inputting the third feature vector into a preset model, obtaining vertex features, and determining a relationship adjacency matrix based on the vertex features;
    根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二表格为所述第一表格的表格结构。The restoration process is performed according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
  10. 根据权利要求9所述的电子设备,其中,执行所述对所述表格图像进行线段检测,得到线段检测结果,包括:The electronic device according to claim 9, wherein performing the line segment detection on the table image to obtain a line segment detection result comprises:
    确定所述表格图像的边界轮廓;determining the boundary contour of the table image;
    对所述界面轮廓内的图像进行纹路提取,得到P条纹路,所述P为大于1的整数;Extracting the texture of the image in the interface outline to obtain P striped texture, where P is an integer greater than 1;
    对所述P条纹路进行筛选,得到Q条纹路,所述Q为大于1且小于所述P的整数;Screening the P-striped roads to obtain Q-striped roads, where Q is an integer greater than 1 and less than the P;
    对所述Q条纹路进行线段检测,得到所述线段检测结果。Perform line segment detection on the Q-striped road to obtain the line segment detection result.
  11. 根据权利要求9或10所述的电子设备,其中,执行所述将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵,包括:The electronic device according to claim 9 or 10, wherein performing the inputting the third feature vector into a preset model to obtain vertex features, and based on the vertex features, determining a relationship adjacency matrix, comprising:
    将所述第三特征向量输入到预设模型,得到顶点特征;Inputting the third feature vector into a preset model to obtain vertex features;
    基于所述顶点特征,通过双线性乘法进行运算,得到初始关系邻接矩阵;Based on the vertex features, the operation is performed by bilinear multiplication to obtain an initial relationship adjacency matrix;
    获取基于所述顶点特征预测得到的预设关系邻接矩阵;obtaining a preset relationship adjacency matrix predicted based on the vertex feature;
    通过所述初始关系邻接矩阵以及所述预设关系邻接矩阵之间的比对结果优化所述预设模型的模型参数;Optimizing the model parameters of the preset model through the comparison result between the initial relationship adjacency matrix and the preset relationship adjacency matrix;
    在所述模型参数满足预设条件时,基于优化后的所述预设模型,输出关系邻接矩阵。When the model parameters satisfy a preset condition, a relation adjacency matrix is output based on the optimized preset model.
  12. 根据权利要求9或10所述的电子设备,其中,在所述获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量之后,以及所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量之前,还包括:The electronic device according to claim 9 or 10, wherein, in the process of acquiring a form image of the first form, recognizing the form image, obtaining a recognition result, and performing feature extraction on the recognition result, obtaining the first form image After a feature vector, and before performing line segment detection on the table image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector, the method further includes:
    确定所述表格图像的目标图像质量评价值;determining the target image quality evaluation value of the table image;
    在所述目标图像质量评价值大于预设图像质量评价值时,执行所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量的步骤;When the target image quality evaluation value is greater than the preset image quality evaluation value, perform the line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector A step of;
    或者,or,
    在所述目标图像质量评价值小于或等于所述预设图像质量评价值时,对所述表格图像进行图像增强处理,得到目标表格图像,所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量,包括:When the target image quality evaluation value is less than or equal to the preset image quality evaluation value, image enhancement processing is performed on the table image to obtain a target table image, and line segment detection is performed on the table image to obtain line segment detection As a result, feature extraction is performed on the line segment detection result to obtain a second feature vector, including:
    对所述目标表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量。Perform line segment detection on the target table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector.
  13. 根据权利要求12所述的电子设备,其中,执行所述对所述表格图像进行图像增强处理,得到目标表格图像,包括:The electronic device according to claim 12, wherein performing the image enhancement processing on the form image to obtain the target form image comprises:
    获取所述识别结果中的文本框内容;Obtain the text box content in the recognition result;
    确定所述文本框内容的目标属性参数;determining the target attribute parameter of the text box content;
    获取所述第一表格的参考属性参数;obtaining the reference attribute parameter of the first table;
    确定所述目标属性参数与所述参考属性参数之间的目标偏离度;determining the target deviation degree between the target attribute parameter and the reference attribute parameter;
    按照预设的偏离度与图像增强参数之间的映射关系,确定所述目标偏离度对应的目标图像增强参数;According to the mapping relationship between the preset deviation degree and the image enhancement parameter, determine the target image enhancement parameter corresponding to the target deviation degree;
    依据所述目标图像增强参数对所述表格图像进行图像增强处理,得到所述目标表格图像。Perform image enhancement processing on the table image according to the target image enhancement parameter to obtain the target table image.
  14. 根据权利要求12所述的电子设备,其中,执行所述确定所述表格图像的目标图像质量评价值,包括:The electronic device according to claim 12, wherein performing the determining of the target image quality evaluation value of the table image comprises:
    对所述表格图像进行去背景操作,得到目标前景图像;performing a background removal operation on the table image to obtain a target foreground image;
    确定所述目标前景图像的目标区域面积;determining the target area area of the target foreground image;
    对所述目标前景图像进行特征点提取,得到目标特征点集,并依据所述目标特征点集确定目标特征点数量;Perform feature point extraction on the target foreground image to obtain a target feature point set, and determine the number of target feature points according to the target feature point set;
    依据所述目标特征点数量以及所述目标区域面积,确定目标特征点分布密度;Determine the distribution density of target feature points according to the number of target feature points and the area of the target area;
    按照预设的特征点分布密度与图像质量评价值之间的映射关系,确定所述目标特征点 分布密度对应的所述目标图像质量评价值。According to the mapping relationship between the preset feature point distribution density and the image quality evaluation value, the target image quality evaluation value corresponding to the target feature point distribution density is determined.
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行表格处理方法,所述方法包括:A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, the computer program includes program instructions that, when executed by a processor, cause the processor to perform a table processing method, The method includes:
    获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量;obtaining a table image of the first table, and identifying the table image to obtain a recognition result, and performing feature extraction on the recognition result to obtain a first feature vector;
    对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量;Perform line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;
    将所述第一特征向量和所述第二特征向量进行拼接,得到第三特征向量;Splicing the first feature vector and the second feature vector to obtain the third feature vector;
    将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵;Inputting the third feature vector into a preset model, obtaining vertex features, and determining a relationship adjacency matrix based on the vertex features;
    根据所述关系邻接矩阵进行还原处理,得到第二表格,所述第二表格为所述第一表格的表格结构。The restoration process is performed according to the relationship adjacency matrix to obtain a second table, where the second table is the table structure of the first table.
  16. 根据权利要求15所述的计算机可读存储介质,其中,执行所述对所述表格图像进行线段检测,得到线段检测结果,包括:The computer-readable storage medium according to claim 15, wherein performing the line segment detection on the table image to obtain a line segment detection result comprises:
    确定所述表格图像的边界轮廓;determining the boundary contour of the table image;
    对所述界面轮廓内的图像进行纹路提取,得到P条纹路,所述P为大于1的整数;Extracting the texture of the image in the interface outline to obtain P striped texture, where P is an integer greater than 1;
    对所述P条纹路进行筛选,得到Q条纹路,所述Q为大于1且小于所述P的整数;Screening the P-striped roads to obtain Q-striped roads, where Q is an integer greater than 1 and less than the P;
    对所述Q条纹路进行线段检测,得到所述线段检测结果。Perform line segment detection on the Q-striped road to obtain the line segment detection result.
  17. 根据权利要求15或16所述的计算机可读存储介质,其中,执行所述将所述第三特征向量输入到预设模型,得到顶点特征,并基于所述顶点特征,确定关系邻接矩阵,包括:The computer-readable storage medium of claim 15 or 16, wherein performing the inputting the third feature vector into a preset model to obtain vertex features, and based on the vertex features, determining a relationship adjacency matrix, comprising: :
    将所述第三特征向量输入到预设模型,得到顶点特征;Inputting the third feature vector into a preset model to obtain vertex features;
    基于所述顶点特征,通过双线性乘法进行运算,得到初始关系邻接矩阵;Based on the vertex features, the operation is performed by bilinear multiplication to obtain an initial relationship adjacency matrix;
    获取基于所述顶点特征预测得到的预设关系邻接矩阵;obtaining a preset relationship adjacency matrix predicted based on the vertex feature;
    通过所述初始关系邻接矩阵以及所述预设关系邻接矩阵之间的比对结果优化所述预设模型的模型参数;Optimizing the model parameters of the preset model through the comparison result between the initial relationship adjacency matrix and the preset relationship adjacency matrix;
    在所述模型参数满足预设条件时,基于优化后的所述预设模型,输出关系邻接矩阵。When the model parameters satisfy a preset condition, a relation adjacency matrix is output based on the optimized preset model.
  18. 根据权利要求15或16所述的计算机可读存储介质,其中,在所述获取第一表格的表格图像,并对所述表格图像进行识别,得到识别结果,并对所述识别结果进行特征提取,得到第一特征向量之后,以及所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量之前,还包括:The computer-readable storage medium according to claim 15 or 16, wherein, in the obtaining of the table image of the first table, the table image is recognized, the recognition result is obtained, and the feature extraction is performed on the recognition result , after obtaining the first feature vector, and before performing line segment detection on the table image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain the second feature vector, further comprising:
    确定所述表格图像的目标图像质量评价值;determining the target image quality evaluation value of the table image;
    在所述目标图像质量评价值大于预设图像质量评价值时,执行所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量的步骤;When the target image quality evaluation value is greater than the preset image quality evaluation value, perform the line segment detection on the table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector A step of;
    或者,or,
    在所述目标图像质量评价值小于或等于所述预设图像质量评价值时,对所述表格图像进行图像增强处理,得到目标表格图像,所述对所述表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量,包括:When the target image quality evaluation value is less than or equal to the preset image quality evaluation value, image enhancement processing is performed on the table image to obtain a target table image, and line segment detection is performed on the table image to obtain line segment detection As a result, feature extraction is performed on the line segment detection result to obtain a second feature vector, including:
    对所述目标表格图像进行线段检测,得到线段检测结果,并对所述线段检测结果进行特征提取,得到第二特征向量。Perform line segment detection on the target table image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector.
  19. 根据权利要求18所述的计算机可读存储介质,其中,执行所述对所述表格图像进行图像增强处理,得到目标表格图像,包括:The computer-readable storage medium according to claim 18, wherein performing the image enhancement processing on the form image to obtain the target form image comprises:
    获取所述识别结果中的文本框内容;Obtain the text box content in the recognition result;
    确定所述文本框内容的目标属性参数;determining the target attribute parameter of the text box content;
    获取所述第一表格的参考属性参数;obtaining the reference attribute parameter of the first table;
    确定所述目标属性参数与所述参考属性参数之间的目标偏离度;determining the target deviation degree between the target attribute parameter and the reference attribute parameter;
    按照预设的偏离度与图像增强参数之间的映射关系,确定所述目标偏离度对应的目标图像增强参数;According to the mapping relationship between the preset deviation degree and the image enhancement parameter, determine the target image enhancement parameter corresponding to the target deviation degree;
    依据所述目标图像增强参数对所述表格图像进行图像增强处理,得到所述目标表格图像。Perform image enhancement processing on the table image according to the target image enhancement parameter to obtain the target table image.
  20. 根据权利要求18所述的计算机可读存储介质,其中,执行所述确定所述表格图像的目标图像质量评价值,包括:The computer-readable storage medium of claim 18, wherein performing the determining of the target image quality evaluation value of the form image comprises:
    对所述表格图像进行去背景操作,得到目标前景图像;performing a background removal operation on the table image to obtain a target foreground image;
    确定所述目标前景图像的目标区域面积;determining the target area area of the target foreground image;
    对所述目标前景图像进行特征点提取,得到目标特征点集,并依据所述目标特征点集确定目标特征点数量;Perform feature point extraction on the target foreground image to obtain a target feature point set, and determine the number of target feature points according to the target feature point set;
    依据所述目标特征点数量以及所述目标区域面积,确定目标特征点分布密度;Determine the distribution density of target feature points according to the number of target feature points and the area of the target area;
    按照预设的特征点分布密度与图像质量评价值之间的映射关系,确定所述目标特征点分布密度对应的所述目标图像质量评价值。The target image quality evaluation value corresponding to the distribution density of the target feature points is determined according to the preset mapping relationship between the distribution density of feature points and the image quality evaluation value.
PCT/CN2021/124416 2020-12-23 2021-10-18 Table processing method and apparatus, and electronic device and storage medium WO2022134771A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011538336.5A CN112668566A (en) 2020-12-23 2020-12-23 Form processing method and device, electronic equipment and storage medium
CN202011538336.5 2020-12-23

Publications (1)

Publication Number Publication Date
WO2022134771A1 true WO2022134771A1 (en) 2022-06-30

Family

ID=75408641

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/124416 WO2022134771A1 (en) 2020-12-23 2021-10-18 Table processing method and apparatus, and electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN112668566A (en)
WO (1) WO2022134771A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668566A (en) * 2020-12-23 2021-04-16 深圳壹账通智能科技有限公司 Form processing method and device, electronic equipment and storage medium
CN113536951B (en) * 2021-06-22 2023-11-24 科大讯飞股份有限公司 Form identification method, related device, electronic equipment and storage medium
CN113378789B (en) * 2021-07-08 2023-09-26 京东科技信息技术有限公司 Cell position detection method and device and electronic equipment
CN114780645B (en) * 2022-03-29 2022-10-25 广东科能工程管理有限公司 Data classification processing method and system based on artificial intelligence and cloud platform
CN115983237B (en) * 2023-03-21 2023-06-13 北京亚信数据有限公司 Form type recognition model training, predicting and form data recommending method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2713313A2 (en) * 2012-09-28 2014-04-02 Omron Corporation Image processing system and image processing method
CN109858325A (en) * 2018-12-11 2019-06-07 科大讯飞股份有限公司 A kind of table detection method and device
CN110084117A (en) * 2019-03-22 2019-08-02 中国科学院自动化研究所 Document table line detecting method, system based on binary map segmented projection
CN110210409A (en) * 2019-06-04 2019-09-06 南昌市微轲联信息技术有限公司 Form frame-line detection method and system in table document
CN111079697A (en) * 2019-12-27 2020-04-28 湖南特能博世科技有限公司 Table extraction method and device and electronic equipment
CN111860257A (en) * 2020-07-10 2020-10-30 上海交通大学 Table identification method and system fusing multiple text features and geometric information
CN112668566A (en) * 2020-12-23 2021-04-16 深圳壹账通智能科技有限公司 Form processing method and device, electronic equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2713313A2 (en) * 2012-09-28 2014-04-02 Omron Corporation Image processing system and image processing method
CN109858325A (en) * 2018-12-11 2019-06-07 科大讯飞股份有限公司 A kind of table detection method and device
CN110084117A (en) * 2019-03-22 2019-08-02 中国科学院自动化研究所 Document table line detecting method, system based on binary map segmented projection
CN110210409A (en) * 2019-06-04 2019-09-06 南昌市微轲联信息技术有限公司 Form frame-line detection method and system in table document
CN111079697A (en) * 2019-12-27 2020-04-28 湖南特能博世科技有限公司 Table extraction method and device and electronic equipment
CN111860257A (en) * 2020-07-10 2020-10-30 上海交通大学 Table identification method and system fusing multiple text features and geometric information
CN112668566A (en) * 2020-12-23 2021-04-16 深圳壹账通智能科技有限公司 Form processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112668566A (en) 2021-04-16

Similar Documents

Publication Publication Date Title
WO2022134771A1 (en) Table processing method and apparatus, and electronic device and storage medium
CN110046529B (en) Two-dimensional code identification method, device and equipment
WO2019169772A1 (en) Picture processing method, electronic apparatus, and storage medium
CN108304814B (en) Method for constructing character type detection model and computing equipment
US8634659B2 (en) Image processing apparatus, computer readable medium storing program, and image processing method
CN109241861B (en) Mathematical formula identification method, device, equipment and storage medium
US9330331B2 (en) Systems and methods for offline character recognition
CN114155546B (en) Image correction method and device, electronic equipment and storage medium
CN113139445A (en) Table recognition method, apparatus and computer-readable storage medium
CN110647882A (en) Image correction method, device, equipment and storage medium
CN111488881A (en) Method, device and storage medium for removing handwritten content in text image
CN111275139A (en) Handwritten content removal method, handwritten content removal device, and storage medium
CN110807110B (en) Image searching method and device combining local and global features and electronic equipment
CN110852311A (en) Three-dimensional human hand key point positioning method and device
US9235757B1 (en) Fast text detection
CN111985465A (en) Text recognition method, device, equipment and storage medium
CN113627428A (en) Document image correction method and device, storage medium and intelligent terminal device
CN112396050B (en) Image processing method, device and storage medium
CN112101386B (en) Text detection method, device, computer equipment and storage medium
CN113436222A (en) Image processing method, image processing apparatus, electronic device, and storage medium
CN112597940B (en) Certificate image recognition method and device and storage medium
US9483834B1 (en) Object boundary detection in an image
CN113627423A (en) Circular seal character recognition method and device, computer equipment and storage medium
CN112507938A (en) Geometric feature calculation method, geometric feature recognition method and geometric feature recognition device for text primitives
CN111179287A (en) Portrait instance segmentation method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21908772

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.11.2023)