WO2023279847A1

WO2023279847A1 - Cell position detection method and apparatus, and electronic device

Info

Publication number: WO2023279847A1
Application number: PCT/CN2022/092571
Authority: WO
Inventors: 陶大程; 薛文元
Original assignee: 京东科技信息技术有限公司
Priority date: 2021-07-08
Filing date: 2022-05-12
Publication date: 2023-01-12
Also published as: CN113378789B; CN113378789A

Abstract

Provided are a cell position detection method and apparatus, and an electronic device. The cell position detection method comprises: obtaining a first position of a prediction cell in a table image, wherein the first position is used for representing the position of a region occupied by the prediction cell in the table image; obtaining an adjacency matrix of the table image according to the first position, wherein each prediction cell in the table image is used as a node, and the adjacency matrix is used for indicating a positional relationship between prediction cells; obtaining a fused node feature of any prediction cell according to the first position of any prediction cell and the adjacency matrix; and obtaining a second position of any prediction cell according to the fused node feature of any prediction cell, wherein the second position is used for representing the row and/or the column to which the prediction cell belongs.

Description

Cell position detection method, device and electronic equipment

Cross References to Related Applications

This application is based on a Chinese patent application with application number 202110772902.7 and a filing date of July 8, 2021, and claims the priority of this Chinese patent application. The entire content of this Chinese patent application is hereby incorporated by reference into this application.

technical field

The present disclosure relates to the technical field of computer applications, and in particular to a cell position detection method, device, electronic equipment, computer-readable storage medium, computer program product and computer program.

Background technique

At present, tabular data has the advantages of simplicity, intuition, and ease of processing, and is widely used in people's office life. With the development of artificial intelligence technology, the requirements for automatic recognition of table data are getting higher and higher. For example, the position of cells is automatically detected from the table image, so that information extraction can be performed based on the position of the cells. However, in the cell position detection method in the related art, the detected cell position information is incomplete and has poor robustness.

Contents of the invention

For this reason, the embodiment of the first aspect of the present disclosure proposes a cell position detection method, which can use the predicted cell as a node, and obtain an adjacency matrix based on the first position of the predicted cell, and then according to the first position and the adjacency matrix The fusion node feature of the prediction cell is obtained, so that the fusion node feature can match the first position of the prediction cell and the positional relationship between the prediction cells, and the expression effect of the fusion node feature of the prediction cell obtained is better OK, and the second position of the predicted cell is obtained according to the fusion node features, and the first position and the second position of the cell can be obtained at the same time, and the obtained cell position is more comprehensive and more robust.

The embodiment of the second aspect of the present disclosure provides a device for detecting the position of a cell.

The embodiment of the third aspect of the present disclosure provides an electronic device.

Embodiments of a fourth aspect of the present disclosure provide a computer-readable storage medium.

The embodiment of the fifth aspect of the present disclosure provides a computer program product.

The embodiment of the sixth aspect of the present disclosure provides a computer program.

The embodiment of the first aspect of the present disclosure proposes a cell position detection method, including: obtaining the first position of the predicted cell in the table image, wherein the first position is used to represent the area occupied by the predicted cell position in the form image; according to the first position, the adjacency matrix of the form image is obtained, wherein each of the prediction cells in the form image is a node, and the adjacency matrix is used To represent the positional relationship between the prediction units; according to the first position of any prediction unit and the adjacency matrix, obtain the fusion node feature of any prediction unit; according to any prediction unit The fused node feature of the cell is used to obtain the second position of any prediction cell, wherein the second position is used to represent the row and/or column to which the prediction cell belongs.

The detection method of the cell position in the embodiment of the present disclosure can use the predicted cell as a node, and obtain the adjacency matrix based on the first position of the predicted cell, and then obtain the fusion node of the predicted cell according to the first position and the adjacency matrix feature, so that the fusion node feature can match the first position of the predicted cell and the positional relationship between the predicted cells, and the obtained fusion node feature of the predicted cell can be expressed better, and according to the fusion node feature Obtaining the second position of the predicted cell can simultaneously obtain the first position and the second position of the cell, and the obtained cell position is more comprehensive and more robust.

In an embodiment of the present disclosure, the first position includes at least one of the two-dimensional coordinates of the center point of the predicted cell, the width of the predicted cell, and the height of the predicted cell.

In an embodiment of the present disclosure, the obtaining the adjacency matrix of the table image according to the first position includes: determining The value of the corresponding element.

In an embodiment of the present disclosure, the determining the value of the corresponding element in the adjacency matrix based on the first position and the number of the predicted cell includes: obtaining the number n of the predicted cells, and sequentially numbering each of the prediction cells according to numbers 1 to n, wherein the n is an integer greater than 1; extracting the prediction units with the numbers i and j from the first position The abscissa and ordinate of the center point of the grid, wherein, 1≤i≤n, 1≤j≤n; obtain the width and height of the table image, and adjust parameters; obtain the number i, j Predicting the first ratio of the difference between the abscissa of the center point of the cell and the width, and determining the value of the element in row i and column j in the adjacency matrix based on the product of the first ratio and the adjustment parameter The value of the row dimension; obtain the second ratio of the difference between the vertical coordinates of the center points of the prediction cells numbered i and j to the height, and based on the second ratio and the adjustment parameter The product of determines the value of the column dimension of the element in row i and column j in the adjacency matrix.

In an embodiment of the present disclosure, the obtaining the fusion node feature of any prediction unit according to the first position of any prediction unit and the adjacency matrix includes: according to any prediction unit The first position of the cell to obtain the node feature of any prediction cell; the node feature and the adjacency matrix are input into the graph convolutional network GCN, and the node feature is obtained by the graph convolutional network. The point features are fused with the adjacency matrix to generate the fused node features of any prediction unit.

In an embodiment of the present disclosure, the obtaining the node feature of any prediction cell according to the first position of any prediction cell includes: for the first position of any prediction cell The position is linearly mapped to obtain the spatial characteristics of any of the predicted cells; based on the first position of any of the predicted cells, the visual semantic features of any of the predicted cells are extracted from the table image; The spatial feature and the visual semantic feature of the any prediction cell are spliced to obtain the node feature of the any prediction cell.

In an embodiment of the present disclosure, the extracting the visual semantic feature of any prediction cell from the table image based on the first position of any prediction cell includes: A first position of a predicted cell, determining a target pixel contained in any of the predicted cells from the pixels contained in the table image; extracting the visual semantic features of the target pixel from the table image , as the visual semantic feature of any prediction cell.

In an embodiment of the present disclosure, the obtaining the second position of any prediction unit according to the fusion node feature of any prediction unit includes: based on the fusion of any prediction unit The node feature is used to obtain the predicted probability of any of the predicted cells in each candidate second position; the maximum predicted probability is obtained from the predicted probability of any of the predicted cells in each candidate second position, and The candidate second position corresponding to the maximum prediction probability is determined as the second position of any prediction unit.

In an embodiment of the present disclosure, the obtaining the second position of any prediction unit according to the fusion node feature of any prediction unit includes: establishing for any prediction unit A target vector, the target vector includes n dimensions, and the n is the number of candidate second positions of any prediction unit; based on the fusion node characteristics of any prediction unit, the target vector is obtained The predicted probability that the value of any vector dimension is 0 or 1; the maximum predicted probability is obtained from the predicted probability that the value of any vector dimension is 0 or 1, and the value corresponding to the maximum predicted probability is determined as The target value of any vector dimension; based on the sum of the target values of the vector dimension, the second position of any prediction cell is obtained.

In an embodiment of the present disclosure, the obtaining the first position of the prediction cell in the table image includes: extracting a detection frame of each prediction cell from the table image, and based on the detection frame Get the first position of the predicted cell.

In an embodiment of the present disclosure, the second position includes at least one of a starting row number, an ending row number, a starting column number, and an ending column number of the prediction cell.

The embodiment of the second aspect of the present disclosure proposes a cell position detection device, including: a first acquisition module, configured to acquire the first position of the predicted cell in the table image, wherein the first position is used to represent the The position of the region occupied by the predicted cell in the form image; the second acquisition module is configured to obtain the adjacency matrix of the form image according to the first position, wherein each of the form images in the form The prediction cell is a node, and the adjacency matrix is used to represent the positional relationship between the prediction cells; the third acquisition module is used to, according to the first position of any prediction cell and the adjacency matrix, Obtain the fusion node feature of any prediction unit; the fourth acquisition module is configured to obtain the second position of any prediction unit according to the fusion node feature of any prediction unit, wherein, The second position is used to represent the row and/or the column to which the predicted cell belongs.

The device for detecting the position of a cell in an embodiment of the disclosure can use the predicted cell as a node, and obtain an adjacency matrix based on the first position of the predicted cell, and then obtain the fusion node of the predicted cell according to the first position and the adjacency matrix feature, so that the fusion node feature can match the first position of the predicted cell and the positional relationship between the predicted cells, and the obtained fusion node feature of the predicted cell can be expressed better, and according to the fusion node feature Obtaining the second position of the predicted cell can simultaneously obtain the first position and the second position of the cell, and the obtained cell position is more comprehensive and more robust.

In an embodiment of the present disclosure, the second obtaining module is further configured to: determine the value of the corresponding element in the adjacency matrix based on the first position and the number of the predicted cell.

In an embodiment of the present disclosure, the second obtaining module is further configured to: obtain the number n of the predicted cells, and sequentially number each predicted cell according to numbers 1 to n, wherein, The n is an integer greater than 1; the abscissa and ordinate of the center point of the prediction cell numbered i and j are extracted from the first position, wherein, 1≤i≤n, 1 ≤j≤n; obtain the width and height of the table image, and adjust parameters; obtain the first ratio of the difference between the abscissa of the center point of the predicted cell numbered i and j to the width , and based on the product of the first ratio and the adjustment parameter, determine the value of the row dimension of the element in the i-th row and j-th column in the adjacency matrix; obtain the prediction cell numbered i, j The second ratio of the difference between the vertical coordinates of the central point of the center point and the height, and determine the column dimension of the element in the i-th row and j-th column in the adjacency matrix based on the product of the second ratio and the adjustment parameter value.

In an embodiment of the present disclosure, the third acquisition module includes: an acquisition unit, configured to obtain the node feature of any prediction unit according to the first position of any prediction unit; fusion A unit for inputting the node features and the adjacency matrix into the graph convolution network GCN, and the graph convolution network performs feature fusion of the node features and the adjacency matrix to generate the any A fused node feature for predicted cells.

In an embodiment of the present disclosure, the acquisition unit includes: a mapping subunit, configured to linearly map the first position of any prediction cell to obtain the spatial characteristics of any prediction cell; The extraction subunit is used to extract the visual semantic features of any prediction cell from the table image based on the first position of any prediction cell; the splicing subunit is used to combine any prediction cell The spatial feature and the visual semantic feature of the prediction cell are spliced to obtain the node feature of any prediction cell.

In an embodiment of the present disclosure, the extracting subunit is further configured to: determine the any prediction unit from the pixels contained in the table image based on the first position of the any prediction unit Included target pixels; extracting the visual semantic features of the target pixel points from the table image as the visual semantic features of any prediction unit.

In an embodiment of the present disclosure, the fourth obtaining module is further configured to: obtain the information of each candidate second position of the any prediction unit based on the fusion node feature of the any prediction unit. the predicted probability; obtain the maximum predicted probability from the predicted probability of any predicted cell in each candidate second position, and determine the candidate second position corresponding to the maximum predicted probability as the predicted probability of any predicted cell second position.

In an embodiment of the present disclosure, the fourth acquisition module is further configured to: establish a target vector for any of the prediction cells, the target vector includes n dimensions, and n is the any The number of candidate second positions of the prediction cell; based on the fusion node feature of any of the prediction cells, the prediction probability of any vector dimension of the target vector being 0 or 1 is obtained; from the any Obtaining the maximum prediction probability from the prediction probability with a value of 0 or 1 in a vector dimension, and determining the value corresponding to the maximum prediction probability as the target value of any vector dimension; based on the target value of the vector dimension The sum value of , to obtain the second position of any predicted cell.

In an embodiment of the present disclosure, the first acquisition module is further configured to: extract the detection frame of each prediction unit cell from the table image, and obtain the prediction unit based on the detection frame the first position of the grid.

The embodiment of the third aspect of the present disclosure proposes an electronic device, including: a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the program, the aforementioned first On the one hand, the detection method of the cell position described in the embodiment.

In the electronic device of the embodiment of the present disclosure, the computer program stored in the memory can be executed by the processor, the predicted cell can be used as a node, and the adjacency matrix can be obtained based on the first position of the predicted cell, and then according to the first position and the adjacency matrix The fusion node feature of the prediction cell is obtained, so that the fusion node feature can match the first position of the prediction cell and the positional relationship between the prediction cells, and the expression effect of the fusion node feature of the prediction cell obtained is better OK, and the second position of the predicted cell is obtained according to the fusion node features, and the first position and the second position of the cell can be obtained at the same time, and the obtained cell position is more comprehensive and more robust.

The embodiment of the fourth aspect of the present disclosure provides a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the method for detecting the position of a cell as described in the embodiment of the first aspect is implemented.

The computer-readable storage medium of the embodiment of the present disclosure stores the computer program and executes it by the processor. The prediction unit can be used as a node, and the adjacency matrix can be obtained based on the first position of the prediction unit, and then according to the first position and the adjacency matrix to obtain the fusion node features of the predicted cell, so that the fusion node feature can match the first position of the predicted cell and the positional relationship between the predicted cells, and the obtained representation effect of the fusion node feature of the predicted cell It is better, and the second position of the predicted cell is obtained according to the fusion node features, and the first position and the second position of the cell can be obtained at the same time, and the obtained cell position is more comprehensive and more robust.

The embodiment of the fifth aspect of the present disclosure proposes a computer program product, wherein the computer program product includes computer program code, and when the computer program code is run on a computer, the above-mentioned embodiment of the first aspect is implemented. The detection method of the cell position.

The computer program product of the embodiment of the present disclosure, the computer program product includes computer program code, when the computer program code is run on the computer, the predicted cell can be used as a node, and based on the first position of the predicted cell, the The adjacency matrix, and then according to the first position and the adjacency matrix, the fusion node features of the prediction cell are obtained, so that the fusion node feature can match the first position of the prediction cell and the positional relationship between the prediction cells, and the obtained prediction The representation effect of the fusion node feature of the cell is better, and the second position of the predicted cell can be obtained according to the fusion node feature, and the first position and the second position of the cell can be obtained at the same time, and the obtained cell position is more comprehensive , with better robustness.

The embodiment of the sixth aspect of the present disclosure proposes a computer program, wherein the computer program includes computer program code, and when the computer program code is run on the computer, the computer executes the unit described in the embodiment of the first aspect The detection method of grid position.

The computer program of the embodiment of the present disclosure includes computer program code. When the computer program code is run on the computer, the computer can use the predicted cell as a node, and obtain an adjacency matrix based on the first position of the predicted cell, and then according to the first A position and adjacency matrix to obtain the fusion node feature of the predicted cell, so that the fusion node feature can match the first position of the predicted cell and the positional relationship between the predicted cell, and the obtained fusion node of the predicted cell The representation effect of the feature is better, and the second position of the predicted cell can be obtained according to the fusion node feature, and the first position and the second position of the cell can be obtained at the same time, and the obtained cell position is more comprehensive and more robust .

Additional aspects and advantages of the disclosure will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Description of drawings

The above and/or additional aspects and advantages of the present disclosure will become apparent and understandable from the following description of the embodiments in conjunction with the accompanying drawings, wherein:

FIG. 1 is a schematic flowchart of a method for detecting a cell position according to an embodiment of the present disclosure;

FIG. 2 is a schematic flow diagram of determining values of corresponding elements in an adjacency matrix in a cell position detection method according to an embodiment of the present disclosure;

FIG. 3 is a schematic flow diagram of obtaining fusion node features of any predicted cell in a cell position detection method according to an embodiment of the present disclosure;

FIG. 4 is a schematic flow diagram of obtaining the node characteristics of any predicted cell in the cell position detection method according to an embodiment of the present disclosure;

FIG. 5 is a schematic flow diagram of obtaining the second position of any predicted cell in the cell position detection method according to an embodiment of the present disclosure;

6 is a schematic flowchart of obtaining the second position of any predicted cell in a cell position detection method according to another embodiment of the present disclosure;

Fig. 7 is a schematic diagram of a detection model of a cell position according to an embodiment of the present disclosure;

FIG. 8 is a schematic structural diagram of a device for detecting cell positions according to an embodiment of the present disclosure;

FIG. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

detailed description

Embodiments of the present disclosure are described in detail below, examples of which are illustrated in the drawings, in which the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present disclosure and should not be construed as limiting the present disclosure.

The method, device, electronic device and storage medium for detecting a cell position according to the embodiments of the present disclosure will be described below with reference to the accompanying drawings.

FIG. 1 is a schematic flowchart of a method for detecting a cell position according to an embodiment of the present disclosure.

As shown in FIG. 1 , the method for detecting a cell position in an embodiment of the present disclosure includes steps S101-S104.

S101. Acquire a first position of a predicted cell in the table image, where the first position is used to represent a position in the table image of an area occupied by the predicted cell.

It should be noted that the executor of the cell position detection method in the embodiment of the present disclosure may be a cell position detection device, and the cell position detection device in the embodiment of the present disclosure can be configured in any electronic device, so that the The electronic device can execute the detection method of the cell position in the embodiment of the present disclosure. Among them, the electronic device can be a personal computer (Personal Computer, referred to as PC), cloud device, mobile device, etc., and the mobile device can be a mobile phone, tablet computer, personal digital assistant, wearable device, vehicle-mounted device, etc., with various operating systems, Hardware devices for touch screens and/or displays.

In the embodiment of the present disclosure, the first position of the predicted cell in the table image may be obtained. It can be understood that a table image may contain at least one prediction cell, and different prediction cells may correspond to different first positions.

It should be noted that, in the embodiments of the present disclosure, the first position is used to represent the position of the area occupied by the predicted cell in the table image, that is, the position of the area occupied by the predicted cell in the table image can be determined according to the first position , the location of the predicted cell can be realized according to the first position.

In one embodiment, the first position includes at least one of the two-dimensional coordinates of the center point of the predicted cell, the width of the predicted cell, and the height of the predicted cell. At this time, the area occupied by the predicted cell is a rectangle.

In one embodiment, the cell recognition can be performed on the table image to generate the detection frame of the predicted cell, then obtaining the first position of the predicted cell in the table image may include extracting each predicted cell from the table image The detection frame of the cell, and obtain the first position of the predicted cell based on the detection frame.

In one embodiment, performing cell recognition on the table image to generate a detection frame for the predicted cell may include performing cell recognition on the table image according to a cell recognition algorithm, so that the predicted cell can be located from the table image , to generate detection boxes for predicted cells. Wherein, the cell identification algorithm can be set according to the actual situation, and there is no excessive limitation here.

In one embodiment, obtaining the first position of the predicted cell based on the detection frame may include obtaining the two-dimensional coordinates of the center point of the detection frame, the width and height of the detection frame, and taking the two-dimensional coordinates of the center point of the detection frame as The two-dimensional coordinates of the center point of the predicted cell, and the width and height of the detection frame are respectively used as the width and height of the predicted cell.

S102. Obtain an adjacency matrix of the table image according to the first position, wherein each prediction cell in the table image is a node, and the adjacency matrix is used to represent the positional relationship between the prediction cells.

In the embodiments of the present disclosure, each prediction cell in the table image may be regarded as a node, the prediction cells and the nodes have a one-to-one correspondence, and each node is used to represent the corresponding prediction cell. Correspondingly, the adjacency matrix is used to represent the positional relationship between predicted cells.

In the embodiment of the present disclosure, the adjacency matrix of the table image can be obtained according to the first position. It can be understood that, according to the first positions of any two predicted cells, the positional relationship between any two predicted cells can be obtained, and then the value of the corresponding element in the adjacency matrix can be obtained. Wherein, the location relationship includes but is not limited to Euclidean distance, Manhattan distance, etc., which are not limited here.

In one embodiment, elements in the adjacency matrix may be used to represent undirected edges between nodes corresponding to any two prediction cells.

S103. According to the first position and the adjacency matrix of any prediction cell, obtain the fusion node feature of any prediction cell.

In the embodiment of the present disclosure, the fusion node feature of any prediction unit can be obtained according to the first position and the adjacency matrix of any prediction unit. Therefore, this method can obtain the fusion node features based on the first position of the prediction cell and the adjacency matrix, so that the fusion node feature can match the first position of the prediction cell and the positional relationship between the prediction cells, and obtain The representation of the fusion node features of predicted cells is better.

For example, assuming that the number of prediction cells is n, and the number of first positions of the prediction cells obtained is n, then n fusion node features can be obtained according to the n first positions and the adjacency matrix.

S104. Obtain a second position of any prediction cell according to the fusion node feature of any prediction cell, where the second position is used to represent the row and/or column to which the prediction cell belongs.

In the embodiment of the present disclosure, the second position of any prediction unit can be obtained according to the fusion node characteristics of any prediction unit, that is, according to the fusion node characteristics of any prediction unit, any prediction unit predict the second position of any predicted cell, and obtain the second position of any predicted cell.

It should be noted that, in the embodiments of the present disclosure, the second position is used to represent the row and/or column to which the prediction cell belongs, that is, the row and/or column to which the prediction cell belongs in the table can be determined according to the second position. column, the positioning of the predicted cell can be realized according to the second position.

In one embodiment, the second position includes at least one of the number of the starting row, the number of the ending row, the number of the starting column, and the number of the ending column of the predicted cell. It can be understood that the rows and columns in the table can be numbered respectively in advance.

In one embodiment, the row to which the predicted cell belongs may be determined according to the number of the start row and the number of the end row of the predicted cell. For example, the candidate numbers between the number of the start row and the number of the end row can be obtained, and the number of the start row, the candidate number, and the number of the end row can be determined as the number of the corresponding row, so that according to the number of the determined row Determine the row to which the forecasted cell belongs. It should be noted that, the manner of determining the column to which the prediction cell belongs may refer to the manner of determining the row to which the prediction cell belongs, and details are not repeated here.

In the embodiment of the present disclosure, obtaining the second position of any prediction cell according to the fusion node characteristics of any prediction unit may include inputting the fusion node characteristics of any prediction unit into the position prediction algorithm, The position prediction algorithm is used to predict the position according to the fusion node features, and the second position of any predicted cell is generated. Wherein, the location prediction algorithm can be set according to actual conditions, and there is no excessive limitation here.

To sum up, according to the detection method of the cell position according to the embodiment of the present disclosure, the predicted cell can be used as a node, and the adjacency matrix can be obtained based on the first position of the predicted cell, and then the predicted cell can be obtained according to the first position and the adjacency matrix The fusion node feature of the prediction cell can match the fusion node feature with the first position of the prediction cell and the positional relationship between the prediction cells, and the obtained fusion node feature of the prediction cell has a better representation effect, and according to The second position of the predicted cell can be obtained by fusing the node features, and the first position and the second position of the cell can be obtained at the same time, and the obtained cell position is more comprehensive and more robust.

On the basis of any of the above embodiments, obtaining the adjacency matrix of the table image according to the first position in step S102 may include determining values of corresponding elements in the adjacency matrix based on the first position and the number of the predicted cell.

It can be understood that the positional relationship between any two predicted cells can be obtained based on the first positions of any two predicted cells, and the target number of the corresponding element in the adjacency matrix can be determined according to the numbers of any two predicted cells , and then the value of the element of the target number in the adjacency matrix can be determined according to the positional relationship between any two predicted cells.

In one embodiment, as shown in FIG. 2 , based on the first position and the number of the predicted cell, the value of the corresponding element in the adjacency matrix is determined, including steps S201-S205.

S201. Acquire the number n of predicted cells, and sequentially number each predicted cell according to the numbers 1 to n, wherein n is an integer greater than 1.

In the embodiment of the present disclosure, the prediction cells can be numbered consecutively according to numbers 1 to n, and the numbers 1 to n can be randomly assigned. For example, if the number of predicted cells is 10, each predicted cell may be numbered consecutively according to numbers 1 to 10.

S202. Extract the abscissa and ordinate of the central point of the prediction cell numbered i and j from the first position, where 1≤i≤n, 1≤j≤n.

In the embodiment of the present disclosure, the first position includes the abscissa and ordinate of the center point of the predicted cell, and the abscissa and ordinate of the center point of the predicted cell numbered i and j can be extracted from the first position .

Wherein, 1≤i≤n, 1≤j≤n, and both i and j are integers.

It can be understood that the first position has a corresponding relationship with the number of the predicted cell, and the above corresponding relationship can be queried according to the numbers i and j to obtain the abscissa and ordinate of the center point of the predicted cell with the numbers i and j.

In one embodiment, a mapping relationship or a mapping table between the first position and the number of the predicted cell can be established in advance, wherein the first position includes the abscissa and ordinate of the center point of the predicted cell, then it can be calculated according to The number of the predicted cell queries the above mapping relationship or mapping table to obtain the abscissa and ordinate of the center point of the predicted cell. It should be noted that the above mapping relationship or mapping table can be set according to actual conditions, and there is no excessive limitation here.

S203, acquiring the width and height of the table image, and adjusting parameters.

In one embodiment, obtaining the width and height of the form image may include performing size recognition on the form image according to an image size recognition algorithm to obtain the width and height of the form image. Wherein, the image size recognition algorithm can be set according to the actual situation, which is not limited here.

It should be noted that, in the embodiments of the present disclosure, the adjustment parameters may be set according to actual conditions, and are not limited here. In one embodiment, the adjustment parameter is positively correlated with the number of rows and/or columns of the table.

S204. Obtain the first ratio of the difference between the abscissa of the central point of the prediction cell numbered i and j to the width, and determine the element in row i and column j in the adjacency matrix based on the product of the first ratio and the adjustment parameter The value of the row dimension of .

In one implementation, the following formula is used to calculate the value of the row dimension of the element in row i and column j in the adjacency matrix:

in,

is the value of the row dimension of the element in row i and column j in the adjacency matrix,

is the abscissa of the center point of the prediction cell numbered i,

is the abscissa of the center point of the predicted cell with number j, w is the width of the table image, and c is the adjustment parameter.

It can be understood that other methods may be used to determine the value of the row dimension of the element in row i and column j in the adjacency matrix, which will not be repeated here.

S205. Obtain the second ratio of the difference between the vertical coordinates of the center point of the prediction cell numbered i and j to the height, and determine the element in row i and column j in the adjacency matrix based on the product of the second ratio and the adjustment parameter The value of the column dimension of .

In one embodiment, the following formula is used to calculate the value of the column dimension of the element in the i-th row and the j-th column in the adjacency matrix:

in,

is the value of the column dimension of the element in row i and column j in the adjacency matrix,

is the ordinate of the center point of the prediction cell numbered i,

is the ordinate of the center point of the predicted cell with number j, H is the width of the table image, and c is the adjustment parameter.

It can be understood that other methods may be used to determine the value of the column dimension of the element in row i and column j in the adjacency matrix, which will not be repeated here.

Therefore, this method can comprehensively consider the abscissa of the center point of the prediction cell numbered i and j, the width of the table image, and the value of the adjustment parameter to the row dimension of the element in the i-th row and j-th column in the adjacency matrix. influence, and comprehensively consider the ordinate of the center point of the prediction cell numbered i and j, the height of the table image, and the influence of adjustment parameters on the value of the column dimension of the element in row i and column j in the adjacency matrix.

On the basis of any of the above embodiments, as shown in Figure 3, in step S103, according to the first position and adjacency matrix of any prediction cell, the fusion node features of any prediction cell are obtained, including steps S301-S302 .

S301. According to the first position of any predicted cell, obtain the node feature of any predicted cell.

In the embodiment of the present disclosure, the node feature of any predicted cell can be obtained according to the first position of any predicted cell, so that the node feature can match the first position of the predicted cell.

In one embodiment, according to the first position of any predicted cell, obtaining the node feature of any predicted cell may include inputting the first position of any predicted cell into the feature extraction algorithm, and the feature The extraction algorithm extracts the node features of any prediction cell from the first position. Wherein, the feature extraction algorithm can be set according to the actual situation, and there is no excessive limitation here.

S302. Input the node features and adjacency matrix into the graph convolutional network GCN, and perform feature fusion of the node features and the adjacency matrix by the graph convolutional network to generate fusion node features of any prediction unit.

In the embodiment of the present disclosure, the node features and adjacency matrix can be input into the graph convolutional network (Graph Convolutional Network, GCN), and the node feature and the adjacency matrix are fused by the graph convolutional network to generate any prediction The fused node feature of the cell can be reconstructed by using the adjacency matrix through the graph convolutional network to generate the fused node feature. Among them, the graph convolutional network can be set according to the actual situation, and there are no too many restrictions here.

In one embodiment, the fusion node features are calculated using the following formula:

X'=ReLU(GCN(X,A))

Among them, X' is the fusion node feature, X is the node feature, A is the adjacency matrix, and ReLU(·) is the activation function.

Therefore, this method can obtain the node features of any prediction cell according to the first position of any prediction cell, and input the node features and adjacency matrix into the graph convolutional network GCN, and the graph convolutional network The feature fusion of the node features and the adjacency matrix is performed to generate the fusion node features of any prediction cell.

On the basis of any of the above embodiments, as shown in FIG. 4 , in step S301 , according to the first position of any predicted cell, the node features of any predicted cell are obtained, including steps S401 - S403 .

S401. Perform linear mapping on the first position of any prediction cell to obtain the spatial characteristics of any prediction cell.

It can be understood that the first position may be a one-dimensional or multi-dimensional vector. For example, when the first position includes the two-dimensional coordinates of the center point of the predicted cell, the width and height of the predicted cell, the first position is a 4-dimensional vector, which can be used

to represent, where b _i is the first position of the prediction cell numbered i,

is the abscissa of the center point of the prediction cell numbered i,

is the ordinate of the center point of the predicted cell numbered i,

is the width of the predicted cell numbered i,

predict the height of the cell numbered i.

In the embodiments of the present disclosure, a linear mapping may be performed on the first position of any prediction cell to obtain the spatial characteristics of any prediction cell. It can be appreciated that the spatial characteristics of any predicted cell match the first location.

In one embodiment, performing linear mapping on the first position of any predicted cell to obtain the spatial characteristics of any predicted cell may include inputting the first position of any predicted cell into a linear mapping algorithm, by The linear mapping algorithm performs linear mapping on the first position to obtain the spatial characteristics of any predicted cell. Wherein, the linear mapping algorithm can be set according to the actual situation, and there is no excessive limitation here.

S402. Based on the first position of any prediction cell, extract the visual semantic features of any prediction cell from the table image.

In the embodiments of the present disclosure, based on the first position of any predicted cell, the visual semantic feature of any predicted cell can be extracted from the table image, so that the visual semantic feature can match the first position of the predicted cell.

In the embodiment of the present disclosure, extracting the visual semantic features of any predicted cell from the table image based on the first position of any predicted cell may include determining any predicted The area occupied by the cell on the table image, and the visual semantic feature is extracted from the corresponding area in the table image as the visual semantic feature of any predicted cell.

In one embodiment, based on the first position of any predicted cell, extracting the visual semantic features of any predicted cell from the table image may include, based on the first position of any predicted cell, extracting from the table image Determine the target pixel contained in any prediction cell from the pixels in the table image, and extract the visual semantic feature of the target pixel from the table image as the visual semantic feature of any prediction cell.

It can be understood that the table image includes a plurality of pixel points, and based on the first position of any prediction cell, the target pixel point included in any prediction cell can be determined from the pixels included in the table image. It should be noted that the target pixel point refers to a pixel point located in the area occupied by the prediction cell.

In one embodiment, extracting the visual semantic feature of the target pixel from the table image as the visual semantic feature of any predicted cell may include extracting the visual semantic feature of each pixel from the table image, according to The preset extraction algorithm extracts the visual semantic features of the target pixels from the visual semantic features. Wherein, the extraction algorithm can be set according to the actual situation, and it is not limited here too much, for example, it can be the RoIAlign algorithm.

S403. Concatenate the spatial features and visual semantic features of any prediction cell to obtain node features of any prediction cell.

In one embodiment, the spatial features and visual semantic features of any prediction cell can be combined horizontally to obtain the node features of any prediction cell. For example, if the spatial features and visual semantic features of any predicted cell are X ^s and X ^v respectively, and X ^s and X ^v are vectors of 256 dimensions and 1024 dimensions respectively, X ^s and X ^v can be spliced horizontally to obtain The node feature of any predicted cell is a 1280-dimensional vector.

Therefore, the method can obtain spatial features and visual semantic features based on the first position of any prediction cell respectively, and splicing the spatial features and visual semantic features to obtain the node features of any prediction cell.

On the basis of any of the above-mentioned embodiments, in step S104, the second position of any prediction unit is obtained according to the fusion node features of any prediction unit, which may include the following two possible implementation modes:

Method 1. As shown in FIG. 5 , in step S104 , according to the fusion node features of any prediction unit, the second position of any prediction unit is obtained, which may include steps S501 - S502 .

S501. Based on the fusion node feature of any prediction unit, obtain the prediction probability of any prediction unit at each candidate second position.

Taking the second position as the starting line of the predicted cell as an example, if the number of rows in the table is T, and the candidate second position includes rows 1, 2 to T, then based on the fusion node characteristics of any predicted cell, we can get The predicted probability of any predicted cell under rows 1, 2 to T.

S502. Obtain the maximum prediction probability from the prediction probability of any prediction unit at each candidate second position, and determine the candidate second position corresponding to the maximum prediction probability as the second position of any prediction unit.

In the embodiment of the present disclosure, the prediction probability of any prediction unit under each candidate second position may be different, and the greater the prediction probability, the greater the possibility that the candidate second position is the second position. The prediction unit obtains the maximum prediction probability from the prediction probabilities under each candidate second position, and determines the candidate second position corresponding to the maximum prediction probability as the second position of any prediction unit.

Continuing to take the second position as the starting line of the predicted cell as an example, if the number of rows in the table is T, the candidate second position includes rows 1, 2 to T, and any predicted cell is under rows 1, 2 to T The predicted probabilities of P ₁ , P ₂ to _PT are respectively, and the maximum value among P ₁ , P ₂ to PT is P ₂ , then row ₂ can be used as the starting row of the predicted cell.

Therefore, this method can obtain the prediction probability of any prediction unit at each candidate second position based on the fusion node features of any prediction unit, and obtain the prediction probability of any prediction unit at each candidate second position from any prediction unit at each candidate second position The maximum predicted probability is obtained from the predicted probabilities below, and the candidate second position corresponding to the maximum predicted probability is determined as the second position of any predicted cell.

Method 2. As shown in FIG. 6 , in step S104 , according to the fusion node features of any prediction unit, the second position of any prediction unit is obtained, which may include steps S601 - S604 .

S601. For any prediction unit, establish a target vector, where the target vector includes n dimensions, and n is the number of candidate second positions of any prediction unit.

Taking the second position as the starting row of the predicted cell as an example, if the number of rows in the table is T, and the candidate second position includes rows 1, 2 to T, then the target vector includes T dimensions.

S602. Obtain the predicted probability of any vector dimension of the target vector being 0 or 1 based on the fusion node feature of any prediction unit.

Continuing to take the second position as the starting line of the prediction unit as an example, if the number of rows in the table is T, the candidate second position includes rows 1, 2 to T, and the target vector includes T dimensions, then it can be based on any prediction unit The fusion node features of the grid can be used to obtain the predicted probability of 0 or 1 for the 1st, 2nd to T vector dimensions of the target vector.

S603. Obtain the maximum predicted probability from the predicted probability of any vector dimension with a value of 0 or 1, and determine the value corresponding to the maximum predicted probability as the target value of any vector dimension.

In the embodiments of the present disclosure, the predicted probability of any vector dimension taking a value of 0 or 1 may be different, and a larger predicted probability of taking a value of 0 indicates that the possibility of taking a value of any vector dimension is more likely to be 0, and vice versa , the larger the predicted probability of 1 indicates that the possibility of any vector dimension to be 1 is greater, then the maximum predicted probability can be obtained from the predicted probability of any vector dimension with a value of 0 or 1, and Determine the value corresponding to the maximum predicted probability as the target value of any vector dimension.

Continuing to take the second position as the starting line of the predicted cell as an example, if the number of rows in the table is T, the candidate second position includes rows 1, 2 to T, the target vector includes T dimensions, and the mth vector dimension of the target vector The predicted probabilities of 0 or 1 are respectively

The maximum value in

Then the target value of the mth vector dimension of the target vector is 1. Among them, 1≤m≤T.

S604. Obtain the second position of any prediction cell based on the sum of the target values of the vector dimension.

In the embodiment of the present disclosure, the sum of the target values of the vector dimension of the target vector has a corresponding relationship with the second position, then the corresponding relationship can be queried based on the sum of the target values of the vector dimension, and the corresponding second position can be determined . It should be noted that the above corresponding relationship may be set according to actual conditions, and is not limited here.

In one embodiment, for the prediction cell numbered i, the number of each candidate second position can be converted into a candidate vector by using the following formula:

Wherein, the candidate vector includes n dimensions, and n is the number of candidate second positions,

is the value of the t-th vector dimension of the candidate vector, r _i is the number of the second position of the candidate, 0≤r _i ≤n-1, 1≤t≤n.

Continuing to take the second position as the starting row of the predicted cell as an example, if the number of rows in the table is 3, the candidate second position includes rows 1, 2 to 3, that is, the numbers of the candidate second positions are 0, 1, 2, Corresponding to rows 1, 2, and 3 respectively, the numbers 0, 1, and 2 of the candidate second position can be converted into candidate vectors (0,0,0), (1,0,0), (1, 1,0).

At this time, the number of the second position may be determined based on the sum of target values of all vector dimensions of the target vector and the target sum of 1. If the sum of the target values of all vector dimensions of the target vector is 2, it can be determined that the number of the starting row of the predicted cell is 3, that is, the starting row of the predicted cell is the third row.

Therefore, this method can establish a target vector for any prediction unit, and determine the value of any vector dimension of the target vector based on the fusion node characteristics of any prediction unit, and according to the target value of the vector dimension The sum value of , to get the second position of any predicted cell, the accuracy of the second position obtained is better.

It should be noted that the method for acquiring a second location in the embodiments of the present disclosure is applicable to any type of second location. In one implementation, the method for obtaining the second position in the embodiment of the present disclosure is suitable for determining the number of the start row, the number of the end row, the number of the start column, and the number of the end column of the prediction cell.

On the basis of any of the above-mentioned embodiments, obtaining the first position of the predicted cell in the table image in step S101 may include extracting the visual semantic feature of each pixel from the table image, and obtaining each pixel based on the visual semantic feature The recognition probability of a point under each category, the maximum recognition probability is obtained from the recognition probability of any pixel point under each category, and the category corresponding to the maximum recognition probability is determined as the target category corresponding to any pixel point, and the recognition is performed by The target category is a connected domain formed by the pixels of cells, the smallest bounding rectangle of the connected domain is determined as the detection frame of the predicted cell, and the first position of the predicted cell is obtained based on the detection frame.

Among them, the category includes but not limited to background, cell, border line.

Among them, the recognition probability of each pixel in each category is obtained based on the visual semantic features, which may include inputting the visual semantic features of any pixel into the classification algorithm, and the classification algorithm performs category prediction according to the visual semantic features to generate any The recognition probability of a pixel in each category. Wherein, the classification algorithm can be set according to the actual situation, and there is no excessive limitation here.

It should be noted that, for relevant content of obtaining the first position of the predicted cell based on the detection frame, reference may be made to the above-mentioned embodiments, and details are not repeated here.

Corresponding to the detection method of the cell position provided by the above-mentioned embodiments in FIGS. 1 to 6 , the present disclosure also provides a detection model of the cell position. The input of the detection model is a table image, and the output is the predicted cell in the table image. The first and second positions of .

As shown in Figure 7, the detection model includes a visual semantic feature extraction layer, a first classification layer, a node feature extraction layer, a graph reconstruction network layer, and a second classification layer.

Among them, the visual semantic feature extraction layer is used to extract the visual semantic feature of each pixel from the table image.

Among them, the first classification layer is used to obtain the recognition probability of each pixel in each category based on the visual semantic features, and then determine the target category corresponding to any pixel according to the recognition probability, and identify the pixel whose target category is the cell A connected domain composed of points, the smallest bounding rectangle of the connected domain is determined as the detection frame of the predicted cell, and the first position of the predicted cell is obtained based on the detection frame.

Wherein, the node feature extraction layer is used to obtain the node feature of any predicted cell according to the first position of any predicted cell.

Among them, the graph reconstruction network layer is used for feature fusion of node features and adjacency matrix to generate fusion node features of any prediction unit.

Wherein, the second classification layer is used to obtain the second position of any prediction unit according to the fusion node feature of any prediction unit.

Corresponding to the method for detecting the position of a cell provided by the embodiments of FIGS. 1 to 6 above, the present disclosure also provides a device for detecting a position of a cell. Corresponding to the detection method of the cell position provided in the embodiment of FIG. 6, the implementation of the detection method of the cell position is also applicable to the detection device of the cell position provided in the embodiment of the present disclosure, and will not be described in the embodiment of the present disclosure. A detailed description.

FIG. 8 is a schematic structural diagram of a device for detecting cell positions according to an embodiment of the present disclosure.

As shown in FIG. 8 , the cell position detection device 100 of the embodiment of the present disclosure may include: a first acquisition module 110 , a second acquisition module 120 , a third acquisition module 130 and a fourth acquisition module 140 .

The first acquisition module 110 is configured to acquire a first position of a prediction cell in the form image, wherein the first position is used to represent the position of the area occupied by the prediction cell in the form image;

The second obtaining module 120 is configured to obtain the adjacency matrix of the table image according to the first position, wherein each of the prediction cells in the table image is a node, and the adjacency matrix is used for Representing the positional relationship between the predicted cells;

The third acquisition module 130 is configured to obtain the fusion node feature of any prediction unit according to the first position of any prediction unit and the adjacency matrix;

The fourth obtaining module 140 is configured to obtain the second position of any prediction unit according to the fusion node feature of the prediction unit, wherein the second position is used to characterize the prediction unit The row and/or column of the .

In an embodiment of the present disclosure, the second obtaining module 120 is further configured to: determine the value of a corresponding element in the adjacency matrix based on the first position and the number of the predicted cell.

In an embodiment of the present disclosure, the second obtaining module 120 is further configured to: obtain the number n of the predicted cells, and serially number each predicted cell according to numbers 1 to n, wherein , the n is an integer greater than 1; the abscissa and ordinate of the center point of the prediction cell numbered i and j are extracted from the first position, wherein, 1≤i≤n, 1≤j≤n; obtain the width and height of the table image, and adjust parameters; obtain the difference between the abscissa of the center point of the prediction cell numbered i and j and the first value of the width ratio, and based on the product of the first ratio and the adjustment parameter, determine the value of the row dimension of the i-th row and j-th column element in the adjacency matrix; obtain the prediction unit numbered i, j The difference between the vertical coordinates of the central point of the grid and the second ratio of the height, and based on the product of the second ratio and the adjustment parameter, determine the column dimension of the element in the i-th row and j-th column in the adjacency matrix value of .

In an embodiment of the present disclosure, the third acquisition module 130 includes: an acquisition unit, configured to obtain the node feature of any prediction unit according to the first position of any prediction unit; The fusion unit is used to input the node features and the adjacency matrix into the graph convolution network GCN, and the graph convolution network performs feature fusion of the node features and the adjacency matrix to generate the Fusion node features for any predicted cell.

In an embodiment of the present disclosure, the fourth acquisition module 140 is further configured to: obtain the location of each candidate second position of any prediction unit based on the fusion node feature of any prediction unit. The predicted probability under: Obtain the maximum predicted probability from the predicted probability of any predicted cell in each candidate second position, and determine the candidate second position corresponding to the maximum predicted probability as the any predicted cell the second position of .

In an embodiment of the present disclosure, the fourth obtaining module 140 is further configured to: establish a target vector for any prediction unit, the target vector includes n dimensions, and n is the The number of candidate second positions of a prediction cell; based on the fusion node feature of any prediction cell, obtain the prediction probability that any vector dimension of the target vector is 0 or 1; from the Obtaining the maximum prediction probability from the prediction probability of the value of any vector dimension being 0 or 1, and determining the value corresponding to the maximum prediction probability as the target value of any vector dimension; based on the target value of the vector dimension The sum of the values yields the second position of either predicted cell.

In an embodiment of the present disclosure, the first obtaining module 110 is further configured to: extract the detection frame of each predicted cell from the table image, and obtain the prediction based on the detection frame The first position of the cell.

In order to realize the above-mentioned embodiments, as shown in FIG. 9 , an embodiment of the present disclosure also proposes an electronic device 200, including: a memory 210, a processor 220, and a computer program stored in the memory 210 and operable on the processor 220, When the processor 220 executes the program, it realizes the detection method of the cell position as proposed in the foregoing embodiments of the present disclosure.

In order to realize the above-mentioned embodiments, the embodiments of the present disclosure also propose a computer-readable storage medium, on which a computer program is stored. When the program is executed by a processor, the method for detecting the position of a cell as proposed in the foregoing embodiments of the present disclosure is implemented. .

In one embodiment of the present disclosure, the computer readable storage medium is a non-transitory computer readable storage medium.

In order to realize the above-mentioned embodiments, the embodiments of the present disclosure also propose a computer program product, the computer program product includes computer program code, when the computer program code is run on the computer, it realizes the unit cell as proposed in the foregoing embodiments of the present disclosure. Location detection method.

In order to realize the above-mentioned embodiments, the embodiments of the present disclosure also propose a computer program, wherein the computer program includes computer program codes, and when the computer program codes are run on the computer, the computer executes the unit cell as proposed in the foregoing embodiments of the present disclosure. Location detection method.

It should be noted that the foregoing explanations of the embodiments of the method for detecting cell positions are also applicable to the computer equipment, computer-readable storage media, computer program products, and computer programs in the above embodiments, and will not be repeated here.

In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present disclosure. In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.

In addition, the terms "first" and "second" are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, the features defined as "first" and "second" may explicitly or implicitly include at least one of these features. In the description of the present disclosure, "plurality" means at least two, such as two, three, etc., unless otherwise specifically defined.

Any process or method descriptions in flowcharts or otherwise described herein may be understood to represent a module, segment or portion of code comprising one or more executable instructions for implementing custom logical functions or steps of a process , and the scope of preferred embodiments of the present disclosure includes additional implementations in which functions may be performed out of the order shown or discussed, including substantially concurrently or in reverse order depending on the functions involved, which shall It is understood by those skilled in the art to which the embodiments of the present disclosure pertain.

The logic and/or steps represented in the flowcharts or otherwise described herein, for example, can be considered as a sequenced listing of executable instructions for implementing logical functions, which can be embodied in any computer-readable medium, For use with instruction execution systems, devices, or devices (such as computer-based systems, systems including processors, or other systems that can fetch instructions from instruction execution systems, devices, or devices and execute instructions), or in conjunction with these instruction execution systems, devices or equipment for use. For the purposes of this specification, a "computer-readable medium" may be any device that can contain, store, communicate, propagate or transmit a program for use in or in conjunction with an instruction execution system, device or device. More specific examples (non-exhaustive list) of computer-readable media include the following: electrical connection with one or more wires (electronic device), portable computer disk case (magnetic device), random access memory (RAM), Read Only Memory (ROM), Erasable and Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM). In addition, the computer-readable medium may even be paper or other suitable medium on which the program can be printed, since the program can be read, for example, by optically scanning the paper or other medium, followed by editing, interpretation or other suitable processing if necessary. The program is processed electronically and stored in computer memory.

It should be understood that various parts of the present disclosure may be implemented in hardware, software, firmware or a combination thereof. In the embodiments described above, various steps or methods may be implemented by software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware as in another embodiment, it can be implemented by any one or a combination of the following techniques known in the art: a discrete Logic circuits, ASICs with suitable combinational logic gates, Programmable Gate Arrays (PGA), Field Programmable Gate Arrays (FPGA), etc.

Those of ordinary skill in the art can understand that all or part of the steps carried by the methods of the above embodiments can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable storage medium. During execution, one or a combination of the steps of the method embodiments is included.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing module, each unit may exist separately physically, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. If the integrated modules are realized in the form of software function modules and sold or used as independent products, they can also be stored in a computer-readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic disk or an optical disk, and the like. Although the embodiments of the present disclosure have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limitations on the present disclosure, and those skilled in the art can understand the above-mentioned embodiments within the scope of the present disclosure. The embodiments are subject to changes, modifications, substitutions and variations.

All the embodiments of the present disclosure can be implemented independently or in combination with other embodiments, which are all regarded as the scope of protection required by the present disclosure.

Claims

A method for detecting a cell position, comprising:

Acquiring a first position of a predicted cell in the table image, wherein the first position is used to characterize the position of the area occupied by the predicted cell in the table image;

According to the first position, the adjacency matrix of the table image is obtained, wherein each of the prediction cells in the table image is a node, and the adjacency matrix is used to represent the relationship between the prediction cells positional relationship;

According to the first position of any prediction cell and the adjacency matrix, obtain the fusion node feature of any prediction cell;

According to the fusion node feature of any prediction cell, the second position of any prediction cell is obtained, wherein the second position is used to represent the row and/or column to which the prediction cell belongs .
The method according to claim 1, wherein the first position comprises at least one of the two-dimensional coordinates of the central point of the predicted cell, the width of the predicted cell, and the height of the predicted cell A sort of.
The method according to claim 1 or 2, wherein said obtaining the adjacency matrix of said form image according to said first position comprises:

Based on the first position and the number of the predicted cell, determine the value of the corresponding element in the adjacency matrix.
The method according to claim 3, wherein the determination of the value of the corresponding element in the adjacency matrix based on the first position and the number of the predicted cell includes:

Acquiring the number n of the predicted cells, and sequentially numbering each of the predicted cells according to numbers 1 to n, wherein the n is an integer greater than 1;

Extracting the abscissa and ordinate of the center point of the prediction cell numbered i and j from the first position, where 1≤i≤n, 1≤j≤n;

Obtain the width and height of the table image, and adjust parameters;

Acquiring the first ratio of the difference between the abscissa of the central point of the prediction cell numbered i and j to the width, and determining the adjacency based on the product of the first ratio and the adjustment parameter The value of the row dimension of the element in row i and column j in the matrix;

Obtaining a second ratio of the difference between the vertical coordinates of the center points of the prediction cells numbered i and j to the height, and determining the adjacency based on the product of the second ratio and the adjustment parameter The value of the column dimension of the element in row i and column j in the matrix.
The method according to any one of claims 1-4, characterized in that, according to the first position of any prediction unit and the adjacency matrix, the fusion node features of any prediction unit are obtained ,include:

Obtaining the node characteristics of any prediction cell according to the first position of any prediction cell;

The node features and the adjacency matrix are input into the graph convolutional network GCN, and the graph convolutional network performs feature fusion of the node features and the adjacency matrix to generate any of the prediction cells The fusion node features.
The method according to claim 5, wherein, according to the first position of any of the predicted cells, obtaining the node characteristics of any of the predicted cells includes:

Performing linear mapping on the first position of any of the predicted cells to obtain the spatial characteristics of any of the predicted cells;

Based on the first position of any of the predicted cells, extracting the visual semantic features of any of the predicted cells from the table image;

The spatial feature and the visual semantic feature of the any prediction cell are spliced to obtain the node feature of the any prediction cell.
The method according to claim 6, wherein the extracting the visual semantic feature of any prediction cell from the table image based on the first position of the prediction cell comprises:

Based on the first position of any prediction unit, determine a target pixel contained in any prediction unit from pixels contained in the table image;

The visual semantic feature of the target pixel is extracted from the table image as the visual semantic feature of any prediction cell.
The method according to any one of claims 1-7, wherein the obtaining the second position of any prediction cell according to the fusion node feature of any prediction cell includes:

Based on the fused node features of any of the prediction cells, the prediction probability of any of the prediction cells at each candidate second position is obtained;

Obtaining the maximum prediction probability from the prediction probabilities of the any prediction unit at each candidate second position, and determining the candidate second position corresponding to the maximum prediction probability as the second position of the any prediction unit.
The method according to any one of claims 1-8, wherein the obtaining the second position of any prediction cell according to the fusion node feature of any prediction cell includes:

A target vector is established for any of the prediction cells, the target vector includes n dimensions, and the n is the number of candidate second positions of any of the prediction cells;

Based on the fusion node features of any prediction unit, the prediction probability of any vector dimension of the target vector being 0 or 1 is obtained;

Obtaining the maximum prediction probability from the prediction probability of any vector dimension whose value is 0 or 1, and determining the value corresponding to the maximum prediction probability as the target value of any vector dimension;

Based on the sum of the target values of the vector dimensions, the second position of any prediction cell is obtained.
The method according to any one of claims 1-9, wherein said obtaining the first position of the predicted cell in the table image comprises:

A detection frame of each prediction cell is extracted from the table image, and a first position of the prediction cell is obtained based on the detection frame.
The method according to any one of claims 1-10, wherein the second position includes the number of the start row, the number of the end row, the number of the start column, the number of the end column of the predicted cell at least one of the numbers.
A detection device for a cell position, characterized in that it comprises:

A first acquiring module, configured to acquire a first position of a predicted cell in the form image, wherein the first position is used to represent the position of the region occupied by the predicted cell in the form image;

The second acquisition module is configured to obtain the adjacency matrix of the table image according to the first position, wherein each of the prediction cells in the table image is a node, and the adjacency matrix is used to represent The positional relationship between the predicted cells;

The third acquisition module is used to obtain the fusion node feature of any prediction unit according to the first position of any prediction unit and the adjacency matrix;

The fourth acquisition module is configured to obtain the second position of any prediction unit according to the fusion node feature of the prediction unit, wherein the second position is used to characterize the prediction unit Owning row and/or Owning column.
An electronic device, characterized by comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, when the processor executes the program, any one of claims 1-11 can be realized. The detection method for the cell position described in item .
A computer-readable storage medium, on which a computer program is stored, wherein when the program is executed by a processor, the method for detecting the position of a cell according to any one of claims 1-11 is implemented.
A computer program product, characterized in that the computer program product includes computer program code, and when the computer program code is run on a computer, the method according to any one of claims 1-11 is realized.
A computer program, characterized in that the computer program includes computer program code, and when the computer program code is run on the computer, the computer is made to execute the method according to any one of claims 1-11.