CN113688693A - Adjacent table processing method and device, computer equipment and storage medium - Google Patents

Adjacent table processing method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113688693A
CN113688693A CN202110866186.9A CN202110866186A CN113688693A CN 113688693 A CN113688693 A CN 113688693A CN 202110866186 A CN202110866186 A CN 202110866186A CN 113688693 A CN113688693 A CN 113688693A
Authority
CN
China
Prior art keywords
feature
cell
character
processed
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110866186.9A
Other languages
Chinese (zh)
Inventor
杨桂秀
杨洋
李锋
张琛
万化
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pudong Development Bank Co Ltd
Original Assignee
Shanghai Pudong Development Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pudong Development Bank Co Ltd filed Critical Shanghai Pudong Development Bank Co Ltd
Priority to CN202110866186.9A priority Critical patent/CN113688693A/en
Publication of CN113688693A publication Critical patent/CN113688693A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Abstract

The application relates to a method and a device for processing adjacent tables, computer equipment and a storage medium. The method comprises the following steps: acquiring adjacent tables to be processed; extracting cell features of each cell in the adjacent to-be-processed table to obtain the cell features of each cell; calculating to obtain a first feature in a first direction according to the extracted cell feature, wherein the first direction is the merging direction of the adjacent tables to be processed; obtaining a second feature by performing feature splicing on a part in a second direction corresponding to each element in the first feature; and judging whether the adjacent tables to be processed need to be combined or not according to the second characteristic. The second characteristic of the method comprises the information of the cells, the information in the first direction and the information in the second direction, so that the information contained in the second characteristic is more, and the judgment on whether the adjacent tables to be processed need to be combined or not is more accurate according to the second characteristic.

Description

Adjacent table processing method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for processing adjacent tables, a computer device, and a storage medium.
Background
With the development of artificial intelligence technology, artificial intelligence has gained a great deal of applications in the fields of natural language learning, graphic processing, and the like. When a text is recognized, tables exist in adjacent pages, adjacent columns or adjacent line feeds, and whether the adjacent pages, adjacent columns or adjacent line feeds are combined needs to be judged when the text is recognized.
In the conventional technology, the judgment of whether the tables are merged is only simply made through table dividing lines.
However, when there is no table partition line, it is not possible to accurately determine whether or not the rows of the adjacent tables need to be merged.
Disclosure of Invention
In view of the above, it is necessary to provide a method, an apparatus, a computer device, and a storage medium for processing an adjacent table, which can improve accuracy.
A method of neighbor table processing, the method comprising:
acquiring adjacent tables to be processed;
extracting cell features of each cell in the adjacent to-be-processed table to obtain the cell features of each cell;
calculating to obtain a first feature in a first direction according to the extracted cell feature, wherein the first direction is the merging direction of the adjacent tables to be processed;
obtaining a second feature by performing feature splicing on a part in a second direction corresponding to each element in the first feature;
and judging whether the adjacent tables to be processed need to be combined or not according to the second characteristic.
In one embodiment, the extracting cell features of each cell in the adjacent to-be-processed table includes:
coding each character in the table to be processed to obtain a coding result;
extracting character features of the coding result to obtain character features corresponding to each character;
and obtaining the cell characteristics of each cell according to the character characteristics.
In one embodiment, the encoding each character in the table to be processed to obtain an encoding result includes:
and coding each character in the table to be processed by at least one method of an automatic learning model, a Word2vec model and a pre-training language model to obtain a coding result.
In one embodiment, the extracting the character features of the encoding result to obtain the character feature corresponding to each character includes:
inputting the coding result to a feature extraction layer to obtain initial character features, wherein the feature extraction layer is realized by at least one of a pre-training language model, a convolution network and a cyclic neural network;
and processing the initial features through a first activation function to obtain corresponding character features.
In one embodiment, the obtaining the cell feature of each cell according to the character feature includes:
and obtaining statistic characteristics of character characteristics of the characters in each cell as the cell characteristics.
In one embodiment, the first feature and the second feature are direction initial features extracted by at least one of a pre-training language model, a convolutional network and a recurrent neural network; and processing the initial direction characteristic through a second activation function.
In one embodiment, each element in the first feature vector comprises two parts to be merged in a second direction; the obtaining of the second feature by feature splicing of the portion in the second direction corresponding to each element in the first feature includes:
and respectively splicing two parts of each element in the first characteristic in a second direction to obtain a second characteristic, wherein the second characteristic comprises the character characteristic, the first direction information and the second direction information of the table to be processed.
In one embodiment, the determining whether the adjacent to-be-processed tables need to be merged according to the second feature includes:
and judging whether the adjacent tables to be processed need to be merged or not according to the second characteristic through a full-connection network and a third activation function.
A neighbor table processing apparatus, the neighbor table processing apparatus comprising:
the to-be-processed table acquisition module is used for acquiring adjacent to-be-processed tables;
the cell feature extraction module is used for extracting cell features of each cell in the adjacent to-be-processed table to obtain the cell features of each cell;
the first feature extraction module is used for calculating and obtaining first features in a first direction according to the extracted cell features, wherein the first direction is the merging direction of the adjacent tables to be processed;
the second feature extraction module is used for performing feature splicing on a part in a second direction corresponding to each element in the first feature to obtain a second feature;
and the judging module is used for judging whether the adjacent tables to be processed need to be combined or not according to the second characteristic.
A computer device comprising a memory storing a computer program and a processor implementing the steps of the method in any of the above embodiments when the processor executes the computer program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method in any of the above-mentioned embodiments.
According to the adjacent table processing method, the adjacent table processing device, the computer equipment and the storage medium, the cell features of each cell are extracted, then the first features in the first direction are obtained through calculation according to the cell features, and the second features in the second direction are obtained through calculation according to the first features in the first direction, so that the second features comprise the information of the cells, the information in the first direction and the information in the second direction, the information contained in the second features is more, and further the judgment on whether the adjacent tables to be processed need to be combined or not is more accurate according to the second features.
Drawings
FIG. 1 is a flow diagram illustrating a method for processing neighboring tables according to one embodiment;
FIG. 2 is a diagram of a to-be-processed form of a spread page in one embodiment;
FIG. 3 is a diagram of a hurdled pending form in one embodiment;
FIG. 4 is a diagram of a to-be-processed form of page crossing in another embodiment;
FIG. 5 is a system architecture diagram of a neighbor table processing method in one embodiment;
FIG. 6 is a block diagram of a neighboring table processing method in one embodiment;
FIG. 7 is a block diagram showing the structure of a neighboring table processing apparatus according to one embodiment;
FIG. 8 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In an embodiment, as shown in fig. 1, a method for processing a neighboring table is provided, and this embodiment is illustrated by applying the method to a terminal, it is to be understood that the method may also be applied to a server, and may also be applied to a system including the terminal and the server, and is implemented by interaction between the terminal and the server. In this embodiment, the method includes the steps of:
s102: and acquiring adjacent tables to be processed.
Specifically, the adjacent to-be-processed table refers to an adjacent page, an adjacent column, or an adjacent line-feed table, and may be specifically shown in fig. 2 to fig. 4, where fig. 2 is a schematic diagram of a to-be-processed table across pages in one embodiment, fig. 3 is a schematic diagram of a to-be-processed table across columns in one embodiment, and fig. 4 is a schematic diagram of a to-be-processed table across pages in another embodiment.
In fig. 2, the first row represents the last row of cells of the previous page, and the second row represents the first row of cells of the next page. In fig. 3, the first row represents the last row of cells of the previous column, and the second row represents the first row of cells of the subsequent column. In fig. 4, the first column represents the last column of cells of the previous page and the second column identifies the first column of cells of the subsequent page.
In practical application, the adjacent to-be-processed table is input, and the output is whether the corresponding cell needs to be merged, where a row or a column may be added to the output result, for example, a row is added in fig. 2 and fig. 3 to represent whether the corresponding cell needs to be merged, where the identifier may be identified by a corresponding identifier, for example, the identifier is 1, and the identifier needs to be merged, otherwise, merging is not needed.
S104: and extracting the cell features of each cell in the adjacent to-be-processed table to obtain the cell features of each cell.
Specifically, the cell features are calculated from the character features of all characters in each cell, so that the obtained cell features can characterize all characters in the cell. The physical meaning of the cell feature is the part of speech, the probability of word formation, etc. of each character in the cell.
The terminal can process each cell in parallel to obtain the cell characteristics corresponding to each cell. Optionally, the cell features may be extracted in an artificial intelligence manner, for example, the terminal may extract the character features of all characters in the cell by using the feature extractor to obtain the cell features, where the feature extraction method may be calculated according to statistics of all character features in the cell, for example, by summing, averaging, or maximum value, and the like, and is not limited herein. The summation refers to that the vector of summation of all character feature vectors of the cells is used as the cell feature vector; the averaging is to average all character feature vectors of the cell to be used as cell feature vectors; the maximum value is a value corresponding to the dimension of the character feature vector in the cell, which is the maximum value for each dimension of the character feature vector.
S106: and calculating to obtain a first feature in a first direction according to the extracted cell features, wherein the first direction is the merging direction of the adjacent to-be-processed tables.
Specifically, the first direction refers to a merging direction of adjacent tables to be processed, and taking fig. 2 and fig. 3 as an example, the merging direction is vertical, so the first direction is vertical, and the first feature in the first direction is a vertical first feature, that is, a column feature.
And the terminal connects or combines the cell characteristics of the corresponding columns to obtain the column characteristics. Specifically, when the terminal starts processing, the cell features are read in the second direction, for example, in fig. 2 and 3, that is, in the row direction, so that when the first feature is calculated, only two rows of cell features, that is, two cell feature vectors in the second direction, need to be input to the first feature extraction layer, and the first feature is obtained through the activation function. The first feature extraction layer may select any feature extraction layer, and preferably, at least one method of a pre-trained language model, a convolutional network and a recurrent neural network may be adopted, where the pre-trained language model includes a transformer, a bert and all other pre-trained language models, and the convolutional network may be a CNN, a multi-layer RNN, or the like; the recurrent neural network may include RNNs and GRUs, Dynamic LSTM, bidirectional LSTM, single layer LSTM, multilayer LSTM, bidirectional multilayer LSTM, etc., and one or more of the above models may be freely combined.
S108: and obtaining a second feature by performing feature splicing on the part in the second direction corresponding to each element in the first feature.
Specifically, the second direction is a direction corresponding to the first direction, for example, if the first direction is vertical, the second direction is horizontal, and if the first direction is horizontal, the second direction is vertical. Also for example in fig. 2, where the second direction is the lateral direction, the second feature is a row feature.
Each element in the first characteristic is composed of two parts, the two parts respectively correspond to different rows/columns in the second direction, the terminal separates the two parts in the second direction corresponding to each element in the first characteristic, and then the two parts are spliced according to the rows/columns in the second direction to obtain the second characteristic.
Specifically, the terminal separates two parts of the first feature, and then splices corresponding parts of all the first features to obtain the second feature. The second feature extraction layer may select any feature extraction layer, and preferably, at least one method of a pre-trained language model, a convolutional network and a recurrent neural network may be adopted, where the pre-trained language model includes a transformer, a bert and all other pre-trained language models, and the convolutional network may be a CNN, a multilayer RNN, or the like; the recurrent neural network may include RNNs and GRUs, Dynamic LSTM, bidirectional LSTM, single layer LSTM, multilayer LSTM, bidirectional multilayer LSTM, etc., and one or more of the above models may be freely combined.
S110: and judging whether the adjacent tables to be processed need to be combined or not according to the second characteristic.
Specifically, the second features obtained by splicing the terminals include cell information, information in the first direction and information in the second direction, so that the second features contain more information, and the terminals input the second features into the classifier to judge so as to determine whether the corresponding cells in the first direction need to be merged.
The terminal inputs the second feature into the classifier, and inputs the output of the classifier into the activation function to obtain the final output, wherein the output is a vector comprising a plurality of dimensions, each dimension of the vector represents whether two cells in the first direction need to be combined, and the two cells in the first direction can be identified through probability representation or through a mark. Taking fig. 2 and fig. 3 as an example, whether each dimension of the output vector is that two rows of cells in the corresponding column need to be merged or not is determined, for example, a row feature input classifier Linear obtains a 2-dimensional vector, the vector elements are mapped to (0, 1) through a Softmax activation function, an index where the maximum value in the two-dimensional vector is located is selected to represent a prediction tag, 0 represents no merging, and 1 represents merging.
The adjacent table processing method firstly extracts the cell features of each cell, then calculates to obtain the first features in the first direction according to the cell features, and calculates to obtain the second features in the second direction according to the first features in the first direction, so that the second features comprise the information of the cells, the information in the first direction and the information in the second direction, the information contained in the second features is more, and further, the judgment of whether the adjacent tables to be processed need to be combined is more accurate according to the second features.
In one embodiment, the cell feature extraction for each cell in the adjacent to-be-processed table includes: coding each character in the table to be processed to obtain a coding result; extracting character features of the coding result to obtain character features corresponding to each character; and obtaining the cell characteristics of each cell according to the character characteristics.
Specifically, the terminal first obtains two sentences, where fig. 2 and 3 are taken as examples, and one row of tables corresponds to one sentence, and fig. 4 is taken as an example, and one column of tables corresponds to one sentence, so that the two sentences are included, and each sentence is marked with a specific symbol to identify different cells, for example, characters in two cells are connected with a specific symbol. The terminal inputs the two sentences into a character vector coding layer to obtain a coding result corresponding to each character, then performs character feature extraction on the coding result to obtain a character feature corresponding to each character, and finally calculates to obtain the cell feature according to the character feature of the corresponding character in each cell.
In the following description, fig. 2 and 3 are taken as examples, and it is assumed that a sentence to be predicted is [ { C [ ]11,C12,C13,...C1n}{C21,C22,C23...C2n}]In which C isijA cell representing the ith row and the jth column, (i ═ 1, 2 represents at most two rows, j ═ 1, 2.., 100, represents at most 100 columns), n ≦ 100; cij=[Vij1,Vij2,...Vijk]Representing a character in a cell, (k < ═ 100), e.g., Vij1Represents a cell CijThe first character.
The character V passes through a full connection layer to obtain the coding result of the character, the coding result is expressed by a word vector V, and C is obtainedij=[vij1,vij2,...vijk]。
The terminal expands the cells of the row and column to obtain { C11,C12,C13,...C1n,C21,C22,C23,...C2nIn which C isij=[vij1,vij2,...vijk](ii) a Obtaining C1 through a character feature extraction layerij=[h1ij1,h1ij2,...h1ijk];h1ijrIs a 100-dimensional vector which represents the character characteristics of the r-th character of the ith row and the jth column, wherein the character characteristics are represented in the form of a characteristic vector; and extracting the characteristics of the cells: selection [ h1ij1,h1ij2,...h1ijk]The statistics in each dimension in the vector form a vector h1ijAs the vector of the cells in the ith row and the jth column, for example, the maximum value of each dimension is selected, and then a 100-dimensional feature vector, i.e., a cell feature, is obtained through this step for each cell. Preferably, for the dimension of the subsequent first feature extraction to meet the requirement, the terminal inputs the cell feature into a full connection layer and a Relu activation function to reduce the dimension of the cell feature, wherein the reduced dimension cell feature is { C1 }11,C112,C113,...C11n,C121,C122,C123,...C12n},C1ijHas a vector dimension of 50.
Optionally, the encoding processing is performed on each character in the table to be processed to obtain an encoding result, where the encoding processing includes: and coding each character in the table to be processed by at least one method of an automatic learning model, a Word2vec model and a pre-training language model to obtain a coding result.
Specifically, the vector encoding layer may select: a linear full-connection layer is initialized randomly, and character vector representation is automatically learned through model training; word2vec models, such as Word vector models obtained by skip _ gram or cbow training, and pre-trained language models such as the Word vectors of bert, RoBERTa, ELMo, eletra, GPT. And optionally, a free combination of one or more of the above may also be used for the character encoding layer, such as using word vectors of word2vec plus bert or word vectors of bert word vectors plus GPT
In one embodiment, extracting the character features of the encoding result to obtain the character features corresponding to each character includes: inputting the coding result to a feature extraction layer to obtain initial character features, wherein the feature extraction layer is realized by at least one of a pre-training language model, a convolution network and a recurrent neural network; and processing the initial features through the first activation function to obtain corresponding character features.
Specifically, the character feature extraction may be to extract initial features of the character through a feature extraction layer, and then process the initial features through a first activation function to obtain corresponding character features. The feature extraction layer is realized by at least one method of a pre-training language model, a convolution network and a recurrent neural network, wherein the pre-training language model and the training language model can be a transformer, a bert and other pre-training language models, and the convolution network can be CNN, multilayer RNN and the like; the recurrent neural network may include RNNs and GRUs, Dynamic LSTM, bidirectional LSTM, single layer LSTM, multilayer LSTM, bidirectional multilayer LSTM, etc., and one or more of the above models may be freely combined. One or more of the models can be freely combined, for example, character features are extracted through a BERT model after two-way LSTM, and linear layers or activation functions can be added after the models according to actual cases to serve as dimension adjustment and introduce nonlinear factors.
In one embodiment, obtaining the cell feature of each cell according to the character feature includes: and obtaining statistic characteristics of character characteristics of the characters in each cell as cell characteristics.
Specifically, the terminal processes the character features in each cell to obtain the cell features, where the cell feature extraction method may be calculated according to statistics of all the character features in the cell, for example, by summing, averaging, or maximizing, and the like, which is not limited herein. The summation refers to that the vector of summation of all character feature vectors of the cells is used as the cell feature vector; the averaging is to average all character feature vectors of the cell to be used as cell feature vectors; the maximum value is a value corresponding to the dimension of the character feature vector in each dimension of the cell, and the maximum value is a value corresponding to the dimension of the cell feature vector of the cell.
In the above embodiment, different layers can be freely combined through different models to extract the cell features.
In one embodiment, the first feature and the second feature are direction initial features extracted by at least one of a pre-training language model, a convolutional network and a recurrent neural network; and processing the initial direction characteristic through a second activation function.
Specifically, the first feature is to input cell features corresponding to two rows/two columns of cells to be merged into the feature extraction layer, and obtain feature representations of the rows/columns through an activation function. The second feature is a row/column feature obtained by vector-stitching the features of the obtained columns/rows.
The first feature and the second feature are extracted by at least one of a pre-training language model, a convolutional network and a recurrent neural network to obtain a directional initial feature, for example, the first feature may be one or more of the pre-training language model, the convolutional network and the recurrent neural network are freely combined, for example, column features are extracted by a GRU model after cnn, and linear layers or activation functions can be added after the model according to actual cases as dimension adjustment and introduction of non-linear factors. The second feature may be one or more of the features that are freely combined, for example, a transform model is used to extract the column features after the Dynamic _ LSTM, which is not limited herein.
Specifically, still taking the above example as an example for explanation, the column { C1.1,C1.2,C1.3,...C1.nWhere C1.j=[h11j,h12j]Obtaining a column vector { C2 through a Dynamic _ LSTM.2,C2.2,C2.3,…C2nWhere C2.j=[h21j,h22j],h21jThe dimension is 100.
The terminal reduces the column vector dimension: i.e., { C2.1,C2.2,C2.3,…C2.nThe obtained C3 is obtained through the full connection layer and Relu activation function.1,C3.2,C3.3,...C3.nWhere C3.j=[h31j,h32j],h31jDimension 50, C3.jThe dimension of the encoder is 100 dimensions, and the splicing is performed after the dimension reduction so as to keep the dimension input to each encoder consistent to 100 dimensions.
In one embodiment, each element in the first feature vector comprises two parts to be merged in the second direction; obtaining a second feature by performing feature splicing on a part in a second direction corresponding to each element in the first feature, wherein the second feature comprises: and respectively splicing two parts of each element in the first characteristic in the second direction to obtain a second characteristic, wherein the second characteristic comprises the character characteristic of the table to be processed, the first direction information and the second direction information.
In particular, the terminal will column vector C3.1,C3.2,C3.3,...C3.nVector splicing is carried out to obtain { C31,C32,C33,...C3nWhere C3j=[h31j,h32j],h31jDimension 50, C3jHas a dimension of 100. { C31,C32,C33,…C3nObtaining { C4 through a Dynamic _ LSTM, a full connection layer and a Relu activation function1,C42,C43,...C4nWhere C4jIs 50 dimensions, and then { C41,C42,C43,...C4nGet { T } through the full connection layer1,T2,T3,...TnN is the number of columns.
In the above embodiments, the first feature and the second feature are extracted in various ways.
In one embodiment, the determining whether the adjacent to-be-processed tables need to be merged according to the second feature includes: and judging whether the adjacent tables to be processed need to be merged or not according to the second characteristic through the full-connection network and the third activation function.
Specifically, the terminal inputs the second feature into a classifier to determine whether the cells need to be merged, wherein the classifier may be a full-connection network and a third activation function.
It should be noted that the first activation function, the second activation function, and the third activation function may be a Sigmoid function, a Tanh function, a ReLU function, a Softmax function, or other activation functions, and are not limited in this respect.
Specifically, still taking the above example as an example, TjThe dimension of (a) is the number of labels to be predicted, for TjAnd obtaining a corresponding label by one softmax, and obtaining the corresponding label by each T step, namely judging whether the cells in the upper and lower columns or the cells in the left and right columns need to be combined.
Specifically, please refer to fig. 5 and 6, in which fig. 5 is a system architecture diagram of a method for processing an adjacent table in an embodiment, and fig. 6 is a model structure diagram of the method for processing an adjacent table in an embodiment.
In this embodiment, the system architecture includes an input layer, a character encoding layer, a character feature extraction layer, a cell feature extraction layer, a first feature extraction layer, a second feature extraction layer, a classifier, and an output layer. The selection range of the models related to each layer can be referred to above, which is not described herein again, but it should be noted that the models listed in each layer in the system architecture shown in fig. 5 can be freely combined with the models of other layers, for example, all feature extraction uses bert, or character feature extraction uses lstm, second feature extraction uses bert, and first feature extraction uses cnn; for example, the use case of the system architecture is that the character coding layer uses GPT, the character feature extraction uses a recurrent neural network (LSTM or GRU or CNN), the row feature extraction uses a transformer, the column feature extraction uses elmo, the feature extraction method uses average, and the activation function uses sign or tanh, which all belong to the protection scope of the present application.
Taking fig. 6 as an example, if the character coding layer selects Linear, the cell feature extractor uses the MAX method, and the other feature extraction layers select Dynamic _ LSTM, the activation function uses relu and softmax, and the model structure of the adjacent table processing method may be described as follows:
the terminal firstly converts the characters and the labels into a data form required by the model according to dictionary mapping to realize text vectorization, inputs the text vectors into the model, and obtains character vector representation through a Linear coding layer. And then the terminal obtains the feature vector of the character of each character in the cell by the character feature extraction layer Dynamic _ LSTM and the activation function Relu. And extracting the characteristics of the cell by the terminal through an MAX method, namely taking the maximum value of each dimension of the character characteristic vector in the cell as the value of the dimension of the cell characteristics. In this way, the terminal inputs the cell feature vectors of two rows into the column feature extraction layer Dynamic _ LSTM, and then obtains the feature representation of the column, i.e. the first feature above, by activating the function Relu. And then the terminal splices the column feature vectors of two rows, inputs the column feature extraction layer Dynamic _ LSTM, and then obtains the feature representation of the row, namely the second feature in the above through the activation function Relu, if N columns exist, N row feature representations are obtained, and after the previous steps of feature extraction layers, the row feature vectors contain character information of the cells, row information of the cells and column information of the cells, so that the row feature vectors can be used as subsequent tables to be merged, judged and input.
And finally, inputting the line characteristics into a classifier Linear by the terminal to obtain a two-dimensional vector, mapping vector elements to (0, 1) through a Softmax activation function, selecting an index where the maximum value in the two-dimensional vector is located to represent a prediction label, wherein 0 represents no combination, and 1 represents combination.
Where Linear as described above is a fully connected neural network layer, Dynamic _ LSTM is a Dynamic LSTM neural network, Relu is an activation function, and Softmax is a normalized exponential function.
Therefore, when the adjacent tables are combined and judged, judgment is carried out based on the neural network, the cells, the columns and the rows are sequentially processed to obtain second characteristics comprising character information of the cells, row information of the cells and column information of the cells, and then model processing is carried out, so that the accuracy of the result output by the classifier is ensured.
It should be understood that although the steps in the flowcharts of fig. 1 and 6 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1 and 6 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.
In one embodiment, as shown in fig. 7, there is provided a neighbor table processing apparatus including: a table to be processed acquiring module 701, a cell feature extracting module 702, a first feature extracting module 703, a second feature extracting module 704 and a judging module 705, wherein:
a to-be-processed table acquiring module 701, configured to acquire an adjacent to-be-processed table;
a cell feature extraction module 702, configured to perform cell feature extraction on each cell in adjacent to-be-processed tables to obtain a cell feature of each cell;
a first feature extraction module 703, configured to calculate, according to the extracted cell features, to obtain first features in a first direction, where the first direction is a merging direction of adjacent to-be-processed tables;
a second feature extraction module 704, configured to obtain a second feature by performing feature splicing on a portion in a second direction corresponding to each element in the first feature;
the determining module 705 is configured to determine whether adjacent to-be-processed tables need to be merged according to the second characteristic.
In one embodiment, the cell feature extraction module 702 includes:
the encoding unit is used for encoding each character in the table to be processed to obtain an encoding result;
the character feature extraction unit is used for extracting character features of the coding result to obtain character features corresponding to each character;
and the cell feature extraction unit is used for obtaining the cell features of each cell according to the character features.
In one embodiment, the encoding unit is configured to perform encoding processing on each character in the table to be processed by using at least one of an automatic learning model, a Word2vec model, and a pre-training language model to obtain an encoding result.
In one embodiment, the character feature extracting unit includes:
the character initial feature extraction subunit is used for inputting the coding result to the feature extraction layer to obtain the character initial features, and the feature extraction layer is realized by at least one method of a pre-training language model, a convolution network and a cyclic neural network;
and the character feature extraction subunit is used for processing the initial features through the first activation function to obtain corresponding character features.
In one embodiment, the cell feature extracting unit is configured to obtain a statistical amount feature of the character in each cell as the cell feature.
In one embodiment, the first feature and the second feature are direction initial features extracted by at least one of a pre-training language model, a convolutional network and a recurrent neural network; and processing the initial direction characteristic through a second activation function.
In one embodiment, each element in the first feature vector comprises two parts to be merged in the second direction; the second feature extraction module 704 is configured to splice two parts of each element in the first feature in the second direction to obtain a second feature, where the second feature includes a character feature of the to-be-processed table, first direction information, and second direction information.
In one embodiment, the determining module 705 is configured to determine whether the adjacent tables to be processed need to be merged according to the second characteristic through the fully connected network and the third activation function.
For the specific definition of the adjacent table processing apparatus, reference may be made to the above definition of the adjacent table processing method, which is not described herein again. The respective modules in the above-described adjacent table processing apparatus may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a method of adjacent table processing. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program: acquiring adjacent tables to be processed; extracting cell features of each cell in adjacent tables to be processed to obtain the cell features of each cell; calculating to obtain a first feature in a first direction according to the extracted cell feature, wherein the first direction is the merging direction of adjacent tables to be processed; obtaining a second feature by performing feature splicing on a part in a second direction corresponding to each element in the first feature; and judging whether the adjacent tables to be processed need to be combined or not according to the second characteristic.
In one embodiment, the cell feature extraction performed on each cell in the adjacent to-be-processed table by the processor when the computer program is executed includes: coding each character in the table to be processed to obtain a coding result; extracting character features of the coding result to obtain character features corresponding to each character; and obtaining the cell characteristics of each cell according to the character characteristics.
In one embodiment, the encoding of each character in the table to be processed, which is implemented when the processor executes the computer program, obtains an encoding result, and includes: and coding each character in the table to be processed by at least one method of an automatic learning model, a Word2vec model and a pre-training language model to obtain a coding result.
In one embodiment, the performing, by a processor executing a computer program, character feature extraction on an encoding result to obtain a character feature corresponding to each character includes: inputting the coding result to a feature extraction layer to obtain initial character features, wherein the feature extraction layer is realized by at least one of a pre-training language model, a convolution network and a recurrent neural network; and processing the initial features through the first activation function to obtain corresponding character features.
In one embodiment, the deriving the cell feature for each cell from the character feature when the processor executes the computer program comprises: and obtaining statistic characteristics of character characteristics of the characters in each cell as cell characteristics.
In one embodiment, the first feature and the second feature involved in the execution of the computer program by the processor are direction initial features extracted by at least one of a pre-trained language model, a convolutional network, and a recurrent neural network; and processing the initial direction characteristic through a second activation function.
In one embodiment, each element of the first feature vector involved in the execution of the computer program by the processor comprises two parts to be merged in the second direction; the obtaining of the second feature by feature concatenation of the portion in the second direction corresponding to each element in the first feature, which is realized when the processor executes the computer program, includes: and respectively splicing two parts of each element in the first characteristic in the second direction to obtain a second characteristic, wherein the second characteristic comprises the character characteristic of the table to be processed, the first direction information and the second direction information.
In one embodiment, the determining whether the adjacent tables to be processed need to be merged according to the second feature when the processor executes the computer program includes: and judging whether the adjacent tables to be processed need to be merged or not according to the second characteristic through the full-connection network and the third activation function.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring adjacent tables to be processed; extracting cell features of each cell in adjacent tables to be processed to obtain the cell features of each cell; calculating to obtain a first feature in a first direction according to the extracted cell feature, wherein the first direction is the merging direction of adjacent tables to be processed; obtaining a second feature by performing feature splicing on a part in a second direction corresponding to each element in the first feature; and judging whether the adjacent tables to be processed need to be combined or not according to the second characteristic.
In one embodiment, the cell feature extraction for each cell in the adjacent to-be-processed table, which is implemented when the computer program is executed by the processor, includes: coding each character in the table to be processed to obtain a coding result; extracting character features of the coding result to obtain character features corresponding to each character; and obtaining the cell characteristics of each cell according to the character characteristics.
In one embodiment, the encoding of each character in the table to be processed to obtain the encoding result when the computer program is executed by the processor includes: and coding each character in the table to be processed by at least one method of an automatic learning model, a Word2vec model and a pre-training language model to obtain a coding result.
In one embodiment, the performing, when executed by a processor, character feature extraction on the encoding result to obtain a character feature corresponding to each character includes: inputting the coding result to a feature extraction layer to obtain initial character features, wherein the feature extraction layer is realized by at least one of a pre-training language model, a convolution network and a recurrent neural network; and processing the initial features through the first activation function to obtain corresponding character features.
In one embodiment, the deriving of the cell feature for each cell from the character feature when the computer program is executed by the processor comprises: and obtaining statistic characteristics of character characteristics of the characters in each cell as cell characteristics.
In one embodiment, the first feature and the second feature involved in the execution of the computer program by the processor are direction initial features extracted by at least one of a pre-trained language model, a convolutional network, and a recurrent neural network; and processing the initial direction characteristic through a second activation function.
In one embodiment, each element of the first feature vector involved in the execution of the computer program by the processor comprises two parts to be merged in the second direction; the second feature obtained by feature concatenation of a portion in a second direction corresponding to each element in the first feature when the computer program is executed by the processor includes: and respectively splicing two parts of each element in the first characteristic in the second direction to obtain a second characteristic, wherein the second characteristic comprises the character characteristic of the table to be processed, the first direction information and the second direction information.
In one embodiment, the determining whether the adjacent tables to be processed need to be merged according to the second feature when the computer program is executed by the processor includes: and judging whether the adjacent tables to be processed need to be merged or not according to the second characteristic through the full-connection network and the third activation function.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (11)

1. A method for processing a neighbor table, the method comprising:
acquiring adjacent tables to be processed;
extracting cell features of each cell in the adjacent to-be-processed table to obtain the cell features of each cell;
calculating to obtain a first feature in a first direction according to the extracted cell feature, wherein the first direction is the merging direction of the adjacent tables to be processed;
obtaining a second feature by performing feature splicing on a part in a second direction corresponding to each element in the first feature;
and judging whether the adjacent tables to be processed need to be combined or not according to the second characteristic.
2. The method according to claim 1, wherein the extracting cell features of each cell in the adjacent table to be processed comprises:
coding each character in the table to be processed to obtain a coding result;
extracting character features of the coding result to obtain character features corresponding to each character;
and obtaining the cell characteristics of each cell according to the character characteristics.
3. The method according to claim 2, wherein the encoding each character in the table to be processed to obtain an encoding result comprises:
and coding each character in the table to be processed by at least one method of an automatic learning model, a Word2vec model and a pre-training language model to obtain a coding result.
4. The method according to claim 2, wherein the extracting the character features of the encoding result to obtain the character features corresponding to each character comprises:
inputting the coding result to a feature extraction layer to obtain initial character features, wherein the feature extraction layer is realized by at least one of a pre-training language model, a convolution network and a cyclic neural network;
and processing the initial features through a first activation function to obtain corresponding character features.
5. The method of claim 2, wherein obtaining the cell feature of each cell according to the character feature comprises:
and obtaining statistic characteristics of character characteristics of the characters in each cell as the cell characteristics.
6. The method according to any one of claims 1 to 5, wherein the first feature and the second feature are direction initial features extracted by at least one of a pre-trained language model, a convolutional network and a recurrent neural network; and processing the initial direction characteristic through a second activation function.
7. The method according to any one of claims 1 to 5, wherein each element in the first eigenvector comprises two parts to be merged in the second direction; the obtaining of the second feature by feature splicing of the portion in the second direction corresponding to each element in the first feature includes:
and respectively splicing two parts of each element in the first characteristic in a second direction to obtain a second characteristic, wherein the second characteristic comprises the character characteristic, the first direction information and the second direction information of the table to be processed.
8. The method according to any one of claims 1 to 5, wherein the determining whether the adjacent tables to be processed need to be merged according to the second feature comprises:
and judging whether the adjacent tables to be processed need to be merged or not according to the second characteristic through a full-connection network and a third activation function.
9. An adjacent form processing apparatus, characterized in that the adjacent form processing apparatus comprises:
the to-be-processed table acquisition module is used for acquiring adjacent to-be-processed tables;
the cell feature extraction module is used for extracting cell features of each cell in the adjacent to-be-processed table to obtain the cell features of each cell;
the first feature extraction module is used for calculating and obtaining first features in a first direction according to the extracted cell features, wherein the first direction is the merging direction of the adjacent tables to be processed;
the second feature extraction module is used for performing feature splicing on a part in a second direction corresponding to each element in the first feature to obtain a second feature;
and the judging module is used for judging whether the adjacent tables to be processed need to be combined or not according to the second characteristic.
10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 8.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 8.
CN202110866186.9A 2021-07-29 2021-07-29 Adjacent table processing method and device, computer equipment and storage medium Pending CN113688693A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110866186.9A CN113688693A (en) 2021-07-29 2021-07-29 Adjacent table processing method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110866186.9A CN113688693A (en) 2021-07-29 2021-07-29 Adjacent table processing method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113688693A true CN113688693A (en) 2021-11-23

Family

ID=78578249

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110866186.9A Pending CN113688693A (en) 2021-07-29 2021-07-29 Adjacent table processing method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113688693A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6006240A (en) * 1997-03-31 1999-12-21 Xerox Corporation Cell identification in table analysis
JP2003216886A (en) * 2002-01-22 2003-07-31 Matsushita Electric Ind Co Ltd Device, method, and program for table editing and recording medium
US20190050379A1 (en) * 2017-08-11 2019-02-14 Emro Co., Ltd. Method for providing data management service having automatic cell merging function and service providing server for performing the method
CN112560545A (en) * 2019-09-10 2021-03-26 珠海金山办公软件有限公司 Method and device for identifying form direction and electronic equipment
CN112632927A (en) * 2020-12-30 2021-04-09 上海犀语科技有限公司 Table fragment link restoration method and system based on semantic processing
CN113177397A (en) * 2021-04-21 2021-07-27 平安消费金融有限公司 Table adjusting method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6006240A (en) * 1997-03-31 1999-12-21 Xerox Corporation Cell identification in table analysis
JP2003216886A (en) * 2002-01-22 2003-07-31 Matsushita Electric Ind Co Ltd Device, method, and program for table editing and recording medium
US20190050379A1 (en) * 2017-08-11 2019-02-14 Emro Co., Ltd. Method for providing data management service having automatic cell merging function and service providing server for performing the method
CN112560545A (en) * 2019-09-10 2021-03-26 珠海金山办公软件有限公司 Method and device for identifying form direction and electronic equipment
CN112632927A (en) * 2020-12-30 2021-04-09 上海犀语科技有限公司 Table fragment link restoration method and system based on semantic processing
CN113177397A (en) * 2021-04-21 2021-07-27 平安消费金融有限公司 Table adjusting method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109871532B (en) Text theme extraction method and device and storage medium
CN111581229A (en) SQL statement generation method and device, computer equipment and storage medium
CN111651992A (en) Named entity labeling method and device, computer equipment and storage medium
CN109948149A (en) A kind of file classification method and device
CN111461301A (en) Serialized data processing method and device, and text processing method and device
CN110598210B (en) Entity recognition model training, entity recognition method, entity recognition device, entity recognition equipment and medium
CN113204659B (en) Label classification method and device for multimedia resources, electronic equipment and storage medium
CN113822264A (en) Text recognition method and device, computer equipment and storage medium
CN111178358A (en) Text recognition method and device, computer equipment and storage medium
CN113836992A (en) Method for identifying label, method, device and equipment for training label identification model
CN113435531B (en) Zero sample image classification method and system, electronic equipment and storage medium
CN114821736A (en) Multi-modal face recognition method, device, equipment and medium based on contrast learning
CN110852071A (en) Knowledge point detection method, device, equipment and readable storage medium
CN115909336A (en) Text recognition method and device, computer equipment and computer-readable storage medium
CN114064852A (en) Method and device for extracting relation of natural language, electronic equipment and storage medium
CN111783688B (en) Remote sensing image scene classification method based on convolutional neural network
CN112307749A (en) Text error detection method and device, computer equipment and storage medium
CN114239760B (en) Multi-modal model training and image recognition method and device, and electronic equipment
CN113239693B (en) Training method, device, equipment and storage medium of intention recognition model
CN115906861A (en) Statement emotion analysis method and device based on interaction aspect information fusion
CN113688693A (en) Adjacent table processing method and device, computer equipment and storage medium
CN110689052B (en) Session message processing method, device, computer equipment and storage medium
CN113591840A (en) Target detection method, device, equipment and storage medium
CN113033213A (en) Method and device for analyzing text information by using attention model and electronic equipment
CN114692715A (en) Sample labeling method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination