CN116595380A - Training method of table title classification model and classification method of table title - Google Patents

Training method of table title classification model and classification method of table title Download PDF

Info

Publication number
CN116595380A
CN116595380A CN202310699463.0A CN202310699463A CN116595380A CN 116595380 A CN116595380 A CN 116595380A CN 202310699463 A CN202310699463 A CN 202310699463A CN 116595380 A CN116595380 A CN 116595380A
Authority
CN
China
Prior art keywords
classification
sample
title
input matrix
global
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310699463.0A
Other languages
Chinese (zh)
Inventor
袁建
郭磊
贾家琛
郑子辰
李小翔
邸智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaneng Tendering Co ltd
Huaneng Clean Energy Research Institute
Original Assignee
Huaneng Tendering Co ltd
Huaneng Clean Energy Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaneng Tendering Co ltd, Huaneng Clean Energy Research Institute filed Critical Huaneng Tendering Co ltd
Priority to CN202310699463.0A priority Critical patent/CN116595380A/en
Publication of CN116595380A publication Critical patent/CN116595380A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a training method of a table title classification model, a table title classification method, a device, equipment and a storage medium. The training method of the table title classification model comprises the following steps: acquiring a form title sample and a sample classification label corresponding to the form title sample; generating a local input matrix and a global input matrix based on the form header sample and the sample classification label; inputting the local input matrix into a local classification unit to obtain a first feature vector; inputting the global input matrix into a global classification unit to obtain a second feature vector; inputting the first feature vector and the second feature vector into the MLP for feature mapping, obtaining a prediction classification result, and training a table title classification model according to the prediction classification result and the sample classification label. Through the technical scheme of the application, the efficiency and accuracy of classifying the form titles can be improved by training the completed form title classification model.

Description

Training method of table title classification model and classification method of table title
Technical Field
The application relates to the technical field of data processing, in particular to a training method of a table title classification model and a table title classification method.
Background
The itemized quotation form is one of the most important documents among the bid documents. The title classification method of the sub-term quotation list in the bidding documents is various and lacks of uniformity and standardization, and certain difficulties are brought to evaluation and comparison of the reviewers. To facilitate evaluation and comparison of bids in different bid documents by a reviewer, the title classification of the listing's bid forms is required in a certain classification.
In the related art, manual classification is mostly adopted for classifying the title of the term quotation list, so that the efficiency is low and the error rate is high.
Disclosure of Invention
The application provides a training method of a table title classification model, a table title classification method, a table title classification device, electronic equipment and a storage medium. The efficiency and accuracy of classifying the lattice topics can be improved.
In a first aspect, an embodiment of the present application provides a training method for a table title classification model, where the table title classification model includes a local classification unit, a global classification unit, and a multi-layer perceptron MLP, and the method includes: acquiring a table title sample and a sample classification label corresponding to the table title sample; generating a local input matrix and a global input matrix based on the form header sample and the sample classification label; inputting the local input matrix into the local classification unit to obtain a first feature vector; inputting the global input matrix into the global classification unit to obtain a second feature vector; inputting the first feature vector and the second feature vector into the MLP for feature mapping, obtaining a prediction classification result, and training the table title classification model according to the prediction classification result and the sample classification label.
According to the technical scheme, a local input matrix and a global input matrix can be generated based on the acquired table title sample and the sample classification label corresponding to the table title sample, and the local input matrix and the global input matrix are input into the table title classification model to acquire a prediction result, so that the table title classification model is trained according to the prediction result and the sample classification label, and the table title classification model capable of accurately classifying the table title is obtained. Thereby improving the efficiency and accuracy of classifying the form titles.
In one implementation, the generating a local input matrix and a global input matrix based on the table header samples and the sample classification labels includes: performing lexical analysis on the table title sample to obtain a table title sample sequence corresponding to the table title sample; generating a sample word vector corresponding to the table title sample based on the table title sample sequence; the local input matrix and the global input matrix are generated based on the sample word vector and the sample classification tag.
In an alternative implementation, the generating the local input matrix and the global input matrix based on the sample word vector and the sample classification tag includes: taking a sample word vector corresponding to each table header sample as a row vector to generate the local input matrix; classifying the form title samples according to the sample classification labels to obtain the form title samples of different categories; and splicing the sample word vector corresponding to the table header sample of each category into a row vector to generate the global input matrix.
In the technical scheme, lexical analysis and vectorization can be performed on the table title samples to generate corresponding local input matrixes and global input matrixes, and the local input matrixes and the global input matrixes are input into the table title classification model to obtain a prediction result, so that the table title classification model is trained according to the prediction result and the sample classification label, and the table title classification model capable of accurately classifying the table titles is obtained. Thereby improving the efficiency and accuracy of classifying the form titles.
In one implementation, the training the table title classification model according to the prediction classification result and the sample classification label includes: acquiring a loss function based on the prediction classification result and the sample classification label; and calculating gradients according to the loss function and carrying out back propagation so as to update model parameters of the table title classification model in a gradient descent mode.
In the technical scheme, a local input matrix and a global input matrix can be generated based on the acquired table title sample and a sample classification label corresponding to the table title sample, and the local input matrix and the global input matrix are input into a table title classification model to acquire a prediction result so as to acquire a loss function based on the prediction classification result and the sample classification label, so that the table title classification model is trained based on the loss function, and the table title classification model capable of accurately classifying the table titles is obtained. Thereby improving the efficiency and accuracy of classifying the form titles.
In one implementation, the local classification unit is a recurrent neural network RNN and/or the global classification unit is a multi-layer neural network based on a transducer architecture, each layer of the neural network comprising at least one multi-headed attention layer and at least one fully connected layer.
In a second aspect, an embodiment of the present application provides a method for classifying a table title, including: acquiring a form title text of a form to be classified; inputting the form title text into a form title classification model to obtain a classification result; wherein the table title classification model is trained based on the method as described in the first aspect.
In a third aspect, the present application provides a training apparatus for a table-header classification model, the table-header classification model including a local classification unit, a global classification unit, and a multi-layer perceptron MLP, the apparatus comprising: the acquisition module is used for acquiring a form title sample and a sample classification label corresponding to the form title sample; a generation module for generating a local input matrix and a global input matrix based on the form title sample and the sample classification label; the first processing module is used for inputting the local input matrix into the local classification unit to obtain a first feature vector; the second processing module is used for inputting the global input matrix into the global classification unit to obtain a second feature vector; and the training module is used for inputting the first feature vector and the second feature vector into the MLP for feature mapping, obtaining a prediction classification result, and training the table title classification model according to the prediction classification result and the sample classification label.
In one implementation, the generating module is specifically configured to: performing lexical analysis on the table title sample to obtain a table title sample sequence corresponding to the table title sample; generating a sample word vector corresponding to the table title sample based on the table title sample sequence; the local input matrix and the global input matrix are generated based on the sample word vector and the sample classification tag.
In an alternative implementation, the generating module is specifically configured to: taking a sample word vector corresponding to each table header sample as a row vector to generate the local input matrix; classifying the form title samples according to the sample classification labels to obtain the form title samples of different categories; and splicing the sample word vector corresponding to the table header sample of each category into a row vector to generate the global input matrix.
In one implementation, the training module is specifically configured to: acquiring a loss function based on the prediction classification result and the sample classification label; and calculating gradients according to the loss function and carrying out back propagation so as to update model parameters of the table title classification model in a gradient descent mode.
In one implementation, the local classification unit is a recurrent neural network RNN and/or the global classification unit is a multi-layer neural network based on a transducer architecture, each layer of the neural network comprising at least one multi-headed attention layer and at least one fully connected layer.
In a fourth aspect, an embodiment of the present application provides a table title classifying apparatus, including: the acquisition module is used for acquiring the form title text of the form to be classified; the classification module is used for inputting the form title text into a form title classification model to obtain a classification result; wherein the table title classification model is trained based on the method as described in the first aspect.
In a fifth aspect, an embodiment of the present application provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the training method of the table title classification model as described in the first aspect or to perform the classification method of the table title as described in the second aspect.
In a sixth aspect, embodiments of the present application provide a computer readable storage medium storing instructions that, when executed, cause a method as described in the first aspect to be implemented or cause a method as described in the second aspect to be implemented.
In a seventh aspect, an embodiment of the present application provides a computer program product comprising a computer program which, when executed by a processor, implements the training method of the table-title classification model according to the first aspect, or implements the steps of the table-title classification method according to the second aspect.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the application or to delineate the scope of the application. Other features of the present application will become apparent from the description that follows.
Drawings
The drawings are included to provide a better understanding of the present application and are not to be construed as limiting the application. Wherein:
FIG. 1 is a schematic diagram of a training method for a table title classification model according to an embodiment of the present application;
FIG. 2 is a schematic diagram of another training method for a table title classification model according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a training method of a table title classification model according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a method for classifying form titles according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a training device for a table title classification model according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a table title classifying device according to an embodiment of the present application;
FIG. 7 is a schematic block diagram of an example electronic device that may be used to implement embodiments of the present application.
Detailed Description
Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Wherein, in the description of the present application, "/" means or is meant unless otherwise indicated, for example, a/B may represent a or B; "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. The first, second, etc. numbers referred to in the present application are merely for convenience of description and are not intended to limit the scope of the embodiments of the present application, nor represent the sequence.
Referring to fig. 1, fig. 1 is a schematic diagram of a training method of a table title classification model according to an embodiment of the application. The form title classification model trained by the method can be used for classifying various quotation forms in the bidding documents. As shown in fig. 1, the method may include, but is not limited to, the steps of:
step S101: and acquiring a form title sample and a sample classification label corresponding to the form title sample.
For example, a bidding form in a bidding document is used as a sample form, a form title of the bidding form is obtained as a form title sample, the form title sample is manually classified and marked, and a sample classification label corresponding to the form title sample is obtained.
Step S102: a local input matrix and a global input matrix are generated based on the table header samples and the sample class labels.
Specifically, a local input matrix containing sample features of each form title sample is generated based on the form title samples, and a global input matrix containing sample features of the form title samples of the same category is generated in combination with the sample classification labels.
Step S103: and inputting the local input matrix into a local classification unit to obtain a first feature vector.
Specifically, the local input matrix is used as an input local classification unit to perform feature extraction on the local input matrix to obtain a first feature vector.
Step S104: and inputting the global input matrix into a global classification unit to obtain a second feature vector.
Specifically, the global input matrix is input into the global classification unit, so that feature extraction is performed on the global input matrix to obtain a second feature vector.
Step S105: inputting the first feature vector and the second feature vector into the MLP for feature mapping, obtaining a prediction classification result, and training a table title classification model according to the prediction classification result and the sample classification label.
For example, the first feature vector and the second feature vector are added and then input into an MLP (Multi-layer perceptron) for mapping, so that the vectors are mapped to a specified dimension (i.e., a plurality of preset table header categories), the mapping result is processed through an argmax function to obtain a prediction result, a table header classification model is trained according to the prediction classification result and a sample classification label, and the model is tested by using pre-acquired test set data to obtain an evaluation index until the evaluation index of the model is greater than an index threshold.
In the embodiment of the present application, the evaluation index of the model may be an accuracy rate or an F1 Score.
By implementing the embodiment of the application, the local input matrix and the global input matrix can be generated based on the acquired form title sample and the sample classification label corresponding to the form title sample, and the local input matrix and the global input matrix are input into the form title classification model to acquire the prediction result, so that the form title classification model is trained according to the prediction result and the sample classification label, and the form title classification model capable of accurately classifying the form title is obtained. Thereby improving the efficiency and accuracy of classifying the form titles.
In one implementation, lexical analysis and vectorization may be performed on the table topic samples to generate corresponding local and global input matrices. As an example, please refer to fig. 2, fig. 2 is a schematic diagram of another training method of a table title classification model according to an embodiment of the present application. As shown in fig. 2, the method may include, but is not limited to, the steps of:
step S201: and acquiring a form title sample and a sample classification label corresponding to the form title sample.
In the embodiment of the present application, step S201 may be implemented in any manner of each embodiment of the present application, which is not limited to this embodiment, and is not described in detail.
Step S202: and performing lexical analysis on the table title sample to obtain a table title sample sequence corresponding to the table title sample.
For example, a lexical analysis tool (e.g., a threac (Tsinghua UniversityLexical Analyzer for Chinese, university of bloom chinese lexical analysis tool)) is used to lexically analyze the table topic samples to word-segment the table topic samples to obtain a table topic sample sequence corresponding to the table topic samples.
Step S203: and generating a sample word vector corresponding to the table title sample based on the table title sample sequence.
For example, a word vector generation model (e.g., word2 vec) is used to vectorize the sequence of table header samples to generate sample word vectors corresponding to the table header samples.
Step S204: a local input matrix and a global input matrix are generated based on the sample word vector and the sample classification labels.
For example, a local input matrix is generated based on the sample word vector corresponding to each form header sample, and a global input matrix of sample word vectors of the same category is generated based on the sample word vector corresponding to each form header sample and the sample classification label.
In an alternative implementation manner, the generating the local input matrix and the global input matrix based on the sample word vector and the sample classification label may include the following steps: taking a sample word vector corresponding to each form title sample as a row vector to generate a local input matrix; classifying the form title samples according to the sample classification labels to obtain form title samples of different categories; and splicing the sample word vector corresponding to the table header sample of each category into a row vector to generate a global input matrix.
For example, the sample word vector corresponding to each table header sample is used as an independent row vector in the matrix, so that a local input matrix is generated based on a plurality of sample word vectors corresponding to a plurality of table header samples; classifying the table heading samples based on the sample classification labels, obtaining sample word vectors corresponding to the table heading samples of the same category, splicing the sample word vectors corresponding to the table heading samples of each category into a row vector, and accordingly obtaining a plurality of row vectors corresponding to the table heading samples of different categories, and generating a global input matrix based on the plurality of row vectors corresponding to the table heading samples of different categories.
In an embodiment of the present application, the global classification unit is a multi-layer neural network based on a transducer architecture, and each layer of neural network includes at least one multi-head attention layer and at least one fully-connected layer.
Step S205: and inputting the local input matrix into a local classification unit to obtain a first feature vector.
In the embodiment of the present application, step S205 may be implemented in any manner of each embodiment of the present application, which is not limited to this embodiment, and is not described in detail.
Step S206: and inputting the global input matrix into a global classification unit to obtain a second feature vector.
In the embodiment of the present application, step S206 may be implemented in any manner of each embodiment of the present application, which is not limited to this embodiment, and is not described in detail.
Step S207: inputting the first feature vector and the second feature vector into the MLP for feature mapping, obtaining a prediction classification result, and training a table title classification model according to the prediction classification result and the sample classification label.
In the embodiment of the present application, step S207 may be implemented in any manner of each embodiment of the present application, which is not limited to this embodiment, and is not described in detail.
By implementing the embodiment of the application, lexical analysis and vectorization can be carried out on the table title sample, a corresponding local input matrix and a global input matrix are generated, and the local input matrix and the global input matrix are input into the table title classification model to obtain a prediction result, so that the table title classification model is trained according to the prediction result and the sample classification label, and the table title classification model capable of accurately classifying the table title is obtained. Thereby improving the efficiency and accuracy of classifying the form titles.
In one implementation, a loss function may be obtained based on the predictive classification result and the sample classification label, thereby training the table topic classification model based on the loss function. As an example, please refer to fig. 3, fig. 3 is a schematic diagram of a training method of a table title classification model according to another embodiment of the present application. As shown in fig. 3, the method may include, but is not limited to, the steps of:
step S301: and acquiring a form title sample and a sample classification label corresponding to the form title sample.
In the embodiment of the present application, step S301 may be implemented in any manner of each embodiment of the present application, which is not limited to this embodiment, and is not described in detail.
Step S302: a local input matrix and a global input matrix are generated based on the table header samples and the sample class labels.
In the embodiment of the present application, step S302 may be implemented in any manner of each embodiment of the present application, which is not limited to this embodiment, and is not described in detail.
Step S303: and inputting the local input matrix into a local classification unit to obtain a first feature vector.
In the embodiment of the present application, step S303 may be implemented in any manner of each embodiment of the present application, which is not limited to this embodiment, and is not described in detail.
Step S304: and inputting the global input matrix into a global classification unit to obtain a second feature vector.
In the embodiment of the present application, step S304 may be implemented in any manner of each embodiment of the present application, which is not limited to this embodiment, and is not described in detail.
Step S305: inputting the first feature vector and the second feature vector into the MLP to perform feature mapping, and obtaining a prediction classification result.
For example, the first feature vector and the second feature vector are added and then input to the MLP for mapping, and the mapping result is processed by using the argmax function to obtain the prediction classification result.
Step S306: and obtaining a loss function based on the prediction classification result and the sample classification label.
Wherein, in an embodiment of the present application, the loss function may be a cross entropy loss function.
For example, a corresponding cross entropy loss function is calculated based on the prediction classification result and the sample classification label.
Step S307: and calculating gradients according to the loss functions and carrying out back propagation to update model parameters of the table title classification model in a gradient descent mode.
Specifically, the gradient is calculated according to the loss function, the direction propagation is performed, and the model parameters of the table-topic classification model are updated by using a gradient descent method.
By implementing the embodiment of the application, the local input matrix and the global input matrix can be generated based on the acquired form title sample and the sample classification label corresponding to the form title sample, and the local input matrix and the global input matrix are input into the form title classification model to acquire the prediction result so as to acquire the loss function based on the prediction classification result and the sample classification label, thereby training the form title classification model based on the loss function and obtaining the form title classification model capable of accurately classifying the form title. Thereby improving the efficiency and accuracy of classifying the form titles.
In some embodiments of the application, the local classification element is a recurrent neural network RNN and/or the global classification element is a multi-layer neural network based on a Transformer architecture, each layer of neural network comprising at least one multi-headed attention layer and at least one fully connected layer.
As one example, the local classification element is RNN (Recurrent Neural Network ). The input of the local classification element can be denoted as { x } 0 ,x 1 ,…,x t ,x t+1 … the output of the local classification element can be noted as y 0 ,y 1 ,…,y t ,y t+1 … the output of hidden units in a local classification unit can be denoted as { s } 0 ,s 1 ,…,s t ,s t+1 …, then the hidden layer output in the local classification unit is as follows:
s t =(Ux t +s t-1 )
wherein U and W are parameter matrices, s t To hide the state of step t of the layer, it is the memory cell of the RNN. s is(s) t And calculating according to the output of the current input layer and the state of the hidden layer in the last step. Sigma is a nonlinear activation function. o (o) t The output of step t is expressed as follows:
o t =oftmax(Vs t )
wherein V is a parameter matrix.
As another example, the global taxonomy is a multi-layer neural network based on a Transformer architecture, each layer of neural network including at least one multi-headed attention layer and at least one fully connected layer.
It will be appreciated that, since the same table names exist in tables of different categories, these data affect the expressive power of the model, the vector matrix is obtained by stitching the sample word vector of each category according to dimension 1, and the sequence information of each table header sample is fused into the vector representation. The calculation mode of the position characterization of the words in the table title sample is as follows:
PE (pos,2i) =sin(pos/10000 2i/d model )
PE (pos,2i+1) =cos(pos/10000 2i/d model )
wherein pos is the position information of the word in the form title sample, i represents the position number, d model Is the hyper-parametric representation vector dimension. The output of the multi-headed attention layer is as follows:
wherein q=q×w Q ,K=k*W K ,V=v*W V Q, k andv is generated from the input matrix of the multi-head main power mileage, W Q 、W K And W is V For presetting weight matrix d k Is the dimension of the k vector. And (5) splicing after Q, K and V of the outputs of the multi-head attention layer are obtained, and obtaining the output of a layer of neural network through full-connection operation.
Wherein in some embodiments of the application, the global taxonomy may comprise a 6-layer neural network.
As yet another example, the local classification element is a recurrent neural network RNN, the global classification element is a multi-layer neural network based on a Transformer architecture, each layer of neural network comprising at least one multi-headed attention layer and at least one fully connected layer.
Referring to fig. 4, fig. 4 is a schematic diagram of a method for classifying table titles according to an embodiment of the present application. The method may be used to categorize various quotation forms in a bidding document. As shown in fig. 4, the method may include, but is not limited to, the steps of:
step S401: and acquiring a form title text of the form to be classified.
For example, form header text of a bid form in a bid document is obtained.
Step S402: and inputting the form title text into a form title classification model to obtain a classification result.
The table title classification model is obtained by training based on the training method of the table title classification model provided by any embodiment of the application.
By implementing the embodiment of the application, the table titles to be classified can be classified based on the trained table title classification model, and the efficiency and accuracy of classifying the table titles are improved.
Referring to fig. 5, fig. 5 is a schematic diagram of a training device for a table title classification model according to an embodiment of the application. As shown in fig. 5, the apparatus 500 includes: an obtaining module 501, configured to obtain a table title sample and a sample classification label corresponding to the table title sample; a generating module 502, configured to generate a local input matrix and a global input matrix based on the table title sample and the sample classification label; a first processing module 503, configured to input a local input matrix into a local classification unit, and obtain a first feature vector; a second processing module 504, configured to input the global input matrix into the global classification unit, and obtain a second feature vector; the training module 505 is configured to input the first feature vector and the second feature vector into the MLP for feature mapping, obtain a prediction classification result, and train the table title classification model according to the prediction classification result and the sample classification label.
In one implementation, the generating module 502 is specifically configured to: performing lexical analysis on the table title sample to obtain a table title sample sequence corresponding to the table title sample; generating a sample word vector corresponding to the table title sample based on the table title sample sequence; a local input matrix and a global input matrix are generated based on the sample word vector and the sample classification labels.
In an alternative implementation, the generating module 502 is specifically configured to: taking a sample word vector corresponding to each form title sample as a row vector to generate a local input matrix; classifying the form title samples according to the sample classification labels to obtain form title samples of different categories; and splicing the sample word vector corresponding to the table header sample of each category into a row vector to generate a global input matrix.
In one implementation, the training module 505 is specifically configured to: acquiring a loss function based on the prediction classification result and the sample classification label; and calculating gradients according to the loss functions and carrying out back propagation to update model parameters of the table title classification model in a gradient descent mode.
In one implementation, the local classification element is an RNN and/or the global classification element is a transform architecture based multi-layer neural network, each layer of neural network including at least one multi-headed attention layer and at least one fully connected layer.
According to the device provided by the embodiment of the application, the local input matrix and the global input matrix can be generated based on the acquired form title sample and the sample classification label corresponding to the form title sample, and the local input matrix and the global input matrix are input into the form title classification model to acquire the prediction result, so that the form title classification model can be trained according to the prediction result and the sample classification label, and the form title classification model capable of accurately classifying the form title can be obtained. Thereby improving the efficiency and accuracy of classifying the form titles.
Referring to fig. 6, fig. 6 is a schematic diagram of a table title classifying device according to an embodiment of the application. As shown in fig. 6, the apparatus 600 includes: an obtaining module 601, configured to obtain a table title text of a table to be classified; the classification module 602 is configured to input the form title text into a form title classification model, and obtain a classification result; the table title classification model is trained based on the training method of the table title classification model provided by any embodiment of the application.
By the device provided by the embodiment of the application, the form titles to be classified can be classified based on the trained form title classification model, and the efficiency and accuracy of classifying the form titles are improved.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
Based on the embodiment of the application, the application also provides electronic equipment, which comprises: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the training method of the table title classification model of any of the foregoing embodiments, or to perform the classification method of the table title of any of the foregoing embodiments.
Based on the embodiment of the present application, the present application further provides a computer readable storage medium, where computer instructions are configured to cause a computer to execute the training method of the table title classification model according to any one of the foregoing embodiments provided by the embodiment of the present application, or execute the classification method of the table title provided by the embodiment of the present application.
Referring now to fig. 7, shown in fig. 7 is a schematic block diagram of an example electronic device that may be used to implement an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.
As shown in fig. 7, the apparatus 700 includes a computing unit 701 that can perform various appropriate actions and processes according to a computer program stored in a Read-Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a random access Memory (Random Access Memory, RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 may also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An Input/Output (I/O) interface 705 is also connected to bus 704.
Various components in device 700 are connected to I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, etc.; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, an optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The computing unit 701 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 701 include, but are not limited to, a central processing unit (Central Processing Unit, CPU), a graphics processing unit (Graphics Processing Unit, GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital signal processors (Digital Signal Process, DSP), and any suitable processors, controllers, microcontrollers, etc. The calculation unit 701 performs the respective methods and processes described above, for example, a training method of a table-header classification model, or a classification method of a table header. For example, in some embodiments, the training method of the form title classification model, and/or the classification method of the form title may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 700 via ROM 702 and/or communication unit 709. When the computer program is loaded into the RAM 703 and executed by the calculation unit 701, the training method of the above-described table-title classification model, or one or more steps of the above-described table-title classification method may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the training method of the table-title classification model, or to perform the classification method of the table-title, in any other suitable way (e.g. by means of firmware).
Various implementations of the systems and techniques described here above can be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (Field Programmable Gate Array, FPGAs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), application specific standard products (Application Specific Standard Parts, ASSPs), systems On Chip (SOC), load programmable logic devices (Complex Programmable Logic Device, CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present application may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-Only Memory (ROM), an erasable programmable read-Only Memory (EPROM) or flash Memory, an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., cathode Ray Tube (CRT) or LCD (Liquid Crystal Display ) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local area network (Local Area Network, LAN), wide area network (Wide Area Network, WAN), the internet and blockchain networks.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS (Virtual Private Server ) service are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solution of the present application are achieved, and are not limited herein.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims (10)

1. A method of training a table header classification model, the table header classification model comprising a local classification unit, a global classification unit, and a multi-layer perceptron MLP, the method comprising:
acquiring a table title sample and a sample classification label corresponding to the table title sample;
generating a local input matrix and a global input matrix based on the form header sample and the sample classification label;
Inputting the local input matrix into the local classification unit to obtain a first feature vector;
inputting the global input matrix into the global classification unit to obtain a second feature vector;
inputting the first feature vector and the second feature vector into the MLP for feature mapping, obtaining a prediction classification result, and training the table title classification model according to the prediction classification result and the sample classification label.
2. The method of claim 1, wherein the generating a local input matrix and a global input matrix based on the table header samples and the sample classification labels comprises:
performing lexical analysis on the table title sample to obtain a table title sample sequence corresponding to the table title sample;
generating a sample word vector corresponding to the table title sample based on the table title sample sequence;
the local input matrix and the global input matrix are generated based on the sample word vector and the sample classification tag.
3. The method of claim 2, wherein the generating the local input matrix and the global input matrix based on the sample word vector and the sample classification tag comprises:
Taking a sample word vector corresponding to each table header sample as a row vector to generate the local input matrix;
classifying the form title samples according to the sample classification labels to obtain the form title samples of different categories;
and splicing the sample word vector corresponding to the table header sample of each category into a row vector to generate the global input matrix.
4. The method of claim 1, wherein the training the form title classification model based on the predictive classification result and the sample classification label comprises:
acquiring a loss function based on the prediction classification result and the sample classification label;
and calculating gradients according to the loss function and carrying out back propagation so as to update model parameters of the table title classification model in a gradient descent mode.
5. The method of claim 1, wherein the local classification unit is a recurrent neural network RNN and/or the global classification unit is a multi-layer neural network based on a Transformer architecture, each layer of the neural network comprising at least one multi-headed attention layer and at least one fully connected layer.
6. A method for classifying a title of a form, comprising:
acquiring a form title text of a form to be classified;
inputting the form title text into a form title classification model to obtain a classification result; wherein the table title classification model is trained based on the method of any one of claims 1 to 5.
7. A training device for a table header classification model, wherein the table header classification model comprises a local classification unit, a global classification unit and a multi-layer perceptron MLP, the device comprising:
the acquisition module is used for acquiring a form title sample and a sample classification label corresponding to the form title sample;
a generation module for generating a local input matrix and a global input matrix based on the form title sample and the sample classification label;
the first processing module is used for inputting the local input matrix into the local classification unit to obtain a first feature vector;
the second processing module is used for inputting the global input matrix into the global classification unit to obtain a second feature vector;
and the training module is used for inputting the first feature vector and the second feature vector into the MLP for feature mapping, obtaining a prediction classification result, and training the table title classification model according to the prediction classification result and the sample classification label.
8. A form header sorting apparatus, comprising:
the acquisition module is used for acquiring the form title text of the form to be classified;
the classification module is used for inputting the form title text into a form title classification model to obtain a classification result; wherein the table title classification model is trained based on the method of any one of claims 1 to 5.
9. An electronic device, comprising:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the training method of the table title classification model of any one of claims 1 to 5 or to perform the classification method of the table title of claim 6.
10. A computer-readable storage medium storing instructions that, when executed, cause a training method of a form title classification model according to any one of claims 1 to 5 to be implemented, or cause a classification method of a form title according to claim 6 to be implemented.
CN202310699463.0A 2023-06-13 2023-06-13 Training method of table title classification model and classification method of table title Pending CN116595380A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310699463.0A CN116595380A (en) 2023-06-13 2023-06-13 Training method of table title classification model and classification method of table title

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310699463.0A CN116595380A (en) 2023-06-13 2023-06-13 Training method of table title classification model and classification method of table title

Publications (1)

Publication Number Publication Date
CN116595380A true CN116595380A (en) 2023-08-15

Family

ID=87599212

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310699463.0A Pending CN116595380A (en) 2023-06-13 2023-06-13 Training method of table title classification model and classification method of table title

Country Status (1)

Country Link
CN (1) CN116595380A (en)

Similar Documents

Publication Publication Date Title
CN111709630A (en) Voice quality inspection method, device, equipment and storage medium
CN112579727B (en) Document content extraction method and device, electronic equipment and storage medium
US20220358292A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN113590776A (en) Text processing method and device based on knowledge graph, electronic equipment and medium
CN113268560A (en) Method and device for text matching
CN113434683A (en) Text classification method, device, medium and electronic equipment
CN112560461A (en) News clue generation method and device, electronic equipment and storage medium
CN112989050B (en) Form classification method, device, equipment and storage medium
CN112906368B (en) Industry text increment method, related device and computer program product
CN112989797B (en) Model training and text expansion methods, devices, equipment and storage medium
CN111241273A (en) Text data classification method and device, electronic equipment and computer readable medium
CN117370524A (en) Training method of reply generation model, reply sentence generation method and device
CN116089586B (en) Question generation method based on text and training method of question generation model
CN116048463A (en) Intelligent recommendation method and device for content of demand item based on label management
CN108733702B (en) Method, device, electronic equipment and medium for extracting upper and lower relation of user query
CN116401372A (en) Knowledge graph representation learning method and device, electronic equipment and readable storage medium
CN116595380A (en) Training method of table title classification model and classification method of table title
CN114817476A (en) Language model training method and device, electronic equipment and storage medium
CN113806541A (en) Emotion classification method and emotion classification model training method and device
CN113886543A (en) Method, apparatus, medium, and program product for generating an intent recognition model
CN111274383B (en) Object classifying method and device applied to quotation
CN113032540B (en) Man-machine interaction method, device, equipment and storage medium
CN116069914B (en) Training data generation method, model training method and device
CN118035445A (en) Work order classification method and device, electronic equipment and storage medium
CN115757163A (en) Software operation environment evaluation method and device based on SVM model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination