CN112395841B - BERT-based method for automatically filling blank text - Google Patents

BERT-based method for automatically filling blank text Download PDF

Info

Publication number
CN112395841B
CN112395841B CN202011291822.1A CN202011291822A CN112395841B CN 112395841 B CN112395841 B CN 112395841B CN 202011291822 A CN202011291822 A CN 202011291822A CN 112395841 B CN112395841 B CN 112395841B
Authority
CN
China
Prior art keywords
word
matrix
words
layer
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011291822.1A
Other languages
Chinese (zh)
Other versions
CN112395841A (en
Inventor
柯逍
卢恺翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202011291822.1A priority Critical patent/CN112395841B/en
Publication of CN112395841A publication Critical patent/CN112395841A/en
Application granted granted Critical
Publication of CN112395841B publication Critical patent/CN112395841B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/285Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The invention provides a method for automatically filling a blank text based on BERT, which comprises the following steps; step S1: taking a public complete gap-filling CLOTH data set as a training data base, preprocessing the data set by using a word segmentation device, and extracting the content of an article and gap-filling options; step S2: pre-training a deep bidirectional representation model by jointly adjusting the contexts in all layers of the processed data set; providing a language model by using a pre-training model, finely adjusting the language model by using an extra output layer, and finally forming an encoder by adding the position information of the problem and the language model; step S3: stacking the full connection layer, the gelu activation function layer, the return layer and the full connection layer in sequence to form a decoder, and inputting the result of the encoder into the decoder for decoding; step S4: predicting the word which should appear at the blank space by the output of the decoder; the invention can realize the purpose of utilizing artificial intelligence to complete the prediction and proofreading of the text with vacancy and assist proofreading personnel to check and publish books.

Description

BERT-based method for automatically filling blank text
Technical Field
The invention relates to the technical field of pattern recognition and natural language processing, in particular to a method for automatically filling a blank text based on BERT.
Background
In recent years, artificial intelligence technology is rapidly developed, and deep learning is utilized to process some conversational understanding in our lives, namely natural language processing becomes a hot technology. Natural language processing is a very important research field in the fields of computer science and technology and artificial intelligence, and is mainly used for researching whether a machine can correctly understand human language so as to complete functions of translation, question answering and the like.
The aim of automatically filling the blank text is to automatically fill or automatically check the blank content or the wrong content of an unpublished book with a large amount of linguistic data by utilizing a deep learning method. By utilizing the context semantic obtaining capability and the long-distance semantic information obtaining capability of the BERT model, the context of the article can be understood, and the automatic filling function of blank parts and the automatic checking function of wrong contents are completed.
Disclosure of Invention
The invention provides a BERT-based method for automatically filling in vacant texts, which can realize the purposes of utilizing artificial intelligence to complete the prediction and proofreading of the vacant texts and assisting proofreading personnel in checking and publishing books.
The invention adopts the following technical scheme.
A BERT based method for automatically filling in a blank text, the method comprising the steps of;
step S1: taking the articles in the public complete gap-filling CLOTH data set as a training data base, preprocessing the CLOTH data set by using a word segmentation device, and extracting the content and gap-filling options of the articles;
step S2: pre-training a deep bidirectional representation model by jointly adjusting the context in all layers in the processed data set; providing a language model by using a pre-training model, finely adjusting the language model by using an extra output layer, and finally forming an encoder by adding the position information of the problem and the language model;
step S3: stacking a full connection layer, a gelu activation function layer, a return layer and another full connection layer in sequence to form a decoder, and inputting the result of the encoder into the decoder for decoding;
step S4: the word that should appear at the space is predicted by the output of the decoder, i.e. the resulting word probability vector.
The step S1 specifically includes the following steps;
step S11, acquiring a public completed type gap filling CLOTH data set;
step S12: utilizing word segmenters corresponding to different pre-training models to perform word segmentation on articles and candidate items in the CLOTH data set and converting the articles and the candidate items into indexes in corresponding dictionaries;
step S13: recording the position of each space in the corresponding text sequence, and converting the standard answers from letters to numbers in sequence;
step S14: each article in the CLOTH data set is classified into five types of data, namely sample name, attribute IDs, options IDs, queries locations and answer after data preprocessing.
Step S2 specifically includes the following steps;
step S21: acquiring a representation vector X of each word of an input sentence, wherein X is obtained by adding a word embedding vector of the word and a word embedding vector of the position of the word;
step S22: the method comprises the steps that a self-attention mechanism in a transform encoder is achieved through three matrixes, wherein the matrixes comprise a query matrix Q, a key matrix K and a value matrix V; first, the words of the input sentence are embedded into a matrix X, where each row of the matrix X represents a word in the input sentence, which is multiplied by a weight matrix W used by the pre-training modelQ、WK、WVRespectively obtaining a matrix Q, a matrix K and a matrix V;
step S23: multiplying the query matrix Q by the key matrix K to perform word-to-word score evaluation for each word in the sentence; wherein the high or low of the score represents whether the degree of association between two words is tight; the resulting score is then divided by the dimension d of the key vectorkSquare root of (c) to enhance the stability of the gradient; the softmax function is used again to make the scores of all words positive and their sum 1; finally, multiplying the obtained softmax fraction by a value matrix V to obtain the output of the self-attention layer at the position, which is represented as a matrix Z; expressed as:
Figure BDA0002784296610000021
step S24: after obtaining Z, the Z is sent to the next module of the encoder, namely a Feed-Forward Neural Network; the module has two fully connected layers, the activation function of the first layer is ReLU, and the second layer is a linear activation function, which can be expressed as:
ffn (z) ═ max (0, ZW1+ b) W2+ b formula two;
w1 and W2 in the formula are weight matrixes, and b1 and b2 are bias vectors; (ii) ffn (z) as the output of the transform encoder;
step S25: the method comprises the following steps of utilizing a language model provided by a pre-training model, and then finely adjusting the existing language model through an additional output layer so that the existing language model is more suitable for filling a downstream task of a blank text;
step S26: the position information of the added problems and a language model based on a Transformer module jointly form an encoder for automatically completing the blank text method;
step S27: the obtained word expression vector matrix is transmitted into an encoder of an automatic completion vacant text method, and an encoding information matrix C of all words of a sentence can be obtained after 6 encoder modules; the word vector matrix is represented by X (n multiplied by d), n is the number of words in the sentence, and d is the dimension representing the vector; the matrix dimensions of the output of each encoder module are identical to the input.
Step S3 specifically includes the following steps;
step S31: the decoder is formed by sequentially stacking a full connection layer, a gelu activation function layer, a return layer and another full connection layer;
step S32: transmitting the coding information matrix C output by the coder to a decoder, and predicting and judging the (n + 1) th word by the decoder according to the current analyzed previous n words in sequence;
step S33: when the translated n +1 word is translated, the word behind the n +1 word needs to be covered by a Mask covering operation; randomly masking 15% of the words in each input sequence and then letting the model predict these masked words during the training process;
step S34: in order to avoid the situation that some selected words are repeatedly shielded to cause the model to fail to see the words in the future fine adjustment process, measures are further taken, wherein in the shielding operation, 80% of probability replaces the selected words with [ MASK ]; a probability of 10% to replace these words with a random word; the probability of 10% remains unchanged.
Step S4 specifically includes the following steps;
step S41: predicting words which should appear at the blank according to a preset dictionary by utilizing the word probability vector output by the decoder;
step S42: and outputting the predicted words to spaces of the specified text or the original text.
The complete type gap-filling CLOTH dataset is provided by the university of Chimerlong in the card and is totally called Large-scale clock TestDataset Created by Teachers.
Compared with the prior art, the invention has the following beneficial effects:
1. compared with other existing precautions, the method for automatically filling the blank text based on the BERT, which is constructed by the invention, has the advantage that the bidirectional Transformer module can effectively understand the context semantics.
2. The data set does not need a large amount of labeled texts, and a language model with good performance can be trained by using a pre-training model and a CLOTH data set provided by Google.
3. The self-attention mechanism in the transform model block can simulate the attention focusing phenomenon caused by human observation of things, so that the hidden context relation in the sentence can be understood by connecting local features or disregarding some useless features.
4. The performance of the language model can be further optimized by using methods such as data expansion, data enhancement, data integration and the like, and the accuracy can be further improved.
The invention provides a method based on BERT aiming at the problems of a large amount of unmarked linguistic data, overlarge parameters contained in a language model, incapability of effectively understanding context and the like.
The invention utilizes the self-attention mechanism proposed by BERT, can imitate the attention focusing phenomenon caused by human observation of things, effectively extracts the hidden connection in the text, reduces the model parameters, and understands the hidden context relationship in the sentence by connecting local features or disregarding some useless features.
Drawings
The invention is described in further detail below with reference to the following figures and detailed description:
fig. 1 is a schematic diagram of the principle of the present invention.
Detailed Description
As shown, a BERT-based method for automatically filling in a blank text, the method comprises the following steps;
step S1: taking the articles in the public complete gap-filling CLOTH data set as a training data base, preprocessing the CLOTH data set by using a word segmentation device, and extracting the content and gap-filling options of the articles;
step S2: pre-training a deep bidirectional representation model by jointly adjusting the context in all layers in the processed data set; providing a language model by using a pre-training model, finely adjusting the language model by using an extra output layer, and finally forming an encoder by adding the position information of the problem and the language model;
step S3: stacking a full connection layer, a gelu activation function layer, a return layer and another full connection layer in sequence to form a decoder, and inputting the result of the encoder into the decoder for decoding;
step S4: the word that should appear at the space is predicted by the output of the decoder, i.e. the resulting word probability vector.
The step S1 specifically includes the following steps;
step S11, acquiring a public completed type gap filling CLOTH data set;
step S12: utilizing word segmenters corresponding to different pre-training models to perform word segmentation on articles and candidate items in the CLOTH data set and converting the articles and the candidate items into indexes in corresponding dictionaries;
step S13: recording the position of each space in the corresponding text sequence, and converting the standard answers from letters to numbers in sequence;
step S14: each article in the CLOTH data set is classified into five types of data, namely sample name, attribute IDs, options IDs, queries locations and answer after data preprocessing.
Step S2 specifically includes the following steps;
step S21: acquiring a representation vector X of each word of an input sentence, wherein the X is obtained by adding a word embedding vector of the word and a word embedding vector of a word position;
step S22: the method comprises the steps that a self-attention mechanism in a transform encoder is achieved through three matrixes, wherein the matrixes comprise a query matrix Q, a key matrix K and a value matrix V; first of all, the first step is to,embedding the words of the input sentence into a matrix X, each row of the X matrix representing a word in the input sentence, multiplying it by a weight matrix W used by the pre-training modelQ、WK、WVRespectively obtaining a matrix Q, a matrix K and a matrix V;
step S23: multiplying the query matrix Q by the key matrix K to perform word-to-word score evaluation for each word in the sentence; wherein the high or low of the score represents whether the degree of association between two words is tight; the resulting score is then divided by the dimension d of the key vectorkSquare root of (c) to enhance the stability of the gradient; the softmax function is used again to make the scores of all words positive and their sum 1; finally, multiplying the obtained softmax fraction by a value matrix V to obtain the output of the self-attention layer at the position, which is represented as a matrix Z; expressed as:
Figure BDA0002784296610000051
step S24: after obtaining Z, the Z is sent to the next module of the encoder to Feed Forward a Neural Network, namely a Feed Forward Neural Network; the module has two fully connected layers, the activation function of the first layer is ReLU, and the second layer is a linear activation function, which can be expressed as:
ffn (z) ═ max (0, ZW1+ b) W2+ b formula two;
w1 and W2 in the formula are weight matrixes, and b1 and b2 are bias vectors; (ii) ffn (z) as the output of the transform encoder;
step S25: the method comprises the following steps of utilizing a language model provided by a pre-training model, and then finely adjusting the existing language model through an additional output layer so that the existing language model is more suitable for filling a downstream task of a blank text;
step S26: the position information of the added problems and a language model based on a Transformer module jointly form an encoder for automatically completing the blank text method;
step S27: the obtained word expression vector matrix is transmitted into an encoder of an automatic completion vacant text method, and an encoding information matrix C of all words of a sentence can be obtained after 6 encoder modules; the word vector matrix is represented by X (n multiplied by d), n is the number of words in the sentence, and d is the dimension representing the vector; the matrix dimensions of the output of each encoder module are identical to the input.
Step S3 specifically includes the following steps;
step S31: the decoder is formed by sequentially stacking a full connection layer, a gelu activation function layer, a return layer and another full connection layer;
step S32: transmitting the coding information matrix C output by the coder to a decoder, and predicting and judging the (n + 1) th word by the decoder according to the current analyzed previous n words in sequence;
step S33: when the translated n +1 word is translated, the word behind the n +1 word needs to be covered by a Mask covering operation; randomly masking 15% of the words in each input sequence and then letting the model predict these masked words during the training process;
step S34: in order to avoid the situation that some selected words are repeatedly shielded to cause the model to fail to see the words in the future fine adjustment process, measures are further taken, wherein in the shielding operation, 80% of probability replaces the selected words with [ MASK ]; a probability of 10% to replace these words with a random word; the probability of 10% remains unchanged.
Step S4 specifically includes the following steps;
step S41: predicting words which should appear at the blank according to a preset dictionary by utilizing the word probability vector output by the decoder;
step S42: and outputting the predicted words to spaces of the specified text or the original text.
The public complete fill CLOTH Dataset is a public complete fill CLOTH Dataset provided by the university of Meilong in the card and is all called Large-scale clock Test Dataset Created by Teachers.
Particularly, the embodiment provides a method based on BERT aiming at the problems of a large amount of unmarked corpora, too large parameters contained in a language model, incapability of effectively understanding the context and the like, and the problem of a large amount of unmarked corpora is effectively solved by utilizing the thought of a pre-training model provided by BERT; the invention utilizes the self-attention mechanism proposed by BERT, can imitate the attention focusing phenomenon caused by human observation of things, effectively extracts the hidden connection in the text, reduces the model parameters, and understands the hidden context relationship in the sentence by connecting local features or disregarding some useless features.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims (4)

1. A method for automatically filling in blank texts based on BERT is characterized in that: the method comprises the following steps;
step S1: taking the articles in the public complete gap-filling CLOTH data set as a training data base, preprocessing the CLOTH data set by using a word segmentation device, and extracting the content and gap-filling options of the articles;
step S2: pre-training a deep bidirectional representation model by jointly adjusting the context in all layers in the processed data set; providing a language model by using a pre-training model, finely adjusting the language model by using an extra output layer, and finally forming an encoder by adding the position information of the problem and the language model;
step S3: stacking a full connection layer, a gelu activation function layer, a return layer and another full connection layer in sequence to form a decoder, and inputting the result of the encoder into the decoder for decoding;
step S4: predicting the word which should appear at the blank by using the output of the decoder, namely the obtained word probability vector;
step S2 specifically includes the following steps;
step S21: acquiring a representation vector X of each word of an input sentence, wherein the X is obtained by adding a word embedding vector of the word and a word embedding vector of a word position;
step S22: transfo implementation by three matricesA self-attention mechanism in the rmer encoder, said matrix comprising a query matrix Q, a key matrix K and a value matrix V; first, the words of the input sentence are embedded into a matrix X, where each row of the matrix X represents a word in the input sentence, which is multiplied by a weight matrix W used by the pre-training modelQ、WK、WVRespectively obtaining a matrix Q, a matrix K and a matrix V;
step S23: multiplying the query matrix Q by the key matrix K to perform word-to-word score evaluation for each word in the sentence; wherein the high or low of the score represents whether the degree of association between two words is tight; the resulting score is then divided by the dimension d of the key vectorkSquare root of (c) to enhance the stability of the gradient; the softmax function is used again to make the scores of all words positive and their sum 1; finally, multiplying the obtained softmax fraction by a value matrix V to obtain the output of the self-attention layer at the position, which is represented as a matrix Z; expressed as:
Figure FDA0003556265210000011
step S24: after obtaining Z, the Z is sent to the next module of the encoder to Feed Forward a Neural Network, namely a Feed Forward Neural Network; the module has two fully connected layers, the activation function of the first layer is ReLU, and the second layer is a linear activation function expressed as:
ffn (z) ═ max (0, ZW1+ b) W2+ b formula two;
w1 and W2 in the formula are weight matrixes, and b1 and b2 are bias vectors; (ii) ffn (z) as the output of the transform encoder;
step S25: utilizing a language model provided by a pre-training model, and then finely adjusting the existing language model through an additional output layer to make the existing language model suitable for filling a downstream task of a vacant text;
step S26: the position information of the added problems and a language model based on a Transformer module jointly form an encoder for automatically completing the blank text method;
step S27: the obtained word expression vector matrix is transmitted into an encoder of an automatic completion vacant text method, and an encoding information matrix C of all words of a sentence is obtained after 6 encoder modules; the word vector matrix is represented by X (n multiplied by d), n is the number of words in the sentence, and d is the dimension representing the vector; the matrix dimension output by each encoder module is completely consistent with the input;
step S3 specifically includes the following steps;
step S31: the decoder is formed by sequentially stacking a full connection layer, a gelu activation function layer, a return layer and another full connection layer;
step S32: transmitting the coding information matrix C output by the coder to a decoder, and predicting and judging the (n + 1) th word by the decoder according to the current analyzed previous n words in sequence;
step S33: when the translated n +1 word is translated, covering the word behind the n +1 word through a Mask covering operation; randomly masking 15% of the words in each input sequence and then letting the model predict these masked words during the training process;
step S34: in order to avoid the situation that some selected words are repeatedly shielded to cause the model to fail to see the words in the future fine adjustment process, measures are further taken, wherein in the shielding operation, 80% of probability replaces the selected words with [ MASK ]; a probability of 10% replacing these words with a random word; the probability of 10% remains unchanged.
2. The BERT-based method for automatically filling in the blank text according to claim 1, wherein: the step S1 specifically includes the following steps;
step S11, acquiring a public completed type gap filling CLOTH data set;
step S12: utilizing word segmenters corresponding to different pre-training models to perform word segmentation on articles and candidate items in the CLOTH data set and converting the articles and the candidate items into indexes in corresponding dictionaries;
step S13: recording the position of each space in the corresponding text sequence, and converting the standard answers from letters to numbers in sequence;
step S14: each article in the CLOTH data set is classified into five types of data, namely sample name, attribute IDs, options IDs, queries locations and answer after data preprocessing.
3. The BERT-based method for automatically filling in the blank text according to claim 1, wherein: step S4 specifically includes the following steps;
step S41: predicting words which should appear at the blank according to a preset dictionary by utilizing the word probability vector output by the decoder;
step S42: and outputting the predicted words to spaces of the specified text or the original text.
4. The BERT-based method for automatically filling in the blank text according to claim 1, wherein: the public complete fill CLOTH Dataset is a public complete fill CLOTH Dataset provided by the university of Meilong in the card and is all called Large-scale clock Test Dataset Created by Teachers.
CN202011291822.1A 2020-11-18 2020-11-18 BERT-based method for automatically filling blank text Active CN112395841B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011291822.1A CN112395841B (en) 2020-11-18 2020-11-18 BERT-based method for automatically filling blank text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011291822.1A CN112395841B (en) 2020-11-18 2020-11-18 BERT-based method for automatically filling blank text

Publications (2)

Publication Number Publication Date
CN112395841A CN112395841A (en) 2021-02-23
CN112395841B true CN112395841B (en) 2022-05-13

Family

ID=74607313

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011291822.1A Active CN112395841B (en) 2020-11-18 2020-11-18 BERT-based method for automatically filling blank text

Country Status (1)

Country Link
CN (1) CN112395841B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113345574B (en) * 2021-05-26 2022-03-22 复旦大学 Traditional Chinese medicine stomachache health preserving scheme obtaining device based on BERT language model and CNN model
CN113268996A (en) * 2021-06-02 2021-08-17 网易有道信息技术(北京)有限公司 Method for expanding corpus, training method for translation model and product
CN114896986B (en) * 2022-06-07 2024-04-05 北京百度网讯科技有限公司 Method and device for enhancing training data of semantic recognition model
CN117273067B (en) * 2023-11-20 2024-02-02 上海芯联芯智能科技有限公司 Dialogue response method and device based on large language model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502627A (en) * 2019-08-28 2019-11-26 上海海事大学 A kind of answer generation method based on multilayer Transformer polymerization encoder
CN110737769A (en) * 2019-10-21 2020-01-31 南京信息工程大学 pre-training text abstract generation method based on neural topic memory
CN111723547A (en) * 2020-05-25 2020-09-29 河海大学 Text automatic summarization method based on pre-training language model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10747427B2 (en) * 2017-02-01 2020-08-18 Google Llc Keyboard automatic language identification and reconfiguration

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502627A (en) * 2019-08-28 2019-11-26 上海海事大学 A kind of answer generation method based on multilayer Transformer polymerization encoder
CN110737769A (en) * 2019-10-21 2020-01-31 南京信息工程大学 pre-training text abstract generation method based on neural topic memory
CN111723547A (en) * 2020-05-25 2020-09-29 河海大学 Text automatic summarization method based on pre-training language model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种改进的个性化查询引文推荐方法;李飞等;《万方数据期刊库》;20191024;第1-5页 *

Also Published As

Publication number Publication date
CN112395841A (en) 2021-02-23

Similar Documents

Publication Publication Date Title
CN112395841B (en) BERT-based method for automatically filling blank text
CN110795552B (en) Training sample generation method and device, electronic equipment and storage medium
CN111858932A (en) Multiple-feature Chinese and English emotion classification method and system based on Transformer
CN111581350A (en) Multi-task learning, reading and understanding method based on pre-training language model
CN111966812B (en) Automatic question answering method based on dynamic word vector and storage medium
CN110232113B (en) Method and system for improving question and answer accuracy of knowledge base
CN110276069A (en) A kind of Chinese braille mistake automatic testing method, system and storage medium
CN114881042B (en) Chinese emotion analysis method based on graph-convolution network fusion of syntactic dependency and part of speech
CN109740164A (en) Based on the matched electric power defect rank recognition methods of deep semantic
Agić et al. Baselines and test data for cross-lingual inference
CN113836895A (en) Unsupervised machine reading understanding method based on large-scale problem self-learning
CN111125333A (en) Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism
CN114757184B (en) Method and system for realizing knowledge question and answer in aviation field
CN115761753A (en) Retrieval type knowledge prefix guide visual question-answering method fused with knowledge graph
Hämäläinen et al. Revisiting NMT for normalization of early English letters
CN115952263A (en) Question-answering method fusing machine reading understanding
Savci et al. Comparison of pre-trained language models in terms of carbon emissions, time and accuracy in multi-label text classification using AutoML
CN114282592A (en) Deep learning-based industry text matching model method and device
Ajees et al. A named entity recognition system for Malayalam using neural networks
Chowanda et al. Generative Indonesian conversation model using recurrent neural network with attention mechanism
CN112182151A (en) Reading understanding task identification method and device based on multiple languages
CN117131877A (en) Text detection method and system based on contrast learning
CN115455144A (en) Data enhancement method of completion type space filling type for small sample intention recognition
CN114579706A (en) Automatic subjective question evaluation method based on BERT neural network and multitask learning
Jiang et al. Analysis and improvement of external knowledge usage in machine multi-choice reading comprehension tasks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant