CN113743095A - Chinese problem generation unified pre-training method based on word lattice and relative position embedding - Google Patents
Chinese problem generation unified pre-training method based on word lattice and relative position embedding Download PDFInfo
- Publication number
- CN113743095A CN113743095A CN202110814546.0A CN202110814546A CN113743095A CN 113743095 A CN113743095 A CN 113743095A CN 202110814546 A CN202110814546 A CN 202110814546A CN 113743095 A CN113743095 A CN 113743095A
- Authority
- CN
- China
- Prior art keywords
- training
- model
- relative position
- domain
- dictionary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012549 training Methods 0.000 title claims abstract description 96
- 238000000034 method Methods 0.000 title claims abstract description 26
- 230000007246 mechanism Effects 0.000 claims description 17
- 239000011159 matrix material Substances 0.000 claims description 13
- 230000000873 masking effect Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 238000013136 deep learning model Methods 0.000 claims description 3
- 238000013508 migration Methods 0.000 claims description 3
- 230000005012 migration Effects 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 6
- 238000003058 natural language processing Methods 0.000 description 5
- 238000011161 development Methods 0.000 description 3
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 102100033814 Alanine aminotransferase 2 Human genes 0.000 description 1
- 101000779415 Homo sapiens Alanine aminotransferase 2 Proteins 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000001671 psychotherapy Methods 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a Chinese question generation unified pre-training method based on word lattice and relative position embedding, which specifically comprises the following steps: performing field pre-training on the Robert parameter; a semi-supervised and semi-manual mode is used for quickly and accurately generating a target field dictionary; according to the dictionary, relative position information of the input Chinese characters and words is merged into a Transformer layer; a newly-built Transformer layer performs task pre-training through a large amount of open domain question-answer data; training and inference of the problem is generated. According to the invention, the relative position information of each list and the domain vocabulary is added in the model input, so that the model can learn more position relations and has a better effect when generating problems aiming at the target domain input. Domain pre-training and task pre-training are also applied to the model to enhance the model's ability to infer in a particular domain. The model provided by the invention has better effect based on the same question and answer data set.
Description
Technical Field
The invention belongs to the field of problem generation in Chinese natural language processing, and provides a unified pre-training method for Chinese problem generation based on word lattice and relative position embedding.
Background
With the development of the internet and information technology, a large amount of information is flooded into the internet, and meanwhile, the development of artificial intelligence is promoted. In the field of natural language processing, intelligent systems usually process a large amount of input corpora, so it is valuable to find a better way to process a large amount of corpus information.
The smart question-answering system qg (question generation) is a research hotspot in aspects of natural language processing. Because of the activity of human thinking and the innovation, the traditional rule-based question-answering system is difficult to obtain satisfactory question sentences, and meanwhile, the computing power of computers is greatly improved in recent years, so that a plurality of deep learning-based question-answering systems gradually start to be applied. One very important application area is in the field of education, and as a large amount of professional knowledge and vocabularies are often contacted in the learning stage, in order to improve the familiarity of students with knowledge, the memories need to be consolidated through some problems. The problem generation system can assist teachers to put forward relevant problems in the field and relieve teaching pressure. The problem generation can also be used for the chat robot, and the man-machine interaction capability is enhanced. In summary, generating high quality questions can not only advance the research in natural language processing, but also promote the development of the fields of psychotherapy, education, and the like. Therefore, it is of great practical significance to study question-answering systems that can present high quality questions.
In recent years, a model based on a Transformer has been developed in a large amount, and since the Transformer proposes an attention mechanism, the mechanism can effectively acquire context information from input corpus. Training the Transformer model through a large amount of text can enable the model to learn the implicit relationship of context in natural language. Such as Bert, Robert, GPT2, Unilm, etc., which all perform well in the NLP domain. The models can be migrated aiming at different downstream tasks, after pre-training, the downstream tasks can enable the models to be converged by using a small amount of labeled texts, and the migrated models have more excellent performance in the downstream tasks. The Unilm language model combines various covering training ideas of other models, and adopts two directions, from left to right, from right to left, and sequence-to-sequence (sequence-to-sequence) according to different specific tasks, so that the model can be better adept in different directions by different covering ideas. For example, in the context of text generation, the use of left-to-right masking ideas may improve text generation capabilities. In the field of Chinese language models, Cui et al use a full word masked Chinese Robert pre-training model in Chinese, and since a word in the Chinese often forms a new complete word meaning, masking the token formed by each of the words can better capture the boundary relationship between the words, and the model achieves the best level of the pre-training model on multiple Chinese data sets. However, because the vocabulary composition and semantic difference of each field in Chinese are great, the model cannot achieve good effect in each professional field.
Disclosure of Invention
The invention aims to provide a Chinese problem generation unified pre-training method based on word Lattice and Relative Position Embedding, which integrates field word Lattice (Lattice) Embedding and Relative Position encoding (Relative Position Embedding) and simultaneously adds field pre-training and task pre-training. The generation precision of the model in the target field is improved, and meaningful question sentences are generated more efficiently.
The technical scheme adopted by the invention is that,
the method for generating the unified pre-training based on the Chinese problem embedded in the word lattice and the relative position specifically comprises the following steps:
step 1, performing field pre-training on the Robert parameter;
step 2, a semi-supervised and semi-manual mode is used for quickly and accurately generating a target field dictionary;
step 3, constructing a special mask matrix, and improving the generation capacity of the model;
step 4, constructing a special relative position embedding matrix, and fusing the relative position information of the input Chinese characters and words into a Transformer layer according to the dictionary in the step 2;
step 5, inheriting 12 th layer parameters of a Robert model by the newly-built Transformer layer, and performing task pre-training through a large amount of open domain question-answer data;
and 6, generating training and inference of the problem.
The present invention is also characterized in that,
the step 1 comprises the following specific steps:
initial parameters of a Transformer block of a model in the field pre-training are taken from a basic Robert of Wiki hundred-subject corpus training, and then model pre-training is carried out on a field information text crawled on the internet. Pre-training uses the Robert's two-way hiding pre-training mechanism and the full word hiding mechanism. The dictionary in full word masking uses the open domain dictionary disclosed to accommodate the need for pre-training. By using these two mechanisms, we have optimized the pre-processing of the model.
The specific steps of the step 2 are as follows:
in order to acquire the target domain dictionary more quickly, the invention uses a semi-supervised and semi-manual mode to accelerate the dictionary generation efficiency. Firstly, manually selecting an electronic document in a target field and a large-scale dictionary in an open field, inputting the document in the target field into a named entity recognition deep learning model, and adding an entity recognized by the model into the field dictionary. And then, indexing words of the large-scale open domain in the target domain text in a rule-based mode, and adding the words existing in the index into a target domain dictionary. And finally, manually examining the formed domain dictionary to form a final domain vocabulary dictionary.
The specific steps of the step 3 are as follows:
in the training process of the model, the original text and the target question sentence are spliced and then sent to the model for training. Wherein, the token in the first half of the text can focus on the text in both front and back directions, and the token in the second half can focus on the text in the first half on the left.
The specific steps of the step 4 are as follows:
word lattice and "relative position embedding" can add the position relationship between each single word or vocabulary to the calculation of attition, strengthening the attention mechanism in the transform. Therefore, the invention uses relative position coding for each single character and vocabulary in the task pre-training stage. Meanwhile, the relative position codes can clearly express the position information between every two vocabularies.
The step 5 comprises the following specific steps:
to save computational resources and to adapt to smaller manually labeled datasets, migration schemes employing pre-trained models are required to provide sufficient common encyclopedia knowledge and domain information. Therefore, the invention inherits the last layer of the Robert parameter pre-trained in the step 1 by the Transformer layer integrated with the word lattice and the relative position code, and migrates the encyclopedic knowledge and the domain knowledge.
Because the model has more parameters and the manually labeled question and answer data are often less, the task pre-training is added, and the task pre-training is carried out on the model through a large amount of question and answer data of the open field crawled from the network, so that the capability of the model in the aspect of problem generation is enhanced.
Step 6 comprises the following steps:
and (3) training the question and answer text in the target field by using a Unilm language model which replaces the last layer of encoder module and is subjected to task pre-training, calculating the cross entropy of a model decoding prediction result and a problem given by original training data, and optimally training the model by using an Adam optimizer. The inference idea mainly adopts the beam search technology.
The invention has the beneficial effects that:
the invention provides a Chinese question generation unified pre-training method based on word lattice and relative position embedding, which is based on a Transformer and a constructed field dictionary of a target field. The core of the model lies in the Embedding of additional information such as lattice Embedding (domain vocabulary Embedding) and relative position Embedding (relative position Embedding) which are specially designed for the particularity of problem generation, and domain pre-training and task pre-training are also applied to the model for enhancing the inference capability of the model in a specific domain. Because the relative position information of each list and the domain vocabulary is added in the model input, the model can learn more position relations and has better effect when generating problems aiming at the target domain input. The model provided by the invention has better effect based on the same question and answer data set.
Drawings
FIG. 1 is a flow chart of the present invention for generating a unified pre-training model based on word lattice and relative position embedding Chinese problems.
FIG. 2 shows a method for generating a domain dictionary according to the present invention.
FIG. 3 is a diagram of the manner in which the invention performs word lattice embedding with respect to position coding and domain vocabulary.
FIG. 4 shows the Seq2Seq mask matrix M in step 3 of the present invention.
Detailed Description
The method for generating the unified pre-training based on the Chinese problem of word lattice and relative position embedding according to the present invention is further described in detail with reference to the accompanying drawings and the detailed description thereof.
As shown in figure 1 of the drawings, in which,
step 1: domain pre-training for Robert parameters
The specific steps are 1.1: acquiring domain pre-training data;
initial parameters of a Transformer block of a model in the field pre-training are taken from a basic Robert of Wiki hundred-subject corpus training, and then model pre-training is carried out on a field information text crawled on the internet. Pre-training uses the Robert's two-way hiding pre-training mechanism and the full word hiding mechanism. The dictionary in full word masking uses the open domain dictionary disclosed to accommodate the need for pre-training. By using these two mechanisms, we have optimized the pre-processing of the model.
The specific steps are 1.2: bidirectional covering pre-training mechanism of Robert
The model uses a bi-directional full-word masking predictive training mechanism, allowing tokens to focus on the text content in both front and back directions. In order to adapt to the problem of Chinese, full word covering pre-training is used, and the meaning of a word expressed by the whole word and the two words which are split into the word in Chinese is completely different in some cases. This way context information can be efficiently encoded, thereby generating an information representation of the context. In specific implementation, the model randomly replaces the word or the word with the 'MASK'. The model randomly replaced 15% of the words in the sequence. Wherein 80% of the probability is replaced by "[ MASK ]", 10% of the probability is replaced by other words or words, and 10% of the probability is not replaced.
The cross entropy loss function carries out loss calculation on the prediction result and the original result so as to train
Step 2: quickly and accurately generating a target field dictionary by using a semi-supervised and semi-manual mode
The specific steps are as follows: named entity recognition acquisition dictionary
Referring to fig. 2, in order to acquire the target domain dictionary more quickly, the invention uses a semi-supervised and semi-manual mode to accelerate the dictionary generation efficiency. And manually selecting a document of the target field, inputting the document of the target field into the named entity recognition deep learning model, and adding the entity recognized by the model into the field dictionary.
The specific step 2.2: obtaining a dictionary based on a rule approach
And selecting a dictionary of a large-scale open field and indexing in the target field text in a rule-based mode, and adding words existing in the index into the target field dictionary. And finally, manually examining the formed domain dictionary to form a final domain vocabulary dictionary.
And step 3: constructing a special mask matrix and improving the generating capacity of the model
In the training process of the model, the original text and the target question sentence are spliced and then sent into the model for training. Wherein, the token in the first half of the text can be concerned with the text in the front and back directions, and the token in the back half can only focus on the text in the first half on the left. For example, given the sequence "[ SOS ] t1 t2 t3[ EOS ] t3 t4 t 5", the three tokens of t1 t2 t3 can only focus on the first 5 tokens, while t3 t4 t5 can focus on itself and all tokens text preceding itself.
Referring to fig. 4, the matrix M is shown. Wherein S1 represents the first half sentence after the input sequence is spliced, and the elements are all set to "0" to represent that the interior thereof can be associated with all token information in the first half sentence. S2 denotes the second half sentence after the concatenation of the input sequence, and the element is set to "- ∞" for indicating that the second half sentence can be associated with the first half sentence information. In order to improve the text generation capability of the model, the S1 sequence in the matrix M is set to focus on both preamble and postamble information, and the S2 sequence is set to focus on only preamble information including itself. For the submatrix at the lower right, we set the upper triangular element to "- ∞" and the remaining elements to "0" for representing that the text information of the part behind the current token cannot be focused on. And adding the specially constructed mask matrix and the attribute score matrix in the encoder part to realize the generation capability of the enhanced model.
And 4, step 4: constructing a special relative position embedding matrix, and fusing the relative position information of the characters and words in the input into a Transformer layer according to the dictionary in the step 2
Referring to fig. 3, at this stage, the input sequence is compared with an example of a domain dictionary to obtain words contained in a plurality of input sequences, the head and tail indexes of the words in the input sequence are respectively recorded, and the head and tail indexes respectively represent the start index and the end index of the word lexicon; for a single word, its start and end indices are the same.
"relative position embedding" can add the positional relationship between each token or vocabulary to the calculation of attention, enhancing the attention mechanism in the transform. The model therefore uses relative position coding for each token and vocabulary during the task pre-training phase. Meanwhile, the relative position codes can clearly express the position information between every two vocabularies.
In particular, head [ i ]]And tail [ i]Is sp of the ith spaniThe four relative position matrix calculations are performed as follows:
wherein,both represent the distance between the head entity and the tail entity span, and the final relative position information is calculated as follows and activated by an activation function:
wherein p isdIs calculated according to the Bert official absolute position embedding method, W is a learnable parameter,the splicing operation of the expression tensor becomes [ hidden size [ "hidden size]It represents the location association information between the token and the token.
In order for the model to learn this correlation information adequately, we use the self-attention mechanism in the Transformer, which is defined as the following form:
where all W represent the weight matrix and u, v are also learnable parameters. The final output represents the infused embedding tensor of the relative position information for the token.
And 5: a newly-built Transformer layer inherits the 12 th layer parameters of the Robert model, and performs task pre-training through a large amount of open field question-answer data
To save computational resources, and to accommodate smaller data sets, migration schemes employing pre-trained models are required to provide sufficient common encyclopedia knowledge information. Therefore, the invention carries out the transfer of the encyclopedic knowledge by inheriting the 12 th layer of the Robert parameter of encyclopedic knowledge training by the transform layer merged with the relative position code.
Because the model has more parameters, the task pre-training is added, and the task pre-training is carried out on the model through a large amount of question-answer data of the open field crawled from the network, so that the capability of the model in the aspect of problem generation is enhanced.
Examples all methods were tested on a pytorre platform using a GPU as GTX2080 TI. In the pre-training process, the maximum sequence length is defined to be 512. The Adam optimizer parameters β 1-0.9, β 2-0.99, learning rate set to 1e-4, dropout ratio set to 0.1, weight decay set to 1e-3, batch size set to 2, and dropout ratio set to 0.2 for each scenario training 200 epochs with dynamic learning rate.
Step 6: training and inference of generating problems
For the question and answer text in the target field, a Unilm model which is fused with relative position codes and is subjected to task pre-training is used for training, the result of model decoding prediction is used for calculating the cross entropy of the problem given by original training data, and the obtained gradient value is subjected to optimization training on the model through an Adam optimizer. The inference idea mainly adopts the beam search technology.
The invention generates a unified pre-training method based on word lattices and Chinese problems of relative position embedding, is different from the prior absolute position coding, and inputs more position information for a model. The model is calculated to know not only the positional relationship of the characters with the adjacent positions but also the positional relationship with each character and the vocabulary. Meanwhile, the domain pre-training and the task pre-training are used, so that the generating capacity of the model in the target domain is improved to a certain extent.
Claims (7)
1. The method is characterized in that a domain pre-training and a task pre-training are used, and a semi-supervised semi-manual mode is used for generating a domain dictionary. In the task pre-training stage, the domain vocabulary index in the input is firstly recorded at the head and the tail of the position in the input sequence, and the indexed vocabulary is spliced behind the input sequence. The relative position between each word and child is then recorded and input into the last self-built transform module in the Unilm model. The resulting problem is finally decoded by a decoder. The method specifically comprises the following steps:
step 1, performing field pre-training on the Robert parameter;
step 2, a semi-supervised and semi-manual mode is used for quickly and accurately generating a target field dictionary;
step 3, constructing a special mask matrix, and improving the generation capacity of the model;
step 4, constructing a special relative position embedding matrix, and fusing the relative position information of the input Chinese characters and words into a Transformer layer according to the dictionary in the step 2;
step 5, the newly-built Transformer layer inherits the 12 th layer parameters of the Robert model, and pre-training tasks is carried out through a large amount of open domain question-answer data;
and 6, generating training and inference of the problem.
2. The method for generating unified pre-training for Chinese problems based on word lattice and relative position embedding according to claim 1, wherein the step 1 comprises the following specific steps:
initial parameters of a Transformer block of a model in the field pre-training are taken from a basic Robert of Wiki encyclopedia corpus training, and then the model pre-training is carried out on a field information text crawled on the internet. Pre-training uses Robert's two-way hiding pre-training mechanism and the full word hiding mechanism. The dictionary in full word masking uses the open domain dictionary disclosed to accommodate the need for pre-training. By using these two mechanisms, we have optimized the pre-processing of the model.
3. The method for generating unified pre-training for Chinese problems based on word lattice and relative position embedding of claim 1, wherein the step 2 comprises the following specific steps:
in order to acquire the target domain dictionary more quickly, the invention uses a semi-supervised and semi-manual mode to accelerate the dictionary generation efficiency. Firstly, manually selecting an electronic document in a target field and a large-scale dictionary in an open field, inputting the document in the target field into a named entity recognition deep learning model, and adding an entity recognized by the model into the field dictionary. And then, indexing words of the large-scale open domain in the target domain text in a rule-based mode, and adding the words existing in the index into a target domain dictionary. And finally, manually examining the formed domain dictionary to form a final domain vocabulary dictionary.
4. The method for generating unified pre-training for Chinese problems based on word lattice and relative position embedding of claim 1, wherein the step 3 comprises the following specific steps:
in the training process of the model, the original text and the target question sentence are spliced and then sent to the model for training. Wherein, the token in the first half of the text can focus on the text in both front and back directions, and the token in the second half can focus on the text in the first half on the left.
5. The method for generating unified pre-training for Chinese problems based on word lattice and relative position embedding of claim 1, wherein the step 4 comprises the following specific steps:
"relative position embedding" can add the position relationship between each single word or vocabulary to the calculation of attition, and strengthen the attention mechanism in the transform. Therefore, the invention uses relative position coding for each single character and vocabulary in the task pre-training stage. Meanwhile, the relative position codes can clearly express the position information between every two vocabularies.
6. The method for generating unified pre-training for Chinese problems based on word lattice and relative position embedding of claim 1, wherein the step 5 comprises the following specific steps:
to save computational resources and to adapt to smaller manually labeled datasets, migration schemes employing pre-trained models are required to provide sufficient common encyclopedia knowledge and domain information. Therefore, the invention inherits the last layer of the Robert parameter pre-trained in the field through the step 1 by the Transformer layer integrated with the word lattice and the relative position code, and migrates the encyclopedic knowledge and the field knowledge.
Because the model has more parameters and the manually labeled question and answer data are often less, the task pre-training is added, and the task pre-training is carried out on the model through a large amount of question and answer data of the open field crawled from the network, so that the capability of the model in the aspect of problem generation is enhanced.
7. The method for generating unified pre-training for Chinese problems based on word lattice and relative position embedding of claim 1, wherein the step 6 comprises the following specific steps:
and (3) training the question and answer text in the target field by using a Unilm language model which replaces the last layer of encoder module and is subjected to task pre-training, calculating the cross entropy of a model decoding prediction result and a problem given by original training data, and optimally training the model by using an Adam optimizer. The inference idea mainly adopts the beam search technology.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110814546.0A CN113743095B (en) | 2021-07-19 | 2021-07-19 | Chinese problem generation unified pre-training method based on word lattice and relative position embedding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110814546.0A CN113743095B (en) | 2021-07-19 | 2021-07-19 | Chinese problem generation unified pre-training method based on word lattice and relative position embedding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113743095A true CN113743095A (en) | 2021-12-03 |
CN113743095B CN113743095B (en) | 2024-09-20 |
Family
ID=78728839
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110814546.0A Active CN113743095B (en) | 2021-07-19 | 2021-07-19 | Chinese problem generation unified pre-training method based on word lattice and relative position embedding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113743095B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114818699A (en) * | 2022-04-30 | 2022-07-29 | 一贯智服(杭州)技术有限公司 | Associated knowledge generation method, auxiliary labeling system and application |
CN117235240A (en) * | 2023-11-14 | 2023-12-15 | 神州医疗科技股份有限公司 | Multi-model result fusion question-answering method and system based on asynchronous consumption queue |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030191625A1 (en) * | 1999-11-05 | 2003-10-09 | Gorin Allen Louis | Method and system for creating a named entity language model |
US20070061022A1 (en) * | 1991-12-23 | 2007-03-15 | Hoffberg-Borghesani Linda I | Adaptive pattern recognition based controller apparatus and method and human-factored interface therefore |
CN111046179A (en) * | 2019-12-03 | 2020-04-21 | 哈尔滨工程大学 | Text classification method for open network question in specific field |
CN111274764A (en) * | 2020-01-23 | 2020-06-12 | 北京百度网讯科技有限公司 | Language generation method and device, computer equipment and storage medium |
JP2020140710A (en) * | 2019-02-26 | 2020-09-03 | 株式会社リコー | Training method for neural machine translation model, apparatus, and storage medium |
JP2020140709A (en) * | 2019-02-26 | 2020-09-03 | 株式会社リコー | Training method for neural machine translation model, apparatus, and storage medium |
CN111639163A (en) * | 2020-04-29 | 2020-09-08 | 深圳壹账通智能科技有限公司 | Problem generation model training method, problem generation method and related equipment |
KR102194837B1 (en) * | 2020-06-30 | 2020-12-23 | 건국대학교 산학협력단 | Method and apparatus for answering knowledge-based question |
CN112270193A (en) * | 2020-11-02 | 2021-01-26 | 重庆邮电大学 | Chinese named entity identification method based on BERT-FLAT |
WO2021012519A1 (en) * | 2019-07-19 | 2021-01-28 | 平安科技(深圳)有限公司 | Artificial intelligence-based question and answer method and apparatus, computer device, and storage medium |
CN112487139A (en) * | 2020-11-27 | 2021-03-12 | 平安科技(深圳)有限公司 | Text-based automatic question setting method and device and computer equipment |
CN112559702A (en) * | 2020-11-10 | 2021-03-26 | 西安理工大学 | Transformer-based natural language problem generation method in civil construction information field |
CN112989834A (en) * | 2021-04-15 | 2021-06-18 | 杭州一知智能科技有限公司 | Named entity identification method and system based on flat grid enhanced linear converter |
CN113011189A (en) * | 2021-03-26 | 2021-06-22 | 深圳壹账通智能科技有限公司 | Method, device and equipment for extracting open entity relationship and storage medium |
WO2021139231A1 (en) * | 2020-06-30 | 2021-07-15 | 平安科技(深圳)有限公司 | Triage method and apparatus based on neural network model, and computer device |
-
2021
- 2021-07-19 CN CN202110814546.0A patent/CN113743095B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070061022A1 (en) * | 1991-12-23 | 2007-03-15 | Hoffberg-Borghesani Linda I | Adaptive pattern recognition based controller apparatus and method and human-factored interface therefore |
US20030191625A1 (en) * | 1999-11-05 | 2003-10-09 | Gorin Allen Louis | Method and system for creating a named entity language model |
JP2020140710A (en) * | 2019-02-26 | 2020-09-03 | 株式会社リコー | Training method for neural machine translation model, apparatus, and storage medium |
JP2020140709A (en) * | 2019-02-26 | 2020-09-03 | 株式会社リコー | Training method for neural machine translation model, apparatus, and storage medium |
WO2021012519A1 (en) * | 2019-07-19 | 2021-01-28 | 平安科技(深圳)有限公司 | Artificial intelligence-based question and answer method and apparatus, computer device, and storage medium |
CN111046179A (en) * | 2019-12-03 | 2020-04-21 | 哈尔滨工程大学 | Text classification method for open network question in specific field |
CN111274764A (en) * | 2020-01-23 | 2020-06-12 | 北京百度网讯科技有限公司 | Language generation method and device, computer equipment and storage medium |
CN111639163A (en) * | 2020-04-29 | 2020-09-08 | 深圳壹账通智能科技有限公司 | Problem generation model training method, problem generation method and related equipment |
KR102194837B1 (en) * | 2020-06-30 | 2020-12-23 | 건국대학교 산학협력단 | Method and apparatus for answering knowledge-based question |
WO2021139231A1 (en) * | 2020-06-30 | 2021-07-15 | 平安科技(深圳)有限公司 | Triage method and apparatus based on neural network model, and computer device |
CN112270193A (en) * | 2020-11-02 | 2021-01-26 | 重庆邮电大学 | Chinese named entity identification method based on BERT-FLAT |
CN112559702A (en) * | 2020-11-10 | 2021-03-26 | 西安理工大学 | Transformer-based natural language problem generation method in civil construction information field |
CN112487139A (en) * | 2020-11-27 | 2021-03-12 | 平安科技(深圳)有限公司 | Text-based automatic question setting method and device and computer equipment |
CN113011189A (en) * | 2021-03-26 | 2021-06-22 | 深圳壹账通智能科技有限公司 | Method, device and equipment for extracting open entity relationship and storage medium |
CN112989834A (en) * | 2021-04-15 | 2021-06-18 | 杭州一知智能科技有限公司 | Named entity identification method and system based on flat grid enhanced linear converter |
Non-Patent Citations (1)
Title |
---|
郭晓然 等: "基于Transformer编码器的中文命名实体识别", 《吉林大学学报(工学版)》, vol. 51, no. 3, 31 May 2021 (2021-05-31) * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114818699A (en) * | 2022-04-30 | 2022-07-29 | 一贯智服(杭州)技术有限公司 | Associated knowledge generation method, auxiliary labeling system and application |
CN117235240A (en) * | 2023-11-14 | 2023-12-15 | 神州医疗科技股份有限公司 | Multi-model result fusion question-answering method and system based on asynchronous consumption queue |
CN117235240B (en) * | 2023-11-14 | 2024-02-20 | 神州医疗科技股份有限公司 | Multi-model result fusion question-answering method and system based on asynchronous consumption queue |
Also Published As
Publication number | Publication date |
---|---|
CN113743095B (en) | 2024-09-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110134771B (en) | Implementation method of multi-attention-machine-based fusion network question-answering system | |
CN111310471B (en) | Travel named entity identification method based on BBLC model | |
CN112559702B (en) | Method for generating natural language problem in civil construction information field based on Transformer | |
CN109992669B (en) | Keyword question-answering method based on language model and reinforcement learning | |
CN111125333B (en) | Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism | |
CN113743095B (en) | Chinese problem generation unified pre-training method based on word lattice and relative position embedding | |
Wang et al. | Knowledge base question answering with attentive pooling for question representation | |
CN112364132A (en) | Similarity calculation model and system based on dependency syntax and method for building system | |
CN111428104A (en) | Epilepsy auxiliary medical intelligent question-answering method based on viewpoint type reading understanding | |
CN114757184B (en) | Method and system for realizing knowledge question and answer in aviation field | |
CN114387537A (en) | Video question-answering method based on description text | |
CN115238691A (en) | Knowledge fusion based embedded multi-intention recognition and slot filling model | |
CN116521857A (en) | Method and device for abstracting multi-text answer abstract of question driven abstraction based on graphic enhancement | |
Mathur et al. | A scaled‐down neural conversational model for chatbots | |
CN117932066A (en) | Pre-training-based 'extraction-generation' answer generation model and method | |
CN114218921A (en) | Problem semantic matching method for optimizing BERT | |
CN113392656A (en) | Neural machine translation method fusing push-and-knock network and character coding | |
CN113360606A (en) | Knowledge graph question-answer joint training method based on Filter | |
Chowanda et al. | Generative Indonesian conversation model using recurrent neural network with attention mechanism | |
CN114328853B (en) | Chinese problem generation method based on Unilm optimized language model | |
CN115309886A (en) | Artificial intelligent text creation method based on multi-mode information input | |
CN115759102A (en) | Chinese poetry wine culture named entity recognition method | |
CN113010676B (en) | Text knowledge extraction method, device and natural language inference system | |
CN115203388A (en) | Machine reading understanding method and device, computer equipment and storage medium | |
Lv et al. | StyleBERT: Chinese pretraining by font style information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |