CN110991160A - Intelligent automatic creation system for study leaving documents - Google Patents

Intelligent automatic creation system for study leaving documents Download PDF

Info

Publication number
CN110991160A
CN110991160A CN201911353042.2A CN201911353042A CN110991160A CN 110991160 A CN110991160 A CN 110991160A CN 201911353042 A CN201911353042 A CN 201911353042A CN 110991160 A CN110991160 A CN 110991160A
Authority
CN
China
Prior art keywords
training
model
study
data
document
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911353042.2A
Other languages
Chinese (zh)
Inventor
和逸伦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Tomorrow Singularity Education Technology Co Ltd
Original Assignee
Suzhou Tomorrow Singularity Education Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Tomorrow Singularity Education Technology Co Ltd filed Critical Suzhou Tomorrow Singularity Education Technology Co Ltd
Priority to CN201911353042.2A priority Critical patent/CN110991160A/en
Publication of CN110991160A publication Critical patent/CN110991160A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses an intelligent automatic creation system of a study-keeping document, which relates to the technical field of natural language processing application and comprises data preprocessing, model construction, model training, auxiliary labeling and new document generation, wherein the data preprocessing comprises loading data, converting data and dividing data mini-batch, and the model construction comprises an input layer, an LSTM layer, an output layer, a training error, a loss rate loss and an optimization optimizer. The invention discloses an intelligent automatic creation system for an academic document, wherein a user with an academic demand only needs to input own personalized data, such as graduate colleges, professions, target colleges, professions, personal university scores, English scores, personal abilities, skilled skills, talent hobbies and the like, can quickly and high-quality generate a high-quality document, greatly reduces the problem of academic failure caused by poor document quality in the process of academic, and has a certain application prospect for protecting driving of vast college students who want to go to study.

Description

Intelligent automatic creation system for study leaving documents
Technical Field
The invention relates to the technical field of natural language processing application, in particular to an intelligent automatic creation system for a study-leaving document.
Background
At present, hundreds of AI intelligent creation platforms, Ali creation platforms and the like exist, but the platforms are general creation platforms, technical systems are biased to creation of news, shopping, hotspot tracking and the like, and a solution scheme of vertical customization is not available in the aspects of creation of study-keeping documents, so that the industry urgently needs a study-keeping document only system capable of realizing intelligent automatic creation, the existing technology does not have excellent machine learning algorithm and large-scale training data in the aspects of information extraction, student information mining, professional interpretation of colleges and universities and the like, the requirements of document creation cannot be met in the aspects of knowledge extraction, map construction and strategy training, and therefore, the study-keeping document intelligent automatic creation system is provided.
Disclosure of Invention
The intelligent automatic creation system for the study-reserving documents can quickly generate a high-quality document with high quality, greatly reduces the problem of study-reserving failure caused by poor document quality in the study-reserving process, and protects the driving of the university students who want to seek study.
In order to achieve the purpose, the invention provides the following technical scheme: the intelligent automatic creation system for the study-keeping document comprises data preprocessing, model construction, model training, auxiliary labeling and generation of a new document, wherein the data preprocessing comprises loading data, converting data and dividing data mini-batch, and the model construction comprises an input layer, an LSTM layer, an output layer, a training error, a loss rate loss and optimization optimizer.
Preferably, the most important function in the data preprocessing is to establish a dictionary and a reverse dictionary, use a text file as an input and train an RNN model, then use the RNN model to generate a text similar to the training data, and obtain a dictionary (word- > ID) and a reverse dictionary (ID- > word) of each word in a training sample (10 ten thousand left study documents); each article is changed into a vector consisting of IDs through a dictionary, and then through the ID vector, through embedded circulation, the English name embedding _ lookup is changed into a word vector, and the training label, the English name train _ label, is obtained by training data and shifting backwards one position of the English name train _ data.
Preferably, in the model construction, an LSTM model generates an LSTM basic model by tf, nn, rn, cell, basic LSTM cell given by tensorflow, finally, a sequence _ loss _ by _ example is used to obtain a loss function as a training target, the model is provided with a network model of 512 LSTM cells, model parameters are set to train the model, constants and training parameters are set, 3 symbols are searched in training data in each step in the training process, then 3 symbols are converted into integers to form an input vector, the symbols are converted into integer vectors to serve as input, the input vector is optimized after being converted into a format of an input dictionary, the optimization in the training process and the precision and loss are accumulated to monitor the training process; typically 50000 iterations are sufficient to achieve acceptable accuracy requirements, one training interval of prediction and accuracy data instances (interval 1000 steps), loss and optimizer design, the accuracy of the LSTM can be improved by adding layers.
Preferably, the model training carries out training processing on all the study-leaving documents, and finally the study-leaving documents are converted into a dictionary, a study-leaving document vector and a reverse dictionary through build _ dataset (); obtaining a preprocessed left study document set; the method adopts an LSTM framework with 2 layers, each layer is provided with 128 hidden layer nodes, the batch _ size is set to be 64, and particularly, the method is characterized in that a shuffle is made on training data every time the training is finished; the output appears to be simple to generate, but in practice LSTM generates a prediction probability vector of 112 elements for the next symbol and normalises it with the softmax () function, and the index of the element with the highest probability value is the index value of the predicted symbol in the inverted dictionary.
Preferably, the auxiliary labeling is performed on feature vector information of each category such as professions, schools, past experience rules, current professional current situations, professional development processes, professional development directions, student individual histories, individual interests and hobbies, individual maintenance and the like, the construction efficiency of a training data set can be averagely improved by 8 times after the auxiliary labeling is further performed, the automatic document generation writing system is helped to better understand the composition and internal logic, grammar and color complementing requirements of excellent documents, and therefore the model understanding effect is optimized more quickly and accurately; in natural language processing, many tasks can be converted into sequence tagging tasks, and classified tagging is performed on word/word sequences, such as Named Entity Recognition (NER), Part-of-speech tagging (Part-of-speech tagging), event extraction (eventeextraction), and the like, which are described herein as named entity recognition; named Entity Recognition (NER) refers to the recognition of entities with specific meanings in texts, and mainly comprises school names, professional names, personal hobbies, school names of the department, the department of the department, practice experiences, English achievements, project experiences and the like; named entity identification is an important basic tool in application fields of information extraction, question-answering systems, syntactic analysis, machine translation and the like, and is used as an important step of structured information extraction; introduction of a labeling model a CRF model is adopted to perform a sequence labeling task, and a CRF layer is adopted to realize the sequence labeling task in a labeling part.
Preferably, the new document is generated by long model training to obtain a series of parameters saved in the training process, the parameters are used for generating a text, when a character is input, the next character is predicted, and the new character is input into the model to continuously generate the character, so that the text is formed; in order to reduce noise, the most possible first 5 predicted values are selected for random selection, for example, h is input, the first five with the highest probability of prediction results are [ o, e, i, u, b ], one of the five is selected randomly as a new character, and random factors are added in the process to reduce the generation of noise; the first 32 predictors in the sample study document generated study document are intercepted and if another sequence is entered, i.e. customized according to the user's personalized information, another study document is automatically generated.
Preferably, the intelligent automatic creation system for the study leaving documents comprises the following steps:
firstly, preprocessing data, including loading data, converting the data, separating data mini-bastc, and establishing functions of a dictionary and a reverse-order dictionary;
secondly, model construction is carried out according to data, wherein the model construction comprises an input layer, an LSTM layer, an output layer, a training error, a loss rate loss and an optimization optimizer;
step three, performing model training according to the established model, wherein the model training comprises two layers of LEST frames, and training the left study documents;
performing auxiliary labeling on the characteristic vector information of each category, including named entity identification and performing a sequence labeling task by using a CRF (fuzzy C-frame) model;
and step five, finally generating a new document, wherein the new document comprises the first 32 predicted values in the study reservation document generated by the study reservation document.
Compared with the prior art, the invention has the beneficial effects that: the intelligent automatic creation system for the study-reserving documents can generate a high-quality document quickly and high-quality only by inputting the personalized data of a user with study-reserving requirements, such as graduate colleges, professions, target colleges, professions, personal university scores, English scores, personal abilities, skilled skills, talent and hobbies and the like, greatly reduces the problem of study-reserving failure caused by poor document quality in the study-reserving process, protects driving of vast college students who want to go to study, and has a certain application prospect.
Drawings
FIG. 1 is a flow chart of the intelligent automatic creation system of the study leaving documents of the present invention.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific embodiments.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present invention provides a technical solution: the intelligent automatic creation system for the literature leaving study comprises data preprocessing, model construction, model training, auxiliary labeling and new literature generation, wherein the data preprocessing comprises loading data, converting data and dividing data mini-batch, the model construction comprises an input layer, an LSTM layer, an output layer, a training error, a loss rate loss and optimization optimizer.
As shown in S10 in fig. 1, the most important function in data preprocessing is to establish a dictionary and a reverse dictionary, train an RNN model using a text file as input, and then use it to generate a text similar to the training data, where the dictionary (word- > ID) and the reverse dictionary (ID- > word) for each word are obtained in the training sample (10 ten thousand left study documents); each article is changed into a vector consisting of IDs through a dictionary, and then through the ID vector, through embedded circulation, the English name embedding _ lookup is changed into a word vector, and the training label, the English name train _ label, is obtained by training data and shifting backwards one position of the English name train _ data.
As shown in S20 in fig. 1, in the model construction, the LSTM model generates an LSTM basic model using tf, nn, rn, cell, basic LSTM rule given by tenserflow, and finally obtains a loss function using sequence _ loss _ by _ example as a training target, there are 512 network models of LSTM cells, model parameters are set to train the model, constants and training parameters, 3 symbols are retrieved in training data at each step in the training process, then 3 symbols are converted into integers to form an input vector, the symbols are converted into integer vectors as input, and after the symbols are converted into an input dictionary format, optimization is performed again, and the accuracy and loss in the training process are accumulated to monitor the training process; typically 50000 iterations are sufficient to achieve acceptable accuracy requirements, one training interval of prediction and accuracy data instances (interval 1000 steps), loss and optimizer design, the accuracy of the LSTM can be improved by adding layers.
As shown in S30 in fig. 1, model training performs training processing on all the study-leaving documents, and finally converts the study-leaving documents into a dictionary, a study-leaving document vector and a reverse dictionary for obtaining the study-leaving documents through build _ dataset (); obtaining a preprocessed left study document set; the method adopts an LSTM framework with 2 layers, each layer is provided with 128 hidden layer nodes, the batch _ size is set to be 64, and particularly, the method is characterized in that a shuffle is made on training data every time the training is finished; the output appears to be simple to generate, but in practice LSTM generates a prediction probability vector of 112 elements for the next symbol and normalises it with the softmax () function, and the index of the element with the highest probability value is the index value of the predicted symbol in the inverted dictionary.
As shown in S40 in fig. 1, the auxiliary labels are labeled for feature vector information of each category, such as profession, school, past experience rule, current professional status, professional development history, professional development direction, student individual history, individual interest, individual fostering, and the like, and after further labeling, the construction efficiency of the training data set can be averagely improved by 8 times, which helps the automatic document generation and writing system to better understand the composition and internal logic, grammar and retouching requirements of the excellent document, thereby optimizing the model understanding effect more quickly and accurately; in natural language processing, many tasks can be converted into sequence tagging tasks, and classified tagging is performed on word/word sequences, such as Named Entity Recognition (NER), Part-of-speech tagging (Part-of-speech tagging), event extraction (eventeextraction), and the like, which are described herein as named entity recognition; named Entity Recognition (NER) refers to the recognition of entities with specific meanings in texts, and mainly comprises school names, professional names, personal hobbies, school names of the department, the department of the department, practice experiences, English achievements, project experiences and the like; named entity identification is an important basic tool in application fields of information extraction, question-answering systems, syntactic analysis, machine translation and the like, and is used as an important step of structured information extraction; introduction of a labeling model a CRF model is adopted to perform a sequence labeling task, and a CRF layer is adopted to realize the sequence labeling task in a labeling part.
As shown in S50 of fig. 1, the generation of a new document is performed through long model training, a series of parameters saved in the training process are obtained, the parameters are used to generate a text, when a character is input, it predicts the next character, and then the new character is input into the model, so that the character can be continuously generated, and a text is formed; in order to reduce noise, the most possible first 5 predicted values are selected for random selection, for example, h is input, the first five with the highest probability of prediction results are [ o, e, i, u, b ], one of the five is selected randomly as a new character, and random factors are added in the process to reduce the generation of noise; the first 32 predictors in the sample study document generated study document are intercepted and if another sequence is entered, i.e. customized according to the user's personalized information, another study document is automatically generated.
The deep neural network platform TensorFlow in the invention: were originally developed by researchers and engineers from the Google brain group (affiliated with the Google machine intelligence research institute) for machine learning and deep neural network studies.
The recurrent neural network RNN:
RNN is a very popular model that has been shown to be powerful in many tasks of NLP. The RNN-based language model (rnnlm) has two applications: scoring each sequence based on its likelihood of occurrence in the real world, which in effect provides a measure of grammatical and semantic correctness, the language model typically being part of a machine translation system; the language model may be used to generate new text.
The Long Short Term Memory model LSTM is called Long Short Term Memory in English;
the LSTM is a special RNN model, which is provided for solving the problem of gradient diffusion of the RNN model; in the traditional RNN, BPTT is used in a training algorithm, when the time is longer, the residual error needing to be returned is exponentially reduced, so that the network weight is updated slowly, the long-term memory effect of the RNN cannot be embodied, and therefore a storage unit is needed for storing memory, and an LSTM model is proposed;
Bi-LSTM is called Bi-Long Short Term Memory Units in English, and refers to bidirectional LSTM; CRF English global Conditional random field, which refers to Conditional random field; the intelligent automatic creation system for study-leaving documents is called intelligent knowledge system of personals, and refers to a singularity study-leaving intelligent document self-creation system.
1. Parameter optimization
Before the model training, some parameters are initialized, and the parameters mainly comprise: the batch _ size is the number of sequences in a single batch and is adjusted to 64; num _ steps, the number of characters in a single sequence is adjusted to 50; lstm _ size, the number of nodes of hidden layer, adjusted to 128; the num _ layers is the number of LSTM layers and is adjusted to be 3 layers; learning rate, adjusted to 0.001; keep _ prob, the proportion of nodes retained in dropout layer during training is adjusted to 80%.
2. Optimization of training models
RNN can encounter the problems of gradient explosion and gradient diffusion, LSTM solves the problem of gradient diffusion, but gradients can still explode, so we adopt the gradient-clipping method to prevent gradient explosion. I.e. by setting a threshold value, when the gradients exceeds this threshold value, it is reset to the threshold size, which ensures that the gradient does not become very large. In addition, the optimization algorithm is used for clip and gradually reducing the learning rate.
3. Assisted annotation personalization
Different from the traditional general labeling scheme, the system adopts a personalized labeling scheme of the study-keeping documents, and labels characteristic vector information of each category such as professions, schools, past experience rules, current professional current situations, professional development courses, professional development directions, student individual histories, individual interests, hobbies, individual maintenance and the like; after further marking, the construction efficiency of the training data set can be averagely improved by 8 times, and the automatic document generation and writing system is helped to better understand the composition and internal logic, grammar and rendering requirements of excellent documents, so that the model understanding effect is optimized more quickly and accurately.
The main steps of the intelligent automatic creation system of the study leaving documents are expressed, and the method comprises the following steps:
firstly, preprocessing data, including loading data, converting the data, separating data mini-bastc, and establishing functions of a dictionary and a reverse-order dictionary;
secondly, model construction is carried out according to data, wherein the model construction comprises an input layer, an LSTM layer, an output layer, a training error, a loss rate loss and an optimization optimizer;
step three, performing model training according to the established model, wherein the model training comprises two layers of LEST frames, and training the left study documents;
performing auxiliary labeling on the characteristic vector information of each category, including named entity identification and performing a sequence labeling task by using a CRF (fuzzy C-frame) model;
and step five, finally generating a new document, wherein the new document comprises the first 32 predicted values in the study reservation document generated by the study reservation document.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (7)

1. The intelligent automatic creation system of the study leaving paperwork comprises data preprocessing, model construction, model training, auxiliary labeling and new paperwork generation, and is characterized in that: the data preprocessing comprises loading data, converting data and dividing data mini-batch, and the model construction comprises an input layer, an LSTM layer, an output layer, a training error, a loss rate loss and an optimization optimizer.
2. The intelligent automatic creation system of leaving study documents according to claim 1, characterized in that: the most important function in the data preprocessing is to establish a dictionary and a reverse dictionary, use a text file as input and train an RNN model, then use the RNN model to generate a text similar to the training data, and obtain a dictionary (word- > ID) and a reverse dictionary (ID- > word) of each word in a training sample (10 ten thousand left-handed study documents); each article is changed into a vector consisting of IDs through a dictionary, and then through the ID vector, through embedded circulation, the English name embedding _ lookup is changed into a word vector, and the training label, the English name train _ label, is obtained by training data and shifting backwards one position of the English name train _ data.
3. The intelligent automatic creation system of leaving study documents according to claim 1, characterized in that: in the model construction, an LSTM model generates an LSTM basic model by using tf, nn, rn, cell, basic LSTM cell given by tensoflow, finally, a sequence _ loss _ by _ example is used for obtaining a loss function as a training target, a network model with 512 LSTM units is provided, model parameters are set for training the model, constants and training parameters are set, 3 symbols are searched in training data in each step in the training process, then the 3 symbols are converted into integers to form input vectors, the symbols are converted into the integer vectors to be used as input, the input vectors are optimized after being converted into formats of an input dictionary, the optimization in the training process is carried out, and the precision and the loss are accumulated to monitor the training process; typically 50000 iterations are sufficient to achieve acceptable accuracy requirements, one training interval of prediction and accuracy data instances (interval 1000 steps), loss and optimizer design, the accuracy of the LSTM can be improved by adding layers.
4. The intelligent automatic creation system of leaving study documents according to claim 1, characterized in that: the model training carries out training processing on all the study-leaving documents, and finally the study-leaving documents are converted into a dictionary, a study-leaving document vector and a reverse dictionary through build _ dataset (); obtaining a preprocessed left study document set; the method adopts an LSTM framework with 2 layers, each layer is provided with 128 hidden layer nodes, the batch _ size is set to be 64, and particularly, the method is characterized in that a shuffle is made on training data every time the training is finished; the output appears to be simple to generate, but in practice LSTM generates a prediction probability vector of 112 elements for the next symbol and normalises it with the softmax () function, and the index of the element with the highest probability value is the index value of the predicted symbol in the inverted dictionary.
5. The intelligent automatic creation system of leaving study documents according to claim 1, characterized in that: the auxiliary marking is used for marking the characteristic vector information of each category such as professions, schools, past experience rules, current professional situations, professional development courses, professional development directions, student individual histories, individual interests and hobbies, individual maintenance and the like, the construction efficiency of a training data set can be averagely improved by 8 times after the auxiliary marking is further carried out, and the automatic document generation writing system is helped to better understand the composition and internal logic, grammar and color complementing requirements of an excellent document, so that the model understanding effect is optimized more quickly and accurately; in natural language processing, many tasks can be converted into sequence tagging tasks, and classified tagging is performed on word/word sequences, such as Named Entity Recognition (NER), Part-of-speech tagging (Part-of-speech tagging), event extraction (eventeextraction), and the like, which are described herein as named entity recognition; named Entity Recognition (NER) refers to the recognition of entities with specific meanings in texts, and mainly comprises school names, professional names, personal hobbies, school names of the department, the department of the department, practice experiences, English achievements, project experiences and the like; named entity identification is an important basic tool in application fields of information extraction, question-answering systems, syntactic analysis, machine translation and the like, and is used as an important step of structured information extraction; introduction of a labeling model a CRF model is adopted to perform a sequence labeling task, and a CRF layer is adopted to realize the sequence labeling task in a labeling part.
6. The intelligent automatic creation system of leaving study documents according to claim 1, characterized in that: the new document is generated by long model training to obtain a series of parameters stored in the training process, the parameters are used for generating a text, when a character is input, the next character is predicted, and the new character is input into the model to continuously generate the character, so that the text is formed; in order to reduce noise, the most possible first 5 predicted values are selected for random selection, for example, h is input, the first five with the highest probability of prediction results are [ o, e, i, u, b ], one of the five is selected randomly as a new character, and random factors are added in the process to reduce the generation of noise; the first 32 predictors in the sample study document generated study document are intercepted and if another sequence is entered, i.e. customized according to the user's personalized information, another study document is automatically generated.
7. The intelligent automatic creation system of leaving study documents according to claim 1, characterized in that: the intelligent automatic creation system for the study leaving documents comprises the following steps:
firstly, preprocessing data, including loading data, converting the data, separating data mini-bastc, and establishing functions of a dictionary and a reverse-order dictionary;
secondly, model construction is carried out according to data, wherein the model construction comprises an input layer, an LSTM layer, an output layer, a training error, a loss rate loss and an optimization optimizer;
step three, performing model training according to the established model, wherein the model training comprises two layers of LEST frames, and training the left study documents;
performing auxiliary labeling on the characteristic vector information of each category, including named entity identification and performing a sequence labeling task by using a CRF (fuzzy C-frame) model;
and step five, finally generating a new document, wherein the new document comprises the first 32 predicted values in the study reservation document generated by the study reservation document.
CN201911353042.2A 2019-12-25 2019-12-25 Intelligent automatic creation system for study leaving documents Pending CN110991160A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911353042.2A CN110991160A (en) 2019-12-25 2019-12-25 Intelligent automatic creation system for study leaving documents

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911353042.2A CN110991160A (en) 2019-12-25 2019-12-25 Intelligent automatic creation system for study leaving documents

Publications (1)

Publication Number Publication Date
CN110991160A true CN110991160A (en) 2020-04-10

Family

ID=70076495

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911353042.2A Pending CN110991160A (en) 2019-12-25 2019-12-25 Intelligent automatic creation system for study leaving documents

Country Status (1)

Country Link
CN (1) CN110991160A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818651A (en) * 2021-01-21 2021-05-18 北京明略软件系统有限公司 Intelligent recommendation writing method and system based on enterprise WeChat

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818651A (en) * 2021-01-21 2021-05-18 北京明略软件系统有限公司 Intelligent recommendation writing method and system based on enterprise WeChat

Similar Documents

Publication Publication Date Title
CN108984745B (en) Neural network text classification method fusing multiple knowledge maps
CN107729309B (en) Deep learning-based Chinese semantic analysis method and device
CN111310471B (en) Travel named entity identification method based on BBLC model
CN110489523B (en) Fine-grained emotion analysis method based on online shopping evaluation
CN107871158A (en) A kind of knowledge mapping of binding sequence text message represents learning method and device
CN111966812B (en) Automatic question answering method based on dynamic word vector and storage medium
CN107145484A (en) A kind of Chinese word cutting method based on hidden many granularity local features
CN110263325A (en) Chinese automatic word-cut
CN111738002A (en) Ancient text field named entity identification method and system based on Lattice LSTM
CN110276069A (en) A kind of Chinese braille mistake automatic testing method, system and storage medium
CN116070602B (en) PDF document intelligent labeling and extracting method
CN111159412A (en) Classification method and device, electronic equipment and readable storage medium
CN112699685B (en) Named entity recognition method based on label-guided word fusion
CN113761868B (en) Text processing method, text processing device, electronic equipment and readable storage medium
CN111222318A (en) Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
CN112836051B (en) Online self-learning court electronic file text classification method
CN111400494A (en) Sentiment analysis method based on GCN-Attention
CN111709225B (en) Event causal relationship discriminating method, device and computer readable storage medium
CN113505589A (en) BERT model-based MOOC learner cognitive behavior identification method
CN113051922A (en) Triple extraction method and system based on deep learning
CN110222338A (en) A kind of mechanism name entity recognition method
CN116245097A (en) Method for training entity recognition model, entity recognition method and corresponding device
CN113312918B (en) Word segmentation and capsule network law named entity identification method fusing radical vectors
CN111428502A (en) Named entity labeling method for military corpus
CN113312498B (en) Text information extraction method for embedding knowledge graph by undirected graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200410

WD01 Invention patent application deemed withdrawn after publication