CN113326700B - ALBert-based complex heavy equipment entity extraction method - Google Patents

ALBert-based complex heavy equipment entity extraction method Download PDF

Info

Publication number
CN113326700B
CN113326700B CN202110217185.1A CN202110217185A CN113326700B CN 113326700 B CN113326700 B CN 113326700B CN 202110217185 A CN202110217185 A CN 202110217185A CN 113326700 B CN113326700 B CN 113326700B
Authority
CN
China
Prior art keywords
albert
model
entity
heavy equipment
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110217185.1A
Other languages
Chinese (zh)
Other versions
CN113326700A (en
Inventor
李军怀
陈苗苗
王怀军
曹霆
于蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202110217185.1A priority Critical patent/CN113326700B/en
Publication of CN113326700A publication Critical patent/CN113326700A/en
Application granted granted Critical
Publication of CN113326700B publication Critical patent/CN113326700B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a ALBert-based complex heavy equipment entity extraction method, which is implemented according to the following steps: step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus; step 2, pre-training ALBert the model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert; marking entity names in the corpus obtained in the step 1, and adjusting a text format to an algorithm reading format to obtain a training set and a verification set; step 4, training a model, namely sending marked data into ALBert-BGRU-Attention-CRF algorithm to obtain a trained model; step 5, creating a dictionary Dict; and 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result. The invention can complete the entity extraction task in the field of complex heavy equipment.

Description

ALBert-based complex heavy equipment entity extraction method
Technical Field
The invention belongs to the technical field of knowledge maps, and particularly relates to a ALBert-based complex heavy equipment entity extraction method.
Background
The complex heavy equipment is one of important basic equipment in the manufacturing industry, is an important guarantee for social and economic development and national defense industry, and is particularly important as a national heavy equipment. Heavy equipment is used as high-end equipment and is widely applied to important industries and fields such as energy, traffic, ships, engineering machinery, metallurgy, aerospace, military industry and the like. Heavy equipment has long development cycle and complex stages, including early investigation, design, manufacture, purchasing, matching, installation, debugging, delivery, quality control, after-sales service, etc., and a large amount of knowledge is generated in these processes, wherein the large amount of knowledge is stored in a text form.
With the development of new internet technology, the effective management of knowledge and the reuse of knowledge in equipment manufacturing industry can better assist the whole process of design, production and operation and maintenance. The knowledge graph is a high-efficiency and knowledge organization and management mode, one of important links of knowledge graph construction is entity extraction, and the accuracy of the entity extraction determines the accuracy of the knowledge graph to a certain extent. Entity extraction for complex heavy equipment text lays a foundation for subsequent knowledge graph construction, knowledge effective management and knowledge reuse.
Disclosure of Invention
The invention aims to provide a ALBert-based entity extraction method for complex heavy equipment, which can complete the entity extraction task in the field of complex heavy equipment.
The technical scheme adopted by the invention is that the method for extracting the entity of the complex heavy equipment based on ALBert is implemented according to the following steps:
Step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus;
Step 2, pre-training ALBert the model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert;
Marking entity names in the corpus obtained in the step 1, and adjusting a text format to an algorithm reading format to obtain a training set and a verification set;
step 4, training a model, namely sending marked data into ALBert-BGRU-Attention-CRF algorithm to obtain a trained model;
Step 5, creating a dictionary Dict;
And 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result.
The present invention is also characterized in that,
In the step 1, a web crawler framework Scrapy is used for capturing related complex heavy equipment information from a webpage and storing the information as a text file, and the stored text is integrated with an existing complex heavy equipment field document collected manually to serve as a data source; then processing the data source, and eliminating special symbols, formulas and measurement units in the data source; the processed data is used as a corpus and is stored as a text file.
In the step 2, the ALBert model takes a single Chinese character as input, a start identifier [ CLS ] is added before the first word of each sentence, an end identifier [ SEP ] is added at the end of each sentence, ALBert outputs a representation vector which is fused text semantic information of each input word, the latter connection parameters are finely tuned according to the corpus in the data source on the basis of the ALBert pre-training model, and ALBert internal training parameters do not participate in training, so that a finely tuned ALBert model is obtained.
And 3, finishing entity labeling in a manual labeling mode, wherein the labeling entity adopts a BIO labeling mode, the first word of the entity is labeled with a B-Type label, the non-first word of the entity is labeled with an I-Type label, the non-entity and the punctuation mark number are all labeled with O labels, and the Type represents the entity category.
The training model in step 4 is specifically as follows:
Step 4.1, inputting the training set and the verification set obtained in the step 3 into the ALBert model finely tuned in the step2 to generate word vectors;
Step 4.2, inputting the word vector generated in the step 4.1 into a bi-directional gating circulation unit BGRU, and obtaining the scores of each word on all labels;
step 4.3, weighting the result in the step 4.2 by using an attribute mechanism to obtain weighted scores of each word on all labels;
Step 4.4, restraining the tag sequence by using a conditional random field CRF, and reducing the occurrence probability of the abnormal sequence;
And 4.5, obtaining the entity extraction model after training.
The step5 is specifically as follows:
related names including, but not limited to, parts, combinations, product names are extracted from the complex heavy equipment detailed information table as dictionary Dict.
The step 6 is specifically as follows:
Step 6.1, aiming at a large number of texts to be extracted, guiding all the texts into the entity extraction model trained in the step 4, obtaining a primary recognition result, and adding the dictionary Dict constructed in the step 5 on the basis to perform secondary extraction to obtain a final entity extraction result;
and 6.2, aiming at entity extraction of the independent sentences, pasting the sentences to be extracted to an online recognition window in an online recognition mode, calling the model obtained in the step 4 and giving an extraction result by combining with a dictionary Dict.
The method has the beneficial effects that the method for extracting the entity of the complex heavy equipment based on ALBert marks the text of the related information in the existing text and webpage of the field as a corpus, word embedding is realized by using the fine-tuned ALBert, the entity extraction model is obtained by training by using the deep learning algorithm BGRU-Attention-CRF, and the field dictionary is added in order to improve the accuracy of entity extraction and simultaneously consider the special nouns of the industry of the complex heavy equipment. When a new corpus is input, the trained model identifies the entity therein and combines with a dictionary to give a final entity extraction result.
Drawings
FIG. 1 is a general flow chart of a ALBert-based complex heavy equipment entity extraction method of the present invention;
FIG. 2 is a flow chart of the depth learning algorithm ALBert-BGRU-Attention-CRF for establishing the entity extraction model of the complex heavy equipment based on ALBert in the entity extraction method of the complex heavy equipment.
Detailed Description
The invention will be described in detail below with reference to the drawings and the detailed description.
According to the complex heavy equipment entity extraction method based on ALBert, a flow chart is shown in fig. 1, on the basis of data collection and processing, an entity extraction model is trained by utilizing a ALBert-BGRU-Attention-CRF algorithm based on deep learning, and a final extraction result is obtained by performing primary entity extraction on corpus to be extracted and combining with a dictionary (Dict). The method is implemented according to the following steps:
Step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus;
In the step 1, a web crawler framework Scrapy is used for capturing related complex heavy equipment information from a webpage and storing the information as a text file, and the stored text is integrated with an existing complex heavy equipment field document collected manually to serve as a data source; then processing the data source, and eliminating special symbols, formulas and measurement units in the data source; the processed data is used as a corpus and is stored as a text file.
Step 2, pre-training ALBert the model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert;
In the step 2, the ALBert model takes a single Chinese character as input, a start identifier [ CLS ] is added before the first word of each sentence, an end identifier [ SEP ] is added at the end of each sentence, ALBert outputs a representation vector which is fused text semantic information of each input word, the latter connection parameters are finely tuned according to the corpus in the data source on the basis of the ALBert pre-training model, and ALBert internal training parameters do not participate in training, so that a finely tuned ALBert model is obtained.
Marking entity names in the corpus obtained in the step 1, and adjusting a text format to an algorithm reading format to obtain a training set and a verification set;
and 3, finishing entity labeling in a manual labeling mode, wherein the labeling entity adopts a BIO labeling mode, the first word of the entity is labeled with a B-Type label, the non-first word of the entity is labeled with an I-Type label, the non-entity and the punctuation mark number are all labeled with O labels, and the Type represents the entity category.
And 3, developing a manual annotation and automatic format adjustment webpage system. And (3) adopting a manual labeling mode, and utilizing the developed data labeling webpage to finish entity labeling.
The entity labeling and text format adjustment algorithm pseudo code is as follows:
Input: text data to be marked;
Output: labeling data with labels;
1. Text preprocessing:
1.1. Removing line feed and blank spaces in the text, and displaying the text with the arranged format;
1.2. Creating a tag array, and initializing all character tags in a text as O;
2. labeling entities;
2.1. Clicking the label type, selecting an entity corresponding to the label type, and setting the label of the text of the selected entity as the corresponding label type;
2.2. If the full text label is started, searching the full text, and setting all the entity labels with the same name as the selected label type;
3. Generating annotation data in a standard format, outputting text character by character, and attaching a label corresponding to each character and a line-feed character after each character;
Return format specification and tagged data;
the labeling entity adopts a BIO labeling mode, the first word of the entity is marked with a B-Type label, the non-first word of the entity is marked with an I-Type label, the non-entity and punctuation marks are all marked with O labels, and the Type represents the entity category.
For example, there is a corpus of: "a metal extrusion machine is the most important equipment for realizing metal extrusion processing. ", entities are labeled: the gold B-Product belongs to I-Product extrusion I-Product pressure I-Product machine I-Product is O and realizes O-gold B-Way belongs to I-Way extrusion I-Way pressure I-Way and I-Way O most O main O of O equipment O. O
The non-entity information is marked as 'O', 'B-Product' marked with the entity first word of 'Product', 'I-Product' marked with the entity first word of 'Product', 'B-Way' marked with the entity first word of 'processing mode', 'I-Way' marked with the entity first word of 'processing mode'.
Step 4, training a model, namely sending marked data into ALBert-BGRU-Attention-CRF algorithm to obtain a trained model; the flow chart is shown in figure 2 of the drawings,
The training model in step 4 is specifically as follows:
Step 4.1, inputting the training set and the verification set obtained in the step 3 into the ALBert model finely tuned in the step2 to generate word vectors;
step 4.2, inputting the word vector generated in the step 4.1 into a bi-directional gating circulation unit BGRU (Bidirectional Gated Recurrent Unit), and obtaining the scores of each word on all labels;
step 4.3, weighting the result in the step 4.2 by using an attribute mechanism to obtain weighted scores of each word on all labels;
Step 4.4, constraint tag sequences are carried out by using a conditional random field CRF (Conditional Random Field), so that the occurrence probability of abnormal sequences is reduced;
And 4.5, obtaining the entity extraction model after training.
The training entity extraction model is as follows:
input: training set and verification set;
Output: extracting a model by an entity;
An import training set, a validation set;
2. Introducing a ALBert model after fine adjustment;
3. importing word vectors into GRU-Attention-CRF;
4. Specifying model parameters;
5. inputting a training set and a verification set to start training;
Return entity extraction model.
Step 5, creating a dictionary Dict;
The step5 is specifically as follows:
related names including, but not limited to, parts, combinations, product names are extracted from the complex heavy equipment detailed information table as dictionary Dict.
And 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result.
The step 6 is specifically as follows:
Step 6.1, aiming at a large number of texts to be extracted, guiding all the texts into the entity extraction model trained in the step 4, obtaining a primary recognition result, and adding the dictionary Dict constructed in the step 5 on the basis to perform secondary extraction to obtain a final entity extraction result;
and 6.2, aiming at entity extraction of the independent sentences, pasting the sentences to be extracted to an online recognition window in an online recognition mode, calling the model obtained in the step 4 and giving an extraction result by combining with a dictionary Dict.

Claims (3)

1. The ALBert-based complex heavy equipment entity extraction method is characterized by comprising the following steps of:
Step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus;
In the step 1, the web crawler framework Scrapy is used for capturing related complex heavy equipment information from a webpage and storing the related complex heavy equipment information as a text file, and the stored text is integrated with the existing complex heavy equipment field document collected manually to be used as a data source; then processing the data source, and eliminating special symbols, formulas and measurement units in the data source; the processed data is used as a corpus and is stored as a text file;
step 2, pre-training ALBert the model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert;
in the step 2, the ALBert model takes a single Chinese character as input, a start identifier [ CLS ] is added before the first word of each sentence, an end identifier [ SEP ] is added at the tail of each sentence, ALBert outputs a representation vector which is fused with text semantic information of each input word, the connection parameters at the back are finely tuned according to the corpus in the data source on the basis of the ALBert pre-training model, and ALBert internal training parameters do not participate in training, so that a finely tuned ALBert model is obtained;
marking entity names in the corpus obtained in the step 1, and adjusting a text format to an algorithm reading format to obtain a training set and a verification set;
In the step 3, an artificial labeling mode is adopted to finish entity labeling, a labeling entity adopts a BIO labeling mode, an entity first word is marked with a B-Type label, an entity non-first word is marked with an I-Type label, non-entity and punctuation marks are all marked with O labels, and the Type represents the entity category;
Step 4, training a model, namely sending marked data into ALBert-BGRU-Attention-CRF algorithm to obtain a trained model;
The training model in the step 4 is specifically as follows:
Step 4.1, inputting the training set and the verification set obtained in the step3 into the ALBert model finely tuned in the step 2 to generate word vectors;
step 4.2, inputting the word vector generated in the step 4.1 into a bi-directional gating circulation unit BGRU, and obtaining the scores of each word on all labels;
Step 4.3, weighting the result in the step 4.2 by using an attribute mechanism to obtain weighted scores of each word on all labels;
Step 4.4, restraining the tag sequence by using a conditional random field CRF, and reducing the occurrence probability of the abnormal sequence;
step 4.5, obtaining a training entity extraction model;
Step 5, creating a dictionary Dict;
And 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result.
2. The method for extracting entities of complex heavy equipment based on ALBert as set forth in claim 1, wherein said step 5 is specifically as follows:
Related names including parts, combination, product names are extracted from the complex heavy equipment detailed information table as dictionary Dict.
3. The method for extracting entities of complex heavy equipment based on ALBert as claimed in claim 2, wherein said step 6 is specifically as follows:
Step 6.1, aiming at a large number of texts to be extracted, guiding all the texts into the entity extraction model trained in the step 4, obtaining a primary recognition result, and adding the dictionary Dict constructed in the step 5 on the basis to perform secondary extraction to obtain a final entity extraction result;
and 6.2, aiming at entity extraction of the independent sentences, pasting the sentences to be extracted to an online recognition window in an online recognition mode, calling the model obtained in the step 4 and giving an extraction result by combining with a dictionary Dict.
CN202110217185.1A 2021-02-26 2021-02-26 ALBert-based complex heavy equipment entity extraction method Active CN113326700B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110217185.1A CN113326700B (en) 2021-02-26 2021-02-26 ALBert-based complex heavy equipment entity extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110217185.1A CN113326700B (en) 2021-02-26 2021-02-26 ALBert-based complex heavy equipment entity extraction method

Publications (2)

Publication Number Publication Date
CN113326700A CN113326700A (en) 2021-08-31
CN113326700B true CN113326700B (en) 2024-05-14

Family

ID=77414448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110217185.1A Active CN113326700B (en) 2021-02-26 2021-02-26 ALBert-based complex heavy equipment entity extraction method

Country Status (1)

Country Link
CN (1) CN113326700B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106815293A (en) * 2016-12-08 2017-06-09 中国电子科技集团公司第三十二研究所 System and method for constructing knowledge graph for information analysis
CN109359293A (en) * 2018-09-13 2019-02-19 内蒙古大学 Mongolian name entity recognition method neural network based and its identifying system
CN110188347A (en) * 2019-04-29 2019-08-30 西安交通大学 Relation extraction method is recognized between a kind of knowledget opic of text-oriented
CN110598203A (en) * 2019-07-19 2019-12-20 中国人民解放军国防科技大学 Military imagination document entity information extraction method and device combined with dictionary
CN110990525A (en) * 2019-11-15 2020-04-10 华融融通(北京)科技有限公司 Natural language processing-based public opinion information extraction and knowledge base generation method
CN111199152A (en) * 2019-12-20 2020-05-26 西安交通大学 Named entity identification method based on label attention mechanism
CN111444721A (en) * 2020-05-27 2020-07-24 南京大学 Chinese text key information extraction method based on pre-training language model
CN111860882A (en) * 2020-06-17 2020-10-30 国网江苏省电力有限公司 Method and device for constructing power grid dispatching fault processing knowledge graph
CN111950540A (en) * 2020-07-24 2020-11-17 浙江师范大学 Knowledge point extraction method, system, device and medium based on deep learning
CN112036185A (en) * 2020-11-04 2020-12-04 长沙树根互联技术有限公司 Method and device for constructing named entity recognition model based on industrial enterprise

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10049103B2 (en) * 2017-01-17 2018-08-14 Xerox Corporation Author personality trait recognition from short texts with a deep compositional learning approach

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106815293A (en) * 2016-12-08 2017-06-09 中国电子科技集团公司第三十二研究所 System and method for constructing knowledge graph for information analysis
CN109359293A (en) * 2018-09-13 2019-02-19 内蒙古大学 Mongolian name entity recognition method neural network based and its identifying system
CN110188347A (en) * 2019-04-29 2019-08-30 西安交通大学 Relation extraction method is recognized between a kind of knowledget opic of text-oriented
CN110598203A (en) * 2019-07-19 2019-12-20 中国人民解放军国防科技大学 Military imagination document entity information extraction method and device combined with dictionary
CN110990525A (en) * 2019-11-15 2020-04-10 华融融通(北京)科技有限公司 Natural language processing-based public opinion information extraction and knowledge base generation method
CN111199152A (en) * 2019-12-20 2020-05-26 西安交通大学 Named entity identification method based on label attention mechanism
CN111444721A (en) * 2020-05-27 2020-07-24 南京大学 Chinese text key information extraction method based on pre-training language model
CN111860882A (en) * 2020-06-17 2020-10-30 国网江苏省电力有限公司 Method and device for constructing power grid dispatching fault processing knowledge graph
CN111950540A (en) * 2020-07-24 2020-11-17 浙江师范大学 Knowledge point extraction method, system, device and medium based on deep learning
CN112036185A (en) * 2020-11-04 2020-12-04 长沙树根互联技术有限公司 Method and device for constructing named entity recognition model based on industrial enterprise

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
结合ALBERT和双向门控循环单元的专利文本分类;温超东 等;《计算机应用》;20210210;第41卷(第2期);全文 *

Also Published As

Publication number Publication date
CN113326700A (en) 2021-08-31

Similar Documents

Publication Publication Date Title
CN109992782B (en) Legal document named entity identification method and device and computer equipment
CN110597997B (en) Military scenario text event extraction corpus iterative construction method and device
CN109670041A (en) A kind of band based on binary channels text convolutional neural networks is made an uproar illegal short text recognition methods
CN107943911A (en) Data pick-up method, apparatus, computer equipment and readable storage medium storing program for executing
CN112100388A (en) Method for analyzing emotional polarity of long text news public sentiment
CN112417854A (en) Chinese document abstraction type abstract method
CN110287482A (en) Semi-automation participle corpus labeling training device
CN110705272A (en) Named entity identification method for automobile engine fault diagnosis
CN112051986A (en) Code search recommendation device and method based on open source knowledge
CN111914555A (en) Automatic relation extraction system based on Transformer structure
CN111460147B (en) Title short text classification method based on semantic enhancement
CN116050397A (en) Method, system, equipment and storage medium for generating long text abstract
CN114969294A (en) Expansion method of sound-proximity sensitive words
CN111444720A (en) Named entity recognition method for English text
CN110968661A (en) Event extraction method and system, computer readable storage medium and electronic device
CN113901224A (en) Knowledge distillation-based secret-related text recognition model training method, system and device
Akhoundzade et al. Persian sentiment lexicon expansion using unsupervised learning methods
CN113326700B (en) ALBert-based complex heavy equipment entity extraction method
CN111339779A (en) Named entity identification method for Vietnamese
CN114595687B (en) Laos text regularization method based on BiLSTM
CN113901172A (en) Case-related microblog evaluation object extraction method based on keyword structure codes
CN110990385A (en) Software for automatically generating news headlines based on Sequence2Sequence
CN116629387B (en) Text processing method and processing system for training under missing condition
Wang et al. Bacteria and Biotope Entity Recognition Using A Dictionary-Enhanced Neural Network Model
CN114386389B (en) Aspect emotion analysis method based on joint learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant