CN113408288A - Named entity identification method based on BERT and BiGRU-CRF - Google Patents

Named entity identification method based on BERT and BiGRU-CRF Download PDF

Info

Publication number
CN113408288A
CN113408288A CN202110725919.7A CN202110725919A CN113408288A CN 113408288 A CN113408288 A CN 113408288A CN 202110725919 A CN202110725919 A CN 202110725919A CN 113408288 A CN113408288 A CN 113408288A
Authority
CN
China
Prior art keywords
bert
model
text data
bigru
crf
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110725919.7A
Other languages
Chinese (zh)
Inventor
陈鹏杰
谢胜利
孙为军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202110725919.7A priority Critical patent/CN113408288A/en
Publication of CN113408288A publication Critical patent/CN113408288A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a named entity identification method based on BERT and BiGRU-CRF, relating to the technical field of computers and comprising the following steps: the method comprises the steps that comment text data of the E-commerce industry are obtained through a web crawler, the text data are labeled, the labeled text data are preprocessed, and a training data set and a verification data set are constructed; and training a BiGRU-CRF algorithm model and a BERT algorithm model according to the training data set and the verification data set. According to the method, word vector representation of a sentence is trained through a BERT pre-training language model, a traditional BERT model training task is improved, and fragment masking is used for replacing traditional word masking; semantic information of the current word and context is further extracted through a BiGRU model; and extracting the optimized label which accords with the context logic through a CRF algorithm.

Description

Named entity identification method based on BERT and BiGRU-CRF
Technical Field
The invention relates to the technical field of computers, in particular to a named entity identification method based on BERT and BiGRU-CRF.
Background
With the rapid development of internet technology, electronic commerce is well-known in people's daily life. The e-commerce platform basically provides a commodity review area for online review of consumers. The consumer can select the satisfied commodity by reading the information of the comment area, meanwhile, the merchant can also obtain the satisfaction degree of the consumer on the commodity from the information of the comment area, the deficiency in the commodity transaction process can be found by timely checking the comment area, and the improvement can be made timely, so that the method has great significance for the continuous development of the shop.
However, the rapid development of the mobile internet has accumulated a lot of and complicated comments on the e-commerce platform, which makes it difficult for consumers to obtain correct commodity information in a short time. Merchants also have difficulty obtaining effective consumer reviews from a vast array of reviews. Therefore, how to efficiently mine the information contained in the comments from the numerous and complicated comments greatly helps to promote consumer behaviors and promote merchants to improve services or change product quality, and directly influences the economic benefit of the e-commerce platform.
With the continuous influx of a large amount of comment texts and the random publication of users, the format of comments is not uniform, grammar rules are difficult to be captured, natural language processing is performed by manpower, the speed of establishing rules and a corpus by experts cannot catch up with the speed of the increase of comment data, the requirements cannot be met, the workload is huge, and human resources are wasted.
An effective solution to the problems in the related art has not been proposed yet.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a named entity identification method based on BERT and BiGRU-CRF, so as to overcome the technical problems in the prior related art.
The technical scheme of the invention is realized as follows:
a named entity identification method based on BERT and BiGRU-CRF comprises the following steps:
obtaining comment text data of the E-commerce industry through a web crawler, and labeling the text data;
preprocessing the labeled text data, constructing a training data set and a verification data set, and training a BiGRU-CRF algorithm model and a BERT algorithm model according to the training data set and the verification data set;
the word vector representation of a sentence is trained through a BERT pre-training language model, the traditional BERT model training task is improved, and fragment masking is used for replacing the traditional word masking;
semantic information of the current word and context is further extracted through a BiGRU model;
and extracting the optimized label which accords with the context logic through a CRF algorithm.
Further, the text data is labeled, and the method comprises the following steps:
training a model by using part of the labeled text data in an incremental learning mode, and predicting the rest unlabeled text data according to the trained model;
and directly taking the prediction result with the confidence coefficient higher than the preset threshold value as a mark of the text data, and manually marking the text data with the confidence coefficient lower than the preset threshold value.
Further, the method for using the fragment mask to replace the traditional word mask comprises the following steps:
according to the geometric distribution, a segment of the hidden length is randomly selected, then the initial position is randomly selected according to the uniform distribution, and finally a segment of characters in the sentence are covered according to the length.
Further, the semantic information of the current word and context is further extracted through the BiGRU model, the semantic information comprises an entity contained in text data to be predicted, which is extracted through the trained BiGRU-CRF algorithm model, and the following steps are represented:
modeling the text data sequence in a forward direction and a backward direction by utilizing a bidirectional GRU model;
and (4) scoring the whole prediction path by using the relation between conditional random field CRF constraint label results, and extracting entities contained in the text data.
Further, the method also comprises the following steps:
if the text data is a long text, predicting the relation between the text data to be predicted and the entity through the trained BERT algorithm model, and comprising the following steps:
acquiring a BERT original model, adopting [ CLS ] marks to represent the characteristics of the whole sentence types in the article through the BERT original model, and using [ SEP ] to segment a plurality of sentences in the input article;
coding by combining BERT input with an upstream extracted entity and adopting a structure of [ CLS ] article sentence [ SEP ] subject [ object ] [ SEP ] };
connecting the subject entity vector, the sentence vector and the object entity vector, and predicting the relationship type through full connection and softmax; wherein a ═ a1, a 2.., an ] is used to represent the subject entity vector; the object entity vector is denoted by b ═ b1, b 2.
The invention has the beneficial effects that:
the invention relates to a named entity identification method based on BERT and BiGRU-CRF, which comprises the steps of obtaining comment text data of the E-commerce industry through a web crawler and marking the text data; preprocessing the labeled text data, and constructing a training data set and a verification data set; training a BiGRU-CRF algorithm model and a BERT algorithm model according to the training data set and the verification data set; extracting entities contained in text data to be predicted through a trained BiGRU-CRF algorithm model; predicting the relation between the text data to be predicted and the entity through the trained BERT algorithm model, establishing an entity connection relation, and then further analyzing semantic information according to the connection relation; the method comprises the steps of extracting required entities in texts by using an algorithm model of BiGRU-CRF on the basis of a natural language processing technology under deep learning on the basis of a named entity recognition task, predicting the relationships among the entities by the texts and the extracted entities through a BERT model, establishing entity connection relationships, and further analyzing semantic information according to the connection relationships.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a schematic flow chart of a named entity identification method based on BERT and BiGRU-CRF according to an embodiment of the present invention;
fig. 2 is a schematic block diagram of a method for identifying named entities based on BERT and BiGRU-CRF according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
According to the embodiment of the invention, a named entity identification method based on BERT and BiGRU-CRF is provided.
As shown in fig. 1-2, the method for identifying a named entity based on BERT and BiGRU-CRF according to an embodiment of the present invention includes the following steps:
the method comprises the steps that firstly, comment text data of the E-commerce industry are obtained through a web crawler, and the text data are marked;
preprocessing the labeled text data to construct a training data set and a verification data set; training a BiGRU-CRF algorithm model and a BERT algorithm model according to the training data set and the verification data set;
extracting entities contained in the text data to be predicted through the trained BiGRU-CRF algorithm model;
predicting the relation between the text data to be predicted and the entity through the trained BERT algorithm model;
and step five, further analyzing the semantic information according to the connection relation.
By means of the technical scheme, an automatic information extraction method based on deep learning natural language processing is innovatively provided through the problem of named entity recognition based on E-commerce comment. On the basis of the recognition of named entities of comment texts, the invention provides a high-precision named entity extraction method for information extraction of unstructured comment texts based on a natural language processing technology under deep learning.
In addition, labeling the text data includes: training a model by using part of the labeled text data in an incremental learning mode, and predicting the rest unlabeled text data according to the trained model; and directly taking the prediction result with the confidence coefficient higher than the preset threshold value as a mark of the text data, and manually re-marking the text data with the confidence coefficient lower than the preset threshold value. And constructing a commodity comment data entity library in the E-commerce industry, replacing entities of the same type from the entity library by adopting a random replacement mode, constructing random number word generators of different expression modes aiming at general entity types such as time, amount and the like, and randomly generating and replacing time, amount and the like in the original label text data. The method has the advantages that the labeled data set is expanded, meanwhile, the model can learn the language knowledge of the context better, and compared with a common method, the robustness and the accuracy of the model are improved. And simultaneously, predicting another batch of unlabeled text data by using a model trained by a part of labeled data in an incremental learning mode, directly taking a prediction result with the algorithm confidence coefficient higher than a threshold value as a mark, and manually re-labeling the data with low confidence coefficient. And carrying out entity replacement on the marked text data by using the constructed entity library and the time and money random generator, and enhancing the data set. And simultaneously, predicting another batch of unlabeled text data by using a model trained by a part of labeled data in an incremental learning mode, directly using a prediction result with high confidence coefficient of the algorithm as a label, and manually re-labeling the data with low confidence coefficient. The method greatly improves the efficiency of data marking and reduces the labor cost.
In addition, the entity contained in the text data to be predicted is extracted through the trained BiGRU-CRF algorithm model, and the method comprises the following steps: modeling the text data sequence in a forward direction and a backward direction by utilizing a bidirectional GRU model; and (4) scoring the whole prediction path by using the relation between conditional random field CRF constraint label results, and extracting entities contained in the text data. In the embodiment, named entity recognition is performed on the sentence after the sentence is labeled by using a deep learning sequence labeling model, so that the entity existing in the text is recognized. And carrying out forward and backward modeling on the sequence by using a bidirectional GRU model, and scoring the whole prediction path by using the relation between Conditional Random Field (CRF) constraint label results so as to extract an entity contained in the sentence.
In addition, if the text data is a long text, predicting the relation between the text data to be predicted and the entity through the trained BERT algorithm model, and establishing the relation connection between the related entities, wherein the relation connection comprises the following steps:
acquiring a BERT original model, adopting [ CLS ] marks to represent the characteristics of the whole sentence types in the article through the BERT original model, and using [ SEP ] to segment a plurality of sentences in the input article;
coding by combining BERT input with an upstream extracted entity and adopting a structure of [ CLS ] article sentence [ SEP ] subject [ object ] [ SEP ] };
connecting the subject entity vector, the sentence vector and the object entity vector, and predicting the relationship type through full connection and softmax;
wherein a ═ a1, a 2.., an ] is used to represent the subject entity vector; the object entity vector is denoted by b ═ b1, b 2.
The method trains an entity recognition model and a relation prediction model respectively through labeled data. Constructing a BiGRU-CRF and training entity identification, modeling sequences in a forward direction and a backward direction by utilizing a bidirectional GRU model, and constraining the relation between label results by utilizing a Conditional Random Field (CRF);
constructing a BERT model as a relational analysis model, processing the forward input data of the model into a structure in the form of a '{ [ CLS ] article sentence [ SEP ] subject [ object ] [ SEP ] }', and training the BERT model;
predicting entities appearing in the text by using data to be predicted through an entity recognition model BiGRU-CRF;
and respectively combining the entities extracted in the last step, integrating the entities with the original text according to the format of the training data, and inputting a model to predict the relationship among the entities.
The invention provides a BERT model for improving a mask strategy to analyze semantic relations between entities. BERT uses a multi-layer self-attention mechanism to carry out bidirectional coded representation on text, and semantic syntax information of different levels of the text is extracted from low to high. The traditional BERT Model achieves the effect of training a Language Model by masking 15% of words in a text and predicting the Masked words in a Masked mode by a Masked Language Model. The invention provides a method for replacing the traditional word masking by using segment masking, and specifically, according to geometric distribution, a segment of masking length is randomly selected, then according to uniform distribution, a starting position is randomly selected, and finally a segment of characters in a sentence is masked according to the length. Migration learning using BERT can provide strong support for downstream tasks in general.
The BERT original model adopts [ CLS ] marks to represent the overall type characteristics of sentences, uses [ SEP ] to segment a plurality of input sentences, innovatively proposes a BERT structure to carry out semantic relation analysis aiming at a special input structure in the BERT, and codes by adopting a structure of a [ CLS ] article sentence [ SEP ] subject [ object ] [ SEP ] }bycombining the input of the BERT with an upstream extracted entity. Denote the subject entity vector using a ═ a1, a 2.., an ]; using b ═ b1, b 2.., bn ] to represent the object entity vector, and finally predicting the relationship type through full connection and softmax by connecting the sentence vector, the subject entity vector and the object entity vector, and representing:
V′=W[concat(a,b)]+λ;
p=softmax(V′)。
aiming at the problem that the information of words with similar meanings is split due to word masking in the traditional BERT model masking strategy, a segment masking mode is innovatively adopted, a segment of masking length is randomly selected according to geometric distribution, then the initial position is randomly selected according to uniform distribution, and finally a segment of characters in a sentence are masked according to the length. And denoising according to the connection relation by taking the extracted entity as a semantic central word, reserving a limited step length text having a dependency relation with the entity, and removing unnecessary text noise. The method can solve the problem of entity area ambiguity and improve the accuracy of entity relationship analysis. And adding the entity pairs into the model in an effective mode, and extracting the characteristics to predict the relation between the entities. Meanwhile, in order to prevent the occurrence of overfitting, the entity pair of the relation to be predicted is covered in the sentence [ MASK ]. Compared with the existing relation extraction mode, the method has obvious effect improvement. The invention discloses a method for removing text noise by using syntactic relation analysis, which comprises the following steps: aiming at the problem of fuzzy entity regions in the text, the semantic center word entities are creatively extracted by adopting a syntactic relation analysis mode, the limited step length text which has a dependency relationship with the entities is reserved, and unnecessary text noise caused by fuzzy entity distribution is removed.
In summary, by means of the technical scheme of the present invention, a semantic relationship extraction method is innovatively provided for information extraction of unstructured text based on a natural language processing technology under deep learning. Firstly, extracting entities such as product entities, money amounts, time, places, mechanisms and the like contained in a text through an algorithm model of BiGRU-CRF, predicting the relationship between the entities through a BERT model for the text and the extracted entities, establishing relationship connection between related entities, and completing text semantic analysis according to the connection relationship. The invention discloses an automatic information extraction method based on deep learning natural language processing, which is based on the problem of semantic relation analysis and extraction of unstructured documents. On the aspect of article unstructured information extraction, the invention provides a high-precision semantic relation extraction method for information extraction of unstructured texts based on a natural language processing technology under deep learning, extracts required entities in texts by using an algorithm model of BiGRU-CRF, and predicts the relation between the entities by using the texts and the extracted entities through a BERT model. The invention is a natural language analysis process combining natural language processing with deep learning pre-training model and having better prediction effect through practice, groping and research, and has high algorithm efficiency and strong pertinence. In the engineering application practice, compared with the rule-based extraction process commonly adopted by the text data mining project and the general technical method, the method has higher accuracy and higher processing speed. Moreover, the invention has the following three advantages:
1) the method comprises the steps of adopting a mode of constructing an e-commerce industry comment data entity library, replacing entities of the same type from the entity library in a random replacement mode, constructing random number word generators of different expression modes aiming at general entity types such as time, amount and the like, and randomly generating and replacing positions such as time, amount and the like in original labeled text data, so that a model can learn language knowledge of context better.
2) The method adopts an incremental learning mode, predicts another batch of unlabeled text data by using a model trained by a part of labeled data, directly uses a prediction result with the algorithm confidence higher than a threshold value as a mark, and manually re-labels the data with low confidence. The method greatly improves the efficiency of data marking and reduces the labor cost.
3) Aiming at the fuzzy characteristic of the entity region in the E-commerce comment text, the noise removal operation is innovatively carried out on irrelevant information in the text in a syntactic relation analysis mode. The extracted entity is taken as a semantic central word, the semantic central word is removed through a syntactic relation, a limited step length text which has a dependency relation with the entity is reserved, the length of the analyzed text can be shortened to the maximum extent while the original context structure is reserved, and unnecessary text noise caused by fuzzy entity distribution is removed. The training speed and the prediction accuracy of the model are greatly improved, and the speed disadvantage of the pre-training model on long sentence training reasoning can be well relieved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (5)

1. A named entity identification method based on BERT and BiGRU-CRF is characterized by comprising the following steps:
obtaining comment text data of the E-commerce industry through a web crawler, and labeling the text data;
preprocessing the labeled text data, constructing a training data set and a verification data set, and training a BiGRU-CRF algorithm model and a BERT algorithm model according to the training data set and the verification data set;
the word vector representation of a sentence is trained through a BERT pre-training language model, the traditional BERT model training task is improved, and fragment masking is used for replacing the traditional word masking;
semantic information of the current word and context is further extracted through a BiGRU model;
and extracting the optimized label which accords with the context logic through a CRF algorithm.
2. The method for identifying named entities based on BERT and BiGRU-CRF as claimed in claim 1, wherein the text data is labeled, comprising the following steps:
training a model by using part of the labeled text data in an incremental learning mode, and predicting the rest unlabeled text data according to the trained model;
and directly taking the prediction result with the confidence coefficient higher than the preset threshold value as a mark of the text data, and manually marking the text data with the confidence coefficient lower than the preset threshold value.
3. The BERT and BiGRU-CRF based named entity recognition method of claim 2, wherein the use of fragment masking instead of traditional word masking comprises the steps of:
according to the geometric distribution, a segment of the hidden length is randomly selected, then the initial position is randomly selected according to the uniform distribution, and finally a segment of characters in the sentence are covered according to the length.
4. The method of claim 3, wherein the extracting semantic information of the current word and context through the BiGRU model further comprises extracting entities included in text data to be predicted through a trained BiGRU-CRF algorithm model, and represents the following steps:
modeling the text data sequence in a forward direction and a backward direction by utilizing a bidirectional GRU model;
and (4) scoring the whole prediction path by using the relation between conditional random field CRF constraint label results, and extracting entities contained in the text data.
5. The method for identifying named entities based on BERT and BiGRU-CRF as claimed in claim 4, further comprising the steps of:
if the text data is a long text, predicting the relation between the text data to be predicted and the entity through the trained BERT algorithm model, and comprising the following steps:
acquiring a BERT original model, adopting [ CLS ] marks to represent the characteristics of the whole sentence types in the article through the BERT original model, and using [ SEP ] to segment a plurality of sentences in the input article;
coding by combining BERT input with an upstream extracted entity and adopting a structure of [ CLS ] article sentence [ SEP ] subject [ object ] [ SEP ] };
connecting the subject entity vector, the sentence vector and the object entity vector, and predicting the relationship type through full connection and softmax; wherein a ═ a1, a 2.., an ] is used to represent the subject entity vector; the object entity vector is denoted by b ═ b1, b 2.
CN202110725919.7A 2021-06-29 2021-06-29 Named entity identification method based on BERT and BiGRU-CRF Withdrawn CN113408288A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110725919.7A CN113408288A (en) 2021-06-29 2021-06-29 Named entity identification method based on BERT and BiGRU-CRF

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110725919.7A CN113408288A (en) 2021-06-29 2021-06-29 Named entity identification method based on BERT and BiGRU-CRF

Publications (1)

Publication Number Publication Date
CN113408288A true CN113408288A (en) 2021-09-17

Family

ID=77680045

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110725919.7A Withdrawn CN113408288A (en) 2021-06-29 2021-06-29 Named entity identification method based on BERT and BiGRU-CRF

Country Status (1)

Country Link
CN (1) CN113408288A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113935310A (en) * 2021-09-22 2022-01-14 三一重机有限公司 Customer demand information extraction method and device
CN114510943A (en) * 2022-02-18 2022-05-17 北京大学 Incremental named entity identification method based on pseudo sample playback
CN114510948A (en) * 2021-11-22 2022-05-17 北京中科凡语科技有限公司 Machine translation detection method and device, electronic equipment and readable storage medium
CN114861600A (en) * 2022-07-07 2022-08-05 之江实验室 NER-oriented Chinese clinical text data enhancement method and device
CN115221882A (en) * 2022-07-28 2022-10-21 平安科技(深圳)有限公司 Named entity identification method, device, equipment and medium
CN115587594A (en) * 2022-09-20 2023-01-10 广东财经大学 Network security unstructured text data extraction model training method and system
CN117010390A (en) * 2023-07-04 2023-11-07 北大荒信息有限公司 Company entity identification method, device, equipment and medium based on bidding information

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113935310A (en) * 2021-09-22 2022-01-14 三一重机有限公司 Customer demand information extraction method and device
CN114510948A (en) * 2021-11-22 2022-05-17 北京中科凡语科技有限公司 Machine translation detection method and device, electronic equipment and readable storage medium
CN114510943A (en) * 2022-02-18 2022-05-17 北京大学 Incremental named entity identification method based on pseudo sample playback
CN114510943B (en) * 2022-02-18 2024-05-28 北京大学 Incremental named entity recognition method based on pseudo sample replay
CN114861600A (en) * 2022-07-07 2022-08-05 之江实验室 NER-oriented Chinese clinical text data enhancement method and device
CN114861600B (en) * 2022-07-07 2022-12-13 之江实验室 NER-oriented Chinese clinical text data enhancement method and device
US11972214B2 (en) 2022-07-07 2024-04-30 Zhejiang Lab Method and apparatus of NER-oriented chinese clinical text data augmentation
CN115221882A (en) * 2022-07-28 2022-10-21 平安科技(深圳)有限公司 Named entity identification method, device, equipment and medium
CN115221882B (en) * 2022-07-28 2023-06-20 平安科技(深圳)有限公司 Named entity identification method, device, equipment and medium
CN115587594A (en) * 2022-09-20 2023-01-10 广东财经大学 Network security unstructured text data extraction model training method and system
CN115587594B (en) * 2022-09-20 2023-06-30 广东财经大学 Unstructured text data extraction model training method and system for network security
CN117010390A (en) * 2023-07-04 2023-11-07 北大荒信息有限公司 Company entity identification method, device, equipment and medium based on bidding information

Similar Documents

Publication Publication Date Title
CN113408288A (en) Named entity identification method based on BERT and BiGRU-CRF
CN109766524B (en) Method and system for extracting combined purchasing recombination type notice information
CN112417888A (en) Method for analyzing sparse semantic relationship by combining BilSTM-CRF algorithm and R-BERT algorithm
CN113868432B (en) Automatic knowledge graph construction method and system for iron and steel manufacturing enterprises
CN113191148B (en) Rail transit entity identification method based on semi-supervised learning and clustering
Zhang et al. Aspect-based sentiment analysis for user reviews
CN111259153B (en) Attribute-level emotion analysis method of complete attention mechanism
CN113204967B (en) Resume named entity identification method and system
CN110889786A (en) Legal action insured advocate security use judging service method based on LSTM technology
CN112861541A (en) Commodity comment sentiment analysis method based on multi-feature fusion
CN113094502A (en) Multi-granularity takeaway user comment sentiment analysis method
CN112818698A (en) Fine-grained user comment sentiment analysis method based on dual-channel model
CN115391570A (en) Method and device for constructing emotion knowledge graph based on aspects
LU506520B1 (en) A sentiment analysis method based on multimodal review data
Zhang et al. Efficiency improvement of function point-based software size estimation with deep learning model
Caciularu et al. Cross-document language modeling
CN115438709A (en) Code similarity detection method based on code attribute graph
CN110750646A (en) Attribute description extracting method for hotel comment text
CN114388108A (en) User feedback analysis method based on multi-task learning
Velmurugan et al. Mining implicit and explicit rules for customer data using natural language processing and apriori algorithm
CN116805010A (en) Multi-data chain integration and fusion knowledge graph construction method oriented to equipment manufacturing
CN116186241A (en) Event element extraction method and device based on semantic analysis and prompt learning, electronic equipment and storage medium
CN115098637A (en) Text semantic matching method and system based on Chinese character shape-pronunciation-meaning multi-element knowledge
CN114297408A (en) Relation triple extraction method based on cascade binary labeling framework
CN112800762A (en) Element content extraction method for processing text with format style

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210917