CN114154504A - Chinese named entity recognition algorithm based on multi-information enhancement - Google Patents

Chinese named entity recognition algorithm based on multi-information enhancement Download PDF

Info

Publication number
CN114154504A
CN114154504A CN202111472663.XA CN202111472663A CN114154504A CN 114154504 A CN114154504 A CN 114154504A CN 202111472663 A CN202111472663 A CN 202111472663A CN 114154504 A CN114154504 A CN 114154504A
Authority
CN
China
Prior art keywords
information
layer
entity
speech
attention mechanism
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111472663.XA
Other languages
Chinese (zh)
Inventor
黄胜
廖星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202111472663.XA priority Critical patent/CN114154504A/en
Publication of CN114154504A publication Critical patent/CN114154504A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Character Discrimination (AREA)
  • Machine Translation (AREA)

Abstract

At present, a Chinese named entity recognition method based on combination of character information and word information obtains good effect, and on the basis, a method for enhancing information by using font information also obtains certain improvement on performance. However, the problems of lack of input semantic information and entity recognition errors caused by nested entities have not been solved. To address these issues, the MIEM (Multi-Information Enhancement Method) model is proposed herein. The MIEM firstly enhances input characteristics by adding part-of-speech information into an embedded layer, adds a nested entity position information matrix based on binary tree structure coding into position information coding, then codes the embedded information by using a self-attention mechanism, and in addition, an MD layer (more details layer) is designed to replace the view field of a traditional residual error structure expansion model so as to acquire more information. The design not only enhances the expression of input information, but also enhances the entity boundary information, and solves the problems that the entity boundary is not clear and the entity identification accuracy is influenced by nested entities. Finally, a neural network model enhanced based on the embedded information and the position coding information is constructed to solve the problem of the recognition error of the named entity caused by the nested entity in the Chinese named entity recognition.

Description

Chinese named entity recognition algorithm based on multi-information enhancement
Technical Field
The invention relates to the field of deep learning and computer natural language processing, in particular to a named entity identification method based on multi-information enhancement.
Background
With the continuous development of the field of artificial intelligence, natural language processing is more and more widespread in practical application, and Named Entity Recognition (NER) is used as a basic technology of natural language processing, the accuracy of the NER determines the effect of downstream tasks, the importance of the NER plays an important role in many downstream tasks (such as translation, question and answer models, search matching, semantic analysis and the like) of natural language processing, and entities recognized by the NER mainly comprise 3 categories (Entity category, time category, numerical category), 7 subclasses (name, place name, organization name, time, date, currency, percentage) and proper nouns. The NER is essentially a sequence tagging problem, which aims to accurately identify entities in a text and classify the entities into a certain class, but at present, the identification accuracy of named entities in social media such as microblogs is not high.
On one hand, because Chinese characters have more complicated semantics compared with English, the expression of the same word has more diversity. Whereas english words have some natural part-of-speech information, such as some words: "action", "education", "organization" and so on all have the same root word "-edition", and at the same time, these words also have the same noun part of speech; it is also better that "adjustable", "respectable", "reasonable", etc. all have the same root "-able", and these words also have the same adjective part of speech. In addition, there are many similar characteristics in english words, so that an english word has some extra part-of-speech information that chinese words do not have. On the other hand, in the common usage, there is usually the problem of entity nesting, which refers to the entity appearing in the text, there is a case that a certain shorter entity is contained in another longer entity, and there are many statements in which nested entities exist, for example, "american project management association" is a nested entity, "american project management association" is an organization, but "usa" is a place name. When such a situation exists, entity identification is difficult, and it can be said that the existence of the nested entity is an important factor influencing the entity identification accuracy.
In the field of natural language processing, named entity recognition tasks are firstly performed based on word segmentation, and the method has a main problem that error information is often spread due to inaccuracy of word segmentation; thereafter, character-based named entity recognition methods overcome this problem, but lack the underlying word information. There are some problems in the Chinese named entity recognition task based on characters or word segmentation only, (Yue Zhang, Jie Yang. Chinese NER using the same LSTM [ C ]// Processing of the 54th Annual Meeting of the Association for the computerized Linear, ACL,2018:1554 and 1564.) the Chinese named entity recognition task is performed by combining the characters and word information. Recently, (Shuang Wu, Xiaoning Song, Zhenhua Feng. MECT: Multi-Metadata Embedding based Cross-transform for Chinese Name Entitude Recognition [ C ]// Proceedings of the 59th Annual Meeting of the Association for computer Linear constraints, ACL,2021:1529 and 1539.) in the Embedding layer, the radical information of the character is added for the input information, and a certain effect is obtained. In view of the current research trend, the accuracy of the chinese named entity recognition is still in need of improvement in some fields, such as social media.
In summary, in consideration of the problems of nested entities and low accuracy in the named entity recognition network based on deep learning, the method for recognizing the named entities based on multi-information enhancement is designed, and by enhancing two aspects of embedded information and position information, a model can learn richer input features and learn the information of the nested entities, so that the accuracy of the Chinese named entity recognition is improved.
Disclosure of Invention
The invention aims to design a multi-information enhanced Chinese named entity recognition algorithm to accurately recognize entities from texts, and fine-tune a pre-training model for the field of specifically realizing named entity recognition based on the method so as to achieve the optimal effect.
The invention provides a Chinese named entity recognition method based on multi-information enhancement, which comprises the following steps: the embedded information module is used for processing an input sentence, adding part-of-speech information for the input of Chinese named entity recognition, transferring the part-of-speech information based on words to a character level for input, inputting the information in an embedded layer, fusing the character information, the word information and the part-of-speech information to be used as input characteristics, simultaneously sending the constructed nested entity position information matrix code based on binary tree structure coding and the input characteristics into a self-attention mechanism together, modeling the input information of the embedded layer, and capturing details of the output of the self-attention mechanism by utilizing a feedforward neural network and a provided novel residual error structure to obtain a deep expression. And the conditional random field is used for learning the relation between the labels to obtain the final entity prediction result.
The invention mainly comprises two parts: an embedded information enhancement method and a position coded information enhancement method.
The method specifically comprises the following steps:
1. acquiring an input sentence, performing part-of-speech tagging on the input sentence, transferring the part-of-speech tagging to a character level, and finally fusing character information, word information and part-of-speech information to serve as a final input characteristic;
2. constructing a Chinese named entity recognition network based on multi-information enhancement, which mainly comprises part-of-speech information enhancement and nested entity matrix information enhancement;
3. pre-training the network by utilizing an open source data set;
4. fine tuning a pre-established neural network by using a small amount of self-made labeled Chinese named entity identification data sets in a transfer learning mode;
5. and predicting named entity identification data in the prepared test set on the network after the transfer learning is finished to obtain a final detection entity.
The Chinese named entity recognition network based on multi-information enhancement in the steps is the main content of the invention, and provides a double-information enhancement method for embedding information enhancement and position information coding enhancement.
In the embedding layer, firstly preprocessing is carried out on input, word information corresponding to characters is matched, meanwhile, word property information is added by using a natural language processing tool library spaCy, then, input word elements are matched by using pre-trained character vectors and word vectors, and finally output of the obtained word embedding information after passing through the linear layer is used as input of a model.
In the attention module, a transformer XL attention calculation method is adopted, wherein for a position coding part, the nested entity matrix position information coding based on the binary tree and the position information coding method of the Flat network are combined, so that the information of the nested entity is ensured, and the information among other morphemes is not lost. The attention module calculation method is as follows:
Figure BDA0003393011700000031
Att(A,V)=softmax(A)V
wherein i represents the ith lemma, and ij represents the relationship between the ith lemma and the jth lemma. Q, K, V is different linear transformation of input matrix, where the input matrix is character, word and part-of-speech information features fused in embedding layer, u and v are learnable hyper-parameters, position information coding module R in attention mechanismBinaryAnd RFLATIs position information coding in an attention mechanism for modeling position information between lemmas in an input sentence, wherein RBinaryThe coding mode can be found in the figure, and the complete position information coding is realized by splicing RBinaryAnd RFLATThe implementation, expressed as:
Figure BDA0003393011700000032
in the feedforward neural network module, the learned "distributed feature representation" is mapped to the sample label space through the linear layer. In order to learn more detailed features, the invention replaces the original residual error structure with the proposed MD layer, captures the detailed features with the MD layer, and finally outputs a feature matrix. In terms of the overall output structure of the network, the addition of the outputs of two parallel networks is used as the input of the overall CRF, so as to reduce errors and improve the robustness of the network.
Due to the adoption of the technical scheme, the invention has the following advantages:
1. english words have some natural part-of-speech information, such as the word suffix "-tion", "-able" indicates nouns and adjective parts-of-speech. Compared with English, Chinese characters have more complex semantics, and the expression of the same word has more diversity, but has no such characteristics. Then, if the part-of-speech information is added for the input of the Chinese named entity recognition, the model can learn more abundant information, and can learn more semantic information by adding the part-of-speech information, so that the performance of the entity recognition model is improved. Therefore, the invention uses the natural language processing tool spaCy to label part of speech information, adds part of speech information to the embedding layer, and transfers label information of words to character information to better endow semantic features of input information, and the obtained form is shown as the embedding layer in the drawing. For the embedding mode of input, the invention uses the pre-trained word list to match the input character and word vector, and for the condition that no character or word vector exists, the invention carries out random initialization processing. The original character-based representation is represented by the word matched with the character, and finally the part-of-speech information obtained by the natural language processing tool is added to obtain the total input, wherein the input comprises the character information, the matched word information and the part-of-speech information.
2. The invention provides a position information code with entity nesting information, which is used for combining relative position information between word elements and position relation between nesting entities and solving the influence of the nesting entities on the identification accuracy of Chinese named entities. In the self-attention module, the position information and the input information are fused, so that the model can actively pay attention to the semantic relation and the position relation among the word elements.
3. For the residual part of the feedforward neural network, in order to obtain a larger receptive field, the invention provides a novel residual structure MD Layer (More Details Layer) to obtain More hidden information, and the specific position of the MD Layer in the model is shown in the attached drawing. The figure shows an implementation method of an MD layer, firstly, input features are amplified by N times through a linear layer, then the amplified features are sliced, and finally the sliced features are added to obtain final output, so that the dimension is guaranteed to be unchanged.
Drawings
In order to make the purpose, technical scheme and beneficial effect of the invention more clear, the invention provides the following drawings for description:
FIG. 1 is a flow chart of the Chinese named entity recognition method based on multi-information enhancement according to the present invention;
FIG. 2 is a schematic diagram of a binary tree based position encoding structure according to the present invention;
FIG. 3 is a schematic diagram of a binary tree structure of a matrix form of position information encoding according to the present invention;
FIG. 4 is a calculation module of the present invention with an attention mechanism;
fig. 5 is a schematic diagram of an MD layer implementation of the present invention.
Detailed description of the preferred embodiments
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
The invention provides a Chinese named entity recognition algorithm based on multi-information enhancement, which specifically comprises the following steps as shown in figure 1:
step 1, inputting a sentence, and performing two simple preprocessing operations of word matching and part-of-speech matching on the sentence;
and 2, constructing a neural network fusing part-of-speech information and nested entity position information codes, and inputting the lemmas into the network for learning.
Step 3, performing attention training on the input features by using an attention mechanism, and automatically focusing the position by the model when the features appear later;
step 4, the output of the self-attention mechanism is sent to a linear layer for feature learning, wherein in order to acquire more detailed information, features are encoded through an MD layer;
and 5, sending the output of the encoder to CRF (conditional Random field) to obtain a final predicted entity.
Detailed Description
Step 1: and acquiring an input statement, and preprocessing the input statement by using operations such as word list matching, part of speech matching and the like in an input preprocessing module to enhance the input expression characteristics.
Step 2: inputting the preprocessed sentence into a self-attention mechanism module, and constructing a position coding structure based on a binary tree in the self-attention mechanism module, wherein a solid circle represents that a current node can form a word with two characters of a next node of a left subtree of the current node, and the word is exemplified by a sentence "Chongqing city changjiang river bridge" in fig. 2, and the words formed by two continuous characters are "Chongqing", "cixian", "Yangtze river" and "great bridge", and then the words are circled by oval solid lines. And the dotted circle represents that the current node may form a word with a plurality of nodes of its left sub-tree, which is denoted as "Chongqing City" and "Yangtze river bridge" in fig. 2. For the binary tree structured position information encoding method of fig. 2, the present invention is represented by the matrix of fig. 3. Wherein the diagonal line represented by the dotted line represents the connection between the nodes of the left subtree of the binary tree structure, the downward solid arrow represents a word that the current node can compose with two characters of the next node of its left subtree, and the rightward solid arrow represents a word that the current node can compose with multiple nodes of its left subtree. After the processing, the entity position code based on the binary tree structure is mapped to the matrix representation. The lemmas in the sentence are encoded according to the encoding mode, and the specific matrix input of the binary tree structure position information encoding is shown in fig. 3. The feature extraction module is an encoder module of a transform network, and an attention mechanism network after position coding is changed is used.
And step 3: the Chinese named entity recognition network is constructed by using a PyTorch framework, the position of a Multi-head attention mechanism in the overall framework is shown in FIG. 1, a calculation diagram of the Multi-head attention mechanism (Multi-HeadAttention) is shown in FIG. 4, and the overall calculation formula is as follows:
Figure BDA0003393011700000061
Att(A,V)=softmax(A)V
in the formula, Q, K, V is different linear transformation of input vector, u and v are learnable hyper-parameters, and position information is fused
Figure BDA0003393011700000062
Comprises the following steps:
Figure BDA0003393011700000063
in the formula RFLAT_ijThe calculation formula is as follows:
Figure BDA0003393011700000064
in the above formula, the first and second carbon atoms are,
Figure BDA0003393011700000065
h in (1)i-hjRepresents
Figure BDA0003393011700000066
In the same way, ti-tjRepresents
Figure BDA0003393011700000067
Figure BDA0003393011700000068
And
Figure BDA0003393011700000069
has been calculatedThe formula is as follows:
Figure BDA00033930117000000610
Figure BDA00033930117000000611
in the above formula, dmodelIs the dimension of the model, and the position d is obtained by the following calculation method:
Figure BDA00033930117000000612
in the formula, hh represents the distance from head [ i ] to head [ j ], wherein i represents the ith lemma, j represents the jth lemma, and tt represents the distance from tail [ i ] to tail [ j ].
And 4, step 4: in the feedforward neural network part of the network, in order to obtain a larger receptive field, the invention provides a novel residual error structure (MD) Layer (More Details Layer) to obtain More hidden information, and the specific position of the MD Layer in the model is shown in FIG. 1. Fig. 5 shows an implementation method of the MD layer, as shown in the figure, first, the input features are amplified by N times through the linear layer, then, the amplified features are sliced, and finally, the sliced features are added to obtain a final output, so as to ensure that the dimensionality is not changed. In the current Chinese named entity recognition task, the N value in the MD layer can be obtained through experiments, the experiment effect can be optimal by taking 2 as the N value, and meanwhile, in order to prevent overfitting during training, a layer normalization function (LayerNorm) is added to the feedforward neural network part.
And 5: and (3) sending the output of the coding part into a CRF layer for calculation, and obtaining a final prediction entity through constraint learning of the conditional random field on the label information.
Step 6: and training the constructed Chinese named entity recognition network. By means of transfer learning, firstly, the network is pre-trained by utilizing open source data of related fields, and then the pre-trained network is finely adjusted by utilizing a self-made labeled Chinese entity recognition data set.

Claims (6)

1. A Chinese named entity recognition method based on multi-information enhancement is characterized in that text content can be processed to obtain a required proper noun, and the method specifically comprises the following steps:
step 1, collecting text sentences which need to be identified by a user, adding part-of-speech labels to input words through a natural language processing tool spaCy, transferring part-of-speech information of the words to a character level, and fusing characters, words and part-of-speech information to serve as embedded information;
step 2, constructing a Chinese named entity recognition network based on multi-information enhancement, which mainly comprises a part-of-speech information embedding module, a position information coding module of a nested entity matrix and a novel feedforward neural network module based on a detail capturing layer;
and 3, carrying out named entity recognition on the input sentence on the trained neural network to obtain the required entity type.
2. The method for recognizing the Chinese named entity based on the multi-information enhancement as claimed in claim 1, wherein the constructed network of the method for recognizing the Chinese named entity based on the multi-information enhancement comprises an information embedding module, a self-attention mechanism module based on the position information of a nested entity matrix, a novel feedforward neural network module and a CRF label constraint module, wherein the information embedding module obtains the embedded vector representation of characters and words by matching a pre-trained vocabulary, then adds part-of-speech tagging information and transfers the part-of-speech information to a character level expression, and for an unregistered word (Out OfVocacbuly), the part-of-speech tagging information is randomly initialized; the self-attention mechanism module sends the embedded information and the position information based on the nested entity matrix into the self-attention mechanism to obtain final characteristic input, wherein the position information enhancing part adopts embedded entity position information coding based on a binary tree structure and is fused with position information coding of an FLAT network; for the feedforward neural network module part, the detail Layer (More Details Layer) provided by the invention is used for capturing deeper feature information instead of a common residual error Layer, and the features obtained by a self-attention mechanism are relearned; the CRF (conditional Random field) label constraint module models the dependence or constraint inside the labeling sequence, learns the contact information among the labels, and finally outputs the prediction result.
3. The method as claimed in claim 2, wherein the model has part-of-speech information in the embedding layer, the part-of-speech information is added to the model through spaCy in the embedding layer and transferred to the character, and the part-of-speech information is fused with the character information and the word information in the embedding layer, providing richer features for the network model.
4. The method for Chinese named entity recognition based on Multi-information enhancement as claimed in claim 2, wherein the self-attention mechanism module (Multi-attention mechanism) encodes the embedded information by a Multi-head attention mechanism, learns the dependence on the long and short distances between the input lemmas, and the attention mechanism is calculated by:
Figure FDA0003393011690000021
Att(A,V)=softmax(A)V
wherein i represents the ith lemma, and ij represents the relationship between the ith lemma and the jth lemma. Q, K, V are different linear transformations of input matrix, u and v are learnable hyperparameters, position information coding module R in attention mechanismBinaryAnd RFLATIs position information coding in attention mechanism, is used for modeling position information between word elements in input sentences, and the complete position information coding is formed by splicing RBinaryAnd RFLATExpressed as:
Figure FDA0003393011690000022
5. the method as claimed in claim 2, wherein the feedforward neural network module performs feature mapping on the output of the attention mechanism by using a linear Layer, wherein the More detailed feature information is obtained by replacing the common residual structure with a detail Layer (More Details Layer) proposed by the present invention.
6. The method for Chinese named entity recognition based on multi-information enhancement as claimed in claim 1, wherein the Chinese named entity recognition operation mainly comprises: the method comprises the steps of performing part-of-speech tagging on an input sentence, transferring part-of-speech tagging information to expression at a character level, fusing character information, word information and part-of-speech information to serve as output of an embedding layer, learning by using information of the embedding layer and nested entity matrix information in a self-attention mechanism, and performing feature mapping through an improved novel feedforward neural network to obtain an output sequence. And finally, sending the output sequence into a CRF layer for label constraint learning to obtain a named entity.
CN202111472663.XA 2021-12-06 2021-12-06 Chinese named entity recognition algorithm based on multi-information enhancement Pending CN114154504A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111472663.XA CN114154504A (en) 2021-12-06 2021-12-06 Chinese named entity recognition algorithm based on multi-information enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111472663.XA CN114154504A (en) 2021-12-06 2021-12-06 Chinese named entity recognition algorithm based on multi-information enhancement

Publications (1)

Publication Number Publication Date
CN114154504A true CN114154504A (en) 2022-03-08

Family

ID=80452741

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111472663.XA Pending CN114154504A (en) 2021-12-06 2021-12-06 Chinese named entity recognition algorithm based on multi-information enhancement

Country Status (1)

Country Link
CN (1) CN114154504A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115329766A (en) * 2022-08-23 2022-11-11 中国人民解放军国防科技大学 Named entity identification method based on dynamic word information fusion
CN115688777A (en) * 2022-09-28 2023-02-03 北京邮电大学 Named entity recognition system for nested and discontinuous entities of Chinese financial text

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115329766A (en) * 2022-08-23 2022-11-11 中国人民解放军国防科技大学 Named entity identification method based on dynamic word information fusion
CN115688777A (en) * 2022-09-28 2023-02-03 北京邮电大学 Named entity recognition system for nested and discontinuous entities of Chinese financial text
CN115688777B (en) * 2022-09-28 2023-05-05 北京邮电大学 Named entity recognition system for nested and discontinuous entities of Chinese financial text

Similar Documents

Publication Publication Date Title
CN111310471B (en) Travel named entity identification method based on BBLC model
CN112989834B (en) Named entity identification method and system based on flat grid enhanced linear converter
CN113128229B (en) Chinese entity relation joint extraction method
CN111738004A (en) Training method of named entity recognition model and named entity recognition method
CN110866401A (en) Chinese electronic medical record named entity identification method and system based on attention mechanism
CN117151220B (en) Entity link and relationship based extraction industry knowledge base system and method
CN111309918A (en) Multi-label text classification method based on label relevance
CN114154504A (en) Chinese named entity recognition algorithm based on multi-information enhancement
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN113657123A (en) Mongolian aspect level emotion analysis method based on target template guidance and relation head coding
CN114153973A (en) Mongolian multi-mode emotion analysis method based on T-M BERT pre-training model
CN112818698A (en) Fine-grained user comment sentiment analysis method based on dual-channel model
CN115169349A (en) Chinese electronic resume named entity recognition method based on ALBERT
CN116340513A (en) Multi-label emotion classification method and system based on label and text interaction
CN115545033A (en) Chinese field text named entity recognition method fusing vocabulary category representation
CN115238693A (en) Chinese named entity recognition method based on multi-word segmentation and multi-layer bidirectional long-short term memory
Park et al. Natural language generation using dependency tree decoding for spoken dialog systems
CN111145914A (en) Method and device for determining lung cancer clinical disease library text entity
CN116522165A (en) Public opinion text matching system and method based on twin structure
CN117390131A (en) Text emotion classification method for multiple fields
CN112989839A (en) Keyword feature-based intent recognition method and system embedded in language model
CN115906854A (en) Multi-level confrontation-based cross-language named entity recognition model training method
CN115759102A (en) Chinese poetry wine culture named entity recognition method
CN113434698B (en) Relation extraction model establishing method based on full-hierarchy attention and application thereof
CN112733526B (en) Extraction method for automatically identifying tax collection object in financial file

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination