CN111178074B - Chinese named entity recognition method based on deep learning - Google Patents

Chinese named entity recognition method based on deep learning Download PDF

Info

Publication number
CN111178074B
CN111178074B CN201911271419.XA CN201911271419A CN111178074B CN 111178074 B CN111178074 B CN 111178074B CN 201911271419 A CN201911271419 A CN 201911271419A CN 111178074 B CN111178074 B CN 111178074B
Authority
CN
China
Prior art keywords
training
layer
model
vectors
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911271419.XA
Other languages
Chinese (zh)
Other versions
CN111178074A (en
Inventor
罗韬
冯爽
徐天一
赵满坤
于健
喻梅
于瑞国
李雪威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201911271419.XA priority Critical patent/CN111178074B/en
Publication of CN111178074A publication Critical patent/CN111178074A/en
Application granted granted Critical
Publication of CN111178074B publication Critical patent/CN111178074B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a Chinese named entity recognition method based on deep learning, which is characterized by comprising the following steps of: the identification method comprises the following steps: 1) Embedding word position information mixed vectors into the data text; 2) Inputting the vectors obtained in the step into a Bi-LSTM layer for vector coding, and simulating long-term relations among the vectors captured by the time sequence; 3) Inputting the vector output by the Bi-LSTM layer into the self-attention layer, clearly learning the dependency relationship between any two characters in the sentence, and capturing the internal structure information of the sentence; 4) And inputting the output vector sequence to the CRF layer, making independent marking decisions, and performing label decoding. The invention has scientific and reasonable design, can run multiple data sets, has strong applicability and high accuracy, and can be applied to named entity recognition models of multi-field texts.

Description

Chinese named entity recognition method based on deep learning
Technical Field
The invention belongs to the technical fields of natural language processing, knowledge graph and sequence marking, relates to a deep learning technology and a sequence marking technology, and in particular relates to a Chinese named entity recognition method based on deep learning.
Background
Named entity recognition belongs to the field of sequence labeling, is a basic task of natural language processing, and mainly aims to find out entities with specific meanings in texts, including person names, place names, organization names and some specific proper nouns. The identified task mainly comprises two parts: entity boundaries identify and determine entity categories (person names, place names, organization names, etc.), where named entities are the basic elements of text and are the basic units for understanding the content of articles. More, named entity recognition is used as an upper-layer basic task for text data processing such as knowledge graph and the like, wherein the accuracy of named entity recognition directly influences the final effect of knowledge graph construction. The knowledge graph is established on the relation between the entities, if the entity extraction is wrong, the subsequent entity relation determination cannot be performed; the same is true of automatic abstract and question-answering systems, where related named entities must be found when semantic analysis is to be performed on sentences. Thus, named entity recognition is extremely critical and important for text data processing, particularly natural language processing.
Currently, the commonly applicable named entity recognition method comprises a CRF model, an LSTM model and a model combining LSTM and CRF as the named entity recognition model which is popular at present. Compared with an independent single model, the LSTM combined with the CRF mixed model combines the advantages of the LSTM and the CRF, can memorize the dependency relationship between long-distance sequences, and utilizes the advantage of CRF labeling, so that the method is widely applied to the field of named entity identification, and is optimized and improved on the basis of the method. Zhang et al studied a new dynamic element embedding method in 2019 and applied it to the chinese NER task. The method creates dynamic, data-specific and task-specific meta-embeddings, because the meta-embeddings of the same character in different sentence sequences are different. Experiments on MSRA and LiteraureNER datasets validated the model and the latest results were obtained on LiteraureNER.
Although research comparisons in recent years have proposed more methods, these generally do not produce good results on multiple data sets, and at the same time, there is no universal named entity recognition model that is highly adaptable, accurate, and applicable to multiple fields.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a Chinese named entity recognition method based on deep learning, which can run multiple data sets, has strong applicability and high accuracy, and can be applied to named entity recognition models of multi-field texts.
The invention solves the technical problems by the following technical proposal:
a Chinese named entity recognition method based on deep learning is characterized by comprising the following steps: the identification method comprises the following steps:
1) Embedding word position information mixed vectors into the data text;
2) Inputting the vectors obtained in the step into a Bi-LSTM layer for vector coding, and simulating long-term relations among the vectors captured by the time sequence;
3) Inputting the vector output by the Bi-LSTM layer into the self-attention layer, clearly learning the dependency relationship between any two characters in the sentence, and capturing the internal structure information of the sentence;
4) And inputting the output vector sequence to the CRF layer, making independent marking decisions, and performing label decoding.
Moreover, the specific operation of the step 1) is as follows:
a. establishing a dictionary according to a training data set, obtaining one-hook vectors of each word, wherein the length of the one-hook vectors is the length V of the dictionary, and mapping the one-hook vectors into low-dimensional dense vectors by a look-up layer and utilizing a pre-trained single-word position vector matrix;
b. vector concatenation is carried out on three character vectors of word vectors, character vectors divided into words and character position vectors, the vectors are used as input of a network model, and a Chinese token sequence is obtained
X=(x 1 ,x 2 ,x 3 ,...x n ,)
Checking whether a token X exists in the word lookup table and the character lookup table, and taking the vector combination of two embedded items as the distributed representation of the token when X exists in all the two tables, namely the token consists of one character; otherwise, the word position vector will be initialized to the word vector of the word in which the word is located using only the embedding in one of the look-up tables as the output of the embedding layer.
Moreover, the specific operation of the step 2) is as follows: the word mixed vector of each word in an input sequence is used as each time step of the network to be input into a Bi-LSTM layer, global characteristics are extracted, and an implicit output sequence (h) of a forward LSTM is obtained through a bidirectional LSTM network 1 ,h 2 ...h n ) Implicit output sequence of reverse LSTMSplicing the two groups of hidden sequences according to the positions to obtain the complete hidden sequence +.>This implicit sequence is taken as input to the next layer.
Moreover, the specific operation of the step 3) is as follows: for each time step input, h=h 1 ,h 3 ,...h n Representing the output of the B-iLSTM hidden layer, according to the principle of a multi-head attention mechanism, after linear transformation of an input vector, and scaling the dotproduct, the attention formula is:
wherein:is a query matrix;
is a key matrix;
is a value matrix;
d is the dimension of the hidden unit of Bi-LSTM and is numerically equal to 2d h
Setting q=k=v=h, the multi-head attention first projects the query, key and value H linearly by using different linear projections, then H projections perform scaled dot product attention in parallel, finally, concatenating these attention results and projects again to get a new representation.
Moreover, the specific operation of the step 4) is as follows: the result is accessed into a CRF layer, the CRF layer comprises a transfer matrix which represents transfer scores among the labels, and the score of the label corresponding to each word in the CRF layer is composed of two parts: and adding legal constraint among the predicted labels through a transfer matrix in a CRF layer, increasing the rationality of label grammar, and finally deducing a label sequence with highest score for label prediction by using Viterbi decoding.
The invention has the advantages and beneficial effects that:
the Chinese named entity recognition method based on deep learning can run multiple data sets, has strong applicability and high accuracy, and can be applied to named entity recognition models of multi-field texts.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a graph of the number of iterations versus model F1 values on an MSRA data set in accordance with the present invention;
FIG. 3 is a graph of the number of iterations versus model F1 values on a LiteratureNER dataset according to the present invention;
FIG. 4 is a graph of the number of iterations versus model Accuracy value on an MSRA data set in accordance with the present invention;
FIG. 5 is a graph of the number of iterations versus model Accuracy values on a LiteratureNER dataset according to the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are intended to be illustrative only and not limiting in any way.
As shown in fig. 1, a method for identifying a chinese named entity based on deep learning is characterized in that: the identification method comprises the following steps:
1) Embedding word position information mixed vectors into the data text;
a. establishing a dictionary according to a training data set, obtaining one-hook vectors of each word, wherein the length of the one-hook vectors is the length V of the dictionary, and mapping the one-hook vectors into low-dimensional dense vectors by a look-up layer and utilizing a pre-trained single-word position vector matrix;
b. vector concatenation is carried out on three character vectors of word vectors, character vectors divided into words and character position vectors, the vectors are used as input of a network model, and a Chinese token sequence is obtained
X=(x 1 ,x 2 ,x 3 ,...x n ,)
Checking whether a token X exists in the word lookup table and the character lookup table, and taking the vector combination of two embedded items as the distributed representation of the token when X exists in all the two tables, namely the token consists of one character; otherwise, the word position vector is initialized to the word vector of the word where the word is located by using the embedding in only one lookup table as the output of the embedding layer;
2) Inputting the vectors obtained in the step into a Bi-LSTM layer for vector coding, and simulating long-term relations among the vectors captured by the time sequence;
the word mixed vector of each word in an input sequence is used as each time step of the network to be input into a Bi-LSTM layer, global characteristics are extracted, and an implicit output sequence (h) of a forward LSTM is obtained through a bidirectional LSTM network 1 ,h 2 ...h n ) Implicit output sequence of reverse LSTMSplicing the two groups of hidden sequences according to positions to obtain a complete hidden sequenceTaking the implicit sequence as the input of the next layer;
3) Inputting the vector output by the Bi-LSTM layer into the self-attention layer, clearly learning the dependency relationship between any two characters in the sentence, and capturing the internal structure information of the sentence;
for each time step input, h=h 1 ,h 3 ,...h n Representing the output of the B-iLSTM hidden layer, according to the principle of a multi-head attention mechanism, after linear transformation of an input vector, and scaling the dotproduct, the attention formula is:
wherein:is a query matrix;
is a key matrix;
is a value matrix;
d is the dimension of the hidden unit of Bi-LSTM and is numerically equal to 2d h
Setting q=k=v=h, the multi-head attention first carries out linear projection on the query, the key and the value H by using different linear projections, then H projections execute scaled dot product attention in parallel, finally, connect the attention results, and project again to obtain a new representation;
4) Inputting the output vector sequence to a CRF layer, making independent marking decisions, and performing label decoding; the result is accessed into a CRF layer, the CRF layer comprises a transfer matrix which represents transfer scores among the labels, and the score of the label corresponding to each word in the CRF layer is composed of two parts: and adding legal constraint among the predicted labels through a transfer matrix in a CRF layer, increasing the rationality of label grammar, and finally deducing a label sequence with highest score for label prediction by using Viterbi decoding.
5) Model training:
a. the training sample is read by the network to train, iteration is carried out from 1, and when the maximum iteration number is greater than K, training is stopped; and for each input training data set, calculating a loss difference value of the current output according to a loss function, wherein the loss difference value is used for measuring the training degree of the model, if the loss is larger than a preset minimum loss value, the model still needs to be continuously trained and adjusted, then the network parameters of each layer need to be updated in sequence by using a back propagation algorithm, if the loss is smaller than the preset minimum loss value, the model reaches the training standard, the training is ended, and the program is exited.
b. After the current batch of training data sets is traversed, the verification set is used for verifying the training degree of the model, if the current verification result is better than the best result of the historical verification, the current training is effective, the performance of the model is in a rising stage, the training can be attempted to be continued, the current data is recorded, the best result of the history is replaced by the current verification result, and the next training is continued. If the verification result is not improved in the continuous M times of training, which possibly indicates that the learning rate span is too large and the extremum part of the minimum loss is possibly just crossed, then the learning rate can be properly reduced, the training is attempted to be continued, iteration is repeated until the learning rate is lower than the preset value of the system, and the training is ended and the system is exited.
c. After model training is finished, the training condition of the model is tested, and the test process of the model is as follows:
(1) The network parameters obtained through training are loaded into the model, and a test data set is input.
(2) The network receives the test data set, and obtains the final test output through a forward propagation algorithm.
(3) And comparing the output sequence of the network model with the correct labeling sequence.
(4) And finally, calculating the accuracy, F1 value and recall rate.
The experiments of this example were performed on microsoft's MSRA news dataset and the published literaurener dataset, respectively.
MSRA is from SIGHAN 2006 and is a shared task of Chinese named entity recognition. The dataset contains 3 entity types: personnel, organization, and location. Statistics show that this dataset contains 48998 sentences for training and 4432 sentences for testing, and since the MSRA dataset lacks a validation set, this embodiment takes one tenth of the training set as the validation set.
The LiteratureNER dataset was constructed from hundreds of Chinese literature articles, excluding too short, too cluttered articles. There are 9 entity types: personnel, organizations, locations, abstractions, times, things, and metrics. The specific contents of the data set segmentation are: 26320 training sentences, 2045 verification sentences and 3016 test sentences.
In order to prove the data superiority, the experimental methods of a plurality of journals are selected as baseline results for comparison, and the final performance of the model is compared with the performance of a control model in tables 1 and 2.
In this example, the same evaluation indexes as in the conventional work were adopted, namely Precision (Precision P), recall (Recall R) and F1-score (F1). The accuracy reflects the ratio of the number of correctly predicted tokens to the number of predicted tokens; the recall reflects the ratio of correctly predicted tokens to all tokens in the data used; f1 is the harmonic mean of the precision and recall. The following formula gives the calculation formulas of three indexes:
wherein: TP is the number of tokens that the model determines as positive and is actually positive;
FP is the number of tokens that are determined by the model to be positive but actually negative;
TN is the number of tokens that are determined by the model to be negative but actually positive;
FN is the number of tokens that are determined by the model to be negative and are actually negative.
Finally, experiments show that the new model obtains better results on the MSRA of the public dataset without using any manually-made characteristic templates, the F1 value reaches 91.37%, the F1 value reaches 73.23% on the LiteraureNER dataset, and the results are better than those of the former others, and compared with Zhang, the results are improved by 0.5% and 0.2%, so that the current best performance of the task is achieved, and meanwhile, the method has the characteristics of being capable of running multiple datasets, high in applicability and accuracy, and capable of being applied to multi-field texts.
Table 1 results comparison table on dataset MSRA
Table 2 results comparison table on dataset LiteratureNER
In the model training process, researchers can judge the training state of the model through the iteration times, the label effect change curve graph and the Accuracy graph of the model. Wherein validation Accuracy refers to the ratio of the number of correctly predicted samples in the verification set to the total number of predicted samples, and the Accuracy calculation formula is as follows, regardless of whether the predicted samples are positive examples or negative examples:
thus, this example demonstrates the scaled up 33-round results of the experimental setup in 100 rounds of iterations, and plots the F1 values of the model on the two data sets, respectively, as shown in fig. 2-5.
It can be seen from the graph that the convergence speed of the F1 value of the model from the iteration start is relatively high, the model tends to be stable after about 15 iterations and keeps floating in a small range, and the model tends to be stable in the initial stage of training on the LiteraureNER data set, which is related to the composition and the data size of the two data sets, and the MSRA data set has relatively high data size, so that the steady state can be achieved only after the training is relatively high, but the model F1 value curve change graph seen by the two data sets can be used for explaining that the model can be converged quickly and does not fall into an overfitting state, and the model is well suitable for Chinese named entity recognition tasks.
Although the embodiments of the present invention and the accompanying drawings have been disclosed for illustrative purposes, those skilled in the art will appreciate that: various substitutions, changes and modifications are possible without departing from the spirit and scope of the invention and the appended claims, and therefore the scope of the invention is not limited to the embodiments and the disclosure of the drawings.

Claims (1)

1. A Chinese named entity recognition method based on deep learning is characterized by comprising the following steps: the identification method comprises the following steps:
1) Embedding word position information mixed vectors into the data text;
2) Inputting the vector obtained in the step 1) into a Bi-LSTM layer for vector coding, and simulating a long-term relationship between the time series captured vectors;
3) Inputting the vector output by the Bi-LSTM layer into the self-attention layer, clearly learning the dependency relationship between any two characters in the sentence, and capturing the internal structure information of the sentence;
4) Inputting the output vector sequence to a CRF layer, making independent marking decisions, and performing label decoding;
the specific operation of the step 1) is as follows:
a. establishing a dictionary according to a training data set, obtaining one-hot vectors of each word, wherein the length is the dictionary length, and mapping the one-hot vectors into low-dimensional dense vectors by using a pre-trained single-word position vector matrix through a look-up layer;
b. vector concatenation is carried out on three character vectors of word vectors, character vectors divided into words and character position vectors, the vectors are used as input of a network model, and a Chinese token sequence is obtained
X=(x 1 ,x 2 ,x 3 ,…x n ,)
Checking whether a token X exists in the word lookup table and the character lookup table, and taking the vector combination of two embedded items as the distributed representation of the token when X exists in all the two tables, namely the token consists of one character; otherwise, the word position vector is initialized to the word vector of the word where the word is located by using the embedding in only one lookup table as the output of the embedding layer;
the specific operation of the step 2) is as follows: the word mixed vector of each word in an input sequence is used as each time step of the network to be input into a Bi-LSTM layer, global characteristics are extracted, and an implicit output sequence of a forward LSTM is obtained through a bidirectional LSTM networkAnd the implicit output sequence of reverse LSTM +.>Splicing the two groups of hidden sequences according to the positions to obtain the complete hidden sequence +.>Taking the implicit sequence as the input of the next layer;
the specific operation of the step 3) is as follows:
for each time step input, h=h 1 ,h 3 ,...h n Representing the output of the B-iLSTM hidden layer, according to the principle of a multi-head attention mechanism, after linear transformation of an input vector, and scaling the dotproduct, the attention formula is:
wherein:is a query matrix;
is a key matrix;
is a value matrix;
d is the dimension of the hidden unit of Bi-LSTM and is numerically equal to 2d h
Setting q=k=v=h, the multi-head attention first carries out linear projection on the query, the key and the value H by using different linear projections, then H projections execute scaled dot product attention in parallel, finally, connect the attention results, and project again to obtain a new representation;
the specific operation of the step 4) is as follows:
the result is accessed into a CRF layer, the CRF layer comprises a transfer matrix which represents transfer scores among the labels, and the score of the label corresponding to each word in the CRF layer is composed of two parts: adding legal constraint among the predicted labels through a transfer matrix in a CRF layer, increasing the rationality of label grammar, and finally deducing a label sequence with highest score for label prediction by using Viterbi decoding;
5) Model training
a. The training sample is read by the network to train, iteration is carried out from 1, and when the maximum iteration number is greater than T, training is stopped; for each input training data set, calculating a current output loss difference value according to a loss function, wherein the loss difference value is used for measuring the training degree of the model, if the loss is larger than a preset minimum loss value, the model is required to be continuously trained and adjusted, then the network parameters of each layer are required to be updated in sequence by using a back propagation algorithm, if the loss is smaller than the preset minimum loss value, the model is required to reach a training standard, the training is ended, and the program is exited;
b. after the current batch of training data sets is traversed, verifying the training degree of the model by using a verification set, if the current verification result is better than the best result of historical verification, indicating that the current training is effective, continuing training when the performance of the model is in an ascending stage, recording the current data, replacing the best result of the history by the current verification result, and continuing the next training; if the verification result is not improved in the continuous M times of training, the learning rate is reduced, the training is continued, the iteration is repeated until the learning rate is lower than the preset value of the system, and the training is ended and the user exits;
c. after model training is finished, the training condition of the model is tested, and the test process of the model is as follows:
(1) Loading the network parameters obtained through training into a model, and inputting a test data set;
(2) The network receives the test data set, and obtains the final test output through a forward propagation algorithm;
(3) Comparing and calculating the output sequence of the network model with the correct labeling sequence;
(4) And finally, calculating the accuracy, F1 value and recall rate.
CN201911271419.XA 2019-12-12 2019-12-12 Chinese named entity recognition method based on deep learning Active CN111178074B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911271419.XA CN111178074B (en) 2019-12-12 2019-12-12 Chinese named entity recognition method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911271419.XA CN111178074B (en) 2019-12-12 2019-12-12 Chinese named entity recognition method based on deep learning

Publications (2)

Publication Number Publication Date
CN111178074A CN111178074A (en) 2020-05-19
CN111178074B true CN111178074B (en) 2023-08-25

Family

ID=70650181

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911271419.XA Active CN111178074B (en) 2019-12-12 2019-12-12 Chinese named entity recognition method based on deep learning

Country Status (1)

Country Link
CN (1) CN111178074B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111666427B (en) * 2020-06-12 2023-05-12 长沙理工大学 Entity relationship joint extraction method, device, equipment and medium
CN112037179B (en) * 2020-08-11 2021-05-11 深圳大学 Method, system and equipment for generating brain disease diagnosis model
CN111967265B (en) * 2020-08-31 2023-09-15 广东工业大学 Chinese word segmentation and entity recognition combined learning method for automatic generation of data set
CN112084336A (en) * 2020-09-09 2020-12-15 浙江综合交通大数据中心有限公司 Entity extraction and event classification method and device for expressway emergency
CN112069823B (en) * 2020-09-17 2021-07-09 华院计算技术(上海)股份有限公司 Information processing method and device
CN112084783B (en) * 2020-09-24 2022-04-12 中国民航大学 Entity identification method and system based on civil aviation non-civilized passengers
CN112464663A (en) * 2020-12-01 2021-03-09 小牛思拓(北京)科技有限公司 Multi-feature fusion Chinese word segmentation method
CN112685549B (en) * 2021-01-08 2022-07-29 昆明理工大学 Document-related news element entity identification method and system integrating discourse semantics
CN113076751A (en) * 2021-02-26 2021-07-06 北京工业大学 Named entity recognition method and system, electronic device and storage medium
CN113449524B (en) * 2021-04-01 2023-04-07 山东英信计算机技术有限公司 Named entity identification method, system, equipment and medium
CN113033206B (en) * 2021-04-01 2022-04-22 重庆交通大学 Bridge detection field text entity identification method based on machine reading understanding
CN113283243B (en) * 2021-06-09 2022-07-26 广东工业大学 Entity and relationship combined extraction method
CN113486173B (en) * 2021-06-11 2023-09-12 南京邮电大学 Text labeling neural network model and labeling method thereof
CN113255294B (en) * 2021-07-14 2021-10-12 北京邮电大学 Named entity recognition model training method, recognition method and device
CN114519355A (en) * 2021-08-25 2022-05-20 浙江万里学院 Medicine named entity recognition and entity standardization method
CN113919358A (en) * 2021-11-03 2022-01-11 厦门市美亚柏科信息股份有限公司 Named entity identification method and system based on active learning
CN116151241B (en) * 2023-04-19 2023-07-07 湖南马栏山视频先进技术研究院有限公司 Entity identification method and device
CN117113997B (en) * 2023-07-25 2024-07-09 四川大学 Chinese named entity recognition method for enhancing dictionary knowledge integration
CN117744656B (en) * 2023-12-21 2024-07-16 湖南工商大学 Named entity identification method and system combining small sample learning and self-checking

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145483A (en) * 2017-04-24 2017-09-08 北京邮电大学 A kind of adaptive Chinese word cutting method based on embedded expression
CN109388807A (en) * 2018-10-30 2019-02-26 中山大学 The method, apparatus and storage medium of electronic health record name Entity recognition
CN109614614A (en) * 2018-12-03 2019-04-12 焦点科技股份有限公司 A kind of BILSTM-CRF name of product recognition methods based on from attention
CN110083710A (en) * 2019-04-30 2019-08-02 北京工业大学 It is a kind of that generation method is defined based on Recognition with Recurrent Neural Network and the word of latent variable structure
CN110334339A (en) * 2019-04-30 2019-10-15 华中科技大学 It is a kind of based on location aware from the sequence labelling model and mask method of attention mechanism
CN110399492A (en) * 2019-07-22 2019-11-01 阿里巴巴集团控股有限公司 The training method and device of disaggregated model aiming at the problem that user's question sentence

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021000362A1 (en) * 2019-07-04 2021-01-07 浙江大学 Deep neural network model-based address information feature extraction method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145483A (en) * 2017-04-24 2017-09-08 北京邮电大学 A kind of adaptive Chinese word cutting method based on embedded expression
CN109388807A (en) * 2018-10-30 2019-02-26 中山大学 The method, apparatus and storage medium of electronic health record name Entity recognition
CN109614614A (en) * 2018-12-03 2019-04-12 焦点科技股份有限公司 A kind of BILSTM-CRF name of product recognition methods based on from attention
CN110083710A (en) * 2019-04-30 2019-08-02 北京工业大学 It is a kind of that generation method is defined based on Recognition with Recurrent Neural Network and the word of latent variable structure
CN110334339A (en) * 2019-04-30 2019-10-15 华中科技大学 It is a kind of based on location aware from the sequence labelling model and mask method of attention mechanism
CN110399492A (en) * 2019-07-22 2019-11-01 阿里巴巴集团控股有限公司 The training method and device of disaggregated model aiming at the problem that user's question sentence

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于Bi-LSTM和注意力机制的命名实体识别;刘晓俊等;《洛阳理工学院学报》;20190325;第1-3小节,摘要,图2 *

Also Published As

Publication number Publication date
CN111178074A (en) 2020-05-19

Similar Documents

Publication Publication Date Title
CN111178074B (en) Chinese named entity recognition method based on deep learning
CN109344236B (en) Problem similarity calculation method based on multiple characteristics
US11693894B2 (en) Conversation oriented machine-user interaction
CN109960724B (en) Text summarization method based on TF-IDF
CN110704621B (en) Text processing method and device, storage medium and electronic equipment
CN108304372B (en) Entity extraction method and device, computer equipment and storage medium
WO2020133960A1 (en) Text quality inspection method, electronic apparatus, computer device and storage medium
CN112115721B (en) Named entity recognition method and device
CN111651589B (en) Two-stage text abstract generation method for long document
CN111428490B (en) Reference resolution weak supervised learning method using language model
CN111738002A (en) Ancient text field named entity identification method and system based on Lattice LSTM
CN115357719B (en) Power audit text classification method and device based on improved BERT model
CN114385803B (en) Extraction type reading understanding method based on external knowledge and fragment selection
CN111666764A (en) XLNET-based automatic summarization method and device
CN112232055A (en) Text detection and correction method based on pinyin similarity and language model
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN112528003B (en) Multi-item selection question-answering method based on semantic sorting and knowledge correction
CN113627172A (en) Entity identification method and system based on multi-granularity feature fusion and uncertain denoising
Lao et al. Style Change Detection Based On Bert And Conv1d.
Schluter et al. Does the cost function matter in Bayes decision rule?
US20230055769A1 (en) Specificity ranking of text elements and applications thereof
CN114595329A (en) Few-sample event extraction system and method for prototype network
Alissa et al. Text simplification using transformer and BERT
CN113139061A (en) Case feature extraction method based on word vector clustering
CN110909547A (en) Judicial entity identification method based on improved deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant