CN111160031A - Social media named entity identification method based on affix perception - Google Patents
Social media named entity identification method based on affix perception Download PDFInfo
- Publication number
- CN111160031A CN111160031A CN201911289215.9A CN201911289215A CN111160031A CN 111160031 A CN111160031 A CN 111160031A CN 201911289215 A CN201911289215 A CN 201911289215A CN 111160031 A CN111160031 A CN 111160031A
- Authority
- CN
- China
- Prior art keywords
- word
- representation
- character
- level
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
The invention discloses a social media named entity recognition method based on affix perception, which comprises the following steps: collecting a social media data set marked with a named entity; capturing embedded representation, character level representation and affix characteristic representation of a word, and fusing the embedded representation, the character level representation and the affix characteristic representation of the word to be used as final representation of the word; inputting the final representation of the obtained word into a bidirectional convolution neural network and a conditional random field, predicting a tag sequence and calculating a loss value; training the model by adopting a random gradient descent algorithm according to the obtained loss value; and inputting the text into the trained model, and identifying the named entity in the text. The invention enriches the semantic representation of the words, relieves the problem of unknown words in the social media data and improves the effect of named entity recognition.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a social media named entity recognition method based on affix perception.
Background
In the world today, with the explosion of mobile internet, people are not publishing information on social media all the time, and the information constitutes a huge amount of social media data. Compared with the traditional news private line manuscript data, the data on the social media are more time-efficient, contain rich information and gradually become a plurality of application potential information sources, such as news hotspot tracking, user public opinion analysis, potential violence incident early warning and the like. Therefore, how to mine potential information from social media data becomes an important task. The entity extraction is a basic task of information extraction, and a powerful entity extraction system is indispensable to the construction of the applications, and has extremely high social and economic values.
In recent years, with the rise of deep neural network models, end-to-end neural network-based models have become the mainstream method for named entity recognition. These methods can be broadly classified into the following categories: word-based representations, character-based representations, phrase-based representations, or any combination of the foregoing. Although these methods have achieved good performance on news text, the performance of such methods is dramatically reduced when confronted with social media data due to the inherent features of social media, such as informal expressions, irregular noun abbreviations, non-grammatical expressions, more unknown words, etc.
Applicants have discovered that affixes, as morphemes with certain semantics, can assist in identifying to some extent whether a word is part of an entity. For simplicity, only the most common prefixes and suffixes of affixes are considered. Two benefits can be brought about by introducing affix-feature representations, one is that words with the same affix tend to have similar meanings, and introducing affix representations can enrich semantic representations of words, e.g., "autopen", "automat", etc., words with the same prefix "auto-" all have "automatic" meaning; the second is that some affixes themselves have the semantics of named entities, for example, the suffix "-ie" is derived from ancient english and is commonly found in related names, names of people, children's words, or colloquial languages, so that the word whose "-ie" ends is likely to be a name of a person.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a social media named entity identification method based on affix perception. The method can combine word embedding, word character level representation, word prefix characteristic representation and word suffix characteristic representation, capture word affix characteristic representation by utilizing the bidirectional cyclic neural network, well enrich semantic representation of words by introducing the affix characteristic representation, relieve the problem of unknown words in social media data and improve the effect of named entity recognition. The method has certain generalization and is also suitable for named entity identification in the fields of news and the like.
The purpose of the invention can be realized by the following technical scheme:
a social media named entity recognition method based on affix perception comprises the following steps:
collecting a social media data set marked with named entities, wherein each piece of data comprises an original text and is marked with the named entities;
preprocessing texts in the data set, and constructing index vector representation of the texts at a word level and index vector representation of the texts at a character level;
capturing embedding representation, character level representation and affix characteristic representation of a word by adopting a recurrent neural network and a word embedding technology, and fusing the word embedding representation, the character level representation and the affix characteristic representation to be used as final representation of the word;
inputting the final representation of the obtained word into a bidirectional convolution neural network and a conditional random field, predicting a tag sequence and calculating a loss value;
training the model by adopting a random gradient descent algorithm according to the obtained loss value;
and inputting the text into the trained model, and identifying the named entity in the text.
Compared with the prior art, the invention has the following beneficial effects:
the method for recognizing the named entity of the social media based on affix perception introduces affix characteristic representation of words on the basis of word embedding and character level representation of the words, enriches semantic representation of the words, relieves the problem of unknown words in social media data, improves the effect of recognizing the named entity, has certain generalization, and is also suitable for recognizing the named entity in the fields of news and the like.
Drawings
Fig. 1 is a flowchart of a social media named entity recognition method based on affix sensing according to the present invention.
Fig. 2 is a schematic diagram of a model used for extracting affix features in this embodiment.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Examples
The embodiment provides a method for identifying a social media named entity based on affix perception, and a flowchart of the method is shown in fig. 1, and the method comprises the following steps:
(1) and collecting a social media data set marked with the named entity, wherein each piece of data comprises original text and is marked with the named entity.
The collected social media data set is used as a training set.
(2) And preprocessing the texts in the data set, and constructing the index vector representation of the texts at the word level and the index vector representation of the texts at the character level.
Specifically, the pretreatment comprises:
replacing all lower case letters in the text with corresponding upper case letters;
all the numbers in the text are replaced by 0.
Specifically, the constructing the index representation of the text at a word level and a character level comprises:
(2-1) traversing all texts of the social media data set, and constructing a word dictionary and a character dictionary;
specifically, the word dictionary is constructed by traversing each word of each text in the data set, adding the word to a word list when different words are encountered, and assigning an index to each word according to the adding sequence, wherein the index value is 0, 1, 2 and so on. And the vocabulary obtained after traversing is a word dictionary.
The method of character dictionary construction is as above except that each character of each word of each text is traversed.
And (2-2) using the word dictionary and the character dictionary obtained in the step (2-1) to sequence the text at a word level and a character level.
The text is serialized at a word level and a character level, namely, each word in each sentence is subjected to one-hot coding, and corresponding vectors are formed according to the word level and the character level.
(3) Capturing embedded representation, character level representation and affix characteristic representation of a word by adopting a recurrent neural network and a word embedding technology, and fusing the embedded representation, the character level representation and the affix characteristic representation of the word to be used as final representation of the word, wherein the steps comprise:
(3-1) the serialization of texts at the word level is expressed as s ═ w1,w2,…,wnWhere n denotes the number of words in the text,representing the one-hot encoding of the ith word of the sentence, v being the number of words in the word dictionary. Inputting s into the word embedding layer to obtain a corresponding word embedding representation:wherein the content of the first and second substances,representing trainable parameters of the word embedding layer and d representing the dimensions of the word embedding vector.
(3-2) let the character serialization of the ith word in the text be represented as column wi={ci,1,ci,2,…,ci,mWhere m denotes the number of characters contained in the ith word,the ith word in the sentenceOne-hot encoding of j characters, vcIs the number of characters in the character dictionary. Extracting character level representation of word by bidirectional cyclic neural network, firstly inputting each character of word into character embedding layer to obtain corresponding character embeddingWherein, WcParameter matrix representing character embedding layer with size dc×vc. Then the character is embedded and input into a bidirectional cyclic neural network to respectively obtain the forward hidden state vector of each characterAnd reverse implicit state vectorFinally, the last implicit state vector of the forward cyclic neural network and the last implicit state vector of the reverse cyclic neural network are spliced to represent the character-level representation of the word
(3-3) for simplicity, the first t characters of each word are considered as the prefix of the word, and similarly, the last t words of each word are considered as the suffix of the word, t being the hyperparameter. Let the character serialization of the ith word in the text be represented as column wi={ci,1,ci,2,…,ci,mTherein ofOne-hot encoding of the jth character representing the ith word in a sentence, vcIs the character dictionary size. Extracting character level representation of the word by adopting a bidirectional cyclic neural network, firstly inputting each character of the word into a character embedding layer to obtain corresponding character embeddingWherein, WcMoments of parameters representing embedded layers of charactersMatrix size dc×vc. Then the character is embedded and input into a bidirectional cyclic neural network to respectively obtain the forward hidden state vector of each characterAnd reverse implicit state vectorFinally, the implicit state vectors of the first k characters are spliced to obtain a matrix, and the dimensionality of the matrix is dvX t, this matrix contains prefix information. Specially, if the length of a word is less than t, the hidden state vectors of all time steps are spliced together, and the dimension of the obtained matrix is dvX m. In order to ensure consistent dimensionality of prefix features of all words, an averaging operation is performed on the implicit state matrices, namely, averaging is performed on the second dimensionality of the matrices, and finally prefix feature representations are obtainedSimilarly, the implicit state vector corresponding to the last t characters of the word is operated in the same way to obtain the suffix feature representation of the word
(3-4) splicing the word embedded representation, the character level representation, the prefix feature representation and the suffix feature representation of the word obtained in the steps (3-1) - (3-3) to obtain a final representation of the word
(4) Inputting the final representation of the obtained word into a bidirectional convolutional neural network and a conditional random field, predicting a label sequence and calculating a loss value, wherein the steps comprise:
(4-1) inputting the final representation of the word obtained in step (3) intoIn the bidirectional cyclic neural network, the obtained forward hidden state and the reverse hidden state are spliced to obtain word sequence representation
(4-2) inputting the word sequence representation obtained in the step (4-1) into a full-connection layer, and obtaining the score of each word on all labels: pi=Whi+ b, where W and b are trainable parameters;
(4-3) y ═ y1,y2,…,ynDenotes the predicted tag sequence corresponding to the input text s, y(s) denotes the set of all possible tag sequences for the input sentence s. The P obtained in the step (4-2)iWhen the random field is input into the conditional random field, the score of each possible sequence is calculated according to the following formula:
wherein A represents a state transition score matrix, the size of A is k × k, Ai,jRepresenting a score for a transition from label i to label j,indicating the tag y in the predicted sequenceiFollowed by label yi+1Score (likelihood), yiRepresenting the ith tag in the predicted tag sequence y.The label of the i-th word representing the input text s is yiScore (likelihood).Represents the predicted tag sequence, with the highest scoring tag sequence as the final prediction:
finally, the loss value is calculated as follows:
(5) training the model by adopting a random gradient descent algorithm according to the loss value obtained in the step (4) to obtain a trained model;
when the loss value of the model is not reduced any more, the training is completed.
(6) And (5) inputting the text into the trained model obtained in the step (5), and identifying the named entity in the text.
Fig. 2 is a schematic diagram of a model used for extracting the affix feature representation.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (10)
1. A social media named entity recognition method based on affix perception is characterized by comprising the following steps:
collecting a social media data set marked with named entities, wherein each piece of data comprises an original text and is marked with the named entities;
preprocessing texts in the data set, and constructing index vector representation of the texts at a word level and index vector representation of the texts at a character level;
capturing embedding representation, character level representation and affix characteristic representation of a word by adopting a recurrent neural network and a word embedding technology, and fusing the word embedding representation, the character level representation and the affix characteristic representation to be used as final representation of the word;
inputting the final representation of the obtained word into a bidirectional convolution neural network and a conditional random field, predicting a tag sequence and calculating a loss value;
training the model by adopting a random gradient descent algorithm according to the obtained loss value;
and inputting the text into the trained model, and identifying the named entity in the text.
2. The method of claim 1, wherein the pre-processing comprises:
replacing all lower case letters in the text with corresponding upper case letters;
replacing all numbers in the text with 0;
traversing all texts of the social media data set, and constructing a word dictionary and a character dictionary;
the text is serialized at word level and character level using the resulting word dictionary and character dictionary.
3. The method of claim 1, wherein the step of constructing an index vector representation of the text at a word level and an index vector representation of the text at a character level comprises:
traversing all texts of the social media data set, and constructing a word dictionary and a character dictionary;
the text is serialized at word level and character level using the resulting word dictionary and character dictionary.
4. The method of claim 3, wherein the word dictionary is constructed by:
traversing each word of each text in the data set, adding the words into a word list when different words are encountered, giving an index to each word according to the adding sequence, wherein the index value is 0, 1, 2 and so on, and the word list obtained after traversing is a word dictionary;
the construction method of the character dictionary is the same as that of the word dictionary, except that each character of each word of each text is traversed;
the serialization method comprises the following steps:
the text is serialized at a word level and a character level, namely, each word in each sentence is subjected to one-hot coding, and corresponding vectors are formed according to the word level and the character level.
5. The method of claim 1, wherein the step of capturing the embedded representation, the character-level representation, and the affix feature representation of the word and fusing the embedded representation, the character-level representation, and the affix feature representation of the word as a final representation of the word comprises:
inputting the word-level serialized representation of the text into a word embedding layer to obtain corresponding word embedding representation;
extracting a character level representation of a word by adopting a bidirectional recurrent neural network: firstly, inputting each character of a word into a character embedding layer to obtain corresponding character embedding; then the character is embedded and input into a bidirectional cyclic neural network to respectively obtain a forward hidden state vector and a reverse hidden state vector of each character, and finally the last hidden state vector of the forward cyclic neural network and the last hidden state vector of the reverse cyclic neural network are spliced to represent the character level representation of the word;
extracting a character level representation of a word by adopting a bidirectional recurrent neural network: firstly, inputting each character of a word into a character embedding layer to obtain corresponding character embedding, then, embedding and inputting the character into a bidirectional cyclic neural network to respectively obtain a forward hidden state vector and a reverse hidden state vector of each character, and finally, splicing the hidden state vectors of the first t characters to obtain a matrix; carrying out the same operation on the implicit state vector corresponding to the last t characters of the word to obtain suffix characteristic representation of the word;
and splicing the obtained word embedded representation, the character level representation, the prefix characteristic representation and the suffix characteristic representation of the word to obtain a final representation of the word.
6. The method of claim 5, wherein in extracting the character-level representation of the word, if the length of the word is less than t, the implicit state vectors at all time steps are concatenated together; in order to ensure that the dimension of the prefix features of all words is consistent, an averaging operation is performed on the implicit state matrices, namely, the average value is taken on the second dimension of the matrices, and finally, the prefix feature representation is obtained.
7. The method of claim 1, wherein the step of inputting the resulting final representation of the word into a bi-directional convolutional neural network and a conditional random field, predicting tag sequences and calculating loss values comprises:
inputting the final representation of the obtained word into a bidirectional cyclic neural network, and splicing the obtained forward hidden state and the reverse hidden state to obtain a word sequence representation;
inputting the word sequence representation into a full-connection layer, and calculating the score of each word on all labels;
inputting the obtained word sequence representation into a conditional random field, and calculating the score of each possible sequence;
the highest scoring tag sequence was taken as the final prediction and its loss value was calculated.
8. The method of claim 7, wherein the score for each word over all tags is calculated by the formula:
Pi=Whi+b
where W and b are trainable parameters.
9. The method of claim 7, wherein the score for each possible sequence is calculated by:
wherein A represents a state transition score matrix, the size of A is k × k, Ai,jRepresents a score for a transition from label i to label j; ,indicating the tag y in the predicted sequenceiFollowed by label yi+1Score of yiRepresenting the ith tag in the predicted tag sequence y,the label of the i-th word representing the input text s is yiScore of (a);
the highest scoring tag sequence was expressed as the final predicted result:
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911289215.9A CN111160031A (en) | 2019-12-13 | 2019-12-13 | Social media named entity identification method based on affix perception |
PCT/CN2020/112923 WO2021114745A1 (en) | 2019-12-13 | 2020-09-02 | Named entity recognition method employing affix perception for use in social media |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911289215.9A CN111160031A (en) | 2019-12-13 | 2019-12-13 | Social media named entity identification method based on affix perception |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111160031A true CN111160031A (en) | 2020-05-15 |
Family
ID=70557141
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911289215.9A Pending CN111160031A (en) | 2019-12-13 | 2019-12-13 | Social media named entity identification method based on affix perception |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111160031A (en) |
WO (1) | WO2021114745A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112270194A (en) * | 2020-11-03 | 2021-01-26 | 沈阳雅译网络技术有限公司 | Named entity identification method based on gradient neural network structure search |
WO2021114745A1 (en) * | 2019-12-13 | 2021-06-17 | 华南理工大学 | Named entity recognition method employing affix perception for use in social media |
CN113033206A (en) * | 2021-04-01 | 2021-06-25 | 重庆交通大学 | Bridge detection field text entity identification method based on machine reading understanding |
CN113609857A (en) * | 2021-07-22 | 2021-11-05 | 武汉工程大学 | Legal named entity identification method and system based on cascade model and data enhancement |
CN115757325A (en) * | 2023-01-06 | 2023-03-07 | 珠海金智维信息科技有限公司 | Intelligent conversion method and system for XES logs |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113468330B (en) * | 2021-07-06 | 2023-04-28 | 北京有竹居网络技术有限公司 | Information acquisition method, device, equipment and medium |
CN113886522B (en) * | 2021-09-13 | 2022-12-02 | 苏州空天信息研究院 | Discontinuous entity identification method based on path expansion |
CN113963358B (en) * | 2021-12-20 | 2022-03-04 | 北京易真学思教育科技有限公司 | Text recognition model training method, text recognition device and electronic equipment |
CN114297079B (en) * | 2021-12-30 | 2024-04-02 | 北京工业大学 | XSS fuzzy test case generation method based on time convolution network |
CN115130468B (en) * | 2022-05-06 | 2023-04-07 | 北京安智因生物技术有限公司 | Myocardial infarction entity recognition method based on word fusion representation and graph attention network |
CN115017144B (en) * | 2022-05-30 | 2024-03-29 | 北京计算机技术及应用研究所 | Judicial document case element entity identification method based on graphic neural network |
CN115329766B (en) * | 2022-08-23 | 2023-04-18 | 中国人民解放军国防科技大学 | Named entity identification method based on dynamic word information fusion |
CN116579343A (en) * | 2023-05-17 | 2023-08-11 | 成都信息工程大学 | Named entity identification method for Chinese text travel class |
CN116432655B (en) * | 2023-06-12 | 2023-12-08 | 山东大学 | Method and device for identifying named entities with few samples based on language knowledge learning |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106569998A (en) * | 2016-10-27 | 2017-04-19 | 浙江大学 | Text named entity recognition method based on Bi-LSTM, CNN and CRF |
US20180131645A1 (en) * | 2016-09-29 | 2018-05-10 | Admit Hub, Inc. | Systems and processes for operating and training a text-based chatbot |
CN108460013A (en) * | 2018-01-30 | 2018-08-28 | 大连理工大学 | A kind of sequence labelling model based on fine granularity vocabulary representation model |
CN108959252A (en) * | 2018-06-28 | 2018-12-07 | 中国人民解放军国防科技大学 | Semi-supervised Chinese named entity recognition method based on deep learning |
US10169315B1 (en) * | 2018-04-27 | 2019-01-01 | Asapp, Inc. | Removing personal information from text using a neural network |
CN109325243A (en) * | 2018-10-22 | 2019-02-12 | 内蒙古大学 | Mongolian word cutting method and its word cutting system of the character level based on series model |
CN109493265A (en) * | 2018-11-05 | 2019-03-19 | 北京奥法科技有限公司 | A kind of Policy Interpretation method and Policy Interpretation system based on deep learning |
CN109871535A (en) * | 2019-01-16 | 2019-06-11 | 四川大学 | A kind of French name entity recognition method based on deep neural network |
CN110096713A (en) * | 2019-03-21 | 2019-08-06 | 昆明理工大学 | A kind of Laotian organization names recognition methods based on SVM-BiLSTM-CRF |
CN110110042A (en) * | 2019-03-21 | 2019-08-09 | 昆明理工大学 | Laotian complexity name place name entity recognition method based on CNN+BLSTM+CRF |
CN110162749A (en) * | 2018-10-22 | 2019-08-23 | 哈尔滨工业大学(深圳) | Information extracting method, device, computer equipment and computer readable storage medium |
CN110223737A (en) * | 2019-06-13 | 2019-09-10 | 电子科技大学 | A kind of chemical composition of Chinese materia medica name entity recognition method and device |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109359293B (en) * | 2018-09-13 | 2019-09-10 | 内蒙古大学 | Mongolian name entity recognition method neural network based and its identifying system |
CN110008469B (en) * | 2019-03-19 | 2022-06-07 | 桂林电子科技大学 | Multilevel named entity recognition method |
CN110083710B (en) * | 2019-04-30 | 2021-04-02 | 北京工业大学 | Word definition generation method based on cyclic neural network and latent variable structure |
CN111160031A (en) * | 2019-12-13 | 2020-05-15 | 华南理工大学 | Social media named entity identification method based on affix perception |
-
2019
- 2019-12-13 CN CN201911289215.9A patent/CN111160031A/en active Pending
-
2020
- 2020-09-02 WO PCT/CN2020/112923 patent/WO2021114745A1/en active Application Filing
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180131645A1 (en) * | 2016-09-29 | 2018-05-10 | Admit Hub, Inc. | Systems and processes for operating and training a text-based chatbot |
CN106569998A (en) * | 2016-10-27 | 2017-04-19 | 浙江大学 | Text named entity recognition method based on Bi-LSTM, CNN and CRF |
CN108460013A (en) * | 2018-01-30 | 2018-08-28 | 大连理工大学 | A kind of sequence labelling model based on fine granularity vocabulary representation model |
US10169315B1 (en) * | 2018-04-27 | 2019-01-01 | Asapp, Inc. | Removing personal information from text using a neural network |
CN108959252A (en) * | 2018-06-28 | 2018-12-07 | 中国人民解放军国防科技大学 | Semi-supervised Chinese named entity recognition method based on deep learning |
CN109325243A (en) * | 2018-10-22 | 2019-02-12 | 内蒙古大学 | Mongolian word cutting method and its word cutting system of the character level based on series model |
CN110162749A (en) * | 2018-10-22 | 2019-08-23 | 哈尔滨工业大学(深圳) | Information extracting method, device, computer equipment and computer readable storage medium |
CN109493265A (en) * | 2018-11-05 | 2019-03-19 | 北京奥法科技有限公司 | A kind of Policy Interpretation method and Policy Interpretation system based on deep learning |
CN109871535A (en) * | 2019-01-16 | 2019-06-11 | 四川大学 | A kind of French name entity recognition method based on deep neural network |
CN110096713A (en) * | 2019-03-21 | 2019-08-06 | 昆明理工大学 | A kind of Laotian organization names recognition methods based on SVM-BiLSTM-CRF |
CN110110042A (en) * | 2019-03-21 | 2019-08-09 | 昆明理工大学 | Laotian complexity name place name entity recognition method based on CNN+BLSTM+CRF |
CN110223737A (en) * | 2019-06-13 | 2019-09-10 | 电子科技大学 | A kind of chemical composition of Chinese materia medica name entity recognition method and device |
Non-Patent Citations (4)
Title |
---|
CHANGMENG ZHENG: "A Boundary-aware Neural Model for Nested Named Entity Recognition", 《PROCEEDINGS OF THE 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING》 * |
SOBHANA N.V: "Conditional Random Field Based Named Entity Recognition in Geological text", 《2010 INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS》 * |
VIKAS YADAV: "Deep Affix Features Improve Neural Named Entity Recognizers", 《PROCEEDINGS OF THE 7TH JOINT CONFERENCE ON LEXICAL AND COMPUTATIONAL SEMANTICS》 * |
买买提阿依甫: "基于 BiLSTM-CNN-CRF 模型的维吾尔文命名实体识别", 《计算机工程》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021114745A1 (en) * | 2019-12-13 | 2021-06-17 | 华南理工大学 | Named entity recognition method employing affix perception for use in social media |
CN112270194A (en) * | 2020-11-03 | 2021-01-26 | 沈阳雅译网络技术有限公司 | Named entity identification method based on gradient neural network structure search |
CN112270194B (en) * | 2020-11-03 | 2023-07-18 | 沈阳雅译网络技术有限公司 | Named entity identification method based on gradient neural network structure search |
CN113033206A (en) * | 2021-04-01 | 2021-06-25 | 重庆交通大学 | Bridge detection field text entity identification method based on machine reading understanding |
CN113609857A (en) * | 2021-07-22 | 2021-11-05 | 武汉工程大学 | Legal named entity identification method and system based on cascade model and data enhancement |
CN113609857B (en) * | 2021-07-22 | 2023-11-28 | 武汉工程大学 | Legal named entity recognition method and system based on cascade model and data enhancement |
CN115757325A (en) * | 2023-01-06 | 2023-03-07 | 珠海金智维信息科技有限公司 | Intelligent conversion method and system for XES logs |
CN115757325B (en) * | 2023-01-06 | 2023-04-18 | 珠海金智维信息科技有限公司 | Intelligent conversion method and system for XES log |
Also Published As
Publication number | Publication date |
---|---|
WO2021114745A1 (en) | 2021-06-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111160031A (en) | Social media named entity identification method based on affix perception | |
CN110119765B (en) | Keyword extraction method based on Seq2Seq framework | |
CN110737763A (en) | Chinese intelligent question-answering system and method integrating knowledge map and deep learning | |
CN111291195B (en) | Data processing method, device, terminal and readable storage medium | |
CN112101041B (en) | Entity relationship extraction method, device, equipment and medium based on semantic similarity | |
Zhang et al. | Learning Chinese word embeddings from stroke, structure and pinyin of characters | |
CN113392209B (en) | Text clustering method based on artificial intelligence, related equipment and storage medium | |
CN113569050B (en) | Method and device for automatically constructing government affair field knowledge map based on deep learning | |
CN112100332A (en) | Word embedding expression learning method and device and text recall method and device | |
Suman et al. | Why pay more? A simple and efficient named entity recognition system for tweets | |
CN108108468A (en) | A kind of short text sentiment analysis method and apparatus based on concept and text emotion | |
CN113722490B (en) | Visual rich document information extraction method based on key value matching relation | |
Jiang et al. | An LSTM-CNN attention approach for aspect-level sentiment classification | |
CN111814477B (en) | Dispute focus discovery method and device based on dispute focus entity and terminal | |
CN114818717A (en) | Chinese named entity recognition method and system fusing vocabulary and syntax information | |
CN112069312A (en) | Text classification method based on entity recognition and electronic device | |
CN113704416A (en) | Word sense disambiguation method and device, electronic equipment and computer-readable storage medium | |
CN114757184B (en) | Method and system for realizing knowledge question and answer in aviation field | |
CN114912453A (en) | Chinese legal document named entity identification method based on enhanced sequence features | |
CN113961666A (en) | Keyword recognition method, apparatus, device, medium, and computer program product | |
CN112784602A (en) | News emotion entity extraction method based on remote supervision | |
CN114064901B (en) | Book comment text classification method based on knowledge graph word meaning disambiguation | |
CN115630145A (en) | Multi-granularity emotion-based conversation recommendation method and system | |
CN111159405B (en) | Irony detection method based on background knowledge | |
CN116595023A (en) | Address information updating method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200515 |