CN111178080A - Named entity identification method and system based on structured information - Google Patents

Named entity identification method and system based on structured information Download PDF

Info

Publication number
CN111178080A
CN111178080A CN202010002138.0A CN202010002138A CN111178080A CN 111178080 A CN111178080 A CN 111178080A CN 202010002138 A CN202010002138 A CN 202010002138A CN 111178080 A CN111178080 A CN 111178080A
Authority
CN
China
Prior art keywords
word
sentence
processing
structural
obtaining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010002138.0A
Other languages
Chinese (zh)
Other versions
CN111178080B (en
Inventor
周彬
牛迪
任天成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Tuya Information Technology Co Ltd
Original Assignee
Hangzhou Tuya Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Tuya Information Technology Co Ltd filed Critical Hangzhou Tuya Information Technology Co Ltd
Priority to CN202010002138.0A priority Critical patent/CN111178080B/en
Publication of CN111178080A publication Critical patent/CN111178080A/en
Application granted granted Critical
Publication of CN111178080B publication Critical patent/CN111178080B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a named entity identification method based on structured information, which comprises the following steps: structuring a processing sentence and obtaining a processing result; obtaining a structural feature according to the processing result; and preprocessing or sequence labeling the sentences according to the distribution of the structural features. Compared with the prior art, the method has the following beneficial effects: the Chinese named entity recognition method based on the structural information of the characters and the words is provided, a domain dictionary is constructed to cut words so as to ensure the accuracy of entity boundaries, semantic information contained in each character and word is analyzed from structural characteristics, and then the semantic information is used as a basis for judging the entities.

Description

Named entity identification method and system based on structured information
Technical Field
The application relates to the field of named entity identification, in particular to a named entity identification method based on structured information.
Background
Named Entity Recognition (NER), also known as proper name recognition, is a task in information extraction and has a wide application range. Named entities generally refer to entities in text that have a particular meaning or strong reference, and typically include names of people, places, organizations, dates and times, and the like. The NER system extracts the entities from the unstructured input text and can identify more classes of entities according to business requirements.
Currently, the algorithm for named entity recognition mainly utilizes machine learning and deep learning models to label a sequence of a single sentence. The sequence annotation refers to marking each character or word in a sentence, for example, "i love China" is an entity of a place name. However, in practical applications, a word may point to multiple entities, such as "red" in "play" tv series "is an entity of tv series name, and" red "in" set light to red "is a light color, and is a common word. This phenomenon is also called word ambiguity. When processing such ambiguous words, the current algorithm model only utilizes the representation information of characters or words, does not utilize deeper structural features, is difficult to accurately predict entities, and combines the above examples, the existing algorithm judges the red in all sentences as the television series names with high probability.
On the other hand, the current named entity recognition algorithm has the problem of wrong boundary division. First, the named entity recognition algorithm of chinese can be classified into a word-based method and a character-based method. The word-based method is to perform word segmentation and then perform entity judgment on the words. However, word segmentation errors can cause entity boundary errors, and the problem is serious in the open domain because word segmentation across domains is still an unsolved problem. For example, "Nanjing Yangtze river bridge" may be segmented into "Nanjing/City Yangtze/river bridge", and the current algorithm may judge "river bridge" as a person's name. The character-based named entity recognition does not need to divide words in advance, and whether the word belongs to one part of the entity or not is directly judged on the word. Although the defects of some word segmentation can be overcome, the method cannot utilize explicit word and word sequence information.
Disclosure of Invention
The main objective of the present application is to provide a named entity identification method based on structured information, which includes:
structuring a processing sentence and obtaining a processing result;
obtaining a structural feature according to the processing result; and
and preprocessing or labeling the sequence of the sentence according to the distribution of the structural features.
Optionally, the structuring and obtaining the processing result comprises: utilizing a database to perform word segmentation processing on the sentence, performing matching retrieval on each word in the sentence, and obtaining three processing results aiming at each word: 1) the word is not present in any one of the databases; 2) the word exists in only one database; 3) the word is stored in two or more databases.
Optionally, obtaining the structured feature according to the processing result includes:
according to the processing result, obtaining the structural characteristics of each word in the sentence, and obtaining three types of distribution results of the sentence according to the structural characteristics of the words: 1) each word in the sentence exists in only one database; 2) more than one word in the sentence does not exist in any database; more than one word in the sentence exists in two or more databases.
Optionally, according to the distribution of the structural features, the sentence preprocessing or sequence labeling includes:
when the distribution result of the structural features belongs to the category 1), giving a corresponding label to each word by the rule layer according to the type of the corresponding database; and when the distribution result of the structural features belongs to the category 2) or 3), preprocessing the sentence, and labeling each word in the sentence through a sequence labeling algorithm model so as to judge the entity type of the sentence.
Optionally, the sequence annotation model is applied to a deep learning method, including:
constructing the characteristics of words, and performing vectorization processing on each word in the sentence to obtain a word vector;
constructing character features, and inputting the word vectors into a neural network of a rolling machine to obtain the features of the characters;
matching and searching each word in the sentence through a database to obtain the structural feature of each word, and vectorizing the structural feature to obtain a structural feature vector;
splicing the word vector, the character features and the structural feature vector to obtain a splicing result, and inputting the splicing result to a bidirectional long-short term memory model to obtain a hidden layer vector;
inputting the hidden layer vector to a full-connection network to obtain probability distribution which is not subjected to normalization processing in deep learning;
and inputting the probability distribution which is not subjected to normalization processing in the deep learning to a conditional random field to obtain an entity label.
Optionally, the pre-processing comprises: stop word recognition, numerical value conversion and wrongly written character correction.
According to an aspect of the present application, there is provided a named entity recognition system based on structured information, including: a preprocessing module, a structural processing module, a rule layer module and a sequence labeling model module,
the structured processing module is used for carrying out structured processing on sentences, obtaining processing results and obtaining structured features according to the processing results; the preprocessing module is used for preprocessing sentences according to the distribution of the structural features; and the rule layer module is used for carrying out sentence sequence marking according to the distribution of the structural characteristics through the sequence marking module.
The application also discloses a computer device, which comprises a memory, a processor and a computer program stored in the memory and capable of being executed by the processor, wherein the processor realizes the method of any one of the above items when executing the computer program.
The application also discloses a computer-readable storage medium, a non-volatile readable storage medium, having stored therein a computer program which, when executed by a processor, implements the method of any of the above.
The present application also discloses a computer program product comprising computer readable code which, when executed by a computer device, causes the computer device to perform the method of any of the above.
Compared with the prior art, the method has the following beneficial effects:
named entity recognition is carried out on the basis of structural information of characters and words. Analyzing structural characteristics by analyzing characters and words in a sentence, and analyzing semantic information contained in the structural characteristics as a basis for judging entity types;
in the vectorization operation input by the sequence labeling model, a word vector, structural features obtained through dictionary matching retrieval and character features obtained through CNN coding are fused.
The named entity recognition system applies the structural characteristics to divide sentences into three types, and entity categories can be judged through a rule layer and a sequence marking model respectively.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, serve to provide a further understanding of the application and to enable other features, objects, and advantages of the application to be more apparent. The drawings and their description illustrate the embodiments of the invention and do not limit it. In the drawings:
1-3 are schematic flow diagrams of a named entity identification method based on structured information according to an embodiment of the present application;
FIGS. 4-5 are schematic diagrams of sequence labeling processes according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a computer device according to one embodiment of the present application; and
FIG. 7 is a schematic diagram of a computer-readable storage medium according to one embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be used. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Referring to fig. 1 to fig. 3, an embodiment of the present application provides a named entity identification method based on structured information, including:
structuring a processing sentence and obtaining a processing result;
obtaining a structural feature according to the processing result; and
and preprocessing or labeling the sequence of the sentence according to the distribution of the structural features.
In an embodiment of the present application, the structuring process and obtaining the processing result includes: utilizing a database to perform word segmentation processing on the sentence, performing matching retrieval on each word in the sentence, and obtaining three processing results aiming at each word: 1) the word is not present in any one of the databases; 2) the word exists in only one database; 3) the word is stored in two or more databases.
In an embodiment of the present application, obtaining the structural feature according to the processing result includes:
according to the processing result, obtaining the structural characteristics of each word in the sentence, and obtaining three types of distribution results of the sentence according to the structural characteristics of the words: 1) each word in the sentence exists in only one database; 2) more than one word in the sentence does not exist in any database; more than one word in the sentence exists in two or more databases.
In an embodiment of the present application, performing sentence preprocessing or sequence labeling according to the distribution of the structural features includes:
when the distribution result of the structural features belongs to the category 1), giving a corresponding label to each word by the rule layer according to the type of the corresponding database; and when the distribution result of the structural features belongs to the category 2) or 3), preprocessing the sentence, and labeling each word in the sentence through a sequence labeling algorithm model so as to judge the entity type of the sentence.
Referring to fig. 4-5, in an embodiment of the present application, the sequence annotation model is used for a deep learning method, including:
constructing the characteristics of words, and performing vectorization processing on each word in the sentence to obtain a word vector;
constructing character features, and inputting the word vectors into a neural network of a rolling machine to obtain the features of the characters;
matching and searching each word in the sentence through a database to obtain the structural feature of each word, and vectorizing the structural feature to obtain a structural feature vector;
splicing the word vector, the character features and the structural feature vector to obtain a splicing result, and inputting the splicing result to a bidirectional long-short term memory model to obtain a hidden layer vector;
inputting the hidden layer vector to a full-connection network to obtain probability distribution which is not subjected to normalization processing in deep learning;
and inputting the probability distribution which is not subjected to normalization processing in the deep learning to a conditional random field to obtain an entity label.
In an embodiment of the present application, the preprocessing includes: stop word recognition, numerical value conversion and wrongly written character correction.
The application also provides a named entity recognition system based on structured information, which comprises: preprocessing, structural processing, a rule layer and a sequence labeling model.
Step 1: and (5) structuring treatment.
In the step, the dictionary is used for carrying out word segmentation on the sentence, then each word in the sentence is subjected to matching retrieval, and three results are obtained aiming at each word:
the word does not exist in any dictionary;
the word only exists in one dictionary;
③ the word exists in two or more dictionaries
Step 2: and (5) structural feature analysis.
In this step, according to the processing result of step 1, the structural features of each word in the sentence are obtained, and at this time, the sentence is divided into three categories according to the structural feature distribution of the word:
each word in the sentence only exists in one dictionary;
there is more than one word in the sentence, and it does not exist in any dictionary;
③ more than one word in the sentence exists in two or more dictionaries
if the structural feature distribution result of the sentence belongs to the category ①, the step 4 is directly entered, and if the structural feature distribution result of the sentence belongs to the category ② or ③, the step 3 is entered.
And step 3: and (5) sentence preprocessing.
The sentence is processed by the preprocessing module, including but not limited to stop word recognition, numerical transformation, wrongly written character correction, etc.
And 4, step 4: sequence labeling
If the last step comes from step 2, that is, each word in the sentence exists in only one dictionary, the rule layer of the system is entered. Since each word belongs to only one dictionary, the rule layer gives each word a corresponding label according to the type of the dictionary. If "China" only belongs to the place name dictionary, the tag of the place name entity is given. And if the 'I' only belongs to the labels of the common dictionary, the labels of the common words are given.
If the last step is from step 3, labeling each word in the pre-sentence by using the algorithm model of sequence labeling, thereby judging the entity type of the word. The sequence labeling model adopts a deep learning method, and a specific algorithm model structure is shown in fig. 4-5 and can be divided into the following sub-steps:
step 4.1:
and constructing the characteristics of words, and performing vectorization processing on each word in the sentence to obtain a word vector.
Step 4.2:
the character features are constructed, one word may be composed of one or more characters, and after each character is subjected to vectorization processing, the characters are input to a CNN (computerized neural network) to obtain the features of the characters.
Step 4.3:
and performing matching retrieval on each word in the sentence by using the dictionary to obtain the structural feature of each word, and performing vectorization processing on the feature to obtain a structural feature vector.
Step 4.4:
the word vectors, character features and structured feature vectors are spliced to form the input of a BILSTM (Bi-directional Long Short Term Memory model), and the hidden vector h is output after the input of the Bi-directional Long Short Term Memory model.
Step 4.5:
the vector h is input to FC (fully Connected Layer) and output to registers (probability distribution without normalization in deep learning).
Step 4.6
logtis as input to CRF (Conditional Random Field) to generate final entity labels (name, place, common words, etc.)
After the sentence passes through the step 4, the entity label corresponding to each word is judged through a rule layer or an algorithm model. At this point, the named entity identification process based on the structured information ends.
Compared with the prior art, the method has the following beneficial effects:
named entity recognition is carried out on the basis of structural information of characters and words. Analyzing structural characteristics by analyzing characters and words in a sentence, and analyzing semantic information contained in the structural characteristics as a basis for judging entity types;
in the vectorization operation input by the sequence labeling model, a word vector, structural features obtained through dictionary matching retrieval and character features obtained through CNN coding are fused.
The named entity recognition system applies the structural characteristics to divide sentences into three types, and entity categories can be judged through a rule layer and a sequence marking model respectively.
Compared with the prior art, the method has the following beneficial effects:
named entity recognition is carried out on the basis of structural information of characters and words. Analyzing structural characteristics by analyzing characters and words in a sentence, and analyzing semantic information contained in the structural characteristics as a basis for judging entity types;
in the vectorization operation input by the sequence labeling model, a word vector, structural features obtained through dictionary matching retrieval and character features obtained through CNN coding are fused.
The named entity recognition system applies the structural characteristics to divide sentences into three types, and entity categories can be judged through a rule layer and a sequence marking model respectively.
The method for identifying the Chinese named entity based on the structural information of the characters and the words is provided, a domain dictionary is constructed to cut words so as to ensure the accuracy of the entity boundary, semantic information contained in each character and word is analyzed from the structural characteristics, and then the semantic information is used as a basis for judging the entity.
Referring to fig. 6, the present application further provides a computer device including a memory, a processor, and a computer program stored in the memory and executable by the processor, wherein the processor implements the method of any one of the above methods when executing the computer program.
Referring to fig. 7, a computer-readable storage medium, a non-volatile readable storage medium, having stored therein a computer program which, when executed by a processor, implements any of the methods described above.
A computer program product comprising computer readable code which, when executed by a computer device, causes the computer device to perform the method of any of the above.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and they may alternatively be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, or fabricated separately as individual integrated circuit modules, or fabricated as a single integrated circuit module from multiple modules or steps. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A named entity identification method based on structured information is characterized by comprising the following steps:
structuring a processing sentence and obtaining a processing result;
obtaining a structural feature according to the processing result; and
and preprocessing or labeling the sequence of the sentence according to the distribution of the structural features.
2. The method for named entity recognition based on structured information as claimed in claim 1, wherein the structured processing and obtaining the processing result comprises: utilizing a database to perform word segmentation processing on the sentence, performing matching retrieval on each word in the sentence, and obtaining three processing results aiming at each word: 1) the word is not present in any one of the databases; 2) the word exists in only one database; 3) the word is stored in two or more databases.
3. The named entity recognition method based on structured information as claimed in claim 2, wherein obtaining the structured features according to the processing result comprises:
according to the processing result, obtaining the structural characteristics of each word in the sentence, and obtaining three types of distribution results of the sentence according to the structural characteristics of the words: 1) each word in the sentence exists in only one database; 2) more than one word in the sentence does not exist in any database; more than one word in the sentence exists in two or more databases.
4. The method for named entity recognition based on structured information as claimed in claim 3, wherein the sentence preprocessing or sequence labeling according to the distribution of the structured features comprises:
when the distribution result of the structural features belongs to the category 1), giving a corresponding label to each word by the rule layer according to the type of the corresponding database; and when the distribution result of the structural features belongs to the category 2) or 3), preprocessing the sentence, and labeling each word in the sentence through a sequence labeling algorithm model so as to judge the entity type of the sentence.
5. The named entity recognition method based on structured information as claimed in claim 4, wherein the sequence annotation model is applied to a deep learning method, comprising:
constructing the characteristics of words, and performing vectorization processing on each word in the sentence to obtain a word vector;
constructing character features, and inputting the word vectors into a neural network of a rolling machine to obtain the features of the characters;
matching and searching each word in the sentence through a database to obtain the structural feature of each word, and vectorizing the structural feature to obtain a structural feature vector;
splicing the word vector, the character features and the structural feature vector to obtain a splicing result, and inputting the splicing result to a bidirectional long-short term memory model to obtain a hidden layer vector;
inputting the hidden layer vector to a full-connection network to obtain probability distribution which is not subjected to normalization processing in deep learning;
and inputting the probability distribution which is not subjected to normalization processing in the deep learning to a conditional random field to obtain an entity label.
6. The method for named entity recognition based on structured information as claimed in claim 5, wherein the preprocessing comprises: stop word recognition, numerical value conversion and wrongly written character correction.
7. A named entity recognition system based on structured information, comprising: a preprocessing module, a structural processing module, a rule layer module and a sequence labeling model module,
the structured processing module is used for carrying out structured processing on sentences, obtaining processing results and obtaining structured features according to the processing results; the preprocessing module is used for preprocessing sentences according to the distribution of the structural features; and the rule layer module is used for carrying out sentence sequence marking according to the distribution of the structural characteristics through the sequence marking module.
8. A computer device comprising a memory, a processor and a computer program stored in the memory and executable by the processor, wherein the processor implements the method of any one of claims 1-6 when executing the computer program.
9. A computer-readable storage medium, a non-transitory readable storage medium, having stored therein a computer program, characterized in that the computer program, when executed by a processor, implements the method according to any one of claims 1-6.
10. A computer program product comprising computer readable code that, when executed by a computer device, causes the computer device to perform the method of any of claims 1-6.
CN202010002138.0A 2020-01-02 2020-01-02 Named entity identification method and system based on structured information Active CN111178080B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010002138.0A CN111178080B (en) 2020-01-02 2020-01-02 Named entity identification method and system based on structured information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010002138.0A CN111178080B (en) 2020-01-02 2020-01-02 Named entity identification method and system based on structured information

Publications (2)

Publication Number Publication Date
CN111178080A true CN111178080A (en) 2020-05-19
CN111178080B CN111178080B (en) 2023-07-18

Family

ID=70654364

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010002138.0A Active CN111178080B (en) 2020-01-02 2020-01-02 Named entity identification method and system based on structured information

Country Status (1)

Country Link
CN (1) CN111178080B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115212A (en) * 2020-09-29 2020-12-22 中国工商银行股份有限公司 Parameter identification method and device and electronic equipment
CN112861539A (en) * 2021-03-16 2021-05-28 云知声智能科技股份有限公司 Nested named entity recognition method and device, electronic equipment and storage medium
CN114925694A (en) * 2022-05-11 2022-08-19 厦门大学 Method for improving biomedical named body recognition by utilizing entity discrimination information

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103631948A (en) * 2013-12-11 2014-03-12 北京京东尚科信息技术有限公司 Identifying method of named entities
CN108491373A (en) * 2018-02-01 2018-09-04 北京百度网讯科技有限公司 A kind of entity recognition method and system
US20180365211A1 (en) * 2015-12-11 2018-12-20 Beijing Gridsum Technology Co., Ltd. Method and Device for Recognizing Domain Named Entity
CN109858018A (en) * 2018-12-25 2019-06-07 中国科学院信息工程研究所 A kind of entity recognition method and system towards threat information
CN109960728A (en) * 2019-03-11 2019-07-02 北京市科学技术情报研究所(北京市科学技术信息中心) A kind of open field conferencing information name entity recognition method and system
CN110008473A (en) * 2019-04-01 2019-07-12 云知声(上海)智能科技有限公司 A kind of medical text name Entity recognition mask method based on alternative manner
CN110134969A (en) * 2019-05-27 2019-08-16 北京奇艺世纪科技有限公司 A kind of entity recognition method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103631948A (en) * 2013-12-11 2014-03-12 北京京东尚科信息技术有限公司 Identifying method of named entities
US20180365211A1 (en) * 2015-12-11 2018-12-20 Beijing Gridsum Technology Co., Ltd. Method and Device for Recognizing Domain Named Entity
CN108491373A (en) * 2018-02-01 2018-09-04 北京百度网讯科技有限公司 A kind of entity recognition method and system
CN109858018A (en) * 2018-12-25 2019-06-07 中国科学院信息工程研究所 A kind of entity recognition method and system towards threat information
CN109960728A (en) * 2019-03-11 2019-07-02 北京市科学技术情报研究所(北京市科学技术信息中心) A kind of open field conferencing information name entity recognition method and system
CN110008473A (en) * 2019-04-01 2019-07-12 云知声(上海)智能科技有限公司 A kind of medical text name Entity recognition mask method based on alternative manner
CN110134969A (en) * 2019-05-27 2019-08-16 北京奇艺世纪科技有限公司 A kind of entity recognition method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈慧炜: "刑事案件文本信息抽取研究", 《中国优秀硕士学位论文电子期刊》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115212A (en) * 2020-09-29 2020-12-22 中国工商银行股份有限公司 Parameter identification method and device and electronic equipment
CN112115212B (en) * 2020-09-29 2023-10-03 中国工商银行股份有限公司 Parameter identification method and device and electronic equipment
CN112861539A (en) * 2021-03-16 2021-05-28 云知声智能科技股份有限公司 Nested named entity recognition method and device, electronic equipment and storage medium
CN112861539B (en) * 2021-03-16 2023-12-15 云知声智能科技股份有限公司 Nested named entity recognition method, apparatus, electronic device and storage medium
CN114925694A (en) * 2022-05-11 2022-08-19 厦门大学 Method for improving biomedical named body recognition by utilizing entity discrimination information
CN114925694B (en) * 2022-05-11 2024-06-04 厦门大学 Method for improving biomedical named body recognition by using entity discrimination information

Also Published As

Publication number Publication date
CN111178080B (en) 2023-07-18

Similar Documents

Publication Publication Date Title
CN107622050A (en) Text sequence labeling system and method based on Bi LSTM and CRF
CN111931506B (en) Entity relationship extraction method based on graph information enhancement
CN110727779A (en) Question-answering method and system based on multi-model fusion
CN111325029B (en) Text similarity calculation method based on deep learning integrated model
CN113191148B (en) Rail transit entity identification method based on semi-supervised learning and clustering
CN111783394A (en) Training method of event extraction model, event extraction method, system and equipment
CN111309910A (en) Text information mining method and device
CN111061882A (en) Knowledge graph construction method
CN111178080B (en) Named entity identification method and system based on structured information
CN112434535A (en) Multi-model-based factor extraction method, device, equipment and storage medium
CN113312922B (en) Improved chapter-level triple information extraction method
CN113360582B (en) Relation classification method and system based on BERT model fusion multi-entity information
CN111259153A (en) Attribute-level emotion analysis method of complete attention mechanism
CN109522396B (en) Knowledge processing method and system for national defense science and technology field
CN108763192B (en) Entity relation extraction method and device for text processing
CN111401065A (en) Entity identification method, device, equipment and storage medium
CN111782793A (en) Intelligent customer service processing method, system and equipment
CN112800184A (en) Short text comment emotion analysis method based on Target-Aspect-Opinion joint extraction
CN115455202A (en) Emergency event affair map construction method
CN115481635A (en) Address element analysis method and system
CN114881043A (en) Deep learning model-based legal document semantic similarity evaluation method and system
CN111400449A (en) Regular expression extraction method and device
CN114648029A (en) Electric power field named entity identification method based on BiLSTM-CRF model
CN110929518A (en) Text sequence labeling algorithm using overlapping splitting rule
CN113869054A (en) Deep learning-based electric power field project feature identification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant