CN116720520A - 一种面向文本数据的别名实体快速识别方法及系统 - Google Patents
一种面向文本数据的别名实体快速识别方法及系统 Download PDFInfo
- Publication number
- CN116720520A CN116720520A CN202310983821.0A CN202310983821A CN116720520A CN 116720520 A CN116720520 A CN 116720520A CN 202310983821 A CN202310983821 A CN 202310983821A CN 116720520 A CN116720520 A CN 116720520A
- Authority
- CN
- China
- Prior art keywords
- entity
- text
- word
- text data
- named entity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 238000012545 processing Methods 0.000 claims abstract description 19
- 239000013598 vector Substances 0.000 claims description 66
- 238000012546 transfer Methods 0.000 claims description 38
- 239000011159 matrix material Substances 0.000 claims description 36
- 238000012549 training Methods 0.000 claims description 31
- 230000011218 segmentation Effects 0.000 claims description 23
- 230000014509 gene expression Effects 0.000 claims description 13
- 238000000354 decomposition reaction Methods 0.000 claims description 12
- 238000005457 optimization Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 2
- 238000000605 extraction Methods 0.000 description 13
- 230000008569 process Effects 0.000 description 12
- 230000000694 effects Effects 0.000 description 11
- 230000006870 function Effects 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 7
- 238000013527 convolutional neural network Methods 0.000 description 7
- 238000003058 natural language processing Methods 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000000306 recurrent effect Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000008520 organization Effects 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 230000003252 repetitive effect Effects 0.000 description 3
- 230000002457 bidirectional effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310983821.0A CN116720520B (zh) | 2023-08-07 | 2023-08-07 | 一种面向文本数据的别名实体快速识别方法及系统 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310983821.0A CN116720520B (zh) | 2023-08-07 | 2023-08-07 | 一种面向文本数据的别名实体快速识别方法及系统 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116720520A true CN116720520A (zh) | 2023-09-08 |
CN116720520B CN116720520B (zh) | 2023-11-03 |
Family
ID=87871938
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310983821.0A Active CN116720520B (zh) | 2023-08-07 | 2023-08-07 | 一种面向文本数据的别名实体快速识别方法及系统 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116720520B (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117312578A (zh) * | 2023-11-28 | 2023-12-29 | 烟台云朵软件有限公司 | 一种非遗传承图谱的构建方法与系统 |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109753653A (zh) * | 2018-12-25 | 2019-05-14 | 金蝶软件(中国)有限公司 | 实体名称识别方法、装置、计算机设备和存储介质 |
CN110472248A (zh) * | 2019-08-22 | 2019-11-19 | 广东工业大学 | 一种中文文本命名实体的识别方法 |
CN112257449A (zh) * | 2020-11-13 | 2021-01-22 | 腾讯科技(深圳)有限公司 | 命名实体识别方法、装置、计算机设备和存储介质 |
CN112632997A (zh) * | 2020-12-14 | 2021-04-09 | 河北工程大学 | 基于BERT和Word2Vec向量融合的中文实体识别方法 |
CN113051900A (zh) * | 2021-04-30 | 2021-06-29 | 中国平安人寿保险股份有限公司 | 同义词识别方法、装置、计算机设备及存储介质 |
CN113065349A (zh) * | 2021-03-15 | 2021-07-02 | 国网河北省电力有限公司 | 基于条件随机场的命名实体识别方法 |
CN114372466A (zh) * | 2021-12-27 | 2022-04-19 | 军事科学院系统工程研究院系统总体研究所 | 别名实体识别方法、装置、计算机设备、介质及程序产品 |
US20230030086A1 (en) * | 2021-07-28 | 2023-02-02 | OntogenAI, Inc. | System and method for generating ontologies and retrieving information using the same |
CN116055472A (zh) * | 2023-02-07 | 2023-05-02 | 烟台云朵软件有限公司 | 一种多终端统一服务接入系统与方法 |
-
2023
- 2023-08-07 CN CN202310983821.0A patent/CN116720520B/zh active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109753653A (zh) * | 2018-12-25 | 2019-05-14 | 金蝶软件(中国)有限公司 | 实体名称识别方法、装置、计算机设备和存储介质 |
CN110472248A (zh) * | 2019-08-22 | 2019-11-19 | 广东工业大学 | 一种中文文本命名实体的识别方法 |
CN112257449A (zh) * | 2020-11-13 | 2021-01-22 | 腾讯科技(深圳)有限公司 | 命名实体识别方法、装置、计算机设备和存储介质 |
CN112632997A (zh) * | 2020-12-14 | 2021-04-09 | 河北工程大学 | 基于BERT和Word2Vec向量融合的中文实体识别方法 |
CN113065349A (zh) * | 2021-03-15 | 2021-07-02 | 国网河北省电力有限公司 | 基于条件随机场的命名实体识别方法 |
CN113051900A (zh) * | 2021-04-30 | 2021-06-29 | 中国平安人寿保险股份有限公司 | 同义词识别方法、装置、计算机设备及存储介质 |
US20230030086A1 (en) * | 2021-07-28 | 2023-02-02 | OntogenAI, Inc. | System and method for generating ontologies and retrieving information using the same |
CN114372466A (zh) * | 2021-12-27 | 2022-04-19 | 军事科学院系统工程研究院系统总体研究所 | 别名实体识别方法、装置、计算机设备、介质及程序产品 |
CN116055472A (zh) * | 2023-02-07 | 2023-05-02 | 烟台云朵软件有限公司 | 一种多终端统一服务接入系统与方法 |
Non-Patent Citations (2)
Title |
---|
TIANYUE CHEN,等: "RoBERT-Agr: An Entity Relationship Extraction Model of Massive Agricultural Text Based on the RoBERTa and CRF Algorithm", 《2023 IEEE 8TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA)》, pages 113 - 120 * |
范涛,等: "基于深度迁移学习的地方志多模态命名实体识别研究", 《情报学报》, vol. 41, no. 4, pages 412 - 423 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117312578A (zh) * | 2023-11-28 | 2023-12-29 | 烟台云朵软件有限公司 | 一种非遗传承图谱的构建方法与系统 |
CN117312578B (zh) * | 2023-11-28 | 2024-02-23 | 烟台云朵软件有限公司 | 一种非遗传承图谱的构建方法与系统 |
Also Published As
Publication number | Publication date |
---|---|
CN116720520B (zh) | 2023-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110298037B (zh) | 基于增强注意力机制的卷积神经网络匹配的文本识别方法 | |
CN112115238B (zh) | 一种基于bert和知识库的问答方法和系统 | |
CN109271529B (zh) | 西里尔蒙古文和传统蒙古文双文种知识图谱构建方法 | |
CN107943784B (zh) | 基于生成对抗网络的关系抽取方法 | |
CN105404632B (zh) | 基于深度神经网络对生物医学文本序列化标注的系统和方法 | |
CN110232192A (zh) | 电力术语命名实体识别方法及装置 | |
CN111209401A (zh) | 网络舆情文本信息情感极性分类处理系统及方法 | |
CN110727779A (zh) | 基于多模型融合的问答方法及系统 | |
CN111639171A (zh) | 一种知识图谱问答方法及装置 | |
CN111931506A (zh) | 一种基于图信息增强的实体关系抽取方法 | |
CN109960728A (zh) | 一种开放域会议信息命名实体识别方法及系统 | |
CN115858758A (zh) | 一种多非结构化数据识别的智慧客服知识图谱系统 | |
CN111274804A (zh) | 基于命名实体识别的案件信息提取方法 | |
CN116720520B (zh) | 一种面向文本数据的别名实体快速识别方法及系统 | |
CN116151256A (zh) | 一种基于多任务和提示学习的小样本命名实体识别方法 | |
CN112860898B (zh) | 一种短文本框聚类方法、系统、设备及存储介质 | |
CN114298035A (zh) | 一种文本识别脱敏方法及其系统 | |
CN116127090A (zh) | 基于融合和半监督信息抽取的航空系统知识图谱构建方法 | |
CN114064901B (zh) | 一种基于知识图谱词义消歧的书评文本分类方法 | |
CN115292490A (zh) | 一种用于政策解读语义的分析算法 | |
CN113901224A (zh) | 基于知识蒸馏的涉密文本识别模型训练方法、系统及装置 | |
CN117454898A (zh) | 一种根据输入文本实现法人实体标准化输出的方法及装置 | |
CN112307756A (zh) | 基于Bi-LSTM和字词融合的汉语分词方法 | |
CN116186067A (zh) | 一种工业数据表存储查询方法及设备 | |
CN113868389B (zh) | 基于自然语言文本的数据查询方法、装置及计算机设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Huang Xin Inventor after: Dai Pengfei Inventor after: Zhou Chunjie Inventor after: Zhang Zhen Inventor after: Wang Qingwei Inventor before: Dai Pengfei Inventor before: Zhou Chunjie Inventor before: Zhang Zhen Inventor before: Wang Qingwei |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20231121 Address after: 264000 Maker Space, Floor 1, Building 4, No. 1, Lanhai Road, High tech Zone, Yantai, Shandong Patentee after: Yantai cloud Software Co.,Ltd. Patentee after: Hulunbuir Cultural and Tourism Development Center Address before: 264000 Maker Space, Floor 1, Building 4, No. 1, Lanhai Road, High tech Zone, Yantai, Shandong Patentee before: Yantai cloud Software Co.,Ltd. |