CN109684631A - 命名实体抽取方法、装置及介质 - Google Patents
命名实体抽取方法、装置及介质 Download PDFInfo
- Publication number
- CN109684631A CN109684631A CN201811516849.9A CN201811516849A CN109684631A CN 109684631 A CN109684631 A CN 109684631A CN 201811516849 A CN201811516849 A CN 201811516849A CN 109684631 A CN109684631 A CN 109684631A
- Authority
- CN
- China
- Prior art keywords
- entity
- text
- name
- name entity
- expression formula
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 67
- 230000014509 gene expression Effects 0.000 claims abstract description 150
- 238000000605 extraction Methods 0.000 claims abstract description 90
- 230000004807 localization Effects 0.000 claims abstract description 65
- 239000000284 extract Substances 0.000 claims abstract description 32
- 230000011218 segmentation Effects 0.000 claims description 19
- 238000012545 processing Methods 0.000 claims description 12
- 238000009472 formulation Methods 0.000 claims description 6
- 230000015654 memory Effects 0.000 claims description 6
- 238000012549 training Methods 0.000 description 18
- 230000008520 organization Effects 0.000 description 14
- 230000004069 differentiation Effects 0.000 description 9
- 230000007246 mechanism Effects 0.000 description 9
- 238000013136 deep learning model Methods 0.000 description 5
- 238000003780 insertion Methods 0.000 description 5
- 230000037431 insertion Effects 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 239000003795 chemical substances by application Substances 0.000 description 3
- 230000006403 short-term memory Effects 0.000 description 3
- 206010039203 Road traffic accident Diseases 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012015 optical character recognition Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
粗分类型 | 第二标识 |
机构名 | ORG |
人名 | PER |
地名 | LOC |
货币 | CUR |
日期 | TIM |
…… | …… |
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811516849.9A CN109684631A (zh) | 2018-12-12 | 2018-12-12 | 命名实体抽取方法、装置及介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811516849.9A CN109684631A (zh) | 2018-12-12 | 2018-12-12 | 命名实体抽取方法、装置及介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109684631A true CN109684631A (zh) | 2019-04-26 |
Family
ID=66187199
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811516849.9A Pending CN109684631A (zh) | 2018-12-12 | 2018-12-12 | 命名实体抽取方法、装置及介质 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109684631A (zh) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111415751A (zh) * | 2020-03-19 | 2020-07-14 | 北京嘉和海森健康科技有限公司 | 电子病历数据的主题切分方法、装置及系统 |
CN111782907A (zh) * | 2020-07-01 | 2020-10-16 | 北京知因智慧科技有限公司 | 新闻分类方法、装置及电子设备 |
WO2021051872A1 (zh) * | 2019-09-18 | 2021-03-25 | 平安科技(深圳)有限公司 | 实体识别方法、装置、设备及计算机可读存储介质 |
CN112800767A (zh) * | 2021-01-31 | 2021-05-14 | 云知声智能科技股份有限公司 | 一种病历文本中患者基本信息的检查方法及系统 |
CN113158677A (zh) * | 2021-05-13 | 2021-07-23 | 竹间智能科技(上海)有限公司 | 一种命名实体识别方法和系统 |
WO2024124409A1 (zh) * | 2022-12-13 | 2024-06-20 | 杭州数梦工场科技有限公司 | 一种正则表达式的生成方法、装置、电子设备和存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060047500A1 (en) * | 2004-08-31 | 2006-03-02 | Microsoft Corporation | Named entity recognition using compiler methods |
CN105138515A (zh) * | 2015-09-02 | 2015-12-09 | 百度在线网络技术(北京)有限公司 | 命名实体识别方法和装置 |
CN107608949A (zh) * | 2017-10-16 | 2018-01-19 | 北京神州泰岳软件股份有限公司 | 一种基于语义模型的文本信息抽取方法及装置 |
CN107729480A (zh) * | 2017-10-16 | 2018-02-23 | 北京神州泰岳软件股份有限公司 | 一种限定区域的文本信息抽取方法及装置 |
CN108647194A (zh) * | 2018-04-28 | 2018-10-12 | 北京神州泰岳软件股份有限公司 | 信息抽取方法及装置 |
-
2018
- 2018-12-12 CN CN201811516849.9A patent/CN109684631A/zh active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060047500A1 (en) * | 2004-08-31 | 2006-03-02 | Microsoft Corporation | Named entity recognition using compiler methods |
CN105138515A (zh) * | 2015-09-02 | 2015-12-09 | 百度在线网络技术(北京)有限公司 | 命名实体识别方法和装置 |
CN107608949A (zh) * | 2017-10-16 | 2018-01-19 | 北京神州泰岳软件股份有限公司 | 一种基于语义模型的文本信息抽取方法及装置 |
CN107729480A (zh) * | 2017-10-16 | 2018-02-23 | 北京神州泰岳软件股份有限公司 | 一种限定区域的文本信息抽取方法及装置 |
CN108647194A (zh) * | 2018-04-28 | 2018-10-12 | 北京神州泰岳软件股份有限公司 | 信息抽取方法及装置 |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021051872A1 (zh) * | 2019-09-18 | 2021-03-25 | 平安科技(深圳)有限公司 | 实体识别方法、装置、设备及计算机可读存储介质 |
CN111415751A (zh) * | 2020-03-19 | 2020-07-14 | 北京嘉和海森健康科技有限公司 | 电子病历数据的主题切分方法、装置及系统 |
CN111415751B (zh) * | 2020-03-19 | 2023-08-08 | 北京嘉和海森健康科技有限公司 | 电子病历数据的主题切分方法、装置及系统 |
CN111782907A (zh) * | 2020-07-01 | 2020-10-16 | 北京知因智慧科技有限公司 | 新闻分类方法、装置及电子设备 |
CN111782907B (zh) * | 2020-07-01 | 2024-03-01 | 北京知因智慧科技有限公司 | 新闻分类方法、装置及电子设备 |
CN112800767A (zh) * | 2021-01-31 | 2021-05-14 | 云知声智能科技股份有限公司 | 一种病历文本中患者基本信息的检查方法及系统 |
CN112800767B (zh) * | 2021-01-31 | 2023-11-21 | 云知声智能科技股份有限公司 | 一种病历文本中患者基本信息的检查方法及系统 |
CN113158677A (zh) * | 2021-05-13 | 2021-07-23 | 竹间智能科技(上海)有限公司 | 一种命名实体识别方法和系统 |
CN113158677B (zh) * | 2021-05-13 | 2023-04-07 | 竹间智能科技(上海)有限公司 | 一种命名实体识别方法和系统 |
WO2024124409A1 (zh) * | 2022-12-13 | 2024-06-20 | 杭州数梦工场科技有限公司 | 一种正则表达式的生成方法、装置、电子设备和存储介质 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109684631A (zh) | 命名实体抽取方法、装置及介质 | |
CN111125331B (zh) | 语义识别方法、装置、电子设备及计算机可读存储介质 | |
CN109902307B (zh) | 命名实体识别方法、命名实体识别模型的训练方法及装置 | |
US11055327B2 (en) | Unstructured data parsing for structured information | |
CN110427623A (zh) | 半结构化文档知识抽取方法、装置、电子设备及存储介质 | |
US20180053107A1 (en) | Aspect-based sentiment analysis | |
CN112149421A (zh) | 一种基于bert嵌入的软件编程领域实体识别方法 | |
CN111222305A (zh) | 一种信息结构化方法和装置 | |
CN104572625A (zh) | 命名实体的识别方法 | |
CN111723569A (zh) | 一种事件抽取方法、装置和计算机可读存储介质 | |
CN113743111B (zh) | 基于文本预训练和多任务学习的金融风险预测方法及装置 | |
CN113779358A (zh) | 一种事件检测方法和系统 | |
CN110852079A (zh) | 文档目录自动生成方法、装置及计算机可读存储介质 | |
CN112905868A (zh) | 事件抽取方法、装置、设备及存储介质 | |
CN116501898B (zh) | 适用于少样本和有偏数据的金融文本事件抽取方法和装置 | |
CN110825827A (zh) | 一种实体关系识别模型训练、实体关系识别方法及装置 | |
CN107590119B (zh) | 人物属性信息抽取方法及装置 | |
CN111178080B (zh) | 一种基于结构化信息的命名实体识别方法及系统 | |
CN114153978A (zh) | 模型训练方法、信息抽取方法、装置、设备及存储介质 | |
CN115455202A (zh) | 一种应急事件事理图谱构建方法 | |
CN114218951B (zh) | 实体识别模型的训练方法、实体识别方法及装置 | |
CN109344390A (zh) | 一种基于多特征神经网络的柬语实体识别的方法 | |
Sarkar | A hidden markov model based system for entity extraction from social media english text at fire 2015 | |
CN113139558A (zh) | 确定物品的多级分类标签的方法和装置 | |
CN110929521B (zh) | 一种模型生成方法、实体识别方法、装置及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
EE01 | Entry into force of recordation of patent licensing contract | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20190426 Assignee: DINFO (BEIJING) SCIENCE DEVELOPMENT Co.,Ltd. Assignor: ULTRAPOWER SOFTWARE Co.,Ltd. Contract record no.: X2019990000214 Denomination of invention: Named entity extraction method and device and medium License type: Exclusive License Record date: 20191127 |
|
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: Room 818, 8 / F, 34 Haidian Street, Haidian District, Beijing 100080 Applicant after: ULTRAPOWER SOFTWARE Co.,Ltd. Address before: 100089 Beijing city Haidian District wanquanzhuang Road No. 28 Wanliu new building block A Room 601 Applicant before: ULTRAPOWER SOFTWARE Co.,Ltd. |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190426 |