CN113239196A - 一种基于数字人文的实体分类模型训练与预测方法 - Google Patents
一种基于数字人文的实体分类模型训练与预测方法 Download PDFInfo
- Publication number
- CN113239196A CN113239196A CN202110515349.9A CN202110515349A CN113239196A CN 113239196 A CN113239196 A CN 113239196A CN 202110515349 A CN202110515349 A CN 202110515349A CN 113239196 A CN113239196 A CN 113239196A
- Authority
- CN
- China
- Prior art keywords
- entity
- vector
- entities
- classification model
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000013145 classification model Methods 0.000 title claims abstract description 23
- 238000012549 training Methods 0.000 title claims abstract description 20
- 239000013598 vector Substances 0.000 claims abstract description 34
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 230000002457 bidirectional effect Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 6
- 238000011160 research Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000004140 cleaning Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110515349.9A CN113239196B (zh) | 2021-05-12 | 2021-05-12 | 一种基于数字人文的实体分类模型训练与预测方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110515349.9A CN113239196B (zh) | 2021-05-12 | 2021-05-12 | 一种基于数字人文的实体分类模型训练与预测方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113239196A true CN113239196A (zh) | 2021-08-10 |
CN113239196B CN113239196B (zh) | 2024-07-09 |
Family
ID=77133947
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110515349.9A Active CN113239196B (zh) | 2021-05-12 | 2021-05-12 | 一种基于数字人文的实体分类模型训练与预测方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113239196B (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113792543A (zh) * | 2021-09-14 | 2021-12-14 | 安徽咪鼠科技有限公司 | 一种写作方法、装置及存储介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109992782A (zh) * | 2019-04-02 | 2019-07-09 | 深圳市华云中盛科技有限公司 | 法律文书命名实体识别方法、装置及计算机设备 |
CN110807098A (zh) * | 2019-09-24 | 2020-02-18 | 武汉智美互联科技有限公司 | 基于BiRNN深度学习的DGA域名检测方法 |
CN111324742A (zh) * | 2020-02-10 | 2020-06-23 | 同方知网(北京)技术有限公司 | 一种数字人文知识图谱的构建方法 |
CN112487817A (zh) * | 2020-12-14 | 2021-03-12 | 北京明略软件系统有限公司 | 命名实体识别模型训练方法、样本标注方法、装置及设备 |
CN112613316A (zh) * | 2020-12-31 | 2021-04-06 | 北京师范大学 | 一种生成古汉语标注模型的方法和系统 |
CN112765984A (zh) * | 2020-12-31 | 2021-05-07 | 平安资产管理有限责任公司 | 命名实体识别方法、装置、计算机设备和存储介质 |
-
2021
- 2021-05-12 CN CN202110515349.9A patent/CN113239196B/zh active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109992782A (zh) * | 2019-04-02 | 2019-07-09 | 深圳市华云中盛科技有限公司 | 法律文书命名实体识别方法、装置及计算机设备 |
CN110807098A (zh) * | 2019-09-24 | 2020-02-18 | 武汉智美互联科技有限公司 | 基于BiRNN深度学习的DGA域名检测方法 |
CN111324742A (zh) * | 2020-02-10 | 2020-06-23 | 同方知网(北京)技术有限公司 | 一种数字人文知识图谱的构建方法 |
CN112487817A (zh) * | 2020-12-14 | 2021-03-12 | 北京明略软件系统有限公司 | 命名实体识别模型训练方法、样本标注方法、装置及设备 |
CN112613316A (zh) * | 2020-12-31 | 2021-04-06 | 北京师范大学 | 一种生成古汉语标注模型的方法和系统 |
CN112765984A (zh) * | 2020-12-31 | 2021-05-07 | 平安资产管理有限责任公司 | 命名实体识别方法、装置、计算机设备和存储介质 |
Non-Patent Citations (1)
Title |
---|
李斌等: "数字人文视域下的古文献文本标注与可视化研究——以《左传》知识库为例", 《大学图书馆学报》, no. 5, 31 October 2020 (2020-10-31), pages 72 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113792543A (zh) * | 2021-09-14 | 2021-12-14 | 安徽咪鼠科技有限公司 | 一种写作方法、装置及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN113239196B (zh) | 2024-07-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109829159B (zh) | 一种古汉语文本的一体化自动词法分析方法及系统 | |
CN112231477B (zh) | 一种基于改进胶囊网络的文本分类方法 | |
CN111401061A (zh) | 基于BERT及BiLSTM-Attention的涉案新闻观点句识别方法 | |
WO2017162134A1 (zh) | 用于文本处理的电子设备和方法 | |
CN109189925A (zh) | 基于点互信息的词向量模型和基于cnn的文本分类方法 | |
CN111368086A (zh) | 一种基于CNN-BiLSTM+attention模型的涉案新闻观点句情感分类方法 | |
CN109670182B (zh) | 一种基于文本哈希向量化表示的海量极短文本分类方法 | |
CN112732864B (zh) | 一种基于稠密伪查询向量表示的文档检索方法 | |
CN109657061B (zh) | 一种针对海量多词短文本的集成分类方法 | |
CN108595643A (zh) | 基于多分类节点卷积循环网络的文本特征提取及分类方法 | |
CN111400494B (zh) | 一种基于GCN-Attention的情感分析方法 | |
CN111008266A (zh) | 文本分析模型的训练方法及装置、文本分析方法及装置 | |
CN110826298B (zh) | 一种智能辅助定密系统中使用的语句编码方法 | |
CN114298035A (zh) | 一种文本识别脱敏方法及其系统 | |
Kozhevnikov et al. | Research of the text data vectorization and classification algorithms of machine learning | |
CN114372475A (zh) | 一种基于RoBERTa模型的网络舆情情感分析方法及系统 | |
CN112100413A (zh) | 一种跨模态的哈希检索方法 | |
CN112489689A (zh) | 基于多尺度差异对抗的跨数据库语音情感识别方法及装置 | |
CN109918507B (zh) | 一种基于TextCNN改进的文本分类方法 | |
CN113191150B (zh) | 一种多特征融合的中文医疗文本命名实体识别方法 | |
CN117454220A (zh) | 数据分级分类方法、装置、设备及存储介质 | |
CN113673252A (zh) | 一种基于字段语义的数据表自动join推荐方法 | |
CN113239196A (zh) | 一种基于数字人文的实体分类模型训练与预测方法 | |
CN113190681B (zh) | 一种基于胶囊网络遮罩记忆注意力的细粒度文本分类方法 | |
CN112926323B (zh) | 基于多级残差卷积与注意力机制的中文命名实体识别方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: Room B201, B202, B203, B205, B206, B207, B208, B209, B210, 2nd Floor, Building B-2, Zhongguancun Dongsheng Science and Technology Park, No. 66 Xixiaokou Road, Haidian District, Beijing (Dongsheng area) Patentee after: Tongfangzhiwang Digital Technology Co.,Ltd. Country or region after: China Patentee after: TONGFANG KNOWLEDGE NETWORK (BEIJING) TECHNOLOGY Co.,Ltd. Address before: Room B201, B202, B203, B205, B206, B207, B208, B209, B210, 2nd Floor, Building B-2, Zhongguancun Dongsheng Science and Technology Park, No. 66 Xixiaokou Road, Haidian District, Beijing (Dongsheng area) Patentee before: TONGFANG KNOWLEDGE NETWORK DIGITAL PUBLISHING TECHNOLOGY CO.,LTD. Country or region before: China Patentee before: TONGFANG KNOWLEDGE NETWORK (BEIJING) TECHNOLOGY Co.,Ltd. |