CN113657106B - 基于归一化词频权重的特征选择方法 - Google Patents
基于归一化词频权重的特征选择方法 Download PDFInfo
- Publication number
- CN113657106B CN113657106B CN202110758265.8A CN202110758265A CN113657106B CN 113657106 B CN113657106 B CN 113657106B CN 202110758265 A CN202110758265 A CN 202110758265A CN 113657106 B CN113657106 B CN 113657106B
- Authority
- CN
- China
- Prior art keywords
- feature
- class
- document
- word
- documents
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010187 selection method Methods 0.000 title claims abstract description 13
- 238000000034 method Methods 0.000 claims abstract description 10
- 238000010606 normalization Methods 0.000 claims abstract description 8
- 238000012549 training Methods 0.000 claims description 32
- 238000012360 testing method Methods 0.000 claims description 18
- 238000012545 processing Methods 0.000 claims description 10
- 238000012706 support-vector machine Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000013145 classification model Methods 0.000 claims description 3
- 238000002790 cross-validation Methods 0.000 claims description 3
- 238000013138 pruning Methods 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 abstract description 20
- 230000006870 function Effects 0.000 abstract description 9
- 239000000284 extract Substances 0.000 abstract 1
- 238000000546 chi-square test Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Abstract
Description
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110758265.8A CN113657106B (zh) | 2021-07-05 | 基于归一化词频权重的特征选择方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110758265.8A CN113657106B (zh) | 2021-07-05 | 基于归一化词频权重的特征选择方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113657106A CN113657106A (zh) | 2021-11-16 |
CN113657106B true CN113657106B (zh) | 2024-06-21 |
Family
ID=
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105224695A (zh) * | 2015-11-12 | 2016-01-06 | 中南大学 | 一种基于信息熵的文本特征量化方法和装置及文本分类方法和装置 |
CN108108462A (zh) * | 2017-12-29 | 2018-06-01 | 河南科技大学 | 一种基于特征分类的文本情感分析方法 |
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105224695A (zh) * | 2015-11-12 | 2016-01-06 | 中南大学 | 一种基于信息熵的文本特征量化方法和装置及文本分类方法和装置 |
CN108108462A (zh) * | 2017-12-29 | 2018-06-01 | 河南科技大学 | 一种基于特征分类的文本情感分析方法 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107193959B (zh) | 一种面向纯文本的企业实体分类方法 | |
CN108363810B (zh) | 一种文本分类方法及装置 | |
CN108898479B (zh) | 信用评价模型的构建方法及装置 | |
CN104750844B (zh) | 基于tf-igm的文本特征向量生成方法和装置及文本分类方法和装置 | |
CN107391772B (zh) | 一种基于朴素贝叶斯的文本分类方法 | |
WO2022126810A1 (zh) | 文本聚类方法 | |
CN103995876A (zh) | 一种基于卡方统计和smo算法的文本分类方法 | |
CN105975518B (zh) | 基于信息熵的期望交叉熵特征选择文本分类系统及方法 | |
CN109271517B (zh) | Ig tf-idf文本特征向量生成及文本分类方法 | |
CN104834940A (zh) | 一种基于支持向量机的医疗影像检查疾病分类方法 | |
CN102622373A (zh) | 一种基于tf*idf算法的统计学文本分类系统及方法 | |
CN111143842A (zh) | 一种恶意代码检测方法及系统 | |
CN111144106B (zh) | 一种不平衡数据集下的两阶段文本特征选择方法 | |
CN111709439B (zh) | 基于词频偏差率因子的特征选择方法 | |
CN109522544A (zh) | 基于卡方检验的句向量计算方法、文本分类方法及系统 | |
CN106570076A (zh) | 一种计算机文本分类系统 | |
CN111539451A (zh) | 样本数据优化方法、装置、设备及存储介质 | |
CN111309577A (zh) | 一种面向Spark的批处理应用执行时间预测模型构建方法 | |
CN109376235B (zh) | 基于文档层词频重排序的特征选择方法 | |
CN113626604A (zh) | 基于最大间隔准则的网页文本分类系统 | |
CN113657106B (zh) | 基于归一化词频权重的特征选择方法 | |
CN106991171A (zh) | 基于智慧校园信息服务平台的话题发现方法 | |
CN116204647A (zh) | 一种目标比对学习模型的建立、文本聚类方法及装置 | |
CN113792141A (zh) | 基于协方差度量因子的特征选择方法 | |
CN113657106A (zh) | 基于归一化词频权重的特征选择方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20240409 Address after: 518000 1002, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province Applicant after: Shenzhen Wanzhida Technology Co.,Ltd. Country or region after: China Address before: 710048 Shaanxi province Xi'an Beilin District Jinhua Road No. 5 Applicant before: XI'AN University OF TECHNOLOGY Country or region before: China |
|
TA01 | Transfer of patent application right |
Effective date of registration: 20240524 Address after: Room 304, 3rd Floor, Building 21, Zone 2, Tiantong Zhongyuan, Dongxiaokou Town, Changping District, Beijing, 100000 Applicant after: It's Also A Pleasure For Youpeng (Beijing) Technology Co.,Ltd. Country or region after: China Address before: 518000 1002, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province Applicant before: Shenzhen Wanzhida Technology Co.,Ltd. Country or region before: China |
|
GR01 | Patent grant |