CN111709463A - 基于指数协同度量的特征选择方法 - Google Patents
基于指数协同度量的特征选择方法 Download PDFInfo
- Publication number
- CN111709463A CN111709463A CN202010474513.1A CN202010474513A CN111709463A CN 111709463 A CN111709463 A CN 111709463A CN 202010474513 A CN202010474513 A CN 202010474513A CN 111709463 A CN111709463 A CN 111709463A
- Authority
- CN
- China
- Prior art keywords
- documents
- data set
- term
- category
- class
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000010187 selection method Methods 0.000 title claims abstract description 15
- 238000005259 measurement Methods 0.000 title claims abstract description 13
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 35
- 238000000034 method Methods 0.000 claims abstract description 16
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 238000012360 testing method Methods 0.000 claims abstract description 8
- 238000012549 training Methods 0.000 claims abstract description 8
- 238000002790 cross-validation Methods 0.000 claims abstract description 4
- 238000011156 evaluation Methods 0.000 claims abstract description 4
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000012706 support-vector machine Methods 0.000 claims description 7
- 230000007547 defect Effects 0.000 abstract description 2
- 238000000546 chi-square test Methods 0.000 description 8
- 238000013145 classification model Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010474513.1A CN111709463B (zh) | 2020-05-29 | 2020-05-29 | 基于指数协同度量的特征选择方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010474513.1A CN111709463B (zh) | 2020-05-29 | 2020-05-29 | 基于指数协同度量的特征选择方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111709463A true CN111709463A (zh) | 2020-09-25 |
CN111709463B CN111709463B (zh) | 2024-02-02 |
Family
ID=72538867
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010474513.1A Active CN111709463B (zh) | 2020-05-29 | 2020-05-29 | 基于指数协同度量的特征选择方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111709463B (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113792141A (zh) * | 2021-08-20 | 2021-12-14 | 西安理工大学 | 基于协方差度量因子的特征选择方法 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040059697A1 (en) * | 2002-09-24 | 2004-03-25 | Forman George Henry | Feature selection for two-class classification systems |
US20100223276A1 (en) * | 2007-03-27 | 2010-09-02 | Faleh Jassem Al-Shameri | Automated Generation of Metadata for Mining Image and Text Data |
CN111144106A (zh) * | 2019-12-20 | 2020-05-12 | 山东科技大学 | 一种不平衡数据集下的两阶段文本特征选择方法 |
-
2020
- 2020-05-29 CN CN202010474513.1A patent/CN111709463B/zh active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040059697A1 (en) * | 2002-09-24 | 2004-03-25 | Forman George Henry | Feature selection for two-class classification systems |
US20100223276A1 (en) * | 2007-03-27 | 2010-09-02 | Faleh Jassem Al-Shameri | Automated Generation of Metadata for Mining Image and Text Data |
CN111144106A (zh) * | 2019-12-20 | 2020-05-12 | 山东科技大学 | 一种不平衡数据集下的两阶段文本特征选择方法 |
Non-Patent Citations (2)
Title |
---|
杨凤芹等: "段落及类别分布的特征选择方法", 《小型微型计算机系统》 * |
王婷婷等: "LDA模型的优化及其主题数量选择研究――以科技文献为例", 《数据分析与知识发现》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113792141A (zh) * | 2021-08-20 | 2021-12-14 | 西安理工大学 | 基于协方差度量因子的特征选择方法 |
CN113792141B (zh) * | 2021-08-20 | 2024-07-05 | 广东云熹科技有限公司 | 基于协方差度量因子的特征选择方法 |
Also Published As
Publication number | Publication date |
---|---|
CN111709463B (zh) | 2024-02-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107391772B (zh) | 一种基于朴素贝叶斯的文本分类方法 | |
CN105512311B (zh) | 一种基于卡方统计的自适应特征选择方法 | |
CN105975518B (zh) | 基于信息熵的期望交叉熵特征选择文本分类系统及方法 | |
CN111709439B (zh) | 基于词频偏差率因子的特征选择方法 | |
CN103995876A (zh) | 一种基于卡方统计和smo算法的文本分类方法 | |
TW201737118A (zh) | 網頁文本分類的方法和裝置,網頁文本識別的方法和裝置 | |
CN110826618A (zh) | 一种基于随机森林的个人信用风险评估方法 | |
CN109271517B (zh) | Ig tf-idf文本特征向量生成及文本分类方法 | |
CN103886108B (zh) | 一种不均衡文本集的特征选择和权重计算方法 | |
BaygIn | Classification of text documents based on Naive Bayes using N-Gram features | |
CN109376235B (zh) | 基于文档层词频重排序的特征选择方法 | |
CN111144106A (zh) | 一种不平衡数据集下的两阶段文本特征选择方法 | |
US11960521B2 (en) | Text classification system based on feature selection and method thereof | |
CN113032573B (zh) | 一种结合主题语义与tf*idf算法的大规模文本分类方法及系统 | |
CN111709463B (zh) | 基于指数协同度量的特征选择方法 | |
CN110348497B (zh) | 一种基于WT-GloVe词向量构建的文本表示方法 | |
JP5929532B2 (ja) | イベント検出装置、イベント検出方法およびイベント検出プログラム | |
CN116881451A (zh) | 基于机器学习的文本分类方法 | |
CN114996446B (zh) | 一种文本分类方法、装置及存储介质 | |
CN113657106B (zh) | 基于归一化词频权重的特征选择方法 | |
CN111382273B (zh) | 一种基于吸引因子的特征选择的文本分类方法 | |
Rao et al. | A Framework for Hate Speech Detection using Different ML Algorithms | |
CN111159410A (zh) | 一种文本情感分类方法、系统、装置及存储介质 | |
CN113792141A (zh) | 基于协方差度量因子的特征选择方法 | |
Yuan et al. | An empirical study of filter-based feature selection algorithms using noisy training data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20231107 Address after: 518000 1002, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province Applicant after: Shenzhen Wanzhida Technology Co.,Ltd. Address before: 710048 Shaanxi province Xi'an Beilin District Jinhua Road No. 5 Applicant before: XI'AN University OF TECHNOLOGY |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240429 Address after: No. 125, 2nd Floor, Unit 7, Building 62, Huangang Road, Tiexi District, Anshan City, Liaoning Province, 114000 Patentee after: Wang Yang Country or region after: China Patentee after: Che Chengwei Address before: 518000 1002, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province Patentee before: Shenzhen Wanzhida Technology Co.,Ltd. Country or region before: China |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240731 Address after: Room 618, 6th Floor, Building 1, No. 5 Beihuangmuchang North Street, Tongzhou District, Beijing 101100 Patentee after: Beijing Lanqiao Technology Co.,Ltd. Country or region after: China Address before: No. 125, 2nd Floor, Unit 7, Building 62, Huangang Road, Tiexi District, Anshan City, Liaoning Province, 114000 Patentee before: Wang Yang Country or region before: China Patentee before: Che Chengwei |
|
TR01 | Transfer of patent right |