CN111709463B - 基于指数协同度量的特征选择方法 - Google Patents
基于指数协同度量的特征选择方法 Download PDFInfo
- Publication number
- CN111709463B CN111709463B CN202010474513.1A CN202010474513A CN111709463B CN 111709463 B CN111709463 B CN 111709463B CN 202010474513 A CN202010474513 A CN 202010474513A CN 111709463 B CN111709463 B CN 111709463B
- Authority
- CN
- China
- Prior art keywords
- documents
- category
- data set
- term
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010187 selection method Methods 0.000 title claims abstract description 15
- 238000005259 measurement Methods 0.000 title claims abstract description 11
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 31
- 238000000034 method Methods 0.000 claims abstract description 17
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 238000012360 testing method Methods 0.000 claims abstract description 8
- 238000012549 training Methods 0.000 claims abstract description 8
- 238000002790 cross-validation Methods 0.000 claims abstract description 4
- 238000011156 evaluation Methods 0.000 claims abstract description 4
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000012706 support-vector machine Methods 0.000 claims description 7
- 230000007547 defect Effects 0.000 abstract description 2
- 238000000546 chi-square test Methods 0.000 description 8
- 238000013145 classification model Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010474513.1A CN111709463B (zh) | 2020-05-29 | 2020-05-29 | 基于指数协同度量的特征选择方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010474513.1A CN111709463B (zh) | 2020-05-29 | 2020-05-29 | 基于指数协同度量的特征选择方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111709463A CN111709463A (zh) | 2020-09-25 |
CN111709463B true CN111709463B (zh) | 2024-02-02 |
Family
ID=72538867
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010474513.1A Active CN111709463B (zh) | 2020-05-29 | 2020-05-29 | 基于指数协同度量的特征选择方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111709463B (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113792141A (zh) * | 2021-08-20 | 2021-12-14 | 西安理工大学 | 基于协方差度量因子的特征选择方法 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7415445B2 (en) * | 2002-09-24 | 2008-08-19 | Hewlett-Packard Development Company, L.P. | Feature selection for two-class classification systems |
US8145677B2 (en) * | 2007-03-27 | 2012-03-27 | Faleh Jassem Al-Shameri | Automated generation of metadata for mining image and text data |
CN111144106B (zh) * | 2019-12-20 | 2023-05-02 | 山东科技大学 | 一种不平衡数据集下的两阶段文本特征选择方法 |
-
2020
- 2020-05-29 CN CN202010474513.1A patent/CN111709463B/zh active Active
Also Published As
Publication number | Publication date |
---|---|
CN111709463A (zh) | 2020-09-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107391772B (zh) | 一种基于朴素贝叶斯的文本分类方法 | |
CN110287328B (zh) | 一种文本分类方法、装置、设备及计算机可读存储介质 | |
CN103886108B (zh) | 一种不均衡文本集的特征选择和权重计算方法 | |
CN111709439B (zh) | 基于词频偏差率因子的特征选择方法 | |
CN103425777B (zh) | 一种基于改进贝叶斯分类的短信智能分类及搜索方法 | |
CN109271517B (zh) | Ig tf-idf文本特征向量生成及文本分类方法 | |
CN103995876A (zh) | 一种基于卡方统计和smo算法的文本分类方法 | |
CN108304371B (zh) | 热点内容挖掘的方法、装置、计算机设备及存储介质 | |
CN109145114B (zh) | 基于Kleinberg在线状态机的社交网络事件检测方法 | |
CN107729520B (zh) | 文件分类方法、装置、计算机设备及计算机可读介质 | |
CN107294834A (zh) | 一种识别垃圾邮件的方法和装置 | |
CN106156163B (zh) | 文本分类方法以及装置 | |
BaygIn | Classification of text documents based on Naive Bayes using N-Gram features | |
CN110826618A (zh) | 一种基于随机森林的个人信用风险评估方法 | |
CN109325125B (zh) | 一种基于cnn优化的社交网络谣言检测方法 | |
CN109376235B (zh) | 基于文档层词频重排序的特征选择方法 | |
CN102945246A (zh) | 网络信息数据的处理方法及装置 | |
CN103914551A (zh) | 一种微博语义信息扩充和特征选取方法 | |
CN111709463B (zh) | 基于指数协同度量的特征选择方法 | |
CN108462624B (zh) | 一种垃圾邮件的识别方法、装置以及电子设备 | |
CN111079427A (zh) | 一种垃圾邮件识别方法及系统 | |
KR101585644B1 (ko) | 단어 연관성 분석을 이용한 문서 분류 장치, 방법 및 이를 위한 컴퓨터 프로그램 | |
CN113468538A (zh) | 一种基于相似性度量的漏洞攻击数据库构建方法 | |
JP5929532B2 (ja) | イベント検出装置、イベント検出方法およびイベント検出プログラム | |
US20230214415A1 (en) | Text classification system based on feature selection and method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20231107 Address after: 518000 1002, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province Applicant after: Shenzhen Wanzhida Technology Co.,Ltd. Address before: 710048 Shaanxi province Xi'an Beilin District Jinhua Road No. 5 Applicant before: XI'AN University OF TECHNOLOGY |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240429 Address after: No. 125, 2nd Floor, Unit 7, Building 62, Huangang Road, Tiexi District, Anshan City, Liaoning Province, 114000 Patentee after: Wang Yang Country or region after: China Patentee after: Che Chengwei Address before: 518000 1002, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province Patentee before: Shenzhen Wanzhida Technology Co.,Ltd. Country or region before: China |
|
TR01 | Transfer of patent right |