CN110134791B - 一种数据处理方法、电子设备及存储介质 - Google Patents
一种数据处理方法、电子设备及存储介质 Download PDFInfo
- Publication number
- CN110134791B CN110134791B CN201910424547.7A CN201910424547A CN110134791B CN 110134791 B CN110134791 B CN 110134791B CN 201910424547 A CN201910424547 A CN 201910424547A CN 110134791 B CN110134791 B CN 110134791B
- Authority
- CN
- China
- Prior art keywords
- information
- clustering
- generalized
- cluster
- data processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 34
- 238000000605 extraction Methods 0.000 claims abstract description 7
- 238000000034 method Methods 0.000 claims description 12
- 230000007717 exclusion Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 6
- 238000000638 solvent extraction Methods 0.000 description 3
- 239000012634 fragment Substances 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910424547.7A CN110134791B (zh) | 2019-05-21 | 2019-05-21 | 一种数据处理方法、电子设备及存储介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910424547.7A CN110134791B (zh) | 2019-05-21 | 2019-05-21 | 一种数据处理方法、电子设备及存储介质 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110134791A CN110134791A (zh) | 2019-08-16 |
CN110134791B true CN110134791B (zh) | 2022-03-08 |
Family
ID=67572057
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910424547.7A Active CN110134791B (zh) | 2019-05-21 | 2019-05-21 | 一种数据处理方法、电子设备及存储介质 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110134791B (zh) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102945244A (zh) * | 2012-09-24 | 2013-02-27 | 南京大学 | 基于句号特征字串的中文网页重复文档检测和过滤方法 |
CN103823809A (zh) * | 2012-11-16 | 2014-05-28 | 百度在线网络技术(北京)有限公司 | 一种对查询短语分类的方法、分类优化的方法及其装置 |
CN104091054A (zh) * | 2014-06-26 | 2014-10-08 | 中国科学院自动化研究所 | 面向短文本的群体性事件预警方法和系统 |
WO2016158768A1 (ja) * | 2015-03-30 | 2016-10-06 | 株式会社メガチップス | クラスタリング装置及び機械学習装置 |
CN107451187A (zh) * | 2017-06-23 | 2017-12-08 | 天津科技大学 | 基于互约束主题模型的半结构短文本集中子话题发现方法 |
CN107516110A (zh) * | 2017-08-22 | 2017-12-26 | 华南理工大学 | 一种基于集成卷积编码的医疗问答语义聚类方法 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5359399B2 (ja) * | 2009-03-11 | 2013-12-04 | ソニー株式会社 | テキスト分析装置および方法、並びにプログラム |
CN102831128B (zh) * | 2011-06-15 | 2015-03-25 | 富士通株式会社 | 一种对互联网上的同名人物信息进行分类的方法及装置 |
US20160335544A1 (en) * | 2015-05-12 | 2016-11-17 | Claudia Bretschneider | Method and Apparatus for Generating a Knowledge Data Model |
CN106610965A (zh) * | 2015-10-21 | 2017-05-03 | 北京瀚思安信科技有限公司 | 确定文本串公共子序列的方法和设备 |
-
2019
- 2019-05-21 CN CN201910424547.7A patent/CN110134791B/zh active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102945244A (zh) * | 2012-09-24 | 2013-02-27 | 南京大学 | 基于句号特征字串的中文网页重复文档检测和过滤方法 |
CN103823809A (zh) * | 2012-11-16 | 2014-05-28 | 百度在线网络技术(北京)有限公司 | 一种对查询短语分类的方法、分类优化的方法及其装置 |
CN104091054A (zh) * | 2014-06-26 | 2014-10-08 | 中国科学院自动化研究所 | 面向短文本的群体性事件预警方法和系统 |
WO2016158768A1 (ja) * | 2015-03-30 | 2016-10-06 | 株式会社メガチップス | クラスタリング装置及び機械学習装置 |
CN107451187A (zh) * | 2017-06-23 | 2017-12-08 | 天津科技大学 | 基于互约束主题模型的半结构短文本集中子话题发现方法 |
CN107516110A (zh) * | 2017-08-22 | 2017-12-26 | 华南理工大学 | 一种基于集成卷积编码的医疗问答语义聚类方法 |
Also Published As
Publication number | Publication date |
---|---|
CN110134791A (zh) | 2019-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106250513B (zh) | 一种基于事件建模的事件个性化分类方法及系统 | |
Uysal et al. | The impact of feature extraction and selection on SMS spam filtering | |
EP3508990A1 (en) | Natural language library generation method and device | |
US10671813B2 (en) | Performing actions based on determined intent of messages | |
CN105022733B (zh) | Dinfo‑oec文本分析挖掘方法与设备 | |
CN107644106B (zh) | 自动挖掘业务中间人的方法、终端设备及存储介质 | |
CN108616654A (zh) | 消息提醒的方法、装置、终端及计算机可读存储介质 | |
CN109716370B (zh) | 用于在消息应用中传送响应的系统和方法 | |
CN107832440B (zh) | 一种数据挖掘方法、装置、服务器及计算机可读存储介质 | |
CN111078742B (zh) | 用户分类模型训练方法、用户分类方法及装置 | |
Yoo et al. | Classification scheme of unstructured text document using TF-IDF and naive bayes classifier | |
CN110442733A (zh) | 一种主题生成方法、装置和设备及介质 | |
CN112184169A (zh) | 用户待办事项的动态规划方法、装置、设备及存储介质 | |
CN112632215A (zh) | 一种基于词对语义主题模型的社区发现方法及系统 | |
CN104424187A (zh) | 一种向客户端用户推荐好友的方法及装置 | |
CN114861746A (zh) | 基于大数据的反欺诈识别方法、装置及相关设备 | |
CN111415196A (zh) | 一种广告召回方法、装置、服务器及存储介质 | |
CN110134791B (zh) | 一种数据处理方法、电子设备及存储介质 | |
Kaliyar et al. | SMS spam filtering on multiple background datasets using machine learning techniques: A novel approach | |
CN116597443A (zh) | 素材标签处理方法、装置、电子设备及介质 | |
CN110738048A (zh) | 一种关键词提取方法、装置及终端设备 | |
CN113011152B (zh) | 文本处理方法、装置、设备及计算机可读存储介质 | |
CN113868410A (zh) | 一种基于用户兴趣的短信拦截方法、装置、设备及介质 | |
CN104881395A (zh) | 一种获取矩阵中向量相似度的方法和系统 | |
Urmi et al. | A Proposal of Systematic SMS Spam Detection Model Using Supervised Machine Learning Classifiers |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder | ||
CP01 | Change in the name or title of a patent holder |
Address after: East of 1st floor, No.36 Haidian Street, Haidian District, Beijing, 100080 Patentee after: Beijing Teddy Future Technology Co.,Ltd. Address before: East of 1st floor, No.36 Haidian Street, Haidian District, Beijing, 100080 Patentee before: Beijing Teddy Bear Mobile Technology Co.,Ltd. |
|
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: East of 1st floor, No.36 Haidian Street, Haidian District, Beijing, 100080 Patentee after: Beijing Teddy Bear Mobile Technology Co.,Ltd. Address before: 100085 07a36, block D, 7 / F, No.28, information road, Haidian District, Beijing Patentee before: BEIJING TEDDY BEAR MOBILE TECHNOLOGY Co.,Ltd. |