CN102819576A - Data mining method and system based on microblog - Google Patents

Data mining method and system based on microblog Download PDF

Info

Publication number
CN102819576A
CN102819576A CN2012102546853A CN201210254685A CN102819576A CN 102819576 A CN102819576 A CN 102819576A CN 2012102546853 A CN2012102546853 A CN 2012102546853A CN 201210254685 A CN201210254685 A CN 201210254685A CN 102819576 A CN102819576 A CN 102819576A
Authority
CN
China
Prior art keywords
data
microblogging
module
machine learning
knowledge base
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012102546853A
Other languages
Chinese (zh)
Inventor
郝文
白昱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WUXI YAZUO ONLINE TECHNOLOGY DEVELOPMENT Co Ltd
Original Assignee
WUXI YAZUO ONLINE TECHNOLOGY DEVELOPMENT Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WUXI YAZUO ONLINE TECHNOLOGY DEVELOPMENT Co Ltd filed Critical WUXI YAZUO ONLINE TECHNOLOGY DEVELOPMENT Co Ltd
Priority to CN2012102546853A priority Critical patent/CN102819576A/en
Publication of CN102819576A publication Critical patent/CN102819576A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a data mining method and a data mining system based on a microblog, relating to food service industries. The method comprises the following courses: a training course and a judgment course, wherein the training course comprises the following steps: performing text preprocessing including word segmentation, feature extraction and the like on microblog sample data as input of a machine learning algorithm depending on a knowledge base system, and creating a classifier in accordance with the machine learning algorithm and informing the classifier of classification judgment standards; and the judgment course comprises the following steps: performing text preprocessing on the microblog data and performing preprocessing including word segmentation and feature extraction, etc. on the microblog data depending on the knowledge base system, and sending the preprocessed microblog data to the classifier and receiving good or bad assessment results returned by the classifier. The invention also provides the data mining system based on the microblog.

Description

A kind of data digging method and system based on microblogging
Technical field
The present invention relates to the data mining technology field, relate in particular to a kind of data digging method and system based on microblogging.
Background technology
At catering industry, at present not based on the microblogging data mining, the product that provides data to support for enterprise management decision-making or consumer spending decision-making.
Summary of the invention
The technical matters that the present invention solves is how to provide the evaluation information of product or service.
In order to overcome the above problems, the embodiment of the invention provides a kind of data digging method based on microblogging, comprises following process:
Training process: rely on the knowledge base system, the microblogging sample data is done text pre-service work such as participle, feature extraction, as the input of machine learning algorithm,, create sorter then, and tell the sorter standard that classification is judged through machine learning algorithm;
Deterministic process: rely on the knowledge base system, the microblogging data are carried out the text pre-service, the microblogging data are done pre-service work such as participle, feature extraction, pretreated microblogging data are issued sorter, and receive favorable comment or the poor result that comments that sorter returns.
Further, as preferred version, knowledge body storehouse system is multi-level tree-like knowledge base system.
Further, as preferred version, machine learning algorithm is the expansion bayesian algorithm.
The embodiment of the invention also provides a kind of data digging system based on microblogging; Comprise: training module: rely on knowledge base system module,, the microblogging sample data is done text pre-service work such as participle, feature extraction earlier through the first text pre-processing module; Then as the input of machine learning module; Through machine learning algorithm, create classifier modules, and tell the standard that the sorter sort module is judged;
Judge module: rely on the knowledge base system; Earlier through the second text pre-processing module; The microblogging data are carried out the text pre-service; The microblogging data are done pre-service work such as participle, feature extraction, pretreated microblogging data are issued classifier modules, and favorable comment or difference that the reception classifier modules is returned comment the result to show at display terminal;
Knowledge base system module: be that the first text pre-processing module, the second text pre-processing module and machine learning module provide data.
Owing to adopted the microblogging data mining technology, the evaluation information of product or service is provided, help the relative merits of food and beverage enterprise's discovery self product or service, for enterprise management decision-making provides the data support.
Description of drawings
When combining accompanying drawing to consider; Through with reference to following detailed, can more completely understand the present invention better and learn wherein many attendant advantages easily, but accompanying drawing described herein is used to provide further understanding of the present invention; Constitute a part of the present invention; Illustrative examples of the present invention and explanation thereof are used to explain the present invention, do not constitute to improper qualification of the present invention, wherein:
Fig. 1 is a method for digging embodiment process flow diagram of the present invention;
Fig. 2 is a digging system embodiment block diagram of the present invention.
Embodiment
Followingly embodiments of the invention are described with reference to Fig. 1-2.
As shown in Figure 1, a kind of data digging method based on microblogging comprises following process:
S1, training process: rely on the knowledge base system, the microblogging sample data is done text pre-service work such as participle, feature extraction, as the input of machine learning algorithm,, create sorter then, and tell the sorter standard that classification is judged through machine learning algorithm;
S2, deterministic process: rely on the knowledge base system, the microblogging data are carried out the text pre-service, the microblogging data are done pre-service work such as participle, feature extraction, pretreated microblogging data are issued sorter, and receive favorable comment or the poor result that comments that sorter returns.
Knowledge body storehouse system is multi-level tree-like knowledge base system.Machine learning algorithm is the expansion bayesian algorithm.
As shown in Figure 2, a kind of data digging system based on microblogging comprises:
Training module 1: rely on knowledge base system module; Earlier through the first text pre-processing module 11; The microblogging sample data is done text pre-service work such as participle, feature extraction, then as the input of machine learning module 12, through machine learning algorithm; Create classifier modules 13, and tell the standard that sorter sort module 13 is judged;
Judge module 2: rely on the knowledge base system; Earlier through the second text pre-processing module 21; The microblogging data are carried out the text pre-service; The microblogging data are done pre-service work such as participle, feature extraction, pretreated microblogging data are issued classifier modules 13, and favorable comment or difference that reception classifier modules 13 is returned comment the result to show at display terminal 22;
Knowledge base system module 3: be that the first text pre-processing module 11, the second text pre-processing module 21 and machine learning module 12 provide data.
As stated, embodiments of the invention have been carried out explanation at length, but as long as not breaking away from inventive point of the present invention and effect in fact can have a lot of distortion, this will be readily apparent to persons skilled in the art.Therefore, such variation also all is included within protection scope of the present invention.

Claims (4)

1. data digging method based on microblogging; It is characterized in that, comprise following process: training process: rely on the knowledge base system, the microblogging sample data is done text pre-service work such as participle, feature extraction; Then as the input of machine learning algorithm; Through machine learning algorithm, create sorter, and tell the sorter standard that classification is judged;
Deterministic process: rely on the knowledge base system, the microblogging data are carried out the text pre-service, the microblogging data are done pre-service work such as participle, feature extraction, pretreated microblogging data are issued sorter, and receive favorable comment or the poor result that comments that sorter returns.
2. according to claim 1 based on the data digging method of microblogging, it is characterized in that said knowledge body storehouse system is multi-level tree-like knowledge base system.
3. according to claim 1 based on the data digging method of microblogging, it is characterized in that said machine learning algorithm is the expansion bayesian algorithm.
4. the data digging system based on microblogging is characterized in that, comprising:
Training module: rely on knowledge base system module; Earlier through the first text pre-processing module; The microblogging sample data is done text pre-service work such as participle, feature extraction, then as the input of machine learning module, through machine learning algorithm; Create classifier modules, and tell the standard that the sorter sort module is judged;
Judge module: rely on the knowledge base system; Earlier through the second text pre-processing module; The microblogging data are carried out the text pre-service; The microblogging data are done pre-service work such as participle, feature extraction, pretreated microblogging data are issued classifier modules, and favorable comment or difference that the reception classifier modules is returned comment the result to show at display terminal;
Knowledge base system module: be that the first text pre-processing module, the second text pre-processing module and machine learning module provide data.
CN2012102546853A 2012-07-23 2012-07-23 Data mining method and system based on microblog Pending CN102819576A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012102546853A CN102819576A (en) 2012-07-23 2012-07-23 Data mining method and system based on microblog

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012102546853A CN102819576A (en) 2012-07-23 2012-07-23 Data mining method and system based on microblog

Publications (1)

Publication Number Publication Date
CN102819576A true CN102819576A (en) 2012-12-12

Family

ID=47303687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012102546853A Pending CN102819576A (en) 2012-07-23 2012-07-23 Data mining method and system based on microblog

Country Status (1)

Country Link
CN (1) CN102819576A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530402A (en) * 2013-10-23 2014-01-22 北京航空航天大学 Method for identifying microblog key users based on improved Page Rank
CN104615718A (en) * 2015-02-05 2015-05-13 北京航空航天大学 Hierarchical analysis method for social network emergency
CN104616198A (en) * 2015-02-12 2015-05-13 哈尔滨工业大学 P2P (peer-to-peer) network lending risk prediction system based on text analysis
CN105868193A (en) * 2015-01-19 2016-08-17 富士通株式会社 Device and method used to detect product relevant information in electronic text

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101556582A (en) * 2008-04-09 2009-10-14 上海复旦光华信息科技股份有限公司 System for analyzing and predicting netizen interest in forum
CN102012985A (en) * 2010-11-19 2011-04-13 国网电力科学研究院 Sensitive data dynamic identification method based on data mining

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101556582A (en) * 2008-04-09 2009-10-14 上海复旦光华信息科技股份有限公司 System for analyzing and predicting netizen interest in forum
CN102012985A (en) * 2010-11-19 2011-04-13 国网电力科学研究院 Sensitive data dynamic identification method based on data mining

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐禾芳等: "《基于搜索引擎和数据挖掘的博客营销》", 《商场现代化》, no. 527, 31 January 2008 (2008-01-31) *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530402A (en) * 2013-10-23 2014-01-22 北京航空航天大学 Method for identifying microblog key users based on improved Page Rank
CN105868193A (en) * 2015-01-19 2016-08-17 富士通株式会社 Device and method used to detect product relevant information in electronic text
CN104615718A (en) * 2015-02-05 2015-05-13 北京航空航天大学 Hierarchical analysis method for social network emergency
CN104615718B (en) * 2015-02-05 2017-12-15 北京航空航天大学 The Hierarchy Analysis Method of social networks accident
CN104616198A (en) * 2015-02-12 2015-05-13 哈尔滨工业大学 P2P (peer-to-peer) network lending risk prediction system based on text analysis
CN104616198B (en) * 2015-02-12 2018-01-26 哈尔滨工业大学 A kind of P2P network loan Risk Forecast Systems based on text analyzing

Similar Documents

Publication Publication Date Title
CN109255565B (en) Address attribution identification and logistics task distribution method and device
CN109685052A (en) Method for processing text images, device, electronic equipment and computer-readable medium
CN104536953B (en) A kind of recognition methods of text emotional valence and device
CN104464733A (en) Multi-scene managing method and device of voice conversation
WO2017000716A3 (en) Image management method and device, and terminal device
WO2013014667A3 (en) System and methods for computerized machine-learning based authentication of electronic documents including use of linear programming for classification
WO2018045241A3 (en) Detection of anomalies in multivariate data
CN103955660A (en) Method for recognizing batch two-dimension code images
CN102819576A (en) Data mining method and system based on microblog
CN106886873A (en) The conjunction folk prescription method and conjunction single system of a kind of e-commerce order
CN105205081A (en) Article recommendation method and device
CN105404540A (en) Robot remote upgrading method, system and remote server
IN2015DE02745A (en)
CN104268134A (en) Subjective and objective classifier building method and system
CN104166725A (en) Phishing website detection method
CN105550253A (en) Method and device for obtaining type relation
CN105704691A (en) Scam short message recognition method and device
CN103093218B (en) The method of automatic identification form types and device
Ando et al. Globalization and domestic operations: Applying the JC/JD method to Japanese manufacturing firms
CN205038674U (en) Logistics management system based on computer
CN102662962B (en) Dynamic display method based on webpage elements
CN107992508B (en) Chinese mail signature extraction method and system based on machine learning
CN103218420A (en) Method and device for extracting page titles
CN103177374A (en) Service recommending method and service recommending system
CN103853536B (en) The method and apparatus that Service tracing is realized based on state transition diagram

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20121212